Determining Long-Term Change in Tourism Research Language with Text-Mining Methods

MU faculty is at the forefront of both tourism research and news media technology, and a new paper by Josef Mazanec of the Department of Tourism and Service Management combines these two disciplines to investigate long-term change in tourism research language. The paper ''Determining Long-Term Change in Tourism Research Language with Text-Mining Methods'' was published in the in the 22nd volume of the journal Tourism Analysis

Tourism researchers, particularly those in later stages of their careers, may sometimes wonder whether, over the decades, the study of tourism has changed focus and touched upon new issues or has been largely reiterating traditional viewpoints. The frequent impression of encountering old wine in new skins may be a correlate of age, but there are also means of investigating this matter objectively. In this study I employed two nontrivial methods that have not yet been probed in connection with the research question of measuring this kind of change. The underlying hypothesis guiding this text mining study posits (i) that article abstracts contain single word items and latent topics assisting in discriminating between earlier and later publications, and (ii) that quantitative text mining methods are capable of discovering change. Annals of Tourism Research served as a representative source spanning four decades of research activities. This journal aims at portraying all social science perspectives of tourism and has not specialized in fields like economics, marketing, or analytical methods. The overall sample comprised 858 abstracts from ‘old’ journal volumes (from 1975, when abstracts became customary, to 1994) and from ‘new’ volumes (2009 to 2015).

Actually, quantitative text mining methods proved to be sensitive enough to recognize change in the language of tourism research. The study of tourism has shifted focus during the past four decades and this got reflected in the abstracts of articles published in a journal of particularly long tradition. The two text mining methods detected significant change in language between early and recent article abstracts. The study investigated discriminant word items and latent topic structures. The double approach with two computationally unrelated methods (penSVM-Penalized Support Vector Machines and LDA-Latent Dirichlet Allocation) analyzed (i) single word items that differentiated between earlier and later article abstracts, and (ii) the relevance of latent topics underlying older and newer abstracts.

The first analysis with penSVM explored the predictive power of individual word items (terms) for distinguishing between old and new abstracts. According to the classification matrix in Table 1 the percentage of correctly classified abstracts in the validation sample reached (152 + 149) / 438 = .69 or 69%. Sensitivity practically equals specificity, which means that results do not differ within old and new abstracts. This may not qualify as an overwhelming predictive success. However, the proportional chance criterion p2 + (1 ? p)2 amounts to a correctly classified share of (218 / 438)2 + (220 / 438)2 ~ .50 or 50%. In other words, the surplus of 19 percentage points compared to a random classification reflects the small but non-negligible predictive value of information residing in the selected word items.

Table 1: Classification matrix of a penSVM run with L1 penalty



Observed class









0 71








The second approach using LDA for extracting latent topics started with five topics. Table 2 presents the absolute frequencies of the most likely topics relevant for old and new abstracts. Despite the very coarse assortment of only five latent topics the frequencies differ significantly (?2 = 30.06, p < .0001). Of course, the very small number of topics causes many terms to appear in more than one topic, though with different frequency. The finding that old and new abstracts are fairly incommensurate in terms of latent topics was strengthened by all further LDA runs estimating models of 20, 30, 40, and 50 topics. The differences in frequency between ‘old’ and ‘new’ were consistently significant; all ?2- and Mann-Whitney tests (applied where one or more cell frequencies < 5) rejected the no-difference assumption with p < .0001.

Table 2:   Absolute frequencies of the most likely topics (5)

Topic nr.






Old abstracts






New abstracts






?2 = 30.06

p < .0001






With the growing number of topics the diversity of word items associated with each topic increased and more and more non-popular terms appeared in top frequency ranks. However, interpreting the latent topics, besides lying outside the purpose of this study, would have been highly speculative and subjective. The results advocate future qualitative analyses for pursuing the reasons and substantive contents of change.