{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T07:47:35Z","timestamp":1776930455515,"version":"3.51.2"},"reference-count":52,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2023,6,2]],"date-time":"2023-06-02T00:00:00Z","timestamp":1685664000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100007707","name":"University of California, Davis","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100007707","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Res. Metr. Anal."],"abstract":"<jats:p>This study compares three different methods commonly employed for the determination and interpretation of the subject matter of large corpuses of textual data. The methods reviewed are: (1) topic modeling, (2) community or group detection, and (3) cluster analysis of semantic networks. Two different datasets related to health topics were gathered from Twitter posts to compare the methods. The first dataset includes 16,138 original tweets concerning HIV pre-exposure prophylaxis (PrEP) from April 3, 2019 to April 3, 2020. The second dataset is comprised of 12,613 tweets about childhood vaccination from July 1, 2018 to October 15, 2018. Our findings suggest that the separate \u201ctopics\u201d suggested by semantic networks (community detection) and\/or cluster analysis (Ward's method) are more clearly identified than the topic modeling results. Topic modeling produced more subjects, but these tended to overlap. This study offers a better understanding of how results may vary based on method to determine subject matter chosen.<\/jats:p>","DOI":"10.3389\/frma.2023.1104691","type":"journal-article","created":{"date-parts":[[2023,6,2]],"date-time":"2023-06-02T13:43:24Z","timestamp":1685713404000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["A comparison of three methods to determine the subject matter in textual data"],"prefix":"10.3389","volume":"8","author":[{"given":"George A.","family":"Barnett","sequence":"first","affiliation":[]},{"given":"Christopher","family":"Calabrese","sequence":"additional","affiliation":[]},{"given":"Jeanette B.","family":"Ruiz","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2023,6,2]]},"reference":[{"key":"B1","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4614-3223-4","volume-title":"Mining Text Data","author":"Aggarwal","year":"2012"},{"key":"B2","volume-title":"The Use of the Internet for Health Information and Social","author":"Barnett","year":"2006"},{"key":"B3","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1515\/9781501500060-005","article-title":"5. Issues in intercultural communication: a semantic network analysis","volume":"9","author":"Barnett","year":"2017","journal-title":"Interc. Commun."},{"key":"B4","doi-asserted-by":"publisher","first-page":"721","DOI":"10.1007\/s13278-013-0117-9","article-title":"An examination of the relationship between international telecommunication networks, terrorism and global news coverage","volume":"3","author":"Barnett","year":"2013","journal-title":"Social Netw. Anal. Mining"},{"key":"B5","doi-asserted-by":"publisher","first-page":"361","DOI":"10.1609\/icwsm.v3i1.13937","article-title":"Gephi: an open source software for exploring and manipulating networks","volume":"10","author":"Bastian","year":"2009","journal-title":"Proc. Conf. Web Soc. Media"},{"key":"B6","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1145\/2133806.2133826","article-title":"Probabilistic topic models","volume":"55","author":"Blei","year":"2012","journal-title":"Commun. ACM"},{"key":"B7","first-page":"993","article-title":"Latent dirichlet allocation","volume":"3","author":"Blei","year":"2003","journal-title":"J. Mac. Learning Res."},{"key":"B8","doi-asserted-by":"publisher","first-page":"10008","DOI":"10.1088\/1742-5468\/2008\/10\/P10008","article-title":"Fast unfolding of communities in large networks","volume":"2008","author":"Blondel","year":"2008","journal-title":"J. Stat. Mech. Exp."},{"key":"B9","doi-asserted-by":"publisher","first-page":"222","DOI":"10.1177\/1075547018824709","article-title":"Online representations of \u201cgenome editing\u201d uncover opportunities for encouraging engagement: a semantic network analysis","volume":"41","author":"Calabrese","year":"2019","journal-title":"Sci. Commun."},{"key":"B10","doi-asserted-by":"publisher","first-page":"954","DOI":"10.1080\/17524032.2019.1699135","article-title":"The uproar over gene-edited babies: a semantic network analysis of CRISPR on Twitter","volume":"14","author":"Calabrese","year":"2020","journal-title":"Environ. Commun."},{"key":"B11","doi-asserted-by":"publisher","first-page":"65","DOI":"10.22720\/hnmr.2022.6.1.065","article-title":"Perceptions of PrEP on Twitter: a theoretically guided content analysis on the behavioral determinants of PrEP uptake","volume":"6","author":"Calabrese","year":"2022","journal-title":"Health New Media Res."},{"key":"B12","volume-title":"Automap User's Guide 2013","author":"Carley","year":"2013"},{"key":"B13","first-page":"198","article-title":"Network analysis of message content","volume":"12","author":"Danowski","year":"1993","journal-title":"Prog. Commun. Sci."},{"key":"B14","doi-asserted-by":"publisher","first-page":"251","DOI":"10.1177\/009365085012002005","article-title":"Crisis effects on intraorganizational computer-based communication","volume":"12","author":"Danowski","year":"1985","journal-title":"Commun. Res."},{"key":"B15","doi-asserted-by":"publisher","first-page":"72","DOI":"10.4324\/9781003120100-4","article-title":"Cable news channels' partisan ideology and market share growth as predictors of social distancing sentiment during the COVID-19 pandemic","volume":"17","author":"Danowski","year":"2021","journal-title":"Semantic Netw. Anal. Soc. Sci"},{"key":"B16","volume-title":"ConText: Software for the Integrated Analysis of Text Data and Network Data","author":"Diesner","year":"2014"},{"key":"B17","doi-asserted-by":"publisher","first-page":"589","DOI":"10.1111\/j.1468-2958.1999.tb00463.x","article-title":"A semantic network analysis of the international communication association","volume":"25","author":"Doerfel","year":"1999","journal-title":"Hum. Commun. Res."},{"key":"B18","doi-asserted-by":"publisher","first-page":"201","DOI":"10.1002\/asi.20950","article-title":"Semantic networks and competition: election year winners and losers in U.S. televised presidential debates, 1960\u20132004","volume":"60","author":"Doerfel","year":"2009","journal-title":"J. Am. Soc. Inf. Sci. Technol."},{"key":"B19","doi-asserted-by":"publisher","first-page":"100105","DOI":"10.1016\/j.osnem.2020.100105","article-title":"Exploring childhood anti-vaccine and pro-vaccine communities on twitter \u2013 a perspective from influential users","volume":"20","author":"Featherstone","year":"2020","journal-title":"Online Social Netw. Media"},{"key":"B20","unstructured":"FeinererI.\n          Introduction to the tm Package Text Mining in R2013"},{"key":"B21","doi-asserted-by":"publisher","first-page":"231","DOI":"10.1080\/08824090409359985","article-title":"The use of semantic network analysis to manage customer complaints","volume":"21","author":"Fitzgerald","year":"2004","journal-title":"Commun. Res. Rep."},{"key":"B22","doi-asserted-by":"publisher","first-page":"7821","DOI":"10.1073\/pnas.122653799","article-title":"Community structure in social and biological networks","volume":"99","author":"Girvan","year":"2002","journal-title":"Proc. Nat. Acad. Sci."},{"key":"B23","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v040.i13","article-title":"Topicmodels: An R package for fitting topic models","volume":"40","author":"Gr\u00fcn","year":"2011","journal-title":"J. Stat. Software"},{"key":"B24","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1016\/j.neucom.2021.01.059","article-title":"'A multi-level clustering technique for community detection'","volume":"441","author":"Inuwa-Dutse","year":"2021","journal-title":"Neurocomputing"},{"key":"B25","doi-asserted-by":"publisher","first-page":"e98679","DOI":"10.1371\/journal.pone.0098679","article-title":"ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the gephi software","volume":"9","author":"Jacomy","year":"2014","journal-title":"PLoS ONE"},{"key":"B26","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1177\/075910639404400104","article-title":"Cultural differences in organizational communication: a semantic network analysis 1","volume":"44","author":"Jang","year":"1994","journal-title":"Bullet. Sociol. Methodol."},{"key":"B27","doi-asserted-by":"publisher","first-page":"e0267406","DOI":"10.1371\/journal.pone.0267406","article-title":"Comparison of public discussions of gene editing on social media between the United States and China","volume":"17","author":"Ji","year":"2022","journal-title":"PLoS ONE"},{"key":"B28","doi-asserted-by":"publisher","first-page":"1700082","DOI":"10.1002\/gch2.201700082","article-title":"Semantic network analysis reveals opposing online representations of the search term \u201cGMO\u201d","volume":"2","author":"Jiang","year":"2018","journal-title":"Global Challenges"},{"key":"B29","first-page":"31","article-title":"\u201cThe structure of the International Communication Association-2016: A network analysis,\u201d","author":"Jiang","year":"2018","journal-title":"Interventions: Communication Theory and Practice, International Communication Association, Annual Conference Theme Book Series, Vol. 5"},{"key":"B30","first-page":"3710","article-title":"News framing in an international context: A semantic network analysis","volume":"10","author":"Jiang","year":"2016","journal-title":"Int. J. Commun."},{"key":"B31","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1080\/17513050902759488","article-title":"Assessing cultural differences in translations: a semantic network analysis of the universal declaration of human rights","volume":"2","author":"Kwon","year":"2009","journal-title":"J. Int. Inter. Commun."},{"key":"B32","unstructured":"MabeyB.\n          pyLDAvis: Python library for interactive topic model visualization. Port of the R LDAvis package2018"},{"key":"B33","doi-asserted-by":"publisher","first-page":"93","DOI":"10.1080\/19312458.2018.1430754","article-title":"Applying LDA topic modeling in communication research: Toward a valid and reliable methodology","volume":"12","author":"Maier","year":"2018","journal-title":"Commun. Methods Measures"},{"key":"B34","doi-asserted-by":"publisher","first-page":"81","DOI":"10.1037\/h0043158","article-title":"The magical number seven, plus or minus two: some limits on our capacity for processing information","volume":"63","author":"Miller","year":"1956","journal-title":"Psychol. Rev."},{"key":"B35","doi-asserted-by":"publisher","first-page":"15","DOI":"10.20982\/tqmp.09.1.p015","article-title":"The k-means clustering technique: general considerations and implementation in Mathematica","volume":"9","author":"Morissette","year":"2013","journal-title":"Tutorials Q. Methods Psychol."},{"key":"B36","first-page":"100","article-title":"Automatic evaluation of topic coherence. in Human language technologies: The 2010 annual conference of the North American chapter of the association for computational linguistics","volume":"2010","author":"Newman","year":"2010","journal-title":"Assoc. Comput."},{"key":"B37","doi-asserted-by":"publisher","first-page":"321","DOI":"10.1140\/epjb\/e2004-00124-y","article-title":"Detecting community structure in networks","volume":"38","author":"Newman","year":"2004","journal-title":"The European Physical Journal B - Condensed Matter"},{"key":"B38","unstructured":"Introducing ChatGPT2022"},{"key":"B39","doi-asserted-by":"publisher","first-page":"369","DOI":"10.1177\/002194369303000401","article-title":"Is it really just like a fancy answering machine? Comparing semantic networks of different types of voice mail users","volume":"30","author":"Rice","year":"1993","journal-title":"J. Bus. Commun."},{"key":"B40","doi-asserted-by":"publisher","first-page":"A07","DOI":"10.22323\/2.20050207","article-title":"Understanding knowledge and perceptions of genome editing technologies: a textual analysis of major agricultural stakeholder groups","volume":"20","author":"Robbins","year":"2021","journal-title":"J. Sci. Commun."},{"key":"B41","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v091.i02","article-title":"Stm: An R package for structural topic models","volume":"91","author":"Roberts","year":"2019","journal-title":"J. Stat. Software"},{"key":"B42","volume-title":"Communication Networks: Toward A New Paradigm for Research.","author":"Rogers","year":"1981"},{"key":"B43","doi-asserted-by":"publisher","first-page":"3354","DOI":"10.1016\/j.vaccine.2015.05.017","article-title":"Exploring the presentation of HPV information online: a semantic network analysis of websites","volume":"33","author":"Ruiz","year":"2015","journal-title":"Vaccine"},{"key":"B44","doi-asserted-by":"publisher","first-page":"63","DOI":"10.3115\/v1\/W14-3110","article-title":"LDAvis: a method for visualizing and interpreting topics","volume":"27","author":"Sievert","year":"2014","journal-title":"in Proc. Workshop Interactive Lang. Learn. Visual. Interfaces"},{"key":"B45","doi-asserted-by":"publisher","first-page":"411","DOI":"10.1111\/1467-9868.00293","article-title":"Estimating the number of clusters in a data set via the gap statistic","volume":"63","author":"Tibshirani","year":"2001","journal-title":"J. Royal Stat. Soc."},{"key":"B46","doi-asserted-by":"publisher","first-page":"236","DOI":"10.1080\/01621459.1963.10500845","article-title":"Hierarchical grouping to optimize an objective function","volume":"58","author":"Ward","year":"1963","journal-title":"J. Am. Stat. Assoc."},{"key":"B47","first-page":"213","article-title":"Attitudes as nonhierarchical clusters in neural networks","volume":"43","author":"Woelfel","year":"1997","journal-title":"Prog. Commun. Sci."},{"key":"B48","volume-title":"CATPAC: A Neural Network for Qualitative Analysis of Text. Artificial Neural Networks for Advertising and Marketing Research","author":"Woelfel","year":"1995"},{"key":"B49","doi-asserted-by":"publisher","first-page":"1141","DOI":"10.3390\/e24081141","article-title":"Community detection in semantic networks: a multi-view approach","volume":"24","author":"Yang","year":"2022","journal-title":"Entropy"},{"key":"B50","doi-asserted-by":"publisher","first-page":"8","DOI":"10.20982\/tqmp.11.1.p008","article-title":"Hierarchical cluster analysis: comparison of three linkage measures and application to psychological data","volume":"11","author":"Yim","year":"2015","journal-title":"Q. Methods Psychol."},{"key":"B51","doi-asserted-by":"publisher","first-page":"1011","DOI":"10.1111\/jcom.12058","article-title":"Privacy in semantic networks on chinese social media: the case of Sina Weibo","volume":"63","author":"Yuan","year":"2013","journal-title":"J. Commun."},{"key":"B52","unstructured":"ZachariasC.\n          twint: An advanced Twitter scraping and OSINT tool2020"}],"container-title":["Frontiers in Research Metrics and Analytics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frma.2023.1104691\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,2]],"date-time":"2023-06-02T13:43:35Z","timestamp":1685713415000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frma.2023.1104691\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,2]]},"references-count":52,"alternative-id":["10.3389\/frma.2023.1104691"],"URL":"https:\/\/doi.org\/10.3389\/frma.2023.1104691","relation":{},"ISSN":["2504-0537"],"issn-type":[{"value":"2504-0537","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,2]]},"article-number":"1104691"}}