{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,10]],"date-time":"2026-01-10T08:01:49Z","timestamp":1768032109868,"version":"3.49.0"},"reference-count":17,"publisher":"Oxford University Press (OUP)","license":[{"start":{"date-parts":[[2022,8,11]],"date-time":"2022-08-11T00:00:00Z","timestamp":1660176000000},"content-version":"vor","delay-in-days":222,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100006108","name":"National Center for Advancing Translational Sciences","doi-asserted-by":"publisher","award":["UL1TR002649"],"award-info":[{"award-number":["UL1TR002649"]}],"id":[{"id":"10.13039\/100006108","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,8,11]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:label\/>\n                  <jats:p>TopEx is a natural language processing application developed to facilitate the exploration of topics and key words in a set of texts through a user interface that requires no programming or natural language processing knowledge, thus enhancing the ability of nontechnical researchers to explore and analyze textual data. The underlying algorithm groups semantically similar sentences together followed by a topic analysis on each group to identify the key topics discussed in a collection of texts. Implementation is achieved via a Python library back end and a web application front end built with React and D3.js for visualizations. TopEx has been successfully used to identify themes, topics and key words in a variety of corpora, including Coronavirus disease 2019 (COVID-19) discharge summaries and tweets. Feedback from the BioCreative VII Challenge Track 4 concludes that TopEx is a useful tool for text exploration for a variety of users and tasks.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Databse URL<\/jats:title>\n                  <jats:p>http:\/\/topex.cctr.vcu.edu<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/database\/baac063","type":"journal-article","created":{"date-parts":[[2022,8,11]],"date-time":"2022-08-11T14:48:27Z","timestamp":1660229307000},"source":"Crossref","is-referenced-by-count":3,"title":["TopEx: topic exploration of COVID-19 corpora - Results from the BioCreative VII Challenge Track 4"],"prefix":"10.1093","volume":"2022","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8064-521X","authenticated-orcid":false,"given":"Amy L","family":"Olex","sequence":"first","affiliation":[{"name":"C. Kenneth and Diane Wright Center for Clinical and Translational Research, Virginia Commonwealth University , 203 E. Cary St, Richmond, VA 23291, USA"},{"name":"Department of Computer Science, Virginia Commonwealth University , 401 S. Main St, Richmond, VA 23284, USA"}]},{"given":"Evan","family":"French","sequence":"additional","affiliation":[{"name":"C. Kenneth and Diane Wright Center for Clinical and Translational Research, Virginia Commonwealth University , 203 E. Cary St, Richmond, VA 23291, USA"},{"name":"Department of Computer Science, Virginia Commonwealth University , 401 S. Main St, Richmond, VA 23284, USA"}]},{"given":"Peter","family":"Burdette","sequence":"additional","affiliation":[{"name":"C. Kenneth and Diane Wright Center for Clinical and Translational Research, Virginia Commonwealth University , 203 E. Cary St, Richmond, VA 23291, USA"}]},{"given":"Srilakshmi","family":"Sagiraju","sequence":"additional","affiliation":[{"name":"C. Kenneth and Diane Wright Center for Clinical and Translational Research, Virginia Commonwealth University , 203 E. Cary St, Richmond, VA 23291, USA"}]},{"given":"Thomas","family":"Neumann","sequence":"additional","affiliation":[{"name":"Massey Cancer Center, Virginia Commonwealth University , 401 S. Main St, Richmond, VA 23284, USA"}]},{"given":"Tamas S","family":"Gal","sequence":"additional","affiliation":[{"name":"C. Kenneth and Diane Wright Center for Clinical and Translational Research, Virginia Commonwealth University , 203 E. Cary St, Richmond, VA 23291, USA"},{"name":"Massey Cancer Center, Virginia Commonwealth University , 401 S. Main St, Richmond, VA 23284, USA"}]},{"given":"Bridget T","family":"McInnes","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Virginia Commonwealth University , 401 S. Main St, Richmond, VA 23284, USA"}]}],"member":"286","published-online":{"date-parts":[[2022,8,11]]},"reference":[{"key":"2022081114481312800_R1","article-title":"Exploring Text Datasets by Visualizing Relevant Words","volume-title":"arXiv:170705261 [cs] [Internet]","author":"Horn"},{"key":"2022081114481312800_R2","doi-asserted-by":"publisher","first-page":"993","DOI":"10.5555\/944919.944937","article-title":"Latent Dirichlet\u00a0allocation","volume":"3","author":"Blei","year":"2003","journal-title":"J. Machine Learn. Res."},{"key":"2022081114481312800_R3","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2020.101582","article-title":"A review of topic modeling methods","volume":"94","author":"Vayansky","year":"2020","journal-title":"Inf. Syst."},{"key":"2022081114481312800_R4","doi-asserted-by":"publisher","first-page":"63","DOI":"10.3115\/v1\/W14-3110","article-title":"LDAvis: a method for visualizing and interpreting topics","author":"Sievert","year":"2014"},{"key":"2022081114481312800_R5","article-title":"LDAExplore: Visualizing Topic Models Generated Using Latent Dirichlet Allocation","volume-title":"arXiv:150706593","author":"Ganesan"},{"key":"2022081114481312800_R6","doi-asserted-by":"publisher","first-page":"173","DOI":"10.1109\/VAST.2014.7042493","article-title":"Serendip: topic model-driven visual exploration of text corpora","author":"Alexander","year":"2014"},{"key":"2022081114481312800_R7","doi-asserted-by":"publisher","first-page":"40","DOI":"10.1016\/j.visinf.2017.01.005","article-title":"VISTopic: a visual analytics system for making sense of large document collections using hierarchical topic modeling","volume":"1","author":"Yang","year":"2017","journal-title":"Visual Inform."},{"key":"2022081114481312800_R8","doi-asserted-by":"publisher","first-page":"93","DOI":"10.1109\/TVCG.2010.225","article-title":"EventRiver: visually exploring text collections with temporal references","volume":"18","author":"Luo","year":"2012","journal-title":"IEEE Trans. Vis. Comput. Graph"},{"key":"2022081114481312800_R9","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1109\/INFVIS.2000.885098","article-title":"ThemeRiver: visualizing theme changes over time","author":"Havre","year":"2000"},{"key":"2022081114481312800_R10","first-page":"459","article-title":"Local topic mining for reflective medical writing","volume":"2020","author":"Olex","year":"2020","journal-title":"AMIA Jt. Summits Transl. Sci. Proc."},{"key":"2022081114481312800_R11","doi-asserted-by":"publisher","first-page":"435","DOI":"10.1145\/1390334.1390409","article-title":"TF-IDF uncovered: a study of theories and probabilities","author":"Roelleke","year":"2008"},{"key":"2022081114481312800_R12","doi-asserted-by":"publisher","first-page":"1","DOI":"10.48550\/arXiv.1802.03426","article-title":"UMAP: uniform manifold approximation and projection for dimension reduction","author":"McInnes","year":"2018","journal-title":"arXiv"},{"key":"2022081114481312800_R13","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Van der Maaten","year":"2008","journal-title":"J. Mach. Learn Res."},{"key":"2022081114481312800_R14","doi-asserted-by":"publisher","first-page":"315","DOI":"10.3390\/epidemiologia2030024","article-title":"A large-scale COVID-19 twitter chatter dataset for open scientific research\u2014an international collaboration","volume":"2","author":"Banda","year":"2021","journal-title":"Epidemiologia"},{"key":"2022081114481312800_R15","doi-asserted-by":"publisher","DOI":"10.1093\/abm\/kaac014","article-title":"Linguistic characteristics of COVID-19 pandemic control and mitigation communications in South Korea","author":"Kim","year":"2022"},{"key":"2022081114481312800_R16","article-title":"Overview of the COVID-19 text mining tool interactive demo track. In: Proceedings of the BioCreative VII Challenge Evaluation Workshop, 227","author":"Chatr-aryamontri"},{"key":"2022081114481312800_R17","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/w19-5034","article-title":"ScispaCy: fast and robust models for biomedical natural language processing","author":"Neumann","year":"2019"}],"container-title":["Database"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baac063\/45335416\/baac063.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/database\/article-pdf\/doi\/10.1093\/database\/baac063\/45335416\/baac063.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,8,11]],"date-time":"2022-08-11T14:48:34Z","timestamp":1660229314000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/database\/article\/doi\/10.1093\/database\/baac063\/6661265"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,1]]},"references-count":17,"URL":"https:\/\/doi.org\/10.1093\/database\/baac063","relation":{},"ISSN":["1758-0463"],"issn-type":[{"value":"1758-0463","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022,1,1]]},"published":{"date-parts":[[2022,1,1]]},"article-number":"baac063"}}