{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,9]],"date-time":"2025-11-09T11:07:38Z","timestamp":1762686458709},"reference-count":23,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":480,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2015,6,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Summary: Diseases caused by zoonotic viruses (viruses transmittable between humans and animals) are a major threat to public health throughout the world. By studying virus migration and mutation patterns, the field of phylogeography provides a valuable tool for improving their surveillance. A key component in phylogeographic analysis of zoonotic viruses involves identifying the specific locations of relevant viral sequences. This is usually accomplished by querying public databases such as GenBank and examining the geospatial metadata in the record. When sufficient detail is not available, a logical next step is for the researcher to conduct a manual survey of the corresponding published articles.<\/jats:p><jats:p>Motivation: In this article, we present a system for detection and disambiguation of locations (toponym resolution) in full-text articles to automate the retrieval of sufficient metadata. Our system has been tested on a manually annotated corpus of journal articles related to phylogeography using integrated heuristics for location disambiguation including a distance heuristic, a population heuristic and a novel heuristic utilizing knowledge obtained from GenBank metadata (i.e. a \u2018metadata heuristic\u2019).<\/jats:p><jats:p>Results: For detecting and disambiguating locations, our system performed best using the metadata heuristic (0.54 Precision, 0.89 Recall and 0.68 F-score). Precision reaches 0.88 when examining only the disambiguation of location names. Our error analysis showed that a noticeable increase in the accuracy of toponym resolution is possible by improving the geospatial location detection. By improving these fundamental automated tasks, our system can be a useful resource to phylogeographers that rely on geospatial metadata of GenBank sequences.<\/jats:p><jats:p>Contact: \u00a0davy.weissenbacher@asu.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/btv259","type":"journal-article","created":{"date-parts":[[2015,6,13]],"date-time":"2015-06-13T17:12:36Z","timestamp":1434215556000},"page":"i348-i356","source":"Crossref","is-referenced-by-count":17,"title":["Knowledge-driven geospatial location resolution for phylogeographic models of virus migration"],"prefix":"10.1093","volume":"31","author":[{"given":"Davy","family":"Weissenbacher","sequence":"first","affiliation":[{"name":"1 Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ 85259, USA and 2Center for Environmental Security, Biodesign Institute, Arizona State University, Tempe, AZ 85287-5904, USA"}]},{"given":"Tasnia","family":"Tahsin","sequence":"additional","affiliation":[{"name":"1 Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ 85259, USA and 2Center for Environmental Security, Biodesign Institute, Arizona State University, Tempe, AZ 85287-5904, USA"}]},{"given":"Rachel","family":"Beard","sequence":"additional","affiliation":[{"name":"1 Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ 85259, USA and 2Center for Environmental Security, Biodesign Institute, Arizona State University, Tempe, AZ 85287-5904, USA"},{"name":"1 Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ 85259, USA and 2Center for Environmental Security, Biodesign Institute, Arizona State University, Tempe, AZ 85287-5904, USA"}]},{"given":"Mari","family":"Figaro","sequence":"additional","affiliation":[{"name":"1 Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ 85259, USA and 2Center for Environmental Security, Biodesign Institute, Arizona State University, Tempe, AZ 85287-5904, USA"},{"name":"1 Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ 85259, USA and 2Center for Environmental Security, Biodesign Institute, Arizona State University, Tempe, AZ 85287-5904, USA"}]},{"given":"Robert","family":"Rivera","sequence":"additional","affiliation":[{"name":"1 Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ 85259, USA and 2Center for Environmental Security, Biodesign Institute, Arizona State University, Tempe, AZ 85287-5904, USA"}]},{"given":"Matthew","family":"Scotch","sequence":"additional","affiliation":[{"name":"1 Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ 85259, USA and 2Center for Environmental Security, Biodesign Institute, Arizona State University, Tempe, AZ 85287-5904, USA"},{"name":"1 Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ 85259, USA and 2Center for Environmental Security, Biodesign Institute, Arizona State University, Tempe, AZ 85287-5904, USA"}]},{"given":"Graciela","family":"Gonzalez","sequence":"additional","affiliation":[{"name":"1 Department of Biomedical Informatics, Arizona State University, Scottsdale, AZ 85259, USA and 2Center for Environmental Security, Biodesign Institute, Arizona State University, Tempe, AZ 85287-5904, USA"}]}],"member":"286","published-online":{"date-parts":[[2015,6,10]]},"reference":[{"key":"2023020115413329400_btv259-B1","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1007\/978-94-007-4587-2_12","article-title":"Inferring thematic places from spatially referenced natural language descriptions","volume-title":"Crowdsourcing Geographic Knowledge","author":"Adams","year":"2013"},{"key":"2023020115413329400_btv259-B2","doi-asserted-by":"crossref","DOI":"10.2307\/j.ctv1nzfgj7","volume-title":"Phylogeography: The History and Formation of Species","author":"Avise","year":"2000"},{"key":"2023020115413329400_btv259-B3","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1093\/nar\/gkq1079","article-title":"Genbank","volume":"39","author":"Benson","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"2023020115413329400_btv259-B4","article-title":"Bionlp shared task 2011\u2014bacteria biotope","volume-title":"Proceedings of the BioNLP 2011 Workshop Companion Volume for Shared Task","author":"Bossy","year":"2011"},{"key":"2023020115413329400_btv259-B5","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1145\/2047296.2047300","article-title":"Approaches to disambiguating toponyms","volume":"3","author":"Buscaldi","year":"2011","journal-title":"SIGSPATIAL Special"},{"key":"2023020115413329400_btv259-B6","first-page":"296","article-title":"Agreement, the f-measure, and reliability in information retrieval","volume":"12","author":"Hripcsak","year":"2005","journal-title":"JAMIA"},{"key":"2023020115413329400_btv259-B7","doi-asserted-by":"crossref","DOI":"10.1145\/1328964.1328989","volume-title":"Toponym Resolution in Text: Annotation, Evaluation and Applications of Spatial Grounding of Place Names","author":"Leidner","year":"2007"},{"key":"2023020115413329400_btv259-B8","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1145\/2047296.2047298","article-title":"Detecting geographical references in the form of place names and associated spatial natural language","volume":"3","author":"Leidner","year":"2011","journal-title":"SIGSPATIAL"},{"key":"2023020115413329400_btv259-B9","article-title":"Spatialml: Annotation scheme, corpora, and tools","author":"Mani","year":"2008"},{"key":"2023020115413329400_btv259-B10","first-page":"188","article-title":"Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons","volume-title":"Proceedings of CoNLL-2013","author":"McCallum","year":"2013"},{"key":"2023020115413329400_btv259-B11","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1186\/1756-0500-2-101","article-title":"Genbank and pubmed: how connected are they?","volume":"2","author":"Miller","year":"2009","journal-title":"BMC Res. Notes"},{"key":"2023020115413329400_btv259-B12","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1007\/978-3-642-28569-1_2","article-title":"Information extraction: past, present and future","volume-title":"Multi-source, multilingual information extraction and summarization, theory and applications of natural language processing","author":"Piskorski","year":"2013"},{"key":"2023020115413329400_btv259-B13","first-page":"1","article-title":"Toponym disambiguation using events","volume-title":"FLAIRS Conference\u201910","author":"Roberts","year":"2010"},{"key":"2023020115413329400_btv259-B14","first-page":"1","article-title":"Using machine learning methods for disambiguating place references in textual documents","author":"Santos","year":"2014","journal-title":"GeoJournal"},{"key":"2023020115413329400_btv259-B15","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1016\/j.jbi.2011.06.005","article-title":"Enhancing phylogeography by improving geographical information from genbank","volume":"44","author":"Scotch","year":"2011","journal-title":"J. Biomed. Inf."},{"key":"2023020115413329400_btv259-B16","volume-title":"Methods and Applications of Text-Driven Toponym Resolution with Indirect Supervision","author":"Speriosu","year":"2013"},{"key":"2023020115413329400_btv259-B17","first-page":"102","article-title":"Natural language processing methods for enhancing geographic metadata for phylogeography of zoonotic viruses","volume":"2014","author":"Tahsin","year":"2014","journal-title":"AMIA Jt. Summits Transl. Sci. Proc."},{"key":"2023020115413329400_btv259-B18","doi-asserted-by":"crossref","first-page":"294","DOI":"10.1186\/1471-2105-11-294","article-title":"Envmine: a text-mining system for the automatic extraction of contextual information","volume":"11","author":"Tamames","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023020115413329400_btv259-B19","first-page":"217","article-title":"Cermine\u2014automatic extraction of metadata and references from scientific literature","volume-title":"Proceedings of 11th IAPR International Workshop on Document Analysis Systems","author":"Tkaczyk","year":"2014"},{"key":"2023020115413329400_btv259-B20","doi-asserted-by":"crossref","DOI":"10.1145\/1722080.1722089","article-title":"Evaluation of georeferencing","volume-title":"Proceedings of the 6th Workshop on Geographic Information Retrieval, GIR \u201910","author":"Tobin","year":"2010"},{"key":"2023020115413329400_btv259-B21","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1145\/1460007.1460012","article-title":"A system for the automatic comparison of machine and human geocoded documents","volume-title":"Proceedings of the 2nd International Workshop on Geographic Information Retrieval, GIR \u201908","author":"Turton","year":"2008"},{"key":"2023020115413329400_btv259-B22","doi-asserted-by":"crossref","first-page":"e32171","DOI":"10.1371\/journal.pone.0032171","article-title":"Text mining improves prediction of protein functional sites","volume":"7","author":"Verspoor","year":"2012","journal-title":"PLoS One"},{"key":"2023020115413329400_btv259-B23","first-page":"37","article-title":"Geocoding location expressions in Twitter messages: A preference learning method","volume":"9","author":"Zhang","year":"2014","journal-title":"J. Spatial Inf. Sci."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/12\/i348\/49014363\/bioinformatics_31_12_i348.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/31\/12\/i348\/49014363\/bioinformatics_31_12_i348.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,8,11]],"date-time":"2023-08-11T12:03:01Z","timestamp":1691755381000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/31\/12\/i348\/216444"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,6,10]]},"references-count":23,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2015,6,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btv259","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2015,6,15]]},"published":{"date-parts":[[2015,6,10]]}}}