{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,26]],"date-time":"2025-10-26T22:48:08Z","timestamp":1761518888719},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"16","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,8,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Text mining in the biomedical domain aims at helping researchers to access information contained in scientific publications in a faster, easier and more complete way. One step towards this aim is the recognition of named entities and their subsequent normalization to database identifiers. Normalization helps to link objects of potential interest, such as genes, to detailed information not contained in a publication; it is also key for integrating different knowledge sources. From an information retrieval perspective, normalization facilitates indexing and querying. Gene mention normalization (GN) is particularly challenging given the high ambiguity of gene names: they refer to orthologous or entirely different genes, are named after phenotypes and other biomedical terms, or they resemble common English words.<\/jats:p>\n               <jats:p>Results: We present the first publicly available system, GNAT, reported to handle inter-species GN. Our method uses extensive background knowledge on genes to resolve ambiguous names to EntrezGene identifiers. It performs comparably to single-species approaches proposed by us and others. On a benchmark set derived from BioCreative 1 and 2 data that contains genes from 13 species, GNAT achieves an F-measure of 81.4% (90.8% precision at 73.8% recall). For the single-species task, we report an F-measure of 85.4% on human genes.<\/jats:p>\n               <jats:p>Availability: A web-frontend is available at http:\/\/cbioc.eas.asu.edu\/gnat\/. GNAT will also be available within the BioCreative MetaService project, see http:\/\/bcms.bioinfo.cnio.es.<\/jats:p>\n               <jats:p>Contact: \u00a0joerg.hakenberg@asu.edu<\/jats:p>\n               <jats:p>Supplementary information: The test data set, lexica, and links to external data are available at http:\/\/cbioc.eas.asu.edu\/gnat\/<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn299","type":"journal-article","created":{"date-parts":[[2008,8,9]],"date-time":"2008-08-09T13:08:02Z","timestamp":1218287282000},"page":"i126-i132","source":"Crossref","is-referenced-by-count":78,"title":["Inter-species normalization of gene mentions with GNAT"],"prefix":"10.1093","volume":"24","author":[{"given":"J\u00f6rg","family":"Hakenberg","sequence":"first","affiliation":[{"name":"1 Department of Computer Science and Engineering, Arizona State University, Tempe, AZ 85287, USA, 2Biotechnological Centre, Technische Universit\u00e4t Dresden, Tatzberg 47\u201351, 01307 Dresden, 3Transinsight GmbH, Tatzberg 47\u201351, 01307 Dresden, Germany and 4Department of Biomedical Informatics, Arizona State University, Phoenix, AZ 85004, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Conrad","family":"Plake","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science and Engineering, Arizona State University, Tempe, AZ 85287, USA, 2Biotechnological Centre, Technische Universit\u00e4t Dresden, Tatzberg 47\u201351, 01307 Dresden, 3Transinsight GmbH, Tatzberg 47\u201351, 01307 Dresden, Germany and 4Department of Biomedical Informatics, Arizona State University, Phoenix, AZ 85004, USA"},{"name":"1 Department of Computer Science and Engineering, Arizona State University, Tempe, AZ 85287, USA, 2Biotechnological Centre, Technische Universit\u00e4t Dresden, Tatzberg 47\u201351, 01307 Dresden, 3Transinsight GmbH, Tatzberg 47\u201351, 01307 Dresden, Germany and 4Department of Biomedical Informatics, Arizona State University, Phoenix, AZ 85004, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Robert","family":"Leaman","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science and Engineering, Arizona State University, Tempe, AZ 85287, USA, 2Biotechnological Centre, Technische Universit\u00e4t Dresden, Tatzberg 47\u201351, 01307 Dresden, 3Transinsight GmbH, Tatzberg 47\u201351, 01307 Dresden, Germany and 4Department of Biomedical Informatics, Arizona State University, Phoenix, AZ 85004, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael","family":"Schroeder","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science and Engineering, Arizona State University, Tempe, AZ 85287, USA, 2Biotechnological Centre, Technische Universit\u00e4t Dresden, Tatzberg 47\u201351, 01307 Dresden, 3Transinsight GmbH, Tatzberg 47\u201351, 01307 Dresden, Germany and 4Department of Biomedical Informatics, Arizona State University, Phoenix, AZ 85004, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Graciela","family":"Gonzalez","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science and Engineering, Arizona State University, Tempe, AZ 85287, USA, 2Biotechnological Centre, Technische Universit\u00e4t Dresden, Tatzberg 47\u201351, 01307 Dresden, 3Transinsight GmbH, Tatzberg 47\u201351, 01307 Dresden, Germany and 4Department of Biomedical Informatics, Arizona State University, Phoenix, AZ 85004, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2008,8,9]]},"reference":[{"key":"2023020210500751500_B1","first-page":"257","article-title":"An integrated approach to concept recognition in biomedical text","volume-title":"Proceedings of Second BioCreativeWorkshop.","author":"Baumgartner","year":"2007"},{"key":"2023020210500751500_B2","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1186\/1471-2105-9-69","article-title":"The strength of co-authorship in gene name disambiguation","volume":"9","author":"Farkas","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023020210500751500_B3","first-page":"149","article-title":"ProMiner: recognition of human gene and protein names using regularly updated dictionaries","volume-title":"Proceedings of Second BioCreative Challenge Evaluation Workshop.","author":"Fluck","year":"2007"},{"key":"2023020210500751500_B4","first-page":"153","article-title":"Human gene normalization by an integrated approach including abbreviation resolution and disambiguation","volume-title":"Proceedings of Second BioCreative Challenge EvaluationWorkshop.","author":"Fundel","year":"2007"},{"key":"2023020210500751500_B5","doi-asserted-by":"crossref","first-page":"D440","DOI":"10.1093\/nar\/gkm883","article-title":"The Gene Ontology project in 2008","volume":"36","author":"Gene Ontology Consortium","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023020210500751500_B6","first-page":"111","article-title":"A Robust Parsing Algorithm for Link Grammars","volume-title":"Proceedings of International Workshop on Parsing Technologies.","author":"Grinberg","year":"1995"},{"key":"2023020210500751500_B7","first-page":"153","article-title":"What's in a gene name? Automated refinement of gene name dictionaries","volume-title":"Proceedings of BioNLP at ACL 2007.","author":"Hakenberg","year":"2007"},{"key":"2023020210500751500_B8","doi-asserted-by":"crossref","first-page":"S14","DOI":"10.1186\/gb-2008-9-s2-s14","article-title":"Gene mention normalization and interaction extraction with context models and sentence motifs","volume":"9","author":"Hakenberg","year":"2008","journal-title":"Genome Biol"},{"key":"2023020210500751500_B9","doi-asserted-by":"crossref","first-page":"S14","DOI":"10.1186\/1471-2105-6-S1-S14","article-title":"ProMiner: rule-based protein and gene entity recognition","volume":"6","author":"Hanisch","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023020210500751500_B10","doi-asserted-by":"crossref","first-page":"S11","DOI":"10.1186\/1471-2105-6-S1-S11","article-title":"Overview of BioCreAtIvE task 1B: normalized gene lists","volume":"6","author":"Hirschman","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023020210500751500_B11","first-page":"652","article-title":"BANNER: an executable survey of advances in biomedical named entity recognition","volume":"13","author":"Leaman","year":"2008","journal-title":"Pac. Symp. Biocomput"},{"key":"2023020210500751500_B12","doi-asserted-by":"crossref","first-page":"S6","DOI":"10.1186\/gb-2008-9-s2-s6","article-title":"Introducing meta-services for biomedical information extraction","volume":"9","author":"Leitner","year":"2008","journal-title":"Genome Biol"},{"key":"2023020210500751500_B13","doi-asserted-by":"crossref","first-page":"S3","DOI":"10.1186\/gb-2008-9-s2-s3","article-title":"Overview of BioCreative II Gene Normalization","volume":"9","author":"Morgan","year":"2008","journal-title":"Genome Biol"},{"key":"2023020210500751500_B14","doi-asserted-by":"crossref","first-page":"2444","DOI":"10.1093\/bioinformatics\/btl408","article-title":"AliBaba: PubMed as a graph","volume":"22","author":"Plake","year":"2006","journal-title":"Bioinformatics"},{"key":"2023020210500751500_B15","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1613\/jair.514","article-title":"Semantic similarity in a taxonomy: an information based measure and its application to problems of ambiguity in natural language","volume":"11","author":"Resnik","year":"1999","journal-title":"J. Artif. Intell. Res"},{"key":"2023020210500751500_B16","doi-asserted-by":"crossref","first-page":"302","DOI":"10.1186\/1471-2105-7-302","article-title":"A new measure for functional similarity of gene products based on Gene Ontology","volume":"7","author":"Schlicker","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023020210500751500_B17","first-page":"451","article-title":"A simple algorithm for identifying abbreviation definitions in biomedical text","author":"Schwartz","year":"2003"},{"key":"2023020210500751500_B18","doi-asserted-by":"crossref","first-page":"402","DOI":"10.1186\/gb-2006-7-5-402","article-title":"The success (or not) of HUGO nomenclature","volume":"7","author":"Tamames","year":"2006","journal-title":"Genome Biol"},{"key":"2023020210500751500_B19","first-page":"41","article-title":"Combining multiple evidence for gene symbol disambiguation","volume-title":"Proceedings of BioNLP at ACL 2007.","author":"Xu","year":"2007"},{"key":"2023020210500751500_B20","first-page":"7","article-title":"BioCreative 2. Gene mention task","volume-title":"Proceedings of Second BioCreative Challenge EvaluationWorkshop.","author":"Wilbur","year":"2007"},{"key":"2023020210500751500_B21","doi-asserted-by":"crossref","first-page":"358","DOI":"10.1093\/bib\/bbm045","article-title":"Frontiers of biomedical text mining: current progress","volume":"8","author":"Zweigenbaum","year":"2007","journal-title":"Brief Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/16\/i126\/49051402\/bioinformatics_24_16_i126.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/16\/i126\/49051402\/bioinformatics_24_16_i126.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T12:46:28Z","timestamp":1675341988000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/16\/i126\/201746"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,8,9]]},"references-count":21,"journal-issue":{"issue":"16","published-print":{"date-parts":[[2008,8,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn299","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2008,8,15]]},"published":{"date-parts":[[2008,8,9]]}}}