{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,29]],"date-time":"2025-10-29T03:25:10Z","timestamp":1761708310823},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"6","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,3,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: The recognition and normalization of textual mentions of gene and protein names is both particularly important and challenging. Its importance lies in the fact that they constitute the crucial conceptual entities in biomedicine. Their recognition and normalization remains a challenging task because of widespread gene name ambiguities within species, across species, with common English words and with medical sublanguage terms.<\/jats:p>\n               <jats:p>Results: We present GeNo, a highly competitive system for gene name normalization, which obtains an F-measure performance of 86.4% (precision: 87.8%, recall: 85.0%) on the BioCreAtIvE-II test set, thus being on a par with the best system on that task. Our system tackles the complex gene normalization problem by employing a carefully crafted suite of symbolic and statistical methods, and by fully relying on publicly available software and data resources, including extensive background knowledge based on semantic profiling. A major goal of our work is to present GeNo's architecture in a lucid and perspicuous way to pave the way to full reproducibility of our results.<\/jats:p>\n               <jats:p>Availability: GeNo, including its underlying resources, will be available from www.julielab.de. It is also currently deployed in the Semedico search engine at www.semedico.org.<\/jats:p>\n               <jats:p>Contact: \u00a0joachim.wermter@uni-jena.de<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp071","type":"journal-article","created":{"date-parts":[[2009,2,3]],"date-time":"2009-02-03T01:14:05Z","timestamp":1233623645000},"page":"815-821","source":"Crossref","is-referenced-by-count":70,"title":["High-performance gene name normalization with G<scp>e<\/scp>N<scp>o<\/scp>"],"prefix":"10.1093","volume":"25","author":[{"given":"Joachim","family":"Wermter","sequence":"first","affiliation":[{"name":"Jena University Language and Information Engineering (JULIE) Lab, Friedrich-Schiller-Universit\u00e4t Jena, F\u00fcrstengraben 30, 07743 Jena, Germany"}]},{"given":"Katrin","family":"Tomanek","sequence":"additional","affiliation":[{"name":"Jena University Language and Information Engineering (JULIE) Lab, Friedrich-Schiller-Universit\u00e4t Jena, F\u00fcrstengraben 30, 07743 Jena, Germany"}]},{"given":"Udo","family":"Hahn","sequence":"additional","affiliation":[{"name":"Jena University Language and Information Engineering (JULIE) Lab, Friedrich-Schiller-Universit\u00e4t Jena, F\u00fcrstengraben 30, 07743 Jena, Germany"}]}],"member":"286","published-online":{"date-parts":[[2009,2,2]]},"reference":[{"key":"2023051209121910200_B1","first-page":"257","article-title":"An integrated approach to concept recognition in biomedical text","volume-title":"Proceedings of the 2nd BioCreative Challenge Evaluation Workshop.","author":"Baumgartner","year":"2007"},{"key":"2023051209121910200_B2","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1016\/j.artmed.2004.07.016","article-title":"Comparative experiments on learning information extractors for proteins and their interactions","volume":"33","author":"Bunescu","year":"2005","journal-title":"Artif. Intell. Med."},{"key":"2023051209121910200_B3","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1093\/bioinformatics\/bth496","article-title":"Gene name ambiguity of eukaryotic nomenclatures","volume":"21","author":"Chen","year":"2005","journal-title":"Bioinformatics"},{"key":"2023051209121910200_B4","first-page":"1","article-title":"An overview of JCoRethe JulieLab UIMA Component Repository","volume-title":"Proceedings of the LREC'08 Workshop \u2018Towards Enhanced Interoperability for Large HLT Systems: UIMA for NLP'.","author":"Hahn","year":"2008"},{"key":"2023051209121910200_B5","first-page":"153","article-title":"What's in a gene name? Automated refinement of gene name dictionaries","volume-title":"Proceedings of the BioNLP Workshop at ACL 2007.","author":"Hakenberg","year":"2007"},{"issue":"Suppl. 2","key":"2023051209121910200_B6","doi-asserted-by":"crossref","first-page":"S14","DOI":"10.1186\/gb-2008-9-s2-s14","article-title":"Gene mention normalization and interaction extraction with context models and sentence motifs","volume":"9","author":"Hakenberg","year":"2008","journal-title":"Genome Biol."},{"key":"2023051209121910200_B7","doi-asserted-by":"crossref","first-page":"i126","DOI":"10.1093\/bioinformatics\/btn299","article-title":"Inter-species normalization of gene mentions with Gnat","volume":"24","author":"Hakenberg","year":"2008","journal-title":"Bioinformatics"},{"issue":"Suppl. 1","key":"2023051209121910200_B8","doi-asserted-by":"crossref","first-page":"S14","DOI":"10.1186\/1471-2105-6-S1-S14","article-title":"ProMiner: rule-based protein and gene entity recognition","volume":"6","author":"Hanisch","year":"2005","journal-title":"BMC Bioinform"},{"issue":"Suppl 1","key":"2023051209121910200_B9","doi-asserted-by":"crossref","first-page":"S1","DOI":"10.1186\/1471-2105-6-S1-S1","article-title":"Overview of BioCreAtIvE: critical assessment of information extraction for biology","volume":"6","author":"Hirschman","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023051209121910200_B10","volume-title":"Proceedings of the Second BioCreative Challenge Evaluation Workshop.","author":"Hirschman","year":"2007"},{"key":"2023051209121910200_B11","doi-asserted-by":"crossref","first-page":"i180","DOI":"10.1093\/bioinformatics\/btg1023","article-title":"Geniacorpus: a semantically annotated corpus for biotextmining","volume":"19","author":"Kim","year":"2003","journal-title":"Bioinformatics"},{"key":"2023051209121910200_B12","first-page":"61","article-title":"Integrated annotation for biomedical information extraction","volume-title":"Proceedings of the BioLink 2004 Workshop \u2018Linking Biological Literature, Ontologies and Databases: Tools for Users\u2019 at NAACL\/HLT 2004.","author":"Kulick","year":"2004"},{"key":"2023051209121910200_B13","first-page":"282","article-title":"Conditional Random Fields: Probabilistic models for segmenting and labeling sequence data","volume-title":"ICML'01: Proceedings of the 18th International Conference on Machine Learning.","author":"Lafferty","year":"2001"},{"key":"2023051209121910200_B14","first-page":"652","article-title":"Banner: an executable survey of advances in biomedical named entity recognition","volume-title":"PSB-2008: Proceedings of the Pacific Symposium on Biocomputing 2008.","author":"Leaman","year":"2008"},{"key":"2023051209121910200_B15","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1093\/bioinformatics\/bti749","article-title":"BioThesaurus: A web-based thesaurus of protein and gene names","volume":"22","author":"Liu","year":"2006","journal-title":"Bioinformatics"},{"key":"2023051209121910200_B16","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1002\/cfg.452","article-title":"Protein name tagging guidelines: lessons learned","volume":"6","author":"Mani","year":"2005","journal-title":"Comp. Funct. Genomics"},{"issue":"Suppl 2","key":"2023051209121910200_B17","doi-asserted-by":"crossref","first-page":"S3","DOI":"10.1186\/gb-2008-9-s2-s3","article-title":"Overview of BioCreative II gene normalization","volume":"9","author":"Morgan","year":"2008","journal-title":"Genome Biol."},{"key":"2023051209121910200_B18","first-page":"107","article-title":"Biomedical named entity recognition using Conditional Random Fields and rich feature sets","volume-title":"Proceedings of the COLING 2004 NLPBA\/BioNLP Workshop.","author":"Settles","year":"2004"},{"issue":"Suppl 1","key":"2023051209121910200_B19","doi-asserted-by":"crossref","first-page":"S3","DOI":"10.1186\/1471-2105-6-S1-S3","article-title":"GeneTag: a tagged corpus for gene\/protein named entity recognition","volume":"6","author":"Tanabe","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023051209121910200_B20","doi-asserted-by":"crossref","first-page":"2768","DOI":"10.1093\/bioinformatics\/btm393","article-title":"Learning string similarity measures for gene\/protein name dictionary look-up using logistic regression","volume":"23","author":"Tsuruoka","year":"2007","journal-title":"Bioinformatics"},{"key":"2023051209121910200_B21","doi-asserted-by":"crossref","first-page":"1015","DOI":"10.1093\/bioinformatics\/btm056","article-title":"Gene symbol disambiguation using knowledge-based profiles","volume":"23","author":"Xu","year":"2007","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/6\/815\/50286361\/bioinformatics_25_6_815.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/6\/815\/50286361\/bioinformatics_25_6_815.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,12]],"date-time":"2023-05-12T09:12:57Z","timestamp":1683882777000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/6\/815\/252663"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,2,2]]},"references-count":21,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2009,3,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp071","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,3,15]]},"published":{"date-parts":[[2009,2,2]]}}}