{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,29]],"date-time":"2025-05-29T04:11:11Z","timestamp":1748491871883,"version":"3.41.0"},"reference-count":44,"publisher":"Springer Science and Business Media LLC","issue":"S1","license":[{"start":{"date-parts":[[2015,1,19]],"date-time":"2015-01-19T00:00:00Z","timestamp":1421625600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"published-print":{"date-parts":[[2015,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Small chemical molecules regulate biological processes at the molecular level. Those molecules are often involved in causing or treating pathological states. Automatically identifying such molecules in biomedical text is difficult due to both, the diverse morphology of chemical names and the alternative types of nomenclature that are simultaneously used to describe them. To address these issues, the last BioCreAtIvE challenge proposed a CHEMDNER task, which is a Named Entity Recognition (NER) challenge that aims at labelling different types of chemical names in biomedical text.<\/jats:p><\/jats:sec><jats:sec><jats:title>Methods<\/jats:title><jats:p>To address this challenge we tested various approaches to recognizing chemical entities in biomedical documents. These approaches range from linear Conditional Random Fields (CRFs) to a combination of CRFs with regular expression and dictionary matching, followed by a post-processing step to tag those chemical names in a corpus of Medline abstracts. We named our best performing systems CheNER.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We evaluate the performance of the various approaches using the F-score statistics. Higher F-scores indicate better performance. The highest F-score we obtain in identifying unique chemical entities is 72.88%. The highest F-score we obtain in identifying all chemical entities is 73.07%. We also evaluate the F-Score of combining our system with ChemSpot, and find an increase from 72.88% to 73.83%.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>CheNER presents a valid alternative for automated annotation of chemical entities in biomedical documents. In addition, CheNER may be used to derive new features to train newer methods for tagging chemical entities. CheNER can be downloaded from<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"http:\/\/metres.udl.cat\" ext-link-type=\"uri\">http:\/\/metres.udl.cat<\/jats:ext-link>and included in text annotation pipelines.<\/jats:p><\/jats:sec>","DOI":"10.1186\/1758-2946-7-s1-s15","type":"journal-article","created":{"date-parts":[[2015,6,18]],"date-time":"2015-06-18T08:45:36Z","timestamp":1434617136000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["CheNER: a tool for the identification of chemical entities and their classes in biomedical literature"],"prefix":"10.1186","volume":"7","author":[{"given":"Anabel","family":"Usi\u00e9","sequence":"first","affiliation":[]},{"given":"Joaquim","family":"Cruz","sequence":"additional","affiliation":[]},{"given":"Jorge","family":"Comas","sequence":"additional","affiliation":[]},{"given":"Francesc","family":"Solsona","sequence":"additional","affiliation":[]},{"given":"Rui","family":"Alves","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2015,1,19]]},"reference":[{"key":"634_CR1","doi-asserted-by":"publisher","first-page":"S1","DOI":"10.1186\/1471-2105-6-S1-S1","volume":"6","author":"L Hirschman","year":"2005","unstructured":"Hirschman L, Yeh A, Blaschke C, Valencia A: Overview of BioCreAtIvE: critical assessment of information extraction for biology. BMC Bioinformatics. 2005, 6: S1-","journal-title":"BMC Bioinformatics"},{"key":"634_CR2","doi-asserted-by":"publisher","first-page":"S1","DOI":"10.1186\/gb-2008-9-s2-s1","volume":"9","author":"M Krallinger","year":"2008","unstructured":"Krallinger M, Morgan A, Smith L, Leitner F, Tanabe L, Wilbur J, Hirschman L, Valencia A: Evaluation of text-mining systems for biology: overview of the Second BioCreative community challenge. Genome Biol. 2008, 9: S1-","journal-title":"Genome Biol"},{"key":"634_CR3","doi-asserted-by":"publisher","first-page":"385","DOI":"10.1109\/TCBB.2010.61","volume":"7","author":"F Leitner","year":"2010","unstructured":"Leitner F, Mardis SA, Krallinger M, Cesareni G, Hirschman LA, Valencia A: An Overview of BioCreative II.5. IEEEACM Trans Comput Biol Bioinforma IEEE ACM. 2010, 7: 385-399.","journal-title":"IEEEACM Trans Comput Biol Bioinforma IEEE ACM"},{"key":"634_CR4","doi-asserted-by":"publisher","first-page":"S1","DOI":"10.1186\/1471-2105-12-S8-S1","volume":"12","author":"C Arighi","year":"2011","unstructured":"Arighi C, Lu Z, Krallinger M, Cohen K, Wilbur W, Valencia A, Hirschman L, Wu C: Overview of the BioCreative III Workshop. BMC Bioinformatics. 2011, 12: S1-","journal-title":"BMC Bioinformatics"},{"issue":"Suppl 1","key":"634_CR5","doi-asserted-by":"publisher","first-page":"S1","DOI":"10.1186\/1758-2946-7-S1-S1","volume":"7","author":"M Krallinger","year":"2015","unstructured":"Krallinger M, Leitner F, Rabal O, Vazquez M, Oyarzabal J, Valencia A: CHEMDNER: The drugs and chemical names extraction challenge. J Cheminform. 2015, 7 (Suppl 1): S1-","journal-title":"J Cheminform"},{"key":"634_CR6","doi-asserted-by":"crossref","unstructured":"Kim J-D, Ohta T, Pyysalo S, Kano Y, Tsujii J: Overview of BioNLP'09 shared task on event extraction. Proc Work Curr Trends Biomed Nat Lang Process Shar Task. 1-9.","DOI":"10.3115\/1572340.1572342"},{"key":"634_CR7","first-page":"1","volume-title":"Proc BioNLP Shar Task 2011 Work","author":"J-D Kim","year":"2011","unstructured":"Kim J-D, Pyysalo S, Ohta T, Bossy R, Nguyen N, Tsujii J: Overview of BioNLP Shared Task 2011. Proc BioNLP Shar Task 2011 Work. 2011, Portland, Oregon, USA: Association for Computational Linguistics, 1-6."},{"key":"634_CR8","first-page":"1","volume-title":"Proc BioNLP Shar Task 2013 Work","author":"C N\u00e9dellec","year":"2013","unstructured":"N\u00e9dellec C, Bossy R, Kim J-D, Kim J, Ohta T, Pyysalo S, Zweigenbaum P: Overview of BioNLP Shared Task 2013. Proc BioNLP Shar Task 2013 Work. 2013, Sofia, Bugaria: Association for Computational Linguistics, 1-7."},{"key":"634_CR9","doi-asserted-by":"publisher","first-page":"506","DOI":"10.1002\/minf.201100005","volume":"30","author":"M Vazquez","year":"2011","unstructured":"Vazquez M, Krallinger M, Leitner F, Valencia A: Text Mining for Drugs and Chemical Compounds: Methods, Tools and Applications. Mol Informatics. 2011, 30: 506-519. 10.1002\/minf.201100005.","journal-title":"Mol Informatics"},{"key":"634_CR10","doi-asserted-by":"publisher","first-page":"S14","DOI":"10.1186\/1471-2105-6-S1-S14","volume":"6","author":"D Hanisch","year":"2005","unstructured":"Hanisch D, Fundel K, Mevissen H-T, Zimmer R, Fluck J: ProMiner: rule-based protein and gene entity recognition. BMC Bioinformatics. 2005, 6: S14-","journal-title":"BMC Bioinformatics"},{"key":"634_CR11","doi-asserted-by":"publisher","first-page":"296","DOI":"10.1093\/bioinformatics\/btm557","volume":"24","author":"D Rebholz-Schuhmann","year":"2008","unstructured":"Rebholz-Schuhmann D, Arregui M, Gaudan S, Kirsch H, Jimeno A: Text processing through Web services: calling Whatizit. Bioinformatics. 2008, 24: 296-298. 10.1093\/bioinformatics\/btm557.","journal-title":"Bioinformatics"},{"key":"634_CR12","doi-asserted-by":"publisher","first-page":"122","DOI":"10.1021\/ci00066a004","volume":"30","author":"DI Cooke-Fox","year":"1990","unstructured":"Cooke-Fox DI, Kirby GH, Lord MR, Rayner JD: Computer translation of IUPAC systematic organic chemical nomenclature. 4. Concise connection tables to structure diagrams. J Chem Inf Comput Sci. 1990, 30: 122-127. 10.1021\/ci00066a004.","journal-title":"J Chem Inf Comput Sci"},{"key":"634_CR13","first-page":"107","volume-title":"Comput Life Sci II","author":"P Corbett","year":"2006","unstructured":"Corbett P, Murray-Rust P: High-Throughput Identification of Chemistry in Life Science Texts. Comput Life Sci II. Edited by: R Berthold M, Glen RC, Fischer I. 2006, Berlin, Heidelberg: Springer Berlin Heidelberg, 4216: 107-118."},{"key":"634_CR14","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1186\/1758-2946-3-41","volume":"3","author":"D Jessop","year":"2011","unstructured":"Jessop D, Adams S, Willighagen E, Hawizy L, Murray-Rust P: OSCAR4: a flexible architecture for chemical text-mining. J Cheminformatics. 2011, 3: 41-10.1186\/1758-2946-3-41.","journal-title":"J Cheminformatics"},{"key":"634_CR15","doi-asserted-by":"publisher","first-page":"i268","DOI":"10.1093\/bioinformatics\/btn181","volume":"24","author":"R Klinger","year":"2008","unstructured":"Klinger R, Kol\u00e1\u0159ik C, Fluck J, Hofmann-Apitius M, Friedrich CM: Detection of IUPAC and IUPAC-like chemical names. Bioinformatics. 2008, 24: i268-i276. 10.1093\/bioinformatics\/btn181.","journal-title":"Bioinformatics"},{"key":"634_CR16","volume-title":"Chemical Names: Terminological Resources and Corpora Annotation","author":"C Kol\u00e1\u0159ik","year":"2008","unstructured":"Kol\u00e1\u0159ik C, Klinger R, Friedrich CM, Hofmann-apitius M, Fluck J: Chemical Names: Terminological Resources and Corpora Annotation. 2008"},{"key":"634_CR17","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1186\/1758-2946-3-17","volume":"3","author":"L Hawizy","year":"2011","unstructured":"Hawizy L, Jessop D, Adams N, Murray-Rust P: ChemicalTagger: A tool for semantic text-mining in chemistry. J Cheminformatics. 2011, 3: 17-10.1186\/1758-2946-3-17.","journal-title":"J Cheminformatics"},{"key":"634_CR18","unstructured":"SureChem - Chemical Patent Search. [http:\/\/surechem.com\/]"},{"key":"634_CR19","doi-asserted-by":"publisher","first-page":"101","DOI":"10.1021\/ci00062a009","volume":"29","author":"DI Cooke-Fox","year":"1989","unstructured":"Cooke-Fox DI, Kirby GH, Rayner JD: Computer translation of IUPAC systematic organic chemical nomenclature. 1. Introduction and background to a grammar-based approach. J Chem Inf Comput Sci. 1989, 29: 101-105. 10.1021\/ci00062a009.","journal-title":"J Chem Inf Comput Sci"},{"key":"634_CR20","doi-asserted-by":"publisher","first-page":"106","DOI":"10.1021\/ci00062a010","volume":"29","author":"DI Cooke-Fox","year":"1989","unstructured":"Cooke-Fox DI, Kirby GH, Rayner JD: Computer translation of IUPAC systematic organic chemical nomenclature. 2. Development of a formal grammar. J Chem Inf Comput Sci. 1989, 29: 106-112. 10.1021\/ci00062a010.","journal-title":"J Chem Inf Comput Sci"},{"key":"634_CR21","volume-title":"Bioinformatics","author":"T Rockt\u00e4schel","year":"2012","unstructured":"Rockt\u00e4schel T, Weidlich M, Leser U: ChemSpot: A Hybrid System for Chemical Named Entity Recognition. Bioinformatics. 2012"},{"key":"634_CR22","volume-title":"Bioinformatics","author":"A Usie","year":"2013","unstructured":"Usie A, Alves R, Solsona F, Vazquez M, Valencia A: CheNER: chemical named entity recognizer. Bioinformatics. 2013"},{"issue":"Suppl 1","key":"634_CR23","doi-asserted-by":"publisher","first-page":"S8","DOI":"10.1186\/1758-2946-7-S1-S8","volume":"7","author":"B Tang","year":"2015","unstructured":"Tang B, Feng Y, Wang X, Wu Y, Zhang Y, Jiang M, Wang J, Xu H: A comparison of conditional random fields and structured support vectormachines for chemical entity recognition in biomedical literature. J Cheminform. 2015, 7 (Suppl 1): S8-","journal-title":"J Cheminform"},{"key":"634_CR24","first-page":"14","volume":"17","author":"C Blaschke","year":"2002","unstructured":"Blaschke C, Valencia A: The frame-based module of the SUISEKI information extraction system. IEEE Intell Syst. 2002, 17: 14-20.","journal-title":"IEEE Intell Syst"},{"key":"634_CR25","first-page":"17","volume-title":"Proc AMIA Annu Symp AMIA Symp","author":"AR Aronson","year":"2001","unstructured":"Aronson AR: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Annu Symp AMIA Symp. 2001, 17-21."},{"key":"634_CR26","doi-asserted-by":"publisher","first-page":"816","DOI":"10.1016\/j.drudis.2008.06.001","volume":"13","author":"I Segura-Bedmar","year":"2008","unstructured":"Segura-Bedmar I, Mart\u00ednez P, Segura-Bedmar M: Drug name recognition and classification in biomedical texts. Drug Discov Today. 2008, 13: 816-823. 10.1016\/j.drudis.2008.06.001.","journal-title":"Drug Discov Today"},{"key":"634_CR27","doi-asserted-by":"publisher","first-page":"S1","DOI":"10.1186\/1471-2105-11-S2-S1","volume":"11","author":"I Segura-Bedmar","year":"2010","unstructured":"Segura-Bedmar I, Crespo M, de Pablo-S\u00e1nchez C, Mart\u00ednez P: Resolving anaphoras for the extraction of drug-drug interactions in pharmacological documents. BMC Bioinformatics. 2010, 11: S1-","journal-title":"BMC Bioinformatics"},{"key":"634_CR28","first-page":"S5","volume":"11","author":"I Segura-Bedmar","year":"2010","unstructured":"Segura-Bedmar I, Mart\u00ednez P, de Pablo-S\u00e1nchez C: Extracting drug-drug interactions from biomedical text. BMC Bioinformatics. 2010, 11: S5-","journal-title":"BMC Bioinformatics"},{"issue":"I5","key":"634_CR29","doi-asserted-by":"publisher","first-page":"914","DOI":"10.1016\/j.jbi.2013.07.011","volume":"46","author":"M Heerero-Zazo","year":"2013","unstructured":"Heerero-Zazo M, Segura-Bedmar I, Mart\u00ednez P, Declerck T: The DDI corpus: an annotated corpus with pharmacological substance and drug-drug interactions. Journal of Biomedical Informatics. 2013, 46 (I5): 914-920.","journal-title":"Journal of Biomedical Informatics"},{"key":"634_CR30","unstructured":"Mallet: A machine learning for language toolkit. [http:\/\/mallet.cs.umass.edu\/about.php]"},{"key":"634_CR31","doi-asserted-by":"publisher","first-page":"D344","DOI":"10.1093\/nar\/gkm791","volume":"36","author":"K Degtyarenko","year":"2007","unstructured":"Degtyarenko K, de Matos P, Ennis M, Hastings J, Zbinden M, McNaught A, Alcantara R, Darsow M, Guedj M, Ashburner M: ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res. 2007, 36: D344-D350. 10.1093\/nar\/gkm791.","journal-title":"Nucleic Acids Res"},{"key":"634_CR32","doi-asserted-by":"publisher","first-page":"D456","DOI":"10.1093\/nar\/gks1146","volume":"41","author":"J Hastings","year":"2013","unstructured":"Hastings J, de Matos P, Dekker A, Ennis M, Harsha B, Kale N, Muthukrishnan V, Owen G, Turner S, Williams M, Steinbeck C: The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res. 2013, 41: D456-D463. 10.1093\/nar\/gks1146.","journal-title":"Nucleic Acids Res"},{"key":"634_CR33","doi-asserted-by":"publisher","first-page":"2983","DOI":"10.1093\/bioinformatics\/btp535","volume":"25","author":"KM Hettne","year":"2009","unstructured":"Hettne KM, Stierum RH, Schuemie MJ, Hendriksen PJM, Schijvenaars BJA, Mulligen EM, Kleinjans J, Kors JA: A dictionary to identify small molecules and drugs in free text. Bioinformatics. 2009, 25: 2983-2991. 10.1093\/bioinformatics\/btp535.","journal-title":"Bioinformatics"},{"key":"634_CR34","doi-asserted-by":"publisher","first-page":"1052","DOI":"10.1016\/j.drudis.2010.10.003","volume":"15","author":"Q Li","year":"2010","unstructured":"Li Q, Cheng T, Wang Y, Bryant SH: PubChem as a public resource for drug discovery. Drug Discov Today. 2010, 15: 1052-1057. 10.1016\/j.drudis.2010.10.003.","journal-title":"Drug Discov Today"},{"key":"634_CR35","first-page":"97","volume":"2","author":"M Choi","year":"2013","unstructured":"Choi M, Yepes AJ, Zobel J, Verspoor K: NEROC: Named Entity Recognizer of Chemicals. Proc Fourth BioCreative Chall Eval Work. Bethesda, Maryland. 2013, 2: 97-104.","journal-title":"Proc Fourth BioCreative Chall Eval Work. Bethesda, Maryland"},{"issue":"Suppl 1","key":"634_CR36","doi-asserted-by":"publisher","first-page":"S3","DOI":"10.1186\/1758-2946-7-S1-S3","volume":"7","author":"R Leaman","year":"2015","unstructured":"Leaman R, Wei C-H, Lu Z: tmChem: a high performance approach for chemical named entity recognitionand normalization. J Cheminform. 2015, 7 (Suppl 1): S3-","journal-title":"J Cheminform"},{"issue":"Suppl 1","key":"634_CR37","doi-asserted-by":"publisher","first-page":"S5","DOI":"10.1186\/1758-2946-7-S1-S5","volume":"7","author":"DM Lowe","year":"2015","unstructured":"Lowe DM, Sayle RA: LeadMine: A grammar and dictionary driven approach to chemical entity recognition. J Cheminform. 2015, 7 (Suppl 1): S5-","journal-title":"J Cheminform"},{"key":"634_CR38","first-page":"55","volume-title":"Proc Fourth BioCreative Chall Eval Work","author":"RT Batista-Navarro","year":"2013","unstructured":"Batista-Navarro RT, Rak R, Ananiadou S: Chemistry-specific Features and Heuristics for Developing a CRF-based Chemical Named Entity Recogniser. Proc Fourth BioCreative Chall Eval Work. 2013, Bethesda, Maryland: Association for Computational Linguistics, 2: 55-59."},{"key":"634_CR39","first-page":"88","volume-title":"Proc Fourth BioCreative Chall Eval Work","author":"T Huber","year":"2013","unstructured":"Huber T, Rockt\u00e4schel T, Weidlich M, Thomas P, Leser U: Extended Feature Set for Chemical Named Entity Recognition and Indexing. Proc Fourth BioCreative Chall Eval Work. 2013, Bethesda, Maryland: Association for Computational Linguistics, 2: 88-91."},{"key":"634_CR40","first-page":"105","volume-title":"Proc Fourth BioCreative Chall Eval Work","author":"M Khabsa","year":"2013","unstructured":"Khabsa M, Giles CL: An Ensemble Information Extraction Approach to the BioCreative CHEMDNER Task. Proc Fourth BioCreative Chall Eval Work. 2013, Bethesda, Maryland: Association for Computational Linguistics, 2: 105-112."},{"issue":"Suppl 1","key":"634_CR41","doi-asserted-by":"publisher","first-page":"S10","DOI":"10.1186\/1758-2946-7-S1-S10","volume":"7","author":"SA Akhondi","year":"2015","unstructured":"Akhondi SA, Hettne M, van der Host E, van Mulligen E, Kors JA: Recognition of chemical entities: combining dictionary-based andgrammar-based approaches. J Cheminform. 2015, 7 (Suppl 1): S10-","journal-title":"J Cheminform"},{"key":"634_CR42","first-page":"121","volume-title":"Proc Fourth BioCreative Chall Eval Work","author":"S Lana-Serrano","year":"2013","unstructured":"Lana-Serrano S, Sanchez-Cisneros D, Campillos L, Segura-Bedmar I: Recognizing Chemical Compounds and Drugs: a Rule-Based Approach Using Semantic Information. Proc Fourth BioCreative Chall Eval Work. 2013, Bethesda, Maryland: Association for Computational Linguistics, 2: 121-128."},{"key":"634_CR43","first-page":"162","volume-title":"Proc Fourth BioCreative Chall Eval Work","author":"M Yoshioka","year":"2013","unstructured":"Yoshioka M, Dieb TM: Ensemble Approach to Extract Chemical Named Entity by Using Results of Multiple CNER Systems with Different Characteristic. Proc Fourth BioCreative Chall Eval Work. 2013, Bethesda, Maryland: Association for Computational Linguistics, 2: 162-167."},{"key":"634_CR44","first-page":"171","volume-title":"Proc Fourth BioCreative Chall Eval Work","author":"L Li","year":"2013","unstructured":"Li L, Guo R, Liu S, Zhang P, Zheng T, Huang D, Zhou H: Combining Machine Learning with Dictionary Lookup for Chemical Compound and Drug Name Recognition Task. Proc Fourth BioCreative Chall Eval Work. 2013, Bethesda, Maryland: Association for Computational Linguistics, 2: 171-177."}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/1758-2946-7-S1-S15.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/1758-2946-7-S1-S15\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1758-2946-7-S1-S15.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,28]],"date-time":"2025-05-28T18:31:17Z","timestamp":1748457077000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/1758-2946-7-S1-S15"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,1,19]]},"references-count":44,"journal-issue":{"issue":"S1","published-print":{"date-parts":[[2015,12]]}},"alternative-id":["634"],"URL":"https:\/\/doi.org\/10.1186\/1758-2946-7-s1-s15","relation":{},"ISSN":["1758-2946"],"issn-type":[{"type":"electronic","value":"1758-2946"}],"subject":[],"published":{"date-parts":[[2015,1,19]]},"assertion":[{"value":"19 January 2015","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S15"}}