{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T01:47:35Z","timestamp":1772502455848,"version":"3.50.1"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"S17","license":[{"start":{"date-parts":[[2012,12,1]],"date-time":"2012-12-01T00:00:00Z","timestamp":1354320000000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/2.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2012,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Manual chemical data curation from publications is error-prone, time consuming, and hard to maintain up-to-date data sets. Automatic information extraction can be used as a tool to reduce these problems. Since chemical structures usually described in images, information extraction needs to combine structure image recognition and text mining together.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We have developed ChemEx, a chemical information extraction system. ChemEx processes both text and images in publications. Text annotator is able to extract compound, organism, and assay entities from text content while structure image recognition enables translation of chemical raster images to machine readable format. A user can view annotated text along with summarized information of compounds, organism that produces those compounds, and assay tests.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>ChemEx facilitates and speeds up chemical data curation by extracting compounds, organisms, and assays from a large collection of publications. The software and corpus can be downloaded from <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"http:\/\/www.biotec.or.th\/isl\/ChemEx\" ext-link-type=\"uri\">http:\/\/www.biotec.or.th\/isl\/ChemEx<\/jats:ext-link>.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-13-s17-s9","type":"journal-article","created":{"date-parts":[[2019,12,11]],"date-time":"2019-12-11T01:59:27Z","timestamp":1576029567000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":23,"title":["ChemEx: information extraction system for chemical data curation"],"prefix":"10.1186","volume":"13","author":[{"given":"Atima","family":"Tharatipyakul","sequence":"first","affiliation":[]},{"given":"Somrak","family":"Numnark","sequence":"additional","affiliation":[]},{"given":"Duangdao","family":"Wichadakul","sequence":"additional","affiliation":[]},{"given":"Supawadee","family":"Ingsriswang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2012,12,13]]},"reference":[{"key":"5473_CR1","unstructured":"ChemBank. [http:\/\/chembank.broadinstitute.org\/]"},{"key":"5473_CR2","volume-title":"Annual Reports in Computational Chemistry","author":"Evan E Bolton","year":"2008","unstructured":"Bolton Evan, Wang Yanli, Thiessen Paul, Bryant Stephen: PubChem: integrated platform of small molecules and biological activities. Annual Reports in Computational Chemistry. 2008, 4:"},{"issue":"Suppl 1","key":"5473_CR3","doi-asserted-by":"publisher","first-page":"S14","DOI":"10.1186\/1471-2105-6-S1-S14","volume":"6","author":"D Hanisch","year":"2005","unstructured":"Hanisch D, Fundel K, Mevissen H-T, Zimmer R, Fluck J: ProMiner: rule-based protein and gene entity recognition. BMC Bioinformatics. 2005, 6 (Suppl 1): S14-10.1186\/1471-2105-6-S1-S14.","journal-title":"BMC Bioinformatics"},{"key":"5473_CR4","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1093\/bib\/6.1.57","volume":"6","author":"AM Cohen","year":"2005","unstructured":"Cohen AM, Hersh WR: A survey of current work in biomedical text mining. Briefings in Bioinformatics. 2005, 6: 57-71. 10.1093\/bib\/6.1.57.","journal-title":"Briefings in Bioinformatics"},{"key":"5473_CR5","doi-asserted-by":"publisher","first-page":"S4","DOI":"10.1186\/gb-2008-9-s2-s4","volume":"9","author":"M Krallinger","year":"2008","unstructured":"Krallinger M, Leitner F, Rodriguez-Penagos C, Valencia A: Overview of the protein-protein interaction annotation extraction task of BioCreative II. Genome Biology. 2008, 9: S4-","journal-title":"Genome Biology"},{"key":"5473_CR6","unstructured":"GENIA tagger. [http:\/\/www.nactem.ac.uk\/tsujii\/GENIA\/tagger\/]"},{"key":"5473_CR7","doi-asserted-by":"publisher","first-page":"373","DOI":"10.1021\/ci00008a018","volume":"32","author":"JR McDaniel","year":"1992","unstructured":"McDaniel JR, Balmuth JR: Kekule: OCR-optical chemical (structure) recognition. Journal of Chemical Information and Computer Sciences. 1992, 32: 373-378. 10.1021\/ci00008a018.","journal-title":"Journal of Chemical Information and Computer Sciences"},{"key":"5473_CR8","doi-asserted-by":"publisher","first-page":"338","DOI":"10.1021\/ci00013a010","volume":"33","author":"P Ibison","year":"1993","unstructured":"Ibison P, Jacquot M, Kam F, Neville AG, Simpson RW, Tonnelier C, Venczel T, Johnson AP: Chemical literature data extraction: The CLiDE Project. Journal of Chemical Information and Computer Sciences. 1993, 33: 338-344. 10.1021\/ci00013a010.","journal-title":"Journal of Chemical Information and Computer Sciences"},{"key":"5473_CR9","doi-asserted-by":"publisher","first-page":"780","DOI":"10.1021\/ci800449t","volume":"49","author":"AT Valko","year":"2009","unstructured":"Valko AT, Johnson AP: CLiDE Pro: The Latest Generation of CLiDE, a Tool for Optical Chemical Structure Recognition. Journal of Chemical Information and Modeling. 2009, 49: 780-787. 10.1021\/ci800449t.","journal-title":"Journal of Chemical Information and Modeling"},{"key":"5473_CR10","doi-asserted-by":"publisher","first-page":"4609","DOI":"10.1109\/IEMBS.2007.4353366","volume-title":"29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2007. EMBS 2007. IEEE","author":"M-E Algorri","year":"2007","unstructured":"Algorri M-E, Zimmermann M, Friedrich CM, Akle S, Hofmann-Apitius M: Reconstruction of chemical molecules from images. 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2007. EMBS 2007. IEEE. 2007, 4609-4612."},{"key":"5473_CR11","doi-asserted-by":"publisher","first-page":"740","DOI":"10.1021\/ci800067r","volume":"49","author":"IV Filippov","year":"2009","unstructured":"Filippov IV, Nicklaus MC: Optical structure recognition software to recover chemical information: OSRA, an open source solution. Journal of Chemical Information and Modeling. 2009, 49: 740-743. 10.1021\/ci800067r.","journal-title":"Journal of Chemical Information and Modeling"},{"key":"5473_CR12","doi-asserted-by":"crossref","unstructured":"Park J, Rosania GR, Shedden KA, Nguyen M, Lyu N, Saitou K: Automated extraction of chemical structure information from digital raster images. Chem Cent J. 3: 4-4.","DOI":"10.1186\/1752-153X-3-4"},{"key":"5473_CR13","doi-asserted-by":"publisher","first-page":"i268","DOI":"10.1093\/bioinformatics\/btn181","volume":"24","author":"R Klinger","year":"2008","unstructured":"Klinger R, Kol\u00e1\u0159ik C, Fluck J, Hofmann-Apitius M, Friedrich CM: Detection of IUPAC and IUPAC-like chemical names. Bioinformatics. 2008, 24: i268-i276. 10.1093\/bioinformatics\/btn181.","journal-title":"Bioinformatics"},{"key":"5473_CR14","doi-asserted-by":"publisher","first-page":"251","DOI":"10.1145\/1242572.1242607","volume-title":"Proceedings of the 16th international conference on World Wide Web","author":"B Sun","year":"2007","unstructured":"Sun B, Tan Q, Mitra P, Giles CL: Extraction and search of chemical formulae in text documents on the web. Proceedings of the 16th international conference on World Wide Web. 2007, New York, NY, USA: ACM, 251-260."},{"key":"5473_CR15","doi-asserted-by":"publisher","first-page":"549","DOI":"10.1136\/jamia.2010.004036","volume":"17","author":"T Hamon","year":"2010","unstructured":"Hamon T, Grabar N: Linguistic approach for identification of medication names and related information in clinical narratives. Journal of the American Medical Informatics Association. 2010, 17: 549-554. 10.1136\/jamia.2010.004036.","journal-title":"Journal of the American Medical Informatics Association"},{"key":"5473_CR16","volume-title":"AAAI","author":"S Yan","year":"2011","unstructured":"Yan S, Spangler WS, Chen Y: Cross media entity extraction and linkage for chemical documents. AAAI. Edited by: Burgard W, Roth D. 2011, AAAI Press"},{"key":"5473_CR17","doi-asserted-by":"publisher","first-page":"461","DOI":"10.1021\/np068054v","volume":"70","author":"DJ Newman","year":"2007","unstructured":"Newman DJ, Cragg GM: Natural products as sources of new drugs over the last 25 years. Journal of Natural Products. 2007, 70: 461-477. 10.1021\/np068054v.","journal-title":"Journal of Natural Products"},{"key":"5473_CR18","unstructured":"Poppler - PDF rendering library. [http:\/\/poppler.freedesktop.org\/]"},{"key":"5473_CR19","unstructured":"Simplified molecular-input line-entry system. [http:\/\/en.wikipedia.org\/wiki\/SMILES]"},{"key":"5473_CR20","unstructured":"Chemical table file. [http:\/\/en.wikipedia.org\/wiki\/Chemical_table_file]"},{"key":"5473_CR21","unstructured":"GOCR: open-source character recognition. [http:\/\/jocr.sourceforge.net\/]"},{"key":"5473_CR22","unstructured":"Apache UIMA - Unstructured Information Management applications. [http:\/\/uima.apache.org\/]"},{"key":"5473_CR23","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1186\/1758-2946-3-41","volume":"3","author":"D Jessop","year":"2011","unstructured":"Jessop D, Adams S, Willighagen E, Hawizy L, Murray-Rust P: OSCAR4: a flexible architecture for chemical text-mining. Journal of Cheminformatics. 2011, 3: 41-10.1186\/1758-2946-3-41.","journal-title":"Journal of Cheminformatics"},{"key":"5473_CR24","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1186\/1758-2946-3-17","volume":"3","author":"L Hawizy","year":"2011","unstructured":"Hawizy L, Jessop D, Adams N, Murray-Rust P: ChemicalTagger: A tool for semantic text-mining in chemistry. Journal of Cheminformatics. 2011, 3: 17-10.1186\/1758-2946-3-17.","journal-title":"Journal of Cheminformatics"},{"key":"5473_CR25","doi-asserted-by":"publisher","first-page":"S4","DOI":"10.1186\/1471-2105-9-S11-S4","volume":"9","author":"P Corbett","year":"2008","unstructured":"Corbett P, Copestake A: Cascaded classifiers for confidence-based chemical named entity recognition. BMC Bioinformatics. 2008, 9: S4-","journal-title":"BMC Bioinformatics"},{"key":"5473_CR26","unstructured":"Apache UIMA ConceptMapper Annotator Documentation. [http:\/\/uima.apache.org\/d\/uima-addons-current\/ConceptMapper\/ConceptMapperAnnotatorUserGuide.html]"},{"key":"5473_CR27","unstructured":"Integrated Taxonomic Information System. [http:\/\/www.itis.gov\/]"},{"key":"5473_CR28","unstructured":"List of Prokaryotic names with Standing in Nomenclature LPSN. [http:\/\/www.bacterio.cict.fr\/]"},{"key":"5473_CR29","unstructured":"Catalogue of Life. [http:\/\/www.catalogueoflife.org\/]"},{"key":"5473_CR30","doi-asserted-by":"publisher","first-page":"D344","DOI":"10.1093\/nar\/gkm791","volume":"36","author":"K Degtyarenko","year":"2008","unstructured":"Degtyarenko K, de Matos P, Ennis M, Hastings J, Zbinden M, McNaught A, Alcantara R, Darsow M, Guedj M, Ashburner M: ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Research. 2008, 36: D344-D350.","journal-title":"Nucleic Acids Research"},{"key":"5473_CR31","unstructured":"JChemPaint. [http:\/\/sourceforge.net\/apps\/mediawiki\/cdk\/index.php?title=JChemPaint]"},{"key":"5473_CR32","doi-asserted-by":"publisher","first-page":"2498","DOI":"10.1093\/bioinformatics\/btm363","volume":"23","author":"S Ingsriswang","year":"2007","unstructured":"Ingsriswang S, Pacharawongsakda E: sMOL Explorer: an open source, web-enabled database and exploration tool for small MOLecules datasets. Bioinformatics. 2007, 23: 2498-2500. 10.1093\/bioinformatics\/btm363.","journal-title":"Bioinformatics"},{"key":"5473_CR33","unstructured":"ACS Publications. [http:\/\/pubs.acs.org\/]"},{"key":"5473_CR34","unstructured":"CACTVS Chemoinformatics Toolkit Academic. [http:\/\/xemistry.com\/]"},{"key":"5473_CR35","unstructured":"IUPAC - International Union of Pure and Applied Chemistry: The IUPAC International Chemical Identifier (InChI). [http:\/\/www.iupac.org\/home\/publications\/e-resources\/inchi.html]"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-13-S17-S9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/1471-2105-13-S17-S9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-13-S17-S9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T21:13:36Z","timestamp":1630530816000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-13-S17-S9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,12]]},"references-count":35,"journal-issue":{"issue":"S17","published-print":{"date-parts":[[2012,12]]}},"alternative-id":["5473"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-13-s17-s9","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,12]]},"assertion":[{"value":"13 December 2012","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S9"}}