{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,3]],"date-time":"2026-07-03T14:46:38Z","timestamp":1783089998890,"version":"3.54.6"},"reference-count":29,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,6,9]],"date-time":"2022-06-09T00:00:00Z","timestamp":1654732800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,6,9]],"date-time":"2022-06-09T00:00:00Z","timestamp":1654732800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100007569","name":"Carl-Zeiss-Stiftung","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100007569","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["CRC1127 ChemBioSys"],"award-info":[{"award-number":["CRC1127 ChemBioSys"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100012957","name":"Friedrich-Schiller-Universit\u00e4t Jena","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100012957","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    The translation of images of chemical structures into machine-readable representations of the depicted molecules is known as optical chemical structure recognition (OCSR). There has been a lot of progress over the last three decades in this field, but the development of systems for the recognition of complex hand-drawn structure depictions is still at the beginning. Currently, there is no data for the systematic evaluation of OCSR methods on hand-drawn structures available. Here we present\n                    <jats:italic>DECIMER\u00a0\u2014\u00a0Hand-drawn molecule images<\/jats:italic>\n                    , a standardised, openly available benchmark dataset of 5088 hand-drawn depictions of diversely picked chemical structures. Every structure depiction in the dataset is mapped to a machine-readable representation of the underlying molecule. The dataset is openly available and published under the CC-BY 4.0 licence which applies very few limitations. We hope that it will contribute to the further development of the field.\n                  <\/jats:p>\n                  <jats:p>\n                    <jats:bold>Graphical Abstract<\/jats:bold>\n                  <\/jats:p>","DOI":"10.1186\/s13321-022-00620-9","type":"journal-article","created":{"date-parts":[[2022,6,9]],"date-time":"2022-06-09T10:02:48Z","timestamp":1654768968000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["DECIMER\u2014hand-drawn molecule images dataset"],"prefix":"10.1186","volume":"14","author":[{"given":"Henning Otto","family":"Brinkhaus","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Achim","family":"Zielesny","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Christoph","family":"Steinbeck","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kohulan","family":"Rajan","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2022,6,9]]},"reference":[{"key":"620_CR1","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1186\/s13321-020-00465-0","volume":"12","author":"K Rajan","year":"2020","unstructured":"Rajan K, Brinkhaus HO, Zielesny A, Steinbeck C (2020) A review of optical chemical structure recognition tools. J Cheminform 12:60 [cito:cites] [cito:citesAsAuthority]","journal-title":"J Cheminform"},{"key":"620_CR2","doi-asserted-by":"publisher","first-page":"373","DOI":"10.1021\/ci00008a018","volume":"32","author":"JR McDaniel","year":"1992","unstructured":"McDaniel JR, Balmuth JR (1992) Kekule: OCR-optical chemical (structure) recognition. J Chem Inf Comput Sci 32:373\u2013378 [cito:cites]","journal-title":"J Chem Inf Comput Sci"},{"key":"620_CR3","doi-asserted-by":"crossref","unstructured":"Casey R, Boyer S, Healey P, Miller A, Oudot B, Zilles K (1993) Optical recognition of chemical graphics. In: Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR \u201993), pp 627\u2013631 [cito:cites]","DOI":"10.1109\/ICDAR.1993.395658"},{"key":"620_CR4","doi-asserted-by":"publisher","first-page":"338","DOI":"10.1021\/ci00013a010","volume":"33","author":"P Ibison","year":"1993","unstructured":"Ibison P, Jacquot M, Kam F, Neville AG, Simpson RW, Tonnelier C, Venczel T, Johnson AP (1993) Chemical literature data extraction: the CLiDE project. J Chem Inf Comput Sci 33:338\u2013344 [cito:cites]","journal-title":"J Chem Inf Comput Sci"},{"key":"620_CR5","doi-asserted-by":"publisher","first-page":"780","DOI":"10.1021\/ci800449t","volume":"49","author":"AT Valko","year":"2009","unstructured":"Valko AT, Johnson AP (2009) CLiDE Pro: the latest generation of CLiDE, a tool for optical chemical structure recognition. J Chem Inf Model 49:780\u2013787 [cito:cites]","journal-title":"J Chem Inf Model"},{"key":"620_CR6","doi-asserted-by":"crossref","unstructured":"Zimmermann M (2011) Chemical structure reconstruction with chemoCR. In: The Twentieth Text REtrieval conference (TREC 2011) Proceedings [cito:cites]","DOI":"10.6028\/NIST.SP.500-296.chemical-chemoCR"},{"key":"620_CR7","doi-asserted-by":"publisher","first-page":"740","DOI":"10.1021\/ci800067r","volume":"49","author":"IV Filippov","year":"2009","unstructured":"Filippov IV, Nicklaus MC (2009) Optical structure recognition software to recover chemical information: OSRA, an open-source solution. J Chem Inf Model 49:740\u2013743 [cito:cites]","journal-title":"J Chem Inf Model"},{"key":"620_CR8","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1186\/1752-153X-3-4","volume":"3","author":"J Park","year":"2009","unstructured":"Park J, Rosania GR, Shedden KA, Nguyen M, Lyu N, Saitou K (2009) Automated extraction of chemical structure information from digital raster images. Chem Cent J 3:4 [cito:cites]","journal-title":"Chem Cent J"},{"key":"620_CR9","unstructured":"Sadawi N (2009) Recognising chemical formulas from molecule depictions. In: Pre-proceedings of the 8th IAPR international workshop on graphics recognition (GREC 2009). pp 167\u2013175 [cito:cites]"},{"issue":"Suppl 17","key":"620_CR10","doi-asserted-by":"publisher","first-page":"S9","DOI":"10.1186\/1471-2105-13-S17-S9","volume":"13","author":"A Tharatipyakul","year":"2012","unstructured":"Tharatipyakul A, Numnark S, Wichadakul D, Ingsriswang S (2012) ChemEx: information extraction system for chemical data curation. BMC Bioinformatics 13(Suppl 17):S9 [cito:cites]","journal-title":"BMC Bioinformatics"},{"key":"620_CR11","doi-asserted-by":"publisher","first-page":"2059","DOI":"10.1021\/acs.jcim.0c00042","volume":"60","author":"EJ Beard","year":"2020","unstructured":"Beard EJ, Cole JM (2020) Chemschematicresolver: a toolkit to decode 2D chemical diagrams with labels and R-groups into annotated chemical named entities. J Chem Inf Model 60:2059\u20132072 [cito:cites]","journal-title":"J Chem Inf Model"},{"key":"620_CR12","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1186\/s13321-021-00538-8","volume":"13","author":"K Rajan","year":"2021","unstructured":"Rajan K, Zielesny A, Steinbeck C (2021) DECIMER 1.0: deep learning for chemical image recognition using transformers. J Cheminform 13:61 [cito:cites] [cito:citesAsAuthority] [cito:extends]","journal-title":"J Cheminform"},{"key":"620_CR13","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1186\/s13321-020-00469-w","volume":"12","author":"K Rajan","year":"2020","unstructured":"Rajan K, Zielesny A, Steinbeck C (2020) DECIMER: towards deep learning for chemical image recognition. J Cheminform 12:65 [cito:cites] [cito:citesAsAuthority] [cito:extends]","journal-title":"J Cheminform"},{"key":"620_CR14","doi-asserted-by":"publisher","DOI":"10.1039\/D1SC01839F","author":"D-A Clevert","year":"2021","unstructured":"Clevert D-A, Le T, Winter R, Montanari F (2021) Img2Mol\u2014accurate SMILES recognition from molecular graphical depictions. Chem Sci. https:\/\/doi.org\/10.1039\/D1SC01839F [cito:cites] [cito:agreesWith]","journal-title":"Chem Sci"},{"key":"620_CR15","doi-asserted-by":"publisher","first-page":"10622","DOI":"10.1039\/D1SC02957F","volume":"12","author":"H Weir","year":"2021","unstructured":"Weir H, Thompson K, Woodward A, Choi B, Braun A, Mart\u00ednez TJ (2021) ChemPix: automated recognition of hand-drawn hydrocarbon structures using deep learning. Chem Sci 12:10622\u201310633 [cito:cites]","journal-title":"Chem Sci"},{"key":"620_CR16","doi-asserted-by":"publisher","first-page":"4506","DOI":"10.1021\/acs.jcim.0c00459","volume":"60","author":"M Oldenhof","year":"2020","unstructured":"Oldenhof M, Arany A, Moreau Y, Simm J (2020) Chemgrapher: optical graph recognition of chemical compounds by deep learning. J Chem Inf Model 60:4506\u20134517 [cito:cites]","journal-title":"J Chem Inf Model"},{"key":"620_CR17","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbac033","author":"X-C Zhang","year":"2022","unstructured":"Zhang X-C, Yi J-C, Yang G-P, Wu C-K, Hou T-J, Cao D-S (2022) ABC-Net: a divide-and-conquer based deep learning architecture for SMILES recognition from molecular images. Brief Bioinform. https:\/\/doi.org\/10.1093\/bib\/bbac033 [cito:cites]","journal-title":"Brief Bioinform"},{"key":"620_CR18","doi-asserted-by":"publisher","DOI":"10.1002\/cmtd.202100069","author":"I Khokhlov","year":"2022","unstructured":"Khokhlov I, Krasnov L, Fedorov MV, Sosnin S (2022) Image2SMILES: transformer-based molecular optical recognition engine. Chem Methods. https:\/\/doi.org\/10.1002\/cmtd.202100069 [cito:cites]","journal-title":"Chem Methods"},{"key":"620_CR19","unstructured":"Osra (2022) https:\/\/sourceforge.net\/p\/osra\/wiki\/Validation\/. Accessed 30 Mar 2022 [cito:cites] [cito:citesAsDataSource]"},{"key":"620_CR20","first-page":"846","volume":"7","author":"TY Ouyang","year":"2007","unstructured":"Ouyang TY, Davis R (2007) Recognition of hand drawn chemical diagrams. AAAI 7:846\u2013851 [cito:cites]","journal-title":"AAAI"},{"key":"620_CR21","doi-asserted-by":"crossref","unstructured":"Ramel J-Y, Boissier G, Emptoz H (1999) Automatic reading of handwritten chemical formulas from a structural representation of the image. In: Proceedings of the 5th International Conference on Document Analysis and Recognition, ICDAR \u201999 (Cat. No.PR00318), pp 83\u201386 [cito:cites]","DOI":"10.1109\/ICDAR.1999.791730"},{"key":"620_CR22","unstructured":"Vision Arcanum: InkToMolecule online. https:\/\/visionarcanum.com\/ink2mol\/. Accessed 30 Mar 2022 [cito:cites]"},{"key":"620_CR23","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1021\/ci00057a005","volume":"28","author":"D Weininger","year":"1988","unstructured":"Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28:31\u201336 [cito:usesMethodIn]","journal-title":"J Chem Inf Comput Sci"},{"key":"620_CR24","doi-asserted-by":"publisher","first-page":"D1388","DOI":"10.1093\/nar\/gkaa971","volume":"49","author":"S Kim","year":"2021","unstructured":"Kim S, Chen J, Cheng T et al (2021) PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Res 49:D1388\u2013D1395 [cito:citesAsDataSource] [cito:usesDataFrom]","journal-title":"Nucleic Acids Res"},{"key":"620_CR25","doi-asserted-by":"publisher","first-page":"598","DOI":"10.1002\/qsar.200290002","volume":"21","author":"M Ashton","year":"2002","unstructured":"Ashton M, Barnard J, Casset F, Charlton M, Downs G, Gorse D, Holliday J, Lahana R, Willett P (2002) Identification of diverse database subsets using property-based and fragment-based molecular descriptions. Quant struct-act relatsh 21:598\u2013604 [cito:usesMethodIn] [cito:cites]","journal-title":"Quant struct-act relatsh"},{"key":"620_CR26","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1021\/c160017a018","volume":"5","author":"HL Morgan","year":"1965","unstructured":"Morgan HL (1965) The generation of a unique machine description for chemical structures-A technique developed at chemical abstracts service. J Chem Doc 5:107\u2013113 [cito:usesMethodIn] [cito:cites]","journal-title":"J Chem Doc"},{"key":"620_CR27","unstructured":"Mayfield J, Swain M, Willighagen E (2022) CDK Depict. In: GitHub. https:\/\/github.com\/cdk\/depict. Accessed 4 Mar 2022 [cito:cites] [cito:usesMethodIn]"},{"key":"620_CR28","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1021\/ci025584y","volume":"43","author":"C Steinbeck","year":"2003","unstructured":"Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E (2003) The chemistry development kit (CDK): an open-source Java library for chemo- and bioinformatics. J Chem Inf Comput Sci 43:493\u2013500 [cito:usesMethodIn]","journal-title":"J Chem Inf Comput Sci"},{"key":"620_CR29","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1162\/dint_r_00024","volume":"2","author":"A Jacobsen","year":"2020","unstructured":"Jacobsen A, de Miranda AR, Juty N et al (2020) FAIR principles: Interpretations and implementation considerations. Data Intelligence 2:10\u201329 [cito:agreesWith]","journal-title":"Data Intelligence"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-022-00620-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-022-00620-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-022-00620-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,26]],"date-time":"2024-09-26T14:22:00Z","timestamp":1727360520000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-022-00620-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,9]]},"references-count":29,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["620"],"URL":"https:\/\/doi.org\/10.1186\/s13321-022-00620-9","relation":{"has-preprint":[{"id-type":"doi","id":"10.26434\/chemrxiv-2022-6gfch","asserted-by":"object"}]},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,6,9]]},"assertion":[{"value":"20 April 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 May 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 June 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 July 2022","order":4,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Update","order":5,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"I CiTO annotations were added in the References.","order":6,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"AZ is co-founder of GNWI\u2014Gesellschaft f\u00fcr naturwissenschaftliche Informatik mbH, Dortmund, Germany.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"36"}}