{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T09:24:54Z","timestamp":1766049894526,"version":"3.41.2"},"reference-count":35,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2022,2,25]],"date-time":"2022-02-25T00:00:00Z","timestamp":1645747200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"HKBU Strategic Development Fund","award":["SDF19\u20130402-P02"],"award-info":[{"award-number":["SDF19\u20130402-P02"]}]},{"name":"Changsha Science and Technology Bureau","award":["kq2001034"],"award-info":[{"award-number":["kq2001034"]}]},{"name":"Changsha Municipal Natural Science Foundation","award":["kq2014144"],"award-info":[{"award-number":["kq2014144"]}]},{"name":"Science and Technology innovation Program of Hunan Province","award":["2021RC4011"],"award-info":[{"award-number":["2021RC4011"]}]},{"name":"Hunan Provincial Science Fund for Distinguished Young Scholars","award":["2021JJ10068"],"award-info":[{"award-number":["2021JJ10068"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["U1811462","22173118"],"award-info":[{"award-number":["U1811462","22173118"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2021YFF1201400"],"award-info":[{"award-number":["2021YFF1201400"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,3,10]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Structural information for chemical compounds is often described by pictorial images in most scientific documents, which cannot be easily understood and manipulated by computers. This dilemma makes optical chemical structure recognition (OCSR) an essential tool for automatically mining knowledge from an enormous amount of literature. However, existing OCSR methods fall far short of our expectations for realistic requirements due to their poor recovery accuracy. In this paper, we developed a deep neural network model named ABC-Net (Atom and Bond Center Network) to predict graph structures directly. Based on the divide-and-conquer principle, we propose to model an atom or a bond as a single point in the center. In this way, we can leverage a fully convolutional neural network (CNN) to generate a series of heat-maps to identify these points and predict relevant properties, such as atom types, atom charges, bond types and other properties. Thus, the molecular structure can be recovered by assembling the detected atoms and bonds. Our approach integrates all the detection and property prediction tasks into a single fully CNN, which is scalable and capable of processing molecular images quite efficiently. Experimental results demonstrate that our method could achieve a significant improvement in recognition performance compared with publicly available tools. The proposed method could be considered as a promising solution to OCSR problems and a starting point for the acquisition of molecular information in the literature.<\/jats:p>","DOI":"10.1093\/bib\/bbac033","type":"journal-article","created":{"date-parts":[[2022,1,25]],"date-time":"2022-01-25T12:08:37Z","timestamp":1643112517000},"source":"Crossref","is-referenced-by-count":15,"title":["ABC-Net: a divide-and-conquer based deep learning architecture for SMILES recognition from molecular images"],"prefix":"10.1093","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7187-435X","authenticated-orcid":false,"given":"Xiao-Chen","family":"Zhang","sequence":"first","affiliation":[{"name":"School of Computer Science, National University of Defense Technology, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6823-1882","authenticated-orcid":false,"given":"Jia-Cai","family":"Yi","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, National University of Defense Technology, China"}]},{"given":"Guo-Ping","family":"Yang","sequence":"additional","affiliation":[{"name":"Center of Clinical Pharmacology, the Third Xiangya Hospital, Central South University, China"}]},{"given":"Cheng-Kun","family":"Wu","sequence":"additional","affiliation":[{"name":"Institute for Quantum Information & State Key Laboratory of High-Performance Computing, College of Computer Science and Technology, National University of Defense Technology, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7227-2580","authenticated-orcid":false,"given":"Ting-Jun","family":"Hou","sequence":"additional","affiliation":[{"name":"College of Pharmaceutical Sciences, Zhejiang University, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3604-3785","authenticated-orcid":false,"given":"Dong-Sheng","family":"Cao","sequence":"additional","affiliation":[{"name":"Xiangya School of Pharmaceutical Sciences, Central South University, China"}]}],"member":"286","published-online":{"date-parts":[[2022,2,25]]},"reference":[{"key":"2022031506402264900_ref1","doi-asserted-by":"crossref","first-page":"1017","DOI":"10.1021\/acs.jcim.8b00669","article-title":"Molecular structure extraction from documents using deep learning","volume":"59","author":"Staker","year":"2019","journal-title":"J Chem Inf Model"},{"key":"2022031506402264900_ref2","doi-asserted-by":"crossref","first-page":"D1121","DOI":"10.1093\/nar\/gkx1076","article-title":"Therapeutic target database update 2018: enriched resource for facilitating bench-to-clinic research of targeted therapeutics","volume":"46","author":"Li","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2022031506402264900_ref3","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1021\/ci00057a005","article-title":"SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules","volume":"28","author":"Weininger","year":"1988","journal-title":"J Chem Inf Comput Sci"},{"key":"2022031506402264900_ref4","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1186\/s13321-015-0068-4","article-title":"InChI, the IUPAC international chemical identifier","volume":"7","author":"Heller","year":"2015","journal-title":"J Chem"},{"key":"2022031506402264900_ref5","doi-asserted-by":"crossref","first-page":"244","DOI":"10.1021\/ci00007a012","article-title":"Description of several chemical structure file formats used by computer programs developed at molecular design limited","volume":"32","author":"Dalby","year":"1992","journal-title":"J Chem Inf Comput Sci"},{"key":"2022031506402264900_ref6","first-page":"1","article-title":"A review of optical chemical structure recognition tools","volume":"12","author":"Rajan","year":"2020","journal-title":"J Chem"},{"key":"2022031506402264900_ref7","doi-asserted-by":"crossref","first-page":"829","DOI":"10.1038\/nrg3337","article-title":"Text-mining solutions for biomedical research: enabling integrative biology","volume":"13","author":"Rebholz-Schuhmann","year":"2012","journal-title":"Nat Rev Genet"},{"issue":"3","key":"2022031506402264900_ref8","doi-asserted-by":"crossref","first-page":"740","DOI":"10.1021\/ci800067r","article-title":"Optical structure recognition software to recover chemical information: OSRA, an open source solution","volume":"49","author":"Filippov","year":"2009","journal-title":"J Chem Inf Model"},{"key":"2022031506402264900_ref9","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1021\/ci00008a018","article-title":"Kekule: OCR-optical chemical (structure) recognition","volume":"32","author":"McDaniel","year":"1992","journal-title":"J Chem Inf Comput Sci"},{"volume-title":"Abstracts of Papers of the American Chemical Society","year":"2019","author":"Peryea","key":"2022031506402264900_ref10"},{"key":"2022031506402264900_ref11","first-page":"1","article-title":"DECIMER: towards deep learning for chemical image recognition","volume":"12","author":"Rajan","year":"2020","journal-title":"J Chem"},{"key":"2022031506402264900_ref12","doi-asserted-by":"crossref","first-page":"780","DOI":"10.1021\/ci800449t","article-title":"CLiDE Pro: the latest generation of CLiDE, a tool for optical chemical structure recognition","volume":"49","author":"Valko","year":"2009","journal-title":"J Chem Inf Model"},{"volume-title":"Proceedings of The Twentieth Text REtrieval Conference","year":"2011","author":"Smolov","key":"2022031506402264900_ref13"},{"key":"2022031506402264900_ref14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1752-153X-3-4","article-title":"Automated extraction of chemical structure information from digital raster images","volume":"3","author":"Park","year":"2009","journal-title":"Chem Cent J"},{"key":"2022031506402264900_ref15","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1038\/nature24270","article-title":"Mastering the game of go without human knowledge","volume":"550","author":"Silver","year":"2017","journal-title":"Nature"},{"key":"2022031506402264900_ref16","first-page":"234","volume-title":"International Conference on Medical Image Computing and Computer-Assisted Intervention","author":"Ronneberger","year":"2015"},{"key":"2022031506402264900_ref17","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"2022031506402264900_ref18","doi-asserted-by":"crossref","first-page":"1825","DOI":"10.1093\/bib\/bbz120","article-title":"Convolutional neural network-based annotation of bacterial type IV secretion system effectors with enhanced accuracy and reduced false discovery","volume":"21","author":"Hong","year":"2020","journal-title":"Brief Bioinform"},{"key":"2022031506402264900_ref19","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1016\/j.tips.2017.12.002","article-title":"Clinical success of drug targets prospectively predicted by in silico study","volume":"39","author":"Zhu","year":"2018","journal-title":"Trends Pharmacol Sci"},{"key":"2022031506402264900_ref20","article-title":"Google's neural machine translation system: bridging the gap between human and machine translation","author":"Wu","year":"2016","journal-title":"arXiv preprint arXiv:08144"},{"key":"2022031506402264900_ref21","first-page":"630","volume-title":"European Conference on Computer Vision","author":"He","year":"2016"},{"key":"2022031506402264900_ref22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3295748","article-title":"A comprehensive survey of deep learning for image captioning","volume":"51","author":"Hossain","year":"2019","journal-title":"ACM Comput Surv"},{"key":"2022031506402264900_ref23","first-page":"2048","volume-title":"Proceedings of the 32nd International Conference on Machine Learning","author":"Xu","year":"2015"},{"key":"2022031506402264900_ref24","doi-asserted-by":"crossref","DOI":"10.1039\/D1SC01839F","article-title":"Img2Mol \u2013 accurate SMILES recognition from molecular graphical depictions","volume":"12","author":"Clevert","year":"2021","journal-title":"Chem Sci"},{"key":"2022031506402264900_ref25","doi-asserted-by":"crossref","first-page":"bbab327","DOI":"10.1093\/bib\/bbab327","article-title":"Learning to SMILES: BAN-based strategies to improve latent representation learning from molecules","volume":"22","author":"Wu","year":"2021","journal-title":"Brief Bioinform"},{"key":"2022031506402264900_ref26","first-page":"734","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV)","author":"Law","year":"2018"},{"volume-title":"Objects as points, arXiv preprint arXiv:.07850","year":"2019","author":"Zhou","key":"2022031506402264900_ref27"},{"key":"2022031506402264900_ref28","first-page":"5693","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Sun","year":"2019"},{"key":"2022031506402264900_ref29","first-page":"3431","volume-title":"Fully convolutional networks for semantic segmentation","author":"Long","year":"2015"},{"key":"2022031506402264900_ref30","doi-asserted-by":"crossref","first-page":"D1100","DOI":"10.1093\/nar\/gkr777","article-title":"ChEMBL: a large-scale bioactivity database for drug discovery","volume":"40","author":"Gaulton","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2022031506402264900_ref31","first-page":"7482","volume-title":"Multi-task learning using uncertainty to weigh losses for scene geometry and semantics","author":"Kendall","year":"2018"},{"key":"2022031506402264900_ref32","article-title":"Adam: a method for stochastic optimization","author":"Kingma","year":"2014","journal-title":"arXiv preprint arXiv"},{"key":"2022031506402264900_ref33","first-page":"4","article-title":"Rdkit documentation","volume":"1","author":"Landrum","year":"2013","journal-title":"Release"},{"key":"2022031506402264900_ref34","doi-asserted-by":"crossref","first-page":"P4","DOI":"10.1186\/1758-2946-3-S1-P4","article-title":"Indigo: universal cheminformatics API","volume":"3","author":"Pavlov","year":"2011","journal-title":"J Chem"},{"key":"2022031506402264900_ref35","first-page":"248","volume-title":"Imagenet: a large-scale hierarchical image database","author":"Deng","year":"2009"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/2\/bbac033\/42806252\/bbac033.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/2\/bbac033\/42806252\/bbac033.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,3,15]],"date-time":"2022-03-15T06:51:55Z","timestamp":1647327115000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbac033\/6535678"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,25]]},"references-count":35,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,3,10]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbac033","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"type":"print","value":"1467-5463"},{"type":"electronic","value":"1477-4054"}],"subject":[],"published-other":{"date-parts":[[2022,3]]},"published":{"date-parts":[[2022,2,25]]},"article-number":"bbac033"}}