{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,9]],"date-time":"2026-06-09T07:45:08Z","timestamp":1780991108884,"version":"3.54.1"},"reference-count":20,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,6,6]],"date-time":"2022-06-06T00:00:00Z","timestamp":1654473600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,6,6]],"date-time":"2022-06-06T00:00:00Z","timestamp":1654473600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100007569","name":"Carl-Zeiss-Stiftung","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100007569","id-type":"DOI","asserted-by":"publisher"}]},{"name":"ChemBioSys","award":["CRC1127"],"award-info":[{"award-number":["CRC1127"]}]},{"name":"ChemBioSys","award":["CRC1127"],"award-info":[{"award-number":["CRC1127"]}]},{"DOI":"10.13039\/100012957","name":"Friedrich-Schiller-Universit\u00e4t Jena","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100012957","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>The development of deep learning-based optical chemical structure recognition (OCSR) systems has led to a need for datasets of chemical structure depictions. The diversity of the features in the training data is an important factor for the generation of deep learning systems that generalise well and are not overfit to a specific type of input. In the case of chemical structure depictions, these features are defined by the depiction parameters such as bond length, line thickness, label font style and many others.\u00a0Here we present RanDepict, a toolkit for the creation of diverse sets of chemical structure depictions. The diversity of the image features is generated by making use of all available depiction parameters in the depiction functionalities of the CDK, RDKit, and Indigo. Furthermore, there is the option to enhance and augment the image with features such as curved arrows, chemical labels around the structure, or other kinds of distortions.\u00a0Using depiction feature fingerprints, RanDepict ensures diversely picked image features. Here, the depiction and augmentation features are summarised in binary vectors and the MaxMin algorithm is used to pick diverse samples out of all valid options.\u00a0By making all resources described herein publicly available, we hope to contribute to the development of deep learning-based OCSR systems.<\/jats:p>\n                  <jats:p>\n                    <jats:bold>Graphical Abstract<\/jats:bold>\n                  <\/jats:p>","DOI":"10.1186\/s13321-022-00609-4","type":"journal-article","created":{"date-parts":[[2022,6,6]],"date-time":"2022-06-06T05:47:09Z","timestamp":1654494429000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["RanDepict: Random chemical structure depiction generator"],"prefix":"10.1186","volume":"14","author":[{"given":"Henning Otto","family":"Brinkhaus","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Kohulan","family":"Rajan","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Achim","family":"Zielesny","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Christoph","family":"Steinbeck","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2022,6,6]]},"reference":[{"key":"609_CR1","doi-asserted-by":"publisher","first-page":"4506","DOI":"10.1021\/acs.jcim.0c00459","volume":"60","author":"M Oldenhof","year":"2020","unstructured":"Oldenhof M, Arany A, Moreau Y, Simm J (2020) ChemGrapher: optical graph recognition of chemical compounds by deep learning. J Chem Inf Model 60:4506\u20134517","journal-title":"J Chem Inf Model"},{"key":"609_CR2","doi-asserted-by":"publisher","DOI":"10.1002\/cmtd.202100069","author":"I Khokhlov","year":"2022","unstructured":"Khokhlov I, Krasnov L, Fedorov M, Sosnin S (2022) Image2SMILES: transformer-based molecular optical recognition engine. Chem Methods. https:\/\/doi.org\/10.1002\/cmtd.202100069","journal-title":"Chem Methods"},{"key":"609_CR3","doi-asserted-by":"publisher","first-page":"14174","DOI":"10.1039\/D1SC01839F","volume":"12","author":"D-A Clevert","year":"2021","unstructured":"Clevert D-A, Le T, Winter R, Montanari F (2021) Img2Mol - accurate SMILES recognition from molecular graphical depictions. Chem Sci 12:14174\u201314181","journal-title":"Chem Sci"},{"key":"609_CR4","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1186\/s13321-021-00538-8","volume":"13","author":"K Rajan","year":"2021","unstructured":"Rajan K, Zielesny A, Steinbeck C (2021) DECIMER 1.0: deep learning for chemical image recognition using transformers. J Cheminform 13:61","journal-title":"J Cheminform"},{"key":"609_CR5","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1186\/s13321-020-00469-w","volume":"12","author":"K Rajan","year":"2020","unstructured":"Rajan K, Zielesny A, Steinbeck C (2020) DECIMER: towards deep learning for chemical image recognition. J Cheminform 12:65","journal-title":"J Cheminform"},{"key":"609_CR6","doi-asserted-by":"publisher","first-page":"10622","DOI":"10.1039\/D1SC02957F","volume":"12","author":"H Weir","year":"2021","unstructured":"Weir H, Thompson K, Woodward A, Choi B, Braun A, Mart\u00ednez TJ (2021) ChemPix: automated recognition of hand-drawn hydrocarbon structures using deep learning. Chem Sci 12:10622\u201310633","journal-title":"Chem Sci"},{"key":"609_CR7","doi-asserted-by":"publisher","first-page":"1017","DOI":"10.1021\/acs.jcim.8b00669","volume":"59","author":"J Staker","year":"2019","unstructured":"Staker J, Marshall K, Abel R, McQuaw CM (2019) Molecular structure extraction from documents using deep learning. J Chem Inf Model 59:1017\u20131029","journal-title":"J Chem Inf Model"},{"key":"609_CR8","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1186\/s13321-020-00465-0","volume":"12","author":"K Rajan","year":"2020","unstructured":"Rajan K, Brinkhaus HO, Zielesny A, Steinbeck C (2020) A review of optical chemical structure recognition tools. J Cheminform 12:60","journal-title":"J Cheminform"},{"key":"609_CR9","doi-asserted-by":"publisher","DOI":"10.1109\/iciecs.2009.5362936","author":"H Wang","year":"2009","unstructured":"Wang H, Ma C, Zhou L (2009) A brief review of machine learning and its application. 2009 Int Conf Inf Eng Comput Sci. https:\/\/doi.org\/10.1109\/iciecs.2009.5362936","journal-title":"2009 Int Conf Inf Eng Comput Sci"},{"key":"609_CR10","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1186\/s13321-021-00496-1","volume":"13","author":"K Rajan","year":"2021","unstructured":"Rajan K, Brinkhaus HO, Sorokina M, Zielesny A, Steinbeck C (2021) DECIMER-Segmentation: automated extraction of chemical structure depictions from scientific literature. J Cheminform 13:20","journal-title":"J Cheminform"},{"key":"609_CR11","doi-asserted-by":"publisher","DOI":"10.3390\/molecules25051160","author":"PA Runeberg","year":"2020","unstructured":"Runeberg PA, Agustin D, Eklund PC (2020) Formation of tetrahydrofurano-, aryltetralin, and butyrolactone norlignans through the epoxidation of 9-norlignans. Molecules. https:\/\/doi.org\/10.3390\/molecules25051160","journal-title":"Molecules"},{"key":"609_CR12","doi-asserted-by":"publisher","DOI":"10.3390\/molecules25051228","author":"G Zhang","year":"2020","unstructured":"Zhang G, Li Y, Wei W, Li J, Li H, Huang Y, Guo D-A (2020) Metabolomics combined with multivariate statistical analysis for screening of chemical markers between andgentiana scabra and gentiana rigescens. Molecules. https:\/\/doi.org\/10.3390\/molecules25051228","journal-title":"Molecules"},{"key":"609_CR13","doi-asserted-by":"publisher","DOI":"10.3390\/molecules25051237","author":"X-W Luo","year":"2020","unstructured":"Luo X-W, Gao C-H, Lu H-M, Wang J-M, Su Z-Q, Tao H-M, Zhou X-F, Yang B, Liu Y-H (2020) HPLC-DAD-guided isolation of diversified chaetoglobosins from the coral-associated fungus C2F17. Molecules. https:\/\/doi.org\/10.3390\/molecules25051237","journal-title":"Molecules"},{"key":"609_CR14","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1021\/ci025584y","volume":"43","author":"C Steinbeck","year":"2003","unstructured":"Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E (2003) The chemistry development kit (CDK): an open-source Java library for chemo- and bioinformatics. J Chem Inf Comput Sci 43:493\u2013500","journal-title":"J Chem Inf Comput Sci"},{"key":"609_CR15","unstructured":"RDKit: Open-source cheminformatics. https:\/\/www.rdkit.org\/. Accessed 16 May 2022"},{"key":"609_CR16","unstructured":"Indigo Toolkit. https:\/\/lifescience.opensource.epam.com\/indigo\/. Accessed 25 Jun 2020"},{"key":"609_CR17","doi-asserted-by":"publisher","first-page":"598","DOI":"10.1002\/qsar.200290002","volume":"21","author":"M Ashton","year":"2002","unstructured":"Ashton M, Barnard J, Casset F, Charlton M, Downs G, Gorse D, Holliday J, Lahana R, Willett P (2002) Identification of diverse database subsets using property-based and fragment-based molecular descriptions. Quant Struct Act Relatsh 21:598\u2013604","journal-title":"Quant Struct Act Relatsh"},{"key":"609_CR18","volume-title":"Python 3 reference manual","author":"RG Van","year":"2009","unstructured":"Van RG, Drake F (2009) Python 3 reference manual. CreateSpace, Scotts Valley"},{"key":"609_CR19","volume-title":"JPype","author":"KE Nelson","year":"2020","unstructured":"Nelson KE, Scherer MK, Others (2020) JPype. Lawrence Livermore National Lab (LLNL), Livermore"},{"key":"609_CR20","doi-asserted-by":"publisher","first-page":"740","DOI":"10.1021\/ci800067r","volume":"49","author":"IV Filippov","year":"2009","unstructured":"Filippov IV, Nicklaus MC (2009) Optical structure recognition software to recover chemical information: OSRA, an open source solution. J Chem Inf Model 49:740\u2013743","journal-title":"J Chem Inf Model"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-022-00609-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-022-00609-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-022-00609-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,6,6]],"date-time":"2022-06-06T17:05:18Z","timestamp":1654535118000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-022-00609-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,6]]},"references-count":20,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["609"],"URL":"https:\/\/doi.org\/10.1186\/s13321-022-00609-4","relation":{"has-preprint":[{"id-type":"doi","id":"10.26434\/chemrxiv-2022-t1kbb","asserted-by":"object"}]},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,6,6]]},"assertion":[{"value":"28 February 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 May 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 June 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"AZ is co-founder of GNWI\u2014Gesellschaft f\u00fcr Naturwissenschaftliche Informatik mbH, Dortmund, Germany.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"31"}}