{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,6]],"date-time":"2026-06-06T16:36:11Z","timestamp":1780763771452,"version":"3.54.1"},"reference-count":46,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,12,27]],"date-time":"2024-12-27T00:00:00Z","timestamp":1735257600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,12,27]],"date-time":"2024-12-27T00:00:00Z","timestamp":1735257600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["239748522"],"award-info":[{"award-number":["239748522"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100012957","name":"Friedrich-Schiller-Universit\u00e4t Jena","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100012957","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Naming chemical compounds systematically is a complex task governed by a set of rules established by the International Union of Pure and Applied Chemistry (IUPAC). These rules are universal and widely accepted by chemists worldwide, but their complexity makes it challenging for individuals to consistently apply them accurately. A translation method can be employed to address this challenge. Accurate translation of chemical compounds from SMILES notation into their corresponding IUPAC names is crucial, as it can significantly streamline the laborious process of naming chemical structures. Here, we present STOUT (SMILES-TO-IUPAC-name translator) V2, which addresses this challenge by introducing a transformer-based model that translates string representations of chemical structures into IUPAC names. Trained on a dataset of nearly 1 billion SMILES strings and their corresponding IUPAC names, STOUT V2 demonstrates exceptional accuracy in generating IUPAC names, even for complex chemical structures. The model's ability to capture intricate patterns and relationships within chemical structures enables it to generate precise and standardised IUPAC names. While established deterministic algorithms remain the gold standard for systematic chemical naming, our work, enabled by access to OpenEye\u2019s Lexichem software through an academic license, demonstrates the potential of neural approaches to complement existing tools in chemical nomenclature.<\/jats:p>\n                  <jats:p>\n                    <jats:bold>Scientific contribution<\/jats:bold>\n                    STOUT V2, built upon transformer-based models, is a significant advancement from our previous work. The web application enhances its accessibility and utility. By making the model and source code fully open and well-documented, we aim to promote unrestricted use and encourage further development.\n                  <\/jats:p>\n                  <jats:p>\n                    <jats:bold>Graphical Abstract<\/jats:bold>\n                  <\/jats:p>","DOI":"10.1186\/s13321-024-00941-x","type":"journal-article","created":{"date-parts":[[2024,12,27]],"date-time":"2024-12-27T06:11:16Z","timestamp":1735279876000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["STOUT V2.0: SMILES to IUPAC name conversion using transformer models"],"prefix":"10.1186","volume":"16","author":[{"given":"Kohulan","family":"Rajan","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Achim","family":"Zielesny","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Christoph","family":"Steinbeck","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2024,12,27]]},"reference":[{"key":"941_CR1","doi-asserted-by":"publisher","unstructured":"Favre HA, Powell WH (2014). Nomenclature of organic chemistry: IUPAC recommendations and preferred names 2013. RSC Publishing. https:\/\/doi.org\/10.1039\/9781849733069.","DOI":"10.1039\/9781849733069"},{"key":"941_CR2","volume-title":"International union of pure and applied chemistry. nomenclature of inorganic chemistry: IUPAC recommendations 2005","author":"T Damhus","year":"2005","unstructured":"Connelly NG, Damhus T, Hartshorn RM, Hutton AT (Eds.). Nomenclature of Inorganic Chemistry: IUPAC Recommendations 2005. RSC Publishing; 2005."},{"key":"941_CR3","volume-title":"A guide to IUPAC nomenclature of organic compounds: recommendations 1993 (including revisions, published and hitherto unpublished, to the 1979 edition of nomenclature of organic chemistry","author":"R Panico","year":"1993","unstructured":"Panico R, Powell WH, Richer JC (1993) A guide to IUPAC nomenclature of organic compounds: recommendations 1993 (including revisions, published and hitherto unpublished, to the 1979 edition of nomenclature of organic chemistry. Wiley-Blackwell, Hoboken"},{"key":"941_CR4","unstructured":"Tinley EH (2013) Naming organic compounds: A guide to the nomenclature used in organic chemistry. Literary Licensing, LLC."},{"key":"941_CR5","unstructured":"Incz\u00e9dy J, Lengyel T (1998) International union of pure and applied chemistry compendium of analytical nomenclature definitive rules 1997. Institut d\u2019Estudis Catalans: Barcelona."},{"key":"941_CR6","unstructured":"Werd S. Mnova 15.0.1. https:\/\/mestrelab.com\/download_file\/mnova-15-0-1\/. Accessed 1 July 2024."},{"key":"941_CR7","unstructured":"Molconvert. https:\/\/docs.chemaxon.com\/display\/lts-lithium\/molconvert.md . Accessed 1 July 2024."},{"key":"941_CR8","unstructured":"Convert chemical structures and chemical names. https:\/\/www.eyesopen.com\/lexichem-tk. Accessed 1 July 2024."},{"key":"941_CR9","unstructured":"Generate IUPAC names for chemical structures. https:\/\/www.acdlabs.com\/products\/name\/. Accessed 1 July 2024."},{"key":"941_CR10","unstructured":"Website available online: ChemAxon\u2014software solutions and services for chemistry & biology. https:\/\/www.chemaxon.com."},{"key":"941_CR11","unstructured":"OpenEye toolkits 2023.1. OpenEye, cadence molecular sciences, Santa Fe, NM. http:\/\/www.eyesopen.com."},{"key":"941_CR12","doi-asserted-by":"publisher","first-page":"1071","DOI":"10.1007\/s11831-019-09344-w","volume":"27","author":"S Dargan","year":"2020","unstructured":"Dargan S, Kumar M, Ayyagari MR, Kumar G (2020) A survey of deep learning and its applications: a new paradigm to machine learning. Arch Comput Method Eng 27:1071\u20131092. https:\/\/doi.org\/10.1007\/s11831-019-09344-w","journal-title":"Arch Comput Method Eng"},{"key":"941_CR13","doi-asserted-by":"publisher","first-page":"91","DOI":"10.3390\/computers12050091","volume":"12","author":"MM Taye","year":"2023","unstructured":"Taye MM (2023) Understanding of machine learning with deep learning: architectures, workflow. Appl Fut Dir Comput 12:91. https:\/\/doi.org\/10.3390\/computers12050091","journal-title":"Appl Fut Dir Comput"},{"key":"941_CR14","doi-asserted-by":"publisher","DOI":"10.1016\/j.nlp.2023.100026","volume":"4","author":"W Khan","year":"2023","unstructured":"Khan W, Daud A, Khan K, Muhammad S, Haq R (2023) Exploring the frontiers of deep learning and natural language processing: a comprehensive overview of key challenges and emerging trends. Nat Lang Process J 4:100026. https:\/\/doi.org\/10.1016\/j.nlp.2023.100026","journal-title":"Nat Lang Process J"},{"key":"941_CR15","unstructured":"Yang S, Wang Y, Chu X. (2020) A survey of deep learning techniques for neural machine translation. arXiv [cs.CL]."},{"key":"941_CR16","unstructured":"Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al. (2020) Language models are few-shot learners. arXiv [cs.CL]"},{"key":"941_CR17","doi-asserted-by":"publisher","unstructured":"Chang EY. (2023) Examining GPT-4\u2019s capabilities and enhancement with socrasynth. In: proceedings of the 2023 international conference on computational science and computational intelligence (CSCI). IEEE. Pp. 7\u201314. https:\/\/doi.org\/10.1109\/CSCI62032.2023.00009.","DOI":"10.1109\/CSCI62032.2023.00009"},{"key":"941_CR18","doi-asserted-by":"publisher","first-page":"6091","DOI":"10.1039\/c8sc02339e","volume":"9","author":"P Schwaller","year":"2018","unstructured":"Schwaller P, Gaudin T, L\u00e1nyi D, Bekas C, Laino T (2018) \u2018Found in translation\u2019: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chem Sci 9:6091\u20136098. https:\/\/doi.org\/10.1039\/c8sc02339e","journal-title":"Chem Sci"},{"key":"941_CR19","doi-asserted-by":"publisher","first-page":"5045","DOI":"10.1038\/s41467-023-40782-0","volume":"14","author":"K Rajan","year":"2023","unstructured":"Rajan K, Brinkhaus HO, Agea MI, Zielesny A, Steinbeck C (2023) DECIMER. ai: an open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications. Nat Commun 14:5045. https:\/\/doi.org\/10.1038\/s41467-023-40782-0","journal-title":"Nat Commun"},{"key":"941_CR20","doi-asserted-by":"publisher","DOI":"10.3390\/ph16060891","author":"A Blanco-Gonz\u00e1lez","year":"2023","unstructured":"Blanco-Gonz\u00e1lez A, Cabez\u00f3n A, Seco-Gonz\u00e1lez A, Conde-Torres D, Antelo-Riveiro P, Pi\u00f1eiro \u00c1, Garcia-Fandino R (2023) The role of AI in drug discovery: challenges, opportunities, and strategies. Pharmaceuticals. https:\/\/doi.org\/10.3390\/ph16060891","journal-title":"Pharmaceuticals"},{"key":"941_CR21","unstructured":"Ertl P, Lewis R, Martin E, Polyakov V. (2017) In Silico generation of novel, drug-like chemical matter using the LSTM neural network. arXiv [cs.LG]."},{"key":"941_CR22","doi-asserted-by":"publisher","first-page":"48","DOI":"10.1186\/s13321-017-0235-x","volume":"9","author":"M Olivecrona","year":"2017","unstructured":"Olivecrona M, Blaschke T, Engkvist O, Chen H (2017) Molecular de-novo design through deep reinforcement learning. J Cheminform 9:48. https:\/\/doi.org\/10.1186\/s13321-017-0235-x","journal-title":"J Cheminform"},{"key":"941_CR23","doi-asserted-by":"publisher","first-page":"695","DOI":"10.1021\/acs.jcim.2c01191","volume":"63","author":"YA Ivanenkov","year":"2023","unstructured":"Ivanenkov YA, Polykovskiy D, Bezrukov D, Zagribelnyy B, Aladinskiy V, Kamya P, Aliper A, Ren F, Zhavoronkov A (2023) Chemistry42: an ai-driven platform for molecular design and optimization. J Chem Inf Model 63:695\u2013701. https:\/\/doi.org\/10.1021\/acs.jcim.2c01191","journal-title":"J Chem Inf Model"},{"key":"941_CR24","doi-asserted-by":"publisher","first-page":"3197","DOI":"10.1021\/acs.jcim.1c00619","volume":"61","author":"ZJ Baum","year":"2021","unstructured":"Baum ZJ, Yu X, Ayala PY, Zhao Y, Watkins SP, Zhou Q (2021) Artificial intelligence in chemistry: current trends and future directions. J Chem Inf Model 61:3197\u20133212. https:\/\/doi.org\/10.1021\/acs.jcim.1c00619","journal-title":"J Chem Inf Model"},{"key":"941_CR25","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1186\/s13321-021-00535-x","volume":"13","author":"J Handsel","year":"2021","unstructured":"Handsel J, Matthews B, Knight NJ, Coles SJ (2021) Translating the InChI: adapting neural machine translation to predict IUPAC names from a chemical identifier. J Cheminform 13:79. https:\/\/doi.org\/10.1186\/s13321-021-00535-x","journal-title":"J Cheminform"},{"issue":"1","key":"941_CR26","doi-asserted-by":"publisher","first-page":"34","DOI":"10.1186\/s13321-021-00512-4","volume":"13","author":"K Rajan","year":"2021","unstructured":"Rajan K, Zielesny A, Steinbeck C. (2021) STOUT: SMILES to IUPAC names using neural machine translation. J Cheminform 13(1):34. https:\/\/doi.org\/10.1186\/s13321-021-00512-4.","journal-title":"J Cheminform"},{"key":"941_CR27","doi-asserted-by":"publisher","first-page":"14798","DOI":"10.1038\/s41598-021-94082-y","volume":"11","author":"L Krasnov","year":"2021","unstructured":"Krasnov L, Khokhlov I, Fedorov MV, Sosnin S (2021) Transformer-based artificial neural networks for the conversion between chemical notations. Sci Rep 11:14798. https:\/\/doi.org\/10.1038\/s41598-021-94082-y","journal-title":"Sci Rep"},{"key":"941_CR28","doi-asserted-by":"publisher","first-page":"739","DOI":"10.1021\/ci100384d","volume":"51","author":"DM Lowe","year":"2011","unstructured":"Lowe DM, Corbett PT, Murray-Rust P, Glen RC (2011) Chemical name to structure: OPSIN, an open source solution. J Chem Inf Model 51:739\u2013753. https:\/\/doi.org\/10.1021\/ci100384d","journal-title":"J Chem Inf Model"},{"key":"941_CR29","doi-asserted-by":"publisher","first-page":"D1373","DOI":"10.1093\/nar\/gkac956","volume":"51","author":"S Kim","year":"2023","unstructured":"Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, Li Q, Shoemaker BA, Thiessen PA, Yu B et al (2023) PubChem 2023 update. Nucl Acid Res 51:D1373\u2013D1380. https:\/\/doi.org\/10.1093\/nar\/gkac956","journal-title":"Nucl Acid Res"},{"key":"941_CR30","unstructured":"ChemAxon. Molconvert: part of Marvin Suite 20.15: Cheminformatics toolkit for structure file conversion and rendering [Software]. Available online: https:\/\/chemaxon.com. Accessed on 14 Oct 2024."},{"key":"941_CR31","volume-title":"An elementary mathematical theory of classification and prediction","author":"TT Tanimoto","year":"1958","unstructured":"Tanimoto TT (1958) An elementary mathematical theory of classification and prediction. International Business Machines Corporation, New York"},{"key":"941_CR32","unstructured":"Molecular Modeling Software. http:\/\/www.eyesopen.com. Accessed 5 August 2024."},{"key":"941_CR33","doi-asserted-by":"publisher","first-page":"76","DOI":"10.1186\/s13321-019-0398-8","volume":"11","author":"A Dalke","year":"2019","unstructured":"Dalke A (2019) The Chemfp Project. J Cheminform 11:76. https:\/\/doi.org\/10.1186\/s13321-019-0398-8","journal-title":"J Cheminform"},{"key":"941_CR34","doi-asserted-by":"publisher","first-page":"598","DOI":"10.1002\/qsar.200290002","volume":"21","author":"M Ashton","year":"2002","unstructured":"Ashton M, Barnard J, Casset F, Charlton M, Downs G, Gorse D, Holliday J, Lahana R, Willett P (2002) Identification of diverse database subsets using property-based and fragment-based molecular descriptions. Quant Struct Act Relatsh 21:598\u2013604. https:\/\/doi.org\/10.1002\/qsar.200290002","journal-title":"Quant Struct Act Relatsh"},{"key":"941_CR35","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. (2017) Attention is all you need. arXiv [cs.CL]."},{"key":"941_CR36","unstructured":"Yu T, Zhu H. (2020) Hyper-parameter optimization: a review of algorithms and applications. arXiv [cs.LG]."},{"key":"941_CR37","unstructured":"Vue.js. https:\/\/vuejs.org. Accessed 14 Oct 2024."},{"key":"941_CR38","unstructured":"FastAPI. https:\/\/fastapi.tiangolo.com. Accessed 14 Oct 2024."},{"key":"941_CR39","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1186\/s13321-020-00469-w","volume":"12","author":"K Rajan","year":"2020","unstructured":"Rajan K, Zielesny A, Steinbeck C (2020) DECIMER: towards deep learning for chemical image recognition. J Cheminform 12:65. https:\/\/doi.org\/10.1186\/s13321-020-00469-w","journal-title":"J Cheminform"},{"key":"941_CR40","doi-asserted-by":"publisher","unstructured":"Papineni K, Roukos S, Ward T, Zhu WJ. (2002) Bleu: A method for automatic evaluation of machine translation. In: proceedings of the proceedings of the 40th annual meeting of the association for computational linguistics. Pp. 311\u2013318. https:\/\/doi.org\/10.3115\/1073083.1073135.","DOI":"10.3115\/1073083.1073135"},{"key":"941_CR41","volume-title":"Dataset shift in machine learning","author":"J Quinonero-Candela","year":"2022","unstructured":"Quinonero-Candela J, Sugiyama M, Schwaighofer A, Lawrence ND (2022) Dataset shift in machine learning. MIT Press, Cambridge"},{"key":"941_CR42","doi-asserted-by":"publisher","first-page":"1345","DOI":"10.1109\/TKDE.2009.191","volume":"22","author":"SJ Pan","year":"2010","unstructured":"Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22:1345\u20131359. https:\/\/doi.org\/10.1109\/TKDE.2009.191","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"941_CR43","doi-asserted-by":"publisher","DOI":"10.1186\/s13321-017-0220-4","author":"EL Willighagen","year":"2017","unstructured":"Willighagen EL, Mayfield JW, Alvarsson J, Berg A, Carlsson L, Jeliazkova N, Kuhn S, Pluskal T, Rojas-Chert\u00f3 M, Spjuth O et al (2017) The chemistry development kit (CDK) v20.: atom typing, depiction, molecular formulas, and substructure searching. J Cheminform. https:\/\/doi.org\/10.1186\/s13321-017-0220-4","journal-title":"J Cheminform"},{"key":"941_CR44","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1021\/ci025584y","volume":"43","author":"C Steinbeck","year":"2003","unstructured":"Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E (2003) The chemistry development kit (CDK): an open-source java library for chemo- and bioinformatics. J Chem Inf Comput Sci 43:493\u2013500. https:\/\/doi.org\/10.1021\/ci025584y","journal-title":"J Chem Inf Comput Sci"},{"key":"941_CR45","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1758-2946-3-S1-P3","volume":"3","author":"B Karulin","year":"2011","unstructured":"Karulin B, Kozhevnikov M (2011) Ketcher: web-based chemical structure editor. J Cheminform 3:1\u20131. https:\/\/doi.org\/10.1186\/1758-2946-3-S1-P3","journal-title":"J Cheminform"},{"key":"941_CR46","unstructured":"Chollet, F., et al. Keras. https:\/\/keras.io."}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-024-00941-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-024-00941-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-024-00941-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,27]],"date-time":"2024-12-27T07:04:56Z","timestamp":1735283096000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-024-00941-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,27]]},"references-count":46,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["941"],"URL":"https:\/\/doi.org\/10.1186\/s13321-024-00941-x","relation":{"has-preprint":[{"id-type":"doi","id":"10.26434\/chemrxiv-2024-089vs","asserted-by":"object"}]},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,27]]},"assertion":[{"value":"14 August 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 December 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 December 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"The authors have given their consent for the work to be published.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"AZ is co-founder of GNWI\u2014Gesellschaft f\u00fcr naturwissenschaftliche Informatik mbH, Dortmund, Germany. The remaining authors declare no financial and non-financial competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"146"}}