{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T02:38:15Z","timestamp":1773196695561,"version":"3.50.1"},"reference-count":28,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,4,27]],"date-time":"2021-04-27T00:00:00Z","timestamp":1619481600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2021,4,27]],"date-time":"2021-04-27T00:00:00Z","timestamp":1619481600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Carl-Zeiss-Foundation"},{"DOI":"10.13039\/100012957","name":"Friedrich-Schiller-Universit\u00e4t Jena","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100012957","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Chemical compounds can be identified through a graphical depiction, a suitable string representation, or a chemical name. A universally accepted naming scheme for chemistry was established by the International Union of Pure and Applied Chemistry (IUPAC) based on a set of rules. Due to the complexity of this ruleset a correct chemical name assignment remains challenging for human beings and there are only a few rule-based cheminformatics toolkits available that support this task in an automated manner. Here we present STOUT (\n                    <jats:bold>S<\/jats:bold>\n                    MILES-\n                    <jats:bold>TO<\/jats:bold>\n                    -I\n                    <jats:bold>U<\/jats:bold>\n                    PAC-name\n                    <jats:bold>t<\/jats:bold>\n                    ranslator), a deep-learning neural machine translation approach to generate the IUPAC name for a given molecule from its SMILES string as well as the reverse translation, i.e. predicting the SMILES string from the IUPAC name. In both cases, the system is able to predict with an average BLEU score of about 90% and a Tanimoto similarity index of more than 0.9. Also incorrect predictions show a remarkable similarity between true and predicted compounds.\n                  <\/jats:p>","DOI":"10.1186\/s13321-021-00512-4","type":"journal-article","created":{"date-parts":[[2021,4,27]],"date-time":"2021-04-27T04:03:41Z","timestamp":1619496221000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":45,"title":["STOUT: SMILES to IUPAC names using neural machine translation"],"prefix":"10.1186","volume":"13","author":[{"given":"Kohulan","family":"Rajan","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Achim","family":"Zielesny","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6966-0814","authenticated-orcid":false,"given":"Christoph","family":"Steinbeck","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,4,27]]},"reference":[{"key":"512_CR1","unstructured":"Contributors to Wikimedia projects (2004) List of chemical compounds with unusual names. https:\/\/en.wikipedia.org\/wiki\/List_of_chemical_compounds_with_unusual_names. Accessed 1 Dec 2020"},{"key":"512_CR2","doi-asserted-by":"crossref","DOI":"10.1039\/9781849733069","volume-title":"Nomenclature of Organic Chemistry: IUPAC Recommendations and Preferred Names 2013","author":"HA Favre","year":"2013","unstructured":"Favre HA, Powell WH (2013) Nomenclature of Organic Chemistry: IUPAC Recommendations and Preferred Names 2013. Royal Society of Chemistry, London"},{"key":"512_CR3","doi-asserted-by":"crossref","unstructured":"Nomenclature of Inorganic Chemistry \u2013 IUPAC Recommendations 2005. Chem Int 27:25\u201326","DOI":"10.1515\/ci.2005.27.1.22b"},{"key":"512_CR4","volume-title":"Compendium of analytical nomenclature","author":"J Inczedy","year":"1998","unstructured":"Inczedy J, Lengyel T, Ure AM, Gelencs\u00e9r A, Hulanicki A, Others, (1998) Compendium of analytical nomenclature. Blackwell Science, Hoboken"},{"key":"512_CR5","unstructured":"Union internationale de chimie pure et appliqu\u00e9e. Physical, International Union of Pure and Applied Chemistry. Physical and Biophysical Chemistry Division (2007) Quantities, Units and Symbols in Physical Chemistry. Royal Society of Chemistry"},{"key":"512_CR6","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1021\/ci00057a005","volume":"28","author":"D Weininger","year":"1988","unstructured":"Weininger D (1988) SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 28:31\u201336","journal-title":"J Chem Inf Comput Sci"},{"key":"512_CR7","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1186\/s13321-015-0068-4","volume":"7","author":"SR Heller","year":"2015","unstructured":"Heller SR, McNaught A, Pletnev I, Stein S, Tchekhovskoi D (2015) InChI, the IUPAC international chemical identifier. J Cheminform 7:23","journal-title":"J Cheminform"},{"key":"512_CR8","doi-asserted-by":"publisher","first-page":"2294","DOI":"10.1021\/ci7004687","volume":"48","author":"RW Homer","year":"2008","unstructured":"Homer RW, Swanson J, Jilek RJ, Hurst T, Clark RD (2008) SYBYL line notation (SLN): a single notation to represent chemical structures, queries, reactions, and virtual libraries. J ChemInf Model 48:2294\u20132307","journal-title":"J ChemInf Model"},{"key":"512_CR9","volume-title":"A line-formula chemical notation","author":"WJ Wiswesser","year":"1954","unstructured":"Wiswesser WJ (1954) A line-formula chemical notation. Thomas Crowell Company publishers, Washington"},{"key":"512_CR10","unstructured":"Website. Daylight Inc. 4. SMARTS\u2014a language for describing molecular patterns. http:\/\/www.daylight.com\/dayhtml\/doc\/theory\/theory.smarts.html. Accessed 16 Dec 2020"},{"key":"512_CR11","unstructured":"ChemAxon - Software Solutions and Services for Chemistry & Biology. https:\/\/www.chemaxon.com. Accessed 23 Nov 2020"},{"key":"512_CR12","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1021\/ci025584y","volume":"43","author":"C Steinbeck","year":"2003","unstructured":"Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E (2003) The chemistry development kit (CDK): an open-source Java library for chemo- and bioinformatics. J Chem Inf Comput Sci 43:493\u2013500","journal-title":"J Chem Inf Comput Sci"},{"key":"512_CR13","unstructured":"Website. RDKit: open-source cheminformatics. https:\/\/www.rdkit.org. Accessed 26 Nov 2020"},{"key":"512_CR14","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1186\/1758-2946-3-33","volume":"3","author":"NM O\u2019Boyle","year":"2011","unstructured":"O\u2019Boyle NM, Banck M, James CA, Morley C, Vandermeersch T, Hutchison GR (2011) Open Babel: an open chemical toolbox. J Cheminform 3:33","journal-title":"J Cheminform"},{"key":"512_CR15","doi-asserted-by":"publisher","first-page":"D1102","DOI":"10.1093\/nar\/gky1033","volume":"47","author":"S Kim","year":"2019","unstructured":"Kim S, Chen J, Cheng T et al (2019) PubChem 2019 update: improved access to chemical data. Nucleic Acids Res 47:D1102\u2013D1109","journal-title":"Nucleic Acids Res"},{"key":"512_CR16","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1186\/s13321-020-00469-w","volume":"12","author":"K Rajan","year":"2020","unstructured":"Rajan K, Zielesny A, Steinbeck C (2020) DECIMER: towards deep learning for chemical image recognition. J Cheminform 12:65. https:\/\/doi.org\/10.1186\/s13321-020-00469-w","journal-title":"J Cheminform"},{"key":"512_CR17","doi-asserted-by":"publisher","unstructured":"O\u2019Boyle N, Dalke A DeepSMILES: An Adaptation of SMILES for Use in Machine-Learning of Chemical Structures. Doi: https:\/\/doi.org\/10.26434\/chemrxiv.7097960","DOI":"10.26434\/chemrxiv.7097960"},{"key":"512_CR18","doi-asserted-by":"publisher","first-page":"045024","DOI":"10.1088\/2632-2153\/aba947","volume":"1","author":"M Krenn","year":"2020","unstructured":"Krenn M, H\u00e4se F, Nigam A, Friederich P, Aspuru-Guzik A (2020) Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation. Mach Learn: Sci Technol 1:045024","journal-title":"Mach Learn: Sci Technol"},{"key":"512_CR19","doi-asserted-by":"crossref","unstructured":"Luong M-T, Pham H, Manning CD (2015) Effective Approaches to Attention-based Neural Machine Translation. arXiv:1508.04025[cs.CL]","DOI":"10.18653\/v1\/D15-1166"},{"key":"512_CR20","unstructured":"Bahdanau D, Cho K, Bengio Y (2014) Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:1409.0473[cs.CL]"},{"key":"512_CR21","unstructured":"Abadi M, Agarwal A, Barham P, et al (2016) TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. arXiv:1603.04467[cs.DC]"},{"key":"512_CR22","unstructured":"Papineni K, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics. pp 311\u2013318"},{"key":"512_CR23","doi-asserted-by":"publisher","first-page":"739","DOI":"10.1021\/ci100384d","volume":"51","author":"DM Lowe","year":"2011","unstructured":"Lowe DM, Corbett PT, Murray-Rust P, Glen RC (2011) Chemical name to structure: OPSIN, an open source solution. J ChemInf Model 51:739\u2013753","journal-title":"J ChemInf Model"},{"key":"512_CR24","unstructured":"nltk.translate package \u2014 NLTK 3.5 documentation. https:\/\/www.nltk.org\/api\/nltk.translate.html. Accessed 18 Mar 2021"},{"key":"512_CR25","unstructured":"Devlin J, Chang M-W, Lee K, Toutanova K (2018) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805v2[cs.CL]"},{"key":"512_CR26","doi-asserted-by":"publisher","DOI":"10.26434\/chemrxiv.13274732.v2","author":"L Krasnov","year":"2021","unstructured":"Krasnov L, Khokhlov I, Fedorov M, Sosnin S (2021) Struct2IUPAC \u2013 transformer-based artificial neural network for the conversion between chemical notations. ChemRxiv. https:\/\/doi.org\/10.26434\/chemrxiv.13274732.v2","journal-title":"ChemRxiv"},{"key":"512_CR27","doi-asserted-by":"publisher","DOI":"10.26434\/chemrxiv.14170472.v1","author":"J Handsel","year":"2021","unstructured":"Handsel J, Matthews B, Knight N, Coles S (2021) Translating the molecules: adapting neural machine translation to predict IUPAC names from a chemical identifier. ChemRxiv. https:\/\/doi.org\/10.26434\/chemrxiv.14170472.v1","journal-title":"ChemRxiv"},{"key":"512_CR28","volume-title":"Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit","author":"S Bird","year":"2009","unstructured":"Bird S, Klein E, Loper E (2009) Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O\u2019Reilly Media Inc, Newton"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-021-00512-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-021-00512-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-021-00512-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,2]],"date-time":"2023-11-02T16:29:32Z","timestamp":1698942572000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-021-00512-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,27]]},"references-count":28,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["512"],"URL":"https:\/\/doi.org\/10.1186\/s13321-021-00512-4","relation":{"has-preprint":[{"id-type":"doi","id":"10.26434\/chemrxiv.13469202","asserted-by":"object"},{"id-type":"doi","id":"10.26434\/chemrxiv.13469202.v2","asserted-by":"object"},{"id-type":"doi","id":"10.26434\/chemrxiv.13469202.v1","asserted-by":"object"}]},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,4,27]]},"assertion":[{"value":"21 December 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 April 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 April 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"AZ is co-founder of GNWI-Gesellschaft f\u00fcr naturwissenschaftliche Informatik mbH, Dortmund, Germany.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"34"}}