{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,28]],"date-time":"2026-02-28T10:51:22Z","timestamp":1772275882836,"version":"3.50.1"},"reference-count":47,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,10,30]],"date-time":"2025-10-30T00:00:00Z","timestamp":1761782400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,10,30]],"date-time":"2025-10-30T00:00:00Z","timestamp":1761782400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100006769","name":"Russian Science Foundation","doi-asserted-by":"publisher","award":["23-11-00358"],"award-info":[{"award-number":["23-11-00358"]}],"id":[{"id":"10.13039\/501100006769","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Cheminform"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>The recent integration of natural language processing into chemistry has advanced drug discovery. Molecule representations in language models (LMs) are crucial to enhance chemical understanding. We explored the ability of models to match the same chemical structures despite their different representations. Recognizing the same substance in different representations is an important component of emulating the understanding of how chemistry works. We propose Augmented Molecular Retrieval (AMORE), a flexible zero-shot framework for the assessment of chemistry LMs of different natures. The framework is based on SMILES augmentations that maintain a foundational chemical structure. The proposed method facilitates the similarity between the embedding representations of the molecule, its SMILES variation, and that of another molecule. Experiments indicate that the tested ChemLLMs are still not robust to different SMILES representations. We evaluated the models on various tasks, including the molecular captioning on ChEBI-20 benchmark and classification and regression tasks of MoleculeNet benchmark. We show that the results\u2019 change after SMILES strings variations align with the proposed AMORE framework.<\/jats:p>","DOI":"10.1186\/s13321-025-01079-0","type":"journal-article","created":{"date-parts":[[2025,10,30]],"date-time":"2025-10-30T10:30:14Z","timestamp":1761820214000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Measuring Chemical\u00a0LLM robustness to molecular representations: a SMILES variation-based framework"],"prefix":"10.1186","volume":"17","author":[{"given":"Veronika","family":"Ganeeva","sequence":"first","affiliation":[]},{"given":"Kuzma","family":"Khrabrov","sequence":"additional","affiliation":[]},{"given":"Artur","family":"Kadurin","sequence":"additional","affiliation":[]},{"given":"Elena","family":"Tutubalina","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,10,30]]},"reference":[{"key":"1079_CR1","first-page":"1","volume":"30","author":"A Vaswani","year":"2017","unstructured":"Vaswani A, Shazeer N, Parmar N et al (2017) Attention is all you need. Adv Neural Inform Proc Syst 30:1","journal-title":"Adv Neural Inform Proc Syst"},{"key":"1079_CR2","unstructured":"Chilingaryan G, Tamoyan H, Tevosyan A, et\u00a0al (2022) Bartsmiles: generative masked language models for molecular representations. arXiv preprint arXiv:2211.16349"},{"key":"1079_CR3","unstructured":"Chithrananda S, Grand G, Ramsundar B (2020) Chemberta: large-scale self-supervised pretraining for molecular property prediction. arXiv preprint arXiv:2010.09885"},{"issue":"1","key":"1079_CR4","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/ac3ffb","volume":"3","author":"R Irwin","year":"2022","unstructured":"Irwin R, Dimitriadis S, He J et al (2022) Chemformer: a pre-trained transformer for computational chemistry. Mach Learn Sci Technol 3(1):015022","journal-title":"Mach Learn Sci Technol"},{"issue":"6","key":"1079_CR5","doi-asserted-by":"publisher","first-page":"1376","DOI":"10.1021\/acs.jcim.1c01467","volume":"62","author":"J Lu","year":"2022","unstructured":"Lu J, Zhang Y (2022) Unified deep learning model for multitask reaction predictions with explanation. J Chem Inf Model 62(6):1376\u20131387","journal-title":"J Chem Inf Model"},{"key":"1079_CR6","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1021\/ci00057a005","volume":"28","author":"D Weininger","year":"1988","unstructured":"Weininger D (1988) Smiles, a chemical language and information system. 1. introduction to methodology and encoding rules. J Chem Inf Comput Sci 28:31\u201336 (https:\/\/api.semanticscholar.org\/CorpusID:5445756)","journal-title":"J Chem Inf Comput Sci"},{"issue":"11","key":"1079_CR7","doi-asserted-by":"publisher","first-page":"2324","DOI":"10.1021\/acs.jcim.5b00559","volume":"55","author":"T Sterling","year":"2015","unstructured":"Sterling T, Irwin JJ (2015) Zinc 15\u2013ligand discovery for everyone. J Chem Inf Model 55(11):2324\u20132337. https:\/\/doi.org\/10.1021\/acs.jcim.5b00559. (pMID: 26479676)","journal-title":"J Chem Inf Model"},{"key":"1079_CR8","doi-asserted-by":"publisher","unstructured":"Lowe D (2017) Chemical reactions from US patents (1976-Sep2016). https:\/\/doi.org\/10.6084\/m9.figshare.5104873.v1, https:\/\/figshare.com\/articles\/dataset\/Chemical_reactions_from_US_patents_1976-Sep2016_\/5104873","DOI":"10.6084\/m9.figshare.5104873.v1"},{"key":"1079_CR9","unstructured":"Lowe DM (2012) Extraction of chemical structures and reactions from the literature. PhD thesis, University of Cambridge"},{"issue":"2","key":"1079_CR10","doi-asserted-by":"publisher","first-page":"513","DOI":"10.1039\/C7SC02664A","volume":"9","author":"Z Wu","year":"2018","unstructured":"Wu Z, Ramsundar B, Feinberg EN et al (2018) MoleculeNet: a benchmark for molecular machine learning. Chem Sci 9(2):513\u2013530","journal-title":"Chem Sci"},{"key":"1079_CR11","doi-asserted-by":"publisher","unstructured":"Edwards C, Lai T, Ros K, et\u00a0al (2022) Translation between molecules and natural language. Abu Dhabi, United Arab Emirates, pp 375\u2013413, https:\/\/doi.org\/10.18653\/v1\/2022.emnlp-main.26","DOI":"10.18653\/v1\/2022.emnlp-main.26"},{"key":"1079_CR12","unstructured":"Christofidellis D, Giannone G, Born J, et\u00a0al (2023) Unifying molecular and textual representations via multi-task language modelling. In: Krause A, Brunskill E, Cho K, et\u00a0al (eds) International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, Proceedings of Machine Learning Research, vol 202. PMLR, pp 6140\u20136157, https:\/\/proceedings.mlr.press\/v202\/christofidellis23a.html"},{"key":"1079_CR13","doi-asserted-by":"publisher","first-page":"8380","DOI":"10.1039\/D4SC00966E","volume":"15","author":"M Livne","year":"2024","unstructured":"Livne M, Miftahutdinov Z, Tutubalina E et al (2024) nach0: multimodal natural and chemical languages foundation model. Chem Sci 15:8380\u20138389. https:\/\/doi.org\/10.1039\/D4SC00966E","journal-title":"Chem Sci"},{"issue":"140","key":"1079_CR14","first-page":"1","volume":"21","author":"C Raffel","year":"2020","unstructured":"Raffel C, Shazeer N, Roberts A et al (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1\u201367 (http:\/\/jmlr.org\/papers\/v21\/20-074.html)","journal-title":"J Mach Learn Res"},{"key":"1079_CR15","doi-asserted-by":"publisher","unstructured":"Devlin J, Chang MW, Lee K, et\u00a0al (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, pp 4171\u20134186. https:\/\/doi.org\/10.18653\/v1\/N19-1423","DOI":"10.18653\/v1\/N19-1423"},{"key":"1079_CR16","first-page":"1877","volume-title":"Advances in Neural Information Processing Systems","author":"T Brown","year":"2020","unstructured":"Brown T, Mann B, Ryder N et al (2020) Language models are few-shot learners. In: Larochelle H, Ranzato M, Hadsell R et al (eds) Advances in Neural Information Processing Systems, vol 33. Curran Associates Inc, Red Hook, pp 1877\u20131901 (https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2020\/file\/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf)"},{"key":"1079_CR17","doi-asserted-by":"crossref","unstructured":"Ganeeva V, Sakhovskiy A, Khrabrov K, et\u00a0al (2024) Lost in translation: Chemical language models and the misunderstanding of molecule structures. In: Findings of the Association for Computational Linguistics: EMNLP 2024. Association for Computational Linguistics, Miami, Florida, USA, pp 12994\u201313013, https:\/\/aclanthology.org\/2024.findings-emnlp.760","DOI":"10.18653\/v1\/2024.findings-emnlp.760"},{"key":"1079_CR18","doi-asserted-by":"crossref","unstructured":"Papineni K, Roukos S, Ward T, et\u00a0al (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp 311\u2013318","DOI":"10.3115\/1073083.1073135"},{"key":"1079_CR19","unstructured":"Lin CY (2004) ROUGE: a package for automatic evaluation of summaries. Barcelona, Spain, pp 74\u201381, W04-1013"},{"key":"1079_CR20","unstructured":"Banerjee S, Lavie A (2005) METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. Ann Arbor, Michigan, pp 65\u201372, W05-0909"},{"key":"1079_CR21","unstructured":"Radev DR, Qi H, Wu H, et\u00a0al (2002) Evaluating web-based question answering systems. In: LREC"},{"issue":"03","key":"1079_CR22","doi-asserted-by":"publisher","first-page":"535","DOI":"10.1109\/TBDATA.2019.2921572","volume":"7","author":"J Johnson","year":"2019","unstructured":"Johnson J, Douze M, Jegou H (2019) Billion-scale similarity search with gpus. IEEE Trans Big Data 7(03):535\u2013547","journal-title":"IEEE Trans Big Data"},{"key":"1079_CR23","unstructured":"Ganeeva V, Khrabrov K, Kadurin A, et\u00a0al (2024) Chemical language models have problems with chemistry: A case study on molecule captioning task. In: The Second Tiny Papers Track at ICLR 2024"},{"key":"1079_CR24","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-020-00456-1","volume":"12","author":"AP Bento","year":"2020","unstructured":"Bento AP, Hersey A, F\u00e9lix E et al (2020) An open source chemical structure curation pipeline using RDKit. J Cheminform 12:1\u201316","journal-title":"J Cheminform"},{"key":"1079_CR25","unstructured":"Greg L, et\u00a0al (2022) RDKit: open-source cheminformatics. https:\/\/www.rdkit.org\/"},{"key":"1079_CR26","unstructured":"Marino D, Marino D, Peruzzo P, et\u00a0al (2001) Qsar carcinogenic study of methylated polycyclic aromatic hydrocarbons based on topological descriptors derived from distance matrices and correlation weights of local graph invariants. Sci Direct Working Paper (S1574-0331):04"},{"key":"1079_CR27","doi-asserted-by":"crossref","unstructured":"Edwards C, Zhai C, Ji H (2021) Text2Mol: Cross-modal molecule retrieval with natural language queries. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp 595\u2013607, https:\/\/aclanthology.org\/2021.emnlp-main.47\/","DOI":"10.18653\/v1\/2021.emnlp-main.47"},{"issue":"1","key":"1079_CR28","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/sdata.2014.22","volume":"1","author":"R Ramakrishnan","year":"2014","unstructured":"Ramakrishnan R, Dral PO, Rupp M et al (2014) Quantum chemistry structures and properties of 134 kilo molecules. Sci Data 1(1):1\u20137","journal-title":"Sci Data"},{"issue":"11","key":"1079_CR29","doi-asserted-by":"publisher","first-page":"2864","DOI":"10.1021\/ci300415d","volume":"52","author":"L Ruddigkeit","year":"2012","unstructured":"Ruddigkeit L, Van Deursen R, Blum LC et al (2012) Enumeration of 166 billion organic small molecules in the chemical universe database gdb-17. J Chem Inf Model 52(11):2864\u20132875","journal-title":"J Chem Inf Model"},{"issue":"D1","key":"1079_CR30","doi-asserted-by":"publisher","first-page":"D1202","DOI":"10.1093\/nar\/gkv951","volume":"44","author":"S Kim","year":"2016","unstructured":"Kim S, Thiessen PA, Bolton EE et al (2016) Pubchem substance and compound databases. Nucleic Acids Res 44(D1):D1202\u2013D1213","journal-title":"Nucleic Acids Res"},{"key":"1079_CR31","unstructured":"Schuh MG, Boldini D, Sieber SA (2024) Twinbooster: Synergising large language models with barlow twins and gradient boosting for enhanced molecular property prediction. arXiv preprint arXiv:2401.04478"},{"key":"1079_CR32","unstructured":"He P, Gao J, Chen W (2023) Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing. In: The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, https:\/\/openreview.net\/pdf?id=sE7-XhLxHA"},{"issue":"D1","key":"1079_CR33","doi-asserted-by":"publisher","first-page":"1373","DOI":"10.1093\/NAR\/GKAC956","volume":"51","author":"S Kim","year":"2023","unstructured":"Kim S, Chen J, Cheng T et al (2023) Pubchem 2023 update. Nucleic Acids Res 51(D1):1373\u20131380. https:\/\/doi.org\/10.1093\/NAR\/GKAC956","journal-title":"Nucleic Acids Res"},{"issue":"Database\u2013Issue","key":"1079_CR34","doi-asserted-by":"publisher","first-page":"1100","DOI":"10.1093\/NAR\/GKR777","volume":"40","author":"A Gaulton","year":"2012","unstructured":"Gaulton A, Bellis LJ, Bento AP et al (2012) Chembl: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40(Database\u2013Issue):1100\u20131107. https:\/\/doi.org\/10.1093\/NAR\/GKR777","journal-title":"Nucleic Acids Res"},{"key":"1079_CR35","unstructured":"Liu Y, Ott M, Goyal N, et\u00a0al (2019) Roberta: a robustly optimized BERT pretraining approach. CoRR arxiv:1907.11692"},{"key":"1079_CR36","doi-asserted-by":"publisher","unstructured":"Lewis M, Liu Y, Goyal N, et\u00a0al (2020) BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Jurafsky D, Chai J, Schluter N, et\u00a0al (eds) Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, pp 7871\u20137880, https:\/\/doi.org\/10.18653\/v1\/2020.acl-main.703,","DOI":"10.18653\/v1\/2020.acl-main.703"},{"issue":"12","key":"1079_CR37","doi-asserted-by":"publisher","first-page":"6065","DOI":"10.1021\/ACS.JCIM.0C00675","volume":"60","author":"JJ Irwin","year":"2020","unstructured":"Irwin JJ, Tang KG, Young J et al (2020) ZINC20\u2014a free ultralarge-scale chemical database for ligand discovery. J Chem Inf Model 60(12):6065\u20136073. https:\/\/doi.org\/10.1021\/ACS.JCIM.0C00675","journal-title":"J Chem Inf Model"},{"issue":"8","key":"1079_CR38","first-page":"9","volume":"1","author":"A Radford","year":"2019","unstructured":"Radford A, Wu J, Child R et al (2019) Language models are unsupervised multitask learners. OpenAI blog 1(8):9","journal-title":"OpenAI blog"},{"key":"1079_CR39","unstructured":"Phan LN, Anibal JT, Tran H, et\u00a0al (2021) Scifive: a text-to-text transformer model for biomedical literature. arXiv preprint arXiv:2106.03598"},{"key":"1079_CR40","doi-asserted-by":"publisher","first-page":"0004","DOI":"10.34133\/research.0004","volume":"2022","author":"XC Zhang","year":"2022","unstructured":"Zhang XC, Wu CK, Yi JC et al (2022) Pushing the boundaries of molecular property prediction for drug discovery with multitask learning bert enhanced by smiles enumeration. Research 2022:0004. https:\/\/doi.org\/10.34133\/research.0004","journal-title":"Research"},{"key":"1079_CR41","unstructured":"Karl (2024) Gpt2 zinc 87m. https:\/\/huggingface.co\/entropy\/gpt2_zinc_87m"},{"issue":"4","key":"1079_CR42","doi-asserted-by":"publisher","first-page":"824","DOI":"10.1109\/TPAMI.2018.2889473","volume":"42","author":"YA Malkov","year":"2018","unstructured":"Malkov YA, Yashunin DA (2018) Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE Trans Pattern Anal Mach Intell 42(4):824\u2013836","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1079_CR43","doi-asserted-by":"publisher","first-page":"513","DOI":"10.1039\/C7SC02664A","volume":"9","author":"Z Wu","year":"2018","unstructured":"Wu Z, Ramsundar B, Feinberg E et al (2018) Moleculenet: a benchmark for molecular machine learning. Chem Sci 9:513\u2013530. https:\/\/doi.org\/10.1039\/C7SC02664A","journal-title":"Chem Sci"},{"key":"1079_CR44","doi-asserted-by":"publisher","unstructured":"Rofin M, Mikhailov V, Florinsky M, et\u00a0al (2023) Vote\u2019n\u2019rank: Revision of benchmarking with social choice theory. Dubrovnik, Croatia, pp 670\u2013686. https:\/\/doi.org\/10.18653\/v1\/2023.eacl-main.48","DOI":"10.18653\/v1\/2023.eacl-main.48"},{"key":"1079_CR45","unstructured":"Aizerman M, Aleskerov F (1995) Theory of choice. vol. 38. Studies in Mathematical and Managerial Economics North-Holland. pp 136"},{"issue":"9","key":"1079_CR46","doi-asserted-by":"publisher","first-page":"1066","DOI":"10.3390\/sym11091066","volume":"11","author":"M Kaya","year":"2019","unstructured":"Kaya M, Bilge H\u015e (2019) Deep metric learning A survey. Symmetry 11(9):1066","journal-title":"Symmetry"},{"issue":"10","key":"1079_CR47","doi-asserted-by":"publisher","DOI":"10.1016\/j.patter.2022.100588","volume":"3","author":"M Krenn","year":"2022","unstructured":"Krenn M, Ai Q, Barthel S et al (2022) Selfies and the future of molecular string representations. Patterns 3(10):100588. https:\/\/doi.org\/10.1016\/j.patter.2022.100588","journal-title":"Patterns"}],"container-title":["Journal of Cheminformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-025-01079-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13321-025-01079-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13321-025-01079-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,30]],"date-time":"2025-10-30T10:30:21Z","timestamp":1761820221000},"score":1,"resource":{"primary":{"URL":"https:\/\/jcheminf.biomedcentral.com\/articles\/10.1186\/s13321-025-01079-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,30]]},"references-count":47,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["1079"],"URL":"https:\/\/doi.org\/10.1186\/s13321-025-01079-0","relation":{},"ISSN":["1758-2946"],"issn-type":[{"value":"1758-2946","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,30]]},"assertion":[{"value":"30 November 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 August 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 October 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"First, we evaluated modes that are publicly available at HuggingFace (HF). We note that there are other popular models such as Chemformer (\n                      \n                      ), Molformer (\n                      \n                      ) and T5Chem (\n                      \n                      ), which we failed to plug as HF checkpoints. Second, the evaluated models primarily focus on the sequence format of molecules, but it is important to consider in future other formats, such as 3D structures, which also hold significant importance. Third, we emphasize that the evaluated models were developed for research purposes and may contain unintended biases, and any molecules generated by them should undergo thorough evaluation through standard clinical testing. Furthermore, SELFIES [\n                      \n                      ] and other molecule naming systems are also widespread in the chemical field. In our research, we have focused on SMILES due to its popularity, but the augmentations of other systems are yet to be explored.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Limitations"}},{"value":"The models and datasets used in this work are publicly available for research purposes. The incorporation of AI into applied chemistry brings forth a variety of risks and ethical dilemmas. First, the direct implementation of AI-generated predictions, potentially hazardous or dangerous, without rigorous validation could result in human injuries, casualties, and damage to laboratory facilities. Second, the absence of proper oversight could lead to the misuse of chemical language models and AI in general, potentially facilitating the production of dangerous and illegal chemical compounds, with significant ethical and societal consequences. To address these concerns, it is essential to develop and implement safe ethical guidelines for the development and deployment of AI in chemistry.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"164"}}