{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,16]],"date-time":"2026-06-16T17:49:04Z","timestamp":1781632144773,"version":"3.54.5"},"reference-count":43,"publisher":"Springer Science and Business Media LLC","issue":"21","license":[{"start":{"date-parts":[[2023,8,12]],"date-time":"2023-08-12T00:00:00Z","timestamp":1691798400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,8,12]],"date-time":"2023-08-12T00:00:00Z","timestamp":1691798400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Appl Intell"],"published-print":{"date-parts":[[2023,11]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Structure-constrained molecular optimisation aims to improve the target pharmacological properties of input molecules through small perturbations of the molecular structures. Previous studies have exploited various optimisation techniques to satisfy the requirements of structure-constrained molecular optimisation tasks. However, several studies have encountered difficulties in producing property-improved and synthetically feasible molecules. To achieve both property improvement and synthetic feasibility of molecules, we proposed a molecular structure editing model called SELF-EdiT that uses self-referencing embedded strings (SELFIES) and Levenshtein transformer models. The SELF-EdiT generates new molecules that resemble the seed molecule by iteratively applying fragment-based deletion-and-insertion operations to SELFIES. The SELF-EdiT exploits a grammar-based SELFIES tokenization method and the Levenshtein transformer model to efficiently learn deletion-and-insertion operations for editing SELFIES. Our results demonstrated that SELF-EdiT outperformed existing structure-constrained molecular optimisation models by a considerable margin of success and total scores on the two benchmark datasets. Furthermore, we confirmed that the proposed model could improve the pharmacological properties without large perturbations of the molecular structures through edit-path analysis. Moreover, our fragment-based approach significantly relieved the SELFIES collapse problem compared to the existing SELFIES-based model. SELF-EdiT is the first attempt to apply editing operations to the SELFIES to design an effective editing-based optimisation, which can be helpful for fellow researchers planning to utilise the SELFIES.<\/jats:p>","DOI":"10.1007\/s10489-023-04915-8","type":"journal-article","created":{"date-parts":[[2023,8,12]],"date-time":"2023-08-12T13:02:00Z","timestamp":1691845320000},"page":"25868-25880","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["SELF-EdiT: Structure-constrained molecular optimisation using SELFIES editing transformer"],"prefix":"10.1007","volume":"53","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2358-9797","authenticated-orcid":false,"given":"Shengmin","family":"Piao","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8429-4135","authenticated-orcid":false,"given":"Jonghwan","family":"Choi","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4883-3987","authenticated-orcid":false,"given":"Sangmin","family":"Seo","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5196-6193","authenticated-orcid":false,"given":"Sanghyun","family":"Park","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2023,8,12]]},"reference":[{"issue":"12","key":"4915_CR1","first-page":"877","volume":"13","author":"A Mullard","year":"2014","unstructured":"Mullard A (2014) New drugs cost US \\$2.6 billion to develop. Nature Rev Drug Discov 13(12):877","journal-title":"Nature Rev Drug Discov"},{"issue":"3","key":"4915_CR2","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1038\/nrd3078","volume":"9","author":"SM Paul","year":"2010","unstructured":"Paul SM, Mytelka DS, Dunwiddie CT, Persinger CC, Munos BH, Lindborg SR, Schacht AL (2010) How to improve R &D productivity: the pharmaceutical industry\u2019s grand challenge. Nature Rev Drug Discov 9(3):203\u2013214","journal-title":"Nature Rev Drug Discov"},{"issue":"4","key":"4915_CR3","first-page":"404","volume":"7","author":"ML Verdonk","year":"2004","unstructured":"Verdonk ML, Hartshorn MJ (2004) Structure-guided fragment screening for lead discovery. Curr Opin Drug Discov Dev 7(4):404\u2013410","journal-title":"Curr Opin Drug Discov Dev"},{"issue":"5","key":"4915_CR4","doi-asserted-by":"publisher","first-page":"333","DOI":"10.1038\/nrd.2018.53","volume":"17","author":"CJ Gerry","year":"2018","unstructured":"Gerry CJ, Schreiber SL (2018) Chemical probes and drug leads from advances in synthetic planning and methodology. Nature Rev Drug Discov 17(5):333\u2013352","journal-title":"Nature Rev Drug Discov"},{"issue":"8","key":"4915_CR5","doi-asserted-by":"publisher","first-page":"675","DOI":"10.1007\/s10822-013-9672-4","volume":"27","author":"PG Polishchuk","year":"2013","unstructured":"Polishchuk PG, Madzhidov TI, Varnek A (2013) Estimation of the size of drug-like chemical space based on GDB-17 data. J Comput-Aided Mol Des 27(8):675\u2013679","journal-title":"J Comput-Aided Mol Des"},{"issue":"5","key":"4915_CR6","first-page":"1608","volume":"12","author":"C Bilodeau","year":"2022","unstructured":"Bilodeau C, Jin W, Jaakkola T, Barzilay R, Jensen KF (2022) Generative models for molecular discovery: Recent advances and challenges. Wiley Interdiscip Rev: Comput Mol Sci 12(5):1608","journal-title":"Wiley Interdiscip Rev: Comput Mol Sci"},{"issue":"4","key":"4915_CR7","doi-asserted-by":"publisher","first-page":"455","DOI":"10.3390\/e24040455","volume":"24","author":"S Yang","year":"2022","unstructured":"Yang S, Tan J, Chen B (2022) Robust spike-based continual meta-learning improved by restricted minimum error entropy criterion. Entropy 24(4):455","journal-title":"Entropy"},{"key":"4915_CR8","doi-asserted-by":"crossref","unstructured":"Yang S, Linares-Barranco B, Chen B (2022) Heterogeneous ensemble-based spike-driven few-shot online learning. Frontiers in Neuroscience 16","DOI":"10.3389\/fnins.2022.850932"},{"issue":"9","key":"4915_CR9","doi-asserted-by":"publisher","first-page":"4398","DOI":"10.1109\/TNNLS.2021.3057070","volume":"33","author":"S Yang","year":"2021","unstructured":"Yang S, Wang J, Zhang N, Deng B, Pang Y, Azghadi MR (2021) Cerebellumorphic: large-scale neuromorphic model and architecture for supervised motor learning. IEEE Trans Neural Netw Learn Syst 33(9):4398\u20134412","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"key":"4915_CR10","doi-asserted-by":"crossref","unstructured":"Yang S, Tan J, Lei T, Linares-Barranco B (2023) Smart traffic navigation system for fault-tolerant edge computing of internet of vehicle in intelligent transportation gateway. IEEE Transactions on Intelligent Transportation Systems","DOI":"10.1109\/TITS.2022.3232231"},{"issue":"1","key":"4915_CR11","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1021\/ci00057a005","volume":"28","author":"D Weininger","year":"1988","unstructured":"Weininger D (1988) SMILES, a chemical language and information system. 1. introduction to methodology and encoding rules. J Chem Inf Comput Sci 28(1):31\u201336","journal-title":"J Chem Inf Comput Sci"},{"issue":"4","key":"4915_CR12","first-page":"045024","volume":"1","author":"M Krenn","year":"2020","unstructured":"Krenn M, H\u00e4se F, Nigam A, Friederich P, Aspuru-Guzik A (2020) Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation. Mach Learn: Sci Technol 1(4):045024","journal-title":"Mach Learn: Sci Technol"},{"key":"4915_CR13","doi-asserted-by":"crossref","unstructured":"Deng J, Yang Z, Ojima I, Samaras D, Wang F (2022) Artificial intelligence in drug discovery: applications and techniques. Briefings in Bioinformatics 23(1)","DOI":"10.1093\/bib\/bbab430"},{"key":"4915_CR14","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1758-2946-1-8","volume":"1","author":"P Ertl","year":"2009","unstructured":"Ertl P, Schuffenhauer A (2009) Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J Cheminformatics 1:1\u201311","journal-title":"J Cheminformatics"},{"issue":"12","key":"4915_CR15","doi-asserted-by":"publisher","first-page":"2973","DOI":"10.1021\/acs.jcim.2c00038","volume":"62","author":"J Yu","year":"2022","unstructured":"Yu J, Wang J, Zhao H, Gao J, Kang Y, Cao D, Wang Z, Hou T (2022) Organic compound synthetic accessibility prediction based on the graph attention mechanism. J Chem Inf Model 62(12):2973-2986","journal-title":"J Chem Inf Model"},{"key":"4915_CR16","unstructured":"Jin W, Yang K, Barzilay R, Jaakkola T (2019) Learning multimodal graph-to-graph translation for molecular optimization. Paper presented at International Conference on Learning Representations 2019"},{"key":"4915_CR17","unstructured":"Jin W, Barzilay R, Jaakkola T (2018) Junction tree variational autoencoder for molecular graph generation. In: International conference on machine learning, pp 2323\u20132332. PMLR"},{"key":"4915_CR18","unstructured":"Jin W, Barzilay R, Jaakkola T (2020) Hierarchical generation of molecular graphs using structural motifs. In: International conference on machine learning, pp 4839\u20134848. PMLR"},{"key":"4915_CR19","unstructured":"Ji C, Zheng Y, Wang R, Cai Y, Wu H (2021) Graph polish: A novel graph generation paradigm for molecular optimization. IEEE Transactions on Neural Networks and Learning Systems"},{"issue":"20","key":"4915_CR20","doi-asserted-by":"publisher","first-page":"7079","DOI":"10.1039\/D1SC00231G","volume":"12","author":"A Nigam","year":"2021","unstructured":"Nigam A, Pollice R, Krenn M, dos Passos Gomes G, Aspuru-Guzik A (2021) Beyond generative models: superfast traversal, optimization, novelty, exploration and discovery (STONED) algorithm for molecules using SELFIES. Chem Sci 12(20):7079\u20137090","journal-title":"Chem Sci"},{"key":"4915_CR21","first-page":"21342","volume":"35","author":"W Gao","year":"2022","unstructured":"Gao W, Fu T, Sun J, Coley CW (2022) Sample efficiency matters: a benchmark for practical molecular optimization. Adv Neural Inf Process Syst 35:21342\u201321357","journal-title":"Adv Neural Inf Process Syst"},{"issue":"30","key":"4915_CR22","doi-asserted-by":"publisher","first-page":"5128","DOI":"10.2174\/092986712803530467","volume":"19","author":"A Kumar","year":"2012","unstructured":"Kumar A, Voet A, Zhang KY (2012) Fragment based drug design: from experimental to computational approaches. Curr Med Chem 19(30):5128\u20135147","journal-title":"Curr Med Chem"},{"key":"4915_CR23","doi-asserted-by":"crossref","unstructured":"Gao T, Yao X, Chen D (2021) Simcse: Simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 Conference on empirical methods in natural language processing, pp 6894\u20136910","DOI":"10.18653\/v1\/2021.emnlp-main.552"},{"key":"4915_CR24","unstructured":"Gu J, Wang C, Zhao J (2019) Levenshtein transformer. Advances in Neural Information Processing Systems 32"},{"key":"4915_CR25","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser, \u0141., Polosukhin I (2017) Attention is all you need. Advances in Neural Information Processing Systems 30"},{"key":"4915_CR26","unstructured":"Levenshtein VI, et al (1966) Binary codes capable of correcting deletions, insertions, and reversals. In: Soviet Physics Doklady, vol 10, pp 707\u2013710. Soviet Union"},{"key":"4915_CR27","unstructured":"You J, Liu B, Ying Z, Pande V, Leskovec J (2018) Graph convolutional policy network for goal-directed molecular graph generation. Advances in Neural Information Processing Systems 31"},{"issue":"1","key":"4915_CR28","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-018-37186-2","volume":"9","author":"Z Zhou","year":"2019","unstructured":"Zhou Z, Kearnes S, Li L, Zare RN, Riley P (2019) Optimization of molecules via deep reinforcement learning. Sci Rep 9(1):1\u201310","journal-title":"Sci Rep"},{"key":"4915_CR29","unstructured":"Bjorck J, Gomes CP, Weinberger KQ (2022) Is high variance unavoidable in rl? a case study in continuous control. Paper presented at International conference on learning representations 2022"},{"issue":"2","key":"4915_CR30","doi-asserted-by":"publisher","first-page":"268","DOI":"10.1021\/acscentsci.7b00572","volume":"4","author":"R G\u00f3mez-Bombarelli","year":"2018","unstructured":"G\u00f3mez-Bombarelli R, Wei JN, Duvenaud D, Hern\u00e1ndez-Lobato JM, S\u00e1nchez-Lengeling B, Sheberla D, Aguilera-Iparraguirre J, Hirzel TD, Adams RP, Aspuru-Guzik A (2018) Automatic chemical design using a data-driven continuous representation of molecules. ACS Central Science 4(2):268\u2013276","journal-title":"ACS Central Science"},{"issue":"2","key":"4915_CR31","doi-asserted-by":"publisher","first-page":"577","DOI":"10.1039\/C9SC04026A","volume":"11","author":"R-R Griffiths","year":"2020","unstructured":"Griffiths R-R, Hern\u00e1ndez-Lobato JM (2020) Constrained Bayesian optimization for automatic chemical design using variational autoencoders. Chem Sci 11(2):577\u2013586","journal-title":"Chem Sci"},{"key":"4915_CR32","doi-asserted-by":"publisher","first-page":"1925","DOI":"10.1007\/s10994-020-05899-z","volume":"109","author":"R Moriconi","year":"2020","unstructured":"Moriconi R, Deisenroth MP, Sesh Kumar K (2020) High-dimensional Bayesian optimization using low-dimensional feature spaces. Mach Learn 109:1925\u20131943","journal-title":"Mach Learn"},{"issue":"4","key":"4915_CR33","doi-asserted-by":"publisher","first-page":"390","DOI":"10.1039\/D2DD00003B","volume":"1","author":"A Nigam","year":"2022","unstructured":"Nigam A, Pollice R, Aspuru-Guzik A (2022) Parallel tempered genetic algorithm guided by deep neural networks for inverse molecular design. Digital Discov 1(4):390\u2013404","journal-title":"Digital Discov"},{"issue":"1","key":"4915_CR34","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1016\/j.commatsci.2008.04.033","volume":"45","author":"W Paszkowicz","year":"2009","unstructured":"Paszkowicz W (2009) Properties of a genetic algorithm equipped with a dynamic penalty function. Comput Mater Sci 45(1):77\u201383","journal-title":"Comput Mater Sci"},{"issue":"5292","key":"4915_CR35","doi-asserted-by":"publisher","first-page":"1531","DOI":"10.1126\/science.274.5292.1531","volume":"274","author":"SB Shuker","year":"1996","unstructured":"Shuker SB, Hajduk PJ, Meadows RP, Fesik SW (1996) Discovering high-affinity ligands for proteins: SAR by NMR. Sci 274(5292):1531\u20131534","journal-title":"Sci"},{"issue":"3","key":"4915_CR36","doi-asserted-by":"publisher","first-page":"187","DOI":"10.1038\/nchem.217","volume":"1","author":"CW Murray","year":"2009","unstructured":"Murray CW, Rees DC (2009) The rise of fragment-based drug discovery. Nature Chem 1(3):187\u2013192","journal-title":"Nature Chem"},{"key":"4915_CR37","doi-asserted-by":"crossref","unstructured":"Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201906), vol 2, pp 1735\u20131742. IEEE","DOI":"10.1109\/CVPR.2006.100"},{"issue":"1","key":"4915_CR38","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-017-0235-x","volume":"9","author":"M Olivecrona","year":"2017","unstructured":"Olivecrona M, Blaschke T, Engkvist O, Chen H (2017) Molecular de-novo design through deep reinforcement learning. J Cheminformatics 9(1):1\u201314","journal-title":"J Cheminformatics"},{"issue":"2","key":"4915_CR39","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1038\/nchem.1243","volume":"4","author":"GR Bickerton","year":"2012","unstructured":"Bickerton GR, Paolini GV, Besnard J, Muresan S, Hopkins AL (2012) Quantifying the chemical beauty of drugs. Nature Chem 4(2):90\u201398","journal-title":"Nature Chem"},{"key":"4915_CR40","unstructured":"Landrum G, et al (2013) RDKit: cheminformatics and machine learning software. RDKIT, ORG, p 405"},{"issue":"1","key":"4915_CR41","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-015-0069-3","volume":"7","author":"D Bajusz","year":"2015","unstructured":"Bajusz D, R\u00e1cz A, H\u00e9berger K (2015) Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations? J Cheminformatics 7(1):1\u201313","journal-title":"J Cheminformatics"},{"issue":"5","key":"4915_CR42","doi-asserted-by":"publisher","first-page":"902","DOI":"10.1021\/acs.jcim.8b00173","volume":"58","author":"A Dalke","year":"2018","unstructured":"Dalke A, Hert J, Kramer C (2018) mmpdb: An open-source matched molecular pair platform for large multiproperty data sets. J Chem Inf Model 58(5):902\u2013910","journal-title":"J Chem Inf Model"},{"key":"4915_CR43","doi-asserted-by":"crossref","unstructured":"Barshatski G, Radinsky K (2021) Unpaired generative molecule-to-molecule translation for lead optimization. In: Proceedings of the 27th ACM SIGKDD Conference on knowledge discovery & data mining, pp 2554\u20132564","DOI":"10.1145\/3447548.3467120"}],"container-title":["Applied Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-023-04915-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10489-023-04915-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-023-04915-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,26]],"date-time":"2024-10-26T02:58:34Z","timestamp":1729911514000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10489-023-04915-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,12]]},"references-count":43,"journal-issue":{"issue":"21","published-print":{"date-parts":[[2023,11]]}},"alternative-id":["4915"],"URL":"https:\/\/doi.org\/10.1007\/s10489-023-04915-8","relation":{},"ISSN":["0924-669X","1573-7497"],"issn-type":[{"value":"0924-669X","type":"print"},{"value":"1573-7497","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,12]]},"assertion":[{"value":"24 July 2023","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 August 2023","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflicts of interest"}}]}}