{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,15]],"date-time":"2026-05-15T07:02:28Z","timestamp":1778828548685,"version":"3.51.4"},"reference-count":36,"publisher":"IOP Publishing","issue":"1","license":[{"start":{"date-parts":[[2022,1,21]],"date-time":"2022-01-21T00:00:00Z","timestamp":1642723200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,1,21]],"date-time":"2022-01-21T00:00:00Z","timestamp":1642723200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Mach. Learn.: Sci. Technol."],"published-print":{"date-parts":[[2022,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>We expand the recent work on clustering of synthetic routes and train a deep learning model to predict the distances between arbitrary routes. The model is based on a long short-term memory representation of a synthetic route and is trained as a twin network to reproduce the tree edit distance (TED) between two routes. The machine learning approach is approximately two orders of magnitude faster than the TED approach and enables clustering many more routes from a retrosynthesis route prediction. The clusters have a high degree of similarity to the clusters given by the TED-based approach and are accordingly intuitive and explainable. We provide the developed model as open-source.<\/jats:p>","DOI":"10.1088\/2632-2153\/ac4a91","type":"journal-article","created":{"date-parts":[[2022,1,12]],"date-time":"2022-01-12T17:16:11Z","timestamp":1642007771000},"page":"015018","update-policy":"https:\/\/doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":12,"title":["Fast prediction of distances between synthetic routes with deep learning"],"prefix":"10.1088","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7624-7363","authenticated-orcid":false,"given":"Samuel","family":"Genheden","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ola","family":"Engkvist","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1614-7376","authenticated-orcid":false,"given":"Esben","family":"Bjerrum","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"266","published-online":{"date-parts":[[2022,1,21]]},"reference":[{"key":"mlstac4a91bib1","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1016\/j.ddtec.2020.06.002","article-title":"AI-assisted synthesis prediction","volume":"32\u201333","author":"Johansson","year":"2020","journal-title":"Drug Discov. Today Technol."},{"key":"mlstac4a91bib2","doi-asserted-by":"publisher","first-page":"178","DOI":"10.1126\/science.166.3902.178","article-title":"Computer-assisted design of complex organic syntheses","volume":"166","author":"Corey","year":"1969","journal-title":"Science"},{"key":"mlstac4a91bib3","doi-asserted-by":"publisher","first-page":"1281","DOI":"10.1021\/acs.accounts.8b00087","article-title":"Machine learning in computer-aided synthesis planning","volume":"51","author":"Coley","year":"2018","journal-title":"Acc. Chem. Res."},{"key":"mlstac4a91bib4","first-page":"1564","article-title":"Construction of new medicines via game proof search","author":"Heifets","year":"2012"},{"key":"mlstac4a91bib5","doi-asserted-by":"publisher","first-page":"604","DOI":"10.1038\/nature25978","article-title":"Planning chemical syntheses with deep neural networks and symbolic AI","volume":"555","author":"Segler","year":"2018","journal-title":"Nature"},{"key":"mlstac4a91bib6","doi-asserted-by":"publisher","first-page":"522","DOI":"10.1016\/j.chempr.2018.02.002","article-title":"Efficient syntheses of diverse, medicinally relevant targets planned by computer and executed in the laboratory","volume":"4","author":"Klucznik","year":"2018","journal-title":"Chemistry"},{"key":"mlstac4a91bib7","doi-asserted-by":"publisher","first-page":"eaax1566","DOI":"10.1126\/science.aax1566","article-title":"A robotic platform for flow synthesis of organic compounds informed by AI planning","volume":"365","author":"Coley","year":"2019","journal-title":"Science"},{"key":"mlstac4a91bib8","doi-asserted-by":"publisher","first-page":"3355","DOI":"10.1039\/C9SC03666K","article-title":"Automatic retrosynthetic route planning using template-free models","volume":"11","author":"Lin","year":"2020","journal-title":"Chem. Sci."},{"key":"mlstac4a91bib9","doi-asserted-by":"publisher","first-page":"3316","DOI":"10.1039\/C9SC05704H","article-title":"Predicting retrosynthetic pathways using transformer-based models and a hyper-graph exploration strategy","volume":"11","author":"Schwaller","year":"2020","journal-title":"Chem. Sci."},{"key":"mlstac4a91bib10","article-title":"Retro*: learning retrosynthetic planning with neural guided A* search","author":"Chen","year":"2020"},{"key":"mlstac4a91bib11","doi-asserted-by":"publisher","first-page":"52","DOI":"10.1186\/s13321-020-00452-5","article-title":"CompRet: a comprehensive recommendation framework for chemical synthesis planning with algorithmic enumeration","volume":"12","author":"Shibukawa","year":"2020","journal-title":"J. Cheminform."},{"key":"mlstac4a91bib12","doi-asserted-by":"publisher","first-page":"1469","DOI":"10.1039\/D0SC05078D","article-title":"Evaluating and clustering retrosynthesis pathways with learned strategy","volume":"12","author":"Mo","year":"2021","journal-title":"Chem. Sci."},{"key":"mlstac4a91bib13","doi-asserted-by":"publisher","DOI":"10.26434\/chemrxiv.13372475.v1","article-title":"Clustering of synthetic routes using tree edit distance","author":"Genheden","year":"2020"},{"key":"mlstac4a91bib14","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2699485","article-title":"Efficient computation of the tree edit distance","volume":"40","author":"Pawlik","year":"2015","journal-title":"ACM Trans. Database Syst."},{"key":"mlstac4a91bib15","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1016\/j.is.2015.08.004","article-title":"Tree edit distance: robust and memory-efficient","volume":"56","author":"Pawlik","year":"2016","journal-title":"Inf. Syst."},{"key":"mlstac4a91bib16","doi-asserted-by":"publisher","first-page":"D1100","DOI":"10.1093\/nar\/gkr777","article-title":"ChEMBL: a large-scale bioactivity database for drug discovery","volume":"40","author":"Gaulton","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"mlstac4a91bib17","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1038\/nchem.1243","article-title":"Quantifying the chemical beauty of drugs","volume":"4","author":"Bickerton","year":"2012","journal-title":"Nat. Chem."},{"key":"mlstac4a91bib18","article-title":"RDKit: open-source cheminformatics","author":"Landrum"},{"key":"mlstac4a91bib19","doi-asserted-by":"publisher","DOI":"10.1002\/minf.201900031","article-title":"Medicinal chemistry aware database GDBMedChem","volume":"38","author":"Awale","year":"2019","journal-title":"Mol. Inform."},{"key":"mlstac4a91bib20","doi-asserted-by":"publisher","first-page":"46","DOI":"10.3389\/fchem.2020.00046","article-title":"ChEMBL-likeness score and database GDBChEMBL","volume":"8","author":"B\u00fchlmann","year":"2020","journal-title":"Front. Chem."},{"key":"mlstac4a91bib21","doi-asserted-by":"publisher","first-page":"70","DOI":"10.1186\/s13321-020-00472-1","article-title":"AiZynthFinder: a fast, robust and flexible open-source software for retrosynthetic planning","volume":"12","author":"Genheden","year":"2020","journal-title":"J. Cheminform."},{"key":"mlstac4a91bib22","doi-asserted-by":"publisher","first-page":"154","DOI":"10.1039\/C9SC04944D","article-title":"Datasets and their influence on the development of computer assisted synthesis planning tools in the pharmaceutical domain","volume":"11","author":"Thakkar","year":"2020","journal-title":"Chem. Sci."},{"key":"mlstac4a91bib23","doi-asserted-by":"crossref","DOI":"10.26434\/chemrxiv.13280495.v1","article-title":"A quick policy to filter reactions based on feasibility in AI-guided retrosynthetic planning","author":"Genheden","year":"2020"},{"key":"mlstac4a91bib24","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"mlstac4a91bib25","doi-asserted-by":"publisher","first-page":"1556","DOI":"10.3115\/v1\/P15-1150","article-title":"Improved semantic representations from tree-structured long short-term memory networks","author":"Tai","year":"2015"},{"key":"mlstac4a91bib26","author":"Dawe"},{"key":"mlstac4a91bib27","doi-asserted-by":"publisher","first-page":"742","DOI":"10.1021\/ci100050t","article-title":"Extended-connectivity fingerprints","volume":"50","author":"Rogers","year":"2010","journal-title":"J. Chem. Inf. Model."},{"key":"mlstac4a91bib28","doi-asserted-by":"publisher","first-page":"73","DOI":"10.1007\/978-1-0716-0826-5_3","author":"Chicco","year":"2021"},{"key":"mlstac4a91bib29","article-title":"Adam: a method for stochastic optimization","author":"Kingma","year":"2015"},{"key":"mlstac4a91bib30","article-title":"Decoupled weight decay regularization","author":"Loshchilov","year":"2017"},{"key":"mlstac4a91bib31","doi-asserted-by":"publisher","first-page":"2623","DOI":"10.1145\/3292500.3330701","article-title":"Optuna: a next-generation hyperparameter optimization framework","author":"Akiba","year":"2019"},{"key":"mlstac4a91bib32","article-title":"PyTorch: an imperative style, high-performance deep learning library","author":"Paszke","year":"2019"},{"key":"mlstac4a91bib33","article-title":"PyTorchLightning\/pytorch-lightning: 0.7.6 release","author":"Falcon","year":"2020"},{"key":"mlstac4a91bib34","first-page":"2825","article-title":"Scikit-learn: machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Machine Learning Res."},{"key":"mlstac4a91bib35","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","article-title":"Silhouettes: a graphical aid to the interpretation and validation of cluster analysis","volume":"20","author":"Rousseeuw","year":"1987","journal-title":"J. Comput. Appl. Math."},{"key":"mlstac4a91bib36","article-title":"Elsevier limited except certain content provided by third parties, Reaxys is a trademark of Elsevier","author":"","year":"2019"}],"container-title":["Machine Learning: Science and Technology"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac4a91","content-type":"text\/html","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac4a91\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac4a91","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac4a91\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac4a91\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac4a91\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac4a91\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac4a91\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,1,21]],"date-time":"2022-01-21T06:50:38Z","timestamp":1642747838000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac4a91"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,21]]},"references-count":36,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,1,21]]},"published-print":{"date-parts":[[2022,3,1]]}},"URL":"https:\/\/doi.org\/10.1088\/2632-2153\/ac4a91","relation":{"has-preprint":[{"id-type":"doi","id":"10.26434\/chemrxiv.14778150.v1","asserted-by":"object"}]},"ISSN":["2632-2153"],"issn-type":[{"value":"2632-2153","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,1,21]]},"assertion":[{"value":"Fast prediction of distances between synthetic routes with deep learning","name":"article_title","label":"Article Title"},{"value":"Machine Learning: Science and Technology","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2022 The Author(s). Published by IOP Publishing Ltd","name":"copyright_information","label":"Copyright Information"},{"value":"2021-07-27","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2022-01-12","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2022-01-21","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}