{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T00:38:51Z","timestamp":1773707931387,"version":"3.50.1"},"reference-count":34,"publisher":"IOP Publishing","issue":"3","license":[{"start":{"date-parts":[[2022,7,25]],"date-time":"2022-07-25T00:00:00Z","timestamp":1658707200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,7,25]],"date-time":"2022-07-25T00:00:00Z","timestamp":1658707200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"funder":[{"name":"Austrian Science Fund","award":["Erwin Schr\u00f6dinger fellowship No. J4309"],"award-info":[{"award-number":["Erwin Schr\u00f6dinger fellowship No. J4309"]}]},{"DOI":"10.13039\/501100004489","name":"Mitacs","doi-asserted-by":"crossref","award":["FR44234"],"award-info":[{"award-number":["FR44234"]}],"id":[{"id":"10.13039\/501100004489","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Anders G. Fr\u00f8seth"},{"DOI":"10.13039\/501100002790","name":"Canadian Network for Research and Innovation in Machining Technology, Natural Sciences and Engineering Research Council of Canada","doi-asserted-by":"crossref","award":["Banting Postdoctoral Fellowship"],"award-info":[{"award-number":["Banting Postdoctoral Fellowship"]}],"id":[{"id":"10.13039\/501100002790","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100013020","name":"Compute Canada","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100013020","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100000159","name":"Natural Resources Canada","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100000159","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Mach. Learn.: Sci. Technol."],"published-print":{"date-parts":[[2022,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Computer aided design of molecules has the potential to disrupt the field of drug and material discovery. Machine learning and deep learning in particular, made big strides in recent years and promises to greatly benefit computer aided methods. Reinforcement learning is a particularly promising approach since it enables de novo molecule design, that is molecular design, without providing any prior knowledge. However, the search space is vast, and therefore any reinforcement learning agent needs to perform efficient exploration. In this study, we examine three versions of intrinsic motivation to aid efficient exploration. The algorithms are adapted from intrinsic motivation in the literature that were developed in other settings, predominantly video games. We show that the <jats:italic>curious<\/jats:italic> agents finds better performing molecules on two of three benchmarks. This indicates an exciting new research direction for reinforcement learning agents that can explore the chemical space out of their own motivation. This has the potential to eventually lead to unexpected new molecular designs no human has thought about so far.<\/jats:p>","DOI":"10.1088\/2632-2153\/ac7ddc","type":"journal-article","created":{"date-parts":[[2022,7,2]],"date-time":"2022-07-02T22:18:37Z","timestamp":1656800317000},"page":"035008","update-policy":"https:\/\/doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":16,"title":["Curiosity in exploring chemical spaces: intrinsic rewards for molecular reinforcement learning"],"prefix":"10.1088","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1202-6809","authenticated-orcid":true,"given":"Luca A","family":"Thiede","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1620-9207","authenticated-orcid":true,"given":"Mario","family":"Krenn","sequence":"additional","affiliation":[]},{"given":"AkshatKumar","family":"Nigam","sequence":"additional","affiliation":[]},{"given":"Al\u00e1n","family":"Aspuru-Guzik","sequence":"additional","affiliation":[]}],"member":"266","published-online":{"date-parts":[[2022,7,25]]},"reference":[{"key":"mlstac7ddcbib1","doi-asserted-by":"publisher","first-page":"360","DOI":"10.1126\/science.aat2663","article-title":"Inverse molecular design using machine learning: generative models for matter engineering","volume":"361","author":"Sanchez-Lengeling","year":"2018","journal-title":"Science"},{"key":"mlstac7ddcbib2","doi-asserted-by":"publisher","first-page":"119","DOI":"10.1038\/s41570-018-0066-y","article-title":"How to explore chemical space using algorithms and automation","volume":"3","author":"Gromski","year":"2019","journal-title":"Nat. Rev. Chem."},{"key":"mlstac7ddcbib3","doi-asserted-by":"publisher","first-page":"675","DOI":"10.1007\/s10822-013-9672-4","article-title":"Estimation of the size of drug-like chemical space based on GDB-17 data","volume":"27","author":"Polishchuk","year":"2013","journal-title":"J. Comput.-Aided Mol. Des."},{"key":"mlstac7ddcbib4","doi-asserted-by":"publisher","first-page":"268","DOI":"10.1021\/acscentsci.7b00572","article-title":"Automatic chemical design using a data-driven continuous representation of molecules","volume":"4","author":"G\u00f3mez-Bombarelli","year":"2018","journal-title":"ACS Cent. Sci."},{"key":"mlstac7ddcbib5","article-title":"Junction tree variational autoencoder for molecular graph generation","author":"Jin","year":"2018"},{"key":"mlstac7ddcbib6","article-title":"Objective-reinforced generative adversarial networks (ORGAN) for sequence generation models","author":"Guimaraes","year":"2017"},{"key":"mlstac7ddcbib7","article-title":"JANUS: parallel tempered genetic algorithm guided by deep neural networks for inverse molecular design","author":"Nigam","year":"2021"},{"key":"mlstac7ddcbib8","article-title":"Augmenting genetic algorithms with deep neural networks for exploring the chemical space","author":"Nigam","year":"2019"},{"key":"mlstac7ddcbib9","doi-asserted-by":"publisher","first-page":"3567","DOI":"10.1039\/C8SC05372C","article-title":"A graph-based genetic algorithm and generative model\/Monte Carlo tree search for the exploration of chemical space","volume":"10","author":"Jensen","year":"2019","journal-title":"Chem. Sci."},{"key":"mlstac7ddcbib10","doi-asserted-by":"publisher","first-page":"e11","DOI":"10.7717\/peerj-pchem.11","article-title":"Chemical space exploration: how genetic algorithms find the needle in the haystack","volume":"2","author":"Henault","year":"2020","journal-title":"PeerJ Phys. Chem."},{"key":"mlstac7ddcbib11","article-title":"Exploring the chemical space without bias: data-free molecule generation with DQN and SELFIES","author":"Gaudin","year":"2019"},{"key":"mlstac7ddcbib12","article-title":"Molecular generation with recurrent neural networks (RNNs)","author":"Bjerrum","year":"2017"},{"key":"mlstac7ddcbib13","doi-asserted-by":"publisher","first-page":"120","DOI":"10.1021\/acscentsci.7b00512","article-title":"Generating focused molecule libraries for drug discovery with recurrent neural networks","volume":"4","author":"Segler","year":"2018","journal-title":"ACS Cent. Sci."},{"key":"mlstac7ddcbib14","article-title":"In silico generation of novel, drug-like chemical matter using the LSTM neural network","author":"Ertl","year":"2017"},{"key":"mlstac7ddcbib15","doi-asserted-by":"publisher","first-page":"48","DOI":"10.1186\/s13321-017-0235-x","article-title":"Molecular de-novo design through deep reinforcement learning","volume":"9","author":"Olivecrona","year":"2017","journal-title":"J. Cheminform."},{"key":"mlstac7ddcbib16","first-page":"pp 488","article-title":"Curiosity-driven exploration by self-supervised prediction","author":"Pathak","year":"2017"},{"key":"mlstac7ddcbib17","article-title":"A survey on intrinsic motivation in reinforcement learning","author":"Aubret","year":"2019"},{"key":"mlstac7ddcbib18","doi-asserted-by":"publisher","first-page":"230","DOI":"10.1109\/TAMD.2010.2056368","article-title":"Formal theory of creativity, fun and intrinsic motivation (1990\u20132010)","volume":"2","author":"Schmidhuber","year":"2010","journal-title":"IEEE Trans. Auton. Ment. Dev."},{"key":"mlstac7ddcbib19","article-title":"Large-scale study of curiosity-driven learning","author":"Burda","year":"2018"},{"key":"mlstac7ddcbib20","article-title":"Proximal policy optimization algorithms","author":"Schulman","year":"2017"},{"key":"mlstac7ddcbib21","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/aba947","article-title":"Self-referencing embedded strings (SELFIES): a 100% robust molecular string representation","volume":"1","author":"Krenn","year":"2020","journal-title":"Mach. Learn.: Sci. Technol."},{"key":"mlstac7ddcbib22","article-title":"Exploration strategies in deep reinforcement learning","author":"Weng","year":"2020"},{"key":"mlstac7ddcbib23","article-title":"Unifying count-based exploration and intrinsic motivation","author":"Bellemare","year":"2016"},{"key":"mlstac7ddcbib24","doi-asserted-by":"publisher","first-page":"1309","DOI":"10.1016\/j.jcss.2007.08.009","article-title":"An analysis of model-based interval estimation for Markov decision processes","volume":"74","author":"Strehl","year":"2008","journal-title":"J. Comput. Syst. Sci."},{"key":"mlstac7ddcbib25","article-title":"#Exploration: a study of count-based exploration for deep reinforcement learning","author":"Tang","year":"2017"},{"key":"mlstac7ddcbib26","article-title":"Incentivizing exploration in reinforcement learning with deep predictive models","author":"Stadie","year":"2015"},{"key":"mlstac7ddcbib27","article-title":"Never give up: learning directed exploration strategies","author":"Badia","year":"2020"},{"key":"mlstac7ddcbib28","doi-asserted-by":"publisher","first-page":"1009","DOI":"10.1080\/17460441.2021.1925247","article-title":"Assigning confidence to molecular property prediction","volume":"16","author":"Nigam","year":"2021","journal-title":"Expert Opin. Drug Discovery"},{"key":"mlstac7ddcbib29","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1186\/s13321-015-0069-3","article-title":"Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?","volume":"7","author":"Bajusz","year":"2015","journal-title":"J. Cheminform."},{"key":"mlstac7ddcbib30","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1038\/nchem.1243","article-title":"Quantifying the chemical beauty of drugs","volume":"4","author":"Richard Bickerton","year":"2012","journal-title":"Nat. Chem."},{"key":"mlstac7ddcbib31","doi-asserted-by":"publisher","first-page":"1096","DOI":"10.1021\/acs.jcim.8b00839","article-title":"GuacaMol: benchmarking models for de novo molecular design","volume":"59","author":"Brown","year":"2019","journal-title":"J. Chem. Inf. Model."},{"key":"mlstac7ddcbib32","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"mlstac7ddcbib33","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1186\/1758-2946-1-8","article-title":"Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions","volume":"1","author":"Ertl","year":"2009","journal-title":"J. Cheminform."},{"key":"mlstac7ddcbib34","article-title":"Amortized tree generation for bottom-up synthesis planning and synthesizable molecular design","author":"Gao","year":"2021"}],"container-title":["Machine Learning: Science and Technology"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac7ddc","content-type":"text\/html","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac7ddc\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac7ddc","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac7ddc\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac7ddc\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac7ddc\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac7ddc\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac7ddc\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,7,25]],"date-time":"2022-07-25T15:39:47Z","timestamp":1658763587000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac7ddc"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,25]]},"references-count":34,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2022,7,25]]},"published-print":{"date-parts":[[2022,9,1]]}},"URL":"https:\/\/doi.org\/10.1088\/2632-2153\/ac7ddc","relation":{},"ISSN":["2632-2153"],"issn-type":[{"value":"2632-2153","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,7,25]]},"assertion":[{"value":"Curiosity in exploring chemical spaces: intrinsic rewards for molecular reinforcement learning","name":"article_title","label":"Article Title"},{"value":"Machine Learning: Science and Technology","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2022 The Author(s). Published by IOP Publishing Ltd","name":"copyright_information","label":"Copyright Information"},{"value":"2022-03-08","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2022-07-01","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2022-07-25","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}