{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T01:44:27Z","timestamp":1772675067397,"version":"3.50.1"},"reference-count":102,"publisher":"IOP Publishing","issue":"4","license":[{"start":{"date-parts":[[2023,12,6]],"date-time":"2023-12-06T00:00:00Z","timestamp":1701820800000},"content-version":"vor","delay-in-days":5,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,12,6]],"date-time":"2023-12-06T00:00:00Z","timestamp":1701820800000},"content-version":"tdm","delay-in-days":5,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"funder":[{"DOI":"10.13039\/501100010785","name":"Canada First Research Excellence Fund","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100010785","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National Science Foundation","award":["DMS-1925919"],"award-info":[{"award-number":["DMS-1925919"]}]},{"DOI":"10.13039\/100010663","name":"H2020 European Research Council","doi-asserted-by":"crossref","award":["772834"],"award-info":[{"award-number":["772834"]}],"id":[{"id":"10.13039\/100010663","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Mach. Learn.: Sci. Technol."],"published-print":{"date-parts":[[2023,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Despite the fundamental progress in autonomous molecular and materials discovery, data scarcity throughout chemical compound space still severely hampers the use of modern ready-made machine learning models as they rely heavily on the paradigm, \u2018the bigger the data the better\u2019. Presenting similarity based machine learning (SML), we show an approach to select data and train a model on-the-fly for specific queries, enabling decision making in data scarce scenarios in chemistry. By solely relying on query and training data proximity to choose training points, only a fraction of data is necessary to converge to competitive performance. After introducing SML for the harmonic oscillator and the Rosenbrock function, we describe applications to scarce data scenarios in chemistry which include quantum mechanics based molecular design and organic synthesis planning. Finally, we derive a relationship between the intrinsic dimensionality and volume of feature space, governing the overall model accuracy.<\/jats:p>","DOI":"10.1088\/2632-2153\/ad0fa3","type":"journal-article","created":{"date-parts":[[2023,11,24]],"date-time":"2023-11-24T22:22:52Z","timestamp":1700864572000},"page":"045043","update-policy":"https:\/\/doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":13,"title":["Improved decision making with similarity based machine learning: applications in chemistry"],"prefix":"10.1088","volume":"4","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8075-1765","authenticated-orcid":true,"given":"Dominik","family":"Lemm","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7987-4330","authenticated-orcid":false,"given":"Guido","family":"Falk von Rudorff","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7419-0466","authenticated-orcid":false,"given":"O","family":"Anatole von Lilienfeld","sequence":"additional","affiliation":[]}],"member":"266","published-online":{"date-parts":[[2023,12,6]]},"reference":[{"key":"mlstad0fa3bib1","doi-asserted-by":"publisher","first-page":"554","DOI":"10.1136\/bmj.1.3923.554-a","article-title":"Design of experiments","volume":"1","author":"Fisher","year":"1936","journal-title":"Br. Med. J."},{"key":"mlstad0fa3bib2","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1214\/ss\/1177009939","article-title":"Bayesian experimental design: a review","volume":"10","author":"Chaloner","year":"1995","journal-title":"Stat. Sci."},{"key":"mlstad0fa3bib3","author":"Pukelsheim","year":"2006"},{"key":"mlstad0fa3bib4","doi-asserted-by":"publisher","first-page":"380","DOI":"10.1037\/h0053870","article-title":"The theory of decision making","volume":"51","author":"Edwards","year":"1954","journal-title":"Psychol. Bull."},{"key":"mlstad0fa3bib5","author":"Pratt","year":"1995"},{"key":"mlstad0fa3bib6","author":"Berger","year":"2013"},{"key":"mlstad0fa3bib7","article-title":"The statistical complexity of interactive decision making","author":"Foster","year":"2021"},{"key":"mlstad0fa3bib8","doi-asserted-by":"publisher","first-page":"291","DOI":"10.1016\/j.tics.2008.04.010","article-title":"Decision making, movement planning and statistical decision theory","volume":"12","author":"Trommersh\u00e4user","year":"2008","journal-title":"Trends Cogn. Sci."},{"key":"mlstad0fa3bib9","author":"Hey","year":"2009"},{"key":"mlstad0fa3bib10","doi-asserted-by":"publisher","first-page":"4164","DOI":"10.1002\/anie.201709686","article-title":"Quantum machine learning in chemical compound space","volume":"57","author":"von Lilienfeld","year":"2018","journal-title":"Angew. Chem., Int. Ed."},{"key":"mlstad0fa3bib11","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/ab6d5d","article-title":"Introducing machine learning: science and technology","volume":"1","author":"von Lilienfeld","year":"2020","journal-title":"Mach. Learn.: Sci. Technol."},{"key":"mlstad0fa3bib12","doi-asserted-by":"publisher","first-page":"247","DOI":"10.1038\/nature02236","article-title":"Functional genomic hypothesis generation and experimentation by a robot scientist","volume":"427","author":"King","year":"2004","journal-title":"Nature"},{"key":"mlstad0fa3bib13","doi-asserted-by":"publisher","first-page":"85","DOI":"10.1126\/science.1165620","article-title":"The automation of science","volume":"324","author":"King","year":"2009","journal-title":"Science"},{"key":"mlstad0fa3bib14","doi-asserted-by":"publisher","first-page":"237","DOI":"10.1038\/s41586-020-2442-2","article-title":"A mobile robotic chemist","volume":"583","author":"Burger","year":"2020","journal-title":"Nature"},{"key":"mlstad0fa3bib15","doi-asserted-by":"publisher","first-page":"282","DOI":"10.1016\/j.trechm.2019.02.007","article-title":"Next-generation experimentation with self-driving laboratories","volume":"1","author":"H\u00e4se","year":"2019","journal-title":"Trends Chem."},{"key":"mlstad0fa3bib16","doi-asserted-by":"publisher","first-page":"377","DOI":"10.1038\/s41586-018-0307-8","article-title":"Controlling an organic synthesis robot with machine learning to search for new reactivity","volume":"559","author":"Granda","year":"2018","journal-title":"Nature"},{"key":"mlstad0fa3bib17","doi-asserted-by":"publisher","first-page":"732","DOI":"10.1039\/D2DD00028H","article-title":"Bayesian optimization with known experimental and design constraints for chemistry applications","volume":"1","author":"Hickman","year":"2022","journal-title":"Digit. Discovery"},{"key":"mlstad0fa3bib18","doi-asserted-by":"publisher","first-page":"170","DOI":"10.1126\/science.abn3445","article-title":"The central role of density functional theory in the AI age","volume":"381","author":"Huang","year":"2023","journal-title":"Science"},{"key":"mlstad0fa3bib19","doi-asserted-by":"publisher","first-page":"889","DOI":"10.1080\/03639045.2017.1291672","article-title":"Design of experiments (DoE) in pharmaceutical development","volume":"43","author":"Politis","year":"2017","journal-title":"Drug Dev. Ind. Pharm."},{"key":"mlstad0fa3bib20","doi-asserted-by":"publisher","first-page":"485","DOI":"10.1016\/S1359-6446(04)03086-7","article-title":"Application of statistical \u2018design of experiments\u2019 methods in drug discovery","volume":"9","author":"Tye","year":"2004","journal-title":"Drug Discovery Today"},{"key":"mlstad0fa3bib21","first-page":"pp 37","article-title":"Decision theoretic generalizations of the pac model for neural net and other learning applications","author":"Haussler","year":"2018"},{"key":"mlstad0fa3bib22","doi-asserted-by":"publisher","first-page":"457","DOI":"10.1038\/s41570-023-00502-0","article-title":"The future of chemistry is language","volume":"7","author":"White","year":"2023","journal-title":"Nat. Rev. Chem."},{"key":"mlstad0fa3bib23","doi-asserted-by":"publisher","DOI":"10.26434\/chemrxiv-2023-fw8n4-v3","article-title":"Leveraging large language models for predictive chemistry","author":"Jablonka","year":"2023"},{"key":"mlstad0fa3bib24","article-title":"Emergent autonomous scientific research capabilities of large language models","author":"Boiko","year":"2023"},{"key":"mlstad0fa3bib25","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/acc928","article-title":"Encrypted machine learning of molecular quantum properties","volume":"4","author":"Weinreich","year":"2023","journal-title":"Mach. Learn.: Sci. Technol."},{"key":"mlstad0fa3bib26","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/ab6ac4","article-title":"Machine learning the computational cost of quantum chemistry","volume":"1","author":"Heinen","year":"2020","journal-title":"Mach. Learn.: Sci. Technol."},{"key":"mlstad0fa3bib27","doi-asserted-by":"publisher","first-page":"1134","DOI":"10.1039\/D3DD00037K","article-title":"Improving molecular machine learning through adaptive subsampling with active learning","volume":"2","author":"Wen","year":"2023","journal-title":"Digit. Discovery"},{"key":"mlstad0fa3bib28","doi-asserted-by":"publisher","DOI":"10.1063\/1.5023802","article-title":"Less is more: sampling chemical space with active learning","volume":"148","author":"Smith","year":"2018","journal-title":"J. Chem. Phys."},{"key":"mlstad0fa3bib29","doi-asserted-by":"crossref","DOI":"10.1088\/2632-2153\/ad1626","article-title":"Synthetic pre-training for neural-network interatomic potentials","author":"Gardner","year":"2023"},{"key":"mlstad0fa3bib30","article-title":"Reducing training data needs with minimal multilevel machine learning (M3L)","author":"Heinen","year":"2023"},{"key":"mlstad0fa3bib31","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevMaterials.3.023804","article-title":"Active learning of uniformly accurate interatomic potentials for materials simulation","volume":"3","author":"Zhang","year":"2019","journal-title":"Phys. Rev. Mater."},{"key":"mlstad0fa3bib32","doi-asserted-by":"publisher","first-page":"1575","DOI":"10.1021\/acs.accounts.0c00868","article-title":"Development of multimodal machine learning potentials: toward a physics-aware artificial intelligence","volume":"54","author":"Zubatiuk","year":"2021","journal-title":"Acc. Chem. Res."},{"key":"mlstad0fa3bib33","author":"Johnson","year":"1990"},{"key":"mlstad0fa3bib34","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1186\/s13321-016-0148-0","article-title":"Comparing structural fingerprints using a literature-based similarity benchmark","volume":"8","author":"O\u2019Boyle","year":"2016","journal-title":"J. Cheminformatics"},{"key":"mlstad0fa3bib35","doi-asserted-by":"publisher","first-page":"3525","DOI":"10.1039\/d0cs00098a","article-title":"QSAR without borders","volume":"49","author":"Muratov","year":"2020","journal-title":"Chem. Soc. Rev."},{"key":"mlstad0fa3bib36","doi-asserted-by":"publisher","first-page":"888","DOI":"10.1162\/neco.1992.4.6.888","article-title":"Local learning algorithms","volume":"4","author":"Bottou","year":"1992","journal-title":"Neural Comput."},{"key":"mlstad0fa3bib37","doi-asserted-by":"publisher","first-page":"823","DOI":"10.1038\/432823a","article-title":"Chemical space","volume":"432","author":"Kirkpatrick","year":"2004","journal-title":"Nature"},{"key":"mlstad0fa3bib38","doi-asserted-by":"publisher","first-page":"1120","DOI":"10.1038\/nmat4717","article-title":"Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach","volume":"15","author":"G\u00f3mez-Bombarelli","year":"2016","journal-title":"Nat. Mater."},{"key":"mlstad0fa3bib39","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1038\/s43588-022-00391-1","article-title":"High-throughput property-driven generative design of functional organic molecules","volume":"3","author":"Westermayr","year":"2023","journal-title":"Nat. Comput. Sci."},{"key":"mlstad0fa3bib40","doi-asserted-by":"publisher","DOI":"10.1126\/sciadv.1603015","article-title":"Machine learning of accurate energy-conserving molecular force fields","volume":"3","author":"Chmiela","year":"2017","journal-title":"Sci. Adv."},{"key":"mlstad0fa3bib41","doi-asserted-by":"publisher","DOI":"10.1126\/sciadv.1701816","article-title":"Machine learning unifies the modeling of materials and molecules","volume":"3","author":"Bart\u00f3k","year":"2017","journal-title":"Sci. Adv."},{"key":"mlstad0fa3bib42","first-page":"pp 327","article-title":"Learning curves: asymptotic values and rate of convergence","author":"Cortes","year":"1994","edition":"ed"},{"key":"mlstad0fa3bib43","article-title":"The shape of learning curves: a review","author":"Viering","year":"2021"},{"key":"mlstad0fa3bib44","article-title":"The intrinsic dimension of images and its impact on learning","author":"Pope","year":"2021"},{"key":"mlstad0fa3bib45","article-title":"Intrinsic dimension of data representations in deep neural networks","volume":"vol 32","author":"Ansuini","year":"2019","edition":"ed"},{"key":"mlstad0fa3bib46","article-title":"The intrinsic dimension of images and its impact on learning","author":"Pope","year":"2021"},{"key":"mlstad0fa3bib47","doi-asserted-by":"publisher","first-page":"1085","DOI":"10.1162\/neco.1996.8.5.1085","article-title":"A numerical study on learning curves in stochastic multilayer feedforward networks","volume":"8","author":"M\u00fcller","year":"1996","journal-title":"Neural Comput."},{"key":"mlstad0fa3bib48","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/sdata.2014.22","article-title":"Quantum chemistry structures and properties of 134 kilo molecules","volume":"1","author":"Ramakrishnan","year":"2014","journal-title":"Sci. Data"},{"key":"mlstad0fa3bib49","doi-asserted-by":"publisher","DOI":"10.1063\/1.5126701","article-title":"FCHL revisited: faster and more accurate quantum machine learning","volume":"152","author":"Christensen","year":"2020","journal-title":"J. Chem. Phys."},{"key":"mlstad0fa3bib50","article-title":"Enamine REAL Compounds","year":"2022"},{"key":"mlstad0fa3bib51","article-title":"Enamine REAL Database","year":"2022"},{"key":"mlstad0fa3bib52","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.130.067401","article-title":"Intrinsic dimension estimation for discrete metrics","volume":"130","author":"Macocco","year":"2023","journal-title":"Phys. Rev. Lett."},{"key":"mlstad0fa3bib53","doi-asserted-by":"publisher","first-page":"294","DOI":"10.2174\/1573409912666160906111821","article-title":"Exploring intrinsic dimensionality of chemical spaces for robust QSAR model development: a comparison of several statistical approaches","volume":"12","author":"Majumdar","year":"2016","journal-title":"Curr. Comput. Aided Drug Des."},{"key":"mlstad0fa3bib54","first-page":"pp 29","article-title":"Estimating local intrinsic dimensionality","author":"Amsaleg","year":"2015"},{"key":"mlstad0fa3bib55","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1109\/TPAMI.1979.4766873","article-title":"An intrinsic dimensionality estimator from near-neighbor information","volume":"PAMI-1","author":"Pettis","year":"1979","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"mlstad0fa3bib56","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-017-11873-y","article-title":"Estimating the intrinsic dimension of datasets by a minimal neighborhood information","volume":"7","author":"Facco","year":"2017","journal-title":"Sci. Rep."},{"key":"mlstad0fa3bib57","article-title":"Maximum likelihood estimation of intrinsic dimension","volume":"vol 17","author":"Levina","year":"2004","edition":"ed"},{"key":"mlstad0fa3bib58","doi-asserted-by":"publisher","first-page":"3828","DOI":"10.1021\/acs.jpclett.0c00527","article-title":"Combining SchNet and SHARC: the SchNarc machine learning approach for excited-state dynamics","volume":"11","author":"Westermayr","year":"2020","journal-title":"J. Phys. Chem. Lett."},{"key":"mlstad0fa3bib59","doi-asserted-by":"publisher","first-page":"9873","DOI":"10.1021\/acs.chemrev.0c00749","article-title":"Machine learning for electronically excited states of molecules","volume":"121","author":"Westermayr","year":"2020","journal-title":"Chem. Rev."},{"key":"mlstad0fa3bib60","first-page":"pp 9323","article-title":"E(n) equivariant graph neural networks","volume":"vol 139","author":"Satorras","year":"2021","edition":"ed"},{"key":"mlstad0fa3bib61","doi-asserted-by":"publisher","first-page":"10775","DOI":"10.1039\/D2CP00834C","article-title":"\u0394-quantum machine-learning for medicinal chemistry","volume":"24","author":"Atz","year":"2022","journal-title":"Phys. Chem. Chem. Phys."},{"key":"mlstad0fa3bib62","article-title":"Equiformerv2: improved equivariant transformer for scaling to higher-degree representations","author":"Liao","year":"2023"},{"key":"mlstad0fa3bib63","article-title":"Equivariant transformers for neural network based molecular potentials","author":"Th\u00f6lke","year":"2022"},{"key":"mlstad0fa3bib64","doi-asserted-by":"publisher","first-page":"4895","DOI":"10.1038\/s41467-020-18556-9","article-title":"Retrospective on a decade of machine learning for chemical discovery","volume":"11","author":"von Lilienfeld","year":"2020","journal-title":"Nat. Commun."},{"key":"mlstad0fa3bib65","doi-asserted-by":"publisher","first-page":"347","DOI":"10.1038\/s41570-020-0189-9","article-title":"Exploring chemical compound space with quantum-based machine learning","volume":"4","author":"von Lilienfeld","year":"2020","journal-title":"Nat. Rev. Chem."},{"key":"mlstad0fa3bib66","doi-asserted-by":"publisher","first-page":"34","DOI":"10.1186\/s13321-021-00512-4","article-title":"STOUT: SMILES to IUPAC names using neural machine translation","volume":"13","author":"Rajan","year":"2021","journal-title":"J. Cheminformatics"},{"key":"mlstad0fa3bib67","article-title":"Leruli.com, online molecular property predictions in real time and for free","author":"Lemm","year":"2021"},{"key":"mlstad0fa3bib68","doi-asserted-by":"publisher","first-page":"1094","DOI":"10.1021\/acs.accounts.0c00714","article-title":"Chemist ex machina: advanced synthesis planning by computers","volume":"54","author":"Molga","year":"2021","journal-title":"Acc. Chem. Res."},{"key":"mlstad0fa3bib69","doi-asserted-by":"publisher","first-page":"1281","DOI":"10.1021\/acs.accounts.8b00087","article-title":"Machine learning in computer-aided synthesis planning","volume":"51","author":"Coley","year":"2018","journal-title":"Acc. Chem. Res."},{"key":"mlstad0fa3bib70","doi-asserted-by":"publisher","first-page":"7747","DOI":"10.1038\/s41467-022-35422-y","article-title":"Merging enzymatic and synthetic chemistry with computational synthesis planning","volume":"13","author":"Levin","year":"2022","journal-title":"Nat. Commun."},{"key":"mlstad0fa3bib71","doi-asserted-by":"publisher","first-page":"83","DOI":"10.1038\/s41586-020-2855-y","article-title":"Computational planning of the synthesis of complex natural products","volume":"588","author":"Mikulak-Klucznik","year":"2020","journal-title":"Nature"},{"key":"mlstad0fa3bib72","doi-asserted-by":"publisher","first-page":"1239","DOI":"10.1111\/j.1476-5381.2010.01127.x","article-title":"Principles of early drug discovery","volume":"162","author":"Hughes","year":"2011","journal-title":"Br. J. Pharmacol."},{"key":"mlstad0fa3bib73","doi-asserted-by":"publisher","DOI":"10.1002\/aic.16976","article-title":"Temperature-dependent vapor\u2013liquid equilibria and solvation free energy estimation from minimal data","volume":"66","author":"Chung","year":"2020","journal-title":"AIChE J."},{"key":"mlstad0fa3bib74","doi-asserted-by":"publisher","first-page":"433","DOI":"10.1021\/acs.jcim.1c01103","article-title":"Group contribution and machine learning approaches to predict abraham solute parameters, solvation free energy and solvation enthalpy","volume":"62","author":"Chung","year":"2022","journal-title":"J. Chem. Inf. Model."},{"key":"mlstad0fa3bib75","first-page":"pp 1000","article-title":"Shape indexing using approximate nearest-neighbour search in high-dimensional spaces","author":"Beis","year":"1997"},{"key":"mlstad0fa3bib76","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/ac8e4f","article-title":"Metric learning for kernel ridge regression: assessment of molecular similarity","volume":"3","author":"Fabregat","year":"2022","journal-title":"Mach. Learn.: Sci. Technol."},{"key":"mlstad0fa3bib77","doi-asserted-by":"publisher","first-page":"5373","DOI":"10.1021\/acs.jcim.2c00817","article-title":"Auto3d: automatic generation of the low-energy 3D structures with ANI neural network potentials","volume":"62","author":"Liu","year":"2022","journal-title":"J. Chem. Inf. Model."},{"key":"mlstad0fa3bib78","doi-asserted-by":"publisher","DOI":"10.1063\/5.0112856","article-title":"Transition state search and geometry relaxation throughout chemical compound space with quantum machine learning","volume":"157","author":"Heinen","year":"2022","journal-title":"J. Chem. Phys."},{"key":"mlstad0fa3bib79","doi-asserted-by":"publisher","first-page":"4468","DOI":"10.1038\/s41467-021-24525-7","article-title":"Machine learning based energy-free structure predictions of molecules, transition states and solids","volume":"12","author":"Lemm","year":"2021","journal-title":"Nat. Commun."},{"key":"mlstad0fa3bib80","first-page":"pp 8867","article-title":"Equivariant diffusion for molecule generation in 3D","volume":"vol 162","author":"Hoogeboom","year":"2022","edition":"ed"},{"key":"mlstad0fa3bib81","first-page":"pp 38592","article-title":"Geometric latent diffusion models for 3D molecule generation","volume":"vol 202","author":"Xu","year":"2023","edition":"ed"},{"key":"mlstad0fa3bib82","article-title":"Torsional diffusion for molecular conformer generation","author":"Jing","year":"2022"},{"key":"mlstad0fa3bib83","doi-asserted-by":"publisher","first-page":"4769","DOI":"10.1021\/acs.jctc.1c00363","article-title":"Impact of the characteristics of quantum chemical databases on machine learning prediction of tautomerization energies","volume":"17","author":"Vazquez-Salazar","year":"2021","journal-title":"J. Chem. Theory Comput."},{"key":"mlstad0fa3bib84","doi-asserted-by":"publisher","first-page":"945","DOI":"10.1038\/s41557-020-0527-z","article-title":"Quantum machine learning using atom-in-molecule-based fragments selected on the fly","volume":"12","author":"Huang","year":"2020","journal-title":"Nat. Chem."},{"key":"mlstad0fa3bib85","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevB.105.165141","article-title":"Exploring the robust extrapolation of high-dimensional machine learning potentials","volume":"105","author":"Zeni","year":"2022","journal-title":"Phys. Rev. B"},{"key":"mlstad0fa3bib86","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1038\/s41529-018-0058-x","article-title":"A review of deep learning in the study of materials degradation","volume":"2","author":"Nash","year":"2018","journal-title":"npj Mater. Degrad."},{"key":"mlstad0fa3bib87","first-page":"pp 1","article-title":"Average life prediction for aero-engine fleet based on performance degradation data","author":"Fang","year":"2010"},{"key":"mlstad0fa3bib88","doi-asserted-by":"publisher","first-page":"2991","DOI":"10.1007\/s00521-022-07167-8","article-title":"A rare failure detection model for aircraft predictive maintenance using a deep hybrid learning approach","author":"Dangut","year":"2022","journal-title":"Neural Comput. Appl."},{"key":"mlstad0fa3bib89","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1002\/sam.10037","article-title":"Turbo similarity searching: effect of fingerprint and dataset on virtual-screening performance","volume":"2","author":"Gardiner","year":"2009","journal-title":"Stat. Anal. Data Min."},{"key":"mlstad0fa3bib90","doi-asserted-by":"publisher","first-page":"32","DOI":"10.1186\/s13321-021-00505-3","article-title":"Extended similarity indices: the benefits of comparing more than two objects simultaneously. Part 1: theory and characteristics","volume":"13","author":"Miranda-Quintana","year":"2021","journal-title":"J. Cheminformatics"},{"key":"mlstad0fa3bib91","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1186\/s13321-021-00504-4","article-title":"Extended similarity indices: the benefits of comparing more than two objects simultaneously. Part 2: speed, consistency, diversity selection","volume":"13","author":"Miranda-Quintana","year":"2021","journal-title":"J. Cheminformatics"},{"key":"mlstad0fa3bib92","first-page":"119","article-title":"A statistical approach to some basic mine valuation problems on the Witwatersrand","volume":"52","author":"Krige","year":"1951","journal-title":"J. South. Afr. Inst. Min. Metall."},{"key":"mlstad0fa3bib93","author":"Vapnik","year":"2000"},{"key":"mlstad0fa3bib94","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.108.058301","article-title":"Fast and accurate modeling of molecular atomization energies with machine learning","volume":"108","author":"Rupp","year":"2012","journal-title":"Phys. Rev. Lett."},{"key":"mlstad0fa3bib95","doi-asserted-by":"publisher","first-page":"2326","DOI":"10.1021\/acs.jpclett.5b00831","article-title":"Machine learning predictions of molecular properties: accurate many-body potentials and nonlocality in chemical space","volume":"6","author":"Hansen","year":"2015","journal-title":"J. Phys. Chem. Lett."},{"key":"mlstad0fa3bib96","doi-asserted-by":"publisher","DOI":"10.1063\/1.3553717","article-title":"Atom-centered symmetry functions for constructing high-dimensional neural network potentials","volume":"134","author":"Behler","year":"2011","journal-title":"J. Chem. Phys."},{"key":"mlstad0fa3bib97","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevB.87.184115","article-title":"On representing chemical environments","volume":"87","author":"Bart\u00f3k","year":"2013","journal-title":"Phys. Rev. B"},{"key":"mlstad0fa3bib98","doi-asserted-by":"publisher","DOI":"10.1063\/5.0041548","article-title":"Machine learning of free energies in chemical compound space using ensemble representations: reaching experimental uncertainty for solvation","volume":"154","author":"Weinreich","year":"2021","journal-title":"J. Chem. Phys."},{"key":"mlstad0fa3bib99","doi-asserted-by":"publisher","first-page":"8732","DOI":"10.1021\/ja902302h","article-title":"970 million druglike small molecules for virtual screening in the chemical Universe database GDB-13","volume":"131","author":"Blum","year":"2009","journal-title":"J. Am. Chem. Soc."},{"key":"mlstad0fa3bib100","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1021\/ci00057a005","article-title":"SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules","volume":"28","author":"Weininger","year":"1988","journal-title":"J. Chem. Inf. Comput. Sci."},{"key":"mlstad0fa3bib101","doi-asserted-by":"publisher","first-page":"2562","DOI":"10.1021\/acs.jcim.5b00654","article-title":"Better informed distance geometry: using what we know to improve conformation generation","volume":"55","author":"Riniker","year":"2015","journal-title":"J. Chem. Inf. Model."},{"key":"mlstad0fa3bib102","doi-asserted-by":"publisher","first-page":"1652","DOI":"10.1021\/acs.jctc.8b01176","article-title":"GFN2-xTB-An accurate and broadly parametrized self-consistent tight-binding quantum chemical method with multipole electrostatics and density-dependent dispersion contributions","volume":"15","author":"Bannwarth","year":"2019","journal-title":"J. Chem. Theory Comput."}],"container-title":["Machine Learning: Science and Technology"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad0fa3","content-type":"text\/html","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad0fa3\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad0fa3","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad0fa3\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad0fa3\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad0fa3\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad0fa3\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad0fa3\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,26]],"date-time":"2023-12-26T12:13:54Z","timestamp":1703592834000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad0fa3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,1]]},"references-count":102,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,12,6]]},"published-print":{"date-parts":[[2023,12,1]]}},"URL":"https:\/\/doi.org\/10.1088\/2632-2153\/ad0fa3","relation":{},"ISSN":["2632-2153"],"issn-type":[{"value":"2632-2153","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,1]]},"assertion":[{"value":"Improved decision making with similarity based machine learning: applications in chemistry","name":"article_title","label":"Article Title"},{"value":"Machine Learning: Science and Technology","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2023 The Author(s). Published by IOP Publishing Ltd","name":"copyright_information","label":"Copyright Information"},{"value":"2023-07-19","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2023-11-24","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2023-12-06","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}