{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,14]],"date-time":"2026-03-14T00:00:42Z","timestamp":1773446442026,"version":"3.50.1"},"reference-count":184,"publisher":"IOP Publishing","issue":"4","license":[{"start":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T00:00:00Z","timestamp":1731974400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T00:00:00Z","timestamp":1731974400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["W2433037"],"award-info":[{"award-number":["W2433037"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100010225","name":"National Outstanding Youth Foundation of China","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100010225","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Mach. Learn.: Sci. Technol."],"published-print":{"date-parts":[[2024,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    The field of computational chemistry is increasingly leveraging machine learning (ML) potentials to predict molecular properties with high accuracy and efficiency, providing a viable alternative to traditional quantum mechanical (QM) methods, which are often computationally intensive. Central to the success of ML models is the quality and comprehensiveness of the data sets on which they are trained. Quantum chemistry data sets and databases, comprising extensive information on molecular structures, energies, forces, and other properties derived from QM calculations, are crucial for developing robust and generalizable ML potentials. In this review, we provide an overview of the current landscape of quantum chemical data sets and databases. We examine key characteristics and functionalities of prominent resources, including the types of information they store, the level of electronic structure theory employed, the diversity of chemical space covered, and the methodologies used for data creation. Additionally, an updatable resource is provided to track new data sets and databases at\n                    <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/Arif-PhyChem\/datasets_and_databases_4_MLPs\">https:\/\/github.com\/Arif-PhyChem\/datasets_and_databases_4_MLPs<\/jats:ext-link>\n                    . This resource also has the overview in a machine-readable database format with the Jupyter notebook example for analysis. Looking forward, we discuss the challenges associated with the rapid growth of quantum chemical data sets and databases, emphasizing the need for updatable and accessible resources to ensure the long-term utility of them. We also address the importance of data format standardization and the ongoing efforts to align with the FAIR principles to enhance data interoperability and reusability. Drawing inspiration from established materials databases, we advocate for the development of user-friendly and sustainable platforms for these data sets and databases.\n                  <\/jats:p>","DOI":"10.1088\/2632-2153\/ad8f13","type":"journal-article","created":{"date-parts":[[2024,11,5]],"date-time":"2024-11-05T17:56:02Z","timestamp":1730829362000},"page":"041001","update-policy":"https:\/\/doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["Molecular quantum chemical data sets and databases for machine learning potentials"],"prefix":"10.1088","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1702-3463","authenticated-orcid":true,"given":"Arif","family":"Ullah","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8398-5690","authenticated-orcid":false,"given":"Yuxinxin","family":"Chen","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2975-9876","authenticated-orcid":true,"given":"Pavlo O","family":"Dral","sequence":"additional","affiliation":[]}],"member":"266","published-online":{"date-parts":[[2024,11,19]]},"reference":[{"key":"mlstad8f13bib1","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/sdata.2014.22","article-title":"Quantum chemistry structures and properties of 134 kilo molecules","volume":"1","author":"Ramakrishnan","year":"2014","journal-title":"Sci. Data"},{"key":"mlstad8f13bib2","doi-asserted-by":"publisher","first-page":"347","DOI":"10.1038\/s41570-020-0189-9","article-title":"Exploring chemical compound space with quantum-based machine learning","volume":"4","author":"von Lilienfeld","year":"2020","journal-title":"Nat. Rev. Chem."},{"key":"mlstad8f13bib3","doi-asserted-by":"publisher","first-page":"134","DOI":"10.1038\/s41597-020-0473-z","article-title":"The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules","volume":"7","author":"Smith","year":"2020","journal-title":"Sci. Data"},{"key":"mlstad8f13bib4","doi-asserted-by":"publisher","DOI":"10.1038\/sdata.2016.18","article-title":"The FAIR guiding principles for scientific data management and stewardship","volume":"3","author":"Wilkinson","year":"2016","journal-title":"Sci. Data"},{"key":"mlstad8f13bib5","doi-asserted-by":"publisher","first-page":"945","DOI":"10.1038\/s41597-024-03723-0","article-title":"Quantum topological atomic properties of 44k molecules","volume":"11","author":"Meza-Gonz\u00e1lez","year":"2024","journal-title":"Sci. Data"},{"key":"mlstad8f13bib6","article-title":"Alchemy: a quantum chemistry dataset for benchmarking AI models","author":"Chen","year":"2024"},{"key":"mlstad8f13bib7","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/sdata.2017.193","article-title":"ANI-1, a data set of 20 million calculated off-equilibrium conformations for organic molecules","volume":"4","author":"Smith","year":"2017","journal-title":"Sci. Data"},{"key":"mlstad8f13bib8","doi-asserted-by":"publisher","first-page":"4192","DOI":"10.1021\/acs.jctc.0c00121","article-title":"Extending the applicability of the ANI deep learning molecular potential to sulfur and halogens","volume":"16","author":"Devereux","year":"2020","journal-title":"J. Chem. Theory Comput."},{"key":"mlstad8f13bib9","doi-asserted-by":"publisher","first-page":"689","DOI":"10.1039\/D1DD00031D","article-title":"The resolution-vs.-accuracy dilemma in machine learning modeling of electronic excitation spectra","volume":"1","author":"Kayastha","year":"2022","journal-title":"Digit. Discov."},{"key":"mlstad8f13bib10","doi-asserted-by":"publisher","DOI":"10.1038\/ncomms13890","article-title":"Quantum-chemical insights from deep tensor neural networks","volume":"8","author":"Sch\u00fctt","year":"2017","journal-title":"Nat. Commun."},{"key":"mlstad8f13bib11","article-title":"CheMFi: a multifidelity dataset of quantum chemical properties of diverse molecules","author":"Vinod","year":"2024"},{"key":"mlstad8f13bib12","doi-asserted-by":"publisher","first-page":"3704","DOI":"10.1021\/acs.jcim.2c00503","article-title":"The COMPAS project: a computational database of polycyclic aromatic systems. phase 1: cata-condensed polybenzenoid hydrocarbons","volume":"62","author":"Wahab","year":"2022","journal-title":"J. Chem. Inf. Model."},{"key":"mlstad8f13bib13","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1038\/s41597-024-02927-8","article-title":"COMPAS-2: a dataset of cata-condensed hetero-polycyclic aromatic systems","volume":"11","author":"Mayo Yanes","year":"2024","journal-title":"Sci. Data"},{"key":"mlstad8f13bib14","doi-asserted-by":"publisher","first-page":"15344","DOI":"10.1039\/D4CP01027B","article-title":"COMPAS-3: a dataset of peri-condensed polybenzenoid hydrocarbons","volume":"26","author":"Wahab","year":"2024","journal-title":"Phys. Chem. Chem. Phys."},{"key":"mlstad8f13bib15","doi-asserted-by":"publisher","first-page":"859","DOI":"10.1038\/s41597-024-03698-y","article-title":"CREMP: conformer-rotamer ensembles of macrocyclic peptides for machine learning","volume":"11","author":"Grambow","year":"2024","journal-title":"Sci. Data"},{"key":"mlstad8f13bib16","doi-asserted-by":"publisher","first-page":"185","DOI":"10.1038\/s41597-022-01288-4","article-title":"GEOM, energy-annotated molecular conformations for property prediction and molecular generation","volume":"9","author":"Axelrod","year":"2022","journal-title":"Sci. Data"},{"key":"mlstad8f13bib17","first-page":"992","article-title":"Schnet: a continuous-filter convolutional neural network for modeling quantum interactions","author":"Sch\u00fctt","year":"2017"},{"key":"mlstad8f13bib18","doi-asserted-by":"publisher","DOI":"10.1126\/sciadv.1603015","article-title":"Machine learning of accurate energy-conserving molecular force fields","volume":"3","author":"Chmiela","year":"2017","journal-title":"Sci. Adv."},{"key":"mlstad8f13bib19","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/abba6f","article-title":"On the role of gradients for machine learning of molecular energies and forces","volume":"1","author":"Christensen","year":"2020","journal-title":"Mach. Learn. Sci. Technol."},{"key":"mlstad8f13bib20","doi-asserted-by":"publisher","first-page":"3887","DOI":"10.1038\/s41467-018-06169-2","article-title":"Towards exact molecular dynamics simulations with machine-learned force fields","volume":"9","author":"Chmiela","year":"2018","journal-title":"Nat. Commun."},{"key":"mlstad8f13bib21","doi-asserted-by":"publisher","first-page":"0873","DOI":"10.1126\/sciadv.adf0873","article-title":"Accurate global machine learning force fields for molecules with hundreds of atoms","volume":"9","author":"Chmiela","year":"2023","journal-title":"Sci. Adv."},{"key":"mlstad8f13bib22","doi-asserted-by":"publisher","first-page":"783","DOI":"10.1038\/s41597-023-02690-2","article-title":"MultiXC-QM9: large dataset of molecular and reaction energies from multi-level quantum chemical methods","volume":"10","author":"Nandi","year":"2023","journal-title":"Sci. Data"},{"key":"mlstad8f13bib23","article-title":"\u22072 DFT: a universal quantum chemistry dataset of drug-like molecules and a benchmark for neural network potentials","author":"Khrabrov","year":"2024"},{"key":"mlstad8f13bib24","doi-asserted-by":"publisher","first-page":"1201","DOI":"10.1021\/acs.jcim.3c01953","article-title":"From organic fragments to photoswitchable catalysts: the OFF-ON structural repository for transferable kernel-based potentials","volume":"64","author":"C\u00e9lerse","year":"2024","journal-title":"J. Chem. Inf. Model."},{"key":"mlstad8f13bib25","doi-asserted-by":"publisher","DOI":"10.1063\/5.0061990","article-title":"A machine learning potential for biological and organic chemistry with semi-empirical cost and DFT accuracy","volume":"155","author":"Christensen","year":"2021","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib26","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13321-019-0391-2","article-title":"Dataset\u2019s chemical diversity limits the generalizability of machine learning predictions","volume":"11","author":"Glavatskikh","year":"2019","journal-title":"J. Cheminf."},{"key":"mlstad8f13bib27","doi-asserted-by":"publisher","first-page":"5734","DOI":"10.1021\/acs.jcim.3c00899","article-title":"PubChemQC B3LYP\/6-31G*\/\/PM6 data set: the electronic structures of 86 million molecules using B3LYP\/6-31G* calculations","volume":"63","author":"Nakata","year":"2023","journal-title":"J. Chem. Inf. Model."},{"key":"mlstad8f13bib28","doi-asserted-by":"publisher","first-page":"1300","DOI":"10.1021\/acs.jcim.7b00083","article-title":"PubChemQC project: a large-scale first-principles electronic structure database for data-driven chemistry","volume":"57","author":"Nakata","year":"2017","journal-title":"J. Chem. Inf. Model."},{"key":"mlstad8f13bib29","doi-asserted-by":"publisher","first-page":"5891","DOI":"10.1021\/acs.jcim.0c00740","article-title":"PubChemQC PM6: data sets of 221 million molecules with optimized molecular geometries and electronic properties","volume":"60","author":"Nakata","year":"2020","journal-title":"J. Chem. Inf. Model."},{"key":"mlstad8f13bib30","doi-asserted-by":"publisher","first-page":"948","DOI":"10.1038\/s41597-024-03788-x","article-title":"Quantum chemistry dataset with ground-and excited-state properties of 450 kilo molecules","volume":"11","author":"Zhu","year":"2024","journal-title":"Sci. Data"},{"key":"mlstad8f13bib31","article-title":"Generating QM1B with PySCFIPU","volume":"vol 36","author":"Mathiasen","year":"2024"},{"key":"mlstad8f13bib32","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.108.058301","article-title":"Fast and accurate modeling of molecular atomization energies with machine learning","volume":"108","author":"Rupp","year":"2012","journal-title":"Phys. Rev. Lett."},{"key":"mlstad8f13bib33","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1038\/s41597-021-00812-2","article-title":"QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules","volume":"8","author":"Hoja","year":"2021","journal-title":"Sci. Data"},{"key":"mlstad8f13bib34","doi-asserted-by":"publisher","DOI":"10.1088\/1367-2630\/15\/9\/095003","article-title":"Machine learning of molecular electronic properties in chemical compound space","volume":"15","author":"Montavon","year":"2013","journal-title":"New J. Phys."},{"key":"mlstad8f13bib35","doi-asserted-by":"publisher","DOI":"10.1063\/1.4928757","article-title":"Electronic spectra from TDDFT and machine learning in chemical space","volume":"143","author":"Ramakrishnan","year":"2015","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib36","doi-asserted-by":"publisher","first-page":"109","DOI":"10.1038\/s41597-019-0121-7","article-title":"Energy refinement and analysis of structures in the QM9 database via a highly accurate quantum chemical method","volume":"6","author":"Kim","year":"2019","journal-title":"Sci. Data"},{"key":"mlstad8f13bib37","doi-asserted-by":"publisher","first-page":"957","DOI":"10.1038\/s43588-023-00550-y","article-title":"A deep learning model for predicting selected organic molecular spectra","volume":"3","author":"Zou","year":"2023","journal-title":"Nat. Comput. Sci."},{"key":"mlstad8f13bib38","doi-asserted-by":"publisher","first-page":"213","DOI":"10.1038\/s41597-019-0237-9","article-title":"QM-sym, a symmetrized quantum chemistry database of 135 kilo molecules","volume":"6","author":"Liang","year":"2019","journal-title":"Sci. Data"},{"key":"mlstad8f13bib39","doi-asserted-by":"publisher","first-page":"400","DOI":"10.1038\/s41597-020-00746-1","article-title":"QM-symex, update of the QM-sym database with excited state information for 173 kilo molecules","volume":"7","author":"Liang","year":"2020","journal-title":"Sci. Data"},{"key":"mlstad8f13bib40","doi-asserted-by":"publisher","DOI":"10.1063\/5.0089200","article-title":"The MD17 datasets from the perspective of datasets for gas-phase \u201csmall\u201d molecule potentials","volume":"156","author":"Bowman","year":"2022","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib41","doi-asserted-by":"publisher","first-page":"273","DOI":"10.1038\/s41597-022-01390-7","article-title":"QMugs, quantum mechanical properties of drug-like molecules","volume":"9","author":"Isert","year":"2022","journal-title":"Sci. Data"},{"key":"mlstad8f13bib42","article-title":"Adaptive hybrid density functionals","author":"Khan","year":"2024"},{"key":"mlstad8f13bib43","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1038\/s41597-022-01882-6","article-title":"Spice, a dataset of drug-like molecules and peptides for training machine learning potentials","volume":"10","author":"Eastman","year":"2023","journal-title":"Sci. Data"},{"key":"mlstad8f13bib44","doi-asserted-by":"crossref","DOI":"10.1021\/acs.jctc.4c00794","article-title":"Nutmeg and SPICE: models and data for biomolecular machine learning","author":"Eastman","year":"2024"},{"key":"mlstad8f13bib45","doi-asserted-by":"publisher","first-page":"2261","DOI":"10.1039\/C7SC04934J","article-title":"The TensorMol-0.1 model chemistry: a neural network augmented with long-range physics","volume":"9","author":"Yao","year":"2018","journal-title":"Chem. Sci."},{"key":"mlstad8f13bib46","doi-asserted-by":"publisher","first-page":"6135","DOI":"10.1021\/acs.jcim.0c01041","article-title":"tmQM dataset\u2014quantum geometries and properties of 86k transition metal complexes","volume":"60","author":"Balcells","year":"2020","journal-title":"J. Chem. Inf. Model."},{"key":"mlstad8f13bib47","doi-asserted-by":"publisher","first-page":"779","DOI":"10.1038\/s41597-022-01870-w","article-title":"Transition1x-a dataset for building generalizable reactive machine learning potentials","volume":"9","author":"Schreiner","year":"2022","journal-title":"Sci. Data"},{"key":"mlstad8f13bib48","doi-asserted-by":"publisher","first-page":"84","DOI":"10.1038\/s41597-022-01185-w","article-title":"VIB5 database with accurate ab initio quantum chemical molecular potential energy surfaces","volume":"9","author":"Zhang","year":"2022","journal-title":"Sci. Data"},{"key":"mlstad8f13bib49","article-title":"Towards comprehensive coverage of chemical space: quantum mechanical properties of 836k constitutional and conformational closed shell neutral isomers consisting of HCNOFSiPSClBr","author":"Khan","year":"2024"},{"key":"mlstad8f13bib50","doi-asserted-by":"publisher","first-page":"95","DOI":"10.1038\/s41597-023-01998-3","article-title":"WS22 database, wigner sampling and geometry interpolation for configurationally diverse molecular datasets","volume":"10","author":"Pinheiro","year":"2023","journal-title":"Sci. Data"},{"key":"mlstad8f13bib51","doi-asserted-by":"publisher","first-page":"222","DOI":"10.1038\/s41597-024-03019-3","article-title":"Beyond MD17: the reactive xxMD dataset","volume":"11","author":"Pengmei","year":"2024","journal-title":"Sci. Data"},{"key":"mlstad8f13bib52","doi-asserted-by":"publisher","first-page":"2864","DOI":"10.1021\/ci300415d","article-title":"Enumeration of 166 billion organic small molecules in the chemical Universe database GDB-17","volume":"52","author":"Ruddigkeit","year":"2012","journal-title":"J. Chem. Inf. Model."},{"key":"mlstad8f13bib53","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/BF00128336","article-title":"MOPAC: a semiempirical molecular orbital program","volume":"4","author":"Stewart","year":"1990","journal-title":"J. Comput. Aided Mol. Des."},{"key":"mlstad8f13bib54","doi-asserted-by":"publisher","first-page":"5648","DOI":"10.1063\/1.464913","article-title":"Density-functional thermochemistry. III. The role of exact exchange","volume":"98","author":"Beck","year":"1993","journal-title":"J. Chem. Phys"},{"key":"mlstad8f13bib55","doi-asserted-by":"publisher","first-page":"11623","DOI":"10.1021\/j100096a001","article-title":"Ab initio calculation of vibrational absorption and circular dichroism spectra using density functional force fields","volume":"98","author":"Stephens","year":"1994","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib56","doi-asserted-by":"publisher","first-page":"724","DOI":"10.1063\/1.1674902","article-title":"Self-consistent molecular-orbital methods. IX. An extended Gaussian-type basis for molecular-orbital studies of organic molecules","volume":"54","author":"Ditchfield","year":"1971","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib57","doi-asserted-by":"publisher","first-page":"650","DOI":"10.1063\/1.438955","article-title":"Self-consistent molecular orbital methods. XX. A basis set for correlated wave functions","volume":"72","author":"Krishnan","year":"1980","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib58","article-title":"Gaussian 09, revision d. 01","volume":"vol 201","author":"Frisch","year":"2009"},{"key":"mlstad8f13bib59","doi-asserted-by":"publisher","DOI":"10.1063\/1.2770701","article-title":"Gaussian-4 theory using reduced order perturbation theory","volume":"127","author":"Curtiss","year":"2007","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib60","doi-asserted-by":"publisher","DOI":"10.6084\/m9.Figshare.978904)","article-title":"Quantum chemistry structures and properties of 134 kilo molecules","author":"Ramakrishnan","year":"2014"},{"key":"mlstad8f13bib61","doi-asserted-by":"publisher","DOI":"10.1063\/1.2436888","article-title":"Gaussian-4 theory","volume":"126","author":"Curtiss","year":"2007","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib62","article-title":"Gaussian 16 Revision C.01","author":"Frisch","year":"2016"},{"key":"mlstad8f13bib63","doi-asserted-by":"publisher","DOI":"10.6084\/m9.Figshare.c.4351631.v1)","article-title":"Highly accurate G4(MP2) benchmark on QM9 database: energy refinement and analysis of structures","author":"Kim","year":"2019"},{"key":"mlstad8f13bib64","doi-asserted-by":"publisher","first-page":"1493","DOI":"10.1002\/wcms.1493","article-title":"Extended tight-binding quantum chemistry methods","volume":"11","author":"Bannwarth","year":"2021","journal-title":"Wiley Interdiscip. Rev. Comput. Mol. Sci."},{"key":"mlstad8f13bib65","doi-asserted-by":"publisher","first-page":"931","DOI":"10.1002\/jcc.1056","article-title":"Chemistry with ADF","volume":"22","author":"Te Velde","year":"2001","journal-title":"J. Comput. Chem."},{"key":"mlstad8f13bib66","doi-asserted-by":"publisher","first-page":"3865","DOI":"10.1103\/PhysRevLett.77.3865","article-title":"Generalized gradient approximation made simple","volume":"77","author":"Perdew","year":"1996","journal-title":"Phys. Rev. Lett."},{"key":"mlstad8f13bib67","doi-asserted-by":"publisher","DOI":"10.11583\/DTU.c.6185986.v3)","article-title":"MultiXC-QM9","author":"Nandi","year":"2023"},{"key":"mlstad8f13bib68","doi-asserted-by":"publisher","first-page":"1664","DOI":"10.1002\/jcc.23981","article-title":"Libcint: an efficient general integral library for Gaussian basis functions","volume":"36","author":"Sun","year":"2015","journal-title":"J. Comput. Chem."},{"key":"mlstad8f13bib69","doi-asserted-by":"publisher","first-page":"1340","DOI":"10.1002\/wcms.1340","article-title":"PySCF: the Python-based simulations of chemistry framework","volume":"8","author":"Sun","year":"2017","journal-title":"WIREs Comput. Mol. Sci."},{"key":"mlstad8f13bib70","doi-asserted-by":"publisher","DOI":"10.1063\/5.0006074","article-title":"Recent developments in the PySCF program package","volume":"153","author":"Sun","year":"2020","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib71","article-title":"aPBE0","author":"GKhan","year":"2024"},{"key":"mlstad8f13bib72","article-title":"Revised QM9 dataset","author":"Khan","year":"2024"},{"key":"mlstad8f13bib73","article-title":"AIMAll (version 19.10.12)","author":"Keith"},{"key":"mlstad8f13bib74","article-title":"AIMEl-DB data set at Zenodo","author":"Meza-Gonz\u00e1lez","year":"2024"},{"key":"mlstad8f13bib75","doi-asserted-by":"publisher","first-page":"185","DOI":"10.1002\/(SICI)1097-461X(1996)58:23.0.CO;2-U","article-title":"Calculations of molecules, clusters and solids with a simplified LCAO-DFT-LDA scheme","volume":"58","author":"Seifert","year":"1996","journal-title":"Int. J. Quantum Chem."},{"key":"mlstad8f13bib76","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.108.236402","article-title":"Accurate and efficient method for many-body van der waals interactions","volume":"108","author":"Tkatchenko","year":"2012","journal-title":"Phys. Rev. Lett."},{"key":"mlstad8f13bib77","doi-asserted-by":"publisher","first-page":"8732","DOI":"10.1021\/ja902302h","article-title":"970 million druglike small molecules for virtual screening in the chemical Universe database gdb-13","volume":"131","author":"Blum","year":"2009","journal-title":"J. Am. Chem. Soc."},{"key":"mlstad8f13bib78","doi-asserted-by":"publisher","first-page":"490","DOI":"10.1002\/(SICI)1096-987X(199604)17:5\/63.0.CO;2-P","article-title":"Merck molecular force field. I. Basis, form, scope, parameterization and performance of mmff94","volume":"17","author":"Halgren","year":"1996","journal-title":"J. Comput. Chem."},{"key":"mlstad8f13bib79","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1758-2946-3-8","article-title":"Confab-systematic generation of diverse low-energy conformers","volume":"3","author":"O\u2019Boyle","year":"2011","journal-title":"J. Cheminf."},{"key":"mlstad8f13bib80","doi-asserted-by":"publisher","first-page":"2175","DOI":"10.1016\/j.cpc.2009.06.022","article-title":"Ab initio molecular simulations with numeric atom-centered orbitals","volume":"180","author":"Blum","year":"2009","journal-title":"Comput. Phys. Commun."},{"key":"mlstad8f13bib81","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.4288677","article-title":"QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules","author":"Hoja","year":"2020"},{"key":"mlstad8f13bib82","doi-asserted-by":"publisher","first-page":"991","DOI":"10.1021\/ci050400b","article-title":"The Blue Obelisk\u2014interoperability in chemical informatics","volume":"46","author":"Guha","year":"2006","journal-title":"J. Chem. Inf. Model."},{"key":"mlstad8f13bib83","article-title":"QM7 dataset","author":"Rupp","year":"2012"},{"key":"mlstad8f13bib84","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1021\/ci00057a005","article-title":"SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules","volume":"28","author":"Weininger","year":"1988","journal-title":"J. Chem. Inf. Comput. Sci."},{"key":"mlstad8f13bib85","doi-asserted-by":"publisher","first-page":"10024","DOI":"10.1021\/ja00051a040","article-title":"UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations","volume":"114","author":"Rapp\u00e9","year":"1992","journal-title":"J. Am. Chem. Soc."},{"key":"mlstad8f13bib86","doi-asserted-by":"publisher","first-page":"796","DOI":"10.1103\/PhysRev.139.A796","article-title":"New method for calculating the one-particle green\u2019s function with application to the electron-gas problem","volume":"139","author":"Hedin","year":"1965","journal-title":"Phys. Rev."},{"key":"mlstad8f13bib87","doi-asserted-by":"publisher","first-page":"e1606","DOI":"10.1002\/wcms.1606","article-title":"Software update: The ORCA program system\u2014Version 5.0","volume":"12","author":"Neese","year":"2022","journal-title":"Wiley Interdiscip. Rev. Comput. Mol. Sci."},{"key":"mlstad8f13bib88","article-title":"QM7b dataset","author":"Montavon","year":"2013"},{"key":"mlstad8f13bib89","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1002\/wcms.1162","article-title":"Turbomole","volume":"4","author":"Furche","year":"2014","journal-title":"Wiley Interdiscip. Rev. Comput. Mol. Sci."},{"key":"mlstad8f13bib90","doi-asserted-by":"publisher","first-page":"7433","DOI":"10.1063\/1.1508368","article-title":"Adiabatic time-dependent density functional methods for excited state properties","volume":"117","author":"Furche","year":"2002","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib91","doi-asserted-by":"publisher","first-page":"9982","DOI":"10.1063\/1.472933","article-title":"Rationale for mixing exact exchange with density functional approximations","volume":"105","author":"Perdew","year":"1996","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib92","doi-asserted-by":"publisher","first-page":"3297","DOI":"10.1039\/b508541a","article-title":"Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: design and assessment of accuracy","volume":"7","author":"Weigend","year":"2005","journal-title":"Phys. Chem. Chem. Phys."},{"key":"mlstad8f13bib93","doi-asserted-by":"publisher","first-page":"5154","DOI":"10.1063\/1.1290013","article-title":"CC2 excitation energy calculations on large molecules using the resolution of the identity approximation","volume":"113","author":"H\u00e4ttig","year":"2000","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib94","article-title":"Alchemy data set","author":"Chen","year":"2024"},{"key":"mlstad8f13bib95","doi-asserted-by":"publisher","first-page":"1504","DOI":"10.1002\/anie.200462457","article-title":"Virtual exploration of the small-molecule chemical Universe below 160 daltons","volume":"44","author":"Fink","year":"2005","journal-title":"Angew. Chem., Int. Ed."},{"key":"mlstad8f13bib96","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1021\/ci600423u","article-title":"Virtual exploration of the chemical Universe up to 11 atoms of c, n, o, f: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes and drug discovery","volume":"47","author":"Fink","year":"2007","journal-title":"J. Chem., Inf. Model."},{"key":"mlstad8f13bib97","article-title":"RDKit","author":"Landrum","year":"2022"},{"key":"mlstad8f13bib98","article-title":"QM1B dataset","author":"Mathiasen","year":"2023"},{"key":"mlstad8f13bib99","doi-asserted-by":"publisher","DOI":"10.1063\/5.0006002","article-title":"PSI4 1.4: open-source software for high-throughput quantum chemistry","volume":"152","author":"Smith","year":"2020","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib100","doi-asserted-by":"publisher","first-page":"5725","DOI":"10.1021\/acs.jctc.8b00842","article-title":"The nonlocal kernel in van der waals density functionals as an additive correction: an extensive analysis with special emphasis on the B97M-V and \u03c9B97M-V approaches","volume":"14","author":"Najibi","year":"2018","journal-title":"J. Chem. Theory Comput."},{"key":"mlstad8f13bib101","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.7338495)","article-title":"SPICE 1.1.2","author":"Eastman","year":"2022"},{"key":"mlstad8f13bib102","doi-asserted-by":"publisher","first-page":"1373","DOI":"10.1093\/nar\/gkac956","article-title":"PubChem 2023 update","volume":"51","author":"Kim","year":"2023","journal-title":"Nucl. Acids Res."},{"key":"mlstad8f13bib103","article-title":"PubChemQC database","author":"Nakata","year":"2017"},{"key":"mlstad8f13bib104","article-title":"PubChemQC PM6 data sets","author":"Nakata","year":"2020"},{"key":"mlstad8f13bib105","article-title":"PubChemQC B3LYP\/6-31G*\/\/PM6","author":"Nakata","year":"2023"},{"key":"mlstad8f13bib106","doi-asserted-by":"publisher","DOI":"10.6084\/m9.figshare.9033977.v1)","article-title":"PC9 dataset","author":"Glavatskikh","year":"2019"},{"key":"mlstad8f13bib107","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.3588370)","article-title":"PC9 dataset","author":"Glavatskikh","year":"2019"},{"key":"mlstad8f13bib108","doi-asserted-by":"publisher","first-page":"5566","DOI":"10.1039\/D0SC05591C","article-title":"Troubleshooting unstable molecules in chemical space","volume":"12","author":"Senthil","year":"2021","journal-title":"Chem. Sci."},{"key":"mlstad8f13bib109","doi-asserted-by":"publisher","first-page":"6615","DOI":"10.1039\/b810189b","article-title":"Long-range corrected hybrid density functionals with damped atom\u2013atom dispersion corrections","volume":"10","author":"Chai","year":"2008","journal-title":"Phys. Chem. Chem. Phys."},{"key":"mlstad8f13bib110","doi-asserted-by":"publisher","DOI":"10.17172\/NOMAD\/2021.09.30-1","article-title":"The bigQM7\u03c9 dataset","author":"Kayastha","year":"2022"},{"key":"mlstad8f13bib111","doi-asserted-by":"publisher","first-page":"930","DOI":"10.1093\/nar\/gky1075","article-title":"ChEMBL: towards direct deposition of bioassay data","volume":"47","author":"Mendez","year":"2019","journal-title":"Nucl. Acids Res."},{"key":"mlstad8f13bib112","doi-asserted-by":"publisher","first-page":"1989","DOI":"10.1021\/acs.jctc.7b00118","article-title":"A robust and accurate tight-binding quantum chemical method for structures, vibrational frequencies and noncovalent interactions of large molecular systems parametrized for all spd-block elements (z = 1\u201386)","volume":"13","author":"Grimme","year":"2017","journal-title":"J. Chem. Theory Comput."},{"key":"mlstad8f13bib113","doi-asserted-by":"publisher","first-page":"1652","DOI":"10.1021\/acs.jctc.8b01176","article-title":"GFN2-xTB\u2014an accurate and broadly parametrized self-consistent tight-binding quantum chemical method with multipole electrostatics and density-dependent dispersion contributions","volume":"15","author":"Bannwarth","year":"2019","journal-title":"J. Chem. Theory Comput."},{"key":"mlstad8f13bib114","doi-asserted-by":"publisher","DOI":"10.3929\/ethz-b-000482129)","article-title":"QMugs, quantum mechanical properties of drug-like molecules","author":"Isert","year":"2022"},{"key":"mlstad8f13bib115","doi-asserted-by":"publisher","first-page":"1180","DOI":"10.1093\/nar\/gkad1004","article-title":"The ChEMBL database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods","volume":"52","author":"Zdrazil","year":"2024","journal-title":"Nucl. Acids Res."},{"key":"mlstad8f13bib116","doi-asserted-by":"publisher","DOI":"10.1002\/qua.26381","article-title":"Assessing conformer energies using electronic structure and machine learning methods","volume":"121","author":"Folmsbee","year":"2021","journal-title":"Int. J. Quantum Chem."},{"key":"mlstad8f13bib117","doi-asserted-by":"publisher","first-page":"1985","DOI":"10.1039\/B600027D","article-title":"Benchmark database of accurate (MP2 and CCSD(T) complete basis set limit) interaction energies of small model complexes, dna base pairs and amino acid pairs","volume":"8","author":"Jure\u010dka","year":"2006","journal-title":"Phys. Chem. Chem. Phys."},{"key":"mlstad8f13bib118","doi-asserted-by":"publisher","DOI":"10.1063\/1.5001028","article-title":"The BioFragment Database (BFDb): an open-data platform for computational chemistry analysis of noncovalent interactions","volume":"147","author":"Burns","year":"2017","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib119","doi-asserted-by":"publisher","DOI":"10.6084\/m9.Figshare.14883867","article-title":"OrbNet denali training data","author":"Christensen","year":"2021"},{"key":"mlstad8f13bib120","doi-asserted-by":"publisher","first-page":"5029","DOI":"10.1063\/1.478401","article-title":"Assessment of the Perdew\u2013Burke\u2013Ernzerhof exchange-correlation functional","volume":"110","author":"Ernzerhof","year":"1999","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib121","article-title":"Original MD17","author":"Chmiela","year":"2017"},{"key":"mlstad8f13bib122","doi-asserted-by":"publisher","DOI":"10.6084\/m9.Figshare.12672038)","article-title":"Original MD17","author":"Christensen","year":"2017"},{"key":"mlstad8f13bib123","article-title":"Fast graph representation learning with PyTorch geometric","author":"Fey","year":"2019"},{"key":"mlstad8f13bib124","doi-asserted-by":"publisher","first-page":"1370","DOI":"10.1002\/wcms.1370","article-title":"Nonadiabatic dynamics: the SHARC approach","volume":"8","author":"Mai","year":"2018","journal-title":"Wiley Interdiscip. Rev. Comput. Mol. Sci."},{"key":"mlstad8f13bib125","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1016\/0301-0104(80)80045-0","article-title":"A complete active space SCF method (CASSCF) using a density matrix formulated super-CI approach","volume":"48","author":"Roos","year":"1980","journal-title":"Chem. Phys."},{"key":"mlstad8f13bib126","doi-asserted-by":"publisher","first-page":"5925","DOI":"10.1021\/acs.jctc.9b00532","article-title":"OpenMolcas: from source code to insight","volume":"15","author":"Fdez. Galv\u00e1n","year":"2019","journal-title":"J. Chem. Theory Comput."},{"key":"mlstad8f13bib127","doi-asserted-by":"publisher","first-page":"215","DOI":"10.1007\/s00214-007-0310-x","article-title":"The M06 suite of density functionals for main group thermochemistry, thermochemical kinetics, noncovalent interactions, excited states and transition elements: two new functionals and systematic testing of four M06-class functionals and 12 other functionals","volume":"120","author":"Zhao","year":"2008","journal-title":"Theor. Chem. Acc."},{"key":"mlstad8f13bib128","doi-asserted-by":"publisher","DOI":"10.1088\/1361-648X\/aa680e","article-title":"The atomic simulation environment\u2014a python library for working with atoms","volume":"29","author":"Hjorth Larsen","year":"2017","journal-title":"J. Condens. Matter Phys."},{"key":"mlstad8f13bib129","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.10393859","article-title":"Beyond MD17: the reactive xxMD dataset","author":"Pengmei","year":"2024"},{"key":"mlstad8f13bib130","article-title":"MD22","author":"Chmiela","year":"2023"},{"key":"mlstad8f13bib131","doi-asserted-by":"publisher","first-page":"894","DOI":"10.1007\/s10825-015-0737-6","article-title":"Comparing Wigner, Husimi and Bohmian distributions: which one is a true probability distribution in phase space?","volume":"14","author":"Colom\u00e9s","year":"2015","journal-title":"J. Comput. Electron."},{"key":"mlstad8f13bib132","doi-asserted-by":"publisher","DOI":"10.1063\/1.5090303","article-title":"Geodesic interpolation for reaction pathways","volume":"150","author":"Zhu","year":"2019","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib133","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.7032334","article-title":"The WS22 database","author":"Pinheiro","year":"2023"},{"key":"mlstad8f13bib134","doi-asserted-by":"publisher","first-page":"242","DOI":"10.1002\/wcms.82","article-title":"Molpro: a general-purpose quantum chemistry program package","volume":"2","author":"Werner","year":"2012","journal-title":"Wiley Interdiscip. Rev. Comput. Mol. Sci."},{"key":"mlstad8f13bib135","doi-asserted-by":"publisher","DOI":"10.1063\/5.0005081","article-title":"The Molpro quantum chemistry package","volume":"152","author":"Werner","year":"2020","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib136","doi-asserted-by":"publisher","DOI":"10.1063\/5.0004837","article-title":"Coupled-cluster techniques for computational chemistry: the CFOUR program package","volume":"152","author":"Matthews","year":"2020","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib137","doi-asserted-by":"publisher","DOI":"10.6084\/m9.figshare.1690328879)","article-title":"VIB5 database","author":"Zhang","year":"2022"},{"key":"mlstad8f13bib138","doi-asserted-by":"publisher","first-page":"3192","DOI":"10.1039\/C6SC05720A","article-title":"ANI-1: an extensible neural network potential with dft accuracy at force field computational cost","volume":"8","author":"Smith","year":"2017","journal-title":"Chem. Sci."},{"key":"mlstad8f13bib139","doi-asserted-by":"publisher","DOI":"10.6084\/m9.figshare.5287732.v1","article-title":"ANI-1, A data set of 20 million calculated off-equilibrium conformations for organic molecules","author":"Smith","year":"2017"},{"key":"mlstad8f13bib140","doi-asserted-by":"publisher","DOI":"10.1063\/1.5023802","article-title":"Less is more: sampling chemical space with active learning","volume":"148","author":"Smith","year":"2018","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib141","doi-asserted-by":"publisher","first-page":"2903","DOI":"10.1038\/s41467-019-10827-4","article-title":"Approaching coupled cluster accuracy with a general-purpose neural network potential through transfer learning","volume":"10","author":"Smith","year":"2019","journal-title":"Nat. Commun."},{"key":"mlstad8f13bib142","doi-asserted-by":"publisher","DOI":"10.6084\/m9.figshare.c.4712477","article-title":"The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules","author":"Smith","year":"2020"},{"key":"mlstad8f13bib143","doi-asserted-by":"publisher","DOI":"10.1063\/5.0004608","article-title":"The ORCA quantum chemistry program package","volume":"152","author":"Neese","year":"2020","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib144","doi-asserted-by":"publisher","first-page":"20905","DOI":"10.1039\/C6CP00688D","article-title":"The S66x8 benchmark for noncovalent interactions revisited: explicitly correlated ab initio methods and density functional theory","volume":"18","author":"Brauer","year":"2016","journal-title":"Phys. Chem. Chem. Phys."},{"key":"mlstad8f13bib145","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.10108942)","article-title":"ANI-2 data set","author":"Devereux","year":"2020"},{"key":"mlstad8f13bib146","doi-asserted-by":"publisher","DOI":"10.1063\/1.2841941","article-title":"Optimization methods for finding minimum energy paths","volume":"128","author":"Sheppard","year":"2008","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib147","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1038\/s41597-020-0460-4","article-title":"Reactants, products and transition states of elementary chemical reactions based on quantum chemistry","volume":"7","author":"Grambow","year":"2020","journal-title":"Sci. Data"},{"key":"mlstad8f13bib148","doi-asserted-by":"publisher","DOI":"10.1063\/1.2834918","article-title":"Systematic optimization of long-range corrected hybrid density functionals","volume":"128","author":"Chai","year":"2008","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib149","doi-asserted-by":"publisher","first-page":"9901","DOI":"10.1063\/1.1329672","article-title":"A climbing image nudged elastic band method for finding saddle points and minimum energy paths","volume":"113","author":"Henkelman","year":"2000","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib150","doi-asserted-by":"publisher","DOI":"10.1063\/1.4878664","article-title":"Improved initial guess for minimum energy path calculations","volume":"140","author":"Smidstrup","year":"2014","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib151","doi-asserted-by":"publisher","DOI":"10.6084\/m9.figshare.19614657.v4","article-title":"Transition1x-a dataset for building generalizable reactive machine learning potentials","author":"Schreiner","year":"2022"},{"key":"mlstad8f13bib152","article-title":"QM-sym-database","author":"Liang","year":"2019"},{"key":"mlstad8f13bib153","doi-asserted-by":"publisher","DOI":"10.6084\/m9.Figshare.9638093)","article-title":"QM-sym-database","author":"Liang","year":"2019"},{"key":"mlstad8f13bib154","doi-asserted-by":"publisher","DOI":"10.6084\/m9.Figshare.12815276)","article-title":"QM-symex-database","author":"Liang","year":"2020"},{"key":"mlstad8f13bib155","doi-asserted-by":"publisher","first-page":"25853","DOI":"10.1039\/D2CP03966D","article-title":"nablaDFT: large-scale conformational energy and Hamiltonian prediction benchmark and dataset","volume":"24","author":"Khrabrov","year":"2022","journal-title":"Phys. Chem. Chem. Phys."},{"key":"mlstad8f13bib156","doi-asserted-by":"publisher","DOI":"10.3389\/fphar.2020.565644","article-title":"Molecular sets (MOSES): a benchmarking platform for molecular generation models","volume":"11","author":"Polykovskiy","year":"2020","journal-title":"Front. Pharmacol."},{"key":"mlstad8f13bib157","doi-asserted-by":"publisher","first-page":"2887","DOI":"10.1021\/jm9602928","article-title":"The properties of known drugs. 1. Molecular frameworks","volume":"39","author":"Bemis","year":"1996","journal-title":"J. Med. Chem."},{"key":"mlstad8f13bib158","doi-asserted-by":"publisher","first-page":"1503","DOI":"10.1002\/cmdc.200800178","article-title":"On the art of compiling and using \u2018drug-like\u2019 chemical fragment spaces","volume":"3","author":"Degen","year":"2008","journal-title":"ChemMedChem"},{"key":"mlstad8f13bib159","doi-asserted-by":"publisher","first-page":"644","DOI":"10.1021\/ci00010a010","article-title":"Clustering of chemical structures on the basis of two-dimensional similarity measures","volume":"32","author":"Barnard","year":"1992","journal-title":"J. Chem. Inf. Comput. Sci."},{"key":"mlstad8f13bib160","first-page":"533","article-title":"CaGe - a virtual environment for studying some special classes of plane graphs - an update","volume":"63","author":"Brinkmann","year":"2010","journal-title":"MATCH Commun. Math. Comput. Chem."},{"key":"mlstad8f13bib161","article-title":"The COMPAS project","author":"Wahab","year":"2022"},{"key":"mlstad8f13bib162","doi-asserted-by":"publisher","first-page":"2240","DOI":"10.1021\/acs.jcim.2c01573","article-title":"CycPeptMPDB: a comprehensive database of membrane permeability of cyclic peptides","volume":"63","author":"Li","year":"2023","journal-title":"J. Chem. Inf. Model."},{"key":"mlstad8f13bib163","doi-asserted-by":"publisher","first-page":"2562","DOI":"10.1021\/acs.jcim.5b00654","article-title":"Better informed distance geometry: using what we know to improve conformation generation","volume":"55","author":"Riniker","year":"2015","journal-title":"J. Chem. Inf. Model."},{"key":"mlstad8f13bib164","doi-asserted-by":"publisher","first-page":"7169","DOI":"10.1039\/C9CP06869D","article-title":"Automated exploration of the low-energy chemical space with fast quantum chemical methods","volume":"22","author":"Pracht","year":"2020","journal-title":"Phys. Chem. Chem. Phys."},{"key":"mlstad8f13bib165","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.10798261","article-title":"CREMP data sets","author":"Grambow","year":"2024"},{"key":"mlstad8f13bib166","doi-asserted-by":"publisher","first-page":"513","DOI":"10.1039\/C7SC02664A","article-title":"Moleculenet: a benchmark for molecular machine learning","volume":"9","author":"Wu","year":"2018","journal-title":"Chem. Sci."},{"key":"mlstad8f13bib167","doi-asserted-by":"publisher","first-page":"4039","DOI":"10.1021\/acs.jpca.1c00971","article-title":"Efficient quantum chemical calculation of structure ensembles and free energies for nonrigid molecules","volume":"125","author":"Grimme","year":"2021","journal-title":"J. Chem. Phys. A"},{"key":"mlstad8f13bib168","article-title":"GEOM on GitHub","author":"Axelrod","year":"2022"},{"key":"mlstad8f13bib169","first-page":"171","article-title":"The Cambridge structural database","volume":"72","author":"Groom","year":"2016","journal-title":"Struct. Sci."},{"key":"mlstad8f13bib170","doi-asserted-by":"publisher","first-page":"618","DOI":"10.1039\/D2DD00129B","article-title":"Deep learning metal complex properties with natural quantum graphs","volume":"2","author":"Kneiding","year":"2023","journal-title":"Digit. Discov."},{"key":"mlstad8f13bib171","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1038\/s43588-024-00616-5","article-title":"Directional multiobjective optimization of metal complexes at the billion-system scale","volume":"4","author":"Kneiding","year":"2024","journal-title":"Nat. Comput. Sci."},{"key":"mlstad8f13bib172","article-title":"The OFF-ON database","author":"C\u00e9lerse","year":"2024"},{"key":"mlstad8f13bib173","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1186\/s13321-022-00604-9","article-title":"Surge: a fast open-source chemical graph generator","volume":"14","author":"McKay","year":"2022","journal-title":"J. Cheminf."},{"key":"mlstad8f13bib174","doi-asserted-by":"publisher","DOI":"10.1063\/5.0004860","article-title":"QMCPACK: advances in the development, efficiency and application of auxiliary field and real-space variational and diffusion quantum Monte Carlo","volume":"152","author":"Kent","year":"2020","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib175","doi-asserted-by":"publisher","first-page":"1456","DOI":"10.1002\/jcc.21759","article-title":"Effect of the damping function in dispersion corrected density functional theory","volume":"32","author":"Grimme","year":"2011","journal-title":"J. Comput. Chem."},{"key":"mlstad8f13bib176","doi-asserted-by":"publisher","first-page":"1123","DOI":"10.1021\/ed100697w","article-title":"ChemSpider: an online chemical information resource","volume":"87","author":"Pence","year":"2010","journal-title":"J. Chem. Educ."},{"key":"mlstad8f13bib177","doi-asserted-by":"publisher","DOI":"10.1063\/1.5020067","article-title":"Metadynamics for training neural network model chemistries: a competitive assessment","volume":"148","author":"Herr","year":"2018","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib178","doi-asserted-by":"publisher","first-page":"184","DOI":"10.1080\/00268976.2014.952696","article-title":"Advances in molecular quantum chemistry contained in the q-chem 4 program package","volume":"113","author":"Shao","year":"2015","journal-title":"Mol. Phys."},{"key":"mlstad8f13bib179","doi-asserted-by":"publisher","first-page":"95","DOI":"10.1021\/ci500593j","article-title":"Managing the computational chemistry big data problem: the ioChem-BD platform","volume":"55","author":"Alvarez-Moreno","year":"2015","journal-title":"J. Chem. Inf. Model."},{"key":"mlstad8f13bib180","doi-asserted-by":"publisher","DOI":"10.1063\/5.0059356","article-title":"Quantum Chemistry Common Driver and Databases (QCDB) and Quantum Chemistry Engine (QCEngine): automation and interoperability among computational chemistry programs","volume":"155","author":"Smith","year":"2021","journal-title":"J. Chem. Phys."},{"key":"mlstad8f13bib181","doi-asserted-by":"publisher","first-page":"1193","DOI":"10.1021\/acs.jctc.3c01203","article-title":"MLatom 3: a platform for machine learning-enhanced computational chemistry simulations and workflows","volume":"20","author":"Dral","year":"2024","journal-title":"J. Chem. Theory Comput."},{"key":"mlstad8f13bib182","doi-asserted-by":"publisher","DOI":"10.26434\/chemrxiv-2024-ng3ws","article-title":"All-in-one foundational models learning across quantum chemical levels","author":"Chen","year":"2024"},{"key":"mlstad8f13bib183","doi-asserted-by":"publisher","DOI":"10.26434\/chemrxiv-2024-604wb","article-title":"Universal and updatable artificial intelligence-enhanced quantum chemical foundational models","author":"Chen","year":"2024"},{"key":"mlstad8f13bib184","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/978-3-319-42913-7_60-1","author":"Jain","year":"2018"}],"container-title":["Machine Learning: Science and Technology"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad8f13","content-type":"text\/html","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad8f13\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad8f13","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad8f13\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad8f13\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad8f13\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad8f13\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad8f13\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T11:57:51Z","timestamp":1732017471000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad8f13"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,19]]},"references-count":184,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2024,11,19]]},"published-print":{"date-parts":[[2024,12,1]]}},"URL":"https:\/\/doi.org\/10.1088\/2632-2153\/ad8f13","relation":{"has-preprint":[{"id-type":"doi","id":"10.26434\/chemrxiv-2024-w3ld0-v2","asserted-by":"object"},{"id-type":"doi","id":"10.26434\/chemrxiv-2024-w3ld0","asserted-by":"object"}]},"ISSN":["2632-2153"],"issn-type":[{"value":"2632-2153","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11,19]]},"assertion":[{"value":"Molecular quantum chemical data sets and databases for machine learning potentials","name":"article_title","label":"Article Title"},{"value":"Machine Learning: Science and Technology","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2024 The Author(s). Published by IOP Publishing Ltd","name":"copyright_information","label":"Copyright Information"},{"value":"2024-08-21","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2024-11-05","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2024-11-19","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}