{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,10]],"date-time":"2026-01-10T00:37:24Z","timestamp":1768005444671,"version":"3.49.0"},"reference-count":135,"publisher":"IOP Publishing","issue":"4","license":[{"start":{"date-parts":[[2022,12,29]],"date-time":"2022-12-29T00:00:00Z","timestamp":1672272000000},"content-version":"vor","delay-in-days":28,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,12,29]],"date-time":"2022-12-29T00:00:00Z","timestamp":1672272000000},"content-version":"tdm","delay-in-days":28,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"funder":[{"name":"Department of Science and Technology, India","award":["DST\/NSM\/ R&D_HPC_Applications\/2021\/05"],"award-info":[{"award-number":["DST\/NSM\/ R&D_HPC_Applications\/2021\/05"]}]}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Mach. Learn.: Sci. Technol."],"published-print":{"date-parts":[[2022,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Shear viscosity, though being a fundamental property of all fluids, is computationally expensive to calculate from equilibrium molecular dynamics simulations. Recently, machine learning (ML) methods have been used to augment molecular simulations in many contexts, thus showing promise to estimate viscosity too in a relatively inexpensive manner. However, ML methods face significant challenges\u2014such as overfitting, when the size of the data set is small, as is the case with viscosity. In this work, we train seven ML models to predict the shear viscosity of a Lennard\u2013Jones fluid, with particular emphasis on addressing issues arising from a small data set. Specifically, the issues related to model selection, performance estimation and uncertainty quantification were investigated. First, we show that the widely used performance estimation procedure of using a single unseen data set shows a wide variability\u2014in estimating the errors on\u2014small data sets. In this context, the common practice of using cross validation (CV) to select the hyperparameters (model selection) can be adapted to estimate the generalization error (performance estimation) as well. We compare two simple CV procedures for their ability to do both model selection and performance estimation, and find that k-fold CV based procedure shows a lower variance of error estimates. Also, these CV procedures naturally lead to an ensemble of trained ML models. We discuss the role of performance metrics in training and evaluation and propose a method to rank the ML models based on multiple metrics. Finally, two methods for uncertainty quantification\u2014Gaussian process regression (GPR) and ensemble method\u2014were used to estimate the uncertainty on individual predictions. The uncertainty estimates from GPR were also used to construct an applicability domain using which the ML models provided even more reliable predictions on an independent viscosity data set generated in this work. Overall, the procedures prescribed in this work, together, lead to robust ML models for small data sets.<\/jats:p>","DOI":"10.1088\/2632-2153\/acac01","type":"journal-article","created":{"date-parts":[[2022,12,15]],"date-time":"2022-12-15T22:32:15Z","timestamp":1671143535000},"page":"045032","update-policy":"https:\/\/doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["Building robust machine learning models for small chemical science data: the case of shear viscosity of fluids"],"prefix":"10.1088","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6467-6040","authenticated-orcid":true,"given":"Nikhil V S","family":"Avula","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4671-9646","authenticated-orcid":false,"given":"Shivanand Kumar","family":"Veesam","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1025-0639","authenticated-orcid":false,"given":"Sudarshan","family":"Behera","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3355-6764","authenticated-orcid":false,"given":"Sundaram","family":"Balasubramanian","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"266","published-online":{"date-parts":[[2022,12,29]]},"reference":[{"key":"mlstacac01bib1","author":"March","year":"2002"},{"key":"mlstacac01bib2","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.106.115703","article-title":"Viscosity, shear waves and atomic-level stress-stress correlations","volume":"106","author":"Levashov","year":"2011","journal-title":"Phys. Rev. Lett."},{"key":"mlstacac01bib3","doi-asserted-by":"publisher","first-page":"123","DOI":"10.1016\/j.jpgl.2008.03.038","article-title":"Viscosity of magmatic liquids: a model","volume":"271","author":"Giordano","year":"2008","journal-title":"Earth Planet. Sci. Lett."},{"key":"mlstacac01bib4","doi-asserted-by":"publisher","first-page":"805","DOI":"10.1038\/33905","article-title":"The viscosity of liquid iron at the physical conditions of the Earth\u2019s core","volume":"392","author":"de Wijs","year":"1998","journal-title":"Nature"},{"key":"mlstacac01bib5","first-page":"pp 91","article-title":"2.05 mineralogy of the Earth\u2014the Earth\u2019s core: iron and iron alloys","author":"Vo\u010dadlo","year":"2007"},{"key":"mlstacac01bib6","first-page":"pp 218","article-title":"Viscosity of the outer core","author":"Secco","year":"1995"},{"key":"mlstacac01bib7","doi-asserted-by":"publisher","DOI":"10.1126\/sciadv.1700399","article-title":"Identifying time scales for violation\/preservation of Stokes\u2013Einstein relation in supercooled water","volume":"3","author":"Kawasaki","year":"2017","journal-title":"Sci. Adv."},{"key":"mlstacac01bib8","doi-asserted-by":"publisher","first-page":"4070","DOI":"10.1073\/pnas.1815943116","article-title":"Probing the link between residual entropy and viscosity of molecular fluids and model potentials","volume":"116","author":"Bell","year":"2019","journal-title":"Proc. Natl Acad. Sci."},{"key":"mlstacac01bib9","doi-asserted-by":"publisher","first-page":"4300","DOI":"10.1038\/s41467-020-17948-1","article-title":"Excess-entropy scaling in supercooled binary mixtures","volume":"11","author":"Bell","year":"2020","journal-title":"Nat. Commun."},{"key":"mlstacac01bib10","doi-asserted-by":"publisher","first-page":"6411","DOI":"10.1021\/acs.jpclett.1c01594","article-title":"Dynamic crossover in fluids: from hard spheres to molecules","volume":"12","author":"Bell","year":"2021","journal-title":"J. Phys. Chem. Lett."},{"key":"mlstacac01bib11","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.129.074503","article-title":"Microscopic origins of the viscosity of a Lennard-Jones liquid","volume":"129","author":"Rizk","year":"2022","journal-title":"Phys. Rev. Lett."},{"key":"mlstacac01bib12","doi-asserted-by":"publisher","first-page":"89","DOI":"10.1016\/j.fuel.2018.01.002","article-title":"Viscosity models for pure hydrocarbons at extreme conditions: a review and comparative study","volume":"218","author":"Baled","year":"2018","journal-title":"Fuel"},{"key":"mlstacac01bib13","doi-asserted-by":"publisher","first-page":"4987","DOI":"10.1021\/acs.iecr.0c05356","article-title":"Industrial requirements for thermodynamic and transport properties: 2020","volume":"60","author":"Kontogeorgis","year":"2021","journal-title":"Ind. Eng. Chem. Res."},{"key":"mlstacac01bib14","doi-asserted-by":"publisher","first-page":"6324","DOI":"10.33011\/livecoms.1.1.6324","article-title":"Best practices for computing transport properties 1. Self-diffusivity and viscosity from equilibrium molecular dynamics","volume":"1","author":"Maginn","year":"2019","journal-title":"Living J. Comput. Mol. Sci."},{"key":"mlstacac01bib15","doi-asserted-by":"publisher","first-page":"209","DOI":"10.1063\/1.1421362","article-title":"Determining the shear viscosity of model liquids from molecular dynamics simulations","volume":"116","author":"Hess","year":"2002","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib16","doi-asserted-by":"publisher","first-page":"5161","DOI":"10.1103\/PhysRevLett.81.5161","article-title":"First-principles calculation of transport coefficients","volume":"81","author":"Alf\u00e8","year":"1998","journal-title":"Phys. Rev. Lett."},{"key":"mlstacac01bib17","doi-asserted-by":"publisher","first-page":"5959","DOI":"10.1021\/acs.jctc.8b00625","article-title":"Shear viscosity computed from the finite-size effects of self-diffusivity in equilibrium molecular dynamics","volume":"14","author":"Jamali","year":"2018","journal-title":"J. Chem. Theory Comput."},{"key":"mlstacac01bib18","doi-asserted-by":"publisher","DOI":"10.1063\/5.0062081","article-title":"Atomic transport properties of liquid iron at conditions of planetary cores","volume":"155","author":"Li","year":"2021","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib19","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1038\/s41524-022-00830-7","article-title":"Viscosity in water from first-principles and deep-neural-network simulations","volume":"8","author":"Malosso","year":"2022","journal-title":"npj Comput. Mater."},{"key":"mlstacac01bib20","doi-asserted-by":"publisher","DOI":"10.1088\/0953-8984\/24\/28\/284117","article-title":"Diffusion coefficient and shear viscosity of rigid water models","volume":"24","author":"Tazi","year":"2012","journal-title":"J. Phys.: Condens. Matter"},{"key":"mlstacac01bib21","doi-asserted-by":"publisher","DOI":"10.1063\/5.0023225","article-title":"Comparison of fixed charge and polarizable models for predicting the structural, thermodynamic and transport properties of molten alkali chlorides","volume":"153","author":"Wang","year":"2020","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib22","doi-asserted-by":"publisher","first-page":"11772","DOI":"10.1073\/pnas.1101210108","article-title":"Predicting human blood viscosity in silico","volume":"108","author":"Fedosov","year":"2011","journal-title":"Proc. Natl Acad. Sci."},{"key":"mlstacac01bib23","doi-asserted-by":"publisher","first-page":"3537","DOI":"10.1021\/acs.jctc.5b00351","article-title":"Reliable viscosity calculation from equilibrium molecular dynamics simulations: a time decomposition method","volume":"11","author":"Zhang","year":"2015","journal-title":"J. Chem. Theory Comput."},{"key":"mlstacac01bib24","doi-asserted-by":"publisher","first-page":"4894","DOI":"10.1103\/PhysRevE.59.4894","article-title":"Reversing the perturbation in nonequilibrium molecular dynamics: an easy way to calculate the shear viscosity of fluids","volume":"59","author":"M\u00fcller-Plathe","year":"1999","journal-title":"Phys. Rev. E"},{"key":"mlstacac01bib25","doi-asserted-by":"publisher","first-page":"349","DOI":"10.1007\/s40544-018-0207-9","article-title":"Advances in nonequilibrium molecular dynamics simulations of lubricants and additives","volume":"6","author":"Ewen","year":"2018","journal-title":"Friction"},{"key":"mlstacac01bib26","doi-asserted-by":"publisher","DOI":"10.1063\/1.5027681","article-title":"Incremental viscosity by non-equilibrium molecular dynamics and the Eyring model","volume":"148","author":"Heyes","year":"2018","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib27","doi-asserted-by":"publisher","first-page":"6604","DOI":"10.1021\/jp0456584","article-title":"Alternative view of self-diffusion and shear viscosity","volume":"109","author":"Stillinger","year":"2005","journal-title":"J. Phys. Chem. B"},{"key":"mlstacac01bib28","doi-asserted-by":"publisher","DOI":"10.1063\/1.3700344","article-title":"Adaptive Green-Kubo estimates of transport coefficients from molecular dynamics based on robust error analysis","volume":"136","author":"Jones","year":"2012","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib29","doi-asserted-by":"publisher","first-page":"485","DOI":"10.1016\/j.jcp.2015.09.021","article-title":"Quantification of sampling uncertainty for molecular dynamics simulation: time-dependent diffusion coefficient in simple fluids","volume":"302","author":"Kim","year":"2015","journal-title":"J. Comput. Phys."},{"key":"mlstacac01bib30","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.95.023308","article-title":"Method to manage integration error in the Green-Kubo method","volume":"95","author":"Oliveira","year":"2017","journal-title":"Phys. Rev. E"},{"key":"mlstacac01bib31","doi-asserted-by":"publisher","DOI":"10.1063\/1.5095501","article-title":"Shear stress relaxation and diffusion in simple liquids by molecular dynamics simulations: analytic expressions and paths to viscosity","volume":"150","author":"Heyes","year":"2019","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib32","doi-asserted-by":"publisher","DOI":"10.1063\/5.0005600","article-title":"Single trajectory transport coefficients and the energy landscape by molecular dynamics simulations","volume":"152","author":"Heyes","year":"2020","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib33","doi-asserted-by":"publisher","DOI":"10.1063\/5.0040106","article-title":"Viscuit and the fluctuation theorem investigation of shear viscosity by molecular dynamics simulations: the information and the noise","volume":"154","author":"Heyes","year":"2021","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib34","doi-asserted-by":"publisher","DOI":"10.1063\/5.0083228","article-title":"Intrinsic viscuit probability distribution functions for transport coefficients of liquids and solids","volume":"156","author":"Heyes","year":"2022","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib35","doi-asserted-by":"publisher","first-page":"4274","DOI":"10.1021\/acs.jctc.1c00268","article-title":"Efficient parametrization of force field for the quantitative prediction of the physical properties of ionic liquid electrolytes","volume":"17","author":"Avula","year":"2021","journal-title":"J. Chem. Theory Comput."},{"key":"mlstacac01bib36","doi-asserted-by":"publisher","DOI":"10.1016\/j.fluid.2021.113100","article-title":"Predicting shear viscosity of 1,1-diphenylethane at high pressures by molecular dynamics methods","volume":"544\u2013545","author":"Kondratyuk","year":"2021","journal-title":"Fluid Phase Equilib."},{"key":"mlstacac01bib37","doi-asserted-by":"publisher","first-page":"1606","DOI":"10.1021\/acs.jctc.0c01002","article-title":"Extension of the CL&Pol polarizable force field to electrolytes, protic ionic liquids and deep eutectic solvents","volume":"17","author":"Goloviznina","year":"2021","journal-title":"J. Chem. Theory Comput."},{"key":"mlstacac01bib38","doi-asserted-by":"publisher","first-page":"3718","DOI":"10.1021\/acs.jced.9b00050","article-title":"Extension of team force-field database to ionic liquids","volume":"64","author":"Gong","year":"2019","journal-title":"J. Chem. Eng. Data"},{"key":"mlstacac01bib39","doi-asserted-by":"publisher","DOI":"10.1063\/1.2219114","article-title":"Optimization of the anisotropic united atoms intermolecular potential for n-alkanes: improvement of transport properties","volume":"125","author":"Nieto-Draghi","year":"2006","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib40","doi-asserted-by":"publisher","DOI":"10.1088\/1742-6596\/1385\/1\/012048","article-title":"Comparing different force fields by viscosity prediction for branched alkane at 0.1 and 400 MPA","volume":"1385","author":"Kondratyuk","year":"2019","journal-title":"J. Phys.: Conf. Ser."},{"key":"mlstacac01bib41","doi-asserted-by":"publisher","DOI":"10.1016\/j.molliq.2020.112663","article-title":"Thermophysical properties of simple molecular liquid mixtures: on the limitations of some force fields","volume":"303","author":"Hamani","year":"2020","journal-title":"J. Mol. Liq."},{"key":"mlstacac01bib42","doi-asserted-by":"publisher","DOI":"10.1063\/1.5035119","article-title":"Nature of intrinsic uncertainties in equilibrium molecular dynamics estimation of shear viscosity for simple and complex fluids","volume":"149","author":"Kim","year":"2018","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib43","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1016\/j.sbi.2019.12.016","article-title":"Machine learning approaches for analyzing and enhancing molecular dynamics simulations","volume":"61","author":"Wang","year":"2020","journal-title":"Curr. Opin. Struct. Biol."},{"key":"mlstacac01bib44","doi-asserted-by":"publisher","first-page":"361","DOI":"10.1146\/annurev-physchem-042018-052331","article-title":"Machine learning for molecular simulation","volume":"71","author":"No\u00e9","year":"2020","journal-title":"Annu. Rev. Phys. Chem."},{"key":"mlstacac01bib45","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/abfd96","article-title":"Strategies for the construction of machine-learning potentials for accurate and efficient atomic-scale simulations","volume":"2","author":"Miksch","year":"2021","journal-title":"Mach. Learn.: Sci. Technol."},{"key":"mlstacac01bib46","doi-asserted-by":"publisher","first-page":"2","DOI":"10.1007\/s12039-021-01995-2","article-title":"Artificial intelligence: machine learning for chemical sciences","volume":"134","author":"Karthikeyan","year":"2021","journal-title":"J. Chem. Sci."},{"key":"mlstacac01bib47","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.2113533118","article-title":"Deep learning the slow modes for rare events sampling","volume":"118","author":"Bonati","year":"2021","journal-title":"Proc. Natl Acad. Sci."},{"key":"mlstacac01bib48","doi-asserted-by":"publisher","first-page":"2355","DOI":"10.1021\/acs.jctc.0c01343","article-title":"Torchmd: a deep learning framework for molecular simulations","volume":"17","author":"Doerr","year":"2021","journal-title":"J. Chem. Theory Comput."},{"key":"mlstacac01bib49","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/ac6ec6","article-title":"High-fidelity molecular dynamics trajectory reconstruction with bi-directional neural networks","volume":"3","author":"Winkler","year":"2022","journal-title":"Mach. Learn.: Sci. Technol."},{"key":"mlstacac01bib50","doi-asserted-by":"publisher","DOI":"10.1063\/5.0011512","article-title":"Machine learning prediction of self-diffusion in Lennard-Jones fluids","volume":"153","author":"Allers","year":"2020","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib51","doi-asserted-by":"publisher","first-page":"10375","DOI":"10.1021\/acs.jpclett.0c03108","article-title":"Machine learning-based upscaling of finite-size molecular dynamics diffusion simulations for binary fluids","volume":"11","author":"Leverant","year":"2020","journal-title":"J. Phys. Chem. Lett."},{"key":"mlstacac01bib52","doi-asserted-by":"publisher","first-page":"253","DOI":"10.1039\/C7ME00094D","article-title":"Statistical models are able to predict ionic liquid viscosity across a wide range of chemical functionalities and experimental conditions","volume":"3","author":"Beckner","year":"2018","journal-title":"Mol. Syst. Des. Eng."},{"key":"mlstacac01bib53","doi-asserted-by":"publisher","first-page":"6820","DOI":"10.1039\/D1SC01000J","article-title":"A review on machine learning algorithms for the ionic liquid chemical space","volume":"12","author":"Koutsoukos","year":"2021","journal-title":"Chem. Sci."},{"key":"mlstacac01bib54","doi-asserted-by":"publisher","first-page":"1451","DOI":"10.1007\/s11814-010-0512-0","article-title":"Viscosity of ionic liquids using the concept of mass connectivity and artificial neural networks","volume":"28","author":"Valderrama","year":"2011","journal-title":"Korean J. Chem. Eng."},{"key":"mlstacac01bib55","doi-asserted-by":"publisher","first-page":"1600","DOI":"10.1080\/00986445.2012.756396","article-title":"Representation of ionic liquid viscosity-temperature data by generalized correlations and an artificial neural network (ANN) model","volume":"200","author":"Dutt","year":"2013","journal-title":"Chem. Eng. Commun."},{"key":"mlstacac01bib56","doi-asserted-by":"publisher","first-page":"1311","DOI":"10.1021\/ci500206u","article-title":"Viscosity of ionic liquids: an extensive database and a new group contribution model based on a feed-forward artificial neural network","volume":"54","author":"Paduszy\u0144ski","year":"2014","journal-title":"J. Chem. Inf. Model."},{"key":"mlstacac01bib57","doi-asserted-by":"publisher","first-page":"309","DOI":"10.1016\/j.molliq.2016.11.133","article-title":"Estimation of viscosities of pure ionic liquids using an artificial neural network based on only structural characteristics","volume":"227","author":"Fatehi","year":"2017","journal-title":"J. Mol. Liq."},{"key":"mlstacac01bib58","doi-asserted-by":"publisher","first-page":"452","DOI":"10.1016\/j.molliq.2017.04.019","article-title":"Prediction viscosity of ionic liquids using a hybrid LSSVM and group contribution method","volume":"236","author":"Baghban","year":"2017","journal-title":"J. Mol. Liq."},{"key":"mlstacac01bib59","doi-asserted-by":"publisher","DOI":"10.1063\/5.0089568","article-title":"Conductivity prediction model for ionic liquids using machine learning","volume":"156","author":"Datta","year":"2022","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib60","doi-asserted-by":"publisher","DOI":"10.1063\/5.0085592","article-title":"Machine learning investigation of viscosity and ionic conductivity of protic ionic liquids in water mixtures","volume":"156","author":"Duong","year":"2022","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib61","doi-asserted-by":"publisher","first-page":"146","DOI":"10.1016\/j.trechm.2020.12.004","article-title":"Metrics for benchmarking and uncertainty quantification: quality, applicability and best practices for machine learning in chemistry","volume":"3","author":"Vishwakarma","year":"2021","journal-title":"Trends Chem."},{"key":"mlstacac01bib62","author":"Bishop","year":"2006"},{"key":"mlstacac01bib63","author":"Goodfellow","year":"2016"},{"key":"mlstacac01bib64","doi-asserted-by":"publisher","first-page":"40","DOI":"10.1214\/09-SS054","article-title":"A survey of cross-validation procedures for model selection","volume":"4","author":"Arlot","year":"2010","journal-title":"Stat. Surv."},{"key":"mlstacac01bib65","doi-asserted-by":"publisher","first-page":"2079","DOI":"10.5555\/1756006.1859921","article-title":"On over-fitting in model selection and subsequent selection bias in performance evaluation","volume":"11","author":"Cawley","year":"2010","journal-title":"J. Mach. Learn. Res."},{"key":"mlstacac01bib66","doi-asserted-by":"publisher","first-page":"95","DOI":"10.1016\/j.jeconom.2015.02.006","article-title":"Cross-validation for selecting a model selection procedure","volume":"187","author":"Zhang","year":"2015","journal-title":"J. Econ."},{"key":"mlstacac01bib67","doi-asserted-by":"publisher","author":"Burnham","year":"2003","DOI":"10.1007\/b97636"},{"key":"mlstacac01bib68","doi-asserted-by":"publisher","first-page":"249","DOI":"10.1007\/s41664-018-0068-2","article-title":"On splitting training and validation set: a comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning","volume":"2","author":"Xu","year":"2018","journal-title":"J. Anal. Test."},{"key":"mlstacac01bib69","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1016\/0169-2070(92)90008-W","article-title":"Error measures for generalizing about forecasting methods: empirical comparisons","volume":"8","author":"Armstrong","year":"1992","journal-title":"Int. J. Forecast."},{"key":"mlstacac01bib70","doi-asserted-by":"publisher","first-page":"746","DOI":"10.1198\/jasa.2011.r10138","article-title":"Making and evaluating point forecasts","volume":"106","author":"Gneiting","year":"2011","journal-title":"J. Am. Stat. Assoc."},{"key":"mlstacac01bib71","doi-asserted-by":"publisher","first-page":"407","DOI":"10.1021\/ci600205g","article-title":"Accurate solubility prediction with error bars for electrolytes: a machine learning approach","volume":"47","author":"Schwaighofer","year":"2007","journal-title":"J. Chem. Inf. Model."},{"key":"mlstacac01bib72","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/ab7e1a","article-title":"Methods for comparing uncertainty quantifications for material property predictions","volume":"1","author":"Tran","year":"2020","journal-title":"Mach. Learn.: Sci. Technol."},{"key":"mlstacac01bib73","doi-asserted-by":"publisher","first-page":"2697","DOI":"10.1021\/acs.jcim.9b00975","article-title":"Evaluating scalable uncertainty estimation methods for deep learning-based molecular property prediction","volume":"60","author":"Scalia","year":"2020","journal-title":"J. Chem. Inf. Model."},{"key":"mlstacac01bib74","doi-asserted-by":"publisher","DOI":"10.1063\/5.0036522","article-title":"Uncertainty estimation for molecular dynamics and sampling","volume":"154","author":"Imbalzano","year":"2021","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib75","doi-asserted-by":"publisher","first-page":"32431","DOI":"10.1021\/acsomega.1c03752","article-title":"Uncertainty prediction for machine learning models of material properties","volume":"6","author":"Tavazza","year":"2021","journal-title":"ACS Omega"},{"key":"mlstacac01bib76","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/abee59","article-title":"Efficient hyperparameter tuning for kernel ridge regression with Bayesian optimization","volume":"2","author":"Stuke","year":"2021","journal-title":"Mach. Learn.: Sci. Technol."},{"key":"mlstacac01bib77","doi-asserted-by":"publisher","first-page":"208","DOI":"10.1016\/j.ijforecast.2019.02.017","article-title":"Why the \u201cbest\u201d point forecast depends on the error or accuracy measure","volume":"36","author":"Kolassa","year":"2020","journal-title":"Int. J. Forecast."},{"key":"mlstacac01bib78","doi-asserted-by":"publisher","first-page":"54","DOI":"10.1016\/j.ijforecast.2019.04.014","article-title":"The M4 competition: 100,000 time series and 61 forecasting methods","volume":"36","author":"Makridakis","year":"2020","journal-title":"Int. J. Forecast."},{"key":"mlstacac01bib79","doi-asserted-by":"publisher","first-page":"3770","DOI":"10.1021\/acs.jcim.0c00502","article-title":"Uncertainty quantification using neural networks for molecular property prediction","volume":"60","author":"Hirschfeld","year":"2020","journal-title":"J. Chem. Inf. Model."},{"key":"mlstacac01bib80","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1109\/4235.585893","article-title":"No free lunch theorems for optimization","volume":"1","author":"Wolpert","year":"1997","journal-title":"IEEE Trans. Evol. Comput."},{"key":"mlstacac01bib81","doi-asserted-by":"publisher","first-page":"3404","DOI":"10.1021\/ct400195d","article-title":"Assessment and validation of machine learning methods for predicting molecular atomization energies","volume":"9","author":"Hansen","year":"2013","journal-title":"J. Chem. Theory Comput."},{"key":"mlstacac01bib82","doi-asserted-by":"publisher","first-page":"2864","DOI":"10.1021\/ci300415d","article-title":"Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17","volume":"52","author":"Ruddigkeit","year":"2012","journal-title":"J. Chem. Inf. Model."},{"key":"mlstacac01bib83","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/abe347","article-title":"Revving up 13 C NMR shielding predictions across chemical space: benchmarks for atoms-in-molecules kernel machine learning with new data for 134 kilo molecules","volume":"2","author":"Gupta","year":"2021","journal-title":"Mach. Learn.: Sci. Technol."},{"key":"mlstacac01bib84","doi-asserted-by":"publisher","first-page":"2667","DOI":"10.1021\/acs.jctc.8b00170","article-title":"Finite-size effects of binary mutual diffusion coefficients from molecular dynamics","volume":"14","author":"Jamali","year":"2018","journal-title":"J. Chem. Theory Comput."},{"key":"mlstacac01bib85","doi-asserted-by":"publisher","first-page":"14722","DOI":"10.1021\/jacs.2c05302","article-title":"Machine learning yield prediction from NiCOlit, a small-size literature data set of nickel catalyzed C\u2013O couplings","volume":"144","author":"Schleinitz","year":"2022","journal-title":"J. Am. Chem. Soc."},{"key":"mlstacac01bib86","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1016\/j.neuroimage.2017.06.061","article-title":"Cross-validation failure: small sample sizes lead to large error bars","volume":"180","author":"Varoquaux","year":"2018","journal-title":"NeuroImage"},{"key":"mlstacac01bib87","doi-asserted-by":"publisher","first-page":"14396","DOI":"10.1039\/D1SC03564A","article-title":"Choosing the right molecular machine learning potential","volume":"12","author":"Pinheiro","year":"2021","journal-title":"Chem. Sci."},{"key":"mlstacac01bib88","doi-asserted-by":"publisher","first-page":"12990","DOI":"10.1021\/acs.jpcb.1c07092","article-title":"Using computationally-determined properties for machine learning prediction of self-diffusion coefficients in pure liquids","volume":"125","author":"Allers","year":"2021","journal-title":"J. Phys. Chem. B"},{"key":"mlstacac01bib89","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1371\/journal.pone.0224365","article-title":"Machine learning algorithm validation with a limited sample size","volume":"14","author":"Vabalas","year":"2019","journal-title":"PLoS One"},{"key":"mlstacac01bib90","doi-asserted-by":"publisher","first-page":"1529","DOI":"10.1021\/ci400197w","article-title":"Modeling, informatics and the quest for reproducibility","volume":"53","author":"Walters","year":"2013","journal-title":"J. Chem. Inf. Model."},{"key":"mlstacac01bib91","doi-asserted-by":"publisher","first-page":"1132","DOI":"10.1038\/s41592-021-01256-7","article-title":"Reproducibility standards for machine learning in the life sciences","volume":"18","author":"Heil","year":"2021","journal-title":"Nat. Methods"},{"key":"mlstacac01bib92","article-title":"Leakage and the reproducibility crisis in ML-based science","author":"Kapoor","year":"2022"},{"key":"mlstacac01bib93","doi-asserted-by":"publisher","first-page":"181","DOI":"10.1109\/72.914517","article-title":"An introduction to kernel-based learning algorithms","volume":"12","author":"Muller","year":"2001","journal-title":"IEEE Trans. Neural Netw."},{"key":"mlstacac01bib94","doi-asserted-by":"publisher","first-page":"1929","DOI":"10.5555\/2627435.2670313","article-title":"Dropout: a simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"mlstacac01bib95","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1186\/1471-2105-7-91","article-title":"Bias in error estimation when using cross-validation for model selection","volume":"7","author":"Varma","year":"2006","journal-title":"BMC Bioinform."},{"key":"mlstacac01bib96","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1186\/1758-2946-6-10","article-title":"Cross-validation pitfalls when selecting and assessing regression and classification models","volume":"6","author":"Krstajic","year":"2014","journal-title":"J. Cheminform."},{"key":"mlstacac01bib97","doi-asserted-by":"publisher","first-page":"2958","DOI":"10.1109\/IJCNN.2006.246632","article-title":"Performance prediction challenge","author":"Guyon","year":"2006"},{"key":"mlstacac01bib98","doi-asserted-by":"publisher","first-page":"717","DOI":"10.1007\/s10822-019-00274-0","article-title":"Validating the validation: reanalyzing a large-scale comparison of deep learning and machine learning models for bioactivity prediction","volume":"34","author":"Robinson","year":"2020","journal-title":"J. Comput. Aided Mol. Des."},{"key":"mlstacac01bib99","doi-asserted-by":"publisher","first-page":"6345","DOI":"10.1021\/acs.jpcb.9b05808","article-title":"Modified entropy scaling of the transport properties of the Lennard-Jones fluid","volume":"123","author":"Bell","year":"2019","journal-title":"J. Phys. Chem. B"},{"key":"mlstacac01bib100","doi-asserted-by":"publisher","author":"Allen","year":"2017","DOI":"10.1093\/oso\/9780198803195.001.0001"},{"key":"mlstacac01bib101","doi-asserted-by":"publisher","first-page":"9327","DOI":"10.1063\/1.474002","article-title":"Viscosity calculations of n-alkanes by equilibrium molecular dynamics","volume":"106","author":"Mondello","year":"1997","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib102","doi-asserted-by":"publisher","first-page":"5677","DOI":"10.1103\/PhysRevB.37.5677","article-title":"Transport coefficients of Lennard-Jones fluids: a molecular-dynamics and effective-hard-sphere treatment","volume":"37","author":"Heyes","year":"1988","journal-title":"Phys. Rev. B"},{"key":"mlstacac01bib103","doi-asserted-by":"publisher","first-page":"1109","DOI":"10.1007\/BF02575252","article-title":"Diffusion and viscosity equations of state for a Lennard-Jones fluid obtained from molecular dynamics simulations","volume":"18","author":"Rowley","year":"1997","journal-title":"Int. J. Thermophys."},{"key":"mlstacac01bib104","doi-asserted-by":"publisher","first-page":"3671","DOI":"10.1063\/1.1770695","article-title":"Transport coefficients of the Lennard-Jones model fluid. I. Viscosity","volume":"121","author":"Meier","year":"2004","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib105","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.83.061202","article-title":"Calculation of the second self-diffusion and viscosity virial coefficients of Lennard-Jones fluid by equilibrium molecular dynamics simulations","volume":"83","author":"Oderji","year":"2011","journal-title":"Phys. Rev. E"},{"key":"mlstacac01bib106","doi-asserted-by":"publisher","DOI":"10.1063\/1.4758806","article-title":"Metastable Lennard-Jones fluids. I. Shear viscosity","volume":"137","author":"Baidakov","year":"2012","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib107","doi-asserted-by":"publisher","DOI":"10.1063\/1.5022058","article-title":"Communication: simple liquids\u2019 high-density viscosity","volume":"148","author":"Costigliola","year":"2018","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib108","doi-asserted-by":"publisher","first-page":"38","DOI":"10.1016\/j.fluid.2018.10.019","article-title":"Transport properties of the Lennard-Jones truncated and shifted fluid from non-equilibrium molecular dynamics simulations","volume":"482","author":"Lautenschlaeger","year":"2019","journal-title":"Fluid Phase Equilib."},{"key":"mlstacac01bib109","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1016\/j.fluid.2005.05.016","article-title":"Influence of the mass ratio on viscosity in Lennard\u2013Jones mixtures: the one-fluid model revisited using nonequilibrium molecular dynamics","volume":"234","author":"Galli\u00e9ro","year":"2005","journal-title":"Fluid Phase Equilib."},{"key":"mlstacac01bib110","doi-asserted-by":"publisher","DOI":"10.1063\/1.5034779","article-title":"Viscosity of Lennard-Jones mixtures: a systematic study and empirical law","volume":"148","author":"Meyer","year":"2018","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib111","doi-asserted-by":"publisher","DOI":"10.1016\/j.fluid.2022.113459","article-title":"Mass effect on viscosity of mixtures in entropy scaling framework: application to Lennard-Jones mixtures","volume":"558","author":"Viet","year":"2022","journal-title":"Fluid Phase Equilib."},{"key":"mlstacac01bib112","doi-asserted-by":"publisher","DOI":"10.1063\/1.5113751","article-title":"Density-dependent finite system-size effects in equilibrium molecular dynamics estimation of shear viscosity: hydrodynamic and configurational study","volume":"151","author":"Kim","year":"2019","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib113","doi-asserted-by":"publisher","first-page":"15873","DOI":"10.1021\/jp0477147","article-title":"System-size dependence of diffusion coefficients and viscosities from molecular dynamics simulations with periodic boundary conditions","volume":"108","author":"Yeh","year":"2004","journal-title":"J. Phys. Chem. B"},{"key":"mlstacac01bib114","doi-asserted-by":"publisher","DOI":"10.1063\/1.4748352","article-title":"Computational studies of ionic liquids: size does matter and time too","volume":"137","author":"Gabl","year":"2012","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib115","doi-asserted-by":"publisher","first-page":"11202","DOI":"10.1063\/1.1818675","article-title":"Cooperative effects, transport and entropy in simple liquids","volume":"121","author":"Petravic","year":"2004","journal-title":"J. Chem. Phys."},{"key":"mlstacac01bib116","author":"Tukey","year":"1977"},{"key":"mlstacac01bib117","author":"Brillinger","year":"2011"},{"key":"mlstacac01bib118","doi-asserted-by":"publisher","first-page":"770","DOI":"10.1136\/bmj.312.7033.770","article-title":"Statistics notes: transforming data","volume":"312","author":"Bland","year":"1996","journal-title":"BMJ"},{"key":"mlstacac01bib119","doi-asserted-by":"publisher","first-page":"1912","DOI":"10.1021\/ci049782w","article-title":"Similarity to molecules in the training set is a good discriminator for prediction accuracy in QSAR","volume":"44","author":"Sheridan","year":"2004","journal-title":"J. Chem. Inf. Comput. Sci."},{"key":"mlstacac01bib120","doi-asserted-by":"publisher","first-page":"839","DOI":"10.1021\/ci0500381","article-title":"A stepwise approach for defining the applicability domain of SAR and QSAR models","volume":"45","author":"Dimitrov","year":"2005","journal-title":"J. Chem. Inf. Model."},{"key":"mlstacac01bib121","doi-asserted-by":"publisher","first-page":"651","DOI":"10.1007\/s10822-007-9160-9","article-title":"Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules","volume":"21","author":"Schroeter","year":"2007","journal-title":"J. Comput. Aided Mol. Des."},{"key":"mlstacac01bib122","doi-asserted-by":"publisher","first-page":"2","DOI":"10.1186\/1758-2946-2-2","article-title":"Estimation of the applicability domain of kernel-based machine learning models for virtual screening","volume":"2","author":"Fechner","year":"2010","journal-title":"J. Cheminform."},{"key":"mlstacac01bib123","doi-asserted-by":"publisher","first-page":"5542","DOI":"10.3390\/ijms21155542","article-title":"Comprehensive analysis of applicability domains of QSPR models for chemical reactions","volume":"21","author":"Rakhimbekova","year":"2020","journal-title":"Int. J. Mol. Sci."},{"key":"mlstacac01bib124","first-page":"2825","article-title":"Scikit-learn: machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"mlstacac01bib125","article-title":"TensorFlow: large-scale machine learning on heterogeneous systems","author":"Abadi","year":"2015"},{"key":"mlstacac01bib126","article-title":"Keras","author":"Chollet","year":"2015"},{"key":"mlstacac01bib127","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1038\/s41586-020-2649-2","article-title":"Array programming with NumPy","volume":"585","author":"Harris","year":"2020","journal-title":"Nature"},{"key":"mlstacac01bib128","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1038\/s41592-019-0686-2","article-title":"SciPy 1.0: fundamental algorithms for scientific computing in Python","volume":"17","author":"Virtanen","year":"2020","journal-title":"Nat. Methods"},{"key":"mlstacac01bib129","article-title":"pandas-dev\/pandas: Pandas 1.3.4","author":"","year":"2021"},{"key":"mlstacac01bib130","doi-asserted-by":"publisher","first-page":"10978","DOI":"10.1039\/C7CP00375G","article-title":"Addressing uncertainty in atomistic machine learning","volume":"19","author":"Peterson","year":"2017","journal-title":"Phys. Chem. Chem. Phys."},{"key":"mlstacac01bib131","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1038\/s41524-020-0283-z","article-title":"On-the-fly active learning of interpretable Bayesian force fields for atomistic rare events","volume":"6","author":"Vandermause","year":"2020","journal-title":"npj Comput. Mater."},{"key":"mlstacac01bib132","article-title":"Uncertainty-aware molecular dynamics from Bayesian active learning: phase transformations and thermal transport in SIC","author":"Xie","year":"2022"},{"key":"mlstacac01bib133","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/ac3de0","article-title":"Deeptime: a python library for machine learning dynamical models from time series data","volume":"3","author":"Hoffmann","year":"2021","journal-title":"Mach. Learn.: Sci. Technol."},{"key":"mlstacac01bib134","doi-asserted-by":"publisher","first-page":"5595","DOI":"10.1021\/acs.jpcb.2c04498","article-title":"Correction to \u201cmodified entropy scaling of the transport properties of the Lennard-Jones fluid\u201d","volume":"126","author":"Bell","year":"2022","journal-title":"J. Phys. Chem. B"},{"key":"mlstacac01bib135","doi-asserted-by":"publisher","first-page":"422","DOI":"10.1038\/s42254-021-00314-5","article-title":"Physics-informed machine learning","volume":"3","author":"Karniadakis","year":"2021","journal-title":"Nat. Rev. Phys."}],"container-title":["Machine Learning: Science and Technology"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acac01","content-type":"text\/html","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acac01\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acac01","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acac01\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acac01\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acac01\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acac01\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acac01\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,29]],"date-time":"2022-12-29T11:29:03Z","timestamp":1672313343000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acac01"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12,1]]},"references-count":135,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2022,12,29]]},"published-print":{"date-parts":[[2022,12,1]]}},"URL":"https:\/\/doi.org\/10.1088\/2632-2153\/acac01","relation":{},"ISSN":["2632-2153"],"issn-type":[{"value":"2632-2153","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,12,1]]},"assertion":[{"value":"Building robust machine learning models for small chemical science data: the case of shear viscosity of fluids","name":"article_title","label":"Article Title"},{"value":"Machine Learning: Science and Technology","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2022 The Author(s). Published by IOP Publishing Ltd","name":"copyright_information","label":"Copyright Information"},{"value":"2022-09-02","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2022-12-15","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2022-12-29","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}