{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,20]],"date-time":"2026-01-20T13:38:56Z","timestamp":1768916336033,"version":"3.49.0"},"reference-count":74,"publisher":"IOP Publishing","issue":"1","license":[{"start":{"date-parts":[[2021,11,24]],"date-time":"2021-11-24T00:00:00Z","timestamp":1637712000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,11,24]],"date-time":"2021-11-24T00:00:00Z","timestamp":1637712000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Mach. Learn.: Sci. Technol."],"published-print":{"date-parts":[[2022,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Bayesian optimisation is a sample-efficient search methodology that holds great promise for accelerating drug and materials discovery programs. A frequently-overlooked modelling consideration in Bayesian optimisation strategies however, is the representation of heteroscedastic aleatoric uncertainty. In many practical applications it is desirable to identify inputs with low aleatoric noise, an example of which might be a material composition which displays robust properties in response to a noisy fabrication process. In this paper, we propose a heteroscedastic Bayesian optimisation scheme capable of representing and minimising aleatoric noise across the input space. Our scheme employs a heteroscedastic Gaussian process surrogate model in conjunction with two straightforward adaptations of existing acquisition functions. First, we extend the augmented expected improvement heuristic to the heteroscedastic setting and second, we introduce the aleatoric noise-penalised expected improvement (ANPEI) heuristic. Both methodologies are capable of penalising aleatoric noise in the suggestions. In particular, the ANPEI acquisition yields improved performance relative to homoscedastic Bayesian optimisation and random sampling on toy problems as well as on two real-world scientific datasets. Code is available at: <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/Ryan-Rhys\/Heteroscedastic-BO\" xlink:type=\"simple\">https:\/\/github.com\/Ryan-Rhys\/Heteroscedastic-BO<\/jats:ext-link>\n               <\/jats:p>","DOI":"10.1088\/2632-2153\/ac298c","type":"journal-article","created":{"date-parts":[[2021,9,23]],"date-time":"2021-09-23T22:21:36Z","timestamp":1632435696000},"page":"015004","update-policy":"https:\/\/doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":16,"title":["Achieving robustness to aleatoric uncertainty with heteroscedastic Bayesian optimisation"],"prefix":"10.1088","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3117-4559","authenticated-orcid":false,"given":"Ryan-Rhys","family":"Griffiths","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alexander","family":"A Aldrick","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Miguel","family":"Garcia-Ortegon","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vidhi","family":"Lalchand","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alpha A","family":"Lee","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"266","published-online":{"date-parts":[[2021,11,24]]},"reference":[{"key":"mlstac298cbib1","doi-asserted-by":"publisher","first-page":"268","DOI":"10.1021\/acscentsci.7b00572","article-title":"Automatic chemical design using a data-driven continuous representation of molecules","volume":"4","author":"G\u00f3mez-Bombarelli","year":"2018","journal-title":"ACS Cent. Sci."},{"key":"mlstac298cbib2","doi-asserted-by":"publisher","first-page":"577","DOI":"10.1039\/C9SC04026A","article-title":"Constrained Bayesian optimization for automatic chemical design using variational autoencoders","volume":"11","author":"Griffiths","year":"2020","journal-title":"Chem. Sci."},{"key":"mlstac298cbib3","article-title":"Optimizing molecules using efficient queries from property evaluations","author":"Hoffman","year":"2020"},{"key":"mlstac298cbib4","article-title":"Gryffin: an algorithm for Bayesian optimization for categorical variables informed by physical intuition with applications to chemistry","author":"Hase","year":"2020"},{"key":"mlstac298cbib5","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/abedc8","article-title":"Olympus: a benchmarking framework for noisy optimization and experiment planning","volume":"2","author":"Hase","year":"2021","journal-title":"Mach. Learn.: Sci. Technol."},{"key":"mlstac298cbib6","doi-asserted-by":"publisher","first-page":"5959","DOI":"10.1039\/D0SC00982B","article-title":"Pushing property limits in materials discovery via boundless objective-free exploration","volume":"11","author":"Terayama","year":"2020","journal-title":"Chem. Sci."},{"key":"mlstac298cbib7","doi-asserted-by":"publisher","DOI":"10.26434\/chemrxiv.13250216.v2","article-title":"Multi-task Bayesian optimization of chemical reactions","author":"Felton","year":"2020","journal-title":"ChemRxiv"},{"key":"mlstac298cbib8","doi-asserted-by":"publisher","first-page":"116","DOI":"10.1002\/cmtd.202000051","article-title":"Summit: benchmarking machine learning methods for reaction optimisation","volume":"1","author":"Felton","year":"2020","journal-title":"Chemistry\u2010Methods"},{"key":"mlstac298cbib9","doi-asserted-by":"publisher","first-page":"2864","DOI":"10.1021\/acs.oprd.0c00376","article-title":"Solvent selection for Mitsunobu reaction driven by an active learning surrogate model","volume":"24","author":"Zhang","year":"2020","journal-title":"Org. Process Res. Develop."},{"key":"mlstac298cbib10","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1007\/s10472-015-9463-9","article-title":"Bayesian optimization for learning gaits under uncertainty","volume":"76","author":"Calandra","year":"2016","journal-title":"Ann. Math. Artif. Intell."},{"key":"mlstac298cbib11","first-page":"pp 2385","article-title":"Adaptive sensor placement for continuous spaces","author":"Grant","year":"2019"},{"key":"mlstac298cbib12","doi-asserted-by":"publisher","first-page":"727","DOI":"10.1109\/TBME.2018.2855404","article-title":"Bayesian multiobjective optimisation with mixed analytical and black-box functions: application to tissue engineering","volume":"66","author":"Olofsson","year":"2018","journal-title":"IEEE Trans. Biomed. Eng."},{"key":"mlstac298cbib13","article-title":"BOSS: Bayesian optimization over string spaces","volume":"p33","author":"Moss","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"mlstac298cbib14","doi-asserted-by":"publisher","first-page":"5574","DOI":"10.5555\/3295222.3295309","article-title":"What uncertainties do we need in Bayesian deep learning for computer vision?","author":"Kendall","year":"2017"},{"key":"mlstac298cbib15","article-title":"Dataset bias in the natural sciences: a case study in chemical reaction prediction and synthesis design","author":"Griffiths","year":"2018","journal-title":"ChemRxiv"},{"key":"mlstac298cbib16","doi-asserted-by":"publisher","first-page":"1559","DOI":"10.1021\/acs.jced.7b00104","article-title":"Approaches for calculating solvation free energies and enthalpies demonstrated with an update of the FreeSolv database","volume":"62","author":"Matos","year":"2017","journal-title":"J. Chem. Eng. Data"},{"key":"mlstac298cbib17","doi-asserted-by":"publisher","DOI":"10.1038\/sdata.2018.166","article-title":"A global dataset of plant available and unavailable phosphorus in natural soils derived by Hedley method","volume":"5","author":"Hou","year":"2018","journal-title":"Sci. Data"},{"key":"mlstac298cbib18","doi-asserted-by":"publisher","first-page":"195","DOI":"10.1146\/annurev-matsci-070214-020823","article-title":"What is high-throughput virtual screening? A perspective from organic materials discovery","volume":"45","author":"Pyzer-Knapp","year":"2015","journal-title":"Ann. Rev. Mater. Res."},{"key":"mlstac298cbib19","first-page":"pp 1470","article-title":"Parallel and distributed Thompson sampling for large-scale accelerated exploration of chemical space","author":"Hern\u00e1ndez-Lobato","year":"2017"},{"key":"mlstac298cbib20","article-title":"Bayesian modeling for optimization and control in robotics","author":"Calandra","year":"2017"},{"key":"mlstac298cbib21","first-page":"pp 841","article-title":"Variational heteroscedastic Gaussian process regression","author":"L\u00e1zaro-Gredilla","year":"2011"},{"key":"mlstac298cbib22","doi-asserted-by":"publisher","first-page":"806","DOI":"10.1177\/0278364913476124","article-title":"Variable risk control via stochastic optimization","volume":"32","author":"Kuindersma","year":"2013","journal-title":"Int. J. Robot. Res."},{"key":"mlstac298cbib23","article-title":"Heteroscedastic treed Bayesian optimisation","author":"Assael","year":"2014"},{"key":"mlstac298cbib24","first-page":"pp 2230","article-title":"Expensive multiobjective optimization for robotics with consideration of heteroscedastic noise","author":"Ariizumi","year":"2014"},{"key":"mlstac298cbib25","first-page":"pp 997","article-title":"Safe exploration for optimization with Gaussian processes","author":"Sui","year":"2015"},{"key":"mlstac298cbib26","article-title":"Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics","author":"Berkenkamp","year":"2016"},{"key":"mlstac298cbib27","doi-asserted-by":"publisher","first-page":"599","DOI":"10.1287\/ijoc.1080.0314","article-title":"The knowledge-gradient policy for correlated normal beliefs","volume":"21","author":"Frazier","year":"2009","journal-title":"INFORMS J. Comput."},{"key":"mlstac298cbib28","doi-asserted-by":"publisher","first-page":"495","DOI":"10.1214\/18-BA1110","article-title":"Constrained Bayesian optimization with noisy experiments","volume":"14","author":"Letham","year":"2019","journal-title":"Bayesian Anal."},{"key":"mlstac298cbib29","article-title":"Heteroscedastic Bayesian optimisation in scientific discovery","author":"Griffiths","year":"2019"},{"key":"mlstac298cbib30","article-title":"Achieving robustness to aleatoric uncertainty with heteroscedastic Bayesian optimisation","author":"Griffiths","year":"2019"},{"key":"mlstac298cbib31","doi-asserted-by":"publisher","first-page":"97+","DOI":"10.1115\/1.3653121","article-title":"A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise","volume":"86","author":"Kushner","year":"1964","journal-title":"J. Basic Eng."},{"key":"mlstac298cbib32","author":"Tiesis","year":"1978"},{"key":"mlstac298cbib33","author":"Rasmussen","year":"2006"},{"key":"mlstac298cbib34","first-page":"pp 393","article-title":"Most likely heteroscedastic Gaussian process regression","author":"Kersting","year":"2007"},{"key":"mlstac298cbib35","doi-asserted-by":"publisher","first-page":"455","DOI":"10.1023\/A:1008306431147","article-title":"Efficient global optimization of expensive black-box functions","volume":"13","author":"Jones","year":"1998","journal-title":"J. Glob. Optim."},{"key":"mlstac298cbib36","doi-asserted-by":"publisher","first-page":"607","DOI":"10.1007\/s00158-013-0919-4","article-title":"A benchmark of kriging-based infill criteria for noisy optimization","volume":"48","author":"Picheny","year":"2013","journal-title":"Struct. Multidiscip. Optim."},{"key":"mlstac298cbib37","article-title":"Global optimization based on noisy evaluations: an empirical study of two statistical approaches","volume":"135","author":"Vazquez","year":"2008","journal-title":"J. Phys.: Conf. Ser."},{"key":"mlstac298cbib38","doi-asserted-by":"publisher","first-page":"441","DOI":"10.1007\/s10898-005-2454-3","article-title":"Global optimization of stochastic black-box systems via sequential kriging meta-models","volume":"34","author":"Huang","year":"2006","journal-title":"J. Glob. Optim."},{"key":"mlstac298cbib39","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1038\/s41586-020-2649-2","article-title":"Array programming with NumPy","volume":"585","author":"Harris","year":"2020","journal-title":"Nature"},{"key":"mlstac298cbib40","doi-asserted-by":"publisher","first-page":"550","DOI":"10.1145\/279232.279236","article-title":"Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization","volume":"23","author":"Zhu","year":"1997","journal-title":"ACM Trans. Math. Softw. (TOMS)"},{"key":"mlstac298cbib41","doi-asserted-by":"publisher","first-page":"711","DOI":"10.1007\/s10822-014-9747-x","article-title":"FreeSolv: a database of experimental and calculated hydration free energies, with input files","volume":"28","author":"Mobley","year":"2014","journal-title":"J. Comput.-Aided Mole. Des."},{"key":"mlstac298cbib42","article-title":"Rdkit: Open-source cheminformatics","author":"Landrum"},{"key":"mlstac298cbib43","first-page":"pp 489","article-title":"Heteroscedastic Gaussian process regression","author":"Le","year":"2005"},{"key":"mlstac298cbib44","doi-asserted-by":"publisher","first-page":"808","DOI":"10.1080\/10618600.2018.1458625","article-title":"Practical heteroscedastic Gaussian process modeling for large simulation experiments","volume":"27","author":"Binois","year":"2018","journal-title":"J. Computat. Graph. Stat."},{"key":"mlstac298cbib45","article-title":"Heteroscedastic Gaussian processes for uncertain and incomplete data","author":"Almosallam","year":"2017"},{"key":"mlstac298cbib46","first-page":"pp 1","article-title":"Heteroscedastic Gaussian process regression using expectation propagation","author":"Mu\u00f1oz-Gonz\u00e1lez","year":"2011"},{"key":"mlstac298cbib47","doi-asserted-by":"publisher","first-page":"10720","DOI":"10.1021\/acs.iecr.7b00867","article-title":"A novel surrogate-based optimization method for black-box simulation with heteroscedastic noise","volume":"56","author":"Wang","year":"2017","journal-title":"Ind. Eng. Chem. Res."},{"key":"mlstac298cbib48","article-title":"Gaussian process regression with heteroscedastic or non-Gaussian residuals","author":"Wang","year":"2012"},{"key":"mlstac298cbib49","doi-asserted-by":"publisher","first-page":"3450","DOI":"10.1109\/TSP.2020.2997940","article-title":"Improved most likely heteroscedastic Gaussian process regression via Bayesian residual moment estimator","volume":"68","author":"Zhang","year":"2020","journal-title":"IEEE Trans. Signal Process."},{"key":"mlstac298cbib50","doi-asserted-by":"publisher","first-page":"636","DOI":"10.1016\/j.trc.2018.08.007","article-title":"Heteroscedastic Gaussian processes for uncertainty modeling in large-scale crowdsourced traffic data","volume":"95","author":"Rodrigues","year":"2018","journal-title":"Transp. Res. C"},{"key":"mlstac298cbib51","doi-asserted-by":"publisher","DOI":"10.1061\/(ASCE)EM.1943-7889.0001466","article-title":"Probabilistic modeling of heteroscedastic laboratory experiments using Gaussian process regression","volume":"144","author":"Tabor","year":"2018","journal-title":"J. Eng. Mech."},{"key":"mlstac298cbib52","doi-asserted-by":"publisher","first-page":"1124","DOI":"10.1016\/j.renene.2019.09.145","article-title":"Probabilistic modelling of wind turbine power curves with application of heteroscedastic Gaussian process regression","volume":"148","author":"Rogers","year":"2020","journal-title":"Renew. Energy"},{"key":"mlstac298cbib53","doi-asserted-by":"publisher","first-page":"3311","DOI":"10.3390\/s19153311","article-title":"Measurement and forecasting of high-speed rail track slab deformation under uncertain SHM data using variational heteroscedastic Gaussian process","volume":"19","author":"Wang","year":"2019","journal-title":"Sensors"},{"key":"mlstac298cbib54","first-page":"pp 380","article-title":"Distributed variational inference-based heteroscedastic Gaussian process metamodeling","author":"Wang","year":"2019"},{"key":"mlstac298cbib55","doi-asserted-by":"crossref","first-page":"708","DOI":"10.1109\/TNNLS.2020.2979188","article-title":"Large-scale heteroscedastic regression via Gaussian process","volume":"32","author":"Liu","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"mlstac298cbib56","article-title":"Efficiently sampling functions from Gaussian process posteriors","author":"Wilson","year":"2020"},{"key":"mlstac298cbib57","doi-asserted-by":"publisher","first-page":"285","DOI":"10.1093\/biomet\/25.3-4.285","article-title":"On the likelihood that one unknown probability exceeds another in view of the evidence of two samples","volume":"25","author":"Thompson","year":"1933","journal-title":"Biometrika"},{"key":"mlstac298cbib58","article-title":"Approximate inference for fully Bayesian Gaussian process regression","author":"Lalchand","year":"2019"},{"key":"mlstac298cbib59","doi-asserted-by":"publisher","first-page":"1925","DOI":"10.1007\/s10994-020-05899-z","article-title":"High-dimensional Bayesian optimization using low-dimensional feature spaces","volume":"109","author":"Moriconi","year":"2020","journal-title":"Mach. Learn."},{"key":"mlstac298cbib60","first-page":"167","article-title":"Dimensionality reduction methods to scale Bayesian optimization up","author":"Candelieri","year":"2019"},{"key":"mlstac298cbib61","article-title":"High-dimensional Bayesian optimisation with variational autoencoders and deep metric learning","author":"Grosnit","year":"2021"},{"key":"mlstac298cbib62","first-page":"21524","article-title":"BoTorch: a framework for efficient Monte-Carlo Bayesian optimization","volume":"33","author":"Balandat","year":"2020","journal-title":"Adv. Neural Inform. Process. Syst."},{"key":"mlstac298cbib63","first-page":"1","article-title":"Tuning hyperparameters without grad students: scalable and robust Bayesian optimisation with dragonfly","volume":"21","author":"Kandasamy","year":"2020","journal-title":"J. Mach. Learn. Res."},{"key":"mlstac298cbib64","first-page":"9884","article-title":"Maximizing acquisition functions for Bayesian optimization","volume":"31","author":"Wilson","year":"2018","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"mlstac298cbib65","article-title":"Are we forgetting about compositional optimisers in Bayesian optimisation?","author":"Grosnit","year":"2020"},{"key":"mlstac298cbib66","article-title":"Compositional adam: an adaptive compositional solver","author":"Tutunov","year":"2020"},{"key":"mlstac298cbib67","article-title":"Global optimization of Gaussian processes","author":"Schweidtmann","year":"2020"},{"key":"mlstac298cbib68","article-title":"A robust approach to warped Gaussian process-constrained optimization","author":"Wiebe","year":"2020"},{"key":"mlstac298cbib69","article-title":"An empirical study of assumptions in Bayesian optimisation","author":"Cowen-Rivers","year":"2020"},{"key":"mlstac298cbib70","article-title":"Gaussian process molecule property prediction with FlowMO","author":"Moss","year":"2020"},{"key":"mlstac298cbib71","doi-asserted-by":"crossref","DOI":"10.26434\/chemrxiv.12609899.v1","article-title":"The photoswitch dataset: a molecular machine learning benchmark for the advancement of synthetic chemistry","author":"Thawani","year":"2020"},{"key":"mlstac298cbib72","doi-asserted-by":"publisher","first-page":"1981","DOI":"10.1021\/acs.accounts.0c00403","article-title":"Mapping materials and molecules","volume":"53","author":"Cheng","year":"2020","journal-title":"Acc. Chem. Res."},{"key":"mlstac298cbib73","author":"Taleb","year":"2012","edition":"1st edn"},{"key":"mlstac298cbib74","doi-asserted-by":"publisher","first-page":"1495","DOI":"10.1039\/C8EE03559H","article-title":"Chemical stability and instability of inorganic halide perovskites","volume":"12","author":"Zhou","year":"2019","journal-title":"Energy Environ. Sci."}],"container-title":["Machine Learning: Science and Technology"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac298c","content-type":"text\/html","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac298c\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac298c","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac298c\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac298c\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac298c\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac298c\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac298c\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,12,15]],"date-time":"2021-12-15T13:33:09Z","timestamp":1639575189000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ac298c"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,24]]},"references-count":74,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2021,11,24]]},"published-print":{"date-parts":[[2022,3,1]]}},"URL":"https:\/\/doi.org\/10.1088\/2632-2153\/ac298c","relation":{},"ISSN":["2632-2153"],"issn-type":[{"value":"2632-2153","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,11,24]]},"assertion":[{"value":"Achieving robustness to aleatoric uncertainty with heteroscedastic Bayesian optimisation","name":"article_title","label":"Article Title"},{"value":"Machine Learning: Science and Technology","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2021 The Author(s). Published by IOP Publishing Ltd","name":"copyright_information","label":"Copyright Information"},{"value":"2021-01-08","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2021-09-23","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2021-11-24","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}