{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,23]],"date-time":"2025-12-23T10:04:50Z","timestamp":1766484290665,"version":"3.37.3"},"reference-count":37,"publisher":"IOP Publishing","issue":"3","license":[{"start":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T00:00:00Z","timestamp":1721174400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T00:00:00Z","timestamp":1721174400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"funder":[{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"crossref","award":["13N14906"],"award-info":[{"award-number":["13N14906"]}],"id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100021130","name":"Bundesministerium f\u00fcr Wirtschaft und Klimaschutz","doi-asserted-by":"crossref","award":["50WK2272"],"award-info":[{"award-number":["50WK2272"]}],"id":[{"id":"10.13039\/100021130","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Mach. Learn.: Sci. Technol."],"published-print":{"date-parts":[[2024,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Acquiring a substantial number of data points for training accurate machine learning (ML) models is a big challenge in scientific fields where data collection is resource-intensive. Here, we propose a novel approach for constructing a minimal yet highly informative database for training ML models in complex multi-dimensional parameter spaces. To achieve this, we mimic the underlying relation between the output and input parameters using Gaussian process regression (GPR). Using a set of known data, GPR provides predictive means and standard deviation for the unknown data. Given the predicted standard deviation by GPR, we select data points using Bayesian optimization to obtain an efficient database for training ML models. We compare the performance of ML models trained on databases obtained through this method, with databases obtained using traditional approaches. Our results demonstrate that the ML models trained on the database obtained using Bayesian optimization approach consistently outperform the other two databases, achieving high accuracy with a significantly smaller number of data points. Our work contributes to the resource-efficient collection of data in high-dimensional complex parameter spaces, to achieve high precision ML predictions.<\/jats:p>","DOI":"10.1088\/2632-2153\/ad605f","type":"journal-article","created":{"date-parts":[[2024,7,8]],"date-time":"2024-07-08T22:42:18Z","timestamp":1720478538000},"page":"035013","update-policy":"https:\/\/doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["Optimizing data acquisition: a Bayesian approach for efficient machine learning model training"],"prefix":"10.1088","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2557-6244","authenticated-orcid":true,"given":"M R","family":"Mahani","sequence":"first","affiliation":[]},{"given":"Igor A","family":"Nechepurenko","sequence":"additional","affiliation":[]},{"given":"Yasmin","family":"Rahimof","sequence":"additional","affiliation":[]},{"given":"Andreas","family":"Wicht","sequence":"additional","affiliation":[]}],"member":"266","published-online":{"date-parts":[[2024,7,17]]},"reference":[{"key":"mlstad605fbib1","doi-asserted-by":"publisher","DOI":"10.1002\/aisy.202100067","article-title":"Recent advances in machine learning for fiber optic sensor applications","volume":"4","author":"Venketeswaran","year":"2022","journal-title":"Adv. Intell. Syst."},{"key":"mlstad605fbib2","doi-asserted-by":"publisher","first-page":"413","DOI":"10.1038\/s42254-022-00441-7","article-title":"Scientific machine learning benchmarks","volume":"4","author":"Thiyagalingam","year":"2022","journal-title":"Nat. Rev. Phys."},{"key":"mlstad605fbib3","doi-asserted-by":"publisher","first-page":"339","DOI":"10.1515\/nanoph-2018-0183","article-title":"Intelligent nanophotonics: merging photonics and artificial intelligence at the nanoscale","volume":"8","author":"Yao","year":"2019","journal-title":"Nanophotonics"},{"volume":"vol 25","year":"2015","author":"Nielsen","key":"mlstad605fbib4"},{"key":"mlstad605fbib5","doi-asserted-by":"publisher","first-page":"27523","DOI":"10.1364\/OE.27.027523","article-title":"Deep learning for accelerated all-dielectric metasurface design","volume":"27","author":"Nadell","year":"2019","journal-title":"Opt. Express"},{"key":"mlstad605fbib6","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1038\/s41566-020-0685-y","article-title":"Deep learning for the design of photonic structures","volume":"15","author":"Ma","year":"2021","journal-title":"Nat. Photon."},{"key":"mlstad605fbib7","doi-asserted-by":"publisher","first-page":"29620","DOI":"10.1364\/OE.27.029620","article-title":"Designing integrated photonic devices using artificial neural networks","volume":"27","author":"Hammond","year":"2019","journal-title":"Opt. Express"},{"key":"mlstad605fbib8","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1007\/s11082-022-04284-5","article-title":"Demonstration of a fast-training feed-forward machine learning algorithm for studying key optical properties of FBG and predicting precisely the output spectrum","volume":"55","author":"Dey","year":"2023","journal-title":"Opt. Quantum Electron."},{"key":"mlstad605fbib9","doi-asserted-by":"publisher","first-page":"1007","DOI":"10.1039\/C9NA00656G","article-title":"Deep learning: a new tool for photonic nanostructure design","volume":"2","author":"Hegde","year":"2020","journal-title":"Nanoscale Adv."},{"key":"mlstad605fbib10","doi-asserted-by":"publisher","first-page":"864","DOI":"10.1364\/OPTICA.5.000864","article-title":"Training of photonic neural networks through in situ backpropagation and gradient measurement","volume":"5","author":"Hughes","year":"2018","journal-title":"Optica"},{"year":"2023","author":"Garnett","key":"mlstad605fbib11"},{"key":"mlstad605fbib12","doi-asserted-by":"publisher","first-page":"148","DOI":"10.1109\/JPROC.2015.2494218","article-title":"Taking the human out of the loop: a review of Bayesian optimization","volume":"vol 104","author":"Shahriari","year":"2015"},{"key":"mlstad605fbib13","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1145\/3422622","article-title":"Generative adversarial networks","volume":"63","author":"Goodfellow","year":"2020","journal-title":"Commun. ACM"},{"key":"mlstad605fbib14","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1109\/MSP.2017.2765202","article-title":"Generative adversarial networks: an overview","volume":"35","author":"Creswell","year":"2018","journal-title":"IEEE Signal Process. Mag."},{"key":"mlstad605fbib15","doi-asserted-by":"publisher","first-page":"336","DOI":"10.1117\/12.590732","article-title":"Balancing accuracy against computation time: 3D FDTD for nanophotonics device optimization","volume":"5733","author":"Burr","year":"2005","journal-title":"Proc. SPIE"},{"key":"mlstad605fbib16","doi-asserted-by":"publisher","first-page":"1474","DOI":"10.1021\/acsaom.3c00198","article-title":"Data-efficient machine learning algorithms for the design of surface Bragg gratings","volume":"1","author":"Mahani","year":"2023","journal-title":"ACS Appl. Opt. Mater."},{"key":"mlstad605fbib17","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1038\/s43586-023-00257-4","article-title":"Finite-difference time-domain methods","volume":"3","author":"Teixeira","year":"2023","journal-title":"Nat. Rev. Methods Primers"},{"key":"mlstad605fbib18","doi-asserted-by":"publisher","first-page":"1381","DOI":"10.1364\/OPTICA.3.001381","article-title":"Space-borne frequency comb metrology","volume":"3","author":"Lezius","year":"2016","journal-title":"Optica"},{"key":"mlstad605fbib19","doi-asserted-by":"publisher","first-page":"391","DOI":"10.1038\/s41586-018-0605-1","article-title":"Space-Borne Bose\u2013Einstein condensation for precision interferometry","volume":"562","author":"Becker","year":"2018","journal-title":"Nature"},{"key":"mlstad605fbib20","doi-asserted-by":"publisher","first-page":"1589","DOI":"10.1109\/JLT.2020.2971240","article-title":"Enhancement of the multiplexing capacity and measurement accuracy of FBG sensor system using IWDM technique and deep learning algorithm","volume":"38","author":"Manie","year":"2020","journal-title":"J. Lightwave Technol."},{"key":"mlstad605fbib21","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1016\/j.snb.2012.06.018","article-title":"A review of developments in near infrared methane detection based on tunable diode laser","volume":"171","author":"Shemshad","year":"2012","journal-title":"Sens. Actuators B"},{"key":"mlstad605fbib22","doi-asserted-by":"publisher","first-page":"136","DOI":"10.3389\/fphy.2022.853966","article-title":"Improvement of the detection sensitivity for tunable diode laser absorption spectroscopy: a review","volume":"10","author":"Lin","year":"2022","journal-title":"Front. Phys."},{"key":"mlstad605fbib23","doi-asserted-by":"publisher","first-page":"1407","DOI":"10.1038\/s41467-018-03697-9","article-title":"High power surface emitting terahertz laser with hybrid second-and fourth-order Bragg gratings","volume":"9","author":"Jin","year":"2018","journal-title":"Nat. Commun."},{"key":"mlstad605fbib24","first-page":"69","article-title":"Designing rectangular surface Bragg gratings using machine learning models","author":"Mahani","year":"2023"},{"key":"mlstad605fbib25","first-page":"3","article-title":"Finite-difference time-domain simulations of surface Bragg gratings","author":"Nechepurenko","year":"2023"},{"year":"2013","author":"Agrawal","key":"mlstad605fbib26"},{"year":"2012","author":"Coldren","key":"mlstad605fbib27"},{"year":"2006","author":"Rasmussen","key":"mlstad605fbib28"},{"key":"mlstad605fbib29","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/3206.001.0001","article-title":"Gaussian processes for machine learning","author":"Rasmussen","year":"2005"},{"key":"mlstad605fbib30","article-title":"Practical Bayesian optimization of machine learning algorithms","volume":"vol 25","author":"Snoek","year":"2012"},{"article-title":"Gaussian process optimization in the bandit setting: no regret and experimental design","year":"2009","author":"Srinivas","key":"mlstad605fbib31"},{"key":"mlstad605fbib32","doi-asserted-by":"publisher","first-page":"273","DOI":"10.1007\/BF00994018","article-title":"Support-vector networks","volume":"20","author":"Cortes","year":"1995","journal-title":"Mach. Learn."},{"key":"mlstad605fbib33","first-page":"5","article-title":"Support vector machines for classification and regression","author":"Gunn","year":"1998"},{"key":"mlstad605fbib34","doi-asserted-by":"publisher","first-page":"785","DOI":"10.1145\/2939672.2939785","article-title":"Xgboost: a scalable tree boosting system","author":"Chen","year":"2016"},{"volume":"vol 1","year":"2002","author":"Sch\u00f6lkopf","key":"mlstad605fbib35"},{"key":"mlstad605fbib36","first-page":"2825","article-title":"Scikit-learn: machine learning in python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."},{"year":"2020","author":"Wade","key":"mlstad605fbib37"}],"container-title":["Machine Learning: Science and Technology"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad605f","content-type":"text\/html","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad605f\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad605f","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad605f\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad605f\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad605f\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad605f\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad605f\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T09:54:01Z","timestamp":1721210041000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/ad605f"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,17]]},"references-count":37,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2024,7,17]]},"published-print":{"date-parts":[[2024,9,1]]}},"URL":"https:\/\/doi.org\/10.1088\/2632-2153\/ad605f","relation":{},"ISSN":["2632-2153"],"issn-type":[{"type":"electronic","value":"2632-2153"}],"subject":[],"published":{"date-parts":[[2024,7,17]]},"assertion":[{"value":"Optimizing data acquisition: a Bayesian approach for efficient machine learning model training","name":"article_title","label":"Article Title"},{"value":"Machine Learning: Science and Technology","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2024 The Author(s). Published by IOP Publishing Ltd","name":"copyright_information","label":"Copyright Information"},{"value":"2023-12-19","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2024-07-08","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2024-07-17","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}