{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,8]],"date-time":"2026-01-08T07:24:01Z","timestamp":1767857041007,"version":"3.49.0"},"reference-count":22,"publisher":"IOP Publishing","issue":"2","license":[{"start":{"date-parts":[[2021,3,25]],"date-time":"2021-03-25T00:00:00Z","timestamp":1616630400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,3,25]],"date-time":"2021-03-25T00:00:00Z","timestamp":1616630400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"funder":[{"DOI":"10.13039\/100006151","name":"Basic Energy Sciences","doi-asserted-by":"crossref","award":["BNL-LDRD 20-032"],"award-info":[{"award-number":["BNL-LDRD 20-032"]}],"id":[{"id":"10.13039\/100006151","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Mach. Learn.: Sci. Technol."],"published-print":{"date-parts":[[2021,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Beamline experiments at central facilities are increasingly demanding of remote, high-throughput, and adaptive operation conditions. To accommodate such needs, new approaches must be developed that enable on-the-fly decision making for data intensive challenges. Reinforcement learning (RL) is a domain of AI that holds the potential to enable autonomous operations in a feedback loop between beamline experiments and trained agents. Here, we outline the advanced data acquisition and control software of the Bluesky suite, and demonstrate its functionality with a canonical RL problem: cartpole. We then extend these methods to efficient use of beamline resources by using RL to develop an optimal measurement strategy for samples with different scattering characteristics. The RL agents converge on the empirically optimal policy when under-constrained with time. When resource limited, the agents outperform a naive or sequential measurement strategy, often by a factor of 100%. We interface these methods directly with the data storage and provenance technologies at the National Synchrotron Light Source II, thus demonstrating the potential for RL to increase the scientific output of beamlines, and layout the framework for how to achieve this impact.<\/jats:p>","DOI":"10.1088\/2632-2153\/abc9fc","type":"journal-article","created":{"date-parts":[[2021,3,25]],"date-time":"2021-03-25T09:00:04Z","timestamp":1616662804000},"page":"025025","update-policy":"https:\/\/doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Gaming the beamlines\u2014employing reinforcement learning to maximize scientific outcomes at large-scale user facilities"],"prefix":"10.1088","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7173-7972","authenticated-orcid":false,"given":"Phillip M","family":"Maffettone","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6032-841X","authenticated-orcid":false,"given":"Joshua K","family":"Lynch","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4692-608X","authenticated-orcid":false,"given":"Thomas A","family":"Caswell","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4187-273X","authenticated-orcid":false,"given":"Clara E","family":"Cook","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7079-0878","authenticated-orcid":false,"given":"Stuart I","family":"Campbell","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4611-4113","authenticated-orcid":false,"given":"Daniel","family":"Olds","sequence":"additional","affiliation":[]}],"member":"266","published-online":{"date-parts":[[2021,3,25]]},"reference":[{"key":"mlstabc9fcbib1","doi-asserted-by":"publisher","first-page":"1140","DOI":"10.1126\/science.aar6404","volume":"362","author":"Silver","year":"2018","journal-title":"Science"},{"key":"mlstabc9fcbib2","author":"Mnih","year":"2013"},{"key":"mlstabc9fcbib3","doi-asserted-by":"publisher","first-page":"350","DOI":"10.1038\/s41586-019-1724-z","article-title":"Grandmaster level in StarCraft II using multi-agent reinforcement learning","volume":"575","author":"Vinyals","year":"2019","journal-title":"Nature"},{"key":"mlstabc9fcbib4","author":"Howard","year":"1960","edition":"1st edn"},{"key":"mlstabc9fcbib5","doi-asserted-by":"publisher","first-page":"1238","DOI":"10.1177\/0278364913495721","volume":"32","author":"Kober","year":"2013","journal-title":"Int. J. Robot. Res."},{"key":"mlstabc9fcbib6","doi-asserted-by":"publisher","first-page":"3133","DOI":"10.1109\/COMST.2019.2916583","volume":"21","author":"Luong","year":"2019","journal-title":"IEEE Commun. Surv. Tutorials"},{"key":"mlstabc9fcbib7","doi-asserted-by":"publisher","first-page":"1337","DOI":"10.1021\/acscentsci.7b00492","volume":"3","author":"Zhou","year":"2017","journal-title":"ACS Cent. Sci."},{"key":"mlstabc9fcbib8","doi-asserted-by":"publisher","first-page":"283","DOI":"10.1016\/S0921-8890(97)00043-2","volume":"22","author":"Benbrahim","year":"1997","journal-title":"Robot. Auton. Syst."},{"key":"mlstabc9fcbib9","doi-asserted-by":"publisher","first-page":"9990","DOI":"10.1038\/s41598-020-66435-6","volume":"10","author":"Kourousias","year":"2020","journal-title":"Sci. Rep."},{"key":"mlstabc9fcbib10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-019-48114-3","volume":"9","author":"Noack","year":"2019","journal-title":"Sci. Rep."},{"key":"mlstabc9fcbib11","doi-asserted-by":"publisher","first-page":"781","DOI":"10.3390\/electronics9050781","article-title":"","volume":"9","author":"Bruchon","year":"2020","journal-title":"Electronics"},{"key":"mlstabc9fcbib12","doi-asserted-by":"publisher","first-page":"19","DOI":"10.1080\/08940886.2019.1608121","volume":"32","author":"Allan","year":"2019","journal-title":"Synchrot. Radiat. News"},{"key":"mlstabc9fcbib13","author":""},{"key":"mlstabc9fcbib14","doi-asserted-by":"publisher","first-page":"1698","DOI":"10.1109\/TIM.2019.2914711","volume":"69","author":"Koerner","year":"2020","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"mlstabc9fcbib15","first-page":"9478","volume":"3","author":"Thein","year":"2014","journal-title":"Int. J. Sci. Eng. Technol. Res."},{"key":"mlstabc9fcbib16","author":""},{"key":"mlstabc9fcbib17","author":"Lillicrap","year":"2015"},{"key":"mlstabc9fcbib18","author":"Brockman","year":"2016"},{"key":"mlstabc9fcbib19","article-title":"Tensorforce: a tensorflow library for applied reinforcement learning web page","author":"Kuhnle","year":"2017"},{"key":"mlstabc9fcbib20","doi-asserted-by":"publisher","DOI":"10.1142\/11389","volume":"vol 2","author":"Pouchard","year":"2019"},{"key":"mlstabc9fcbib21","first-page":"1","article-title":"Scientific Literature Mining for Experiment Information in Materials Design","author":"Park","year":"2019"},{"key":"mlstabc9fcbib22","article-title":"Figure descriptive text extraction using ontological representation","author":"Park","year":"2020"}],"container-title":["Machine Learning: Science and Technology"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc9fc","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc9fc\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc9fc\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc9fc\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,1,23]],"date-time":"2022-01-23T00:10:28Z","timestamp":1642896628000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/abc9fc"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,25]]},"references-count":22,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2021,3,25]]},"published-print":{"date-parts":[[2021,6,1]]}},"URL":"https:\/\/doi.org\/10.1088\/2632-2153\/abc9fc","relation":{},"ISSN":["2632-2153"],"issn-type":[{"value":"2632-2153","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,3,25]]},"assertion":[{"value":"Gaming the beamlines\u2014employing reinforcement learning to maximize scientific outcomes at large-scale user facilities","name":"article_title","label":"Article Title"},{"value":"Machine Learning: Science and Technology","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2021 US government","name":"copyright_information","label":"Copyright Information"},{"value":"2020-09-11","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2020-11-12","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2021-03-25","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}