{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T16:37:15Z","timestamp":1776875835159,"version":"3.51.2"},"reference-count":65,"publisher":"IOP Publishing","issue":"3","license":[{"start":{"date-parts":[[2023,9,6]],"date-time":"2023-09-06T00:00:00Z","timestamp":1693958400000},"content-version":"vor","delay-in-days":5,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,9,6]],"date-time":"2023-09-06T00:00:00Z","timestamp":1693958400000},"content-version":"tdm","delay-in-days":5,"URL":"https:\/\/iopscience.iop.org\/info\/page\/text-and-data-mining"}],"funder":[{"DOI":"10.13039\/501100013699","name":"Bundesministerium f\u00fcr Bildung, Wissenschaft und Forschung","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100013699","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100010663","name":"H2020 European Research Council","doi-asserted-by":"crossref","award":["101055129"],"award-info":[{"award-number":["101055129"]}],"id":[{"id":"10.13039\/100010663","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100010665","name":"H2020 Marie Sk\u0142odowska-Curie Actions","doi-asserted-by":"crossref","award":["801110"],"award-info":[{"award-number":["801110"]}],"id":[{"id":"10.13039\/100010665","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Austrian Science Fund","award":["DK-ALM: W1259-N27"],"award-info":[{"award-number":["DK-ALM: W1259-N27"]}]},{"DOI":"10.13039\/501100001663","name":"Volkswagen Foundation","doi-asserted-by":"crossref","award":["Az:97721"],"award-info":[{"award-number":["Az:97721"]}],"id":[{"id":"10.13039\/501100001663","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["iopscience.iop.org"],"crossmark-restriction":false},"short-container-title":["Mach. Learn.: Sci. Technol."],"published-print":{"date-parts":[[2023,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In recent years, reinforcement learning (RL) has become increasingly successful in its application to the quantum domain and the process of scientific discovery in general. However, while RL algorithms learn to solve increasingly complex problems, interpreting the solutions they provide becomes ever more challenging. In this work, we gain insights into an RL agent\u2019s learned behavior through a post-hoc analysis based on sequence mining and clustering. Specifically, frequent and compact subroutines, used by the agent to solve a given task, are distilled as gadgets and then grouped by various metrics. This process of gadget discovery develops in three stages: First, we use an RL agent to generate data, then, we employ a mining algorithm to extract gadgets and finally, the obtained gadgets are grouped by a density-based clustering algorithm. We demonstrate our method by applying it to two quantum-inspired RL environments. First, we consider simulated quantum optics experiments for the design of high-dimensional multipartite entangled states where the algorithm finds gadgets that correspond to modern interferometer setups. Second, we consider a circuit-based quantum computing environment where the algorithm discovers various gadgets for quantum information processing, such as quantum teleportation. This approach for analyzing the policy of a learned agent is agent and environment agnostic and can yield interesting insights into any agent\u2019s policy.<\/jats:p>","DOI":"10.1088\/2632-2153\/acf098","type":"journal-article","created":{"date-parts":[[2023,8,15]],"date-time":"2023-08-15T22:44:57Z","timestamp":1692139497000},"page":"035043","update-policy":"https:\/\/doi.org\/10.1088\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Automated gadget discovery in the quantum domain"],"prefix":"10.1088","volume":"4","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5690-707X","authenticated-orcid":true,"given":"Lea M","family":"Trenkwalder","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0522-6610","authenticated-orcid":true,"given":"Andrea","family":"L\u00f3pez-Incera","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7815-7006","authenticated-orcid":true,"given":"Hendrik","family":"Poulsen Nautrup","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4999-2840","authenticated-orcid":true,"given":"Fulvio","family":"Flamini","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9065-1565","authenticated-orcid":false,"given":"Hans J","family":"Briegel","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"266","published-online":{"date-parts":[[2023,9,6]]},"reference":[{"key":"mlstacf098bib1","doi-asserted-by":"publisher","first-page":"350","DOI":"10.1038\/s41586-019-1724-z","article-title":"Grandmaster level in StarCraft II using multi-agent reinforcement learning","volume":"575","author":"Vinyals","year":"2019","journal-title":"Nature"},{"key":"mlstacf098bib2","doi-asserted-by":"publisher","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"mlstacf098bib3","doi-asserted-by":"publisher","first-page":"604","DOI":"10.1038\/s41586-020-03051-4","article-title":"Mastering Atari, Go, chess and shogi by planning with a learned model","volume":"588","author":"Schrittwieser","year":"2020","journal-title":"Nature"},{"key":"mlstacf098bib4","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2021.102193","article-title":"Deep reinforcement learning in medical imaging: a literature review","volume":"73","author":"Zhou","year":"2021","journal-title":"Med. Image Anal."},{"key":"mlstacf098bib5","doi-asserted-by":"publisher","first-page":"2063","DOI":"10.1109\/TNNLS.2018.2790388","article-title":"Applications of deep learning and reinforcement learning to biological data","volume":"29","author":"Mahmud","year":"2017","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"mlstacf098bib6","article-title":"Quantum circuit optimization with deep reinforcement learning","author":"F\u00f6sel","year":"2021"},{"key":"mlstacf098bib7","article-title":"Reinforcement learning for optimization of variational quantum circuit architectures","volume":"vol 34","author":"Ostaszewski","year":"2021"},{"key":"mlstacf098bib8","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/ac9ae8","article-title":"Operationally meaningful representations of physical systems in neural networks","volume":"3","author":"Poulsen Nautrup","year":"2022","journal-title":"Mach. Learn.: Sci. Technol."},{"key":"mlstacf098bib9","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.124.010508","article-title":"Discovering physical concepts with neural networks","volume":"124","author":"Iten","year":"2020","journal-title":"Phys. Rev. Lett."},{"key":"mlstacf098bib10","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.116.090405","article-title":"Automated Search for new Quantum Experiments","volume":"116","author":"Krenn","year":"2016","journal-title":"Phys. Rev. Lett."},{"key":"mlstacf098bib11","doi-asserted-by":"publisher","first-page":"1221","DOI":"10.1073\/pnas.1714936115","article-title":"Active learning machine learns to create new quantum experiments","volume":"115","author":"Melnikov","year":"2018","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"mlstacf098bib12","doi-asserted-by":"publisher","first-page":"185","DOI":"10.1007\/s11023-022-09619-5","article-title":"How a minimal learning agent can infer the existence of unobserved variables in a complex environment","volume":"33","author":"Eva","year":"2023","journal-title":"Minds Mach."},{"key":"mlstacf098bib13","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.100.033311","article-title":"Toward an artificial intelligence physicist for unsupervised learning","volume":"100","author":"Wu","year":"2019","journal-title":"Phys. Rev. E"},{"key":"mlstacf098bib14","doi-asserted-by":"publisher","first-page":"289","DOI":"10.1140\/epjc\/s10052-019-6787-3","article-title":"Guiding new physics searches with unsupervised learning","volume":"79","author":"De Simone","year":"2019","journal-title":"Eur. Phys. J. C"},{"key":"mlstacf098bib15","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevD.99.015014","article-title":"Learning new physics from a machine","volume":"99","author":"D\u2019Agnolo","year":"2019","journal-title":"Phys. Rev. D"},{"key":"mlstacf098bib16","article-title":"Learning the arrow of time","author":"Rahaman","year":"2019"},{"key":"mlstacf098bib17","doi-asserted-by":"publisher","DOI":"10.1088\/2632-2153\/ac7ddc","article-title":"Curiosity in exploring chemical space: intrinsic rewards for deep molecular reinforcement learning","volume":"3","author":"Thiede","year":"2022","journal-title":"Mach. Learn.: Sci. Technol."},{"key":"mlstacf098bib18","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevX.11.031044","article-title":"Conceptual understanding through efficient automated design of quantum optical experiments","volume":"11","author":"Krenn","year":"2021","journal-title":"Phys. Rev. X"},{"key":"mlstacf098bib19","doi-asserted-by":"publisher","first-page":"761","DOI":"10.1038\/s42254-022-00518-3","article-title":"On scientific understanding with artificial intelligence","volume":"4","author":"Krenn","year":"2022","journal-title":"Nat. Rev. Phys."},{"key":"mlstacf098bib20","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2020.103367","article-title":"Interestingness elements for explainable reinforcement learning: understanding agents\u2019 capabilities and limitations","volume":"288","author":"Sequeira","year":"2020","journal-title":"Artif. Intell."},{"key":"mlstacf098bib21","doi-asserted-by":"publisher","first-page":"248","DOI":"10.1038\/nphoton.2016.12","article-title":"Multi-photon entanglement in high dimensions","volume":"10","author":"Malik","year":"2016","journal-title":"Nat. Photon."},{"key":"mlstacf098bib22","doi-asserted-by":"publisher","first-page":"759","DOI":"10.1038\/s41566-018-0257-6","article-title":"Experimental Greenberger\u2013Horne\u2013Zeilinger entanglement beyond qubits","volume":"12","author":"Erhard","year":"2018","journal-title":"Nat. Photon."},{"key":"mlstacf098bib23","doi-asserted-by":"publisher","first-page":"354","DOI":"10.1038\/nature24270","article-title":"Mastering the game of Go without human knowledge","volume":"550","author":"Silver","year":"2017","journal-title":"Nature"},{"key":"mlstacf098bib24","doi-asserted-by":"publisher","first-page":"1285","DOI":"10.1109\/TKDE.2015.2510010","article-title":"Pattern based sequence classification","volume":"28","author":"Zhou","year":"2016","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"mlstacf098bib25","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1023\/A:1007652502315","article-title":"SPADE: an efficient algorithm for mining frequent sequences","volume":"42","author":"Zaki","year":"2001","journal-title":"Mach. Learn."},{"key":"mlstacf098bib26","first-page":"pp 160","article-title":"Density-based clustering based on hierarchical density estimates","author":"Campello","year":"2013"},{"key":"mlstacf098bib27","first-page":"pp 77","article-title":"Explainable reinforcement learning: a survey","author":"Puiutta","year":"2020"},{"key":"mlstacf098bib28","article-title":"A survey on explainable reinforcement learning: concepts, algorithms, challenges","author":"Qing","year":"2022"},{"key":"mlstacf098bib29","doi-asserted-by":"publisher","first-page":"92:1","DOI":"10.1145\/3527448","article-title":"Explainable deep reinforcement learning: state of the art and challenges","volume":"55","author":"Vouros","year":"2022","journal-title":"ACM Comput. Surv."},{"key":"mlstacf098bib30","article-title":"A survey of explainable reinforcement learning","author":"Milani","year":"2022"},{"key":"mlstacf098bib31","first-page":"pp 97","article-title":"\u201cWhy should I trust you?\u201d: explaining the predictions of any classifier","author":"Ribeiro","year":"2016"},{"key":"mlstacf098bib32","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.2206625119","article-title":"Acquisition of chess knowledge in AlphaZero","volume":"119","author":"McGrath","year":"2022","journal-title":"Proc. Natl Acad. Sci."},{"key":"mlstacf098bib33","article-title":"Explainable reinforcement learning via reward decomposition","author":"Juozapaitis","year":"2019"},{"key":"mlstacf098bib34","doi-asserted-by":"publisher","first-page":"16693","DOI":"10.1007\/s00521-022-07280-8","article-title":"Hierarchical goals contextualize local reward decomposition explanations","volume":"35","author":"Rietz","year":"2023","journal-title":"Neural Comput. Appl."},{"key":"mlstacf098bib35","article-title":"Explainable reinforcement learning via model transforms","author":"Finkelstein","year":"2022"},{"key":"mlstacf098bib36","first-page":"pp 1168","article-title":"HIGHLIGHTS: summarizing agent behavior to people","author":"Amir","year":"2018"},{"key":"mlstacf098bib37","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-031-44064-9_20","article-title":"IxDRL: a novel explainable deep reinforcement learning toolkit based on analyses of interestingness","author":"Sequeira","year":"2023"},{"key":"mlstacf098bib38","first-page":"54","article-title":"A survey of sequential pattern mining","volume":"1","author":"Fournier Viger","year":"2017","journal-title":"Data Sci. Pattern Recogn."},{"key":"mlstacf098bib39","first-page":"pp 2094","article-title":"Deep reinforcement learning with double Q-learning","author":"Hasselt","year":"2015"},{"key":"mlstacf098bib40","doi-asserted-by":"publisher","first-page":"8185","DOI":"10.1103\/PhysRevA.45.8185","article-title":"Orbital angular momentum of light and the transformation of Laguerre\u2013Gaussian laser modes","volume":"45","author":"Allen","year":"1992","journal-title":"Phys. Rev. A"},{"key":"mlstacf098bib41","doi-asserted-by":"publisher","first-page":"865","DOI":"10.1103\/RevModPhys.81.865","article-title":"Quantum entanglement","volume":"81","author":"Horodecki","year":"2009","journal-title":"Rev. Mod. Phys."},{"key":"mlstacf098bib42","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.110.030501","article-title":"Structure of multidimensional entanglement in multipartite systems","volume":"110","author":"Huber","year":"2013","journal-title":"Phys. Rev. Lett."},{"key":"mlstacf098bib43","article-title":"Accelerating empowerment computation with UCT tree search","volume":"vol 2018","author":"Salge","year":"2018"},{"key":"mlstacf098bib44","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/TCIAIG.2012.2186810","article-title":"A survey of Monte Carlo tree search methods","volume":"4","author":"Browne","year":"2012","journal-title":"IEEE Trans. Comput. Intell. AI Games"},{"key":"mlstacf098bib45","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1070\/PU1988v031n01ABEH002537","article-title":"A simple method of preparing pure states of an optical field, of implementing the Einstein\u2013Podolsky\u2013Rosen experiment and of demonstrating the complementarity principle","volume":"31","author":"Klyshko","year":"1988","journal-title":"Phys.-Usp."},{"key":"mlstacf098bib46","doi-asserted-by":"publisher","first-page":"547","DOI":"10.1080\/09500340.2014.899645","article-title":"Experimental demonstration of Klyshko\u2019s advanced-wave picture using a coincidence-count based, camera-enabled imaging system","volume":"61","author":"Aspden","year":"2014","journal-title":"J. Mod. Opt."},{"key":"mlstacf098bib47","doi-asserted-by":"publisher","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"mlstacf098bib48","doi-asserted-by":"publisher","first-page":"420","DOI":"10.1038\/nphys4053","article-title":"New tool in the box","volume":"13","author":"Zdeborov\u00e1","year":"2017","journal-title":"Nat. Phys."},{"key":"mlstacf098bib49","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3453160","article-title":"Hierarchical reinforcement learning: a comprehensive survey","volume":"54","author":"Pateria","year":"2021","journal-title":"ACM Comput. Surv."},{"key":"mlstacf098bib50","doi-asserted-by":"publisher","first-page":"64","DOI":"10.5539\/cis.v8n3p64","article-title":"Sequence pattern mining in data streams","volume":"8","author":"Hijawi","year":"2015","journal-title":"Comput. Inf. Sci."},{"key":"mlstacf098bib51","doi-asserted-by":"publisher","first-page":"886","DOI":"10.1109\/ICPADS.2009.64","article-title":"Sequential pattern mining in data streams using the weighted sliding window model","author":"Xu","year":"2009"},{"key":"mlstacf098bib52","doi-asserted-by":"publisher","first-page":"181","DOI":"10.1016\/S0004-3702(99)00052-1","article-title":"Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning","volume":"112","author":"Sutton","year":"1999","journal-title":"Artif. Intell."},{"key":"mlstacf098bib53","doi-asserted-by":"publisher","first-page":"1","DOI":"10.22331\/q-2019-12-16-215","article-title":"Optimizing quantum error correction codes with reinforcement learning","volume":"3","author":"Nautrup","year":"2018","journal-title":"Quantum"},{"key":"mlstacf098bib54","doi-asserted-by":"publisher","first-page":"907","DOI":"10.3389\/fpsyg.2013.00907","article-title":"Novelty or surprise?","volume":"4","author":"Barto","year":"2013","journal-title":"Frontiers Psychol."},{"key":"mlstacf098bib55","article-title":"Surprise-based intrinsic motivation for deep reinforcement learning","author":"Achiam","year":"2016"},{"key":"mlstacf098bib56","doi-asserted-by":"publisher","first-page":"25","DOI":"10.3389\/fnbot.2013.00025","article-title":"Curiosity driven reinforcement learning for motion planning on humanoids","volume":"7","author":"Frank","year":"2014","journal-title":"Front. Neurorobot."},{"key":"mlstacf098bib57","article-title":"VIME: variational information maximizing exploration","volume":"vol 29","author":"Houthooft","year":"2016"},{"key":"mlstacf098bib58","first-page":"pp 4261","article-title":"Curiosity-driven exploration by self-supervised prediction","volume":"vol 6","author":"Pathak","year":"2017"},{"key":"mlstacf098bib59","article-title":"Unifying count-based exploration and intrinsic motivation","volume":"vol 29","author":"Bellemare","year":"2016"},{"key":"mlstacf098bib60","first-page":"pp 2721","article-title":"Count-based exploration with neural density models","author":"Ostrovski","year":"2017"},{"key":"mlstacf098bib61","doi-asserted-by":"publisher","first-page":"42","DOI":"10.3389\/frobt.2020.00042","article-title":"Skill learning by autonomous robotic playing using active learning and exploratory behavior composition","volume":"7","author":"Hangl","year":"2020","journal-title":"Front. Robot. AI"},{"key":"mlstacf098bib62","first-page":"pp 1331","article-title":"CURIOUS: intrinsically motivated modular multi-goal reinforcement learning","author":"Colas","year":"2019"},{"key":"mlstacf098bib63","article-title":"Control what you can: intrinsically motivated task-planning agent","volume":"vol 32","author":"Blaes","year":"2019"},{"key":"mlstacf098bib64","article-title":"Variational information maximisation for intrinsically motivated reinforcement learning","volume":"vol 28","author":"Mohamed","year":"2015"},{"key":"mlstacf098bib65","article-title":"Variational intrinsic control","author":"Gregor","year":"2017"}],"container-title":["Machine Learning: Science and Technology"],"original-title":[],"link":[{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acf098","content-type":"text\/html","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acf098\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acf098","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acf098\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acf098\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acf098\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acf098\/pdf","content-type":"application\/pdf","content-version":"am","intended-application":"similarity-checking"},{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acf098\/pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,19]],"date-time":"2023-12-19T03:12:51Z","timestamp":1702955571000},"score":1,"resource":{"primary":{"URL":"https:\/\/iopscience.iop.org\/article\/10.1088\/2632-2153\/acf098"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,1]]},"references-count":65,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2023,9,6]]},"published-print":{"date-parts":[[2023,9,1]]}},"URL":"https:\/\/doi.org\/10.1088\/2632-2153\/acf098","relation":{},"ISSN":["2632-2153"],"issn-type":[{"value":"2632-2153","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,1]]},"assertion":[{"value":"Automated gadget discovery in the quantum domain","name":"article_title","label":"Article Title"},{"value":"Machine Learning: Science and Technology","name":"journal_title","label":"Journal Title"},{"value":"paper","name":"article_type","label":"Article Type"},{"value":"\u00a9 2023 The Author(s). Published by IOP Publishing Ltd","name":"copyright_information","label":"Copyright Information"},{"value":"2023-01-30","name":"date_received","label":"Date Received","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2023-08-15","name":"date_accepted","label":"Date Accepted","group":{"name":"publication_dates","label":"Publication dates"}},{"value":"2023-09-06","name":"date_epub","label":"Online publication date","group":{"name":"publication_dates","label":"Publication dates"}}]}}