{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,12]],"date-time":"2026-05-12T22:57:08Z","timestamp":1778626628473,"version":"3.51.4"},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2021,8,2]],"date-time":"2021-08-02T00:00:00Z","timestamp":1627862400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,8,2]],"date-time":"2021-08-02T00:00:00Z","timestamp":1627862400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Austrian Science Fund","award":["SFB BeyondC F71"],"award-info":[{"award-number":["SFB BeyondC F71"]}]},{"name":"Austrian Science Fund","award":["P 30937-N27"],"award-info":[{"award-number":["P 30937-N27"]}]},{"DOI":"10.13039\/501100003246","name":"Dutch Research Council","doi-asserted-by":"crossref","award":["024.003.037"],"award-info":[{"award-number":["024.003.037"]}],"id":[{"id":"10.13039\/501100003246","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Deutsches Zentrum f\u00fcr Luft- und Raumfahrt e. V. (DLR)"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Quantum Mach. Intell."],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In recent years, quantum-enhanced machine learning has emerged as a particularly fruitful application of quantum algorithms, covering aspects of supervised, unsupervised and reinforcement learning. Reinforcement learning offers numerous options of how quantum theory can be applied, and is arguably the least explored, from a quantum perspective. Here, an agent explores an environment and tries to find a behavior optimizing some figure of merit. Some of the first approaches investigated settings where this exploration can be sped-up, by considering quantum analogs of classical environments, which can then be queried in superposition. If the environments have a strict periodic structure in time (i.e. are strictly episodic), such environments can be effectively converted to conventional oracles encountered in quantum information. However, in general environments, we obtain scenarios that generalize standard oracle tasks. In this work, we consider one such generalization, where the environment is not strictly episodic, which is mapped to an oracle identification setting with a changing oracle. We analyze this case and show that standard amplitude-amplification techniques can, with minor modifications, still be applied to achieve quadratic speed-ups. In addition, we prove that an algorithm based on Grover iterations is optimal for oracle identification even if the oracle changes over time in a way that the \u201crewarded space\u201d is monotonically increasing. This result constitutes one of the first generalizations of quantum-accessible reinforcement learning.<\/jats:p>","DOI":"10.1007\/s42484-021-00049-7","type":"journal-article","created":{"date-parts":[[2021,8,2]],"date-time":"2021-08-02T08:03:56Z","timestamp":1627891436000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Quantum-accessible reinforcement learning beyond strictly epochal environments"],"prefix":"10.1007","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9016-3641","authenticated-orcid":false,"given":"A.","family":"Hamann","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"V.","family":"Dunjko","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9137-4814","authenticated-orcid":false,"given":"S.","family":"W\u00f6lk","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,8,2]]},"reference":[{"key":"49_CR1","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1007\/s10994-012-5316-5","volume":"90","author":"E Aimeur","year":"2013","unstructured":"Aimeur E, Brassard G, Gambs S (2013) Quantum speed-up for unsupervised learning. Mach Learn 90:261. https:\/\/doi.org\/10.1007\/s10994-012-5316-5","journal-title":"Mach Learn"},{"key":"49_CR2","doi-asserted-by":"publisher","first-page":"750","DOI":"10.1006\/jcss.2002.1826","volume":"64","author":"A Ambainis","year":"2002","unstructured":"Ambainis A (2002) Quantum lower bounds by quantum arguments. J Comp Syst Sci 64:750","journal-title":"J Comp Syst Sci"},{"key":"49_CR3","doi-asserted-by":"publisher","first-page":"220","DOI":"10.1016\/j.jcss.2005.06.006","volume":"72","author":"A Ambainis","year":"2006","unstructured":"Ambainis A (2006) Polynomial degree vs. quantum query complexity. J Comp Syst Sci 72:220","journal-title":"J Comp Syst Sci"},{"key":"49_CR4","doi-asserted-by":"publisher","first-page":"903","DOI":"10.1137\/18M117563X","volume":"48","author":"S Arunachalam","year":"2019","unstructured":"Arunachalam S, Bri\u00ebt J., Palazuelos C (2019) Quantum query algorithms are completely bounded forms. SIAM J Comp 48:903","journal-title":"SIAM J Comp"},{"key":"49_CR5","unstructured":"Arunachalam S, de Wolf R (2015) Optimizing the number of gates in quantum search. Quantum Inform Comput 17"},{"key":"49_CR6","doi-asserted-by":"publisher","first-page":"195","DOI":"10.1038\/nature23474","volume":"549","author":"J Biamonte","year":"2017","unstructured":"Biamonte J, Wittek P, Pancotti N, Rebentrost P, Wiebe N, Lloyd S (2017) Quantum machine learning. Nature 549:195","journal-title":"Nature"},{"key":"49_CR7","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1002\/(SICI)1521-3978(199806)46:4\/5<493::AID-PROP493>3.0.CO;2-P","volume":"46","author":"M Boyer","year":"1998","unstructured":"Boyer M, Brassard G, Hoyer P, Tappa A (1998) Tight bounds on quantum searching. Fortschr Phys 46:493. https:\/\/doi.org\/10.1002\/3527603093.ch10","journal-title":"Fortschr Phys"},{"key":"49_CR8","unstructured":"Brassard G, Hoyer PF, Mosca M, de Montreal ATDU, Aarhus BU, Waterloo CU (2000) Quantum amplitude amplification and estimation. arXiv:quant-ph\/0005055"},{"key":"49_CR9","doi-asserted-by":"publisher","first-page":"400","DOI":"10.1038\/srep00400","volume":"2","author":"HJ Briegel","year":"2012","unstructured":"Briegel HJ, De las Cuevas G (2012) Projective simulation for artificial intelligence. Sci Rep 2:400. https:\/\/doi.org\/10.1038\/srep00400","journal-title":"Sci Rep"},{"issue":"1","key":"49_CR10","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1109\/MNET.001.1900092","volume":"34","author":"AS Cacciapuoti","year":"2020","unstructured":"Cacciapuoti AS, Caleffi M, Tafuri F, Cataliotti FS, Gherardini S, Bianchi G (2020) Quantum internet: networking challenges in distributed quantum computing. IEEE Netw 34(1):137. https:\/\/doi.org\/10.1109\/MNET.001.1900092","journal-title":"IEEE Netw"},{"key":"49_CR11","doi-asserted-by":"crossref","unstructured":"Chia NH, Gily\u00e9n A, Li T, Lin HH, Tang E, Wang C (2019) Sampling-based sublinear low-rank matrix arithmetic framework for dequantizing quantum machine learning. arXiv:1910.06151","DOI":"10.1145\/3357713.3384314"},{"key":"49_CR12","unstructured":"Cornelissen A (2018) Quantum gradient estimation and its application to quantum reinforcement learning. Master\u2019s thesis, Delft University of Technology"},{"key":"49_CR13","doi-asserted-by":"crossref","unstructured":"da Silva BC, Basso EW, Bazzan PM, Engel ALC (2006) Dealing with non-stationary environments using context detection. In: Proceedings of the 23rd international conference on machine learning, ICML, vol 2006, p 217","DOI":"10.1145\/1143844.1143872"},{"key":"49_CR14","doi-asserted-by":"publisher","first-page":"130501","DOI":"10.1103\/PhysRevLett.117.130501","volume":"117","author":"V Dunjko","year":"2016","unstructured":"Dunjko V, Taylor JM, Briegel HJ (2016) Quantum-enhanced machine learning. Phys Rev Lett 117:130501. https:\/\/doi.org\/10.1103\/PhysRevLett.117.130501","journal-title":"Phys Rev Lett"},{"key":"49_CR15","doi-asserted-by":"crossref","unstructured":"Dunjko V, Liu YK, Wu X, Taylor JM (2017) Super-polynomial and exponential improvements for quantum-enhanced reinforcement learning. arXiv:1710.11160","DOI":"10.1109\/SMC.2017.8122616"},{"key":"49_CR16","doi-asserted-by":"publisher","first-page":"074001","DOI":"10.1088\/1361-6633\/aab406","volume":"81","author":"V Dunjko","year":"2018","unstructured":"Dunjko V, Briegel H (2018) Machine learning & artificial intelligence in the quantum domain: a review of recent progress. Rep Prog Phys 81:074001. https:\/\/doi.org\/10.1088\/1361-6633\/aab406","journal-title":"Rep Prog Phys"},{"key":"49_CR17","unstructured":"Farhi E, Neven H (2018) Classification with quantum neural networks on near term processors. Farhi E, Neven H (2018) Classification with quantum neural networks on near term processors. arXiv:1802.06002"},{"key":"49_CR18","doi-asserted-by":"publisher","first-page":"325","DOI":"10.1103\/PhysRevLett.79.325","volume":"79","author":"LK Grover","year":"1997","unstructured":"Grover LK (1997) Quantum mechanics helps in searching for a needle in a haystack. Phys Rev Lett 79:325. https:\/\/doi.org\/10.1103\/PhysRevLett.79.325","journal-title":"Phys Rev Lett"},{"key":"49_CR19","doi-asserted-by":"publisher","first-page":"4329","DOI":"10.1103\/PhysRevLett.80.4329","volume":"80","author":"LK Grover","year":"1998","unstructured":"Grover LK (1998) Quantum computers can search rapidly by using almost any transformation. Phys Rev Lett 80:4329. https:\/\/doi.org\/10.1103\/PhysRevLett.80.4329","journal-title":"Phys Rev Lett"},{"key":"49_CR20","unstructured":"Gyurik C, Cade C, Dunjko V (2020) Towards quantum advantage for topological data analysis. arXiv:2005.02607"},{"key":"49_CR21","unstructured":"Han M (2018) Reinforcement learning approaches in dynamic environments. Databases [cs.DB].T\u00e9l\u00e9com ParisTech. English. tel-01891805"},{"key":"49_CR22","doi-asserted-by":"publisher","first-page":"150502","DOI":"10.1103\/PhysRevLett.103.150502","volume":"103","author":"AW Harrow","year":"2009","unstructured":"Harrow AW, Hassidim A, Lloyd S (2009) Quantum algorithm for linear systems of equations. Phys Rev Lett 103:150502. https:\/\/doi.org\/10.1103\/PhysRevLett.103.150502","journal-title":"Phys Rev Lett"},{"issue":"7747","key":"49_CR23","doi-asserted-by":"publisher","first-page":"209","DOI":"10.1038\/s41586-019-0980-2","volume":"567","author":"V Havl\u00edcek","year":"2019","unstructured":"Havl\u00edcek V, C\u00f3rcoles AD, Temme K, Harrow AW, Kandala A, Chow JM, Gambetta JM (2019) Supervised learning with quantum-enhanced feature spaces. Nature 567(7747):209. https:\/\/doi.org\/10.1038\/s41586-019-0980-2","journal-title":"Nature"},{"key":"49_CR24","unstructured":"Jerbi S, Poulsen Nautrup H, Trenkwalder LM, Dunjko BHJV (2019) A framework for deep energy-based reinforcement learning with quantum speed-up. arXiv:1910.12760"},{"key":"49_CR25","doi-asserted-by":"publisher","first-page":"1023","DOI":"10.1038\/nature07127","volume":"453","author":"HJ Kimble","year":"2008","unstructured":"Kimble HJ (2008) The quantum internet. Nature 453:1023","journal-title":"Nature"},{"key":"49_CR26","unstructured":"Levit A, Crawford D, Ghadermarzy N, Oberoi JS, Zahedinejad E, Ronagh P (2017) Free energy-based reinforcement learning using a quantum processor. arXiv:1706.00074"},{"key":"49_CR27","doi-asserted-by":"publisher","first-page":"64639","DOI":"10.1109\/ACCESS.2018.2876494","volume":"6","author":"AA Melnikov","year":"2018","unstructured":"Melnikov AA, Makmal A, Briegel HJ (2018) Benchmarking projective simulation in navigation problems. IEEE Access 6:64639. https:\/\/doi.org\/10.1109\/ACCESS.2018.2876494","journal-title":"IEEE Access"},{"issue":"6","key":"49_CR28","doi-asserted-by":"publisher","first-page":"1221","DOI":"10.1073\/pnas.1714936115","volume":"115","author":"AA Melnikov","year":"2018","unstructured":"Melnikov AA, Poulsen Nautrup H, Krenn M, Dunjko V, Tiersch M, Zeilinger A, Briegel HJ (2018) Active learning machine learns to create new quantum experiments. Proc Nat Ac Sci 115 (6):1221. https:\/\/www.pnas.org\/content\/115\/6\/1221","journal-title":"Proc Nat Ac Sci"},{"key":"49_CR29","doi-asserted-by":"publisher","first-page":"71","DOI":"10.3389\/fphy.2017.00071","volume":"5","author":"F Neukart","year":"2018","unstructured":"Neukart F, Von Dollen D, Seidel C, Compostella G (2018) Quantum-enhanced reinforcement learning for finite-episode games with discrete state spaces. Front Phys 5:71. https:\/\/doi.org\/10.3389\/fphy.2017.00071","journal-title":"Front Phys"},{"key":"49_CR30","doi-asserted-by":"publisher","first-page":"031002","DOI":"10.1103\/PhysRevX.4.031002","volume":"4","author":"GD Paparo","year":"2014","unstructured":"Paparo GD, Dunjko V, Makmal A, Martin-Delgado MA, Briegel HJ (2014) Quantum speedup for active learning agents. Phys Rev X 4:031002. https:\/\/doi.org\/10.1103\/PhysRevX.4.031002","journal-title":"Phys Rev X"},{"key":"49_CR31","unstructured":"Ronagh P (2019) Quantum algorithms for solving dynamic programming problems. arXiv:1906.02229"},{"key":"49_CR32","volume-title":"Artificial intelligence: a modern approach","author":"SJ Russell","year":"2003","unstructured":"Russell SJ, Norvig P (2003) Artificial intelligence: a modern approach, 2nd edn. Pearson Education, London","edition":"2nd edn."},{"key":"49_CR33","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1038\/s41586-021-03242-7","volume":"591","author":"V Saggio","year":"2021","unstructured":"Saggio V, Asenbeck B, Hamann A, Str\u00f6mberg T, Schiansky P, Dunjko V, Friis N, Harris NC, Hochberg M, Englund D, W\u00f6lk S, Briegel HJ, Walther P (2021) Experimental quantum speed-up in reinforcement learning agents. Nature 591:229. https:\/\/doi.org\/10.1038\/s41586-021-03242-7","journal-title":"Nature"},{"key":"49_CR34","unstructured":"Singh S, Bertsekas D (1996) Reinforcement learning for dynamic channel allocation in cellular telephone systems. In: Proceedings of the 9th International Conference on Neural Information Processing Systems, NIPS 1996, p 974"},{"key":"49_CR35","doi-asserted-by":"publisher","first-page":"015014","DOI":"10.1088\/2058-9565\/aaef5e","volume":"4","author":"T Sriarunothai","year":"2019","unstructured":"Sriarunothai T, W\u00f6lk S., Giri GS, Friis N, Dunjko V, Briegel HJ, Wunderlich C (2019) Speeding-up the decision making of a learning agent using an iontrap quantum processor. Quantum Sci Technol 4:015014","journal-title":"Quantum Sci Technol"},{"key":"49_CR36","volume-title":"Reinforcement learning","author":"R Sutton","year":"1998","unstructured":"Sutton R, Barto A (1998) Reinforcement learning. The MIT Press, Cambridge"},{"key":"49_CR37","unstructured":"Tesauro G, Das R, Chan H, Kephart J, Levine D, Rawson F, Le-furgy C (2008) Managing power consumption and performance of computing systems using reinforcement learning. In: Advances in neural information processing systems, vol 20, p 1497"},{"key":"49_CR38","doi-asserted-by":"publisher","first-page":"210501","DOI":"10.1103\/PhysRevLett.113.210501","volume":"113","author":"TJ Yoder","year":"2014","unstructured":"Yoder TJ, Low GH, Chuang IL (2014) Fixed-point quantum search with an optimal number of queries. Phys Rev Lett 113:210501. https:\/\/doi.org\/10.1103\/PhysRevLett.113.210501","journal-title":"Phys Rev Lett"},{"key":"49_CR39","doi-asserted-by":"publisher","first-page":"2746","DOI":"10.1103\/PhysRevA.60.2746","volume":"60","author":"C Zalka","year":"1999","unstructured":"Zalka C (1999) Grover\u2019s quantum searching algorithm is optimal. Phys. Rev. A 60:2746. https:\/\/doi.org\/10.1103\/PhysRevA.60.2746","journal-title":"Phys. Rev. A"}],"container-title":["Quantum Machine Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s42484-021-00049-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s42484-021-00049-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s42484-021-00049-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,12,16]],"date-time":"2021-12-16T09:17:00Z","timestamp":1639646220000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s42484-021-00049-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,2]]},"references-count":39,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["49"],"URL":"https:\/\/doi.org\/10.1007\/s42484-021-00049-7","relation":{},"ISSN":["2524-4906","2524-4914"],"issn-type":[{"value":"2524-4906","type":"print"},{"value":"2524-4914","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,8,2]]},"assertion":[{"value":"27 August 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 May 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 August 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"22"}}