{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,27]],"date-time":"2026-01-27T10:02:19Z","timestamp":1769508139246,"version":"3.49.0"},"reference-count":25,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2020,11,5]],"date-time":"2020-11-05T00:00:00Z","timestamp":1604534400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,11,5]],"date-time":"2020-11-05T00:00:00Z","timestamp":1604534400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Austrian Science Fund","award":["SFB BeyondC F7102"],"award-info":[{"award-number":["SFB BeyondC F7102"]}]},{"DOI":"10.13039\/501100002428","name":"Austrian Science Fund","doi-asserted-by":"publisher","award":["SFB BeyondC F7102"],"award-info":[{"award-number":["SFB BeyondC F7102"]}],"id":[{"id":"10.13039\/501100002428","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002428","name":"Austrian Science Fund","doi-asserted-by":"publisher","award":["DK-ALM:W1259-N27"],"award-info":[{"award-number":["DK-ALM:W1259-N27"]}],"id":[{"id":"10.13039\/501100002428","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002428","name":"Austrian Science Fund","doi-asserted-by":"publisher","award":["DK-ALM:W1259-N27"],"award-info":[{"award-number":["DK-ALM:W1259-N27"]}],"id":[{"id":"10.13039\/501100002428","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Quantum Software Consortium","award":["none"],"award-info":[{"award-number":["none"]}]},{"DOI":"10.13039\/501100002428","name":"Austrian Science Fund","doi-asserted-by":"publisher","award":["SFB FoQus F4212"],"award-info":[{"award-number":["SFB FoQus F4212"]}],"id":[{"id":"10.13039\/501100002428","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002428","name":"Austrian Science Fund","doi-asserted-by":"publisher","award":["SFB FoQus F4212"],"award-info":[{"award-number":["SFB FoQus F4212"]}],"id":[{"id":"10.13039\/501100002428","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002428","name":"Austrian Science Fund","doi-asserted-by":"publisher","award":["SFB FoQus F4212"],"award-info":[{"award-number":["SFB FoQus F4212"]}],"id":[{"id":"10.13039\/501100002428","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Quantum Mach. Intell."],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In recent years, the interest in leveraging quantum effects for enhancing machine learning tasks has significantly increased. Many algorithms speeding up supervised and unsupervised learning were established. The first framework in which ways to exploit quantum resources specifically for the broader context of reinforcement learning were found is projective simulation. Projective simulation presents an agent-based reinforcement learning approach designed in a manner which may support quantum walk-based speedups. Although classical variants of projective simulation have been benchmarked against common reinforcement learning algorithms, very few formal theoretical analyses have been provided for its performance in standard learning scenarios. In this paper, we provide a detailed formal discussion of the properties of this model. Specifically, we prove that one version of the projective simulation model, understood as a reinforcement learning approach, converges to optimal behavior in a large class of Markov decision processes. This proof shows that a physically inspired approach to reinforcement learning can guarantee to converge.<\/jats:p>","DOI":"10.1007\/s42484-020-00023-9","type":"journal-article","created":{"date-parts":[[2020,11,5]],"date-time":"2020-11-05T10:03:09Z","timestamp":1604570589000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["On the convergence of projective-simulation\u2013based reinforcement learning in Markov decision processes"],"prefix":"10.1007","volume":"2","author":[{"given":"W. L.","family":"Boyajian","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"J.","family":"Clausen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5690-707X","authenticated-orcid":false,"given":"L. M.","family":"Trenkwalder","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"V.","family":"Dunjko","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"H. J.","family":"Briegel","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2020,11,5]]},"reference":[{"key":"23_CR1","doi-asserted-by":"publisher","first-page":"389","DOI":"10.1038\/377389a0","volume":"377","author":"CH Bennett","year":"1995","unstructured":"Bennett CH, DiVincenzo DP (1995) Towards an engineering era? Nature 377:389\u2013390","journal-title":"Nature"},{"key":"23_CR2","doi-asserted-by":"crossref","unstructured":"Biamonte J, Wittek P, Pancotti N, Rebentrost P, Wiebe N, Lloyd S (2016) Quantum machine learning 549:11","DOI":"10.1038\/nature23474"},{"key":"23_CR3","doi-asserted-by":"publisher","first-page":"522","DOI":"10.1038\/srep00522","volume":"2","author":"HJ Briegel","year":"2012","unstructured":"Briegel HJ (2012) On creative machines and the physical origins of freedom. Sci Rep 2:522","journal-title":"Sci Rep"},{"key":"23_CR4","doi-asserted-by":"publisher","first-page":"400","DOI":"10.1038\/srep00400","volume":"2","author":"HJ Briegel","year":"2012","unstructured":"Briegel HJ, las Cuevas GD (2012) Projective simulation for artificial intelligence. Sci Rep 2:400","journal-title":"Sci Rep"},{"key":"23_CR5","doi-asserted-by":"publisher","first-page":"022303","DOI":"10.1103\/PhysRevA.97.022303","volume":"97","author":"J Clausen","year":"2018","unstructured":"Clausen J, Briegel HJ (2018) Quantum machine learning with glow for episodic tasks and decision games. Phys Rev A 97:022303","journal-title":"Phys Rev A"},{"issue":"3","key":"23_CR6","first-page":"295","volume":"14","author":"P Dayan","year":"1994","unstructured":"Dayan P, Sejnowski TJ (1994) TD (\u03bb) converges with probability 1. Mach Learn 14(3):295\u2013301","journal-title":"Mach Learn"},{"key":"23_CR7","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1088\/1361-6633\/aab406","volume":"81","author":"V Dunjko","year":"2018","unstructured":"Dunjko V, Briegel H (2018) Machine learning & artificial intelligence in the quantum domain: a review of recent progress. Rep Prog Phys 81:7","journal-title":"Rep Prog Phys"},{"key":"23_CR8","doi-asserted-by":"publisher","first-page":"130501","DOI":"10.1103\/PhysRevLett.117.130501","volume":"117","author":"V Dunjko","year":"2016","unstructured":"Dunjko V, Taylor JM, Briegel HJ (2016) Quantum-enhanced machine learning. Phys Rev Lett 117:130501","journal-title":"Phys Rev Lett"},{"key":"23_CR9","doi-asserted-by":"crossref","unstructured":"Dvoretzky A, et al. (1956) On stochastic approximation","DOI":"10.1525\/9780520313880-007"},{"key":"23_CR10","doi-asserted-by":"publisher","unstructured":"Hangl S, Ugur E, Szedmak S, Piater J (2016) Robotic playing for hierarchical complex skill learning. In: Proc. IEEE\/RSJ Int. Conf. Intell. Robots Syst. https:\/\/doi.org\/10.1109\/IROS.2016.7759434, pp 2799\u20132804","DOI":"10.1109\/IROS.2016.7759434"},{"key":"23_CR11","doi-asserted-by":"publisher","first-page":"42","DOI":"10.3389\/frobt.2020.00042","volume":"7","author":"S Hangl","year":"2020","unstructured":"Hangl S, Dunjko V, Briegel HJ, Piater J (2020) Skill learning by autonomous robotic playing using active learning and exploratory behavior composition. Frontiers in Robotics and AI 7:42. https:\/\/doi.org\/10.3389\/frobt.2020.00042. https:\/\/www.frontiersin.org\/article\/10.3389\/frobt.2020.00042https:\/\/www.frontiersin.org\/article\/10.3389\/frobt.2020.00042","journal-title":"Frontiers in Robotics and AI"},{"key":"23_CR12","doi-asserted-by":"crossref","unstructured":"Jaakkola T, Jordan MI, Singh SP (1994) Convergence of stochastic iterative dynamic programming algorithms. In: Advances in neural information processing systems, pp 703\u2013710","DOI":"10.1162\/neco.1994.6.6.1185"},{"key":"23_CR13","doi-asserted-by":"publisher","first-page":"2110","DOI":"10.1109\/ACCESS.2016.2556579","volume":"4","author":"A Makmal","year":"2016","unstructured":"Makmal A, Melnikov AA, Dunjko V, Briegel HJ (2016) Meta-learning within projective simulation. IEEE Access 4:2110","journal-title":"IEEE Access"},{"key":"23_CR14","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1007\/s00354-015-0102-0","volume":"33","author":"J Mautner","year":"2015","unstructured":"Mautner J, Makmal A, Manzano D, Tiersch M, Briegel HJ (2015) Projective simulation for classical learning agents: A comprehensive investigation. New Gener Comput 33:69","journal-title":"New Gener Comput"},{"key":"23_CR15","doi-asserted-by":"publisher","first-page":"64639","DOI":"10.1109\/ACCESS.2018.2876494","volume":"6","author":"AA Melnikov","year":"2018","unstructured":"Melnikov AA, Makmal A, Briegel HJ (2018) Benchmarking projective simulation in navigation problems. IEEE Access 6:64639\u201364648","journal-title":"IEEE Access"},{"key":"23_CR16","doi-asserted-by":"publisher","first-page":"14430","DOI":"10.1038\/s41598-017-14740-y","volume":"7","author":"AA Melnikov","year":"2017","unstructured":"Melnikov AA, Makmal A, Dunjko V, Briegel HJ (2017) Projective simulation with generalization. Sci Rep 7:14430","journal-title":"Sci Rep"},{"key":"23_CR17","doi-asserted-by":"publisher","first-page":"1221","DOI":"10.1073\/pnas.1714936115","volume":"115","author":"AA Melnikov","year":"2018","unstructured":"Melnikov AA, Poulsen Nautrup H, Krenn M, Dunjko V, Tiersch M, Zeilinger A, Briegel HJ (2018) Active learning machine learns to create new quantum experiments. Proc Natl Acad Sci U.S.A 115:1221","journal-title":"Proc Natl Acad Sci U.S.A"},{"key":"23_CR18","volume-title":"Quantum computation and quantum information","author":"MA Nielsen","year":"2000","unstructured":"Nielsen MA, Chuang IL (2000) Quantum computation and quantum information. Cambridge University Press, Cambridge"},{"key":"23_CR19","doi-asserted-by":"publisher","first-page":"215","DOI":"10.22331\/q-2019-12-16-215","volume":"3","author":"HP Nautrup","year":"2019","unstructured":"Nautrup HP, Delfosse N, Dunjko V, Briegel HJ, Friis N (2019) Optimizing quantum error correction codes with reinforcement learning. Quantum 3:215. https:\/\/doi.org\/10.22331\/q-2019-12-16-215","journal-title":"Quantum"},{"key":"23_CR20","first-page":"031002","volume":"4","author":"G Paparo","year":"2014","unstructured":"Paparo G, Dunjko V, Makmal A, Martin-Delgado MA, Briegel HJ (2014) Quantum speed-up for active learning agents. Phys Rev X 4:031002","journal-title":"Phys Rev X"},{"key":"23_CR21","doi-asserted-by":"publisher","first-page":"2567","DOI":"10.1007\/s11128-014-0809-8","volume":"13","author":"M Schuld","year":"2014","unstructured":"Schuld M, Sinayskiy I, Petruccione F (2014) The quest for a quantum neural network. Quantum Inf Process 13:2567\u20132586","journal-title":"Quantum Inf Process"},{"issue":"3","key":"23_CR22","doi-asserted-by":"publisher","first-page":"287","DOI":"10.1023\/A:1007678930559","volume":"38","author":"S Singh","year":"2000","unstructured":"Singh S, Jaakkola T, Littman ML, Szepesv\u00e1ri C (2000) Convergence results for single-step on-policy reinforcement-learning algorithms. Mach Learn 38(3):287\u2013308","journal-title":"Mach Learn"},{"key":"23_CR23","unstructured":"Sriarunothai T, W\u00f6lk S, Giri GS, Friis N, Dunjko V, Briegel HJ, Wunderlich C (2017) Speeding-up the decision making of a learning agent using an ion trap quantum processor. arXiv:https:\/\/arxiv.org\/abs\/1709.01366"},{"key":"23_CR24","volume-title":"Reinforcement Learning: An Introduction, 2nd edn","author":"RS Sutton","year":"2018","unstructured":"Sutton RS, Barto AG (2018) Reinforcement Learning: An Introduction, 2nd edn. MIT Press, Cambridge, MA"},{"issue":"3-4","key":"23_CR25","doi-asserted-by":"publisher","first-page":"279","DOI":"10.1007\/BF00992698","volume":"8","author":"CJCH Watkins","year":"1992","unstructured":"Watkins CJCH, Dayan P (1992) Q-learning. Machine learning 8(3-4):279\u2013292","journal-title":"Machine learning"}],"container-title":["Quantum Machine Intelligence"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s42484-020-00023-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s42484-020-00023-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s42484-020-00023-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,4,12]],"date-time":"2021-04-12T16:28:36Z","timestamp":1618244916000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s42484-020-00023-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,11,5]]},"references-count":25,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["23"],"URL":"https:\/\/doi.org\/10.1007\/s42484-020-00023-9","relation":{},"ISSN":["2524-4906","2524-4914"],"issn-type":[{"value":"2524-4906","type":"print"},{"value":"2524-4914","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,11,5]]},"assertion":[{"value":"7 November 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 July 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 November 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"13"}}