{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,7]],"date-time":"2024-08-07T07:37:23Z","timestamp":1723016243842},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,8]]},"abstract":"<jats:p>We discuss some recent results on Thompson sampling for nonparametric reinforcement learning in countable classes of general stochastic environments. These environments can be non-Markovian, non-ergodic, and partially observable. We show that Thompson sampling learns the environment class in the sense that\n\n(1) asymptotically its value converges in mean to the optimal value and\n\n(2) given a recoverability assumption regret is sublinear.\n\nWe conclude with a discussion about optimality in reinforcement learning.<\/jats:p>","DOI":"10.24963\/ijcai.2017\/688","type":"proceedings-article","created":{"date-parts":[[2017,7,28]],"date-time":"2017-07-28T09:14:07Z","timestamp":1501233247000},"page":"4889-4893","source":"Crossref","is-referenced-by-count":10,"title":["On Thompson Sampling and Asymptotic Optimality"],"prefix":"10.24963","author":[{"given":"Jan","family":"Leike","sequence":"first","affiliation":[{"name":"DeepMind"},{"name":"FHI, University of Oxford"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tor","family":"Lattimore","sequence":"additional","affiliation":[{"name":"DeepMind"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Laurent","family":"Orseau","sequence":"additional","affiliation":[{"name":"DeepMind"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Marcus","family":"Hutter","sequence":"additional","affiliation":[{"name":"Australian National University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"10584","event":{"number":"26","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)","University of Technology Sydney (UTS)","Australian Computer Society (ACS)"],"acronym":"IJCAI-2017","name":"Twenty-Sixth International Joint Conference on Artificial Intelligence","start":{"date-parts":[[2017,8,19]]},"theme":"Artificial Intelligence","location":"Melbourne, Australia","end":{"date-parts":[[2017,8,26]]}},"container-title":["Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2017,7,28]],"date-time":"2017-07-28T11:55:07Z","timestamp":1501242907000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2017\/688"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2017,8]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2017\/688","relation":{},"subject":[],"published":{"date-parts":[[2017,8]]}}}