{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:22:58Z","timestamp":1750220578628,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":32,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,8,20]],"date-time":"2020-08-20T00:00:00Z","timestamp":1597881600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Science Foundation","award":["IIS-1618948, IIS-1553568"],"award-info":[{"award-number":["IIS-1618948, IIS-1553568"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,8,23]]},"DOI":"10.1145\/3394486.3406484","type":"proceedings-article","created":{"date-parts":[[2020,8,20]],"date-time":"2020-08-20T23:18:56Z","timestamp":1597965536000},"page":"3575-3576","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Learning by Exploration"],"prefix":"10.1145","author":[{"given":"Qingyun","family":"Wu","sequence":"first","affiliation":[{"name":"University of Virginia, Charlottesville, VA, USA"}]},{"given":"Huazheng","family":"Wang","sequence":"additional","affiliation":[{"name":"University of Virginia, Charlottesville, VA, USA"}]},{"given":"Hongning","family":"Wang","sequence":"additional","affiliation":[{"name":"University of Virginia, Charlottesville, VA, USA"}]}],"member":"320","published-online":{"date-parts":[[2020,8,20]]},"reference":[{"key":"e_1_3_2_1_1_1","first-page":"2312","volume-title":"NIPS","author":"Y.","year":"2011","unstructured":"Y. Abbasi-yadkori, D. P\u00e1l , and C. Szepesv\u00e1ri . Improved algorithms for linear stochastic bandits . In NIPS , pages 2312 -- 2320 . 2011 . Y. Abbasi-yadkori, D. P\u00e1l, and C. Szepesv\u00e1ri. Improved algorithms for linear stochastic bandits. In NIPS, pages 2312--2320. 2011."},{"key":"e_1_3_2_1_2_1","first-page":"127","volume-title":"International Conference on Machine Learning","author":"Agrawal S.","year":"2013","unstructured":"S. Agrawal and N. Goyal . Thompson sampling for contextual bandits with linear payoffs . In International Conference on Machine Learning , pages 127 -- 135 , 2013 . S. Agrawal and N. Goyal. Thompson sampling for contextual bandits with linear payoffs. In International Conference on Machine Learning, pages 127--135, 2013."},{"key":"e_1_3_2_1_3_1","volume-title":"Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3(Nov):397--422","author":"Auer P.","year":"2002","unstructured":"P. Auer . Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3(Nov):397--422 , 2002 . P. Auer. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3(Nov):397--422, 2002."},{"key":"e_1_3_2_1_4_1","volume-title":"May","author":"Auer P.","year":"2002","unstructured":"P. Auer , N. Cesa-Bianchi , and P. Fischer . Finite-time analysis of the multiarmed bandit problem. Mach. Learn., 47(2--3):235--256 , May 2002 . P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Mach. Learn., 47(2--3):235--256, May 2002."},{"key":"e_1_3_2_1_5_1","volume-title":"Nearly optimal adaptive procedure with change detection for piecewise-stationary bandit. arXiv preprint arXiv:1802.03692","author":"Cao Y.","year":"2018","unstructured":"Y. Cao , Z. Wen , B. Kveton , and Y. Xie . Nearly optimal adaptive procedure with change detection for piecewise-stationary bandit. arXiv preprint arXiv:1802.03692 , 2018 . Y. Cao, Z. Wen, B. Kveton, and Y. Xie. Nearly optimal adaptive procedure with change detection for piecewise-stationary bandit. arXiv preprint arXiv:1802.03692, 2018."},{"key":"e_1_3_2_1_6_1","volume-title":"Pro. NIPS","author":"Cesa-Bianchi N.","year":"2013","unstructured":"N. Cesa-Bianchi , C. Gentile , and G. Zappella . A gang of bandits . In Pro. NIPS , 2013 . N. Cesa-Bianchi, C. Gentile, and G. Zappella. A gang of bandits. In Pro. NIPS, 2013."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.5555\/3398761.3398990"},{"key":"e_1_3_2_1_8_1","volume-title":"Conference on Learning Theory","author":"Chen Y.","year":"2019","unstructured":"Y. Chen , C.-W. Lee , H. Luo , and C.-Y. Wei . A new algorithm for non-stationary contextual bandits: Efficient, optimal, and parameter-free . In Conference on Learning Theory , 2019 . Y. Chen, C.-W. Lee, H. Luo, and C.-Y. Wei. A new algorithm for non-stationary contextual bandits: Efficient, optimal, and parameter-free. In Conference on Learning Theory, 2019."},{"key":"e_1_3_2_1_9_1","volume-title":"On upper-confidence bound policies for non-stationary bandit problems. In arXiv preprint arXiv:0805.3415","author":"Garivier A.","year":"2008","unstructured":"A. Garivier and E. Moulines . On upper-confidence bound policies for non-stationary bandit problems. In arXiv preprint arXiv:0805.3415 ( 2008 ). A. Garivier and E. Moulines. On upper-confidence bound policies for non-stationary bandit problems. In arXiv preprint arXiv:0805.3415 (2008)."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/3305381.3305511"},{"key":"e_1_3_2_1_11_1","first-page":"757","volume-title":"Pro. of the 31st International Conference on Machine Learning (ICML-14)","author":"Gentile C.","year":"2014","unstructured":"C. Gentile , S. Li , and G. Zappella . Online clustering of bandits . In Pro. of the 31st International Conference on Machine Learning (ICML-14) , pages 757 -- 765 , 2014 . C. Gentile, S. Li, and G. Zappella. Online clustering of bandits. In Pro. of the 31st International Conference on Machine Learning (ICML-14), pages 757--765, 2014."},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/2832747.2832852"},{"key":"e_1_3_2_1_13_1","first-page":"325","volume-title":"Advances in Neural Information Processing Systems","author":"Joseph M.","year":"2016","unstructured":"M. Joseph , M. Kearns , J. H. Morgenstern , and A. Roth . Fairness in learning: Classic and contextual bandits . In Advances in Neural Information Processing Systems , pages 325 -- 333 , 2016 . M. Joseph, M. Kearns, J. H. Morgenstern, and A. Roth. Fairness in learning: Classic and contextual bandits. In Advances in Neural Information Processing Systems, pages 325--333, 2016."},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-34106-9_18"},{"key":"e_1_3_2_1_15_1","volume-title":"Perturbed-history exploration in stochastic linear bandits. arXiv preprint arXiv:1903.09132","author":"Kveton B.","year":"2019","unstructured":"B. Kveton , C. Szepesvari , M. Ghavamzadeh , and C. Boutilier . Perturbed-history exploration in stochastic linear bandits. arXiv preprint arXiv:1903.09132 , 2019 . B. Kveton, C. Szepesvari, M. Ghavamzadeh, and C. Boutilier. Perturbed-history exploration in stochastic linear bandits. arXiv preprint arXiv:1903.09132, 2019."},{"key":"e_1_3_2_1_16_1","first-page":"3601","volume-title":"International Conference on Machine Learning","author":"Kveton B.","year":"2019","unstructured":"B. Kveton , C. Szepesvari , S. Vaswani , Z. Wen , T. Lattimore , and M. Ghavamzadeh . Garbage in, reward out: Bootstrapping exploration in multi-armed bandits . In International Conference on Machine Learning , pages 3601 -- 3610 , 2019 . B. Kveton, C. Szepesvari, S. Vaswani, Z. Wen, T. Lattimore, and M. Ghavamzadeh. Garbage in, reward out: Bootstrapping exploration in multi-armed bandits. In International Conference on Machine Learning, pages 3601--3610, 2019."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.5555\/2981562.2981665"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/1772690.1772758"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2911451.2911548"},{"key":"e_1_3_2_1_20_1","first-page":"1739","volume-title":"Conference On Learning Theory","author":"Luo H.","year":"2018","unstructured":"H. Luo , C.-Y. Wei , A. Agarwal , and J. Langford . Efficient contextual bandits in non-stationary worlds . In Conference On Learning Theory , pages 1739 -- 1776 , 2018 . H. Luo, C.-Y. Wei, A. Agarwal, and J. Langford. Efficient contextual bandits in non-stationary worlds. In Conference On Learning Theory, pages 1739--1776, 2018."},{"key":"e_1_3_2_1_21_1","first-page":"12040","volume-title":"Advances in Neural Information Processing Systems","author":"Russac Y.","year":"2019","unstructured":"Y. Russac , C. Vernade , and O. Capp\u00e9 . Weighted linear bandits for non-stationary environments . In Advances in Neural Information Processing Systems , pages 12040 -- 12049 , 2019 . Y. Russac, C. Vernade, and O. Capp\u00e9. Weighted linear bandits for non-stationary environments. In Advances in Neural Information Processing Systems, pages 12040--12049, 2019."},{"key":"e_1_3_2_1_22_1","first-page":"4296","volume-title":"Advances in Neural Information Processing Systems","author":"Shariff R.","year":"2018","unstructured":"R. Shariff and O. Sheffet . Differentially private contextual linear bandits . In Advances in Neural Information Processing Systems , pages 4296 -- 4306 , 2018 . R. Shariff and O. Sheffet. Differentially private contextual linear bandits. In Advances in Neural Information Processing Systems, pages 4296--4306, 2018."},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.1998.712192"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331264"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983847"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.5555\/3298483.3298627"},{"key":"e_1_3_2_1_27_1","volume-title":"SIGIR","author":"Wu Q.","year":"2018","unstructured":"Q. Wu , N. Iyer , and H. Wang . Learning contextual bandits in a collaborative environment . In SIGIR 2018 . Q. Wu, N. Iyer, and H. Wang. Learning contextual bandits in a collaborative environment. In SIGIR 2018."},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330874"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2911451.2911528"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3308558.3313727"},{"key":"e_1_3_2_1_31_1","series-title":"Proceedings of Machine Learning Research","first-page":"7335","volume-title":"Proceedings of the 36th International Conference on Machine Learning","author":"Zhang C.","year":"2019","unstructured":"C. Zhang , A. Agarwal , H. D. Iii , J. Langford , and S. Negahban . Warm-starting contextual bandits: Robustly combining supervised and bandit feedback . In K. Chaudhuri and R. Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning , volume 97 of Proceedings of Machine Learning Research , pages 7335 -- 7344 , Long Beach, California, USA , 09-15 Jun 2019 . PMLR. C. Zhang, A. Agarwal, H. D. Iii, J. Langford, and S. Negahban. Warm-starting contextual bandits: Robustly combining supervised and bandit feedback. In K. Chaudhuri and R. Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 7335--7344, Long Beach, California, USA, 09-15 Jun 2019. PMLR."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2017\/186"}],"event":{"name":"KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining","sponsor":["SIGMOD ACM Special Interest Group on Management of Data","SIGKDD ACM Special Interest Group on Knowledge Discovery in Data"],"location":"Virtual Event CA USA","acronym":"KDD '20"},"container-title":["Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3394486.3406484","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3394486.3406484","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3394486.3406484","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:31:30Z","timestamp":1750195890000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3394486.3406484"}},"subtitle":["New Challenges in Real-World Environments"],"short-title":[],"issued":{"date-parts":[[2020,8,20]]},"references-count":32,"alternative-id":["10.1145\/3394486.3406484","10.1145\/3394486"],"URL":"https:\/\/doi.org\/10.1145\/3394486.3406484","relation":{},"subject":[],"published":{"date-parts":[[2020,8,20]]},"assertion":[{"value":"2020-08-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}