{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,11]],"date-time":"2026-06-11T06:55:54Z","timestamp":1781160954697,"version":"3.54.1"},"reference-count":313,"publisher":"Emerald","issue":"1-2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,11,8]]},"abstract":"<jats:p>Multi-armed bandits a simple but very powerful framework for algorithms that make decisions over time under uncertainty. An enormous body of work has accumulated over the years, covered in several books and surveys. This book provides a more introductory, textbook-like treatment of the subject. Each chapter tackles a particular line of work, providing a self-contained, teachable technical introduction and a brief review of the further developments.<\/jats:p>","DOI":"10.1561\/2200000068","type":"journal-article","created":{"date-parts":[[2019,11,8]],"date-time":"2019-11-08T10:00:08Z","timestamp":1573207208000},"page":"1-286","source":"Crossref","is-referenced-by-count":457,"title":["Introduction to Multi-Armed Bandits"],"prefix":"10.1108","volume":"12","author":[{"given":"Aleksandrs","family":"Slivkins","sequence":"first","affiliation":[{"name":"Microsoft Research NYC"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"140","published-online":{"date-parts":[[2019,11,8]]},"reference":[{"key":"2026033012245038300_ref001","first-page":"2312","volume-title":"25th Advances in Neural Information Processing Systems (NIPS)","author":"Abbasi-Yadkori","year":"2011"},{"key":"2026033012245038300_ref002","first-page":"6584","volume-title":"Advances in Neural Information Processing Systems (NIPS)","author":"Abernethy","year":"2017"},{"key":"2026033012245038300_ref003","first-page":"49","volume-title":"17th ACM Symp. on Parallel Algorithms and Architectures (SPAA)","author":"Abraham","year":"2005"},{"key":"2026033012245038300_ref004","volume-title":"Fairness, Accountability, and Transparency in Machine Learning (FATML)","author":"Agarwal","year":"2017"},{"key":"2026033012245038300_ref005","unstructured":"Agarwal, A., S.Bird, M.Cozowicz, M.Dudik, L.Hoang, J.Langford, L.Li, D.Melamed, G.Oshri, S.Sen, and A.Slivkins. 2016. \u201cMultiworld Testing: A System for Experimentation, Learning, And DecisionMaking\u201d. A white paper, available athttps:\/\/github.com\/Microsoft\/mwt-ds\/raw\/master\/images\/MWT-WhitePaper.pdf."},{"key":"2026033012245038300_ref006","unstructured":"Agarwal, A., S.Bird, M.Cozowicz, L.Hoang,J.Langford, S.Lee, J.Li, D.Melamed, G.Oshri, O.Ribas, S.Sen, and A.Slivkins. 2017b. \u201cMaking Contextual Decisions with Low Technical Debt\u201d. Techical report atarxiv.org\/abs\/1606.03966."},{"key":"2026033012245038300_ref007","first-page":"1926","volume-title":"15th Intl. Conf. on Artificial Intelligence and Statistics (AISTATS)","author":"Agarwal","year":"2012"},{"key":"2026033012245038300_ref008","volume-title":"31st Intl. Conf. on Machine Learning (ICML)","author":"Agarwal","year":"2014"},{"key":"2026033012245038300_ref009","first-page":"12","volume-title":"30th Conf, on Learning Theory (COLT)","author":"Agarwal","year":"2017"},{"issue":"2","key":"2026033012245038300_ref010","first-page":"701","article-title":"Competition and Innovation: An Inverted U Relationship","volume":"120","author":"Aghion","year":"2005","journal-title":"Quaterly J. of Economics"},{"issue":"6","key":"2026033012245038300_ref011","doi-asserted-by":"crossref","first-page":"1926","DOI":"10.1137\/S0363012992237273","article-title":"The continuum-armed bandit problem","volume":"33","author":"Agrawal","year":"1995","journal-title":"SIAM J. Control and Optimization"},{"key":"2026033012245038300_ref012","first-page":"599","volume-title":"17th ACM Conf. on Economics and Computation (ACM EC)","author":"Agrawal","year":"2016"},{"key":"2026033012245038300_ref013","volume-title":"15th ACM Conf. on Economics and Computation (ACM EC)","author":"Agrawal","year":"2014"},{"key":"2026033012245038300_ref014","author":"Agrawal","year":"2016","journal-title":"29th Advances in Neural Information Processing Systems (NIPS)"},{"key":"2026033012245038300_ref015","volume-title":"29th Conf. on Learning Theory (COLT)","author":"Agrawal","year":"2016"},{"key":"2026033012245038300_ref016","volume-title":"25nd Conf, on Learning Theory (COLT)","author":"Agrawal","year":"2012"},{"key":"2026033012245038300_ref017","first-page":"99","volume-title":"16th Intl. Conf. on Artificial Intelligence and Statistics (AISTATS)","author":"Agrawal","year":"2013"},{"issue":"4","key":"2026033012245038300_ref018","doi-asserted-by":"crossref","first-page":"876","DOI":"10.1287\/opre.2014.1289","article-title":"A Dynamic Near-Optimal Algorithm for Online Linear Programming","volume":"62","author":"Agrawal","year":"2014","journal-title":"Operations Research"},{"key":"2026033012245038300_ref019","first-page":"856","volume-title":"Intl. Conf. on Machine Learning (ICML)","author":"Ailon","year":"2014"},{"key":"2026033012245038300_ref020","first-page":"23","volume-title":"28th Conf. on Learning Theory (COLT)","author":"Alon","year":"2015"},{"key":"2026033012245038300_ref021","first-page":"1610","volume-title":"27th Advances in Neural Information Processing Systems (NIPS)","author":"Alon","year":"2013"},{"key":"2026033012245038300_ref022","volume-title":"24th Conf. on Learning Theory (COLT)","author":"Amin","year":"2011"},{"key":"2026033012245038300_ref023","first-page":"1169","volume-title":"26th Advances in Neural Information Processing Systems (NIPS)","author":"Amin","year":"2013"},{"key":"2026033012245038300_ref024","first-page":"622","volume-title":"27th Advances in Neural Information Processing Systems (NIPS)","author":"Amin","year":"2014"},{"issue":"4","key":"2026033012245038300_ref025","doi-asserted-by":"crossref","first-page":"731","DOI":"10.1109\/JSAC.2011.110406","article-title":"Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret","volume":"29","author":"Anandkumar","year":"2011","journal-title":"IEEE Journal on Selected Areas in Communications"},{"key":"2026033012245038300_ref026","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1016\/j.tcs.2012.10.008","article-title":"Toward a classification of finite partial-monitoring games","volume":"473","author":"Antos","year":"2013","journal-title":"Theor. Comput. Sci"},{"key":"2026033012245038300_ref027","volume-title":"20th ACM Conf. on Economics and Computation (ACM EC)","author":"Aridor","year":"2019"},{"issue":"1","key":"2026033012245038300_ref028","doi-asserted-by":"crossref","first-page":"121","DOI":"10.4086\/toc.2012.v008a006","article-title":"The Multiplicative Weights Update Method: a Meta\u2013Algorithm and Applications","volume":"8","author":"Arora","year":"2012","journal-title":"Theory of Computing"},{"issue":"6","key":"2026033012245038300_ref029","doi-asserted-by":"crossref","first-page":"2463","DOI":"10.3982\/ECTA6995","article-title":"An Efficient Dynamic Mechanism","volume":"81","author":"Athey","year":"2013","journal-title":"Econometrica"},{"key":"2026033012245038300_ref030","doi-asserted-by":"crossref","first-page":"1876","DOI":"10.1016\/j.tcs.2009.01.016","article-title":"Explorationexploitation Trade\u2013off using Variance Estimates in Multi\u2013Armed Bandits","volume":"410","author":"Audibert","year":"2009","journal-title":"Theoretical Computer Science"},{"key":"2026033012245038300_ref031","first-page":"2785","article-title":"Regret Bounds and Minimax Policies under Partial Monitoring","volume":"11","author":"Audibert","year":"2010","journal-title":"J. of Machine Learning Research (JMLR)"},{"key":"2026033012245038300_ref032","first-page":"41","volume-title":"23rd Conf. on Learning Theory (COLT)","author":"Audibert","year":"2010"},{"issue":"2\u20133","key":"2026033012245038300_ref033","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1023\/A:1013689704352","article-title":"Finite\u2013time Analysis of the Multiarmed Bandit Problem","volume":"47","author":"Auer","year":"2002","journal-title":"Machine Learning"},{"issue":"1","key":"2026033012245038300_ref034","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1137\/S0097539701398375","article-title":"The Nonstochastic Multiarmed Bandit Problem","volume":"32","author":"Auer","year":"2002","journal-title":"SIAM J. Comput"},{"key":"2026033012245038300_ref035","volume-title":"29th Conf. on Learning Theory (COLT)","author":"Auer","year":"2016"},{"key":"2026033012245038300_ref036","volume-title":"Conf. on Learning Theory (COLT)","author":"Auer","year":"2019"},{"key":"2026033012245038300_ref037","doi-asserted-by":"crossref","first-page":"454","DOI":"10.1007\/978-3-540-72927-3_33","volume-title":"20th Conf, on Learning Theory (COLT)","author":"Auer","year":"2007"},{"key":"2026033012245038300_ref038","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1016\/0304-4068(74)90037-8","article-title":"Subjectivity and correlation in randomized strategies","volume":"1","author":"Aumann","year":"1974","journal-title":"J. of Mathematical Economics"},{"key":"2026033012245038300_ref039","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1007\/978-3-662-44848-9_5","volume-title":"European Conf. on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD)","author":"Avner","year":"2014"},{"key":"2026033012245038300_ref040","first-page":"631","volume-title":"24th Conf. of the IEEE Communications Society (INFOCOM)","author":"Awerbuch","year":"2005"},{"issue":"1","key":"2026033012245038300_ref041","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1016\/j.jcss.2007.04.016","article-title":"Online linear optimization and adaptive routing","volume":"74","author":"Awerbuch","year":"2008","journal-title":"J. of Computer and System Sciences"},{"key":"2026033012245038300_ref042","first-page":"1557","volume-title":"31th Intl. Conf, on Machine Learning (ICML)","author":"Azar","year":"2014"},{"issue":"1","key":"2026033012245038300_ref043","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1145\/2559152","article-title":"Dynamic Pricing with Limited Supply","volume":"3","author":"Babaioff","year":"2015","journal-title":"ACM Trans. on Economics and Computation"},{"key":"2026033012245038300_ref044","first-page":"43","volume-title":"11th ACM Conf. on Electronic Commerce (EC)","author":"Babaioff","year":"2010"},{"key":"2026033012245038300_ref045","first-page":"35","volume-title":"13th ACM Conf. on Electronic Commerce (EC)","author":"Babaioff","year":"2013"},{"issue":"2","key":"2026033012245038300_ref046","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1145\/2724705","article-title":"Truthful Mechanisms with Implicit Payment Computation","volume":"62","author":"Babaioff","year":"2015","journal-title":"J. of the ACM"},{"issue":"1","key":"2026033012245038300_ref047","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1137\/120878768","article-title":"Characterizing Truthful Multi\u2013armed Bandit Mechanisms","volume":"43","author":"Babaioff","year":"2014","journal-title":"SIAM J. on Computing (SICOMP)"},{"key":"2026033012245038300_ref048","first-page":"128","volume-title":"13th ACM Conf, on Electronic Commerce (EC)","author":"Badanidiyuru","year":"2012"},{"key":"2026033012245038300_ref049","volume-title":"54th IEEE Symp. on Foundations of Computer Science (FOCS)","author":"Badanidiyuru","year":"2013"},{"issue":"3","key":"2026033012245038300_ref050","doi-asserted-by":"crossref","DOI":"10.1145\/3164539","article-title":"Bandits with Knapsacks","volume":"65","author":"Badanidiyuru","year":"2018","journal-title":"J. of the ACM"},{"key":"2026033012245038300_ref051","author":"Badanidiyuru","year":"2014","journal-title":"27th Conf, on Learning Theory (COLT)"},{"key":"2026033012245038300_ref052","unstructured":"Bahar, G., O.Ben\u2013Porat, K.Leyton\u2013Brown, and M.Tennenholtz. 2019a. \u201cFiduciary Bandits\u201d. CoRR. abs\/1905.07043. arXiv: 1905.07043. URL:http:\/\/arxiv.org\/abs\/1905.07043."},{"key":"2026033012245038300_ref053","volume-title":"16th ACM Conf. on Electronic Commerce (EC)","author":"Bahar","year":"2016"},{"key":"2026033012245038300_ref054","first-page":"153","volume-title":"ACM Conf, on Economics and Computation (ACM EC)","author":"Bahar","year":"2019"},{"key":"2026033012245038300_ref055","first-page":"321","volume-title":"ACM Conf. on Economics and Computation (ACM EC)","author":"Bailey","year":"2018"},{"issue":"4","key":"2026033012245038300_ref056","doi-asserted-by":"crossref","first-page":"967","DOI":"10.1287\/moor.2014.0663","article-title":"Partial Monitoring \u2013 Classification, Regret Bounds, and Algorithms","volume":"39","author":"Bart\u00f3k","year":"2014","journal-title":"Math. Oper. Res"},{"key":"2026033012245038300_ref057","volume-title":"CoRR","author":"Bastani","year":"2018"},{"issue":"4","key":"2026033012245038300_ref058","doi-asserted-by":"crossref","first-page":"1251","DOI":"10.3982\/ECTA11105","article-title":"Robust Predictions in Games With Incomplete Information","volume":"81","author":"Bergemann","year":"2013","journal-title":"Econometrica"},{"key":"2026033012245038300_ref059","author":"Bergemann","year":"2016"},{"key":"2026033012245038300_ref060","volume-title":"Wiley Encyclopedia of Operations Research and Management Science","author":"Bergemann","year":"2011"},{"key":"2026033012245038300_ref061","volume-title":"The New Palgrave Dictionary of Economics, 2nd ed","author":"Bergemann","year":"2006"},{"issue":"2","key":"2026033012245038300_ref062","doi-asserted-by":"crossref","first-page":"771","DOI":"10.3982\/ECTA7260","article-title":"The Dynamic Pivot Mechanism","volume":"78","author":"Bergemann","year":"2010","journal-title":"Econometrica"},{"key":"2026033012245038300_ref063","doi-asserted-by":"crossref","DOI":"10.1007\/978-94-015-3711-7","volume-title":"Bandit problems: sequential allocation of experiments.","author":"Berry","year":"1985"},{"issue":"6","key":"2026033012245038300_ref064","doi-asserted-by":"crossref","first-page":"1407","DOI":"10.1287\/opre.1080.0640","article-title":"Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near\u2013Optimal Algorithms","volume":"57","author":"Besbes","year":"2009","journal-title":"Operations Research"},{"issue":"6","key":"2026033012245038300_ref065","doi-asserted-by":"crossref","first-page":"1537","DOI":"10.1287\/opre.1120.1103","article-title":"Blind Network Revenue Management","volume":"60","author":"Besbes","year":"2012","journal-title":"Operations Research"},{"key":"2026033012245038300_ref066","volume-title":"14th Intl. Conf. on Artificial Intelligence and Statistics (AISTATS)","author":"Beygelzimer","year":"2011"},{"key":"2026033012245038300_ref067","volume-title":"CoRR","author":"Bietti","year":"2018"},{"issue":"4","key":"2026033012245038300_ref068","first-page":"1477","article-title":"Crowdsourcing Exploration","volume":"64","author":"Bimpikis","year":"2018","journal-title":"Management Science"},{"key":"2026033012245038300_ref069","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1007335615132","article-title":"Empirical support for winnow and weighted\u2013majority based algorithms: Results on a calendar scheduling domain","volume":"26","author":"Blum","year":"1997","journal-title":"Machine Learning"},{"key":"2026033012245038300_ref070","first-page":"373","volume-title":"40th ACM Symp. on Theory of Computing (STOC)","author":"Blum","year":"2008"},{"key":"2026033012245038300_ref071","first-page":"202","volume-title":"14th ACM\u2013SIAM Symp. on Discrete Algorithms (SODA)","author":"Blum","year":"2003"},{"issue":"13","key":"2026033012245038300_ref072","first-page":"1307","article-title":"From external to internal regret","volume":"8","author":"Blum","year":"2007","journal-title":"J. of Machine Learning Research (JMLR)"},{"issue":"1","key":"2026033012245038300_ref073","article-title":"Dynamic pricing and learning: Historical origins, current research, and new directions","volume":"20","author":"Boer","year":"2015","journal-title":"Surveys in Operations Research and Management Science"},{"issue":"2","key":"2026033012245038300_ref074","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1111\/1468-0262.00022","article-title":"Strategic Experimentation","volume":"67","author":"Bolton","year":"1999","journal-title":"Econometrica"},{"key":"2026033012245038300_ref075","unstructured":"Boursier, E. and V.Perchet. 2018. \u201cSIC\u2013MMAB: Synchronisation Involves Communication in Multiplayer Multi\u2013Armed Bandits\u201d. CoRR. abs\/1809.08151. URL: http:\/\/arxiv.org\/abs\/1809.08151."},{"key":"2026033012245038300_ref076","first-page":"523","volume-title":"ACM Conf. on Economics and Computation (ACM EC)","author":"Braverman","year":"2018"},{"key":"2026033012245038300_ref077","first-page":"383","volume-title":"Conf. on Learning Theory (COLT)","author":"Braverman","year":"2019"},{"key":"2026033012245038300_ref078","first-page":"3347","volume-title":"27th Advances in Neural Information Processing Systems (NIPS)","author":"Bresler","year":"2014"},{"key":"2026033012245038300_ref079","first-page":"207","volume-title":"The Intl. Conf. on Measurement and Modeling of Computer Systems (SIGMETRICS)","author":"Bresler","year":"2016"},{"key":"2026033012245038300_ref080","first-page":"P\u201378","volume-title":"Tech. rep","author":"Brown","year":"1949"},{"key":"2026033012245038300_ref081","volume-title":"PhD thesis","author":"Bubeck","year":"2010"},{"issue":"1","key":"2026033012245038300_ref082","doi-asserted-by":"crossref","DOI":"10.1561\/2200000024","article-title":"Regret Analysis of Stochastic and Nonstochastic Multi\u2013armed Bandit Problems","volume":"5","author":"Bubeck","year":"2012","journal-title":"Foundations and Trends in Machine Learning"},{"key":"2026033012245038300_ref083","volume-title":"29nd Intl. Conf. on Algorithmic Learning Theory (ALT)","author":"Bubeck","year":"2018"},{"key":"2026033012245038300_ref084","first-page":"266","volume-title":"28th Conf, on Learning Theory (COLT)","author":"Bubeck","year":"2015"},{"key":"2026033012245038300_ref085","first-page":"72","volume-title":"49th ACM Symp. on Theory of Computing (STOC)","author":"Bubeck","year":"2017"},{"key":"2026033012245038300_ref086","volume-title":"Conf. on Learning Theory (COLT)","author":"Bubeck","year":"2019"},{"key":"2026033012245038300_ref087","unstructured":"Bubeck, S., Y.Li, Y.Peres, and M.Sellke. 2019b. \u201cNon\u2013Stochastic Multi\u2013Player Multi\u2013Armed Bandits: Optimal Rate With Collision Information, Sublinear Without\u201d. CoRR.abs\/1904.12233. arXiv: 1904.12233. URL: http:\/\/arxiv.org\/abs\/1904.12233."},{"issue":"19","key":"2026033012245038300_ref088","doi-asserted-by":"crossref","first-page":"1832","DOI":"10.1016\/j.tcs.2010.12.059","article-title":"Pure Exploration in MultiArmed Bandit Problems","volume":"412","author":"Bubeck","year":"2011","journal-title":"Theoretical Computer Science"},{"key":"2026033012245038300_ref089","first-page":"1587","article-title":"Online Optimization in X\u2013Armed Bandits","volume":"12","author":"Bubeck","year":"2011","journal-title":"J. of Machine Learning Research (JMLR)"},{"key":"2026033012245038300_ref090","volume-title":"25th Conf. on Learning Theory (COLT)","author":"Bubeck","year":"2012"},{"key":"2026033012245038300_ref091","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1007\/978-3-642-24412-4_14","volume-title":"22nd Intl. Conf. on Algorithmic Learning Theory (ALT)","author":"Bubeck","year":"2011"},{"issue":"4","key":"2026033012245038300_ref092","first-page":"2289","article-title":"Adaptive\u2013treed bandits","volume":"21","author":"Bull","year":"2015","journal-title":"Bernoulli J. of Statistics"},{"key":"2026033012245038300_ref093","first-page":"590","volume-title":"29th Conf, on Learning Theory (COLT)","author":"Carpentier","year":"2016"},{"key":"2026033012245038300_ref094","first-page":"465","volume-title":"30th Conf, on Learning Theory (COLT)","author":"Cesa\u2013Bianchi","year":"2017"},{"key":"2026033012245038300_ref095","volume-title":"ACM\u2013SIAM Symp. on Discrete Algorithms (SODA)","author":"Cesa\u2013Bianchi","year":"2013"},{"issue":"3","key":"2026033012245038300_ref096","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1023\/A:1022901500417","article-title":"Potential\u2013Based Algorithms in On\u2013Line Prediction and Game Theory","volume":"51","author":"Cesa\u2013Bianchi","year":"2003","journal-title":"Machine Learning"},{"key":"2026033012245038300_ref097","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511546921","volume-title":"Prediction, learning, and games","author":"Cesa\u2013Bianchi","year":"2006"},{"issue":"5","key":"2026033012245038300_ref098","doi-asserted-by":"crossref","first-page":"1404","DOI":"10.1016\/j.jcss.2012.01.001","article-title":"Combinatorial bandits","volume":"78","author":"Cesa\u2013Bianchi","year":"2012","journal-title":"J. Comput. Syst. Sci"},{"key":"2026033012245038300_ref099","first-page":"273","volume-title":"22nd Advances in Neural Information Processing Systems (NIPS)","author":"Chakrabarti","year":"2008"},{"key":"2026033012245038300_ref100","volume-title":"Quarterly Journal of Economics","author":"Che","year":"2018"},{"key":"2026033012245038300_ref101","first-page":"798","volume-title":"Conf. on Learning Theory (COLT)","author":"Chen","year":"2018"},{"key":"2026033012245038300_ref102","volume-title":"IEEE Internet of Things Journal","author":"Chen","year":"2018"},{"issue":"24","key":"2026033012245038300_ref103","doi-asserted-by":"crossref","first-page":"6350","DOI":"10.1109\/TSP.2017.2750109","article-title":"An online convex optimization approach to proactive network resource allocation","volume":"65","author":"Chen","year":"2017","journal-title":"IEEE Transactions on Signal Processing"},{"key":"2026033012245038300_ref104","first-page":"151","volume-title":"20th Intl. Conf, on Machine Learning (ICML)","author":"Chen","year":"2013"},{"key":"2026033012245038300_ref105","volume-title":"Conf. on Learning Theory (COLT)","author":"Chen","year":"2019"},{"key":"2026033012245038300_ref106","first-page":"807","volume-title":"Conf. on Learning Theory (COLT)","author":"Cheung","year":"2019"},{"issue":"11","key":"2026033012245038300_ref107","first-page":"1750","article-title":"Adaptive design methods in clinical trials \u2013 a review","volume":"3","author":"Chow","year":"2008","journal-title":"Orphanet Journal of Rare Diseases"},{"key":"2026033012245038300_ref108","first-page":"273","volume-title":"43rd ACM Symp. on Theory of Computing (STOC)","author":"Christiano","year":"2011"},{"key":"2026033012245038300_ref109","volume-title":"14th Intl. Conf. on Artificial Intelligence and Statistics (AISTATS)","author":"Chu","year":"2011"},{"key":"2026033012245038300_ref110","first-page":"135","volume-title":"ACM Conf, on Economics and Computation (ACM EC)","author":"Cohen","year":"2019"},{"issue":"1","key":"2026033012245038300_ref111","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1145\/2796314.2745847","article-title":"Bandits with budgets: Regret lower bounds and optimal algorithms","volume":"43","author":"Combes","year":"2015","journal-title":"ACM SIGMETRICS Performance Evaluation Review"},{"key":"2026033012245038300_ref112","volume-title":"Elements of Information Theory","author":"Cover","year":"1991"},{"key":"2026033012245038300_ref113","first-page":"355","volume-title":"21th Conf. on Learning Theory (COLT)","author":"Dani","year":"2008"},{"key":"2026033012245038300_ref114","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1016\/j.geb.2014.01.003","article-title":"Near\u2013optimal noregret algorithms for zero\u2013sum games","volume":"92","author":"Daskalakis","year":"2015","journal-title":"Games and Economic Behavior"},{"key":"2026033012245038300_ref115","volume-title":"6th International Conference on Learning Representations (ICLR)","author":"Daskalakis","year":"2018"},{"key":"2026033012245038300_ref116","first-page":"11","volume-title":"55th IEEE Symp. on Foundations of Computer Science (FOCS)","author":"Daskalakis","year":"2014"},{"key":"2026033012245038300_ref117","volume-title":"29th Intl. Conf. on Machine Learning (ICML)","author":"Dekel","year":"2012"},{"key":"2026033012245038300_ref118","volume-title":"29th Intl. Conf. on Machine Learning (ICML)","author":"Desautels","year":"2012"},{"key":"2026033012245038300_ref119","first-page":"71","volume-title":"10th ACM Conf. on Electronic Commerce (EC)","author":"Devanur","year":"2009"},{"key":"2026033012245038300_ref120","first-page":"29","volume-title":"12th ACM Conf. on Electronic Commerce (EC)","author":"Devanur","year":"2011"},{"issue":"1","key":"2026033012245038300_ref121","doi-asserted-by":"crossref","first-page":"7:1","DOI":"10.1145\/3284177","article-title":"Near Optimal Online Algorithms and Fast Approximation Algorithms for Resource Allocation Problems","volume":"66","author":"Devanur","year":"2019","journal-title":"J. ACM"},{"key":"2026033012245038300_ref122","first-page":"99","volume-title":"10th ACM Conf. on Electronic Commerce (EC)","author":"Devanur","year":"2009"},{"key":"2026033012245038300_ref123","volume-title":"27th AAAI Conference on Artificial Intelligence (AAAI)","author":"Ding","year":"2013"},{"key":"2026033012245038300_ref124","first-page":"395","volume-title":"12th USENIX Symp. on Networked Systems Design and Implementation (NSDI)","author":"Dong","year":"2015"},{"key":"2026033012245038300_ref125","first-page":"343","volume-title":"15th USENIX Symp. on Networked Systems Design and Implementation (NSDI)","author":"Dong","year":"2018"},{"key":"2026033012245038300_ref126","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511581274","volume-title":"Concentration of Measure for the Analysis of Randomized Algorithms","author":"Dubhashi","year":"2009"},{"key":"2026033012245038300_ref127","first-page":"247","volume-title":"28th Conf. on Uncertainty in Artificial Intelligence (UAI)","author":"Dudik","year":"2012"},{"issue":"4","key":"2026033012245038300_ref128","doi-asserted-by":"crossref","first-page":"1097","DOI":"10.1214\/14-STS500","article-title":"Doubly Robust Policy Evaluation and Optimization","volume":"29","author":"Dudik","year":"2014","journal-title":"Statistical Science"},{"key":"2026033012245038300_ref129","first-page":"528","volume-title":"58th IEEE Symp. on Foundations of Computer Science (FOCS)","author":"Dudik","year":"2017"},{"key":"2026033012245038300_ref130","volume-title":"28th Conf. on Learning Theory (COLT)","author":"Dudik","year":"2015"},{"key":"2026033012245038300_ref131","volume-title":"27th Conf. on Uncertainty in Artificial Intelligence (UAI)","author":"Dudik","year":"2011"},{"key":"2026033012245038300_ref132","first-page":"255","volume-title":"15th Conf. on Learning Theory (COLT)","author":"Even\u2013Dar","year":"2002"},{"key":"2026033012245038300_ref133","first-page":"1079","article-title":"Action Elimination and Stopping Conditions for the Multi\u2013Armed Bandit and Reinforcement Learning Problems","volume":"7","author":"Even\u2013Dar","year":"2006","journal-title":"J. of Machine Learning Research (JMLR)"},{"issue":"1","key":"2026033012245038300_ref134","doi-asserted-by":"crossref","first-page":"190","DOI":"10.1137\/14100227X","article-title":"Chasing Ghosts: Competing with Stateful Policies","volume":"46","author":"Feige","year":"2017","journal-title":"SIAM J. on Computing (SICOMP)"},{"key":"2026033012245038300_ref135","first-page":"182","volume-title":"18th Annual European Symp. on Algorithms (ESA)","author":"Feldman","year":"2010"},{"key":"2026033012245038300_ref136","first-page":"385","volume-title":"16th ACM\u2013SIAM Symp. on Discrete Algorithms (SODA)","author":"Flaxman","year":"2005"},{"key":"2026033012245038300_ref137","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1006\/game.1997.0595","article-title":"Calibrated learning and correlated equilibrium","volume":"21","author":"Foster","year":"1997","journal-title":"Games and Economic Behavior"},{"key":"2026033012245038300_ref138","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1093\/biomet\/85.2.379","article-title":"Asymptotic calibration&","volume":"85","author":"Foster","year":"1998","journal-title":"Biometrika"},{"key":"2026033012245038300_ref139","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1006\/game.1999.0740","article-title":"Regret in the on\u2013line decision problem","volume":"29","author":"Foster","year":"1999","journal-title":"Games and Economic Behavior"},{"key":"2026033012245038300_ref140","first-page":"1534","volume-title":"35th Intl. Conf, on Machine Learning (ICML)","author":"Foster","year":"2018"},{"key":"2026033012245038300_ref141","first-page":"4727","volume-title":"29th Advances in Neural Information Processing Systems (NIPS)","author":"Foster","year":"2016"},{"key":"2026033012245038300_ref142","first-page":"5","volume-title":"ACM Conf. on Economics and Computation (ACM EC)","author":"Frazier","year":"2014"},{"issue":"1","key":"2026033012245038300_ref143","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1006\/jcss.1997.1504","article-title":"A decision\u2013theoretic generalization of on\u2013line learning and an application to boosting","volume":"55","author":"Freund","year":"1997","journal-title":"Journal of Computer and System Sciences"},{"key":"2026033012245038300_ref144","first-page":"325","volume-title":"9th Conf. on Learning Theory (COLT)","author":"Freund","year":"1996"},{"issue":"1\u20132","key":"2026033012245038300_ref145","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1006\/game.1999.0738","volume":"29","author":"Freund","year":"1999","journal-title":"Games and Economic Behavior"},{"key":"2026033012245038300_ref146","first-page":"334","volume-title":"29th ACM Symp. on Theory of Computing (STOC)","author":"Freund","year":"1997"},{"key":"2026033012245038300_ref147","volume-title":"24th Conf. on Learning Theory (COLT)","author":"Garivier","year":"2011"},{"key":"2026033012245038300_ref148","doi-asserted-by":"crossref","first-page":"174","DOI":"10.1007\/978-3-642-24412-4_16","volume-title":"22nd Intl. Conf. on Algorithmic Learning Theory (ALT)","author":"Garivier","year":"2011"},{"key":"2026033012245038300_ref149","volume-title":"13th ACM Conf. on Electronic Commerce (EC)","author":"Gatti","year":"2012"},{"key":"2026033012245038300_ref150","first-page":"233","volume-title":"Innovations in Theoretical Computer Science Conf. (ITCS)","author":"Ghosh","year":"2013"},{"key":"2026033012245038300_ref151","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1111\/j.2517-6161.1979.tb01068.x","article-title":"Bandit processes and dynamic allocation indices (with discussion)","volume":"41","author":"Gittins","year":"1979","journal-title":"J. Roy. Statist. Soc. Ser. B"},{"key":"2026033012245038300_ref152","doi-asserted-by":"crossref","DOI":"10.1002\/9780470980033","volume-title":"Multi\u2013Armed Bandit Allocation Indices","author":"Gittins","year":"2011"},{"key":"2026033012245038300_ref153","volume-title":"Advances in Neural Information Processing Systems (NIPS)","author":"Golovin","year":"2009"},{"key":"2026033012245038300_ref154","volume-title":"28th Advances in Neural Information Processing Systems (NIPS)","author":"Grill","year":"2015"},{"key":"2026033012245038300_ref155","first-page":"496","volume-title":"36th Intl. Colloquium on Automata, Languages and Programming (ICALP)","author":"Guha","year":"2007"},{"key":"2026033012245038300_ref156","first-page":"1562","volume-title":"Conf. on Learning Theory (COLT)","author":"Gupta","year":"2019"},{"key":"2026033012245038300_ref157","first-page":"534","volume-title":"44th IEEE Symp. on Foundations of Computer Science (FOCS)","author":"Gupta","year":"2003"},{"key":"2026033012245038300_ref158","first-page":"827","volume-title":"52nd IEEE Symp. on Foundations of Computer Science (FOCS)","author":"Gupta","year":"2011"},{"key":"2026033012245038300_ref159","first-page":"830","volume-title":"20th Intl. Joint Conf. on Artificial Intelligence (IJCAI)","author":"Gy\u00f6rgy","year":"2007"},{"key":"2026033012245038300_ref160","first-page":"2369","article-title":"The On\u2013Line Shortest Path Problem Under Partial Monitoring","volume":"8","author":"Gy\u00f6rgy","year":"2007","journal-title":"J. of Machine Learning Research (JMLR)"},{"key":"2026033012245038300_ref161","first-page":"97","article-title":"Approximation to Bayes risk in repeated play","volume":"3","author":"Hannan","year":"1957","journal-title":"Contributions to the Theory of Games"},{"key":"2026033012245038300_ref162","doi-asserted-by":"crossref","first-page":"1127","DOI":"10.1111\/1468-0262.00153","article-title":"A simple adaptive procedure leading to correlated equilibrium","volume":"68","author":"Hart","year":"2000","journal-title":"Econometrica"},{"issue":"3\u20134","key":"2026033012245038300_ref163","first-page":"157","article-title":"Introduction to Online Convex Optimization","volume":"2","author":"Hazan","year":"2015","journal-title":"Foundations and Trends\u00ae in Optimization"},{"key":"2026033012245038300_ref164","first-page":"1287","article-title":"Better algorithms for benign bandits","volume":"12","author":"Hazan","year":"2011","journal-title":"Journal of Machine Learning Research"},{"key":"2026033012245038300_ref165","first-page":"784","volume-title":"27th Advances in Neural Information Processing Systems (NIPS)","author":"Hazan","year":"2014"},{"key":"2026033012245038300_ref166","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1007\/978-3-540-72927-3_36","volume-title":"20th Conf. on Learning Theory (COLT)","author":"Hazan","year":"2007"},{"key":"2026033012245038300_ref167","first-page":"2559","volume-title":"33rd Intl. Conf. on Machine Learning (ICML)","author":"Heidari","year":"2016"},{"key":"2026033012245038300_ref168","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1613\/jair.4940","article-title":"Adaptive Contract Design for Crowdsourcing Markets: Bandit Algorithms for Repeated Principal\u2013Agent Problems","volume":"55","author":"Ho","year":"2016","journal-title":"J. of Artificial Intelligence Research"},{"issue":"1","key":"2026033012245038300_ref169","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/1500000051","article-title":"Online Evaluation for Information Retrieval","volume":"10","author":"Hofmann","year":"2016","journal-title":"Foundations and Trends\u00ae in Information Retrieval"},{"key":"2026033012245038300_ref170","volume-title":"23rd Conf. on Learning Theory (COLT)","author":"Honda","year":"2010"},{"key":"2026033012245038300_ref171","first-page":"580","volume-title":"27th ACM\u2013SIAM Symp. on Discrete Algorithms (SODA)","author":"Hsu","year":"2016"},{"key":"2026033012245038300_ref172","unstructured":"Immorlica, N., J.Mao, A.Slivkins, and S.Wu. 2018. \u201cIncentivizing Exploration with Unbiased History\u201d. Working paper. URL:https:\/\/arxiv.org\/abs\/1811.06026."},{"key":"2026033012245038300_ref173","volume-title":"The Web Conference (formerly known as WWW)","author":"Immorlica","year":"2019"},{"key":"2026033012245038300_ref174","volume-title":"60th IEEE Symp. on Foundations of Computer Science (FOCS)","author":"Immorlica","year":"2019"},{"key":"2026033012245038300_ref175","first-page":"286","volume-title":"ACM SIGCOMM (ACM SIGCOMM Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communications)","author":"Jiang","year":"2016"},{"key":"2026033012245038300_ref176","first-page":"393","volume-title":"14th USENIX Symp. on Networked Systems Design and Implementation (NSDI)","author":"Jiang","year":"2017"},{"key":"2026033012245038300_ref177","author":"Kakade","year":"2011"},{"issue":"3","key":"2026033012245038300_ref178","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1016\/j.jcss.2004.10.016","article-title":"Efficient algorithms for online decision problems","volume":"71","author":"Kalai","year":"2005","journal-title":"J. of Computer and Systems Sciences"},{"key":"2026033012245038300_ref179","first-page":"1054","volume-title":"24th Advances in Neural Information Processing Systems (NIPS)","author":"Kale","year":"2010"},{"issue":"6","key":"2026033012245038300_ref180","doi-asserted-by":"crossref","first-page":"2590","DOI":"10.1257\/aer.101.6.2590","article-title":"Bayesian Persuasion","volume":"101","author":"Kamenica","year":"2011","journal-title":"American Economic Review"},{"key":"2026033012245038300_ref181","volume-title":"Advances in Neural Information Processing Systems (NIPS)","author":"Kannan","year":"2018"},{"key":"2026033012245038300_ref182","first-page":"63","volume-title":"34th ACM Symp. on Theory of Computing (STOC)","author":"Karger","year":"2002"},{"key":"2026033012245038300_ref183","first-page":"1:1","article-title":"On the Complexity of Best\u2013Arm Identification in Multi\u2013Armed Bandit Models","volume":"17","author":"Kaufmann","year":"2016","journal-title":"J. of Machine Learning Research (JMLR)"},{"key":"2026033012245038300_ref184","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1007\/978-3-642-34106-9_18","volume-title":"23rd Intl. Conf. on Algorithmic Learning Theory (ALT)","author":"Kaufmann","year":"2012"},{"key":"2026033012245038300_ref185","first-page":"2564","volume-title":"35th Intl. Conf. on Machine Learning (ICML)","author":"Kearns","year":"2018"},{"issue":"1","key":"2026033012245038300_ref186","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1111\/j.1468-0262.2005.00564.x","article-title":"Strategic Experimentation with Exponential Bandits","volume":"73","author":"Keller","year":"2005","journal-title":"Econometrica"},{"issue":"6","key":"2026033012245038300_ref187","doi-asserted-by":"crossref","DOI":"10.1145\/1568318.1568322","article-title":"Triangulation and Embedding Using Small Sets of Beacons","volume":"56","author":"Kleinberg","year":"2009","journal-title":"J. of the ACM"},{"key":"2026033012245038300_ref188","volume-title":"Algorithm Design","author":"Kleinberg","year":"2005"},{"key":"2026033012245038300_ref189","volume-title":"18th Advances in Neural Information Processing Systems (NIPS)","author":"Kleinberg","year":"2004"},{"key":"2026033012245038300_ref190","first-page":"928","volume-title":"17th ACM\u2013SIAM Symp. on Discrete Algorithms (SODA)","author":"Kleinberg","year":"2006"},{"key":"2026033012245038300_ref191","unstructured":"Kleinberg, R.\n          \n          2007. \u201cCS683: Learning, Games, and Electronic Markets, a class at Cornell University\u201d. Lecture notes, available athttp:\/\/www.cs.cornell.edu\/courses\/cs683\/2007sp\/."},{"key":"2026033012245038300_ref192","volume-title":"IEEE Symp. on Foundations of Computer Science (FOCS)","author":"Kleinberg","year":"2003"},{"key":"2026033012245038300_ref193","first-page":"425","volume-title":"21st Conf. on Learning Theory (COLT)","author":"Kleinberg","year":"2008"},{"key":"2026033012245038300_ref194","first-page":"533","volume-title":"41st ACM Symp. on Theory of Computing (STOC)","author":"Kleinberg","year":"2009"},{"key":"2026033012245038300_ref195","volume-title":"21st ACM\u2013SIAM Symp. on Discrete Algorithms (SODA)","author":"Kleinberg","year":"2010"},{"key":"2026033012245038300_ref196","first-page":"681","volume-title":"40th ACM Symp. on Theory of Computing (STOC)","author":"Kleinberg","year":"2008"},{"key":"2026033012245038300_ref197","doi-asserted-by":"crossref","unstructured":"Kleinberg, R., A.Slivkins, and E.Upfal. 2019. \u201cBandits and Experts in Metric Spaces\u201d. J. of the ACM. 66(4). Merged and revised version of conference papers in ACM STOC 2008 and ACM\u2013SIAM SODA 2010. Also available athttp:\/\/arxiv.org\/abs\/1312.1277.","DOI":"10.1145\/3299873"},{"key":"2026033012245038300_ref198","first-page":"282","volume-title":"17th European Conf. on Machine Learning (ECML)","author":"Kocsis","year":"2006"},{"key":"2026033012245038300_ref199","volume-title":"23rd Conf. on Learning Theory (COLT)","author":"Koolen","year":"2010"},{"key":"2026033012245038300_ref200","first-page":"2447","volume-title":"25th Advances in Neural Information Processing Systems (NIPS)","author":"Krause","year":"2011"},{"issue":"5","key":"2026033012245038300_ref201","doi-asserted-by":"crossref","first-page":"988","DOI":"10.1086\/676597","article-title":"Implementing the \u201cWisdom of the Crowd\u201d","volume":"122","author":"Kremer","year":"2014","journal-title":"J. of Political Economy"},{"key":"2026033012245038300_ref202","volume-title":"29th Advances in Neural Information Processing Systems (NIPS)","author":"Krishnamurthy","year":"2016"},{"key":"2026033012245038300_ref203","unstructured":"Krishnamurthy, A., J.Langford, A.Slivkins, and C.Zhang. 2019. \u201cContextual bandits with continuous actions: Smoothing, zooming, and adapting\u201d. In: Conf. on Learning Theory (COLT). Working paper, under journal submission. URL:https:\/\/arxiv.org\/abs\/1902.01520."},{"key":"2026033012245038300_ref204","first-page":"767","volume-title":"32nd Intl. Conf. on Machine Learning (ICML)","author":"Kveton","year":"2015"},{"key":"2026033012245038300_ref205","first-page":"420","volume-title":"13th Conf. on Uncertainty in Artificial Intelligence (UAI)","author":"Kveton","year":"2014"},{"key":"2026033012245038300_ref206","first-page":"420","volume-title":"Conf. on Uncertainty in Artificial Intelligence (UAI)","author":"Kveton","year":"2014"},{"key":"2026033012245038300_ref207","first-page":"1450","volume-title":"28th Advances in Neural Information Processing Systems (NIPS)","author":"Kveton","year":"2015"},{"key":"2026033012245038300_ref208","volume-title":"18th Intl. Conf. on Artificial Intelligence and Statistics (AISTATS)","author":"Kveton","year":"2015"},{"key":"2026033012245038300_ref209","doi-asserted-by":"crossref","DOI":"10.1515\/9781400829453","volume-title":"The Theory of Incentives: The Principal\u2013Agent Model","author":"Laffont","year":"2002"},{"key":"2026033012245038300_ref210","volume-title":"42nd Asilomar Conference on Signals, Systems and Computers","author":"Lai","year":"2008"},{"key":"2026033012245038300_ref211","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1016\/0196-8858(85)90002-8","article-title":"Asymptotically efficient Adaptive Allocation Rules","volume":"6","author":"Lai","year":"1985","journal-title":"Advances in Applied Mathematics"},{"key":"2026033012245038300_ref212","volume-title":"21st Advances in Neural Information Processing Systems (NIPS)","author":"Langford","year":"2007"},{"key":"2026033012245038300_ref213","volume-title":"Bandit Algorithms","author":"Lattimore","year":"2019"},{"key":"2026033012245038300_ref214","first-page":"929","volume-title":"24th Intl. World Wide Web Conf. (WWW)","author":"Li","year":"2015"},{"key":"2026033012245038300_ref215","volume-title":"19th Intl. World Wide Web Conf. (WWW)","author":"Li","year":"2010"},{"key":"2026033012245038300_ref216","volume-title":"4th ACM Intl. Conf. on Web Search and Data Mining (WSDM)","author":"LiChu","year":"2011"},{"key":"2026033012245038300_ref217","first-page":"539","volume-title":"16th ACM Intl. Conf. on Research and Development in Information Retrieval (SIGIR)","author":"Li","year":"2016"},{"issue":"2","key":"2026033012245038300_ref218","doi-asserted-by":"crossref","first-page":"212","DOI":"10.1006\/inco.1994.1009","article-title":"The Weighted Majority Algorithm","volume":"108","author":"Littlestone","year":"1994","journal-title":"Information and Computation"},{"issue":"11","key":"2026033012245038300_ref219","doi-asserted-by":"crossref","first-page":"5667","DOI":"10.1109\/TSP.2010.2062509","article-title":"Distributed learning in multi\u2013armed bandit with multiple players","volume":"58","author":"Liu","year":"2010","journal-title":"IEEE Trans. Signal Processing"},{"key":"2026033012245038300_ref220","volume-title":"14th Intl. Conf. on Artificial Intelligence and Statistics (AISTATS)","author":"Lu","year":"2010"},{"key":"2026033012245038300_ref221","first-page":"1739","volume-title":"Conf. on Learning Theory (COLT)","author":"Luo","year":"2018"},{"key":"2026033012245038300_ref222","volume-title":"50th ACM Symp. on Theory of Computing (STOC)","author":"Lykouris","year":"2018"},{"key":"2026033012245038300_ref223","first-page":"120","volume-title":"27th ACM\u2013SIAM Symp. on Discrete Algorithms (SODA)","author":"Lykouris","year":"2016"},{"key":"2026033012245038300_ref224","first-page":"975","volume-title":"27th Conf. on Learning Theory (COLT)","author":"Magureanu","year":"2014"},{"key":"2026033012245038300_ref225","first-page":"2503","article-title":"Trading regret for efficiency: online convex optimization with long term constraints","volume":"13","author":"Mahdavi","year":"2012","journal-title":"J. of Machine Learning Research (JMLR)"},{"key":"2026033012245038300_ref226","first-page":"1115","volume-title":"Advances in Neural Information Processing Systems (NIPS)","author":"Meahdavi","year":"2013"},{"key":"2026033012245038300_ref227","doi-asserted-by":"crossref","first-page":"305","DOI":"10.1007\/978-3-642-15883-4_20","volume-title":"European Conf. on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD)","author":"Maillard","year":"2010"},{"key":"2026033012245038300_ref228","volume-title":"24th Conf. on Learning Theory (COLT)","author":"Maillard","year":"2011"},{"key":"2026033012245038300_ref229","first-page":"623","article-title":"The sample complexity of exploration in the multi-armed bandit problem","volume":"5","author":"Mannor","year":"2004","journal-title":"J. of Machine Learning Research (JMLR)"},{"key":"2026033012245038300_ref230","volume-title":"Operations Research","author":"Mansour","year":"2019"},{"key":"2026033012245038300_ref231","volume-title":"16th ACM Conf. on Economics and Computation (ACM EC)","author":"Mansour","year":"2016"},{"key":"2026033012245038300_ref232","volume-title":"9th Innovations in Theoretical Computer Science Conf. (ITCS)","author":"Mansour","year":"2018"},{"key":"2026033012245038300_ref233","first-page":"195","volume-title":"Probabilistic Methods for Discrete Mathematics","author":"McDiarmid","year":"1998"},{"key":"2026033012245038300_ref234","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1007\/978-3-540-27819-1_8","author":"McMahan","year":"2004","journal-title":"17th Conf. on Learning Theory (COLT)"},{"issue":"7","key":"2026033012245038300_ref235","doi-asserted-by":"crossref","first-page":"1947","DOI":"10.1109\/TIT.2002.1013135","article-title":"On Sequential strategies for loss functions with memory","volume":"48","author":"Merhav","year":"2002","journal-title":"IEEE Trans. on Information Theory"},{"key":"2026033012245038300_ref236","first-page":"2703","volume-title":"29th ACM-SIAM Symp. on Discrete Algorithms (SODA)","author":"Mertikopoulos","year":"2018"},{"key":"2026033012245038300_ref237","first-page":"105","volume-title":"26th Conf. on Learning Theory (COLT)","author":"Minsker","year":"2013"},{"key":"2026033012245038300_ref238","doi-asserted-by":"crossref","first-page":"701","DOI":"10.1007\/978-3-642-31594-7_59","volume-title":"39th Intl. Colloquium on Automata, Languages and Programming (ICALP)","author":"Molinaro","year":"2012"},{"issue":"3","key":"2026033012245038300_ref239","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1007\/BF01769190","article-title":"Strategically zero-sum games: the class of games whose completely mixed equilibria cannot be improved upon","volume":"7","author":"Moulin","year":"1978","journal-title":"Intl. J. of Game Theory"},{"key":"2026033012245038300_ref240","first-page":"783","volume-title":"25th Advances in Neural Information Processing Systems (NIPS)","author":"Munos","year":"2011"},{"issue":"1","key":"2026033012245038300_ref241","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/2200000038","article-title":"From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning","volume":"7","author":"Munos","year":"2014","journal-title":"Foundations and Trends in Machine Learning"},{"key":"2026033012245038300_ref242","volume-title":"23rd Conf, on Uncertainty in Artificial Intelligence (UAI)","author":"Munos","year":"2007"},{"issue":"1","key":"2026033012245038300_ref243","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1287\/opre.1120.1124","article-title":"Dynamic Pay-Per- Action Mechanisms and Applications to Online Advertising","volume":"61","author":"Nazerzadeh","year":"2013","journal-title":"Operations Research."},{"key":"2026033012245038300_ref244","volume-title":"arXiv preprint","author":"Neely","year":"2017"},{"key":"2026033012245038300_ref245","first-page":"1","volume-title":"16th ACM Conf. on Electronic Commerce (EC)","author":"Nekipelov","year":"2015"},{"key":"2026033012245038300_ref246","first-page":"73","volume-title":"26th Intl. World Wide Web Conf. (WWW)","author":"Nisan","year":"2017"},{"key":"2026033012245038300_ref247","volume-title":"SIAM Intl. Conf. on Data Mining (SDM)","author":"Pandey","year":"2007"},{"key":"2026033012245038300_ref248","volume-title":"24th Intl. Conf. on Machine Learning (ICML)","author":"Pandey","year":"2007"},{"key":"2026033012245038300_ref249","author":"Pavan","year":"2011"},{"key":"2026033012245038300_ref250","first-page":"784","volume-title":"25th Intl. Conf. on Machine Learning (ICML)","author":"Radlinski","year":"2008"},{"key":"2026033012245038300_ref251","first-page":"1724","volume-title":"Conf. on Learning Theory (COLT)","author":"Raghavan","year":"2018"},{"key":"2026033012245038300_ref252","first-page":"3066","volume-title":"27th Advances in Neural Information Processing Systems (NIPS)","author":"Rakhlin","year":"2013"},{"key":"2026033012245038300_ref253","volume-title":"33nd Intl. Conf. on Machine Learning (ICML)","author":"Rakhlin","year":"2016"},{"key":"2026033012245038300_ref254","first-page":"155","article-title":"Online learning via sequential complexities","volume":"16","author":"Rakhlin","year":"2015","journal-title":"J. of Machine Learning Research (JMLR)"},{"key":"2026033012245038300_ref255","first-page":"3311","volume-title":"28th Intl. Joint Conf. on Artificial Intelligence (IJCAI)","author":"Rangi","year":"2019"},{"key":"2026033012245038300_ref256","volume-title":"arXiv preprint","author":"Rivera","year":"2018"},{"issue":"2","key":"2026033012245038300_ref257","doi-asserted-by":"crossref","first-page":"296","DOI":"10.2307\/1969530","article-title":"An iterative method of solving a game","volume":"54","author":"Robinson","year":"1951","journal-title":"Annals of Mathematics, Second Series"},{"key":"2026033012245038300_ref258","first-page":"471","author":"Rogers","year":"2015","journal-title":"16th ACM Conf. on Electronic Commerce (EC)"},{"key":"2026033012245038300_ref259","first-page":"155","volume-title":"33nd Intl. Conf. on Machine Learning (ICML)","author":"Rosenski","year":"2016"},{"key":"2026033012245038300_ref260","first-page":"519","volume-title":"18th ACM Conf. on Electronic Commerce (EC)","author":"Roth","year":"2017"},{"key":"2026033012245038300_ref261","first-page":"949","volume-title":"48th ACM Symp. on Theory of Computing (STOC)","author":"Roth","year":"2016"},{"key":"2026033012245038300_ref262","first-page":"513","volume-title":"41st ACM Symp. on Theory of Computing (STOC)","author":"Roughgarden","year":"2009"},{"key":"2026033012245038300_ref263","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9781316779309","volume-title":"Twenty Lectures on Algorithmic Game Theory","author":"Roughgarden","year":"2016"},{"issue":"2","key":"2026033012245038300_ref264","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1287\/moor.1100.0446","volume":"35","author":"Rusmevichientong","year":"2010","journal-title":"Mathematics of Operations Research"},{"issue":"4","key":"2026033012245038300_ref265","doi-asserted-by":"crossref","first-page":"1221","DOI":"10.1287\/moor.2014.0650","article-title":"Learning to Optimize via Posterior Sampling","volume":"39","author":"Russo","year":"2014","journal-title":"Mathematics of Operations Research"},{"key":"2026033012245038300_ref266","first-page":"68:1","article-title":"An Information-Theoretic Analysis of Thompson Sampling","volume":"17","author":"Russo","year":"2016","journal-title":"J. of Machine Learning Research (JMLR)"},{"issue":"1","key":"2026033012245038300_ref267","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/2200000070","article-title":"A Tutorial on Thompson Sampling","volume":"11","author":"Russo","year":"2018","journal-title":"Foundations and Trends in Machine Learning"},{"key":"2026033012245038300_ref268","first-page":"1760","volume-title":"Intl. Conf. on Artificial Intelligence and Statistics (AISTATS)","author":"Sankararaman","year":"2018"},{"issue":"3","key":"2026033012245038300_ref269","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1287\/msom.2013.0429","article-title":"Optimal Dynamic Assortment Planning with Demand Learning","volume":"15","author":"Saur\u00e9","year":"2013","journal-title":"Manufacturing & Service Operations Management"},{"key":"2026033012245038300_ref270","first-page":"862","volume-title":"Intl. Conf. on Artificial Intelligence and Statistics (AISTATS)","author":"Schmit","year":"2018"},{"key":"2026033012245038300_ref271","volume-title":"Fractal, Chaos and Power Laws: Minutes from an Infinite Paradise","author":"Schroeder","year":"1991"},{"key":"2026033012245038300_ref272","volume-title":"Capitalism, Socialism and Democracy","author":"Schumpeter","year":"1942"},{"key":"2026033012245038300_ref273","volume-title":"13th European Workshop on Reinforcement Learning (EWRL)","author":"Seldin","year":"2016"},{"key":"2026033012245038300_ref274","volume-title":"30th Conf. on Learning Theory (COLT)","author":"Seldin","year":"2017"},{"key":"2026033012245038300_ref275","volume-title":"31th Intl. Conf. on Machine Learning (ICML)","author":"Seldin","year":"2014"},{"key":"2026033012245038300_ref276","author":"Sellke","year":"2019"},{"key":"2026033012245038300_ref277","author":"Sellke","year":"2019"},{"key":"2026033012245038300_ref278","first-page":"1523","volume-title":"28th Conf. on Learning Theory (COLT)","author":"Shamir","year":"2015"},{"key":"2026033012245038300_ref279","first-page":"1167","volume-title":"22nd Intl, World Wide Web Conf. (WWW)","author":"Singla","year":"2013"},{"key":"2026033012245038300_ref280","first-page":"89","volume-title":"26th Annual ACM Symp. on Principles Of Distributed Computing (PODC)","author":"Slivkins","year":"2007"},{"key":"2026033012245038300_ref281","volume-title":"25th Advances in Neural Information Processing Systems (NIPS)","author":"Slivkins","year":"2011"},{"key":"2026033012245038300_ref282","unstructured":"Slivkins, A.\n          \n          2013. \u201cDynamic Ad Allocation: Bandits with Budgets\u201d. A technical report onarxiv.org\/abs\/1306.0155."},{"issue":"1","key":"2026033012245038300_ref283","first-page":"2533","article-title":"Contextual bandits with similarity information","volume":"15","author":"Slivkins","year":"2014","journal-title":"J. of Machine Learning Research (JMLR)"},{"issue":"1","key":"2026033012245038300_ref284","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1145\/3123744","article-title":"Incentivizing exploration via information asymmetry","volume":"24","author":"Slivkins","year":"2017","journal-title":"ACM Crossroads"},{"key":"2026033012245038300_ref285","first-page":"343","volume-title":"21st Conf. on Learning Theory (COLT)","author":"Slivkins","year":"2008"},{"issue":"2","key":"2026033012245038300_ref286","doi-asserted-by":"crossref","DOI":"10.1145\/2692359.2692364","article-title":"Online Decision Making in Crowdsourcing Markets: Theoretical Challenges","volume":"12","author":"Slivkins","year":"2013","journal-title":"SIGecom Exchanges"},{"key":"2026033012245038300_ref287","first-page":"1015","volume-title":"27th Intl. Conf. on Machine Learning (ICML)","author":"Srinivas","year":"2010"},{"key":"2026033012245038300_ref288","volume-title":"PhD thesis","author":"Stoltz","year":"2005"},{"key":"2026033012245038300_ref289","first-page":"1577","volume-title":"Advances in Neural Information Processing Systems (NIPS)","author":"Streeter","year":"2008"},{"key":"2026033012245038300_ref290","volume-title":"Reinforcement Learning: An Introduction","author":"Stton","year":"1998"},{"key":"2026033012245038300_ref291","first-page":"1731","article-title":"Batch learning from logged bandit feedback through counterfactual risk minimization","volume":"16","author":"Swaminathan","year":"2015","journal-title":"J. of Machine Learning Research (JMLR)"},{"key":"2026033012245038300_ref292","first-page":"3635","volume-title":"30th Advances in Neural Information Processing Systems (NIPS)","author":"Swaminathan","year":"2017"},{"key":"2026033012245038300_ref293","first-page":"2989","volume-title":"28th Advances in Neural Information Processing Systems (NIPS)","author":"Syrgkanis","year":"2015"},{"key":"2026033012245038300_ref294","volume-title":"33nd Intl. Conf. on Machine Learning (ICML)","author":"Syrgkanis","year":"2016"},{"key":"2026033012245038300_ref295","volume-title":"29th Advances in Neural Information Processing Systems (NIPS)","author":"Syrgkanis","year":"2016"},{"key":"2026033012245038300_ref296","first-page":"211","volume-title":"45th ACM Symp. on Theory of Computing (STOC)","author":"Syrgkanis","year":"2013"},{"key":"2026033012245038300_ref297","volume-title":"Algorithms for Reinforcement Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning","author":"Szepesv\u00e1ri","year":"2010"},{"key":"2026033012245038300_ref298","first-page":"281","volume-title":"36th ACM Symp. on Theory of Computing (STOC)","author":"Talwar","year":"2004"},{"issue":"3\u20134","key":"2026033012245038300_ref299","doi-asserted-by":"crossref","first-page":"285","DOI":"10.1093\/biomet\/25.3-4.285","article-title":"On the likelihood that one unknown probability exceeds another in view of the evidence of two samples","volume":"25","author":"Thompson","year":"1933","journal-title":"Biometrika"},{"key":"2026033012245038300_ref300","first-page":"1211","volume-title":"24th AAAI Conference on Artificial Intelligence (AAAI)","author":"Tran-Thanh","year":"2010"},{"key":"2026033012245038300_ref301","first-page":"1134","volume-title":"26th AAAI Conference on Artificial Intelligence (AAAI)","author":"Tran-Thanh","year":"2012"},{"key":"2026033012245038300_ref302","first-page":"19","volume-title":"30th Intl. Conf. on Machine Learning (ICML)","author":"Valko","year":"2013"},{"key":"2026033012245038300_ref303","first-page":"3828","volume-title":"31st Advances in Neural Information Processing Systems (NIPS)","author":"Wang","year":"2018"},{"issue":"2","key":"2026033012245038300_ref304","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1287\/opre.2013.1245","article-title":"Close the Gaps: A LearningWhile-Doing Algorithm for Single-Product Revenue Management Problems","volume":"62","author":"Wang","year":"2014","journal-title":"Operations Research"},{"key":"2026033012245038300_ref305","volume-title":"31st Conf. on Learning Theory (COLT)","author":"Wei","year":"2018"},{"key":"2026033012245038300_ref306","volume-title":"13th ACM Conf. on Electronic Commerce (EC)","author":"Wilkens","year":"2012"},{"issue":"5","key":"2026033012245038300_ref307","doi-asserted-by":"crossref","first-page":"1538","DOI":"10.1016\/j.jcss.2011.12.028","article-title":"The K-armed dueling bandits problem","volume":"78","author":"Yue","year":"2012","journal-title":"J. Comput. Syst. Sci"},{"key":"2026033012245038300_ref308","first-page":"1201","volume-title":"26th Intl. Conf. on Machine Learning (ICML)","author":"Yue","year":"2009"},{"key":"2026033012245038300_ref309","volume-title":"33rd Advances in Neural Information Processing Systems (NeurIPS)","author":"Zimmert","year":"2019"},{"key":"2026033012245038300_ref310","first-page":"7683","volume-title":"36th Intl. Conf. on Machine Learning (ICML)","author":"Zimmert","year":"2019"},{"key":"2026033012245038300_ref311","volume-title":"Intl. Conf. on Artificial Intelligence and Statistics (AISTATS)","author":"Zimmert","year":"2019"},{"key":"2026033012245038300_ref312","first-page":"10","volume-title":"Intl. Conf, on Machine Learning (ICML)","author":"Zoghi","year":"2014"},{"key":"2026033012245038300_ref313","volume-title":"32nd Conf. on Uncertainty in Artificial Intelligence (UAI)","author":"Zong","year":"2016"}],"container-title":["Foundations and Trends\u00ae in Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/ftmal\/article-pdf\/12\/1-2\/1\/11160748\/2200000068en.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/www.emerald.com\/ftmal\/article-pdf\/12\/1-2\/1\/11160748\/2200000068en.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T18:10:46Z","timestamp":1777486246000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.emerald.com\/ftmal\/article\/12\/1-2\/1\/1332831\/Introduction-to-Multi-Armed-Bandits"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,11,8]]},"references-count":313,"journal-issue":{"issue":"1-2","published-print":{"date-parts":[[2019,11,8]]}},"URL":"https:\/\/doi.org\/10.1561\/2200000068","relation":{},"ISSN":["1935-8237","1935-8245"],"issn-type":[{"value":"1935-8237","type":"print"},{"value":"1935-8245","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,11,8]]}}}