{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,20]],"date-time":"2025-07-20T22:47:16Z","timestamp":1753051636282,"version":"3.41.0"},"reference-count":55,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2020,9,10]],"date-time":"2020-09-10T00:00:00Z","timestamp":1599696000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100003246","name":"Netherlands Organisation for Scientific Research","doi-asserted-by":"crossref","award":["612.001.551"],"award-info":[{"award-number":["612.001.551"]}],"id":[{"id":"10.13039\/501100003246","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Innovation Center for Artificial Intelligence"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Inf. Syst."],"published-print":{"date-parts":[[2020,10,31]]},"abstract":"<jats:p>\n            Online ranker evaluation is one of the key challenges in information retrieval. Although the preferences of rankers can be inferred by interleaving methods, the problem of how to effectively choose the ranker pair that generates the interleaved list without degrading the user experience too much is still challenging. On the one hand, if two rankers have not been compared enough, the inferred preference can be noisy and inaccurate. On the other hand, if two rankers are compared too many times, the interleaving process inevitably hurts the user experience too much. This dilemma is known as the\n            <jats:italic>exploration versus exploitation<\/jats:italic>\n            tradeoff. It is captured by the\n            <jats:italic>K<\/jats:italic>\n            -armed dueling bandit problem, which is a variant of the\n            <jats:italic>K<\/jats:italic>\n            -armed bandit problem, where the feedback comes in the form of pairwise preferences. Today\u2019s deployed search systems can evaluate a large number of rankers concurrently, and scaling effectively in the presence of numerous rankers is a critical aspect of\n            <jats:italic>K<\/jats:italic>\n            -armed dueling bandit problems.\n          <\/jats:p>\n          <jats:p>In this article, we focus on solving the large-scale online ranker evaluation problem under the so-called Condorcet assumption, where there exists an optimal ranker that is preferred to all other rankers. We propose Merge Double Thompson Sampling (MergeDTS), which first utilizes a divide-and-conquer strategy that localizes the comparisons carried out by the algorithm to small batches of rankers, and then employs Thompson Sampling to reduce the comparisons between suboptimal rankers inside these small batches. The effectiveness (regret) and efficiency (time complexity) of MergeDTS are extensively evaluated using examples from the domain of online evaluation for web search. Our main finding is that for large-scale Condorcet ranker evaluation problems, MergeDTS outperforms the state-of-the-art dueling bandit algorithms.<\/jats:p>","DOI":"10.1145\/3411753","type":"journal-article","created":{"date-parts":[[2020,9,11]],"date-time":"2020-09-11T03:38:41Z","timestamp":1599795521000},"page":"1-28","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["MergeDTS"],"prefix":"10.1145","volume":"38","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6978-2034","authenticated-orcid":false,"given":"Chang","family":"Li","sequence":"first","affiliation":[{"name":"University of Amsterdam, Amsterdam, The Netherlands"}]},{"given":"Ilya","family":"Markov","sequence":"additional","affiliation":[{"name":"University of Amsterdam, Amsterdam, The Netherlands"}]},{"given":"Maarten De","family":"Rijke","sequence":"additional","affiliation":[{"name":"University of Amsterdam and Ahold Delhaize, Zaandam, The Netherlands"}]},{"given":"Masrour","family":"Zoghi","sequence":"additional","affiliation":[{"name":"Microsoft, Redmond, WA"}]}],"member":"320","published-online":{"date-parts":[[2020,9,10]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/3044805.3044988"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1013689704352"},{"key":"e_1_2_1_3_1","first-page":"1","article-title":"The nonstochastic multiarmed bandit problem","volume":"32","author":"Auer Peter","year":"2003","unstructured":"Peter Auer , Nicol\u00f2 Cesa-Bianchi , Yoav Freund , and Robert E. Schapire . 2003 . The nonstochastic multiarmed bandit problem . SIAM Journal on Computing 32 , 1 (Jan. 2003), 48--77. DOI:https:\/\/doi.org\/10.1137\/S0097539701398375 10.1137\/S0097539701398375 Peter Auer, Nicol\u00f2 Cesa-Bianchi, Yoav Freund, and Robert E. Schapire. 2003. The nonstochastic multiarmed bandit problem. SIAM Journal on Computing 32, 1 (Jan. 2003), 48--77. DOI:https:\/\/doi.org\/10.1137\/S0097539701398375","journal-title":"SIAM Journal on Computing"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2911451.2914706"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2983323.2983659"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-11662-4_3"},{"key":"e_1_2_1_8_1","unstructured":"Robert Busa-Fekete Eyke H\u00fcllermeier and Adil El Mesaoudi-Paul. 2018. Preference-based online learning with dueling bandits: A survey. arXiv:1807.11398  Robert Busa-Fekete Eyke H\u00fcllermeier and Adil El Mesaoudi-Paul. 2018. Preference-based online learning with dueling bandits: A survey. arXiv:1807.11398"},{"key":"e_1_2_1_9_1","volume-title":"Berger","author":"Casella George","year":"2002","unstructured":"George Casella and Roger L . Berger . 2002 . Statistical Inference. Duxbury, Pacific Grove, CA. George Casella and Roger L. Berger. 2002. Statistical Inference. Duxbury, Pacific Grove, CA."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/3045754.3045756"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2094072.2094078"},{"key":"#cr-split#-e_1_2_1_12_1.1","doi-asserted-by":"crossref","unstructured":"Aleksandr Chuklin Ilya Markov and Maarten de Rijke. 2015. Click Models for Web Search. Morgan 8 Claypool.DOI:https:\/\/doi.org\/10.2200\/S00654ED1V01Y201507ICR043 10.2200\/S00654ED1V01Y201507ICR043","DOI":"10.2200\/S00654ED1V01Y201507ICR043"},{"key":"#cr-split#-e_1_2_1_12_1.2","doi-asserted-by":"crossref","unstructured":"Aleksandr Chuklin Ilya Markov and Maarten de Rijke. 2015. Click Models for Web Search. Morgan 8 Claypool.DOI:https:\/\/doi.org\/10.2200\/S00654ED1V01Y201507ICR043","DOI":"10.1007\/978-3-031-02294-4"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2668120"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3231937"},{"key":"e_1_2_1_15_1","volume-title":"Proceedings of the 28th Conference on Learning Theory\u2014Volume 40 (COLT\u201915)","author":"Dud\u00edk Miroslav","year":"2015","unstructured":"Miroslav Dud\u00edk , Katja Hofmann , Robert E. Schapire , Aleksandrs Slivkins , and Masrour Zoghi . 2015 . Contextual dueling bandits . In Proceedings of the 28th Conference on Learning Theory\u2014Volume 40 (COLT\u201915) . 563--587. http:\/\/jmlr.org\/proceedings\/papers\/v40\/Dudik15.html. Miroslav Dud\u00edk, Katja Hofmann, Robert E. Schapire, Aleksandrs Slivkins, and Masrour Zoghi. 2015. Contextual dueling bandits. In Proceedings of the 28th Conference on Learning Theory\u2014Volume 40 (COLT\u201915). 563--587. http:\/\/jmlr.org\/proceedings\/papers\/v40\/Dudik15.html."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.5555\/3045118.3045143"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3186195"},{"volume-title":"Behind the Scenes. Retrieved","year":"2010","key":"e_1_2_1_18_1","unstructured":"Google. 2010 . Google Instant , Behind the Scenes. Retrieved August 17, 2020 from https:\/\/googleblog.blogspot.no\/2010\/09\/google-instant-behind-scenes.html. Google. 2010. Google Instant, Behind the Scenes. Retrieved August 17, 2020 from https:\/\/googleblog.blogspot.no\/2010\/09\/google-instant-behind-scenes.html."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1963.10500830"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1561\/1500000051"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/2433396.2433419"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2063576.2063618"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2396761.2398516"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2536736.2536737"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-011-5257-4"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/2866571"},{"key":"e_1_2_1_27_1","volume-title":"Proceedings of the 18th International Conference on Artificial Intelligence and Statistics\u2014Volume 38 (AISTATS\u201915)","author":"Jamieson Kevin","year":"2015","unstructured":"Kevin Jamieson , Sumeet Katariya , Atul Deshpande , and Robert Nowak . 2015 . Sparse dueling bandits . In Proceedings of the 18th International Conference on Artificial Intelligence and Statistics\u2014Volume 38 (AISTATS\u201915) . 416--424. http:\/\/jmlr.org\/proceedings\/papers\/v38\/jamieson15.html. Kevin Jamieson, Sumeet Katariya, Atul Deshpande, and Robert Nowak. 2015. Sparse dueling bandits. In Proceedings of the 18th International Conference on Artificial Intelligence and Statistics\u2014Volume 38 (AISTATS\u201915). 416--424. http:\/\/jmlr.org\/proceedings\/papers\/v38\/jamieson15.html."},{"volume-title":"Text Mining","author":"Joachims Thorsten","key":"e_1_2_1_28_1","unstructured":"Thorsten Joachims . 2003. Evaluating retrieval performance using clickthrough data . In Text Mining . Cornell University , Ithaca, NY , 79--96. Available at https:\/\/www.cs.cornell.edu\/people\/tj\/publications\/joachims_02b.pdf. Thorsten Joachims. 2003. Evaluating retrieval performance using clickthrough data. In Text Mining. Cornell University, Ithaca, NY, 79--96. Available at https:\/\/www.cs.cornell.edu\/people\/tj\/publications\/joachims_02b.pdf."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1229179.1229181"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2487575.2488217"},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the 28th Conference on Learning Theory\u2014Volume 40 (COLT\u201915)","author":"Komiyama Junpei","year":"2015","unstructured":"Junpei Komiyama , Junya Honda , Hisashi Kashima , and Hiroshi Nakagawa . 2015 . Regret lower bound and optimal algorithm in dueling bandit problem . In Proceedings of the 28th Conference on Learning Theory\u2014Volume 40 (COLT\u201915) . 1141--1154. http:\/\/jmlr.org\/proceedings\/papers\/v40\/Komiyama15.html. Junpei Komiyama, Junya Honda, Hisashi Kashima, and Hiroshi Nakagawa. 2015. Regret lower bound and optimal algorithm in dueling bandit problem. In Proceedings of the 28th Conference on Learning Theory\u2014Volume 40 (COLT\u201915). 1141--1154. http:\/\/jmlr.org\/proceedings\/papers\/v40\/Komiyama15.html."},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the 33rd International Conference on Machine Learning\u2014Volume 48 (ICML\u201916)","author":"Komiyama Junpei","year":"2016","unstructured":"Junpei Komiyama , Junya Honda , and Hiroshi Nakagawa . 2016 . Copeland dueling bandit problem: Regret lower bound, optimal algorithm, and computationally efficient algorithm . In Proceedings of the 33rd International Conference on Machine Learning\u2014Volume 48 (ICML\u201916) . 1235--1244. http:\/\/dl.acm.org\/citation.cfm?id=3045390.3045521 Junpei Komiyama, Junya Honda, and Hiroshi Nakagawa. 2016. Copeland dueling bandit problem: Regret lower bound, optimal algorithm, and computationally efficient algorithm. In Proceedings of the 33rd International Conference on Machine Learning\u2014Volume 48 (ICML\u201916). 1235--1244. http:\/\/dl.acm.org\/citation.cfm?id=3045390.3045521"},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI\u20192019)","author":"Li Chang","year":"2019","unstructured":"Chang Li , Branislav Kveton , Tor Lattimore , Ilya Markov , Maarten de Rijke , Csaba Szepesvari , and Masrour Zoghi . 2019 . BubbleRank: Safe online learning to re-rank via implicit click feedback . In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI\u20192019) . http:\/\/auai.org\/uai2019\/proceedings\/papers\/47.pdf. Chang Li, Branislav Kveton, Tor Lattimore, Ilya Markov, Maarten de Rijke, Csaba Szepesvari, and Masrour Zoghi. 2019. BubbleRank: Safe online learning to re-rank via implicit click feedback. In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI\u20192019). http:\/\/auai.org\/uai2019\/proceedings\/papers\/47.pdf."},{"volume-title":"Proceedings of the Conference on World Wide Web(WWW\u201910)","author":"Li Lihong","key":"e_1_2_1_34_1","unstructured":"Lihong Li , Wei Chu , John Langford , and Robert E. Schapire . 2010. A contextual-bandit approach to personalized news article recommendation . In Proceedings of the Conference on World Wide Web(WWW\u201910) . 661--670. Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. In Proceedings of the Conference on World Wide Web(WWW\u201910). 661--670."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2911451.2914763"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3308774.3308780"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3052768"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3185153"},{"key":"e_1_2_1_39_1","unstructured":"Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 datasets. arXiv:1306.2597  Tao Qin and Tie-Yan Liu. 2013. Introducing LETOR 4.0 datasets. arXiv:1306.2597"},{"key":"e_1_2_1_40_1","volume-title":"Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence (UAI\u201918)","author":"Saha Aadirupa","year":"2018","unstructured":"Aadirupa Saha and Aditya Gopalan . 2018 . Battle of bandits . In Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence (UAI\u201918) . http:\/\/auai.org\/uai2018\/proceedings\/papers\/290.pdf. Aadirupa Saha and Aditya Gopalan. 2018. Battle of bandits. In Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence (UAI\u201918). http:\/\/auai.org\/uai2018\/proceedings\/papers\/290.pdf."},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766462.2767838"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2513150.2513162"},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the 33rd Conference on Uncertainty in Artificial Intelligence (UAI\u201917)","author":"Sui Yanan","year":"2017","unstructured":"Yanan Sui , Vincent Zhuang , Joel W. Burdick , and Yisong Yue . 2017 . Multi-dueling bandits with dependent arms . In Proceedings of the 33rd Conference on Uncertainty in Artificial Intelligence (UAI\u201917) . http:\/\/auai.org\/uai2017\/proceedings\/papers\/155.pdf. Yanan Sui, Vincent Zhuang, Joel W. Burdick, and Yisong Yue. 2017. Multi-dueling bandits with dependent arms. In Proceedings of the 33rd Conference on Uncertainty in Artificial Intelligence (UAI\u201917). http:\/\/auai.org\/uai2017\/proceedings\/papers\/155.pdf."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.5555\/3304652.3304790"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1093\/biomet\/25.3-4.285"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.5555\/3042817.3042904"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.5555\/3157096.3157169"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2011.12.028"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553527"},{"key":"e_1_2_1_50_1","volume-title":"Proceedings of the 28th International Conference on Machine Learning (ICML\u201911)","author":"Yue Yisong","year":"2011","unstructured":"Yisong Yue and Thorsten Joachims . 2011 . Beat the mean bandit . In Proceedings of the 28th International Conference on Machine Learning (ICML\u201911) . 241--248. http:\/\/dl.acm.org\/citation.cfm?id=3104482.3104513 Yisong Yue and Thorsten Joachims. 2011. Beat the mean bandit. In Proceedings of the 28th International Conference on Machine Learning (ICML\u201911). 241--248. http:\/\/dl.acm.org\/citation.cfm?id=3104482.3104513"},{"key":"e_1_2_1_51_1","volume-title":"Proceedings of Machine Learning Research, K. Chaudhuri and M. Sugiyama (Eds.)","volume":"89","author":"Zimmert Julian","year":"2019","unstructured":"Julian Zimmert and Yevgeny Seldin . 2019 . An optimal algorithm for stochastic and adversarial bandits . In Proceedings of Machine Learning Research, K. Chaudhuri and M. Sugiyama (Eds.) , Vol. 89 . PMLR, 467--475. Julian Zimmert and Yevgeny Seldin. 2019. An optimal algorithm for stochastic and adversarial bandits. In Proceedings of Machine Learning Research, K. Chaudhuri and M. Sugiyama (Eds.), Vol. 89. PMLR, 467--475."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.5555\/2969239.2969274"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/2684822.2685290"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.5555\/3044805.3044894"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/2556195.2556256"}],"container-title":["ACM Transactions on Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3411753","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3411753","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:02:39Z","timestamp":1750197759000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3411753"}},"subtitle":["A Method for Effective Large-Scale Online Ranker Evaluation"],"short-title":[],"issued":{"date-parts":[[2020,9,10]]},"references-count":55,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,10,31]]}},"alternative-id":["10.1145\/3411753"],"URL":"https:\/\/doi.org\/10.1145\/3411753","relation":{},"ISSN":["1046-8188","1558-2868"],"issn-type":[{"type":"print","value":"1046-8188"},{"type":"electronic","value":"1558-2868"}],"subject":[],"published":{"date-parts":[[2020,9,10]]},"assertion":[{"value":"2018-11-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-07-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-09-10","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}