{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:20:02Z","timestamp":1750220402831,"version":"3.41.0"},"reference-count":5,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2021,12,1]],"date-time":"2021-12-01T00:00:00Z","timestamp":1638316800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["SIGIR Forum"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:p>Ranking system is the core part of modern retrieval and recommender systems, where the goal is to rank candidate items given user contexts. Optimizing ranking systems online means that the deployed system can serve users' requests, e.g., queries in the web search, and optimize the ranking policy by learning from user interactions, e.g., clicks. Bandit is a general online learning framework and can be used in our optimization task. However, due to the unique features of ranking, there are several challenges in designing bandit algorithms for ranking system optimization. In this thesis, we study and propose bandit algorithms for four challenges in optimizing ranking systems online: effectiveness, safety, nonstationarity, and diversification.<\/jats:p>\n          <jats:p>We first focus on the large-scale online ranker evaluation problem. The challenge is that the number of pair-wise ranker comparisons grows quadratically with respect to the number of rankers. We proposed the merge double Thompson sampling (MergeDTS) method to solve the problem. MergeDTS takes the divide-and-conquer idea in Merge Sort to decrease the complexity and uses the Thompson sampling to increase the effectiveness of pair-wise comparisons.<\/jats:p>\n          <jats:p>We then address the safety Online Learning to Rank (OLTR) by introducing the BubbleRank algorithm. BubbleRank uses the offline trained ranker, e.g., the production ranker, to obtain the initial ranked list, and then conducts safe online pairwise exploration to improve this list. The safety comes from the fact that BubbleRank explores the ranked lists by randomly exchanging items with their neighbors. Thus, during exploration, low-quality items are hardly shifted at top positions.<\/jats:p>\n          <jats:p>Non-stationarity widely appears in interactive systems since user preferences are affected by different factors and change over time. It is critical to design algorithms that capture the non-stationarity in OLTR. This thesis provides CascadeDUCB and CascadeSWUCB algorithms to solve the non-stationary OLTR. We derive gap-dependent bounds on their regret and show the theoretical soundness of the proposed algorithms, and then we conduct simulated experiments to validate the empirical effectiveness.<\/jats:p>\n          <jats:p>Result diversification and relevance ranking are two important aspects in modern recommender systems. Ideal learning algorithms should be able to display a ranked list whose items are relevant and the topics of items are diverse. The last research chapter of the thesis focuses on this challenge and provides the CascadeHybrid algorithm. CascadeHybrid learns from interactive feedback online and trained a ranker, which is a hybrid of a linear function capturing the relevance part and a submodular function responding to the results diversification.<\/jats:p>\n          <jats:p>\n            <jats:bold>Awarded by<\/jats:bold>\n            : University of Amsterdam, Amsterdam, the Netherlands on 4 March 2021.\n          <\/jats:p>\n          <jats:p>\n            <jats:bold>Supervised by<\/jats:bold>\n            : Maarten de Rijke.\n          <\/jats:p>\n          <jats:p>\n            <jats:bold>Available at<\/jats:bold>\n            : https:\/\/dare.uva.nl\/search?identifier=f043b9b4-e666-48e0-8a6c-7c5431660e17.\n          <\/jats:p>","DOI":"10.1145\/3527546.3527575","type":"journal-article","created":{"date-parts":[[2022,3,17]],"date-time":"2022-03-17T23:24:36Z","timestamp":1647559476000},"page":"1-2","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Optimizing ranking systems online as bandits"],"prefix":"10.1145","volume":"55","author":[{"given":"Chang","family":"Li","sequence":"first","affiliation":[{"name":"University of Amsterdam"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,3,17]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2019\/396"},{"key":"e_1_2_1_2_1","volume-title":"Federated unbiased learning to rank. ArXiv, abs\/2105.04761","author":"Li Chang","year":"2021","unstructured":"Chang Li and Hua Ouyang . Federated unbiased learning to rank. ArXiv, abs\/2105.04761 , 2021 . Chang Li and Hua Ouyang. Federated unbiased learning to rank. ArXiv, abs\/2105.04761, 2021."},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the Conference on Uncertainty in Artificial Intelligence, UAI","author":"Li Chang","year":"2019","unstructured":"Chang Li , Branislav Kveton , Tor Lattimore , Ilya Markov , Maarten de Rijke , Csaba Szepesv\u00e1ri , and Masrour Zoghi . BubbleRank : Safe online learning to re-rank via implicit click feedback . In Proceedings of the Conference on Uncertainty in Artificial Intelligence, UAI , July 2019 . Chang Li, Branislav Kveton, Tor Lattimore, Ilya Markov, Maarten de Rijke, Csaba Szepesv\u00e1ri, and Masrour Zoghi. BubbleRank: Safe online learning to re-rank via implicit click feedback. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, UAI, July 2019."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3383313.3412245"},{"key":"e_1_2_1_5_1","volume-title":"TOIS, 38(4): Article","author":"Li Chang","year":"2020","unstructured":"Chang Li , Ilya Markov , Maarten de Rijke , and Masrour Zoghi . MergeDTS: A method for effective large-scale online ranker evaluation. ACM Transactions on Information Systems , TOIS, 38(4): Article 40, August 2020 b. Chang Li, Ilya Markov, Maarten de Rijke, and Masrour Zoghi. MergeDTS: A method for effective large-scale online ranker evaluation. ACM Transactions on Information Systems, TOIS, 38(4): Article 40, August 2020b."}],"container-title":["ACM SIGIR Forum"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3527546.3527575","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3527546.3527575","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:18:53Z","timestamp":1750191533000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3527546.3527575"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12]]},"references-count":5,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["10.1145\/3527546.3527575"],"URL":"https:\/\/doi.org\/10.1145\/3527546.3527575","relation":{},"ISSN":["0163-5840"],"issn-type":[{"type":"print","value":"0163-5840"}],"subject":[],"published":{"date-parts":[[2021,12]]},"assertion":[{"value":"2022-03-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}