{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T15:31:52Z","timestamp":1772119912227,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":40,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,10,19]],"date-time":"2020-10-19T00:00:00Z","timestamp":1603065600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100006435","name":"National Science Foundation","doi-asserted-by":"publisher","award":["IIS1907704, IIS1928278, IIS1714741, IIS1715940, IIS1845081, CNS1815636"],"award-info":[{"award-number":["IIS1907704, IIS1928278, IIS1714741, IIS1715940, IIS1845081, CNS1815636"]}],"id":[{"id":"10.13039\/100006435","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,10,19]]},"DOI":"10.1145\/3340531.3412044","type":"proceedings-article","created":{"date-parts":[[2020,10,19]],"date-time":"2020-10-19T05:31:03Z","timestamp":1603085463000},"page":"1883-1891","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":46,"title":["Whole-Chain Recommendations"],"prefix":"10.1145","author":[{"given":"Xiangyu","family":"Zhao","sequence":"first","affiliation":[{"name":"Michigan State University, East Lansing, MI, USA"}]},{"given":"Long","family":"Xia","sequence":"additional","affiliation":[{"name":"York University, Toronto, ON, Canada"}]},{"given":"Lixin","family":"Zou","sequence":"additional","affiliation":[{"name":"Baidu Inc., Beijing, China"}]},{"given":"Hui","family":"Liu","sequence":"additional","affiliation":[{"name":"Michigan State University, East Lansing, MI, USA"}]},{"given":"Dawei","family":"Yin","sequence":"additional","affiliation":[{"name":"Baidu, Beijing, China"}]},{"given":"Jiliang","family":"Tang","sequence":"additional","affiliation":[{"name":"Michigan State University, East Lansing, MI, USA"}]}],"member":"320","published-online":{"date-parts":[[2020,10,19]]},"reference":[{"key":"e_1_3_2_2_1_1","first-page":"213","article-title":"R-max-a general polynomial time algorithm for near-optimal reinforcement learning","volume":"3","author":"Brafman Ronen I","year":"2002","unstructured":"Ronen I Brafman and Moshe Tennenholtz . 2002 . R-max-a general polynomial time algorithm for near-optimal reinforcement learning . Journal of Machine Learning Research , Vol. 3 , Oct (2002), 213 -- 231 . Ronen I Brafman and Moshe Tennenholtz. 2002. R-max-a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, Vol. 3, Oct (2002), 213--231.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178876.3186039"},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11452"},{"key":"e_1_3_2_2_4_1","volume-title":"Large-scale Interactive Recommendation with Tree-structured Policy Gradient. arXiv preprint arXiv:1811.05869","author":"Chen Haokun","year":"2018","unstructured":"Haokun Chen , Xinyi Dai , Han Cai , Weinan Zhang , Xuejian Wang , Ruiming Tang , Yuzhou Zhang , and Yong Yu. 2018b. Large-scale Interactive Recommendation with Tree-structured Policy Gradient. arXiv preprint arXiv:1811.05869 ( 2018 ). Haokun Chen, Xinyi Dai, Han Cai, Weinan Zhang, Xuejian Wang, Ruiming Tang, Yuzhou Zhang, and Yong Yu. 2018b. Large-scale Interactive Recommendation with Tree-structured Policy Gradient. arXiv preprint arXiv:1811.05869 (2018)."},{"key":"e_1_3_2_2_5_1","volume-title":"Chi","author":"Chen Minmin","year":"2018","unstructured":"Minmin Chen , Alex Beutel , Paul Covington , Sagar Jain , Francois Belletti , and Ed Chi . 2018 a. Top-K Off-Policy Correction for a REINFORCE Recommender System . arXiv preprint arXiv:1812.02353 (2018). Minmin Chen, Alex Beutel, Paul Covington, Sagar Jain, Francois Belletti, and Ed Chi. 2018a. Top-K Off-Policy Correction for a REINFORCE Recommender System. arXiv preprint arXiv:1812.02353 (2018)."},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3220122"},{"key":"e_1_3_2_2_7_1","volume-title":"Neural Model-Based Reinforcement Learning for Recommendation. arXiv preprint arXiv:1812.10613","author":"Chen Xinshi","year":"2018","unstructured":"Xinshi Chen , Shuang Li , Hui Li , Shaohua Jiang , Yuan Qi , and Le Song . 2018c. Neural Model-Based Reinforcement Learning for Recommendation. arXiv preprint arXiv:1812.10613 ( 2018 ). Xinshi Chen, Shuang Li, Hui Li, Shaohua Jiang, Yuan Qi, and Le Song. 2018c. Neural Model-Based Reinforcement Learning for Recommendation. arXiv preprint arXiv:1812.10613 (2018)."},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2988450.2988454"},{"key":"e_1_3_2_2_9_1","volume-title":"Reinforcement Learning based Recommender System using Biclustering Technique. arXiv preprint arXiv:1801.05532","author":"Choi Sungwoon","year":"2018","unstructured":"Sungwoon Choi , Heonseok Ha , Uiwon Hwang , Chanju Kim , Jung-Woo Ha , and Sungroh Yoon . 2018. Reinforcement Learning based Recommender System using Biclustering Technique. arXiv preprint arXiv:1801.05532 ( 2018 ). Sungwoon Choi, Heonseok Ha, Uiwon Hwang, Chanju Kim, Jung-Woo Ha, and Sungroh Yoon. 2018. Reinforcement Learning based Recommender System using Biclustering Technique. arXiv preprint arXiv:1801.05532 (2018)."},{"key":"e_1_3_2_2_10_1","volume-title":"Deep reinforcement learning in large discrete action spaces. arXiv preprint arXiv:1512.07679","author":"Dulac-Arnold Gabriel","year":"2015","unstructured":"Gabriel Dulac-Arnold , Richard Evans , Hado van Hasselt , Peter Sunehag , Timothy Lillicrap , Jonathan Hunt , Timothy Mann , Theophane Weber , Thomas Degris , and Ben Coppin . 2015. Deep reinforcement learning in large discrete action spaces. arXiv preprint arXiv:1512.07679 ( 2015 ). Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, and Ben Coppin. 2015. Deep reinforcement learning in large discrete action spaces. arXiv preprint arXiv:1512.07679 (2015)."},{"key":"e_1_3_2_2_11_1","volume-title":"Attacking Black-box Recommendations via Copying Cross-domain User Profiles. arXiv preprint arXiv:2005.08147","author":"Fan Wenqi","year":"2020","unstructured":"Wenqi Fan , Tyler Derr , Xiangyu Zhao , Yao Ma , Hui Liu , Jianping Wang , Jiliang Tang , and Qing Li. 2020. Attacking Black-box Recommendations via Copying Cross-domain User Profiles. arXiv preprint arXiv:2005.08147 ( 2020 ). Wenqi Fan, Tyler Derr, Xiangyu Zhao, Yao Ma, Hui Liu, Jianping Wang, Jiliang Tang, and Qing Li. 2020. Attacking Black-box Recommendations via Copying Cross-domain User Profiles. arXiv preprint arXiv:2005.08147 (2020)."},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178876.3186165"},{"key":"e_1_3_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2017\/239"},{"key":"e_1_3_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.2984287"},{"key":"e_1_3_2_2_15_1","volume-title":"Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939","author":"Hidasi Bal\u00e1zs","year":"2015","unstructured":"Bal\u00e1zs Hidasi , Alexandros Karatzoglou , Linas Baltrunas , and Domonkos Tikk . 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 ( 2015 ). Bal\u00e1zs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. 2015. Session-based recommendations with recurrent neural networks. arXiv preprint arXiv:1511.06939 (2015)."},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219846"},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/582415.582418"},{"key":"e_1_3_2_2_18_1","volume-title":"Near-optimal reinforcement learning in polynomial time. Machine learning","author":"Kearns Michael","year":"2002","unstructured":"Michael Kearns and Satinder Singh . 2002. Near-optimal reinforcement learning in polynomial time. Machine learning , Vol. 49 , 2--3 ( 2002 ), 209--232. Michael Kearns and Satinder Singh. 2002. Near-optimal reinforcement learning in polynomial time. Machine learning, Vol. 49, 2--3 (2002), 209--232."},{"key":"e_1_3_2_2_19_1","unstructured":"Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Advances in neural information processing systems. 2177--2185.  Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Advances in neural information processing systems. 2177--2185."},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3132847.3132926"},{"key":"e_1_3_2_2_21_1","volume-title":"Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971","author":"Lillicrap Timothy P","year":"2015","unstructured":"Timothy P Lillicrap , Jonathan J Hunt , Alexander Pritzel , Nicolas Heess , Tom Erez , Yuval Tassa , David Silver , and Daan Wierstra . 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 ( 2015 ). Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)."},{"key":"e_1_3_2_2_22_1","volume-title":"OpenAI Pieter Abbeel, and Igor Mordatch","author":"Lowe Ryan","year":"2017","unstructured":"Ryan Lowe , Yi I Wu , Aviv Tamar , Jean Harb , OpenAI Pieter Abbeel, and Igor Mordatch . 2017 . Multi-agent actor-critic for mixed cooperative-competitive environments. In Advances in neural information processing systems. 6379--6390. Ryan Lowe, Yi I Wu, Aviv Tamar, Jean Harb, OpenAI Pieter Abbeel, and Igor Mordatch. 2017. Multi-agent actor-critic for mixed cooperative-competitive environments. In Advances in neural information processing systems. 6379--6390."},{"key":"e_1_3_2_2_23_1","volume-title":"Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602","author":"Mnih Volodymyr","year":"2013","unstructured":"Volodymyr Mnih , Koray Kavukcuoglu , David Silver , Alex Graves , Ioannis Antonoglou , Daan Wierstra , and Martin Riedmiller . 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 ( 2013 ). Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)."},{"key":"e_1_3_2_2_24_1","volume-title":"et almbox","author":"Mnih Volodymyr","year":"2015","unstructured":"Volodymyr Mnih , Koray Kavukcuoglu , David Silver , Andrei A Rusu , Joel Veness , Marc G Bellemare , Alex Graves , Martin Riedmiller , Andreas K Fidjeland , Georg Ostrovski , et almbox . 2015 . Human-level control through deep reinforcement learning. Nature , Vol. 518 , 7540 (2015), 529. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et almbox. 2015. Human-level control through deep reinforcement learning. Nature, Vol. 518, 7540 (2015), 529."},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148176"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2018.00074"},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401467"},{"key":"e_1_3_2_2_28_1","volume-title":"2019 a. Deep Reinforcement Learning for Online Advertising in Recommender Systems. arXiv preprint arXiv:1909.03602","author":"Zhao Xiangyu","year":"2019","unstructured":"Xiangyu Zhao , Changsheng Gu , Haoshenglun Zhang , Xiaobing Liu , Xiwang Yang , and Jiliang Tang . 2019 a. Deep Reinforcement Learning for Online Advertising in Recommender Systems. arXiv preprint arXiv:1909.03602 ( 2019 ). Xiangyu Zhao, Changsheng Gu, Haoshenglun Zhang, Xiaobing Liu, Xiwang Yang, and Jiliang Tang. 2019 a. Deep Reinforcement Learning for Online Advertising in Recommender Systems. arXiv preprint arXiv:1909.03602 (2019)."},{"key":"e_1_3_2_2_29_1","volume-title":"2020 a. Memory-efficient Embedding for Recommendations. arXiv preprint arXiv:2006.14827","author":"Zhao Xiangyu","year":"2020","unstructured":"Xiangyu Zhao , Haochen Liu , Hui Liu , Jiliang Tang , Weiwei Guo , Jun Shi , Sida Wang , Huiji Gao , and Bo Long . 2020 a. Memory-efficient Embedding for Recommendations. arXiv preprint arXiv:2006.14827 ( 2020 ). Xiangyu Zhao, Haochen Liu, Hui Liu, Jiliang Tang, Weiwei Guo, Jun Shi, Sida Wang, Huiji Gao, and Bo Long. 2020 a. Memory-efficient Embedding for Recommendations. arXiv preprint arXiv:2006.14827 (2020)."},{"key":"e_1_3_2_2_30_1","volume-title":"2020 b. AutoEmb: Automated Embedding Dimensionality Search in Streaming Recommendations. arXiv preprint arXiv:2002.11252","author":"Zhao Xiangyu","year":"2020","unstructured":"Xiangyu Zhao , Chong Wang , Ming Chen , Xudong Zheng , Xiaobing Liu , and Jiliang Tang . 2020 b. AutoEmb: Automated Embedding Dimensionality Search in Streaming Recommendations. arXiv preprint arXiv:2002.11252 ( 2020 ). Xiangyu Zhao, Chong Wang, Ming Chen, Xudong Zheng, Xiaobing Liu, and Jiliang Tang. 2020 b. AutoEmb: Automated Embedding Dimensionality Search in Streaming Recommendations. arXiv preprint arXiv:2002.11252 (2020)."},{"key":"e_1_3_2_2_31_1","volume-title":"2019 b. Toward Simulating Environments in Reinforcement Learning Based Recommendations. arXiv preprint arXiv:1906.11462","author":"Zhao Xiangyu","year":"2019","unstructured":"Xiangyu Zhao , Long Xia , Zhuoye Ding , Dawei Yin , and Jiliang Tang . 2019 b. Toward Simulating Environments in Reinforcement Learning Based Recommendations. arXiv preprint arXiv:1906.11462 ( 2019 ). Xiangyu Zhao, Long Xia, Zhuoye Ding, Dawei Yin, and Jiliang Tang. 2019 b. Toward Simulating Environments in Reinforcement Learning Based Recommendations. arXiv preprint arXiv:1906.11462 (2019)."},{"key":"e_1_3_2_2_32_1","volume-title":"Long Xia, Jiliang Tang, and Dawei Yin with Martin Vesely as coordinator. ACM SIGWEB Newsletter Spring","author":"Zhao Xiangyu","year":"2019","unstructured":"Xiangyu Zhao , Long Xia , Jiliang Tang , and Dawei Yin . 2019 c. Deep reinforcement learning for search, recommendation, and online advertising: a survey by Xiangyu Zhao , Long Xia, Jiliang Tang, and Dawei Yin with Martin Vesely as coordinator. ACM SIGWEB Newsletter Spring ( 2019 ), 4. Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin. 2019 c. Deep reinforcement learning for search, recommendation, and online advertising: a survey by Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin with Martin Vesely as coordinator. ACM SIGWEB Newsletter Spring (2019), 4."},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3240323.3240374"},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219886"},{"key":"e_1_3_2_2_35_1","volume-title":"Deep Reinforcement Learning for List-wise Recommendations. arXiv preprint arXiv:1801.00209","author":"Zhao Xiangyu","year":"2017","unstructured":"Xiangyu Zhao , Liang Zhang , Zhuoye Ding , Dawei Yin , Yihong Zhao , and Jiliang Tang . 2017. Deep Reinforcement Learning for List-wise Recommendations. arXiv preprint arXiv:1801.00209 ( 2017 ). Xiangyu Zhao, Liang Zhang, Zhuoye Ding, Dawei Yin, Yihong Zhao, and Jiliang Tang. 2017. Deep Reinforcement Learning for List-wise Recommendations. arXiv preprint arXiv:1801.00209 (2017)."},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403384"},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178876.3185994"},{"key":"e_1_3_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330668"},{"key":"e_1_3_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3336191.3371801"},{"key":"e_1_3_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401181"}],"event":{"name":"CIKM '20: The 29th ACM International Conference on Information and Knowledge Management","location":"Virtual Event Ireland","acronym":"CIKM '20","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"]},"container-title":["Proceedings of the 29th ACM International Conference on Information &amp; Knowledge Management"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3340531.3412044","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3340531.3412044","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:02:29Z","timestamp":1750197749000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3340531.3412044"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,19]]},"references-count":40,"alternative-id":["10.1145\/3340531.3412044","10.1145\/3340531"],"URL":"https:\/\/doi.org\/10.1145\/3340531.3412044","relation":{},"subject":[],"published":{"date-parts":[[2020,10,19]]},"assertion":[{"value":"2020-10-19","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}