{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:24:28Z","timestamp":1750220668259,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":33,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,10,19]],"date-time":"2020-10-19T00:00:00Z","timestamp":1603065600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,10,19]]},"DOI":"10.1145\/3340531.3412721","type":"proceedings-article","created":{"date-parts":[[2020,10,19]],"date-time":"2020-10-19T06:18:51Z","timestamp":1603088331000},"page":"2677-2684","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Learning to Infer User Hidden States for Online Sequential Advertising"],"prefix":"10.1145","author":[{"given":"Zhaoqing","family":"Peng","sequence":"first","affiliation":[{"name":"Alibaba Group, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Junqi","family":"Jin","sequence":"additional","affiliation":[{"name":"Alibaba Group, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lan","family":"Luo","sequence":"additional","affiliation":[{"name":"University of Southern California, Los Angeles, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yaodong","family":"Yang","sequence":"additional","affiliation":[{"name":"University College London, London, United Kingdom"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rui","family":"Luo","sequence":"additional","affiliation":[{"name":"University College London, London, United Kingdom"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jun","family":"Wang","sequence":"additional","affiliation":[{"name":"University College London, London, United Kingdom"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weinan","family":"Zhang","sequence":"additional","affiliation":[{"name":"Shanghai Jiao Tong University, Shanghai, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haiyang","family":"Xu","sequence":"additional","affiliation":[{"name":"Alibaba Group, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Miao","family":"Xu","sequence":"additional","affiliation":[{"name":"Alibaba Group, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chuan","family":"Yu","sequence":"additional","affiliation":[{"name":"Alibaba Group, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tiejian","family":"Luo","sequence":"additional","affiliation":[{"name":"University of Chinese Academy of Sciences, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Han","family":"Li","sequence":"additional","affiliation":[{"name":"Alibaba Group, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jian","family":"Xu","sequence":"additional","affiliation":[{"name":"Alibaba Group, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kun","family":"Gai","sequence":"additional","affiliation":[{"name":"Alibaba Group, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2020,10,19]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"Media exposure through the funnel: A model of multi-stage attribution. Available at SSRN 2158421","author":"Abhishek Vibhanshu","year":"2012","unstructured":"Vibhanshu Abhishek , Peter Fader , and Kartik Hosanagar . 2012. Media exposure through the funnel: A model of multi-stage attribution. Available at SSRN 2158421 ( 2012 ). Vibhanshu Abhishek, Peter Fader, and Kartik Hosanagar. 2012. Media exposure through the funnel: A model of multi-stage attribution. Available at SSRN 2158421 (2012)."},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-12637-1_47"},{"key":"e_1_3_2_2_3_1","volume-title":"Proceedings of the National Conference on Artificial Intelligence. Citeseer, 1168--1175","author":"Boutilier Craig","year":"1996","unstructured":"Craig Boutilier and David Poole . 1996 . Computing optimal policies for partially observable decision processes using compact representations . In Proceedings of the National Conference on Artificial Intelligence. Citeseer, 1168--1175 . Craig Boutilier and David Poole. 1996. Computing optimal policies for partially observable decision processes using compact representations. In Proceedings of the National Conference on Artificial Intelligence. Citeseer, 1168--1175."},{"key":"e_1_3_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3018661.3018702"},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3220122"},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10479-005-5724-z"},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3178876.3186165"},{"key":"e_1_3_2_2_8_1","volume-title":"Towards a digital attribution model: Measuring the impact of display advertising on online consumer behavior. Available at SSRN 2672090","author":"Ghose Anindya","year":"2015","unstructured":"Anindya Ghose and Vilma Todri . 2015. Towards a digital attribution model: Measuring the impact of display advertising on online consumer behavior. Available at SSRN 2672090 ( 2015 ). Anindya Ghose and Vilma Todri. 2015. Towards a digital attribution model: Measuring the impact of display advertising on online consumer behavior. Available at SSRN 2672090 (2015)."},{"key":"e_1_3_2_2_9_1","volume-title":"Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application. arXiv preprint arXiv:1803.00710","author":"Hu Yujing","year":"2018","unstructured":"Yujing Hu , Qing Da , Anxiang Zeng , Yang Yu , and Yinghui Xu. 2018. Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application. arXiv preprint arXiv:1803.00710 ( 2018 ). Yujing Hu, Qing Da, Anxiang Zeng, Yang Yu, and Yinghui Xu. 2018. Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application. arXiv preprint arXiv:1803.00710 (2018)."},{"key":"e_1_3_2_2_10_1","volume-title":"et almbox","author":"Ie Eugene","year":"2019","unstructured":"Eugene Ie , Vihan Jain , Jing Wang , Sanmit Navrekar , Ritesh Agarwal , Rui Wu , Heng-Tze Cheng , Morgane Lustman , Vince Gatto , Paul Covington , et almbox . 2019 . Reinforcement learning for slate-based recommender systems: A tractable decomposition and practical methodology. arXiv preprint arXiv:1905.12767 (2019). Eugene Ie, Vihan Jain, Jing Wang, Sanmit Navrekar, Ritesh Agarwal, Rui Wu, Heng-Tze Cheng, Morgane Lustman, Vince Gatto, Paul Covington, et almbox. 2019. Reinforcement learning for slate-based recommender systems: A tractable decomposition and practical methodology. arXiv preprint arXiv:1905.12767 (2019)."},{"key":"e_1_3_2_2_11_1","first-page":"1","article-title":"Bidding on the buying funnel for sponsored search and keyword advertising","volume":"12","author":"Jansen Bernard J","year":"2011","unstructured":"Bernard J Jansen and Simone Schuster . 2011 . Bidding on the buying funnel for sponsored search and keyword advertising . Journal of Electronic Commerce Research , Vol. 12 , 1 (2011), 1 . Bernard J Jansen and Simone Schuster. 2011. Bidding on the buying funnel for sponsored search and keyword advertising. Journal of Electronic Commerce Research, Vol. 12, 1 (2011), 1.","journal-title":"Journal of Electronic Commerce Research"},{"key":"e_1_3_2_2_12_1","unstructured":"Wendi Ji and Xiaoling Wang. 2017. Additional Multi-Touch Attribution for Online Advertising. In AAAI. 1360--1366.  Wendi Ji and Xiaoling Wang. 2017. Additional Multi-Touch Attribution for Online Advertising. In AAAI. 1360--1366."},{"key":"e_1_3_2_2_13_1","volume-title":"Real-Time Bidding with Multi-Agent Reinforcement Learning in Display Advertising. arXiv preprint arXiv:1802.09756","author":"Jin Junqi","year":"2018","unstructured":"Junqi Jin , Chengru Song , Han Li , Kun Gai , Jun Wang , and Weinan Zhang . 2018. Real-Time Bidding with Multi-Agent Reinforcement Learning in Display Advertising. arXiv preprint arXiv:1802.09756 ( 2018 ). Junqi Jin, Chengru Song, Han Li, Kun Gai, Jun Wang, and Weinan Zhang. 2018. Real-Time Bidding with Multi-Agent Reinforcement Learning in Display Advertising. arXiv preprint arXiv:1802.09756 (2018)."},{"key":"e_1_3_2_2_14_1","volume-title":"Qmdp-net: Deep learning for planning under partial observability. In Advances in Neural Information Processing Systems. 4694--4704.","author":"Karkus Peter","year":"2017","unstructured":"Peter Karkus , David Hsu , and Wee Sun Lee . 2017 . Qmdp-net: Deep learning for planning under partial observability. In Advances in Neural Information Processing Systems. 4694--4704. Peter Karkus, David Hsu, and Wee Sun Lee. 2017. Qmdp-net: Deep learning for planning under partial observability. In Advances in Neural Information Processing Systems. 4694--4704."},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBOT.1996.506507"},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.5555\/3104322.3104415"},{"key":"e_1_3_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1016\/B978-1-55860-307-3.50031-9"},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/CDC.2016.7799381"},{"volume-title":"Human-level control through deep reinforcement learning. Nature 518, no. 7540 (2015): 529","year":"2015","key":"e_1_3_2_2_20_1","unstructured":"Human-level control through deep reinforcement learning. Nature 518, no. 7540 (2015): 529 ( 2015 ). et al. Mnih, Volodymyr. 2015. Human-level control through deep reinforcement learning. Nature 518, no. 7540 (2015): 529 (2015)."},{"key":"e_1_3_2_2_21_1","volume-title":"A survey of POMDP solution techniques. environment","author":"Murphy Kevin P","year":"2000","unstructured":"Kevin P Murphy . 2000. A survey of POMDP solution techniques. environment , Vol. 2 ( 2000 ), X3. Kevin P Murphy. 2000. A survey of POMDP solution techniques. environment, Vol. 2 (2000), X3."},{"key":"e_1_3_2_2_22_1","volume-title":"It's time to bury the marketing funnel. URL: http:\/\/www. forrester. com\/rb\/Research\/time_to_bury_marketing_funnel\/q\/id\/57495","author":"Noble Steven","year":"2010","unstructured":"Steven Noble . 2010. It's time to bury the marketing funnel. URL: http:\/\/www. forrester. com\/rb\/Research\/time_to_bury_marketing_funnel\/q\/id\/57495 , Vol. 2 ( 2010 ). Steven Noble. 2010. It's time to bury the marketing funnel. URL: http:\/\/www. forrester. com\/rb\/Research\/time_to_bury_marketing_funnel\/q\/id\/57495, Vol. 2 (2010)."},{"key":"e_1_3_2_2_23_1","first-page":"1088","article-title":"Approximating optimal policies for partially observable stochastic domains","volume":"95","author":"Parr Ronald","year":"1995","unstructured":"Ronald Parr and Stuart Russell . 1995 . Approximating optimal policies for partially observable stochastic domains . In IJCAI , Vol. 95. 1088 -- 1094 . Ronald Parr and Stuart Russell. 1995. Approximating optimal policies for partially observable stochastic domains. In IJCAI, Vol. 95. 1088--1094.","journal-title":"IJCAI"},{"key":"e_1_3_2_2_24_1","unstructured":"Andres C Rodriguez Ronald Parr and Daphne Koller. 2000. Reinforcement learning using approximate belief states. In Advances in Neural Information Processing Systems. 1036--1042.  Andres C Rodriguez Ronald Parr and Daphne Koller. 2000. Reinforcement learning using approximate belief states. In Advances in Neural Information Processing Systems. 1036--1042."},{"key":"e_1_3_2_2_25_1","unstructured":"Stephane Ross Brahim Chaib-draa and Joelle Pineau. 2008. Bayes-adaptive pomdps. In Advances in neural information processing systems. 1225--1232.  Stephane Ross Brahim Chaib-draa and Joelle Pineau. 2008. Bayes-adaptive pomdps. In Advances in neural information processing systems. 1225--1232."},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.5555\/1622503.1622504"},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/2020408.2020453"},{"key":"e_1_3_2_2_28_1","volume-title":"Virtual-Taobao: Virtualizing Real-world Online Retail Environment for Reinforcement Learning. arXiv preprint arXiv:1805.10000","author":"Shi Jing-Cheng","year":"2018","unstructured":"Jing-Cheng Shi , Yang Yu , Qing Da , Shi-Yong Chen , and An-Xiang Zeng . 2018. Virtual-Taobao: Virtualizing Real-world Online Retail Environment for Reinforcement Learning. arXiv preprint arXiv:1805.10000 ( 2018 ). Jing-Cheng Shi, Yang Yu, Qing Da, Shi-Yong Chen, and An-Xiang Zeng. 2018. Virtual-Taobao: Virtualizing Real-world Online Retail Environment for Reinforcement Learning. arXiv preprint arXiv:1805.10000 (2018)."},{"key":"e_1_3_2_2_29_1","doi-asserted-by":"crossref","unstructured":"Martin Sundermeyer Ralf Schl\u00fcter and Hermann Ney. 2012. LSTM neural networks for language modeling. In Thirteenth annual conference of the international speech communication association.  Martin Sundermeyer Ralf Schl\u00fcter and Hermann Ney. 2012. LSTM neural networks for language modeling. In Thirteenth annual conference of the international speech communication association.","DOI":"10.21437\/Interspeech.2012-65"},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3269206.3271748"},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2396761.2396828"},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939759"},{"key":"e_1_3_2_2_33_1","volume-title":"Learning Tree-based Deep Model for Recommender Systems. arXiv preprint arXiv:1801.02294","author":"Zhu Han","year":"2018","unstructured":"Han Zhu , Xiang Li , Pengye Zhang , Guozheng Li , Jie He , Han Li , and Kun Gai . 2018b. Learning Tree-based Deep Model for Recommender Systems. arXiv preprint arXiv:1801.02294 ( 2018 ). Han Zhu, Xiang Li, Pengye Zhang, Guozheng Li, Jie He, Han Li, and Kun Gai. 2018b. Learning Tree-based Deep Model for Recommender Systems. arXiv preprint arXiv:1801.02294 (2018)."},{"key":"e_1_3_2_2_34_1","unstructured":"Pengfei Zhu Xin Li Pascal Poupart and Guanghui Miao. 2018a. On improving deep reinforcement learning for pomdps. arXiv preprint arXiv:1804.06309.  Pengfei Zhu Xin Li Pascal Poupart and Guanghui Miao. 2018a. On improving deep reinforcement learning for pomdps. arXiv preprint arXiv:1804.06309."}],"event":{"name":"CIKM '20: The 29th ACM International Conference on Information and Knowledge Management","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"],"location":"Virtual Event Ireland","acronym":"CIKM '20"},"container-title":["Proceedings of the 29th ACM International Conference on Information &amp; Knowledge Management"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3340531.3412721","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3340531.3412721","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:02:55Z","timestamp":1750197775000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3340531.3412721"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,19]]},"references-count":33,"alternative-id":["10.1145\/3340531.3412721","10.1145\/3340531"],"URL":"https:\/\/doi.org\/10.1145\/3340531.3412721","relation":{},"subject":[],"published":{"date-parts":[[2020,10,19]]},"assertion":[{"value":"2020-10-19","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}