{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,7]],"date-time":"2026-03-07T19:08:13Z","timestamp":1772910493856,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":37,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,10,17]],"date-time":"2022-10-17T00:00:00Z","timestamp":1665964800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,17]]},"DOI":"10.1145\/3511808.3557064","type":"proceedings-article","created":{"date-parts":[[2022,10,16]],"date-time":"2022-10-16T01:29:57Z","timestamp":1665883797000},"page":"3604-3613","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["An Actor-critic Reinforcement Learning Model for Optimal Bidding in Online Display Advertising"],"prefix":"10.1145","author":[{"given":"Congde","family":"Yuan","sequence":"first","affiliation":[{"name":"Tencent, Shenzhen, China"}]},{"given":"Mengzhuo","family":"Guo","sequence":"additional","affiliation":[{"name":"Tencent, Shenzhen, China"}]},{"given":"Chaoneng","family":"Xiang","sequence":"additional","affiliation":[{"name":"Tencent, Shenzhen, China"}]},{"given":"Shuangyang","family":"Wang","sequence":"additional","affiliation":[{"name":"Tencent, Shenzhen, China"}]},{"given":"Guoqing","family":"Song","sequence":"additional","affiliation":[{"name":"Tencent, China, China"}]},{"given":"Qingpeng","family":"Zhang","sequence":"additional","affiliation":[{"name":"Tencent, Shenzhen, China"}]}],"member":"320","published-online":{"date-parts":[[2022,10,17]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"Second Workshop on Sponsored Search Auctions. EC. Citeseer.","author":"Animesh Animesh","year":"2005","unstructured":"Animesh Animesh , Vandana Ramachandran , and Siva Viswanathan . 2005 . Online advertisers bidding strategies for search, experience, and credence goods: An empirical investigation . In Second Workshop on Sponsored Search Auctions. EC. Citeseer. Animesh Animesh, Vandana Ramachandran, and Siva Viswanathan. 2005. Online advertisers bidding strategies for search, experience, and credence goods: An empirical investigation. In Second Workshop on Sponsored Search Auctions. EC. Citeseer."},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3018661.3018702"},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1287\/isre.2019.0902"},{"key":"e_1_3_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/1566374.1566384"},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/1250910.1250917"},{"key":"e_1_3_2_2_6_1","volume-title":"International conference on machine learning. PMLR, 1587--1596","author":"Fujimoto Scott","year":"2018","unstructured":"Scott Fujimoto , Herke Hoof , and David Meger . 2018 . Addressing function approximation error in actor-critic methods . In International conference on machine learning. PMLR, 1587--1596 . Scott Fujimoto, Herke Hoof, and David Meger. 2018. Addressing function approximation error in actor-critic methods. In International conference on machine learning. PMLR, 1587--1596."},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1526709.1526744"},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3447548.3467199"},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2505515.2505665"},{"key":"e_1_3_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1504\/IJEB.2008.018068"},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-85481-4_7"},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.5555\/1622737.1622748"},{"key":"e_1_3_2_2_13_1","volume-title":"Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980","author":"Kingma Diederik P","year":"2014","unstructured":"Diederik P Kingma and Jimmy Ba . 2014 . Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014). Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)."},{"key":"e_1_3_2_2_14_1","volume-title":"Optimal bidding on keyword auctions. Electronic markets","author":"Kitts Brendan","year":"2004","unstructured":"Brendan Kitts and Benjamin Leblanc . 2004. Optimal bidding on keyword auctions. Electronic markets , Vol. 14 , 3 ( 2004 ), 186--201. Brendan Kitts and Benjamin Leblanc. 2004. Optimal bidding on keyword auctions. Electronic markets, Vol. 14, 3 (2004), 186--201."},{"key":"e_1_3_2_2_15_1","volume-title":"Actor-critic algorithms. Advances in neural information processing systems","author":"Konda Vijay","year":"1999","unstructured":"Vijay Konda and John Tsitsiklis . 1999. Actor-critic algorithms. Advances in neural information processing systems , Vol. 12 ( 1999 ). Vijay Konda and John Tsitsiklis. 1999. Actor-critic algorithms. Advances in neural information processing systems, Vol. 12 (1999)."},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3358027"},{"key":"e_1_3_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/3209978.3210104"},{"key":"e_1_3_2_2_18_1","volume-title":"Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602","author":"Mnih Volodymyr","year":"2013","unstructured":"Volodymyr Mnih , Koray Kavukcuoglu , David Silver , Alex Graves , Ioannis Antonoglou , Daan Wierstra , and Martin Riedmiller . 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 ( 2013 ). Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)."},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"crossref","unstructured":"Volodymyr Mnih Koray Kavukcuoglu David Silver Andrei A Rusu Joel Veness Marc G Bellemare Alex Graves Martin Riedmiller Andreas K Fidjeland Georg Ostrovski etal 2015. Human-level control through deep reinforcement learning. nature Vol. 518 7540 (2015) 529--533.  Volodymyr Mnih Koray Kavukcuoglu David Silver Andrei A Rusu Joel Veness Marc G Bellemare Alex Graves Martin Riedmiller Andreas K Fidjeland Georg Ostrovski et al. 2015. Human-level control through deep reinforcement learning. nature Vol. 518 7540 (2015) 529--533.","DOI":"10.1038\/nature14236"},{"key":"e_1_3_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330870"},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2017.2775228"},{"key":"e_1_3_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1287\/mksc.2017.1083"},{"key":"e_1_3_2_2_23_1","volume-title":"International conference on machine learning. PMLR","author":"Schulman John","year":"2015","unstructured":"John Schulman , Sergey Levine , Pieter Abbeel , Michael Jordan , and Philipp Moritz . 2015 . Trust region policy optimization . In International conference on machine learning. PMLR , 1889--1897. John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. Trust region policy optimization. In International conference on machine learning. PMLR, 1889--1897."},{"key":"e_1_3_2_2_24_1","volume-title":"Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347","author":"Schulman John","year":"2017","unstructured":"John Schulman , Filip Wolski , Prafulla Dhariwal , Alec Radford , and Oleg Klimov . 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 ( 2017 ). John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)."},{"key":"e_1_3_2_2_25_1","volume-title":"Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems. 1359--1367","author":"Tang Pingzhong","year":"2020","unstructured":"Pingzhong Tang , Xun Wang , Zihe Wang , Yadong Xu , and Xiwang Yang . 2020 . Optimized Cost per Mille in Feeds Advertising . In Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems. 1359--1367 . Pingzhong Tang, Xun Wang, Zihe Wang, Yadong Xu, and Xiwang Yang. 2020. Optimized Cost per Mille in Feeds Advertising. In Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems. 1359--1367."},{"key":"e_1_3_2_2_26_1","volume-title":"Convexification and global optimization in continuous and mixed-integer nonlinear programming: theory, algorithms, software, and applications","author":"Tawarmalani Mohit","unstructured":"Mohit Tawarmalani and Nikolaos V Sahinidis . 2013. Convexification and global optimization in continuous and mixed-integer nonlinear programming: theory, algorithms, software, and applications . Vol. 65 . Springer Science & Business Media . Mohit Tawarmalani and Nikolaos V Sahinidis. 2013. Convexification and global optimization in continuous and mixed-integer nonlinear programming: theory, algorithms, software, and applications. Vol. 65. Springer Science & Business Media."},{"key":"e_1_3_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"e_1_3_2_2_28_1","volume-title":"Display advertising with real-time bidding (RTB) and behavioural targeting. arXiv preprint arXiv:1610.03013","author":"Wang Jun","year":"2016","unstructured":"Jun Wang , Weinan Zhang , and Shuai Yuan . 2016. Display advertising with real-time bidding (RTB) and behavioural targeting. arXiv preprint arXiv:1610.03013 ( 2016 ). Jun Wang, Weinan Zhang, and Shuai Yuan. 2016. Display advertising with real-time bidding (RTB) and behavioural targeting. arXiv preprint arXiv:1610.03013 (2016)."},{"key":"e_1_3_2_2_29_1","volume-title":"A deep probabilistic model for customer lifetime value prediction. arXiv preprint arXiv:1912.07753","author":"Wang Xiaojing","year":"2019","unstructured":"Xiaojing Wang , Tianqi Liu , and Jingang Miao . 2019. A deep probabilistic model for customer lifetime value prediction. arXiv preprint arXiv:1912.07753 ( 2019 ). Xiaojing Wang, Tianqi Liu, and Jingang Miao. 2019. A deep probabilistic model for customer lifetime value prediction. arXiv preprint arXiv:1912.07753 (2019)."},{"key":"e_1_3_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3269206.3271748"},{"key":"e_1_3_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330681"},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2501040.2501980"},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/SOLI.2014.6960761"},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2623330.2623633"},{"key":"e_1_3_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219918"},{"key":"e_1_3_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219823"},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098134"}],"event":{"name":"CIKM '22: The 31st ACM International Conference on Information and Knowledge Management","location":"Atlanta GA USA","acronym":"CIKM '22","sponsor":["SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval"]},"container-title":["Proceedings of the 31st ACM International Conference on Information &amp; Knowledge Management"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3511808.3557064","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3511808.3557064","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:30:55Z","timestamp":1750188655000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3511808.3557064"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,17]]},"references-count":37,"alternative-id":["10.1145\/3511808.3557064","10.1145\/3511808"],"URL":"https:\/\/doi.org\/10.1145\/3511808.3557064","relation":{},"subject":[],"published":{"date-parts":[[2022,10,17]]},"assertion":[{"value":"2022-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}