{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T15:51:37Z","timestamp":1776181897134,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":66,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,12,5]],"date-time":"2022-12-05T00:00:00Z","timestamp":1670198400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Foundation Fund of China","award":["Y2009-1B-02-353"],"award-info":[{"award-number":["Y2009-1B-02-353"]}]},{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2021YFC2800501"],"award-info":[{"award-number":["2021YFC2800501"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Research Foundation, Singapore and DSO National Laboratories under the AI Singapore Programme","award":["AISG2-RP-2020-017"],"award-info":[{"award-number":["AISG2-RP-2020-017"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,12,5]]},"DOI":"10.1145\/3564625.3564636","type":"proceedings-article","created":{"date-parts":[[2022,12,3]],"date-time":"2022-12-03T01:01:29Z","timestamp":1670029289000},"page":"186-200","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":17,"title":["Curiosity-Driven and Victim-Aware Adversarial Policies"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6178-4118","authenticated-orcid":false,"given":"Chen","family":"Gong","sequence":"first","affiliation":[{"name":"Institute of Automation, Chinese Academy of Sciences, China and School of Artificial Intelligence, University of Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5938-1918","authenticated-orcid":false,"given":"Zhou","family":"Yang","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5670-7230","authenticated-orcid":false,"given":"Yunpeng","family":"Bai","sequence":"additional","affiliation":[{"name":"Institute of Automation, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0799-5018","authenticated-orcid":false,"given":"Jieke","family":"Shi","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3594-3848","authenticated-orcid":false,"given":"Arunesh","family":"Sinha","sequence":"additional","affiliation":[{"name":"Rutgers University, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1006-8493","authenticated-orcid":false,"given":"Bowen","family":"Xu","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4367-7201","authenticated-orcid":false,"given":"David","family":"Lo","sequence":"additional","affiliation":[{"name":"Singapore Management University, Singapore"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8468-001X","authenticated-orcid":false,"given":"Xinwen","family":"Hou","sequence":"additional","affiliation":[{"name":"Institute of Automation,, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2724-2432","authenticated-orcid":false,"given":"Guoliang","family":"Fan","sequence":"additional","affiliation":[{"name":"Institute of Automation,, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,12,5]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"Marcin Andrychowicz Anton Raichuk Piotr Sta\u0144czyk Manu Orsini Sertan Girgin Rapha\u00ebl Marinier Leonard Hussenot Matthieu Geist Olivier Pietquin Marcin Michalski Sylvain Gelly and Olivier Bachem. 2021. What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study. In ICLR. Marcin Andrychowicz Anton Raichuk Piotr Sta\u0144czyk Manu Orsini Sertan Girgin Rapha\u00ebl Marinier Leonard Hussenot Matthieu Geist Olivier Pietquin Marcin Michalski Sylvain Gelly and Olivier Bachem. 2021. What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study. In ICLR."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3468264.3473124"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSME52107.2021.00079"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2021.3136169"},{"key":"e_1_3_2_1_5_1","unstructured":"Trapit Bansal Jakub Pachocki Szymon Sidor Ilya Sutskever and Igor Mordatch. 2018. Emergent Complexity via Multi-Agent Competition. In ICLR. Trapit Bansal Jakub Pachocki Szymon Sidor Ilya Sutskever and Igor Mordatch. 2018. Emergent Complexity via Multi-Agent Competition. In ICLR."},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-62416-7_19"},{"key":"e_1_3_2_1_7_1","unstructured":"Yuri Burda Harri Edwards Deepak Pathak 2018. Large-scale study of curiosity-driven learning. arXiv preprint arXiv:1808.04355(2018). Yuri Burda Harri Edwards Deepak Pathak 2018. Large-scale study of curiosity-driven learning. arXiv preprint arXiv:1808.04355(2018)."},{"key":"e_1_3_2_1_8_1","unstructured":"Yuri Burda Harrison Edwards Amos Storkey 2018. Exploration by random network distillation. arXiv preprint arXiv:1810.12894(2018). Yuri Burda Harrison Edwards Amos Storkey 2018. Exploration by random network distillation. arXiv preprint arXiv:1810.12894(2018)."},{"key":"e_1_3_2_1_9_1","volume-title":"Towards Evaluating the Robustness of Neural Networks. In 2017 IEEE Symposium on Security and Privacy (SP). 39\u201357","author":"Carlini Nicholas","year":"2017","unstructured":"Nicholas Carlini and David Wagner . 2017 . Towards Evaluating the Robustness of Neural Networks. In 2017 IEEE Symposium on Security and Privacy (SP). 39\u201357 . Nicholas Carlini and David Wagner. 2017. Towards Evaluating the Robustness of Neural Networks. In 2017 IEEE Symposium on Security and Privacy (SP). 39\u201357."},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/3433210.3453090"},{"key":"e_1_3_2_1_11_1","volume-title":"Magnetic control of tokamak plasmas through deep reinforcement learning. Nature 602, 7897","author":"Degrave Jonas","year":"2022","unstructured":"Jonas Degrave , Federico Felici , Jonas Buchli , Michael Neunert , Brendan Tracey , Francesco Carpanese , Timo Ewalds , Roland Hafner , Abbas Abdolmaleki , Diego de Las\u00a0Casas , 2022. Magnetic control of tokamak plasmas through deep reinforcement learning. Nature 602, 7897 ( 2022 ), 414\u2013419. Jonas Degrave, Federico Felici, Jonas Buchli, Michael Neunert, Brendan Tracey, Francesco Carpanese, Timo Ewalds, Roland Hafner, Abbas Abdolmaleki, Diego de Las\u00a0Casas, 2022. Magnetic control of tokamak plasmas through deep reinforcement learning. Nature 602, 7897 (2022), 414\u2013419."},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.5220\/0006197105590566"},{"key":"e_1_3_2_1_13_1","volume-title":"Driverless Car: Autonomous Driving Using Deep Reinforcement Learning in Urban Environment. In 15th International Conference on Ubiquitous Robots (UR). 896\u2013901","author":"Fayjie R.","year":"2018","unstructured":"Abdur\u00a0 R. Fayjie , Sabir Hossain , Doukhi Oualid , and Deok-Jin Lee . 2018 . Driverless Car: Autonomous Driving Using Deep Reinforcement Learning in Urban Environment. In 15th International Conference on Ubiquitous Robots (UR). 896\u2013901 . Abdur\u00a0R. Fayjie, Sabir Hossain, Doukhi Oualid, and Deok-Jin Lee. 2018. Driverless Car: Autonomous Driving Using Deep Reinforcement Learning in Urban Environment. In 15th International Conference on Ubiquitous Robots (UR). 896\u2013901."},{"key":"e_1_3_2_1_14_1","unstructured":"Lior Fox Leshem Choshen and Yonatan Loewenstein. 2018. DORA The Explorer: Directed Outreaching Reinforcement Action-Selection. In ICLR. Lior Fox Leshem Choshen and Yonatan Loewenstein. 2018. DORA The Explorer: Directed Outreaching Reinforcement Action-Selection. In ICLR."},{"key":"e_1_3_2_1_15_1","volume-title":"Adversarial Policies: Attacking Deep Reinforcement Learning. In ICLR.","author":"Gleave Adam","year":"2020","unstructured":"Adam Gleave , Michael Dennis , Cody Wild , 2020 . Adversarial Policies: Attacking Deep Reinforcement Learning. In ICLR. Adam Gleave, Michael Dennis, Cody Wild, 2020. Adversarial Policies: Attacking Deep Reinforcement Learning. In ICLR."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"crossref","unstructured":"Chen Gong Yunpeng Bai Xinwen Hou 2020. Stable Training of Bellman Error in Reinforcement Learning. In Neural Information Processing. 439\u2013448. Chen Gong Yunpeng Bai Xinwen Hou 2020. Stable Training of Bellman Error in Reinforcement Learning. In Neural Information Processing. 439\u2013448.","DOI":"10.1007\/978-3-030-63823-8_51"},{"key":"e_1_3_2_1_17_1","volume-title":"Wide-Sense Stationary Policy Optimization with Bellman Residual on Video Games. In 2021 IEEE International Conference on Multimedia and Expo (ICME). 1\u20136.","author":"Gong Chen","year":"2021","unstructured":"Chen Gong , Qiang He , Yunpeng Bai , 2021 . Wide-Sense Stationary Policy Optimization with Bellman Residual on Video Games. In 2021 IEEE International Conference on Multimedia and Expo (ICME). 1\u20136. Chen Gong, Qiang He, Yunpeng Bai, 2021. Wide-Sense Stationary Policy Optimization with Bellman Residual on Video Games. In 2021 IEEE International Conference on Multimedia and Expo (ICME). 1\u20136."},{"key":"e_1_3_2_1_18_1","volume-title":"ICLR","author":"Goodfellow Ian","unstructured":"Ian Goodfellow , Jonathon Shlens , and Christian Szegedy . 2015. Explaining and Harnessing Adversarial Examples . In ICLR . http:\/\/arxiv.org\/abs\/1412.6572 Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. In ICLR. http:\/\/arxiv.org\/abs\/1412.6572"},{"key":"e_1_3_2_1_19_1","volume-title":"Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates","author":"Gu Shixiang","unstructured":"Shixiang Gu , Ethan Holly , Timothy Lillicrap , 2017. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates . In ICRA. IEEE , 3389\u20133396. Shixiang Gu, Ethan Holly, Timothy Lillicrap, 2017. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In ICRA. IEEE, 3389\u20133396."},{"key":"e_1_3_2_1_20_1","volume-title":"Proceedings of the 38th International Conference on Machine Learning, Vol.\u00a0139","author":"Guo Wenbo","year":"2021","unstructured":"Wenbo Guo , Xian Wu , Sui Huang , and Xinyu Xing . 2021 . Adversarial Policy Learning in Two-player Competitive Games . In Proceedings of the 38th International Conference on Machine Learning, Vol.\u00a0139 . PMLR, 3910\u20133919. Wenbo Guo, Xian Wu, Sui Huang, and Xinyu Xing. 2021. Adversarial Policy Learning in Two-player Competitive Games. In Proceedings of the 38th International Conference on Machine Learning, Vol.\u00a0139. PMLR, 3910\u20133919."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.23919\/CNSM50824.2020.9269092"},{"key":"e_1_3_2_1_22_1","unstructured":"Kim Hammar and Rolf Stadler. 2022. Learning Security Strategies through Game Play and Optimal Stopping. CoRR abs\/2205.14694(2022). arXiv:2205.14694 Kim Hammar and Rolf Stadler. 2022. Learning Security Strategies through Game Play and Optimal Stopping. CoRR abs\/2205.14694(2022). arXiv:2205.14694"},{"key":"e_1_3_2_1_23_1","volume-title":"Mepg: A minimalist ensemble policy gradient framework for deep reinforcement learning. arXiv preprint arXiv:2109.10552(2021).","author":"He Qiang","year":"2021","unstructured":"Qiang He , Chen Gong , Yuxun Qu , Xiaoyu Chen , Xinwen Hou , and Yu Liu . 2021 . Mepg: A minimalist ensemble policy gradient framework for deep reinforcement learning. arXiv preprint arXiv:2109.10552(2021). Qiang He, Chen Gong, Yuxun Qu, Xiaoyu Chen, Xinwen Hou, and Yu Liu. 2021. Mepg: A minimalist ensemble policy gradient framework for deep reinforcement learning. arXiv preprint arXiv:2109.10552(2021)."},{"key":"e_1_3_2_1_24_1","volume-title":"Rainbow: Combining Improvements in Deep Reinforcement Learning. In AAAI. 3215\u20133222.","author":"Hessel Matteo","year":"2018","unstructured":"Matteo Hessel , Joseph Modayil , Hado van Hasselt , Tom Schaul , Georg Ostrovski , Will Dabney , 2018 . Rainbow: Combining Improvements in Deep Reinforcement Learning. In AAAI. 3215\u20133222. Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, 2018. Rainbow: Combining Improvements in Deep Reinforcement Learning. In AAAI. 3215\u20133222."},{"key":"e_1_3_2_1_25_1","volume-title":"Adversarial Attacks on Neural Network Policies. In 5th International Conference on Learning Representations, ICLR 2017, Workshop Track Proceedings.","author":"Huang H.","year":"2017","unstructured":"Sandy\u00a0 H. Huang , Nicolas Papernot , Ian\u00a0 J. Goodfellow , Yan Duan , and Pieter Abbeel . 2017 . Adversarial Attacks on Neural Network Policies. In 5th International Conference on Learning Representations, ICLR 2017, Workshop Track Proceedings. Sandy\u00a0H. Huang, Nicolas Papernot, Ian\u00a0J. Goodfellow, Yan Duan, and Pieter Abbeel. 2017. Adversarial Attacks on Neural Network Policies. In 5th International Conference on Learning Representations, ICLR 2017, Workshop Track Proceedings."},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/DAC18072.2020.9218663"},{"key":"e_1_3_2_1_27_1","volume-title":"Deep Reinforcement Learning for Autonomous Driving: A Survey. abs\/2002.00444","author":"Kiran Bangalore\u00a0Ravi","year":"2020","unstructured":"Bangalore\u00a0Ravi Kiran , Ibrahim Sobh , Victor Talpaert , Patrick Mannion , Ahmad A.\u00a0 Al Sallab , Senthil\u00a0Kumar Yogamani , and Patrick P\u00e9rez . 2020. Deep Reinforcement Learning for Autonomous Driving: A Survey. abs\/2002.00444 ( 2020 ). Bangalore\u00a0Ravi Kiran, Ibrahim Sobh, Victor Talpaert, Patrick Mannion, Ahmad A.\u00a0Al Sallab, Senthil\u00a0Kumar Yogamani, and Patrick P\u00e9rez. 2020. Deep Reinforcement Learning for Autonomous Driving: A Survey. abs\/2002.00444 (2020)."},{"key":"e_1_3_2_1_28_1","volume-title":"5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Workshop Track Proceedings.","author":"Kos Jernej","year":"2017","unstructured":"Jernej Kos and Dawn Song . 2017 . Delving into adversarial attacks on deep policies . In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Workshop Track Proceedings. Jernej Kos and Dawn Song. 2017. Delving into adversarial attacks on deep policies. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Workshop Track Proceedings."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Yenchen Lin Zhangwei Hong Yuanhong Liao 2017. Tactics of Adversarial Attack on Deep Reinforcement Learning Agents. In IJCAI. 3756\u20133762. Yenchen Lin Zhangwei Hong Yuanhong Liao 2017. Tactics of Adversarial Attack on Deep Reinforcement Learning Agents. In IJCAI. 3756\u20133762.","DOI":"10.24963\/ijcai.2017\/525"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i04.5955"},{"key":"e_1_3_2_1_31_1","unstructured":"Aleksander Madry Aleksandar Makelov Ludwig Schmidt 2018. Towards Deep Learning Models Resistant to Adversarial Attacks. In ICLR. Aleksander Madry Aleksandar Makelov Ludwig Schmidt 2018. Towards Deep Learning Models Resistant to Adversarial Attacks. In ICLR."},{"key":"e_1_3_2_1_32_1","volume-title":"Human-level control through deep reinforcement learning. Nature 518, 7540","author":"Mnih Volodymyr","year":"2015","unstructured":"Volodymyr Mnih , Koray Kavukcuoglu , David Silver , 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 ( 2015 ), 529\u2013533. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529\u2013533."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"crossref","unstructured":"OpenAI Ilge Akkaya Marcin Andrychowicz Maciek Chociej 2019. Solving Rubik\u2019s Cube with a Robot Hand. CoRR abs\/1910.07113(2019). arXiv:1910.07113 OpenAI Ilge Akkaya Marcin Andrychowicz Maciek Chociej 2019. Solving Rubik\u2019s Cube with a Robot Hand. CoRR abs\/1910.07113(2019). arXiv:1910.07113","DOI":"10.1149\/MA2019-02\/41\/1910"},{"key":"e_1_3_2_1_34_1","volume-title":"the International Conference on Autonomous Agents and MultiAgent Systems. 368\u2013376","author":"Pan Xinlei","year":"2019","unstructured":"Xinlei Pan , Weiyao Wang , Xiaoshuai Zhang , 2019 . How You Act Tells a Lot: Privacy-Leaking Attack on Deep Reinforcement Learning . In the International Conference on Autonomous Agents and MultiAgent Systems. 368\u2013376 . Xinlei Pan, Weiyao Wang, Xiaoshuai Zhang, 2019. How You Act Tells a Lot: Privacy-Leaking Attack on Deep Reinforcement Learning. In the International Conference on Autonomous Agents and MultiAgent Systems. 368\u2013376."},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3052973.3053009"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2017.70"},{"key":"e_1_3_2_1_37_1","volume-title":"RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments. In ICLR.","author":"Raileanu Roberta","year":"2020","unstructured":"Roberta Raileanu and Tim Rockt\u00e4schel . 2020 . RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments. In ICLR. Roberta Raileanu and Tim Rockt\u00e4schel. 2020. RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments. In ICLR."},{"key":"e_1_3_2_1_38_1","unstructured":"Alessio Russo and Alexandre Prouti\u00e8re. 2019. Optimal Attacks on Reinforcement Learning Policies. CoRR abs\/1907.13548(2019). http:\/\/arxiv.org\/abs\/1907.13548 Alessio Russo and Alexandre Prouti\u00e8re. 2019. Optimal Attacks on Reinforcement Learning Policies. CoRR abs\/1907.13548(2019). http:\/\/arxiv.org\/abs\/1907.13548"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.2352\/ISSN.2470-1173.2017.19.AVM-023"},{"key":"e_1_3_2_1_40_1","unstructured":"John Schulman Filip Wolski Prafulla Dhariwal 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347(2017). John Schulman Filip Wolski Prafulla Dhariwal 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347(2017)."},{"key":"e_1_3_2_1_41_1","volume-title":"A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362, 6419","author":"Silver David","year":"2018","unstructured":"David Silver , Thomas Hubert , Julian Schrittwieser , 2018. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362, 6419 ( 2018 ), 1140\u20131144. David Silver, Thomas Hubert, Julian Schrittwieser, 2018. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science 362, 6419 (2018), 1140\u20131144."},{"key":"e_1_3_2_1_42_1","volume-title":"CCLF: A Contrastive-Curiosity-Driven Learning Framework for Sample-Efficient Reinforcement Learning. arXiv preprint arXiv:2205.00943(2022).","author":"Sun Chenyu","year":"2022","unstructured":"Chenyu Sun , Hangwei Qian , and Chunyan Miao . 2022 . CCLF: A Contrastive-Curiosity-Driven Learning Framework for Sample-Efficient Reinforcement Learning. arXiv preprint arXiv:2205.00943(2022). Chenyu Sun, Hangwei Qian, and Chunyan Miao. 2022. CCLF: A Contrastive-Curiosity-Driven Learning Framework for Sample-Efficient Reinforcement Learning. arXiv preprint arXiv:2205.00943(2022)."},{"key":"e_1_3_2_1_43_1","unstructured":"Peng Sun Xinghai Sun Lei Han Jiechao Xiong Qing Wang Bo Li Yang Zheng Ji Liu Yongsheng Liu Han Liu and Tong Zhang. 2018. TStarBots: Defeating the Cheating Level Builtin AI in StarCraft II in the Full Game. arxiv:1809.07193\u00a0[cs.AI] Peng Sun Xinghai Sun Lei Han Jiechao Xiong Qing Wang Bo Li Yang Zheng Ji Liu Yongsheng Liu Han Liu and Tong Zhang. 2018. TStarBots: Defeating the Cheating Level Builtin AI in StarCraft II in the Full Game. arxiv:1809.07193\u00a0[cs.AI]"},{"key":"e_1_3_2_1_44_1","volume-title":"Reinforcement learning: An introduction","author":"Sutton S.","unstructured":"Richard\u00a0 S. Sutton and Andrew\u00a0 G. Barto . 2018. Reinforcement learning: An introduction . MIT press . Richard\u00a0S. Sutton and Andrew\u00a0G. Barto. 2018. Reinforcement learning: An introduction. MIT press."},{"key":"e_1_3_2_1_45_1","unstructured":"Haoran Tang Rein Houthooft Davis Foote Adam Stooke OpenAI Xi\u00a0Chen Yan Duan John Schulman Filip DeTurck and Pieter Abbeel. 2017. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning. In Advances in Neural Information Processing Systems Vol.\u00a030. Haoran Tang Rein Houthooft Davis Foote Adam Stooke OpenAI Xi\u00a0Chen Yan Duan John Schulman Filip DeTurck and Pieter Abbeel. 2017. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning. In Advances in Neural Information Processing Systems Vol.\u00a030."},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/203330.203343"},{"key":"e_1_3_2_1_47_1","volume-title":"Visualizing data using t-SNE.Journal of machine learning research 9, 11","author":"Maaten Laurens Van\u00a0der","year":"2008","unstructured":"Laurens Van\u00a0der Maaten and Geoffrey Hinton . 2008. Visualizing data using t-SNE.Journal of machine learning research 9, 11 ( 2008 ). Laurens Van\u00a0der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE.Journal of machine learning research 9, 11 (2008)."},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2021\/509"},{"key":"e_1_3_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2021.3114024"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"crossref","unstructured":"Yufei Wang Zheyuan\u00a0Ryan Shi Lantao Yu 2019. Deep reinforcement learning for green security games with real-time information. In the Association for the Advancement of Artificial Intelligence AAAI Vol.\u00a033. 1401\u20131408. Yufei Wang Zheyuan\u00a0Ryan Shi Lantao Yu 2019. Deep reinforcement learning for green security games with real-time information. In the Association for the Advancement of Artificial Intelligence AAAI Vol.\u00a033. 1401\u20131408.","DOI":"10.1609\/aaai.v33i01.33011401"},{"key":"e_1_3_2_1_51_1","volume-title":"30th USENIX Security Symposium (USENIX Security 21)","author":"Wu Xian","year":"2021","unstructured":"Xian Wu , Wenbo Guo , Hua Wei , and Xinyu Xing . 2021 . Adversarial Policy Training against Deep Reinforcement Learning . In 30th USENIX Security Symposium (USENIX Security 21) . USENIX Association , 1883\u20131900. Xian Wu, Wenbo Guo, Hua Wei, and Xinyu Xing. 2021. Adversarial Policy Training against Deep Reinforcement Learning. In 30th USENIX Security Symposium (USENIX Security 21). USENIX Association, 1883\u20131900."},{"key":"e_1_3_2_1_52_1","unstructured":"Chaowei Xiao Xinlei Pan Warren He 2019. Characterizing attacks on deep reinforcement learning. arXiv preprint arXiv:1907.09470(2019). Chaowei Xiao Xinlei Pan Warren He 2019. Characterizing attacks on deep reinforcement learning. arXiv preprint arXiv:1907.09470(2019)."},{"key":"e_1_3_2_1_53_1","doi-asserted-by":"crossref","unstructured":"Zhiwei Xu Bin Zhang Yunpeng Bai 2021. Learning to Coordinate via Multiple Graph Neural Networks. In Neural Information Processing. 52\u201363. Zhiwei Xu Bin Zhang Yunpeng Bai 2021. Learning to Coordinate via Multiple Graph Neural Networks. In Neural Information Processing. 52\u201363.","DOI":"10.1007\/978-3-030-92238-2_5"},{"key":"e_1_3_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3468264.3473117"},{"key":"e_1_3_2_1_55_1","volume-title":"Design of intentional backdoors in sequential models. arXiv:1902.09972","author":"Yang Zhaoyuan","year":"2019","unstructured":"Zhaoyuan Yang , Naresh Iyer , Johan Reimann , and Nurali Virani . 2019. Design of intentional backdoors in sequential models. arXiv:1902.09972 ( 2019 ). Zhaoyuan Yang, Naresh Iyer, Johan Reimann, and Nurali Virani. 2019. Design of intentional backdoors in sequential models. arXiv:1902.09972 (2019)."},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSME52107.2021.00073"},{"key":"e_1_3_2_1_57_1","volume-title":"Revisiting Neuron Coverage Metrics and Quality of Deep Neural Networks. In 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE Computer Society.","author":"Yang Zhou","year":"2022","unstructured":"Zhou Yang , Jieke Shi , Muhammad\u00a0Hilmi Asyrofi , and David Lo . 2022 . Revisiting Neuron Coverage Metrics and Quality of Deep Neural Networks. In 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE Computer Society. Zhou Yang, Jieke Shi, Muhammad\u00a0Hilmi Asyrofi, and David Lo. 2022. Revisiting Neuron Coverage Metrics and Quality of Deep Neural Networks. In 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE Computer Society."},{"key":"e_1_3_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3510003.3510146"},{"key":"e_1_3_2_1_59_1","doi-asserted-by":"crossref","unstructured":"Changkun Ye Huimin Ma Xiaoqin Zhang Kai Zhang and Shaodi You. 2017. Survival-Oriented Reinforcement Learning Model: An Effcient and Robust Deep Reinforcement Learning Algorithm for Autonomous Driving Problem Yao Zhao Xiangwei Kong and David Taubman (Eds.). 417\u2013429. Changkun Ye Huimin Ma Xiaoqin Zhang Kai Zhang and Shaodi You. 2017. Survival-Oriented Reinforcement Learning Model: An Effcient and Robust Deep Reinforcement Learning Algorithm for Autonomous Driving Problem Yao Zhao Xiangwei Kong and David Taubman (Eds.). 417\u2013429.","DOI":"10.1007\/978-3-319-71589-6_36"},{"key":"e_1_3_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.trc.2019.08.011"},{"key":"e_1_3_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i12.17297"},{"key":"e_1_3_2_1_62_1","unstructured":"Huan Zhang Hongge Chen Duane Boning 2021. Robust Reinforcement Learning on State Observations with Learned Optimal Adversary. Huan Zhang Hongge Chen Duane Boning 2021. Robust Reinforcement Learning on State Observations with Learned Optimal Adversary."},{"key":"e_1_3_2_1_63_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSE.2019.2962027"},{"key":"e_1_3_2_1_64_1","doi-asserted-by":"crossref","unstructured":"Shaohua Zhang Shuang Liu Jun Sun 2021. FIGCPS: Effective Failure-inducing Input Generation for Cyber-Physical Systems with Deep Reinforcement Learning. In 36th IEEE\/ACM International Conference on Automated Software Engineering (ASE). 555\u2013567. Shaohua Zhang Shuang Liu Jun Sun 2021. FIGCPS: Effective Failure-inducing Input Generation for Cyber-Physical Systems with Deep Reinforcement Learning. In 36th IEEE\/ACM International Conference on Automated Software Engineering (ASE). 555\u2013567.","DOI":"10.1109\/ASE51524.2021.9678832"},{"key":"e_1_3_2_1_65_1","volume-title":"Blackbox Attacks on Reinforcement Learning Agents Using Approximated Temporal Information. In International Conference on Dependable Systems and Networks Workshops. 16\u201324","author":"Zhao Yiren","year":"2020","unstructured":"Yiren Zhao , Ilia Shumailov , and Han Cui . 2020 . Blackbox Attacks on Reinforcement Learning Agents Using Approximated Temporal Information. In International Conference on Dependable Systems and Networks Workshops. 16\u201324 . Yiren Zhao, Ilia Shumailov, and Han Cui. 2020. Blackbox Attacks on Reinforcement Learning Agents Using Approximated Temporal Information. In International Conference on Dependable Systems and Networks Workshops. 16\u201324."},{"key":"e_1_3_2_1_66_1","unstructured":"Lulu Zheng Jiarui Chen Jianhao Wang Jiamin He Yujing Hu Yingfeng Chen Changjie Fan Yang Gao and Chongjie Zhang. 2021. Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration. In Advances in Neural Information Processing Systems Vol.\u00a034. 3757\u20133769. Lulu Zheng Jiarui Chen Jianhao Wang Jiamin He Yujing Hu Yingfeng Chen Changjie Fan Yang Gao and Chongjie Zhang. 2021. Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration. In Advances in Neural Information Processing Systems Vol.\u00a034. 3757\u20133769."}],"event":{"name":"ACSAC: Annual Computer Security Applications Conference","location":"Austin TX USA","acronym":"ACSAC"},"container-title":["Proceedings of the 38th Annual Computer Security Applications Conference"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3564625.3564636","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3564625.3564636","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:09:11Z","timestamp":1750183751000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3564625.3564636"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12,5]]},"references-count":66,"alternative-id":["10.1145\/3564625.3564636","10.1145\/3564625"],"URL":"https:\/\/doi.org\/10.1145\/3564625.3564636","relation":{},"subject":[],"published":{"date-parts":[[2022,12,5]]},"assertion":[{"value":"2022-12-05","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}