{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,19]],"date-time":"2026-02-19T04:42:45Z","timestamp":1771476165532,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":29,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,7,8]],"date-time":"2022-07-08T00:00:00Z","timestamp":1657238400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"DARPA Advanced Research Project Agency (DARPA) and Space and Naval Warfare Systems Center, Pacific (SSC Pacific)","award":["N66001-18-C-4036"],"award-info":[{"award-number":["N66001-18-C-4036"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,7,8]]},"DOI":"10.1145\/3512290.3528844","type":"proceedings-article","created":{"date-parts":[[2022,7,26]],"date-time":"2022-07-26T13:08:13Z","timestamp":1658840893000},"page":"1290-1298","source":"Crossref","is-referenced-by-count":3,"title":["Analyzing multi-agent reinforcement learning and coevolution in cybersecurity"],"prefix":"10.1145","author":[{"given":"Matthew J.","family":"Turner","sequence":"first","affiliation":[{"name":"MIT"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Erik","family":"Hemberg","sequence":"additional","affiliation":[{"name":"MIT"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Una-May","family":"O'Reilly","sequence":"additional","affiliation":[{"name":"MIT"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,7,8]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"[n. d.]. 2022 Frost & Sullivan White Paper on Breach and Attack Simulation. https:\/\/info.xmcyber.com\/frost1  [n. d.]. 2022 Frost & Sullivan White Paper on Breach and Attack Simulation. https:\/\/info.xmcyber.com\/frost1"},{"key":"e_1_3_2_1_2_1","unstructured":"[n. d.]. NVD - CPE. https:\/\/nvd.nist.gov\/products\/cpe  [n. d.]. NVD - CPE. https:\/\/nvd.nist.gov\/products\/cpe"},{"key":"e_1_3_2_1_3_1","unstructured":"[n. d.]. NVD - Vulnerability Metrics. https:\/\/nvd.nist.gov\/vuln-metrics\/cvss  [n. d.]. NVD - Vulnerability Metrics. https:\/\/nvd.nist.gov\/vuln-metrics\/cvss"},{"key":"e_1_3_2_1_4_1","unstructured":"2018. A History of Ransomware Attacks: The Biggest and Worst Ransomware Attacks of All Time. https:\/\/digitalguardian.com\/blog\/history-ransomware-attacks-biggest-and-worst-ransomware-attacks-all-time  2018. A History of Ransomware Attacks: The Biggest and Worst Ransomware Attacks of All Time. https:\/\/digitalguardian.com\/blog\/history-ransomware-attacks-biggest-and-worst-ransomware-attacks-all-time"},{"key":"e_1_3_2_1_5_1","unstructured":"2021. https:\/\/cve.mitre.org\/  2021. https:\/\/cve.mitre.org\/"},{"key":"e_1_3_2_1_6_1","unstructured":"2022. CAPEC - Common Attack Pattern Enumeration and Classification (CAPEC\u2122). https:\/\/capec.mitre.org\/  2022. CAPEC - Common Attack Pattern Enumeration and Classification (CAPEC\u2122). https:\/\/capec.mitre.org\/"},{"key":"e_1_3_2_1_7_1","unstructured":"Greg Brockman Vicki Cheung Ludwig Pettersson Jonas Schneider John Schulman Jie Tang and Wojciech Zaremba. 2016. OpenAI Gym. arXiv:arXiv:1606.01540  Greg Brockman Vicki Cheung Ludwig Pettersson Jonas Schneider John Schulman Jie Tang and Wojciech Zaremba. 2016. OpenAI Gym. arXiv:arXiv:1606.01540"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.23919\/TMA.2018.8506545"},{"key":"e_1_3_2_1_9_1","volume-title":"Reinforcement Learning in Large Discrete Action Spaces. CoRR abs\/1512.07679","author":"Dulac-Arnold Gabriel","year":"2015","unstructured":"Gabriel Dulac-Arnold , Richard Evans , Peter Sunehag , and Ben Coppin . 2015. Reinforcement Learning in Large Discrete Action Spaces. CoRR abs\/1512.07679 ( 2015 ). arXiv:1512.07679 http:\/\/arxiv.org\/abs\/1512.07679 Gabriel Dulac-Arnold, Richard Evans, Peter Sunehag, and Ben Coppin. 2015. Reinforcement Learning in Large Discrete Action Spaces. CoRR abs\/1512.07679 (2015). arXiv:1512.07679 http:\/\/arxiv.org\/abs\/1512.07679"},{"key":"e_1_3_2_1_10_1","unstructured":"Hemberg Erik. 2022. donkey_ge. https:\/\/github.com\/flexgp\/donkey_ge  Hemberg Erik. 2022. donkey_ge. https:\/\/github.com\/flexgp\/donkey_ge"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2010.00533"},{"key":"e_1_3_2_1_13_1","volume-title":"Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research","author":"Jin Chi","year":"2020","unstructured":"Chi Jin , Tiancheng Jin , Haipeng Luo , Suvrit Sra , and Tiancheng Yu . 2020 . Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition . In Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research , Vol. 119), Hal Daum\u00e9 III and Aarti Singh (Eds.). PMLR, 4860--4869. https:\/\/proceedings.mlr.press\/v119\/jin20c.html Chi Jin, Tiancheng Jin, Haipeng Luo, Suvrit Sra, and Tiancheng Yu. 2020. Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition. In Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 119), Hal Daum\u00e9 III and Aarti Singh (Eds.). PMLR, 4860--4869. https:\/\/proceedings.mlr.press\/v119\/jin20c.html"},{"key":"e_1_3_2_1_14_1","unstructured":"Xiaofen Lu Ke Tang Stefan Menzel and Xin Yao. 2019. Competitive Coevolution as an Adversarial Approach to Dynamic Optimization. arXiv:1907.13529 [cs.NE]  Xiaofen Lu Ke Tang Stefan Menzel and Xin Yao. 2019. Competitive Coevolution as an Adversarial Approach to Dynamic Optimization. arXiv:1907.13529 [cs.NE]"},{"key":"e_1_3_2_1_15_1","volume-title":"Wortman Vaughan (Eds.)","volume":"34","author":"Luo Haipeng","year":"2021","unstructured":"Haipeng Luo , Chen-Yu Wei , and Chung-Wei Lee . 2021 . Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J . Wortman Vaughan (Eds.) , Vol. 34 . Curran Associates, Inc., 22931--22942. https:\/\/proceedings.neurips.cc\/paper\/ 2021\/file\/c1b8bf9e071c0dabb899e7a27f353762-Paper.pdf Haipeng Luo, Chen-Yu Wei, and Chung-Wei Lee. 2021. Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 22931--22942. https:\/\/proceedings.neurips.cc\/paper\/2021\/file\/c1b8bf9e071c0dabb899e7a27f353762-Paper.pdf"},{"key":"e_1_3_2_1_16_1","volume-title":"Suspected Russian Cyberattack Began With Ubiquitous Software Company. Wall Street Journal (Dec","author":"McMillan Robert","year":"2020","unstructured":"Robert McMillan . 2020. Suspected Russian Cyberattack Began With Ubiquitous Software Company. Wall Street Journal (Dec . 2020 ). https:\/\/www.wsj.com\/articles\/suspected-russian-cyberattack-began-with-a-little-known-but-ubiquitous-software-company-11608036495 Robert McMillan. 2020. Suspected Russian Cyberattack Began With Ubiquitous Software Company. Wall Street Journal (Dec. 2020). https:\/\/www.wsj.com\/articles\/suspected-russian-cyberattack-began-with-a-little-known-but-ubiquitous-software-company-11608036495"},{"key":"e_1_3_2_1_17_1","volume-title":"5th International Conference on Learning Representations, ICLR","author":"Metz Luke","year":"2017","unstructured":"Luke Metz , Ben Poole , David Pfau , and Jascha Sohl-Dickstein . 2017. Unrolled Generative Adversarial Networks . In 5th International Conference on Learning Representations, ICLR 2017 , Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview .net. https:\/\/openreview.net\/forum?id=BydrOIcle Luke Metz, Ben Poole, David Pfau, and Jascha Sohl-Dickstein. 2017. Unrolled Generative Adversarial Networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net. https:\/\/openreview.net\/forum?id=BydrOIcle"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3319535.3363217"},{"key":"e_1_3_2_1_19_1","volume-title":"Proceedings of The 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research","volume":"1937","author":"Mnih Volodymyr","year":"2016","unstructured":"Volodymyr Mnih , Adria Puigdomenech Badia , Mehdi Mirza , Alex Graves , Timothy Lillicrap , Tim Harley , David Silver , and Koray Kavukcuoglu . 2016 . Asynchronous Methods for Deep Reinforcement Learning . In Proceedings of The 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research , Vol. 48), Maria Florina Balcan and Kilian Q. Weinberger (Eds.). PMLR, New York, New York, USA, 1928-- 1937 . https:\/\/proceedings.mlr.press\/v48\/mniha16.html Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous Methods for Deep Reinforcement Learning. In Proceedings of The 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 48), Maria Florina Balcan and Kilian Q. Weinberger (Eds.). PMLR, New York, New York, USA, 1928--1937. https:\/\/proceedings.mlr.press\/v48\/mniha16.html"},{"key":"e_1_3_2_1_20_1","volume-title":"Wortman Vaughan (Eds.)","volume":"34","author":"Oikarinen Tuomas","year":"2021","unstructured":"Tuomas Oikarinen , Wang Zhang , Alexandre Megretski , Luca Daniel , and TsuiWei Weng . 2021 . Robust Deep Reinforcement Learning through Adversarial Loss. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J . Wortman Vaughan (Eds.) , Vol. 34 . Curran Associates, Inc., 26156--26167. https:\/\/proceedings.neurips.cc\/paper\/ 2021\/file\/dbb422937d7ff56e049d61da730b3e11-Paper.pdf Tuomas Oikarinen, Wang Zhang, Alexandre Megretski, Luca Daniel, and TsuiWei Weng. 2021. Robust Deep Reinforcement Learning through Adversarial Loss. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 26156--26167. https:\/\/proceedings.neurips.cc\/paper\/2021\/file\/dbb422937d7ff56e049d61da730b3e11-Paper.pdf"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2013.137"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/4235.942529"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10710-020-09389-y"},{"key":"e_1_3_2_1_24_1","first-page":"1","article-title":"Stable-Baselines3: Reliable Reinforcement Learning Implementations","volume":"22","author":"Raffin Antonin","year":"2021","unstructured":"Antonin Raffin , Ashley Hill , Adam Gleave , Anssi Kanervisto , Maximilian Ernestus , and Noah Dormann . 2021 . Stable-Baselines3: Reliable Reinforcement Learning Implementations . Journal of Machine Learning Research 22 , 268 (2021), 1 -- 8 . http:\/\/jmlr.org\/papers\/v22\/20-1364.html Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kanervisto, Maximilian Ernestus, and Noah Dormann. 2021. Stable-Baselines3: Reliable Reinforcement Learning Implementations. Journal of Machine Learning Research 22, 268 (2021), 1--8. http:\/\/jmlr.org\/papers\/v22\/20-1364.html","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.1703.03864"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2018.2824250"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3449639.3459351"},{"key":"e_1_3_2_1_28_1","volume-title":"Investigating the parameter space of evolutionary algorithms. BioData mining 11, 1","author":"Sipper Moshe","year":"2018","unstructured":"Moshe Sipper , Weixuan Fu , Karuna Ahuja , and Jason H Moore . 2018. Investigating the parameter space of evolutionary algorithms. BioData mining 11, 1 ( 2018 ), 1--14. Moshe Sipper, Weixuan Fu, Karuna Ahuja, and Jason H Moore. 2018. Investigating the parameter space of evolutionary algorithms. BioData mining 11, 1 (2018), 1--14."},{"key":"e_1_3_2_1_30_1","volume-title":"Handbook of Reinforcement Learning and Control","author":"Zhang Kaiqing","unstructured":"Kaiqing Zhang , Zhuoran Yang , and Tamer Ba\u015far . 2021. Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms . In Handbook of Reinforcement Learning and Control . Springer International Publishing , Cham , 321--384. Kaiqing Zhang, Zhuoran Yang, and Tamer Ba\u015far. 2021. Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms. In Handbook of Reinforcement Learning and Control. Springer International Publishing, Cham, 321--384."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3319619.3326851"}],"event":{"name":"GECCO '22: Genetic and Evolutionary Computation Conference","location":"Boston Massachusetts","acronym":"GECCO '22","sponsor":["SIGEVO ACM Special Interest Group on Genetic and Evolutionary Computation"]},"container-title":["Proceedings of the Genetic and Evolutionary Computation Conference"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3512290.3528844","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3512290.3528844","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:09:57Z","timestamp":1750183797000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3512290.3528844"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,8]]},"references-count":29,"alternative-id":["10.1145\/3512290.3528844","10.1145\/3512290"],"URL":"https:\/\/doi.org\/10.1145\/3512290.3528844","relation":{},"subject":[],"published":{"date-parts":[[2022,7,8]]}}}