{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,24]],"date-time":"2026-02-24T16:56:56Z","timestamp":1771952216148,"version":"3.50.1"},"reference-count":34,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2025,6,11]],"date-time":"2025-06-11T00:00:00Z","timestamp":1749600000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Faced with challenges posed by sophisticated cyber attacks and dynamic characteristics of cyberspace, the autonomous cyber defense (ACD) technology has shown its effectiveness. However, traditional decision-making methods for ACD are unable to effectively characterize the network topology and internode dependencies, which makes it difficult for defenders to identify key nodes and critical attack paths. Therefore, this paper proposes an enhanced decision-making method combining graph embedding with reinforcement learning algorithms. By constructing a game model for cyber confrontations, this paper models important elements of the network topology for decision-making, which guide the defender to dynamically optimize its strategy based on topology awareness. We improve the reinforcement learning with the Node2vec algorithm to characterize information for the defender from the network. And, node attributes and network structural features are embedded into low-dimensional vectors instead of using traditional one-hot encoding, which can address the perceptual bottleneck in high-dimensional sparse environments. Meanwhile, the algorithm training environment Cyberwheel is extended by adding new fine-grained defense mechanisms to enhance the utility and portability of ACD. In experiments, our decision-making method based on graph embedding is compared and analyzed with traditional perception methods. The results show and verify the superior performance of our approach in the strategy selection of defensive decision-making. Also, diverse parameters of the graph representation model Node2vec are analyzed and compared to find the impact on the enhancement of the embedding effectiveness for the decision-making of ACD.<\/jats:p>","DOI":"10.3390\/e27060622","type":"journal-article","created":{"date-parts":[[2025,6,11]],"date-time":"2025-06-11T11:18:40Z","timestamp":1749640720000},"page":"622","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["A Novel Framework for Enhancing Decision-Making in Autonomous Cyber Defense Through Graph Embedding"],"prefix":"10.3390","volume":"27","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-3948-672X","authenticated-orcid":false,"given":"Zhen","family":"Wang","sequence":"first","affiliation":[{"name":"College of Electronic Engineering, National University of Defense Technology, Hefei 230037, China"},{"name":"Anhui Province Key Laboratory of Cyberspace Security Situation Awareness and Evaluation, Hefei 230037, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3367-5509","authenticated-orcid":false,"given":"Yongjie","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Electronic Engineering, National University of Defense Technology, Hefei 230037, China"},{"name":"Anhui Province Key Laboratory of Cyberspace Security Situation Awareness and Evaluation, Hefei 230037, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9997-3747","authenticated-orcid":false,"given":"Xinli","family":"Xiong","sequence":"additional","affiliation":[{"name":"College of Electronic Engineering, National University of Defense Technology, Hefei 230037, China"},{"name":"Anhui Province Key Laboratory of Cyberspace Security Situation Awareness and Evaluation, Hefei 230037, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-1866-6067","authenticated-orcid":false,"given":"Qiankun","family":"Ren","sequence":"additional","affiliation":[{"name":"College of Electronic Engineering, National University of Defense Technology, Hefei 230037, China"},{"name":"Anhui Province Key Laboratory of Cyberspace Security Situation Awareness and Evaluation, Hefei 230037, China"}]},{"given":"Jun","family":"Huang","sequence":"additional","affiliation":[{"name":"College of Electronic Engineering, National University of Defense Technology, Hefei 230037, China"},{"name":"Anhui Province Key Laboratory of Cyberspace Security Situation Awareness and Evaluation, Hefei 230037, China"}]}],"member":"1968","published-online":{"date-parts":[[2025,6,11]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1007\/s11390-019-1906-z","article-title":"A Survey on the Moving Target Defense Strategies: An Architectural Perspective","volume":"34","author":"Zheng","year":"2019","journal-title":"J. Comput. Sci. Technol."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Kott, A., and Linkov, I. (2018). Cyber Resilience of Systems and Networks, Springer Publishing Company, Incorporated. [1st ed.].","DOI":"10.1007\/978-3-319-77492-3"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1109\/MSP.2018.1870866","article-title":"Cyber Deception: Overview and the Road Ahead","volume":"16","author":"Wang","year":"2018","journal-title":"IEEE Secur. Priv."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Kott, A. (2023). Autonomous Intelligent Cyber-defense Agent: Introduction and Overview. Autonomous Intelligent Cyber Defense Agent (AICA): A Comprehensive Guide, Springer International Publishing.","DOI":"10.1007\/978-3-031-29269-9"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Sommer, R., and Paxson, V. (2010, January 16\u201319). Outside the Closed World: On Using Machine Learning for Network Intrusion Detection. Proceedings of the 2010 IEEE Symposium on Security and Privacy, Oakland, CA, USA.","DOI":"10.1109\/SP.2010.25"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1016\/j.cose.2011.12.012","article-title":"Toward developing a systematic approach to generate benchmark datasets for intrusion detection","volume":"31","author":"Shiravi","year":"2012","journal-title":"Comput. Secur."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"3779","DOI":"10.1109\/TNNLS.2021.3121870","article-title":"Deep Reinforcement Learning for Cyber Security","volume":"34","author":"Nguyen","year":"2023","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_8","unstructured":"Sengupta, S., and Kambhampati, S. (2020). Multi-agent Reinforcement Learning in Bayesian Stackelberg Markov Games for Adaptive Moving Target Defense. arXiv."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"21954","DOI":"10.1109\/ACCESS.2017.2762418","article-title":"A Deep Learning Approach for Intrusion Detection Using Recurrent Neural Networks","volume":"5","author":"Yin","year":"2017","journal-title":"IEEE Access"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1007\/s10207-004-0060-x","article-title":"Game strategies in network security","volume":"4","author":"Lye","year":"2005","journal-title":"Int. J. Inf. Secur."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Tambe, M. (2011). Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned, Cambridge University Press. [1st ed.].","DOI":"10.1017\/CBO9780511973031"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Khouzani, M., Panaousis, E., and Theodorakopoulos, G. (2015, January 4\u20135). Approximate Solutions for Attack Graph Games with Imperfect Information. Proceedings of the Decision and Game Theory for Security, London, UK.","DOI":"10.1007\/978-3-319-25594-1"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Yao, Q., Wang, Y., Xiong, X., Wang, P., and Li, Y. (2023). Adversarial Decision-Making for Moving Target Defense: A Multi-Agent Markov Game and Reinforcement Learning Approach. Entropy, 25.","DOI":"10.3390\/e25040605"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"103871","DOI":"10.1016\/j.cose.2024.103871","article-title":"A method of network attack-defense game and collaborative defense decision-making based on hierarchical multi-agent reinforcement learning","volume":"142","author":"Tang","year":"2024","journal-title":"Comput. Secur."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"100145","DOI":"10.1016\/j.hcc.2023.100145","article-title":"Research on active defense decision-making method for cloud boundary networks based on reinforcement learning of intelligent agent","volume":"4","author":"Wang","year":"2024","journal-title":"High-Confid. Comput."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Nyberg, J., and Johnson, P. (2023, January 27). Learning automated defense strategies using graph-based cyber attack simulations. Proceedings of the 2023 Workshop on Security Operation Center Operations and Construction, San Diego, CA, USA.","DOI":"10.14722\/wosoc.2023.23006"},{"key":"ref_17","unstructured":"Dutta, A., Chatterjee, S., Bhattacharya, A., and Halappanavar, M.M. (2023). Deep Reinforcement Learning for Cyber System Defense under Dynamic Adversarial Uncertainties. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Oh, S.H., Jeong, M.K., Kim, H.C., and Park, J. (2023). Applying Reinforcement Learning for Enhanced Cybersecurity against Adversarial Simulation. Sensors, 23.","DOI":"10.3390\/s23063000"},{"key":"ref_19","unstructured":"Kiely, M., Bowman, D., Standen, M., and Moir, C. (2023). On Autonomous Agents in a Cyber Defence Environment. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1109\/MCOM.009.2100389","article-title":"ATMoS+: Generalizable Threat Mitigation in SDN Using Permutation Equivariant and Invariant Deep Reinforcement Learning","volume":"59","author":"Tsang","year":"2021","journal-title":"IEEE Commun. Mag."},{"key":"ref_21","unstructured":"DARPA (2025, June 08). Cyber Grand Challenge. Available online: https:\/\/www.darpa.mil\/research\/programs\/cyber-grand-challenge."},{"key":"ref_22","unstructured":"Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., and Zaremba, W. (2016). OpenAI Gym. arXiv."},{"key":"ref_23","first-page":"15032","article-title":"Pettingzoo: Gym for multi-agent reinforcement learning","volume":"34","author":"Terry","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_24","unstructured":"Katsikas, S., Abie, H., Ranise, S., Verderame, L., Cambiaso, E., Ugarelli, R., Pra\u00e7a, I., Li, W., Meng, W., and Furnell, S. (2023, January 25\u201329). NASimEmu: Network Attack Simulator & Emulator for Training Agents Generalizing to Novel Scenarios. Proceedings of the Computer Security. ESORICS 2023 International Workshops, Hague, The Netherlands."},{"key":"ref_25","unstructured":"Microsoft Defender Research Team (2025, June 08). CyberBattleSim. Available online: https:\/\/github.com\/microsoft\/cyberbattlesim."},{"key":"ref_26","unstructured":"Standen, M., Lucas, M., Bowman, D., Richer, T.J., Kim, J., and Marriott, D.A. (2021). CybORG: A Gym for the Development of Autonomous Cyber Agents. arXiv."},{"key":"ref_27","unstructured":"Li, L., Fayad, R., and Taylor, A. (2021). CyGIL: A Cyber Gym for Training Autonomous Agents over Emulated Network Systems. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"104140","DOI":"10.1016\/j.cose.2024.104140","article-title":"PenGym: Realistic training environment for reinforcement learning pentesting agents","volume":"148","author":"Nguyen","year":"2025","journal-title":"Comput. Secur."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Oesch, S., Chaulagain, A., Weber, B., Dixson, M., Sadovnik, A., Roberson, B., Watson, C., and Austria, P. (2024, January 13). Towards a High Fidelity Training Environment for Autonomous Cyber Defense Agents. Proceedings of the 17th Cyber Security Experimentation and Test Workshop, New York, NY, USA. CSET \u201924.","DOI":"10.1145\/3675741.3675752"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Oesch, S., Austria, P., Chaulagain, A., Weber, B., Watson, C., Dixson, M., and Sadovnik, A. (2024). The Path To Autonomous Cyber Defense. arXiv.","DOI":"10.1109\/MSEC.2024.3427640"},{"key":"ref_31","unstructured":"Vyas, S., Hannay, J., Bolton, A., and Burnap, P.P. (2023). Automated Cyber Defence: A Review. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Hammar, K., and Stadler, R. (2020, January 2\u20136). Finding Effective Security Strategies through Reinforcement Learning and Self-Play. Proceedings of the 2020 16th International Conference on Network and Service Management (CNSM), Izmir, Turkey.","DOI":"10.23919\/CNSM50824.2020.9269092"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Grover, A., and Leskovec, J. (2016, January 13\u201317). node2vec: Scalable Feature Learning for Networks. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA. KDD \u201916.","DOI":"10.1145\/2939672.2939754"},{"key":"ref_34","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization Algorithms. arXiv."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/27\/6\/622\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:50:19Z","timestamp":1760032219000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/27\/6\/622"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,11]]},"references-count":34,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2025,6]]}},"alternative-id":["e27060622"],"URL":"https:\/\/doi.org\/10.3390\/e27060622","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,11]]}}}