{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,16]],"date-time":"2025-04-16T05:39:20Z","timestamp":1744781960041,"version":"3.37.3"},"reference-count":32,"publisher":"Wiley","license":[{"start":{"date-parts":[[2022,1,28]],"date-time":"2022-01-28T00:00:00Z","timestamp":1643328000000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62076251","2017YFB0802800"],"award-info":[{"award-number":["62076251","2017YFB0802800"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012166","name":"National Basic Research Program of China","doi-asserted-by":"publisher","award":["62076251","2017YFB0802800"],"award-info":[{"award-number":["62076251","2017YFB0802800"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Security and Communication Networks"],"published-print":{"date-parts":[[2022,1,28]]},"abstract":"<jats:p>Attacker identification from network traffic is a common practice of cyberspace security management. However, network administrators cannot cover all security equipment due to the cyberspace management cost constraints, giving attackers the chance to escape from the surveillance of network security administrators by legitimate actions and to perform the attack in both physical domain and digital domain. Therefore, we proposed a hidden attack sequence detection method based on reinforcement learning to deal with the challenge through modeling the network administrators as an intelligent agent that learns their action policy from the interaction with the cyberspace environment. Following Deep Deterministic Policy Gradient (DDPG), the intelligent agent can not only discover the hidden attackers hiding in the legitimate action sequences but also reduce the cyberspace management cost. Furthermore, a dynamic reward DDPG method was proposed to improve defense performance, which set dynamic reward depending on the hidden attack sequences steps and agent\u2019s check steps, compared to the fixed reward in common methods. Meanwhile, the method was verified in a simulated experimental cyberspace environment. Finally, the experimental results demonstrate that there are hidden attack sequences in cyberspace, and the proposed method can discover the hidden attack sequences. The dynamic reward DDPG shows superior performance in detecting hidden attackers, with a detection rate of 97.46%, which can improve the ability to discover hidden attackers and reduce the 6% cyberspace management cost compared to DDPG.<\/jats:p>","DOI":"10.1155\/2022\/1488344","type":"journal-article","created":{"date-parts":[[2022,1,28]],"date-time":"2022-01-28T22:50:12Z","timestamp":1643410212000},"page":"1-13","source":"Crossref","is-referenced-by-count":4,"title":["A Hidden Attack Sequences Detection Method Based on Dynamic Reward Deep Deterministic Policy Gradient"],"prefix":"10.1155","volume":"2022","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8746-1106","authenticated-orcid":true,"given":"Lei","family":"Zhang","sequence":"first","affiliation":[{"name":"Command and Control Engineering College, Army Engineering University of PLA, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8615-7313","authenticated-orcid":true,"given":"Zhisong","family":"Pan","sequence":"additional","affiliation":[{"name":"Command and Control Engineering College, Army Engineering University of PLA, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4657-4117","authenticated-orcid":true,"given":"Yu","family":"Pan","sequence":"additional","affiliation":[{"name":"Command and Control Engineering College, Army Engineering University of PLA, Nanjing, China"}]},{"given":"Shize","family":"Guo","sequence":"additional","affiliation":[{"name":"Command and Control Engineering College, Army Engineering University of PLA, Nanjing, China"}]},{"given":"Yi","family":"Liu","sequence":"additional","affiliation":[{"name":"Defense Innovation Institute, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8162-9091","authenticated-orcid":true,"given":"Shiming","family":"Xia","sequence":"additional","affiliation":[{"name":"Command and Control Engineering College, Army Engineering University of PLA, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3989-377X","authenticated-orcid":true,"given":"Qibin","family":"Zheng","sequence":"additional","affiliation":[{"name":"Academy of Military Science, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0699-9455","authenticated-orcid":true,"given":"Hongmei","family":"Li","sequence":"additional","affiliation":[{"name":"Academy of Military Science, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9738-0112","authenticated-orcid":true,"given":"Wei","family":"Bai","sequence":"additional","affiliation":[{"name":"Command and Control Engineering College, Army Engineering University of PLA, Nanjing, China"}]}],"member":"311","reference":[{"issue":"6","key":"1","article-title":"Comparative analysis on TCP and UDP network traffic","volume":"27","author":"Y. B. Zhang","year":"2010","journal-title":"Application Research of Computers"},{"key":"2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jnca.2012.09.004"},{"key":"3","article-title":"Research progress and prospect of network intrusion detection technology","volume":"32","author":"Y. P. Jiang","year":"2017","journal-title":"Journal of Light Industry"},{"key":"4","doi-asserted-by":"publisher","DOI":"10.1109\/CIT.2004.1357226"},{"key":"5","doi-asserted-by":"publisher","DOI":"10.1016\/j.jnca.2012.08.007"},{"key":"6","article-title":"Survey of network security situation awareness","volume":"28","author":"J. Gong","year":"2017","journal-title":"Journal of Software"},{"article-title":"Weakness analysis of cyberspace configuration based on reinforcement learning","year":"2020","author":"L. Zhang","key":"7"},{"key":"8","doi-asserted-by":"publisher","DOI":"10.1109\/iceeccot.2017.8284655"},{"key":"9","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2895898"},{"key":"10","doi-asserted-by":"publisher","DOI":"10.1109\/icos.2014.7042412"},{"key":"11","doi-asserted-by":"publisher","DOI":"10.3969\/j.issn.0372-2112.2017.03.033"},{"key":"12","doi-asserted-by":"publisher","DOI":"10.1109\/icsess.2017.8343013"},{"key":"13","doi-asserted-by":"publisher","DOI":"10.1016\/j.cose.2018.11.005"},{"key":"14","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2017.07.005"},{"key":"15","doi-asserted-by":"publisher","DOI":"10.1109\/CBD.2014.41"},{"key":"16","doi-asserted-by":"publisher","DOI":"10.1109\/tetci.2017.2772792"},{"key":"17","doi-asserted-by":"publisher","DOI":"10.1109\/access.2017.2762418"},{"key":"18","doi-asserted-by":"publisher","DOI":"10.1109\/platcon.2017.7883684"},{"key":"19","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2018.2863036"},{"key":"20","doi-asserted-by":"publisher","DOI":"10.1109\/tii.2020.3022432"},{"key":"21","article-title":"Reinforcement learning as classification: leveraging large margin classifiers","author":"M. G. Lagoudakis","year":"2003","journal-title":"Icml Submitted"},{"key":"22","article-title":"Towards traffic anomaly detection via reinforcement learning and data flow","volume":"6","author":"A. Servin","year":"2018","journal-title":"IEEE Access"},{"key":"23","doi-asserted-by":"publisher","DOI":"10.1109\/tnet.2004.842221"},{"key":"24","doi-asserted-by":"publisher","DOI":"10.1145\/2663474.2663481"},{"key":"25","doi-asserted-by":"publisher","DOI":"10.5220\/0006197105590566"},{"key":"26","doi-asserted-by":"publisher","DOI":"10.1016\/j.comnet.2019.05.013"},{"key":"27","doi-asserted-by":"publisher","DOI":"10.1109\/tnnls.2021.3121870"},{"volume-title":"Reinforcement Learning:An Introduction","year":"1998","author":"R. Sutton","key":"28"},{"key":"29","doi-asserted-by":"publisher","DOI":"10.1023\/a:1022676722315"},{"key":"30","article-title":"Playing atari with deep reinforcement learning","author":"V. Mnih","year":"2013","journal-title":"Computer Science"},{"first-page":"387","article-title":"Deterministic policy gradient algorithms","author":"D. Silver","key":"31"},{"article-title":"Continuous control with deep reinforcement learning","year":"2015","author":"T. P. Lillicrap","key":"32"}],"container-title":["Security and Communication Networks"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/scn\/2022\/1488344.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/scn\/2022\/1488344.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/scn\/2022\/1488344.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,1,28]],"date-time":"2022-01-28T22:50:23Z","timestamp":1643410223000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.hindawi.com\/journals\/scn\/2022\/1488344\/"}},"subtitle":[],"editor":[{"given":"Marimuthu","family":"Karuppiah","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,1,28]]},"references-count":32,"alternative-id":["1488344","1488344"],"URL":"https:\/\/doi.org\/10.1155\/2022\/1488344","relation":{},"ISSN":["1939-0122","1939-0114"],"issn-type":[{"type":"electronic","value":"1939-0122"},{"type":"print","value":"1939-0114"}],"subject":[],"published":{"date-parts":[[2022,1,28]]}}}