{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T19:19:15Z","timestamp":1776107955991,"version":"3.50.1"},"reference-count":42,"publisher":"MDPI AG","issue":"23","license":[{"start":{"date-parts":[[2023,11,25]],"date-time":"2023-11-25T00:00:00Z","timestamp":1700870400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61971109"],"award-info":[{"award-number":["61971109"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["ZYGX2020ZB031"],"award-info":[{"award-number":["ZYGX2020ZB031"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"GF Science and Technology Special Innovation Zone Project","award":["61971109"],"award-info":[{"award-number":["61971109"]}]},{"name":"GF Science and Technology Special Innovation Zone Project","award":["ZYGX2020ZB031"],"award-info":[{"award-number":["ZYGX2020ZB031"]}]},{"name":"Fundamental Research Funds of Central Universities","award":["61971109"],"award-info":[{"award-number":["61971109"]}]},{"name":"Fundamental Research Funds of Central Universities","award":["ZYGX2020ZB031"],"award-info":[{"award-number":["ZYGX2020ZB031"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Unmanned combat aerial vehicle (UCAV) trajectory planning to avoid radar detection threats is a complicated optimization problem that has been widely studied. The rapid changes in Radar Cross Sections (RCSs), the unknown cruise trajectory of airborne radar, and the uncertain distribution of radars exacerbate the complexity of this problem. In this paper, we propose a novel UCAV trajectory planning method based on deep reinforcement learning (DRL) technology to overcome the adverse impacts caused by the dynamics and randomness of environments. A predictive control model is constructed to describe the dynamic characteristics of the UCAV trajectory planning problem in detail. To improve the UCAV\u2019s predictive ability, we propose a memory-enhanced twin delayed deep deterministic policy gradient (ME-TD3) algorithm that uses an attention mechanism to effectively extract environmental patterns from historical information. The simulation results show that the proposed method can successfully train UCAVs to carry out trajectory planning tasks in dynamic and unknown environments. Furthermore, the ME-TD3 algorithm outperforms other classical DRL algorithms in UCAV trajectory planning, exhibiting superior performance and adaptability.<\/jats:p>","DOI":"10.3390\/rs15235494","type":"journal-article","created":{"date-parts":[[2023,11,27]],"date-time":"2023-11-27T03:35:06Z","timestamp":1701056106000},"page":"5494","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-6402-3773","authenticated-orcid":false,"given":"Jiantao","family":"Li","sequence":"first","affiliation":[{"name":"School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4845-9796","authenticated-orcid":false,"given":"Tianxian","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1460-5097","authenticated-orcid":false,"given":"Kai","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,11,25]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Xu, H., Fang, G., Fan, Y., Xu, B., and Yan, J. (2020). Universal adaptive neural network predictive algorithm for remotely piloted unmanned combat aerial vehicle in wireless sensor network. Sensors, 20.","DOI":"10.3390\/s20082213"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Zhang, T.X., Wang, Y.H., Ma, Z.J., and Kong, L.J. (IEEE Trans. Aerosp. Electron. Syst., 2023). Task assignment in UAV-enabled front jammer swarm: A coalition formation game approach, IEEE Trans. Aerosp. Electron. Syst., early access.","DOI":"10.1109\/TAES.2023.3323441"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1016\/j.advengsoft.2016.05.015","article-title":"Grey wolf optimizer for unmanned combat aerial vehicle path planning","volume":"99","author":"Zhang","year":"2016","journal-title":"Adv. Eng. Softw."},{"key":"ref_4","unstructured":"Kabamba, P.T., Meerkov, S.M., and Zeitz, F.H. (2005, January 3\u20138). Optimal UCAV path planning under missile threats. Proceedings of the 16th International Federation of Automatic Control World Congress (IFAC), Prague, Czech Republic."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1109\/TITS.2019.2954952","article-title":"Memory-based deep reinforcement learning for obstacle avoidance in UAV with limited environment knowledge","volume":"22","author":"Singla","year":"2021","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"6587","DOI":"10.1109\/JSEN.2020.3042079","article-title":"A graph-based track-before-detect algorithm for automotive radar target detection","volume":"21","author":"Chen","year":"2021","journal-title":"IEEE Sens. J."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Lu, S.Z., Meng, Z.J., Huang, Z., and Wang, Z. (2022). Study on quantum radar detection probability based on flying-wing stealth aircraft. Sensors, 22.","DOI":"10.3390\/s22165944"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"696","DOI":"10.1109\/TCST.2002.801879","article-title":"Radar cross-section reduction via route planning and intelligent control","volume":"10","author":"Moore","year":"2002","journal-title":"IEEE Trans. Control Syst. Technol."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"547","DOI":"10.1016\/j.compfluid.2007.07.008","article-title":"Robust evolutionary algorithms for UAV\/UCAV aerodynamic and RCS design optimization","volume":"37","author":"Lee","year":"2008","journal-title":"Comput. Fluids"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"279","DOI":"10.2514\/1.14303","article-title":"Optimal path planning for unmanned combat aerial vehicles to defeat radar tracking","volume":"29","author":"Kabamba","year":"2006","journal-title":"J. Guid. Control Dyn."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"442","DOI":"10.1016\/j.ast.2009.07.002","article-title":"Novel intelligent water drops optimization approach to single UCAV smooth trajectory planning","volume":"13","author":"Duan","year":"2009","journal-title":"Aerosp. Sci. Technol."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"374","DOI":"10.14429\/dsj.70.15040","article-title":"A case-based online trajectory planning method of autonomous unmanned combat aerial vehicles with weapon release constraints","volume":"70","author":"Tang","year":"2020","journal-title":"Def. Sci. J."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"3719762","DOI":"10.1155\/2018\/3719762","article-title":"UCAV formation online collaborative trajectory planning using hp adaptive pseudospectral method","volume":"2018","author":"Wei","year":"2018","journal-title":"Math. Probl. Eng."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"536","DOI":"10.1109\/JSEE.2012.00068","article-title":"Hybrid hierarchical trajectory planning for a fixed-wing UCAV performing air-to-surface multi-target attack","volume":"23","author":"Zhang","year":"2012","journal-title":"J. Syst. Eng. Electron."},{"key":"ref_15","unstructured":"Sutton, R., and Barto, A. (2017). Reinforcement Learning: An Introduction, MIT Press. [2nd ed.]."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1007\/s40815-021-01158-y","article-title":"Visual range maneuver decision of unmanned combat aerial vehicle based on fuzzy reasoning","volume":"24","author":"Wu","year":"2022","journal-title":"Int. J. Fuzzy Syst."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Yang, K., Dong, W., Cai, M., Jia, S., and Liu, R. (2022). UCAV air combat maneuver decisions based on a proximal policy optimization algorithm with situation reward shaping. Electronics, 11.","DOI":"10.3390\/electronics11162602"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"104767","DOI":"10.1016\/j.engappai.2022.104767","article-title":"Aerial combat maneuvering policy learning based on confrontation demonstrations and dynamic quality replay","volume":"111","author":"Hu","year":"2022","journal-title":"Eng. Appl. Artif. Intel."},{"key":"ref_19","first-page":"1477078","article-title":"Research on UCAV maneuvering decision method based on heuristic reinforcement learning","volume":"2022","author":"Yuan","year":"2022","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"3657814","DOI":"10.1155\/2023\/3657814","article-title":"Autonomous maneuver decision of UCAV air combat based on double deep Q network algorithm and stochastic game theory","volume":"2023","author":"Cao","year":"2023","journal-title":"Int. J. Aerosp. Eng."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Wang, Y., Li, K., Zhuang, X., Liu, X., and Li, H. (2023). A reinforcement learning method based on an improved sampling mechanism for unmanned aerial vehicle penetration. Aerospace, 10.","DOI":"10.3390\/aerospace10070642"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Wan, K., Gao, X., Hu, Z., and Wu, G. (2020). Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning. Remote Sens., 12.","DOI":"10.3390\/rs12040640"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Li, B., Gan, Z.G., Chen, D.Q., and Aleksandrovich, D.S. (2020). UAV maneuvering target tracking in uncertain environments based on deep reinforcement learning and meta-learning. Remote Sens., 12.","DOI":"10.3390\/rs12223789"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Liu, X.X., Yin, Y., Su, Y.Z., and Ming, R.C. (2022). A multi-UCAV cooperative decision-making method based on an MAPPO algorithm for beyond-visual-range air combat. Aerospace, 9.","DOI":"10.3390\/aerospace9100563"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Kong, W., Zhou, D., Yang, Z., Zhang, K., and Zeng, L. (2020). Maneuver strategy generation of UCAV for within visual range air combat based on multi-agent reinforcement learning and target position prediction. Appl. Sci., 10.","DOI":"10.3390\/app10155198"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"5649","DOI":"10.1007\/s00521-021-06702-3","article-title":"Tactical UAV path optimization under radar threat using deep reinforcement learning","volume":"34","author":"Alpdemir","year":"2022","journal-title":"Neural Comput. Appl."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"547","DOI":"10.1017\/aer.2021.85","article-title":"Reinforcement learning-based radar-evasive path planning: A comparative analysis","volume":"126","author":"Hameed","year":"2022","journal-title":"Aeronaut. J."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1007\/s10846-019-01073-3","article-title":"Towards real-time path planning through deep reinforcement learning for a UAV in dynamic environments","volume":"98","author":"Yan","year":"2020","journal-title":"J. Intell. Robot. Syst."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Zipfel, P. (2014). Modeling and Simulation of Aerospace Vehicle Dynamics, AIAA Press. [3rd ed.].","DOI":"10.2514\/4.102509"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"2286","DOI":"10.1109\/TAES.2022.3213793","article-title":"Sensitivity of single-pulse radar detection to aircraft pose uncertainties","volume":"59","author":"Costley","year":"2023","journal-title":"IEEE Trans. Aerosp. Electron. Syst."},{"key":"ref_31","unstructured":"Mahafza, B.R. (2013). Radar Systems Analysis and Design Using Matlab, CRC Press."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1108\/00022661311294067","article-title":"Penetration trajectory planning based on radar tracking features for UAV","volume":"85","author":"Chen","year":"2013","journal-title":"Aircr. Eng. Aerosp. Technol."},{"key":"ref_33","unstructured":"Skolink, M.I. (1990). Radar Handbook, McGraw-Hill Press. [2nd ed.]."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1109\/5.554205","article-title":"An introduction to multisensor data fusion","volume":"85","author":"Hall","year":"1997","journal-title":"Proc. IEEE"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"49089","DOI":"10.1109\/ACCESS.2018.2854283","article-title":"A deep hierarchical reinforcement learning algorithm in partially observable markov decision processes","volume":"6","author":"Le","year":"2018","journal-title":"IEEE Access"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1023\/A:1007678930559","article-title":"Convergence results for single-step on-policy reinforcement-learning algorithms","volume":"38","author":"Singh","year":"2000","journal-title":"Mach. Learn."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1180","DOI":"10.1016\/j.neucom.2007.11.026","article-title":"Natural actor-critic","volume":"71","author":"Peters","year":"2008","journal-title":"Neurocomputing"},{"key":"ref_38","first-page":"180","article-title":"Continuous control with deep reinforcement learning","volume":"8","author":"Lillicrap","year":"2015","journal-title":"Comput. Sci."},{"key":"ref_39","unstructured":"Fujimoto, S., Van Hoof, H., and Meger, D. (2018). Addressing function approximation error in actor-critic methods. arXiv."},{"key":"ref_40","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017). Attention is all you need. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"4262","DOI":"10.1109\/TAES.2023.3241141","article-title":"Strategy optimization for Range Gate Pull-Off track-deception jamming under black-box circumstance","volume":"59","author":"Wang","year":"2023","journal-title":"IEEE Trans. Aerosp. Electron. Syst."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"580","DOI":"10.1109\/TEVC.2022.3175517","article-title":"A stochastic simulation optimization-based Range Gate Pull-Off jamming method","volume":"27","author":"Wang","year":"2023","journal-title":"IEEE Trans. Evol. Comput."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/23\/5494\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:29:55Z","timestamp":1760131795000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/23\/5494"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,25]]},"references-count":42,"journal-issue":{"issue":"23","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["rs15235494"],"URL":"https:\/\/doi.org\/10.3390\/rs15235494","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,11,25]]}}}