{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T00:28:24Z","timestamp":1769128104890,"version":"3.49.0"},"reference-count":45,"publisher":"MDPI AG","issue":"15","license":[{"start":{"date-parts":[[2021,7,26]],"date-time":"2021-07-26T00:00:00Z","timestamp":1627257600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61803111"],"award-info":[{"award-number":["61803111"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Guangzhou Science and Technology Project","award":["202102010403"],"award-info":[{"award-number":["202102010403"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>This paper focuses on the trajectory tracking guidance problem for the Terminal Area Energy Management (TAEM) phase of the Reusable Launch Vehicle (RLV). Considering the continuous state and action space of this guidance problem, the Continuous Actor\u2013Critic Learning Automata (CACLA) is applied to construct the guidance strategy of RLV. Two three-layer neuron networks are used to model the critic and actor of CACLA, respectively. The weight vectors of the critic are updated by the model-free Temporal Difference (TD) learning algorithm, which is improved by eligibility trace and momentum factor. The weight vectors of the actor are updated based on the sign of TD error, and a Gauss exploration is carried out in the actor. Finally, a Monte Carlo simulation and a comparison simulation are performed to show the effectiveness of the CACLA-based guidance strategy.<\/jats:p>","DOI":"10.3390\/s21155062","type":"journal-article","created":{"date-parts":[[2021,7,26]],"date-time":"2021-07-26T22:22:46Z","timestamp":1627338166000},"page":"5062","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["CACLA-Based Trajectory Tracking Guidance for RLV in Terminal Area Energy Management Phase"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0938-9879","authenticated-orcid":false,"given":"Xuejing","family":"Lan","sequence":"first","affiliation":[{"name":"School of Mechanical and Electrical Engineering, Guangzhou University, Guangzhou 510006, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhifeng","family":"Tan","sequence":"additional","affiliation":[{"name":"School of Mechanical and Electrical Engineering, Guangzhou University, Guangzhou 510006, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tao","family":"Zou","sequence":"additional","affiliation":[{"name":"School of Mechanical and Electrical Engineering, Guangzhou University, Guangzhou 510006, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wenbiao","family":"Xu","sequence":"additional","affiliation":[{"name":"Guangdong Province Institute of Metrology, Guangzhou 510450, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,7,26]]},"reference":[{"key":"ref_1","unstructured":"Joshi, A., and Sivan, K. (2012). Reentry Guidance for Generic RLV Using Optimal Perturbations and Error Weights. Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, Honolulu, HI, USA, 18\u201321 August 2012, American Institute of Aeronautics and Astronautics."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1016\/j.ast.2017.12.009","article-title":"A three-dimensional predictor\u2013corrector entry guidance based on reduced-order motion equations","volume":"73","author":"Liang","year":"2018","journal-title":"Aerosp. Sci. Technol."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1016\/j.paerosci.2018.10.006","article-title":"Space traffic management: Towards safe and unsegregated space transport operations","volume":"105","author":"Hilton","year":"2019","journal-title":"Prog. Aerosp. Sci."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1016\/j.actaastro.2009.01.057","article-title":"RLV candidates for European Future Launchers Preparatory Programme","volume":"65","author":"Tomatis","year":"2009","journal-title":"Acta Astronaut."},{"key":"ref_5","unstructured":"Hanson, J. (2012). A Plan for Advanced Guidance and Control Technology for 2nd Generation Reusable Launch Vehicles. Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, Honolulu, HI, USA, 18\u201321 August 2012, American Institute of Aeronautics and Astronautics."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"627","DOI":"10.1016\/j.ast.2017.10.019","article-title":"Entry trajectory generation without reversal of bank angle","volume":"71","author":"He","year":"2017","journal-title":"Aerosp. Sci. Technol."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1016\/j.ast.2017.10.012","article-title":"Attitude controller design for reusable launch vehicles during reentry phase via compound adaptive fuzzy H-infinity control","volume":"72","author":"Mao","year":"2017","journal-title":"Aerosp. Sci. Technol."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"150","DOI":"10.1016\/j.ast.2019.03.052","article-title":"An on-line guidance algorithm for high L\/D hypersonic reentry vehicles","volume":"89","author":"Zang","year":"2019","journal-title":"Aerosp. Sci. Technol."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1177\/1729881418817971","article-title":"Rapid trajectory planning of a reusable launch vehicle for airdrop with geographic constraints","volume":"16","author":"Wei","year":"2019","journal-title":"Int. J. Adv. Robot. Syst."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"3212","DOI":"10.1109\/TVT.2019.2899917","article-title":"A Novel Reentry Trajectory Generation Method Using Improved Particle Swarm Optimization","volume":"68","author":"Zhou","year":"2019","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"620","DOI":"10.1016\/j.ast.2018.10.035","article-title":"Entry trajectory planning with terminal full states constraints and multiple geographic constraints","volume":"84","author":"Wang","year":"2019","journal-title":"Aerosp. Sci. Technol."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1016\/j.ast.2017.12.037","article-title":"An approach and landing guidance design for reusable launch vehicle based on adaptive predictor\u2013corrector technique","volume":"75","author":"Li","year":"2018","journal-title":"Aerosp. Sci. Technol."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Hameed, A.S., and Bindu, D.G.R. (2019). A Novel Flare Maneuver Guidance for Approach and Landing Phase of a Reusable Launch Vehicle. Advances in Science and Engineering Technology, Proceedings of the International Conferences (ASET), Dubai, United Arab Emirates, 26 March\u201311 April 2019, IEEE.","DOI":"10.1109\/ICASET.2019.8714376"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"642","DOI":"10.2514\/1.51083","article-title":"Optimal Longitudinal Trajectories for Reusable Space Vehicles in the Terminal Area","volume":"48","author":"Ridder","year":"2011","journal-title":"J. Spacecr. Rocket."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Corraro, F., Morani, G., Nebula, F., Cuciniello, G., and Palumbo, R. (2012, January 11\u201314). GN&C Technology Innovations for TAEM: USV DTFT2 Mission Results. Proceedings of the 17th AIAA International Space Planes and Hypersonic Systems and Technologies Conference, San Francisco, CA, USA.","DOI":"10.2514\/6.2011-2262"},{"key":"ref_16","unstructured":"Horneman, K., and Kluever, C. (2012, January 13\u201316). Terminal Area Energy Management Trajectory Planning for an Unpowered Reusable Launch Vehicle. Proceedings of the AIAA Atmospheric Flight Mechanics Conference and Exhibit, Minneapolis, MN, USA."},{"key":"ref_17","unstructured":"Mayanna, A., Grimm, W., and Well, K. (2012, January 18\u201321). Adaptive Guidance for Terminal Area Energy Management (TAEM)of Reentry Vehicles. Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit, Honolulu, HI, USA."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"162","DOI":"10.2514\/1.24864","article-title":"Terminal Guidance for an Unpowered Reusable Launch Vehicle with Bank Constraints","volume":"30","author":"Kluever","year":"2007","journal-title":"J. Guid. Control. Dyn."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/s12567-018-0219-3","article-title":"Dynamic guidance of orbiter gliders: Alignment, final approach, and landing","volume":"11","author":"Fonseca","year":"2019","journal-title":"CEAS Space J."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Pengfei, F., Fan, W., Yonghua, F., and Jie, Y. (2019, January 24\u201326). In-flight Longitudinal Guidance for RLV in TAEM Phase. Proceedings of the 2018 IEEE 4th International Conference on Control Science and Systems Engineering (ICCSSE), Wuhan, China.","DOI":"10.1109\/CCSSE.2018.8724848"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1099","DOI":"10.1007\/s12206-008-0501-y","article-title":"Trajectory optimization and the control of a re-entry vehicle in TAEM phase","volume":"22","author":"Baek","year":"2008","journal-title":"J. Mech. Sci. Technol."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"472","DOI":"10.1049\/iet-cta.2008.0463","article-title":"Robust terminal area energy management guidance using flatness approach","volume":"4","author":"Cazaurang","year":"2010","journal-title":"IET Control Theory Appl."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zheng, B., Liang, Z., Li, Q., and Ren, Z. (2015, January 8\u201310). Trajectory tracking for RLV terminal area energy management phase based on LQR. Proceedings of the 2014 IEEE Chinese Guidance, Navigation and Control Conference, Yantai, China.","DOI":"10.1109\/CGNCC.2014.7007564"},{"key":"ref_24","unstructured":"Grantham, K. (2012, January 6\u20139). Adaptive Critic Neural Network Based Terminal Area Energy Management\/Entry Guidance. Proceedings of the 41st Aerospace Sciences Meeting and Exhibit, Reno, NV, USA."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Mu, L., Yu, X., Wang, B., Zhang, Y., Wang, X., and Li, P. (2018, January 18\u201320). 3D gliding guidance for an unpowered RLV in the TAEM phase. Proceedings of the 2018 33rd Youth Academic Annual Conference of Chinese Association of Automation (YAC), Nanjing, China.","DOI":"10.1109\/YAC.2018.8406409"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Kluever, C., Horneman, K., and Schierman, J. (2009, January 10\u201313). Rapid Terminal-Trajectory Planner for an Unpowered Reusable Launch Vehicle. Proceedings of the AIAA Guidance, Navigation, and Control Conference, Chicago, IL, USA.","DOI":"10.2514\/6.2009-5766"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1016\/j.actaastro.2015.10.019","article-title":"Online trajectory planning and guidance for reusable launch vehicles in the terminal area","volume":"118","author":"Lan","year":"2016","journal-title":"Acta Astronaut."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"04020003","DOI":"10.1061\/(ASCE)AS.1943-5525.0001112","article-title":"3D Profile Reconstruction and Guidance for the Terminal Area Energy Management Phase of an Unpowered RLV with Aerosurface Failure","volume":"33","author":"Lan","year":"2020","journal-title":"J. Aerosp. Eng."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Busoniu, L., Ernst, D., De Schutter, B., and Babuska, R. (2011, January 11\u201315). Approximate reinforcement learning\u2014An overview. Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), Paris, France.","DOI":"10.1109\/ADPRL.2011.5967353"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"2509","DOI":"10.1109\/TIE.2014.2361485","article-title":"A Novel Dual Iterative Q-Learning Method for Optimal Battery Management in Smart Residential Environments","volume":"62","author":"Wei","year":"2014","journal-title":"IEEE Trans. Ind. Electron."},{"key":"ref_31","unstructured":"Tampuu, A., Matiisen, T., Kodelja, D., Kuzovkin, I., Korjus, K., Aru, J., Aru, J., and Vicente, R. (2015). Multiagent Cooperation and Competition with Deep Reinforcement Learning. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1109\/TCDS.2018.2866477","article-title":"Biologically Inspired Motion Modeling and Neural Control for Robot Learning From Demonstrations","volume":"11","author":"Yang","year":"2018","journal-title":"IEEE Trans. Cogn. Dev. Syst."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"777","DOI":"10.1109\/TNNLS.2018.2852711","article-title":"Robot Learning System Based on Adaptive Neural Control and Dynamic Movement Primitives","volume":"30","author":"Yang","year":"2019","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Wang, N., Chen, C., and Yang, C. (2019). A Robot Learning Framework based on Adaptive Admittance Control and Generalizable Motion Modeling with Neural Network Controller. Neurocomputing, 390.","DOI":"10.1016\/j.neucom.2019.04.100"},{"key":"ref_35","first-page":"161","article-title":"Finite-Time Convergence Disturbance Rejection Control for a Flexible Timoshenko Manipulator","volume":"8","author":"Zhao","year":"2021","journal-title":"IEEE\/CAA J. Autom. Sin."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Al-Talabi, A.A., and Schwartz, H.M. (2016, January 24\u201326). Kalman fuzzy actor-critic learning automaton algorithm for the pursuit-evasion differential game. Proceedings of the IEEE International Conference on Fuzzy Systems, Vancouver, BC, Canada.","DOI":"10.1109\/FUZZ-IEEE.2016.7737799"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Gerken, A., and Spranger, M. (2019). Continuous Value Iteration (CVI) Reinforcement Learning and Imaginary Experience Replay (IER) for learning multi-goal, continuous action and state space controllers. arXiv.","DOI":"10.1109\/ICRA.2019.8794347"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Zimmer, M., and Weng, P. (2019). Exploiting the Sign of the Advantage Function to Learn Deterministic Policies in Continuous Domains. arXiv.","DOI":"10.24963\/ijcai.2019\/625"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Lan, X., Liu, Y., and Zhao, Z. (2020). Cooperative control for swarming systems based on reinforcement learning in unknown dynamic environment. Neurocomputing, 410.","DOI":"10.1016\/j.neucom.2020.06.038"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Leuenberger, G., and Wiering, M.A. (2018, January 16\u201318). Actor-Critic Reinforcement Learning with Neural Networks in Continuous Games. Proceedings of the International Conference on Agents and Artificial Intelligence (ICAART), Funchal, Portugal.","DOI":"10.5220\/0006556500530060"},{"key":"ref_41","unstructured":"Jiang, X., Yang, J., Tan, X., and Xi, H. (2018). Observation-based Optimization for POMDPs with Continuous State, Observation, and Action Spaces. IEEE Trans. Autom. Control, 1\u20138."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"59624","DOI":"10.1109\/ACCESS.2019.2914669","article-title":"ADP-Based Intelligent Decentralized Control for Multi-Agent Systems Moving in Obstacle Environment","volume":"7","author":"Lan","year":"2019","journal-title":"IEEE Access"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Van Hasselt, H. (2012). Reinforcement Learning in Continuous State and Action Spaces. Reinforcement Learning, Springer.","DOI":"10.1007\/978-3-642-27645-3_7"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Comsa, I.S., Aydin, M., Zhang, S., Kuonen, P., Wagen, J.F., and Yao, L. (2014, January 8\u201311). Scheduling policies based on dynamic throughput and fairness tradeoff control in LTE-A networks. Proceedings of the 39th Annual IEEE Conference on Local Computer Networks, Edmonton, AB, Canada.","DOI":"10.1109\/LCN.2014.6925806"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Hafez, M.B., Weber, C., and Wermter, S. (2017, January 18\u201321). Curiosity-driven exploration enhances motor skills of continuous actor-critic learner. Proceedings of the 7th Joint IEEE International Conferences on Development and Learning and Epigenetic Robotics (ICDL-Epirob), Lisbon, Portugal.","DOI":"10.1109\/DEVLRN.2017.8329785"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/15\/5062\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:35:15Z","timestamp":1760164515000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/15\/5062"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,26]]},"references-count":45,"journal-issue":{"issue":"15","published-online":{"date-parts":[[2021,8]]}},"alternative-id":["s21155062"],"URL":"https:\/\/doi.org\/10.3390\/s21155062","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,7,26]]}}}