{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T04:20:19Z","timestamp":1773894019316,"version":"3.50.1"},"reference-count":30,"publisher":"Wiley","issue":"4","license":[{"start":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T00:00:00Z","timestamp":1773273600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/onlinelibrary.wiley.com\/termsAndConditions#vor"},{"start":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T00:00:00Z","timestamp":1773273600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/doi.wiley.com\/10.1002\/tdm_license_1.1"}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Expert Systems"],"published-print":{"date-parts":[[2026,4]]},"abstract":"<jats:title>ABSTRACT<\/jats:title>\n                  <jats:p>With advances in autonomous driving technology, autonomous parking\u2014an indispensable capability of intelligent vehicles\u2014has emerged as a focal point for both academia and industry. To mitigate the slow convergence caused by sparse reward signals in parking tasks, this study introduces Evolutionary\u2010Guided Self\u2010Imitation Proximal Policy Optimisation (EGSI\u2010PPO), a novel algorithm that fuses the exploratory diversity of evolutionary strategies with the trajectory\u2010guided supervision of self\u2010imitation learning. The evolutionary component maintains policy diversity and enlarges the search space through population\u2010based parallel evolution, thereby enhancing global exploration, while the self\u2010imitation component transforms sparse rewards into dense supervisory signals using high\u2010return trajectories, simultaneously accelerating convergence and guiding the policy out of suboptimal traps. To balance parking accuracy, efficiency, and smoothness, a composite multi\u2010objective reward function is formulated, and a meta\u2010gradient weight\u2010balancing mechanism automatically adjusts the relative importance of each sub\u2010objective. In addition, action\u2010level smoothing and physical constraints are imposed at the policy output to ensure practical deployability. Experiments on the Webots simulation platform show that, compared with SAC, DDPG, and TD3, EGSI\u2010PPO delivers significant improvements in success rate, parking accuracy, and convergence speed. Ablation studies further confirm the individual contributions of the evolutionary component and the self\u2010imitation learning module. Overall, this work provides an efficient and robust deep reinforcement learning solution for autonomous parking and demonstrates the algorithm's potential in continuous control tasks characterised by sparse rewards.<\/jats:p>","DOI":"10.1111\/exsy.70240","type":"journal-article","created":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T11:26:20Z","timestamp":1773314780000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["<scp>EGSI<\/scp>\n                    \u2010\n                    <scp>PPO<\/scp>\n                    : An Evolutionary\u2010Guided Self\u2010Imitation Reinforcement Learning Framework for Autonomous Parking"],"prefix":"10.1111","volume":"43","author":[{"given":"Liang","family":"Hou","sequence":"first","affiliation":[{"name":"Gansu Engineering Research Center of Manufacturing Information Lanzhou University of Technology  Lanzhou China"},{"name":"School of Information Engineering Lanzhou City University  Lanzhou China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-3173-6382","authenticated-orcid":false,"given":"Fan","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Computer and Artificial Intelligence Lanzhou University of Technology  Lanzhou China"}]},{"given":"Jie","family":"Cao","sequence":"additional","affiliation":[{"name":"Gansu Engineering Research Center of Manufacturing Information Lanzhou University of Technology  Lanzhou China"},{"name":"School of Information Engineering Lanzhou City University  Lanzhou China"}]},{"given":"Nana","family":"Lian","sequence":"additional","affiliation":[{"name":"School of Computer and Artificial Intelligence Lanzhou University of Technology  Lanzhou China"}]},{"given":"Xudong","family":"Liu","sequence":"additional","affiliation":[{"name":"Medical Department Gansu Provincial Maternal and Child Health Hospital  Lanzhou China"}]}],"member":"311","published-online":{"date-parts":[[2026,3,12]]},"reference":[{"key":"e_1_2_9_2_1","doi-asserted-by":"publisher","DOI":"10.1049\/itr2.12614"},{"key":"e_1_2_9_3_1","doi-asserted-by":"publisher","DOI":"10.3390\/s25061941"},{"key":"e_1_2_9_4_1","doi-asserted-by":"publisher","DOI":"10.3390\/app13116847"},{"key":"e_1_2_9_5_1","doi-asserted-by":"publisher","DOI":"10.30880\/jscdm.2024.05.01.001"},{"key":"e_1_2_9_6_1","doi-asserted-by":"publisher","DOI":"10.3390\/app112210659"},{"key":"e_1_2_9_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2020.3025671"},{"key":"e_1_2_9_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2025.3550417"},{"key":"e_1_2_9_9_1","doi-asserted-by":"publisher","DOI":"10.3390\/app10249100"},{"key":"e_1_2_9_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2023.104545"},{"key":"e_1_2_9_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10462\u2010024\u201011062\u20100"},{"key":"e_1_2_9_12_1","first-page":"3878","article-title":"Self\u2010Imitation Learning","volume":"80","author":"Oh J.","year":"2018","journal-title":"Proceedings of the 35th International Conference on Machine Learning"},{"key":"e_1_2_9_13_1","doi-asserted-by":"publisher","DOI":"10.3390\/s25144328"},{"key":"e_1_2_9_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2023.3330431"},{"key":"e_1_2_9_15_1","doi-asserted-by":"publisher","DOI":"10.3389\/frai.2025.1688764"},{"key":"e_1_2_9_16_1","doi-asserted-by":"publisher","DOI":"10.1007\/s12293\u2010024\u201000419\u20101"},{"key":"e_1_2_9_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TAC.2024.3454011"},{"key":"e_1_2_9_18_1","doi-asserted-by":"publisher","DOI":"10.1049\/itr2.12067"},{"key":"e_1_2_9_19_1","doi-asserted-by":"publisher","DOI":"10.1613\/jair.1.14819"},{"key":"e_1_2_9_20_1","doi-asserted-by":"publisher","DOI":"10.3390\/s23167124"},{"key":"e_1_2_9_21_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.conengprac.2025.106423"},{"key":"e_1_2_9_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/JAS.2023.123975"},{"key":"e_1_2_9_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCST.2024.3367468"},{"key":"e_1_2_9_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/TRO.2023.3249564"},{"key":"e_1_2_9_25_1","doi-asserted-by":"publisher","DOI":"10.3390\/s24154962"},{"key":"e_1_2_9_26_1","first-page":"2402","article-title":"Meta\u2010Gradient Reinforcement Learning","volume":"31","author":"Xu Z.","year":"2018","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_9_27_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2024.124929"},{"key":"e_1_2_9_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2022.07.111"},{"key":"e_1_2_9_29_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41598\u2010025\u201085541\u2010x"},{"key":"e_1_2_9_30_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2023.126628"},{"key":"e_1_2_9_31_1","doi-asserted-by":"publisher","DOI":"10.3390\/systems13060453"}],"container-title":["Expert Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1111\/exsy.70240","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1111\/exsy.70240","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1111\/exsy.70240","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T02:01:42Z","timestamp":1773885702000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1111\/exsy.70240"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,12]]},"references-count":30,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2026,4]]}},"alternative-id":["10.1111\/exsy.70240"],"URL":"https:\/\/doi.org\/10.1111\/exsy.70240","archive":["Portico"],"relation":{},"ISSN":["0266-4720","1468-0394"],"issn-type":[{"value":"0266-4720","type":"print"},{"value":"1468-0394","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,12]]},"assertion":[{"value":"2025-09-18","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-02-25","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-03-12","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"e70240"}}