{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,2]],"date-time":"2026-03-02T18:26:11Z","timestamp":1772475971267,"version":"3.50.1"},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2024,4,2]],"date-time":"2024-04-02T00:00:00Z","timestamp":1712016000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,4,2]],"date-time":"2024-04-02T00:00:00Z","timestamp":1712016000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"the National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62173251"],"award-info":[{"award-number":["62173251"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"publisher","award":["2242023K30034"],"award-info":[{"award-number":["2242023K30034"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"publisher"}]},{"name":"the \u201cZhishan\u201d Scholars Programs of Southeast University"},{"name":"Engineering Research Center of Blockchain Application, Supervision And Management (Southeast University), Ministry of Education"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Artif Intell Rev"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Safe and efficient cooperative planning of multiple robots in pedestrian participation environments is promising for applications. In this paper, a novel multi-robot social-aware efficient cooperative planner on the basis of off-policy multi-agent reinforcement learning (MARL) under partial dimension-varying observation and imperfect perception conditions is proposed. We adopt a temporal-spatial graph (TSG)-based social encoder to better extract the importance of social relations between each robot and the pedestrians in its field of view (FOV). Also, we introduce a K-step lookahead reward setting in the multi-robot RL framework to avoid aggressive, intrusive, short-sighted, and unnatural motion decisions generated by robots. Moreover, we improve the traditional centralized critic network with a multi-head global attention module to better aggregate local observation information among different robots to guide the process of the individual policy update. Finally, multi-group experimental results verify the effectiveness of the proposed cooperative motion planner.<\/jats:p>","DOI":"10.1007\/s10462-024-10739-w","type":"journal-article","created":{"date-parts":[[2024,4,2]],"date-time":"2024-04-02T10:01:53Z","timestamp":1712052113000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Multi-robot social-aware cooperative planning in pedestrian environments using attention-based actor-critic"],"prefix":"10.1007","volume":"57","author":[{"given":"Lu","family":"Dong","sequence":"first","affiliation":[]},{"given":"Zichen","family":"He","sequence":"additional","affiliation":[]},{"given":"Chunwei","family":"Song","sequence":"additional","affiliation":[]},{"given":"Xin","family":"Yuan","sequence":"additional","affiliation":[]},{"given":"Haichao","family":"Zhang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,4,2]]},"reference":[{"key":"10739_CR1","doi-asserted-by":"publisher","unstructured":"Berg JVD, Guy SJ, Lin M, Manocha D (2011) Reciprocal n-body collision avoidance. In: Robotics research. Springer, Berlin, pp 3\u201319. https:\/\/doi.org\/10.1007\/978-3-642-19457-3_1","DOI":"10.1007\/978-3-642-19457-3_1"},{"key":"10739_CR2","doi-asserted-by":"publisher","unstructured":"Chen C, Liu Y, Kreiss S, Alahi A (2019) Crowd-robot interaction: crowd-aware robot navigation with attention-based deep reinforcement learning. In: 2019 International conference on robotics and automation (ICRA), pp 6015\u20136022. https:\/\/doi.org\/10.1109\/ICRA.2019.8794134","DOI":"10.1109\/ICRA.2019.8794134"},{"key":"10739_CR3","doi-asserted-by":"publisher","unstructured":"Desaraju VR, How JP (2011) Decentralized path planning for multi-agent teams in complex environments using rapidly-exploring random trees. In: 2011 IEEE international conference on robotics and automation, pp 4956\u20134961. https:\/\/doi.org\/10.1109\/ICRA.2011.5980392","DOI":"10.1109\/ICRA.2011.5980392"},{"issue":"2","key":"10739_CR4","doi-asserted-by":"publisher","first-page":"439","DOI":"10.23919\/JSEE.2023.000051","volume":"34","author":"L Dong","year":"2023","unstructured":"Dong L, He Z, Song C, Sun C (2023) A review of mobile robot motion planning methods: from classical motion planning workflows to reinforcement learning-based architectures. J Syst Eng Electron 34(2):439\u2013459","journal-title":"J Syst Eng Electron"},{"key":"10739_CR5","doi-asserted-by":"publisher","unstructured":"Douthwaite JA, Zhao S, Mihaylova LS (2018) A comparative study of velocity obstacle approaches for multi-agent systems. In: 2018 UKACC 12th international conference on control (CONTROL), pp 289\u2013294. https:\/\/doi.org\/10.1109\/CONTROL.2018.8516848","DOI":"10.1109\/CONTROL.2018.8516848"},{"issue":"11","key":"10739_CR6","doi-asserted-by":"publisher","first-page":"6584","DOI":"10.1109\/TNNLS.2021.3082568","volume":"33","author":"J Duan","year":"2021","unstructured":"Duan J, Guan Y, Li SE, Ren Y, Sun Q, Cheng B (2021) Distributional soft actor-critic: off-policy reinforcement learning for addressing value estimation errors. IEEE Trans Neural Netw Learn Syst 33(11):6584\u20136598","journal-title":"IEEE Trans Neural Netw Learn Syst"},{"key":"10739_CR7","doi-asserted-by":"publisher","unstructured":"Everett M, Chen YF, How JP (2018) Motion planning among dynamic, decision-making agents with deep reinforcement learning. In: 2018 IEEE\/RSJ international conference on intelligent robots and systems (IROS), pp 3052\u20133059. https:\/\/doi.org\/10.1109\/IROS.2018.8593871","DOI":"10.1109\/IROS.2018.8593871"},{"key":"10739_CR8","doi-asserted-by":"publisher","first-page":"10357","DOI":"10.1109\/ACCESS.2021.3050338","volume":"9","author":"M Everett","year":"2021","unstructured":"Everett M, Chen YF, How JP (2021) Collision avoidance in pedestrian-rich environments with deep reinforcement learning. IEEE Access 9:10357\u201310377. https:\/\/doi.org\/10.1109\/ACCESS.2021.3050338","journal-title":"IEEE Access"},{"issue":"7","key":"10739_CR9","doi-asserted-by":"publisher","first-page":"856","DOI":"10.1177\/0278364920916531","volume":"39","author":"T Fan","year":"2020","unstructured":"Fan T, Long P, Liu W, Pan J (2020) Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios. Int J Robot Res 39(7):856\u2013892","journal-title":"Int J Robot Res"},{"key":"10739_CR10","doi-asserted-by":"crossref","unstructured":"Gu T, Chen G, Li J, Lin C, Rao Y, Zhou J, Lu J (2022) Stochastic trajectory prediction via motion indeterminacy diffusion. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (CVPR), pp 17113\u201317122","DOI":"10.1109\/CVPR52688.2022.01660"},{"key":"10739_CR11","doi-asserted-by":"crossref","unstructured":"Gupta A, Johnson J, Fei-Fei L, Savarese S, Alahi A (2018) Social GAN: Socially acceptable trajectories with generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)","DOI":"10.1109\/CVPR.2018.00240"},{"issue":"5","key":"10739_CR12","doi-asserted-by":"publisher","first-page":"2757","DOI":"10.1109\/TSMC.2021.3050960","volume":"52","author":"Z He","year":"2022","unstructured":"He Z, Dong L, Sun C, Wang J (2022) Asynchronous multithreading reinforcement-learning-based path planning and tracking for unmanned underwater vehicle. IEEE Trans Syst Man Cybern Syst 52(5):2757\u20132769. https:\/\/doi.org\/10.1109\/TSMC.2021.3050960","journal-title":"IEEE Trans Syst Man Cybern Syst"},{"key":"10739_CR13","doi-asserted-by":"publisher","unstructured":"He Z, Dong L, Song C, Sun C (2022) Multiagent soft actor-critic based hybrid motion planner for mobile robots. In: IEEE transactions on neural networks and learning systems (to be published). https:\/\/doi.org\/10.1109\/TNNLS.2022.3172168","DOI":"10.1109\/TNNLS.2022.3172168"},{"key":"10739_CR14","doi-asserted-by":"crossref","unstructured":"Huang Y, Bi H, Li Z, Mao T, Wang Z (2019) Stgat: modeling spatial-temporal interactions for human trajectory prediction. In: Proceedings of the IEEE\/CVF international conference on computer vision (ICCV)","DOI":"10.1109\/ICCV.2019.00637"},{"key":"10739_CR15","doi-asserted-by":"publisher","unstructured":"Huang X, Zhou L, Guan Z, Li Z, Wen C, He R (2019) Generalized reciprocal collision avoidance for non-holonomic robots. In: 2019 14th IEEE conference on industrial electronics and applications (ICIEA), pp 1623\u20131628. https:\/\/doi.org\/10.1109\/ICIEA.2019.8834353","DOI":"10.1109\/ICIEA.2019.8834353"},{"key":"10739_CR16","doi-asserted-by":"publisher","unstructured":"Liang Z, Cao J, Lin W, Chen J, Xu H (2021) Hierarchical deep reinforcement learning for multi-robot cooperation in partially observable environment. In: 2021 IEEE third international conference on cognitive machine intelligence (CogMI), pp 272\u2013281. https:\/\/doi.org\/10.1109\/CogMI52975.2021.00042","DOI":"10.1109\/CogMI52975.2021.00042"},{"key":"10739_CR17","unstructured":"Liu S, Chang P, Huang Z, Chakraborty N, Liang W, Geng J, Driggs-Campbell K (2022) Socially aware robot crowd navigation with interaction graphs and human trajectory prediction. arXiv preprint arXiv:2203.01821"},{"key":"10739_CR18","doi-asserted-by":"publisher","unstructured":"Liu S, Chang P, Liang W, Chakraborty N, Driggs-Campbell K (2021) Decentralized structural-RNN for robot crowd navigation with deep reinforcement learning. In: 2021 IEEE international conference on robotics and automation (ICRA), pp 3517\u20133524. https:\/\/doi.org\/10.1109\/ICRA48506.2021.9561595","DOI":"10.1109\/ICRA48506.2021.9561595"},{"key":"10739_CR19","doi-asserted-by":"crossref","unstructured":"Matsuzaki S, Hasegawa Y (2022) Learning crowd-aware robot navigation from challenging environments via distributed deep reinforcement learning. In: 2022 International conference on robotics and automation (ICRA), pp 4730\u20134736. IEEE","DOI":"10.1109\/ICRA46639.2022.9812011"},{"key":"10739_CR20","doi-asserted-by":"crossref","unstructured":"Mehran R, Oyama A, Shah M (2009) Abnormal crowd behavior detection using social force model. In: 2009 IEEE conference on computer vision and pattern recognition, pp 935\u2013942. IEEE","DOI":"10.1109\/CVPRW.2009.5206641"},{"key":"10739_CR21","doi-asserted-by":"publisher","unstructured":"Mellinger D, Kushleyev A, Kumar V (2012) Mixed-integer quadratic program trajectory generation for heterogeneous quadrotor teams. In: 2012 IEEE international conference on robotics and automation, pp 477\u2013483. https:\/\/doi.org\/10.1109\/ICRA.2012.6225009","DOI":"10.1109\/ICRA.2012.6225009"},{"key":"10739_CR22","doi-asserted-by":"publisher","unstructured":"Nishimura M, Yonetani R (2020) L2B: learning to balance the safety-efficiency trade-off in interactive crowd-aware robot navigation. In: 2020 IEEE\/RSJ international conference on intelligent robots and systems (IROS), pp 11004\u201311010. https:\/\/doi.org\/10.1109\/IROS45743.2020.9341519","DOI":"10.1109\/IROS45743.2020.9341519"},{"key":"10739_CR23","doi-asserted-by":"crossref","unstructured":"Phillips M, Likhachev M (2011) SIPP: safe interval path planning for dynamic environments. In: 2011 IEEE international conference on robotics and automation, pp 5628\u20135635. IEEE","DOI":"10.1109\/ICRA.2011.5980306"},{"key":"10739_CR24","doi-asserted-by":"crossref","unstructured":"Qiu Q, Yao S, Wang J, Ma J, Chen G, Ji J (2022) Learning to socially navigate in pedestrian-rich environments with interaction capacity. arXiv preprint arXiv:2203.16154","DOI":"10.1109\/ICRA46639.2022.9811662"},{"issue":"1","key":"10739_CR25","doi-asserted-by":"publisher","first-page":"48","DOI":"10.1109\/TRO.2020.3006716","volume":"37","author":"AH Qureshi","year":"2021","unstructured":"Qureshi AH, Miao Y, Simeonov A, Yip MC (2021) Motion planning networks: bridging the gap between learning-based and classical motion planners. IEEE Trans Robot 37(1):48\u201366. https:\/\/doi.org\/10.1109\/TRO.2020.3006716","journal-title":"IEEE Trans Robot"},{"issue":"3","key":"10739_CR26","doi-asserted-by":"publisher","first-page":"4249","DOI":"10.1109\/LRA.2020.2994035","volume":"5","author":"B Rivi\u00e8re","year":"2020","unstructured":"Rivi\u00e8re B, H\u00f6nig W, Yue Y, Chung S-J (2020) Glas: global-to-local safe autonomy synthesis for multi-robot motion planning with end-to-end learning. IEEE Robot Autom Lett 5(3):4249\u20134256. https:\/\/doi.org\/10.1109\/LRA.2020.2994035","journal-title":"IEEE Robot Autom Lett"},{"issue":"3","key":"10739_CR27","doi-asserted-by":"publisher","first-page":"2378","DOI":"10.1109\/LRA.2019.2903261","volume":"4","author":"G Sartoretti","year":"2019","unstructured":"Sartoretti G, Kerr J, Shi Y, Wagner G, Kumar TKS, Koenig S, Choset H (2019) Primal: pathfinding via reinforcement and imitation multi-agent learning. IEEE Robot Autom Lett 4(3):2378\u20132385. https:\/\/doi.org\/10.1109\/LRA.2019.2903261","journal-title":"IEEE Robot Autom Lett"},{"issue":"2","key":"10739_CR28","doi-asserted-by":"publisher","first-page":"3221","DOI":"10.1109\/LRA.2020.2974695","volume":"5","author":"SH Semnani","year":"2020","unstructured":"Semnani SH, Liu H, Everett M, de Ruiter A, How JP (2020) Multi-agent motion planning for dense and dynamic environments via deep reinforcement learning. IEEE Robot Autom Lett 5(2):3221\u20133226. https:\/\/doi.org\/10.1109\/LRA.2020.2974695","journal-title":"IEEE Robot Autom Lett"},{"issue":"4","key":"10739_CR29","doi-asserted-by":"publisher","first-page":"696","DOI":"10.1109\/TRO.2011.2120810","volume":"27","author":"J Snape","year":"2011","unstructured":"Snape J, Berg JVD, Guy SJ, Manocha D (2011) The hybrid reciprocal velocity obstacle. IEEE Trans Robot 27(4):696\u2013706. https:\/\/doi.org\/10.1109\/TRO.2011.2120810","journal-title":"IEEE Trans Robot"},{"key":"10739_CR30","doi-asserted-by":"publisher","unstructured":"Song C, He Z, Dong L (2022) A local-and-global attention reinforcement learning algorithm for multiagent cooperative navigation. In: IEEE transactions on neural networks and learning systems (to be published). https:\/\/doi.org\/10.1109\/TNNLS.2022.3220798","DOI":"10.1109\/TNNLS.2022.3220798"},{"issue":"9","key":"10739_CR31","doi-asserted-by":"publisher","first-page":"1062","DOI":"10.1177\/0278364917741532","volume":"37","author":"S Tang","year":"2018","unstructured":"Tang S, Thomas J, Kumar V (2018) Hold or take optimal plan (hoop): a quadratic programming approach to multi-robot trajectory generation. Int J Robot Res 37(9):1062\u20131084","journal-title":"Int J Robot Res"},{"key":"10739_CR32","doi-asserted-by":"publisher","unstructured":"Vemula A, Muelling K, Oh J (2018) Social attention: modeling attention in human crowds. In: 2018 IEEE international conference on robotics and automation (ICRA), pp 4601\u20134607. https:\/\/doi.org\/10.1109\/ICRA.2018.8460504","DOI":"10.1109\/ICRA.2018.8460504"},{"key":"10739_CR33","doi-asserted-by":"publisher","unstructured":"Wang L, Li Z, Wen C, He R, Guo F (2018) Reciprocal collision avoidance for nonholonomic mobile robots. In: 2018 15th International conference on control, automation, robotics and vision (ICARCV), pp 371\u2013376. https:\/\/doi.org\/10.1109\/ICARCV.2018.8581239","DOI":"10.1109\/ICARCV.2018.8581239"},{"key":"10739_CR34","unstructured":"Wang RE, Everett M, How JP (2020) R-MADDPG for partially observable environments and limited communication. arXiv preprint arXiv:2002.06684"},{"issue":"4","key":"10739_CR35","doi-asserted-by":"publisher","first-page":"6932","DOI":"10.1109\/LRA.2020.3026638","volume":"5","author":"B Wang","year":"2020","unstructured":"Wang B, Liu Z, Li Q, Prorok A (2020) Mobile robot path planning in dynamic environments through globally guided reinforcement learning. IEEE Robot Autom Lett 5(4):6932\u20136939. https:\/\/doi.org\/10.1109\/LRA.2020.3026638","journal-title":"IEEE Robot Autom Lett"},{"key":"10739_CR36","doi-asserted-by":"publisher","DOI":"10.3390\/machines9040077","author":"M Wang","year":"2021","unstructured":"Wang M, Zeng B, Wang Q (2021) Research on motion planning based on flocking control and reinforcement learning for multi-robot systems. Machines. https:\/\/doi.org\/10.3390\/machines9040077","journal-title":"Machines"},{"issue":"5","key":"10739_CR37","doi-asserted-by":"publisher","first-page":"1163","DOI":"10.1109\/TRO.2016.2593448","volume":"32","author":"J Yu","year":"2016","unstructured":"Yu J, LaValle SM (2016) Optimal multirobot path planning on graphs: complete algorithms and effective heuristics. IEEE Trans Robot 32(5):1163\u20131177. https:\/\/doi.org\/10.1109\/TRO.2016.2593448","journal-title":"IEEE Trans Robot"},{"key":"10739_CR38","unstructured":"Yu C, Velu A, Vinitsky E, Wang Y, Bayen A, Wu Y (2021) The surprising effectiveness of PPO in cooperative, multi-agent games. arXiv preprint arXiv:2103.01955"},{"key":"10739_CR39","unstructured":"Zhou Y, Li S, Garcke J (2021) R-SARL: crowd-aware navigation based deep reinforcement learning for nonholonomic robot in complex environments. arXiv preprint arXiv:2105.13409"}],"container-title":["Artificial Intelligence Review"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10462-024-10739-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10462-024-10739-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10462-024-10739-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,4,13]],"date-time":"2024-04-13T07:21:16Z","timestamp":1712992876000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10462-024-10739-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,2]]},"references-count":39,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2024,4]]}},"alternative-id":["10739"],"URL":"https:\/\/doi.org\/10.1007\/s10462-024-10739-w","relation":{},"ISSN":["1573-7462"],"issn-type":[{"value":"1573-7462","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,4,2]]},"assertion":[{"value":"24 February 2024","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 April 2024","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"108"}}