{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,22]],"date-time":"2025-10-22T23:07:31Z","timestamp":1761174451508,"version":"build-2065373602"},"reference-count":48,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2025,10,22]],"date-time":"2025-10-22T00:00:00Z","timestamp":1761091200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001691","name":"JSPS KAKENHI","doi-asserted-by":"publisher","award":["23K24903"],"award-info":[{"award-number":["23K24903"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>Automated collision avoidance is a central topic in multi-agent systems that consist of mobile agents. One simple approach to pursue system-wide performance is a centralized algorithm, which, however, becomes computationally expensive when involving a large number of agents. There have thus been proposed fully distributed collision avoidance algorithms that can naturally handle many-to-many encounter situations. The DSSA+ is one of those algorithms, which is heuristic and incomplete but has lower communication and computation overheads than other counterparts. However, the DSSA+ and some other distributed collision avoidance algorithms basically optimize the agents\u2019 behavior only in the short term, not caring about the total efficiency in their paths. This may result in some agents\u2019 paths with over-deviation or over-stagnation. In this paper, we present Distributed Stochastic Search algorithm with a deep Q-network (DSSQ), in which the agents can generate time-efficient collision-free paths while they learn independently whether to detour or change speeds by Deep Reinforcement Learning. A key idea in the learning principle of the DSSQ is to let the agents pursue their individual optimality. We have experimentally confirmed that a sequence of short-term system-optimal solutions found by the DSSA+ gradually becomes long-term individually optimal for every agent.<\/jats:p>","DOI":"10.3390\/a18110671","type":"journal-article","created":{"date-parts":[[2025,10,22]],"date-time":"2025-10-22T07:03:51Z","timestamp":1761116631000},"page":"671","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Learning System-Optimal and Individual-Optimal Collision Avoidance Behaviors by Autonomous Mobile Agents"],"prefix":"10.3390","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6656-5819","authenticated-orcid":false,"given":"Katsutoshi","family":"Hirayama","sequence":"first","affiliation":[{"name":"Graduate School of Maritime Sciences, Kobe University, Kobe 658-0022, Hyogo, Japan"}]},{"given":"Kazuma","family":"Gohara","sequence":"additional","affiliation":[{"name":"Graduate School of Maritime Sciences, Kobe University, Kobe 658-0022, Hyogo, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2542-614X","authenticated-orcid":false,"given":"Jinichi","family":"Koue","sequence":"additional","affiliation":[{"name":"Graduate School of Maritime Sciences, Kobe University, Kobe 658-0022, Hyogo, Japan"}]},{"given":"Tenda","family":"Okimoto","sequence":"additional","affiliation":[{"name":"Graduate School of Maritime Sciences, Kobe University, Kobe 658-0022, Hyogo, Japan"}]},{"given":"Donggyun","family":"Kim","sequence":"additional","affiliation":[{"name":"Division of Navigation Science, Mokpo National Maritime University, Mokpo-si 58628, Jeollanam-do, Republic of Korea"}]}],"member":"1968","published-online":{"date-parts":[[2025,10,22]]},"reference":[{"key":"ref_1","unstructured":"Hennes, D., Claes, D., Meeussen, W., and Tuyls, K. (2012, January 4\u20138). Multi-Robot Collision Avoidance with Localization Uncertainty. Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-2012), Valencia, Spain."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1016\/j.ssci.2019.09.018","article-title":"Ship collision avoidance methods: State-of-the-art","volume":"121","author":"Huang","year":"2020","journal-title":"Saf. Sci."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Wang, Q., and Phillips, C. (2013, January 26\u201329). Cooperative collision avoidance for multi-vehicle systems using reinforcement learning. Proceedings of the 2013 18th International Conference on Methods Models in Automation Robotics (MMAR), Miedzyzdroje, Poland.","DOI":"10.1109\/MMAR.2013.6669888"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"837","DOI":"10.1109\/TRO.2018.2857475","article-title":"A Survey on Aerial Swarm Robotics","volume":"34","author":"Chung","year":"2018","journal-title":"IEEE Trans. Robot."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Alonso-Mora, J., Breitenmoser, A., Rufli, M., Beardsley, P., and Siegwart, R. (2013). Optimal reciprocal collision avoidance for multiple non-holonomic robots. Distributed Autonomous Robotic Systems: The 10th International Symposium, Springer.","DOI":"10.1007\/978-3-642-32723-0_15"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Kuderer, M., Kretzschmar, H., Sprunk, C., and Burgard, W. (2013). Feature-Based Prediction of Trajectories for Socially Compliant Navigation. Robotics: Science and Systems VIII, The MIT Press.","DOI":"10.15607\/RSS.2012.VIII.025"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Phillips, M., and Likhachev, M. (2011, January 9\u201313). SIPP: Safe interval path planning for dynamic environments. Proceedings of the 2011 IEEE International Conference on Robotics and Automation (ICRA-2011), Shanghai, China.","DOI":"10.1109\/ICRA.2011.5980306"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Pradalier, C., Siegwart, R., and Hirzinger, G. (2011). Reciprocal n-Body Collision Avoidance. Robotics Research, Springer.","DOI":"10.1007\/978-3-642-19457-3"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Chen, Y.F., Liu, M., Everett, M., and How, J.P. (June, January 29). Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA-2017), Singapore.","DOI":"10.1109\/ICRA.2017.7989037"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Everett, M., Chen, Y.F., and How, J.P. (2018, January 1\u20135). Motion planning among dynamic, decision-making agents with deep reinforcement learning. Proceedings of the 2018 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS-2018), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8593871"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"856","DOI":"10.1177\/0278364920916531","article-title":"Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios","volume":"39","author":"Fan","year":"2020","journal-title":"Int. J. Robot. Res."},{"key":"ref_12","unstructured":"Felner, A., Stern, R., Shimony, S.E., Boyarski, E., Goldenberg, M., Sharon, G., Sturtevant, N., Wagner, G., and Surynek, P. (2017, January 16\u201317). Search-Based Optimal Solvers for the Multi-Agent Pathfinding Problem: Summary and Challenges. Proceedings of the Tenth International Symposium on Combinatorial Search (SoCS-2017), Pittsburgh, PA, USA."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1062","DOI":"10.1177\/0278364917741532","article-title":"Hold or take Optimal Plan (HOOP): A quadratic programming approach to multi-robot trajectory generation","volume":"37","author":"Tang","year":"2018","journal-title":"Int. J. Robot. Res."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"839","DOI":"10.20965\/jaciii.2014.p0839","article-title":"Collision Avoidance in Multiple-Ship Situations by Distributed Local Search","volume":"18","author":"Kim","year":"2014","journal-title":"J. Adv. Comput. Intell. Intell. Informatics"},{"key":"ref_15","first-page":"23","article-title":"Ship Collision Avoidance by Distributed Tabu Search","volume":"9","author":"Kim","year":"2015","journal-title":"TransNav Int. J. Mar. Navig. Saf. Sea Transp."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"699","DOI":"10.1017\/S037346331700008X","article-title":"Distributed Stochastic Search Algorithm for Multi-ship Encounter Situations","volume":"70","author":"Kim","year":"2017","journal-title":"J. Navig."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1406","DOI":"10.1109\/TCST.2016.2599485","article-title":"Fast ADMM for Distributed Model Predictive Control of Cooperative Waterborne AGVs","volume":"25","author":"Zheng","year":"2017","journal-title":"IEEE Trans. Control Syst. Technol."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1016\/j.trc.2018.04.013","article-title":"Distributed model predictive control for vessel train formations of cooperative multi-vessel systems","volume":"92","author":"Chen","year":"2018","journal-title":"Transp. Res. Part C Emerg. Technol."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1016\/j.ifacol.2018.07.062","article-title":"Intersection Crossing of Cooperative Multi-vessel Systems","volume":"51","author":"Chen","year":"2018","journal-title":"IFAC-PapersOnLine"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Ferranti, L., Negenborn, R.R., Keviczky, T., and Alonso-Mora, J. (2018, January 12\u201315). Coordination of Multiple Vessels Via Distributed Nonlinear Model Predictive Control. Proceedings of the 2018 European Control Conference (ECC), Limassol, Cyprus.","DOI":"10.23919\/ECC.2018.8550178"},{"key":"ref_21","first-page":"117","article-title":"DSSA+: Distributed Collision Avoidance Algorithm in an Environment where Both Course and Speed Changes are Allowed","volume":"13","author":"Hirayama","year":"2019","journal-title":"TransNav Int. J. Mar. Navig. Saf. Sea Transp."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"212","DOI":"10.1016\/j.oceaneng.2019.03.054","article-title":"Distributed coordination for collision avoidance of multiple ships considering ship maneuverability","volume":"181","author":"Li","year":"2019","journal-title":"Ocean Eng."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1016\/j.ifacol.2022.10.439","article-title":"Collaborative Collision Avoidance for Autonomous Ships Using Informed Scenario-Based Model Predictive Control","volume":"55","author":"Fossen","year":"2022","journal-title":"IFAC-PapersOnLine"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"302","DOI":"10.1016\/j.ifacol.2024.07.357","article-title":"Parallel distributed collision avoidance with intention consensus based on ADMM","volume":"58","author":"Tran","year":"2024","journal-title":"IFAC-PapersOnLine"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1016\/j.artint.2004.10.004","article-title":"Distributed stochastic search and distributed breakout: Properties, comparison and applications to constraint optimization problems in sensor networks","volume":"161","author":"Zhang","year":"2005","journal-title":"Artif. Intell."},{"key":"ref_26","first-page":"1","article-title":"Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers","volume":"3","author":"Boyd","year":"2011","journal-title":"Found. Trends\u00aeMach. Learn."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_28","unstructured":"CBSMornings (2025, August 22). Synchronized Walking Becomes Staple at Japanese University. Available online: https:\/\/www.youtube.com\/watch?v=uDgEQGsh7Qs."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1016\/j.apor.2019.02.020","article-title":"Automatic collision avoidance of multiple ships based on deep Q-learning","volume":"86","author":"Shen","year":"2019","journal-title":"Appl. Ocean Res."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Godoy, J.E., Karamouzas, I., Guy, S.J., and Gini, M. (2016, January 12\u201317). Implicit coordination in crowded multi-agent navigation. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-2016), Phoenix, AZ, USA.","DOI":"10.1609\/aaai.v30i1.10131"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1016\/j.robot.2007.09.020","article-title":"Theory and implementation of path planning by negotiation for decentralized agents","volume":"56","author":"Purwin","year":"2008","journal-title":"Robot. Auton. Syst."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"899","DOI":"10.1109\/TCST.2016.2594588","article-title":"Distributed Model Predictive Control for Heterogeneous Vehicle Platoons Under Unidirectional Topologies","volume":"25","author":"Zheng","year":"2017","journal-title":"IEEE Trans. Control Syst. Technol."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"7817","DOI":"10.1109\/TITS.2021.3073012","article-title":"Model Predictive Control for Connected Vehicle Platoon Under Switching Communication Topology","volume":"23","author":"Wang","year":"2022","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"3339","DOI":"10.1109\/TITS.2022.3227465","article-title":"Distributed Model Predictive Control for Heterogeneous Vehicle Platoon With Inter-Vehicular Spacing Constraints","volume":"24","author":"Qiang","year":"2023","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"38273","DOI":"10.1109\/JIOT.2024.3445460","article-title":"Efficient Motion Control for Heterogeneous Autonomous Vehicle Platoon Using Multilayer Predictive Control Framework","volume":"11","author":"Du","year":"2024","journal-title":"IEEE Internet Things J."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Hirayama, K., and Yokoo, M. (November, January 29). Distributed Partial Constraint Satisfaction Problem. Proceedings of the Third International Conference on Principles and Practice of Constraint Programming (CP-1997), Linz, Austria.","DOI":"10.1007\/BFb0017442"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Petcu, A., and Faltings, B. (August, January 30). A Scalable Method for Multiagent Constraint Optimization. Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI-2005), Edinburgh, UK.","DOI":"10.1007\/11600930_71"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1613\/jair.2591","article-title":"Asynchronous Forward Bounding for Distributed COPs","volume":"34","author":"Gershman","year":"2009","journal-title":"J. Artif. Intell. Res."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1281","DOI":"10.1613\/jair.1.15748","article-title":"Collision Avoiding Max-Sum for Mobile Sensor Teams","volume":"79","author":"Pertzovskiy","year":"2024","journal-title":"J. Artif. Intell. Res."},{"key":"ref_40","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, The MIT Press. [2nd ed.]."},{"key":"ref_41","first-page":"279","article-title":"Technical Note: Q-Learning","volume":"8","author":"Watkins","year":"1992","journal-title":"Mach. Learn."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Van Hasselt, H., Guez, A., and Silver, D. (2016, January 12\u201317). Deep reinforcement learning with double q-learning. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-2016), Phoenix, AZ, USA.","DOI":"10.1609\/aaai.v30i1.10295"},{"key":"ref_43","unstructured":"Schaul, T., Quan, J., Antonoglou, I., and Silver, D. (2015). Prioritized experience replay. arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"De Asis, K., Hernandez-Garcia, J.F., Holland, G.Z., and Sutton, R.S. (2018, January 2\u20137). Multi-step reinforcement learning: A unifying algorithm. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-2018), New Orleans, LA, USA.","DOI":"10.1609\/aaai.v32i1.11631"},{"key":"ref_45","unstructured":"Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M., and Freitas, N. (2016, January 19\u201324). Dueling network architectures for deep reinforcement learning. Proceedings of the 33nd International Conference on Machine Learning (ICML-2016), New York, NY, USA."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Hessel, M., Modayil, J., Van Hasselt, H., Schaul, T., Ostrovski, G., Dabney, W., Horgan, D., Piot, B., Azar, M., and Silver, D. (2018, January 2\u20137). Rainbow: Combining improvements in deep reinforcement learning. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-2018), New Orleans, LA, USA.","DOI":"10.1609\/aaai.v32i1.11796"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"106821","DOI":"10.1016\/j.jfranklin.2024.106821","article-title":"Hierarchical path planner combining probabilistic roadmap and deep deterministic policy gradient for unmanned ground vehicles with non-holonomic constraints","volume":"361","author":"Fan","year":"2024","journal-title":"J. Frankl. Inst."},{"key":"ref_48","first-page":"16509","article-title":"Multi-agent reinforcement learning is a sequence modeling problem","volume":"35","author":"Wen","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/11\/671\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,22]],"date-time":"2025-10-22T07:07:52Z","timestamp":1761116872000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/11\/671"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,22]]},"references-count":48,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2025,11]]}},"alternative-id":["a18110671"],"URL":"https:\/\/doi.org\/10.3390\/a18110671","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,22]]}}}