{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T23:28:25Z","timestamp":1777505305169,"version":"3.51.4"},"reference-count":52,"publisher":"Springer Science and Business Media LLC","issue":"8","license":[{"start":{"date-parts":[[2023,8,19]],"date-time":"2023-08-19T00:00:00Z","timestamp":1692403200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,8,19]],"date-time":"2023-08-19T00:00:00Z","timestamp":1692403200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100007297","name":"Office of Naval Research Global","doi-asserted-by":"publisher","award":["NICOP-grant N62909-19-1-2027"],"award-info":[{"award-number":["NICOP-grant N62909-19-1-2027"]}],"id":[{"id":"10.13039\/100007297","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Auton Robot"],"published-print":{"date-parts":[[2023,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Decentralized multi-robot systems typically perform coordinated motion planning by constantly broadcasting their intentions to avoid collisions. However, the risk of collision between robots varies as they move and communication may not always be needed. This paper presents an efficient communication method that addresses the problem of \u201cwhen\u201d and \u201cwith whom\u201d to communicate in multi-robot collision avoidance scenarios. In this approach, each robot learns to reason about other robots\u2019 states and considers the risk of future collisions before asking for the trajectory plans of other robots. We introduce a new neural architecture for the learned communication policy which allows our method to be scalable. We evaluate and verify the proposed communication strategy in simulation with up to twelve quadrotors, and present results on the zero-shot generalization\/robustness capabilities of the policy in different scenarios. We demonstrate that our policy (learned in a simulated environment) can be successfully transferred to real robots.\n<\/jats:p>","DOI":"10.1007\/s10514-023-10127-3","type":"journal-article","created":{"date-parts":[[2023,8,19]],"date-time":"2023-08-19T10:06:59Z","timestamp":1692439619000},"page":"1275-1297","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Learning scalable and efficient communication policies for multi-robot collision avoidance"],"prefix":"10.1007","volume":"47","author":[{"given":"\u00c1lvaro","family":"Serra-G\u00f3mez","sequence":"first","affiliation":[]},{"given":"Hai","family":"Zhu","sequence":"additional","affiliation":[]},{"given":"Bruno","family":"Brito","sequence":"additional","affiliation":[]},{"given":"Wendelin","family":"B\u00f6hmer","sequence":"additional","affiliation":[]},{"given":"Javier","family":"Alonso-Mora","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,8,19]]},"reference":[{"key":"10127_CR1","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1111\/j.1467-8640.2008.01329.x","volume":"25","author":"R Becker","year":"2009","unstructured":"Becker, R., Carlin, A., Lesser, V., & Zilberstein, S. (2009). Analyzing myopic approaches for multi-agent communication. Computational Intelligence, 25, 31\u201350. https:\/\/doi.org\/10.1111\/j.1467-8640.2008.01329.x","journal-title":"Computational Intelligence"},{"key":"10127_CR2","doi-asserted-by":"crossref","unstructured":"Bengio, Y., Louradour, J., Collobert, R., & Weston, J. (2009). Curriculum learning. In Proceedings of the 26th annual international conference on machine learning (pp. 41\u201348).","DOI":"10.1145\/1553374.1553380"},{"key":"10127_CR3","doi-asserted-by":"publisher","DOI":"10.1287\/moor.27.4.819.297","author":"D Bernstein","year":"2002","unstructured":"Bernstein, D., Givan, R., Immerman, N., & Zilberstein, S. (2002). The complexity of decentralized control of Markov decision processes. Mathematics of Operations Research. https:\/\/doi.org\/10.1287\/moor.27.4.819.297","journal-title":"Mathematics of Operations Research"},{"key":"10127_CR4","doi-asserted-by":"publisher","unstructured":"Best, G., Forrai, M., Mettu, R. R., & Fitch, R. (2018). Planning-aware communication for decentralised multi-robot coordination. In 2018 IEEE international conference on robotics and automation (ICRA) (pp. 1050\u20131057). https:\/\/doi.org\/10.1109\/ICRA.2018.8460617","DOI":"10.1109\/ICRA.2018.8460617"},{"key":"10127_CR5","doi-asserted-by":"crossref","unstructured":"Brito, B., Everett, M., How, J. P., & Alonso-Mora, J. (2021). Where to go next: Learning a subgoal recommendation policy for navigation among pedestrians. arXiv:2102.13073","DOI":"10.1109\/LRA.2021.3068662"},{"key":"10127_CR6","unstructured":"Das, A., Gervet, T., Romoff, J., Batra, D., Parikh, D., Rabbat, M., & Pineau, J. (2019). TarMAC: Targeted multi-agent communication. In 36th international conference on machine learning, ICML 2019."},{"key":"10127_CR7","unstructured":"Ding, Z., Huang, T., & Lu, Z. (2020). Learning individually inferred communication for multi-agent cooperation. arXiv:2006.06455"},{"key":"10127_CR8","unstructured":"Domahidi, A., & Jerez, J. (2014). Forces professional. embotech gmbh http:\/\/embotech.com\/forces-pro"},{"key":"10127_CR9","doi-asserted-by":"publisher","unstructured":"Everett, M., Chen, Y. F., & How, J. P. (2018). Motion planning among dynamic, decision-making agents with deep reinforcement learning. In IEEE international conference on intelligent robots and systems (pp. 3052\u20133059). https:\/\/github.com\/mfe7\/cadrl_roshttps:\/\/doi.org\/10.1109\/IROS.2018.8593871","DOI":"10.1109\/IROS.2018.8593871"},{"key":"10127_CR10","unstructured":"Everett, M., Chen, Y. F., & How, J. P. (2019). Collision avoidance in pedestrian-rich environments with deep reinforcement learning. arXiv:1910.11689."},{"issue":"7","key":"10127_CR11","doi-asserted-by":"publisher","first-page":"856","DOI":"10.1177\/0278364920916531","volume":"39","author":"T Fan","year":"2020","unstructured":"Fan, T., Long, P., Liu, W., & Pan, J. (2020). Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios. The International Journal of Robotics Research, 39(7), 856\u2013892. https:\/\/doi.org\/10.1177\/0278364920916531","journal-title":"The International Journal of Robotics Research"},{"key":"10127_CR12","unstructured":"Foerster, J. N., Assael, Y. M., de Freitas, N., & Whiteson, S. (2016). Learning to communicate with deep multi-agent reinforcement learning. In Proceedings of the 30th international conference on neural information processing systems (pp.\u00a02145\u20132153). Curran Associates Inc."},{"key":"10127_CR13","doi-asserted-by":"crossref","unstructured":"Foerster, J. N., Farquhar, G., Afouras, T., Nardelli, N., & Whiteson, S. (2018). Counterfactual multi-agent policy gradients. In Proceedings of the thirty-second AAAI conference on artificial intelligence and thirtieth innovative applications of artificial intelligence conference and eighth AAAI symposium on educational advances in artificial intelligence. AAAI Press.","DOI":"10.1609\/aaai.v32i1.11794"},{"key":"10127_CR14","doi-asserted-by":"publisher","first-page":"1034","DOI":"10.1109\/TSP.2018.2887403","volume":"67","author":"F Gama","year":"2019","unstructured":"Gama, F., Marques, A., Leus, G., & Ribeiro, A. (2019). Convolutional neural network architectures for signals supported on graphs. IEEE Transactions on Signal Processing, 67, 1034\u20131049.","journal-title":"IEEE Transactions on Signal Processing"},{"key":"10127_CR15","doi-asserted-by":"publisher","unstructured":"Gupta, J. K., Egorov, M., & Kochenderfer, M. (2017). Cooperative multi-agent control using deep reinforcement learning. LNAI; Technical report (Vol. 10642). https:\/\/doi.org\/10.1007\/978-3-319-71682-4_5","DOI":"10.1007\/978-3-319-71682-4_5"},{"key":"10127_CR16","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9, 1735\u201380. https:\/\/doi.org\/10.1162\/neco.1997.9.8.1735","journal-title":"Neural Computation"},{"key":"10127_CR17","unstructured":"Iqbal, S., & Sha, F. (2019). Actor-attention-critic for multi-agent reinforcement learning. In ICML."},{"key":"10127_CR18","unstructured":"Jiang, J., & Lu, Z. (2018). Learning attentional communication for multi-agent cooperation. In Advances in neural information processing systems."},{"key":"10127_CR19","doi-asserted-by":"publisher","unstructured":"Kamel, M., Alonso-Mora, J., Siegwart, R., & Nieto, J. (2017). Robust collision avoidance for multiple micro aerial vehicles using nonlinear model predictive control. |In 2017 IEEE\/RSJ international conference on intelligent robots and systems (IROS) (pp. 236\u2013243). IEEE. https:\/\/doi.org\/10.1109\/IROS.2017.8202163","DOI":"10.1109\/IROS.2017.8202163"},{"key":"10127_CR20","doi-asserted-by":"publisher","unstructured":"Kassir, A., Fitch, R., & Sukkarieh, S. (2016). Communication-efficient motion coordination and data fusion in information gathering teams. In IEEE international conference on intelligent robots and systems (Vol. 2016-November, pp. 5258\u20135265). Institute of Electrical and Electronics Engineers Inc. https:\/\/doi.org\/10.1109\/IROS.2016.7759773","DOI":"10.1109\/IROS.2016.7759773"},{"key":"10127_CR21","unstructured":"Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. arXiv:1412.6980"},{"key":"10127_CR22","unstructured":"Kurin, V., Igl, M., Rockt\u00e4schel, T., Boehmer, W., & Whiteson, S. (2020). My body is a cage: The role of morphology in graph-based incompatible control. arXiv:2010.01856."},{"key":"10127_CR23","doi-asserted-by":"crossref","unstructured":"Li, Q., Gama, F., Ribeiro, A., & Prorok, A. (2020a). Graph neural networks for decentralized multi-robot path planning. In 2020 IEEE\/RSJ international conference on intelligent robots and systems (IROS) (pp. 11785\u201311792).","DOI":"10.1109\/IROS45743.2020.9341668"},{"key":"10127_CR24","doi-asserted-by":"crossref","unstructured":"Li, Q., Lin, W., Liu, Z., & Prorok, A. (2020b). Message-aware graph attention networks for large-scale multi-robot path planning. arXiv:2011.13219.","DOI":"10.1109\/LRA.2021.3077863"},{"key":"10127_CR25","unstructured":"Liang, E., Liaw, R., Nishihara, R., Moritz, P., Fox, R., Gonzalez, J., Goldberg, K., & Stoica, I. (2017). Ray rllib: A composable and scalable reinforcement learning library. arXiv:1712.09381."},{"key":"10127_CR26","unstructured":"Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., & Mordatch, I. (2017). Multi-agent actor-critic for mixed cooperative-competitive environments. In Advances in neural information processing systems (Vol. 2017-Decem, pp. 6380\u20136391)."},{"issue":"2","key":"10127_CR27","doi-asserted-by":"publisher","first-page":"604","DOI":"10.1109\/LRA.2020.2964159","volume":"5","author":"CE Luis","year":"2020","unstructured":"Luis, C. E., Vukosavljev, M., & Schoellig, A. P. (2020). Online trajectory generation with distributed model predictive control for multi-robot motion planning. IEEE Robotics and Automation Letters, 5(2), 604\u2013611. https:\/\/doi.org\/10.1109\/LRA.2020.2964159","journal-title":"IEEE Robotics and Automation Letters"},{"key":"10127_CR28","doi-asserted-by":"crossref","unstructured":"Mordatch, I., & Abbeel, P. (2018). Emergence of grounded compositional language in multi-agent populations. In 32nd AAAI conference on artificial intelligence, AAAI 2018 (pp. 1495\u20131502).","DOI":"10.1609\/aaai.v32i1.11492"},{"key":"10127_CR29","unstructured":"Moritz, P., Nishihara, R., Wang, S., Tumanov, A., Liaw, R., Liang, E., Elibol, M., Yang, Z., Paul, W., Jordan, M. I., & Stoica, I. (2018). Ray: A distributed framework for emerging ai applications. in Proceedings of the 13th usenix conference on operating systems design and implementation (pp.\u00a0561\u2013577). USAUSENIX Association."},{"key":"10127_CR30","doi-asserted-by":"publisher","unstructured":"Rahmattalabi, A., Chung, J. J., Colby, M., & Tumer, K. (2016). D++: Structural credit assignment in tightly coupled multiagent domains.In 2016 IEEE\/RSJ international conference on intelligent robots and systems (IROS) (Vol. 2016- November, pp. 5258\u20135265). https:\/\/doi.org\/10.1109\/IROS.2016.7759651","DOI":"10.1109\/IROS.2016.7759651"},{"key":"10127_CR31","unstructured":"Rashid, T., Samvelyan, M., Witt, C. S. D., Farquhar, G., Foerster, J. N., & Whiteson, S. (2018). Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning. arXiv:1803.11485."},{"key":"10127_CR32","doi-asserted-by":"crossref","unstructured":"Roth, M., Simmons, R., & Veloso, M. (2005). Reasoning about joint beliefs for execution-time communication decisions. Technical report.","DOI":"10.1145\/1082473.1082593"},{"key":"10127_CR33","unstructured":"Schulman, J., Moritz, P., Levine, S., Jordan, M. I., & Abbeel, P. (2016). High-dimensional continuous control using generalized advantage estimation. arXiv:1506.02438"},{"key":"10127_CR34","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv:1707.06347."},{"key":"10127_CR35","doi-asserted-by":"crossref","unstructured":"Serra-G\u00f3mez, A., Brito, B., Zhu, H., Chung, J. J., & Alonso-Mora, J. (2020). With whom to communicate: Learning efficient communication for multi-robot collision avoidance. In 2020 IEEE\/RSJ international conference on intelligent robots and systems (IROS) (Vol. 2016- November, pp. 5258\u20135265). IEEE.","DOI":"10.1109\/IROS45743.2020.9341762"},{"key":"10127_CR36","unstructured":"Son, K., Kim, D., Kang, W. J., Hostallero, D. E., & Yi, Y. (2019). Qtran: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. arXiv:1905.05408"},{"key":"10127_CR37","unstructured":"Sukhbaatar, S., Szlam, A., & Fergus, R. (2016). Learning multiagent communication with backpropagation. In Advances in neural information processing systems, NIPS (pp. 2252\u20132260)."},{"key":"10127_CR38","doi-asserted-by":"crossref","unstructured":"Sun, C., Shen, M., & How, J. P. (2020). Scaling up multiagent reinforcement learning for robotic systems: Learn an adaptive sparse communication graph. In 2020 IEEE\/RSJ international conference on intelligent robots and systems (IROS) (pp. 11755\u201311762).","DOI":"10.1109\/IROS45743.2020.9341303"},{"key":"10127_CR39","unstructured":"Sunehag, P., Lever, G., Gruslys, A., Czarnecki, W., Zambaldi, V., Jaderberg, M., Lanctot, M., Sonnerat, N., Leibo, J. Z., Tuyls, K., & Graepel, T. (2018). Value-decomposition networks for cooperative multi-agent learning. arXiv:1706.05296"},{"key":"10127_CR40","doi-asserted-by":"publisher","first-page":"eabf1416","DOI":"10.1126\/scirobotics.abf1416","volume":"56","author":"MS Talamali","year":"2021","unstructured":"Talamali, M. S., Saha, A., Marshall, J. A. R., & Reina, A. (2021). When less is more: Robot swarms adapt better to changes with constrained communication. Science Robotics, 56, eabf1416.","journal-title":"Science Robotics"},{"key":"10127_CR41","doi-asserted-by":"publisher","unstructured":"Van Den Berg, J., Guy, S. J., Lin, M., & Manocha, D. (2011). Reciprocal n-body collision avoidance. InSpringer tracts in advanced robotics (Vol. 70, pp. 3\u201319). https:\/\/doi.org\/10.1007\/978-3-642-19457-3_1","DOI":"10.1007\/978-3-642-19457-3_1"},{"key":"10127_CR42","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. arXiv:1706.03762"},{"issue":"3","key":"10127_CR43","doi-asserted-by":"publisher","first-page":"661","DOI":"10.1109\/TRO.2017.2659727","volume":"33","author":"L Wang","year":"2017","unstructured":"Wang, L., Ames, A. D., & Egerstedt, M. (2017). Safety barrier certificates for collisions-free multirobot systems. IEEE Transactions on Robotics, 33(3), 661\u2013674. https:\/\/doi.org\/10.1109\/TRO.2017.2659727","journal-title":"IEEE Transactions on Robotics"},{"key":"10127_CR44","unstructured":"Wang, R. E. , Kew, J., Lee, D., Lee, T., Zhang, T., Ichter, B., Tan, J., & Faust, A. (2020). Model-based reinforcement learning for decentralized multiagent rendezvous. Multiagent Systems."},{"key":"10127_CR45","doi-asserted-by":"publisher","unstructured":"Wheeler, T., Bharathi, E., & Gil, S. (2019). Switching topology for resilient consensus using wi-fi signals. In 2019 international conference on robotics and automation (ICRA) (pp. 2018\u20132024). https:\/\/doi.org\/10.1109\/ICRA.2019.8793788","DOI":"10.1109\/ICRA.2019.8793788"},{"key":"10127_CR46","doi-asserted-by":"crossref","unstructured":"Yongjie, Y., & Yan, Z. (2009). Collision avoidance planning in multi-robot based on improved artificial potential field and rules. 2009 IEEE international conference on robotics and biomimetics (robio) (pp. 1026\u20131031). IEEE.","DOI":"10.1109\/ROBIO.2009.4913141"},{"issue":"4","key":"10127_CR47","doi-asserted-by":"publisher","first-page":"8379","DOI":"10.1109\/LRA.2021.3102636","volume":"6","author":"Y Zhai","year":"2021","unstructured":"Zhai, Y., Ding, B., Liu, X., Jia, H., Zhao, Y., & Luo, J. (2021). Decentralized multi-robot collision avoidance in complex scenarios with selective communication. IEEE Robotics and Automation Letters, 6(4), 8379\u20138386. https:\/\/doi.org\/10.1109\/LRA.2021.3102636","journal-title":"IEEE Robotics and Automation Letters"},{"issue":"2","key":"10127_CR48","doi-asserted-by":"publisher","first-page":"1047","DOI":"10.1109\/LRA.2017.2656241","volume":"2","author":"D Zhou","year":"2017","unstructured":"Zhou, D., Wang, Z., Bandyopadhyay, S., & Schwager, M. (2017). Fast, on-line collision avoidance for dynamic vehicles using buffered voronoi cells. IEEE Robotics and Automation Letters, 2(2), 1047\u20131054. https:\/\/doi.org\/10.1109\/LRA.2017.2656241","journal-title":"IEEE Robotics and Automation Letters"},{"key":"10127_CR49","doi-asserted-by":"crossref","unstructured":"Zhu, H., & Alonso-Mora, J. (2019a). B-uavc: Buffered uncertainty-aware voronoi cells for probabilistic multi-robot collision avoidance. In 2019 international symposium on multi-robot and multi-agent systems (MRS) (pp. 162\u2013168).","DOI":"10.1109\/MRS.2019.8901092"},{"issue":"2","key":"10127_CR50","doi-asserted-by":"publisher","first-page":"776","DOI":"10.1109\/LRA.2019.2893494","volume":"4","author":"H Zhu","year":"2019","unstructured":"Zhu, H., & Alonso-Mora, J. (2019b). Chance-constrained collision avoidance for mavs in dynamic environments. IEEE Robotics and Automation Letters, 4(2), 776\u2013783. https:\/\/doi.org\/10.1109\/LRA.2019.2893494","journal-title":"IEEE Robotics and Automation Letters"},{"issue":"2","key":"10127_CR51","doi-asserted-by":"publisher","first-page":"2256","DOI":"10.1109\/LRA.2021.3061073","volume":"6","author":"H Zhu","year":"2021","unstructured":"Zhu, H., Claramunt, F. M., Brito, B., & Alonso-Mora, J. (2021). Learning interaction-aware trajectory predictions for decentralized multi-robot motion planning in dynamic environments. IEEE Robotics and Automation Letters, 6(2), 2256\u20132263. https:\/\/doi.org\/10.1109\/LRA.2021.3061073","journal-title":"IEEE Robotics and Automation Letters"},{"key":"10127_CR52","doi-asserted-by":"publisher","unstructured":"Zhu, H., Juhl, J., Ferranti, L., & Alonso-Mora, J. (2019). Distributed multi-robot formation splitting and merging in dynamic environments. In 2019 IEEE international conference on robotics and automation (ICRA) (pp. 9080\u20139086). IEEE. https:\/\/doi.org\/10.1109\/ICRA.2019.8793765","DOI":"10.1109\/ICRA.2019.8793765"}],"container-title":["Autonomous Robots"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10514-023-10127-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10514-023-10127-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10514-023-10127-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,28]],"date-time":"2023-11-28T18:08:51Z","timestamp":1701194931000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10514-023-10127-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,19]]},"references-count":52,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2023,12]]}},"alternative-id":["10127"],"URL":"https:\/\/doi.org\/10.1007\/s10514-023-10127-3","relation":{},"ISSN":["0929-5593","1573-7527"],"issn-type":[{"value":"0929-5593","type":"print"},{"value":"1573-7527","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,19]]},"assertion":[{"value":"28 July 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 July 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 August 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}