{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T00:41:06Z","timestamp":1775090466065,"version":"3.50.1"},"reference-count":109,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2022,12,29]],"date-time":"2022-12-29T00:00:00Z","timestamp":1672272000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,12,29]],"date-time":"2022-12-29T00:00:00Z","timestamp":1672272000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000741","name":"University of Warwick","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100000741","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[2023,2]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In artificial multi-agent systems, the ability to learn collaborative policies is predicated upon the agents\u2019 communication skills: they must be able to encode the information received from the environment and learn how to share it with other agents as required by the task at hand. We present a deep reinforcement learning approach, Connectivity Driven Communication (CDC), that facilitates the emergence of multi-agent collaborative behaviour only through experience. The agents are modelled as nodes of a weighted graph whose state-dependent edges encode pair-wise messages that can be exchanged. We introduce a graph-dependent attention mechanisms that controls how the agents\u2019 incoming messages are weighted. This mechanism takes into full account the current state of the system as represented by the graph, and builds upon a diffusion process that captures how the information flows on the graph. The graph topology is not assumed to be known a priori, but depends dynamically on the agents\u2019 observations, and is learnt concurrently with the attention mechanism and policy in an end-to-end fashion. Our empirical results show that CDC is able to learn effective collaborative policies and can over-perform competing learning algorithms on cooperative navigation tasks.<\/jats:p>","DOI":"10.1007\/s10994-022-06286-6","type":"journal-article","created":{"date-parts":[[2022,12,29]],"date-time":"2022-12-29T18:02:38Z","timestamp":1672336958000},"page":"483-514","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Learning multi-agent coordination through connectivity-driven communication"],"prefix":"10.1007","volume":"112","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0314-8057","authenticated-orcid":false,"given":"Emanuele","family":"Pesce","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Giovanni","family":"Montana","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2022,12,29]]},"reference":[{"key":"6286_CR1","unstructured":"Agarwal, A., Kumar, S., & Sycara, K. (2019). Learning transferable cooperative behavior in multi-agent teams. arXiv preprint arXiv:1906.01202."},{"key":"6286_CR2","unstructured":"Agogino, A. K., & Tumer, K. (2004). Unifying temporal and structural credit assignment problems. In AAMAS (Vol. 4, pp. 980\u2013987)."},{"key":"6286_CR3","doi-asserted-by":"publisher","first-page":"66","DOI":"10.1016\/j.artint.2018.01.002","volume":"258","author":"SV Albrecht","year":"2018","unstructured":"Albrecht, S. V., & Stone, P. (2018). Autonomous agents modelling other agents: A comprehensive survey and open problems. Artificial Intelligence, 258, 66\u201395.","journal-title":"Artificial Intelligence"},{"issue":"3","key":"6286_CR4","doi-asserted-by":"publisher","first-page":"970","DOI":"10.1137\/09074721X","volume":"31","author":"AH Al-Mohy","year":"2009","unstructured":"Al-Mohy, A. H., & Higham, N. J. (2009). A new scaling and squaring algorithm for the matrix exponential. SIAM Journal on Matrix Analysis and Applications, 31(3), 970\u2013989.","journal-title":"SIAM Journal on Matrix Analysis and Applications"},{"issue":"6","key":"6286_CR5","doi-asserted-by":"publisher","first-page":"926","DOI":"10.1109\/70.736776","volume":"14","author":"T Balch","year":"1998","unstructured":"Balch, T., & Arkin, R. C. (1998). Behavior-based formation control for multirobot teams. IEEE Transactions on Robotics and Automation, 14(6), 926\u2013939.","journal-title":"IEEE Transactions on Robotics and Automation"},{"issue":"4","key":"6286_CR6","doi-asserted-by":"publisher","first-page":"555","DOI":"10.1016\/j.socnet.2007.04.002","volume":"29","author":"P Bonacich","year":"2007","unstructured":"Bonacich, P. (2007). Some unique properties of eigenvector centrality. Social Networks, 29(4), 555\u2013564.","journal-title":"Social Networks"},{"key":"6286_CR7","doi-asserted-by":"crossref","unstructured":"Breazeal, C., Kidd, C. D., Thomaz, A. L., Hoffman, G., & Berlin, M. (2005). Effects of nonverbal communication on efficiency and robustness in human-robot teamwork. In 2005 IEEE\/RSJ international conference on intelligent robots and systems (pp. 708\u2013713). IEEE.","DOI":"10.1109\/IROS.2005.1545011"},{"key":"6286_CR8","volume-title":"Spectra of graphs","author":"AE Brouwer","year":"2011","unstructured":"Brouwer, A. E., & Haemers, W. H. (2011). Spectra of graphs. Springer."},{"key":"6286_CR9","doi-asserted-by":"crossref","unstructured":"Brunet, C.-A., Gonzalez-Rubio, R., & Tetreault, M. (1995). A multi-agent architecture for a driver model for autonomous road vehicles. In Proceedings 1995 Canadian conference on electrical and computer engineering (Vol. 2, pp. 772\u2013775). IEEE.","DOI":"10.1109\/CCECE.1995.526409"},{"issue":"2","key":"6286_CR10","doi-asserted-by":"publisher","first-page":"156","DOI":"10.1109\/TSMCC.2007.913919","volume":"38","author":"L Busoniu","year":"2008","unstructured":"Busoniu, L., Babuska, R., & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 38(2), 156\u2013172.","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)"},{"issue":"3","key":"6286_CR11","doi-asserted-by":"publisher","first-page":"268","DOI":"10.1016\/j.enganabound.2004.12.001","volume":"29","author":"AH-D Cheng","year":"2005","unstructured":"Cheng, A.H.-D., & Cheng, D. T. (2005). Heritage and early history of the boundary element method. Engineering Analysis with Boundary Elements, 29(3), 268\u2013302.","journal-title":"Engineering Analysis with Boundary Elements"},{"issue":"12","key":"6286_CR12","doi-asserted-by":"publisher","first-page":"4195","DOI":"10.1007\/s10489-020-01755-8","volume":"50","author":"H Chen","year":"2020","unstructured":"Chen, H., Liu, Y., Zhou, Z., Hu, D., & Zhang, M. (2020). Gama: Graph attention multi-agent reinforcement learning algorithm for cooperation. Applied Intelligence, 50(12), 4195\u20134205.","journal-title":"Applied Intelligence"},{"key":"6286_CR13","unstructured":"Chung, F. R., & Graham, F. C. (1997). Spectral graph theory. American Mathematical Society."},{"key":"6286_CR14","doi-asserted-by":"crossref","unstructured":"Chung, A. W., Pesce, E., Monti, R. P., & Montana, G. (2016a). Classifying hcp task-fmri networks using heat kernels. In 2016 International workshop on pattern recognition in neuroimaging (PRNI) (pp. 1\u20134). IEEE.","DOI":"10.1109\/PRNI.2016.7552339"},{"key":"6286_CR15","doi-asserted-by":"publisher","first-page":"490","DOI":"10.1016\/j.neuroimage.2016.07.006","volume":"141","author":"AW Chung","year":"2016","unstructured":"Chung, A. W., Schirmer, M., Krishnan, M. L., Ball, G., Aljabar, P., Edwards, A. D., & Montana, G. (2016b). Characterising brain network topologies: A dynamic analysis approach using heat kernels. Neuroimage, 141, 490\u2013501.","journal-title":"Neuroimage"},{"key":"6286_CR16","unstructured":"Cvetkovic, D. M. (1980). Spectra of graphs. Theory and Application."},{"key":"6286_CR17","unstructured":"Das, A., Gervet, T., Romoff, J., Batra, D., Parikh, D., Rabbat, M., & Pineau, J. (2018). Tarmac: Targeted multi-agent communication. arXiv preprint arXiv:1810.11187."},{"key":"6286_CR18","unstructured":"Degris, T., White, M., & Sutton, R. S. (2012). Off-policy actor-critic. arXiv preprint arXiv:1205.4839."},{"issue":"4","key":"6286_CR19","doi-asserted-by":"publisher","first-page":"1292","DOI":"10.1257\/aer.98.4.1292","volume":"98","author":"S Demichelis","year":"2008","unstructured":"Demichelis, S., & Weibull, J. W. (2008). Language, meaning, and games: A model of communication, coordination, and evolution. American Economic Review, 98(4), 1292\u20131311.","journal-title":"American Economic Review"},{"key":"6286_CR20","doi-asserted-by":"crossref","unstructured":"Dresner, K., & Stone, P. (2004). Multiagent traffic management: A reservation-based intersection control mechanism. In: Proceedings of the third international joint conference on autonomous agents and multiagent systems (Vol. 2, pp. 530\u2013537). IEEE Computer Society.","DOI":"10.1145\/1082473.1082545"},{"issue":"1","key":"6286_CR21","doi-asserted-by":"publisher","first-page":"57","DOI":"10.4064\/-25-1-57-70","volume":"25","author":"M Fiedler","year":"1989","unstructured":"Fiedler, M. (1989). Laplacian of graphs and algebraic connectivity. Banach Center Publications, 25(1), 57\u201370.","journal-title":"Banach Center Publications"},{"key":"6286_CR22","unstructured":"Foerster, J., Assael, I. A., de Freitas, N., & Whiteson, S. (2016). Learning to communicate with deep multi-agent reinforcement learning. In: Advances in neural information processing systems (pp. 2137\u20132145)."},{"key":"6286_CR23","doi-asserted-by":"crossref","unstructured":"Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., & Whiteson, S. (2017). Counterfactual multi-agent policy gradients. arXiv preprint arXiv:1705.08926.","DOI":"10.1609\/aaai.v32i1.11794"},{"issue":"3","key":"6286_CR24","doi-asserted-by":"publisher","first-page":"325","DOI":"10.1023\/A:1008937911390","volume":"8","author":"D Fox","year":"2000","unstructured":"Fox, D., Burgard, W., Kruppa, H., & Thrun, S. (2000). A probabilistic approach to collaborative multi-robot localization. Autonomous Robots, 8(3), 325\u2013344.","journal-title":"Autonomous Robots"},{"key":"6286_CR25","doi-asserted-by":"publisher","first-page":"65","DOI":"10.3389\/frobt.2018.00065","volume":"5","author":"N Gildert","year":"2018","unstructured":"Gildert, N., Millard, A. G., Pomfret, A., & Timmis, J. (2018). The need for combining implicit and explicit communication in cooperative robotic systems. Frontiers in Robotics and AI, 5, 65.","journal-title":"Frontiers in Robotics and AI"},{"key":"6286_CR26","unstructured":"Grupen, N. A., Lee, D. D., & Selman, B. (2022). Multi-agent curricula and emergent implicit signaling. In Proceedings of the 21st international conference on autonomous agents and multiagent systems (pp. 553\u2013561)."},{"key":"6286_CR27","unstructured":"Guestrin, C., Koller, D., & Parr, R. (2002). Multiagent planning with factored mdps. In Advances in neural information processing systems (pp. 1523\u20131530)."},{"key":"6286_CR28","doi-asserted-by":"crossref","unstructured":"Hagberg, A., Swart, P., & Chult, D.S. (2008). Exploring network structure, dynamics, and function using network. Technical report, Los Alamos National Lab.(LANL), Los Alamos, NM (United States).","DOI":"10.25080\/TCWV9851"},{"key":"6286_CR29","doi-asserted-by":"publisher","DOI":"10.1075\/ais.4","volume-title":"Communication in humans and other animals","author":"G H\u00e5kansson","year":"2013","unstructured":"H\u00e5kansson, G., & Westander, J. (2013). Communication in humans and other animals. John Benjamins."},{"issue":"1","key":"6286_CR30","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1109\/JSYST.2007.901641","volume":"1","author":"A Harati","year":"2007","unstructured":"Harati, A., Ahmadabadi, M. N., & Araabi, B. N. (2007). Knowledge-based multiagent credit assignment: a study on task type and critic information. IEEE Systems Journal, 1(1), 55\u201367.","journal-title":"IEEE Systems Journal"},{"key":"6286_CR31","unstructured":"Hernandez-Leal, P., Kaisers, M., Baarslag, T., & de Cote, E. M. (2017). A survey of learning in multiagent environments: Dealing with non-stationarity. arXiv preprint arXiv:1707.09183."},{"issue":"6","key":"6286_CR32","doi-asserted-by":"publisher","first-page":"750","DOI":"10.1007\/s10458-019-09421-1","volume":"33","author":"P Hernandez-Leal","year":"2019","unstructured":"Hernandez-Leal, P., Kartal, B., & Taylor, M. E. (2019). A survey and critique of multiagent deep reinforcement learning. Autonomous Agents and Multi-Agent Systems, 33(6), 750\u2013797.","journal-title":"Autonomous Agents and Multi-Agent Systems"},{"issue":"8","key":"6286_CR33","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735\u20131780.","journal-title":"Neural Computation"},{"key":"6286_CR34","unstructured":"Hoshen, Y. (2017). Vain: Attentional multi-agent predictive modeling. In Advances in neural information processing systems (pp. 2701\u20132711)."},{"key":"6286_CR35","doi-asserted-by":"crossref","unstructured":"Huang, Y., Bi, H., Li, Z., Mao, T., & Wang, Z. (2019). Stgat: Modeling spatial-temporal interactions for human trajectory prediction. In Proceedings of the IEEE\/CVF international conference on computer vision (pp. 6272\u20136281).","DOI":"10.1109\/ICCV.2019.00637"},{"key":"6286_CR36","unstructured":"Iqbal, S., & Sha, F. (2019). Actor-attention-critic for multi-agent reinforcement learning. ICML."},{"key":"6286_CR37","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-15612-0","volume-title":"Innovations in agent-based complex automated negotiations","author":"T It\u014d","year":"2011","unstructured":"It\u014d, T., Zhang, M., Robu, V., Fatima, S., Matsuo, T., & Yamaki, H. (2011). Innovations in agent-based complex automated negotiations. Springer."},{"key":"6286_CR38","doi-asserted-by":"crossref","unstructured":"Jia, J., Schaub, M. T., Segarra, S., & Benson, A. R. (2019). Graph-based semi-supervised & active learning for edge flows. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 761\u2013771).","DOI":"10.1145\/3292500.3330872"},{"key":"6286_CR39","unstructured":"Jiang, J., & Lu, Z. (2018). Learning attentional communication for multi-agent cooperation. arXiv preprint arXiv:1805.07733."},{"key":"6286_CR40","unstructured":"Jiang, J., Dun, C., Huang, T., & Lu, Z. (2018). Graph convolutional reinforcement learning. arXiv preprint arXiv:1810.09202."},{"issue":"10","key":"6286_CR41","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1145\/2347736.2347753","volume":"55","author":"M Kearns","year":"2012","unstructured":"Kearns, M. (2012). Experiments in social computation. Communications of the ACM, 55(10), 56\u201367.","journal-title":"Communications of the ACM"},{"key":"6286_CR42","unstructured":"Kim, W., Park, J., & Sung, Y. (2020). Communication in multi-agent reinforcement learning: Intention sharing. In International Conference on Learning Representations."},{"key":"6286_CR43","unstructured":"Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980."},{"key":"6286_CR44","unstructured":"Klicpera, J., Wei\u00dfenberger, S., & G\u00fcnnemann, S. (2019). Diffusion improves graph learning. In Advances in neural information processing systems (pp. 13354\u201313366)."},{"key":"6286_CR45","doi-asserted-by":"crossref","unstructured":"Kloster, K., & Gleich, D. F. (2014). Heat kernel based community detection. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1386\u20131395). ACM.","DOI":"10.1145\/2623330.2623706"},{"key":"6286_CR46","unstructured":"Kondor, R., & Lafferty, J. (2002). Diffusion kernels on graphs and other discrete input spaces. icml 2002. In Proc (pp. 315\u2013322)."},{"key":"6286_CR47","doi-asserted-by":"publisher","first-page":"82","DOI":"10.1016\/j.neucom.2016.01.031","volume":"190","author":"L Kraemer","year":"2016","unstructured":"Kraemer, L., & Banerjee, B. (2016). Multi-agent reinforcement learning as a rehearsal for decentralized planning. Neurocomputing, 190, 82\u201394.","journal-title":"Neurocomputing"},{"issue":"2","key":"6286_CR48","doi-asserted-by":"publisher","first-page":"498","DOI":"10.1109\/18.910572","volume":"47","author":"FR Kschischang","year":"2001","unstructured":"Kschischang, F. R., Frey, B. J., Loeliger, H.-A., et al. (2001). Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory, 47(2), 498\u2013519.","journal-title":"IEEE Transactions on Information Theory"},{"key":"6286_CR49","doi-asserted-by":"crossref","unstructured":"Kuyer, L., Whiteson, S., Bakker, B., & Vlassis, N. (2008). Multiagent reinforcement learning for urban traffic control using coordination graphs. In Joint European conference on machine learning and knowledge discovery in databases (pp. 656\u2013671). Springer.","DOI":"10.1007\/978-3-540-87479-9_61"},{"key":"6286_CR50","first-page":"129","volume":"6","author":"J Lafferty","year":"2005","unstructured":"Lafferty, J., & Lebanon, G. (2005). Diffusion kernels on statistical manifolds. Journal of Machine Learning Research, 6, 129\u2013163.","journal-title":"Journal of Machine Learning Research"},{"issue":"1","key":"6286_CR51","doi-asserted-by":"publisher","first-page":"55","DOI":"10.3233\/KES-2010-0206","volume":"15","author":"GJ Laurent","year":"2011","unstructured":"Laurent, G. J., Matignon, L., Fort-Piat, L., et al. (2011). The world of independent learners is not Markovian. International Journal of Knowledge-based and Intelligent Engineering Systems, 15(1), 55\u201364.","journal-title":"International Journal of Knowledge-based and Intelligent Engineering Systems"},{"issue":"7553","key":"6286_CR52","doi-asserted-by":"publisher","first-page":"436","DOI":"10.1038\/nature14539","volume":"521","author":"Y LeCun","year":"2015","unstructured":"LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436\u2013444.","journal-title":"Nature"},{"issue":"1","key":"6286_CR53","doi-asserted-by":"publisher","first-page":"233","DOI":"10.1080\/00207540701441921","volume":"46","author":"J-H Lee","year":"2008","unstructured":"Lee, J.-H., & Kim, C.-O. (2008). Multi-agent systems applications in manufacturing systems and supply chain management: A review paper. International Journal of Production Research, 46(1), 233\u2013265.","journal-title":"International Journal of Production Research"},{"key":"6286_CR54","unstructured":"Li, S., Gupta, J. K., Morales, P., Allen, R., & Kochenderfer, M. J. (2020). Deep implicit coordination graphs for multi-agent reinforcement learning. arXiv preprint arXiv:2006.11438."},{"key":"6286_CR55","unstructured":"Liao, W., Bak-Jensen, B., Pillai, J. R., Wang, Y., & Wang, Y. (2021). A review of graph neural networks and their applications in power systems. arXiv preprint arXiv:2101.10025."},{"key":"6286_CR56","unstructured":"Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D. (2015). Continuous control with deep reinforcement learning. CoRR arXiv:abs\/1509.02971."},{"key":"6286_CR57","doi-asserted-by":"crossref","unstructured":"Lin, K., Zhao, R., Xu, Z., & Zhou, J. (2018). Efficient large-scale fleet management via multi-agent deep reinforcement learning. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 1774\u20131783).","DOI":"10.1145\/3219819.3219993"},{"key":"6286_CR58","doi-asserted-by":"crossref","unstructured":"Littman, M. L. (1994). Markov games as a framework for multi-agent reinforcement learning. In Machine learning proceedings 1994 (pp. 157\u2013163). Elsevier.","DOI":"10.1016\/B978-1-55860-335-6.50027-1"},{"key":"6286_CR59","doi-asserted-by":"crossref","unstructured":"Liu, Y.-C., Tian, J., Glaser, N., & Kira, Z. (2020). When2com: Multi-agent perception via communication graph grouping. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 4106\u20134115).","DOI":"10.1109\/CVPR42600.2020.00416"},{"key":"6286_CR60","doi-asserted-by":"crossref","unstructured":"Liu, Y., Wang, W., Hu, Y., Hao, J., Chen, X., & Gao, Y. (2020). Multi-agent game abstraction via graph attention neural network. In AAAI (pp. 7211\u20137218).","DOI":"10.1609\/aaai.v34i05.6211"},{"key":"6286_CR61","unstructured":"Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, O. P., & Mordatch, I. (2017). Multi-agent actor-critic for mixed cooperative-competitive environments. In Advances in neural information processing systems (pp. 6379\u20136390)."},{"key":"6286_CR62","unstructured":"Mao, H., Zhang, Z., Xiao, Z., & Gong, Z. (2018). Modelling the dynamic joint policy of teammates with attention multi-agent ddpg. arXiv preprint arXiv:1811.07029."},{"key":"6286_CR63","volume-title":"Wolves: Behavior, ecology, and conservation","author":"LD Mech","year":"2007","unstructured":"Mech, L. D., & Boitani, L. (2007). Wolves: Behavior, ecology, and conservation. University of Chicago Press."},{"key":"6286_CR64","doi-asserted-by":"publisher","DOI":"10.1515\/9781400835355","volume-title":"Graph theoretic methods in multiagent networks","author":"M Mesbahi","year":"2010","unstructured":"Mesbahi, M., & Egerstedt, M. (2010). Graph theoretic methods in multiagent networks. Princeton University Press."},{"issue":"5","key":"6286_CR65","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1002\/cplx.20034","volume":"9","author":"JH Miller","year":"2004","unstructured":"Miller, J. H., & Moser, S. (2004). Communication and coordination. Complexity, 9(5), 31\u201340.","journal-title":"Complexity"},{"issue":"7540","key":"6286_CR66","doi-asserted-by":"publisher","first-page":"529","DOI":"10.1038\/nature14236","volume":"518","author":"V Mnih","year":"2015","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529.","journal-title":"Nature"},{"key":"6286_CR67","doi-asserted-by":"crossref","unstructured":"Mohamed, A., Qian, K., Elhoseiny, M., & Claudel, C. (2020). Social-stgcnn: A social spatio-temporal graph convolutional neural network for human trajectory prediction. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 14424\u201314432).","DOI":"10.1109\/CVPR42600.2020.01443"},{"key":"6286_CR68","doi-asserted-by":"crossref","unstructured":"Montesello, F., D\u2019Angelo, A., Ferrari, C., & Pagello, E. (1998). Implicit coordination in a multi-agent system using a behavior-based approach. In Distributed autonomous robotic systems (Vol. 3, pp. 351\u2013360). Springer.","DOI":"10.1007\/978-3-642-72198-4_34"},{"key":"6286_CR69","doi-asserted-by":"crossref","unstructured":"Mordatch, I., & Abbeel, P. (2017). Emergence of grounded compositional language in multi-agent populations. arXiv preprint arXiv:1703.04908.","DOI":"10.1609\/aaai.v32i1.11492"},{"key":"6286_CR70","doi-asserted-by":"crossref","unstructured":"Nguyen, T. T., Nguyen, N. D., & Nahavandi, S. (2020). Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications. IEEE Transactions on Cybernetics, 50(9), 3826\u20133839.","DOI":"10.1109\/TCYB.2020.2977374"},{"key":"6286_CR71","unstructured":"Niu, Y., Paleja, R., & Gombolay, M. (2021). Multi-agent graph-attention communication and teaming. In Proceedings of the 20th international conference on autonomous agents and MultiAgent systems (pp. 964\u2013973)."},{"issue":"3","key":"6286_CR72","doi-asserted-by":"publisher","first-page":"243","DOI":"10.1023\/A:1015575522401","volume":"5","author":"S Parsons","year":"2002","unstructured":"Parsons, S., & Wooldridge, M. (2002). Game theory and decision theory in multi-agent systems. Autonomous Agents and Multi-Agent Systems, 5(3), 243\u2013254.","journal-title":"Autonomous Agents and Multi-Agent Systems"},{"key":"6286_CR73","unstructured":"Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in PyTorch."},{"key":"6286_CR74","unstructured":"Peng, P., Yuan, Q., Wen, Y., Yang, Y., Tang, Z., Long, H., & Wang, J. (2017). Multiagent bidirectionally-coordinated nets for learning to play starcraft combat games. arXiv preprint arXiv:1703.10069."},{"key":"6286_CR75","unstructured":"Pesce, E., & Montana, G. (2019). Improving coordination in multi-agent deep reinforcement learning through memory-driven communication. Deep Reinforcement Learning Workshop, (NeurIPS 2018), Montreal, Canada."},{"issue":"1738","key":"6286_CR76","doi-asserted-by":"publisher","first-page":"2539","DOI":"10.1098\/rspb.2011.2537","volume":"279","author":"NJ Quick","year":"2012","unstructured":"Quick, N. J., & Janik, V. M. (2012). Bottlenose dolphins exchange signature whistles when meeting at sea. Proceedings of the Royal Society B: Biological Sciences, 279(1738), 2539\u20132545.","journal-title":"Proceedings of the Royal Society B: Biological Sciences"},{"key":"6286_CR77","doi-asserted-by":"crossref","unstructured":"Rahaie, Z., & Beigy, H. (2009). Toward a solution to multi-agent credit assignment problem. In 2009 International conference of soft computing and pattern recognition (pp. 563\u2013568). IEEE.","DOI":"10.1109\/SoCPaR.2009.112"},{"key":"6286_CR78","doi-asserted-by":"crossref","unstructured":"Scardovi, L., & Sepulchre, R. (2008). Synchronization in networks of identical linear systems. In 47th IEEE conference on decision and control, 2008. CDC 2008 (pp. 546\u2013551). IEEE","DOI":"10.1109\/CDC.2008.4738875"},{"key":"6286_CR79","volume-title":"The Serengeti lion: A study of predator-prey relations","author":"GB Schaller","year":"2009","unstructured":"Schaller, G. B. (2009). The Serengeti lion: A study of predator-prey relations. University of Chicago press."},{"key":"6286_CR80","unstructured":"Schmidhuber, J. (1996). A general method for multi-agent reinforcement learning in unrestricted environments. In Adaptation, coevolution and learning in multiagent systems: Papers from the 1996 AAAI spring symposium (pp. 84\u201387)."},{"key":"6286_CR81","doi-asserted-by":"publisher","first-page":"85","DOI":"10.1016\/j.neunet.2014.09.003","volume":"61","author":"J Schmidhuber","year":"2015","unstructured":"Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61, 85\u2013117.","journal-title":"Neural Networks"},{"key":"6286_CR82","volume-title":"Lectures on differential geometry","author":"R Schoen","year":"1994","unstructured":"Schoen, R., & Shing-Tung Yau Mack, C. A. (1994). Lectures on differential geometry. International Press."},{"key":"6286_CR83","unstructured":"Seraj, E., Wang, Z., Paleja, R., Sklar, M., Patel, A., & Gombolay, M. (2021). Heterogeneous graph attention networks for learning diverse communication. arXiv preprint arXiv:2108.09568."},{"key":"6286_CR84","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511811654","volume-title":"Multiagent systems: Algorithmic, game-theoretic, and logical foundations","author":"Y Shoham","year":"2008","unstructured":"Shoham, Y., & Leyton-Brown, K. (2008). Multiagent systems: Algorithmic, game-theoretic, and logical foundations. Cambridge University Press."},{"key":"6286_CR85","unstructured":"Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., & Riedmiller, M. (2014). Deterministic policy gradient algorithms. In ICML."},{"issue":"7587","key":"6286_CR86","doi-asserted-by":"publisher","first-page":"484","DOI":"10.1038\/nature16961","volume":"529","author":"D Silver","year":"2016","unstructured":"Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., et al. (2016). Mastering the game of go with deep neural networks and tree search. Nature, 529(7587), 484.","journal-title":"Nature"},{"key":"6286_CR87","unstructured":"Singh, A., Jain, T., & Sukhbaatar, S. (2019). Learning when to communicate at scale in multiagent cooperative and competitive tasks. In ICLR."},{"issue":"3","key":"6286_CR88","doi-asserted-by":"publisher","first-page":"345","DOI":"10.1023\/A:1008942012299","volume":"8","author":"P Stone","year":"2000","unstructured":"Stone, P., & Veloso, M. (2000). Multiagent systems: A survey from a machine learning perspective. Autonomous Robots, 8(3), 345\u2013383.","journal-title":"Autonomous Robots"},{"key":"6286_CR89","unstructured":"Su, J., Adams, S., & Beling, P. A. (2020). Counterfactual multi-agent reinforcement learning with graph convolution communication. arXiv preprint arXiv:2004.00470."},{"key":"6286_CR90","unstructured":"Sukhbaatar, S., & Fergus, R., et al. (2016). Learning multiagent communication with backpropagation. In Advances in neural information processing systems (pp. 2244\u20132252)."},{"key":"6286_CR91","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.1998.712192","volume-title":"Introduction to reinforcement learning","author":"RS Sutton","year":"1998","unstructured":"Sutton, R. S., & Barto, A. G. (1998). Introduction to reinforcement learning. MIT Press."},{"key":"6286_CR92","doi-asserted-by":"crossref","unstructured":"Tanner, H. G., & Kumar, A. (2005). Towards decentralization of multi-robot navigation functions. In Proceedings of the 2005 IEEE international conference on robotics and automation (pp. 4132\u20134137) IEEE.","DOI":"10.1109\/ROBOT.2005.1570754"},{"issue":"3","key":"6286_CR93","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1609\/aimag.v33i3.2426","volume":"33","author":"K Tuyls","year":"2012","unstructured":"Tuyls, K., & Weiss, G. (2012). Multiagent learning: Basics, challenges, and prospects. AI Magazine, 33(3), 41.","journal-title":"AI Magazine"},{"key":"6286_CR94","volume-title":"Python tutorial","author":"G Van Rossum","year":"1995","unstructured":"Van Rossum, G., & Drake, F. L., Jr. (1995). Python tutorial. Amsterdam: Centrum voor Wiskunde en Informatica."},{"key":"6286_CR95","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, \u0141., & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998\u20136008)."},{"key":"6286_CR96","doi-asserted-by":"crossref","unstructured":"Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., et al. (2019). Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature, 1\u20135.","DOI":"10.1038\/s41586-019-1724-z"},{"issue":"2","key":"6286_CR97","doi-asserted-by":"publisher","first-page":"0170780","DOI":"10.1371\/journal.pone.0170780","volume":"12","author":"Y Vorobeychik","year":"2017","unstructured":"Vorobeychik, Y., Joveski, Z., & Yu, S. (2017). Does communication help people coordinate? PLoS ONE, 12(2), 0170780.","journal-title":"PLoS ONE"},{"key":"6286_CR98","unstructured":"Wang, R. E., Everett, M., & How, J. P. (2020). R-maddpg for partially observable environments and limited communication. arXiv preprint arXiv:2002.06684."},{"key":"6286_CR99","unstructured":"Wang, T., Wang, J., Zheng, C., & Zhang, C. (2019). Learning nearly decomposable value functions via communication minimization. arXiv preprint arXiv:1910.05366."},{"key":"6286_CR100","unstructured":"Wang, Y., Xu, T., Niu, X., Tan, C., Chen, E., & Xiong, H. (2019). Stmarl: A spatio-temporal multi-agent reinforcement learning approach for traffic light control. arXiv preprint arXiv:1908.10577."},{"issue":"2","key":"6286_CR101","doi-asserted-by":"publisher","first-page":"170","DOI":"10.1002\/rnc.1687","volume":"22","author":"G Wen","year":"2012","unstructured":"Wen, G., Duan, Z., Yu, W., & Chen, G. (2012). Consensus in multi-agent systems with communication constraints. International Journal of Robust and Nonlinear Control, 22(2), 170\u2013182.","journal-title":"International Journal of Robust and Nonlinear Control"},{"key":"6286_CR102","unstructured":"Wunder, M., Littman, M., & Stone, M. (2009). Communication, credibility and negotiation using a cognitive hierarchy model. In Workshop# 19: MSDM 2009 (p. 73)."},{"key":"6286_CR103","unstructured":"Xiao, B., Wilson, R. C., & Hancock, E. R. (2005). Characterising graphs using the heat kernel."},{"key":"6286_CR104","doi-asserted-by":"crossref","unstructured":"Xu, B., Shen, H., Cao, Q., Cen, K., & Cheng, X. (2020). Graph convolutional networks using heat kernel for semi-supervised learning. arXiv preprint arXiv:2007.16002.","DOI":"10.24963\/ijcai.2019\/267"},{"key":"6286_CR105","doi-asserted-by":"crossref","unstructured":"Xu, Z., Zhang, B., Bai, Y., Li, D., & Fan, G. (2021). Learning to coordinate via multiple graph neural networks. arXiv preprint arXiv:2104.03503.","DOI":"10.1007\/978-3-030-92238-2_5"},{"key":"6286_CR106","doi-asserted-by":"crossref","unstructured":"Yliniemi, L., & Tumer, K. (2014). Multi-objective multiagent credit assignment through difference rewards in reinforcement learning. In Asia-Pacific conference on simulated evolution and learning (pp. 407\u2013418). Springer.","DOI":"10.1007\/978-3-319-13563-2_35"},{"key":"6286_CR107","doi-asserted-by":"crossref","unstructured":"Yuan, Q., Fu, X., Li, Z., Luo, G., Li, J., & Yang, F. (2021). Graphcomm: Efficient graph convolutional communication for multi-agent cooperation. IEEE Internet of Things Journal.","DOI":"10.1109\/JIOT.2021.3097947"},{"issue":"11","key":"6286_CR108","doi-asserted-by":"publisher","first-page":"3328","DOI":"10.1016\/j.patcog.2008.05.007","volume":"41","author":"F Zhang","year":"2008","unstructured":"Zhang, F., & Hancock, E. R. (2008). Graph spectral image smoothing using the heat kernel. Pattern Recognition, 41(11), 3328\u20133342.","journal-title":"Pattern Recognition"},{"key":"6286_CR109","doi-asserted-by":"publisher","first-page":"298","DOI":"10.1016\/j.neucom.2021.03.024","volume":"445","author":"H Zhou","year":"2021","unstructured":"Zhou, H., Ren, D., Xia, H., Fan, M., Yang, X., & Huang, H. (2021). Ast-gnn: An attention-based spatio-temporal graph neural network for interaction-aware pedestrian trajectory prediction. Neurocomputing, 445, 298\u2013308.","journal-title":"Neurocomputing"}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-022-06286-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10994-022-06286-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-022-06286-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,11]],"date-time":"2024-10-11T05:30:33Z","timestamp":1728624633000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10994-022-06286-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12,29]]},"references-count":109,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,2]]}},"alternative-id":["6286"],"URL":"https:\/\/doi.org\/10.1007\/s10994-022-06286-6","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"value":"0885-6125","type":"print"},{"value":"1573-0565","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,12,29]]},"assertion":[{"value":"10 December 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 October 2022","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 November 2022","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 December 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval"}},{"value":"The authors give their consent to participate.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent to participate"}},{"value":"The authors give their consent for publication.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"No competing and finacial interests to disclose.","order":5,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}