{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T19:31:29Z","timestamp":1774121489952,"version":"3.50.1"},"reference-count":39,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2024,5,14]],"date-time":"2024-05-14T00:00:00Z","timestamp":1715644800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000646","name":"JST KAKENHI","doi-asserted-by":"publisher","award":["20H04245"],"award-info":[{"award-number":["20H04245"]}],"id":[{"id":"10.13039\/501100000646","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000646","name":"JST KAKENHI","doi-asserted-by":"publisher","award":["2024C-117"],"award-info":[{"award-number":["2024C-117"]}],"id":[{"id":"10.13039\/501100000646","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000646","name":"JST KAKENHI","doi-asserted-by":"publisher","award":["JPMJSP2128"],"award-info":[{"award-number":["JPMJSP2128"]}],"id":[{"id":"10.13039\/501100000646","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Waseda University","award":["20H04245"],"award-info":[{"award-number":["20H04245"]}]},{"name":"Waseda University","award":["2024C-117"],"award-info":[{"award-number":["2024C-117"]}]},{"name":"Waseda University","award":["JPMJSP2128"],"award-info":[{"award-number":["JPMJSP2128"]}]},{"name":"SPRING","award":["20H04245"],"award-info":[{"award-number":["20H04245"]}]},{"name":"SPRING","award":["2024C-117"],"award-info":[{"award-number":["2024C-117"]}]},{"name":"SPRING","award":["JPMJSP2128"],"award-info":[{"award-number":["JPMJSP2128"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Decentralized execution is a widely used framework in multi-agent reinforcement learning. However, it has a well-known but neglected shortcoming, redundant computation, that is, the same\/similar computation is performed redundantly in different agents owing to their overlapping observations. This study proposes a novel method, the locally centralized team transformer (LCTT), to address this problem. This method first proposes a locally centralized execution framework that autonomously determines some agents as leaders that generate instructions and other agents as workers to act according to the received instructions without running their policy networks. For the LCTT, we subsequently propose the team-transformer (T-Trans) structure, which enables leaders to generate targeted instructions for each worker, and the leadership shift, which enables agents to determine those that should instruct or be instructed by others. The experimental results demonstrated that the proposed method significantly reduces redundant computations without decreasing rewards and achieves faster learning convergence.<\/jats:p>","DOI":"10.3390\/info15050279","type":"journal-article","created":{"date-parts":[[2024,5,14]],"date-time":"2024-05-14T06:28:12Z","timestamp":1715668092000},"page":"279","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Locally Centralized Execution for Less Redundant Computation in Multi-Agent Cooperation"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8653-100X","authenticated-orcid":false,"given":"Yidong","family":"Bai","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, Waseda University, Tokyo 169-8555, Japan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9271-4507","authenticated-orcid":false,"given":"Toshiharu","family":"Sugawara","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Waseda University, Tokyo 169-8555, Japan"}]}],"member":"1968","published-online":{"date-parts":[[2024,5,14]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"735","DOI":"10.1109\/TITS.2019.2893683","article-title":"Distributed multiagent coordinated learning for autonomous driving in highways based on dynamic coordination graphs","volume":"21","author":"Yu","year":"2019","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Wachi, A. (2019). Failure-scenario maker for rule-based agent using multi-agent adversarial reinforcement learning and its application to autonomous driving. arXiv.","DOI":"10.24963\/ijcai.2019\/832"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Bhalla, S., Ganapathi Subramanian, S., and Crowley, M. (2020, January 13\u201315). Deep multi agent reinforcement learning for autonomous driving. Proceedings of the Canadian Conference on Artificial Intelligence, Ottawa, ON, Canada.","DOI":"10.1007\/978-3-030-47358-7_7"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Palanisamy, P. (2020, January 19\u201324). Multi-agent connected autonomous driving using deep reinforcement learning. Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK.","DOI":"10.1109\/IJCNN48605.2020.9207663"},{"key":"ref_5","unstructured":"Shalev-Shwartz, S., Shammah, S., and Shashua, A. (2016). Safe, multi-agent, reinforcement learning for autonomous driving. arXiv."},{"key":"ref_6","unstructured":"Wang, Y., Zhong, F., Xu, J., and Wang, Y. (2022, January 25\u201329). ToM2C: Target-oriented Multi-agent Communication and Cooperation with Theory of Mind. Proceedings of the International Conference on Learning Representations, Virtual Event."},{"key":"ref_7","unstructured":"Yuan, L., Wang, J., Zhang, F., Wang, C., Zhang, Z., Yu, Y., and Zhang, C. (March, January 22). Multi-agent incentive communication via decentralized teammate modeling. Proceedings of the AAAI Conference on Artificial Intelligence, Virtual Event."},{"key":"ref_8","unstructured":"Berner, C., Brockman, G., Chan, B., Cheung, V., D\u0119biak, P., Dennison, C., Farhi, D., Fischer, Q., Hashme, S., and Hesse, C. (2019). Dota 2 with large scale deep reinforcement learning. arXiv."},{"key":"ref_9","unstructured":"Gupta, J.K., Egorov, M., and Kochenderfer, M. (2017, January 8\u201312). Cooperative multi-agent control using deep reinforcement learning. Proceedings of the Autonomous Agents and Multiagent Systems: AAMAS 2017 Workshops, Best Papers, S\u00e3o Paulo, Brazil. Revised Selected Papers 16."},{"key":"ref_10","unstructured":"Han, L., Sun, P., Du, Y., Xiong, J., Wang, Q., Sun, X., Liu, H., and Zhang, T. (2019, January 9\u201315). Grid-wise control for multi-agent reinforcement learning in video game AI. Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1016\/j.neucom.2016.01.031","article-title":"Multi-agent reinforcement learning as a rehearsal for decentralized planning","volume":"190","author":"Kraemer","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_12","unstructured":"Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., and Mordatch, I. (2017). Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. arXiv."},{"key":"ref_13","unstructured":"Wooldridge, M. (2009). An Introduction to Multiagent Systems, John Wiley & Sons."},{"key":"ref_14","unstructured":"Weiss, G. (1999). Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence, MIT Press."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Shoham, Y., and Leyton-Brown, K. (2008). Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations, Cambridge University Press.","DOI":"10.1017\/CBO9780511811654"},{"key":"ref_16","unstructured":"Ferber, J., and Weiss, G. (1999). Multi-Agent Systems: An Introduction to Distributed Artificial Intelligence, Addison-Wesley Reading."},{"key":"ref_17","unstructured":"Yang, Y., and Wang, J. (2020). An overview of multi-agent reinforcement learning from game theoretical perspective. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Sugawara, T. (1990, January 21\u201323). A cooperative LAN diagnostic and observation expert system. Proceedings of the 1990 Ninth Annual International Phoenix Conference on Computers and Communications, Scottsdale, AZ, USA.","DOI":"10.1109\/PCCC.1990.101684"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1275","DOI":"10.1109\/TC.1987.5009468","article-title":"Coherent cooperation among communicating problem solvers","volume":"100","author":"Durfee","year":"1987","journal-title":"IEEE Trans. Comput."},{"key":"ref_20","unstructured":"Krnjaic, A., Steleac, R.D., Thomas, J.D., Papoudakis, G., Sch\u00e4fer, L., To AW, K., Lao, K.-H., Cubuktepe, M., Haley, M., and B\u00f6rsting, P. (2022). Scalable multi-agent reinforcement learning for warehouse logistics with robotic and human co-workers. arXiv."},{"key":"ref_21","first-page":"10053","article-title":"Learning multi-agent coordination for enhancing target coverage in directional sensor networks","volume":"33","author":"Xu","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_22","unstructured":"Cammarata, S., McArthur, D., and Steeb, R. (1988). Strategies of cooperation in distributed problem solving. Readings in Distributed Artificial Intelligence, Elsevier."},{"key":"ref_23","unstructured":"Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., and Wu, Y. (2021). The surprising effectiveness of PPO in cooperative, multi-agent games. arXiv."},{"key":"ref_24","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017). Attention is all you need. arXiv."},{"key":"ref_25","first-page":"7234","article-title":"Monotonic value function factorisation for deep multi-agent reinforcement learning","volume":"21","author":"Rashid","year":"2020","journal-title":"J. Mach. Learn. Res."},{"key":"ref_26","unstructured":"Papoudakis, G., Christianos, F., Sch\u00e4fer, L., and Albrecht, S.V. (2020). Benchmarking multi-agent deep reinforcement learning algorithms in cooperative tasks. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"895","DOI":"10.1007\/s10462-021-09996-w","article-title":"Multi-agent deep reinforcement learning: A survey","volume":"55","author":"Gronauer","year":"2022","journal-title":"Artif. Intell. Rev."},{"key":"ref_28","unstructured":"Peng, P., Wen, Y., Yang, Y., Yuan, Q., Tang, Z., Long, H., and Wang, J. (2017). Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play starcraft combat games. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Tan, M. (1993, January 27\u201329). Multi-agent reinforcement learning: Independent vs. cooperative agents. Proceedings of the Tenth International Conference On Machine Learning, Amherst, MA, USA.","DOI":"10.1016\/B978-1-55860-307-3.50049-6"},{"key":"ref_30","first-page":"2252","article-title":"Learning multiagent communication with backpropagation","volume":"29","author":"Sukhbaatar","year":"2016","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_31","unstructured":"Jaques, N., Lazaridou, A., Hughes, E., Gulcehre, C., Ortega, P., Strouse, D., Leibo, J.Z., and De Freitas, N. (2019, January 9\u201315). Social influence as intrinsic motivation for multi-agent deep reinforcement learning. Proceedings of the International Conference on Machine Learning (PMLR 2019), Beach, CA, USA."},{"key":"ref_32","first-page":"22069","article-title":"Learning individually inferred communication for multi-agent cooperation","volume":"33","author":"Ding","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Bai, Y., and Sugawara, T. (2024). Reducing Redundant Computation in Multi-Agent Coordination through Locally Centralized Execution. arXiv.","DOI":"10.1109\/IIAI-AAI63651.2024.00073"},{"key":"ref_34","first-page":"7265","article-title":"Learning attentional communication for multi-agent cooperation","volume":"31","author":"Jiang","year":"2018","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_35","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, MIT Press."},{"key":"ref_36","unstructured":"Watkins, C.J.C.H. (1989). Learning from Delayed Rewards. [Ph.D. Thesis, King\u2019s College]."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_38","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Cho, K., Van Merri\u00ebnboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv.","DOI":"10.3115\/v1\/D14-1179"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/5\/279\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:41:58Z","timestamp":1760107318000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/5\/279"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,14]]},"references-count":39,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2024,5]]}},"alternative-id":["info15050279"],"URL":"https:\/\/doi.org\/10.3390\/info15050279","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,14]]}}}