{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,14]],"date-time":"2025-11-14T09:33:24Z","timestamp":1763112804702,"version":"3.45.0"},"reference-count":43,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2025,11,13]],"date-time":"2025-11-13T00:00:00Z","timestamp":1762992000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Postdoctoral Research Startup Fund of the Big Picture Center of Hangzhou City University","award":["201000-584105\/002"],"award-info":[{"award-number":["201000-584105\/002"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>Multi-agent inverse reinforcement learning (MA-IRL) infers the underlying reward functions or objectives of multiple agents by observing their behavioral data, thereby providing insights into collaboration, competition, or mixed interaction strategies among agents, and addressing the symmetrical ambiguity problem where multiple rewards may correspond to the same strategy. However, most existing algorithms mainly focus on solving cooperative and non-cooperative tasks among homogeneous multi-agent systems, making it difficult to adapt to the dynamic topologies and heterogeneous behavioral strategies of multi-agent systems in real-world applications. This makes it difficult for the algorithm to adapt to scenarios with locally sparse interactions and dynamic heterogeneity, such as autonomous driving, drone swarms, and robot clusters. To address this problem, this study proposes a dynamic heterogeneous multi-agent inverse reinforcement learning framework (GAMF-DHIRL) based on a graph attention mean field (GAMF) to infer the potential reward functions of agents. In GAMF-DHIRL, we introduce a graph attention mean field theory based on adversarial maximum entropy inverse reinforcement learning to dynamically model dependencies between agents and adaptively adjust the influence weights of neighboring nodes through attention mechanisms. Specifically, the GAMF module uses a dynamic adjacency matrix to capture the time-varying characteristics of the interactions among agents. Meanwhile, the typed mean-field approximation reduces computational complexity. Experiments demonstrate that the proposed method can efficiently recover reward functions of heterogeneous agents in collaborative tasks and adversarial environments, and it outperforms traditional MA-IRL methods.<\/jats:p>","DOI":"10.3390\/sym17111951","type":"journal-article","created":{"date-parts":[[2025,11,14]],"date-time":"2025-11-14T09:06:36Z","timestamp":1763111196000},"page":"1951","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Dynamic Heterogeneous Multi-Agent Inverse Reinforcement Learning Based on Graph Attention Mean Field"],"prefix":"10.3390","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2616-5156","authenticated-orcid":false,"given":"Li","family":"Song","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China"},{"name":"School of Computer and Computing Science, Hangzhou City University, Hangzhou 310015, China"}]},{"given":"Irfan Ali","family":"Channa","sequence":"additional","affiliation":[{"name":"Department of Artificial Intelligence, Aror University of Art, Architecture, Design and Heritage, Sukkur 65170, Pakistan"}]},{"given":"Zeyu","family":"Wang","sequence":"additional","affiliation":[{"name":"Tongzhou Operation Area of the Beijing Oil and Gas Branch of Beijing Pipeline Limited Company, Beijing 100101, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6761-6019","authenticated-orcid":false,"given":"Guangyu","family":"Sun","sequence":"additional","affiliation":[{"name":"Swiss Federal Institute of Technology in Lausanne, Lausanne 1015, Switzerland"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,13]]},"reference":[{"key":"ref_1","unstructured":"Ng, A.Y., and Russell, S. (July, January 29). Algorithms for Inverse Reinforcement Learning. Proceedings of the Seventeenth International Conference on Machine Learning, San Francisco, CA, USA."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Abbeel, P., and Ng, A.Y. (2004, January 4\u20138). Apprenticeship Learning Via Inverse Reinforcement Learning. Proceedings of the 21st International Conference on Machine Learning, New York, NY, USA.","DOI":"10.1145\/1015330.1015430"},{"key":"ref_3","unstructured":"Mandyam, A., Li, D., Yao, J.Y., Cai, D.N., Jones, A., and Engelhardt, B.E. (2023). Kernel Density Bayesian Inverse Reinforcement Learning. arXiv."},{"key":"ref_4","unstructured":"Fu, J., Luo, K., and Levine, S. (May, January 30). Learning Robust Rewards with Adversarial Inverse Reinforcement Learning. Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada."},{"key":"ref_5","unstructured":"Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014, January 8\u201313). Generative Adversarial Nets. Proceedings of the Advances in Neural Information Processing Systems 27 (NIPS 2014), Montreal, QC, Australia."},{"key":"ref_6","unstructured":"Ho, J., and Ermon, S. (2017, January 4\u20139). Generative adversarial imitation learning. Proceedings of the 31st International Conference on Neural Information Processing Systems, Red Hook, NY, USA."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1167","DOI":"10.1287\/trsc.2024.0532","article-title":"A Game-Theoretic Framework for Generic Second-Order Traffic Flow Models Using Mean Field Games and Adversarial Inverse Reinforcement Learning","volume":"58","author":"Mo","year":"2024","journal-title":"Transp. Sci."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"17549","DOI":"10.1109\/TNNLS.2023.3305983","article-title":"Hierarchical Adversarial Inverse Reinforcement Learning","volume":"35","author":"Chen","year":"2024","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"7047","DOI":"10.1109\/LRA.2025.3572771","article-title":"Better Than Diverse Demonstrators: Reward Decomposition from Suboptimal and Heterogeneous Demonstrations","volume":"10","author":"Xue","year":"2025","journal-title":"IEEE Rob. Autom. Lett."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Rucker, M., Adams, S., Hayes, R., and Beling, P.A. (2021, January 17\u201320). Inverse Reinforcement Learning for Strategy Identification. Proceedings of the 2021 IEEE International Conference on Systems, Man, and Cybernetics, Melbourne, Australia.","DOI":"10.1109\/SMC52423.2021.9658704"},{"key":"ref_11","unstructured":"Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., and Mordatch, I. (2017, January 4\u20139). Multi-agent Actor-critic for Mixed Cooperative-competitive Environments. Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA."},{"key":"ref_12","unstructured":"Freihaut, T., and Ramponi, G. (2024). On Feasible Rewards in Multi-Agent Inverse Reinforcement Learning. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"4867","DOI":"10.1109\/TASE.2024.3412188","article-title":"Optimal Robust Formation of Multi-Agent Systems as Adversarial Graphical Apprentice Games with Inverse Reinforcement Learning","volume":"22","author":"Golmisheh","year":"2025","journal-title":"IEEE Trans. Autom. Sci. Eng."},{"key":"ref_14","first-page":"2061081","article-title":"Multiagent modeling of pedestrian-vehicle conflicts using Adversarial Inverse Reinforcement Learning","volume":"19","author":"Nasernejad","year":"2023","journal-title":"Transp. A"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1880","DOI":"10.1109\/LRA.2021.3061397","article-title":"Adversarial Inverse Reinforcement Learning with Self-Attention Dynamics Model","volume":"6","author":"Sun","year":"2021","journal-title":"IEEE Rob. Autom. Lett."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1309","DOI":"10.23919\/cje.2024.00.202","article-title":"Learning Robust Adaptive Bitrate Algorithms with Adversarial Inverse Reinforcement Learning","volume":"34","author":"Yi","year":"2024","journal-title":"Chin. J. Electron."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"3179","DOI":"10.1109\/LRA.2024.3366023","article-title":"SC-AIRL: Share-Critic in Adversarial Inverse Reinforcement Learning for Long-Horizon Task","volume":"9","author":"Xiang","year":"2024","journal-title":"IEEE Rob. Autom. Lett."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"103839","DOI":"10.1016\/j.tre.2024.103839","article-title":"Personalized Origin-destination Travel Time Estimation with Active Adversarial Inverse Reinforcement Learning and Transformer","volume":"193","author":"Liu","year":"2024","journal-title":"Transp. Res. Part E Logist. Trans."},{"key":"ref_19","unstructured":"Zhang, X., Li, Y.H., Zhang, Z.M., and Zhang, Z.L. (2020, January 6\u201312). f-GAIL: Learning f-Divergence for Generative Adversarial Imitation Learning. Proceedings of the 34th Conference on Neural Information Processing Systems, Vancouver, BC, Canada."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"121712","DOI":"10.1016\/j.ins.2024.121712","article-title":"Adaptive Generative Adversarial Maximum Entropy Inverse Reinforcement Learning","volume":"695","author":"Song","year":"2025","journal-title":"Inform. Sci."},{"key":"ref_21","unstructured":"Yu, L.T., Song, J.M., and Ermon, S. (2019, January 9\u201315). Multi-Agent Adversarial Inverse Reinforcement Learning. Proceedings of the 36th International Conference on Machine Learning, Long Beach, California, USA."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"841","DOI":"10.1109\/TCNS.2022.3210856","article-title":"Multiagent Graphical Games with Inverse Reinforcement Learning","volume":"10","author":"Donge","year":"2023","journal-title":"IEEE Trans. Control Netw. Syst."},{"key":"ref_23","unstructured":"Gruver, N., Song, J.M., Kochenderfer, M.J., and Ermon, S. (2020, January 9\u201313). Multi-agent Adversarial Inverse Reinforcement Learning with Latent Variables. Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems, Auckland, New Zealand."},{"key":"ref_24","unstructured":"Seraj, E., Xiong, J., Schrum, M., and Gombolay, M. (2023, January 10\u201316). Mixed-Initiative Multiagent Apprenticeship Learning for Human Training of Robot Teams. Proceedings of the 37th Conference on Neural Information Processing Systems, New Orleans, Louisiana, USA."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"103191","DOI":"10.1016\/j.trc.2021.103191","article-title":"Markov-game modeling of cyclist-pedestrian interactions in shared spaces: A multi-agent adversarial inverse reinforcement learning approach","volume":"128","author":"Alsaleh","year":"2021","journal-title":"Transp. Res. Part C Emerg. Technol."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1109\/TCIAIG.2017.2679115","article-title":"Multiagent Inverse Reinforcement Learning for Two-Person Zero-Sum Games","volume":"10","author":"Lin","year":"2018","journal-title":"IEEE Trans. Games"},{"key":"ref_27","unstructured":"Chen, Y., Zhang, L.B., Liu, J.M., and Hu, S. (2022, January 9\u201313). Individual-Level Inverse Reinforcement Learning for Mean Field Games. Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, Virtual Event."},{"key":"ref_28","unstructured":"Chen, Y., Zhang, L.B., Liu, J.M., and Witbrock, M. (June, January 29). Adversarial Inverse Reinforcement Learning for Mean Field Games. Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, London, UK."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Chen, Y., Lin, X., Yan, B., Zhang, L.B., Liu, J.M., Tan, N., and Witbrock, M. (2024, January 20\u201327). Meta-Inverse Reinforcement Learning for Mean Field Games via Probabilistic Context Variables. Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, Canada.","DOI":"10.1609\/aaai.v38i10.29021"},{"key":"ref_30","unstructured":"Anahtarci, B., Kariksiz, C.D., and Saldi, N. (2024). Maximum Causal Entropy IRL in Mean-Field Games and GNEP Framework for Forward RL. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1016\/j.cobeha.2019.04.010","article-title":"Theory of Mind as Inverse Reinforcement Learning","volume":"29","year":"2019","journal-title":"Curr. Opin. Behav. Sci."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"999","DOI":"10.1007\/s12369-023-01007-y","article-title":"Spatially Small-scale Approach-avoidance Behaviors Allow Learning-free Machine Inference of Object Preferences in Human Minds","volume":"15","author":"Huang","year":"2023","journal-title":"Int. J. Social Rob."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Tian, R., Tomizuka, M., and Sun, L. (October, January 27). Learning Human Rewards by Inferring Their Latent Intelligence Levels in Multi-Agent Games: A Theory-of-Mind Approach with Application to Driving Data. Proceedings of the 2021 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Prague, Czech Republic.","DOI":"10.1109\/IROS51168.2021.9636653"},{"key":"ref_34","unstructured":"Wu, H.C., Sequeira, P., and Pynadath, D.V. (2023). Multiagent Inverse Reinforcement Learning via Theory of Mind Reasoning. arXiv."},{"key":"ref_35","unstructured":"Oguntola, I., Campbell, J., Stepputtis, S., and Sycara, K. (2023, January 23\u201329). Theory of Mind as Intrinsic Motivation for Multi-Agent Reinforcement Learning. Proceedings of the 40th International Conference on Machine Learning, Honolulu, Hawaii, USA."},{"key":"ref_36","unstructured":"Wei, R., Zeng, S., Li, C.L., Garcia, A., McDonald, A., and Hong, M.Y. (2023, January 23\u201329). Robust Inverse Reinforcement Learning Through Bayesian Theory of Mind. Proceedings of the 40th International Conference on Machine Learning, Honolulu, Hawaii, USA."},{"key":"ref_37","first-page":"4772","article-title":"Multiagent Inverse Reinforcement Learning via Theory of Mind Reasoning","volume":"34","author":"Zhang","year":"2023","journal-title":"J. Softw."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Kang, S., Dong, Q., Xue, Y., and Yanjun, W. (2024, January 27\u201331). MACS: Multi-Agent Adversarial Reinforcement Learning for Finding Diverse Critical Driving Scenarios. Proceedings of the 2024 IEEE Conference on Software Testing, Verification and Validation, Toronto, Canada.","DOI":"10.1109\/ICST60714.2024.00010"},{"key":"ref_39","unstructured":"Wang, X.Y., and Klabjan, D. (2018, January 10\u201315). Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations. Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden."},{"key":"ref_40","unstructured":"Ziebart, B.D., Maas, A.L., Bagnell, J.A., and Dey, A.K. (2008, January 13\u201317). Maximum entropy inverse reinforcement learning. Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, Chicago, IL, USA."},{"key":"ref_41","unstructured":"Wei, E., Wicke, D., and Luke, S. (2019, January 13\u201317). Multiagent adversarial inverse reinforcement learning. Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems, Montreal QC, Canada."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"4356","DOI":"10.1109\/LRA.2025.3550743","article-title":"Multi-Agent Generative Adversarial Interactive Self-Imitation Learning for AUV Formation Control and Obstacle Avoidance","volume":"10","author":"Fang","year":"2025","journal-title":"IEEE Rob. Autom. Lett."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Zhang, H., Yang, Y., Zhao, W., and Yue, D. (2025). Distributed Data-Driven Inverse Reinforcement Learning for Multi-Agent Systems. IEEE Trans. Circuits Syst. I Regul. Pap., in press.","DOI":"10.1109\/TCSI.2025.3598198"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/11\/1951\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,14]],"date-time":"2025-11-14T09:29:55Z","timestamp":1763112595000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/11\/1951"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,13]]},"references-count":43,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2025,11]]}},"alternative-id":["sym17111951"],"URL":"https:\/\/doi.org\/10.3390\/sym17111951","relation":{},"ISSN":["2073-8994"],"issn-type":[{"value":"2073-8994","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11,13]]}}}