{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,8]],"date-time":"2026-03-08T20:35:53Z","timestamp":1773002153735,"version":"3.50.1"},"reference-count":58,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2025,5,19]],"date-time":"2025-05-19T00:00:00Z","timestamp":1747612800000},"content-version":"vor","delay-in-days":138,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2025,1,1]],"date-time":"2025-01-01T00:00:00Z","timestamp":1735689600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/doi.wiley.com\/10.1002\/tdm_license_1.1"}],"funder":[{"DOI":"10.13039\/501100015401","name":"Key Research and Development Projects of Shaanxi Province","doi-asserted-by":"publisher","award":["2022ZDLGY03-02"],"award-info":[{"award-number":["2022ZDLGY03-02"]}],"id":[{"id":"10.13039\/501100015401","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62106134"],"award-info":[{"award-number":["62106134"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62476159"],"award-info":[{"award-number":["62476159"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["International Journal of Intelligent Systems"],"published-print":{"date-parts":[[2025,1]]},"abstract":"<jats:p>Prominent achievements of multiagent reinforcement learning (MARL) have been recognized in the last few years, but effective cooperation among agents remains a challenge. Traditional methods neglect the modeling of action semantic relations in the learning process of joint action latent representations. In other words, the uncertain semantic relations might hinder the learning of sophisticated cooperative relationships among actions, which may lead to homogeneous behaviors across all agents and their limited exploration efficiency. Our aim is to learn the structure of the action semantic space to improve the cooperation\u2010aware representation for policy optimization of MARL. To achieve this, a scheme called graph learning of semantic relations (GLSR) is proposed, where action semantic embeddings and joint action representations are learned in a collaborative way. GLSR incorporates an action semantic encoder for capturing semantic relations in the action semantic space. By leveraging the cross\u2010attention mechanism with action semantic embeddings, GLSR prompts the action semantic relations to guide mining the cooperation\u2010aware joint action representations, implicitly facilitating agent cooperation in the joint policy space for more diverse behaviors of cooperative agents. The experimental results on challenging tasks demonstrate that GLSR attains state\u2010of\u2010the\u2010art outcomes and shows robust performance in multiagent cooperative tasks.<\/jats:p>","DOI":"10.1155\/int\/4810561","type":"journal-article","created":{"date-parts":[[2025,5,19]],"date-time":"2025-05-19T05:05:27Z","timestamp":1747631127000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Graph Learning of Semantic Relations (GLSR) for Cooperative Multiagent Reinforcement Learning"],"prefix":"10.1155","volume":"2025","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-4624-750X","authenticated-orcid":false,"given":"Pengting","family":"Duan","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6894-8207","authenticated-orcid":false,"given":"Chao","family":"Wen","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6404-9354","authenticated-orcid":false,"given":"Baoping","family":"Wang","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0009-0005-6187-1760","authenticated-orcid":false,"given":"Zhenni","family":"Wang","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8065-1038","authenticated-orcid":false,"given":"Zhifang","family":"Wei","sequence":"additional","affiliation":[]}],"member":"311","published-online":{"date-parts":[[2025,5,19]]},"reference":[{"key":"e_1_2_11_1_2","doi-asserted-by":"crossref","unstructured":"TroullinosD. ChalkiadakisG. PapamichailI. andPapageorgiouM. Collaborative Multiagent Decision Making for Lane-Free Autonomous Driving Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems 2021 1335\u20131343.","DOI":"10.65109\/ZHIF5176"},{"key":"e_1_2_11_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/comst.2021.3063822"},{"key":"e_1_2_11_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/tits.2022.3173490"},{"key":"e_1_2_11_4_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2024.112000"},{"key":"e_1_2_11_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/tetci.2024.3360282"},{"key":"e_1_2_11_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/tcds.2023.3250819"},{"key":"e_1_2_11_7_2","doi-asserted-by":"crossref","unstructured":"JiangT. ZhuangD. andXieH. Anti-Drone Policy Learning Based on Self-Attention Multi-Agent Deterministic Policy Gradient International Conference on Autonomous Unmanned Systems 2021 Springer 2277\u20132289.","DOI":"10.1007\/978-981-16-9492-9_225"},{"key":"e_1_2_11_8_2","doi-asserted-by":"publisher","DOI":"10.1002\/int.22648"},{"key":"e_1_2_11_9_2","first-page":"1698","article-title":"LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent Reinforcement Learning","volume":"35","author":"Yang M.","year":"2022","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_11_10_2","unstructured":"De WittC. S. GuptaT. MakoviichukD.et al. Is Independent Learning All You Need in the Starcraft Multi-Agent Challenge? 2020 arXiv preprint arXiv:2011.09533."},{"key":"e_1_2_11_11_2","first-page":"24611","article-title":"The Surprising Effectiveness of Ppo in Cooperative Multi-Agent Games","volume":"35","author":"Yu C.","year":"2022","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_11_12_2","unstructured":"KubaJ. ChenR. WenM.et al. Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning ICLR 2022-10th International Conference on Learning Representations the International Conference on Learning Representations 2022 ICLR."},{"key":"e_1_2_11_13_2","doi-asserted-by":"crossref","unstructured":"SunehagP. LeverG. GruslysA.et al. Value-Decomposition Networks for Cooperative Multi-Agent Learning Based on Team Reward Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems 2018 2085\u20132087.","DOI":"10.65109\/JSRC7365"},{"key":"e_1_2_11_14_2","unstructured":"RashidT. SamvelyanM. SchroederC. FarquharG. FoersterJ. andWhitesonS. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning International Conference on Machine Learning 2018 PMLR 4295\u20134304."},{"key":"e_1_2_11_15_2","unstructured":"SonK. KimD. KangW. J. HostalleroD. E. andYiY. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning International Conference on Machine Learning 2019 PMLR 5887\u20135896."},{"key":"e_1_2_11_16_2","article-title":"Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments","volume":"30","author":"Lowe R.","year":"2017","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_11_17_2","doi-asserted-by":"crossref","unstructured":"FoersterJ. FarquharG. AfourasT. NardelliN. andWhitesonS. Counterfactual Multi-Agent Policy Gradients Proceedings of the AAAI Conference on Artificial Intelligence 2018.","DOI":"10.1609\/aaai.v32i1.11794"},{"key":"e_1_2_11_18_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v37i10.26370"},{"key":"e_1_2_11_19_2","unstructured":"LiC. WangT. WuC. ZhaoQ. YangJ. andZhangC. Celebrating Diversity in Shared Multi-Agent Reinforcement Learning 2021 3991\u20134002."},{"key":"e_1_2_11_20_2","doi-asserted-by":"crossref","unstructured":"ShaoJ. LouZ. ZhangH. JiangY. HeS. andJiX. Self-Organized Group for Cooperative Multi-Agent Reinforcement Learning 2022 5711\u20135723.","DOI":"10.52202\/068431-0413"},{"key":"e_1_2_11_21_2","unstructured":"WangT. GuptaT. MahajanA. PengB. WhitesonS. andZhangC. Rode: Learning Roles to Decompose Multi-Agent Tasks International Conference on Learning Representations 2021."},{"key":"e_1_2_11_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/tnnls.2023.3283523"},{"key":"e_1_2_11_23_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2023\/40"},{"key":"e_1_2_11_24_2","first-page":"6860","article-title":"Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition","volume":"139","author":"Liu B.","year":"2021","journal-title":"Proceedings of the 38th International Conference on Machine Learning"},{"key":"e_1_2_11_25_2","unstructured":"VaswaniA. Attention Is All You Need 2017 arXiv preprint arXiv:1706.03762."},{"key":"e_1_2_11_26_2","first-page":"16509","article-title":"Multi-Agent Reinforcement Learning Is a Sequence Modeling Problem","volume":"35","author":"Wen M.","year":"2022","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_11_27_2","article-title":"An Overview of Multi-Agent Reinforcement Learning From Game Theoretical Perspective","author":"Yang Y.","year":"2020","journal-title":"3"},{"key":"e_1_2_11_28_2","unstructured":"ClausC.andBoutilierC. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems AAAI\/IAAI 1998 1998."},{"key":"e_1_2_11_29_2","doi-asserted-by":"publisher","DOI":"10.1016\/B978-1-55860-335-6.50027-1"},{"key":"e_1_2_11_30_2","doi-asserted-by":"crossref","unstructured":"FoersterJ. ChenR. Y. Al-ShedivatM. WhitesonS. AbbeelP. andMordatchI. Learning With Opponent-Learning Awareness Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems 2018 122\u2013130.","DOI":"10.65109\/HGWA8807"},{"key":"e_1_2_11_31_2","unstructured":"IqbalS.andShaF. Actor-Attention-Critic for Multi-Agent Reinforcement Learning International Conference on Machine Learning 2019 PMLR 2961\u20132970."},{"key":"e_1_2_11_32_2","first-page":"12208","article-title":"Facmac: Factored Multi-Agent Centralised Policy Gradients","volume":"34","author":"Peng B.","year":"2021","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_11_33_2","first-page":"22069","article-title":"Learning Individually Inferred Communication for Multi-Agent Cooperation","volume":"33","author":"Ding Z.","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_11_34_2","doi-asserted-by":"crossref","unstructured":"LiS. GuptaJ. K. MoralesP. AllenR. andKochenderferM. J. Deep Implicit Coordination Graphs for Multi-Agent Reinforcement Learning Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems 2021 764\u2013772.","DOI":"10.65109\/CZOY2835"},{"key":"e_1_2_11_35_2","doi-asserted-by":"crossref","unstructured":"NiuY. PalejaR. andGombolayM. Multi-Agent Graph-Attention Communication and Teaming Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems 2021 964\u2013973.","DOI":"10.65109\/YHUO7761"},{"key":"e_1_2_11_36_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2024.125116"},{"key":"e_1_2_11_37_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2023.103905"},{"key":"e_1_2_11_38_2","unstructured":"WangX. TianZ. WanZ. WenY. WangJ. andZhangW. Order Matters: Agent-By-Agent Policy Optimization The Eleventh International Conference on Learning Representations 2023."},{"key":"e_1_2_11_39_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i5.16598"},{"key":"e_1_2_11_40_2","unstructured":"NaH. SeoY. andChul MoonI. Efficient Episodic Memory Utilization of Cooperative Multi-Agent Reinforcement Learning The Twelfth International Conference on Learning Representations 2024."},{"key":"e_1_2_11_41_2","unstructured":"van der HeidenT. SalgeC. GavvesE. andvan HoofH. Robust Multi-Agent Reinforcement Learning With Social Empowerment for Coordination and Communication 2020 arXiv preprint arXiv:2012."},{"key":"e_1_2_11_42_2","unstructured":"XieA. LoseyD. TolsmaR. FinnC. andSadighD. Learning Latent Representations to Influence Multi-Agent Interaction Conference on Robot Learning 2021 PMLR 575\u2013588."},{"key":"e_1_2_11_43_2","unstructured":"WangW. Z. ShihA. XieA. andSadighD. Influencing Towards Stable Multi-Agent Interactions Conference on Robot Learning 2022 PMLR 1132\u20131143."},{"key":"e_1_2_11_44_2","unstructured":"B\u00f6hmerW. KurinV. andWhitesonS. Deep Coordination Graphs International Conference on Machine Learning 2020 PMLR 980\u2013991."},{"key":"e_1_2_11_45_2","unstructured":"MajumdarS. KhadkaS. MiretS. McAleerS. andTumerK. Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination International Conference on Machine Learning 2020 PMLR 6651\u20136660."},{"key":"e_1_2_11_46_2","unstructured":"LiP. HaoJ. TangH. ZhengY. andFuX. Race: Improve Multi-Agent Reinforcement Learning With Representation Asymmetry and Collaborative Evolution International Conference on Machine Learning 2023 PMLR 19490\u201319503."},{"key":"e_1_2_11_47_2","doi-asserted-by":"publisher","DOI":"10.1002\/int.22550"},{"key":"e_1_2_11_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/tpami.2021.3136592"},{"key":"e_1_2_11_49_2","doi-asserted-by":"crossref","unstructured":"KharbandaS. GuptaD. SchultheisE. BanerjeeA. HsiehC.-J. andBabbarR. Gandalf: Learning Label-Label Correlations in Extreme Multi-Label Classification via Label Features Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2024 1360\u20131371 https:\/\/doi.org\/10.1145\/3637528.3672063.","DOI":"10.1145\/3637528.3672063"},{"key":"e_1_2_11_50_2","doi-asserted-by":"crossref","unstructured":"ChenZ.-M. WeiX.-S. WangP. andGuoY. Multi-label Image Recognition With Graph Convolutional Networks Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 2019 5177\u20135186.","DOI":"10.1109\/CVPR.2019.00532"},{"key":"e_1_2_11_51_2","doi-asserted-by":"crossref","unstructured":"LittmanM. L. Markov Games as a Framework for Multi-Agent Reinforcement Learning Proceedings of the Eleventh International Conference on International Conference on Machine Learning 1994 157\u2013163 https:\/\/doi.org\/10.1016\/b978-1-55860-335-6.50027-1.","DOI":"10.1016\/B978-1-55860-335-6.50027-1"},{"key":"e_1_2_11_52_2","unstructured":"EspeholtL. SoyerH. MunosR.et al. Impala: Scalable Distributed Deep-Rl With Importance Weighted Actor-Learner Architectures International Conference on Machine Learning 2018 PMLR 1407\u20131416."},{"key":"e_1_2_11_53_2","unstructured":"XuK. HuW. LeskovecJ. andJegelkaS. How Powerful Are Graph Neural Networks? 2018 arXiv preprint arXiv:1810.00826."},{"key":"e_1_2_11_54_2","unstructured":"SchulmanJ. MoritzP. LevineS. JordanM. andAbbeelP. High-Dimensional Continuous Control Using Generalized Advantage Estimation 2015 arXiv preprint arXiv:1506.02438."},{"key":"e_1_2_11_55_2","unstructured":"HuS. ShenL. ZhangY. andTaoD. Graph Decision Transformer 2023 arXiv preprint arXiv:2303.03747."},{"key":"e_1_2_11_56_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i05.6243"},{"key":"e_1_2_11_57_2","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"e_1_2_11_58_2","doi-asserted-by":"crossref","unstructured":"SamvelyanM. RashidT. Schroeder de WittC.et al. The Starcraft Multi-Agent Challenge Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems 2019 2186\u20132188.","DOI":"10.65109\/LVZZ5205"}],"container-title":["International Journal of Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/int\/4810561","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/full-xml\/10.1155\/int\/4810561","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/int\/4810561","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,8]],"date-time":"2026-03-08T18:12:35Z","timestamp":1772993555000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/int\/4810561"}},"subtitle":[],"editor":[{"given":"Mohamadreza (Mohammad)","family":"Khosravi","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,1]]},"references-count":58,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,1]]}},"alternative-id":["10.1155\/int\/4810561"],"URL":"https:\/\/doi.org\/10.1155\/int\/4810561","archive":["Portico"],"relation":{},"ISSN":["0884-8173","1098-111X"],"issn-type":[{"value":"0884-8173","type":"print"},{"value":"1098-111X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1]]},"assertion":[{"value":"2025-03-17","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-04-24","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-05-19","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"4810561"}}