{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T07:00:39Z","timestamp":1770361239189,"version":"3.49.0"},"reference-count":29,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T00:00:00Z","timestamp":1770249600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>Although multi-agent reinforcement learning (MARL) has achieved significant success in various domains, its deployment in real-world scenarios remains challenging, particularly in communication-constrained environments involving multi-task coupling. Existing methods suffer from two limitations: (1) the inability to effectively integrate and process incomplete state from disparate agents, and (2) a lack of robust mechanisms for handling complex multi-task coupling. To address these challenges, we propose the Coupled Communication-Task Decoupling (CCTD) framework. CCTD introduces two critical innovations: first, a distributed state compensation mechanism to process historical data, thereby reconstructing accurate global states from partial observations; second, a hierarchical architecture that systematically decomposes complex tasks into manageable subtasks while preserving their interdependencies. Thanks to its modular design, CCTD can integrate with existing MARL algorithms and allow for flexible combination of various subtasks. Extensive experiments demonstrate that CCTD outperforms baseline methods, achieving a 10% improvement in communication reception rate and superior performance across all subtasks in multi-task environments.<\/jats:p>","DOI":"10.3390\/bdcc10020052","type":"journal-article","created":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T15:24:19Z","timestamp":1770305059000},"page":"52","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["CCTD-MARL: Coupled Communication-Task Decoupling Framework for Multi-Agent Systems Under Partial Observability"],"prefix":"10.3390","volume":"10","author":[{"given":"Kehan","family":"Li","sequence":"first","affiliation":[{"name":"School of Computer Science, Huazhong University of Science and Technology, Wuhan 430074, China"}]},{"given":"Zhenya","family":"Wang","sequence":"additional","affiliation":[{"name":"China Aerospace Science and Technology Corporation (CASC) Academy of Aerospace System and Innovation, Beijing 100048, China"}]},{"given":"Xin","family":"Tang","sequence":"additional","affiliation":[{"name":"School of Communication and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210023, China"}]},{"given":"Heng","family":"You","sequence":"additional","affiliation":[{"name":"China Aerospace Science and Technology Corporation (CASC) Academy of Aerospace System and Innovation, Beijing 100048, China"}]},{"given":"Long","family":"Hu","sequence":"additional","affiliation":[{"name":"School of Computer Science, Huazhong University of Science and Technology, Wuhan 430074, China"}]},{"given":"Haidong","family":"Xie","sequence":"additional","affiliation":[{"name":"China Aerospace Science and Technology Corporation (CASC) Academy of Aerospace System and Innovation, Beijing 100048, China"}]},{"given":"Min","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, South China University of Technology, Guangzhou 510640, China"},{"name":"Pazhou Laboratory, Guangzhou 510640, China"}]}],"member":"1968","published-online":{"date-parts":[[2026,2,5]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Canese, L., Cardarilli, G.C., Di Nunzio, L., Fazzolari, R., Giardino, D., Re, M., and Span\u00f2, S. (2021). Multi-agent reinforcement learning: A review of challenges and applications. Appl. Sci., 11.","DOI":"10.1038\/s41598-021-94691-7"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Zhang, K., Yang, Z., and Ba\u015far, T. (2021). Multi-agent reinforcement learning: A selective overview of theories and algorithms. Handbook of Reinforcement Learning and Control, Springer.","DOI":"10.1007\/978-3-030-60990-0_12"},{"key":"ref_3","first-page":"16509","article-title":"Multi-agent reinforcement learning is a sequence modeling problem","volume":"35","author":"Wen","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1017\/pds.2021.17","article-title":"A multi-agent reinforcement learning framework for intelligent manufacturing with autonomous mobile robots","volume":"1","author":"Agrawal","year":"2021","journal-title":"Proc. Des. Soc."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"829","DOI":"10.1109\/TAI.2024.3497919","article-title":"Safe multi-agent reinforcement learning with bilevel optimization in autonomous driving","volume":"6","author":"Zheng","year":"2024","journal-title":"IEEE Trans. Artif. Intell."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"3215","DOI":"10.1007\/s10462-020-09938-y","article-title":"A survey on multi-agent deep reinforcement learning: From the perspective of challenges and applications","volume":"54","author":"Du","year":"2021","journal-title":"Artif. Intell. Rev."},{"key":"ref_7","unstructured":"Nekoei, H., Badrinaaraayanan, A., Sinha, A., Amini, M., Rajendran, J., Mahajan, A., and Chandar, S. (2023, January 22\u201325). Dealing with non-stationarity in decentralized cooperative multi-agent deep reinforcement learning via multi-timescale learning. Proceedings of the Conference on Lifelong Learning Agents, Montreal, QC, Canada."},{"key":"ref_8","unstructured":"Amato, C. (2024). An introduction to centralized training for decentralized execution in cooperative multi-agent reinforcement learning. arXiv."},{"key":"ref_9","unstructured":"Phan, T., Ritz, F., Altmann, P., Zorn, M., N\u00fc\u00dflein, J., K\u00f6lle, M., Gabor, T., and Linnhoff-Popien, C. (2023, January 23\u201329). Attention-based recurrence for multi-agent reinforcement learning under stochastic partial observability. Proceedings of the International Conference on Machine Learning, Honolulu, HI, USA."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Liu, X., and Jin, B. (2024). Information-Theoretic Multi-Agent Algorithm Based on the CTDE Framework. 2024 9th International Conference on Electronic Technology and Information Science (ICETIS), IEEE.","DOI":"10.1109\/ICETIS61828.2024.10593780"},{"key":"ref_11","unstructured":"Hu, S., Shen, L., Zhang, Y., and Tao, D. (2024). Learning multi-agent communication from graph modeling perspective. arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"13431","DOI":"10.1109\/TNNLS.2024.3475216","article-title":"Multi-task multi-agent reinforcement learning with interaction and task representations","volume":"36","author":"Li","year":"2024","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"6949","DOI":"10.1109\/TWC.2022.3153316","article-title":"Multi-agent deep reinforcement learning for task offloading in UAV-assisted mobile edge computing","volume":"21","author":"Zhao","year":"2022","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zhu, X., Xu, J., Ge, J., Wang, Y., and Xie, Z. (2023). Multi-task multi-agent reinforcement learning for real-time scheduling of a dual-resource flexible job shop with robots. Processes, 11.","DOI":"10.3390\/pr11010267"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3453160","article-title":"Hierarchical reinforcement learning: A comprehensive survey","volume":"54","author":"Pateria","year":"2021","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"172","DOI":"10.3390\/make4010009","article-title":"Hierarchical reinforcement learning: A survey and open research challenges","volume":"4","author":"Mets","year":"2022","journal-title":"Mach. Learn. Knowl. Extr."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"137","DOI":"10.12785\/ijcds\/040207","article-title":"Hierarchical reinforcement learning: A survey","volume":"4","year":"2015","journal-title":"Int. J. Comput. Digit. Syst."},{"key":"ref_18","unstructured":"Williams, R.J. (2007). Reinforcement Learning and Markov Decision Processes, Spring. Available online: https:\/\/ccs.neu.edu\/home\/rjw\/com3480\/lectures\/reinforcement.pdf."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"3401","DOI":"10.1109\/JSAC.2023.3310080","article-title":"Cooperative task offloading and service caching for digital twin edge networks: A graph attention multi-agent reinforcement learning approach","volume":"41","author":"Yao","year":"2023","journal-title":"IEEE J. Sel. Areas Commun."},{"key":"ref_20","first-page":"1","article-title":"Monotonic value function factorisation for deep multi-agent reinforcement learning","volume":"21","author":"Rashid","year":"2020","journal-title":"J. Mach. Learn. Res."},{"key":"ref_21","unstructured":"Hao, X., Wang, W., Mao, H., Yang, Y., Li, D., Zheng, Y., Wang, Z., and Hao, J. (2022). API: Boosting multi-agent reinforcement learning via agent-permutation-invariant networks. arXiv."},{"key":"ref_22","first-page":"3394","article-title":"Deep sets","volume":"30","author":"Zaheer","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1016\/j.cogsys.2020.08.012","article-title":"Hierarchical deep q-network from imperfect demonstrations in minecraft","volume":"65","author":"Skrynnik","year":"2021","journal-title":"Cogn. Syst. Res."},{"key":"ref_24","first-page":"5055","article-title":"Hindsight experience replay","volume":"30","author":"Andrychowicz","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"R\u00f6der, F., Eppe, M., Nguyen, P.D., and Wermter, S. (2020). Curious hierarchical actor-critic reinforcement learning. International Conference on Artificial Neural Networks, Springer.","DOI":"10.1007\/978-3-030-61616-8_33"},{"key":"ref_26","first-page":"2145","article-title":"Learning to communicate with deep multi-agent reinforcement learning","volume":"29","author":"Foerster","year":"2016","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_27","first-page":"2252","article-title":"Learning multiagent communication with backpropagation","volume":"29","author":"Sukhbaatar","year":"2016","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_28","first-page":"24611","article-title":"The surprising effectiveness of ppo in cooperative multi-agent games","volume":"35","author":"Yu","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1481","DOI":"10.1038\/s41597-025-05825-9","article-title":"Real operational labeled data of air handling units from office, auditorium, and hospital buildings","volume":"12","author":"Wang","year":"2025","journal-title":"Sci. Data"}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/10\/2\/52\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T15:53:28Z","timestamp":1770306808000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/10\/2\/52"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,5]]},"references-count":29,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2026,2]]}},"alternative-id":["bdcc10020052"],"URL":"https:\/\/doi.org\/10.3390\/bdcc10020052","relation":{},"ISSN":["2504-2289"],"issn-type":[{"value":"2504-2289","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,5]]}}}