{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T21:05:59Z","timestamp":1774731959696,"version":"3.50.1"},"reference-count":51,"publisher":"Association for Computing Machinery (ACM)","issue":"3","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62372242, 92267104, and 62177014"],"award-info":[{"award-number":["62372242, 92267104, and 62177014"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100004608","name":"Natural Science Foundation of Jiangsu Province","doi-asserted-by":"crossref","award":["BK20211284"],"award-info":[{"award-number":["BK20211284"]}],"id":[{"id":"10.13039\/501100004608","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Auton. Adapt. Syst."],"published-print":{"date-parts":[[2025,9,30]]},"abstract":"<jats:p>Connected Autonomous Vehicle (CAV) Driving, as a data-driven intelligent driving technology within the Internet of Vehicles (IoV), presents significant challenges to the efficiency and security of real-time data management. The combination of Web3.0 and edge content caching holds promise in providing low-latency data access for CAVs\u2019 real-time applications. Web3.0 enables the reliable pre-migration of frequently requested content from content providers to edge nodes. However, identifying optimal edge node peers for joint content caching and replacement remains challenging due to the dynamic nature of traffic flow in IoV. Addressing these challenges, this article introduces GAMA-Cache, an innovative edge content caching methodology leveraging Graph Attention Networks (GAT) and Multi-Agent Reinforcement Learning (MARL). GAMA-Cache conceptualizes the cooperative edge content caching issue as a constrained Markov decision process. It employs a MARL technique predicated on cooperation effectiveness to discern optimal caching decisions, with GAT augmenting information extracted from adjacent nodes. A distinct collaborator selection mechanism is also developed to streamline communication between agents, filtering out those with minimal correlations in the vector input to the policy network. Experimental results demonstrate that, in terms of service latency and delivery failure, the GAMA-Cache outperforms other state-of-the-art MARL solutions for edge content caching in IoV.<\/jats:p>","DOI":"10.1145\/3699431","type":"journal-article","created":{"date-parts":[[2024,12,18]],"date-time":"2024-12-18T04:18:39Z","timestamp":1734495519000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Multi-Agent Reinforcement Learning based Edge Content Caching for Connected Autonomous Vehicles in IoV"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4879-9803","authenticated-orcid":false,"given":"Xiaolong","family":"Xu","sequence":"first","affiliation":[{"name":"School of Software, Jiangsu Province Engineering Research Center of Advanced Computing and Intelligent Services, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET), Nanjing University of Information Science and Technology, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6799-1190","authenticated-orcid":false,"given":"Linjie","family":"Gu","sequence":"additional","affiliation":[{"name":"Changwang School of Honors, Nanjing University of Information Science and Technology, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4221-0877","authenticated-orcid":false,"given":"Muhammad","family":"Bilal","sequence":"additional","affiliation":[{"name":"School of Computing and Communications, Lancaster University, Lancaster, United Kingdom"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7656-0184","authenticated-orcid":false,"given":"Maqbool","family":"Khan","sequence":"additional","affiliation":[{"name":"Department of IT and Computer Science, Pak-Austria Fachhochschule-Institute of Applied Sciences and Technology, Haripur, Pakistan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0381-8460","authenticated-orcid":false,"given":"Yiping","family":"Wen","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5795-5516","authenticated-orcid":false,"given":"Guoqiang","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Software, Nanjing University of Information Science and Technology, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4233-4407","authenticated-orcid":false,"given":"Yuan","family":"Yuan","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Beihang University, Beijing, China, State Key Laboratory of Software Development Environment, Beijing, China and Zhongguancun Laboratory, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2025,9,15]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","unstructured":"Andrea Banino Adri\u00e0 Puidomenech Badia Jacob Walker Tim Scholtes Jovana Mitrovic and Charles Blundell. 2022. CoBERL: Contrastive BERT for reinforcement learning. arXiv:2107.05431. Retrieved from 10.48550\/arXiv.2107.05431","DOI":"10.48550\/arXiv.2107.05431"},{"key":"e_1_3_1_3_2","doi-asserted-by":"crossref","first-page":"528","DOI":"10.1109\/ICACT.2014.6779016","volume-title":"16th International Conference on Advanced Communication Technology","author":"Bilal Muhammad","year":"2014","unstructured":"Muhammad Bilal and Shin-Gak Kang. 2014. Time aware least recent used (TLRU) cache management policy in ICN. In 16th International Conference on Advanced Communication Technology, 528\u2013532. DOI: 10.1109\/ICACT.2014.6779016"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2017.2669344"},{"key":"e_1_3_1_5_2","first-page":"3358","volume-title":"2015 IEEE International Conference on Communications (ICC \u201915)","author":"Blaszczyszyn Bartlomiej","year":"2015","unstructured":"Bartlomiej Blaszczyszyn and Anastasios Giovanidis. 2015. Optimal geographic caching in cellular networks. In 2015 IEEE International Conference on Communications (ICC \u201915). IEEE, London, 3358\u20133363. DOI: 10.1109\/ICC.2015.7248843"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/COMST.2020.3005361"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.26599\/TST.2020.9010050"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2021.3065404"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2023.3251303"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/COMST.2023.3317242"},{"key":"e_1_3_1_11_2","first-page":"1266","volume-title":"Companion Proceedings of the ACM Web Conference 2023","author":"Gan Wensheng","year":"2023","unstructured":"Wensheng Gan, Zhenqiang Ye, Shicheng Wan, and Philip S. Yu. 2023. Web 3.0: The future of Internet. In Companion Proceedings of the ACM Web Conference 2023. ACM, 1266\u20131275. DOI: 10.1145\/3543873.3587583"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10458-019-09421-1"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","unstructured":"Eric Jang Shixiang Gu and Ben Poole. 2017. Categorical reparameterization with Gumbel-Softmax. arXiv:1611.01144. Retrieved from 10.48550\/arXiv.1611.01144","DOI":"10.48550\/arXiv.1611.01144"},{"key":"e_1_3_1_14_2","article-title":"Learning attentional communication for multi-agent cooperation","volume":"31","author":"Jiang Jiechuan","year":"2018","unstructured":"Jiechuan Jiang and Zongqing Lu. 2018. Learning attentional communication for multi-agent cooperation. In Advances in Neural Information Processing Systems, Vol. 31.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_15_2","first-page":"455","volume-title":"2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS \u201920)","author":"Jiang Kai","year":"2020","unstructured":"Kai Jiang, Huan Zhou, Deze Zeng, and Jie Wu. 2020. Multi-agent reinforcement learning for cooperative edge caching in Internet of vehicles. In 2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS \u201920). IEEE, 455\u2013463. DOI: 10.1109\/MASS50613.2020.00062"},{"key":"e_1_3_1_16_2","volume-title":"International Conference on Learning Representations","author":"Kapturowski Steven","year":"2019","unstructured":"Steven Kapturowski, Georg Ostrovski, Will Dabney, John Quan, and Remi Munos. 2019. Recurrent experience replay in distributed reinforcement learning. In International Conference on Learning Representations. Retrieved from https:\/\/openreview.net\/forum?id=r1lyTjAqYX"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","unstructured":"Guanzhou Li Jianping Wu and Yujing He. 2023. D-HAL: Distributed hierarchical adversarial learning for multi-agent interaction in autonomous intersection management. arXiv:2303.02630. Retrieved from 10.48550\/arXiv.2303.02630","DOI":"10.48550\/arXiv.2303.02630"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3234463"},{"issue":"8","key":"e_1_3_1_19_2","doi-asserted-by":"crossref","first-page":"1768","DOI":"10.1109\/JSAC.2018.2844658","article-title":"Hierarchical edge caching in device-to-device aided mobile networks: Modeling, optimization, and design","volume":"36","author":"Li Xiuhua","year":"2018","unstructured":"Xiuhua Li, Xiaofei Wang, Peng-Jun Wan, Zhu Han, and Victor C. M. Leung. 2018. Hierarchical edge caching in device-to-device aided mobile networks: Modeling, optimization, and design. IEEE Journal on Selected Areas in Communications 36, 8 (Aug. 2018), 1768\u20131785.","journal-title":"IEEE Journal on Selected Areas in Communications"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","unstructured":"Timothy P. Lillicrap Jonathan J. Hunt Alexander Pritzel Nicolas Heess Tom Erez Yuval Tassa David Silver and Daan Wierstra. 2019. Continuous control with deep reinforcement learning. arXiv:1509.02971. Retrieved from 10.48550\/arXiv.1509.02971","DOI":"10.48550\/arXiv.1509.02971"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIV.2022.3214119"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.23919\/JCC.2020.09.017"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","unstructured":"Yong Liu Weixun Wang Yujing Hu Jianye Hao Xingguo Chen and Yang Gao. 2019. Multi-agent game abstraction via graph attention neural network. arXiv:1911.10715. Retrieved from 10.48550\/arXiv.1911.10715","DOI":"10.48550\/arXiv.1911.10715"},{"key":"e_1_3_1_24_2","volume-title":"Advances in Neural Information Processing Systems","author":"Lowe Ryan","year":"2017","unstructured":"Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, and Igor Mordatch. 2017. Multi-agent actor-critic for mixed cooperative-competitive environments. In Advances in Neural Information Processing Systems, Vol. 30, Curran Associates, Inc."},{"key":"e_1_3_1_25_2","first-page":"1","volume-title":"2020 IEEE Global Communications Conference (GLOBECOM \u201920)","author":"Mai Tianle","year":"2020","unstructured":"Tianle Mai, Haipeng Yao, Zehui Xiong, Song Guo, and Dusit Tao Niyato. 2020. Multi-agent actor-critic reinforcement learning based in-network load balance. In 2020 IEEE Global Communications Conference (GLOBECOM \u201920). IEEE, 1\u20136. DOI: 10.1109\/GLOBECOM42002.2020.9322277"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/APNOMS.2015.7275393"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","unstructured":"Tabish Rashid Mikayel Samvelyan Christian Schroeder de Witt Gregory Farquhar Jakob Foerster and Shimon Whiteson. 2018. QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning. arXiv:1803.11485. Retrieved from 10.48550\/arXiv.1803.11485","DOI":"10.48550\/arXiv.1803.11485"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/MIS.2006.62"},{"key":"e_1_3_1_29_2","first-page":"1","article-title":"Learning structured communication for multi-agent reinforcement learning","volume":"36","author":"Sheng Junjie","year":"2020","unstructured":"Junjie Sheng, Xiangfeng Wang, Bo Jin, Junchi Yan, Wenhao Li, Tsung-Hui Chang, Jun Wang, and Hongyuan Zha. 2020. Learning structured communication for multi-agent reinforcement learning. Autonomous Agents and Multi-Agent Systems 36 (2020), 1\u201331.","journal-title":"Autonomous Agents and Multi-Agent Systems"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2017.2749459"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","unstructured":"Amanpreet Singh Tushar Jain and Sainbayar Sukhbaatar. 2018. Learning when to communicate at scale in multiagent cooperative and competitive tasks. arXiv:1812.09755.Retrieved from 10.48550\/arXiv.1812.09755","DOI":"10.48550\/arXiv.1812.09755"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48891.2023.10160367"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVT.2020.3018817"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","unstructured":"Chen Tessler Daniel J. Mankowitz and Shie Mannor. 2018. Reward constrained policy optimization. arXiv:1805.11074. DOI: 10.48550\/arXiv.1805.11074","DOI":"10.48550\/arXiv.1805.11074"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11280-021-00939-7"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCCN.2020.3027695"},{"key":"e_1_3_1_37_2","first-page":"1913","volume-title":"28th ACM International Conference on Information and Knowledge Management","author":"Wei Hua","year":"2019","unstructured":"Hua Wei, Nan Xu, Huichu Zhang, Guanjie Zheng, Xinshi Zang, Chacha Chen, Weinan Zhang, Yanmin Zhu, Kai Xu, and Zhenhui Li. 2019. CoLight: Learning network-level cooperation for traffic signal control. In 28th ACM International Conference on Information and Knowledge Management. ACM, 1913\u20131922. DOI: 10.1145\/3357384.3357902"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/TWC.2020.3003339"},{"key":"e_1_3_1_39_2","first-page":"6","volume-title":"26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","author":"Wu Ning","year":"2020","unstructured":"Ning Wu, Xin Wayne Zhao, Jingyuan Wang, and Dayan Pan. 2020. Learning effective road network representation with hierarchical graph neural networks. In 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 6\u201314. DOI: 10.1145\/3394486.3403043"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","unstructured":"Kelvin Xu Jimmy Lei Ryan Kiros Kyunghyun Cho Aaron Courville Ruslan Salakhutdinov Richard S. Zemel and Yoshua Bengio. 2016. Show attend and tell: Neural image caption generation with visual attention. arXiv:1502.03044. DOI: 10.48550\/arXiv.1502.03044","DOI":"10.48550\/arXiv.1502.03044"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSTSP.2023.3293650"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3447032"},{"issue":"4","key":"e_1_3_1_43_2","doi-asserted-by":"crossref","first-page":"2457","DOI":"10.1109\/TCOMM.2020.3045050","article-title":"Decentralized multi-agent multi-armed bandit learning with calibration for multi-cell caching","volume":"69","author":"Xu Xianzhe","year":"2021","unstructured":"Xianzhe Xu and Meixia Tao. 2021. Decentralized multi-agent multi-armed bandit learning with calibration for multi-cell caching. IEEE Transactions on Communications 69, 4 (Apr. 2021), 2457\u20132472.","journal-title":"IEEE Transactions on Communications"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2020.3043250"},{"key":"e_1_3_1_45_2","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1109\/VNC57357.2023.10136285","volume-title":"2023 IEEE Vehicular Networking Conference (VNC \u201923),","author":"Zenden Ivo","year":"2023","unstructured":"Ivo Zenden, Han Wang, Alfonso Iacovazzi, Arash Vahidi, Rolf Blom, and Shahid Raza. 2023. On the resilience of machine learning-based IDS for automotive networks. In 2023 IEEE Vehicular Networking Conference (VNC \u201923), 239\u2013246. DOI: 10.1109\/VNC57357.2023.10136285"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/MCOM.2016.7565185"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1186\/s13677-020-00182-x"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2021.3072118"},{"key":"e_1_3_1_49_2","first-page":"20410","article-title":"BCORLE(\u03bb): An offline reinforcement learning and evaluation framework for coupons allocation in E-commerce market","volume":"34","author":"Zhang Yang","year":"2021","unstructured":"Yang Zhang, Bo Tang, Qingyu Yang, Dou An, Hongyin Tang, Chenyang Xi, Xueying LI, and Feiyu Xiong. 2021. BCORLE(\u03bb): An offline reinforcement learning and evaluation framework for coupons allocation in E-commerce market. In Advances in Neural Information Processing Systems, Vol. 34, 20410\u201320422.","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"10","key":"e_1_3_1_50_2","doi-asserted-by":"crossref","first-page":"10216","DOI":"10.1109\/TVT.2019.2936792","article-title":"Heterogeneous information network-based content caching in the Internet of vehicles","volume":"68","author":"Zhang Yin","year":"2019","unstructured":"Yin Zhang, Ranran Wang, M. Shamim Hossain, Mohammed F. Alhamid, and Mohsen Guizani. 2019. Heterogeneous information network-based content caching in the Internet of vehicles. IEEE Transactions on Vehicular Technology 68, 10 (Oct. 2019), 10216\u201310226.","journal-title":"IEEE Transactions on Vehicular Technology"},{"key":"e_1_3_1_51_2","first-page":"1","volume-title":"IEEE International Conference on Communications (ICC \u201921)","author":"Zhao Yiwei","year":"2021","unstructured":"Yiwei Zhao, Ruibin Li, Chenyang Wang, Xiaofei Wang, and Victor C. M. Leung. 2021. Neighboring-aware caching in heterogeneous edge networks by actor-attention-critic learning. In IEEE International Conference on Communications (ICC \u201921). IEEE, 1\u20136. DOI: 10.1109\/ICC42927.2021.9500929"},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2021.3078514"}],"container-title":["ACM Transactions on Autonomous and Adaptive Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3699431","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,15]],"date-time":"2025-09-15T17:10:44Z","timestamp":1757956244000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3699431"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,15]]},"references-count":51,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,9,30]]}},"alternative-id":["10.1145\/3699431"],"URL":"https:\/\/doi.org\/10.1145\/3699431","relation":{},"ISSN":["1556-4665","1556-4703"],"issn-type":[{"value":"1556-4665","type":"print"},{"value":"1556-4703","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,15]]},"assertion":[{"value":"2023-07-03","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-04-16","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-09-15","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}