{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,29]],"date-time":"2026-06-29T15:09:46Z","timestamp":1782745786995,"version":"3.54.5"},"reference-count":129,"publisher":"PeerJ","license":[{"start":{"date-parts":[[2026,3,23]],"date-time":"2026-03-23T00:00:00Z","timestamp":1774224000000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Universiti Kebangsaan Malaysia Geran Universiti Penyelidikan","award":["GUP-2024-009"],"award-info":[{"award-number":["GUP-2024-009"]}]},{"name":"Universiti Kebangsaan Malaysia Fundamental Research Grant Scheme (FRGS) from the Ministry of Higher Education","award":["FRGS\/1\/2023\/ICT07\/UKM\/02\/1"],"award-info":[{"award-number":["FRGS\/1\/2023\/ICT07\/UKM\/02\/1"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"abstract":"<jats:p>The rapid development of emerging technologies, such as the massive Internet of Things (IoT) and immersive applications, is driving the resource requirements of Beyond Fifth Generation (B5G) mobile networks to evolve in a more complex and dynamic direction. Network Slicing (NS) technology enables the personalized needs of different services by logically dividing the physical network. However, the resource competition between slices, dynamic traffic changes, and global optimization requirements make it difficult for traditional Resource Allocation (RA) methods to satisfy the network requirements of B5G. Deep Reinforcement Learning (DRL) offers an intelligent approach to RA of NS, leveraging its autonomous learning and adaptive capabilities. This study focused on the multi-agent approach of DRL for RA of NS optimization in B5G. It introduced the process of RA in a multi-slice environment, then summarized the key challenges of RA in B5G scenarios, including multi-domain resource coordination, adaptive resource orchestration, and joint optimization of computation and communication resources. At the same time, this study summarized the training process of Multi-Agent DRL (MADRL), then classified the recent RA methods based on DRL into value-based, policy-based and hybrid methods. Additionally, the challenges faced in deploying B5G environments by current optimization methods are highlighted, and future research directions are discussed. By analyzing the practical challenges between advanced DRL algorithms and RA optimization of NS in B5G, this study lays a theoretical foundation for designing scalable and adaptive multi-agent resource allocation optimization schemes in future communication systems.<\/jats:p>","DOI":"10.7717\/peerj-cs.3728","type":"journal-article","created":{"date-parts":[[2026,3,23]],"date-time":"2026-03-23T08:39:03Z","timestamp":1774255143000},"page":"e3728","source":"Crossref","is-referenced-by-count":1,"title":["A review of multi-agent deep reinforcement learning for resource allocation in beyond 5G network slicing: solutions, challenges and future research directions"],"prefix":"10.7717","volume":"12","author":[{"given":"Zhiyi","family":"Cui","sequence":"first","affiliation":[{"name":"Center for Cyber Security, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Faizan","family":"Qamar","sequence":"additional","affiliation":[{"name":"Center for Cyber Security, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Syed Hussain Ali","family":"Kazmi","sequence":"additional","affiliation":[{"name":"Center for Cyber Security, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3627-556X","authenticated-orcid":true,"given":"Khairul Akram","family":"Zainol Ariffin","sequence":"additional","affiliation":[{"name":"Center for Cyber Security, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ghazanfar Ali","family":"Safdar","sequence":"additional","affiliation":[{"name":"School of Computer Science & Technology, Faculty of Creative Arts, Technologies, & Science, University of Bedfordshire, Luton, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Muhammad Habib","family":"ur Rehman","sequence":"additional","affiliation":[{"name":"School of Computer Science & Technology, Faculty of Creative Arts, Technologies, & Science, University of Bedfordshire, Luton, United Kingdom"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"4443","published-online":{"date-parts":[[2026,3,23]]},"reference":[{"issue":"10","key":"10.7717\/peerj-cs.3728\/ref-1","doi-asserted-by":"publisher","first-page":"4710","DOI":"10.1109\/tiv.2024.3492015","article-title":"Blockchain-empowered resource allocation in HAPS-assisted IoV digital twin networks: a federated DRL approach","volume":"10","author":"Abishu","year":"2024","journal-title":"IEEE Transactions on Intelligent Vehicles"},{"key":"10.7717\/peerj-cs.3728\/ref-2","first-page":"1","article-title":"Unlocking efficiency in B5G networks: the need for adaptive service function chains","author":"Abreu","year":"2024"},{"issue":"11","key":"10.7717\/peerj-cs.3728\/ref-3","doi-asserted-by":"publisher","first-page":"19616","DOI":"10.1109\/jiot.2024.3370192","article-title":"Long-term throughput maximization in wireless powered communication networks: a multi-task DRL approach","volume":"11","author":"Ahmadian","year":"2024","journal-title":"IEEE Internet of Things Journal"},{"issue":"3","key":"10.7717\/peerj-cs.3728\/ref-4","doi-asserted-by":"publisher","first-page":"647","DOI":"10.3390\/electronics12030647","article-title":"A survey on resource management for 6G heterogeneous networks: current research, future trends, and challenges","volume":"12","author":"Alhashimi","year":"2023","journal-title":"Electronics"},{"key":"10.7717\/peerj-cs.3728\/ref-5","first-page":"1","article-title":"The 5G wireless technology and a significant economic growth and sustainable development","author":"Alkholidi","year":"2023"},{"key":"10.7717\/peerj-cs.3728\/ref-6","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2409.03052","article-title":"An introduction to centralized training for decentralized execution in cooperative multi-agent reinforcement learning","author":"Amato","year":"2024"},{"key":"10.7717\/peerj-cs.3728\/ref-7","doi-asserted-by":"publisher","first-page":"3690","DOI":"10.1109\/ojcoms.2024.3414622","article-title":"An in-depth survey on virtualization technologies in 6G integrated terrestrial and non-terrestrial networks","volume":"5","author":"Ammar","year":"2024","journal-title":"IEEE Open Journal of the Communications Society"},{"issue":"4","key":"10.7717\/peerj-cs.3728\/ref-8","doi-asserted-by":"publisher","first-page":"1752","DOI":"10.1109\/tgcn.2024.3404500","article-title":"6G+ networks through enhanced efficiency and sustainability with MADDPG-Driven network slicing in SoS environments","volume":"8","author":"Andreou","year":"2024","journal-title":"IEEE Transactions on Green Communications and Networking"},{"key":"10.7717\/peerj-cs.3728\/ref-9","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2407.10987","article-title":"Adaptive digital twin and communication-efficient federated learning network slicing for 5G-enabled internet of things","author":"Ayepah-Mensah","year":"2024"},{"key":"10.7717\/peerj-cs.3728\/ref-10","doi-asserted-by":"publisher","first-page":"188572\u2013188589","DOI":"10.1109\/ACCESS.2024.3515077","article-title":"Multi-access edge computing resource slice allocation: a review","volume":"12","author":"Bahramisirat","year":"2024","journal-title":"IEEE Access"},{"key":"10.7717\/peerj-cs.3728\/ref-11","doi-asserted-by":"publisher","first-page":"245","DOI":"10.1016\/j.aej.2022.08.017","article-title":"6G mobile communication technology: requirements, targets, applications, challenges, advantages, and opportunities","volume":"64","author":"Banafaa","year":"2023","journal-title":"Alexandria Engineering Journal"},{"key":"10.7717\/peerj-cs.3728\/ref-12","doi-asserted-by":"publisher","first-page":"3191","DOI":"10.1109\/tnsm.2024.3486288","article-title":"Sustainable task offloading in secure UAV-assisted smart farm networks: a multi-agent DRL with action mask approach","volume":"22","author":"Bao","year":"2024","journal-title":"IEEE Transactions on Network and Service Management"},{"key":"10.7717\/peerj-cs.3728\/ref-13","doi-asserted-by":"publisher","first-page":"31","DOI":"10.4236\/ijcns.2024.173003","article-title":"Satellite communications with 5G, B5G, and 6G: challenges and prospects","volume":"17","author":"Beyaz","year":"2024","journal-title":"International Journal of Communications, Network and System Sciences"},{"key":"10.7717\/peerj-cs.3728\/ref-14","doi-asserted-by":"publisher","first-page":"62788","DOI":"10.1109\/access.2021.3074802","article-title":"Network slicing for TSN-based transport networks","volume":"9","author":"Bhattacharjee","year":"2021","journal-title":"IEEE Access"},{"key":"10.7717\/peerj-cs.3728\/ref-15","doi-asserted-by":"crossref","first-page":"1","DOI":"10.4018\/979-8-3693-3739-4.ch001","article-title":"Edge intelligence paradigm shift on optimizing the edge intelligence using artificial intelligence state-of-the-art models","volume-title":"Advancing Intelligent Networks Through Distributed Optimization","author":"Chandrasekaran","year":"2024"},{"key":"10.7717\/peerj-cs.3728\/ref-16","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2110.03239","article-title":"Understanding domain randomization for sim-to-real transfer","author":"Chen","year":"2021"},{"issue":"4","key":"10.7717\/peerj-cs.3728\/ref-17","doi-asserted-by":"publisher","first-page":"2878","DOI":"10.1109\/tnse.2025.3554991","article-title":"MADDPG-M&L: UAV-assisted joint user association and slicing resource allocation in HetNets","volume":"12","author":"Chen","year":"2025","journal-title":"IEEE Transactions on Network Science and Engineering"},{"issue":"3","key":"10.7717\/peerj-cs.3728\/ref-18","doi-asserted-by":"publisher","first-page":"2763","DOI":"10.1109\/jiot.2024.3477494","article-title":"Deep customized network slicing and efficient routing for IoT applications in B5G-enabled edge computing networks","volume":"12","author":"Chen","year":"2024","journal-title":"IEEE Internet of Things Journal"},{"key":"10.7717\/peerj-cs.3728\/ref-19","doi-asserted-by":"publisher","first-page":"110127","DOI":"10.1016\/j.ress.2024.110127","article-title":"Knowledge transfer for adaptive maintenance policy optimization in engineering fleets based on meta-reinforcement learning","volume":"247","author":"Cheng","year":"2024","journal-title":"Reliability Engineering & System Safety"},{"key":"10.7717\/peerj-cs.3728\/ref-20","doi-asserted-by":"publisher","first-page":"1634","DOI":"10.1109\/comst.2022.3184049","article-title":"Channel nonstationarity and consistency for beyond 5G and 6G: a survey","volume":"24","author":"Cheng","year":"2022","journal-title":"IEEE Communications Surveys & Tutorials"},{"key":"10.7717\/peerj-cs.3728\/ref-21","first-page":"739","article-title":"An innovative heterogeneous transfer learning framework to enhance the scalability of deep reinforcement learning controllers in buildings with integrated energy systems","volume-title":"Building Simulation","author":"Coraci","year":"2024"},{"key":"10.7717\/peerj-cs.3728\/ref-22","doi-asserted-by":"publisher","first-page":"2005","DOI":"10.1109\/icct59356.2023.10419509","article-title":"Multi-agent reinforcement learning for slicing resource allocation in vehicular networks","volume":"25","author":"Cui","year":"2023","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"issue":"2","key":"10.7717\/peerj-cs.3728\/ref-23","doi-asserted-by":"publisher","first-page":"890","DOI":"10.1109\/tmc.2024.3476338","article-title":"O-RAN-enabled intelligent network slicing to meet service-level agreement (SLA)","volume":"24","author":"Dai","year":"2024","journal-title":"IEEE Transactions on Mobile Computing"},{"key":"10.7717\/peerj-cs.3728\/ref-24","doi-asserted-by":"publisher","DOI":"10.1201\/9781003540212","article-title":"Empowering edge-enabled resource efficient collaborative deep learning over B5G\/6G networks","volume-title":"The Intersection of 6G, AI\/Machine Learning, and Embedded Systems: Pioneering Intelligent Wireless Technologies","volume":"125","author":"Das","year":"2025"},{"issue":"1","key":"10.7717\/peerj-cs.3728\/ref-25","doi-asserted-by":"publisher","first-page":"13","DOI":"10.17576\/apjitm-2023-1201-02","article-title":"Internet of things (IoT) intrusion detection by machine learning (ML): a review","volume":"12","author":"Dehkordi","year":"2023","journal-title":"Asia-Pacific Journal of Information Technology & Multimedia"},{"key":"10.7717\/peerj-cs.3728\/ref-26","doi-asserted-by":"crossref","DOI":"10.1109\/ICOCT64433.2025.11118425","article-title":"Secure and adaptive federated learning pipelines: a framework for multi-tenant enterprise data systems","author":"Devaraju","year":"2025"},{"key":"10.7717\/peerj-cs.3728\/ref-27","doi-asserted-by":"publisher","first-page":"2449","DOI":"10.32604\/cmes.2024.050986","article-title":"CoopAI-route: DRL empowered multi-agent cooperative system for efficient QoS-aware routing for network slicing in multi-domain SDN","volume":"140","author":"Dhandapani","year":"2024","journal-title":"CMES-Computer Modeling in Engineering & Sciences"},{"key":"10.7717\/peerj-cs.3728\/ref-28","doi-asserted-by":"publisher","first-page":"101651","DOI":"10.1016\/j.jestch.2024.101651","article-title":"Designing an optimal microgrid control system using deep reinforcement learning: a systematic review","volume":"51","author":"Dinata","year":"2024","journal-title":"Engineering Science and Technology, an International Journal"},{"key":"10.7717\/peerj-cs.3728\/ref-29","doi-asserted-by":"publisher","first-page":"956","DOI":"10.1109\/tmlcn.2024.3420268","article-title":"Sample-efficient multi-agent DQNs for scalable multi-domain 5G+ inter-slice orchestration","volume":"2","author":"Doanis","year":"2024","journal-title":"IEEE Transactions on Machine Learning in Communications and Networking"},{"key":"10.7717\/peerj-cs.3728\/ref-30","doi-asserted-by":"publisher","first-page":"e8327","DOI":"10.1002\/cpe.8327","article-title":"AI based resource management for 5G network slicing: history, use cases, and research directions","volume":"37","author":"Dubey","year":"2025","journal-title":"Concurrency and Computation: Practice and Experience"},{"issue":"4","key":"10.7717\/peerj-cs.3728\/ref-31","doi-asserted-by":"publisher","first-page":"2836","DOI":"10.1109\/comst.2024.3390613","article-title":"Resource management from single-domain 5G to end-to-end 6G network slicing: a survey","volume":"26","author":"Ebrahimi","year":"2024","journal-title":"IEEE Communications Surveys & Tutorials"},{"issue":"17","key":"10.7717\/peerj-cs.3728\/ref-32","doi-asserted-by":"publisher","first-page":"5558","DOI":"10.3390\/s24175558","article-title":"Utility-driven end-to-end network slicing for diverse IoT users in MEC: a multi-agent deep reinforcement learning approach","volume":"24","author":"Ejaz","year":"2024a","journal-title":"Sensors"},{"key":"10.7717\/peerj-cs.3728\/ref-33","first-page":"2876","article-title":"Deep reinforcement learning approach for enhancing profitability in mobile edge computing","author":"Ejaz","year":"2024b"},{"issue":"12","key":"10.7717\/peerj-cs.3728\/ref-34","doi-asserted-by":"publisher","first-page":"4586","DOI":"10.3390\/s22124586","article-title":"Multi-agent decision-making modes in uncertain interactive traffic scenarios via graph convolution-based deep reinforcement learning","volume":"22","author":"Gao","year":"2022","journal-title":"Sensors"},{"key":"10.7717\/peerj-cs.3728\/ref-35","doi-asserted-by":"publisher","first-page":"8508","DOI":"10.1109\/TVT.2025.3539090","article-title":"High-level service type analysis and MORL-based network slice configuration for cell-free-based 6G networks","volume":"74","author":"Ghafouri","year":"2025","journal-title":"IEEE Transactions on Vehicular Technology"},{"key":"10.7717\/peerj-cs.3728\/ref-36","doi-asserted-by":"publisher","first-page":"895","DOI":"10.1007\/s10462-021-09996-w","article-title":"Multi-agent deep reinforcement learning: a survey","volume":"55","author":"Gronauer","year":"2022","journal-title":"Artificial Intelligence Review"},{"key":"10.7717\/peerj-cs.3728\/ref-37","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2501.01007","article-title":"Deep reinforcement learning for job scheduling and resource management in cloud computing: an algorithm-level review","author":"Gu","year":"2025"},{"key":"10.7717\/peerj-cs.3728\/ref-38","doi-asserted-by":"publisher","first-page":"354","DOI":"10.1007\/s10462-025-11340-5","article-title":"Multi-agent reinforcement learning for resources allocation optimization: a survey","volume":"58","author":"Hady","year":"2025","journal-title":"Artificial Intelligence Review"},{"issue":"6","key":"10.7717\/peerj-cs.3728\/ref-39","doi-asserted-by":"publisher","first-page":"5239","DOI":"10.1109\/tnsm.2025.3603391","article-title":"Slicing for AI: an online learning framework for network slicing supporting AI services","volume":"22","author":"Helmy","year":"2025","journal-title":"IEEE Transactions on Network and Service Management"},{"key":"10.7717\/peerj-cs.3728\/ref-40","doi-asserted-by":"publisher","first-page":"2675","DOI":"10.1007\/s10994-023-06422-w","article-title":"Efficient learning of power grid voltage control strategies via model-based deep reinforcement learning","volume":"113","author":"Hossain","year":"2024","journal-title":"Machine Learning"},{"issue":"3","key":"10.7717\/peerj-cs.3728\/ref-41","doi-asserted-by":"publisher","first-page":"743","DOI":"10.35833\/mpce.2021.000394","article-title":"Mixed deep reinforcement learning considering discrete-continuous hybrid action space for smart home energy management","volume":"10","author":"Huang","year":"2022","journal-title":"Journal of Modern Power Systems and Clean Energy"},{"issue":"4","key":"10.7717\/peerj-cs.3728\/ref-42","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3524106","article-title":"A survey on requirements of future intelligent networks: solutions and future research directions","volume":"55","author":"Husen","year":"2022","journal-title":"ACM Computing Surveys"},{"issue":"3","key":"10.7717\/peerj-cs.3728\/ref-43","doi-asserted-by":"publisher","first-page":"184","DOI":"10.1007\/s10586-024-04893-7","article-title":"A survey on resource scheduling approaches in multi-access edge computing environment: a deep reinforcement learning study","volume":"28","author":"Ismail","year":"2025","journal-title":"Cluster Computing"},{"key":"10.7717\/peerj-cs.3728\/ref-44","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3513002","article-title":"Resource allocation and task scheduling in fog computing and internet of everything environments: a taxonomy, review, and future directions","volume":"54","author":"Jamil","year":"2022","journal-title":"ACM Computing Surveys (CSUR)"},{"issue":"6","key":"10.7717\/peerj-cs.3728\/ref-45","doi-asserted-by":"publisher","first-page":"4297","DOI":"10.1109\/tii.2021.3131355","article-title":"Dynamic network slicing orchestration for remote adaptation and configuration in industrial IoT","volume":"18","author":"Ji","year":"2021","journal-title":"IEEE Transactions on Industrial Informatics"},{"issue":"13","key":"10.7717\/peerj-cs.3728\/ref-46","doi-asserted-by":"publisher","first-page":"23118","DOI":"10.1109\/jiot.2025.3550592","article-title":"Efficient resource allocation in computing power networks considering similar task merging: a Lyapunov optimization-based DRL approach","volume":"12","author":"Jia","year":"2025","journal-title":"IEEE Internet of Things Journal"},{"issue":"8","key":"10.7717\/peerj-cs.3728\/ref-47","doi-asserted-by":"publisher","first-page":"6081","DOI":"10.1109\/globecom54140.2023.10436734","article-title":"Hierarchical intelligence enabled joint RAN slicing and MAC scheduling for SLA guarantee","volume":"73","author":"Jia","year":"2024","journal-title":"IEEE Transactions on Communications"},{"issue":"3","key":"10.7717\/peerj-cs.3728\/ref-48","doi-asserted-by":"publisher","first-page":"1132","DOI":"10.1109\/tccn.2023.3342441","article-title":"Reinforcement-learning-based network slicing and resource allocation for multi-access edge computing networks","volume":"10","author":"Jiang","year":"2023","journal-title":"IEEE Transactions on Cognitive Communications and Networking"},{"key":"10.7717\/peerj-cs.3728\/ref-49","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2503.13415","article-title":"A comprehensive survey on multi-agent cooperative decision-making: scenarios. approaches challenges and perspectives","author":"Jin","year":"2025"},{"key":"10.7717\/peerj-cs.3728\/ref-50","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2407.18066","article-title":"Multi-agent deep reinforcement learning for resilience optimization in 5G RAN","author":"Kaada","year":"2024"},{"issue":"13","key":"10.7717\/peerj-cs.3728\/ref-51","doi-asserted-by":"publisher","first-page":"e5857","DOI":"10.1002\/dac.5857","article-title":"Slice admission control in 5G cloud radio access network using deep reinforcement learning: a survey","volume":"37","author":"Khani","year":"2024","journal-title":"International Journal of Communication Systems"},{"key":"10.7717\/peerj-cs.3728\/ref-52","doi-asserted-by":"publisher","first-page":"56178","DOI":"10.1109\/access.2021.3072435","article-title":"Multi-agent reinforcement learning-based resource management for end-to-end network slicing","volume":"9","author":"Kim","year":"2021","journal-title":"IEEE Access"},{"issue":"2","key":"10.7717\/peerj-cs.3728\/ref-53","doi-asserted-by":"publisher","first-page":"133","DOI":"10.1080\/00051144.2025.2460879","article-title":"Analysis of 6G and B5G waveforms using hybrid MF-ED and ECG-ED spectrum sensing techniques","volume":"66","author":"Kumar","year":"2025","journal-title":"Automatika"},{"issue":"3","key":"10.7717\/peerj-cs.3728\/ref-54","doi-asserted-by":"publisher","first-page":"2912","DOI":"10.1109\/tnsm.2023.3240301","article-title":"Blockchain-based computing resource trading in autonomous multi-access edge network slicing: a dueling double deep q-learning approach","volume":"20","author":"Kwantwi","year":"2023","journal-title":"IEEE Transactions on Network and Service Management"},{"issue":"4","key":"10.7717\/peerj-cs.3728\/ref-55","doi-asserted-by":"publisher","first-page":"2554","DOI":"10.1109\/tccn.2024.3510562","article-title":"Joint computation offloading and resource allocation for LEO satellite networks using hierarchical multi-agent reinforcement learning","volume":"11","author":"Lai","year":"2024","journal-title":"IEEE Transactions on Cognitive Communications and Networking"},{"key":"10.7717\/peerj-cs.3728\/ref-56","doi-asserted-by":"publisher","first-page":"360","DOI":"10.3390\/electronics12020360","article-title":"Research on multi-agent D2D communication resource allocation algorithm based on A2C","volume":"12","author":"Li","year":"2023","journal-title":"Electronics"},{"issue":"8","key":"10.7717\/peerj-cs.3728\/ref-57","doi-asserted-by":"publisher","first-page":"7360","DOI":"10.1109\/tmc.2025.3548767\/mm1","article-title":"OACR2: online admission control and resource reservation for 5G slice networks with deep reinforcement learning","volume":"24","author":"Li","year":"2025a","journal-title":"IEEE Transactions on Mobile Computing"},{"key":"10.7717\/peerj-cs.3728\/ref-58","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1109\/TNSM.2025.3633927","article-title":"Incremental DRL-based resource management for dynamic network slicing in an urban-wide testbed","volume":"23","author":"Li","year":"2025c","journal-title":"IEEE Transactions on Network and Service Management"},{"key":"10.7717\/peerj-cs.3728\/ref-59","doi-asserted-by":"publisher","first-page":"182","DOI":"10.1145\/3643735","article-title":"IRCoCo: immediate rewards-guided deep reinforcement learning for code completion","volume":"1","author":"Li","year":"2024","journal-title":"Proceedings of the ACM on Software Engineering"},{"key":"10.7717\/peerj-cs.3728\/ref-60","doi-asserted-by":"publisher","first-page":"126493","DOI":"10.1016\/j.eswa.2025.126493","article-title":"Serial distributed reinforcement learning for enhanced multi-objective platoon control in curved road coordinates","volume":"269","author":"Li","year":"2025b","journal-title":"Expert Systems with Applications"},{"key":"10.7717\/peerj-cs.3728\/ref-61","doi-asserted-by":"publisher","first-page":"5031","DOI":"10.1109\/twc.2022.3231379","article-title":"Multi-agent DRL for resource allocation and cache design in terrestrial-satellite networks","volume":"22","author":"Li","year":"2022b","journal-title":"IEEE Transactions on Wireless Communications"},{"key":"10.7717\/peerj-cs.3728\/ref-62","doi-asserted-by":"publisher","first-page":"1240","DOI":"10.1109\/comst.2022.3160697","article-title":"Applications of multi-agent reinforcement learning in future internet: a comprehensive survey","volume":"24","author":"Li","year":"2022a","journal-title":"IEEE Communications Surveys & Tutorials"},{"key":"10.7717\/peerj-cs.3728\/ref-63","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/tpds.2024.3457153","article-title":"Hierarchical reinforcement learning with partner modeling for distributed multi-agent cooperation","author":"Liang","year":"2024","journal-title":"IEEE Transactions on Parallel and Distributed Systems"},{"key":"10.7717\/peerj-cs.3728\/ref-64","doi-asserted-by":"publisher","first-page":"7760","DOI":"10.3390\/s24237760","article-title":"MAARS: multiagent actor-critic approach for resource allocation and network slicing in multiaccess edge computing","volume":"24","author":"Lim","year":"2024","journal-title":"Sensors"},{"key":"10.7717\/peerj-cs.3728\/ref-65","doi-asserted-by":"publisher","first-page":"108198","DOI":"10.1016\/j.comcom.2025.108198","article-title":"Empowering disaster response: advanced network slicing solutions for reliable Wi-Fi and 5G communications","volume":"240","author":"Limani","year":"2025","journal-title":"Computer Communications"},{"key":"10.7717\/peerj-cs.3728\/ref-66","doi-asserted-by":"publisher","first-page":"94610","DOI":"10.1109\/ACCESS.2024.3410318","article-title":"Scaling up multi-agent reinforcement learning: an extensive survey on scalability issues","volume":"12","author":"Liu","year":"2024","journal-title":"IEEE Access"},{"key":"10.7717\/peerj-cs.3728\/ref-67","doi-asserted-by":"publisher","first-page":"76606","DOI":"10.1109\/ACCESS.2024.3405487","article-title":"Towards efficient 6G IoT networks: a perspective on resource optimization strategies, challenges, and future directions","volume":"12","author":"Liwen","year":"2024","journal-title":"IEEE Access"},{"key":"10.7717\/peerj-cs.3728\/ref-68","doi-asserted-by":"publisher","first-page":"21","DOI":"10.14209\/jcis.2023.4","article-title":"Deep reinforcement learning based resource allocation approach for wireless networks considering network slicing paradigm","volume":"38","author":"Lopes","year":"2023","journal-title":"Journal of Communication and Information Systems"},{"key":"10.7717\/peerj-cs.3728\/ref-69","doi-asserted-by":"publisher","first-page":"3242","DOI":"10.3390\/s24103242","article-title":"A comprehensive overview of network slicing for improving the energy efficiency of fifth-generation networks","volume":"24","author":"Lorincz","year":"2024","journal-title":"Sensors"},{"issue":"6","key":"10.7717\/peerj-cs.3728\/ref-70","doi-asserted-by":"publisher","first-page":"6744","DOI":"10.1109\/tnsm.2024.3454758","article-title":"Multi-agent DRL-based two-timescale resource allocation for network slicing in V2X communications","volume":"21","author":"Lu","year":"2024a","journal-title":"IEEE Transactions on Network and Service Management"},{"key":"10.7717\/peerj-cs.3728\/ref-71","first-page":"1137","article-title":"Deep reinforcement learning-based resilient resource allocation for smart grid network slicing","author":"Lu","year":"2024c"},{"key":"10.7717\/peerj-cs.3728\/ref-72","doi-asserted-by":"publisher","first-page":"361","DOI":"10.1109\/TPDS.2024.3522085","article-title":"Online elastic resource provisioning with QoS guarantee in container-based cloud computing","volume":"36","author":"Lu","year":"2024b","journal-title":"IEEE Transactions on Parallel and Distributed Systems"},{"key":"10.7717\/peerj-cs.3728\/ref-73","doi-asserted-by":"publisher","first-page":"2797","DOI":"10.1109\/tiv.2022.3225147","article-title":"DRL-based computation offloading with queue stability for vehicular-cloud-assisted mobile edge computing systems","volume":"8","author":"Ma","year":"2022","journal-title":"IEEE Transactions on Intelligent Vehicles"},{"key":"10.7717\/peerj-cs.3728\/ref-74","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s42979-024-03314-1","article-title":"QoS-Aware cross-domain routing in SDN: a comparative study between competitive and cooperative MARL approaches","volume":"5","author":"Majdoub","year":"2024","journal-title":"SN Computer Science"},{"key":"10.7717\/peerj-cs.3728\/ref-75","doi-asserted-by":"publisher","first-page":"11937","DOI":"10.1109\/tmc.2024.3404125","article-title":"Federated deep reinforcement learning for prediction-based network slice mobility in 6G mobile networks","volume":"23","author":"Ming","year":"2024","journal-title":"IEEE Transactions on Mobile Computing"},{"key":"10.7717\/peerj-cs.3728\/ref-76","doi-asserted-by":"publisher","first-page":"665","DOI":"10.1007\/s12243-021-00872-w","article-title":"Network slicing for vehicular communications: a multi-agent deep reinforcement learning approach","volume":"76","author":"Mlika","year":"2021","journal-title":"Annals of Telecommunications"},{"key":"10.7717\/peerj-cs.3728\/ref-77","first-page":"1093","article-title":"A unified approach to autonomous driving in a high-fidelity simulator using vision-based reinforcement learning","author":"Mohammed","year":"2024"},{"key":"10.7717\/peerj-cs.3728\/ref-78","doi-asserted-by":"publisher","first-page":"6909","DOI":"10.1109\/TCE.2024.3440178","article-title":"Advancing security and trust in WSNs: a federated multi-agent deep reinforcement learning approach","volume":"70","author":"Moudoud","year":"2024","journal-title":"IEEE Transactions on Consumer Electronics"},{"key":"10.7717\/peerj-cs.3728\/ref-79","doi-asserted-by":"publisher","first-page":"236","DOI":"10.1109\/ticps.2023.3311394","article-title":"5G deployment models and configuration choices for industrial cyber-physical systems-a state of art overview","volume":"1","author":"Muzaffar","year":"2023","journal-title":"IEEE Transactions on Industrial Cyber-Physical Systems"},{"key":"10.7717\/peerj-cs.3728\/ref-80","doi-asserted-by":"publisher","first-page":"83017","DOI":"10.1109\/access.2023.3302250","article-title":"Machine learning empowered emerging wireless networks in 6G: recent advancements, challenges and future trends","volume":"11","author":"Noman","year":"2023","journal-title":"IEEE Access"},{"key":"10.7717\/peerj-cs.3728\/ref-81","doi-asserted-by":"publisher","first-page":"170780\u2013170802","DOI":"10.1109\/ACCESS.2024.3501319","article-title":"Performance of 5G slicing with access technologies, and diversity: a review and challenges","volume":"12","author":"Novanana","year":"2024a","journal-title":"IEEE Access"},{"key":"10.7717\/peerj-cs.3728\/ref-82","first-page":"634","article-title":"Provisioning of coexisting eMBB and URLLC services in 5G network slicing with kubernetes-based MANO","author":"Novanana","year":"2024b"},{"key":"10.7717\/peerj-cs.3728\/ref-83","doi-asserted-by":"publisher","first-page":"320","DOI":"10.1016\/j.comcom.2023.11.015","article-title":"A multi-agent federated reinforcement learning-based optimization of quality of service in various LoRa network slices","volume":"213","author":"Ossongo","year":"2024","journal-title":"Computer Communications"},{"issue":"22","key":"10.7717\/peerj-cs.3728\/ref-84","doi-asserted-by":"publisher","first-page":"20174","DOI":"10.1109\/jiot.2023.3283553","article-title":"Two-Tier resource allocation for multitenant network slicing: a federated deep reinforcement learning approach","volume":"10","author":"Ou","year":"2023","journal-title":"IEEE Internet of Things Journal"},{"key":"10.7717\/peerj-cs.3728\/ref-85","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2308.13793","article-title":"Cooperative resource trading for network slicing in industrial IoT: a multi-agent DRL approach","author":"Owusu Boateng","year":"2023"},{"key":"10.7717\/peerj-cs.3728\/ref-86","doi-asserted-by":"publisher","first-page":"103091","DOI":"10.1016\/j.simpat.2025.103091","article-title":"BRAVE: benefit-aware data offloading in UAV edge computing using multi-agent reinforcement learning","volume":"140","author":"Pantaleon","year":"2025","journal-title":"Simulation Modelling Practice and Theory"},{"key":"10.7717\/peerj-cs.3728\/ref-87","doi-asserted-by":"publisher","first-page":"1416","DOI":"10.3390\/electronics9091416","article-title":"Issues, challenges, and research trends in spectrum management: a comprehensive overview and new vision for designing 6G networks","volume":"9","author":"Qamar","year":"2020","journal-title":"Electronics"},{"issue":"4","key":"10.7717\/peerj-cs.3728\/ref-88","doi-asserted-by":"publisher","first-page":"2584","DOI":"10.1109\/tccn.2024.3524641","article-title":"Resource allocation for network slicing in open RAN: a hierarchical learning approach","volume":"11","author":"Qiao","year":"2025","journal-title":"IEEE Transactions on Cognitive Communications and Networking"},{"issue":"1","key":"10.7717\/peerj-cs.3728\/ref-89","doi-asserted-by":"publisher","first-page":"595","DOI":"10.1109\/comst.2024.3410295","article-title":"A survey on beyond 5G network slicing for smart cities applications","volume":"27","author":"Rafique","year":"2024","journal-title":"IEEE Communications Surveys & Tutorials"},{"key":"10.7717\/peerj-cs.3728\/ref-90","article-title":"Joint resource allocation for multiplexing eMBB","author":"Ren","year":"2024"},{"key":"10.7717\/peerj-cs.3728\/ref-91","doi-asserted-by":"publisher","first-page":"8299","DOI":"10.3390\/app14188299","article-title":"Uncertainty-aware federated reinforcement learning for optimizing accuracy and energy in heterogeneous industrial IoT","volume":"14","author":"Sagar","year":"2024","journal-title":"Applied Sciences"},{"key":"10.7717\/peerj-cs.3728\/ref-92","doi-asserted-by":"publisher","first-page":"82","DOI":"10.1109\/mcom.006.2200534","article-title":"Deep reinforcement learning approaches to network slice scaling and placement: a survey","volume":"61","author":"Saha","year":"2023","journal-title":"IEEE Communications Magazine"},{"key":"10.7717\/peerj-cs.3728\/ref-93","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2203.12775","article-title":"ZSM-based management and orchestration of 3GPP network slicing: an architectural framework and deployment options","author":"Sajjad","year":"2022"},{"key":"10.7717\/peerj-cs.3728\/ref-94","doi-asserted-by":"publisher","first-page":"109720","DOI":"10.1016\/j.comnet.2023.109720","article-title":"Resource allocation in multi-access edge computing for 5G-and-beyond networks","volume":"227","author":"Sarah","year":"2023","journal-title":"Computer Networks"},{"key":"10.7717\/peerj-cs.3728\/ref-95","doi-asserted-by":"publisher","first-page":"113741\u2013113784","DOI":"10.1109\/ACCESS.2024.3444313","article-title":"A comprehensive survey on resource management in 6G network based on internet of things","volume":"12","author":"Sefati","year":"2024","journal-title":"IEEE Access"},{"key":"10.7717\/peerj-cs.3728\/ref-96","first-page":"3685","article-title":"Dynamic resource reconfiguration for network slicing: an incremental multi-agent reinforcement learning based approach","author":"Shen","year":"2024"},{"issue":"3","key":"10.7717\/peerj-cs.3728\/ref-97","doi-asserted-by":"publisher","first-page":"132","DOI":"10.2478\/cait-2024-0029","article-title":"Energy-Efficient and accelerated resource allocation in O-RAN slicing using deep reinforcement learning and transfer learning","volume":"24","author":"Sherif","year":"2024","journal-title":"Cybernetics and Information Technologies"},{"key":"10.7717\/peerj-cs.3728\/ref-98","doi-asserted-by":"publisher","first-page":"54639","DOI":"10.1109\/access.2023.3282363","article-title":"URLLC in beyond 5G and 6G networks: an interference management perspective","volume":"11","author":"Siddiqui","year":"2023","journal-title":"IEEE Access"},{"key":"10.7717\/peerj-cs.3728\/ref-99","doi-asserted-by":"crossref","DOI":"10.1109\/NOMS54207.2022.9789903","article-title":"Multi-agent deep reinforcement learning for slicing and admission control in 5G C-RAN","author":"Sulaiman","year":"2022"},{"key":"10.7717\/peerj-cs.3728\/ref-100","doi-asserted-by":"publisher","first-page":"7969","DOI":"10.1109\/TVT.2024.3523331","article-title":"A dynamic and collaborative spectrum sharing strategy based on multi-agent DRL in satellite-terrestrial converged networks","volume":"74","author":"Tang","year":"2024a","journal-title":"IEEE Transactions on Vehicular Technology"},{"issue":"19","key":"10.7717\/peerj-cs.3728\/ref-101","doi-asserted-by":"publisher","first-page":"16989","DOI":"10.1109\/jiot.2023.3274163","article-title":"Digital-twin-assisted resource allocation for network slicing in industry 4.0 and beyond using distributed deep reinforcement learning","volume":"10","author":"Tang","year":"2023","journal-title":"IEEE Internet of Things Journal"},{"issue":"1","key":"10.7717\/peerj-cs.3728\/ref-102","doi-asserted-by":"publisher","first-page":"1064","DOI":"10.1109\/jiot.2024.3476112","article-title":"Deterministic delay of digital twin-assisted end-to-end network slicing in industrial IoT via multiagent deep reinforcement learning","volume":"12","author":"Tang","year":"2024b","journal-title":"IEEE Internet of Things Journal"},{"issue":"5","key":"10.7717\/peerj-cs.3728\/ref-103","doi-asserted-by":"publisher","first-page":"6635","DOI":"10.1109\/tits.2024.3524595","article-title":"Dynamic slice resource management and information synchronization strategy in IoV based on digital twin","volume":"26","author":"Tang","year":"2025","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"10.7717\/peerj-cs.3728\/ref-104","doi-asserted-by":"publisher","first-page":"120864\u2013120876","DOI":"10.1109\/ACCESS.2024.3452797","article-title":"DRL-based dynamic resource configuration and optimization for B5G network slicing","volume":"12","author":"Tian","year":"2024","journal-title":"IEEE Access"},{"key":"10.7717\/peerj-cs.3728\/ref-105","doi-asserted-by":"publisher","first-page":"110820","DOI":"10.1016\/j.engappai.2025.110820","article-title":"Using meta-reinforcement learning for solving the virtual network embedding problem","volume":"153","author":"Torkamani-Azar","year":"2025","journal-title":"Engineering Applications of Artificial Intelligence"},{"issue":"19","key":"10.7717\/peerj-cs.3728\/ref-106","doi-asserted-by":"publisher","first-page":"30690","DOI":"10.1109\/jiot.2024.3416157","article-title":"Priority-based load balancing with multi-agent deep reinforcement learning for space-air-ground integrated network slicing","volume":"11","author":"Tu","year":"2024","journal-title":"IEEE Internet of Things Journal"},{"key":"10.7717\/peerj-cs.3728\/ref-107","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1201\/b12298-5","article-title":"Edge computing: architecture, applications, and future challenges in a decentralized era","volume":"7","author":"Veeramachaneni","year":"2025","journal-title":"Recent Trends in Computer Graphics and Multimedia Technology"},{"key":"10.7717\/peerj-cs.3728\/ref-108","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2412.17301","article-title":"Dynamic scheduling strategies for resource optimization in computing environments","author":"Wang","year":"2024"},{"issue":"4","key":"10.7717\/peerj-cs.3728\/ref-109","doi-asserted-by":"publisher","first-page":"4780","DOI":"10.1109\/tits.2025.3547775","article-title":"Collaborative collision avoidance approach for USVs based on multi-agent deep reinforcement learning","volume":"26","author":"Wang","year":"2025b","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"10.7717\/peerj-cs.3728\/ref-110","doi-asserted-by":"publisher","first-page":"112665","DOI":"10.1016\/j.knosys.2024.112665","article-title":"Enhancing collaboration in multi-agent reinforcement learning with correlated trajectories","volume":"305","author":"Wang","year":"2024b","journal-title":"Knowledge-Based Systems"},{"key":"10.7717\/peerj-cs.3728\/ref-111","doi-asserted-by":"publisher","first-page":"1878","DOI":"10.1109\/tase.2024.3371250","article-title":"Hybrid task scheduling in cloud manufacturing with sparse-reward deep reinforcement learning","volume":"22","author":"Wang","year":"2024c","journal-title":"IEEE Transactions on Automation Science and Engineering"},{"key":"10.7717\/peerj-cs.3728\/ref-112","doi-asserted-by":"publisher","first-page":"39158","DOI":"10.1109\/jiot.2024.3479779","article-title":"Low-carbon federated multiagent-DRL enhanced network slicing for satellite direct-to-device communications","volume":"11","author":"Wang","year":"2024a","journal-title":"IEEE Internet of Things Journal"},{"issue":"4","key":"10.7717\/peerj-cs.3728\/ref-113","doi-asserted-by":"publisher","first-page":"2647","DOI":"10.1109\/comst.2024.3393230","article-title":"End-edge-cloud collaborative computing for deep learning: a comprehensive survey","volume":"26","author":"Wang","year":"2024d","journal-title":"IEEE Communications Surveys & Tutorials"},{"key":"10.7717\/peerj-cs.3728\/ref-114","doi-asserted-by":"publisher","first-page":"882","DOI":"10.1109\/comst.2025.3575041","article-title":"A survey on intent-driven end-to-end 6G mobile communication system","volume":"28","author":"Wang","year":"2025a","journal-title":"IEEE Communications Surveys & Tutorials"},{"key":"10.7717\/peerj-cs.3728\/ref-115","doi-asserted-by":"crossref","DOI":"10.1109\/CloudNet62863.2024.10815780","article-title":"Multi-agent distributed decentralized dynamic resource orchestration in 5G edge-cloud networks","author":"We","year":"2024"},{"key":"10.7717\/peerj-cs.3728\/ref-116","doi-asserted-by":"publisher","first-page":"5023","DOI":"10.1007\/s10462-022-10299-x","article-title":"Deep multiagent reinforcement learning: challenges and directions","volume":"56","author":"Wong","year":"2023","journal-title":"Artificial Intelligence Review"},{"key":"10.7717\/peerj-cs.3728\/ref-117","doi-asserted-by":"publisher","first-page":"1659","DOI":"10.1109\/lwc.2022.3170998","article-title":"Multi-agent deep reinforcement learning-based power control and resource allocation for D2D communications","volume":"11","author":"Xiang","year":"2022","journal-title":"IEEE Wireless Communications Letters"},{"issue":"6","key":"10.7717\/peerj-cs.3728\/ref-118","doi-asserted-by":"publisher","first-page":"5910","DOI":"10.1109\/tsg.2024.3419122","article-title":"Peer-to-peer energy transactions for prosumers based on improved deep deterministic policy gradient algorithm","volume":"15","author":"Xiao","year":"2024","journal-title":"IEEE Transactions on Smart Grid"},{"issue":"6","key":"10.7717\/peerj-cs.3728\/ref-119","doi-asserted-by":"publisher","first-page":"4864","DOI":"10.1109\/twc.2025.3544478","article-title":"Dynamic blockchain-empowered trustworthy end-edge collaborative computing via rotating multi-agent DRL","volume":"24","author":"Xu","year":"2025","journal-title":"IEEE Transactions on Wireless Communications"},{"issue":"5","key":"10.7717\/peerj-cs.3728\/ref-120","doi-asserted-by":"publisher","first-page":"1220","DOI":"10.1109\/lwc.2024.3365161","article-title":"Multi-agent deep reinforcement learning joint beamforming for slicing resource allocation","volume":"13","author":"Yan","year":"2024","journal-title":"IEEE Wireless Communications Letters"},{"key":"10.7717\/peerj-cs.3728\/ref-121","doi-asserted-by":"publisher","first-page":"3133","DOI":"10.3390\/math13193133","article-title":"Joint power allocation algorithm based on multi-agent DQN in cognitive satellite-terrestrial mixed 6G networks","volume":"13","author":"Zhai","year":"2025","journal-title":"Mathematics"},{"key":"10.7717\/peerj-cs.3728\/ref-122","doi-asserted-by":"publisher","first-page":"1457","DOI":"10.1109\/tgcn.2023.3262516","article-title":"Distributed joint resource optimization for federated learning task distribution","volume":"7","author":"Zhang","year":"2023a","journal-title":"IEEE Transactions on Green Communications and Networking"},{"key":"10.7717\/peerj-cs.3728\/ref-123","doi-asserted-by":"publisher","first-page":"3265","DOI":"10.1109\/lcomm.2023.3326509","article-title":"Temporal feature-enhanced deep reinforcement learning for RAN slicing with user mobility","volume":"27","author":"Zhang","year":"2023b","journal-title":"IEEE Communications Letters"},{"key":"10.7717\/peerj-cs.3728\/ref-124","doi-asserted-by":"publisher","first-page":"13195","DOI":"10.1109\/jiot.2022.3140811","article-title":"Joint communication and computation resource allocation in fog-based vehicular networks","volume":"9","author":"Zhang","year":"2022","journal-title":"IEEE Internet of Things Journal"},{"key":"10.7717\/peerj-cs.3728\/ref-125","doi-asserted-by":"publisher","first-page":"103020","DOI":"10.1016\/j.seta.2023.103020","article-title":"Energy efficient resource allocation method for 5G access network based on reinforcement learning algorithm","volume":"56","author":"Zhao","year":"2023","journal-title":"Sustainable Energy Technologies and Assessments"},{"key":"10.7717\/peerj-cs.3728\/ref-126","doi-asserted-by":"publisher","first-page":"191","DOI":"10.3390\/s25010191","article-title":"Knowledge distillation-enhanced behavior transformer for decision-making of autonomous driving","volume":"25","author":"Zhao","year":"2025a","journal-title":"Sensors"},{"issue":"12","key":"10.7717\/peerj-cs.3728\/ref-127","doi-asserted-by":"publisher","first-page":"19365","DOI":"10.1109\/tits.2024.3452480","article-title":"A survey on recent advancements in autonomous driving using deep reinforcement learning: applications, challenges, and solutions","volume":"25","author":"Zhao","year":"2024","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"10.7717\/peerj-cs.3728\/ref-128","doi-asserted-by":"publisher","first-page":"122514","DOI":"10.1016\/j.ins.2025.122514","article-title":"Sequence value decomposition transformer for cooperative multi-agent reinforcement learning","volume":"720","author":"Zhao","year":"2025b","journal-title":"Information Sciences"},{"key":"10.7717\/peerj-cs.3728\/ref-129","doi-asserted-by":"publisher","first-page":"53","DOI":"10.23919\/JCC.ea.2021-0772.202401","article-title":"Resource allocation for cognitive network slicing in PD-SCMA system based on two-way deep reinforcement learning","volume":"21","author":"Zhenyu","year":"2024","journal-title":"China Communications"}],"container-title":["PeerJ Computer Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/peerj.com\/articles\/cs-3728.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/articles\/cs-3728.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/articles\/cs-3728.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/articles\/cs-3728.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,23]],"date-time":"2026-03-23T08:39:07Z","timestamp":1774255147000},"score":1,"resource":{"primary":{"URL":"https:\/\/peerj.com\/articles\/cs-3728"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,23]]},"references-count":129,"alternative-id":["10.7717\/peerj-cs.3728"],"URL":"https:\/\/doi.org\/10.7717\/peerj-cs.3728","archive":["CLOCKSS","LOCKSS","Portico"],"relation":{},"ISSN":["2376-5992"],"issn-type":[{"value":"2376-5992","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,23]]},"article-number":"e3728"}}