{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,17]],"date-time":"2026-06-17T23:54:04Z","timestamp":1781740444143,"version":"3.54.5"},"reference-count":39,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2024,1,2]],"date-time":"2024-01-02T00:00:00Z","timestamp":1704153600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"MIC\/SCOPE","award":["#JP235006102"],"award-info":[{"award-number":["#JP235006102"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>In the advanced 5G and beyond networks, multi-access edge computing (MEC) is increasingly recognized as a promising technology, offering the dual advantages of reducing energy utilization in cloud data centers while catering to the demands for reliability and real-time responsiveness in end devices. However, the inherent complexity and variability of MEC networks pose significant challenges in computational offloading decisions. To tackle this problem, we propose a proximal policy optimization (PPO)-based Device-to-Device (D2D)-assisted computation offloading and resource allocation scheme. We construct a realistic MEC network environment and develop a Markov decision process (MDP) model that minimizes time loss and energy consumption. The integration of a D2D communication-based offloading framework allows for collaborative task offloading between end devices and MEC servers, enhancing both resource utilization and computational efficiency. The MDP model is solved using the PPO algorithm in deep reinforcement learning to derive an optimal policy for offloading and resource allocation. Extensive comparative analysis with three benchmarked approaches has confirmed our scheme\u2019s superior performance in latency, energy consumption, and algorithmic convergence, demonstrating its potential to improve MEC network operations in the context of emerging 5G and beyond technologies.<\/jats:p>","DOI":"10.3390\/fi16010019","type":"journal-article","created":{"date-parts":[[2024,1,2]],"date-time":"2024-01-02T03:33:33Z","timestamp":1704166413000},"page":"19","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":31,"title":["Proximal Policy Optimization for Efficient D2D-Assisted Computation Offloading and Resource Allocation in Multi-Access Edge Computing"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0009-0000-2006-9492","authenticated-orcid":false,"given":"Chen","family":"Zhang","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology, Inner Mongolia Normal University, Saihan District, Hohhot 010096, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6853-5878","authenticated-orcid":false,"given":"Celimuge","family":"Wu","sequence":"additional","affiliation":[{"name":"Graduate School of Informatics and Engineering, The University of Electro-Communications, Tokyo 1828585, Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Min","family":"Lin","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Inner Mongolia Normal University, Saihan District, Hohhot 010096, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yangfei","family":"Lin","sequence":"additional","affiliation":[{"name":"Graduate School of Informatics and Engineering, The University of Electro-Communications, Tokyo 1828585, Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"William","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Computing, Electrical and Applied Technologies, Unitec Institute of Technology, Auckland 1025, New Zealand"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2024,1,2]]},"reference":[{"key":"ref_1","unstructured":"(2023, December 10). How Many Smartphones Are in the World?. Available online: https:\/\/www.bankmycell.com\/blog\/how-many-phones-are-in-the-world#part-3)."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1840","DOI":"10.1109\/TITS.2020.3025687","article-title":"An Edge Traffic Flow Detection Scheme Based on Deep Learning in an Intelligent Transportation System","volume":"22","author":"Chen","year":"2020","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1657","DOI":"10.1109\/COMST.2017.2705720","article-title":"On multi-access edge computing: A survey of the emerging 5G network edge cloud ar-chitecture and orchestration","volume":"19","author":"Taleb","year":"2017","journal-title":"IEEE Commun. Surv. Tutor."},{"key":"ref_4","unstructured":"(2023, December 10). Anon. The Standard, News from ETSI\u2014Issue 2. Available online: https:\/\/www.etsi.org\/images\/files\/ETSInewsletter\/etsinewsletter-issue2-2017.pdf."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"8005","DOI":"10.1109\/JIOT.2020.3041673","article-title":"Resource Management for Computation Offloading in D2D-Aided Wireless Powered Mobile-Edge Computing Networks","volume":"8","author":"Sun","year":"2020","journal-title":"IEEE Internet Things J."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"24065","DOI":"10.1109\/JIOT.2022.3188928","article-title":"A Novel Hybrid-ARPPO Algorithm for Dynamic Computation Offloading in Edge Computing","volume":"9","author":"Yang","year":"2022","journal-title":"IEEE Internet Things J."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Guo, H., Wang, Y., Liu, J., and Liu, C. (2023). Multi-UAV Cooperative Task Offloading and Resource Allocation in 5G Advanced and Beyond. IEEE Trans. Wirel. Commun.","DOI":"10.1109\/TWC.2023.3277801"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2839","DOI":"10.1007\/s00607-021-00931-z","article-title":"Jointly Optimizing Offloading Decision and Bandwidth Allocation with Energy Constraint in Mobile Edge Computing Environment","volume":"103","author":"Zhou","year":"2021","journal-title":"Computing"},{"key":"ref_9","first-page":"6599","article-title":"Multi-Objective Parallel Task Offloading and Content Caching in D2D-aided MEC Networks","volume":"22","author":"Xiao","year":"2022","journal-title":"IEEE Trans. Mob. Comput."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"108900","DOI":"10.1016\/j.comnet.2022.108900","article-title":"Joint computing, communication and cost-aware task offloading in D2D-enabled Het-MEC","volume":"209","author":"Abbas","year":"2022","journal-title":"Comput. Netw."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"358","DOI":"10.1016\/j.future.2021.01.021","article-title":"Security and energy-aware collaborative task offloading in D2D communication","volume":"118","author":"Li","year":"2021","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1825","DOI":"10.1007\/s10586-020-03230-y","article-title":"Delay-aware optimization of energy consumption for task offloading in fog envi-ronments using metaheuristic algorithms","volume":"24","author":"Keshavarznejad","year":"2021","journal-title":"Clust. Comput."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1393","DOI":"10.1007\/s10586-022-03542-1","article-title":"Partial offloading with stable equilibrium in fog-cloud envi-ronments using replicator dynamics of evolutionary game theory","volume":"25","author":"Khoobkar","year":"2022","journal-title":"Clust. Comput."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"102225","DOI":"10.1016\/j.sysarc.2021.102225","article-title":"A Survey on Task Offloading in Multi-access Edge Computing","volume":"118","author":"Islam","year":"2021","journal-title":"J. Syst. Arch."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"166079","DOI":"10.1109\/ACCESS.2019.2953172","article-title":"Device-enhanced MEC: Multi-access edge computing (MEC) aided by end device compu-tation and caching: A survey","volume":"7","author":"Mehrabi","year":"2019","journal-title":"IEEE Access"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"15501477211023021","DOI":"10.1177\/15501477211023021","article-title":"Meta-heuristic-based offloading task optimization in mobile edge computing","volume":"17","author":"Abbas","year":"2021","journal-title":"Int. J. Distrib. Sens. Netw."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"108710","DOI":"10.1016\/j.comnet.2021.108710","article-title":"A Novel Lyapunov based Dynamic Resource Allocation for UAVs-assisted Edge Computing","volume":"205","author":"Lin","year":"2022","journal-title":"Comput. Netw."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Chang, S., Li, C., Deng, C., and Luo, Y. (2023). Low-latency controller load balancing strategy and offloading decision generation algorithm based on lyapunov optimization in SDN mobile edge computing environment. Clust. Comput., 1\u201321.","DOI":"10.1007\/s10586-023-04012-y"},{"key":"ref_19","first-page":"17","article-title":"Reinforcement Learning Methods for Computation Offloading: A Sys-tematic Review","volume":"56","author":"Zabihi","year":"2023","journal-title":"ACM Comput. Surv."},{"key":"ref_20","unstructured":"Gao, H., Wang, X., Ma, X., Wei, W., and Mumatz, S. (2020). Com-DDPG: A multiagent reinforcement learning-based offloading strategy for mobile edge computing. arXiv."},{"key":"ref_21","unstructured":"Silva, C., Magaia, N., and Grilo, A. (November, January 30). Task Offloading Optimization in Mobile Edge Computing based on Deep Reinforcement Learning. Proceedings of the Int\u2019l ACM Conference on Modeling Analysis and Simulation of Wireless and Mobile Systems, Montreal, QC, Canada."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"5349571","DOI":"10.1155\/2022\/5349571","article-title":"Joint Computation Offloading and Resource Allocation for NOMA-Enabled Multitask D2D System","volume":"2022","author":"Han","year":"2022","journal-title":"Wirel. Commun. Mob. Comput."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Li, G., Chen, M., Wei, X., Qi, T., and Zhuang, Q. (2020, January 15\u201319). Computation offloading with reinforcement learning in d2d-mec network. Proceedings of the 2020 International Wireless Communications and Mobile Computing (IWCMC), Limassol, Cyprus.","DOI":"10.1109\/IWCMC48107.2020.9148285"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1816","DOI":"10.1109\/LCOMM.2019.2931719","article-title":"Optimal Task Offloading Scheduling for Energy Efficient D2D Cooperative Computing","volume":"23","author":"Lin","year":"2019","journal-title":"IEEE Commun. Lett."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Guan, X., Lv, T., Lin, Z., Huang, P., and Zeng, J. (2022). D2D-Assisted Multi-User Cooperative Partial Offloading in MEC Based on Deep Reinforcement Learning. Sensors, 22.","DOI":"10.3390\/s22187004"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Liu, W., Xu, Y., Qi, N., Yao, J., Zhang, Y., and He, W. (2020, January 21\u201323). Joint computation offloading and resource allocation in UAV swarms with multi-access edge computing. Proceedings of the 2020 International Conference on Wireless Communications and Signal Processing (WCSP), Nanjing, China.","DOI":"10.1109\/WCSP49889.2020.9299713"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Fan, N., Wang, X., Wang, D., Lan, Y., and Hou, J. (2020, January 25\u201328). A collaborative task offloading scheme in d2d-assisted fog computing networks. Proceedings of the 2020 IEEE Wireless Communications and Networking Conference (WCNC), Seoul, Republic of Korea.","DOI":"10.1109\/WCNC45663.2020.9120662"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3243929","article-title":"Towards the decentralised cloud: Survey on approaches and challenges for mobile, ad hoc, and edge computing","volume":"51","author":"Ferrer","year":"2019","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Al-Absi, M.A., Al-Absi, A.A., Sain, M., and Lee, H. (2021). Moving Ad Hoc Networks\u2014A Comparative Study. Sustainability, 13.","DOI":"10.3390\/su13116187"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"484","DOI":"10.1038\/nature16961","article-title":"Mastering the game of Go with deep neural networks and tree search","volume":"529","author":"Silver","year":"2016","journal-title":"Nature"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1109\/TVT.2017.2739203","article-title":"Small Cell Cluster-Based Resource Allocation for Wireless Backhaul in Two-Tier Heterogeneous Networks with Massive MIMO","volume":"67","author":"Hao","year":"2017","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1109\/JSAC.2022.3228558","article-title":"Deep Reinforcement Learning Based Computation Offloading and Trajectory Planning for Multi-UAV Cooperative Target Search","volume":"41","author":"Luo","year":"2022","journal-title":"IEEE J. Sel. Areas Commun."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"119191","DOI":"10.1016\/j.eswa.2022.119191","article-title":"Proximal policy optimization algorithm for dynamic pricing with online reviews","volume":"213","author":"Wu","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_34","unstructured":"Zhu, W., and Rosendo, A. (2020). Proximal policy optimization smoothed algorithm. arXiv."},{"key":"ref_35","unstructured":"(2023, December 10). Power Consumption Benchmarks of Raspberry Pi 4B. Available online: https:\/\/www.pidramble.com\/wiki\/benchmarks\/power-consumption."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"1066","DOI":"10.1016\/j.procs.2022.01.135","article-title":"A Review of Yolo Algorithm Developments","volume":"199","author":"Jiang","year":"2022","journal-title":"Procedia Comput. Sci."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1170","DOI":"10.1016\/j.ins.2019.10.035","article-title":"A scheduling scheme in the cloud computing environment using deep Q-learning","volume":"512","author":"Tong","year":"2020","journal-title":"Inf. Sci."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1561\/2200000071","article-title":"An introduction to deep reinforcement learning","volume":"11","author":"Henderson","year":"2018","journal-title":"Found. Trends Mach. Learn."},{"key":"ref_39","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv."}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/16\/1\/19\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T13:38:17Z","timestamp":1760103497000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/16\/1\/19"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,1,2]]},"references-count":39,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,1]]}},"alternative-id":["fi16010019"],"URL":"https:\/\/doi.org\/10.3390\/fi16010019","relation":{},"ISSN":["1999-5903"],"issn-type":[{"value":"1999-5903","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,1,2]]}}}