{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T16:51:12Z","timestamp":1768582272001,"version":"3.49.0"},"reference-count":31,"publisher":"Association for Computing Machinery (ACM)","issue":"1","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Internet Technol."],"published-print":{"date-parts":[[2026,2,28]]},"abstract":"<jats:p>\n                    Reinforcement Learning (RL) has emerged as a promising solution for task offloading due to its adaptability to dynamic environments and ability to reduce online computational overhead. Thereby, this article explores RL for optimizing periodic Directed Acyclic Graph (DAG) task offloading in multi-user Mobile Edge Computing (MEC) systems, aiming to minimize overall costs, including user device energy consumption and server computational charges. A key contribution of this work is the explicit modeling of user competition for limited edge resources, where concurrent access leads to dynamic contention, significantly affecting offloading latency and energy usage. However, this optimization task faces two main challenges: the high dimensionality of task states and the large action space, both of which increase learning complexity. To address this, we propose a dynamic and distributed Proximal Policy Optimization (PPO)-based offloading framework. An encoder is employed to map DAG node features and structural information into a lower-dimensional representation, reducing computational overhead and improving learning efficiency. Additionally, we incorporate behavioral cloning to imitate greedy policies as the PPO agent\u2019s initial behavior, effectively narrowing the action space and accelerating convergence. By combining representation learning and imitation-based initialization, our method enables the PPO agent to quickly adapt to environmental dynamics, leveraging both prior knowledge and real-time feedback to make informed offloading decisions. Simulation results confirm that our approach achieves rapid convergence and outperforms existing baselines in cost reduction, demonstrating its effectiveness for periodic task offloading in MEC scenarios. The source code and implementation details are available at:\n                    <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"https:\/\/github.com\/xiaolutihua\/GAT\/tree\/master\">https:\/\/github.com\/xiaolutihua\/GAT\/tree\/master<\/jats:ext-link>\n                    .\n                  <\/jats:p>","DOI":"10.1145\/3762993","type":"journal-article","created":{"date-parts":[[2025,8,26]],"date-time":"2025-08-26T11:25:58Z","timestamp":1756207558000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Cost-Optimized Periodic DAG-Structured Task Offloading in Multi-User MEC Systems Using Reinforcement Learning"],"prefix":"10.1145","volume":"26","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9462-3983","authenticated-orcid":false,"given":"Yan","family":"Wang","sequence":"first","affiliation":[{"name":"The school of Computer science, Guangzhou University","place":["Guangzhou, Japan"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-3508-4965","authenticated-orcid":false,"given":"Yubin","family":"He","sequence":"additional","affiliation":[{"name":"Guangzhou University","place":["Guangzhou, China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1330-0945","authenticated-orcid":false,"given":"Gang","family":"Liu","sequence":"additional","affiliation":[{"name":"University of Electronic Science and Technology of China Shenzhen Institute for Advanced Study","place":["Shenzhen, China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5224-4048","authenticated-orcid":false,"given":"Keqin","family":"Li","sequence":"additional","affiliation":[{"name":"State University of New York","place":["New Paltz, United States"]}]}],"member":"320","published-online":{"date-parts":[[2026,1,14]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2022.3166110"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCCN.2021.3066619"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCCWorkshops57813.2023.10233785"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSC.2018.2826544"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCC.2024.3381646"},{"key":"e_1_3_1_7_2","article-title":"Price competition in multi-server edge computing networks under SAA and SIQ models","author":"Chen Ziya","year":"2024","unstructured":"Ziya Chen, Qian Ma, Lin Gao, and Xu Chen. 2024. Price competition in multi-server edge computing networks under SAA and SIQ models. IEEE Transactions on Mobile Computing 23, 1 (2024), 754\u2013768.","journal-title":"IEEE Transactions on Mobile Computing"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/WCNC49053.2021.9417286"},{"key":"e_1_3_1_9_2","article-title":"Intelligent delay-aware partial computing task offloading for multiuser industrial internet of things through edge computing","author":"Deng Xiaoheng","year":"2023","unstructured":"Xiaoheng Deng, Jian Yin, Peiyuan Guan, Neal N. Xiong, Lan Zhang, and Shahid Mumtaz. 2023. Intelligent delay-aware partial computing task offloading for multiuser industrial internet of things through edge computing. IEEE Internet of Things Journal 10, 4 (2023), 2954\u20132966.","journal-title":"IEEE Internet of Things Journal"},{"issue":"8","key":"e_1_3_1_10_2","first-page":"3571","article-title":"Offloading in mobile edge computing: Task allocation and computational frequency scaling","volume":"65","author":"Dinh Thinh Quang","year":"2017","unstructured":"Thinh Quang Dinh, Jianhua Tang, Quang Duy La, and Tony Q. S. Quek. 2017. Offloading in mobile edge computing: Task allocation and computational frequency scaling. IEEE Transactions on Communications 65, 8 (2017), 3571\u20133584.","journal-title":"IEEE Transactions on Communications"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMC.2021.3123165"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2023.10.024"},{"key":"e_1_3_1_13_2","doi-asserted-by":"crossref","unstructured":"Qinting Jiang Xiaolong Xu Qiang He Xuyun Zhang Fei Dai Lianyong Qi and Wanchun Dou. 2021. Game theory-based task offloading and resource allocation for vehicular networks in edge-cloud computing. 2021 IEEE International Conference on Web Services (ICWS). Chicago IL USA 2021 341\u2013346.","DOI":"10.1109\/ICWS53863.2021.00052"},{"key":"e_1_3_1_14_2","article-title":"Computation offloading scheduling for periodic tasks in mobile edge computing","year":"2020","unstructured":"JoiloSlaana and DnGyrgy. 2020. Computation offloading scheduling for periodic tasks in mobile edge computing. IEEE\/ACM Transactions on Networking 28, 2 (2020), 667\u2013680.","journal-title":"IEEE\/ACM Transactions on Networking"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.comnet.2023.109572"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2022.3200431"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2021.09.003"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMC.2020.3004225"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3466772.3467034"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISCC58397.2023.10218299"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2021.3051427"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/LCOMM.2020.3034668"},{"key":"e_1_3_1_23_2","unstructured":"John Schulman Filip Wolski Prafulla Dhariwal Alec Radford and Oleg Klimov. 2017. Proximal policy optimization algorithms. CoRR abs\/1707.06347 (2017)."},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.pmcj.2021.101395"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCOMM.2020.3044085"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVT.2018.2881191"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/LCOMM.2023.3266931"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2021.03.041"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCC51575.2020.9345003"},{"key":"e_1_3_1_30_2","article-title":"Share-aware joint model deployment and task offloading for multi-task inference","author":"Wu Yalan","year":"2024","unstructured":"Yalan Wu, Jigang Wu, Long Chen, Bosheng Liu, Mianyang Yao, and Siew Kei Lam. 2024. Share-aware joint model deployment and task offloading for multi-task inference. IEEE Transactions on Intelligent Transportation Systems 25, 6 (2024), 5674\u20135687.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"e_1_3_1_31_2","doi-asserted-by":"crossref","unstructured":"Zeinab Zabihi Amir Masoud Eftekhari Moghadam and Mohammad Hossein Rezvani. 2023. Reinforcement learning methods for computation offloading: A systematic review. 56 1 Article No. 17 (2023) 1\u201341.","DOI":"10.1145\/3603703"},{"key":"e_1_3_1_32_2","article-title":"A dynamic task offloading scheme based on location forecasting for mobile intelligent vehicles","author":"Zhang Zhiwei","year":"2024","unstructured":"Zhiwei Zhang, Zehan Chen, Yulong Shen, Xuewen Dong, and Ning Xi. 2024. A dynamic task offloading scheme based on location forecasting for mobile intelligent vehicles. IEEE Transactions on Vehicular Technology 73, 6 (2024), 7532\u20137546.","journal-title":"IEEE Transactions on Vehicular Technology"}],"container-title":["ACM Transactions on Internet Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3762993","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T05:08:31Z","timestamp":1768540111000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3762993"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,14]]},"references-count":31,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,2,28]]}},"alternative-id":["10.1145\/3762993"],"URL":"https:\/\/doi.org\/10.1145\/3762993","relation":{},"ISSN":["1533-5399","1557-6051"],"issn-type":[{"value":"1533-5399","type":"print"},{"value":"1557-6051","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,14]]},"assertion":[{"value":"2024-11-02","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-07","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-01-14","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}