{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,17]],"date-time":"2026-06-17T10:50:51Z","timestamp":1781693451719,"version":"3.54.5"},"reference-count":38,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2021,2,28]],"date-time":"2021-02-28T00:00:00Z","timestamp":1614470400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Edge computing (EC) has recently emerged as a promising paradigm that supports resource-hungry Internet of Things (IoT) applications with low latency services at the network edge. However, the limited capacity of computing resources at the edge server poses great challenges for scheduling application tasks. In this paper, a task scheduling problem is studied in the EC scenario, and multiple tasks are scheduled to virtual machines (VMs) configured at the edge server by maximizing the long-term task satisfaction degree (LTSD). The problem is formulated as a Markov decision process (MDP) for which the state, action, state transition, and reward are designed. We leverage deep reinforcement learning (DRL) to solve both time scheduling (i.e., the task execution order) and resource allocation (i.e., which VM the task is assigned to), considering the diversity of the tasks and the heterogeneity of available resources. A policy-based REINFORCE algorithm is proposed for the task scheduling problem, and a fully-connected neural network (FCN) is utilized to extract the features. Simulation results show that the proposed DRL-based task scheduling algorithm outperforms the existing methods in the literature in terms of the average task satisfaction degree and success ratio.<\/jats:p>","DOI":"10.3390\/s21051666","type":"journal-article","created":{"date-parts":[[2021,2,28]],"date-time":"2021-02-28T20:43:32Z","timestamp":1614545012000},"page":"1666","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":160,"title":["Deep Reinforcement Learning-Based Task Scheduling in IoT Edge Computing"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0309-5482","authenticated-orcid":false,"given":"Shuran","family":"Sheng","sequence":"first","affiliation":[{"name":"School of Information Science and Engineering, Southeast University, Nanjing 210096, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7120-1577","authenticated-orcid":false,"given":"Peng","family":"Chen","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Millimeter Waves, Southeast University, Nanjing 210096, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8812-8970","authenticated-orcid":false,"given":"Zhimin","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Electronic and Information, Shanghai Dianji University, Shanghai 201306, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Lenan","family":"Wu","sequence":"additional","affiliation":[{"name":"School of Information Science and Engineering, Southeast University, Nanjing 210096, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yuxuan","family":"Yao","sequence":"additional","affiliation":[{"name":"Shannxi Key Laboratory of Integrated and Intelligent Navigation, Xi\u2019an 710068, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2021,2,28]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1125","DOI":"10.1109\/JIOT.2017.2683200","article-title":"A Survey on Internet of Things: Architecture, Enabling Technologies, Security and Privacy, and Applications","volume":"4","author":"Lin","year":"2017","journal-title":"IEEE Internet Things J."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"112","DOI":"10.1109\/MS.2016.20","article-title":"Reference Architectures for the Internet of Things","volume":"33","author":"Weyrich","year":"2016","journal-title":"IEEE Softw."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"3317","DOI":"10.1109\/TPDS.2014.2381640","article-title":"Computation Offloading for Service Workflow in Mobile Cloud Computing","volume":"26","author":"Deng","year":"2015","journal-title":"IEEE Trans. Parallel Distrib. Syst."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"450","DOI":"10.1109\/JIOT.2017.2750180","article-title":"Mobile Edge Computing: A Survey","volume":"5","author":"Abbas","year":"2018","journal-title":"IEEE Internet Things J."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"637","DOI":"10.1109\/JIOT.2016.2579198","article-title":"Edge Computing: Vision and Challenges","volume":"3","author":"Shi","year":"2016","journal-title":"IEEE Internet Things J."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"5404","DOI":"10.1109\/TWC.2020.2993071","article-title":"Offloading and Resource Allocation With General Task Graph in Mobile Edge Computing: A Deep Reinforcement Learning Approach","volume":"19","author":"Yan","year":"2020","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"5031","DOI":"10.1109\/TVT.2019.2904244","article-title":"Collaborative Cloud and Edge Computing for Latency Minimization","volume":"68","author":"Ren","year":"2019","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2146","DOI":"10.1109\/JIOT.2018.2826006","article-title":"Application Aware Workload Allocation for Edge Computing-Based IoT","volume":"5","author":"Fan","year":"2018","journal-title":"IEEE Internet Things J."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"4921","DOI":"10.1109\/JIOT.2019.2893866","article-title":"From Cloud Down to Things: An Overview of Machine Learning in Internet of Things","volume":"6","author":"Samie","year":"2019","journal-title":"IEEE Internet Things J."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1584","DOI":"10.1109\/JPROC.2019.2922285","article-title":"Computation Offloading Toward Edge Computing","volume":"107","author":"Lin","year":"2019","journal-title":"Proc. IEEE"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"6606","DOI":"10.1109\/JIOT.2019.2908861","article-title":"Credit-Based Payments for Fast Computing Resource Trading in Edge-Assisted Internet of Things","volume":"6","author":"Li","year":"2019","journal-title":"IEEE Internet Things J."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"2872","DOI":"10.1109\/JIOT.2018.2876198","article-title":"Joint Task Assignment, Transmission, and Computing Resource Allocation in Multilayer Mobile Edge Computing Systems","volume":"6","author":"Wang","year":"2019","journal-title":"IEEE Internet Things J."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"4283","DOI":"10.1109\/JIOT.2018.2875917","article-title":"Joint Resource Allocation for Latency-Sensitive Services Over Mobile Edge Computing Networks With Caching","volume":"6","author":"Zhang","year":"2019","journal-title":"IEEE Internet Things J."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"113345","DOI":"10.1109\/ACCESS.2019.2935217","article-title":"Resource Allocation for a UAV-Enabled Mobile-Edge Computing System: Computation Efficiency Maximization","volume":"7","author":"Zhang","year":"2019","journal-title":"IEEE Access"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"3074","DOI":"10.1109\/TNSE.2020.3015689","article-title":"Enhanced Online Q-Learning Scheme for Resource Allocation with Maximum Utility and Fairness in Edge-IoT Networks","volume":"7","author":"AlQerm","year":"2020","journal-title":"IEEE Trans. Netw. Sci. Eng."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"856","DOI":"10.1109\/TVT.2018.2881191","article-title":"Joint Task Offloading and Resource Allocation for Multi-Server Mobile-Edge Computing Networks","volume":"68","author":"Tran","year":"2019","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Tan, H., Han, Z., Li, X., and Lau, F.C.M. (2017, January 1\u20134). Online job dispatching and scheduling in edge-clouds. Proceedings of the IEEE INFOCOM 2017\u2014IEEE Conference on Computer Communications, Atlanta, GA, USA.","DOI":"10.1109\/INFOCOM.2017.8057116"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"86265","DOI":"10.1109\/ACCESS.2019.2924032","article-title":"Resource Scheduling for Delay Minimization in Multi-Server Cellular Edge Computing Systems","volume":"7","author":"Zhang","year":"2019","journal-title":"IEEE Access"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"117088","DOI":"10.1109\/ACCESS.2019.2934890","article-title":"A Hybrid Task Scheduling Scheme for Heterogeneous Vehicular Edge Systems","volume":"7","author":"Chen","year":"2019","journal-title":"IEEE Access"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"105008","DOI":"10.1109\/ACCESS.2019.2931336","article-title":"Joint Cotask-Aware Offloading and Scheduling in Mobile Edge Computing Systems","volume":"7","author":"Chiang","year":"2019","journal-title":"IEEE Access"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1016\/j.future.2019.01.007","article-title":"Collaborative cache allocation and task scheduling for data-intensive applications in edge computing environment","volume":"95","author":"Li","year":"2019","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"360","DOI":"10.1109\/TWC.2020.3024538","article-title":"Mobility-Aware Joint Task Scheduling and Resource Allocation for Cooperative Mobile Edge Computing","volume":"20","author":"Saleem","year":"2021","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"668","DOI":"10.1109\/JSAC.2019.2894306","article-title":"Dynamic Task Offloading and Scheduling for Low-Latency IoT Services in Multi-Access Edge Computing","volume":"37","author":"Alameddine","year":"2019","journal-title":"IEEE J. Sel. Areas Commun."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Abdel-Basset, M., Mohamed, R., Elhoseny, M., Bashir, A.K., Jolfaei, A., and Kumar, N. (2020). Energy-Aware Marine Predators Algorithm for Task Scheduling in IoT-based Fog Computing Applications. IEEE Trans. Ind. Inform., early access.","DOI":"10.1109\/TII.2020.3001067"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1270","DOI":"10.1109\/TPDS.2019.2961905","article-title":"Online Deadline-Aware Task Dispatching and Scheduling in Edge Computing","volume":"31","author":"Meng","year":"2020","journal-title":"IEEE Trans. Parallel Distrib. Syst."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"4512","DOI":"10.1109\/JIOT.2018.2883762","article-title":"A Q -Learning-Based Proactive Caching Strategy for Non-Safety Related Services in Vehicular Networks","volume":"6","author":"Hou","year":"2019","journal-title":"IEEE Internet Things J."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"4271","DOI":"10.1109\/TVT.2020.2972999","article-title":"Mode Selection and Resource Allocation in Sliced Fog Radio Access Networks: A Reinforcement Learning Approach","volume":"69","author":"Xiang","year":"2020","journal-title":"IEEE Trans. Veh. Technol."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"6255","DOI":"10.1109\/TWC.2020.3001736","article-title":"Power Allocation in Multi-User Cellular Networks: Deep Reinforcement Learning Approaches","volume":"19","author":"Meng","year":"2020","journal-title":"IEEE Trans. Wirel. Commun."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"7011","DOI":"10.1109\/JIOT.2019.2913162","article-title":"iRAF: A Deep Reinforcement Learning Approach for Collaborative Mobile Edge Computing IoT Networks","volume":"6","author":"Chen","year":"2019","journal-title":"IEEE Internet Things J."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Xu, Z., Tang, J., Yin, C., Wang, Y., Xue, G., Wang, J., and Gursoy, M.C. (2020). ReCARL: Resource Allocation in Cloud RANs with Deep Reinforcement Learning. IEEE Trans. Mob. Comput., early access.","DOI":"10.1109\/TMC.2020.3044282"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1133","DOI":"10.1109\/JSAC.2020.2986615","article-title":"Resource Allocation Based on Deep Reinforcement Learning in IoT Edge Computing","volume":"38","author":"Xiong","year":"2020","journal-title":"IEEE J. Sel. Areas Commun."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"55112","DOI":"10.1109\/ACCESS.2018.2872674","article-title":"DRL-Scheduling: An Intelligent QoS-Aware Job Scheduling Framework for Applications in Clouds","volume":"6","author":"Wei","year":"2018","journal-title":"IEEE Access"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1041","DOI":"10.1109\/JIOT.2020.3009540","article-title":"Heterogeneous Task Offloading and Resource Allocations via Deep Recurrent Reinforcement Learning in Partial Observable Multifog Networks","volume":"8","author":"Baek","year":"2021","journal-title":"IEEE Internet Things J."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"15178","DOI":"10.1109\/ACCESS.2018.2801319","article-title":"Optimal Scheduling of VMs in Queueing Cloud Computing Systems With a Heterogeneous Workload","volume":"6","author":"Guo","year":"2018","journal-title":"IEEE Access"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"5449","DOI":"10.1109\/JIOT.2020.2978830","article-title":"Deep-Reinforcement-Learning-Based Offloading Scheduling for Vehicular Edge Computing","volume":"7","author":"Zhan","year":"2020","journal-title":"IEEE Internet Things J."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"3576","DOI":"10.1109\/JIOT.2020.3025015","article-title":"Cloud Resource Scheduling with Deep Reinforcement Learning and Imitation Learning","volume":"8","author":"Guo","year":"2020","journal-title":"IEEE Internet Things J."},{"key":"ref_37","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, MIT Press."},{"key":"ref_38","first-page":"1057","article-title":"Policy gradient methods for reinforcement learning with function approximation","volume":"99","author":"Sutton","year":"1999","journal-title":"Adv. Neural Inf. Process. Syst."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/5\/1666\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:30:38Z","timestamp":1760160638000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/5\/1666"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,28]]},"references-count":38,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2021,3]]}},"alternative-id":["s21051666"],"URL":"https:\/\/doi.org\/10.3390\/s21051666","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,2,28]]}}}