{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T00:56:31Z","timestamp":1759971391644,"version":"build-2065373602"},"reference-count":44,"publisher":"Association for Computing Machinery (ACM)","issue":"6","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Embed. Comput. Syst."],"published-print":{"date-parts":[[2025,11,30]]},"abstract":"<jats:p>Wireless Powered Communication Network (WPCN) is a new paradigm to allow low-power wireless devices to exchange data packets and receive stable energy transfer from a power source and thus support autonomous and sustainable network operations without battery replacements. In recent years, we have witnessed the growing deployment of WPCNs in both industrial and consumer IoT systems to support time-triggered and event-triggered monitoring applications. In this article, we present a novel reinforcement learning (RL)-based on-demand path planning framework to plan the trajectory of a Mobile Charger (MC) and schedule the charging sequence of wireless devices to sustain the network operations. A modified Deep Q-learning approach is designed to charge the wireless devices by balancing between their residual energy level and the distance from the MC to the device. This approach minimizes the total distance that the MC travels while ensuring that individual residual energy of a given set of devices is above a designated threshold. Extensive experimental results from both the Gazebo-based high-fidelity simulation and Turtlebot-based physical testbed demonstrate that our approach outperforms the classic scheduling methods (e.g., Nearest Job Next and Earliest Deadline First), state-of-the-art scheduling methods(Extended Particle Swarm Optimization, Enhanced Teaching\u2013Learning-Based Optimization Algorithm\u00a0and Spatiotemporal Optimization for Charging Scheduling), learning-based methods (e.g., Proximal Policy Optimization and Advantage Actor-Critic) with similar sample sizes for training.<\/jats:p>","DOI":"10.1145\/3763235","type":"journal-article","created":{"date-parts":[[2025,8,25]],"date-time":"2025-08-25T11:25:51Z","timestamp":1756121151000},"page":"1-32","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Deep Q-Learning-Based Mobile Charger Path Planning in Wireless Powered Communication Networks"],"prefix":"10.1145","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4671-0070","authenticated-orcid":false,"given":"Mainak","family":"Mondal","sequence":"first","affiliation":[{"name":"Computer Science and Engineering, University of Connecticut","place":["Storrs, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4246-8616","authenticated-orcid":false,"given":"Fei","family":"Dou","sequence":"additional","affiliation":[{"name":"School of Computing, University of Georgia","place":["Athens, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6996-4092","authenticated-orcid":false,"given":"Jinbo","family":"Bi","sequence":"additional","affiliation":[{"name":"Computer Science and Engineering, University of Connecticut","place":["Storrs, United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1491-7675","authenticated-orcid":false,"given":"Song","family":"Han","sequence":"additional","affiliation":[{"name":"Computer Science and Engineering, University of Connecticut","place":["Storrs, United States"]}]}],"member":"320","published-online":{"date-parts":[[2025,10,8]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"crossref","unstructured":"Sanjai Prasada Rao Banoth Praveen Kumar Donta and Tarachand Amgoth. 2021. Dynamic mobile charger scheduling with partial charging strategy for WSNs using deep-Q-networks. Neural Computing and Applications 33 22 (2021) 15267\u201315279.","DOI":"10.1007\/s00521-021-06146-9"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.3390\/vehicles5020028"},{"key":"e_1_3_2_4_2","unstructured":"Various Contributers. 2023. ROS Navigation Stack Source Code. Retrieved January 15 2025 from https:\/\/github.com\/ROBOTISGIT\/turtlebot3"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2015.7139807"},{"key":"e_1_3_2_6_2","doi-asserted-by":"crossref","unstructured":"Dieter Fox Wolfram Burgard and Sebastian Thrun. 1997. The dynamic window approach to collision avoidance. Robotics & Automation Magazine IEEE 4 1 (1997) 23\u201333.","DOI":"10.1109\/100.580977"},{"key":"e_1_3_2_7_2","doi-asserted-by":"crossref","unstructured":"Zheng Gong Hao Wu Yong Feng and Nianbo Liu. 2023. Deep reinforcement learning-based online one-to-multiple charging scheme in wireless rechargeable sensor network. Sensors 23 8 (2023) 3903.","DOI":"10.3390\/s23083903"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/CICT56698.2022.9997963"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/MASS.2013.51"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMC.2014.2368557"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2023.3294434"},{"key":"e_1_3_2_12_2","doi-asserted-by":"crossref","unstructured":"Chengpeng Jiang Wencong Chen Jing Wang Ziyang Wang and Wendong Xiao. 2024. An improved deep Q-network approach for charging sequence scheduling with optimal mobile charging cost and charging efficiency in wireless rechargeable sensor networks. Ad Hoc Networks 157 (2024) 103458.","DOI":"10.1016\/j.adhoc.2024.103458"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICC.2017.7997315"},{"key":"e_1_3_2_14_2","unstructured":"Timothy P. Lillicrap Jonathan J. Hunt Alexander Pritzel Nicolas Heess Tom Erez Yuval Tassa David Silver and Daan Wierstra. 2019. Continuous control with deep reinforcement learning. arXiv:1509.02971."},{"key":"e_1_3_2_15_2","doi-asserted-by":"crossref","unstructured":"Chi Lin Guowei Wu Mohammad S. Obaidat and Chang Wu Yu. 2016. Clustering and splitting charging algorithms for large scaled wireless rechargeable sensor networks. Journal of Systems and Software 113 (2016) 381\u2013394.","DOI":"10.1016\/j.jss.2015.12.017"},{"key":"e_1_3_2_16_2","doi-asserted-by":"crossref","unstructured":"Tang Liu Baijun Wu Hongyi Wu and Jian Peng. 2017. Low-cost collaborative mobile charging for large-scale wireless sensor networks. IEEE Transactions on Mobile Computing 16 8 (2017) 2213\u20132227.","DOI":"10.1109\/TMC.2016.2616309"},{"key":"e_1_3_2_17_2","doi-asserted-by":"crossref","unstructured":"Tang Liu Baijun Wu Wenzheng Xu Xianbo Cao Jiangen Peng and Hongyi Wu. 2021. RLC: A reinforcement learning-based charging algorithm for mobile devices. ACM Transactions on Sensor Networks 17 4 Article 36 (2021).","DOI":"10.1145\/3453682"},{"key":"e_1_3_2_18_2","doi-asserted-by":"crossref","unstructured":"Yong Liu Kam-Yiu Lam Song Han and Qingchun Chen. 2019. Mobile data gathering and energy harvesting in rechargeable wireless sensor networks. Information Sciences 482 (2019) 189\u2013209.","DOI":"10.1016\/j.ins.2019.01.014"},{"key":"e_1_3_2_19_2","doi-asserted-by":"crossref","unstructured":"Yu Ma Weifa Liang and Wenzheng Xu. 2018. Charging utility maximization in wireless rechargeable sensor networks by charging multiple sensors simultaneously. IEEE\/ACM Transactions on Networking 26 4 (2018) 1591\u20131604.","DOI":"10.1109\/TNET.2018.2841420"},{"key":"e_1_3_2_20_2","unstructured":"Volodymyr Mnih Adria Puigdomenech Badia Mehdi Mirza Alex Graves Timothy Lillicrap Tim Harley David Silver and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In Proceedings of The 33rd International Conference on Machine Learning (ICML). 1928\u20131937."},{"key":"e_1_3_2_21_2","doi-asserted-by":"crossref","unstructured":"Volodymyr Mnih Koray Kavukcuoglu David Silver Andrei A. Rusu Joel Veness Marc G. Bellemare Alex Graves Martin Riedmiller Andreas K. Fidjeland Georg Ostrovski et\u00a0al. 2015. Human-level control through deep reinforcement learning. Nature 518 7540 (2015) 529\u2013533.","DOI":"10.1038\/nature14236"},{"key":"e_1_3_2_22_2","unstructured":"Mainak Mondal. 2024. DQN+ Source Files. Retrieved February 4 2025 from https:\/\/github.com\/CrashxZ\/WPCN_Power_Scheduling"},{"key":"e_1_3_2_23_2","unstructured":"Emilio Parisotto H. Francis Song Jack W. Rae Razvan Pascanu Caglar Gulcehre Siddhant M. Jayakumar Max Jaderberg Rapha\u00ebl Lopez Kaufman Aidan Clark Seb Noury Matthew M. Botvinick Nicolas Heess and Raia Hadsell. 2020. Stabilizing transformers for reinforcement learning. In Proceedings of the 37th International Conference on Machine Learning (ICML). Article 694."},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCE.2024.3419128"},{"key":"e_1_3_2_25_2","volume-title":"ICRA Workshop on Open Source Software","author":"Quigley Morgan","year":"2009","unstructured":"Morgan Quigley, Ken Conley, Brian Gerkey, Josh Faust, Tully Foote, Jeremy Leibs, Rob Wheeler, and Andrew Y. Ng. 2009. ROS: An open-source robot operating system. In ICRA Workshop on Open Source Software."},{"key":"e_1_3_2_26_2","unstructured":"Robotis. 2022. Turtlebot 3 ROS Packeges Source Code. Retrieved January 15 2025 from https:\/\/github.com\/ROBOTIS-GIT\/turtlebot3"},{"key":"e_1_3_2_27_2","unstructured":"Robotis. 2022. Turtlebot E-manual Autononomous Navigation. Retrieved January 15 2025 from https:\/\/emanual.robotis.com\/docs\/en\/platform\/turtlebot3\/navigation\/#navigation"},{"key":"e_1_3_2_28_2","unstructured":"Robotis. 2022. Turtlebot E-manual Basic Operation. Retrieved January 15 2025 from https:\/\/emanual.robotis.com\/docs\/en\/platform\/turtlebot3\/basic_operation\/#basic-operation"},{"key":"e_1_3_2_29_2","unstructured":"Robotis. 2022. Turtlebot E-manual Bringup. Retrieved January 15 2025 from https:\/\/emanual.robotis.com\/docs\/en\/platform\/turtlebot3\/bringup\/#bringup"},{"key":"e_1_3_2_30_2","unstructured":"Robotis. 2022. Turtlebot E-manual SLAM. Retrieved January 15 2025 from https:\/\/emanual.robotis.com\/docs\/en\/platform\/turtlebot3\/slam\/#run-slam-node"},{"key":"e_1_3_2_31_2","unstructured":"ROS. 2020. ROS Wiki Map Saver. Retrieved January 15 2025 from https:\/\/wiki.ros.org\/map_server"},{"key":"e_1_3_2_32_2","doi-asserted-by":"crossref","unstructured":"Fahira Sangare Yong Xiao Dusit Niyato and Zhu Han. 2017. Mobile charging in wireless-powered sensor networks: Optimal scheduling and experimental implementation. IEEE Transactions on Vehicular Technology 66 8 (2017) 7400\u20137410.","DOI":"10.1109\/TVT.2017.2668990"},{"key":"e_1_3_2_33_2","unstructured":"John Schulman Filip Wolski Prafulla Dhariwal Alec Radford and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv:1707.06347."},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.5555\/3016100.3016191"},{"key":"e_1_3_2_35_2","unstructured":"Ziyu Wang Tom Schaul Matteo Hessel Hado van Hasselt Marc Lanctot and Nando de Freitas. 2016. Dueling network architectures for deep reinforcement learning. In Proceedings of the 33rd International Conference on Machine Learning (ICML). 1995\u20132003."},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-94268-1_40"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/2491288.2491291"},{"key":"e_1_3_2_38_2","doi-asserted-by":"crossref","unstructured":"Yuan Xing Riley Young Giaolong Nguyen Maxwell Lefebvre Tianchi Zhao Haowen Pan and Liang Dong. 2022. Optimal path planning for wireless power transfer robot using area division deep reinforcement learning. Wireless Power Transfer 2022 Article 9921885 (2022).","DOI":"10.1155\/2022\/9921885"},{"key":"e_1_3_2_39_2","doi-asserted-by":"crossref","unstructured":"Meiyi Yang Nianbo Liu Lin Zuo Yong Feng Minghui Liu Haigang Gong and Ming Liu. 2021. Dynamic charging scheme problem with actor-critic reinforcement learning. IEEE Internet of Things Journal 8 1 (2021) 370\u2013380.","DOI":"10.1109\/JIOT.2020.3005598"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/RTAS52030.2021.00035"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/MASS.2012.6502505"},{"key":"e_1_3_2_42_2","doi-asserted-by":"crossref","unstructured":"Chuanxin Zhao Hengjing Zhang Fulong Chen Siguang Chen Changzhi Wu and Taochun Wang. 2020. Spatiotemporal charging scheduling in wireless rechargeable sensor networks. Computer Communications 152 (2020) 155\u2013170.","DOI":"10.1016\/j.comcom.2020.01.037"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3316781.3317926"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCAD.2023.3235799"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/RTSS49844.2020.00025"}],"container-title":["ACM Transactions on Embedded Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3763235","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,8]],"date-time":"2025-10-08T13:51:38Z","timestamp":1759931498000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3763235"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,8]]},"references-count":44,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,11,30]]}},"alternative-id":["10.1145\/3763235"],"URL":"https:\/\/doi.org\/10.1145\/3763235","relation":{},"ISSN":["1539-9087","1558-3465"],"issn-type":[{"type":"print","value":"1539-9087"},{"type":"electronic","value":"1558-3465"}],"subject":[],"published":{"date-parts":[[2025,10,8]]},"assertion":[{"value":"2024-10-08","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-11","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-10-08","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}