{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,24]],"date-time":"2026-04-24T14:23:32Z","timestamp":1777040612642,"version":"3.51.4"},"reference-count":57,"publisher":"MDPI AG","issue":"16","license":[{"start":{"date-parts":[[2023,8,17]],"date-time":"2023-08-17T00:00:00Z","timestamp":1692230400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Qatar National Research Fund","award":["NPRP11S-1202-170052"],"award-info":[{"award-number":["NPRP11S-1202-170052"]}]},{"name":"Qatar National Library","award":["NPRP11S-1202-170052"],"award-info":[{"award-number":["NPRP11S-1202-170052"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Modern active distribution networks (ADNs) witness increasing complexities that require efforts in control practices, including optimal reactive power dispatch (ORPD). Deep reinforcement learning (DRL) is proposed to manage the network\u2019s reactive power by coordinating different resources, including distributed energy resources, to enhance performance. However, there is a lack of studies examining DRL elements\u2019 performance sensitivity. To this end, in this paper we examine the impact of various DRL reward representations and hyperparameters on the agent\u2019s learning performance when solving the ORPD problem for ADNs. We assess the agent\u2019s performance regarding accuracy and training time metrics, as well as critic estimate measures. Furthermore, different environmental changes are examined to study the DRL model\u2019s scalability by including other resources. Results show that compared to other representations, the complementary reward function exhibits improved performance in terms of power loss minimization and convergence time by 10\u201315% and 14\u201318%, respectively. Also, adequate agent performance is observed to be neighboring the best-suited value of each hyperparameter for the studied problem. In addition, scalability analysis depicts that increasing the number of possible action combinations in the action space by approximately nine times results in 1.7 times increase in the training time.<\/jats:p>","DOI":"10.3390\/s23167216","type":"journal-article","created":{"date-parts":[[2023,8,17]],"date-time":"2023-08-17T10:47:02Z","timestamp":1692269222000},"page":"7216","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Optimal Reactive Power Dispatch in ADNs using DRL and the Impact of Its Various Settings and Environmental Changes"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7862-4820","authenticated-orcid":false,"given":"Tassneem","family":"Zamzam","sequence":"first","affiliation":[{"name":"Electrical Engineering Department, Qatar University, Doha 2713, Qatar"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5688-7515","authenticated-orcid":false,"given":"Khaled","family":"Shaban","sequence":"additional","affiliation":[{"name":"Computer Science and Engineering Department, Qatar University, Doha 2713, Qatar"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9343-469X","authenticated-orcid":false,"given":"Ahmed","family":"Massoud","sequence":"additional","affiliation":[{"name":"Electrical Engineering Department, Qatar University, Doha 2713, Qatar"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,8,17]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1016\/j.eswa.2017.06.009","article-title":"Optimal reactive power dispatch problem using a two-archive multi-objective grey wolf optimizer","volume":"87","author":"Nuaekaew","year":"2017","journal-title":"Expert Syst. Appl."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1049\/iet-gtd.2011.0681","article-title":"Optimal reactive power dispatch using a gravitational search algorithm","volume":"6","author":"Duman","year":"2012","journal-title":"IET Gener. Transm. Distrib."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1549","DOI":"10.1016\/j.asoc.2007.12.002","article-title":"Differential evolution approach for optimal reactive power dispatch","volume":"8","author":"Varadarajan","year":"2008","journal-title":"Appl. Soft Comput. J."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1447","DOI":"10.1109\/59.99399","article-title":"Application of Newton\u2019s optimal power flow in voltage\/reactive power control","volume":"5","author":"Bjelogrlic","year":"1990","journal-title":"IEEE Trans. Power Syst."},{"key":"ref_5","unstructured":"Lai, L.L., Nieh, T.Y., Vujatovic, D., Ma, Y.N., Lu, Y.P., Yang, Y.W., and Braun, H. (2005, January 18). Swarm intelligence for optimal reactive power dispatch. Proceedings of the 2005 IEEE\/PES transmission, Distribution Conference Asia Pacific, Dalian, China."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"136","DOI":"10.1109\/59.317548","article-title":"Optimal reactive dispatch through interior point methods\u2014Power Systems","volume":"9","author":"Granville","year":"1994","journal-title":"IEEE Trans. Power Syst."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1016\/j.epsr.2010.10.005","article-title":"Differential evolution algorithm for optimal reactive power dispatch","volume":"81","author":"Abido","year":"2011","journal-title":"Electr. Power Syst. Res."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/j.ijepes.2010.08.019","article-title":"Electrical power and energy systems an investigation about the impact of the optimal reactive power dispatch solved by DE","volume":"33","author":"Ramirez","year":"2011","journal-title":"Int. J. Electr. Power Energy Syst."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"684","DOI":"10.1016\/j.ijepes.2010.11.018","article-title":"Optimal reactive power dispatch based on harmony search algorithm","volume":"33","author":"Khazali","year":"2011","journal-title":"Int. J. Electr. Power Energy Syst."},{"key":"ref_10","unstructured":"Ma, J., and Lai, L. (December, January 29). Application of genetic algorithm to optimal reactive power dispatch including voltage-dependent load models. Proceedings of the 1995 IEEE International Conference on Evolutionary Computation, Perth, WA, Australia."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1049\/ip-gtd:19951958","article-title":"Hybrid expert system and simulated annealing approach to optimal reactive power planning","volume":"142","author":"Jwo","year":"1995","journal-title":"IEEE Proc. Gener. Transm. Distrib."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1070","DOI":"10.1109\/TPWRS.2005.846064","article-title":"A multiagent-based particle swarm optimization approach for optimal reactive power dispatch","volume":"20","author":"Zhao","year":"2005","journal-title":"IEEE Trans. Power Syst."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Mouassa, S., and Bouktir, T. (2015, January 21\u201323). Artificial bee colony algorithm for discrete optimal reactive power dispatch. Proceedings of the 2015 International Conference on Industrial Engineering and Systems Management, Seville, Spain.","DOI":"10.1109\/IESM.2015.7380228"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"116830","DOI":"10.1016\/j.eswa.2022.116830","article-title":"Reinforcement learning in urban network traffic signal control: A systematic literature review","volume":"199","author":"Noaeen","year":"2022","journal-title":"Expert Syst. Appl."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"116995","DOI":"10.1016\/j.eswa.2022.116995","article-title":"Power output optimization of electric vehicles smart charging hubs using deep reinforcement learning","volume":"201","author":"Bertolini","year":"2022","journal-title":"Expert Syst. Appl."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1928","DOI":"10.1002\/er.4333","article-title":"A new generation of AI: A review and perspective on machine learning technologies applied to smart energy and electric power systems","volume":"43","author":"Cheng","year":"2019","journal-title":"Int. J. Energy Res."},{"key":"ref_17","unstructured":"Sutton, S.R., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, MIT Press. [2nd ed.]."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1561\/2200000071","article-title":"An Introduction to Deep Reinforcement Learning","volume":"11","author":"Henderson","year":"2018","journal-title":"Found. Trends Mach. Learn."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Mbuwir, B.V., Kaffash, M., and Deconinck, G. (2018, January 29\u201331). Battery scheduling in a residential multi-carrier energy system using reinforcement learning. Proceedings of the 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids, Aalborg, Denmark.","DOI":"10.1109\/SmartGridComm.2018.8587412"},{"key":"ref_20","unstructured":"Wan, Z., Li, H., and He, H. (2018, January 8\u201313). Residential Energy Management with Deep Reinforcement Learning. Proceedings of the 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids, Rio de Janeiro, Brazil."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"5246","DOI":"10.1109\/TSG.2018.2879572","article-title":"Model-Free Real-Time EV Charging Scheduling Based on Deep Reinforcement Learning","volume":"10","author":"Wan","year":"2018","journal-title":"IEEE Trans. Smart Grid"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"799","DOI":"10.1016\/j.apenergy.2018.03.104","article-title":"Continuous reinforcement learning of energy management with deep Q network for a power split hybrid electric bus","volume":"222","author":"Wu","year":"2018","journal-title":"Appl. Energy"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"937","DOI":"10.1016\/j.apenergy.2018.12.061","article-title":"Incentive-based demand response for smart grid with reinforcement learning and deep neural network","volume":"236","author":"Lu","year":"2019","journal-title":"Appl. Energy"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"116564","DOI":"10.1016\/j.eswa.2022.116564","article-title":"Deep reinforcement learning approach for solving joint pricing and inventory problem with reference price effects","volume":"195","author":"Zhou","year":"2022","journal-title":"Expert Syst. Appl."},{"key":"ref_25","unstructured":"Hao, J. (2020). Deep Reinforcement Learning for the Optimization of Building Energy Control and Management. [Ph.D. Thesis, University of Denver]."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"3283","DOI":"10.1109\/JSYST.2018.2855689","article-title":"Residential Load Scheduling with Renewable Generation in the Smart Grid: A Reinforcement Learning Approach","volume":"13","author":"Remani","year":"2018","journal-title":"IEEE Syst. J."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"4338","DOI":"10.1109\/TSG.2018.2857449","article-title":"Indirect Customer-to-Customer Energy Trading With Reinforcement Learning","volume":"10","author":"Chen","year":"2018","journal-title":"IEEE Trans. Smart Grid"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"114632","DOI":"10.1016\/j.eswa.2021.114632","article-title":"An application of deep reinforcement learning to algorithmic trading","volume":"173","author":"Ernst","year":"2021","journal-title":"Expert Syst. Appl."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"4513","DOI":"10.1109\/TSG.2020.2986333","article-title":"Deep Reinforcement Learning-Based Energy Storage Arbitrage with Accurate Lithium-Ion Battery Degradation Model","volume":"11","author":"Cao","year":"2020","journal-title":"IEEE Trans. Smart Grid"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"10728","DOI":"10.1109\/JIOT.2019.2941498","article-title":"Reinforcement Learning-Based Microgrid Energy Trading with a Reduced Power Plant Schedule","volume":"6","author":"Lu","year":"2019","journal-title":"IEEE Internet Things J."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"3259","DOI":"10.1109\/TSG.2016.2629450","article-title":"Convolutional Neural Networks for Automatic State-Time Feature Extraction in Reinforcement Learning Applied to Residential Load Control","volume":"9","author":"Claessens","year":"2018","journal-title":"IEEE Trans. Smart Grid"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1016\/j.apenergy.2019.03.027","article-title":"A reinforcement learning framework for optimal operation and maintenance of power grids","volume":"241","author":"Rocchetta","year":"2019","journal-title":"Appl. Energy"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Yang, Q., Wang, G., Sadeghi, A., Giannakis, G.B., and Sun, J. (2019). Two-Timescale Voltage Control in Distribution Grids Using Deep Reinforcement Learning. arXiv.","DOI":"10.1109\/SmartGridComm.2019.8909764"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Diao, R., Wang, Z., Shi, D., Chang, Q., Duan, J., and Zhang, X. (2019). Autonomous Voltage Control for Grid Operation Using Deep Reinforcement Learning. arXiv.","DOI":"10.1109\/PESGM40551.2019.8973924"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"2684","DOI":"10.1109\/TNNLS.2018.2885530","article-title":"A Multistage Game in Smart Grid Security: A Reinforcement Learning Solution","volume":"30","author":"Ni","year":"2019","journal-title":"IEEE Trans. Neural Networks Learn. Syst."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Paul, S., and Ni, Z. (2018, January 8\u201313). A Study of Linear Programming and Reinforcement Learning for One-Shot Game in Smart Grid Security. Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil.","DOI":"10.1109\/IJCNN.2018.8489202"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"2158","DOI":"10.1109\/TSG.2018.2790704","article-title":"Evaluation of Reinforcement Learning-Based False Data Injection Attack to Automatic Voltage Control","volume":"10","author":"Chen","year":"2018","journal-title":"IEEE Trans. Smart Grid"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"2192","DOI":"10.1109\/TNNLS.2018.2801880","article-title":"Distributed economic dispatch in Microgrids based on cooperative reinforcement learning","volume":"29","author":"Liu","year":"2018","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Han, C., Yang, B., Bao, T., Yu, T., and Zhang, X. (2017). Bacteria Foraging Reinforcement Learning for Risk-Based Economic Dispatch via Knowledge Transfer. Energies, 10.","DOI":"10.3390\/en10050638"},{"key":"ref_40","first-page":"19","article-title":"Smart grid optimization by deep reinforcement learning over discrete and continuous action space","volume":"8","author":"Sogabe","year":"2019","journal-title":"Bull. Netw. Comput. Syst. Softw."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"7360","DOI":"10.1109\/JIOT.2019.2899673","article-title":"When Edge Computing Meets Microgrid: A Deep Reinforcement Learning Approach","volume":"6","author":"Munir","year":"2019","journal-title":"IEEE Internet Things J."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Wang, D.-L., Sun, Q.-Y., Li, Y.-Y., and Liu, X.-R. (2019). Optimal Energy Routing Design in Energy Internet with Multiple Energy Routing Centers Using Artificial Neural Network-Based Reinforcement Learning Method. Appl. Sci., 9.","DOI":"10.3390\/app9030520"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1742","DOI":"10.1109\/TSMCC.2012.2218596","article-title":"Multiagent-Based Reinforcement Learning for Optimal Reactive Power Dispatch","volume":"42","author":"Xu","year":"2012","journal-title":"IEEE Trans. Syst. Man Cybern. Part C Appl. Rev."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"65","DOI":"10.17775\/CSEEJPES.2016.00037","article-title":"Hierarchically correlated equilibrium Q-learning for multi-area decentralized collaborative reactive power optimization","volume":"2","author":"Tan","year":"2016","journal-title":"CSEE J. Power Energy Syst."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"4137","DOI":"10.1109\/TSG.2021.3072251","article-title":"Data-Driven Multi-Agent Deep Reinforcement Learning for Distribution System Decentralized Voltage Control with High Penetration of PVs","volume":"12","author":"Cao","year":"2021","journal-title":"IEEE Trans. Smart Grid"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Ali, M., Mujeeb, A., Ullah, H., and Zeb, S. (2020, January 29\u201331). Reactive Power Optimization Using Feed Forward Neural Deep Reinforcement Learning Method: (Deep Reinforcement Learning DQN algorithm). Proceedings of the 2020 Asia Energy and Electrical Engineering Symposium (AEEES), Chengdu, China.","DOI":"10.1109\/AEEES48850.2020.9121492"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Tousi, M.R., Hosseinian, S.H., Jadidinejad, A.H., and Menhaj, M.B. (2008, January 1\u20133). Application of SARSA learning algorithm for reactive power control in power system. Proceedings of the 2008 IEEE 2nd International Power and Energy Conference, Johor Bahru, Malaysia.","DOI":"10.1109\/PECON.2008.4762658"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"109241","DOI":"10.1016\/j.asoc.2022.109241","article-title":"Reward criteria impact on the performance of reinforcement learning agent for autonomous navigation","volume":"126","author":"Dayal","year":"2022","journal-title":"Appl. Soft Comput."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"109150","DOI":"10.1016\/j.asoc.2022.109150","article-title":"Embedded draw-down constraint reward function for deep reinforcement learning","volume":"125","author":"Wu","year":"2022","journal-title":"Appl. Soft Comput."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"97557","DOI":"10.1109\/ACCESS.2021.3090364","article-title":"Subgoal-Based Reward Shaping to Improve Efficiency in Reinforcement Learning","volume":"9","author":"Okudo","year":"2021","journal-title":"IEEE Access"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"695","DOI":"10.1109\/TCDS.2019.2924724","article-title":"A Multiple-Attribute Decision-Making Approach to Reinforcement Learning","volume":"12","author":"Shi","year":"2020","journal-title":"IEEE Trans. Cogn. Dev. Syst."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"1274","DOI":"10.1109\/TMC.2019.2908171","article-title":"Distributed Energy-Efficient Multi-UAV Navigation for Long-Term Communication Coverage by Deep Reinforcement Learning","volume":"19","author":"Liu","year":"2020","journal-title":"IEEE Trans. Mob. Comput."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1038\/s41560-017-0074-z","article-title":"Impact of uncoordinated plug-in electric vehicle charging on residential power demand","volume":"3","author":"Muratori","year":"2018","journal-title":"Nat. Energy"},{"key":"ref_54","unstructured":"MathWorks (2022, September 27). Reinforcement Learning Toolbox. Available online: https:\/\/mathworks.com\/products\/reinforcement-learning.html."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1109\/TPWRS.2010.2051168","article-title":"MATPOWER: Steady-State Operations, Planning, and Analysis Tools for Power Systems Research and Education","volume":"26","author":"Zimmerman","year":"2010","journal-title":"IEEE Trans. Power Syst."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Garrido, V.M., Montoya, O.D., Medina-Quesada, \u00c1., and Hern\u00e1ndez, J.C. (2022). Optimal Reactive Power Compensation in Distribution Networks with Radial and Meshed Structures Using D-STATCOMs: A Mixed-Integer Convex Approach. Sensors, 22.","DOI":"10.3390\/s22228676"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Mora-Burbano, J.A., Fonseca-D\u00edaz, C.D., and Montoya, O.D. (2022). Application of the SSA for Optimal Reactive Power Compensation in Radial and Meshed Distribution Using D-STATCOMs. Algorithms, 15.","DOI":"10.3390\/a15100345"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/16\/7216\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:35:38Z","timestamp":1760128538000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/16\/7216"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,17]]},"references-count":57,"journal-issue":{"issue":"16","published-online":{"date-parts":[[2023,8]]}},"alternative-id":["s23167216"],"URL":"https:\/\/doi.org\/10.3390\/s23167216","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,17]]}}}