{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,1]],"date-time":"2026-06-01T12:57:07Z","timestamp":1780318627798,"version":"3.54.1"},"reference-count":23,"publisher":"MDPI AG","issue":"22","license":[{"start":{"date-parts":[[2019,11,19]],"date-time":"2019-11-19T00:00:00Z","timestamp":1574121600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004663","name":"Ministry of Science and Technology, Taiwan","doi-asserted-by":"publisher","award":["MOST 105-2511-S-009-016-MY3"],"award-info":[{"award-number":["MOST 105-2511-S-009-016-MY3"]}],"id":[{"id":"10.13039\/501100004663","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004663","name":"Ministry of Science and Technology, Taiwan","doi-asserted-by":"publisher","award":["MOST 108-2221-E-009 -119 -"],"award-info":[{"award-number":["MOST 108-2221-E-009 -119 -"]}],"id":[{"id":"10.13039\/501100004663","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>The maximum power point tracking (MPPT) technique is often used in photovoltaic (PV) systems to extract the maximum power in various environmental conditions. The perturbation and observation (P&amp;O) method is one of the most well-known MPPT methods; however, it may face problems of large oscillations around maximum power point (MPP) or low-tracking efficiency. In this paper, two reinforcement learning-based maximum power point tracking (RL MPPT) methods are proposed by the use of the Q-learning algorithm. One constructs the Q-table and the other adopts the Q-network. These two proposed methods do not require the information of an actual PV module in advance and can track the MPP through offline training in two phases, the learning phase and the tracking phase. From the experimental results, both the reinforcement learning-based Q-table maximum power point tracking (RL-QT MPPT) and the reinforcement learning-based Q-network maximum power point tracking (RL-QN MPPT) methods have smaller ripples and faster tracking speeds when compared with the P&amp;O method. In addition, for these two proposed methods, the RL-QT MPPT method performs with smaller oscillation and the RL-QN MPPT method achieves higher average power.<\/jats:p>","DOI":"10.3390\/s19225054","type":"journal-article","created":{"date-parts":[[2019,11,19]],"date-time":"2019-11-19T11:30:17Z","timestamp":1574163017000},"page":"5054","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":46,"title":["Maximum Power Point Tracking of Photovoltaic System Based on Reinforcement Learning"],"prefix":"10.3390","volume":"19","author":[{"given":"Kuan-Yu","family":"Chou","sequence":"first","affiliation":[{"name":"Institute of Electrical and Control Engineering, National Chiao Tung University, Hsinchu 30010, Taiwan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Shu-Ting","family":"Yang","sequence":"additional","affiliation":[{"name":"Institute of Electrical and Control Engineering, National Chiao Tung University, Hsinchu 30010, Taiwan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yon-Ping","family":"Chen","sequence":"additional","affiliation":[{"name":"Institute of Electrical and Control Engineering, National Chiao Tung University, Hsinchu 30010, Taiwan"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2019,11,19]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"637","DOI":"10.1016\/j.rser.2015.02.009","article-title":"A survey of the most used MPPT methods: Conventional and advanced algorithms applied for photovoltaic systems","volume":"45","author":"Bendib","year":"2015","journal-title":"Renew. Sustain. Energy Rev."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1555","DOI":"10.1016\/j.solmat.2005.10.023","article-title":"Review of the maximum power point tracking algorithms for stand-alone photovoltaic systems","volume":"90","author":"Salas","year":"2006","journal-title":"Sol. Energy Mat. Sol. Cells"},{"key":"ref_3","unstructured":"Xiao, W., and Dunford, W.G. (2004, January 20\u201325). A modified adaptive hill climbing MPPT method for photovoltaic power systems. Proceedings of the 2004 IEEE 35th Annual Power Electronics Specialists Conference (IEEE Cat. No. 04CH37551), Aachen, Germany."},{"key":"ref_4","unstructured":"Jung, Y., So, J., Yu, G., and Choi, J. (2005, January 3\u20137). Improved perturbation and observation method (IP&O) of MPPT control for photovoltaic power systems. Proceedings of the Conference Record of the Thirty-first IEEE Photovoltaic Specialists Conference, Lake Buena Vista, FL, USA."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1016\/j.apenergy.2015.04.006","article-title":"An improved perturb and observe (P&O) maximum power point tracking (MPPT) algorithm for higher efficiency","volume":"150","author":"Ahmed","year":"2015","journal-title":"Appl. Energy"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Rezoug, M., Chenni, R., and Taibi, D. (2018). Fuzzy logic-based perturb and observe algorithm with variable step of a reference voltage for solar permanent magnet synchronous motor drive system fed by direct-connected photovoltaic array. Energies, 11.","DOI":"10.3390\/en11020462"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Mohd Zainuri, M.A.A., Mohd Radzi, M.A., Soh, A.C., and Rahim, N.A. (November, January 30). Adaptive P&O-fuzzy control MPPT for PV boost dc-dc converter. Proceedings of the 2012 IEEE International Conference on Power and Energy (PECon), Kota Kinabalu, Malaysia.","DOI":"10.1109\/PECon.2012.6450270"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Sutton, R., and Barto, A. (1998). Introduction to Reinforcement Learning, MIT Press.","DOI":"10.1109\/TNN.1998.712192"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"245","DOI":"10.7763\/JOCET.2016.V4.290","article-title":"Reinforcement Learning for Online Maximum Power Point Tracking Control","volume":"4","author":"Youssef","year":"2016","journal-title":"J. Clean Energy Technol."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2015\/496401","article-title":"A Reinforcement Learning-Based Maximum Power Point Tracking Method for Photovoltaic Array","volume":"2015","author":"Hsu","year":"2015","journal-title":"Int. J. Photoenergy"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1016\/j.renene.2017.03.008","article-title":"A reinforcement learning approach for MPPT control method of photovoltaic sources","volume":"108","author":"Kofinas","year":"2017","journal-title":"Renew. Energy"},{"key":"ref_12","unstructured":"Lin, L.-J. (1992). Reinforcement Learning for Robots using Neural Networks. [Ph.D. Dissertation, Carnegie Mellon University]."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nature14236","article-title":"Human-level control through deep reinforcement learning","volume":"518","author":"Mnih","year":"2015","journal-title":"Nature"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Tiwari, G.N., and Tiwari, A. (2017). Handbook of Solar Energy\u2014Theory, Analysis and Applications, Sprinter.","DOI":"10.1007\/978-981-10-0807-8"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"494","DOI":"10.1016\/j.rser.2015.11.051","article-title":"Solar cell parameters extraction based on single and double-diode models: A review","volume":"56","author":"Humada","year":"2016","journal-title":"Renew. Sustain. Energy Rev."},{"key":"ref_16","unstructured":"Sedra, A., and Smith, K. (2011). Microelectronic Circuits, Oxford University Press. [6th ed.]."},{"key":"ref_17","first-page":"679","article-title":"A Markovian Decision Process","volume":"6","author":"Bellman","year":"1957","journal-title":"J. Math. Mech."},{"key":"ref_18","unstructured":"Howard, R.A. (1960). Dynamic Programming and Markov Processes, MIT Press."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1007\/BF00992698","article-title":"Q-learning","volume":"8","author":"Watkins","year":"1992","journal-title":"Mach. Learn."},{"key":"ref_20","unstructured":"Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press."},{"key":"ref_21","unstructured":"Ruder, S. (2016). An overview of gradient descent optimization algorithms. arXiv."},{"key":"ref_22","unstructured":"(2019, November 16). Raspberry Pi Foundation. Available online: http:\/\/www.raspberrypi.org."},{"key":"ref_23","unstructured":"Hossain, M.A., Islam, M.S., Chowdhury, M.M.H., Sabuj, M.N.H., and Bari, M.S. (2011, January 18\u201320). Performance Evaluation of 1.68 kwp DC Operated Solar Pump with Auto Tracker Using Microcontroller Based Data Acquisition System. Proceedings of the International Conference on Mechanical Engineering 2011, Dhaka, Bangladesh."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/19\/22\/5054\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:35:48Z","timestamp":1760189748000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/19\/22\/5054"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,11,19]]},"references-count":23,"journal-issue":{"issue":"22","published-online":{"date-parts":[[2019,11]]}},"alternative-id":["s19225054"],"URL":"https:\/\/doi.org\/10.3390\/s19225054","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,11,19]]}}}