{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T02:08:25Z","timestamp":1774663705695,"version":"3.50.1"},"reference-count":36,"publisher":"Springer Science and Business Media LLC","issue":"29","license":[{"start":{"date-parts":[[2025,9,7]],"date-time":"2025-09-07T00:00:00Z","timestamp":1757203200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,9,7]],"date-time":"2025-09-07T00:00:00Z","timestamp":1757203200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"CRUI-CARE Agreement Politecnico di Milano"},{"DOI":"10.13039\/501100006690","name":"Politecnico di Milano","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100006690","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Comput &amp; Applic"],"published-print":{"date-parts":[[2025,10]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Accurate short-term photovoltaic (PV) power forecasts are critical for efficient grid balancing, yet training optimizers, often overlooked compared to neural network architectures, significantly influence prediction accuracy and convergence speed. Prior research primarily focuses on adjusting network architectures, typically employing a single optimizer (commonly Adam), thus leaving optimizer selection underexplored, especially under noisy and incomplete real-world PV data. This study systematically benchmarks four optimizers\u2014Adam, Adaptive Gradient (Adagrad), Rectified Adam (RAdam), and Lookahead\u2014across three deep-learning architectures (Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN)-LSTM, and LSTM-Autoencoder) using data from two distinct PV sites. Unlike prior works, we assess optimizer effectiveness across a wide range of conditions, including varying training data lengths, sampling intervals, and missing data patterns (both random and block-wise). Using two real-world PV datasets representing semi-arid and desert climates, we analyze forecasting accuracy, convergence time, and robustness. Our empirical results demonstrate that RAdam consistently outperforms Adam by achieving up to 36% lower forecasting error under noisy and incomplete data conditions, while Lookahead offers up to 40% faster convergence in deep hybrid models. These gains translate into tighter reserve-margin planning and smoother inverter set-points, advancing state-of-the-art PV forecast pipelines. The paper concludes with optimizer-architecture recommendations for practitioners facing latency or compute constraints.<\/jats:p>","DOI":"10.1007\/s00521-025-11546-2","type":"journal-article","created":{"date-parts":[[2025,9,7]],"date-time":"2025-09-07T17:38:57Z","timestamp":1757266737000},"page":"23909-23939","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["A benchmark study of optimizers for short-term solar PV power forecasting using neural networks under real-world constraints"],"prefix":"10.1007","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-9636-1649","authenticated-orcid":false,"given":"Saloni","family":"Dhingra","sequence":"first","affiliation":[]},{"given":"Giambattista","family":"Gruosso","sequence":"additional","affiliation":[]},{"given":"Giancarlo","family":"Storti Gajani","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,9,7]]},"reference":[{"key":"11546_CR1","doi-asserted-by":"publisher","first-page":"485","DOI":"10.35833\/MPCE.2019.000112","volume":"9","author":"M Alraddadi","year":"2021","unstructured":"Alraddadi M, Conejo AJ, Lima RM (2021) Expansion planning for renewable integration in power system of regions with very high solar irradiation. J Mod Power Syst Clean Energy 9:485\u2013494. https:\/\/doi.org\/10.35833\/MPCE.2019.000112","journal-title":"J Mod Power Syst Clean Energy"},{"key":"11546_CR2","doi-asserted-by":"publisher","first-page":"52233","DOI":"10.1109\/ACCESS.2022.3174555","volume":"10","author":"M Shafiullah","year":"2022","unstructured":"Shafiullah M, Ahmed SD, Al-Sulaiman FA (2022) Grid integration challenges and solution strategies for solar PV systems: a review. IEEE Access 10:52233\u201352257. https:\/\/doi.org\/10.1109\/ACCESS.2022.3174555","journal-title":"IEEE Access"},{"key":"11546_CR3","doi-asserted-by":"publisher","first-page":"376","DOI":"10.1109\/OAJPE.2020.3029979","volume":"7","author":"T Hong","year":"2020","unstructured":"Hong T, Pinson P, Wang Y, Weron R, Yang D, Zareipour H (2020) Energy forecasting: a review and outlook. IEEE Open Access J Power Energy 7:376\u2013388. https:\/\/doi.org\/10.1109\/OAJPE.2020.3029979","journal-title":"IEEE Open Access J Power Energy"},{"key":"11546_CR4","doi-asserted-by":"publisher","unstructured":"Dhingra S, Gruosso G, Gajani GS (2023) Solar PV power forecasting and ageing evaluation using machine learning techniques. In: IECON 2023-49th annual conference of the IEEE industrial electronics Society, pp 1\u20136. https:\/\/doi.org\/10.1109\/IECON51785.2023.10312446","DOI":"10.1109\/IECON51785.2023.10312446"},{"key":"11546_CR5","doi-asserted-by":"publisher","DOI":"10.3390\/en15072457","author":"VH Wentz","year":"2022","unstructured":"Wentz VH, Maciel JN, Gimenez Ledesma JJ, Ando Junior OH (2022) Solar irradiance forecasting to short-term PV power: accuracy comparison of ANN and LSTM models. Energies. https:\/\/doi.org\/10.3390\/en15072457","journal-title":"Energies"},{"key":"11546_CR6","doi-asserted-by":"publisher","first-page":"9095","DOI":"10.1007\/s00521-024-09558-5","volume":"36","author":"D Salman","year":"2024","unstructured":"Salman D, Direkoglu C, Kusaf MEA (2024) Hybrid deep learning models for time series forecasting of solar power. Neural Comput Appl 36:9095\u20139112. https:\/\/doi.org\/10.1007\/s00521-024-09558-5","journal-title":"Neural Comput Appl"},{"key":"11546_CR7","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-024-10607-2","author":"C Symeonidis","year":"2025","unstructured":"Symeonidis C, Nikolaidis N (2025) Efficient deterministic renewable energy forecasting guided by multiple-location weather data. Neural Comput Appl. https:\/\/doi.org\/10.1007\/s00521-024-10607-2","journal-title":"Neural Comput Appl"},{"key":"11546_CR8","doi-asserted-by":"publisher","DOI":"10.1016\/j.rser.2020.109792","author":"R Ahmed","year":"2020","unstructured":"Ahmed R, Sreeram V, Mishra Y, Arif MD (2020) A review and evaluation of the state-of-the-art in PV solar power forecasting: techniques and optimization. Renew Sustain Energy Rev. https:\/\/doi.org\/10.1016\/j.rser.2020.109792","journal-title":"Renew Sustain Energy Rev"},{"key":"11546_CR9","doi-asserted-by":"publisher","unstructured":"Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. International conference on learning representations (ICLR). https:\/\/doi.org\/10.48550\/arXiv.1412.6980","DOI":"10.48550\/arXiv.1412.6980"},{"key":"11546_CR10","first-page":"2121","volume":"12","author":"J Duchi","year":"2011","unstructured":"Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121\u20132159","journal-title":"J Mach Learn Res"},{"key":"11546_CR11","doi-asserted-by":"publisher","unstructured":"Ruder S (2017) An overview of gradient descent optimization algorithms. arXiv preprint https:\/\/doi.org\/10.48550\/arXiv.1609.04747","DOI":"10.48550\/arXiv.1609.04747"},{"key":"11546_CR12","doi-asserted-by":"publisher","unstructured":"Zeiler M (2012) Adadelta: an adaptive learning rate method. arXiv preprint https:\/\/doi.org\/10.48550\/arXiv.1212.5701","DOI":"10.48550\/arXiv.1212.5701"},{"key":"11546_CR13","doi-asserted-by":"publisher","unstructured":"Zhang MR, Lucas J, Hinton G, Ba J (2019) Lookahead optimizer: k steps forward, 1 step back. In: Proceedings of the 33rd international conference on neural information processing systems 861:12. https:\/\/doi.org\/10.48550\/arXiv.1907.08610","DOI":"10.48550\/arXiv.1907.08610"},{"key":"11546_CR14","doi-asserted-by":"publisher","unstructured":"Liu L, Jiang H, He P, Chen W, Liu X, Gao J, Han J (2019) On the variance of the adaptive learning rate and beyond. arXiv preprint https:\/\/doi.org\/10.48550\/arXiv.1908.03265","DOI":"10.48550\/arXiv.1908.03265"},{"key":"11546_CR15","doi-asserted-by":"publisher","unstructured":"Zhuang J, Tang T, Ding Y, Tatikonda S, Dvornek N, Papademetris X, Duncan JS (2020) Adabelief optimizer: adapting stepsizes by the belief in observed gradients. arXiv preprint https:\/\/doi.org\/10.48550\/arXiv.2010.07468","DOI":"10.48550\/arXiv.2010.07468"},{"key":"11546_CR16","doi-asserted-by":"publisher","unstructured":"Park J, Lee S, Song B (2023) Stable quantization-aware training with adaptive gradient clipping. In: 2023 International conference on electronics, information, and communication (ICEIC), pp 1\u20133. https:\/\/doi.org\/10.1109\/ICEIC57457.2023.10049939","DOI":"10.1109\/ICEIC57457.2023.10049939"},{"key":"11546_CR17","doi-asserted-by":"publisher","unstructured":"Zaeed M, Islam TZ, Indic V (2024) Characterize and compare the performance of deep learning optimizers in recurrent neural network architectures. In: 2024 IEEE 48th annual computers, software, and applications conference (COMPSAC), pp 39\u201344. https:\/\/doi.org\/10.1109\/COMPSAC61105.2024.00016","DOI":"10.1109\/COMPSAC61105.2024.00016"},{"key":"11546_CR18","doi-asserted-by":"publisher","unstructured":"Suresha HS, Parthasarathy SS (2020) Alzheimer disease detection based on deep neural network with rectified Adam optimization technique using MRI analysis. In: 2020 Third international conference on advances in electronics, computers and communications (ICAECC), pp 1\u20136. https:\/\/doi.org\/10.1109\/ICAECC50550.2020.9339504","DOI":"10.1109\/ICAECC50550.2020.9339504"},{"key":"11546_CR19","doi-asserted-by":"publisher","first-page":"66965","DOI":"10.1109\/ACCESS.2021.3076313","volume":"9","author":"H Shi","year":"2021","unstructured":"Shi H, Wang L, Scherer R, Wozniak M, Zhang P, Wei W (2021) Short-term load forecasting based on adabelief optimized temporal convolutional network and gated recurrent unit hybrid neural network. IEEE Access 9:66965\u201366981. https:\/\/doi.org\/10.1109\/ACCESS.2021.3076313","journal-title":"IEEE Access"},{"key":"11546_CR20","doi-asserted-by":"publisher","unstructured":"Dhingra S, Gruosso G, Gajani GS (2023) Modelling ageing and power production of solar PV using machine learning techniques. In: 2023 International conference on electrical, computer and energy technologies (ICECET), pp 1\u20136. https:\/\/doi.org\/10.1109\/ICECET58911.2023.10389351","DOI":"10.1109\/ICECET58911.2023.10389351"},{"key":"11546_CR21","doi-asserted-by":"publisher","first-page":"2909","DOI":"10.1002\/ese3.1178","volume":"10","author":"J Sharma","year":"2022","unstructured":"Sharma J, Soni S, Paliwal P, Saboor S, Chaurasiya PK, Sharifpur M, Khalilpoor N, Afzal A (2022) A novel long term solar photovoltaic power forecasting approach using LSTM with Nadam optimizer: a case study of india. Energy Sci Eng 10:2909\u20132929. https:\/\/doi.org\/10.1002\/ese3.1178","journal-title":"Energy Sci Eng"},{"key":"11546_CR22","doi-asserted-by":"publisher","unstructured":"Bhatnagar M, Dwivedi V, Rozinaj G (2023) Short-term electric load forecast model using the combination of ant lion optimization with bi-LSTM network. In: 2023 30th International conference on systems, signals and image processing (IWSSIP), pp 1\u20135. https:\/\/doi.org\/10.1109\/IWSSIP58668.2023.10180242","DOI":"10.1109\/IWSSIP58668.2023.10180242"},{"key":"11546_CR23","doi-asserted-by":"publisher","unstructured":"Cai H, Chen X, Ling J, Xu Q (2022) Short-term load forecasting based on Radam optimized CNN-BiLSTM-AE hybrid model. In: 2022 Power system and green energy conference (PSGEC), pp 626\u2013631. https:\/\/doi.org\/10.1109\/PSGEC54663.2022.9881115","DOI":"10.1109\/PSGEC54663.2022.9881115"},{"key":"11546_CR24","doi-asserted-by":"publisher","unstructured":"Jiang F, He J, Peng Z (2018) Short-term wind power forecasting based on BP neural network with improved ant lion optimizer. In: 2018 37th Chinese control conference (CCC), pp 8543\u20138547. https:\/\/doi.org\/10.23919\/ChiCC.2018.8482950","DOI":"10.23919\/ChiCC.2018.8482950"},{"key":"11546_CR25","doi-asserted-by":"publisher","unstructured":"Sulochana BC, Pragada B, Kiran BC, Reddy G, KM (2024) Machine learning in weather forecasting: a comparative approach with emphasis on neural networks and optimizer. In: 2024 International conference on advances in modern age technologies for health and engineering science (AMATHE), pp 1\u20137. https:\/\/doi.org\/10.1109\/AMATHE61652.2024.10582220","DOI":"10.1109\/AMATHE61652.2024.10582220"},{"key":"11546_CR26","doi-asserted-by":"publisher","unstructured":"BM, PM, K A, SD (2022) Early prediction of health disease for long\u2013short term memory using AdaGrad model. In: 2022 IEEE 2nd Mysore sub section international conference (MysuruCon), pp 1\u20136. https:\/\/doi.org\/10.1109\/MysuruCon55714.2022.9972556","DOI":"10.1109\/MysuruCon55714.2022.9972556"},{"key":"11546_CR27","unstructured":"Deline C, Perry K, Muller M, Sekulic W, Jordan D Photovoltaic Data Acquisition (PVDAQ) Public Datasets. https:\/\/data.openei.org\/submissions\/4568 (accessed on 3 Jan 2024)"},{"issue":"1","key":"11546_CR28","doi-asserted-by":"publisher","first-page":"54","DOI":"10.1109\/TSMC.2021.3093519","volume":"52","author":"SMJ Jalali","year":"2022","unstructured":"Jalali SMJ, Ahmadian S, Kavousi-Fard A, Khosravi A, Nahavandi S (2022) Automated deep CNN-LSTM architecture design for solar irradiance forecasting. IEEE Trans Syst Man Cybern Syst 52(1):54\u201365. https:\/\/doi.org\/10.1109\/TSMC.2021.3093519","journal-title":"IEEE Trans Syst Man Cybern Syst"},{"key":"11546_CR29","doi-asserted-by":"publisher","unstructured":"Zivkovic M, Jovanovic L, Pavlov M, Bacanin N, Dobrojevic M, Salb M (2023) Optimized recurrent neural networks with attention for wind farm energy generation forecasting. In: 2023 16th International conference on advanced technologies, systems and services in telecommunications (TELSIKS), pp 187\u2013190. https:\/\/doi.org\/10.1109\/TELSIKS57806.2023.10316047","DOI":"10.1109\/TELSIKS57806.2023.10316047"},{"key":"11546_CR30","doi-asserted-by":"publisher","first-page":"3921","DOI":"10.1007\/s00521-023-09347-6","volume":"36","author":"Z Zhang","year":"2024","unstructured":"Zhang Z, Wang Y, Peng JEA (2024) An improved self-attention for long-sequence time-series data forecasting with missing values. Neural Comput Appl 36:3921\u20133940. https:\/\/doi.org\/10.1007\/s00521-023-09347-6","journal-title":"Neural Comput Appl"},{"key":"11546_CR31","doi-asserted-by":"publisher","unstructured":"Srivastava A, Rawat BS, Singh G, Bhatnagar V, Saini PK, Dhondiyal SA (2023) A review of optimization algorithms for training neural networks. In: 2023 International conference on sustainable emerging innovations in engineering and technology (ICSEIET), pp 886\u2013890. https:\/\/doi.org\/10.1109\/ICSEIET58677.2023.10303287","DOI":"10.1109\/ICSEIET58677.2023.10303287"},{"key":"11546_CR32","doi-asserted-by":"publisher","unstructured":"Llugsi R, Yacoubi S, Fontaine A, Lupera P (2021) Comparison between Adam, AdaMax and Adam W optimizers to implement a weather forecast based on neural networks for the Andean city of Quito. In: 2021 IEEE fifth ecuador technical chapters meeting (ETCM), pp 1\u20136. https:\/\/doi.org\/10.1109\/ETCM53643.2021.9590681","DOI":"10.1109\/ETCM53643.2021.9590681"},{"key":"11546_CR33","doi-asserted-by":"publisher","unstructured":"Ward R, Wu X, Bottou L (2021) Adagrad stepsizes: sharp convergence over nonconvex landscapes. arXiv preprint https:\/\/doi.org\/10.48550\/arXiv.1806.01811","DOI":"10.48550\/arXiv.1806.01811"},{"key":"11546_CR34","doi-asserted-by":"publisher","unstructured":"Fang J, Fong C, Yang P, Hung C, Lu W, Chang C (2020) Adagrad gradient descent method for AI image management. In: 2020 IEEE international conference on consumer electronics-Taiwan (ICCE-Taiwan), pp 1\u20132. https:\/\/doi.org\/10.1109\/ICCE-Taiwan49838.2020.9258085","DOI":"10.1109\/ICCE-Taiwan49838.2020.9258085"},{"key":"11546_CR35","unstructured":"Zhou P, Yan H, Yuan X, Feng J, Yan S (2024) Towards understanding why lookahead generalizes better than SGD and beyond. In: Proceedings of the 35th international conference on neural information processing systems. NIPS \u201921, pp 27290\u201327304"},{"key":"11546_CR36","doi-asserted-by":"publisher","DOI":"10.1016\/j.energy.2021.120996","author":"J Qu","year":"2021","unstructured":"Qu J, Qian Z, Pei Y (2021) Day-ahead hourly photovoltaic power forecasting using attention-based CNN-LSTM neural network embedded with multiple relevant and target variables prediction pattern. Energy. https:\/\/doi.org\/10.1016\/j.energy.2021.120996","journal-title":"Energy"}],"container-title":["Neural Computing and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-025-11546-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00521-025-11546-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-025-11546-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,30]],"date-time":"2025-09-30T05:23:23Z","timestamp":1759209803000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00521-025-11546-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,7]]},"references-count":36,"journal-issue":{"issue":"29","published-print":{"date-parts":[[2025,10]]}},"alternative-id":["11546"],"URL":"https:\/\/doi.org\/10.1007\/s00521-025-11546-2","relation":{},"ISSN":["0941-0643","1433-3058"],"issn-type":[{"value":"0941-0643","type":"print"},{"value":"1433-3058","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,7]]},"assertion":[{"value":"10 April 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 July 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 September 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"All authors consent to publication of the manuscript if accepted.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"All authors declare to comply with the ethical standards required by this journal.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical approval"}}]}}