{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T09:04:27Z","timestamp":1774602267638,"version":"3.50.1"},"reference-count":43,"publisher":"Springer Science and Business Media LLC","issue":"S1","license":[{"start":{"date-parts":[[2023,10,19]],"date-time":"2023-10-19T00:00:00Z","timestamp":1697673600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,10,19]],"date-time":"2023-10-19T00:00:00Z","timestamp":1697673600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Energy Inform"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In the smart grid of the future, accurate load forecasts on the level of individual clients can help to balance supply and demand locally and to prevent grid outages. While the number of monitored clients will increase with the ongoing smart meter rollout, the amount of data per client will always be limited. We evaluate whether a Transformer load forecasting model benefits from a transfer learning strategy, where a global univariate model is trained on the load time series from multiple clients. In experiments with two datasets containing load time series from several hundred clients, we find that the global training strategy is superior to the multivariate and local training strategies used in related work. On average, the global training strategy results in 21.8% and 12.8% lower forecasting errors than the two other strategies, measured across forecasting horizons from one day to one month into the future. A comparison to linear models, multi-layer perceptrons and LSTMs shows that Transformers are effective for load forecasting when they are trained with the global training strategy.<\/jats:p>","DOI":"10.1186\/s42162-023-00278-z","type":"journal-article","created":{"date-parts":[[2023,10,19]],"date-time":"2023-10-19T14:02:13Z","timestamp":1697724133000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":23,"title":["Transformer training strategies for forecasting multiple load time series"],"prefix":"10.1186","volume":"6","author":[{"given":"Matthias","family":"Hertel","sequence":"first","affiliation":[]},{"given":"Maximilian","family":"Beichter","sequence":"additional","affiliation":[]},{"given":"Benedikt","family":"Heidrich","sequence":"additional","affiliation":[]},{"given":"Oliver","family":"Neumann","sequence":"additional","affiliation":[]},{"given":"Benjamin","family":"Sch\u00e4fer","sequence":"additional","affiliation":[]},{"given":"Ralf","family":"Mikut","sequence":"additional","affiliation":[]},{"given":"Veit","family":"Hagenmeyer","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,10,19]]},"reference":[{"key":"278_CR1","unstructured":"A gentle introduction to the rectified linear unit (ReLU). https:\/\/machinelearningmastery.com\/rectified-linear-activation-function-for-deep-learning-neural-networks\/. Accessed 28 Apr 2023"},{"key":"278_CR2","doi-asserted-by":"crossref","unstructured":"An NH, Anh DT (2015) Comparison of strategies for multi-step-ahead prediction of time series using neural network. In: 2015 International Conference on Advanced Computing and Applications (ACOMP), pp. 142\u2013149","DOI":"10.1109\/ACOMP.2015.24"},{"key":"278_CR3","doi-asserted-by":"crossref","unstructured":"\u00c7akmak HK, Hagenmeyer V (2022) Using open data for modeling and simulation of the all electrical society in eASiMOV. In: 2022 Open Source Modelling and Simulation of Energy Systems (OSMSES)","DOI":"10.1109\/OSMSES54027.2022.9769145"},{"key":"278_CR4","doi-asserted-by":"crossref","unstructured":"Cao Y, Dang Z, Wu F, Xu X, Zhou F (2022) Probabilistic electricity demand forecasting with transformer-guided state space model. In: 2022 IEEE 5th International Conference on Automation, Electronics and Electrical Engineering (AUTEEE), pp. 964\u2013969. IEEE","DOI":"10.1109\/AUTEEE56487.2022.9994294"},{"key":"278_CR5","unstructured":"Gao J, Hu W, Zhang D, Chen Y (2022) TgDLF2.0: Theory-guided deep-learning for electrical load forecasting via transformer and transfer learning. arXiv:2210.02448"},{"key":"278_CR6","doi-asserted-by":"crossref","unstructured":"Giacomazzi E, Haag F, Hopf K (2023) Short-term electricity load forecasting using the temporal fusion transformer: Effect of grid hierarchies and data sources. arXiv preprint arXiv:2305.10559","DOI":"10.1145\/3575813.3597345"},{"key":"278_CR7","doi-asserted-by":"crossref","unstructured":"Grabner M, Wang Y, Wen Q, Bla\u017ei\u010d B, \u0160truc V (2023) A global modeling framework for load forecasting in distribution networks. IEEE Trans Smart Grid (Early Access)","DOI":"10.1109\/TSG.2023.3264525"},{"key":"278_CR8","doi-asserted-by":"publisher","DOI":"10.1016\/j.apenergy.2021.117798","volume":"304","author":"S Haben","year":"2021","unstructured":"Haben S, Arora S, Giasemidis G, Voss M, Greetham DV (2021) Review of low voltage load forecasting: methods, applications, and recommendations. Appl Energy 304:117798","journal-title":"Appl Energy"},{"issue":"2","key":"278_CR9","first-page":"261","volume":"7","author":"F Han","year":"2020","unstructured":"Han F, Pu T, Li M, Taylor G (2020) Short-term forecasting of individual residential load based on deep learning and k-means clustering. CSEE J Power Energy Syst 7(2):261\u2013269","journal-title":"CSEE J Power Energy Syst"},{"key":"278_CR10","doi-asserted-by":"crossref","unstructured":"Hertel M, Ott S, Sch\u00e4fer B, Mikut R, Hagenmeyer V, Neumann O (2022) Evaluation of transformer architectures for electrical load time-series forecasting. In: Proceedings 32. Workshop Computational Intelligence","DOI":"10.58895\/ksp\/1000151141-6"},{"key":"278_CR11","unstructured":"Hertel M, Ott S, Sch\u00e4fer B, Mikut R, Hagenmeyer V, Neumann O (2022) Transformer neural networks for building load forecasting. In: Tackling Climate Change with Machine Learning: Workshop at NeurIPS 2022"},{"key":"278_CR12","doi-asserted-by":"publisher","DOI":"10.1016\/j.scs.2022.104059","volume":"85","author":"Y Himeur","year":"2022","unstructured":"Himeur Y, Elnour M, Fadli F, Meskin N, Petri I, Rezgui Y, Bensaali F, Amira A (2022) Next-generation energy systems for sustainable smart cities: roles of transfer learning. Sustain Cities Soc 85:104059","journal-title":"Sustain Cities Soc"},{"issue":"8","key":"278_CR13","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735\u20131780","journal-title":"Neural Comput"},{"key":"278_CR14","doi-asserted-by":"publisher","first-page":"376","DOI":"10.1109\/OAJPE.2020.3029979","volume":"7","author":"T Hong","year":"2020","unstructured":"Hong T, Pinson P, Wang Y, Weron R, Yang D, Zareipour H (2020) Energy forecasting: a review and outlook. IEEE Open Access J Power Energy 7:376\u2013388","journal-title":"IEEE Open Access J Power Energy"},{"key":"278_CR15","doi-asserted-by":"publisher","first-page":"106296","DOI":"10.1109\/ACCESS.2022.3211941","volume":"10","author":"PC Huy","year":"2022","unstructured":"Huy PC, Minh NQ, Tien ND, Anh TTQ (2022) Short-term electricity load forecasting based on temporal fusion transformer model. IEEE Access 10:106296\u2013106304","journal-title":"IEEE Access"},{"issue":"1","key":"278_CR16","doi-asserted-by":"publisher","first-page":"841","DOI":"10.1109\/TSG.2017.2753802","volume":"10","author":"W Kong","year":"2017","unstructured":"Kong W, Dong ZY, Jia Y, Hill DJ, Xu Y, Zhang Y (2017) Short-term residential load forecasting based on LSTM recurrent neural network. IEEE Trans Smart Grid 10(1):841\u2013851","journal-title":"IEEE Trans Smart Grid"},{"key":"278_CR17","doi-asserted-by":"crossref","unstructured":"Lai G, Chang W-C, Yang Y, Liu H (2018) Modeling long-and short-term temporal patterns with deep neural networks. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 95\u2013104","DOI":"10.1145\/3209978.3210006"},{"key":"278_CR18","unstructured":"Loshchilov I, Hutter F (2019) Decoupled weight decay regularization. In: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6\u20139, 2019"},{"key":"278_CR19","volume-title":"Power system dynamics and stability","author":"J Machowski","year":"1997","unstructured":"Machowski J, Bialek J, Bumby JR, Bumby J (1997) Power system dynamics and stability. Wiley, USA"},{"key":"278_CR20","unstructured":"Murphy WMJ, Chen K (2023) Univariate vs multivariate time series forecasting with transformers. https:\/\/openreview.net\/forum?id=GpW327gxLTF"},{"key":"278_CR21","doi-asserted-by":"crossref","unstructured":"Nawar M, Shomer M, Faddel S, Gong H (2023) Transfer learning in deep learning models for building load forecasting: Case of limited data. arXiv:2301.10663","DOI":"10.1109\/SoutheastCon51012.2023.10115128"},{"key":"278_CR22","unstructured":"Nie Y, Nguyen NH, Sinthong P, Kalagnanam J (2022) A time series is worth 64 words: long-term forecasting with transformers. arXiv:2211.14730"},{"key":"278_CR23","doi-asserted-by":"crossref","unstructured":"Ordiano J\u00c1G, Waczowicz S, Hagenmeyer V, Mikut R (2018) Energy forecasting tools and services. WIREs Data Mining Knowl Discov 8(2)","DOI":"10.1002\/widm.1235"},{"key":"278_CR24","doi-asserted-by":"crossref","unstructured":"Pinto G, Wang Z, Roy A, Hong T, Capozzoli A (2022) Transfer learning for smart buildings: a critical review of algorithms, applications, and future perspectives. Adv Appl Energy 100084","DOI":"10.1016\/j.adapen.2022.100084"},{"key":"278_CR25","volume-title":"Climate change 2022: impacts, adaptation and vulnerability","author":"H-O P\u00f6rtner","year":"2022","unstructured":"P\u00f6rtner H-O, Roberts DC, Adams H, Adler C, Aldunce P, Ali E, Begum RA, Betts R, Kerr RB, Biesbroek R et al (2022) Climate change 2022: impacts, adaptation and vulnerability. IPCC Geneva, Switzerland"},{"issue":"4","key":"278_CR26","doi-asserted-by":"publisher","first-page":"86","DOI":"10.1145\/2133806.2133825","volume":"55","author":"SD Ramchurn","year":"2012","unstructured":"Ramchurn SD, Vytelingum P, Rogers A, Jennings NR (2012) Putting the \u201csmarts\u201d into the smart grid: a grand challenge for artificial intelligence. Commun ACM 55(4):86\u201397","journal-title":"Commun ACM"},{"key":"278_CR27","doi-asserted-by":"publisher","DOI":"10.1016\/j.epsr.2022.108885","volume":"214","author":"P Ran","year":"2023","unstructured":"Ran P, Dong K, Liu X, Wang J (2023) Short-term load forecasting based on CEEMDAN and transformer. Electric Power Syst Res 214:108885","journal-title":"Electric Power Syst Res"},{"issue":"1","key":"278_CR28","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1007\/s10115-018-1169-y","volume":"57","author":"F Rodrigues","year":"2018","unstructured":"Rodrigues F, Trindade A (2018) Load forecasting through functional clustering and ensemble learning. Knowl Informat Syst 57(1):229\u2013244","journal-title":"Knowl Informat Syst"},{"key":"278_CR29","doi-asserted-by":"crossref","unstructured":"Sahoo D, Sood N, Rani U, Abraham G, Dutt V, Dileep A (2020) Comparative analysis of multi-step time-series forecasting for network load dataset. In: 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), pp. 1\u20137","DOI":"10.1109\/ICCCNT49239.2020.9225449"},{"issue":"5","key":"278_CR30","doi-asserted-by":"publisher","first-page":"5271","DOI":"10.1109\/TSG.2017.2686012","volume":"9","author":"H Shi","year":"2017","unstructured":"Shi H, Xu M, Li R (2017) Deep learning for household load forecasting\u2014a novel pooling deep RNN. IEEE Trans Smart Grid 9(5):5271\u20135280","journal-title":"IEEE Trans Smart Grid"},{"key":"278_CR31","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: NIPS, pp. 5998\u20136008"},{"key":"278_CR32","doi-asserted-by":"publisher","DOI":"10.1016\/j.egyai.2020.100009","volume":"1","author":"F vom Scheidt","year":"2020","unstructured":"vom Scheidt F, Medinov\u00e1 H, Ludwig N, Richter B, Staudt P, Weinhardt C (2020) Data analytics in the electricity sector\u2014a quantitative and qualitative literature review. Energy AI 1:100009","journal-title":"Energy AI"},{"key":"278_CR33","doi-asserted-by":"crossref","unstructured":"Vo\u00df M, Bender-Saebelkampf C, Albayrak S (2018) Residential short-term load forecasting using convolutional neural networks. In: 2018 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), pp. 1\u20136","DOI":"10.1109\/SmartGridComm.2018.8587494"},{"issue":"4","key":"278_CR34","doi-asserted-by":"publisher","first-page":"2703","DOI":"10.1109\/TSG.2022.3166600","volume":"13","author":"C Wang","year":"2022","unstructured":"Wang C, Wang Y, Ding Z, Zheng T, Hu J, Zhang K (2022) A transformer-based method of multienergy load forecasting in integrated energy system. IEEE Trans Smart Grid 13(4):2703\u20132714","journal-title":"IEEE Trans Smart Grid"},{"key":"278_CR35","doi-asserted-by":"crossref","unstructured":"Werling D, Heidrich B, \u00c7akmak HK, Hagenmeyer V (2022) Towards line-restricted dispatchable feeders using probabilistic forecasts for PV-dominated low-voltage distribution grids. In: Proceedings of the Thirteenth ACM International Conference on Future Energy Systems, pp. 395\u2013400","DOI":"10.1145\/3538637.3538868"},{"key":"278_CR36","unstructured":"Wu N, Green B, Ben X, O\u2019Banion S (2020) Deep transformer models for time series forecasting: The influenza prevalence case. arXiv preprint arXiv:2001.08317"},{"key":"278_CR37","unstructured":"Wu H, Xu J, Wang J, Long M (2021) Autoformer: decomposition transformers with auto-correlation for long-term series forecasting. In: NeurIPS, pp. 22419\u201322430"},{"key":"278_CR38","doi-asserted-by":"crossref","unstructured":"Yang E, Youn C-H (2021) Individual load forecasting for multi-customers with distribution-aware temporal pooling. In: IEEE INFOCOM 2021-IEEE Conference on Computer Communications, pp. 1\u201310","DOI":"10.1109\/INFOCOM42981.2021.9488816"},{"key":"278_CR39","doi-asserted-by":"publisher","first-page":"402","DOI":"10.1016\/j.apenergy.2017.10.014","volume":"208","author":"B Yildiz","year":"2017","unstructured":"Yildiz B, Bilbao JI, Dore J, Sproul AB (2017) Recent advances in the analysis of residential electricity consumption and applications of smart meter data. Appl Energy 208:402\u2013427","journal-title":"Appl Energy"},{"key":"278_CR40","unstructured":"Zeng A, Chen M, Zhang L, Xu Q (2022) Are transformers effective for time series forecasting? arXiv:2205.13504"},{"issue":"1","key":"278_CR41","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1007\/s44196-022-00128-y","volume":"15","author":"G Zhang","year":"2022","unstructured":"Zhang G, Wei C, Jing C, Wang Y (2022) Short-term electrical load forecasting based on time augmented transformer. Int J Comput Intell Syst 15(1):67","journal-title":"Int J Comput Intell Syst"},{"key":"278_CR42","unstructured":"Zhou T, Ma Z, Wen Q, Wang X, Sun L, Jin R (2022) FEDformer: frequency enhanced decomposed transformer for long-term series forecasting. In: International Conference on Machine Learning, pp. 27268\u201327286"},{"key":"278_CR43","doi-asserted-by":"crossref","unstructured":"Zhou H, Zhang S, Peng J, Zhang S, Li J, Xiong H, Zhang W (2021) Informer: Beyond efficient transformer for long sequence time-series forecasting. In: AAAI, pp. 11106\u201311115","DOI":"10.1609\/aaai.v35i12.17325"}],"container-title":["Energy Informatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42162-023-00278-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s42162-023-00278-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42162-023-00278-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,31]],"date-time":"2024-10-31T05:06:42Z","timestamp":1730351202000},"score":1,"resource":{"primary":{"URL":"https:\/\/energyinformatics.springeropen.com\/articles\/10.1186\/s42162-023-00278-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,19]]},"references-count":43,"journal-issue":{"issue":"S1","published-online":{"date-parts":[[2023,10]]}},"alternative-id":["278"],"URL":"https:\/\/doi.org\/10.1186\/s42162-023-00278-z","relation":{},"ISSN":["2520-8942"],"issn-type":[{"value":"2520-8942","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,19]]},"assertion":[{"value":"19 October 2023","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"20"}}