{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T16:48:48Z","timestamp":1776444528529,"version":"3.51.2"},"reference-count":48,"publisher":"MDPI AG","issue":"17","license":[{"start":{"date-parts":[[2024,8,31]],"date-time":"2024-08-31T00:00:00Z","timestamp":1725062400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"FCT\u2014Funda\u00e7\u00e3o para a Ci\u00eancia e Tecnologia","award":["UIDP\/05422\/2020"],"award-info":[{"award-number":["UIDP\/05422\/2020"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Mathematics"],"abstract":"<jats:p>This study investigates the effectiveness of Transformer-based models for retail demand forecasting. We evaluated vanilla Transformer, Informer, Autoformer, PatchTST, and temporal fusion Transformer (TFT) against traditional baselines like AutoARIMA and AutoETS. Model performance was assessed using mean absolute scaled error (MASE) and weighted quantile loss (WQL). The M5 competition dataset, comprising 30,490 time series from 10 stores, served as the evaluation benchmark. The results demonstrate that Transformer-based models significantly outperform traditional baselines, with Transformer, Informer, and TFT leading the performance metrics. These models achieved MASE improvements of 26% to 29% and WQL reductions of up to 34% compared to the seasonal Na\u00efve method, particularly excelling in short-term forecasts. While Autoformer and PatchTST also surpassed traditional methods, their performance was slightly lower, indicating the potential for further tuning. Additionally, this study highlights a trade-off between model complexity and computational efficiency, with Transformer models, though computationally intensive, offering superior forecasting accuracy compared to the significantly slower traditional models like AutoARIMA. These findings underscore the potential of Transformer-based approaches for enhancing retail demand forecasting, provided the computational demands are managed effectively.<\/jats:p>","DOI":"10.3390\/math12172728","type":"journal-article","created":{"date-parts":[[2024,9,2]],"date-time":"2024-09-02T07:59:40Z","timestamp":1725263980000},"page":"2728","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":48,"title":["Evaluating the Effectiveness of Time Series Transformers for Demand Forecasting in Retail"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8516-6418","authenticated-orcid":false,"given":"Jos\u00e9 Manuel","family":"Oliveira","sequence":"first","affiliation":[{"name":"Faculty of Economics, University of Porto, Rua Dr. Roberto Frias, 4200-464 Porto, Portugal"},{"name":"Institute for Systems and Computer Engineering, Technology and Science, Campus da FEUP, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0959-8446","authenticated-orcid":false,"given":"Patr\u00edcia","family":"Ramos","sequence":"additional","affiliation":[{"name":"Institute for Systems and Computer Engineering, Technology and Science, Campus da FEUP, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal"},{"name":"CEOS.PP, ISCAP, Polytechnic of Porto, Rua Jaime Lopes Amorim s\/n, 4465-004 S\u00e3o Mamede de Infesta, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2024,8,31]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"523","DOI":"10.1016\/j.ejor.2023.10.039","article-title":"Simplifying tree-based methods for retail sales forecasting with explanatory variables","volume":"314","author":"Wellens","year":"2024","journal-title":"Eur. J. Oper. Res."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Ramos, P., Oliveira, J.M., Kourentzes, N., and Fildes, R. (2023). Forecasting Seasonal Sales with Many Drivers: Shrinkage or Dimensionality Reduction?. Appl. Syst. Innov., 6.","DOI":"10.3390\/asi6010003"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"705","DOI":"10.1016\/j.ijforecast.2021.11.001","article-title":"Forecasting: Theory and practice","volume":"38","author":"Petropoulos","year":"2022","journal-title":"Int. J. Forecast."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Oliveira, J.M., and Ramos, P. (2023). Investigating the Accuracy of Autoregressive Recurrent Networks Using Hierarchical Aggregation Structure-Based Data Partitioning. Big Data Cogn. Comput., 7.","DOI":"10.20944\/preprints202304.0222.v1"},{"key":"ref_5","unstructured":"Elkind, E. (2023, January 19\u201325). Transformers in Time Series: A Survey. Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, Macao, China. Survey Track."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1531","DOI":"10.1007\/s00034-023-02454-8","article-title":"Transformers in Time-Series Analysis: A Tutorial","volume":"42","author":"Ahmed","year":"2023","journal-title":"Circuits Syst. Signal Process."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"650","DOI":"10.1016\/j.procir.2021.03.088","article-title":"A survey on long short-term memory networks for time series prediction","volume":"99","author":"Lindemann","year":"2021","journal-title":"Procedia CIRP"},{"key":"ref_8","first-page":"11121","article-title":"Are Transformers Effective for Time Series Forecasting?","volume":"37","author":"Zeng","year":"2023","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_9","unstructured":"Li, S., Jin, X., Xuan, Y., Zhou, X., Chen, W., Wang, Y.X., and Yan, X. (2019). Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting. arXiv."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Hassanien, A.E., and Darwish, A. (2021). A Survey on Deep Learning for Time-Series Forecasting. Machine Learning and Big Data Analytics Paradigms: Analysis, Applications and Challenges, Springer International Publishing.","DOI":"10.1007\/978-3-030-59338-4"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3533382","article-title":"Deep Learning for Time Series Forecasting: Tutorial and Literature Survey","volume":"55","author":"Benidis","year":"2022","journal-title":"ACM Comput. Surv."},{"key":"ref_12","unstructured":"Iliadis, L., Maglogiannis, I., Alonso, S., Jayne, C., and Pimenidis, E. Cross-Learning-Based Sales Forecasting Using Deep Learning via Partial Pooling from Multi-level Data. Proceedings of the Engineering Applications of Neural Networks."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Casolaro, A., Capone, V., Iannuzzo, G., and Camastra, F. (2023). Deep Learning for Time Series Forecasting: Advances and Open Problems. Information, 14.","DOI":"10.3390\/info14110598"},{"key":"ref_14","unstructured":"Miller, J.A., Aldosari, M., Saeed, F., Barna, N.H., Rana, S., Arpinar, I., and Liu, N. (2024). A Survey of Deep Learning and Foundation Models for Time Series Forecasting. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Ramos, P., and Oliveira, J.M. (2023). Robust Sales forecasting Using Deep Learning with Static and Dynamic Covariates. Appl. Syst. Innov., 6.","DOI":"10.20944\/preprints202308.0427.v1"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"20200209","DOI":"10.1098\/rsta.2020.0209","article-title":"Time-series forecasting with deep learning: A survey","volume":"379","author":"Lim","year":"2021","journal-title":"Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. R. Soc."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1016\/j.aiopen.2022.10.001","article-title":"A survey of transformers","volume":"3","author":"Lin","year":"2022","journal-title":"AI Open"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Padhi, I., Schiff, Y., Melnyk, I., Rigotti, M., Mroueh, Y., Dognin, P., Ross, J., Nair, R., and Altman, E. (2021, January 6\u201311). Tabular Transformers for Modeling Multivariate Time Series. Proceedings of the ICASSP 2021\u20142021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada.","DOI":"10.1109\/ICASSP39728.2021.9414142"},{"key":"ref_19","first-page":"11106","article-title":"Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting","volume":"35","author":"Zhou","year":"2021","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_20","first-page":"22419","article-title":"Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting","volume":"34","author":"Wu","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1748","DOI":"10.1016\/j.ijforecast.2021.03.012","article-title":"Temporal Fusion Transformers for interpretable multi-horizon time series forecasting","volume":"37","author":"Lim","year":"2021","journal-title":"Int. J. Forecast."},{"key":"ref_22","unstructured":"Nie, Y., Nguyen, N.H., Sinthong, P., and Kalagnanam, J. (2023, January 1\u20135). A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., and Funtowicz, M. (2020). HuggingFace\u2019s Transformers: State-of-the-art Natural Language Processing. arXiv.","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"ref_24","unstructured":"Erickson, N., Mueller, J., Shirkov, A., Zhang, H., Larroy, P., Li, M., and Smola, A. (2020). AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data. arXiv."},{"key":"ref_25","unstructured":"Olivares, K.G., Chall\u00fa, C., Garza, F., Canseco, M.M., and Dubrawski, A. (2022). NeuralForecast: User Friendly State-of-the-Art Neural Forecasting Models, PyCon."},{"key":"ref_26","unstructured":"Liu, S., Yu, H., Liao, C., Li, J., Lin, W., Liu, A.X., and Dustdar, S. (2022, January 25). Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting. Proceedings of the International Conference on Learning Representations, Virtual."},{"key":"ref_27","first-page":"27268","article-title":"FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting","volume":"Volume 162","author":"Chaudhuri","year":"2022","journal-title":"Proceedings of the 39th International Conference on Machine Learning"},{"key":"ref_28","first-page":"9881","article-title":"Non-stationary transformers: Exploring the stationarity in time series forecasting","volume":"35","author":"Liu","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_29","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L.u., and Polosukhin, I. (2017, January 4\u20139). Attention is All you Need. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_30","unstructured":"Xiong, R., Yang, Y., He, D., Zheng, K., Zheng, S., Xing, C., Zhang, H., Lan, Y., Wang, L., and Liu, T.Y. (2020, January 13\u201318). On layer normalization in the transformer architecture. Proceedings of the 37th International Conference on Machine Learning, Online."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Oliveira, J.M., and Ramos, P. (2019). Assessing the Performance of Hierarchical Forecasting Methods on the Retail Sector. Entropy, 21.","DOI":"10.3390\/e21040436"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1016\/j.rcim.2014.12.015","article-title":"Performance of state space and ARIMA models for consumer retail sales forecasting","volume":"34","author":"Ramos","year":"2015","journal-title":"Robot. Comput.-Integr. Manuf."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Ramos, P., and Oliveira, J.M. (2016). A procedure for identification of appropriate state space and ARIMA models based on time-series cross-validation. Algorithms, 9.","DOI":"10.3390\/a9040076"},{"key":"ref_34","unstructured":"Garza, F., Canseco, M.M., Chall\u00fa, C., and Olivares, K.G. (2022). StatsForecast: Lightning Fast Forecasting with Statistical and Econometric Models, PyCon."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v027.i03","article-title":"Automatic time series forecasting: The forecast package for R","volume":"27","author":"Hyndman","year":"2008","journal-title":"J. Stat. Softw."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1016\/S0169-2070(01)00110-8","article-title":"A state space framework for automatic forecasting using exponential smoothing methods","volume":"18","author":"Hyndman","year":"2002","journal-title":"Int. J. Forecast."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D. (2008). Forecasting with Exponential Smoothing: The State Space Approach, Springer.","DOI":"10.1007\/978-3-540-71918-2"},{"key":"ref_38","unstructured":"Hyndman, R.J., and Athanasopoulos, G. (2021). Forecasting: Principles and Practice, OTexts. [3rd ed.]."},{"key":"ref_39","unstructured":"Ord, J.K., Fildes, R., and Kourentzes, N. (2017). Principles of Business Forecasting, Wessex Press Publishing Co.. [2nd ed.]."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1325","DOI":"10.1016\/j.ijforecast.2021.07.007","article-title":"The M5 competition: Background, organization, and implementation","volume":"38","author":"Makridakis","year":"2022","journal-title":"Int. J. Forecast."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Koenker, R. (2005). Quantile Regression, Cambridge University Press. Econometric Society Monographs.","DOI":"10.1017\/CBO9780511754098"},{"key":"ref_42","unstructured":"Eisenach, C., Patel, Y., and Madeka, D. (2022). MQTransformer: Multi-Horizon Forecasts with Context Dependent and Feedback-Aware Attention. arXiv."},{"key":"ref_43","unstructured":"Wen, R., Torkkola, K., Narayanaswamy, B., and Madeka, D. (2018). A Multi-Horizon Quantile Recurrent Forecaster. arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Akiba, T., Sano, S., Yanase, T., Ohta, T., and Koyama, M. (2019, January 4\u20138). Optuna: A Next-generation Hyperparameter Optimization Framework. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Anchorage, AK, USA.","DOI":"10.1145\/3292500.3330701"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1016\/j.ijforecast.2006.03.001","article-title":"Another look at measures of forecast accuracy","volume":"22","author":"Hyndman","year":"2006","journal-title":"Int. J. Forecast."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1198\/016214506000001437","article-title":"Strictly Proper Scoring Rules, Prediction, and Estimation","volume":"102","author":"Gneiting","year":"2007","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_47","first-page":"1901","article-title":"Probabilistic Forecasting with Spline Quantile Function RNNs","volume":"Volume 89","author":"Chaudhuri","year":"2019","journal-title":"Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics"},{"key":"ref_48","unstructured":"Shchur, O., Turkmen, C., Erickson, N., Shen, H., Shirkov, A., Hu, T., and Wang, Y. (2023, January 12\u201315). AutoGluon-TimeSeries: AutoML for Probabilistic Time Series Forecasting. Proceedings of the International Conference on Automated Machine Learning, Potsdam\/Berlin, Germany."}],"container-title":["Mathematics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2227-7390\/12\/17\/2728\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T15:46:21Z","timestamp":1760111181000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2227-7390\/12\/17\/2728"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,31]]},"references-count":48,"journal-issue":{"issue":"17","published-online":{"date-parts":[[2024,9]]}},"alternative-id":["math12172728"],"URL":"https:\/\/doi.org\/10.3390\/math12172728","relation":{},"ISSN":["2227-7390"],"issn-type":[{"value":"2227-7390","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,8,31]]}}}