{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T00:59:07Z","timestamp":1780448347045,"version":"3.54.1"},"reference-count":27,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2025,10,14]],"date-time":"2025-10-14T00:00:00Z","timestamp":1760400000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>This paper provides a detailed breakdown of a minimalist, fundamental Transformer-based architecture for forecasting univariate time series. It describes each processing step in detail, from input embedding and positional encoding to self-attention mechanisms and output projection. All of these steps are specifically tailored to sequential temporal data. By isolating and analyzing the role of each component, this paper demonstrates how Transformers capture long-term dependencies in time series. A simplified, interpretable Transformer model named \u2019minimalist Transformer\u2019 is implemented and showcased using a simple example. It is then validated using the M3 forecasting competition benchmark, which is based on real-world data, and a number of data series generated by IoT sensors. The aim of this work is to serve as a practical guide and foundation for future Transformer-based forecasting innovations, providing a solid baseline that is simple to achieve but exhibits a stable forecasting ability not far behind that of state-of-the-art specialized designs.<\/jats:p>","DOI":"10.3390\/a18100645","type":"journal-article","created":{"date-parts":[[2025,10,15]],"date-time":"2025-10-15T08:34:26Z","timestamp":1760517266000},"page":"645","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Deconstructing a Minimalist Transformer Architecture for Univariate Time Series Forecasting"],"prefix":"10.3390","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-3012-6042","authenticated-orcid":false,"given":"Filippo","family":"Garagnani","sequence":"first","affiliation":[{"name":"\u201cEnzo Ferrari\u201d Department of Engineering, University of Modena and Reggio Emilia, 41121 Modena, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1220-1235","authenticated-orcid":false,"given":"Vittorio","family":"Maniezzo","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Bologna, 40126 Bologna, Italy"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2025,10,14]]},"reference":[{"key":"ref_1","unstructured":"Cun, Y.L., Bottou, L., and Bengio, Y. (1997, January 21\u201324). Reading checks with multilayer graph transformer networks. Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, Munich, Germany."},{"key":"ref_2","unstructured":"Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (2017). Attention is All you Need. Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long Short-Term Memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_4","unstructured":"Chung, J., Gulcehre, C., Cho, K., and Bengio, Y. (2014). Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv."},{"key":"ref_5","unstructured":"Box, G., and Jenkins, G. (1970). Time Series Analysis: Forecasting and Control, Holden-Day."},{"key":"ref_6","first-page":"5","article-title":"Forecasting trends and seasonals by exponentially weighted moving averages","volume":"52","author":"Holt","year":"1957","journal-title":"ONR Memo."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"324","DOI":"10.1287\/mnsc.6.3.324","article-title":"Forecasting Sales by Exponentially Weighted Moving Averages","volume":"6","author":"Winters","year":"1960","journal-title":"Manag. Sci."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., and Zhang, W. (2021). Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. arXiv.","DOI":"10.1609\/aaai.v35i12.17325"},{"key":"ref_9","unstructured":"Wu, H., Xu, J., Wang, J., and Long, M. (2022). Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting. arXiv."},{"key":"ref_10","first-page":"27268","article-title":"FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting","volume":"Volume 162","author":"Chaudhuri","year":"2022","journal-title":"International Conference on Machine Learning"},{"key":"ref_11","unstructured":"Nie, Y., Nguyen, N.H., Sinthong, P., and Kalagnanam, J. (2023). A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. arXiv."},{"key":"ref_12","unstructured":"Ansari, A.F., Stella, L., Turkmen, C., Zhang, X., Mercado, P., Shen, H., Shchur, O., Rangapuram, S.S., Pineda Arango, S., and Kapoor, S. (2024). Chronos: Learning the Language of Time Series. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Lin, Y., Koprinska, I., and Rana, M. (2021, January 18\u201322). Temporal Convolutional Attention Neural Networks for Time Series Forecasting. Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), Shenzhen, China.","DOI":"10.1109\/IJCNN52387.2021.9534351"},{"key":"ref_14","unstructured":"Wen, Q., Zhou, T., Zhang, C., Chen, W., Ma, Z., Yan, J., and Sun, L. (2023, January 19\u201325). Transformers in time series: A survey. Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023, IJCAI \u201923, Macao, China."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"7433","DOI":"10.1007\/s00034-023-02454-8","article-title":"Transformers in Time-Series Analysis: A Tutorial","volume":"42","author":"Ahmed","year":"2023","journal-title":"Circuits Syst. Signal Process."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1573","DOI":"10.1007\/s10462-024-11044-2","article-title":"A systematic review for transformer-based long-term series forecasting","volume":"58","author":"Su","year":"2025","journal-title":"Artif. Intell. Rev."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1748","DOI":"10.1016\/j.ijforecast.2021.03.012","article-title":"Temporal Fusion Transformers for interpretable multi-horizon time series forecasting","volume":"37","author":"Lim","year":"2021","journal-title":"Int. J. Forecast."},{"key":"ref_18","unstructured":"Liu, Y., Hu, T., Zhang, H., Wu, H., Wang, S., Ma, L., and Long, M. (2024, January 7\u201311). iTransformer: Inverted Transformers Are Effective for Time Series Forecasting. Proceedings of the Twelfth International Conference on Learning Representations, Vienna, Austria."},{"key":"ref_19","unstructured":"Garagnani, F. (2025, August 08). Deconstructing-Transformers. Available online: https:\/\/github.com\/FGaragnani\/deconstructing-transformers."},{"key":"ref_20","unstructured":"Google (2025, August 27). Google Trends. Available online: https:\/\/trends.google.com."},{"key":"ref_21","unstructured":"(2025, August 08). PyTorch Transformer. Available online: https:\/\/docs.pytorch.org\/docs\/stable\/generated\/torch.nn.Transformer.html."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Akiba, T., Sano, S., Yanase, T., Ohta, T., and Koyama, M. (2019). Optuna: A Next-generation Hyperparameter Optimization Framework. arXiv.","DOI":"10.1145\/3292500.3330701"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/s10994-006-6226-1","article-title":"Extremely randomized trees","volume":"63","author":"Geurts","year":"2006","journal-title":"Mach. Learn."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1016\/S0169-2070(00)00057-1","article-title":"The M3-Competition: Results, conclusions and implications","volume":"16","author":"Makridakis","year":"2000","journal-title":"Int. J. Forecast."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Cecilia, J.M., Manzoni, P., Trolle, D., Nielsen, A., Blanco, P., Prandi, C., Pe\u00f1a Haro, S., Barkved, L., Pierson, D., and Senent, J. (2021, January 9\u201311). SMARTLAGOON: Innovative modelling approaches for predicting socio-environmental evolution in highly anthropized coastal lagoons. Proceedings of the Conference on Information Technology for Social Good, GoodIT \u201921, New York, NY, USA.","DOI":"10.1145\/3462203.3475925"},{"key":"ref_26","unstructured":"Seabold, S., and Perktold, J. (July, January 28). statsmodels: Econometric and statistical modeling with python. Proceedings of the 9th Python in Science Conference, Austin, TX, USA."},{"key":"ref_27","unstructured":"Smith, T.G. (2025, September 29). pmdarima: ARIMA Estimators for Python. Available online: https:\/\/alkaline-ml.com\/pmdarima\/index.html."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/10\/645\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,15]],"date-time":"2025-10-15T09:00:22Z","timestamp":1760518822000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/10\/645"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,14]]},"references-count":27,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2025,10]]}},"alternative-id":["a18100645"],"URL":"https:\/\/doi.org\/10.3390\/a18100645","relation":{},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,14]]}}}