{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T23:53:48Z","timestamp":1770335628141,"version":"3.49.0"},"reference-count":25,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2024,5,16]],"date-time":"2024-05-16T00:00:00Z","timestamp":1715817600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>Time series forecasting has been a challenging area in the field of Artificial Intelligence. Various approaches such as linear neural networks, recurrent linear neural networks, Convolutional Neural Networks, and recently transformers have been attempted for the time series forecasting domain. Although transformer-based architectures have been outstanding in the Natural Language Processing domain, especially in autoregressive language modeling, the initial attempts to use transformers in the time series arena have met mixed success. A recent important work indicating simple linear networks outperform transformer-based designs. We investigate this paradox in detail comparing the linear neural network- and transformer-based designs, providing insights into why a certain approach may be better for a particular type of problem. We also improve upon the recently proposed simple linear neural network-based architecture by using dual pipelines with batch normalization and reversible instance normalization. Our enhanced architecture outperforms all existing architectures for time series forecasting on a majority of the popular benchmarks.<\/jats:p>","DOI":"10.3390\/bdcc8050048","type":"journal-article","created":{"date-parts":[[2024,5,16]],"date-time":"2024-05-16T06:44:31Z","timestamp":1715841871000},"page":"48","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Enhanced Linear and Vision Transformer-Based Architectures for Time Series Forecasting"],"prefix":"10.3390","volume":"8","author":[{"given":"Musleh","family":"Alharthi","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, University of Bridgeport, Bridgeport, CT 06604, USA"}]},{"given":"Ausif","family":"Mahmood","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, University of Bridgeport, Bridgeport, CT 06604, USA"}]}],"member":"1968","published-online":{"date-parts":[[2024,5,16]]},"reference":[{"key":"ref_1","unstructured":"Box, G.E., Jenkins, G.M., Reinsel, G.C., and Ljung, G.M. (2015). Time Series Analysis: Forecasting and Control, John Wiley & Sons."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1513","DOI":"10.1198\/jasa.2011.tm09771","article-title":"Forecasting time series with complex seasonal patterns using exponential smoothing","volume":"106","author":"Hyndman","year":"2011","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_3","unstructured":"Cerqueira, V., Torgo, L., and Soares, C. (2019). Machine Learning vs. Statistical Methods for Time Series Forecasting: Size Matters. arXiv."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"20200209","DOI":"10.1098\/rsta.2020.0209","article-title":"Time-series forecasting with deep learning: A survey","volume":"379","author":"Lim","year":"2021","journal-title":"Philos. Trans. R. Soc. A"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1181","DOI":"10.1016\/j.ijforecast.2019.07.001","article-title":"DeepAR: Probabilistic forecasting with autoregressive recurrent networks","volume":"36","author":"Salinas","year":"2020","journal-title":"Int. J. Forecast."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"388","DOI":"10.1016\/j.ijforecast.2020.06.008","article-title":"Recurrent Neural Networks for Time Series Forecasting: Current Status and Future Directions","volume":"37","author":"Hewamalage","year":"2021","journal-title":"Int. J. Forecast."},{"key":"ref_7","unstructured":"Sen, R., Yu, H.F., and Dhillon, I.S. (2019). Advances in Neural Information Processing Systems 32 (NeurIPS 2019), Neural Information Processing Systems Foundation, Inc. (NeurIPS)."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Lai, G., Chang, W.-C., Yang, Y., and Liu, H. (2018, January 8\u201312). Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks. Proceedings of the SIGIR \u201818: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, Ann Arbor, MI, USA.","DOI":"10.1145\/3209978.3210006"},{"key":"ref_9","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017). Advances in Neural Information Processing Systems 30 (NIPS 2017), Neural Information Processing Systems Foundation, Inc. (NeurIPS)."},{"key":"ref_10","first-page":"11106","article-title":"Informer: Beyond efficient transformer for long sequence time-series forecasting","volume":"35","author":"Zhou","year":"2021","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_11","unstructured":"Wu, H., Xu, J., Wang, J., and Long, M. (2021). Advances in Neural Information Processing Systems 34 (NeurIPS 2021), Neural Information Processing Systems Foundation, Inc. (NeurIPS)."},{"key":"ref_12","unstructured":"Zhou, T., Ma, Z., Wen, Q., Wang, X., Sun, L., and Jin, R. (2022, January 17\u201323). Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. Proceedings of the 39th International Conference on Machine Learning PMLR 2022, Baltimore, MD, USA."},{"key":"ref_13","unstructured":"Li, S., Jin, X., Xuan, Y., Zhou, X., Chen, W., Wang, Y.X., and Yan, X. (2019). Advances in Neural Information Processing Systems 32 (NeurIPS 2019), Neural Information Processing Systems Foundation, Inc. (NeurIPS)."},{"key":"ref_14","unstructured":"Nie, Y., Nguyen, N.H., Sinthong, P., and Kalagnanam, J. (2022). A Time Series is worth 64 words: Long-term forecasting with Transformers. arXiv."},{"key":"ref_15","first-page":"11121","article-title":"Are Transformers effective for Time Series Forecasting?","volume":"37","author":"Zeng","year":"2023","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_16","unstructured":"Liu, S., Yu, H., Liao, C., Li, J., Lin, W., Liu, A.X., and Dustdar, S. (2022, January 25\u201329). Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting. Proceedings of the International Conference on Learning Representations 2022, Online."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"746","DOI":"10.1111\/j.1467-6419.2007.00518.x","article-title":"Direct multi-step estimation and forecasting","volume":"21","author":"Guillaume","year":"2007","journal-title":"J. Econ. Surv."},{"key":"ref_18","unstructured":"Albert, G., and Dao, T. (2023). Mamba: Linear-time sequence modeling with selective state spaces. arXiv."},{"key":"ref_19","unstructured":"Liu, Y., Tian, Y., Zhao, Y., Yu, H., Xie, L., Wang, Y., Ye, Q., and Liu, Y. (2024). Vmamba: Visual state space model. arXiv."},{"key":"ref_20","unstructured":"Zhang, M., Saab, K.K., Poli, M., Dao, T., Goel, K., and R\u00e9, C. (2023). Effectively Modeling Time Series with Simple Discrete State Spaces. arXiv."},{"key":"ref_21","unstructured":"Kim, T., Kim, J., Tae, Y., Park, C., Choi, J.H., and Choo, J. (2021, January 3\u20137). Reversible instance normalization for accurate time-series forecasting against distribution shift. Proceedings of the International Conference on Learning Representations 2021, Online."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 10\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_23","unstructured":"Zhu, L., Liao, B., Zhang, Q., Wang, X., Liu, W., and Wang, X. (2024). Vision mamba: Efficient visual representation learning with bidirectional state space model. arXiv."},{"key":"ref_24","unstructured":"Wang, Z., Kong, F., Feng, S., Wang, M., Zhao, H., Wang, D., and Zhang, Y. (2024). Is Mamba Effective for Time Series Forecasting?. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Ahamed, M.A., and Cheng, Q. (2024). TimeMachine: A Time Series is Worth 4 Mambas for Long-term Forecasting. arXiv.","DOI":"10.3233\/FAIA240677"}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/8\/5\/48\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:43:23Z","timestamp":1760107403000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/8\/5\/48"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,16]]},"references-count":25,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2024,5]]}},"alternative-id":["bdcc8050048"],"URL":"https:\/\/doi.org\/10.3390\/bdcc8050048","relation":{},"ISSN":["2504-2289"],"issn-type":[{"value":"2504-2289","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,16]]}}}