{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,6]],"date-time":"2026-06-06T02:49:53Z","timestamp":1780714193706,"version":"3.54.1"},"reference-count":31,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2021,1,25]],"date-time":"2021-01-25T00:00:00Z","timestamp":1611532800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Centre of Excellence 509 project \u201cDATACROSS\u201d, co-financed by the Croatian Government and the European Union through the European 510 Regional Development Fund\u2014the Competitiveness and Cohesion Operational Programme","award":["KK.01.1.1.01.0009"],"award-info":[{"award-number":["KK.01.1.1.01.0009"]}]},{"name":"QuantiX\u2014 Lie Center of Excellence, a project co-financed by the Croatian 507 Government and European Union through the European Regional Development Fund\u2014 The Competitiveness 508 and Cohesion Operational Programme","award":["KK.01.1.1.01.0004"],"award-info":[{"award-number":["KK.01.1.1.01.0004"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>The adaptation of deep learning models within safety-critical systems cannot rely only on good prediction performance but needs to provide interpretable and robust explanations for their decisions. When modeling complex sequences, attention mechanisms are regarded as the established approach to support deep neural networks with intrinsic interpretability. This paper focuses on the emerging trend of specifically designing diagnostic datasets for understanding the inner workings of attention mechanism based deep learning models for multivariate forecasting tasks. We design a novel benchmark of synthetically designed datasets with the transparent underlying generating process of multiple time series interactions with increasing complexity. The benchmark enables empirical evaluation of the performance of attention based deep neural networks in three different aspects: (i) prediction performance score, (ii) interpretability correctness, (iii) sensitivity analysis. Our analysis shows that although most models have satisfying and stable prediction performance results, they often fail to give correct interpretability. The only model with both a satisfying performance score and correct interpretability is IMV-LSTM, capturing both autocorrelations and crosscorrelations between multiple time series. Interestingly, while evaluating IMV-LSTM on simulated data from statistical and mechanistic models, the correctness of interpretability increases with more complex datasets.<\/jats:p>","DOI":"10.3390\/e23020143","type":"journal-article","created":{"date-parts":[[2021,1,25]],"date-time":"2021-01-25T02:07:00Z","timestamp":1611540420000},"page":"143","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":25,"title":["Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions"],"prefix":"10.3390","volume":"23","author":[{"given":"Domjan","family":"Bari\u0107","sequence":"first","affiliation":[{"name":"Department of Physics, Faculty of Science, University of Zagreb, Bijeni\u010dka cesta 32, 10000 Zagreb, Croatia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Petar","family":"Fumi\u0107","sequence":"additional","affiliation":[{"name":"Department of Physics, Faculty of Science, University of Zagreb, Bijeni\u010dka cesta 32, 10000 Zagreb, Croatia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0411-8474","authenticated-orcid":false,"given":"Davor","family":"Horvati\u0107","sequence":"additional","affiliation":[{"name":"Department of Physics, Faculty of Science, University of Zagreb, Bijeni\u010dka cesta 32, 10000 Zagreb, Croatia"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8037-8198","authenticated-orcid":false,"given":"Tomislav","family":"Lipic","sequence":"additional","affiliation":[{"name":"Division of Electronics, Ru\u0111er Bo\u0161kovi\u0107 Institute, Bijeni\u010dka cesta 54, 10000 Zagreb, Croatia"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2021,1,25]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Lim, B., and Zohren, S. (2020). Time Series Forecasting With Deep Learning: A Survey. arXiv.","DOI":"10.1098\/rsta.2020.0209"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"159915","DOI":"10.1109\/ACCESS.2020.3019989","article-title":"Deepcovidnet: An interpretable deep learning model for predictive surveillance of covid-19 using heterogeneous features and their interactions","volume":"8","author":"Ramchandani","year":"2020","journal-title":"IEEE Access"},{"key":"ref_3","unstructured":"Shi, Z.R., Wang, C., and Fang, F. (2020). Artificial intelligence for social good: A survey. arXiv."},{"key":"ref_4","first-page":"846","article-title":"Short-Term Electricity Consumption Forecasting Based on the Attentive Encoder-Decoder Model","volume":"140","author":"Song","year":"2020","journal-title":"IEEJ Trans. Electron. Inf. Syst."},{"key":"ref_5","unstructured":"Arya, V., Bellamy, R.K., Chen, P.Y., Dhurandhar, A., Hind, M., Hoffman, S.C., Houde, S., Liao, Q.V., Luss, R., and Mojsilovi\u0107, A. (2019). One explanation does not fit all: A toolkit and taxonomy of ai explainability techniques. arXiv."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"206","DOI":"10.1038\/s42256-019-0048-x","article-title":"Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead","volume":"1","author":"Rudin","year":"2019","journal-title":"Nat. Mach. Intell."},{"key":"ref_7","unstructured":"Lundberg, S.M., and Lee, S.I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"73694","DOI":"10.1109\/ACCESS.2019.2921101","article-title":"A deep learning perspective on beauty, sentiment, and remembrance of art","volume":"7","author":"Cetinic","year":"2019","journal-title":"IEEE Access"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1016\/j.cobeha.2019.04.007","article-title":"The Omniglot challenge: A 3-year progress report","volume":"29","author":"Lake","year":"2019","journal-title":"Curr. Opin. Behav. Sci."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Johnson, J., Hariharan, B., van der Maaten, L., Fei-Fei, L., Lawrence Zitnick, C., and Girshick, R. (2017, January 21\u201326). Clevr: A diagnostic dataset for compositional language and elementary visual reasoning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.215"},{"key":"ref_11","unstructured":"Santoro, A., Hill, F., Barrett, D., Morcos, A., and Lillicrap, T. (2018, January 10\u201315). Measuring abstract reasoning in neural networks. Proceedings of the International Conference on Machine Learning, Alvsjo, Sweden."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Springer, J.M., and Kenyon, G.T. (2020). It is Hard for Neural Networks To Learn the Game of Life. arXiv.","DOI":"10.1109\/IJCNN52387.2021.9534060"},{"key":"ref_13","unstructured":"Chollet, F. (2019). On the measure of intelligence. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Assaf, R., and Schumann, A. (2019, January 10\u201316). Explainable Deep Neural Networks for Multivariate Time Series Predictions. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19), Macao, China.","DOI":"10.24963\/ijcai.2019\/932"},{"key":"ref_15","unstructured":"Arnout, H., El-Assady, M., Oelke, D., and Keim, D.A. (2019, January 27\u201328). Towards A Rigorous Evaluation Of XAI Methods On Time Series. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Korea."},{"key":"ref_16","unstructured":"Ismail, A.A., Gunady, M., Corrada Bravo, H., and Feizi, S. (2020, January 6\u201312). Benchmarking Deep Learning Interpretability in Time Series Predictions. Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, BC, Canada."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Pantiskas, L., Verstoep, C., and Bal, H. (2020, January 1\u20134). Interpretable Multivariate Time Series Forecasting with Temporal Attention Convolutional Neural Networks. Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence, Canberra, Australia.","DOI":"10.1109\/SSCI47803.2020.9308570"},{"key":"ref_18","unstructured":"Fauvel, K., Masson, V., and Fromont, \u00c9. (2020). A Performance-Explainability Framework to Benchmark Machine Learning Methods: Application to Multivariate Time Series Classifiers. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Mohankumar, A.K., Nema, P., Narasimhan, S., Khapra, M.M., Srinivasan, B.V., and Ravindran, B. (2020). Towards Transparent and Explainable Attention Models. arXiv.","DOI":"10.18653\/v1\/2020.acl-main.387"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"075310","DOI":"10.1063\/1.5025050","article-title":"Causal network reconstruction from time series: From theoretical assumptions to practical estimation","volume":"28","author":"Runge","year":"2018","journal-title":"Chaos Interdiscip. J. Nonlinear Sci."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"eaau4996","DOI":"10.1126\/sciadv.aau4996","article-title":"Detecting and quantifying causal associations in large nonlinear time series datasets","volume":"5","author":"Runge","year":"2019","journal-title":"Sci. Adv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Runge, J., Bathiany, S., Bollt, E., Camps-Valls, G., Coumou, D., Deyle, E., Glymour, C., Kretschmer, M., Mahecha, M., and Munoz-Mari, J. (2019). Inferring causation from time series with perspectives in Earth system sciences. Nat. Commun., 10.","DOI":"10.1038\/s41467-019-10105-3"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1016\/j.ijforecast.2019.04.014","article-title":"The M4 Competition: 100,000 time series and 61 forecasting methods","volume":"36","author":"Makridakis","year":"2020","journal-title":"Int. J. Forecast."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Dang, X.H., Shah, S.Y., and Zerfos, P. (2018). seq2graph: Discovering Dynamic Dependencies from Multivariate Time Series with Multi-level Attention. arXiv.","DOI":"10.1109\/BigData47090.2019.9006103"},{"key":"ref_25","unstructured":"Guo, T., Lin, T., and Antulov-Fantulin, N. (2019). Exploring Interpretable LSTM Neural Networks over Multi-Variable Data. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"312","DOI":"10.3390\/make1010019","article-title":"Causal Discovery with Attention-Based Convolutional Neural Networks","volume":"1","author":"Nauta","year":"2019","journal-title":"Mach. Learn. Knowl. Extr."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Qin, Y., Song, D., Chen, H., Cheng, W., Jiang, G., and Cottrell, G. (2017). A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction. arXiv.","DOI":"10.24963\/ijcai.2017\/366"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1103\/PhysRev.65.117","article-title":"Crystal statistics. I. A two-dimensional model with an order-disorder transition","volume":"65","author":"Onsager","year":"1944","journal-title":"Phys. Rev."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Landau, D.P., and Binder, K. (2009). A Guide to Monte Carlo Simulations in Statistical Physics, Cambridge University Press.","DOI":"10.1017\/CBO9780511994944"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/j.ijforecast.2019.03.017","article-title":"A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting","volume":"36","author":"Smyl","year":"2020","journal-title":"Int. J. Forecast."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"053304","DOI":"10.1103\/PhysRevE.97.053304","article-title":"Scale-invariant feature extraction of neural network and renormalization group flow","volume":"97","author":"Iso","year":"2018","journal-title":"Phys. Rev. E"}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/23\/2\/143\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:14:52Z","timestamp":1760159692000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/23\/2\/143"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,1,25]]},"references-count":31,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2021,2]]}},"alternative-id":["e23020143"],"URL":"https:\/\/doi.org\/10.3390\/e23020143","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,1,25]]}}}