{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T17:26:21Z","timestamp":1768584381200,"version":"3.49.0"},"reference-count":54,"publisher":"Association for Computing Machinery (ACM)","issue":"7","license":[{"start":{"date-parts":[[2023,5,4]],"date-time":"2023-05-04T00:00:00Z","timestamp":1683158400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62032025"],"award-info":[{"award-number":["62032025"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Key-Area Research and Development Program of Shandong Province","award":["2021CXGC010108"],"award-info":[{"award-number":["2021CXGC010108"]}]},{"name":"Jiangsu Science and Technology Program","award":["BE2020006-4"],"award-info":[{"award-number":["BE2020006-4"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2023,8,31]]},"abstract":"<jats:p>\n            We focus on multi-step ahead time series forecasting with the multi-output strategy. From the perspective of multi-task learning (MTL), we recognize imbalanced uncertainties between prediction tasks of different future time steps. Unexpectedly, trained by the standard summed Mean Squared Error (MSE) loss, existing multi-output forecasting models may suffer from performance drops due to the inconsistency between the loss function and the imbalance structure. To address this problem, we reformulate each prediction task as a distinct Gaussian Mixture Model (GMM) and derive a multi-level Gaussian mixture loss function to better fit imbalanced uncertainties in multi-output time series forecasting. Instead of using the two-step Expectation-Maximization (EM) algorithm, we apply the self-attention mechanism on the task-specific parameters to learn the correlations between different prediction tasks and generate the weight distribution for each GMM component. In this way, our method jointly optimizes the parameters of the forecasting model and the mixture model simultaneously in an end-to-end fashion, avoiding the need of two-step optimization. Experiments on three real-world datasets demonstrate the effectiveness of our multi-level Gaussian mixture loss compared to models trained with the standard summed MSE loss function. All the experimental data and source code are available at\n            <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"web\" xlink:href=\"https:\/\/github.com\/smallGum\/GMM-FNN\">https:\/\/github.com\/smallGum\/GMM-FNN<\/jats:ext-link>\n            .\n          <\/jats:p>","DOI":"10.1145\/3584704","type":"journal-article","created":{"date-parts":[[2023,2,16]],"date-time":"2023-02-16T11:34:40Z","timestamp":1676547280000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Fitting Imbalanced Uncertainties in Multi-output Time Series Forecasting"],"prefix":"10.1145","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1755-6828","authenticated-orcid":false,"given":"Jiezhu","family":"Cheng","sequence":"first","affiliation":[{"name":"School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, Guangdong Province, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3034-9639","authenticated-orcid":false,"given":"Kaizhu","family":"Huang","sequence":"additional","affiliation":[{"name":"Data Science Research Center, Duke Kunshan University, Suzhou, Jiangsu Province, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7878-4330","authenticated-orcid":false,"given":"Zibin","family":"Zheng","sequence":"additional","affiliation":[{"name":"School of Software Engineering, Sun Yat-sen University, Zhuhai, Guangdong Province, China"}]}],"member":"320","published-online":{"date-parts":[[2023,5,4]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330917"},{"key":"e_1_3_2_3_2","first-page":"41","volume-title":"Proceedings of the 20th Annual Conference on Neural Information Processing Systems.","author":"Argyriou Andreas","year":"2006","unstructured":"Andreas Argyriou, Theodoros Evgeniou, and Massimiliano Pontil. 2006. Multi-task feature learning. InProceedings of the 20th Annual Conference on Neural Information Processing Systems. Bernhard Sch\u00f6lkopf, John C. Platt, and Thomas Hofmann (Eds.), MIT, 41\u201348. Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2006\/hash\/0afa92fc0f8a9cf051bf2961b06ac56b-Abstract.html."},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/UKSim.2014.67"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2012.01.039"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.5555\/574978"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1111\/1467-9868.00054"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007379606734"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/d16-1053"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i04.5766"},{"key":"e_1_3_2_11_2","volume-title":"Proceedings of the 9th International Conference on Learning Representations.","author":"Dosovitskiy Alexey","year":"2021","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An image is worth 16x16 words: Transformers for image recognition at scale. In Proceedings of the 9th International Conference on Learning Representations. Retrieved from https:\/\/openreview.net\/forum?id=YicbFdNTTy."},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-41398-8_15"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330662"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3097983.3098037"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-04898-2_594"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00781"},{"key":"e_1_3_2_18_2","volume-title":"Proceedings of the 3rd International Conference on Learning Representations.","author":"Kingma Diederik P.","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations. Yoshua Bengio and Yann LeCun (Eds.). Retrieved from http:\/\/arxiv.org\/abs\/1412.6980."},{"key":"e_1_3_2_19_2","volume-title":"Proceedings of the 29th International Conference on Machine Learning","author":"Kumar Abhishek","year":"2012","unstructured":"Abhishek Kumar and Hal Daum\u00e9 III. 2012. Learning task grouping and overlap in multi-task learning. In Proceedings of the 29th International Conference on Machine Learning. Retrieved from http:\/\/icml.cc\/2012\/papers\/690.pdf."},{"key":"e_1_3_2_20_2","series-title":"Proceedings of the 35th International Conference on Machine Learning.","first-page":"2962","volume":"80","author":"Lee Haebeom","year":"2018","unstructured":"Haebeom Lee, Eunho Yang, and Sung Ju Hwang. 2018. Deep asymmetric multi-task feature learning. In Proceedings of the 35th International Conference on Machine Learning. Jennifer G. Dy and Andreas Krause (Eds.), Proceedings of Machine Learning Research, Vol. 80, PMLR, 2962\u20132970. Retrieved from http:\/\/proceedings.mlr.press\/v80\/lee18d.html."},{"key":"e_1_3_2_21_2","first-page":"5244","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems.","author":"Li Shiyang","year":"2019","unstructured":"Shiyang Li, Xiaoyong Jin, Yao Xuan, Xiyou Zhou, Wenhu Chen, Yu-Xiang Wang, and Xifeng Yan. 2019. Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. In Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems. Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d\u2019Alch\u00e9-Buc, Emily B. Fox, and Roman Garnett (Eds.), 5244\u20135254. Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2019\/hash\/6775a0635c302542da2c32aa19d86be0-Abstract.html."},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3453724"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1098\/rsta.2020.0209"},{"key":"e_1_3_2_24_2","volume-title":"Proceedings of the 5th International Conference on Learning Representations","author":"Lin Zhouhan","year":"2017","unstructured":"Zhouhan Lin, Minwei Feng, C\u00edcero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. In Proceedings of the 5th International Conference on Learning Representations. Retrieved from https:\/\/openreview.net\/forum?id=BJC_jUqxe."},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijforecast.2018.06.001"},{"key":"e_1_3_2_26_2","first-page":"15434","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021.","author":"Marfoq Othmane","year":"2021","unstructured":"Othmane Marfoq, Giovanni Neglia, Aur\u00e9lien Bellet, Laetitia Kameni, and Richard Vidal. 2021. Federated multi-task learning under a mixture of distributions. In Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021. Marc\u2019Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds.), 15434\u201315447. Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2021\/hash\/82599a4ec94aca066873c99b4c741ed8-Abstract.html."},{"key":"e_1_3_2_27_2","volume-title":"Proceedings of the 8th International Conference on Learning Representations","author":"Oreshkin Boris N.","year":"2020","unstructured":"Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, and Yoshua Bengio. 2020. N-BEATS: Neural basis expansion analysis for interpretable time series forecasting. In Proceedings of the 8th International Conference on Learning Representations. Retrieved from https:\/\/openreview.net\/forum?id=r1ecqn4YwB."},{"key":"e_1_3_2_28_2","volume-title":"Proceedings of the 9th International Conference on Learning Representations","author":"Pilault Jonathan","year":"2021","unstructured":"Jonathan Pilault, Amine Elhattami, and Christopher J. Pal. 2021. Conditionally adaptive multi-task learning: Improving transfer learning in NLP using fewer parameters and less data. In Proceedings of the 9th International Conference on Learning Representations. Retrieved from https:\/\/openreview.net\/forum?id=de11dbHzAMF."},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2017\/366"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.protcy.2013.12.228"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2018.2840129"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijforecast.2019.07.001"},{"key":"e_1_3_2_33_2","series-title":"Proceedings of the 35th International Conference on Machine Learning.","first-page":"4555","volume":"80","author":"Serr\u00e0 Joan","year":"2018","unstructured":"Joan Serr\u00e0, Didac Suris, Marius Miron, and Alexandros Karatzoglou. 2018. Overcoming catastrophic forgetting with hard attention to the task. In Proceedings of the 35th International Conference on Machine Learning. Jennifer G. Dy and Andreas Krause (Eds.), Proceedings of Machine Learning Research, Vol. 80, PMLR, 4555\u20134564. Retrieved from http:\/\/proceedings.mlr.press\/v80\/serra18a.html."},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.5555\/2503308.2503331"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2006.06.015"},{"key":"e_1_3_2_36_2","first-page":"3104","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014.","author":"Sutskever Ilya","year":"2014","unstructured":"Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014. Zoubin Ghahramani, Max Welling, Corinna Cortes, Neil D. Lawrence, and Kilian Q. Weinberger (Eds.), 3104\u20133112. Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2014\/hash\/a14ac55a4f27472c5d894ec1c3c743d2-Abstract.html."},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2015.2411629"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2009.5178802"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2018.2869225"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178776"},{"key":"e_1_3_2_41_2","first-page":"5998","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017.","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.), 5998\u20136008. Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html."},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/59.651623"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1155\/2020\/6458576"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1007\/s13571-018-0159-0"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1080\/03610918.2019.1588305"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.1711.11053"},{"key":"e_1_3_2_47_2","volume-title":"Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020.","author":"Wu Sifan","year":"2020","unstructured":"Sifan Wu, Xi Xiao, Qianggang Ding, Peilin Zhao, Ying Wei, and Junzhou Huang. 2020. Adversarial sparse transformer for time series forecasting. In Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020. Hugo Larochelle, Marc\u2019Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). Retrieved from https:\/\/proceedings.neurips.cc\/paper\/2020\/hash\/c6b8c8d762da15fa8dbbdfb6baf9e260-Abstract.html."},{"key":"e_1_3_2_48_2","volume-title":"Proceedings of the 8th International Conference on Learning Representations.","author":"Wu Sen","year":"2020","unstructured":"Sen Wu, Hongyang R. Zhang, and Christopher R\u00e9. 2020. Understanding and improving information transfer in multi-task learning. In Proceedings of the 8th International Conference on Learning Representations. Retrieved from https:\/\/openreview.net\/forum?id=SylzhkBtDB."},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i1.16145"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSC.2018.2876532"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1145\/3464308"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSC.2020.2995571"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i12.17325"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.aiopen.2021.01.001"},{"key":"e_1_3_2_55_2","volume-title":"Proceedings of the 6th International Conference on Learning Representations.","author":"Zong Bo","year":"2018","unstructured":"Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Dae-ki Cho, and Haifeng Chen. 2018. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In Proceedings of the 6th International Conference on Learning Representations. Retrieved from https:\/\/openreview.net\/forum?id=BJJLHbb0-."}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3584704","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3584704","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:51:37Z","timestamp":1750182697000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3584704"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,4]]},"references-count":54,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2023,8,31]]}},"alternative-id":["10.1145\/3584704"],"URL":"https:\/\/doi.org\/10.1145\/3584704","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,4]]},"assertion":[{"value":"2022-01-31","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-02-06","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-05-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}