{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T18:45:18Z","timestamp":1774982718341,"version":"3.50.1"},"reference-count":41,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2024,12,18]],"date-time":"2024-12-18T00:00:00Z","timestamp":1734480000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2022YFB2702100"],"award-info":[{"award-number":["2022YFB2702100"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"NSFC","doi-asserted-by":"crossref","award":["61932004, 62225203, U21A20516, U2001211, 62102023, U21B2007"],"award-info":[{"award-number":["61932004, 62225203, U21A20516, U2001211, 62102023, U21B2007"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100006374","name":"Liaoning Revitalization Talents Program","doi-asserted-by":"publisher","award":["XLYC2204005"],"award-info":[{"award-number":["XLYC2204005"]}],"id":[{"id":"10.13039\/501100006374","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2024,12,18]]},"abstract":"<jats:p>Errors are common in time series due to unreliable sensor measurements. Existing methods focus on univariate data but do not utilize the correlation between dimensions. Cleaning each dimension separately may lead to a less accurate result, as some errors can only be identified in the multivariate case. We also point out that the widely used minimum change principle is not always the best choice. Instead, we try to change the smallest number of data to avoid a significant change in the data distribution. In this paper, we propose MTCSC, the constraint-based method for cleaning multivariate time series. We formalize the repair problem, propose a linear-time method to employ online computing, and improve it by exploiting data trends. We also support adaptive speed constraint capturing. We analyze the properties of our proposals and compare them with SOTA methods in terms of effectiveness, efficiency versus error rates, data sizes, and applications such as classification. Experiments on real datasets show that MTCSC can have higher repair accuracy with less time consumption. Interestingly, it can be effective even when there are only weak or no correlations between the dimensions.<\/jats:p>","DOI":"10.1145\/3698821","type":"journal-article","created":{"date-parts":[[2024,12,20]],"date-time":"2024-12-20T16:40:35Z","timestamp":1734712835000},"page":"1-26","source":"Crossref","is-referenced-by-count":1,"title":["Multivariate Time Series Cleaning under Speed Constraints"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4059-6913","authenticated-orcid":false,"given":"Aoqian","family":"Zhang","sequence":"first","affiliation":[{"name":"Beijing Institute of Technology, Beijing, CN"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-6953-4911","authenticated-orcid":false,"given":"Zexue","family":"Wu","sequence":"additional","affiliation":[{"name":"Beijing Institute of Technology, Beijing, CN"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-6495-765X","authenticated-orcid":false,"given":"Yifeng","family":"Gong","sequence":"additional","affiliation":[{"name":"Beijing Institute of Technology, Beijing, CN"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0247-9866","authenticated-orcid":false,"given":"Ye","family":"Yuan","sequence":"additional","affiliation":[{"name":"Beijing Institute of Technology, Beijing, CN"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0181-8379","authenticated-orcid":false,"given":"Guoren","family":"Wang","sequence":"additional","affiliation":[{"name":"Beijing Institute of Technology, Beijing, CN"}]}],"member":"320","published-online":{"date-parts":[[2024,12,20]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"ICDT (ACM International Conference Proceeding Series","volume":"41","author":"Foto","unstructured":"Foto N. Afrati and Phokion G. Kolaitis. 2009. Repair checking in inconsistent databases: algorithms and complexity. In ICDT (ACM International Conference Proceeding Series, Vol. 361). ACM, 31--41."},{"key":"e_1_2_1_2_1","volume-title":"Outlier analysis","author":"Aggarwal Charu C","unstructured":"Charu C Aggarwal. 2016. Outlier analysis second edition."},{"key":"e_1_2_1_3_1","doi-asserted-by":"crossref","unstructured":"Fabrizio Angiulli and Fabio Fassetti. 2007. Detecting distance-based outliers in streams of data. In CIKM. ACM 811--820.","DOI":"10.1145\/1321440.1321552"},{"key":"e_1_2_1_4_1","volume-title":"Jason Lines, Michael Flynn, James Large, Aaron Bostrom, Paul Southam, and Eamonn J. Keogh.","author":"Bagnall Anthony J.","year":"2018","unstructured":"Anthony J. Bagnall, Hoang Anh Dau, Jason Lines, Michael Flynn, James Large, Aaron Bostrom, Paul Southam, and Eamonn J. Keogh. 2018. The UEA multivariate time series classification archive, 2018. CoRR, Vol. abs\/1811.00075 (2018). [arXiv]1811.00075 http:\/\/arxiv.org\/abs\/1811.00075"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3444690"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066157.1066175"},{"key":"e_1_2_1_7_1","volume-title":"Time series - data analysis and theory. Classics in applied mathematics","author":"Brillinger David R.","unstructured":"David R. Brillinger. 2001. Time series - data analysis and theory. Classics in applied mathematics, Vol. 36. SIAM."},{"key":"e_1_2_1_8_1","volume-title":"The MILP road to MIQCP. Mixed integer nonlinear programming","author":"Burer Samuel","year":"2011","unstructured":"Samuel Burer and Anureet Saxena. 2011. The MILP road to MIQCP. Mixed integer nonlinear programming (2011), 373--405."},{"key":"e_1_2_1_9_1","unstructured":"Yanping Chen Eamonn Keogh Bing Hu Nurjahan Begum Anthony Bagnall Abdullah Mueen and Gustavo Batista. 2015. The UCR Time Series Classification Archive. www.cs.ucr.edu\/ eamonn\/time_series_data\/."},{"key":"e_1_2_1_10_1","volume-title":"Holistic data cleaning: Putting violations into context","author":"Chu Xu","unstructured":"Xu Chu, Ihab F. Ilyas, and Paolo Papotti. 2013. Holistic data cleaning: Putting violations into context. In ICDE. IEEE Computer Society, 458--469."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1967.1053964"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.14778\/2350229.2350279"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.5555\/1248547.1248548"},{"key":"e_1_2_1_14_1","first-page":"149","article-title":"Time Series Data Cleaning Method Based on Optimized ELM Prediction Constraints","volume":"19","author":"Ding Guohui","year":"2023","unstructured":"Guohui Ding, Yueyi Zhu, Chenyang Li, Jinwei Wang, Ru Wei, and Zhaoyu Liu. 2023. Time Series Data Cleaning Method Based on Optimized ELM Prediction Constraints. J. Inf. Process. Syst., Vol. 19, 2 (2023), 149--163.","journal-title":"J. Inf. Process. Syst."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/3469088"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijforecast.2006.03.005"},{"key":"e_1_2_1_17_1","volume-title":"Johnson","author":"Garey M. R.","year":"1979","unstructured":"M. R. Garey and David S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687627.1687693"},{"key":"e_1_2_1_19_1","unstructured":"LLC Gurobi Optimization. 2024. Gurobi Optimization. https:\/\/www.gurobi.com\/. Accessed: 2024-03--30."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.envsoft.2009.08.010"},{"key":"e_1_2_1_21_1","unstructured":"Rob J Hyndman and George Athanasopoulos. 2018. Forecasting: principles and practice. OTexts."},{"key":"e_1_2_1_22_1","volume-title":"Franklin","author":"Jeffery Shawn R.","year":"2006","unstructured":"Shawn R. Jeffery, Minos N. Garofalakis, and Michael J. Franklin. 2006. Adaptive Cleaning for RFID Data Streams. In VLDB. ACM, 163--174."},{"key":"e_1_2_1_23_1","volume-title":"Biostatistical analysis. Biostatistical analysis","author":"Jerrold H Zar","year":"1999","unstructured":"H Zar Jerrold. 1999. Biostatistical analysis. Biostatistical analysis (1999)."},{"key":"e_1_2_1_24_1","volume-title":"Information theory and statistics","author":"Kullback Solomon","unstructured":"Solomon Kullback. 1997. Information theory and statistics. Courier Corporation."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.14778\/2535568.2448943"},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the fifth Berkeley symposium on mathematical statistics and probability","volume":"1","author":"James","unstructured":"James MacQueen et al. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Vol. 1. Oakland, CA, USA, 281--297."},{"key":"e_1_2_1_27_1","unstructured":"Samuel Madden. 2003. Intel Berkeley research lab data. https:\/\/db.csail.mit.edu\/labdata\/labdata.html. Accessed: 2024-04--10."},{"key":"e_1_2_1_28_1","volume-title":"CurrentClean: Spatio-Temporal Cleaning of Stale Data","author":"Milani Mostafa","unstructured":"Mostafa Milani, Zheng Zheng, and Fei Chiang. 2019. CurrentClean: Spatio-Temporal Cleaning of Stale Data. In ICDE. IEEE, 172--183."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.1971.10482356"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.14778\/3137628.3137631"},{"key":"e_1_2_1_31_1","volume-title":"Longest increasing and decreasing subsequences. Canadian Journal of mathematics","author":"Schensted Craige","year":"1961","unstructured":"Craige Schensted. 1961. Longest increasing and decreasing subsequences. Canadian Journal of mathematics, Vol. 13 (1961), 179--191."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3465740"},{"key":"e_1_2_1_33_1","volume-title":"SIGMOD Conference. ACM, 827--841","author":"Song Shaoxu","unstructured":"Shaoxu Song, Aoqian Zhang, Jianmin Wang, and Philip S. Yu. 2015. SCREEN: Stream Data Cleaning under Speed Constraints. In SIGMOD Conference. ACM, 827--841."},{"key":"e_1_2_1_34_1","volume-title":"Applications of dynamic programming to agricultural decision problems","author":"Taylor C Robert","unstructured":"C Robert Taylor. 2019. Dynamic programming and the curses of dimensionality. In Applications of dynamic programming to agricultural decision problems. CRC Press, 1--10."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.14778\/3514061.3514067"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-023-00796-y"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403171"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2915233"},{"key":"e_1_2_1_39_1","volume-title":"Unsupervised Deep Anomaly Detection for Multi-Sensor Time-Series Signals. CoRR","author":"Zhang Yuxin","year":"2021","unstructured":"Yuxin Zhang, Yiqiang Chen, Jindong Wang, and Zhiwen Pan. 2021. Unsupervised Deep Anomaly Detection for Multi-Sensor Time-Series Signals. CoRR, Vol. abs\/2107.12626 (2021)."},{"key":"e_1_2_1_40_1","doi-asserted-by":"crossref","unstructured":"Yu Zheng Like Liu Longhao Wang and Xing Xie. 2008. Learning transportation mode from raw gps data for geographic applications on the web. In WWW. ACM 247--256.","DOI":"10.1145\/1367497.1367532"},{"key":"e_1_2_1_41_1","first-page":"1","article-title":"A High-Dimensional Timing Data Cleaning Algorithm for Wireless Sensor Networks","volume":"53","author":"Zhou Jingjing","year":"2022","unstructured":"Jingjing Zhou, Xiaokang Yu, Jilin Zhang, Hanxiao Shi, Yuxin Mao, and Junfeng Yuan. 2022. A High-Dimensional Timing Data Cleaning Algorithm for Wireless Sensor Networks. Ad Hoc Sens. Wirel. Networks, Vol. 53, 1--2 (2022), 141--164.","journal-title":"Ad Hoc Sens. Wirel. Networks"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3698821","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3698821","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T17:46:16Z","timestamp":1774979176000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3698821"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,18]]},"references-count":41,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,12,18]]}},"alternative-id":["10.1145\/3698821"],"URL":"https:\/\/doi.org\/10.1145\/3698821","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,18]]}}}