{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,7]],"date-time":"2026-04-07T05:10:16Z","timestamp":1775538616464,"version":"3.50.1"},"reference-count":59,"publisher":"Association for Computing Machinery (ACM)","issue":"6","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62302241, 62372252, 72342017, 92267203, 62232005, 62021002"],"award-info":[{"award-number":["62302241, 62372252, 72342017, 92267203, 62232005, 62021002"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"the National Key Research and Development Plan","award":["2024YFB3311901"],"award-info":[{"award-number":["2024YFB3311901"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Manag. Data"],"published-print":{"date-parts":[[2025,12,4]]},"abstract":"<jats:p>Spatio-temporal data collected from geographically distributed sources often contain dirty values that affect downstream applications. Temporal data repairing methods, e.g., based on speed constraints, may mistakenly treat sudden changes as errors, although they represent real events and occur simultaneously at multiple locations. Spatial data repairing approaches emphasize value consistency across different locations but ignore temporal pattern similarity. Meanwhile, existing spatio-temporal repairing methods focus more on spatial error correction rather than temporal value repairing across locations. Therefore, we use both temporal and spatial dependencies to identify and repair spatio-temporal errors. Our main contributions are: (1) formalizing the optimal spatio-temporal data repairing problem under constraints and proving its NP-hardness; (2) designing an exact algorithm that decomposes global repair into local decisions with pruning methods; (3) developing two approximate algorithms with theoretical guarantees and probabilities of hitting the optimal solution, where the first explores a wider search space for higher accuracy, and the second uses a greedy sliding-window strategy to improve efficiency; and (4) conducting experiments on nine real-world datasets and downstream applications against eleven baselines, which demonstrate the superiority and practicability of our methods.<\/jats:p>","DOI":"10.1145\/3769794","type":"journal-article","created":{"date-parts":[[2025,12,6]],"date-time":"2025-12-06T04:32:13Z","timestamp":1764995533000},"page":"1-27","source":"Crossref","is-referenced-by-count":0,"title":["From Suspicious Errors to Valid Data: On Repairing Spatio-Temporal Data via Spatial and Temporal Dependencies"],"prefix":"10.1145","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-0839-6455","authenticated-orcid":false,"given":"Weiwei","family":"Deng","sequence":"first","affiliation":[{"name":"College of Computer Science, Nankai University, Tianjin, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-7398-2972","authenticated-orcid":false,"given":"Yu","family":"Sun","sequence":"additional","affiliation":[{"name":"College of Computer Science, Nankai University, Tianjin, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9503-2755","authenticated-orcid":false,"given":"Shaoxu","family":"Song","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5876-6856","authenticated-orcid":false,"given":"Xiaojie","family":"Yuan","sequence":"additional","affiliation":[{"name":"College of Computer Science, Nankai University, Tianjin, China"}]}],"member":"320","published-online":{"date-parts":[[2025,12,5]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2002.985697"},{"key":"e_1_2_1_2_1","unstructured":"2025. athens. https:\/\/www.kaggle.com\/datasets\/yekenot\/air-quality-monitoring-in-european-cities."},{"key":"e_1_2_1_3_1","unstructured":"2025. pems. https:\/\/pems.dot.ca.gov."},{"key":"e_1_2_1_4_1","unstructured":"2025. STRhub. https:\/\/github.com\/cookiesvivi\/STRhub."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2882907"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1177\/2043820613513390"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066157.1066175"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.1979.4766909"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2525314.2525468"},{"key":"e_1_2_1_10_1","volume-title":"Jensen","author":"Fang Ziquan","year":"2021","unstructured":"Ziquan Fang, Yuntao Du, Xinjun Zhu, Lu Chen, Yunjun Gao, and Christian S. Jensen. 2021. ST2Vec: Spatio-Temporal Trajectory Similarity Learning in Road Networks. CoRR abs\/2112.09339 (2021). arXiv:2112.09339 https:\/\/arxiv.org\/abs\/2112.09339"},{"key":"e_1_2_1_11_1","first-page":"637","article-title":"EXPONENTIAL SMOOTHING: THE STATE OF THE ART","volume":"22","author":"Gardner Everette S.","year":"2006","unstructured":"Everette S. Gardner. 2006. EXPONENTIAL SMOOTHING: THE STATE OF THE ART, PART II. International Journal of Forecasting 22 (2006), 637-666. https:\/\/api.semanticscholar.org\/CorpusID:2749748","journal-title":"PART II. International Journal of Forecasting"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687627.1687693"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3654993"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/BSN.2012.10"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2020.3027736"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.envsoft.2009.08.010"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.5194\/gmd-15-5481-2022"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.14778\/3665844.3665862"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1179\/1752270613Y.0000000053"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11356-016-7812-9"},{"key":"e_1_2_1_22_1","unstructured":"Yaguang Li Rose Yu Cyrus Shahabi and Yan Liu. 2018. Diffusion Convolutional Recurrent Neural Network: Data- Driven Traffic Forecasting. arXiv:1707.01926 [cs.LG] https:\/\/arxiv.org\/abs\/1707.01926"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2018\/476"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2023.102078"},{"key":"e_1_2_1_25_1","volume-title":"Spatiotemporal Pattern of PM2.5 Concentrations in Mainland China and Analysis of Its Influencing Factors using Geographically Weighted Regression. Scientific Reports 7","author":"Luo Jieqiong","year":"2017","unstructured":"Jieqiong Luo, Peijun Du, Alim Samat, Junshi Xia, Meiqin Che, and Zhaohui Xue. 2017. Spatiotemporal Pattern of PM2.5 Concentrations in Mainland China and Analysis of Its Influencing Factors using Geographically Weighted Regression. Scientific Reports 7 (2017). https:\/\/api.semanticscholar.org\/CorpusID:31005343"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330787"},{"key":"e_1_2_1_27_1","unstructured":"Samuel Madden. 2003. Intel Berkeley Research Lab Data. https:\/\/db.csail.mit.edu\/labdata\/labdata.html. MIT CSAIL."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/MASS.2013.13"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2019.00024"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.3390\/s23146431"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1525856.1525863"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-11508-5_22"},{"key":"e_1_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Karl Pearson. [n.d.]. Mathematical Contributions to the Theory of Evolution. III. Regression Heredity and Panmixia. Philosophical Transactions of the Royal Society A 187 ([n.d.]) 253-318. https:\/\/api.semanticscholar.org\/CorpusID:119875807","DOI":"10.1098\/rsta.1896.0007"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.14778\/3574245.3574254"},{"key":"e_1_2_1_35_1","volume-title":"B. Rajanarayan Prusty, and Debashisha Jena.","author":"Ranjan Kumar Gaurav","year":"2020","unstructured":"Kumar Gaurav Ranjan, Debesh Shankar Tripathy, B. Rajanarayan Prusty, and Debashisha Jena. 2020. An improved sliding window prediction-based outlier detection and correction for volatile time-series. International Journal of Numerical Modelling: Electronic Networks 34 (2020). https:\/\/api.semanticscholar.org\/CorpusID:225339439"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1016\/0377-0427(87)90125-7"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.2022.2078330"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1213\/ANE.0000000000002864"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3465740"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2723730"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588939"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE48307.2020.00068"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1016\/B978-0-12-817026-7.00006-0"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.14778\/3514061.3514067"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2008.2009971"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3437963.3441731"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3437963.3441731"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1109\/CSE.2013.116"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219822"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219822"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2018\/505"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2915233"},{"key":"e_1_2_1_53_1","unstructured":"Aoqian Zhang Shaoxu Song Jianmin Wang and Philip S. Yu. 2020. Time Series Data Cleaning: From Anomaly Detection to Anomaly Repairing (Technical Report). arXiv:2003.12396 [cs.DB] https:\/\/arxiv.org\/abs\/2003.12396"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2411.01214"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1007\/s12145-024-01598-8"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2021.3102110"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/2629592"},{"key":"e_1_2_1_58_1","first-page":"141","article-title":"A High-Dimensional Timing Data Cleaning Algorithm for Wireless Sensor Networks","volume":"53","author":"Zhou Jingjing","year":"2022","unstructured":"Jingjing Zhou, Xiaokang Yu, Jilin Zhang, Hanxiao Shi, Yu xing Mao, and Junfeng Yuan. 2022. A High-Dimensional Timing Data Cleaning Algorithm for Wireless Sensor Networks. Ad Hoc Sens. Wirel. Networks 53 (2022), 141-164. https:\/\/api.semanticscholar.org\/CorpusID:252623861","journal-title":"Ad Hoc Sens. Wirel. Networks"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/3654946"}],"container-title":["Proceedings of the ACM on Management of Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3769794","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,7]],"date-time":"2026-04-07T04:28:22Z","timestamp":1775536102000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3769794"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,4]]},"references-count":59,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,12,4]]}},"alternative-id":["10.1145\/3769794"],"URL":"https:\/\/doi.org\/10.1145\/3769794","relation":{},"ISSN":["2836-6573"],"issn-type":[{"value":"2836-6573","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,4]]}}}