{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,19]],"date-time":"2025-10-19T06:09:53Z","timestamp":1760854193466,"version":"3.41.0"},"reference-count":31,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2021,4,27]],"date-time":"2021-04-27T00:00:00Z","timestamp":1619481600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Key Research and Development Plan","award":["2019YFB1705301"],"award-info":[{"award-number":["2019YFB1705301"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62072265, 61572272, and 71690231"],"award-info":[{"award-number":["62072265, 61572272, and 71690231"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. Data and Information Quality"],"published-print":{"date-parts":[[2021,9,30]]},"abstract":"<jats:p>IoT data with timestamps are often found with outliers, such as GPS trajectories or sensor readings. While existing systems mostly focus on detecting temporal outliers without explanations and repairs, a decision maker may be more interested in the cause of the outlier appearance such that subsequent actions would be taken, e.g., cleaning unreliable readings or repairing broken devices or adopting a strategy for data repairs. Such outlier detection, explanation, and repairs are expected to be performed in either offline (batch) or online modes (over streaming IoT data with timestamps). In this work, we present TsClean, a new prototype system for detecting and repairing outliers with explanations over IoT data. The framework defines uniform profiles to explain the outliers detected by various algorithms, including the outliers with variant time intervals, and take approaches to repair outliers. Both batch and streaming processing are supported in a uniform framework. In particular, by varying the block size, it provides a tradeoff between computing the accurate results and approximating with efficient incremental computation. In this article, we present several case studies of applying TsClean in industry, e.g., how this framework works in detecting and repairing outliers over excavator water temperature data, and how to get reasonable explanations and repairs for the detected outliers in tracking excavators.<\/jats:p>","DOI":"10.1145\/3436239","type":"journal-article","created":{"date-parts":[[2021,4,27]],"date-time":"2021-04-27T14:14:25Z","timestamp":1619532865000},"page":"1-17","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["EXPERIENCE: Algorithms and Case Study for Explaining Repairs with Uniform Profiles over IoT Data"],"prefix":"10.1145","volume":"13","author":[{"given":"Zhicheng","family":"Liu","sequence":"first","affiliation":[{"name":"Tsinghua University"}]},{"given":"Yang","family":"Zhang","sequence":"additional","affiliation":[{"name":"Tsinghua University"}]},{"given":"Ruihong","family":"Huang","sequence":"additional","affiliation":[{"name":"Tsinghua University"}]},{"given":"Zhiwei","family":"Chen","sequence":"additional","affiliation":[{"name":"Tsinghua University"}]},{"given":"Shaoxu","family":"Song","sequence":"additional","affiliation":[{"name":"Tsinghua University"}]},{"given":"Jianmin","family":"Wang","sequence":"additional","affiliation":[{"name":"Tsinghua University"}]}],"member":"320","published-online":{"date-parts":[[2021,4,27]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2017.04.070"},{"key":"e_1_2_1_2_1","article-title":"Time series analysis : forecasting and control","volume":"31","author":"Box G. E. P.","year":"2010","unstructured":"G. E. P. Box and G. M. Jenkins . 2010 . Time series analysis : forecasting and control . Journal of Time 31 , 3 (2010). G. E. P. Box and G. M. Jenkins. 2010. Time series analysis : forecasting and control. Journal of Time 31, 3 (2010).","journal-title":"Journal of Time"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/358198.358222"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3183713.3190659"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/GrC.2012.6468672"},{"key":"e_1_2_1_6_1","unstructured":"Yanping Chen Eamonn Keogh Bing Hu Nurjahan Begum Anthony Bagnall Abdullah Mueen and Gustavo Batista. 2015. The UCR Time Series Classification Archive. Retrieved from www.cs.ucr.edu\/\u223ceamonn\/time_series_data\/.  Yanping Chen Eamonn Keogh Bing Hu Nurjahan Begum Anthony Bagnall Abdullah Mueen and Gustavo Batista. 2015. The UCR Time Series Classification Archive. Retrieved from www.cs.ucr.edu\/\u223ceamonn\/time_series_data\/."},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPWRS.2002.804943"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-40994-3_20"},{"key":"e_1_2_1_9_1","first-page":"336","article-title":"Discovering latent threads in entity histories. Data Sci","volume":"4","author":"Duan Yijun","year":"2019","unstructured":"Yijun Duan , Adam Jatowt , and Katsumi Tanaka . 2019 . Discovering latent threads in entity histories. Data Sci . Eng. 4 , 4 (2019), 336 \u2013 351 . DOI:https:\/\/doi.org\/10.1007\/s41019-019-00108-x 10.1007\/s41019-019-00108-x Yijun Duan, Adam Jatowt, and Katsumi Tanaka. 2019. Discovering latent threads in entity histories. Data Sci. Eng. 4, 4 (2019), 336\u2013351. DOI:https:\/\/doi.org\/10.1007\/s41019-019-00108-x","journal-title":"Eng."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.5555\/2621979"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2013.184"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD\u201918)","author":"Gupta Nikhil","year":"2018","unstructured":"Nikhil Gupta , Dhivya Eswaran , Neil Shah , Leman Akoglu , and Christos Faloutsos . 2018 . Beyond outlier detection: LookOut for pictorial explanation . In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD\u201918) . 122\u2013138. DOI:https:\/\/doi.org\/10.1007\/978-3-030-10925-7_8 10.1007\/978-3-030-10925-7_8 Nikhil Gupta, Dhivya Eswaran, Neil Shah, Leman Akoglu, and Christos Faloutsos. 2018. Beyond outlier detection: LookOut for pictorial explanation. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD\u201918). 122\u2013138. DOI:https:\/\/doi.org\/10.1007\/978-3-030-10925-7_8"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijinfomgt.2018.08.006"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData47090.2019.9006232"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1080\/00224065.1986.11979014"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1007\/11748625_6"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.5555\/1182635.1164143"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jnca.2016.08.002"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.5555\/645496.657889"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2007.190691"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-03070-3_34"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2013.132"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/SCCC.2013.18"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-008-5093-3"},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the IEEE International Conference on Big Data and Smart Computing (BigComp\u201917)","author":"Son Siwoon","year":"2017","unstructured":"Siwoon Son , Myeong-Seon Gil , and Yang-Sae Moon . 2017 . Anomaly detection for big log data using a Hadoop ecosystem . In Proceedings of the IEEE International Conference on Big Data and Smart Computing (BigComp\u201917) . 377\u2013380. DOI:https:\/\/doi.org\/10.1109\/BIGCOMP.2017.7881697 10.1109\/BIGCOMP.2017.7881697 Siwoon Son, Myeong-Seon Gil, and Yang-Sae Moon. 2017. Anomaly detection for big log data using a Hadoop ecosystem. In Proceedings of the IEEE International Conference on Big Data and Smart Computing (BigComp\u201917). 377\u2013380. DOI:https:\/\/doi.org\/10.1109\/BIGCOMP.2017.7881697"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDMW.2018.00204"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2013.84"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2783258.2783317"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2723730"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2882903.2915233"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.14778\/3115404.3115410"}],"container-title":["Journal of Data and Information Quality"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3436239","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3436239","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:03:29Z","timestamp":1750197809000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3436239"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,27]]},"references-count":31,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2021,9,30]]}},"alternative-id":["10.1145\/3436239"],"URL":"https:\/\/doi.org\/10.1145\/3436239","relation":{},"ISSN":["1936-1955","1936-1963"],"issn-type":[{"type":"print","value":"1936-1955"},{"type":"electronic","value":"1936-1963"}],"subject":[],"published":{"date-parts":[[2021,4,27]]},"assertion":[{"value":"2020-02-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-04-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}