{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T21:48:46Z","timestamp":1771710526407,"version":"3.50.1"},"reference-count":64,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2022,7,30]],"date-time":"2022-07-30T00:00:00Z","timestamp":1659139200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Science Foundation","award":["OAC-1532133 & CMMI-1727785"],"award-info":[{"award-number":["OAC-1532133 & CMMI-1727785"]}]},{"name":"USDOT Eisenhower Fellowship program","award":["693JJ32045011"],"award-info":[{"award-number":["693JJ32045011"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2022,12,31]]},"abstract":"<jats:p>Measuring the built and natural environment at a fine-grained scale is now possible with low-cost urban environmental sensor networks. However, fine-grained city-scale data analysis is complicated by tedious data cleaning including removing outliers and imputing missing data. While many methods exist to automatically correct anomalies and impute missing entries, challenges still exist on data with large spatial-temporal scales and shifting patterns. To address these challenges, we propose an online robust tensor recovery (OLRTR) method to preprocess streaming high-dimensional urban environmental datasets. A small-sized dictionary that captures the underlying patterns of the data is computed and constantly updated with new data. OLRTR enables online recovery for large-scale sensor networks that provide continuous data streams, with a lower computational memory usage compared to offline batch counterparts. In addition, we formulate the objective function so that OLRTR can detect structured outliers, such as faulty readings over a long period of time. We validate OLRTR on a synthetically degraded National Oceanic and Atmospheric Administration temperature dataset, and apply it to the Array of Things city-scale sensor network in Chicago, IL, showing superior results compared with several established online and batch-based low-rank decomposition methods.<\/jats:p>","DOI":"10.1145\/3532189","type":"journal-article","created":{"date-parts":[[2022,5,4]],"date-time":"2022-05-04T11:19:26Z","timestamp":1651663166000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":13,"title":["Streaming Data Preprocessing via Online Tensor Recovery for Large Environmental Sensor Networks"],"prefix":"10.1145","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6579-0646","authenticated-orcid":false,"given":"Yue","family":"Hu","sequence":"first","affiliation":[{"name":"Vanderbilt University, Nashville, TN, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3996-8521","authenticated-orcid":false,"given":"Ao","family":"Qu","sequence":"additional","affiliation":[{"name":"Vanderbilt University, Nashville, TN, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3988-8356","authenticated-orcid":false,"given":"Yanbing","family":"Wang","sequence":"additional","affiliation":[{"name":"Vanderbilt University, Nashville, TN, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0565-2158","authenticated-orcid":false,"given":"Daniel B.","family":"Work","sequence":"additional","affiliation":[{"name":"Vanderbilt University, Nashville, TN, USA"}]}],"member":"320","published-online":{"date-parts":[[2022,7,30]]},"reference":[{"key":"e_1_3_3_2_2","article-title":"The Sustainable Development Goals Report","author":"Nations United","year":"2016","unstructured":"United Nations. 2016. The Sustainable Development Goals Report. UN, New York, NY.","journal-title":"UN, New York, NY"},{"key":"e_1_3_3_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/3063386.3063771"},{"key":"e_1_3_3_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/THS.2008.4534518"},{"key":"e_1_3_3_5_2","unstructured":"A. Lewis W. R. Peltier and E. von Schneidemesser. 2018. Low-cost sensors for the measurement of atmospheric composition: Overview of topic and future applications. World Meteorological Organization. https:\/\/www.wmo.int\/pages\/prog\/arep\/gaw\/documents\/Draft_low_cost_sensors.pdf."},{"key":"e_1_3_3_6_2","doi-asserted-by":"publisher","DOI":"10.3390\/atmos10090506"},{"key":"e_1_3_3_7_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.chemolab.2006.06.016"},{"key":"e_1_3_3_8_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.envsoft.2009.08.010"},{"key":"e_1_3_3_9_2","doi-asserted-by":"publisher","DOI":"10.1111\/j.1365-246X.2012.05569.x"},{"key":"e_1_3_3_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2018.2860964"},{"key":"e_1_3_3_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVW.2015.125"},{"key":"e_1_3_3_12_2","first-page":"404","volume-title":"Proceedings of the 26th International Conference on Neural Information Processing Systems","author":"Feng Jiashi","year":"2013","unstructured":"Jiashi Feng, Huan Xu, and Shuicheng Yan. 2013. Online robust PCA via stochastic optimization. In Proceedings of the 26th International Conference on Neural Information Processing Systems. 404\u2013412."},{"key":"e_1_3_3_13_2","first-page":"622","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Shen J.","year":"2016","unstructured":"J. Shen, P. Li, and H. Xu. 2016. Online low-rank subspace clustering by basis dictionary pursuit. In Proceedings of the International Conference on Machine Learning. 622\u2013631."},{"key":"e_1_3_3_14_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.trc.2019.03.003"},{"key":"e_1_3_3_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/3417337"},{"key":"e_1_3_3_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSIPN.2021.3105795"},{"key":"e_1_3_3_17_2","volume-title":"Proceedings of the NeurIPS 2019 Workshop on Tackling Climate Change with Machine Learning","author":"Hu Yue","year":"2019","unstructured":"Yue Hu, Yanbing Wang, Canwen Jiao, Rajesh Sankaran, Charles Catlett, and Daniel Work. 2019. Automatic data cleaning via tensor factorization for large urban environmental sensor networks. In Proceedings of the NeurIPS 2019 Workshop on Tackling Climate Change with Machine Learning."},{"key":"e_1_3_3_18_2","doi-asserted-by":"publisher","DOI":"10.7289\/JWPF-Y430"},{"key":"e_1_3_3_19_2","unstructured":"University of Chicago. 2019. Array of Things File Browser. Retrieved April 2021 from https:\/\/afb.plenar.io\/data-sets\/chicago-complete."},{"key":"e_1_3_3_20_2","doi-asserted-by":"publisher","DOI":"10.1137\/07070111X"},{"key":"e_1_3_3_21_2","doi-asserted-by":"publisher","DOI":"10.1137\/130905010"},{"key":"e_1_3_3_22_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF02289464"},{"key":"e_1_3_3_23_2","doi-asserted-by":"publisher","DOI":"10.1137\/080738970"},{"key":"e_1_3_3_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/1970392.1970395"},{"key":"e_1_3_3_25_2","doi-asserted-by":"publisher","DOI":"10.5555\/2639267.2639269"},{"key":"e_1_3_3_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2008.2001399"},{"key":"e_1_3_3_27_2","first-page":"99126W","volume-title":"Proceedings of SPIE, Advances in Optical and Mechanical Technologies for Telescopes and Instrumentation II","volume":"9912","author":"Nirmal K.","year":"2016","unstructured":"K. Nirmal, A. G. Sreejith, Joice Mathew, Mayuresh Sarpotdar, Ambily Suresh, Ajin Prakash, Margarita Safonova, and Jayant Murthy. 2016. Noise modeling and analysis of an IMU-based attitude sensor: Improvement of performance by filtering and sensor fusion. In Proceedings of SPIE, Advances in Optical and Mechanical Technologies for Telescopes and Instrumentation II, Vol. 9912. International Society for Optics and Photonics, 99126W."},{"key":"e_1_3_3_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/1236360.1236364"},{"key":"e_1_3_3_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/TGRS.2011.2110657"},{"key":"e_1_3_3_30_2","doi-asserted-by":"publisher","DOI":"10.3390\/rs8080689"},{"key":"e_1_3_3_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP40776.2020.9053863"},{"key":"e_1_3_3_32_2","doi-asserted-by":"publisher","DOI":"10.3390\/s20010317"},{"key":"e_1_3_3_33_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2019.06.021"},{"key":"e_1_3_3_34_2","doi-asserted-by":"publisher","DOI":"10.5555\/1543556"},{"key":"e_1_3_3_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/ISCAS.1998.705329"},{"key":"e_1_3_3_36_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.oceaneng.2020.108261"},{"key":"e_1_3_3_37_2","article-title":"Missing data imputation with Bayesian maximum entropy for Internet of Things applications","author":"Gonz\u00e1lez-Vidal Aurora","year":"2020","unstructured":"Aurora Gonz\u00e1lez-Vidal, Punit Rathore, Aravinda S. Rao, Jos\u00e9 Mendoza-Bernal, Marimuthu Palaniswami, and Antonio F. Skarmeta-G\u00f3mez. 2020. Missing data imputation with Bayesian maximum entropy for Internet of Things applications. IEEE Internet of Things Journal 8, 21 (2020), 16108\u201316120.","journal-title":"IEEE Internet of Things Journal"},{"key":"e_1_3_3_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/1921632.1921636"},{"key":"e_1_3_3_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3168363"},{"key":"e_1_3_3_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSP.2021.3073544"},{"key":"e_1_3_3_41_2","doi-asserted-by":"publisher","DOI":"10.4208\/jcm.1809-m2018-0106"},{"key":"e_1_3_3_42_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2016.10.030"},{"key":"e_1_3_3_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2015.2465178"},{"key":"e_1_3_3_44_2","first-page":"2496","volume-title":"Proceedings of the 23rd International Conference on Neural Information Processing Systems","author":"Xu H.","year":"2010","unstructured":"H. Xu, C. Caramanis, and S. Sanghavi. 2010. Robust PCA via outlier pursuit. In Proceedings of the 23rd International Conference on Neural Information Processing Systems. 2496\u20132504."},{"key":"e_1_3_3_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.419"},{"key":"e_1_3_3_46_2","article-title":"Detecting moving objects from dynamic background combining subspace learning with mixed norm approach","volume":"79","author":"Lu Yuqiu","year":"2020","unstructured":"Yuqiu Lu, Jingjing Liu, Wang Liu, Shiwei Ma, Xianchao Xiu, Wanquan Liu, and Hui Chen. 2020. Detecting moving objects from dynamic background combining subspace learning with mixed norm approach. Multimedia Tools & Applications 79, 25\u201326 (2020), 18747\u201318766.","journal-title":"Multimedia Tools & Applications"},{"key":"e_1_3_3_47_2","doi-asserted-by":"publisher","DOI":"10.1002\/sapm192761164"},{"key":"e_1_3_3_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2015.2392756"},{"key":"e_1_3_3_49_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2018.2851612"},{"key":"e_1_3_3_50_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.164"},{"key":"e_1_3_3_51_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.sigpro.2021.108370"},{"key":"e_1_3_3_52_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.sigpro.2022.108460"},{"issue":"1","key":"e_1_3_3_53_2","article-title":"Online learning for matrix factorization and sparse coding","volume":"11","author":"Mairal J.","year":"2010","unstructured":"J. Mairal, F. Bach, J. Ponce, and G. Sapiro. 2010. Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research 11, 1 (2010), 19\u201360.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_3_54_2","article-title":"Online robust subspace tracking from partial information","author":"He J.","year":"2011","unstructured":"J. He, L. Balzano, and J. Lui. 2011. Online robust subspace tracking from partial information. arXiv:1109.3827. Retrieved from https:\/\/arxiv.org\/abs\/1109.3827.","journal-title":"arXiv:1109.3827"},{"key":"e_1_3_3_55_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE51399.2021.00078"},{"key":"e_1_3_3_56_2","doi-asserted-by":"publisher","DOI":"10.3390\/sym14010113"},{"key":"e_1_3_3_57_2","article-title":"Traffic estimation and prediction via online variational Bayesian subspace filtering","author":"Paliwal Charul","year":"2021","unstructured":"Charul Paliwal, Uttkarsha Bhatt, Pravesh Biyani, and Ketan Rajawat. 2021. Traffic estimation and prediction via online variational Bayesian subspace filtering. IEEE Transactions on Intelligent Transportation Systems 23, 5 (2021), 4674\u20134684.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"e_1_3_3_58_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i04.5714"},{"key":"e_1_3_3_59_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSP.2012.2204986"},{"key":"e_1_3_3_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/tnsm.2016.2598788"},{"key":"e_1_3_3_61_2","doi-asserted-by":"publisher","DOI":"10.1137\/070697835"},{"key":"e_1_3_3_62_2","doi-asserted-by":"publisher","DOI":"10.1145\/1102351.1102441"},{"key":"e_1_3_3_63_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2019.2891760"},{"key":"e_1_3_3_64_2","doi-asserted-by":"publisher","DOI":"10.1109\/TSP.2015.2417491"},{"key":"e_1_3_3_65_2","unstructured":"National Centers for Environmental Information. 2018. What\u2019s a USCRN Station? Retrieved July 17 2019 fromhttps:\/\/www.ncei.noaa.gov\/news\/what-is-a-uscrn-station."}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3532189","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3532189","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:09:42Z","timestamp":1750183782000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3532189"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,30]]},"references-count":64,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2022,12,31]]}},"alternative-id":["10.1145\/3532189"],"URL":"https:\/\/doi.org\/10.1145\/3532189","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,7,30]]},"assertion":[{"value":"2021-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-04-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-07-30","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}