{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:12:11Z","timestamp":1760148731210,"version":"build-2065373602"},"reference-count":34,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2023,5,28]],"date-time":"2023-05-28T00:00:00Z","timestamp":1685232000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>With the advancement of IoT technologies, there is a large amount of data available from wireless sensor networks (WSN), particularly for studying climate change. Clustering long and noisy time series has become an important research area for analyzing this data. This paper proposes a feature-based clustering approach using topological data analysis, which is a set of methods for finding topological structure in data. Persistence diagrams and landscapes are popular topological summaries that can be used to cluster time series. This paper presents a framework for selecting an optimal number of persistence landscapes, and using them as features in an unsupervised learning algorithm. This approach reduces computational cost while maintaining accuracy. The clustering approach was demonstrated to be accurate on simulated data, based on only four, three, and three features, respectively, selected in Scenarios 1\u20133. On real data, consisting of multiple long temperature streams from various US locations, our optimal feature selection method achieved approximately a 13 times speed-up in computing.<\/jats:p>","DOI":"10.3390\/fi15060195","type":"journal-article","created":{"date-parts":[[2023,5,28]],"date-time":"2023-05-28T15:29:52Z","timestamp":1685287792000},"page":"195","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Feature Construction Using Persistence Landscapes for Clustering Noisy IoT Time Series"],"prefix":"10.3390","volume":"15","author":[{"given":"Renjie","family":"Chen","sequence":"first","affiliation":[{"name":"Department of Statistics, University of Connecticut, Storrs, CT 06269, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nalini","family":"Ravishanker","sequence":"additional","affiliation":[{"name":"Department of Statistics, University of Connecticut, Storrs, CT 06269, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,5,28]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Soliman, A., Rajasekaran, S., Toman, P., Ravishanker, N., Lally, N., and D\u2019Addeo, H. (July, January 14). A Custom Unsupervised Approach for Pipe-Freeze Online Anomaly Detection. Proceedings of the 2021 IEEE 7th World Forum on Internet of Things (WF-IoT), New Orleans, LA, USA.","DOI":"10.1109\/WF-IoT51360.2021.9595720"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Toman, P., Soliman, A., Ravishanker, N., Rajasekaran, S., Lally, N., and D\u2019Addeo, H. (2023, January 24\u201326). Understanding insured behavior through causal analysis of IoT streams. Proceedings of the 2023 6th International Conference on Data Mining and Knowledge Discovery (DMKD 2023), Chongqing, China.","DOI":"10.1109\/ISCSIC60498.2023.00078"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"e513","DOI":"10.1002\/sta4.513","article-title":"Collaborative analysis for energy usage monitoring and management on a large university campus","volume":"11","author":"Chen","year":"2022","journal-title":"Stat"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"4728","DOI":"10.3390\/s90604728","article-title":"A review of wireless sensor technologies and applications in agriculture and food industry: State of the art and current trends","volume":"9","author":"Lunadei","year":"2009","journal-title":"Sensors"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"892","DOI":"10.1016\/j.jhydrol.2019.04.078","article-title":"Application of remote sensing to water environmental processes under a changing climate","volume":"574","author":"Cui","year":"2019","journal-title":"J. Hydrol."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1038\/d41586-021-02981-x","article-title":"Embrace open-source sensors for local climate studies","volume":"599","author":"Levintal","year":"2021","journal-title":"Nature"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1007\/BF00868169","article-title":"Cluster analysis of southeastern US climate stations","volume":"44","author":"Stooksbury","year":"1991","journal-title":"Theor. Appl. Climatol."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2103","DOI":"10.1175\/1520-0442(1993)006<2103:CZOTCU>2.0.CO;2","article-title":"Climate zones of the conterminous United States defined using cluster analysis","volume":"6","author":"Fovell","year":"1993","journal-title":"J. Clim."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1405","DOI":"10.1175\/1520-0442(1997)010<1405:CCOUST>2.0.CO;2","article-title":"Consensus clustering of US temperature and precipitation data","volume":"10","author":"Fovell","year":"1997","journal-title":"J. Clim."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"791","DOI":"10.1002\/joc.645","article-title":"Spatial grouping of United States climate stations using a hybrid clustering approach","volume":"21","author":"DeGaetano","year":"2001","journal-title":"Int. J. Climatol. J. R. Meteorol. Soc."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1016\/j.is.2015.04.007","article-title":"Time-series Clustering\u2014A Decade Review","volume":"53","author":"Aghabozorgi","year":"2015","journal-title":"Inf. Syst."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Edelsbrunner, H., and Harer, J. (2010). Computational Topology an Introduction, American Mathematical Society.","DOI":"10.1090\/mbk\/069"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"e1548","DOI":"10.1002\/wics.1548","article-title":"An introduction to persistent homology for time series","volume":"13","author":"Ravishanker","year":"2021","journal-title":"Wiley Interdiscip. Rev. Comput. Stat."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"366","DOI":"10.1007\/BFb0091924","article-title":"Detecting strange attractors in turbulence","volume":"898","author":"Takens","year":"1981","journal-title":"Lect. Notes Math."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"799","DOI":"10.1007\/s10208-014-9206-z","article-title":"Sliding Windows and Persistence: An Application of Topological Methods to Signal Analysis","volume":"15","author":"Perea","year":"2015","journal-title":"Found. Comput. Math."},{"key":"ref_16","unstructured":"Fasy, B.T., Kim, J., Lecci, F., and Maria, C. (2014). Introduction to the R package TDA. arXiv."},{"key":"ref_17","first-page":"77","article-title":"Statistical Topological Data Analysis Using Persistence Landscapes","volume":"16","author":"Bubenik","year":"2015","journal-title":"J. Mach. Learn. Res."},{"key":"ref_18","unstructured":"Truong, P. (2017). An Exploration of Topological Properties of High-Frequency One-Dimensional Financial Time Series Data Using TDA. [Master\u2019s Thesis, KTH Royal Institute of Technology, Mathematical Statistics]."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"123843","DOI":"10.1016\/j.physa.2019.123843","article-title":"Topological recognition of critical transitions in time series of cryptocurrencies","volume":"548","author":"Gidea","year":"2020","journal-title":"Phys. Stat. Mech. Appl."},{"key":"ref_20","unstructured":"Kim, K., Kim, J., and Rinaldo, A. (2018). Time Series featurization via topological data analysis. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1506","DOI":"10.1214\/17-AOAS1119","article-title":"Topological data analysis of single-trial electroencephalographic signals","volume":"12","author":"Wang","year":"2018","journal-title":"Ann. Appl. Stat."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1007\/s42421-019-00008-6","article-title":"Clustering activity\u2014Travel behavior time series using topological data analysis","volume":"1","author":"Chen","year":"2019","journal-title":"J. Big Data Anal. Transp."},{"key":"ref_23","unstructured":"Chen, R. (2022, January 01). Topological Data Analysis for Clustering and Classifying Time Series. 2019, University of Connecticut, USA, Doctoral Dissertations. Available online: https:\/\/opencommons.uconn.edu\/dissertations\/2365."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Shumway, R.H., and Stoffer, D.S. (2011). Time Series Analysis and Its Applications (Springer Texts in Statistics), Springer.","DOI":"10.1007\/978-1-4419-7865-3"},{"key":"ref_25","unstructured":"Priestley, M.B. (1981). Spectral Analysis and Time Series, Academic Press."},{"key":"ref_26","unstructured":"Bloomfield, P. (2004). Fourier Analysis of Time Series: An Introduction, John Wiley & Sons."},{"key":"ref_27","unstructured":"Munkres, J.R. (1993). Elements of Algebraic Topology, Addison-Wesley."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"311","DOI":"10.1080\/0952813X.2021.1871971","article-title":"Topological machine learning for multivariate time series","volume":"34","author":"Wu","year":"2022","journal-title":"J. Exp. Theor. Artif. Intell."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Shi, Q., Zhu, J., Peng, J., and Li, H. (2021). Time Series Clustering with Topological and Geometric Mixed Distance. Mathematics, 9.","DOI":"10.3390\/math9091046"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1002\/(SICI)1097-0266(199606)17:6<441::AID-SMJ819>3.0.CO;2-G","article-title":"The application of cluster analysis in strategic management research: An analysis and critique","volume":"17","author":"Ketchen","year":"1996","journal-title":"Strateg. Manag. J."},{"key":"ref_31","unstructured":"Bradley, P.S., and Fayyad, U.M. (1998, January 24\u201327). Refining Initial Points for K-Means Clustering. Proceedings of the ICML \u201998 Proceedings of the Fifteenth International Conference on Machine Learning, San Francisco, CA, USA."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"200","DOI":"10.1016\/j.eswa.2012.07.021","article-title":"A comparative study of efficient initialization methods for the k-means clustering algorithm","volume":"40","author":"Celebi","year":"2013","journal-title":"Expert Syst. Appl."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","article-title":"Silhouettes: A graphical aid to the interpretation and validation of cluster analysis","volume":"20","author":"Rousseeuw","year":"1987","journal-title":"J. Comput. Appl. Math."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Wang, Z., Mamun, A.A., Cai, X., Ravishanker, N., and Rajasekaran, S. (2019, January 3\u20137). Efficient sequential and parallel algorithms for estimating higher order spectra. Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China.","DOI":"10.1145\/3357384.3358062"}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/15\/6\/195\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:43:48Z","timestamp":1760125428000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/15\/6\/195"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,28]]},"references-count":34,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2023,6]]}},"alternative-id":["fi15060195"],"URL":"https:\/\/doi.org\/10.3390\/fi15060195","relation":{},"ISSN":["1999-5903"],"issn-type":[{"type":"electronic","value":"1999-5903"}],"subject":[],"published":{"date-parts":[[2023,5,28]]}}}