{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T01:47:37Z","timestamp":1760233657669,"version":"build-2065373602"},"reference-count":60,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2021,2,4]],"date-time":"2021-02-04T00:00:00Z","timestamp":1612396800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Samsung Research","award":["SRFC-IT1801-10"],"award-info":[{"award-number":["SRFC-IT1801-10"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Efficient and accurate estimation of the probability distribution of a data stream is an important problem in many sensor systems. It is especially challenging when the data stream is non-stationary, i.e., its probability distribution changes over time. Statistical models for non-stationary data streams demand agile adaptation for concept drift while tolerating temporal fluctuations. To this end, a statistical model needs to forget old data samples and to detect concept drift swiftly. In this paper, we propose FlexSketch, an online probability density estimation algorithm for data streams. Our algorithm uses an ensemble of histograms, each of which represents a different length of data history. FlexSketch updates each histogram for a new data sample and generates probability distribution by combining the ensemble of histograms while monitoring discrepancy between recent data and existing models periodically. When it detects concept drift, a new histogram is added to the ensemble and the oldest histogram is removed. This allows us to estimate the probability density function with high update speed and high accuracy using only limited memory. Experimental results demonstrate that our algorithm shows improved speed and accuracy compared to existing methods for both stationary and non-stationary data streams.<\/jats:p>","DOI":"10.3390\/s21041080","type":"journal-article","created":{"date-parts":[[2021,2,4]],"date-time":"2021-02-04T21:29:27Z","timestamp":1612474167000},"page":"1080","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["FlexSketch: Estimation of Probability Density for Stationary and Non-Stationary Data Streams"],"prefix":"10.3390","volume":"21","author":[{"given":"Namuk","family":"Park","sequence":"first","affiliation":[{"name":"School of Integrated Technology, Yonsei University, Incheon 21983, Korea"}]},{"given":"Songkuk","family":"Kim","sequence":"additional","affiliation":[{"name":"School of Integrated Technology, Yonsei University, Incheon 21983, Korea"}]}],"member":"1968","published-online":{"date-parts":[[2021,2,4]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Kraska, T., Beutel, A., Chi, E.H., Dean, J., and Polyzotis, N. (2018, January 10\u201315). The case for learned index structures. Proceedings of the International Conference on Management of Data, Houston, TX, USA.","DOI":"10.1145\/3183713.3196909"},{"key":"ref_2","unstructured":"Ustinova, E., and Lempitsky, V. (2016, January 5\u201310). Learning deep embeddings with histogram loss. Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain."},{"key":"ref_3","unstructured":"Geng, Y., Liu, S., Yin, Z., Naik, A., Prabhakar, B., Rosenblum, M., and Vahdat, A. (2018, January 9\u201311). Exploiting a natural network effect for scalable, fine-grained clock synchronization. Proceedings of the USENIX Symposium on Networked Systems Design and Implementation, Renton, WA, USA."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"964","DOI":"10.1007\/s10618-015-0448-4","article-title":"Characterizing concept drift","volume":"30","author":"Webb","year":"2016","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"134","DOI":"10.1016\/j.neucom.2017.04.070","article-title":"Unsupervised real-time anomaly detection for streaming data","volume":"262","author":"Ahmad","year":"2017","journal-title":"Neurocomputing"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Cheng, K.W., Chen, Y.T., and Fang, W.H. (2015, January 7\u201312). Video anomaly detection and localization using hierarchical feature representation and Gaussian process regression. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298909"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Yang, D., Li, B., Rettig, L., and Cudr\u00e9-Mauroux, P. (2017, January 18\u201321). HistoSketch: Fast similarity-preserving sketching of streaming histograms with concept drift. Proceedings of the IEEE International Conference on Data Mining, New Orleans, LA, USA.","DOI":"10.1109\/ICDM.2017.64"},{"key":"ref_8","first-page":"849","article-title":"A streaming parallel decision tree algorithm","volume":"11","year":"2010","journal-title":"J. Mach. Learn. Res."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"2630","DOI":"10.1016\/j.patcog.2011.03.019","article-title":"Multivariate online kernel density estimation with Gaussian kernels","volume":"44","author":"Kristan","year":"2011","journal-title":"Pattern Recognit."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Heinz, C., and Seeger, B. (2006, January 27\u201329). Towards kernel density estimation over streaming data. Proceedings of the International Conference on Management of Data, Chicago, IL, USA.","DOI":"10.1145\/1183614.1183772"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"642","DOI":"10.1109\/TKDE.2016.2626441","article-title":"KDE-Track: An efficient dynamic density estimator for data streams","volume":"29","author":"Qahtan","year":"2017","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1014","DOI":"10.1016\/j.envsoft.2009.08.010","article-title":"Anomaly detection in streaming environmental sensor data: A data-driven modeling approach","volume":"25","author":"Hill","year":"2010","journal-title":"Environ. Model. Softw."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/j.compind.2018.12.001","article-title":"Intelligent fault diagnosis of rotating machinery based on one-dimensional convolutional neural network","volume":"108","author":"Wu","year":"2019","journal-title":"Comput. Ind."},{"key":"ref_14","unstructured":"Wang, J., Yang, X., and Long, K. (2010, January 22\u201325). A new relative entropy based app-DDoS detection method. Proceedings of the IEEE Symposium on Computers and Communications, Riccione, Italy."},{"key":"ref_15","unstructured":"Wilson, A.G., Gilboa, E., Nehorai, A., and Cunningham, J.P. Fast kernel learning for multidimensional pattern extrapolation. Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS\u201914)\u2014Volume 2."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1007\/s10618-012-0297-3","article-title":"Anomaly detection in large-scale data stream networks","volume":"28","author":"Pham","year":"2014","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2523813","article-title":"A survey on concept drift adaptation","volume":"46","author":"Gama","year":"2014","journal-title":"ACM Comput. Surv."},{"key":"ref_18","unstructured":"Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., and Gavald\u00e0, R. (July, January 28). New ensemble methods for evolving data streams. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Bifet, A., Holmes, G., and Pfahringer, B. (2010, January 19\u201323). Leveraging bagging for evolving data streams. Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Barcelona, Spain.","DOI":"10.1007\/978-3-642-15880-3_15"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1469","DOI":"10.1007\/s10994-017-5642-8","article-title":"Adaptive random forests for evolving data stream classification","volume":"106","author":"Gomes","year":"2017","journal-title":"Mach. Learn."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1007\/s10994-019-05840-z","article-title":"Kappa updated ensemble for drifting data stream mining","volume":"109","author":"Cano","year":"2020","journal-title":"Mach. Learn."},{"key":"ref_22","unstructured":"Klinkenberg, R., and Joachims, T. (July, January 29). Detecting concept drift with support vector machines. Proceedings of the International Conference on Machine Learning, Stanford, CA, USA."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"388","DOI":"10.1631\/FITEE.1800038","article-title":"FAAD: An unsupervised fast and accurate anomaly detection method for a multi-dimensional sequence over data stream","volume":"20","author":"Li","year":"2019","journal-title":"Front. Inf. Technol. Electron. Eng."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1108\/IJPCC-03-2017-0027","article-title":"A framework for unsupervised change detection in activity recognition","volume":"13","author":"Bashir","year":"2017","journal-title":"Int. J. Pervasive Comput. Commun."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1016\/j.eswa.2017.12.022","article-title":"Handling adversarial concept drift in streaming data","volume":"97","author":"Sethi","year":"2018","journal-title":"Expert Syst. Appl."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Costa, A.F.J., Albuquerque, R.A.S., and dos Santos, E.M. (2018, January 8\u201313). A drift detection method based on active learning. Proceedings of the International Joint Conference on Neural Networks, Rio de Janeiro, Brazil.","DOI":"10.1109\/IJCNN.2018.8489364"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Koh, Y.S. (2016, January 24\u201329). CD-TDS: Change detection in transactional data streams for frequent pattern mining. Proceedings of the International Joint Conference on Neural Networks, Vancouver, BC, Canada.","DOI":"10.1109\/IJCNN.2016.7727383"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1016\/j.eswa.2018.08.054","article-title":"On learning guarantees to unsupervised concept drift detection on data streams","volume":"117","author":"Vaz","year":"2019","journal-title":"Expert Syst. Appl."},{"key":"ref_29","first-page":"50","article-title":"A drift detection method based on dynamic classifier selection","volume":"34","author":"Gama","year":"2019","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1009","DOI":"10.1007\/s00500-010-0657-0","article-title":"Fuzzy classification in dynamic environments","volume":"15","author":"Bouchachia","year":"2011","journal-title":"Soft Comput."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Gomes, J.A.B., Menasalvas, E., and Sousa, P.A.C. (2011, January 21\u201324). Learning recurring concepts from data streams with a context-aware ensemble. Proceedings of the 2011 ACM Symposium on Applied Computing (SAC\u201911), Taichung, Taiwan.","DOI":"10.1145\/1982185.1982403"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1007\/s12530-012-9067-0","article-title":"EVE: A framework for event detection","volume":"4","author":"Berthold","year":"2013","journal-title":"Evol. Syst."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Vorburger, P., and Bernstein, A. (2006, January 18\u201322). Entropy-based concept shift detection. Proceedings of the International Conference on Data Mining (ICDM\u201906), Hong Kong, China.","DOI":"10.1109\/ICDM.2006.66"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"G\u00f6z\u00fca\u00e7\u0131k, O., B\u00fcy\u00fck\u00e7ak\u0131r, A., Bonab, H., and Can, F. (2019, January 3\u20137). Unsupervised concept drift detection with a discriminative classifier. Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM\u201919), Beijing, China.","DOI":"10.1145\/3357384.3358144"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"108384","DOI":"10.1109\/ACCESS.2019.2932018","article-title":"Drifted Twitter spam classification using multiscale detection test on K-L divergence","volume":"7","author":"Wang","year":"2019","journal-title":"IEEE Access"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1080\/00224065.1997.11979720","article-title":"Designing a multivariate EWMA control chart","volume":"29","author":"Prabhu","year":"1997","journal-title":"J. Qual. Technol."},{"key":"ref_37","unstructured":"Koren, Y. (July, January 28). Collaborative filtering with temporal dynamics. Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1145\/1809400.1809423","article-title":"Online mass flow prediction in CFB boilers with explicit detection of sudden concept drift","volume":"11","author":"Pechenizkiy","year":"2010","journal-title":"ACM SIGKDD Explor. Newsl."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Forman, G. (2006, January 6\u201311). Tackling concept drift by temporal inductive transfer. Proceedings of the 29th ACM Conference on Research and Development in Information Retrieval, Seattle, WA, USA.","DOI":"10.1145\/1148170.1148216"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Gilbert, A.C., Guha, S., Indyk, P., Kotidis, Y., Muthukrishnan, S., and Strauss, M.J. (2002, January 19\u201321). Fast, small-space algorithms for approximate histogram maintenance. Proceedings of the Annual ACM Symposium on Theory of Computing, Montreal, QC, Canada.","DOI":"10.1145\/509907.509966"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"396","DOI":"10.1145\/1132863.1132873","article-title":"Approximation and streaming algorithms for histogram construction problems","volume":"31","author":"Guha","year":"2006","journal-title":"ACM Trans. Database Syst."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1145\/376284.375670","article-title":"Space-efficient online computation of quantile summaries","volume":"30","author":"Greenwald","year":"2001","journal-title":"ACM SIGMOD Rec."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Shrivastava, N., Buragohain, C., Agrawal, D., and Suri, S. (2004, January 3\u20135). Medians and beyond: New aggregation techniques for sensor networks. Proceedings of the International Conference on Embedded Network Sensor Systems, Baltimore, MD, USA.","DOI":"10.1145\/1031495.1031524"},{"key":"ref_44","unstructured":"Cormode, G., Korn, F., Muthukrishnan, S., and Srivastava, D. (2005, January 5\u20138). Effective computation of biased quantiles over data streams. Proceedings of the International Conference on Data Engineering, Tokoyo, Japan."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Singh, S.A., Srivastava, D., and Tirthapura, S. (2016, January 9). Estimating quantiles from the union of historical and streaming data. Proceedings of the VLDB Endowment, New Delhi, India.","DOI":"10.14778\/3025111.3025124"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"1794","DOI":"10.1137\/S0097539701398363","article-title":"Maintaining stream statistics over sliding windows","volume":"31","author":"Datar","year":"2002","journal-title":"SIAM J. Comput."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"861","DOI":"10.3233\/IDA-2009-0397","article-title":"On the window size for classification in changing environments","volume":"13","author":"Kuncheva","year":"2009","journal-title":"Intell. Data Anal."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/j.cie.2012.02.008","article-title":"Towards a variable size sliding window model for frequent itemset mining over data streams","volume":"63","author":"Deypir","year":"2012","journal-title":"Comput. Ind. Eng."},{"key":"ref_49","first-page":"2755","article-title":"Dynamic weighted majority: An ensemble method for drifting concepts","volume":"8","author":"Kolter","year":"2007","journal-title":"J. Mach. Learn. Res."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"1517","DOI":"10.1109\/TNN.2011.2160459","article-title":"Incremental learning of concept drift in nonstationary environments","volume":"22","author":"Elwell","year":"2011","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3054925","article-title":"A survey on ensemble learning for data stream classification","volume":"50","author":"Gomes","year":"2017","journal-title":"ACM Comput. Surv."},{"key":"ref_52","unstructured":"Oza, N.C. (2005, January 12). Online bagging and boosting. Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Waikoloa, HI, USA."},{"key":"ref_53","unstructured":"(2021, February 04). Source Codes of FlexSketch. Available online: https:\/\/xxxnell.github.io\/flex\/docs\/core\/sketch.html."},{"key":"ref_54","unstructured":"(2021, February 04). Source Codes of Online Kernel Density Estimation. Available online: https:\/\/github.com\/joluet\/okde-java."},{"key":"ref_55","unstructured":"(2021, February 04). Source Codes of Streaming Parallel Decision Tree. Available online: https:\/\/github.com\/soundcloud\/spdt."},{"key":"ref_56","first-page":"1601","article-title":"MOA: Massive online analysis","volume":"11","author":"Bifet","year":"2010","journal-title":"J. Mach. Learn. Res."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Street, N., and Kim, Y. (2001, January 26\u201329). A streaming ensemble algorithm (SEA) for large-scale classification. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/502512.502568"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Thaper, N., Guha, S., Indyk, P., and Koudas, N. (2002, January 4\u20136). Dynamic multidimensional histograms. Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD \u201902), Madison, WI, USA.","DOI":"10.1145\/564740.564741"},{"key":"ref_59","unstructured":"Diakonikolas, I., Kane, D.M., and Peebles, J. (2019, January 25\u201328). Testing identity of multidimensional histograms. Proceedings of the Conference on Learning Theory (PMLR), Phoenix, AZ, USA."},{"key":"ref_60","unstructured":"Jordaney, R., Sharad, K., Dash, S.K., Wang, Z., Papini, D., Nouretdinov, I., and Cavallaro, L. (2017, January 16\u201318). Transcend: Detecting concept drift in malware classification models. Proceedings of the 26th USENIX Security Symposium (USENIX Security 17), Vancouver, BC, Canada."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/4\/1080\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T05:20:19Z","timestamp":1760160019000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/4\/1080"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,4]]},"references-count":60,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2021,2]]}},"alternative-id":["s21041080"],"URL":"https:\/\/doi.org\/10.3390\/s21041080","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2021,2,4]]}}}