{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T16:26:42Z","timestamp":1775838402858,"version":"3.50.1"},"reference-count":57,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2020,12,15]],"date-time":"2020-12-15T00:00:00Z","timestamp":1607990400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Data from smart grids are challenging to analyze due to their very large size, high dimensionality, skewness, sparsity, and number of seasonal fluctuations, including daily and weekly effects. With the data arriving in a sequential form the underlying distribution is subject to changes over the time intervals. Time series data streams have their own specifics in terms of the data processing and data analysis because, usually, it is not possible to process the whole data in memory as the large data volumes are generated fast so the processing and the analysis should be done incrementally using sliding windows. Despite the proposal of many clustering techniques applicable for grouping the observations of a single data stream, only a few of them are focused on splitting the whole data streams into the clusters. In this article we aim to explore individual characteristics of electricity usage and recommend the most suitable tariff to the customer so they can benefit from lower prices. This work investigates various algorithms (and their improvements) what allows us to formulate the clusters, in real time, based on smart meter data.<\/jats:p>","DOI":"10.3390\/e22121414","type":"journal-article","created":{"date-parts":[[2020,12,15]],"date-time":"2020-12-15T09:12:57Z","timestamp":1608023577000},"page":"1414","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Whole Time Series Data Streams Clustering: Dynamic Profiling of the Electricity Consumption"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6953-8907","authenticated-orcid":false,"given":"Krzysztof","family":"Gajowniczek","sequence":"first","affiliation":[{"name":"Department of Artificial Intelligence, Institute of Information Technology, Warsaw University of Life Sciences-SGGW, 02-776 Warsaw, Poland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Marcin","family":"Bator","sequence":"additional","affiliation":[{"name":"Department of Artificial Intelligence, Institute of Information Technology, Warsaw University of Life Sciences-SGGW, 02-776 Warsaw, Poland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tomasz","family":"Z\u0105bkowski","sequence":"additional","affiliation":[{"name":"Department of Artificial Intelligence, Institute of Information Technology, Warsaw University of Life Sciences-SGGW, 02-776 Warsaw, Poland"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,12,15]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Zabkowski, T., Gajowniczek, K., and Szupiluk, R. (2015, January 24\u201326). Grade analysis for energy usage patterns segmentation based on smart meter data. Proceedings of the 2015 IEEE 2nd International Conference on Cybernetics (CYBCONF), Gdynia, Poland.","DOI":"10.1109\/CYBConf.2015.7175938"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Nafkha, R., Gajowniczek, K., and Z\u0105bkowski, T. (2018). Do Customers Choose Proper Tariff? Empirical Analysis Based on Polish Data Using Unsupervised Techniques. Energies, 11.","DOI":"10.3390\/en11030514"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2522968.2522981","article-title":"Data stream clustering","volume":"46","author":"Silva","year":"2013","journal-title":"ACM Comput. Surv."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"15883","DOI":"10.1109\/ACCESS.2017.2735378","article-title":"A Novel Online and Non-Parametric Approach for Drift Detection in Big Data","volume":"5","author":"Bhaduri","year":"2017","journal-title":"IEEE Access"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Gajowniczek, K., Z\u0105bkowski, T., and Sodenkamp, M. (2018). Revealing Household Characteristics from Electricity Meter Data with Grade Analysis and Machine Learning Algorithms. Appl. Sci., 8.","DOI":"10.3390\/app8091654"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"27354","DOI":"10.1109\/ACCESS.2017.2771448","article-title":"A Novel Weak Estimator for Dynamic Systems","volume":"5","author":"Bhaduri","year":"2017","journal-title":"IEEE Access"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"30855","DOI":"10.1109\/ACCESS.2018.2837660","article-title":"Using Empirical Recurrence Rates Ratio for Time Series Data Similarity","volume":"6","author":"Bhaduri","year":"2018","journal-title":"IEEE Access."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1007\/s10115-019-01350-5","article-title":"Histogram-based clustering of multiple data streams","volume":"62","author":"Balzanella","year":"2019","journal-title":"Knowl. Inf. Syst."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1016\/j.ijepes.2014.11.029","article-title":"Typification of load curves for DSM in Brazil for a smart grid environment","volume":"67","author":"Macedo","year":"2015","journal-title":"Int. J. Electr. Power Energy Syst."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"3683969","DOI":"10.1155\/2018\/3683969","article-title":"Simulation Study on Clustering Approaches for Short-Term Electricity Forecasting","volume":"2018","author":"Gajowniczek","year":"2018","journal-title":"Complexity"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"264","DOI":"10.1145\/331499.331504","article-title":"Data clustering: A review","volume":"31","author":"Jain","year":"1999","journal-title":"ACM Comput. Surv."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Pitt, B.D., and Kitschen, D.S. (1999, January 21). Application of data mining techniques to load profiling. Proceedings of the 21st 1999 IEEE International Conference on Power Industry Computer Applications\u2013PICA\u201999, Santa Clara, CA, USA.","DOI":"10.1109\/PICA.1999.779395"},{"key":"ref_13","unstructured":"Gerbec, D., Gasperic, S., Simon, I., and Gubina, F. (2002, January 19). Hierarchic clustering methods for consumers load profile determination. Proceedings of the 2nd Balkan Power Conference, Belgrade, SR Yugoslavia."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Nazarko, J., and Styczynski, Z.A. (1999, January 11). Application of statistical and neural approaches to the daily load profiles modelling in power distribution systems. Proceedings of the 1999 IEEE Transmission and Distribution Conference, New Orleans, LA, USA.","DOI":"10.1109\/TDC.1999.755372"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1622","DOI":"10.1109\/TPWRS.2005.852123","article-title":"Short-term load forecasting, profile identification, and customer segmentation: A methodology based on periodic time series","volume":"20","author":"Espinoza","year":"2005","journal-title":"IEEE Transact. Power Syst."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1223","DOI":"10.1016\/j.rser.2011.08.014","article-title":"Energy models for demand forecasting\u2014A review","volume":"16","author":"Suganthi","year":"2012","journal-title":"Renew. Sustain. Energy Rev."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"190","DOI":"10.1016\/j.apenergy.2014.12.039","article-title":"A clustering approach to domestic electricity load profile characterisation using smart metering data","volume":"141","author":"McLoughlin","year":"2015","journal-title":"Appl. Energy"},{"key":"ref_18","unstructured":"Lamedica, R., Santolamazza, L., Fracassi, G., Martinelli, G., and Prudenzi, A. (2000, January 16\u201320). A novel methodology based on clustering techniques for automatic processing of MV feeder daily load patterns. Proceedings of the IEEE Power Engineering Society Summer Meeting, Seattle, WA, USA."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1109\/TPWRS.2002.807085","article-title":"Customer characterization options for improving the tariff offer","volume":"18","author":"Chicco","year":"2003","journal-title":"IEEE Transact. Power Syst."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1016\/j.ijepes.2013.09.022","article-title":"Dynamic clustering segmentation applied to load profiles of energy consumption from Spanish customers","volume":"55","author":"Quijano","year":"2014","journal-title":"Int. J. Electr. Power Energy Syst."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1016\/j.apenergy.2014.08.111","article-title":"Clustering analysis of residential electricity demand profiles","volume":"135","author":"Rhodes","year":"2014","journal-title":"Appl. Energy"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1120","DOI":"10.1109\/TPWRS.2007.901287","article-title":"Two-stage pattern recognition of load curves for classification of electricity customers","volume":"22","author":"Tsekouras","year":"2007","journal-title":"IEEE Transact. Power Syst."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"933","DOI":"10.1109\/TPWRS.2006.873122","article-title":"Comparisons among clustering techniques for electricity customer classification","volume":"21","author":"Chicco","year":"2006","journal-title":"IEEE Transact. Power Syst."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1016\/j.ins.2016.01.071","article-title":"A fast density-based data stream clustering algorithm with cluster centers self-determined for mixed data","volume":"345","author":"Chen","year":"2016","journal-title":"Inf. Sci."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"370","DOI":"10.1016\/j.jnca.2014.11.007","article-title":"MuDi-Stream: A multi density clustering algorithm for evolving data stream","volume":"59","author":"Amini","year":"2016","journal-title":"J. Netw. Comput. Appl."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Chen, Y., and Tu, L. (2007, January 12\u201315). Density-based clustering for real-time stream data. Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining\u2014KDD \u201907, San Jose, CA, USA.","DOI":"10.1145\/1281192.1281210"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Aggarwal, C.C., Yu, P.S., Han, J., and Wang, J. (2003, January 9\u201312). A Framework for Clustering Evolving Data Streams. Proceedings of the 2003 VLDB Conference, Berlin, Germany.","DOI":"10.1016\/B978-012722442-8\/50016-1"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1449","DOI":"10.1109\/TKDE.2016.2522412","article-title":"Clustering Data Streams Based on Shared Density between Micro-Clusters","volume":"28","author":"Hahsler","year":"2016","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1145\/235968.233324","article-title":"BIRCH: An efficient data clustering method for very large databases","volume":"25","author":"Zhang","year":"1996","journal-title":"ACM SIGMOD Rec."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Udommanetanakit, K., Rakthanmanon, T., and Waiyamai, K. (2007). E-Stream: Evolution-Based Technique for Stream Clustering. Lect. Notes Comput. Sci., 605\u2013615.","DOI":"10.1007\/978-3-540-73871-8_58"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1145\/2133803.2184450","article-title":"StreamKM++","volume":"17","author":"Ackermann","year":"2012","journal-title":"J. Exp. Algorithmics"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Beringer, J., and Hllermeier, E. (2007). Fuzzy Clustering of Parallel Data Streams. Adv. Fuzzy Clust. Appl., 333\u2013352.","DOI":"10.1002\/9780470061190.ch16"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Chen, Y. (2009). Clustering Parallel Data Streams. Data Min. Knowl. Discov. Real Life Appl.","DOI":"10.5772\/6447"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1166","DOI":"10.1109\/TKDE.2006.137","article-title":"Adaptive Clustering for Multiple Evolving Streams","volume":"18","author":"Dai","year":"2006","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1007\/s10618-018-0598-2","article-title":"Interpretable multiple data streams clustering with clipped streams representation for the improvement of electricity consumption forecasting","volume":"33","author":"Laurinec","year":"2018","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1016\/j.neucom.2016.01.009","article-title":"Incremental density-based ensemble clustering over evolving data streams","volume":"191","author":"Khan","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_37","first-page":"531","article-title":"TS-stream: Clustering time series on data streams","volume":"42","author":"Pereira","year":"2014","journal-title":"J. Intell. Inf. Syst."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"615","DOI":"10.1109\/TKDE.2007.190727","article-title":"Hierarchical Clustering of Time-Series Data Streams","volume":"20","author":"Rodrigues","year":"2008","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1016\/j.ins.2011.09.004","article-title":"A clustering algorithm for multiple data streams based on spectral component similarity","volume":"183","author":"Chen","year":"2012","journal-title":"Inf. Sci."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Alseghayer, R., Petrov, D., Chrysanthis, P.K., Sharaf, M., and Labrinidis, A. (2017, January 28). Detection of Highly Correlated Live Data Streams. Proceedings of the International Workshop on Real-Time Business Intelligence and Analytics, Munich, Germany.","DOI":"10.1145\/3129292.3129298"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Sakurai, Y., Papadimitriou, S., and Faloutsos, C. (2005, January 14\u201316). BRAID: Stream mining through group lag correlations. Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, MD, USA.","DOI":"10.1145\/1066157.1066226"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Shafer, I., Ren, K., Boddeti, V.N., Abe, Y., Ganger, G.R., and Faloutsos, C. (2012, January 12\u201316). RainMon: An integrated approach to mining bursty timeseries monitoring data. Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-KDD 2012, Beijing, China.","DOI":"10.1145\/2339530.2339711"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Zhu, Y., and Shasha, D. (2002, January 20\u201323). Statstream: Statistical monitoring of thousands of data streams in real time. Proceedings of the 28th International Conference on Very Large Databases 2002\u2013VLDB\u201902, Hong Kong, China.","DOI":"10.1016\/B978-155860869-6\/50039-1"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"4347","DOI":"10.1109\/JIOT.2019.2946753","article-title":"Dominant Data Set Selection Algorithms for Electricity Consumption Time-Series Data Analysis Based on Affine Transformation","volume":"7","author":"Wu","year":"2020","journal-title":"IEEE Internet Things J."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Gajowniczek, K., Bator, M., Z\u0105bkowski, T., Or\u0142owski, A., and Loo, C.K. (2020). Simulation Study on the Electricity Data Streams Time Series Clustering. Energies, 13.","DOI":"10.3390\/en13040924"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1007\/s11634-014-0176-4","article-title":"Basic statistics for distributional symbolic variables: A new metric-based approach","volume":"9","author":"Irpino","year":"2014","journal-title":"Adv. Data Anal. Classif."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Verde, R., and Irpino, A. (2007). Dynamic Clustering of Histogram Data: Using the Right Metric. Studies in Classification. Data Anal. Knowl. Organ., 123\u2013134.","DOI":"10.1007\/978-3-540-73560-1_12"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Diday, E., and Noirhomme-Fraiture, M. (2007). Symbolic Data Analysis and the SODAS Software, John Wiley & Sons.","DOI":"10.1002\/9780470723562"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"356","DOI":"10.1109\/PROC.1967.5493","article-title":"Results of a prototype television bandwidth compression scheme","volume":"55","author":"Robinson","year":"1967","journal-title":"Proc. IEEE"},{"key":"ref_50","unstructured":"Kaufman, L., and Rousseeuw, P.J. (2009). Finding Groups in Data: An Introduction to Cluster Analysis, John Wiley & Sons."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"224","DOI":"10.1109\/TPAMI.1979.4766909","article-title":"A Cluster Separation Measure","volume":"1","author":"Davies","year":"1979","journal-title":"IEEE Transact. Pattern Anal. Mach. Intell."},{"key":"ref_52","unstructured":"Lyons, R.G. (2004). Understanding Digital Signal Processing, 2\/E, Prentice Hall PTR Upper."},{"key":"ref_53","unstructured":"(2020, March 10). BIRCH-Clustering-R-Package. Available online: https:\/\/github.com\/rohitkata\/BIRCH-Clustering-R-package."},{"key":"ref_54","unstructured":"(2020, March 10). SymbolicDA: Analysis of Symbolic Data. Available online: https:\/\/rdrr.io\/cran\/symbolicDA\/."},{"key":"ref_55","unstructured":"(2020, March 10). ClipStream. Available online: https:\/\/github.com\/PetoLau\/ClipStream."},{"key":"ref_56","unstructured":"Langham, E., Downes, J., Brennan, T., Fyfe, J., Mohr, S., Rickwood, P., and White, S. (2014). Smart Grid, Smart City, Customer Research Report, Institute for Sustainable Futures."},{"key":"ref_57","unstructured":"(2020, December 01). UK Power Networks Led Low Carbon London, Available online: https:\/\/data.london.gov.uk\/dataset\/smartmeter-energy-use-data-in-london-households."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/22\/12\/1414\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:45:22Z","timestamp":1760179522000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/22\/12\/1414"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12,15]]},"references-count":57,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2020,12]]}},"alternative-id":["e22121414"],"URL":"https:\/\/doi.org\/10.3390\/e22121414","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,12,15]]}}}