{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,1]],"date-time":"2025-12-01T11:20:48Z","timestamp":1764588048404,"version":"build-2065373602"},"reference-count":22,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2019,8,12]],"date-time":"2019-08-12T00:00:00Z","timestamp":1565568000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Madeira 14-20","award":["M1420-01-0145-FEDER-000002"],"award-info":[{"award-number":["M1420-01-0145-FEDER-000002"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Data"],"abstract":"<jats:p>Datasets play a vital role in data science and machine learning research as they serve as the basis for the development, evaluation, and benchmark of new algorithms. Non-Intrusive Load Monitoring is one of the fields that has been benefiting from the recent increase in the number of publicly available datasets. However, there is a lack of consensus concerning how dataset should be made available to the community, thus resulting in considerable structural differences between the publicly available datasets. This technical note presents the DSCleaner, a Python library to clean, preprocess, and convert time series datasets to a standard file format. Two application examples using real-world datasets are also presented to show the technical validity of the proposed library.<\/jats:p>","DOI":"10.3390\/data4030123","type":"journal-article","created":{"date-parts":[[2019,8,13]],"date-time":"2019-08-13T04:31:21Z","timestamp":1565670681000},"page":"123","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["dsCleaner: A Python Library to Clean, Preprocess and Convert Non-Intrusive Load Monitoring Datasets"],"prefix":"10.3390","volume":"4","author":[{"given":"Manuel","family":"Pereira","sequence":"first","affiliation":[{"name":"ITI, LARSyS, 9020-105 Funchal, Portugal"},{"name":"Ci\u00eancias Exatas e Engenharia, Universidade da Madeira, 9020-105 Funchal, Portugal"}]},{"given":"Nuno","family":"Velosa","sequence":"additional","affiliation":[{"name":"ITI, LARSyS, 9020-105 Funchal, Portugal"},{"name":"Ci\u00eancias Exatas e Engenharia, Universidade da Madeira, 9020-105 Funchal, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9110-8775","authenticated-orcid":false,"given":"Lucas","family":"Pereira","sequence":"additional","affiliation":[{"name":"ITI, LARSyS, 9020-105 Funchal, Portugal"},{"name":"T\u00e9nico Lisboa, Universidade de Lisboa, 1049-001 Lisbon, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2019,8,12]]},"reference":[{"key":"ref_1","unstructured":"Hart, G. (1985). Prototype Nonintrusive Appliance Load Monitor, MIT Energy Laboratory Technical Report, and Electric Power Research Institute Technical Report. Technical Report."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Pereira, L., and Nunes, N. (2018). Performance evaluation in non-intrusive load monitoring: Datasets, metrics, and tools\u2014A review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Wiley.","DOI":"10.1002\/widm.1265"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1109\/TCE.2011.5735484","article-title":"Nonintrusive appliance load monitoring: Review and outlook","volume":"57","author":"Zeifman","year":"2011","journal-title":"IEEE Trans. Consum. Electron."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1016\/j.rser.2016.07.009","article-title":"A review disaggregation method in Non-intrusive Appliance Load Monitoring","volume":"66","author":"Esa","year":"2016","journal-title":"Renew. Sustain. Energy Rev."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1007\/s10462-018-9613-7","article-title":"Machine learning approaches for non-intrusive load monitoring: From qualitative to quantitative comparation","volume":"52","author":"Nalmpantis","year":"2019","journal-title":"Artif. Intell. Rev."},{"key":"ref_6","unstructured":"Kolter, Z., and Matthew, J. (2011, January 21\u201324). REDD: A public data set for energy disaggregation research. Proceedings of the Data Mining Applications in Sustainability (SustKDD), San Diego, CA, USA."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Anderson, K., Ocneanu, A., Benitez, D., Carlson, D., Rowe, A., and Berges, M. (2012, January 12\u201316). BLUED: A Fully Labeled Public Dataset for Event-Based Non-Intrusive Load Monitoring Research. Proceedings of the 2nd KDD Workshop on Data Mining Applications in Sustainability (SustKDD), Beijing, China.","DOI":"10.1109\/IECON.2012.6389367"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"150007","DOI":"10.1038\/sdata.2015.7","article-title":"The UK-DALE dataset, domestic appliance-level electricity demand and whole-house demand from five UK homes","volume":"2","author":"Kelly","year":"2015","journal-title":"Sci. Data"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"160037","DOI":"10.1038\/sdata.2016.37","article-title":"Electricity, water, and natural gas consumption of a residential house in Canada from 2012 to 2014","volume":"3","author":"Makonin","year":"2016","journal-title":"Sci. Data"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"160122","DOI":"10.1038\/sdata.2016.122","article-title":"An electrical load measurements dataset of United Kingdom households from a two-year longitudinal study","volume":"4","author":"Murray","year":"2017","journal-title":"Sci. Data"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"809","DOI":"10.1007\/s12053-014-9306-2","article-title":"Nonintrusive load monitoring (NILM) performance evaluation","volume":"8","author":"Makonin","year":"2014","journal-title":"Energy Effic."},{"key":"ref_12","unstructured":"Mayhorn, E.T., Sullivan, G.P., Fu, T., and Petersen, J.M. (2017). Non-Intrusive Load Monitoring Laboratory-Based Test Protocols, Pacific Northwest National Laboratory (PNNL). Technical Report."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Pereira, L., and Nunes, N. (2017, January 23\u201326). A comparison of performance metrics for event classification in Non-Intrusive Load Monitoring. Proceedings of the 2017 IEEE International Conference on Smart Grid Communications (SmartGridComm), Dresden, Germany.","DOI":"10.1109\/SmartGridComm.2017.8340682"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Batra, N., Kelly, J., Parson, O., Dutta, H., Knottenbelt, W., Rogers, A., Singh, A., and Srivastava, M. (2014, January 11\u201313). NILMTK: An Open Source Toolkit for Non-intrusive Load Monitoring. Proceedings of the 5th International Conference on Future Energy Systems, Cambridge, UK.","DOI":"10.1145\/2602044.2602051"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Pereira, L. (2017, January 8\u20139). EMD-DF: A Data Model and File Format for Energy Disaggregation Datasets. Proceedings of the 4th ACM International Conference on Systems for Energy-Efficient Built Environments, Delft, The Netherlands.","DOI":"10.1145\/3137133.3141474"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Batra, N., Gulati, M., Singh, A., and Srivastava, M.B. (2013, January 11\u201315). It\u2019s Different: Insights into Home Energy Consumption in India. Proceedings of the 5th ACM Workshop on Embedded Systems For Energy-Efficient Buildings, Roma, Italy.","DOI":"10.1145\/2528282.2528293"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Pereira, L., Ribeiro, M., and Nunes, N. (2017, January 6\u20137). Engineering and deploying a hardware and software platform to collect and label non-intrusive load monitoring datasets. Proceedings of the 2017 Sustainable Internet and ICT for Sustainability (SustainIT), Funchal, Portugal.","DOI":"10.23919\/SustainIT.2017.8379791"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Kelly, J., and Knottenbelt, W. (2014, January 21\u201325). Metadata for Energy Disaggregation. Proceedings of the 2014 IEEE 38th International Computer Software and Applications Conference Workshops (COMPSACW 2014), Vasteras, Sweden.","DOI":"10.1109\/COMPSACW.2014.97"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Kriechbaumer, T., Jorde, D., and Jacobsen, H.A. (2019, January 25\u201328). Waveform Signal Entropy and Compression Study of Whole-Building Energy Datasets. Proceedings of the Tenth ACM International Conference on Future Energy Systems, Phoenix, AZ, USA.","DOI":"10.1145\/3307772.3328285"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Ribeiro, M., Pereira, L., Quintal, F., and Nunes, N. (2016, January 14\u201316). SustDataED: A Public Dataset for Electric Energy Disaggregation Research. Proceedings of the ICT for Sustainability 2016, Bangkok, Thailand.","DOI":"10.2991\/ict4s-16.2016.36"},{"key":"ref_21","unstructured":"Colpaert, P. (2017). Publishing Transport Data for Maximum Reuse. [Doctor Dissertation, Ghent University]."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Pereira, L. (2016). Hardware and Software Platforms to Deploy and Evaluate Non-Intrusive Load Monitoring Systems. [Ph.D. Thesis, Universidade da Madeira].","DOI":"10.23919\/SustainIT.2017.8379791"}],"container-title":["Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2306-5729\/4\/3\/123\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:10:27Z","timestamp":1760188227000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2306-5729\/4\/3\/123"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,8,12]]},"references-count":22,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2019,9]]}},"alternative-id":["data4030123"],"URL":"https:\/\/doi.org\/10.3390\/data4030123","relation":{},"ISSN":["2306-5729"],"issn-type":[{"type":"electronic","value":"2306-5729"}],"subject":[],"published":{"date-parts":[[2019,8,12]]}}}