{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T12:33:19Z","timestamp":1772541199638,"version":"3.50.1"},"reference-count":37,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2016,11,21]],"date-time":"2016-11-21T00:00:00Z","timestamp":1479686400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Database Syst."],"published-print":{"date-parts":[[2017,3,31]]},"abstract":"<jats:p>Smart electricity meters have been replacing conventional meters worldwide, enabling automated collection of fine-grained (e.g., every 15 minutes or hourly) consumption data. A variety of smart meter analytics algorithms and applications have been proposed, mainly in the smart grid literature. However, the focus has been on what can be done with the data rather than how to do it efficiently. In this article, we examine smart meter analytics from a software performance perspective. First, we design a performance benchmark that includes common smart meter analytics tasks. These include offline feature extraction and model building as well as a framework for online anomaly detection that we propose. Second, since obtaining real smart meter data is difficult due to privacy issues, we present an algorithm for generating large realistic datasets from a small seed of real data. Third, we implement the proposed benchmark using five representative platforms: a traditional numeric computing platform (Matlab), a relational DBMS with a built-in machine learning toolkit (PostgreSQL\/MADlib), a main-memory column store (\u201cSystem C\u201d), and two distributed data processing platforms (Hive and Spark\/Spark Streaming). We compare the five platforms in terms of application development effort and performance on a multicore machine as well as a cluster of 16 commodity servers.<\/jats:p>","DOI":"10.1145\/3004295","type":"journal-article","created":{"date-parts":[[2016,11,21]],"date-time":"2016-11-21T14:01:46Z","timestamp":1479736906000},"page":"1-39","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":46,"title":["Smart Meter Data Analytics"],"prefix":"10.1145","volume":"42","author":[{"given":"Xiufeng","family":"Liu","sequence":"first","affiliation":[{"name":"Technical University of Denmark, Denmark"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lukasz","family":"Golab","sequence":"additional","affiliation":[{"name":"University of Waterloo, Ontario, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wojciech","family":"Golab","sequence":"additional","affiliation":[{"name":"University of Waterloo, Ontario, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ihab F.","family":"Ilyas","sequence":"additional","affiliation":[{"name":"University of Waterloo, Ontario, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shichao","family":"Jin","sequence":"additional","affiliation":[{"name":"University of Waterloo, Ontario, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2016,11,21]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.enbuild.2012.02.044"},{"key":"e_1_2_1_2_1","volume-title":"Conf. on Information Hiding, 118--132","author":"Acs G.","unstructured":"G. Acs and C. Castelluccia . 2011. I have a DREAM (DiffeRentially privatE smArt Metering) . In Conf. on Information Hiding, 118--132 . G. Acs and C. Castelluccia. 2011. I have a DREAM (DiffeRentially privatE smArt Metering). In Conf. on Information Hiding, 118--132."},{"key":"e_1_2_1_3_1","volume-title":"ECML-PKDD DARE Workshop on Energy Analytics.","author":"Albert A.","unstructured":"A. Albert , T. Gebru , J. Ku , J. Kwac , J. Leskovec , and R. Rajagopal . 2013. Drivers of variability in energy consumption . In ECML-PKDD DARE Workshop on Energy Analytics. A. Albert, T. Gebru, J. Ku, J. Kwac, J. Leskovec, and R. Rajagopal. 2013. Drivers of variability in energy consumption. In ECML-PKDD DARE Workshop on Energy Analytics."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2013.6691644"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPWRS.2013.2266122"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/1740390.1740400"},{"key":"e_1_2_1_7_1","volume-title":"Benchmarking of data mining techniques as applied to power system analysis. Master\u2019s Thesis","author":"Anil C.","unstructured":"C. Anil . 2013. Benchmarking of data mining techniques as applied to power system analysis. Master\u2019s Thesis , Uppsala University . C. Anil. 2013. Benchmarking of data mining techniques as applied to power system analysis. Master\u2019s Thesis, Uppsala University."},{"key":"e_1_2_1_8_1","volume-title":"EnDM Workshop on Energy Data Management, 140--147","author":"Ardakanian O.","unstructured":"O. Ardakanian , N. Koochakzadeh , R. P. Singh , L. Golab , and S. Keshav . 2014. Computing electricity consumption profiles from household smart meter data . In EnDM Workshop on Energy Data Management, 140--147 . O. Ardakanian, N. Koochakzadeh, R. P. Singh, L. Golab, and S. Keshav. 2014. Computing electricity consumption profiles from household smart meter data. In EnDM Workshop on Energy Data Management, 140--147."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2668930.2688055"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.enbuild.2012.03.025"},{"key":"e_1_2_1_11_1","volume-title":"Int. Conf. on Very Large Data Bases. 1097--1107","author":"Bruno N.","unstructured":"N. Bruno and S. Chaudhuri . 2005. Flexible database generators . In Int. Conf. on Very Large Data Bases. 1097--1107 . N. Bruno and S. Chaudhuri. 2005. Flexible database generators. In Int. Conf. on Very Large Data Bases. 1097--1107."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00779-012-0513-6"},{"key":"e_1_2_1_13_1","volume-title":"AAAI Workshop on Artificial Intelligence and Smarter Living: The Conquest of Complexity.","author":"Chen C.","unstructured":"C. Chen and D. Cook . 2011. Energy outlier detection in smart environments . In AAAI Workshop on Artificial Intelligence and Smarter Living: The Conquest of Complexity. C. Chen and D. Cook. 2011. Energy outlier detection in smart environments. In AAAI Workshop on Artificial Intelligence and Smarter Living: The Conquest of Complexity."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPWRS.2006.873122"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-014-0368-8"},{"key":"e_1_2_1_16_1","unstructured":"Electric Power Research Institute (EPRI). 2013. Big Data Survey Summary Report  Electric Power Research Institute (EPRI). 2013. Big Data Survey Summary Report"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPWRS.2005.852123"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPWRS.2005.846234"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/NAPS.2011.6025124"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2015.7113276"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.14778\/2367502.2367510"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2487166.2487204"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1024988512476"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/UIC-ATC-ScalCom-CBDCom-IoP.2015.55"},{"key":"e_1_2_1_25_1","volume-title":"Int. Conf. on Extending Database Technology. 285--396","author":"Liu X.","unstructured":"X. Liu , L. Golab , W. Golab , and I. Ilyas . 2015a. Benchmarking smart meter data analytics . In Int. Conf. on Extending Database Technology. 285--396 . X. Liu, L. Golab, W. Golab, and I. Ilyas. 2015a. Benchmarking smart meter data analytics. In Int. Conf. on Extending Database Technology. 285--396."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDE.2015.7113405"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.14778\/2733004.2733021"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33338-5_11"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1791314.1791316"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2602044.2602046"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.apenergy.2010.05.015"},{"key":"e_1_2_1_32_1","unstructured":"B. A. Smith J. Wong and R. Rajagopal. 2012. A simple way to use interval data to segment residential customers for energy efficiency and demand response program targeting. In ACEEE Summer Study on Energy Efficiency in Buildings.  B. A. Smith J. Wong and R. Rajagopal. 2012. A simple way to use interval data to segment residential customers for energy efficiency and demand response program targeting. In ACEEE Summer Study on Energy Efficiency in Buildings."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687553.1687609"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPWRS.2007.901287"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/2457317.2457357"},{"key":"e_1_2_1_36_1","volume-title":"USENIX Conf., 10","author":"Zaharia M.","unstructured":"M. Zaharia , M. Chowdhury , M. J. Franklin , S. Shenker , and I. Stoica . 2010. Spark: Cluster computing with working sets . In USENIX Conf., 10 . M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. 2010. Spark: Cluster computing with working sets. In USENIX Conf., 10."},{"key":"e_1_2_1_37_1","volume-title":"Proc. USENIX Conf. on Hot Topics in Cloud Computing. 10","author":"Zaharia M.","unstructured":"M. Zaharia , T. Das , H. Li , S. Shenker , and I. Stoica . 2012. Discretized streams: An efficient and fault-tolerant model for stream processing on large clusters . In Proc. USENIX Conf. on Hot Topics in Cloud Computing. 10 . M. Zaharia, T. Das, H. Li, S. Shenker, and I. Stoica. 2012. Discretized streams: An efficient and fault-tolerant model for stream processing on large clusters. In Proc. USENIX Conf. on Hot Topics in Cloud Computing. 10."}],"container-title":["ACM Transactions on Database Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3004295","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3004295","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T19:05:12Z","timestamp":1750273512000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3004295"}},"subtitle":["Systems, Algorithms, and Benchmarking"],"short-title":[],"issued":{"date-parts":[[2016,11,21]]},"references-count":37,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2017,3,31]]}},"alternative-id":["10.1145\/3004295"],"URL":"https:\/\/doi.org\/10.1145\/3004295","relation":{},"ISSN":["0362-5915","1557-4644"],"issn-type":[{"value":"0362-5915","type":"print"},{"value":"1557-4644","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,11,21]]},"assertion":[{"value":"2015-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-11-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}