{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T18:34:29Z","timestamp":1772562869087,"version":"3.50.1"},"reference-count":50,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2019,8,29]],"date-time":"2019-08-29T00:00:00Z","timestamp":1567036800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>Prediction is a common machine learning (ML) technique used on building energy consumption data. This process is valuable for anomaly detection, load profile-based building control and measurement and verification procedures. Hundreds of building energy prediction techniques have been developed over the last three decades, yet there is still no consensus on which techniques are the most effective for various building types. In addition, many of the techniques developed are not publicly available to the general research community. This paper outlines a library of open-source regression techniques from the Scikit-Learn Python library and describes the process of applying them to open hourly electrical meter data from 482 non-residential buildings from the Building Data Genome Project. The results illustrate that there are several techniques, notably decision tree-based models, that perform well on two-thirds of the total cohort of buildings. However, over one-third of the buildings, specifically primary schools, performed poorly. This example implementation shows that there is no one size-fits-all modeling solution and that various types of temporal behavior are difficult to capture using machine learning. An analysis of the generalizability of the models tested motivates the need for the application of future techniques to a board range of building types and behaviors. The importance of this type of scalability analysis is discussed in the context of the growth of energy meter and other Internet-of-Things (IoT) data streams in the built environment. This framework is designed to be an example baseline implementation for other building energy data prediction methods as applied to a larger population of buildings. For reproducibility, the entire code base and data sets are found on Github.<\/jats:p>","DOI":"10.3390\/make1030056","type":"journal-article","created":{"date-parts":[[2019,8,29]],"date-time":"2019-08-29T11:26:22Z","timestamp":1567077982000},"page":"974-993","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":37,"title":["More Buildings Make More Generalizable Models\u2014Benchmarking Prediction Methods on Open Electrical Meter Data"],"prefix":"10.3390","volume":"1","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1186-4299","authenticated-orcid":false,"given":"Clayton","family":"Miller","sequence":"first","affiliation":[{"name":"Building and Urban Data Science (BUDS) Lab, Department of Building, School of Design and Environment (SDE), National University of Singapore (NUS), Singapore 119077, Singapore"}]}],"member":"1968","published-online":{"date-parts":[[2019,8,29]]},"reference":[{"key":"ref_1","unstructured":"Agrawal, A., Gans, J., and Goldfarb, A. (2018). Prediction Machines: The Simple Economics of Artificial Intelligence, Harvard Business Press."},{"key":"ref_2","unstructured":"Solomon, D.M., Winter, R.L., Boulanger, A.G., Anderson, R.N., and Wu, L.L. (2011). Forecasting Energy Demand in Large Commercial Buildings Using Support Vector Machine Regression, Department of Computer Science, Columbia University. Tech. Rep. CUCS-040-11."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.apenergy.2014.04.016","article-title":"Development of prediction models for next-day building energy consumption and peak power demand using data mining techniques","volume":"127","author":"Fan","year":"2014","journal-title":"Appl. Energy"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"2110","DOI":"10.3390\/en6042110","article-title":"Assessing tolerance-based robust short-term load forecasting in buildings","volume":"6","author":"Borges","year":"2013","journal-title":"Energies"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"3112","DOI":"10.1016\/j.enbuild.2011.08.008","article-title":"New artificial neural network prediction method for electrical consumption forecasting based on building end-uses","volume":"43","year":"2011","journal-title":"Energy Build."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1016\/j.enbuild.2014.08.004","article-title":"Neural network model ensembles for building-level electricity load forecasts","volume":"84","author":"Jetcheva","year":"2014","journal-title":"Energy Build."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"168","DOI":"10.1016\/j.apenergy.2014.02.057","article-title":"Forecasting energy consumption of multi-family residential buildings using support vector regression: Investigating the impact of temporal and spatial monitoring granularity on performance accuracy","volume":"123","author":"Jain","year":"2014","journal-title":"Appl. Energy"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"106","DOI":"10.1016\/j.apenergy.2015.01.026","article-title":"Automated measurement and verification: Performance of public domain whole-building electric baseline models","volume":"144","author":"Granderson","year":"2015","journal-title":"Appl. Energy"},{"key":"ref_9","unstructured":"Efficiency Valuation Organisation (2019, August 27). International Performance Measurement and Verification Protocol. Available online: http:\/\/www.eeperformance.org\/uploads\/8\/6\/5\/0\/8650231\/ipmvp_volume_i__2012.pdf."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1192","DOI":"10.1016\/j.rser.2017.04.095","article-title":"A review of data-driven building energy consumption prediction studies","volume":"81","author":"Amasyali","year":"2018","journal-title":"Renew. Sustain. Energy Rev."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"272","DOI":"10.1016\/j.rser.2013.03.004","article-title":"State of the art in building modelling and energy performances prediction: A review","volume":"23","author":"Foucquier","year":"2013","journal-title":"Renew. Sustain. Energy Rev."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"322","DOI":"10.1016\/j.enbuild.2015.02.007","article-title":"Short-term load forecasting in a non-residential building contrasting models and attributes","volume":"92","author":"Massana","year":"2015","journal-title":"Energy Build."},{"key":"ref_13","unstructured":"Stephanie, T.C.Y. (2018, November 29). Model Tuning and the Bias-Variance Tradeoff. Available online: http:\/\/www.r2d3.us\/visual-intro-to-machine-learning-part-2\/."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"296","DOI":"10.1016\/j.apenergy.2016.04.049","article-title":"Accuracy of automated measurement and verification (M&V) techniques for energy savings in commercial buildings","volume":"173","author":"Granderson","year":"2016","journal-title":"Appl. Energy"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1016\/j.enbuild.2017.02.040","article-title":"Application of automated measurement and verification to utility energy efficiency program data","volume":"142","author":"Grandersona","year":"2017","journal-title":"Energy Build."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1023\/A:1024988512476","article-title":"On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration","volume":"7","author":"Keogh","year":"2003","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"606","DOI":"10.1007\/s10618-016-0483-9","article-title":"The great time series classification bake off: A review and experimental evaluation of recent algorithmic advances","volume":"31","author":"Bagnall","year":"2017","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_18","unstructured":"Chen, Y., Keogh, E., Hu, B., Begum, N., Bagnall, A., Mueen, A., and Batista, G. (2019, August 27). The UCR Time Series Classification Archive 2015. Available online: https:\/\/www.cs.ucr.edu\/~eamonn\/time_series_data\/2015."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"232","DOI":"10.1093\/nar\/gkl812","article-title":"A protein classification benchmark collection for machine learning","volume":"35","author":"Sonego","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"ref_20","first-page":"3","article-title":"LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval","volume":"Volume 3","author":"Liu","year":"2007","journal-title":"Proceedings of the SIGIR 2007 Workshop Learning to Rank for Information Retrieval"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Kayac\u0131k, H.G., and Zincir-Heywood, N. (2005, January 19\u201320). Analysis of Three Intrusion Detection System Benchmark Datasets Using Machine Learning Algorithms. Proceedings of the International Conference on Intelligence and Security Informatics, Atlanta, GA, USA.","DOI":"10.1007\/11427995_29"},{"key":"ref_22","first-page":"1104","article-title":"Predicting hourly building energy use: The great energy predictor shootout\u2014Overview and discussion of results","volume":"100","author":"Kreider","year":"1994","journal-title":"ASHRAE Trans."},{"key":"ref_23","unstructured":"Haberl, J.S., and Thamilseran, S. (1996, January 22\u201326). Great Energy Predictor Shootout II: Measuring Retrofit Savings\u2014Overview and Discussion of Results. Proceedings of the 1996 Annual Meeting of the American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE), Inc., San Antonio, TX, USA."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"595","DOI":"10.1016\/j.enbuild.2004.09.006","article-title":"Prediction of hourly energy consumption in buildings based on a feedback artificial neural network","volume":"37","year":"2005","journal-title":"Energy Build."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"949","DOI":"10.1016\/j.enbuild.2005.11.005","article-title":"Modeling and predicting building\u2019s energy use with artificial neural networks: Methods and results","volume":"38","author":"Karatasou","year":"2006","journal-title":"Energy Build."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.autcon.2014.09.004","article-title":"Automated daily pattern filtering of measured building performance data","volume":"49","author":"Miller","year":"2015","journal-title":"Autom. Constr."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"360","DOI":"10.1016\/j.enbuild.2017.09.056","article-title":"Mining electrical meter data to predict principal building use, performance class, and operations strategy for hundreds of non-residential buildings","volume":"156","author":"Miller","year":"2017","journal-title":"Energy Build."},{"key":"ref_28","first-page":"3","article-title":"STL: A Seasonal-Trend Decomposition Procedure Based on Loess","volume":"6","author":"Cleveland","year":"1990","journal-title":"J. Off. Stat."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s12273-017-0396-6","article-title":"Occupant behavior models: A critical review of implementation and representation approaches in building performance simulation programs","volume":"11","author":"Hong","year":"2018","journal-title":"Build. Simul."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1016\/j.apenergy.2007.06.020","article-title":"Measuring industrial energy savings","volume":"85","author":"Eger","year":"2008","journal-title":"Appl. Energy"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"507","DOI":"10.1109\/TSG.2011.2145010","article-title":"Quantifying changes in building electricity use, with application to demand response","volume":"2","author":"Mathieu","year":"2011","journal-title":"IEEE Trans. Smart Grid"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"James, N.A., Kejariwal, A., and Matteson, D.S. (2016, January 5\u20138). Leveraging cloud data to mitigate user experience from \u2018breaking bad\u2019. Proceedings of the 2016 IEEE International Conference on Big Data (Big Data), Washington, DC, USA.","DOI":"10.1109\/BigData.2016.7841013"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1016\/j.egypro.2017.07.400","article-title":"The Building Data Genome Project: An open, public data set from non-residential building electrical meters","volume":"122","author":"Miller","year":"2017","journal-title":"Energy Procedia"},{"key":"ref_34","first-page":"2825","article-title":"Scikit-learn: Machine Learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_35","unstructured":"ASHRAE (2019, August 27). Available online: https:\/\/www.techstreet.com\/mss\/products\/preview\/1888937."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"902","DOI":"10.1016\/j.rser.2017.02.085","article-title":"A review on time series forecasting techniques for building energy consumption","volume":"74","author":"Deb","year":"2017","journal-title":"Renew. Sustain. Energy Rev."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1016\/j.enbuild.2015.04.011","article-title":"Short-term electricity load forecasting of buildings in microgrids","volume":"99","author":"Chitsaz","year":"2015","journal-title":"Energy Build."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"516","DOI":"10.1016\/j.enpol.2012.02.064","article-title":"Forecasting monthly peak demand of electricity in India-A critique","volume":"45","author":"Rallapalli","year":"2012","journal-title":"Energy Policy"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"2249","DOI":"10.1016\/j.apenergy.2008.11.035","article-title":"Applying support vector machine to predict hourly cooling load in the building","volume":"86","author":"Li","year":"2009","journal-title":"Appl. Energy"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1016\/j.asoc.2014.11.043","article-title":"A new linguistic out-sample approach of fuzzy time series for daily forecasting of Malaysian electricity load demand","volume":"28","author":"Efendi","year":"2015","journal-title":"Appl. Soft Comput. J."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1016\/j.ijrefrig.2003.12.001","article-title":"Applying grey forecasting to predicting the operating energy performance of air cooled water chillers","volume":"27","author":"Jiang","year":"2004","journal-title":"Int. J. Refrig."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1109\/TPWRS.2011.2161780","article-title":"Short-term load forecasting with exponentially weighted methods","volume":"27","author":"Taylor","year":"2012","journal-title":"IEEE Trans. Power Syst."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1016\/j.enpol.2014.02.001","article-title":"Development of surrogate models using artificial neural network for building shell energy labelling","volume":"69","author":"Melo","year":"2014","journal-title":"Energy Policy"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"646","DOI":"10.1016\/j.enbuild.2016.01.030","article-title":"Unsupervised energy prediction in a Smart Grid context using reinforcement cross-building transfer learning","volume":"116","author":"Mocanu","year":"2016","journal-title":"Energy Build."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1016\/j.segan.2016.02.005","article-title":"Deep learning for estimating building energy consumption","volume":"6","author":"Mocanu","year":"2016","journal-title":"Sustain. Energy Grids Netw."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1016\/j.apenergy.2017.03.064","article-title":"A short-term building cooling load prediction method using deep learning algorithms","volume":"195","author":"Fan","year":"2017","journal-title":"Appl. Energy"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"3698","DOI":"10.1109\/TSG.2018.2834219","article-title":"On-Line Building Energy Optimization Using Deep Reinforcement Learning","volume":"10","author":"Mocanu","year":"2019","journal-title":"IEEE Trans. Smart Grid"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Mangal, A., and Kumar, N. (2016, January 5\u20138). Using big data to enhance the bosch production line performance: A Kaggle challenge. Proceedings of the 2016 IEEE International Conference on Big Data (Big Data), Washington, DC, USA.","DOI":"10.1109\/BigData.2016.7840826"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Baba, Y., Nori, N., Saito, S., and Kashima, H. (November, January 30). Crowdsourced data analytics: A case study of a predictive modeling competition. Proceedings of the DSAA 2014\u2014Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics, Shanghai, China.","DOI":"10.1109\/DSAA.2014.7058086"},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1016\/j.enbuild.2017.08.069","article-title":"Bayesian calibration of building energy models with large datasets","volume":"154","author":"Chong","year":"2017","journal-title":"Energy Build."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/1\/3\/56\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:15:05Z","timestamp":1760188505000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/1\/3\/56"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,8,29]]},"references-count":50,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2019,9]]}},"alternative-id":["make1030056"],"URL":"https:\/\/doi.org\/10.3390\/make1030056","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,8,29]]}}}