{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T20:09:34Z","timestamp":1775160574131,"version":"3.50.1"},"reference-count":31,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2023,2,21]],"date-time":"2023-02-21T00:00:00Z","timestamp":1676937600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Union\u2019s Horizon 2020 research and innovation program under the Marie Sk\u0142odowska-Curie","award":["778196"],"award-info":[{"award-number":["778196"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>The need for artificial intelligence (AI) and machine learning (ML) models to optimize data center (DC) operations increases as the volume of operations management data upsurges tremendously. These strategies can assist operators in better understanding their DC operations and help them make informed decisions upfront to maintain service reliability and availability. The strategies include developing models that optimize energy efficiency, identifying inefficient resource utilization and scheduling policies, and predicting outages. In addition to model hyperparameter tuning, feature subset selection (FSS) is critical for identifying relevant features for effectively modeling DC operations to provide insight into the data, optimize model performance, and reduce computational expenses. Hence, this paper introduces the Shapley Additive exPlanation (SHAP) values method, a class of additive feature attribution values for identifying relevant features that is rarely discussed in the literature. We compared its effectiveness with several commonly used, importance-based feature selection methods. The methods were tested on real DC operations data streams obtained from the ENEA CRESCO6 cluster with 20,832 cores. To demonstrate the effectiveness of SHAP compared to other methods, we selected the top ten most important features from each method, retrained the predictive models, and evaluated their performance using the MAE, RMSE, and MPAE evaluation criteria. The results presented in this paper demonstrate that the predictive models trained using features selected with the SHAP-assisted method performed well, with a lower error and a reasonable execution time compared to other methods.<\/jats:p>","DOI":"10.3390\/fi15030088","type":"journal-article","created":{"date-parts":[[2023,2,22]],"date-time":"2023-02-22T04:33:56Z","timestamp":1677040436000},"page":"88","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":101,"title":["Machine Learning for Data Center Optimizations: Feature Selection Using Shapley Additive exPlanation (SHAP)"],"prefix":"10.3390","volume":"15","author":[{"given":"Yibrah","family":"Gebreyesus","sequence":"first","affiliation":[{"name":"School of Computer Science, University College of Dublin, D04 V1W8 Dublin, Ireland"}]},{"given":"Damian","family":"Dalton","sequence":"additional","affiliation":[{"name":"School of Computer Science, University College of Dublin, D04 V1W8 Dublin, Ireland"}]},{"given":"Sebastian","family":"Nixon","sequence":"additional","affiliation":[{"name":"School of Computer Science, Wolaita Sodo University, Wolaita P.O. Box 138, Ethiopia"}]},{"given":"Davide","family":"De Chiara","sequence":"additional","affiliation":[{"name":"ENEA-R.C. Portici, 80055 Portici (NA), Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8123-2791","authenticated-orcid":false,"given":"Marta","family":"Chinnici","sequence":"additional","affiliation":[{"name":"ENEA-R.C. Casaccia, 00196 Rome, Italy"}]}],"member":"1968","published-online":{"date-parts":[[2023,2,21]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"325","DOI":"10.24136\/eq.2021.012","article-title":"The impact of digital transformation on European countries: Insights from a comparative analysis","volume":"16","author":"Urbaniec","year":"2021","journal-title":"Equilibrium Q. J. Econ. Econ. Policy"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Hoosain, M.S., Paul, B.S., and Ramakrishna, S. (2020). The impact of 4ir digital technologies and circular thinking on the United Nations sustainable development goals. Sustainability, 12.","DOI":"10.3390\/su122310143"},{"key":"ref_3","unstructured":"Nicholson, J. (2020). How is coronavirus impacting the news? Our analysis of global traffic and coverage data. Chartbeat Blog."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"117","DOI":"10.3390\/challe6010117","article-title":"On global electricity usage of communication technology: Trends to 2030","volume":"6","author":"Andrae","year":"2015","journal-title":"Challenges"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1145\/3364684","article-title":"Toward ml-centric cloud platforms","volume":"63","author":"Bianchini","year":"2020","journal-title":"Commun. ACM"},{"key":"ref_6","first-page":"158","article-title":"Deepmind ai reduces google data centre cooling bill by 40%","volume":"20","author":"Evans","year":"2016","journal-title":"Deep. Blog."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Grishina, A., Chinnici, M., Kor, A.-L., Rondeau, E., and Georges, J.-P. (2020). A machine learning solution for data center thermal characteristics analysis. Energies, 13.","DOI":"10.20944\/preprints202007.0325.v1"},{"key":"ref_8","unstructured":"Lundberg, S.M., and Lee, S.-I. (2017). A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst., 30."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"012026","DOI":"10.1088\/1742-6596\/1284\/1\/012026","article-title":"A comparison of feature selection methodology for solving classification problems in finance","volume":"1284","author":"Xiaomao","year":"2019","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"R713","DOI":"10.1016\/j.cub.2017.05.064","article-title":"Animal communication: When i\u2019m calling you, will you answer too?","volume":"27","author":"Vickers","year":"2017","journal-title":"Curr. Biol."},{"key":"ref_11","unstructured":"Molina, L.C., Belanche, L., and Nebot, A. (2002, January 9\u201312). Feature selection algorithms: A survey and experimental\u2019 evaluation. Proceedings of the 2002 IEEE International Conference on Data Mining, Maebashi City, Japan."},{"key":"ref_12","first-page":"1157","article-title":"An introduction to variable and feature selection","volume":"3","author":"Guyon","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref_13","unstructured":"Cunningham, P., Kathirgamanathan, B., and Delany, S.J. (2021). Feature selection tutorial with python examples. arXiv."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"106337","DOI":"10.1016\/j.asoc.2020.106337","article-title":"A novel hybrid feature selection method based on dynamic feature importance","volume":"93","author":"Wei","year":"2020","journal-title":"Appl. Soft Comput."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1016\/j.compeleceng.2013.11.024","article-title":"A survey on feature selection methods","volume":"40","author":"Chandrashekar","year":"2014","journal-title":"Comput. Electr. Eng."},{"key":"ref_16","unstructured":"Yang, K., and Shahabi, C. (2005, January 27\u201330). On the stationarity of multivariate time series for correlation-based data analysis. Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM\u201905), IEEE, Houston, TX, USA."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1260\/1748-3018.6.3.385","article-title":"Sigmis: A feature selection algorithm using correlation-based method","volume":"6","author":"Blessie","year":"2012","journal-title":"J. Algorithms Comput. Technol."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"659","DOI":"10.1016\/0098-3004(87)90035-5","article-title":"Corank: A fortran-77 program to calculate and test matrices of pearson, spearman, and kendall correlation coefficients with pairwise treatment of missing values","volume":"13","author":"Rock","year":"1987","journal-title":"Comput. Geosci."},{"key":"ref_19","unstructured":"University of Alabama at Birmingham, and National Institutes of Health (NIH) (2018). Autoantibody Reduction Therapy in Patients with Idiopathic Pulmonary Fibrosis (Art-Ipf), National Institutes of Health."},{"key":"ref_20","first-page":"129","article-title":"Correlation and symmetrical uncertainty-based feature selection for multivariate time series classification","volume":"12","author":"Saikhu","year":"2019","journal-title":"Int. J. Intell. Eng. Syst."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.neucom.2012.02.031","article-title":"Feature selection with missing data using mutual information estimators","volume":"90","author":"Doquire","year":"2012","journal-title":"Neurocomputing"},{"key":"ref_22","unstructured":"Kathirgamanathan, B., and Cunningham, P. (2021). Correlation based feature subset selection for multivariate time-series data. arXiv."},{"key":"ref_23","unstructured":"Hall, M.A. (1999). Correlation-Based Feature Selection for Machine Learning. [Ph.D. Dissertation, The University of Waikato]."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1111\/jiec.13155","article-title":"Increasing the energy efficiency of a data center based on machine learning","volume":"26","author":"Yang","year":"2022","journal-title":"J. Ind. Ecol."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Ribeiro, M.T., Singh, S., and Guestrin, C. (2016, January 13\u201317). \u201cWhy should I trust you?\u201d explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939778"},{"key":"ref_26","unstructured":"Shrikumar, A., Greenside, P., and Kundaje, A. (2017, January 6\u201311). Learning important features through propagating activation differences. Proceedings of the International Conference on Machine Learning, PMLR, Sydney, Australia."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1109\/ICDAR.1995.598994","article-title":"Random decision forests","volume":"Volume 1","author":"Ho","year":"1995","journal-title":"Proceedings of the 3rd International Conference on Document Analysis and Recognition"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/BF00058655","article-title":"Bagging predictors","volume":"24","author":"Breiman","year":"1996","journal-title":"Mach. Learn."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_31","unstructured":"Lundberg, S.M., Erion, G.G., and Lee, S.-I. (2018). Consistent individualized feature attribution for tree ensembles. arXiv."}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/15\/3\/88\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:38:40Z","timestamp":1760121520000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/15\/3\/88"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,21]]},"references-count":31,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2023,3]]}},"alternative-id":["fi15030088"],"URL":"https:\/\/doi.org\/10.3390\/fi15030088","relation":{"has-preprint":[{"id-type":"doi","id":"10.20944\/preprints202212.0482.v1","asserted-by":"object"}]},"ISSN":["1999-5903"],"issn-type":[{"value":"1999-5903","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,21]]}}}