{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T17:41:54Z","timestamp":1776879714939,"version":"3.51.2"},"reference-count":50,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2023,2,8]],"date-time":"2023-02-08T00:00:00Z","timestamp":1675814400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e Tecnologia","doi-asserted-by":"publisher","award":["UIDB\/04728\/2020"],"award-info":[{"award-number":["UIDB\/04728\/2020"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e Tecnologia","doi-asserted-by":"publisher","award":["EXPL\/CCI-COM\/0706\/2021"],"award-info":[{"award-number":["EXPL\/CCI-COM\/0706\/2021"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e Tecnologia","doi-asserted-by":"publisher","award":["CPCA-IAC\/AV\/475278\/2022"],"award-info":[{"award-number":["CPCA-IAC\/AV\/475278\/2022"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Electronics"],"abstract":"<jats:p>Despite major advances in recent years, the field of Machine Learning continues to face research and technical challenges. Mostly, these stem from big data and streaming data, which require models to be frequently updated or re-trained, at the expense of significant computational resources. One solution is the use of distributed learning algorithms, which can learn in a distributed manner, from distributed datasets. In this paper, we describe CEDEs\u2014a distributed learning system in which models are heterogeneous distributed Ensembles, i.e., complex models constituted by different base models, trained with different and distributed subsets of data. Specifically, we address the issue of predicting the training time of a given model, given its characteristics and the characteristics of the data. Given that the creation of an Ensemble may imply the training of hundreds of base models, information about the predicted duration of each of these individual tasks is paramount for an efficient management of the cluster\u2019s computational resources and for minimizing makespan, i.e., the time it takes to train the whole Ensemble. Results show that the proposed approach is able to predict the training time of Decision Trees with an average error of 0.103 s, and the training time of Neural Networks with an average error of 21.263 s. We also show how results depend significantly on the hyperparameters of the model and on the characteristics of the input data.<\/jats:p>","DOI":"10.3390\/electronics12040871","type":"journal-article","created":{"date-parts":[[2023,2,9]],"date-time":"2023-02-09T02:55:54Z","timestamp":1675911354000},"page":"871","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Predicting Model Training Time to Optimize Distributed Machine Learning Applications"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0573-9122","authenticated-orcid":false,"given":"Miguel","family":"Guimar\u00e3es","sequence":"first","affiliation":[{"name":"CIICESI, ESTG, Polit\u00e9cnico do Porto, 4610-156 Felgueiras, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6650-0388","authenticated-orcid":false,"given":"Davide","family":"Carneiro","sequence":"additional","affiliation":[{"name":"CIICESI, ESTG, Polit\u00e9cnico do Porto, 4610-156 Felgueiras, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6633-3033","authenticated-orcid":false,"given":"Guilherme","family":"Palumbo","sequence":"additional","affiliation":[{"name":"CIICESI, ESTG, Polit\u00e9cnico do Porto, 4610-156 Felgueiras, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2424-5024","authenticated-orcid":false,"given":"Filipe","family":"Oliveira","sequence":"additional","affiliation":[{"name":"CIICESI, ESTG, Polit\u00e9cnico do Porto, 4610-156 Felgueiras, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3807-7292","authenticated-orcid":false,"given":"\u00d3scar","family":"Oliveira","sequence":"additional","affiliation":[{"name":"CIICESI, ESTG, Polit\u00e9cnico do Porto, 4610-156 Felgueiras, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1819-7051","authenticated-orcid":false,"given":"Victor","family":"Alves","sequence":"additional","affiliation":[{"name":"ALGORITMI Research Centre\/LASI, University of Minho, 4710-057 Braga, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3549-0754","authenticated-orcid":false,"given":"Paulo","family":"Novais","sequence":"additional","affiliation":[{"name":"ALGORITMI Research Centre\/LASI, University of Minho, 4710-057 Braga, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2023,2,8]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1146\/annurev-matsci-070218-010015","article-title":"Opportunities and Challenges for Machine Learning in Materials Science","volume":"50","author":"Morgan","year":"2020","journal-title":"Annu. Rev. Mater. Res."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"350","DOI":"10.1016\/j.neucom.2017.01.026","article-title":"Machine learning on big data: Opportunities and challenges","volume":"237","author":"Zhou","year":"2017","journal-title":"Neurocomputing"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1145\/3373464.3373470","article-title":"Machine learning for streaming data: State of the art, challenges, and opportunities","volume":"21","author":"Gomes","year":"2019","journal-title":"ACM SIGKDD Explor. Newsl."},{"key":"ref_4","first-page":"1","article-title":"Data quality considerations for big data and machine learning: Going beyond data cleaning and transformations","volume":"10","author":"Gudivada","year":"2017","journal-title":"Int. J. Adv. Softw."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3377454","article-title":"A survey on distributed machine learning","volume":"53","author":"Verbraeken","year":"2020","journal-title":"ACM Comput. Surv. (csur)"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1016\/j.jpdc.2019.07.007","article-title":"Estimation of energy consumption in machine learning","volume":"134","author":"Rodrigues","year":"2019","journal-title":"J. Parallel Distrib. Comput."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1016\/j.neucom.2020.07.061","article-title":"On hyperparameter optimization of machine learning algorithms: Theory and practice","volume":"415","author":"Yang","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_8","first-page":"1","article-title":"MFE: Towards reproducible meta-feature extraction","volume":"21","author":"Siqueira","year":"2020","journal-title":"J. Mach. Learn. Res."},{"key":"ref_9","unstructured":"Bellosa, F., Weissel, A., Waitz, M., and Kellner, S. (2003, January 27). Event-driven energy accounting for dynamic thermal management. Proceedings of the Workshop on Compilers and Operating Systems for Low Power (COLP\u201903), New Orleans, LA, USA."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Bertran, R., Gonzalez, M., Martorell, X., Navarro, N., and Ayguade, E. (2010, January 2\u20134). Decomposable and Responsive Power Models for Multicore Processors Using Performance Counters. Proceedings of the 24th ACM International Conference on Supercomputing, Tsukuba, Japan. ICS \u201910.","DOI":"10.1145\/1810085.1810108"},{"key":"ref_11","unstructured":"Economou, D., Rivoire, S., Kozyrakis, C., and Ranganathan, P. (2006, January 17\u201321). Full-system power analysis and modeling for server environments. Proceedings of the ISCA06: The 33rd Annual International Symposium on Computer Architecture, New York, NY, USA."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Goel, B., McKee, S.A., Gioiosa, R., Singh, K., Bhadauria, M., and Cesati, M. (2010, January 15\u201318). Portable, scalable, per-core power estimation for intelligent resource management. Proceedings of the International Conference on Green Computing, Chicago, IL, USA.","DOI":"10.1109\/GREENCOMP.2010.5598313"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Mazouz, A., Wong, D.C., Kuck, D., and Jalby, W. (2017, January 22\u201326). An Incremental Methodology for Energy Measurement and Modeling. Proceedings of the 8th ACM\/SPEC on International Conference on Performance Engineering, L\u2019Aquila, Italy. ICPE \u201917.","DOI":"10.1145\/3030207.3030224"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Rajamani, K., Hanson, H., Rubio, J., Ghiasi, S., and Rawson, F. (2006, January 25\u201327). Application-Aware Power Management. Proceedings of the 2006 IEEE International Symposium on Workload Characterization, San Jose, CA, USA.","DOI":"10.1109\/IISWC.2006.302728"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Spiliopoulos, V., Sembrant, A., and Kaxiras, S. (2012, January 7\u20139). Power-Sleuth: A Tool for Investigating Your Program\u2019s Power Behavior. Proceedings of the 2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, Washington, DC, USA.","DOI":"10.1109\/MASCOTS.2012.36"},{"key":"ref_16","unstructured":"Walker, M.J., Das, A.K., Merrett, G.V., and Hashimi, B. (2015, January 21). Run-time power estimation for mobile and embedded asymmetric multi-core cpus. Proceedings of the Workshop on High Performance Energy Efficient Embedded Systems (HIP3ES), Amsterdam, The Netherlands. Collocated with HIPEAC 2015 Conference."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1145\/342001.339657","article-title":"Wattch: A framework for architectural-level power analysis and optimizations","volume":"28","author":"Brooks","year":"2000","journal-title":"ACM SIGARCH Comput. Archit. News"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1145\/1168917.1168881","article-title":"Accurate and efficient regression modeling for microarchitectural performance and power prediction","volume":"40","author":"Lee","year":"2006","journal-title":"ACM SIGOPS Oper. Syst. Rev."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Li, S., Ahn, J.H., Strong, R.D., Brockman, J.B., Tullsen, D.M., and Jouppi, N.P. (2009, January 12\u201316). McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures. Proceedings of the 42nd Annual IEEE\/ACM International Symposium on Microarchitecture, New York, NY, USA. MICRO 42.","DOI":"10.1145\/1669112.1669172"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Yang, T.J., Chen, Y.H., and Sze, V. (2017, January 21\u201326). Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.643"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"David, H., Gorbatov, E., Hanebutte, U.R., Khanna, R., and Le, C. (2010, January 18\u201320). RAPL: Memory Power Estimation and Capping. Proceedings of the 16th ACM\/IEEE International Symposium on Low Power Electronics and Design, Austin, TX, USA. ISLPED \u201910.","DOI":"10.1145\/1840845.1840883"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Shao, Y.S., and Brooks, D. (2014, January 11\u201313). Energy characterization and instruction-level energy model of Intel\u2019s Xeon Phi processor. Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), La Jolla, CA, USA.","DOI":"10.1109\/ISLPED.2013.6629328"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Paun, I., Moshfeghi, Y., and Ntarmos, N. (2021, January 21\u201323). Are we there yet? Estimating training time for recommendation systems. Proceedings of the 1st Workshop on Machine Learning and Systems, Bangalore, India.","DOI":"10.1145\/3437984.3458832"},{"key":"ref_24","unstructured":"Tang, Y. (2021). Distributed Machine Learning Patterns, Manning Publications Co.. [2nd ed.]."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"2802","DOI":"10.1109\/TPDS.2020.3003307","article-title":"Distributed Training of Deep Learning Models: A Taxonomic Perspective","volume":"31","author":"Langer","year":"2020","journal-title":"IEEE Trans. Parallel Distrib. Syst."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Galakatos, A., Crotty, A., and Kraska, T. (2017). Encyclopedia of Database Systems, Springer. Available online: https:\/\/doi.org\/10.1007\/978-1-4899-7993-3_80647-1.","DOI":"10.1007\/978-1-4899-7993-3_80647-1"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1016\/j.bushor.2019.10.005","article-title":"Machine learning for enterprises: Applications, algorithm selection, and challenges","volume":"63","author":"Lee","year":"2020","journal-title":"Bus. Horizons"},{"key":"ref_28","unstructured":"Elshawi, R., Maher, M., and Sakr, S. (2019). Automated machine learning: State-of-the-art and open challenges. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Carneiro, D., Guimaraes, M., Carvalho, M., and Novais, P. (2023). Using meta-learning to predict performance metrics in machine learning problems. Expert Syst., 40.","DOI":"10.1111\/exsy.12900"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Vanschoren, J. (2018). Meta-learning: A survey. arXiv.","DOI":"10.1007\/978-3-030-05318-5_2"},{"key":"ref_31","first-page":"1934","article-title":"Tunability: Importance of hyperparameters of machine learning algorithms","volume":"20","author":"Probst","year":"2019","journal-title":"J. Mach. Learn. Res."},{"key":"ref_32","unstructured":"Weerts, H.J., Mueller, A.C., and Vanschoren, J. (2020). Importance of tuning hyperparameters of machine learning algorithms. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Rivolli, A., Garcia, L.P., Soares, C., Vanschoren, J., and de Carvalho, A.C. (2022). Meta-features for meta-learning. Knowl.-Based Syst., 240.","DOI":"10.1016\/j.knosys.2021.108101"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"497","DOI":"10.2307\/1910129","article-title":"An Automatic Method of Solving Discrete Programming Problems","volume":"28","author":"Land","year":"1960","journal-title":"Econometrica"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1090\/S0002-9904-1954-09848-8","article-title":"The Theory of Dynamic Programming","volume":"60","author":"Bellman","year":"1954","journal-title":"Bull. Am. Math. Soc."},{"key":"ref_36","unstructured":"Pardalos, P., and Resende, M. (2014). Handbook of Applied Optimization, Oxford University Press."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Feo, T., and Resende, M.G.C. (1995). Greedy randomized adaptive search procedures. J. Glob. Optim., 109\u2013133.","DOI":"10.1007\/BF01096763"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"156","DOI":"10.1111\/j.1540-5915.1977.tb01074.x","article-title":"Heuristics for integer programming using surrogate constraints","volume":"8","author":"Glover","year":"1977","journal-title":"Decis. Sci."},{"key":"ref_39","first-page":"653","article-title":"Fundamentals of scatter search and path relinking","volume":"39","author":"Glover","year":"2000","journal-title":"Control. Cybern."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"759","DOI":"10.1007\/978-3-319-07124-4_19","article-title":"Variable neighborhood search","volume":"1\u20132","author":"Hansen","year":"2018","journal-title":"Handb. Heuristics"},{"key":"ref_41","first-page":"589","article-title":"Survey of genetic algorithms and genetic programming","volume":"1995","author":"Koza","year":"1995","journal-title":"Wescon Conf. Rec."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Shukla, A., Pandey, H.M., and Mehrotra, D. (2015, January 25\u201327). Comparative review of selection techniques in genetic algorithm. Proceedings of the 2015 1st International Conference on Futuristic Trends in Computational Analysis and Knowledge Management, ABLAZE 2015, Noida, India.","DOI":"10.1109\/ABLAZE.2015.7154916"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Mart\u00ed, R., Pardalos, P.M., and Resende, M.G.C. (2018). Handbook of Heuristics, Springer International Publishing.","DOI":"10.1007\/978-3-319-07124-4"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1145\/937503.937505","article-title":"Metaheuristics in Combinatorial Optimization: Overview and Conceptual Comparison","volume":"35","author":"Blum","year":"2003","journal-title":"ACM Comput. Surv."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"732","DOI":"10.1007\/s10559-009-9134-0","article-title":"Classification of applied methods of combinatorial optimization","volume":"45","author":"Sergienko","year":"2009","journal-title":"Cybern. Syst. Anal."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"393","DOI":"10.1016\/j.ejor.2021.04.032","article-title":"Machine learning at the service of meta-heuristics for solving combinatorial optimization problems: A state-of-the-art","volume":"296","author":"Mohammadi","year":"2022","journal-title":"Eur. J. Oper. Res."},{"key":"ref_47","unstructured":"Sun, S., Cao, Z., Zhu, H., and Zhao, J. (2019). A Survey of Optimization Methods from a Machine Learning Perspective. arXiv."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"405","DOI":"10.1016\/j.ejor.2020.07.063","article-title":"Machine learning for combinatorial optimization: A methodological tour d\u2019horizon","volume":"290","author":"Bengio","year":"2021","journal-title":"Eur. J. Oper. Res."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Tillman, R.E. (2009, January 14\u201318). Structure learning with independent non-identically distributed data. Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, QC, Canada.","DOI":"10.1145\/1553374.1553507"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Palumbo, G., Carneiro, D., Guimar\u00e3es, M., Alves, V., and Novais, P. (2023). Algorithm Recommendation and Performance Prediction Using Meta-Learning. Int. J. Neural Syst., in press.","DOI":"10.1142\/S0129065723500119"}],"container-title":["Electronics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-9292\/12\/4\/871\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:27:43Z","timestamp":1760120863000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-9292\/12\/4\/871"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,8]]},"references-count":50,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,2]]}},"alternative-id":["electronics12040871"],"URL":"https:\/\/doi.org\/10.3390\/electronics12040871","relation":{},"ISSN":["2079-9292"],"issn-type":[{"value":"2079-9292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,8]]}}}