{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,25]],"date-time":"2025-09-25T11:22:14Z","timestamp":1758799334725,"version":"3.44.0"},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2025,4,18]],"date-time":"2025-04-18T00:00:00Z","timestamp":1744934400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,4,18]],"date-time":"2025-04-18T00:00:00Z","timestamp":1744934400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100006752","name":"Universidade do Porto","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100006752","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Data Sci Anal"],"published-print":{"date-parts":[[2025,10]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Many current AutoML platforms include a very large space of alternatives (the <jats:italic>configuration space<\/jats:italic>). This increases the probability of including the best one for any dataset but makes the task of identifying it for a new dataset more difficult. In this paper, we explore a method that can reduce a large configuration space to a significantly smaller one and so help to reduce the search time for the potentially best algorithm configuration, with limited risk of significant loss of predictive performance. We empirically validate the method with a large set of alternatives based on five ML algorithms with different sets of hyperparameters and one preprocessing method (feature selection). Our results show that it is possible to reduce the given search space by more than one order of magnitude, from a few thousands to a few hundred items. After reduction, the search for the best algorithm configuration is about one order of magnitude faster than on the original space without significant loss in predictive performance.<\/jats:p>","DOI":"10.1007\/s41060-025-00764-5","type":"journal-article","created":{"date-parts":[[2025,4,18]],"date-time":"2025-04-18T06:55:48Z","timestamp":1744959348000},"page":"4973-4993","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Reducing algorithm configuration spaces for efficient search"],"prefix":"10.1007","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-4406-3113","authenticated-orcid":false,"given":"Fernando","family":"Freitas","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4720-0486","authenticated-orcid":false,"given":"Pavel","family":"Brazdil","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4549-8917","authenticated-orcid":false,"given":"Carlos","family":"Soares","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,4,18]]},"reference":[{"key":"764_CR1","doi-asserted-by":"crossref","unstructured":"Abdulrahman, S.M., Brazdil, P., van Rijn, J.N., Vanschoren, J.: Speeding up algorithm selection using average ranking and active testing by introducing runtime. Machine learning, Special Issue on Metalearning and Algorithm Selection (2017)","DOI":"10.1007\/s10994-017-5687-8"},{"key":"764_CR2","doi-asserted-by":"crossref","unstructured":"Abdulrahman, S.M., Brazdil, P., Zinon, M., Adamu, A.: Simplifying the Algorithm Selection Using Reduction of Rankings of Classification Algorithms. In: ICSCA \u201919 Proceedings of the 8th International Conference on Software and Computer Applications, Malaysia, ACM New York. pp. 140\u2013148 (2019)","DOI":"10.1145\/3316615.3316674"},{"key":"764_CR3","first-page":"281","volume":"13","author":"J Bergstra","year":"2012","unstructured":"Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281\u2013305 (2012)","journal-title":"J. Mach. Learn. Res."},{"key":"764_CR4","unstructured":"Bischl, B., Casalicchio, G., Feurer, M., Hutter, F., Lang, M., Mantovani, R.G., van Rijn, J.N., Vanschoren, J.: Openml benchmarking suites. arXiv:1708.03731v2 [stat.ML] (2019)"},{"key":"764_CR5","doi-asserted-by":"crossref","unstructured":"Brazdil, P., van Rijn, J., Soares, C., Vanschoren, J.: Chapter 17, Learning from Metadata in Repositories. In: Metalearning: Applications to Automated Machine Learning and Data Mining, pp. 311\u2013327. Springer (2022)","DOI":"10.1007\/978-3-030-67024-5_17"},{"key":"764_CR6","doi-asserted-by":"crossref","unstructured":"Brazdil, P., van Rijn, J., Soares, C., Vanschoren, J.: Chapter 2, Metalearning Approaches to Algorithm Selection I (Exploiting Rankings). In: Metalearning: Applications to Automated Machine Learning and Data Mining, pp. 19\u201337. Springer (2022)","DOI":"10.1007\/978-3-030-67024-5_2"},{"key":"764_CR7","doi-asserted-by":"crossref","unstructured":"Brazdil, P., van Rijn, J., Soares, C., Vanschoren, J.: Chapter 8, Setting-up Configuration Spaces and Experiments. In: Metalearning: Applications to Automated Machine Learning and Data Mining, pp. 143\u2013168. Springer (2022)","DOI":"10.1007\/978-3-030-67024-5_8"},{"key":"764_CR8","unstructured":"Eggensperger, K., Feurer, M., Hutter, F., Bergstra, J., Snoek, J., Hoos, H., Leyton-Brown, K.: Towards an empirical foundation for assessing bayesian optimization of hyperparameters. In: NIPS workshop on Bayesian Optimization in Theory and Practice. pp.\u00a01\u20135 (2013)"},{"issue":"4","key":"764_CR9","doi-asserted-by":"publisher","first-page":"431","DOI":"10.1007\/s10732-014-9275-9","volume":"22","author":"C Fawcett","year":"2016","unstructured":"Fawcett, C., Hoos, H.: Analysing differences between algorithm configurations through ablation. J. Heurist. 22(4), 431\u2013458 (2016)","journal-title":"J. Heurist."},{"key":"764_CR10","first-page":"3133","volume":"15","author":"M Fern\u00e1ndez-Delgado","year":"2014","unstructured":"Fern\u00e1ndez-Delgado, M., Cernadas, E., Barro, S., Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems? J. Mach. Learn. Res. 15, 3133\u20133181 (2014)","journal-title":"J. Mach. Learn. Res."},{"key":"764_CR11","unstructured":"Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., Hutter, F.: Efficient and robust automated machine learning. In: Advances in Neural Information Processing Systems. pp. 2962\u20132970 (2015)"},{"key":"764_CR12","first-page":"1","volume":"23","author":"M Feurer","year":"2022","unstructured":"Feurer, M., Eggensperger, K., Falkner, S., Lindauer, M., Hutter, F.: Auto-sklearn 2.0: The next generation. J. Mach. Learn. Res. 23, 1\u201341 (2022)","journal-title":"J. Mach. Learn. Res."},{"key":"764_CR13","doi-asserted-by":"crossref","unstructured":"Feurer, M., Klein, A., Eggensperger, K., Springenberg, J.T., Blum, M., Hutter, F.: Auto-sklearn: Efficient and robust automated machine learning. In: Hutter, F., Kotthoff, L., Vanschoren, J. (eds.) Automated Machine Learning: Methods, Systems, Challenges, pp. 113\u2013134. Springer (2019)","DOI":"10.1007\/978-3-030-05318-5_6"},{"key":"764_CR14","doi-asserted-by":"crossref","unstructured":"Fr\u00e9chette, A., Kotthoff, L., Rahwan, T., Hoos, H., Leyton-Brown, K., Michalak, T.: Using the Shapley value to analyze algorithm portfolios. In: 30th AAAI Conference on Artificial Intelligence (2016)","DOI":"10.1609\/aaai.v30i1.10440"},{"key":"764_CR15","doi-asserted-by":"crossref","unstructured":"Freitas, F., Brazdil, P., Soares, C.: Exploring the reduction of configuration spaces of workflows. In: International Conference on Discovery Science. pp. 33\u201347. Springer (2023)","DOI":"10.1007\/978-3-031-45275-8_3"},{"key":"764_CR16","unstructured":"F\u00fcrnkranz, J., Petrak, J., Brazdil, P., Soares, C.: On the use of fast subsampling estimates for algorithm recommendation. In: \u00d6sterreichisches Forschungsinstitut f\u00fcr Artificial Intelligence (2002)"},{"key":"764_CR17","doi-asserted-by":"crossref","unstructured":"Hetlerovi\u010d, D., Popel\u00ednsk\u1ef3, L., Brazdil, P., Soares, C., Freitas, F.: On usefulness of outlier elimination in classification tasks. In: International Symposium on Intelligent Data Analysis. pp. 143\u2013156 (2022)","DOI":"10.1007\/978-3-031-01333-1_12"},{"key":"764_CR18","unstructured":"Hutter, F., Hoos, H., Leyton-Brown, K.: An efficient approach for assessing hyperparameter importance. In: Proceedings of the 31st International Conference on Machine Learning. pp. 754\u2013762. ICML\u201914 (2014)"},{"key":"764_CR19","doi-asserted-by":"crossref","unstructured":"Kuhn, M.: Applied Predictive Modeling. Springer (2013)","DOI":"10.1007\/978-1-4614-6849-3"},{"key":"764_CR20","doi-asserted-by":"crossref","unstructured":"Leite, R., Brazdil, P.: Predicting Relative Performance of Classifiers from Samples. In: Proceedings of the 22nd International Conference on Machine Learning. pp. 497\u2013503. ACM (2005)","DOI":"10.1145\/1102351.1102414"},{"key":"764_CR21","unstructured":"Leite, R., Brazdil, P.: Selecting Classifiers Using Meta-Learning with Sampling Landmarks and Data Characterization. In: Proceedings of the 2nd Planning to Learn Workshop (PlanLearn) at ICML\/COLT\/UAI 2008, pp. 35\u201341. ICML\/COLT\/UAI 2008 (2008)"},{"key":"764_CR22","doi-asserted-by":"crossref","unstructured":"Mohr, F., Wever, M.: Naive automated machine learning. Mach. Learn. 112, 1131\u20131170 (2022)","DOI":"10.1007\/s10994-022-06200-0"},{"key":"764_CR23","unstructured":"Perrone, V., Shen, H., Seeger, M., Archambeau, C., Jenatton, R.: Learning search spaces for bayesian optimization: Another view of hyperparameter transfer learning. In: Proceedings of the AAAI Conference on Artificial Intelligence (2019)"},{"key":"764_CR24","doi-asserted-by":"crossref","unstructured":"Pfisterer, F., van Rijn, J., Probst, P., M\u00fcller, A., Bischl, B.: Learning multiple defaults for machine learning algorithms. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion. pp. 241\u2013242 (2021)","DOI":"10.1145\/3449726.3459523"},{"key":"764_CR25","unstructured":"van Rijn, J.N.: Massively Collaborative Machine Learning. PhD dissertation, Leiden University (2016)"},{"key":"764_CR26","doi-asserted-by":"crossref","unstructured":"van Rijn, J.N., Abdulrahman, S.M., Brazdil, P., Vanschoren, J.: Fast Algorithm Selection using Learning Curves. In: Advances in Intelligent Data Analysis XIV. Springer (2015)","DOI":"10.1007\/978-3-319-24465-5_26"},{"key":"764_CR27","doi-asserted-by":"crossref","unstructured":"van Rijn, J.N., Hutter, F.: Hyperparameter importance across datasets. In: KDD \u201918: The 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM (2018)","DOI":"10.1145\/3219819.3220058"},{"key":"764_CR28","doi-asserted-by":"crossref","unstructured":"Soares, C., Petrak, J., Brazdil, P.: Sampling-based relative landmarks: Systematically test-driving algorithms before choosing. In: In Portuguese Conference on Artificial Intelligence, pp. 88\u201395. Springer Berlin Heidelberg (2001)","DOI":"10.1007\/3-540-45329-6_12"},{"key":"764_CR29","doi-asserted-by":"crossref","unstructured":"Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. In: In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. pp. 847\u2013855. ACM (2013)","DOI":"10.1145\/2487575.2487629"},{"issue":"2","key":"764_CR30","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1145\/2641190.2641198","volume":"15","author":"J Vanschoren","year":"2013","unstructured":"Vanschoren, J., van Rijn, J.N., Bischl, B., Torgo, L.: Openml: networked science in machine learning. ACM SIGKDD Explor. Newsl. 15(2), 49\u201360 (2013)","journal-title":"ACM SIGKDD Explor. Newsl."},{"key":"764_CR31","doi-asserted-by":"crossref","unstructured":"Wistuba, M., Schilling, N., Schmidt-Thieme, L.: Learning hyperparameter optimization initializations. In: 2015 IEEE International Conference on Data Science and Advanced Analytics, DSAA 2015. pp. 1\u201310 (2015)","DOI":"10.1109\/DSAA.2015.7344817"},{"key":"764_CR32","doi-asserted-by":"crossref","unstructured":"Xu, L., Hutter, F., Hoos, H., Leyton-Brown, K.: Evaluating component solver contributions to portfolio-based algorithm selectors. In: International Conference on Theory and Applications of Satisfiability Testing. pp. 228\u2013241. Springer (2012)","DOI":"10.1007\/978-3-642-31612-8_18"}],"container-title":["International Journal of Data Science and Analytics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41060-025-00764-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s41060-025-00764-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41060-025-00764-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,25]],"date-time":"2025-09-25T10:49:06Z","timestamp":1758797346000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s41060-025-00764-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,18]]},"references-count":32,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2025,10]]}},"alternative-id":["764"],"URL":"https:\/\/doi.org\/10.1007\/s41060-025-00764-5","relation":{},"ISSN":["2364-415X","2364-4168"],"issn-type":[{"type":"print","value":"2364-415X"},{"type":"electronic","value":"2364-4168"}],"subject":[],"published":{"date-parts":[[2025,4,18]]},"assertion":[{"value":"18 October 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 March 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"18 April 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}