{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T01:35:43Z","timestamp":1777685743242,"version":"3.51.4"},"reference-count":36,"publisher":"SAGE Publications","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["HIS"],"published-print":{"date-parts":[[2024,9,19]]},"abstract":"<jats:p>The genetic algorithm with aggressive mutations GAAM, is a specialised algorithm for feature selection. This algorithm is dedicated to the selection of a small number of features and allows the user to specify the maximum number of features desired. A major obstacle to the use of this algorithm is its high computational cost, which increases significantly with the number of dimensions to be retained. To solve this problem, we introduce a surrogate model based on machine learning, which reduces the number of evaluations of the fitness function by an average of 48% on the datasets tested, using the standard parameters specified in the original paper. Additionally, we experimentally demonstrate that eliminating the crossover step in the original algorithm does not result in any visible changes in the algorithm\u2019s results. We also demonstrate that the original algorithm uses an artificially complex mutation method that could be replaced by a simpler method without loss of efficiency. The sum of the improvements resulted in an average reduction of 53% in the number of evaluations of the fitness functions. Finally, we have shown that these outcomes apply to parameters beyond those utilized in the initial article, while still achieving a comparable decrease in the count of evaluation function calls. Tests were conducted on 9 datasets of varying dimensions, using two different classifiers.<\/jats:p>","DOI":"10.3233\/his-240019","type":"journal-article","created":{"date-parts":[[2024,7,16]],"date-time":"2024-07-16T12:30:32Z","timestamp":1721133032000},"page":"259-274","source":"Crossref","is-referenced-by-count":2,"title":["Machine Learning-Based Surrogate Model for Genetic Algorithm with Aggressive Mutation for Feature Selection"],"prefix":"10.1177","volume":"20","author":[{"given":"Marc","family":"Chevallier","sequence":"first","affiliation":[{"name":"LIPN Laboratory, Sorbonne Paris Nord University, Villetaneuse, France"}]},{"given":"Charly","family":"Clairmont","sequence":"additional","affiliation":[{"name":"Synaltic, Vincennes, France"}]}],"member":"179","reference":[{"key":"10.3233\/HIS-240019_ref1","doi-asserted-by":"crossref","first-page":"26766","DOI":"10.1109\/ACCESS.2021.3056407","article-title":"Metaheuristic algorithms on feature selection: A survey of one decade of research (2009\u20132019)","volume":"9","author":"Agrawal","year":"2021","journal-title":"IEEE Access"},{"issue":"11","key":"10.3233\/HIS-240019_ref2","doi-asserted-by":"crossref","first-page":"11797","DOI":"10.1007\/s11227-023-05132-3","article-title":"A comparative analysis of meta-heuristic optimization algorithms for feature selection on ML-based classification of heart-related diseases.","volume":"79","author":"Ay","year":"2023","journal-title":"The Journal of Supercomputing"},{"issue":"1","key":"10.3233\/HIS-240019_ref3","doi-asserted-by":"crossref","first-page":"9","DOI":"10.3390\/biomimetics9010009","article-title":"Feature selection problem and metaheuristics: A systematic literature review about its formulation, evaluation and applications.","volume":"9","author":"Barrera-Garc\u00eda","year":"2023","journal-title":"Biomimetics"},{"key":"10.3233\/HIS-240019_ref5","doi-asserted-by":"crossref","unstructured":"A. Blum, J. Hopcroft and R. Kannan, Foundations of Data Science, Cambridge University Press (2020), pp.\u00a012\u201332.","DOI":"10.1017\/9781108755528"},{"key":"10.3233\/HIS-240019_ref6","doi-asserted-by":"crossref","unstructured":"L. Breiman, J.H. Friedman, R.A. Olshen and C.J. Stone, Classification And Regression Trees, Routledge (Oct 2017), pp.\u00a098\u2013137.","DOI":"10.1201\/9781315139470"},{"issue":"70","key":"10.3233\/HIS-240019_ref7","first-page":"2079","article-title":"On over-fitting in model selection and subsequent selection bias in performance evaluation.","volume":"11","author":"Cawley","year":"2010","journal-title":"Journal of Machine Learning Research"},{"key":"10.3233\/HIS-240019_ref8","unstructured":"M. Chevallier, L\u2019Apprentissage artificiel au service du profilage des donn\u00e9es. Theses, Universit\u00e9 Paris-Nord \u2013 Paris XIII (Nov 2022)."},{"key":"10.3233\/HIS-240019_ref9","doi-asserted-by":"crossref","unstructured":"M. Chevallier and C. Clairmont, Machine learning-based surrogate model for genetic algorithm with aggressive mutation for feature selection. In: Proceedings of the 15th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2023). Springer International Publishing (2024).","DOI":"10.3233\/HIS-240019"},{"key":"10.3233\/HIS-240019_ref10","doi-asserted-by":"crossref","unstructured":"M. Chevallier, N. Grozavu, F. Boufar\u00e8s, N. Rogovschi and C. Clairmont, Trade between population size and\u00a0mutation rate for GAAM (genetic algorithm with aggressive mutation) for feature selection. In: IFIP Advances in Information and Communication Technology, Springer International Publishing (2022), pp.\u00a0432\u2013444.","DOI":"10.1007\/978-3-031-08333-4_35"},{"issue":"1, 2","key":"10.3233\/HIS-240019_ref11","doi-asserted-by":"crossref","first-page":"79","DOI":"10.3233\/HIS-230006","article-title":"Parallel swarm-based algorithms for scheduling independent tasks.","volume":"19","author":"Dietze","year":"2023","journal-title":"International Journal of Hybrid Intelligent Systems"},{"key":"10.3233\/HIS-240019_ref12","doi-asserted-by":"crossref","unstructured":"Q. Fournier and D. Aloise, Empirical comparison between autoencoders and traditional dimensionality reduction methods. In: 2019 IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering (AIKE). IEEE (2019), pp.\u00a0211\u2013214.","DOI":"10.1109\/AIKE.2019.00044"},{"issue":"5","key":"10.3233\/HIS-240019_ref13","doi-asserted-by":"crossref","first-page":"1189","DOI":"10.1214\/aos\/1013203451","article-title":"Greedy function approximation: A gradient boosting machine.","volume":"29","author":"Friedman","year":"2001","journal-title":"The Annals of Statistics"},{"key":"10.3233\/HIS-240019_ref14","first-page":"1","article-title":"A genetic scheduling strategy with spatial reuse for dense wireless networks.","author":"Fulber-Garcia","year":"2023","journal-title":"International Journal of Hybrid Intelligent Systems"},{"key":"10.3233\/HIS-240019_ref15","unstructured":"B. Ghojogh, M.N. Samad, S.A. Mashhadi, T. Kapoor, W. Ali, F. Karray and M. Crowley, Feature selection and feature extraction in pattern analysis: A literature review. ArXiv (2019)."},{"key":"10.3233\/HIS-240019_ref16","unstructured":"K. Gokcesu and H. Gokcesu, Generalized huber loss for robust learning and its efficient minimization for a robust statistics (2021)."},{"issue":"7825","key":"10.3233\/HIS-240019_ref17","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/s41586-020-2649-2","article-title":"Array programming with NumPy","volume":"585","author":"Harris","year":"2020","journal-title":"Nature"},{"key":"10.3233\/HIS-240019_ref18","doi-asserted-by":"crossref","unstructured":"J. He, L. Ding, L. Jiang and L. Ma, Kernel ridge regression classification. In: 2014 International Joint Conference on Neural Networks (IJCNN). (2014), pp.\u00a02263\u20132267.","DOI":"10.1109\/IJCNN.2014.6889396"},{"issue":"1","key":"10.3233\/HIS-240019_ref19","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1080\/08839514.2018.1451032","article-title":"Fraudulent firm classification: A case study of an external audit.","volume":"32","author":"Hooda","year":"2018","journal-title":"Applied Artificial Intelligence"},{"key":"10.3233\/HIS-240019_ref20","doi-asserted-by":"crossref","unstructured":"R. Izabela and L. Krzysztof, GAAMmf: genetic algorithm with aggressive mutation and decreasing feature set for feature selection. Genetic Programming and Evolvable Machines 24(2) (Jul 2023).","DOI":"10.1007\/s10710-023-09458-y"},{"issue":"2","key":"10.3233\/HIS-240019_ref21","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1016\/j.swevo.2011.05.001","article-title":"Surrogate-assisted evolutionary computation: Recent advances and future challenges.","volume":"1","author":"Jin","year":"2011","journal-title":"Swarm and Evolutionary Computation"},{"issue":"2065","key":"10.3233\/HIS-240019_ref22","first-page":"20150202","article-title":"Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical","volume":"374","author":"Jolliffe","year":"2016","journal-title":"Physical and Engineering Sciences"},{"key":"10.3233\/HIS-240019_ref23","doi-asserted-by":"crossref","unstructured":"K.A. de\u00a0Jong, Evolutionary computation, Evolutionary Computation, Bradford Books, Cambridge, MA (Feb 2006), pp.\u00a01\u201321.","DOI":"10.1145\/3583131.3600058"},{"issue":"3","key":"10.3233\/HIS-240019_ref24","doi-asserted-by":"crossref","first-page":"1863","DOI":"10.1007\/s11831-022-09853-1","article-title":"A systematic review on metaheuristic optimization techniques for feature selections in disease diagnosis: Open issues and challenges.","volume":"30","author":"Kaur","year":"2023","journal-title":"Arch. Comput. Methods Eng."},{"key":"10.3233\/HIS-240019_ref25","doi-asserted-by":"crossref","unstructured":"O. Kramer, Genetic algorithm essentials, Studies in computational intelligence, Springer International Publishing, Cham, Switzerland, 1 edn. (Jan 2017), pp.\u00a011\u201319.","DOI":"10.1007\/978-3-319-52156-5_2"},{"key":"10.3233\/HIS-240019_ref27","unstructured":"S. Luke, Essentials of Metaheuristics, Lulu, second edn. (2013), pp.\u00a09\u201310."},{"issue":"3, 4","key":"10.3233\/HIS-240019_ref28","doi-asserted-by":"crossref","first-page":"183","DOI":"10.3233\/HIS-230012","article-title":"Ga evolved cgp configuration data for digital circuit design on embryonic architecture.","volume":"19","author":"Malhotra","year":"2023","journal-title":"International Journal of Hybrid Intelligent Systems"},{"key":"10.3233\/HIS-240019_ref29","first-page":"2825","article-title":"Scikit-learn: Machine learning in Python.","volume":"12","author":"Pedregosa","year":"2011","journal-title":"Journal of Machine Learning Research"},{"key":"10.3233\/HIS-240019_ref30","doi-asserted-by":"crossref","unstructured":"N. Pudjihartono, T. Fadason, A.W. Kempa-Liehr and J.M. O\u2019Sullivan, A review of feature selection methods for machine learning-based disease risk prediction. Frontiers in Bioinformatics 2 (Jun 2022).","DOI":"10.3389\/fbinf.2022.927312"},{"issue":"2","key":"10.3233\/HIS-240019_ref31","doi-asserted-by":"crossref","first-page":"253","DOI":"10.1007\/s10044-021-01000-z","article-title":"fGAAM: A fast and resizable genetic algorithm with aggressive mutation for feature selection.","volume":"25","author":"Rejer","year":"2021","journal-title":"Pattern Analysis and Applications"},{"issue":"4","key":"10.3233\/HIS-240019_ref32","doi-asserted-by":"crossref","first-page":"828","DOI":"10.1109\/JBHI.2013.2245674","article-title":"Collection and analysis of a Parkinson speech dataset with multiple types of sound recordings.","volume":"17","author":"Sakar","year":"2013","journal-title":"IEEE J Biomed Health Inform"},{"key":"10.3233\/HIS-240019_ref33","doi-asserted-by":"crossref","first-page":"414","DOI":"10.1016\/j.ins.2021.03.002","article-title":"Surrogate models in evolutionary single-objective optimization: A new taxonomy and experimental study.","volume":"562","author":"Tong","year":"2021","journal-title":"Information Sciences"},{"issue":"2","key":"10.3233\/HIS-240019_ref34","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1145\/2641190.2641198","article-title":"Openml: networked science in machine learning.","volume":"15","author":"Vanschoren","year":"2013","journal-title":"SIGKDD Explorations"},{"key":"10.3233\/HIS-240019_ref35","unstructured":"G. Vanwinckelen, H. Blockeel, B. De\u00a0Baets, B. Manderick, M. Rademaker and W. Waegeman, On estimating model accuracy with repeated cross-validation (2012-01-01)."},{"issue":"1","key":"10.3233\/HIS-240019_ref36","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/S0169-7439(00)00122-2","article-title":"Monte carlo cross validation.","volume":"56","author":"Xu","year":"2001","journal-title":"Chemometrics and Intelligent Laboratory Systems"},{"key":"10.3233\/HIS-240019_ref37","unstructured":"H. Zhang, The optimality of naive bayes (2004)."},{"key":"10.3233\/HIS-240019_ref38","doi-asserted-by":"crossref","unstructured":"J. Ziegler and W. Banzhaf, Decreasing the number of evaluations in evolutionary algorithms by using a meta-model of the fitness function. In: C. Ryan, T. Soule, M. Keijzer, E. Tsang, R. Poli, E. Costa, (eds.) Genetic Programming. Springer Berlin Heidelberg, Berlin, Heidelberg (2003), pp.\u00a0264\u2013275.","DOI":"10.1007\/3-540-36599-0_24"}],"container-title":["International Journal of Hybrid Intelligent Systems"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/HIS-240019","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T08:53:04Z","timestamp":1777452784000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/HIS-240019"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,19]]},"references-count":36,"journal-issue":{"issue":"3"},"URL":"https:\/\/doi.org\/10.3233\/his-240019","relation":{},"ISSN":["1448-5869","1875-8819"],"issn-type":[{"value":"1448-5869","type":"print"},{"value":"1875-8819","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,9,19]]}}}