{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,15]],"date-time":"2026-04-15T17:31:51Z","timestamp":1776274311059,"version":"3.50.1"},"reference-count":41,"publisher":"PeerJ","license":[{"start":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T00:00:00Z","timestamp":1774915200000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004515","name":"Universiti Kebangsaan Malaysia","doi-asserted-by":"crossref","award":["DIP-2016-024"],"award-info":[{"award-number":["DIP-2016-024"]}],"id":[{"id":"10.13039\/501100004515","id-type":"DOI","asserted-by":"crossref"}]},{"name":"American University of the Middle East in Kuwait"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"abstract":"<jats:p>Feature selection (FS) constitutes an indispensable process in medical data analysis, pivotal for mitigating the curse of dimensionality while concurrently augmenting classification performance and fostering model interpretability. This article introduces a novel hybrid framework, the Adaptive Ensemble of Multivariate Filters (AE-MVF), which synergistically integrates Multivariate Mutual Information (MMI) with extreme gradient boosting (XGBoost)-based feature importance scores within a dynamically weighted ensemble architecture. The core of our contribution is the application of the Harris Hawks Optimization (HHO) algorithm to adaptively determine optimal filter weights and feature subset cardinality, a process explicitly designed to navigate the trade-off between maximizing feature relevance and minimizing inter-feature redundancy. The efficacy and robustness of AE-MVF were rigorously evaluated on a diverse suite of 22 benchmark medical datasets, spanning various dimensionalities, sample sizes, and class distributions. Empirical results demonstrate that AE-MVF yields statistically significant improvements over contemporary filter- and wrapper-based FS methodologies, establishing a new performance benchmark in both classification accuracy and feature subset parsimony. Notably, on high-dimensional, low-sample-size (HDLSS) datasets such as \u2018Leukemia\u2019 and \u2018Colon\u2019, the framework achieved substantial dimensionality reduction while preserving or enhancing predictive accuracy. In lower-dimensional contexts, AE-MVF delivered competitive performance with highly parsimonious feature sets, often comprising just 3\u20135 variables. These findings underscore the scalability and generalizability of the AE-MVF framework, positioning it as a potent tool for developing interpretable and computationally efficient models in real-world medical diagnostics.<\/jats:p>","DOI":"10.7717\/peerj-cs.3739","type":"journal-article","created":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T08:40:21Z","timestamp":1774946421000},"page":"e3739","source":"Crossref","is-referenced-by-count":2,"title":["Adaptive ensemble of multivariate filters with Harris Hawks optimization for feature selection in medical diagnosis"],"prefix":"10.7717","volume":"12","author":[{"given":"Safaa","family":"Al-Adwan","sequence":"first","affiliation":[{"name":"Data Mining and Computational Intelligence Research Group, Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia"},{"name":"Prince Abdullah Bin Ghazi Faculty of Information and Communication Technology, Al-Balqa Applied University, Al-Salt, Balqa, Jordan"}]},{"given":"Salwani","family":"Abdullah","sequence":"additional","affiliation":[{"name":"Data Mining and Computational Intelligence Research Group, Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia"}]},{"given":"Mohammed","family":"Alweshah","sequence":"additional","affiliation":[{"name":"Prince Abdullah Bin Ghazi Faculty of Information and Communication Technology, Al-Balqa Applied University, Al-Salt, Balqa, Jordan"},{"name":"Software Engineering Department, Faculty of Information Technology, Aqaba University of Technology, Aqaba, Jordan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2025-3710","authenticated-orcid":true,"given":"Wandeep Kaur","family":"Ratan Singh","sequence":"additional","affiliation":[{"name":"Sentiment Analysis Research Lab, Center for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Selangor, Malaysia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1929-5668","authenticated-orcid":true,"given":"Aws","family":"Al-Qaisi","sequence":"additional","affiliation":[{"name":"College of Engineering and Technology, American University of the Middle East, Kuwait"}]},{"given":"Maen","family":"Takruri","sequence":"additional","affiliation":[{"name":"College of Engineering and Technology, American University of the Middle East, Kuwait"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0586-1961","authenticated-orcid":true,"given":"Sofian","family":"Kassaymeh","sequence":"additional","affiliation":[{"name":"Department of Robotics and Artificial Intelligence, Faculty of Information Technology, Jadara University, Irbid, Jordan"}]}],"member":"4443","published-online":{"date-parts":[[2026,3,31]]},"reference":[{"issue":"2","key":"10.7717\/peerj-cs.3739\/ref-1","doi-asserted-by":"publisher","first-page":"100240","DOI":"10.1016\/j.dajour.2023.100240","article-title":"A new univariate feature selection algorithm based on the best-worst multi-attribute decision-making method","volume":"7","author":"Abellana","year":"2023","journal-title":"Decision Analytics Journal"},{"issue":"12","key":"10.7717\/peerj-cs.3739\/ref-2","doi-asserted-by":"publisher","first-page":"7165","DOI":"10.1007\/s00521-020-05483-5","article-title":"An intelligent feature selection approach based on moth flame optimization for medical diagnosis","volume":"33","author":"Abu Khurmaa","year":"2021","journal-title":"Neural Computing and Applications"},{"key":"10.7717\/peerj-cs.3739\/ref-3","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2503.16577","article-title":"Feature selection strategies for optimized heart disease diagnosis using ML and DL models","author":"Ahmad","year":"2025"},{"issue":"6","key":"10.7717\/peerj-cs.3739\/ref-4","doi-asserted-by":"publisher","first-page":"1221","DOI":"10.14569\/ijacsa.2023.01406130","article-title":"Parameter identification of a multilayer perceptron neural network using an optimized salp swarm algorithm","volume":"14","author":"Al-Laham","year":"2023","journal-title":"International Journal of Advanced Computer Science and Applications"},{"issue":"2","key":"10.7717\/peerj-cs.3739\/ref-5","doi-asserted-by":"publisher","first-page":"1","DOI":"10.5455\/jjcit.71-1639410312","article-title":"Hybridization of arithmetic optimization with great deluge algorithms for feature selection problems in medical diagnosis","volume":"8","author":"Alweshah","year":"2022","journal-title":"Jordanian Journal of Computers and Information Technology"},{"issue":"3","key":"10.7717\/peerj-cs.3739\/ref-6","doi-asserted-by":"publisher","first-page":"293","DOI":"10.1504\/ijdmmm.2024.140529","article-title":"Improving intrusion detection in the IoT with African vultures optimization algorithm-based feature selection","volume":"16","author":"Alweshah","year":"2024","journal-title":"International Journal of Data Mining, Modelling and Management"},{"issue":"2","key":"10.7717\/peerj-cs.3739\/ref-7","doi-asserted-by":"publisher","first-page":"107629","DOI":"10.1016\/j.knosys.2021.107629","article-title":"Coronavirus herd immunity optimizer with greedy crossover for feature selection in medical diagnosis","volume":"235","author":"Alweshah","year":"2022","journal-title":"Knowledge-Based Systems"},{"issue":"5","key":"10.7717\/peerj-cs.3739\/ref-8","doi-asserted-by":"publisher","first-page":"6349","DOI":"10.1007\/s12652-022-04407-6","article-title":"Intrusion detection for the internet of things (IoT) based on the emperor penguin colony optimization algorithm","volume":"14","author":"Alweshah","year":"2023","journal-title":"Journal of Ambient Intelligence and Humanized Computing"},{"issue":"1","key":"10.7717\/peerj-cs.3739\/ref-9","first-page":"101","article-title":"Hybrid filter-wrapper feature selection using equilibrium optimization","volume":"55","author":"Ansari Shiri","year":"2023","journal-title":"Journal of Algorithms and Computation"},{"key":"10.7717\/peerj-cs.3739\/ref-10","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2501.06805","article-title":"A pan-cancer classification model using multi-view feature selection method and ensemble classifier","author":"Chowdhury","year":"2025"},{"issue":"1","key":"10.7717\/peerj-cs.3739\/ref-11","doi-asserted-by":"publisher","first-page":"15202","DOI":"10.1088\/2631-8695\/ada1a4","article-title":"An optimized phishing detection model using hybrid feature selection and a fine-tuned narrow neural network with dynamic jaya optimization to overcome cyberthreats","volume":"7","author":"Diviya","year":"2025","journal-title":"Engineering Research Express"},{"key":"10.7717\/peerj-cs.3739\/ref-12","doi-asserted-by":"publisher","first-page":"269","DOI":"10.1016\/j.neucom.2022.04.083","article-title":"A comprehensive survey on recent metaheuristics for feature selection","volume":"494","author":"Dokeroglu","year":"2022","journal-title":"Neurocomputing"},{"issue":"1","key":"10.7717\/peerj-cs.3739\/ref-13","doi-asserted-by":"publisher","first-page":"426","DOI":"10.1007\/s42235-023-00433-y","article-title":"An improved binary quantum-based avian navigation optimizer algorithm to select effective feature subset from medical data: a COVID-19 case study","volume":"21","author":"Fatahi","year":"2024","journal-title":"Journal of Bionic Engineering"},{"issue":"1","key":"10.7717\/peerj-cs.3739\/ref-14","doi-asserted-by":"publisher","first-page":"221","DOI":"10.31449\/inf.v48i1.4759","article-title":"A hybrid feature selection based on fisher score and svm-rfe for microarray data","volume":"48","author":"Hamla","year":"2024","journal-title":"Informatica"},{"key":"10.7717\/peerj-cs.3739\/ref-15","doi-asserted-by":"publisher","first-page":"849","DOI":"10.1016\/j.future.2019.02.028","article-title":"Harris hawks optimization: algorithm and applications","volume":"97","author":"Heidari","year":"2019","journal-title":"Future Generation Computer Systems"},{"key":"10.7717\/peerj-cs.3739\/ref-16","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2302.12205","article-title":"Harris hawks feature selection in distributed machine learning for secure IoT environments","author":"Hijazi","year":"2023"},{"issue":"1","key":"10.7717\/peerj-cs.3739\/ref-17","doi-asserted-by":"publisher","first-page":"665","DOI":"10.1007\/s10479-022-04933-8","article-title":"Two new feature selection methods based on learn-heuristic techniques for breast cancer prediction: a comprehensive analysis","volume":"328","author":"Karimi","year":"2023","journal-title":"Annals of Operations Research"},{"issue":"17","key":"10.7717\/peerj-cs.3739\/ref-18","doi-asserted-by":"publisher","first-page":"12675","DOI":"10.1007\/s00521-023-08383-6","article-title":"Wrapper-based optimized feature selection using nature-inspired algorithms","volume":"35","author":"Karlupia","year":"2023","journal-title":"Neural Computing and Applications"},{"issue":"1","key":"10.7717\/peerj-cs.3739\/ref-19","doi-asserted-by":"publisher","first-page":"645","DOI":"10.32604\/iasc.2023.027039","article-title":"Intrusion detection using ensemble wrapper filter based feature selection with stacking model","volume":"35","author":"Karthikeyan","year":"2023","journal-title":"Intelligent Automation & Soft Computing"},{"issue":"4","key":"10.7717\/peerj-cs.3739\/ref-20","doi-asserted-by":"publisher","first-page":"1060","DOI":"10.1016\/j.jksuci.2019.06.012","article-title":"Stability of feature selection algorithm: a review","volume":"34","author":"Khaire","year":"2022","journal-title":"Journal of King Saud University-Computer and Information Sciences"},{"issue":"2","key":"10.7717\/peerj-cs.3739\/ref-21","doi-asserted-by":"publisher","first-page":"151","DOI":"10.1089\/big.2021.0132","article-title":"An efficient, ensemble-based classification framework for big medical data","volume":"10","author":"Khan","year":"2022","journal-title":"Big Data"},{"key":"10.7717\/peerj-cs.3739\/ref-22","doi-asserted-by":"publisher","first-page":"104718","DOI":"10.1016\/j.bspc.2023.104718","article-title":"An augmented Snake Optimizer for diseases and COVID-19 diagnosis","volume":"84","author":"Khurma","year":"2023","journal-title":"Biomedical Signal Processing and Control"},{"issue":"6","key":"10.7717\/peerj-cs.3739\/ref-23","doi-asserted-by":"publisher","first-page":"170","DOI":"10.1016\/j.patrec.2023.05.007","article-title":"A novel improved binary harris hawks optimization for high dimensionality feature selection","volume":"171","author":"Lahmar","year":"2023","journal-title":"Pattern Recognition Letters"},{"issue":"1","key":"10.7717\/peerj-cs.3739\/ref-24","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12859-023-05247-7","article-title":"A two-stage hybrid biomarker selection method based on ensemble filter and binary differential evolution incorporating binary African vultures optimization","volume":"24","author":"Li","year":"2023","journal-title":"BMC Bioinformatics"},{"key":"10.7717\/peerj-cs.3739\/ref-25","doi-asserted-by":"publisher","first-page":"78","DOI":"10.12688\/f1000research.160393.1","article-title":"Feature optimized hybrid model for prediction of myocardial infarction","volume":"14","author":"Mishra","year":"2025","journal-title":"F1000Research"},{"key":"10.7717\/peerj-cs.3739\/ref-26","doi-asserted-by":"crossref","DOI":"10.1109\/EPEC.2018.8598326","article-title":"Multivariate mutual information-based feature selection for cyber intrusion detection","author":"Mohammadi","year":"2018"},{"issue":"6","key":"10.7717\/peerj-cs.3739\/ref-27","doi-asserted-by":"publisher","first-page":"105971","DOI":"10.1016\/j.envsoft.2024.105971","article-title":"Applications of XGBoost in water resources engineering: a systematic literature review (dec 2018\u2013may 2023)","volume":"174","author":"Niazkar","year":"2024","journal-title":"Environmental Modelling & Software"},{"issue":"1","key":"10.7717\/peerj-cs.3739\/ref-28","doi-asserted-by":"publisher","first-page":"22588","DOI":"10.1038\/s41598-023-49962-w","article-title":"Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction","volume":"13","author":"Noroozi","year":"2023","journal-title":"Scientific Reports"},{"issue":"8","key":"10.7717\/peerj-cs.3739\/ref-29","doi-asserted-by":"publisher","first-page":"1226","DOI":"10.1109\/TPAMI.2005.159","article-title":"Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy","volume":"27","author":"Peng","year":"2005","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"issue":"11","key":"10.7717\/peerj-cs.3739\/ref-30","doi-asserted-by":"publisher","first-page":"100376\u2013100396","DOI":"10.1109\/access.2022.3203400","article-title":"An enhanced binary multiobjective hybrid filter-wrapper chimp optimization based feature selection method for COVID-19 patient health prediction","volume":"10","author":"Piri","year":"2022","journal-title":"IEEE Access"},{"key":"10.7717\/peerj-cs.3739\/ref-31","doi-asserted-by":"publisher","first-page":"927312","DOI":"10.3389\/fbinf.2022.927312","article-title":"A review of feature selection methods for machine learning-based disease risk prediction","volume":"2","author":"Pudjihartono","year":"2022","journal-title":"Frontiers in Bioinformatics"},{"issue":"1","key":"10.7717\/peerj-cs.3739\/ref-32","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1016\/j.dsm.2023.10.003","article-title":"Hybrid distributed feature selection using particle swarm optimization-mutual information","volume":"7","author":"Robindro","year":"2024","journal-title":"Data Science and Management"},{"issue":"1","key":"10.7717\/peerj-cs.3739\/ref-33","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s44196-024-00716-0","article-title":"Adaptive ensemble learning model-based binary white shark optimizer for software defect classification","volume":"18","author":"Saraireh","year":"2025","journal-title":"International Journal of Computational Intelligence Systems"},{"key":"10.7717\/peerj-cs.3739\/ref-34","doi-asserted-by":"publisher","first-page":"61368","DOI":"10.1109\/access.2023.3287484","article-title":"Wrapper-based feature selection for medical diagnosis: the BTLBO-KNN algorithm","volume":"11","author":"Seghir","year":"2023","journal-title":"IEEE Access"},{"issue":"20","key":"10.7717\/peerj-cs.3739\/ref-35","doi-asserted-by":"publisher","first-page":"12299","DOI":"10.1007\/s00521-024-09713-y","article-title":"A hybrid feature weighting and selection-based strategy to classify the high-dimensional and imbalanced medical data","volume":"36","author":"Singh","year":"2024","journal-title":"Neural Computing and Applications"},{"key":"10.7717\/peerj-cs.3739\/ref-36","doi-asserted-by":"publisher","first-page":"104396","DOI":"10.1016\/j.chemolab.2021.104396","article-title":"A hybrid ensemble-filter wrapper feature selection approach for medical data classification","volume":"217","author":"Singh","year":"2021","journal-title":"Chemometrics and Intelligent Laboratory Systems"},{"issue":"3","key":"10.7717\/peerj-cs.3739\/ref-37","doi-asserted-by":"publisher","first-page":"1575","DOI":"10.1007\/s10115-023-02010-5","article-title":"Feature selection techniques for machine learning: a survey of more than two decades of research","volume":"66","author":"Theng","year":"2024","journal-title":"Knowledge and Information Systems"},{"issue":"6","key":"10.7717\/peerj-cs.3739\/ref-38","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s10614-023-10400-8","article-title":"Two-stage hybrid feature selection approach using levy\u2019s flight based chicken swarm optimization for stock market forecasting","volume":"63","author":"Verma","year":"2023","journal-title":"Computational Economics"},{"issue":"17","key":"10.7717\/peerj-cs.3739\/ref-39","doi-asserted-by":"publisher","first-page":"119612","DOI":"10.1016\/j.eswa.2023.119612","article-title":"A hybrid filter-wrapper feature selection using fuzzy KNN based on bonferroni mean for medical datasets classification: a COVID-19 case study","volume":"218","author":"Vommi","year":"2023","journal-title":"Expert Systems with Applications"},{"issue":"12","key":"10.7717\/peerj-cs.3739\/ref-40","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1007\/s00262-024-03843-x","article-title":"Optimizing cancer classification: a hybrid rdo-XGBoost approach for feature selection and predictive insights","volume":"73","author":"Yaqoob","year":"2024","journal-title":"Cancer Immunology, Immunotherapy"},{"issue":"4","key":"10.7717\/peerj-cs.3739\/ref-41","doi-asserted-by":"publisher","first-page":"1096","DOI":"10.3390\/rs15041096","article-title":"A proposed ensemble feature selection method for estimating forest aboveground biomass from multiple satellite data","volume":"15","author":"Zhang","year":"2023","journal-title":"Remote Sensing"}],"container-title":["PeerJ Computer Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/peerj.com\/articles\/cs-3739.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/articles\/cs-3739.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/articles\/cs-3739.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/articles\/cs-3739.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T08:40:27Z","timestamp":1774946427000},"score":1,"resource":{"primary":{"URL":"https:\/\/peerj.com\/articles\/cs-3739"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,31]]},"references-count":41,"alternative-id":["10.7717\/peerj-cs.3739"],"URL":"https:\/\/doi.org\/10.7717\/peerj-cs.3739","archive":["CLOCKSS","LOCKSS","Portico"],"relation":{},"ISSN":["2376-5992"],"issn-type":[{"value":"2376-5992","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,31]]},"article-number":"e3739"}}