{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T06:59:45Z","timestamp":1780469985695,"version":"3.54.1"},"reference-count":52,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,4,28]],"date-time":"2022-04-28T00:00:00Z","timestamp":1651104000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,4,28]],"date-time":"2022-04-28T00:00:00Z","timestamp":1651104000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"The Second Forum for Women in Research Award."}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Big Data"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>To mitigate the curse of dimensionality in high-dimensional datasets, feature selection has become a crucial step in most data mining applications. However, no feature selection method consistently delivers the best performance across different domains. For this reason and in order to improve the stability of the feature selection process, ensemble feature selection frameworks have become increasingly popular. While many have examined the construction of ensemble techniques under various considerations, little work has been done to shed light on the influence of the aggregation process on the stability of the ensemble feature selection. In contribution to this field, this work aims to explore the impact of some selected aggregation strategies on the ensemble\u2019s stability and accuracy. Using twelve classification real datasets from various domains, the stability and accuracy of five different aggregation techniques were examined under four standard filter feature selection methods. The experimental analysis revealed significant differences in both the stability and accuracy behavior of the ensemble under different aggregations, especially between score-based and rank-based aggregation strategies. Moreover, it was observed that the simpler score-based strategies based on the Arithmetic Mean or L2-norm aggregation appear to be efficient and compelling in most cases. Given the data structure or associated application domain, this work\u2019s findings can guide the construction of feature selection ensembles using the most efficient and suitable aggregation rules.<\/jats:p>","DOI":"10.1186\/s40537-022-00607-1","type":"journal-article","created":{"date-parts":[[2022,4,28]],"date-time":"2022-04-28T13:07:37Z","timestamp":1651151257000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":41,"title":["The stability of different aggregation techniques in ensemble feature selection"],"prefix":"10.1186","volume":"9","author":[{"given":"Reem","family":"Salman","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ayman","family":"Alzaatreh","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Hana","family":"Sulieman","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2022,4,28]]},"reference":[{"issue":"3","key":"607_CR1","doi-asserted-by":"publisher","first-page":"211","DOI":"10.6029\/smartcr.2014.03.007","volume":"4","author":"V Kumar","year":"2014","unstructured":"Kumar V, Minz S. Feature selection: a literature review. SmartCR. 2014;4(3):211\u201329.","journal-title":"SmartCR"},{"issue":"Mar","key":"607_CR2","first-page":"1157","volume":"3","author":"I Guyon","year":"2003","unstructured":"Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res. 2003;3(Mar):1157\u201382.","journal-title":"J Mach Learn Res"},{"issue":"1","key":"607_CR3","first-page":"03","volume":"5","author":"H Sulieman","year":"2018","unstructured":"Sulieman H, Alzaatreh A. A supervised feature selection approach based on global sensitivity. Arch Data Sci Ser A (Online First). 2018;5(1):03.","journal-title":"Arch Data Sci Ser A (Online First)"},{"issue":"19","key":"607_CR4","doi-asserted-by":"publisher","first-page":"2507","DOI":"10.1093\/bioinformatics\/btm344","volume":"23","author":"Y Saeys","year":"2007","unstructured":"Saeys Y, Inza I, Larranaga P. A review of feature selection techniques in bioinformatics. Bioinformatics. 2007;23(19):2507\u201317.","journal-title":"Bioinformatics"},{"issue":"1","key":"607_CR5","first-page":"3","volume":"19","author":"B Venkatesh","year":"2019","unstructured":"Venkatesh B, Anuradha J. A review of feature selection and its methods. Cybern Inf Technol. 2019;19(1):3\u201326.","journal-title":"Cybern Inf Technol"},{"key":"607_CR6","doi-asserted-by":"crossref","unstructured":"Pes B. Evaluating feature selection robustness on high-dimensional data. In: International conference on hybrid artificial intelligence systems. Springer; 2018. p. 235\u2013247.","DOI":"10.1007\/978-3-319-92639-1_20"},{"issue":"1","key":"607_CR7","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s40537-020-00385-8","volume":"8","author":"S Alelyani","year":"2021","unstructured":"Alelyani S. Stable bagging feature selection on medical data. J Big Data. 2021;8(1):1\u201318.","journal-title":"J Big Data"},{"key":"607_CR8","first-page":"15","volume":"312","author":"G Brown","year":"2010","unstructured":"Brown G. Ensemble learning. Encycl Mach Learn. 2010;312:15\u20139.","journal-title":"Encycl Mach Learn"},{"issue":"2","key":"607_CR9","doi-asserted-by":"publisher","first-page":"200","DOI":"10.3390\/e23020200","volume":"23","author":"R Salman","year":"2021","unstructured":"Salman R, Alzaatreh A, Sulieman H, Faisal S. A bootstrap framework for aggregating within and between feature selection methods. Entropy. 2021;23(2):200.","journal-title":"Entropy"},{"key":"607_CR10","doi-asserted-by":"crossref","unstructured":"Saeys Y, Abeel T, Van\u00a0de Peer Y. Robust feature selection using ensemble feature selection techniques. In: Joint European conference on machine learning and knowledge discovery in databases. Springer; 2008. p. 313\u2013325.","DOI":"10.1007\/978-3-540-87481-2_21"},{"key":"607_CR11","doi-asserted-by":"crossref","unstructured":"Wang H, Khoshgoftaar TM, Napolitano A. A comparative study of ensemble feature selection techniques for software defect prediction. In: 2010 Ninth international conference on machine learning and applications. IEEE; 2010. p. 135\u2013140.","DOI":"10.1109\/ICMLA.2010.27"},{"issue":"2","key":"607_CR12","doi-asserted-by":"publisher","first-page":"105","DOI":"10.1007\/s40747-017-0060-x","volume":"4","author":"N Hoque","year":"2018","unstructured":"Hoque N, Singh M, Bhattacharyya DK. Efs-mi: an ensemble feature selection method for classification. Complex Intell Syst. 2018;4(2):105\u201318.","journal-title":"Complex Intell Syst"},{"key":"607_CR13","doi-asserted-by":"publisher","first-page":"365","DOI":"10.1016\/j.ins.2018.12.033","volume":"480","author":"P Drot\u00e1r","year":"2019","unstructured":"Drot\u00e1r P, Gazda M, Vokorokos L. Ensemble feature selection using election methods and ranker clustering. Inf Sci. 2019;480:365\u201380.","journal-title":"Inf Sci"},{"issue":"5","key":"607_CR14","doi-asserted-by":"publisher","first-page":"12553","DOI":"10.1111\/exsy.12553","volume":"37","author":"C-W Chen","year":"2020","unstructured":"Chen C-W, Tsai Y-H, Chang F-R, Lin W-C. Ensemble feature selection in medical datasets: combining filter, wrapper, and embedded feature selection results. Expert Syst. 2020;37(5):12553.","journal-title":"Expert Syst"},{"key":"607_CR15","doi-asserted-by":"publisher","unstructured":"Abeel T, Helleputte T, Van de Peer Y, Dupont P, Saeys Y. Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics. 2009;26(3):392\u20138. https:\/\/doi.org\/10.1093\/bioinformatics\/btp630. https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/3\/392\/16896736\/btp630.pdf","DOI":"10.1093\/bioinformatics\/btp630"},{"issue":"10","key":"607_CR16","doi-asserted-by":"publisher","first-page":"5951","DOI":"10.1007\/s00521-019-04082-3","volume":"32","author":"B Pes","year":"2020","unstructured":"Pes B. Ensemble feature selection for high-dimensional data: a stability analysis across multiple domains. Neural Comput Appl. 2020;32(10):5951\u201373.","journal-title":"Neural Comput Appl"},{"key":"607_CR17","unstructured":"Liu H, Motoda H, Setiono R, Zhao Z. Feature selection: An ever evolving frontier in data mining. In: Liu, H., Motoda, H., Setiono, R., Zhao, Z. (eds.) Proceedings of the Fourth International Workshop on Feature Selection in Data Mining. Proceedings of Machine Learning Research, vol. 10, pp. 4\u201313. PMLR, Hyderabad, India (2010). https:\/\/proceedings.mlr.press\/v10\/liu10b.html."},{"issue":"2","key":"607_CR18","doi-asserted-by":"publisher","first-page":"483","DOI":"10.1016\/S0377-2217(02)00911-6","volume":"156","author":"S Piramuthu","year":"2004","unstructured":"Piramuthu S. Evaluating feature selection methods for learning in data mining applications. Eur J Oper Res. 2004;156(2):483\u201394.","journal-title":"Eur J Oper Res"},{"key":"607_CR19","doi-asserted-by":"publisher","DOI":"10.1201\/9781584888796","volume-title":"Computational methods of feature selection","author":"H Liu","year":"2007","unstructured":"Liu H, Motoda H. Computational methods of feature selection. Cham: CRC Press; 2007."},{"issue":"3","key":"607_CR20","doi-asserted-by":"publisher","first-page":"190","DOI":"10.1080\/02564602.2014.906859","volume":"31","author":"D Guan","year":"2014","unstructured":"Guan D, Yuan W, Lee Y-K, Najeebullah K, Rasel MK. A review of ensemble learning based feature selection. IETE Tech Rev. 2014;31(3):190\u20138.","journal-title":"IETE Tech Rev"},{"key":"607_CR21","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.inffus.2018.11.008","volume":"52","author":"V Bol\u00f3n-Canedo","year":"2019","unstructured":"Bol\u00f3n-Canedo V, Alonso-Betanzos A. Ensembles for feature selection: a review and future trends. Inf Fusion. 2019;52:1\u201312.","journal-title":"Inf Fusion"},{"issue":"1","key":"607_CR22","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1177\/0165551515613226","volume":"43","author":"A Onan","year":"2017","unstructured":"Onan A, Koruko\u011flu S. A feature selection model based on genetic rank aggregation for text sentiment classification. J Inf Sci. 2017;43(1):25\u201338.","journal-title":"J Inf Sci"},{"issue":"1","key":"607_CR23","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12938-016-0292-9","volume":"16","author":"S Najdi","year":"2017","unstructured":"Najdi S, Gharbali AA, Fonseca JM. Feature ranking and rank aggregation for automatic sleep stage classification: a comparative study. Biomed Eng Online. 2017;16(1):1\u201319.","journal-title":"Biomed Eng Online"},{"key":"607_CR24","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1016\/j.jneumeth.2018.04.002","volume":"303","author":"JD L\u00f3pez-Cabrera","year":"2018","unstructured":"L\u00f3pez-Cabrera JD, Lorenzo-Ginori JV. Feature selection for the classification of traced neurons. J Neurosci Methods. 2018;303:41\u201354.","journal-title":"J Neurosci Methods"},{"issue":"5","key":"607_CR25","doi-asserted-by":"publisher","first-page":"555","DOI":"10.1002\/wics.111","volume":"2","author":"S Lin","year":"2010","unstructured":"Lin S. Rank aggregation methods. Wiley Interdiscip Rev Comput Stat. 2010;2(5):555\u201370.","journal-title":"Wiley Interdiscip Rev Comput Stat"},{"issue":"5","key":"607_CR26","doi-asserted-by":"publisher","first-page":"537","DOI":"10.1038\/nbt1203","volume":"24","author":"S Aerts","year":"2006","unstructured":"Aerts S, Lambrechts D, Maity S, Van Loo P, Coessens B, De Smet F, Tranchevent L-C, De Moor B, Marynen P, Hassan B, et al. Gene prioritization through genomic data fusion. Nat Biotechnol. 2006;24(5):537\u201344.","journal-title":"Nat Biotechnol"},{"issue":"4","key":"607_CR27","doi-asserted-by":"publisher","first-page":"573","DOI":"10.1093\/bioinformatics\/btr709","volume":"28","author":"R Kolde","year":"2012","unstructured":"Kolde R, Laur S, Adler P, Vilo J. Robust rank aggregation for gene list integration and meta-analysis. Bioinformatics. 2012;28(4):573\u201380.","journal-title":"Bioinformatics"},{"key":"607_CR28","doi-asserted-by":"crossref","unstructured":"Joachims T. Optimizing search engines using clickthrough data. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. 2002. p. 133\u2013142","DOI":"10.1145\/775047.775067"},{"key":"607_CR29","unstructured":"Dittman DJ, Khoshgoftaar TM, Wald R, Napolitano A. Classification performance of rank aggregation techniques for ensemble gene selection. In: The twenty-sixth international FLAIRS conference 2013."},{"key":"607_CR30","doi-asserted-by":"publisher","first-page":"124","DOI":"10.1016\/j.knosys.2016.11.017","volume":"118","author":"B Seijo-Pardo","year":"2017","unstructured":"Seijo-Pardo B, Porto-D\u00edaz I, Bol\u00f3n-Canedo V, Alonso-Betanzos A. Ensemble feature selection: homogeneous and heterogeneous approaches. Knowl Based Syst. 2017;118:124\u201339.","journal-title":"Knowl Based Syst"},{"key":"607_CR31","doi-asserted-by":"crossref","unstructured":"Seijo-Pardo B, Bol\u00f3n-Canedo V, Alonso-Betanzos A. Using a feature selection ensemble on dna microarray datasets. In: ESANN 2016.","DOI":"10.1007\/978-3-319-21858-8_4"},{"issue":"3","key":"607_CR32","doi-asserted-by":"publisher","first-page":"857","DOI":"10.1007\/s11063-017-9619-1","volume":"46","author":"B Seijo-Pardo","year":"2017","unstructured":"Seijo-Pardo B, Bol\u00f3n-Canedo V, Alonso-Betanzos A. Testing different ensemble configurations for feature selection. Neural Process Lett. 2017;46(3):857\u201380.","journal-title":"Neural Process Lett"},{"key":"607_CR33","doi-asserted-by":"crossref","unstructured":"Wald R, Khoshgoftaar TM, Dittman D, Awada W, Napolitano A. An extensive comparison of feature ranking aggregation techniques in bioinformatics. In: 2012 IEEE 13th international conference on information reuse and integration (IRI). IEEE; 2012. p. 377\u2013384.","DOI":"10.1109\/IRI.2012.6303034"},{"key":"607_CR34","doi-asserted-by":"crossref","unstructured":"Wald R, Khoshgoftaar TM, Dittman D. Mean aggregation versus robust rank aggregation for ensemble gene selection. In: 2012 11th international conference on machine learning and applications, vol. 1. IEEE; 2012. p. 63\u201369.","DOI":"10.1109\/ICMLA.2012.20"},{"key":"607_CR35","doi-asserted-by":"crossref","unstructured":"Dess\u00ec N, Pes B, Angioni M. On stability of ensemble gene selection. In: International conference on intelligent data engineering and automated learning. Springer; 2015. p. 416\u2013423.","DOI":"10.1007\/978-3-319-24834-9_48"},{"issue":"1","key":"607_CR36","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1021\/ci300547g","volume":"53","author":"P Willett","year":"2013","unstructured":"Willett P. Combination of similarity rankings using data fusion. J Chem Inf Model. 2013;53(1):1\u201310.","journal-title":"J Chem Inf Model"},{"key":"607_CR37","doi-asserted-by":"crossref","unstructured":"Dittman DJ, Khoshgoftaar TM, Wald R, Napolitano A. Comparison of rank-based vs. score-based aggregation for ensemble gene selection. In: 2013 IEEE 14th international conference on information reuse and integration (IRI). IEEE; 2013. p. 225\u2013231.","DOI":"10.1109\/IRI.2013.6642476"},{"key":"607_CR38","doi-asserted-by":"crossref","unstructured":"Dernoncourt D, Hanczar B, Zucker J-D. Stability of ensemble feature selection on high-dimension and low-sample size data. In: Proceedings of the 3rd international conference on pattern recognition applications and methods. 2014. p. 325\u2013330.","DOI":"10.5220\/0004922203250330"},{"key":"607_CR39","doi-asserted-by":"crossref","unstructured":"Li Y, Hsu DF, Chung SM. Combining multiple feature selection methods for text categorization by using rank-score characteristics. In: 2009 21st IEEE international conference on tools with artificial intelligence. IEEE; 2009. p. 508\u2013517.","DOI":"10.1109\/ICTAI.2009.129"},{"key":"607_CR40","doi-asserted-by":"crossref","unstructured":"Alelyani S, Zhao Z, Liu H. A dilemma in assessing stability of feature selection algorithms. In: 2011 IEEE international conference on high performance computing and communications. IEEE; 2011. p. 701\u2013707.","DOI":"10.1109\/HPCC.2011.99"},{"key":"607_CR41","doi-asserted-by":"crossref","unstructured":"Dittman D, Khoshgoftaar T, Wald R, Napolitano A. Similarity analysis of feature ranking techniques on imbalanced dna microarray datasets. In: 2012 IEEE international conference on bioinformatics and biomedicine. IEEE; 2012. p. 1\u20135.","DOI":"10.1109\/BIBM.2012.6392708"},{"key":"607_CR42","doi-asserted-by":"crossref","unstructured":"Wald R, Khoshgoftaar TM, Napolitano A. Stability of filter-and wrapper-based feature subset selection. In: 2013 IEEE 25th international conference on tools with artificial intelligence. IEEE; 2013. p. 374\u2013380.","DOI":"10.1109\/ICTAI.2013.63"},{"key":"607_CR43","unstructured":"Lustgarten JL, Gopalakrishnan V, Visweswaran S. Measuring stability of feature selection in biomedical datasets. In: AMIA annual symposium proceedings, vol. 2009. American Medical Informatics Association; 2009. p. 406."},{"key":"607_CR44","doi-asserted-by":"crossref","unstructured":"Nogueira S, Brown G. Measuring the stability of feature selection with applications to ensemble methods. In: International workshop on multiple classifier systems. Springer; 2015. p. 135\u2013146.","DOI":"10.1007\/978-3-319-20248-8_12"},{"key":"607_CR45","unstructured":"Kuncheva LI. A stability index for feature selection. In: Artificial intelligence and applications. 2007. p. 421\u2013427."},{"issue":"1","key":"607_CR46","first-page":"6345","volume":"18","author":"S Nogueira","year":"2017","unstructured":"Nogueira S, Sechidis K, Brown G. On the stability of feature selection algorithms. J Mach Learn Res. 2017;18(1):6345\u201398.","journal-title":"J Mach Learn Res"},{"issue":"1","key":"607_CR47","doi-asserted-by":"publisher","first-page":"95","DOI":"10.1007\/s10115-006-0040-8","volume":"12","author":"A Kalousis","year":"2007","unstructured":"Kalousis A, Prados J, Hilario M. Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl Inf Syst. 2007;12(1):95\u2013116.","journal-title":"Knowl Inf Syst"},{"key":"607_CR48","doi-asserted-by":"crossref","unstructured":"Bommert, A., Rahnenf\u00fchrer, J.: Adjusted measures for feature selection stability for data sets with similar features. In: International conference on machine learning, optimization, and data science. Springer; 2010. p. 203\u2013214","DOI":"10.1007\/978-3-030-64583-0_19"},{"issue":"1","key":"607_CR49","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1016\/j.cie.2006.07.004","volume":"51","author":"E Yu","year":"2006","unstructured":"Yu E, Cho S. Ensemble based on ga wrapper feature selection. Comput Ind Eng. 2006;51(1):111\u20136.","journal-title":"Comput Ind Eng"},{"key":"607_CR50","doi-asserted-by":"publisher","unstructured":"Khaire UM, Dhanalakshmi R. Stability of feature selection algorithm: a review. J King Saud Univ Comput Inf Sci 2019;34(4):1060\u20131073. https:\/\/doi.org\/10.1016\/j.jksuci.2019.06.012","DOI":"10.1016\/j.jksuci.2019.06.012"},{"issue":"1","key":"607_CR51","doi-asserted-by":"publisher","first-page":"163","DOI":"10.1093\/biomet\/70.1.163","volume":"70","author":"JT Kent","year":"1983","unstructured":"Kent JT. Information gain and a general measure of correlation. Biometrika. 1983;70(1):163\u201373.","journal-title":"Biometrika"},{"issue":"39","key":"607_CR52","doi-asserted-by":"publisher","first-page":"283","DOI":"10.2307\/3603556","volume":"2","author":"R Muirhead","year":"1903","unstructured":"Muirhead R. Proofs that the arithmetic mean is greater than the geometric mean. Math Gaz. 1903;2(39):283\u20137.","journal-title":"Math Gaz"}],"container-title":["Journal of Big Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-022-00607-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s40537-022-00607-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-022-00607-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,23]],"date-time":"2024-09-23T05:29:51Z","timestamp":1727069391000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofbigdata.springeropen.com\/articles\/10.1186\/s40537-022-00607-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,28]]},"references-count":52,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["607"],"URL":"https:\/\/doi.org\/10.1186\/s40537-022-00607-1","relation":{},"ISSN":["2196-1115"],"issn-type":[{"value":"2196-1115","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,4,28]]},"assertion":[{"value":"3 October 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 April 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 April 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"51"}}