{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,6]],"date-time":"2025-11-06T12:35:18Z","timestamp":1762432518416,"version":"build-2065373602"},"reference-count":36,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2024,11,9]],"date-time":"2024-11-09T00:00:00Z","timestamp":1731110400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>This study introduces a novel measure for evaluating attribute relevance, specifically designed to accurately identify attributes that are intrinsically related to a phenomenon, while being sensitive to the asymmetry of those relationships and noise conditions. Traditional variable selection techniques, such as filter and wrapper methods, often fall short in capturing these complexities. Our methodology, grounded in decision trees but extendable to other machine learning models, was rigorously evaluated across various data scenarios. The results demonstrate that our measure effectively distinguishes relevant from irrelevant attributes and highlights how relevance is influenced by noise, providing a more nuanced understanding compared to established methods such as Pearson, Spearman, Kendall, MIC, MAS, MEV, GMIC, and Phik. This research underscores the importance of phenomenon-centric explainability, reproducibility, and robust attribute relevance evaluation in the development of predictive models. By enhancing both the interpretability and contextual accuracy of models, our approach not only supports more informed decision making but also contributes to a deeper understanding of the underlying mechanisms in diverse application domains, such as biomedical research, financial modeling, astronomy, and others.<\/jats:p>","DOI":"10.3390\/a17110518","type":"journal-article","created":{"date-parts":[[2024,11,11]],"date-time":"2024-11-11T11:34:11Z","timestamp":1731324851000},"page":"518","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Attribute Relevance Score: A Novel Measure for Identifying Attribute Importance"],"prefix":"10.3390","volume":"17","author":[{"given":"Pablo","family":"Neirz","sequence":"first","affiliation":[{"name":"Departamento de Inform\u00e1tica, Universidad T\u00e9cnica Federico Santa Mar\u00eda, Valpara\u00edso 1680, Chile"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hector","family":"Allende","sequence":"additional","affiliation":[{"name":"Departamento de Inform\u00e1tica, Universidad T\u00e9cnica Federico Santa Mar\u00eda, Valpara\u00edso 1680, Chile"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4130-0010","authenticated-orcid":false,"given":"Carolina","family":"Saavedra","sequence":"additional","affiliation":[{"name":"Departamento de Inform\u00e1tica, Universidad T\u00e9cnica Federico Santa Mar\u00eda, Valpara\u00edso 1680, Chile"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2024,11,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Ullah, S., Mahmood, Z., Ali, N., Ahmad, T., and Buriro, A. (2023). Machine Learning-Based Dynamic Attribute Selection Technique for DDoS Attack Classification in IoT Networks. Computers, 12.","DOI":"10.3390\/computers12060115"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Kang, I.A., Njimbouom, S.N., and Kim, J.D. (2023). Optimal Feature Selection-Based Dental Caries Prediction Model Using Machine Learning for Decision Support System. Bioengineering, 10.","DOI":"10.20944\/preprints202301.0304.v1"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Kiratsoudis, S., and Tsiantos, V. (2024). Enhancing Personnel Selection through the Integration of the Entropy Synergy Analysis of Multi-Attribute Decision Making Model: A Novel Approach. Information, 15.","DOI":"10.3390\/info15010001"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"AL-Gburi, A.F.J., Nazri, M.Z.A., Yaakub, M.R.B., and Alyasseri, Z.A.A. (2024). Multi-Objective Unsupervised Feature Selection and Cluster Based on Symbiotic Organism Search. Algorithms, 17.","DOI":"10.3390\/a17080355"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"\u0110urasevi\u0107, M., Jakobovi\u0107, D., Picek, S., and Mariot, L. (2024). Assessing the Ability of Genetic Programming for Feature Selection in Constructing Dispatching Rules for Unrelated Machine Environments. Algorithms, 17.","DOI":"10.3390\/a17020067"},{"key":"ref_6","first-page":"589","article-title":"Consistent Feature Selection for Pattern Recognition in Polynomial Time","volume":"8","author":"Nilsson","year":"2007","journal-title":"J. Mach. Learn. Res."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1016\/S0004-3702(97)00043-X","article-title":"Wrappers for feature subset selection","volume":"97","author":"Kohavi","year":"1997","journal-title":"Artif. Intell."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"277","DOI":"10.1016\/j.ins.2020.03.024","article-title":"All-relevant feature selection using multidimensional filters with exhaustive search","volume":"524","author":"Mnich","year":"2020","journal-title":"Inf. Sci."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Strobl, C., Boulesteix, A.L., Zeileis, A., and Hothorn, T. (2007). Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinform., 8.","DOI":"10.1186\/1471-2105-8-25"},{"key":"ref_11","first-page":"1399","article-title":"Ranking a Random Feature For Variable And Feature Selection","volume":"3","author":"Stoppiglia","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"492","DOI":"10.1093\/bib\/bbx124","article-title":"Evaluation of variable selection methods for random forests and omics data sets","volume":"20","author":"Degenhardt","year":"2019","journal-title":"Briefings Bioinform."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v036.i11","article-title":"Feature selection with the Boruta package","volume":"36","author":"Kursa","year":"2010","journal-title":"J. Stat. Softw."},{"key":"ref_14","unstructured":"Wetschoreck, F. (2024, November 07). RIP Correlation. Introducing the Predictive Power Score. Towards Data Science, Medium. Available online: https:\/\/towardsdatascience.com\/rip-correlation-introducing-the-predictive-power-score-3d90808b9598."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"341ps12","DOI":"10.1126\/scitranslmed.aaf5027","article-title":"What does research reproducibility mean?","volume":"8","author":"Goodman","year":"2016","journal-title":"Sci. Transl. Med."},{"key":"ref_16","unstructured":"Bouthillier, X., Laurent, C., and Vincent, P. (2019, January 9\u201315). Unreproducible Research is Reproducible. Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA."},{"key":"ref_17","first-page":"747","article-title":"Accounting for Variance in Machine Learning Benchmarks","volume":"3","author":"Bouthillier","year":"2021","journal-title":"Proc. Mach. Learn. Syst."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Sammut, C., and Webb, G.I. (2017). Online Controlled Experiments and A\/B Testing. Encyclopedia of Machine Learning and Data Mining, Springer.","DOI":"10.1007\/978-1-4899-7687-1"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Deng, A., Xu, Y., Kohavi, R., and Walker, T. (2013, January 4\u20138). Improving the sensitivity of online controlled experiments by utilizing pre-experiment data. Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, New York, NY, USA. WSDM\u201913.","DOI":"10.1145\/2433396.2433413"},{"key":"ref_20","unstructured":"Lundberg, S.M., and Lee, S.I. (2017, January 4\u20139). A unified approach to interpreting model predictions. Proceedings of the 31st International Conference on Neural Information Processing Systems, Red Hook, NY, USA. NIPS\u201917."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Ribeiro, M.T., Singh, S., and Guestrin, C. (2016., January 13\u201317). \u201cWhy Should I Trust You?\u201d. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, San Francisco, CA, USA. Available online: https:\/\/arxiv.org\/pdf\/1602.04938.pdf.","DOI":"10.1145\/2939672.2939778"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Sokol, K., and Flach, P. (2020, January 27\u201330). Explainability fact sheets: A framework for systematic assessment of explainable approaches. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, New York, NY, USA. FAT*\u201920.","DOI":"10.1145\/3351095.3372870"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1096","DOI":"10.1038\/s41467-019-08987-4","article-title":"Unmasking Clever Hans predictors and assessing what machines really learn","volume":"10","author":"Lapuschkin","year":"2019","journal-title":"Nat. Commun."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Pfungst, O. (1911). Clever Hans (The Horse of Mr. von Osten): A Contribution to Experimental, Animal, and Human Psychology, Holt, Rinehart, and Winston.","DOI":"10.5962\/bhl.title.56164"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1038\/s42256-021-00307-0","article-title":"Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans","volume":"3","author":"Roberts","year":"2021","journal-title":"Nat. Mach. Intell."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1038\/s42256-021-00338-7","article-title":"AI for radiographic COVID-19 detection selects shortcuts over signal","volume":"3","author":"DeGrave","year":"2021","journal-title":"Nat. Mach. Intell."},{"key":"ref_27","unstructured":"Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. (2014, January 14\u201316). Intriguing properties of neural networks. Proceedings of the International Conference on Learning Representations (ICLR), Banff, AB, Canada."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1038\/s41562-023-01784-6","article-title":"Modelling dataset bias in machine-learned theories of economic decision-making","volume":"8","author":"Thomas","year":"2024","journal-title":"Nat. Hum. Behav."},{"key":"ref_29","first-page":"6638","article-title":"CatBoost: Unbiased boosting with categorical features","volume":"31","author":"Prokhorenkova","year":"2018","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"107043","DOI":"10.1016\/j.csda.2020.107043","article-title":"A new correlation coefficient between categorical, ordinal and interval variables with Pearson characteristics","volume":"152","author":"Baak","year":"2020","journal-title":"Comput. Stat. Data Anal."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Ross, B.C. (2014). Mutual information between discrete and continuous data sets. PLoS ONE, 9.","DOI":"10.1371\/journal.pone.0087357"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1518","DOI":"10.1126\/science.1205438","article-title":"Detecting novel associations in large data sets","volume":"334","author":"Reshef","year":"2011","journal-title":"Science"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"giy032","DOI":"10.1093\/gigascience\/giy032","article-title":"A practical tool for maximal information coefficient analysis","volume":"7","author":"Albanese","year":"2018","journal-title":"GigaScience"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1093\/bioinformatics\/bts707","article-title":"minerva and minepy: A C engine for the MINE suite and its R, Python and MATLAB wrappers","volume":"29","author":"Albanese","year":"2012","journal-title":"Bioinformatics"},{"key":"ref_35","unstructured":"Luedtke, A., and Tran, L.H. (2013). The Generalized Mean Information Coefficient. arXiv."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Chen, Z., and Zhang, W. (2013). Integrative analysis using module-guided random forests reveals correlated genetic factors related to mouse weight. PLoS Comput. Biol., 9.","DOI":"10.1371\/journal.pcbi.1002956"}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/17\/11\/518\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:29:24Z","timestamp":1760113764000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/17\/11\/518"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,9]]},"references-count":36,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2024,11]]}},"alternative-id":["a17110518"],"URL":"https:\/\/doi.org\/10.3390\/a17110518","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2024,11,9]]}}}