{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,21]],"date-time":"2026-05-21T09:04:25Z","timestamp":1779354265480,"version":"3.51.4"},"reference-count":94,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2025,5,6]],"date-time":"2025-05-06T00:00:00Z","timestamp":1746489600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Data"],"abstract":"<jats:p>This research paper focuses on dimensionality reduction, which is a major subproblem in any data processing operation. Dimensionality reduction based on principal components is the most used methodology. Our paper examines three heuristics, namely Kaiser\u2019s rule, the broken stick, and the conditional number rule, for selecting informative principal components when using principal component analysis to reduce high-dimensional data to lower dimensions. This study uses 22 classification datasets and three classifiers, namely Fisher\u2019s discriminant classifier, logistic regression, and K nearest neighbors, to test the effectiveness of the three heuristics. The results show that there is no universal answer to the best intrinsic dimension, but the conditional number heuristic performs better, on average. This means that the conditional number heuristic is the best candidate for automatic data pre-processing.<\/jats:p>","DOI":"10.3390\/data10050070","type":"journal-article","created":{"date-parts":[[2025,5,7]],"date-time":"2025-05-07T05:18:20Z","timestamp":1746595100000},"page":"70","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Linear Dimensionality Reduction: What Is Better?"],"prefix":"10.3390","volume":"10","author":[{"given":"Mohit","family":"Baliyan","sequence":"first","affiliation":[{"name":"School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1474-1734","authenticated-orcid":false,"given":"Evgeny M.","family":"Mirkes","sequence":"additional","affiliation":[{"name":"School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,5,6]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"van der Ploeg, T., Austin, P.C., and Steyerberg, E.W. (2014). Modern modelling techniques are data hungry: A simulation study for predicting dichotomous endpoints. BMC Med. Res. Methodol., 14.","DOI":"10.1186\/1471-2288-14-137"},{"key":"ref_2","first-page":"32","article-title":"High-dimensional data analysis: The curses and blessings of dimensionality","volume":"1","author":"Donoho","year":"2000","journal-title":"AMS Math Challenges Lect."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1016\/j.neunet.2021.01.034","article-title":"General stochastic separation theorems with optimal bounds","volume":"138","author":"Grechuk","year":"2021","journal-title":"Neural Netw."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Gorban, A.N., Grechuk, B., Mirkes, E.M., Stasenko, S.V., and Tyukin, I.Y. (2021). High-dimensional separability for one-and few-shot learning. Entropy, 23.","DOI":"10.20944\/preprints202106.0718.v1"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Mirkes, E.M., Allohibi, J., and Gorban, A. (2020). Fractional norms and quasinorms do not help to overcome the curse of dimensionality. Entropy, 22.","DOI":"10.3390\/e22101105"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Bac, J., Mirkes, E.M., Gorban, A.N., Tyukin, I., and Zinovyev, A. (2021). Scikit-dimension: A python package for intrinsic dimension estimation. Entropy, 23.","DOI":"10.3390\/e23101368"},{"key":"ref_7","unstructured":"Jiang, H., Kim, B., Guan, M., and Gupta, M. (2018, January 3\u20138). To trust or not to trust a classifier. Proceedings of the Advances in Neural Information Processing Systems, Montr\u00e9al, Canada."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Bac, J., and Zinovyev, A. (2020). Lizard brain: Tackling locally low-dimensional yet globally complex organization of multi-dimensional datasets. Front. Neurorobotics, 13.","DOI":"10.3389\/fnbot.2019.00110"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"329","DOI":"10.32614\/RJ-2017-054","article-title":"ider: Intrinsic Dimension Estimation with R","volume":"9","author":"Hino","year":"2017","journal-title":"R J."},{"key":"ref_10","first-page":"13","article-title":"Dimensionality reduction: A comparative review","volume":"10","author":"Postma","year":"2009","journal-title":"J. Mach. Learn. Res."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser. B Stat. Methodol."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Grassberger, P., and Procaccia, I. (2004). Measuring the strangeness of strange attractors. The theory of Chaotic Attractors, Springer.","DOI":"10.1007\/978-0-387-21830-4_12"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Farahmand, A.M., Szepesv\u00e1ri, C., and Audibert, J.Y. (2007, January 20\u201324). Manifold-adaptive dimension estimation. Proceedings of the 24th International Conference on Machine Learning, Corvalis, OR, USA.","DOI":"10.1145\/1273496.1273530"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1768","DOI":"10.1007\/s10618-018-0578-6","article-title":"Extreme-value-theoretic estimation of local intrinsic dimensionality","volume":"32","author":"Amsaleg","year":"2018","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_15","first-page":"119","article-title":"Statistical inference using extreme order statistics","volume":"3","author":"Pickands","year":"1975","journal-title":"Ann. Statist."},{"key":"ref_16","unstructured":"Levina, E., and Bickel, P.J. (2003, January 9\u201311). Maximum Likelihood Estimation of Intrinsic Dimension. Proceedings of the 17th International Conference on Neural Information Processing Systems, Cambridge, MA, USA."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"358","DOI":"10.1007\/s11263-008-0144-6","article-title":"Translated poisson mixture model for stratification learning","volume":"80","author":"Haro","year":"2008","journal-title":"Int. J. Comput. Vis."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Albergante, L., Bac, J., and Zinovyev, A. (2019, January 14\u201319). Estimating the effective dimension of large biological datasets using Fisher separability analysis. Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary.","DOI":"10.1109\/IJCNN.2019.8852450"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1007\/s10994-012-5294-7","article-title":"Novel high intrinsic dimensionality estimators","volume":"89","author":"Rozza","year":"2012","journal-title":"Mach. Learn."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"2569","DOI":"10.1016\/j.patcog.2014.02.013","article-title":"DANCo: An intrinsic dimensionality estimator exploiting angle and norm concentration","volume":"47","author":"Ceruti","year":"2014","journal-title":"Pattern Recognit."},{"key":"ref_21","unstructured":"Johnsson, K. (2016). Structures in High-Dimensional Data: Intrinsic Dimension and Cluster Analysis, Centre for Mathematical Sciences, Lund University."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Facco, E., D\u2019Errico, M., Rodriguez, A., and Laio, A. (2017). Estimating the intrinsic dimension of datasets by a minimal neighborhood information. Sci. Rep., 7.","DOI":"10.1038\/s41598-017-11873-y"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1016\/j.ins.2018.07.040","article-title":"Correction of AI systems by linear discriminants: Probabilistic foundations","volume":"466","author":"Gorban","year":"2018","journal-title":"Inf. Sci."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Amsaleg, L., Chelly, O., Houle, M.E., Kawarabayashi, K.I., Radovanovi\u0107, M., and Treeratanajaru, W. (2019, January 2\u20134). Intrinsic dimensionality estimation within tight localities. Proceedings of the 2019 SIAM International Conference on Data Mining, Calgary, AB, Canada.","DOI":"10.1137\/1.9781611975673.21"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"650","DOI":"10.1109\/TSP.2009.2031722","article-title":"On local intrinsic dimension estimation and its applications","volume":"58","author":"Carter","year":"2009","journal-title":"IEEE Trans. Signal Process."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"759567","DOI":"10.1155\/2015\/759567","article-title":"Intrinsic dimension estimation: Relevant techniques and a benchmark framework","volume":"2015","author":"Campadelli","year":"2015","journal-title":"Math. Probl. Eng."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1016\/j.ins.2015.08.029","article-title":"Intrinsic dimension estimation: Advances and open problems","volume":"328","author":"Camastra","year":"2016","journal-title":"Inf. Sci."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"067401","DOI":"10.1103\/PhysRevLett.130.067401","article-title":"Intrinsic dimension estimation for discrete metrics","volume":"130","author":"Macocco","year":"2023","journal-title":"Phys. Rev. Lett."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1080\/14786440109462720","article-title":"LIII. On lines and planes of closest fit to systems of points in space","volume":"2","author":"Pearson","year":"1901","journal-title":"Lond. Edinb. Dublin Philos. Mag. J. Sci."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1357","DOI":"10.1016\/j.patcog.2010.12.015","article-title":"Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds","volume":"44","author":"Barshan","year":"2011","journal-title":"Pattern Recognit."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"459","DOI":"10.1109\/TVCG.2004.17","article-title":"Robust linear dimensionality reduction","volume":"10","author":"Koren","year":"2004","journal-title":"IEEE Trans. Vis. Comput. Graph."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"2789","DOI":"10.1016\/j.patcog.2008.01.001","article-title":"A unified framework for semi-supervised dimensionality reduction","volume":"41","author":"Song","year":"2008","journal-title":"Pattern Recognit."},{"key":"ref_33","unstructured":"Gorban, A., Mirkes, E., and Zinovyev, A. (2023, January 21). Supervised PCA. Available online: https:\/\/github.com\/Mirkes\/SupervisedPCA."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1198\/016214505000000628","article-title":"Prediction by supervised principal components","volume":"101","author":"Bair","year":"2006","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Mirkes, E.M., Bac, J., Fouch\u00e9, A., Stasenko, S.V., Zinovyev, A., and Gorban, A.N. (2023). Domain Adaptation Principal Component Analysis: Base linear method for learning with out-of-distribution data. Entropy, 25.","DOI":"10.3390\/e25010033"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1198\/106186006X113430","article-title":"Sparse principal component analysis","volume":"15","author":"Zou","year":"2006","journal-title":"J. Comput. Graph. Stat."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Gorban, A.N., K\u00e9gl, B., Wunsch, D.C., and Zinovyev, A.Y. (2008). Principal Manifolds for Data Visualization and Dimension Reduction, Springer.","DOI":"10.1007\/978-3-540-73750-6"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Gu, Q., Li, Z., and Han, J. (2011, January 5\u20139). Linear discriminant dimensionality reduction. Proceedings of the Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2011, Athens, Greece.","DOI":"10.1007\/978-3-642-23780-5_45"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1007\/BF02289162","article-title":"Some necessary conditions for common-factor analysis","volume":"19","author":"Guttman","year":"1954","journal-title":"Psychometrika"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1177\/001316446002000116","article-title":"The application of electronic computers to factor analysis","volume":"20","author":"Kaiser","year":"1960","journal-title":"Educ. Psychol. Meas."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"2204","DOI":"10.2307\/1939574","article-title":"Stopping rules in principal components analysis: A comparison of heuristical and statistical approaches","volume":"74","author":"Jackson","year":"1993","journal-title":"Ecology"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1109\/T-C.1971.223208","article-title":"An algorithm for finding intrinsic dimensionality of data","volume":"C-20","author":"Fukunaga","year":"1971","journal-title":"IEEE Trans. Comput."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1016\/j.inffus.2020.01.005","article-title":"Overview and comparative study of dimensionality reduction techniques for high dimensional data","volume":"59","author":"Ayesha","year":"2020","journal-title":"Inf. Fusion"},{"key":"ref_44","unstructured":"Tang, B., Shepherd, M., Milios, E., and Heywood, M.I. (2005, January 21\u201323). Comparing and combining dimension reduction techniques for efficient text clustering. Proceedings of the SIAM International Workshop on Feature Selection for Data Mining, Newport Beach, CA, USA."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Konstorum, A., Jekel, N., Vidal, E., and Laubenbacher, R. (2018). Comparative analysis of linear and nonlinear dimension reduction techniques on mass cytometry data. BioRxiv.","DOI":"10.1101\/273862"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"931","DOI":"10.1080\/00207720500381573","article-title":"Analysis of linear and nonlinear dimensionality reduction methods for gender classification of face images","volume":"36","author":"Buchala","year":"2005","journal-title":"Int. J. Syst. Sci."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Fodor, I.K. (2002). A Survey of Dimension Reduction Techniques, Lawrence Livermore National Lab. (LLNL). Technical Report.","DOI":"10.2172\/15002155"},{"key":"ref_48","unstructured":"Deegalla, S., Bostr\u00f6m, H., and Walgama, K. (2012, January 9\u201312). Choice of dimensionality reduction methods for feature and classifier fusion with nearest neighbor classifiers. Proceedings of the 2012 15th International Conference on Information Fusion, Singapore."},{"key":"ref_49","first-page":"2","article-title":"Determining the number of factors to retain in EFA: An easy-to-use computer program for carrying out parallel analysis","volume":"12","author":"Ledesma","year":"2007","journal-title":"Pract. Assess. Res. Eval."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Cangelosi, R., and Goriely, A. (2007). Component retention in principal component analysis with application to cDNA microarray data. Biol. Direct, 2.","DOI":"10.1186\/1745-6150-2-2"},{"key":"ref_51","unstructured":"Kelly, M., Longjohn, R., and Nottingham, K. (2025, May 01). The UCI Machine Learning Repository. Available online: http:\/\/archive.ics.uci.edu."},{"key":"ref_52","unstructured":"(2025, May 01). Banknote Authentication. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/banknote+authentication."},{"key":"ref_53","unstructured":"(2025, May 01). Blood Transfusion Service Center. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/Blood+Transfusion+Service+Center."},{"key":"ref_54","unstructured":"(2025, May 01). Breast Cancer Wisconsin (Diagnostic). Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/Breast+Cancer+Wisconsin+%28Diagnostic%29."},{"key":"ref_55","unstructured":"(2025, May 01). Climate Model Simulation Crashes. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/Climate+Model+Simulation+Crashes."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1016\/j.compbiomed.2017.01.001","article-title":"An expert system for selecting wart treatment method","volume":"81","author":"Khozeimeh","year":"2017","journal-title":"Comput. Biol. Med."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"474","DOI":"10.1111\/ijd.13535","article-title":"Intralesional immunotherapy compared to cryotherapy in the treatment of warts","volume":"56","author":"Khozeimeh","year":"2017","journal-title":"Int. J. Dermatol."},{"key":"ref_58","unstructured":"(2025, May 01). Cryotherapy. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/Cryotherapy+Dataset."},{"key":"ref_59","unstructured":"(2025, May 01). Diabetic Retinopathy Debrecen Data Set. Available online: https:\/\/archive.ics.uci.edu\/dataset\/329\/diabetic+retinopathy+debrecen."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1016\/j.knosys.2013.12.023","article-title":"An ensemble-based system for automatic screening of diabetic retinopathy","volume":"60","author":"Antal","year":"2014","journal-title":"Knowl.-Based Syst."},{"key":"ref_61","unstructured":"(2025, May 01). EEG Eye State. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/EEG+Eye+State#."},{"key":"ref_62","unstructured":"(2025, May 01). HTRU2. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/HTRU2."},{"key":"ref_63","unstructured":"Lyon, R.J. (2025, May 01). HTRU2. Available online: https:\/\/figshare.com\/articles\/dataset\/HTRU2\/3080389\/1?file=4787626."},{"key":"ref_64","doi-asserted-by":"crossref","first-page":"1104","DOI":"10.1093\/mnras\/stw656","article-title":"Fifty years of pulsar candidate selection: From simple filters to a new principled real-time classification approach","volume":"459","author":"Lyon","year":"2016","journal-title":"Mon. Not. R. Astron. Soc."},{"key":"ref_65","unstructured":"(2025, May 01). Immunotherapy. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/Immunotherapy+Dataset."},{"key":"ref_66","unstructured":"(2025, May 01). Indian Liver Patient. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/ILPD+%28Indian+Liver+Patient+Dataset%29."},{"key":"ref_67","unstructured":"Saul, L.K., Weiss, Y., and Bottou, L. (2005). Result analysis of the NIPS 2003 feature selection challenge. Proceedings of the Advances in Neural Information Processing Systems 17, MIT Press. Available online: https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2004\/file\/5e751896e527c862bf67251a474b3819-Paper.pdf."},{"key":"ref_68","unstructured":"(2025, May 01). Madelon. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/Madelon."},{"key":"ref_69","unstructured":"(2025, May 01). MiniBooNE Particle Identification. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/MiniBooNE+particle+identification."},{"key":"ref_70","unstructured":"(2025, May 01). Musk 1 and 2. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/Musk+%28Version+1%29."},{"key":"ref_71","unstructured":"Bhatt, R. (2025, May 01). Planning-Relax Dataset for Automatic Classification of EEG Signals. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/Planning+Relax."},{"key":"ref_72","doi-asserted-by":"crossref","first-page":"867","DOI":"10.1021\/ci4000213","article-title":"Quantitative structure\u2013activity relationship models for ready biodegradability of chemicals","volume":"53","author":"Mansouri","year":"2013","journal-title":"J. Chem. Inf. Model"},{"key":"ref_73","unstructured":"(2025, May 01). QSAR Biodegradation. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/QSAR+biodegradation."},{"key":"ref_74","unstructured":"Bhatt, R., and Dhall, A. (2025, May 01). Skin Segmentation. UCI Machine Learning Repository. Available online: https:\/\/archive.ics.uci.edu\/dataset\/229\/skin+segmentation."},{"key":"ref_75","unstructured":"(2025, May 01). Connectionist Bench (Sonar Mines vs. Rocks). Available online: https:\/\/http:\/\/archive.ics.uci.edu\/ml\/datasets\/connectionist+bench+%28sonar,+mines+vs%2E+rocks%29."},{"key":"ref_76","unstructured":"(2025, May 01). SPECTF Heart. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/SPECTF+Heart."},{"key":"ref_77","unstructured":"(2025, May 01). MAGIC Gamma Telescope. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/MAGIC+Gamma+Telescope."},{"key":"ref_78","unstructured":"(2025, May 01). Vertebral Column. Available online: https:\/\/archive.ics.uci.edu\/ml\/datasets\/Vertebral+Column."},{"key":"ref_79","doi-asserted-by":"crossref","first-page":"432","DOI":"10.1037\/0033-2909.99.3.432","article-title":"Comparison of five rules for determining the number of components to retain","volume":"99","author":"Zwick","year":"1986","journal-title":"Psychol. Bull."},{"key":"ref_80","unstructured":"Belsley, D.A., Kuh, E., and Welsch, R.E. (2005). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, John Wiley & Sons."},{"key":"ref_81","doi-asserted-by":"crossref","unstructured":"Student (1908). The probable error of a mean. Biometrika, 6, 1\u201325.","DOI":"10.2307\/2331554"},{"key":"ref_82","doi-asserted-by":"crossref","unstructured":"Wilcoxon, F. (1992). Individual comparisons by ranking methods. Breakthroughs in Statistics: Methodology and Distribution, Springer.","DOI":"10.1007\/978-1-4612-4380-9_16"},{"key":"ref_83","unstructured":"Sager, T. (2010). Kolmogorov\u2013Smirnov Test. Encycl. Res. Des., 664\u2013668."},{"key":"ref_84","doi-asserted-by":"crossref","unstructured":"Miller, R.G., and Miller, R.G. (1981). Normal univariate techniques. Simultaneous Statistical Inference, Springer.","DOI":"10.1007\/978-1-4613-8122-8"},{"key":"ref_85","doi-asserted-by":"crossref","unstructured":"Clarkson, K.L. (2006). Nearest-Neighbor Searching and Metric Space Dimensions. Nearest-Neighbor Methods in Learning and Vision, The MIT Press.","DOI":"10.7551\/mitpress\/4908.003.0005"},{"key":"ref_86","doi-asserted-by":"crossref","unstructured":"Hosmer, D.W., and Lemeshow, S. (2000). Applied Logistic Regression, John Wiley & Sons.","DOI":"10.1002\/0471722146"},{"key":"ref_87","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1111\/j.1469-1809.1936.tb02137.x","article-title":"The Use of Multiple Measurements in Taxonomic Problems","volume":"7","author":"Fisher","year":"1936","journal-title":"Ann. Eugen."},{"key":"ref_88","unstructured":"(2024, June 20). Confusion Matrix. Available online: https:\/\/en.wikipedia.org\/wiki\/Confusion_matrix#Table_of_confusion."},{"key":"ref_89","doi-asserted-by":"crossref","unstructured":"Devroye, L., Gy\u00f6rfi, L., and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition, Springer.","DOI":"10.1007\/978-1-4612-0711-5"},{"key":"ref_90","unstructured":"Baliyan, M., and Mirkes, E. (2025, May 01). Linear dimensionality reduction: Data and code. Available online: https:\/\/github.com\/mohit-baliyan\/Linear-Dimensionality-Reduction\/."},{"key":"ref_91","unstructured":"(2025, May 01). scipy.stats.rankdata in SciPy Documantation 1.14.0. Available online: https:\/\/docs.scipy.org\/doc\/scipy\/reference\/generated\/scipy.stats.rankdata.html."},{"key":"ref_92","unstructured":"Dodge, Y. (2008). The Concise Encyclopedia of Statistics, Springer Science & Business Media."},{"key":"ref_93","unstructured":"Gujarati, D., and Porter, D. (2003). Multicollinearity: What happens if the regressors are correlated. Basic Econometrics, McGraw-Hill."},{"key":"ref_94","unstructured":"Tikhonov, A.N., and Arsenin, V.Y. (1977). Solutions of Ill-Posed Problems, Winston."}],"container-title":["Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2306-5729\/10\/5\/70\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:28:16Z","timestamp":1760030896000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2306-5729\/10\/5\/70"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,6]]},"references-count":94,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2025,5]]}},"alternative-id":["data10050070"],"URL":"https:\/\/doi.org\/10.3390\/data10050070","relation":{},"ISSN":["2306-5729"],"issn-type":[{"value":"2306-5729","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,5,6]]}}}