{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,6]],"date-time":"2026-04-06T14:49:02Z","timestamp":1775486942651,"version":"3.50.1"},"reference-count":79,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2021,1,4]],"date-time":"2021-01-04T00:00:00Z","timestamp":1609718400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2021,4,30]]},"abstract":"<jats:p>Dimensionality reduction is a commonly used technique in data analytics. Reducing the dimensionality of datasets helps not only with managing their analytical complexity but also with removing redundancy. Over the years, several such algorithms have been proposed with their aims ranging from generating simple linear projections to complex non-linear transformations of the input data. Subsequently, researchers have defined several quality metrics in order to evaluate the performances of different algorithms. Hence, given a plethora of dimensionality reduction algorithms and metrics for their quality analysis, there is a long-existing need for guidelines on how to select the most appropriate algorithm in a given scenario. In order to bridge this gap, in this article, we have compiled 12 state-of-the-art quality metrics and categorized them into 5 identified analytical contexts. Furthermore, we assessed 15 most popular dimensionality reduction algorithms on the chosen quality metrics using a large-scale and systematic experimental study. Later, using a set of robust non-parametric statistical tests, we assessed the generalizability of our evaluation on 40 real-world datasets. Finally, based on our results, we present practitioners\u2019 guidelines for the selection of an appropriate dimensionally reduction algorithm in the present analytical contexts.<\/jats:p>","DOI":"10.1145\/3428077","type":"journal-article","created":{"date-parts":[[2021,1,4]],"date-time":"2021-01-04T14:34:22Z","timestamp":1609770862000},"page":"1-40","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Context-Based Evaluation of Dimensionality Reduction Algorithms\u2014Experiments and Statistical Significance Analysis"],"prefix":"10.1145","volume":"15","author":[{"given":"Aindrila","family":"Ghosh","sequence":"first","affiliation":[{"name":"Electrical and Computer Engineering, University of Alberta, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mona","family":"Nashaat","sequence":"additional","affiliation":[{"name":"Electrical and Computer Engineering, University of Alberta, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"James","family":"Miller","sequence":"additional","affiliation":[{"name":"Electrical and Computer Engineering, University of Alberta, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shaikh","family":"Quader","sequence":"additional","affiliation":[{"name":"IBM Canada Software Lab, Unionville, Ontario, IBM Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,1,4]]},"reference":[{"key":"e_1_2_2_1_1","first-page":"1","article-title":"Dimensionality reduction\u202f: A comparative review","volume":"10","author":"van der Maaten L.","year":"2008","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.5555\/2789272.2912091"},{"key":"e_1_2_2_3_1","volume-title":"Proceedings of the 8th International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD\u201902)","author":"Vlachos M."},{"key":"e_1_2_2_4_1","first-page":"3","article-title":"Fast feature selection using fractal dimension","volume":"1","author":"Jr C. T.","year":"2010","journal-title":"J. Inf. Data Manag."},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1038\/nbt.4314"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDMW.2016.0093"},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TGRS.2011.2162339"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1039\/C3AY41907J"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.21105\/joss.00861"},{"key":"e_1_2_2_10_1","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"van der Maaten L.","year":"2008","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2014.12.095"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2008.12.017"},{"key":"e_1_2_2_13_1","unstructured":"E. Amid and M. K. Warmuth. 2019. A more globally accurate dimensionality reduction method using triplets. arXiv:1803.00854. Retrieved April 8 2019 from http:\/\/arxiv.org\/abs\/1803.00854.  E. Amid and M. K. Warmuth. 2019. A more globally accurate dimensionality reduction method using triplets. arXiv:1803.00854. Retrieved April 8 2019 from http:\/\/arxiv.org\/abs\/1803.00854."},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2008.204"},{"key":"e_1_2_2_15_1","doi-asserted-by":"crossref","unstructured":"J. L. Su\u00e1rez S. Garc\u00eda and F. Herrera. 2020. A tutorial on distance metric learning: Mathematical foundations algorithms experimental analysis prospects and challenges. Neurocomputing. In press. https:\/\/doi.org\/10.1016\/j.neucom.2020.08.017  J. L. Su\u00e1rez S. Garc\u00eda and F. Herrera. 2020. A tutorial on distance metric learning: Mathematical foundations algorithms experimental analysis prospects and challenges. Neurocomputing. In press. https:\/\/doi.org\/10.1016\/j.neucom.2020.08.017","DOI":"10.1016\/j.neucom.2020.08.017"},{"key":"e_1_2_2_16_1","volume-title":"Robust dimensionality reduction via feature space to feature space distance metric learning. Neural Netw. 112 (Apr","author":"Li B.","year":"2019"},{"key":"e_1_2_2_17_1","volume-title":"Proceedings of the Safe Machine Learning Workshop at ICLR. 7.","author":"Bibal A."},{"key":"e_1_2_2_18_1","doi-asserted-by":"crossref","unstructured":"T. Hastie R. Tibshirani and J. Friedman. 2001. The Elements of Statistical Learning (2nd ed.). Springer New York NY.  T. Hastie R. Tibshirani and J. Friedman. 2001. The Elements of Statistical Learning (2nd ed.). Springer New York NY.","DOI":"10.1007\/978-0-387-21606-5"},{"key":"e_1_2_2_19_1","volume-title":"Proceeding of the 33rd International Conference on Software Engineering (ICSE\u201911)","author":"Arcuri A.","year":"1985"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/1248547.1248548"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2010.04.013"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.5555\/3227223.3227408"},{"key":"e_1_2_2_23_1","volume-title":"Retrieved","author":"Johannemann J.","year":"2019"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.visinf.2018.12.004"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/3301294"},{"key":"e_1_2_2_26_1","volume-title":"Proceedings of the 25th International Conference on World Wide Web (WWW\u201916)","author":"Tang J."},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2016.06.004"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41592-018-0308-4"},{"key":"e_1_2_2_29_1","volume-title":"Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV\u201905)","volume":"1","author":"He Xiaofei","year":"2005"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/375551.383213"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2017.2745258"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1186\/s12874-019-0737-5"},{"key":"e_1_2_2_33_1","unstructured":"C. O. S. Sorzano J. Vargas and A. Pascual. A survey of dimensionality reduction techniques. arXiv14032877.  C. O. S. Sorzano J. Vargas and A. Pascual. A survey of dimensionality reduction techniques. arXiv14032877."},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2008.08.003"},{"key":"e_1_2_2_35_1","doi-asserted-by":"crossref","unstructured":"B. Rieck and H. Leitte. 2017. Agreement analysis of quality measures for dimensionality reduction. In Topological Methods in Data Analysis and Visualization IV H. Carr C. Garth and T. Weinkauf (Eds.). Springer International Publishing Cham 103--117.  B. Rieck and H. Leitte. 2017. Agreement analysis of quality measures for dimensionality reduction. In Topological Methods in Data Analysis and Visualization IV H. Carr C. Garth and T. Weinkauf (Eds.). Springer International Publishing Cham 103--117.","DOI":"10.1007\/978-3-319-44684-4_6"},{"key":"e_1_2_2_36_1","doi-asserted-by":"crossref","unstructured":"J. A. Lee E. Renard G. Bernard P. Dupont and M. Verleysen. 2013. Type 1 and 2 mixtures of Kullback--Leibler divergences as cost functions in dimensionality reduction based on similarity preservation. Neurocomputing 112 (2013) 92--108. DOI:https:\/\/doi.org\/10.1016\/j.neucom.2012.12.036  J. A. Lee E. Renard G. Bernard P. Dupont and M. Verleysen. 2013. Type 1 and 2 mixtures of Kullback--Leibler divergences as cost functions in dimensionality reduction based on similarity preservation. Neurocomputing 112 (2013) 92--108. DOI:https:\/\/doi.org\/10.1016\/j.neucom.2012.12.036","DOI":"10.1016\/j.neucom.2012.12.036"},{"key":"e_1_2_2_37_1","unstructured":"J. Goldberger G. E. Hinton S. T. Roweis and R. R. Salakhutdinov. 2005. Neighbourhood components analysis. In Advances in Neural Information Processing Systems. MIT Press 513--520.  J. Goldberger G. E. Hinton S. T. Roweis and R. R. Salakhutdinov. 2005. Neighbourhood components analysis. In Advances in Neural Information Processing Systems. MIT Press 513--520."},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICMLA.2011.55"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1186\/s41044-017-0025-5"},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jesp.2013.03.013"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2003.819861"},{"key":"e_1_2_2_42_1","unstructured":"M. Gashler D. Ventura and T. Martinez. 2008. Iterative non-linear dimensionality reduction by manifold sculpting. In Advances in Neural Information Processing Systems. MIT Press 513--520.  M. Gashler D. Ventura and T. Martinez. 2008. Iterative non-linear dimensionality reduction by manifold sculpting. In Advances in Neural Information Processing Systems. MIT Press 513--520."},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2018.2842019"},{"key":"e_1_2_2_44_1","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics","volume":"1","author":"Dror R."},{"key":"e_1_2_2_45_1","first-page":"396","article-title":"Error of the normal approximation to the sum of N random variables","volume":"58","author":"Sherman R.","year":"1971","journal-title":"Biometrika"},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1080\/14786440109462720"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF02289694"},{"key":"e_1_2_2_48_1","volume-title":"Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI\u201910)","author":"Yang Y."},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1162\/089976698300017467"},{"key":"e_1_2_2_50_1","volume-title":"Github Repository. Retrieved","author":"Ulyanov D.","year":"2016"},{"key":"e_1_2_2_51_1","volume-title":"Pac. Symp. Biocomput. 24","author":"Hu Q.","year":"2019"},{"key":"e_1_2_2_52_1","unstructured":"E. Levina and P. J. Bickel. 2005. Maximum likelihood estimation of intrinsic dimension. In Advances in Neural Information Processing Systems. MIT Press 777--784.  E. Levina and P. J. Bickel. 2005. Maximum likelihood estimation of intrinsic dimension. In Advances in Neural Information Processing Systems. MIT Press 777--784."},{"key":"e_1_2_2_53_1","volume-title":"Advances in Neural Information Processing Systems","volume":"14","author":"Xing E. P."},{"key":"e_1_2_2_54_1","volume-title":"An Overview of Distance Metric Learning. Technical report. School of Computer Science","author":"Yang L."},{"key":"e_1_2_2_55_1","unstructured":"L. Yang and R. Jin. 2006. Distance metric learning: A comprehensive survey. Michigan State Universiy.  L. Yang and R. Jin. 2006. Distance metric learning: A comprehensive survey. Michigan State Universiy."},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/TGRS.2016.2645703"},{"key":"e_1_2_2_57_1","volume-title":"Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics (HILDA\u201917)","author":"Wenskovitch J."},{"key":"e_1_2_2_58_1","unstructured":"D. Dheeru and G. Casey. 2017. UCI Machine Learning Repository. Retrieved from http:\/\/archive.ics.uci.edu\/ml.  D. Dheeru and G. Casey. 2017. UCI Machine Learning Repository. Retrieved from http:\/\/archive.ics.uci.edu\/ml."},{"key":"e_1_2_2_59_1","volume-title":"Retrieved","year":"2019"},{"key":"e_1_2_2_60_1","volume-title":"Retrieved","author":"Bischl B.","year":"2019"},{"key":"e_1_2_2_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2017.2744098"},{"key":"e_1_2_2_62_1","doi-asserted-by":"publisher","DOI":"10.1109\/CIDM.2011.5949443"},{"key":"e_1_2_2_63_1","volume-title":"Biostatistics: A Foundation for Analysis in the Health Sciences","author":"Daniel W. W.","year":"2018","edition":"10"},{"key":"e_1_2_2_64_1","volume-title":"Retrieved","author":"Efraimidis P. S.","year":"2019"},{"key":"e_1_2_2_65_1","doi-asserted-by":"publisher","DOI":"10.5555\/1855075"},{"key":"e_1_2_2_66_1","doi-asserted-by":"publisher","DOI":"10.5555\/1182635.1164180"},{"key":"e_1_2_2_67_1","doi-asserted-by":"publisher","DOI":"10.1177\/002224376500200107"},{"key":"e_1_2_2_68_1","first-page":"2677","article-title":"An extension on \u201cstatistical comparisons of classifiers over multiple data sets\u201d for all pairwise comparisons","volume":"9","author":"Gar\u0107\u0131a S.","year":"2008","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_2_2_69_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2009.12.010"},{"key":"e_1_2_2_70_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015330.1015338"},{"key":"e_1_2_2_71_1","unstructured":"V. D. Silva and J. B. Tenenbaum. 2003. Global versus local methods in nonlinear dimensionality reduction. In Advances in Neural Information Processing Systems. MIT Press 721--728.  V. D. Silva and J. B. Tenenbaum. 2003. Global versus local methods in nonlinear dimensionality reduction. In Advances in Neural Information Processing Systems. MIT Press 721--728."},{"key":"e_1_2_2_72_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2010.10.011"},{"key":"e_1_2_2_73_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0031-3203(03)00176-6"},{"key":"e_1_2_2_74_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2015.08.029"},{"key":"e_1_2_2_75_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00411"},{"key":"e_1_2_2_76_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2017.2701829"},{"key":"e_1_2_2_77_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2019.01.001"},{"key":"e_1_2_2_78_1","volume-title":"Proceedings of the 22nd International Conference on Software Engineering 8 Knowledge Engineering (SEKE\u201910)","author":"Feldt R."},{"key":"e_1_2_2_79_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2008.12.009"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3428077","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3428077","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:24:23Z","timestamp":1750195463000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3428077"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,1,4]]},"references-count":79,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2021,4,30]]}},"alternative-id":["10.1145\/3428077"],"URL":"https:\/\/doi.org\/10.1145\/3428077","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,1,4]]},"assertion":[{"value":"2019-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-01-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}