{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T15:52:13Z","timestamp":1778082733029,"version":"3.51.4"},"reference-count":192,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2018,1,23]],"date-time":"2018-01-23T00:00:00Z","timestamp":1516665600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2018,4,30]]},"abstract":"<jats:p>Technological advancement has enabled us to store and process huge amount of data in relatively short spans of time. The nature of data is rapidly changing, particularly its dimensionality is more commonly multi- and high-dimensional. There is an immediate need to expand our focus to include analysis of high-dimensional and large datasets. Data analysis is becoming a mammoth task, due to incremental increase in data volume and complexity in terms of heterogony of data. It is due to this dynamic computing environment that the existing techniques either need to be modified or discarded to handle new data in multiple high-dimensions. Data clustering is a tool that is used in many disciplines, including data mining, so that meaningful knowledge can be extracted from seemingly unstructured data. The aim of this article is to understand the problem of clustering and various approaches addressing this problem. This article discusses the process of clustering from both microviews (data treating) and macroviews (overall clustering process). Different distance and similarity measures, which form the cornerstone of effective data clustering, are also identified. Further, an in-depth analysis of different clustering approaches focused on data mining, dealing with large-scale datasets is given. These approaches are comprehensively compared to bring out a clear differentiation among them. This article also surveys the problem of high-dimensional data and the existing approaches, that makes it more relevant. It also explores the latest trends in cluster analysis, and the real-life applications of this concept. This survey is exhaustive as it tries to cover all the aspects of clustering in the field of data mining.<\/jats:p>","DOI":"10.1145\/3132088","type":"journal-article","created":{"date-parts":[[2018,1,23]],"date-time":"2018-01-23T13:26:43Z","timestamp":1516714003000},"page":"1-68","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":69,"title":["Systematic Review of Clustering High-Dimensional and Large Datasets"],"prefix":"10.1145","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8694-1538","authenticated-orcid":false,"given":"Divya","family":"Pandove","sequence":"first","affiliation":[{"name":"Thapar University, Patiala, Punjab, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shivan","family":"Goel","sequence":"additional","affiliation":[{"name":"Bennett University, Greater Noida, U.P., India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rinkl","family":"Rani","sequence":"additional","affiliation":[{"name":"Thapar University, Patiala, Punjab, India"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2018,1,23]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/SSDBM.2007.21"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611972771.37"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/SSDBM.2006.35"},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of the 29th International Conference on Very Large Data Bases-Volume 29 (VLDB\u201903)","author":"Aggarwal Charu C."},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the International Conference on Extending Database Technology, Advances in Database Technology (EDBT\u201904)","author":"Aggarwal Charu C."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/304182.304188"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/335191.335383"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/69.991713"},{"key":"e_1_2_1_9_1","volume-title":"Aggarwal and ChengXiang Zhai","author":"Charu","year":"2012"},{"key":"e_1_2_1_10_1","volume-title":"Aggarwal and ChengXiang Zhai","author":"Charu","year":"2012"},{"key":"e_1_2_1_11_1","doi-asserted-by":"crossref","volume-title":"Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications","author":"Agrawal Rakesh","DOI":"10.1145\/276304.276314"},{"key":"e_1_2_1_12_1","volume-title":"Dimitrios Gunopulos, and Prabhakar Raghavan.","author":"Agrawal Rakesh","year":"1999"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10791-008-9066-8"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11390-014-1416-y"},{"key":"e_1_2_1_15_1","doi-asserted-by":"crossref","unstructured":"Rajaraman Anand and D. U. Jeffrey. 2012. Mining of Massive Datasets.   Rajaraman Anand and D. U. Jeffrey. 2012. Mining of Massive Datasets.","DOI":"10.1017\/CBO9781139058452"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCIMA.2007.127"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/CONFLUENCE.2014.6949256"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1002\/widm.1062"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2007.49"},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR\u201914)","author":"Ayed Abdelkarim Ben"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1016\/0893-6080(89)90014-2"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/347090.347145"},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic (NIPS\u201901)","volume":"14","author":"Belkin Mikhail","year":"2001"},{"key":"e_1_2_1_24_1","first-page":"548","article-title":"Survey of text mining","volume":"45","author":"Berry Michael W.","year":"2004","journal-title":"Computing Reviews"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.5555\/645503.656271"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.865189"},{"key":"e_1_2_1_27_1","volume-title":"Neural Networks for Pattern Recognition","author":"Bishop Christopher M."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/21.97475"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/1007568.1007620"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2008.03.002"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1504\/IJSSC.2011.040339"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-39658-1_52"},{"key":"e_1_2_1_33_1","volume-title":"Proceedings of the Conference on Data Mining and Data Warehouses (SiKDD\u201905)","author":"Brank Janez","year":"2005"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2011.199"},{"key":"e_1_2_1_35_1","volume-title":"Optimization and Operations Research","author":"Brucker Peter"},{"key":"e_1_2_1_36_1","volume-title":"The Handbook of Brain Theory and Neural Networks, Michael A","author":"Buhmann Joachim"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1016\/0893-6080(91)90056-B"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1090\/conm\/588\/11712"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the 26th VLDB Conference. 89--100","author":"Chakrabarti Kaushik","year":"2000"},{"key":"e_1_2_1_40_1","volume-title":"Astrostatistical Challenges for the New Astronomy","author":"Chattopadhyay Asis Kumar"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2014.01.015"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11036-013-0489-0"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.5555\/1287369.1287398"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/312129.312199"},{"key":"e_1_2_1_45_1","first-page":"93","article-title":"Biclustering of expression data","volume":"8","author":"Cheng Yizong","year":"2000","journal-title":"ISMB"},{"key":"e_1_2_1_46_1","volume-title":"Mulier","author":"Cherkassky Vladimir","year":"2007"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2751521"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1162\/15324430152733142"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277784"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/1327452.1327492"},{"key":"e_1_2_1_51_1","volume-title":"Data Clustering: Algorithms and Applications, Charu C","author":"Deng Hongbo"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/502512.502550"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015330.1015408"},{"key":"e_1_2_1_54_1","volume-title":"Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM\u201902)","author":"Ding Chris"},{"key":"e_1_2_1_55_1","volume-title":"Proceedings of the IEEE International Conference on Data Mining (ICDM\u201901)","author":"Ding Chris H. Q."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2012.57"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1038\/nbt1406"},{"key":"e_1_2_1_58_1","volume-title":"Handbook of Pattern Recognition 8 Computer Vision","author":"Dubes Richard C."},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.72.027104"},{"key":"e_1_2_1_60_1","volume-title":"Stork","author":"Duda Richard O.","year":"2001"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.4153\/CJM-1965-045-4"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.95.25.14863"},{"key":"e_1_2_1_63_1","volume-title":"Proceedings of the 30th International Conference on Machine Learning (ICML\u201913)","author":"Ermon Stefano","year":"2013"},{"key":"e_1_2_1_64_1","first-page":"226","article-title":"A density-based algorithm for discovering clusters in large spatial databases with noise","volume":"96","author":"Ester Martin","year":"1996","journal-title":"Kdd"},{"key":"e_1_2_1_65_1","doi-asserted-by":"crossref","volume-title":"An Introduction to Applied Multivariate Analysis with R","author":"Everitt Brian","DOI":"10.1007\/978-1-4419-9650-3"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/TETC.2014.2330519"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1080\/15427951.2004.10129093"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/41.8.578"},{"key":"e_1_2_1_69_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2013.05.040"},{"key":"e_1_2_1_70_1","first-page":"166","article-title":"Hybrid clustering algorithm with modifications enhanced K-means and hierarchal clustering","volume":"3","author":"Galluccio Laurent","year":"2013","journal-title":"International Journal of Advanced Research in Computer Science and Software Engineering"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1145\/2723372.2737792"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.laa.2012.10.051"},{"key":"e_1_2_1_73_1","volume-title":"German-Japanese Interchange of Data Analysis Results, Wolfgang Gaul, Andreas Geyer-Schulz","author":"Geyer-Schulz Andreas"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1016\/0031-3203(91)90022-W"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.5555\/795666.796588"},{"key":"e_1_2_1_76_1","doi-asserted-by":"publisher","DOI":"10.1145\/276304.276312"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2016.2522412"},{"key":"e_1_2_1_78_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1012801612483"},{"key":"e_1_2_1_79_1","doi-asserted-by":"publisher","DOI":"10.1145\/584792.584890"},{"key":"e_1_2_1_80_1","volume-title":"Data Mining: Concepts and Techniques","author":"Han Jiawei","year":"2011"},{"key":"e_1_2_1_81_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2014.07.006"},{"key":"e_1_2_1_82_1","doi-asserted-by":"publisher","DOI":"10.1109\/91.873580"},{"key":"e_1_2_1_83_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11704-013-3158-3"},{"key":"e_1_2_1_84_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2006.05.006"},{"key":"e_1_2_1_85_1","doi-asserted-by":"publisher","DOI":"10.1145\/1148170.1148222"},{"key":"e_1_2_1_86_1","volume-title":"Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD\u201998)","volume":"98","author":"Hinneburg Alexander"},{"key":"e_1_2_1_87_1","volume-title":"Proceedings of the 25th VLDB Conference.","author":"Hinneburg Alexander"},{"key":"e_1_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2014.2337335"},{"key":"e_1_2_1_89_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDEW.2010.5452747"},{"key":"e_1_2_1_90_1","first-page":"94","article-title":"Survey on independent component analysis","volume":"2","author":"Hyvarinen Aapo","year":"1999","journal-title":"Neural Computing Surveys"},{"key":"e_1_2_1_91_1","volume-title":"Dubes","author":"Jain Anil K.","year":"1988"},{"key":"e_1_2_1_92_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.824819"},{"key":"e_1_2_1_93_1","doi-asserted-by":"publisher","DOI":"10.1145\/331499.331504"},{"key":"e_1_2_1_94_1","doi-asserted-by":"publisher","DOI":"10.1145\/775047.775126"},{"key":"e_1_2_1_95_1","doi-asserted-by":"publisher","DOI":"10.1109\/TITB.2007.900808"},{"key":"e_1_2_1_96_1","doi-asserted-by":"publisher","DOI":"10.1137\/1.9781611972740.23"},{"key":"e_1_2_1_97_1","doi-asserted-by":"publisher","DOI":"10.1109\/2.781637"},{"key":"e_1_2_1_98_1","volume-title":"Rousseeuw","author":"Kaufman Leonard","year":"2009"},{"key":"e_1_2_1_99_1","doi-asserted-by":"publisher","DOI":"10.1109\/TAC.2005.861710"},{"key":"e_1_2_1_100_1","volume-title":"Proceedings of the 15th International Conference on Neural Information Processing Systems. 463--470","author":"Kleinberg Jon","year":"2003"},{"key":"e_1_2_1_101_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.58325"},{"key":"e_1_2_1_102_1","doi-asserted-by":"publisher","DOI":"10.1109\/72.846729"},{"key":"e_1_2_1_103_1","doi-asserted-by":"crossref","unstructured":"Teuvo Kohonen M. R. Schroeder and T. S. Huang. 2001. Self-Organizing Maps. Springer-Verlag New York Inc. Secaucus NJ 43.   Teuvo Kohonen M. R. Schroeder and T. S. Huang. 2001. Self-Organizing Maps. Springer-Verlag New York Inc. Secaucus NJ 43.","DOI":"10.1007\/978-3-642-56927-2"},{"key":"e_1_2_1_104_1","doi-asserted-by":"publisher","DOI":"10.14778\/1454159.1454223"},{"key":"e_1_2_1_105_1","doi-asserted-by":"publisher","DOI":"10.1145\/1497577.1497578"},{"key":"e_1_2_1_106_1","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/10.3.271"},{"key":"e_1_2_1_107_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btm563"},{"key":"e_1_2_1_108_1","doi-asserted-by":"publisher","DOI":"10.1145\/2094114.2094118"},{"key":"e_1_2_1_109_1","volume-title":"Data field for hierarchical clustering. Developments in Data Extraction, Management, and Analysis","author":"Li Deyi","year":"2012"},{"key":"e_1_2_1_110_1","doi-asserted-by":"publisher","DOI":"10.5555\/1764441.1764512"},{"key":"e_1_2_1_111_1","doi-asserted-by":"publisher","DOI":"10.1109\/SNPD.2012.31"},{"key":"e_1_2_1_112_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2012.06.023"},{"key":"e_1_2_1_113_1","doi-asserted-by":"publisher","DOI":"10.1109\/MIC.2003.1167344"},{"key":"e_1_2_1_114_1","doi-asserted-by":"publisher","DOI":"10.1145\/354756.354775"},{"key":"e_1_2_1_115_1","volume-title":"Introduction to Combinatorial Mathematics","author":"Liu Chung Laung"},{"key":"e_1_2_1_116_1","first-page":"75","article-title":"Network based framework for author name disambiguation applications. International Journal of u-and e-Service","volume":"8","author":"Liu Yuechang","year":"2015","journal-title":"Science and Technology"},{"key":"e_1_2_1_117_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1982.1056489"},{"key":"e_1_2_1_118_1","volume-title":"Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability","volume":"1","author":"James"},{"key":"e_1_2_1_119_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCBB.2004.2"},{"key":"e_1_2_1_120_1","doi-asserted-by":"publisher","DOI":"10.1109\/72.478389"},{"key":"e_1_2_1_121_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijar.2010.04.007"},{"key":"e_1_2_1_122_1","volume-title":"Basford","author":"McLachlan Geoffrey J.","year":"1988"},{"key":"e_1_2_1_123_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.1983.4767409"},{"key":"e_1_2_1_124_1","volume-title":"Workshop on Big Data Benchmarks, Tilmann Rabl, Nambiar Raghunath, Meikel Poess, Milind Bhandarkar, Hans-Arno Jacobsen, and Chaitanya Baru (Eds.). Springer, 138--154","author":"Ming Zijian","year":"2013"},{"key":"e_1_2_1_125_1","volume-title":"Data Mining: Concepts and Techniques","author":"Han Jiawei","year":"2001"},{"key":"e_1_2_1_126_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2014.08.027"},{"key":"e_1_2_1_127_1","volume-title":"Pacific Symposium on Biocomputing","volume":"8","author":"Murali T. M.","year":"2003"},{"key":"e_1_2_1_128_1","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/26.4.354"},{"key":"e_1_2_1_129_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-010-0538-7"},{"key":"e_1_2_1_130_1","first-page":"307","article-title":"A more accurate clustering method by using co-author social networks for author name disambiguation","volume":"1","author":"Nadimi Mohammad Hossein","year":"2015","journal-title":"Journal of Computing and Security"},{"key":"e_1_2_1_131_1","doi-asserted-by":"publisher","DOI":"10.1140\/epjb\/e2004-00124-y"},{"key":"e_1_2_1_132_1","volume-title":"Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic (NIPS\u201901)","volume":"14","author":"Ng Andrew Y.","year":"2001"},{"key":"e_1_2_1_133_1","volume-title":"Proceedings of the 20th International Conference on Very Large Data Bases (VLDB\u201994)","author":"Ng R."},{"key":"e_1_2_1_134_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2002.1033770"},{"key":"e_1_2_1_135_1","doi-asserted-by":"publisher","DOI":"10.5555\/1888305.1888335"},{"key":"e_1_2_1_136_1","doi-asserted-by":"publisher","DOI":"10.1145\/2623330.2623726"},{"key":"e_1_2_1_137_1","volume-title":"Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI\u201916)","author":"Nie Feiping","year":"2016"},{"key":"e_1_2_1_138_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSMCB.2011.2161607"},{"key":"e_1_2_1_139_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.2011.2162000"},{"key":"e_1_2_1_140_1","doi-asserted-by":"publisher","DOI":"10.5555\/876875.878995"},{"key":"e_1_2_1_141_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0893-6080(05)80089-9"},{"key":"e_1_2_1_142_1","doi-asserted-by":"publisher","DOI":"10.1109\/72.238310"},{"key":"e_1_2_1_143_1","doi-asserted-by":"publisher","DOI":"10.1109\/ECS.2015.7124801"},{"key":"e_1_2_1_144_1","volume-title":"Proceedings of the 2015 IEEE International Conference on Computer and Information Technology","author":"Pandove Divya"},{"key":"e_1_2_1_145_1","doi-asserted-by":"publisher","DOI":"10.1145\/1007730.1007731"},{"key":"e_1_2_1_146_1","doi-asserted-by":"publisher","DOI":"10.1145\/1559845.1559865"},{"key":"e_1_2_1_147_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10799-008-0044-z"},{"key":"e_1_2_1_148_1","volume-title":"Proceedings of the 2013 IEEE 3rd International Advance Computing Conference (IACC). IEEE, 726--731","author":"Rajendra Prasad K."},{"key":"e_1_2_1_149_1","volume-title":"International Symposium on Graph Drawing. Springer, 197--210","author":"Quigley Aaron","year":"2000"},{"key":"e_1_2_1_150_1","doi-asserted-by":"publisher","DOI":"10.22436\/jmcs.05.03.11"},{"key":"e_1_2_1_151_1","volume-title":"Jeffrey David Ullman, and Jeffrey David Ullman","author":"Rajaraman Anand","year":"2012"},{"key":"e_1_2_1_152_1","volume-title":"Proceedings of DRTC Workshop on Semantic Web","volume":"8","author":"Ravichandra Rao I. K.","year":"2003"},{"key":"e_1_2_1_153_1","volume-title":"Adaptive Control Processes: A Guided Tour","author":"Richard Bellman"},{"key":"e_1_2_1_154_1","first-page":"410","article-title":"V-Measure: A conditional entropy-based external cluster evaluation measure","volume":"7","author":"Rosenberg Andrew","year":"2007","journal-title":"Proceedings of EMNLP-CoNLL"},{"key":"e_1_2_1_155_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cosrev.2007.05.001"},{"key":"e_1_2_1_156_1","volume-title":"Social Network Analysis","author":"Scott John"},{"key":"e_1_2_1_157_1","volume-title":"Proceedings of 15th IEEE International Conference on Machine Learning and Applications (ICMLA\u201916)","author":"Omair Shafiq M.","year":"2016"},{"key":"e_1_2_1_158_1","unstructured":"B. A. Shboul and Sung-Hyon Myaeng. 2009. Initializing k-means using genetic algorithms. (2009).  B. A. Shboul and Sung-Hyon Myaeng. 2009. Initializing k-means using genetic algorithms. (2009)."},{"key":"e_1_2_1_159_1","first-page":"428","article-title":"Wavecluster: A multi-resolution clustering approach for very large spatial databases","volume":"98","author":"Sheikholeslami Gholamhosein","year":"1998","journal-title":"Proceedings of VLDB"},{"key":"e_1_2_1_160_1","doi-asserted-by":"publisher","DOI":"10.1099\/00221287-17-1-201"},{"key":"e_1_2_1_161_1","volume-title":"A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons. Biologiske skrifter 5","author":"S\u00f8rensen Thorvald","year":"1948"},{"key":"e_1_2_1_162_1","volume-title":"Proceedings of KDD Workshop on Text Mining","volume":"400","author":"Steinbach Michael","year":"2000"},{"key":"e_1_2_1_163_1","first-page":"424","article-title":"Probabilistic topic models","volume":"427","author":"Steyvers Mark","year":"2007","journal-title":"Handbook of Latent Semantic Analysis"},{"key":"e_1_2_1_164_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pgen.1000479"},{"key":"e_1_2_1_165_1","doi-asserted-by":"publisher","DOI":"10.1109\/34.927466"},{"key":"e_1_2_1_166_1","doi-asserted-by":"publisher","DOI":"10.1109\/TST.2013.6574671"},{"key":"e_1_2_1_167_1","doi-asserted-by":"publisher","DOI":"10.1145\/2481244.2481248"},{"key":"e_1_2_1_168_1","doi-asserted-by":"publisher","DOI":"10.14778\/3402707.3402736"},{"key":"e_1_2_1_169_1","doi-asserted-by":"publisher","DOI":"10.1145\/1516360.1516426"},{"key":"e_1_2_1_170_1","doi-asserted-by":"publisher","DOI":"10.1145\/2500492"},{"key":"e_1_2_1_171_1","doi-asserted-by":"publisher","DOI":"10.1145\/1557019.1557107"},{"key":"e_1_2_1_172_1","first-page":"38","article-title":"Sparse nonnegative matrix approximation: New formulations and algorithms","volume":"193","author":"Tandon Rashish","year":"2010","journal-title":"Rapport Technique"},{"key":"e_1_2_1_173_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.4109"},{"key":"e_1_2_1_174_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.290.5500.2319"},{"key":"e_1_2_1_175_1","doi-asserted-by":"publisher","DOI":"10.1145\/1066157.1066211"},{"key":"e_1_2_1_176_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2014.6835958"},{"key":"e_1_2_1_177_1","first-page":"186","article-title":"STING: A statistical information grid approach to spatial data mining","volume":"97","author":"Wang Wei","year":"1997","journal-title":"Proceedings of VLDB"},{"key":"e_1_2_1_178_1","doi-asserted-by":"publisher","DOI":"10.1080\/00949658208810560"},{"key":"e_1_2_1_179_1","volume-title":"Introduction to Graph Theory","author":"West Douglas Brent"},{"key":"e_1_2_1_180_1","volume-title":"Hadoop: The Definitive Guide. O\u2019Reilly Media","author":"White Tom","year":"2012"},{"key":"e_1_2_1_181_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.2005.845141"},{"key":"e_1_2_1_182_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.2005.845141"},{"key":"e_1_2_1_183_1","doi-asserted-by":"publisher","DOI":"10.1145\/1247480.1247602"},{"key":"e_1_2_1_184_1","doi-asserted-by":"publisher","DOI":"10.4324\/9780203767719"},{"key":"e_1_2_1_185_1","doi-asserted-by":"publisher","DOI":"10.1109\/COMST.2005.1423333"},{"key":"e_1_2_1_186_1","volume-title":"Proceedings of the International Conference on Big Data, Cloud and Applications (BDCA\u201915)","author":"Zerhari Btissam","year":"2015"},{"key":"e_1_2_1_187_1","doi-asserted-by":"publisher","DOI":"10.1145\/233269.233324"},{"key":"e_1_2_1_188_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-10665-1_71"},{"key":"e_1_2_1_189_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICDM.2007.57"},{"key":"e_1_2_1_190_1","doi-asserted-by":"publisher","DOI":"10.14778\/1687627.1687709"},{"key":"e_1_2_1_191_1","doi-asserted-by":"publisher","DOI":"10.1109\/83.535841"},{"key":"e_1_2_1_192_1","doi-asserted-by":"publisher","DOI":"10.1145\/1656274.1656286"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3132088","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3132088","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:30:34Z","timestamp":1750217434000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3132088"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,1,23]]},"references-count":192,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2018,4,30]]}},"alternative-id":["10.1145\/3132088"],"URL":"https:\/\/doi.org\/10.1145\/3132088","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,1,23]]},"assertion":[{"value":"2016-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-08-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-01-23","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}