{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T08:54:04Z","timestamp":1775120044659,"version":"3.50.1"},"reference-count":225,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2024,5,7]],"date-time":"2024-05-07T00:00:00Z","timestamp":1715040000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>Numerous real-world applications apply categorical data clustering to find hidden patterns in the data. The K-modes-based algorithm is a popular algorithm for solving common issues in categorical data, from outlier and noise sensitivity to local optima, utilizing metaheuristic methods. Many studies have focused on increasing clustering performance, with new methods now outperforming the traditional K-modes algorithm. It is important to investigate this evolution to help scholars understand how the existing algorithms overcome the common issues of categorical data. Using a research-area-based bibliometric analysis, this study retrieved articles from the Web of Science (WoS) Core Collection published between 2014 and 2023. This study presents a deep analysis of 64 articles to develop a new taxonomy of categorical data clustering algorithms. This study also discusses the potential challenges and opportunities in possible alternative solutions to categorical data clustering.<\/jats:p>","DOI":"10.3390\/make6020047","type":"journal-article","created":{"date-parts":[[2024,5,7]],"date-time":"2024-05-07T11:00:10Z","timestamp":1715079610000},"page":"1009-1054","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Categorical Data Clustering: A Bibliometric Analysis and Taxonomy"],"prefix":"10.3390","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4981-7893","authenticated-orcid":false,"given":"Maya","family":"Cendana","sequence":"first","affiliation":[{"name":"Department of Industrial Management, National Taiwan University of Science and Technology, No. 43, Section 4, Kee-Lung Road, Taipei 106, Taiwan"}]},{"given":"Ren-Jieh","family":"Kuo","sequence":"additional","affiliation":[{"name":"Department of Industrial Management, National Taiwan University of Science and Technology, No. 43, Section 4, Kee-Lung Road, Taipei 106, Taiwan"}]}],"member":"1968","published-online":{"date-parts":[[2024,5,7]]},"reference":[{"key":"ref_1","first-page":"434","article-title":"Customer segmentation and profiling for life insurance using k-modes clustering and decision tree classifier","volume":"12","author":"Arifin","year":"2021","journal-title":"Int. J. Adv. Comput. Sc."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"298","DOI":"10.1016\/j.cie.2018.04.050","article-title":"Application of metaheuristic based fuzzy k-modes algorithm to supplier clustering","volume":"120","author":"Kuo","year":"2018","journal-title":"Comput. Ind. Eng."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Hendricks, R., and Khasawneh, M. (2021). Cluster analysis of categorical variables of parkinson\u2019s disease patients. Brain Sci., 11.","DOI":"10.3390\/brainsci11101290"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1038\/s41398-020-00951-x","article-title":"Clustering by phenotype and genome-wide association study in autism","volume":"10","author":"Narita","year":"2020","journal-title":"Transl. Psychiat"},{"key":"ref_5","first-page":"9","article-title":"Face extraction from image based on k-means clustering algorithms","volume":"8","author":"Farhang","year":"2017","journal-title":"Int. J. Adv. Comput. Sc."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"12386","DOI":"10.1109\/ACCESS.2019.2893063","article-title":"Brain image segmentation based on FCM clustering algorithm and rough set","volume":"7","author":"Huang","year":"2019","journal-title":"IEEE Access"},{"key":"ref_7","first-page":"1","article-title":"Research on face feature extraction based on k-mean algorithm","volume":"2018","author":"Wei","year":"2018","journal-title":"Eurasip. J. Image Vide"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1142\/S021972000900400X","article-title":"Clustering of gene expression data and end-point measurements by simulated annealing","volume":"7","author":"Bushel","year":"2009","journal-title":"J. Bioinform. Comput. Biol."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"442","DOI":"10.1089\/cmb.2018.0245","article-title":"A fast parallel k-modes algorithm for clustering nucleotide sequences to predict translation initiation sites","volume":"26","author":"Castro","year":"2019","journal-title":"J. Comput. Biol."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1080\/13645579.2012.716973","article-title":"Clustering in the field of social sciences: That is your choice","volume":"16","author":"Fonseca","year":"2013","journal-title":"Int. J. Soc. Res. Method."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"362","DOI":"10.20965\/jaciii.2019.p0362","article-title":"Massive data mining algorithm for web text based on clustering algorithm","volume":"23","author":"Luo","year":"2019","journal-title":"J. Adv. Comput. Intell. Inform."},{"key":"ref_12","unstructured":"Dua, D.G. (2024, January 10). UCI Machine Learning Repository. Available online: https:\/\/archive.ics.uci.edu\/."},{"key":"ref_13","unstructured":"Tan, P.-N., Steinbach, M.S., Karpatne, A., and Kumar, V. (2019). Introduction to Data Mining, Pearson Education, Inc.. [2nd ed.]."},{"key":"ref_14","unstructured":"MacQueen, J. (1967, January 21). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1023\/A:1009769707641","article-title":"Extensions to the k-means algorithm for clustering large data sets with categorical values","volume":"2","author":"Huang","year":"1998","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"264","DOI":"10.1145\/331499.331504","article-title":"Data clustering: A review","volume":"31","author":"Jain","year":"1999","journal-title":"ACM Comput. Surv."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1142\/S0219622019300064","article-title":"Clustering categorical data: A survey","volume":"19","author":"Naouali","year":"2020","journal-title":"Int. J. Inf. Technol. Decis. Mak."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Alamuri, M., Surampudi, B.R., and Negi, A. (2014, January 6\u201311). A survey of distance\/similarity measures for categorical data. Proceedings of the 2014 International Joint Conference on Neural Networks (IJCNN), Beijing, China.","DOI":"10.1109\/IJCNN.2014.6889941"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1016\/j.swevo.2016.06.004","article-title":"A comprehensive survey of traditional, merge-split and evolutionary approaches proposed for determination of cluster number","volume":"32","author":"Hancer","year":"2017","journal-title":"Swarm Evol. Comput."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Alloghani, M., Al-Jumeily, D., Mustafina, J., Hussain, A., and Aljaaf, A.J. (2020). Supervised and Unsupervised Learning for Data Science, Springer. Unsupervised and Semi-Supervised Learning.","DOI":"10.1007\/978-3-030-22475-2_1"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"6","DOI":"10.31449\/inf.v47i6.4445","article-title":"Big data clustering techniques challenged and perspectives: Review","volume":"47","author":"Awad","year":"2023","journal-title":"Informatica"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"178","DOI":"10.1016\/j.ins.2022.11.139","article-title":"K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data","volume":"622","author":"Ikotun","year":"2022","journal-title":"Inf. Sci."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"121860","DOI":"10.1016\/j.eswa.2023.121860","article-title":"Density peak clustering algorithms: A review on the decade 2014\u20132023","volume":"238","author":"Wang","year":"2024","journal-title":"Expert Syst. Appl."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1145\/1007730.1007731","article-title":"Subspace clustering for high dimensional data: A review","volume":"6","author":"Parsons","year":"2004","journal-title":"SIGKDD Explor."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"6247","DOI":"10.1007\/s00521-020-05395-4","article-title":"Automatic clustering algorithms: A systematic review and bibliometric analysis of relevant literature","volume":"33","author":"Ezugwu","year":"2020","journal-title":"Neural Comput. Appl."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/s42452-020-2073-0","article-title":"Nature-inspired metaheuristic techniques for automatic clustering: A survey and performance study","volume":"2","author":"Ezugwu","year":"2020","journal-title":"SN Appl. Sci."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"n71","DOI":"10.1136\/bmj.n71","article-title":"The PRISMA 2020 statement: An updated guideline for reporting systematic reviews","volume":"372","author":"Page","year":"2021","journal-title":"BMJ"},{"key":"ref_28","first-page":"1275","article-title":"Some bibliometric procedures for analyzing and evaluating research fields","volume":"48","author":"Cobo","year":"2017","journal-title":"Appl. Intell."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"285","DOI":"10.1016\/j.jbusres.2021.04.070","article-title":"How to conduct a bibliometric analysis: An overview and guidelines","volume":"133","author":"Donthu","year":"2021","journal-title":"J. Bus. Res."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1382","DOI":"10.1002\/asi.21525","article-title":"science mapping software tools: Review, analysis, and cooperative study among tools","volume":"62","author":"Cobo","year":"2011","journal-title":"J. Am. Soc. Inf. Sci. Technol."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"959","DOI":"10.1016\/j.joi.2017.08.007","article-title":"Bibliometrix: An R-tool for comprehensive science mapping analysis","volume":"11","author":"Aria","year":"2017","journal-title":"J. Informetr."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Pranckut\u0117, R. (2021). Web of Science (WoS) and Scopus: The titans of bibliographic information in today\u2019s academic world. Publications, 9.","DOI":"10.3390\/publications9010012"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"390","DOI":"10.1016\/j.ijinfomgt.2017.04.007","article-title":"Co-citation and cluster analyses of extant literature on social networks","volume":"37","author":"Shiau","year":"2017","journal-title":"Int. J. Inf. Manag."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1178","DOI":"10.1016\/j.joi.2016.10.006","article-title":"Constructing bibliometric networks: A comparison between full and fractional counting","volume":"10","author":"Waltman","year":"2016","journal-title":"J. Informetr."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1053","DOI":"10.1007\/s11192-017-2300-7","article-title":"Citation-based clustering of publications using CitNetExplorer and VOSviewer","volume":"111","author":"Waltman","year":"2017","journal-title":"Scientometrics"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"8153","DOI":"10.1007\/s11192-021-04082-y","article-title":"Link-based approach to study scientific software usage: The case of VOSviewer","volume":"126","author":"Costas","year":"2021","journal-title":"Scientometrics"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1016\/j.ins.2015.11.005","article-title":"Initialization of k-modes clustering using outlier detection techniques","volume":"332","author":"Jiang","year":"2016","journal-title":"Inf. Sci."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1016\/j.knosys.2014.04.008","article-title":"Hierarchical clustering algorithm for categorical data using a probabilistic rough set model","volume":"65","author":"Li","year":"2014","journal-title":"Knowl. -Based Syst."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1016\/j.neucom.2013.11.024","article-title":"The k-modes type clustering plus between-cluster information for categorical data","volume":"133","author":"Bai","year":"2014","journal-title":"Neurocomputing"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"4593","DOI":"10.1109\/TNNLS.2017.2770167","article-title":"An algorithm for clustering categorical data with set-valued features","volume":"29","author":"Cao","year":"2018","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1016\/j.knosys.2014.03.013","article-title":"MGR: An information theory based hierarchical divisive clustering algorithm for categorical data","volume":"67","author":"Qin","year":"2014","journal-title":"Knowl. -Based Syst."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1016\/j.engappai.2016.01.026","article-title":"A modified fuzzy k-partition based on indiscernibility relation for categorical data clustering","volume":"53","author":"Yanto","year":"2016","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1007\/s00357-016-9211-9","article-title":"Model-based clustering","volume":"33","author":"McNicholas","year":"2016","journal-title":"J. Classif."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1093\/biomet\/61.2.215","article-title":"Exploratory latent structure analysis using both identifiable and unidentifiable models","volume":"61","author":"Goodman","year":"1974","journal-title":"Biometrika"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1177\/0095798420930932","article-title":"Latent class analysis: A guide to best practice","volume":"46","author":"Weller","year":"2020","journal-title":"J. Black Psychol."},{"key":"ref_46","first-page":"1","article-title":"Maximum likelihood from incomplete data via the EM algorithm","volume":"39","author":"Dempster","year":"2018","journal-title":"J. R. Stat. Soc. Ser. B (Methodol.)"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1016\/j.neucom.2019.02.043","article-title":"Hierarchical division clustering framework for categorical data","volume":"341","author":"Wei","year":"2019","journal-title":"Neurocomputing"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1007\/s00357-019-09317-5","article-title":"Comparison of similarity measures for categorical data in hierarchical clustering","volume":"36","author":"Sulc","year":"2019","journal-title":"J. Classif."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"3213","DOI":"10.1007\/s13042-019-01012-6","article-title":"Fuzzy rough clustering for categorical data","volume":"10","author":"Xu","year":"2019","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.fss.2018.02.007","article-title":"Integrated rough fuzzy clustering for categorical data analysis","volume":"361","author":"Saha","year":"2019","journal-title":"Fuzzy Sets Syst."},{"key":"ref_51","first-page":"S6171","article-title":"Attribute weights-based clustering centres algorithm for initialising k-modes clustering","volume":"22","author":"Peng","year":"2019","journal-title":"Clust. Comput. -J. Netw. Softw. Tools Appl."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"112662","DOI":"10.1109\/ACCESS.2019.2935089","article-title":"Heterogeneous graph based similarity measure for categorical data unsupervised learning","volume":"7","author":"Ye","year":"2019","journal-title":"IEEE Access"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"99721","DOI":"10.1109\/ACCESS.2019.2927593","article-title":"Automatic fuzzy clustering using non-dominated sorting particle swarm optimization algorithm for categorical data","volume":"7","author":"Nguyen","year":"2019","journal-title":"IEEE Access"},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"254","DOI":"10.1016\/j.asoc.2018.11.028","article-title":"Partition-and-merge based fuzzy genetic clustering algorithm for categorical data","volume":"75","author":"Nguyen","year":"2019","journal-title":"Appl. Soft Comput."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"116","DOI":"10.1016\/j.neucom.2018.11.016","article-title":"Genetic intuitionistic weighted fuzzy k-modes algorithm for categorical data","volume":"330","author":"Kuo","year":"2019","journal-title":"Neurocomputing"},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1016\/j.patcog.2019.01.042","article-title":"Optimal mathematical programming and variable neighborhood search for k-modes categorical data clustering","volume":"90","author":"Xiao","year":"2019","journal-title":"Pattern Recognit."},{"key":"ref_57","first-page":"486","article-title":"CUBOS: An internal cluster validity index for categorical data","volume":"26","author":"Gao","year":"2019","journal-title":"Teh. Vjesn. -Tech. Gaz."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"1065","DOI":"10.1109\/TNNLS.2015.2436432","article-title":"A new distance metric for unsupervised learning of categorical data","volume":"27","author":"Jia","year":"2016","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"2047","DOI":"10.1109\/TNNLS.2015.2451151","article-title":"Space structure and clustering of categorical data","volume":"27","author":"Qian","year":"2016","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1016\/j.asoc.2015.01.031","article-title":"Non-dominated sorting genetic algorithm using fuzzy membership chromosome for categorical data clustering","volume":"30","author":"Yang","year":"2015","journal-title":"Appl. Soft Comput."},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"322","DOI":"10.1016\/j.patcog.2015.09.027","article-title":"Soft subspace clustering of categorical data with probabilistic distance","volume":"51","author":"Chen","year":"2016","journal-title":"Pattern Recognit."},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1016\/j.is.2014.06.008","article-title":"Rough set approach for clustering categorical data using information-theoretic dependency measure","volume":"48","author":"Park","year":"2015","journal-title":"Inf. Syst."},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"230","DOI":"10.1016\/j.eswa.2017.12.013","article-title":"Many-objective fuzzy centroids clustering algorithm for categorical data","volume":"96","author":"Zhu","year":"2018","journal-title":"Expert. Syst. Appl."},{"key":"ref_64","doi-asserted-by":"crossref","first-page":"463","DOI":"10.1016\/j.compeleceng.2018.04.023","article-title":"A fast and effective partitional clustering algorithm for large categorical datasets using a k-means based approach","volume":"68","author":"Naouali","year":"2018","journal-title":"Comput. Electr. Eng."},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"1560","DOI":"10.1007\/s10618-014-0387-5","article-title":"Cluster validity functions for categorical data: A solution-space perspective","volume":"29","author":"Bai","year":"2015","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_66","doi-asserted-by":"crossref","first-page":"108694","DOI":"10.1016\/j.patcog.2022.108694","article-title":"A categorical data clustering framework on graph representation","volume":"128","author":"Bai","year":"2022","journal-title":"Pattern Recognit."},{"key":"ref_67","first-page":"1","article-title":"A fuzzy SV-k-modes algorithm for clustering categorical data with set-valued attributes","volume":"295","author":"Cao","year":"2017","journal-title":"Appl. Math. Comput."},{"key":"ref_68","doi-asserted-by":"crossref","first-page":"605","DOI":"10.1016\/j.asoc.2017.04.019","article-title":"K-mw-modes: An algorithm for clustering categorical matrix-object data","volume":"57","author":"Cao","year":"2017","journal-title":"Appl. Soft Comput."},{"key":"ref_69","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.ins.2020.12.051","article-title":"Metaheuristic-based possibilistic fuzzy k-modes algorithms for categorical data clustering","volume":"557","author":"Kuo","year":"2021","journal-title":"Inf. Sci."},{"key":"ref_70","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1109\/MIS.2020.3038837","article-title":"The DRk-M for clustering categorical datasets with uncertainty","volume":"36","author":"Naouali","year":"2021","journal-title":"IEEE Intell. Syst."},{"key":"ref_71","doi-asserted-by":"crossref","first-page":"2069","DOI":"10.1007\/s13042-021-01293-w","article-title":"A rough set based algorithm for updating the modes in categorical clustering","volume":"12","author":"Naouali","year":"2021","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"ref_72","doi-asserted-by":"crossref","first-page":"113555","DOI":"10.1016\/j.eswa.2020.113555","article-title":"Uncertainty mode selection in categorical clustering using the rough set theory","volume":"158","author":"Naouali","year":"2020","journal-title":"Expert. Syst. Appl."},{"key":"ref_73","doi-asserted-by":"crossref","first-page":"758","DOI":"10.1109\/TCYB.2020.2983073","article-title":"A new distance metric exploiting heterogeneous interattribute relationship for ordinal-and-nominal-attribute data clustering","volume":"52","author":"Zhang","year":"2022","journal-title":"IEEE Trans. Cybern."},{"key":"ref_74","first-page":"3560","article-title":"Learnable weighting of intra-attribute distances for categorical data clustering with nominal and ordinal attributes","volume":"44","author":"Zhang","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_75","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1109\/TNNLS.2019.2899381","article-title":"A unified entropy-based distance metric for ordinal-and-nominal-attribute data clustering","volume":"31","author":"Zhang","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_76","doi-asserted-by":"crossref","unstructured":"Chen, H., Xu, K.P., Chen, L.F., and Jiang, Q.S. (2021). Self-expressive kernel subspace clustering algorithm for categorical data with embedded feature selection. Mathematics, 9.","DOI":"10.3390\/math9141680"},{"key":"ref_77","doi-asserted-by":"crossref","first-page":"072104:1","DOI":"10.1007\/s11432-014-5267-5","article-title":"A probabilistic framework for optimizing projected clusters with categorical attributes","volume":"58","author":"Chen","year":"2015","journal-title":"Sci. China-Inf. Sci."},{"key":"ref_78","doi-asserted-by":"crossref","first-page":"1498","DOI":"10.1007\/s10489-019-01583-5","article-title":"A dissimilarity measure for mixed nominal and ordinal attribute data in k-modes algorithm","volume":"50","author":"Yuan","year":"2020","journal-title":"Appl. Intell."},{"key":"ref_79","doi-asserted-by":"crossref","first-page":"111494","DOI":"10.1016\/j.chaos.2021.111494","article-title":"FKMAWCW: Categorical fuzzy k-modes clustering with automated attribute-weight and cluster-weight learning","volume":"153","author":"Oskouei","year":"2021","journal-title":"Chaos Solitons Fractals"},{"key":"ref_80","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1016\/j.neucom.2015.03.037","article-title":"Categorical fuzzy k-modes clustering with automated feature weight learning","volume":"166","author":"Saha","year":"2015","journal-title":"Neurocomputing"},{"key":"ref_81","doi-asserted-by":"crossref","first-page":"320","DOI":"10.1016\/j.neucom.2017.06.011","article-title":"A multi-act sequential game-based multi-objective clustering approach for categorical data","volume":"267","author":"Heloulou","year":"2017","journal-title":"Neurocomputing"},{"key":"ref_82","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1002\/sam.11546","article-title":"An efficient k-modes algorithm for clustering categorical datasets","volume":"15","author":"Dorman","year":"2022","journal-title":"Stat. Anal. Data Min."},{"key":"ref_83","doi-asserted-by":"crossref","unstructured":"Rios, E.J.R., Medina-P\u00e9rez, M.A., Lazo-Cort\u00e9s, M.S., and Monroy, R. (2021). Learning-based dissimilarity for clustering categorical data. Appl. Sci. -Basel, 11.","DOI":"10.3390\/app11083509"},{"key":"ref_84","doi-asserted-by":"crossref","first-page":"2061","DOI":"10.1007\/s00607-021-00950-w","article-title":"A novel rough value set categorical clustering technique for supplier base management","volume":"103","author":"Uddin","year":"2021","journal-title":"Computing"},{"key":"ref_85","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1007\/s11047-015-9489-2","article-title":"Detecting outliers in categorical data through rough clustering","volume":"15","author":"Suri","year":"2016","journal-title":"Nat. Comput."},{"key":"ref_86","doi-asserted-by":"crossref","first-page":"105795","DOI":"10.1016\/j.engappai.2022.105795","article-title":"An efficient entropy based dissimilarity measure to cluster categorical data","volume":"119","author":"Kar","year":"2023","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_87","doi-asserted-by":"crossref","first-page":"160","DOI":"10.1016\/j.neucom.2018.03.048","article-title":"Learning category distance metric for data clustering","volume":"306","author":"Chen","year":"2018","journal-title":"Neurocomputing"},{"key":"ref_88","doi-asserted-by":"crossref","first-page":"1810","DOI":"10.1109\/TKDE.2018.2808532","article-title":"Unsupervised coupled metric similarity for Non-IID categorical data","volume":"30","author":"Jian","year":"2018","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_89","doi-asserted-by":"crossref","first-page":"810","DOI":"10.1109\/TFUZZ.2022.3189831","article-title":"Graph enhanced fuzzy clustering for categorical data using a bayesian dissimilarity measure","volume":"31","author":"Zhang","year":"2023","journal-title":"IEEE Trans. Fuzzy Syst."},{"key":"ref_90","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1504\/IJBIC.2018.092801","article-title":"EGA-FMC: Enhanced genetic algorithm-based fuzzy k-modes clustering for categorical data","volume":"11","author":"Narasimhan","year":"2018","journal-title":"Int. J. Bio-Inspired Comput."},{"key":"ref_91","doi-asserted-by":"crossref","first-page":"927","DOI":"10.1109\/TNNLS.2019.2911118","article-title":"From whole to part: Reference-based representation for clustering categorical data","volume":"31","author":"Zheng","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_92","doi-asserted-by":"crossref","unstructured":"Faouzi, T., Firinguetti-Limone, L., Avilez-Bozo, J.M., and Carvajal-Schiaffino, R. (2022). The \u03b1-Groups under condorcet clustering. Mathematics, 10.","DOI":"10.3390\/math10050718"},{"key":"ref_93","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1016\/j.neucom.2023.01.020","article-title":"A kernel-based intuitionistic weight fuzzy k-modes algorithm using coupled chained P system combines DNA genetic rules for categorical data","volume":"528","author":"Jiang","year":"2023","journal-title":"Neurocomputing"},{"key":"ref_94","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1080\/10618600.2017.1305278","article-title":"Clustering categorical data via ensembling dissimilarity matrices","volume":"27","author":"Amiri","year":"2018","journal-title":"J. Comput. Graph. Stat."},{"key":"ref_95","doi-asserted-by":"crossref","first-page":"979","DOI":"10.3233\/JIFS-16157","article-title":"A weighted k-modes clustering using new weighting method based on within-cluster and between-cluster impurity measures","volume":"32","author":"Kim","year":"2017","journal-title":"J. Intell. Fuzzy Syst."},{"key":"ref_96","doi-asserted-by":"crossref","first-page":"303","DOI":"10.15388\/Informatica.2017.131","article-title":"Holo-entropy based categorical data hierarchical clustering","volume":"28","author":"Sun","year":"2017","journal-title":"Informatica"},{"key":"ref_97","doi-asserted-by":"crossref","first-page":"34196","DOI":"10.1109\/ACCESS.2022.3162690","article-title":"A novel cluster prediction approach based on locality-sensitive hashing for fuzzy clustering of categorical data","volume":"10","author":"Mau","year":"2022","journal-title":"IEEE Access"},{"key":"ref_98","doi-asserted-by":"crossref","first-page":"2610","DOI":"10.1007\/s10489-020-01677-5","article-title":"k-PbC: An improved cluster center initialization for categorical data clustering","volume":"50","author":"Dinh","year":"2020","journal-title":"Appl. Intell."},{"key":"ref_99","doi-asserted-by":"crossref","first-page":"879","DOI":"10.1016\/j.datak.2007.05.005","article-title":"MMR: An algorithm for clustering categorical data using rough set theory","volume":"63","author":"Parmar","year":"2007","journal-title":"Data Knowl. Eng."},{"key":"ref_100","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1016\/j.inffus.2006.05.006","article-title":"K-ANMI: A mutual information based clustering algorithm for categorical data","volume":"9","author":"He","year":"2008","journal-title":"Inf. Fusion."},{"key":"ref_101","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1016\/j.knosys.2009.11.001","article-title":"G-ANMI: A mutual information based genetic clustering algorithm for categorical data","volume":"23","author":"Deng","year":"2010","journal-title":"Knowl. -Based Syst."},{"key":"ref_102","doi-asserted-by":"crossref","unstructured":"Barbar\u00e1, D., Li, Y., and Couto, J. (2002, January 4\u20139). COOLCAT: An entropy-based algorithm for categorical clustering. Proceedings of the Eleventh International Conference on Information and Knowledge Management, McLean, VA, USA.","DOI":"10.1145\/584792.584888"},{"key":"ref_103","doi-asserted-by":"crossref","first-page":"220","DOI":"10.1016\/j.knosys.2009.12.003","article-title":"A rough set approach for selecting clustering attribute","volume":"23","author":"Herawan","year":"2010","journal-title":"Knowl. -Based Syst."},{"key":"ref_104","unstructured":"Mazlack, L., He, A., Zhu, Y., and Coppock, S. (2000, January 1\u20133). A rough set approach in choosing partitioning attributes. Proceedings of the ISCA 13th International Conference (CAINE-2000), Honolulu, HI, USA."},{"key":"ref_105","unstructured":"Andritsos, P., Tsaparas, P., Miller, R.J., and Sevcik, K.C. (2003, January 7\u201310). Limbo: A scalable algorithm to cluster categorical data. Proceedings of the International Conference on Extending Database Technology, Berlin\/Heidelberg, Germany."},{"key":"ref_106","doi-asserted-by":"crossref","first-page":"553","DOI":"10.32604\/iasc.2023.027579","article-title":"P-ROCK: A sustainable clustering algorithm for large categorical datasets","volume":"35","author":"Altameem","year":"2023","journal-title":"Intell. Autom. Soft Comput."},{"key":"ref_107","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1016\/S0306-4379(00)00022-3","article-title":"ROCK: A robust clustering algorithm for categorical attributes","volume":"25","author":"Guha","year":"2000","journal-title":"Inf. Syst."},{"key":"ref_108","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1109\/TKDE.2011.261","article-title":"Information-theoretic outlier detection for large-scale categorical data","volume":"25","author":"Wu","year":"2013","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_109","doi-asserted-by":"crossref","first-page":"2364","DOI":"10.1016\/j.patrec.2005.04.008","article-title":"QROCK: A quick version of the ROCK algorithm for clustering of categorical data","volume":"26","author":"Dutta","year":"2005","journal-title":"Pattern Recognit. Lett."},{"key":"ref_110","first-page":"518","article-title":"Modified rock (MROCK) algorithm for clustering categorical data","volume":"9","author":"Saruladha","year":"2015","journal-title":"Adv. Nat. Appl. Sci."},{"key":"ref_111","doi-asserted-by":"crossref","first-page":"409","DOI":"10.3233\/IDA-140648","article-title":"New dynamic clustering approaches within belief function framework","volume":"18","author":"Elouedi","year":"2014","journal-title":"Intell. Data Anal."},{"key":"ref_112","unstructured":"Smets, P. (1990, January 27\u201329). The transferable belief model and other interpretations of Dempster-Shafer\u2019s model. Proceedings of the Conference on Uncertainty in Artificial Intelligence, Cambridge, MA, USA."},{"key":"ref_113","doi-asserted-by":"crossref","unstructured":"Ben Hariz, S., Elouedi, Z., and Mellouli, K. (2006). Clustering Approach Using Belief Function Theory, Springer.","DOI":"10.1007\/11861461_18"},{"key":"ref_114","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1016\/j.neucom.2012.11.009","article-title":"A weighting k-modes algorithm for subspace clustering of categorical data","volume":"108","author":"Cao","year":"2013","journal-title":"Neurocomputing"},{"key":"ref_115","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1016\/j.knosys.2011.07.011","article-title":"A dissimilarity measure for the k-modes clustering algorithm","volume":"26","author":"Cao","year":"2012","journal-title":"Knowl. -Based Syst."},{"key":"ref_116","unstructured":"Chi-Hyon, O., Honda, K., and Ichihashi, H. (2001, January 25\u201328). Fuzzy clustering for categorical multivariate data. Proceedings of the Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569), Vancouver, BC, Canada."},{"key":"ref_117","doi-asserted-by":"crossref","unstructured":"Heloulou, I., Radjef, M.S., and Kechadi, M.T. (2014). Clustering Based on Sequential Multi-Objective Games, Springer International Publishing.","DOI":"10.1007\/978-3-319-10160-6_33"},{"key":"ref_118","doi-asserted-by":"crossref","unstructured":"Kaufman, L., and Rousseeuw, P. (1990). Finding Groups in Data: An Introduction to Cluster Analysis, John Wiley & Sons.","DOI":"10.1002\/9780470316801"},{"key":"ref_119","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1007\/s10489-007-0111-x","article-title":"Multi-instance clustering with applications to multi-instance prediction","volume":"31","author":"Zhang","year":"2009","journal-title":"Appl. Intell."},{"key":"ref_120","doi-asserted-by":"crossref","unstructured":"Giannotti, F., Gozzi, C., and Manco, G. (2002). Clustering Transactional Data, Springer.","DOI":"10.1007\/3-540-45681-3_15"},{"key":"ref_121","doi-asserted-by":"crossref","first-page":"7444","DOI":"10.1016\/j.eswa.2013.07.002","article-title":"Cluster center initialization algorithm for k-modes clustering","volume":"40","author":"Khan","year":"2013","journal-title":"Expert. Syst. Appl."},{"key":"ref_122","unstructured":"Wu, S., Jiang, Q., and Huang, J.Z. (2007, January 22\u201325). A new initialization method for clustering categorical data. Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining, Nanjing, China."},{"key":"ref_123","unstructured":"Arthur, D., and Vassilvitskii, S. (2007, January 7\u20139). K-means++: The advantages of careful seeding. Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, New Orleans, LA, USA."},{"key":"ref_124","doi-asserted-by":"crossref","first-page":"622","DOI":"10.14778\/2180912.2180915","article-title":"Scalable k-means++","volume":"5","author":"Bahmani","year":"2012","journal-title":"Proc. VLDB Endow."},{"key":"ref_125","doi-asserted-by":"crossref","first-page":"10223","DOI":"10.1016\/j.eswa.2009.01.060","article-title":"A new initialization method for categorical data clustering","volume":"36","author":"Fuyuan","year":"2009","journal-title":"Expert. Syst. Appl."},{"key":"ref_126","first-page":"241","article-title":"An alternative extension of the k-means algorithm for clustering categorical data","volume":"14","author":"San","year":"2004","journal-title":"Int. J. Appl. Math. Comput. Sci."},{"key":"ref_127","doi-asserted-by":"crossref","unstructured":"Nguyen, T.-H.T., and Huynh, V.-N. (2016, January 7\u201311). A k-means-like algorithm for clustering categorical data using an information theoretic-based dissimilarity measure. Proceedings of the International Symposium on Foundations of Information and Knowledge Systems, Linz, Austria.","DOI":"10.1007\/978-3-319-30024-5_7"},{"key":"ref_128","doi-asserted-by":"crossref","first-page":"15011","DOI":"10.1007\/s12652-019-01445-5","article-title":"A method for k-means-like clustering of categorical data","volume":"14","author":"Nguyen","year":"2019","journal-title":"J. Ambient. Intell. Humaniz. Comput."},{"key":"ref_129","doi-asserted-by":"crossref","first-page":"8986360","DOI":"10.1155\/2017\/8986360","article-title":"Clustering categorical data using community detection techniques","volume":"2017","author":"Nguyen","year":"2017","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_130","doi-asserted-by":"crossref","first-page":"943","DOI":"10.1016\/j.patcog.2003.11.003","article-title":"An optimization algorithm for clustering using weighted dissimilarity measures","volume":"37","author":"Chan","year":"2004","journal-title":"Pattern Recognit."},{"key":"ref_131","doi-asserted-by":"crossref","first-page":"2843","DOI":"10.1016\/j.patcog.2011.04.024","article-title":"A novel attribute weighting algorithm for clustering high-dimensional categorical data","volume":"44","author":"Bai","year":"2011","journal-title":"Pattern Recognit."},{"key":"ref_132","unstructured":"Ng, A.Y., Jordan, M.I., and Weiss, Y. (2001, January 3\u20138). On spectral clustering: Analysis and an algorithm. Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, Vancouver, BC, Canada."},{"key":"ref_133","unstructured":"Lee, D.D., and Seung, H.S. (2000, January 28\u201330). Algorithms for non-negative matrix factorization. Proceedings of the 13th International Conference on Neural Information Processing Systems, Denver, CO, USA."},{"key":"ref_134","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1126\/science.1127647","article-title":"Reducing the dimensionality of data with neural networks","volume":"313","author":"Hinton","year":"2006","journal-title":"Science"},{"key":"ref_135","doi-asserted-by":"crossref","first-page":"1147","DOI":"10.1016\/0167-8655(95)00075-R","article-title":"A conceptual version of the k-means algorithm","volume":"16","author":"Ralambondrainy","year":"1995","journal-title":"Pattern Recognit. Lett."},{"key":"ref_136","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1109\/TKDE.2010.268","article-title":"A link-based cluster ensemble approach for categorical data clustering","volume":"24","author":"Boongeon","year":"2012","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_137","doi-asserted-by":"crossref","unstructured":"Jian, S., Cao, L., Pang, G., Lu, K., and Gao, H. (2017, January 19\u201325). Embedding-based representation of categorical data by hierarchical value coupling learning. Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia.","DOI":"10.24963\/ijcai.2017\/269"},{"key":"ref_138","first-page":"21","article-title":"Agregation de similarites en classification automatique","volume":"30","author":"Marcotorchino","year":"1982","journal-title":"Rev. De Stat. Appliqu\u00e9e"},{"key":"ref_139","unstructured":"Hariz, S.B., and Elouedi, Z. (2010, January 16\u201319). IK-BKM: An incremental clustering approach based on intra-cluster distance. Proceedings of the ACS\/IEEE International Conference on Computer Systems and Applications\u2014AICCSA 2010, Washington, DC, USA."},{"key":"ref_140","doi-asserted-by":"crossref","unstructured":"Ben Hariz, S., and Elouedi, Z. (2010). DK-BKM: Decremental k Belief k-Modes Method, Springer.","DOI":"10.1007\/978-3-642-15951-0_13"},{"key":"ref_141","first-page":"100","article-title":"A k-means clustering algorithm","volume":"28","author":"Hartigan","year":"1979","journal-title":"J. R. Stat. Society. Ser. C (Appl. Stat.)"},{"key":"ref_142","unstructured":"Grahne, G., and Zhu, J. (2003, January 1\u20133). High performance mining of maximal frequent itemsets. Proceedings of the 6th International Workshop on High Performance Data Mining, San Francisco, CA, USA."},{"key":"ref_143","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1109\/TPAMI.2007.53","article-title":"On the impact of dissimilarity measure in k-modes clustering algorithm","volume":"29","author":"Ng","year":"2007","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_144","first-page":"708","article-title":"Clustering categorical data using the k-means algorithm and the attribute\u2019s relative frequency","volume":"11","author":"Naouali","year":"2017","journal-title":"World Acad. Sci. Eng. Technol. Int. J. Comput. Electr. Autom. Control Inf. Eng."},{"key":"ref_145","first-page":"14","article-title":"A computational cost-effective clustering algorithm in multidimensional space using the manhattan metric: Application to the global terrorism database","volume":"2017","author":"Sami","year":"2017","journal-title":"World Acad. Sci. Eng. Technol. Int. J. Comput. Electr. Autom. Control Inf. Eng."},{"key":"ref_146","doi-asserted-by":"crossref","first-page":"1615","DOI":"10.1016\/j.eswa.2007.11.045","article-title":"A genetic fuzzy k-Modes algorithm for clustering categorical data","volume":"36","author":"Gan","year":"2009","journal-title":"Expert. Syst. Appl."},{"key":"ref_147","doi-asserted-by":"crossref","first-page":"991","DOI":"10.1109\/TEVC.2009.2012163","article-title":"Multiobjective genetic algorithm-based fuzzy clustering of categorical attributes","volume":"13","author":"Mukhopadhyay","year":"2009","journal-title":"IEEE Trans. Evol. Comput."},{"key":"ref_148","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1007\/s10044-015-0465-3","article-title":"Multivariate fuzzy k-modes algorithm","volume":"20","author":"Maciel","year":"2017","journal-title":"Pattern Anal. Appl."},{"key":"ref_149","unstructured":"Trigo, M. (2005). Using Fuzzy k-Modes to Analyze Patterns of System Calls for Intrusion Detection. [Master\u2019s Thesis, California State University]."},{"key":"ref_150","doi-asserted-by":"crossref","first-page":"1263","DOI":"10.1016\/j.patrec.2004.04.004","article-title":"Fuzzy clustering of categorical data using fuzzy centroids","volume":"25","author":"Kim","year":"2004","journal-title":"Pattern Recognit. Lett."},{"key":"ref_151","doi-asserted-by":"crossref","first-page":"1607","DOI":"10.1109\/TKDE.2007.190649","article-title":"Top-down parameter-free clustering of high-dimensional categorical data","volume":"19","author":"Cesario","year":"2007","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_152","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1007\/s10618-011-0221-2","article-title":"DHCC: Divisive hierarchical clustering of categorical data","volume":"24","author":"Tengke","year":"2012","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_153","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/s10618-013-0336-8","article-title":"Clustering categorical data in projected spaces","volume":"29","author":"Bouguessa","year":"2015","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_154","first-page":"7","article-title":"A comparative study of categorical variable encoding techniques for neural network classifiers","volume":"175","author":"Potdar","year":"2017","journal-title":"Int. J. Comput. Appl."},{"key":"ref_155","doi-asserted-by":"crossref","first-page":"647","DOI":"10.1016\/0003-2670(93)80130-D","article-title":"On k-medoid clustering of large data sets with the aid of a genetic algorithm: Background, feasiblity and comparison","volume":"282","author":"Lucasius","year":"1993","journal-title":"Anal. Chim. Acta"},{"key":"ref_156","unstructured":"Toan Nguyen, M., and Van-Nam, H. (2021, January 11\u201314). Kernel-based k-representatives algorithm for fuzzy clustering of categorical data. Proceedings of the 2021 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Luxembourg."},{"key":"ref_157","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1016\/j.neucom.2021.08.050","article-title":"An LSH-based k-representatives clustering method for large categorical data","volume":"463","author":"Mau","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_158","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1016\/j.knosys.2018.12.007","article-title":"Density-sensitive fuzzy kernel maximum entropy clustering algorithm","volume":"166","author":"Tao","year":"2019","journal-title":"Knowl. -Based Syst."},{"key":"ref_159","doi-asserted-by":"crossref","first-page":"106981","DOI":"10.1016\/j.asoc.2020.106981","article-title":"Two graph-regularized fuzzy subspace clustering methods","volume":"100","author":"Teng","year":"2021","journal-title":"Appl. Soft Comput."},{"key":"ref_160","doi-asserted-by":"crossref","first-page":"517","DOI":"10.1109\/TFUZZ.2004.840099","article-title":"A possibilistic fuzzy c-means clustering algorithm","volume":"13","author":"Pal","year":"2005","journal-title":"IEEE Trans. Fuzzy Syst."},{"key":"ref_161","first-page":"238237","article-title":"Intuitionistic fuzzy possibilistic c means clustering algorithms","volume":"2015","author":"Chaudhuri","year":"2015","journal-title":"Adv. Fuzzy Syst."},{"key":"ref_162","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1016\/j.knosys.2013.07.020","article-title":"A spectral clustering algorithm based on intuitionistic fuzzy information","volume":"53","author":"Xu","year":"2013","journal-title":"Knowl. -Based Syst."},{"key":"ref_163","doi-asserted-by":"crossref","first-page":"3775","DOI":"10.1016\/j.ins.2008.06.008","article-title":"Clustering algorithm for intuitionistic fuzzy sets","volume":"178","author":"Xu","year":"2008","journal-title":"Inf. Sci."},{"key":"ref_164","first-page":"90","article-title":"Intuitionistic fuzzy hierarchical clustering algorithms","volume":"20","author":"Zeshui","year":"2009","journal-title":"J. Syst. Eng. Electron."},{"key":"ref_165","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1006\/jcss.1999.1693","article-title":"Computing with membranes","volume":"61","year":"2000","journal-title":"J. Comput. Syst. Sci."},{"key":"ref_166","doi-asserted-by":"crossref","first-page":"3763","DOI":"10.1166\/jctn.2016.5209","article-title":"A DNA genetic algorithm inspired by biological membrane structure","volume":"13","author":"Zang","year":"2016","journal-title":"J. Comput. Theor. Nanosci."},{"key":"ref_167","doi-asserted-by":"crossref","first-page":"676","DOI":"10.1002\/int.21723","article-title":"Semantically segmented clustering based on possibilistic and rough set theories","volume":"30","author":"Ammar","year":"2015","journal-title":"Int. J. Intell. Syst."},{"key":"ref_168","doi-asserted-by":"crossref","unstructured":"Tripathy, B.K., and Ghosh, A. (2011, January 22\u201324). SDR: An algorithm for clustering categorical data using rough set theory. Proceedings of the 2011 IEEE Recent Advances in Intelligent Computational Systems, Trivandrum, India.","DOI":"10.1109\/RAICS.2011.6069433"},{"key":"ref_169","first-page":"314","article-title":"SSDR: An algorithm for clustering categorical data using rough set theory","volume":"2","author":"Tripathy","year":"2011","journal-title":"Adv. Appl. Sci. Res."},{"key":"ref_170","doi-asserted-by":"crossref","first-page":"390","DOI":"10.1016\/j.fss.2007.08.012","article-title":"A fuzzy k-partitions model for categorical data and its comparison to the GoM model","volume":"159","author":"Yang","year":"2008","journal-title":"Fuzzy Sets Syst."},{"key":"ref_171","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1016\/j.inffus.2004.03.001","article-title":"A cluster ensemble method for clustering categorical data","volume":"6","author":"Zengyou","year":"2005","journal-title":"Inf. Fusion."},{"key":"ref_172","doi-asserted-by":"crossref","first-page":"2783","DOI":"10.1016\/S0031-3203(02)00021-3","article-title":"Clustering categorical data sets using tabu search techniques","volume":"35","author":"Ng","year":"2002","journal-title":"Pattern Recognit."},{"key":"ref_173","unstructured":"Jain, A.K., and Dubes, R.C. (1988). Algorithms for Clustering Data, Prentice-Hall, Inc."},{"key":"ref_174","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1016\/j.knosys.2015.01.008","article-title":"Ensemble based rough fuzzy clustering for categorical data","volume":"77","author":"Saha","year":"2015","journal-title":"Knowl. -Based Syst."},{"key":"ref_175","doi-asserted-by":"crossref","unstructured":"Peters, J.F., and Skowron, A. (2008). Transactions on Rough Sets VIII, Springer.","DOI":"10.1007\/978-3-540-85064-9"},{"key":"ref_176","doi-asserted-by":"crossref","first-page":"657","DOI":"10.1109\/TPAMI.2005.95","article-title":"Automated variable weighting in k-means type clustering","volume":"27","author":"Huang","year":"2005","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_177","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1016\/j.knosys.2012.06.001","article-title":"A novel soft set approach in selecting clustering attribute","volume":"36","author":"Qin","year":"2012","journal-title":"Knowl. -Based Syst."},{"key":"ref_178","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1016\/j.fss.2012.06.005","article-title":"A novel fuzzy clustering algorithm with between-cluster information for categorical data","volume":"215","author":"Bai","year":"2013","journal-title":"Fuzzy Sets Syst."},{"key":"ref_179","doi-asserted-by":"crossref","first-page":"53","DOI":"10.14257\/ijdta.2013.6.5.06","article-title":"An algorithm for selecting clustering attribute using significance of attributes","volume":"6","author":"Hassanein","year":"2013","journal-title":"Int. J. Database Theory Appl."},{"key":"ref_180","doi-asserted-by":"crossref","unstructured":"Ammar, A., Elouedi, Z., and Lingras, P. (2013, January 24\u201328). The k-modes method using possibility and rough set theories. Proceedings of the 2013 Joint IFSA World Congress and NAFIPS Annual Meeting (IFSA\/NAFIPS), Edmonton, AB, Canada.","DOI":"10.1109\/IFSA-NAFIPS.2013.6608589"},{"key":"ref_181","doi-asserted-by":"crossref","first-page":"743","DOI":"10.1007\/s10115-012-0599-1","article-title":"An effective dissimilarity measure for clustering of high-dimensional categorical data","volume":"38","author":"Lee","year":"2014","journal-title":"Knowl. Inf. Syst."},{"key":"ref_182","unstructured":"Tao, L., Sheng, M., and Mitsunori, O. (2004, January 4\u20138). Entropy-based criterion in categorical clustering. Proceedings of the Twenty-First International Conference on Machine Learning, Banff, AB, Canada."},{"key":"ref_183","doi-asserted-by":"crossref","first-page":"1509","DOI":"10.1109\/TPAMI.2012.228","article-title":"The impact of cluster representatives on the convergence of the k-modes type clustering","volume":"35","author":"Liang","year":"2013","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_184","unstructured":"Esposito, F., Malerba, D., Tamma, V., and Bock, H.-H. (2000). Classical Resemblance Measures, Springer."},{"key":"ref_185","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1016\/j.patrec.2006.06.006","article-title":"A method to compute distance between two categorical values of same attribute in unsupervised learning for categorical data set","volume":"28","author":"Ahmad","year":"2007","journal-title":"Pattern Recognit. Lett."},{"key":"ref_186","unstructured":"Knorr, E.M., and Ng, R.T. (1998, January 24\u201327). Algorithms for mining distance-based outliers in large datasets. Proceedings of the Very Large Data Bases Conference, New York, NY, USA."},{"key":"ref_187","unstructured":"Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. (1996, January 2\u20134). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA."},{"key":"ref_188","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1198\/016214502760047131","article-title":"Model-based clustering, discriminant analysis, and density estimation","volume":"97","author":"Fraley","year":"2002","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_189","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1016\/j.datak.2007.03.016","article-title":"A k-mean clustering algorithm for mixed numeric and categorical data","volume":"63","author":"Ahmad","year":"2007","journal-title":"Data Knowl. Eng."},{"key":"ref_190","doi-asserted-by":"crossref","unstructured":"Wang, C., Cao, L., Wang, M., Li, J., Wei, W., and Ou, Y. (2011, January 24\u201328). Coupled nominal similarity in unsupervised learning. Proceedings of the 20th ACM international conference on Information and knowledge management, Glasgow, Scotland.","DOI":"10.1145\/2063576.2063715"},{"key":"ref_191","doi-asserted-by":"crossref","first-page":"781","DOI":"10.1109\/TNNLS.2014.2325872","article-title":"Coupled Attribute Similarity learning on categorical data","volume":"26","author":"Wang","year":"2015","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_192","doi-asserted-by":"crossref","unstructured":"Boriah, S., Chandola, V., and Kumar, V. Similarity measures for categorical data: A comparative evaluation. Proceedings of the 2008 SIAM International Conference on Data Mining (SDM).","DOI":"10.1137\/1.9781611972788.22"},{"key":"ref_193","unstructured":"Bock, H.-H., and Diday, E. (2000). Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data, Springer Science & Business Media."},{"key":"ref_194","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1007\/s11222-007-9033-z","article-title":"A tutorial on spectral clustering","volume":"17","year":"2007","journal-title":"Stat. Comput."},{"key":"ref_195","unstructured":"Jones, K.S. (1988). Document Retrieval Systems, Taylor Graham Publishing."},{"key":"ref_196","first-page":"882","article-title":"A new similarity index based on probability","volume":"1966","author":"David","year":"1966","journal-title":"Biometrics"},{"key":"ref_197","doi-asserted-by":"crossref","first-page":"1213","DOI":"10.1016\/j.patrec.2012.01.011","article-title":"A modified short and fukunaga metric based on the attribute independence assumption","volume":"33","author":"Li","year":"2012","journal-title":"Pattern Recognit. Lett."},{"key":"ref_198","doi-asserted-by":"crossref","unstructured":"Barbar\u00e1, D., and Jajodia, S. (2002). Applications of Data Mining in Computer Security, Springer.","DOI":"10.1007\/978-1-4615-0953-0"},{"key":"ref_199","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1007\/s00357-012-9107-2","article-title":"A new class of weighted similarity indices using polytomous variables","volume":"29","author":"Morlini","year":"2012","journal-title":"J. Classif."},{"key":"ref_200","unstructured":"Lin, D. (1998, January 24\u201327). An information-theoretic definition of similarity. Proceedings of the Fifteenth International Conference on Machine Learning, Wisconson, DC, USA."},{"key":"ref_201","first-page":"1409","article-title":"A statistical method for evaluating systematic relationships","volume":"38","author":"Sokal","year":"1958","journal-title":"Univ. Kans. Sci. Bull."},{"key":"ref_202","unstructured":"Dino, I., Ruggero, G.P., and Rosa, M. (2009). Context-Based Distance Learning for Categorical Data Clustering, Springer."},{"key":"ref_203","first-page":"1","article-title":"From context to distance: Learning dissimilarity for categorical data clustering","volume":"6","author":"Dino","year":"2012","journal-title":"ACM Trans. Knowl. Discov. Data"},{"key":"ref_204","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1109\/TKDE.2007.1048","article-title":"An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data","volume":"19","author":"Liping","year":"2007","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_205","doi-asserted-by":"crossref","first-page":"3308","DOI":"10.1109\/TNNLS.2017.2728138","article-title":"Subspace clustering of categorical and numerical data with an unknown number of clusters","volume":"29","author":"Jia","year":"2018","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_206","doi-asserted-by":"crossref","first-page":"853","DOI":"10.1109\/TKDE.2018.2848902","article-title":"CURE: Flexible categorical data representation by hierarchical coupling learning","volume":"31","author":"Jian","year":"2019","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_207","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1109\/TPAMI.2020.3010953","article-title":"Unsupervised heterogeneous coupling learning for categorical representation","volume":"44","author":"Zhu","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_208","first-page":"6869","article-title":"An ordinal data clustering algorithm with automated distance learning","volume":"34","author":"Zhang","year":"2020","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_209","unstructured":"Murthy, K.P.N. (2006). Ludwig boltzmann, transport equation and the second law. arXiv."},{"key":"ref_210","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1016\/j.patrec.2017.07.001","article-title":"A novel density peaks clustering algorithm for mixed data","volume":"97","author":"Du","year":"2017","journal-title":"Pattern Recognit. Lett."},{"key":"ref_211","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1002\/j.1538-7305.1950.tb00463.x","article-title":"Error detecting and error correcting codes","volume":"29","author":"Hamming","year":"1950","journal-title":"Bell Syst. Tech. J."},{"key":"ref_212","first-page":"47","article-title":"A mathematical model of taxonomy","volume":"17","author":"Gambaryan","year":"1964","journal-title":"Izvest. Akad. Nauk. Armen. SSR"},{"key":"ref_213","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1007\/BF02332078","article-title":"On a method for character weighting a similarity coefficient, employing the concept of information","volume":"2","author":"Burnaby","year":"1970","journal-title":"J. Int. Assoc. Math. Geol."},{"key":"ref_214","doi-asserted-by":"crossref","first-page":"8684","DOI":"10.1016\/j.eswa.2011.01.074","article-title":"A fuzzy c-means-type algorithm for clustering of data with mixed numeric and categorical attributes employing a probabilistic dissimilarity functional","volume":"38","author":"Chatzis","year":"2011","journal-title":"Expert. Syst. Appl."},{"key":"ref_215","doi-asserted-by":"crossref","first-page":"700","DOI":"10.1016\/j.neucom.2015.08.018","article-title":"Applying subclustering and Lp distance in weighted k-means with distributed centroids","volume":"173","author":"Makarenkov","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_216","unstructured":"Mahamadou, A.J.D., Antoine, V., Nguifo, E.M., and Moreno, S. (2020, January 19\u201324). Categorical fuzzy entropy c-means. Proceedings of the 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Glasgow, UK."},{"key":"ref_217","doi-asserted-by":"crossref","first-page":"446","DOI":"10.1109\/91.784206","article-title":"A fuzzy k-modes algorithm for clustering categorical data","volume":"7","author":"Huang","year":"1999","journal-title":"IEEE Trans. Fuzzy Syst."},{"key":"ref_218","doi-asserted-by":"crossref","first-page":"324","DOI":"10.1016\/j.asoc.2019.02.038","article-title":"New fuzzy C-means clustering method based on feature-weight and cluster-weight learning","volume":"78","author":"Hashemzadeh","year":"2019","journal-title":"Appl. Soft Comput."},{"key":"ref_219","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1016\/j.neucom.2012.12.074","article-title":"Robust local feature weighting hard c-means clustering algorithm","volume":"134","author":"Zhi","year":"2014","journal-title":"Neurocomputing"},{"key":"ref_220","doi-asserted-by":"crossref","unstructured":"He, Z., Deng, S., and Xu, X. (2005). Improving k-Modes Algorithm Considering Frequencies of Attribute Values in Mode, Springer.","DOI":"10.1007\/11596448_23"},{"key":"ref_221","unstructured":"Huang, J.Z. (1997, January 11). A fast clustering algorithm to cluster very large categorical data sets in data mining. Proceedings of the Data Mining and Knowledge Discovery, Tucson, AZ, USA."},{"key":"ref_222","unstructured":"Gluck, M., and Corter, J. (1985, January 15\u201317). Information uncertainty, and the utility of categories. Proceedings of the Seventh Annual Conference of the Cognitive Science Society, Irvine, CA, USA."},{"key":"ref_223","doi-asserted-by":"crossref","first-page":"1643","DOI":"10.1007\/s00500-012-0972-8","article-title":"Rough subspace-based clustering ensemble for categorical data","volume":"17","author":"Gao","year":"2013","journal-title":"Soft Comput."},{"key":"ref_224","doi-asserted-by":"crossref","unstructured":"Chang, C.-H., and Ding, Z.-K. (2004). Categorical Data Visualization and Clustering Using Subjective Factors, Springer.","DOI":"10.1007\/978-3-540-30076-2_23"},{"key":"ref_225","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1016\/S0167-739X(97)00017-4","article-title":"Clustering techniques","volume":"13","author":"Michaud","year":"1997","journal-title":"Future Gener. Comput. Syst."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/6\/2\/47\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:41:06Z","timestamp":1760107266000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/6\/2\/47"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,7]]},"references-count":225,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2024,6]]}},"alternative-id":["make6020047"],"URL":"https:\/\/doi.org\/10.3390\/make6020047","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,7]]}}}