{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,15]],"date-time":"2026-05-15T17:37:45Z","timestamp":1778866665059,"version":"3.51.4"},"reference-count":57,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2019,2,1]],"date-time":"2019-02-01T00:00:00Z","timestamp":1548979200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"the Key Natural Science Foundation of the Colleges and Universities in Anhui Province of China","award":["KJ2016A592, KJ2017A547, KJ2018A0564"],"award-info":[{"award-number":["KJ2016A592, KJ2017A547, KJ2018A0564"]}]},{"name":"the Outstanding Youth Talent Foundation of Hefei University","award":["16YQ11RC"],"award-info":[{"award-number":["16YQ11RC"]}]},{"name":"the Fostering Master\u2019s Degree Empowerment Point Project of Hefei University","award":["2018xs03"],"award-info":[{"award-number":["2018xs03"]}]},{"name":"the Postgraduate Research&amp;Practice Innovation Program of Jiangsu Province of China","award":["KYCX17_0486"],"award-info":[{"award-number":["KYCX17_0486"]}]},{"name":"the Fundamental Research Funds for the Central Universities","award":["2017B708X14"],"award-info":[{"award-number":["2017B708X14"]}]},{"name":"the Research Project of Human Resources and Social Security in Hebei Province of China","award":["JRSHZ-2018-08018"],"award-info":[{"award-number":["JRSHZ-2018-08018"]}]},{"name":"the Natural Science Foundation of the Colleges and Universities in Anhui Province of China","award":["KJ2017B016"],"award-info":[{"award-number":["KJ2017B016"]}]},{"name":"Jiangsu Provincial Key Constructive Laboratory for Big Data of Psychology and Cognitive Science","award":["PDLAB201807"],"award-info":[{"award-number":["PDLAB201807"]}]},{"name":"Natural Science Project of the Higher Education Institutions of Jiangsu Province of China","award":["18KJD520005"],"award-info":[{"award-number":["18KJD520005"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>With the universal existence of mixed data with numerical and categorical attributes in real world, a variety of clustering algorithms have been developed to discover the potential information hidden in mixed data. Most existing clustering algorithms often compute the distances or similarities between data objects based on original data, which may cause the instability of clustering results because of noise. In this paper, a clustering framework is proposed to explore the grouping structure of the mixed data. First, the transformed categorical attributes by one-hot encoding technique and normalized numerical attributes are input to a stacked denoising autoencoders to learn the internal feature representations. Secondly, based on these feature representations, all the distances between data objects in feature space can be calculated and the local density and relative distance of each data object can be also computed. Thirdly, the density peaks clustering algorithm is improved and employed to allocate all the data objects into different clusters. Finally, experiments conducted on some UCI datasets have demonstrated that our proposed algorithm for clustering mixed data outperforms three baseline algorithms in terms of the clustering accuracy and the rand index.<\/jats:p>","DOI":"10.3390\/sym11020163","type":"journal-article","created":{"date-parts":[[2019,2,1]],"date-time":"2019-02-01T11:19:58Z","timestamp":1549019998000},"page":"163","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Clustering Mixed Data Based on Density Peaks and Stacked Denoising Autoencoders"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9269-0912","authenticated-orcid":false,"given":"Baobin","family":"Duan","sequence":"first","affiliation":[{"name":"College of Computer and Information, Hohai University, Nanjing 211100, China"},{"name":"Department of Mathematics and Physics, Hefei University, Hefei 230601, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lixin","family":"Han","sequence":"additional","affiliation":[{"name":"College of Computer and Information, Hohai University, Nanjing 211100, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhinan","family":"Gou","sequence":"additional","affiliation":[{"name":"College of Computer and Information, Hohai University, Nanjing 211100, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yi","family":"Yang","sequence":"additional","affiliation":[{"name":"College of Computer and Information, Hohai University, Nanjing 211100, China"},{"name":"College of Computer Science and Technology, HuaiBei Normal University, HuaiBei 235000, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6337-2747","authenticated-orcid":false,"given":"Shuangshuang","family":"Chen","sequence":"additional","affiliation":[{"name":"Jiangsu Provincial Key Constructive Laboratory for Big Data of Psychology and Cognitive Science, Yancheng Teachers University, Yancheng 224002, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2019,2,1]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1109\/TCBB.2016.2622692","article-title":"Bi-level and Bi-objective p-Median Type Problems for Integrative Clustering: Application to Analysis of Cancer Gene-Expression and Drug-Response Data","volume":"15","author":"Ushakov","year":"2018","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1157","DOI":"10.1007\/s13042-015-0486-7","article-title":"Fuzzy soft subspace clustering method for gene co-expression network analysis","volume":"8","author":"Wang","year":"2017","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1007\/s40595-018-0116-x","article-title":"A hybrid mobile call fraud detection model using optimized fuzzy C-means clustering and group method of data handling-based network","volume":"5","author":"Subudhi","year":"2018","journal-title":"Vietnam J. Comput. Sci."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1017","DOI":"10.1007\/s10586-017-0792-9","article-title":"Improved SLIC imagine segmentation algorithm based on K-means","volume":"20","author":"Han","year":"2017","journal-title":"Clust. Comput."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"537","DOI":"10.1007\/s11634-017-0280-3","article-title":"Cluster-based sparse topical coding for topic mining and document clustering","volume":"12","author":"Ahmadi","year":"2018","journal-title":"Adv. Data Anal. Classif."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s13278-018-0508-z","article-title":"Fine-grained document clustering via ranking and its application to social media analytics","volume":"8","author":"Sutanto","year":"2018","journal-title":"Soc. Netw. Anal. Min."},{"key":"ref_7","unstructured":"MacQueen, J. (July, January 21). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA."},{"key":"ref_8","unstructured":"Ester, M., Kriegel, H.P., and Xu, X. (1996, January 2\u20134). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining(KDD\u201996), Portland, OR, USA."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"420","DOI":"10.1147\/rd.175.0420","article-title":"Lower Bounds for the Partitioning of Graphs","volume":"17","author":"Donath","year":"1973","journal-title":"IBM J. Res. Dev."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/j.asoc.2017.12.004","article-title":"A spectral clustering method with semantic interpretation based on axiomatic fuzzy set theory","volume":"64","author":"Wang","year":"2018","journal-title":"Appl. Soft Comput."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"955","DOI":"10.1007\/s11590-015-0980-6","article-title":"A min-cut approach to functional regionalization, with a case study of the Italian local labour market areas","volume":"10","author":"Bianchi","year":"2016","journal-title":"Optim. Lett."},{"key":"ref_12","unstructured":"Huang, Z. (1997, January 22\u201323). Clustering Large Data Sets with Mixed Numeric and Categorical Values. Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD\u201997), Singapore."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"2228","DOI":"10.1016\/j.patcog.2013.01.027","article-title":"Categorical-and-numerical-attribute data clustering based on a unified similarity metric without knowing cluster number","volume":"46","author":"Cheung","year":"2013","journal-title":"Pattern Recognit."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"294","DOI":"10.1016\/j.knosys.2017.07.027","article-title":"An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood","volume":"133","author":"Ding","year":"2017","journal-title":"Knowl. Based Syst."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1147","DOI":"10.1016\/0167-8655(95)00075-R","article-title":"A conceptual version of the K-means algorithm","volume":"16","author":"Ralambondrainy","year":"1995","journal-title":"Pattern Recognit. Lett."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1077","DOI":"10.1002\/int.20108","article-title":"Scalable algorithms for clustering large datasets with mixed type attributes","volume":"20","author":"He","year":"2010","journal-title":"Int. J. Intell. Syst."},{"key":"ref_17","unstructured":"Huang, Z. (1997, January 11). A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining. Proceedings of the the SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD\u201997), Tucson, AZ, USA."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1016\/j.knosys.2012.01.006","article-title":"A fuzzy k-prototype clustering algorithm for mixed numeric and categorical data","volume":"30","author":"Ji","year":"2012","journal-title":"Knowl. Based Syst."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"8684","DOI":"10.1016\/j.eswa.2011.01.074","article-title":"A fuzzy c-means-type algorithm for clustering of data with mixed numeric and categorical attributes employing a probabilistic dissimilarity functional","volume":"38","author":"Chatzis","year":"2011","journal-title":"Exp. Syst. Appl."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1492","DOI":"10.1126\/science.1242072","article-title":"Clustering by fast search and find of density peaks","volume":"344","author":"Rodriguez","year":"2014","journal-title":"Science"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1016\/j.patrec.2017.07.001","article-title":"A novel density peaks clustering algorithm for mixed data","volume":"97","author":"Du","year":"2017","journal-title":"Pattern Recognit. Lett."},{"key":"ref_22","first-page":"5060842","article-title":"Clustering Mixed Data by Fast Search and Find of Density Peaks","volume":"2017","author":"Liu","year":"2017","journal-title":"Math. Probl. Eng."},{"key":"ref_23","unstructured":"Xie, J., Girshick, R., and Farhadi, A. (2016, January 19\u201324). Unsupervised Deep Embedding for Clustering Analysis. Proceedings of the 33nd International Conference on Machine Learning (ICML 2016), New York, NY, USA."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1016\/j.patcog.2018.05.019","article-title":"Discriminatively boosted image clustering with fully convolutional auto-encoders","volume":"83","author":"Li","year":"2018","journal-title":"Pattern Recognit."},{"key":"ref_25","unstructured":"Chen, G. (2015, January 13). Deep Learning with Nonparametric Clustering. Available online: http:\/\/arxiv.org\/abs\/1501.03084."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1177","DOI":"10.1016\/j.eswa.2007.08.049","article-title":"Incremental clustering of mixed data based on distance hierarchy","volume":"35","author":"Hsu","year":"2008","journal-title":"Expert Syst. Appl."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Zhang, K., Wang, Q., Chen, Z., Marsic, I., Kumar, V., Jiang, G., and Zhang, J. (May, January 30). From Categorical to Numerical: Multiple Transitive Distance Learning and Embedding. Proceedings of the 2015 SIAM International Conference on Data Mining (SIAM 2015), Vancouver, BC, Canada.","DOI":"10.1137\/1.9781611974010.6"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"416","DOI":"10.1016\/j.patcog.2011.07.006","article-title":"SpectralCAT: Categorical spectral clustering of numerical and nominal data","volume":"45","author":"David","year":"2012","journal-title":"Pattern Recognit."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"3308","DOI":"10.1109\/TNNLS.2017.2728138","article-title":"Subspace Clustering of Categorical and Numerical Data with an Unknown Number of Clusters","volume":"29","author":"Jia","year":"2018","journal-title":"IEEE Trans. Neural. Netw. Learn. Syst."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Zheng, Z., Gong, M., Ma, J., Jiao, L., and Wu, Q. (2010, January 18\u201323). Unsupervised evolutionary clustering algorithm for mixed type data. Proceedings of the IEEE Congress on Evolutionary Computation (CEC 2010), Barcelona, Spain.","DOI":"10.1109\/CEC.2010.5586136"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1313","DOI":"10.1007\/s10586-017-0818-3","article-title":"A novel DBSCAN with entropy and probability for mixed data","volume":"20","author":"Liu","year":"2017","journal-title":"Clust. Comput."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Behzadi, S., Ibrahim, M.A., and Plant, C. (2018, January 3\u20136). Parameter Free Mixed-Type Density-Based Clustering. Proceedings of the 29th International Conference Database and Expert Systems Applications (DEXA 2018), Regensburg, Germany.","DOI":"10.1007\/978-3-319-98812-2_2"},{"key":"ref_33","first-page":"3371","article-title":"Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion","volume":"11","author":"Vincent","year":"2010","journal-title":"J. Mach. Learn. Res."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1109\/TMM.2017.2745702","article-title":"CNN-Based Joint Clustering and Representation Learning with Feature Drift Compensation for Large-Scale Image Data","volume":"20","author":"Hsu","year":"2018","journal-title":"IEEE Trans. Multimed."},{"key":"ref_35","unstructured":"Kingma, D.P., and Welling, M. (2014, May 01). Auto-Encoding Variational Bayes. Available online: http:\/\/arxiv.org\/abs\/1312.6114."},{"key":"ref_36","unstructured":"Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014, January 8\u201313). Generative Adversarial Nets. Proceedings of the Advances in Neural Information Processing Systems 27 (NIPS\u201914), Montreal, QC, Canada."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Jiang, Z., Zheng, Y., Tan, H., Tang, B., and Zhou, H. (2017, January 19\u201325). Variational Deep Embedding: An Unsupervised and Generative Approach to Clustering. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI 2017), Melbourne, Australia.","DOI":"10.24963\/ijcai.2017\/273"},{"key":"ref_38","unstructured":"Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., and Abbeel, P. (2016, January 5\u201310). InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. Proceedings of the Advances in Neural Information Processing Systems 29 (NIPS\u201916), Barcelona, Spain."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1605","DOI":"10.1109\/ACCESS.2015.2477216","article-title":"Clustering Data of Mixed Categorical and Numerical Type with Unsupervised Feature Learning","volume":"3","author":"Lam","year":"2015","journal-title":"IEEE Access"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"11687","DOI":"10.1109\/ACCESS.2017.2759509","article-title":"A High-Order Clustering Algorithm Based on Dropout Deep Learning for Heterogeneous Data in Cyber-Physical-Social Systems","volume":"6","author":"Bu","year":"2018","journal-title":"IEEE Access"},{"key":"ref_41","unstructured":"Aljalbout, E., Golkov, V., Siddiqui, Y., and Cremers, D. (2018, September 13). Clustering with Deep Learning: Taxonomy and New Methods. Available online: http:\/\/arxiv.org\/abs\/1801.07648."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"39501","DOI":"10.1109\/ACCESS.2018.2855437","article-title":"A Survey of Clustering with Deep Learning: From the Perspective of Network Architecture","volume":"6","author":"Min","year":"2018","journal-title":"IEEE Access"},{"key":"ref_43","unstructured":"Zhang, W., Du, T., and Wang, J. (2016, January 20\u201323). Deep Learning over Multi-field Categorical Data: A Case Study on User Response Prediction. Proceedings of the European Conference on Information Retrieval (ECIR 2016), Padua, Italy."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Bengio, Y., Lamblin, P., Dan, P., and Larochelle, H. (2006, January 4\u20137). Greedy layer-wise training of deep networks. Proceedings of the Advances in Neural Information Processing Systems 19 (NIPS\u201906), Vancouver, BC, Canada.","DOI":"10.7551\/mitpress\/7503.003.0024"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Ranzato, M.A., Poultney, C.S., Chopra, S., and LeCun, Y. (2006, January 4\u20137). Efficient Learning of Sparse Representations with an Energy-Based Model. Proceedings of the Advances in Neural Information Processing Systems 19 (NIPS\u201906), Vancouver, BC, Canada.","DOI":"10.7551\/mitpress\/7503.003.0147"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1016\/j.neucom.2017.03.014","article-title":"Knock-Knock: Acoustic object recognition by using stacked denoising autoencoders","volume":"267","author":"Luo","year":"2017","journal-title":"Neurocomputing"},{"key":"ref_47","first-page":"257","article-title":"Adaptive Subgradient Methods for Online Learning and Stochastic Optimization","volume":"12","author":"Duchi","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"323","author":"Rumelhart","year":"1986","journal-title":"Nature"},{"key":"ref_49","first-page":"1929","article-title":"Dropout: A Simple Way to Prevent Neural Networks from Overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"640","DOI":"10.1109\/TPAMI.2016.2572683","article-title":"Fully Convolutional Networks for Semantic Segmentation","volume":"39","author":"Shelhamer","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"785","DOI":"10.1007\/s00779-016-0954-4","article-title":"Adaptive fuzzy clustering by fast search and find of density peaks","volume":"20","author":"Bie","year":"2016","journal-title":"Personal Ubiquitous Comput."},{"key":"ref_52","unstructured":"Salvador, S., and Chan, P. (2004, January 15\u201317). Determining the number of clusters\/segments in hierarchical clustering\/segmentation algorithms. Proceedings of the 2004 IEEE 16th International Conference on Tools with Artificial Intelligence (ICTAI 2004), Boca Raton, FL, USA."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1016\/j.solener.2014.01.021","article-title":"On the determination of coherent solar microclimates for utility planning and operations","volume":"102","author":"Zagouras","year":"2014","journal-title":"Sol. Energy"},{"key":"ref_54","unstructured":"Dua, D., and Taniskidou, E.K. (2019, January 05). UCI Machine Learning Repository. Available online: http:\/\/archive.ics.uci.edu\/ml."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"2047","DOI":"10.1109\/TNNLS.2015.2451151","article-title":"Space Structure and Clustering of Categorical Data","volume":"27","author":"Qian","year":"2017","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1080\/01621459.1971.10482356","article-title":"Objective Criteria for the Evaluation of Clustering Methods","volume":"66","author":"Rand","year":"1971","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1002\/nav.3800020109","article-title":"The Hungarian method for the assignment problem","volume":"2","author":"Kuhn","year":"1955","journal-title":"Nav. Res. Logist."}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/11\/2\/163\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:30:13Z","timestamp":1760185813000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/11\/2\/163"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,2,1]]},"references-count":57,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2019,2]]}},"alternative-id":["sym11020163"],"URL":"https:\/\/doi.org\/10.3390\/sym11020163","relation":{},"ISSN":["2073-8994"],"issn-type":[{"value":"2073-8994","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,2,1]]}}}