{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T10:06:08Z","timestamp":1760609168046,"version":"build-2065373602"},"reference-count":38,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2019,7,12]],"date-time":"2019-07-12T00:00:00Z","timestamp":1562889600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61562027"],"award-info":[{"award-number":["61562027"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Although within-cluster information is commonly used in most clustering approaches, other important information such as between-cluster information is rarely considered in some cases. Hence, in this study, we propose a new novel measure of between-cluster distance in subspace, which is to maximize the distance between the center of a cluster and the points that do not belong to this cluster. Based on this idea, we firstly design an optimization objective function integrating the between-cluster distance and entropy regularization in this paper. Then, updating rules are given by theoretical analysis. In the following, the properties of our proposed algorithm are investigated, and the performance is evaluated experimentally using two synthetic and seven real-life datasets. Finally, the experimental studies demonstrate that the results of the proposed algorithm (ERKM) outperform most existing state-of-the-art k-means-type clustering algorithms in most cases.<\/jats:p>","DOI":"10.3390\/e21070683","type":"journal-article","created":{"date-parts":[[2019,7,12]],"date-time":"2019-07-12T11:49:38Z","timestamp":1562932178000},"page":"683","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["An Entropy Regularization k-Means Algorithm with a New Measure of between-Cluster Distance in Subspace Clustering"],"prefix":"10.3390","volume":"21","author":[{"given":"Liyan","family":"Xiong","sequence":"first","affiliation":[{"name":"School of Information Engineering Department, East China Jiaotong University, R.d 808, East Shuanggang Avenue, Nanchang 330013, China"}]},{"given":"Cheng","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Information Engineering Department, East China Jiaotong University, R.d 808, East Shuanggang Avenue, Nanchang 330013, China"}]},{"given":"Xiaohui","family":"Huang","sequence":"additional","affiliation":[{"name":"School of Information Engineering Department, East China Jiaotong University, R.d 808, East Shuanggang Avenue, Nanchang 330013, China"}]},{"given":"Hui","family":"Zeng","sequence":"additional","affiliation":[{"name":"School of Information Engineering Department, East China Jiaotong University, R.d 808, East Shuanggang Avenue, Nanchang 330013, China"}]}],"member":"1968","published-online":{"date-parts":[[2019,7,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1023\/A:1009769707641","article-title":"Extensions to the k-means algorithm for clustering large datasets with categorical values","volume":"2","author":"Huang","year":"1998","journal-title":"Data Min. Knowl. Discov."},{"key":"ref_2","unstructured":"MacQueen, J. (July, January 21). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1007\/BF01908720","article-title":"A preliminary study of optimal variable weighting in k-means clustering","volume":"7","author":"Green","year":"1990","journal-title":"J. Classif."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"947","DOI":"10.1093\/bioinformatics\/btt064","article-title":"Phylogenomic clustering for selecting non-redundant genomes for comparative genomics","volume":"29","author":"ElSherbiny","year":"2013","journal-title":"Bioinformatics"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"767","DOI":"10.1016\/j.patcog.2009.09.010","article-title":"Enhanced soft subspace clustering integrating within-cluster and between-cluster information","volume":"43","author":"Deng","year":"2010","journal-title":"Pattern Recognit."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Sardana, M., and Agrawal, R. (2012). A comparative study of clustering methods for relevant gene selection in microarray data. Advances in Computer Science, Engineering & Applications, Springer.","DOI":"10.1007\/978-3-642-30157-5_78"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1109\/TKDE.2011.159","article-title":"Identifying evolving groups in dynamic multimode networks","volume":"24","author":"Tang","year":"2012","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"264","DOI":"10.1145\/331499.331504","article-title":"Data clustering: A review","volume":"31","author":"Jain","year":"1999","journal-title":"ACM Comput. Surv."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1016\/S0893-6080(01)00108-3","article-title":"Projective ART for clustering datasets in high dimensional spaces","volume":"15","author":"Cao","year":"2002","journal-title":"Neural Netw."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"657","DOI":"10.1109\/TPAMI.2005.95","article-title":"Automated variable weighting in k-means type clustering","volume":"27","author":"Huang","year":"2005","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1007\/BF02294206","article-title":"Synthesized clustering: A method for amalgamating alternative clustering bases with differential weighting of variables","volume":"49","author":"DeSarbo","year":"1984","journal-title":"Psychometrika"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1007\/BF00227423","article-title":"Optimal variable weighting for ultrametric and additive tree clustering","volume":"20","year":"1986","journal-title":"Qual. Quant."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1007\/BF01901677","article-title":"OVWTRE: A program for optimal variable weighting for ultrametric and additive tree fitting","volume":"5","year":"1988","journal-title":"J. Classif."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1007\/s00357-001-0018-x","article-title":"Optimal variable weighting for ultrametric and additive trees and k-means partitioning: Methods and software","volume":"18","author":"Makarenkov","year":"2001","journal-title":"J. Classif."},{"key":"ref_15","first-page":"320","article-title":"Noisy sparse subspace clustering","volume":"17","author":"Wang","year":"2016","journal-title":"J. Mach. Learn. Res."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1109\/TKDE.2007.1048","article-title":"An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data","volume":"19","author":"Jing","year":"2007","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"639","DOI":"10.1016\/j.patrec.2004.09.016","article-title":"A novel fuzzy clustering algorithm based on a fuzzy scatter matrix with optimality tests","volume":"26","author":"Wu","year":"2005","journal-title":"Pattern Recognit. Lett."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1433","DOI":"10.1109\/TNNLS.2013.2293795","article-title":"Extensions of kmeans-type algorithms: A new clustering framework by integrating intracluster compactness and intercluster separation","volume":"25","author":"Huang","year":"2014","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1016\/j.knosys.2014.07.009","article-title":"DSKmeans: A new kmeans-type approach to discriminative subspace clustering","volume":"70","author":"Huang","year":"2014","journal-title":"Knowl.-Based Syst."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Han, K.J., and Narayanan, S.S. (April, January 31). Novel inter-cluster distance measure combining GLR and ICR for improved agglomerative hierarchical speaker clustering. Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA.","DOI":"10.1109\/ICASSP.2008.4518624"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1016\/j.fss.2012.06.005","article-title":"A novel fuzzy clustering algorithm with between-cluster information for categorical data","volume":"215","author":"Bai","year":"2013","journal-title":"Fuzzy Sets Syst."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1016\/j.neucom.2013.11.024","article-title":"The k-modes type clustering plus between-cluster information for categorical data","volume":"133","author":"Bai","year":"2014","journal-title":"Neurocomputing"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1016\/j.neucom.2015.09.127","article-title":"Fuzzy clustering with the entropy of attribute weights","volume":"198","author":"Zhou","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1016\/j.ins.2016.01.101","article-title":"A survey on soft subspace clustering","volume":"348","author":"Deng","year":"2016","journal-title":"Inf. Sci."},{"key":"ref_25","first-page":"1265","article-title":"Sparse k-means with \u2113\u221e\/\u21130 penalty for high-dimensional data clustering","volume":"28","author":"Chang","year":"2018","journal-title":"Stat. Sin."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"713","DOI":"10.1198\/jasa.2010.tm09415","article-title":"A framework for feature selection in clustering","volume":"105","author":"Witten","year":"2010","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_27","first-page":"1145","article-title":"Penalized model-based clustering with application to variable selection","volume":"8","author":"Pan","year":"2007","journal-title":"J. Mach. Learn. Res."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhou, J., and Chen, C.P. (2011, January 8\u201310). Attribute weight entropy regularization in fuzzy c-means algorithm for feature selection. Proceedings of the 2011 International Conference on System Science and Engineering, Macao, China.","DOI":"10.1109\/ICSSE.2011.5961874"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"23","DOI":"10.5121\/ijdkp.2015.5203","article-title":"Improved Text Clustering with Neighbours","volume":"5","author":"Govardhan","year":"2015","journal-title":"Int. J. Data Min. Knowl. Manag. Process"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"456","DOI":"10.1016\/j.patcog.2017.10.011","article-title":"Comment on \u201cEnhanced soft subspace clustering integrating within-cluster and between-cluster information\u201d by Z. Deng et al. (Pattern Recognition, vol. 43, pp. 767\u2013781, 2010)","volume":"77","author":"Forghani","year":"2018","journal-title":"Pattern Recognit."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"218","DOI":"10.1109\/TSMCA.2007.909595","article-title":"Automatic clustering using an improved differential evolution algorithm","volume":"38","author":"Das","year":"2008","journal-title":"IEEE Trans. Syst. Man Cybern. Part A Syst. Hum."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"379","DOI":"10.1016\/S0167-9473(02)00183-4","article-title":"Modelling high-dimensional data by mixtures of factor analyzers","volume":"41","author":"McLachlan","year":"2003","journal-title":"Comput. Stat. Data Anal."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"2616","DOI":"10.1109\/TCYB.2016.2627686","article-title":"Sparse Regularization in Fuzzy c-Means for High-Dimensional Data Clustering","volume":"47","author":"Chang","year":"2017","journal-title":"IEEE Trans. Cybern."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TPAMI.1980.4766964","article-title":"A convergence theorem for the fuzzy ISODATA clustering algorithms","volume":"PAMI-2","author":"Bezdek","year":"1980","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1109\/TPAMI.1984.4767478","article-title":"K-means-type algorithms: A generalized convergence theorem and characterization of local optimality","volume":"PAMI-6","author":"Selim","year":"1984","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_36","unstructured":"Bachem, O., Lucic, M., Hassani, H., and Krause, A. (2016, January 5\u201310). Fast and provably good seedings for k-means. Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain."},{"key":"ref_37","unstructured":"Tarn, C., Zhang, Y., and Feng, Y. (2018). Sampling Clustering. arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"4081","DOI":"10.1109\/TIT.2018.2812824","article-title":"Noisy subspace clustering via matching pursuits","volume":"64","author":"Tschannen","year":"2018","journal-title":"IEEE Trans. Inf. Theory"}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/21\/7\/683\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:05:01Z","timestamp":1760187901000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/21\/7\/683"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,7,12]]},"references-count":38,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2019,7]]}},"alternative-id":["e21070683"],"URL":"https:\/\/doi.org\/10.3390\/e21070683","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2019,7,12]]}}}