{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T15:33:20Z","timestamp":1772120000398,"version":"3.50.1"},"reference-count":57,"publisher":"Springer Science and Business Media LLC","issue":"6","license":[{"start":{"date-parts":[[2022,6,22]],"date-time":"2022-06-22T00:00:00Z","timestamp":1655856000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,6,22]],"date-time":"2022-06-22T00:00:00Z","timestamp":1655856000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100004063","name":"Knut och Alice Wallenbergs Stiftelse","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100004063","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002835","name":"Chalmers University of Technology","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100002835","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[2023,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Several clustering methods (e.g., <jats:italic>Normalized Cut<\/jats:italic> and <jats:italic>Ratio Cut<\/jats:italic>) divide the <jats:italic>Min Cut<\/jats:italic> cost function by a cluster dependent factor (e.g., the size or the degree of the clusters), in order to yield a more balanced partitioning. We, instead, investigate adding such regularizations to the original cost function. We first consider the case where the regularization term is the sum of the squared size of the clusters, and then generalize it to adaptive regularization of the pairwise similarities. This leads to shifting (adaptively) the pairwise similarities which might make some of them negative. We then study the connection of this method to <jats:italic>Correlation Clustering<\/jats:italic> and then propose an efficient <jats:italic>local search<\/jats:italic> optimization algorithm with fast theoretical convergence rate to solve the new clustering problem. In the following, we investigate the shift of pairwise similarities on some common clustering methods, and finally, we demonstrate the superior performance of the method by extensive experiments on different datasets.<\/jats:p>","DOI":"10.1007\/s10994-022-06189-6","type":"journal-article","created":{"date-parts":[[2022,6,22]],"date-time":"2022-06-22T17:04:45Z","timestamp":1655917485000},"page":"2025-2051","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Shift of pairwise similarities for data clustering"],"prefix":"10.1007","volume":"112","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2912-7422","authenticated-orcid":false,"given":"Morteza","family":"Haghir Chehreghani","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2022,6,22]]},"reference":[{"key":"6189_CR1","doi-asserted-by":"crossref","unstructured":"Bailey, K. (1994). Numerical taxonomy and cluster analysis. SAGE Publications.","DOI":"10.4135\/9781412986397.n3"},{"issue":"1\u20133","key":"6189_CR2","doi-asserted-by":"publisher","first-page":"89","DOI":"10.1023\/B:MACH.0000033116.57574.95","volume":"56","author":"N Bansal","year":"2004","unstructured":"Bansal, N., Blum, A., & Chawla, S. (2004). Correlation clustering. Machine Learning, 56(1\u20133), 89\u2013113.","journal-title":"Machine Learning"},{"key":"6189_CR3","doi-asserted-by":"crossref","unstructured":"B\u00fchler, T., & Hein, M. (2009). Spectral clustering based on the graph p-laplacian. In Proceedings of the 26th annual international conference on machine learning, ICML \u201909, pp. 81\u201388. ACM.","DOI":"10.1145\/1553374.1553385"},{"issue":"7","key":"6189_CR4","doi-asserted-by":"publisher","first-page":"984","DOI":"10.1016\/j.cviu.2010.12.004","volume":"115","author":"SR Bul\u00f2","year":"2011","unstructured":"Bul\u00f2, S. R., Pelillo, M., & Bomze, I. M. (2011). Graph-based quadratic optimization: A fast evolutionary approach. Computer Vision and Image Understanding, 115(7), 984\u2013995.","journal-title":"Computer Vision and Image Understanding"},{"issue":"4","key":"6189_CR5","doi-asserted-by":"publisher","first-page":"476","DOI":"10.1037\/h0054116","volume":"38","author":"RB Cattell","year":"1943","unstructured":"Cattell, R. B. (1943). The description of personality: Basic traits resolved into clusters. The Journal of Abnormal and Social Psychology, 38(4), 476\u2013506.","journal-title":"The Journal of Abnormal and Social Psychology"},{"issue":"9","key":"6189_CR6","doi-asserted-by":"publisher","first-page":"1088","DOI":"10.1109\/43.310898","volume":"13","author":"PK Chan","year":"1994","unstructured":"Chan, P. K., Schlag, M. D. F., & Zien, J. Y. (1994). Spectral k-way ratio-cut partitioning and clustering. IEEE Transactions on CAD of Integrated Circuits and Systems, 13(9), 1088\u20131096.","journal-title":"IEEE Transactions on CAD of Integrated Circuits and Systems"},{"key":"6189_CR7","unstructured":"Chehreghani, M. H. (2013). Information-theoretic validation of clustering algorithms. PhD thesis, ETH Zurich."},{"key":"6189_CR8","doi-asserted-by":"crossref","unstructured":"Chehreghani, M. H. (2017). Clustering by shift. In 2017 IEEE international conference on data mining, ICDM, pp. 793\u2013798.","DOI":"10.1109\/ICDM.2017.94"},{"key":"6189_CR9","doi-asserted-by":"crossref","unstructured":"Chehreghani, M. H. (2021). Reliable agglomerative clustering. In International joint conference on neural networks (IJCNN). IEEE.","DOI":"10.1109\/IJCNN52387.2021.9534228"},{"key":"6189_CR10","unstructured":"Chehreghani, M. H., Busetto, A. G., & Buhmann, J. M. (2012). Information theoretic model validation for spectral clustering. In Proceedings of the fifteenth international conference on artificial intelligence and statistics, AISTATS, vol.\u00a022, pp. 495\u2013503."},{"issue":"2\u20133","key":"6189_CR11","doi-asserted-by":"publisher","first-page":"271","DOI":"10.1007\/s10994-016-5573-9","volume":"104","author":"MH Chehreghani","year":"2016","unstructured":"Chehreghani, M. H. (2016). Adaptive trajectory analysis of replicator dynamics for data clustering. Machine Learning, 104(2\u20133), 271\u2013289.","journal-title":"Machine Learning"},{"issue":"1","key":"6189_CR12","doi-asserted-by":"publisher","first-page":"30","DOI":"10.1016\/j.datak.2008.06.006","volume":"67","author":"MH Chehreghani","year":"2008","unstructured":"Chehreghani, M. H., Abolhassani, H., & Chehreghani, M. H. (2008). Improving density-based methods for hierarchical clustering of web pages. Data & Knowledge Engineering, 67(1), 30\u201350.","journal-title":"Data & Knowledge Engineering"},{"key":"6189_CR13","first-page":"211","volume":"18","author":"Y Chen","year":"2005","unstructured":"Chen, Y., Zhang, Y., & Ji, X. (2005). Size regularized cut for data clustering. Advances in Neural Information Processing Systems (NIPS), 18, 211\u2013218.","journal-title":"Advances in Neural Information Processing Systems (NIPS)"},{"issue":"2\u20133","key":"6189_CR14","doi-asserted-by":"publisher","first-page":"172","DOI":"10.1016\/j.tcs.2006.05.008","volume":"361","author":"ED Demaine","year":"2006","unstructured":"Demaine, E. D., Emanuel, D., Fiat, A., & Immorlica, N. (2006). Correlation clustering in general weighted graphs. Theoretical Computer Science, 361(2\u20133), 172\u2013187.","journal-title":"Theoretical Computer Science"},{"key":"6189_CR15","unstructured":"Demetriou, A., A\u00e5g, H., Rahrovani, S., & Chehreghani, M. H. (2020). A deep learning framework for generation and analysis of driving scenario trajectories. CoRR, arXiv: 2007.14524."},{"key":"6189_CR16","doi-asserted-by":"crossref","unstructured":"Dhillon, I. S., Guan, Y., & Kulis, B. (2004). Kernel k-means: Spectral clustering and normalized cuts. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, KDD \u201904, pp. 551\u2013556. ACM.","DOI":"10.1145\/1014052.1014118"},{"key":"6189_CR17","doi-asserted-by":"crossref","unstructured":"Dhillon, I. S., Guan, Y., & Kulis, B. (2005). A unified view of kernel k-means, spectral clustering and graph cuts. Technical Report TR-04-25.","DOI":"10.1145\/1014052.1014118"},{"key":"6189_CR18","doi-asserted-by":"publisher","first-page":"28","DOI":"10.1016\/j.tcs.2020.07.022","volume":"842","author":"H Ding","year":"2020","unstructured":"Ding, H. (2020). Faster balanced clusterings in high dimension. Theoretical Computer Science, 842, 28\u201340.","journal-title":"Theoretical Computer Science"},{"key":"6189_CR19","unstructured":"Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the second international conference on knowledge discovery and data mining (KDD), pp. 226\u2013231."},{"key":"6189_CR20","doi-asserted-by":"crossref","unstructured":"Frank, M., Chehreghani, M. H., & Buhmann, J. M. (2011). The minimum transfer cost principle for model-order selection. In European conference on machine learning and knowledge discovery in databases (ECML-PKDD), Lecture Notes in Computer Science, pp. 423\u2013438.","DOI":"10.1007\/978-3-642-23780-5_37"},{"issue":"1","key":"6189_CR21","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1287\/moor.19.1.24","volume":"19","author":"O Goldschmidt","year":"1994","unstructured":"Goldschmidt, O., & Hochbaum, D. S. (1994). A polynomial algorithm for the k-cut problem for fixed k. Mathematics of Operations Research, 19(1), 24\u201337.","journal-title":"Mathematics of Operations Research"},{"issue":"10","key":"6189_CR22","doi-asserted-by":"publisher","first-page":"3059","DOI":"10.1109\/TNNLS.2018.2870131","volume":"30","author":"J Han","year":"2019","unstructured":"Han, J., Liu, H., & Nie, F. (2019). A local and global discriminative framework and optimization for balanced clustering. IEEE Transactions on Neural Networks and Learning Systems, 30(10), 3059\u20133071.","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"issue":"1","key":"6189_CR23","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/34.566806","volume":"19","author":"T Hofmann","year":"1997","unstructured":"Hofmann, T., & Buhmann, J. M. (1997). Pairwise data clustering by deterministic annealing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(1), 1\u201314.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"issue":"1","key":"6189_CR24","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1007\/BF01908075","volume":"2","author":"L Hubert","year":"1985","unstructured":"Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193\u2013218.","journal-title":"Journal of Classification"},{"issue":"4","key":"6189_CR25","doi-asserted-by":"publisher","first-page":"601","DOI":"10.1145\/234533.234534","volume":"43","author":"DR Karger","year":"1996","unstructured":"Karger, D. R., & Stein, C. (1996). A new approach to the minimum cut problem. Journal of the ACM (JACM), 43(4), 601\u2013640.","journal-title":"Journal of the ACM (JACM)"},{"issue":"4","key":"6189_CR26","doi-asserted-by":"publisher","first-page":"373","DOI":"10.1093\/comjnl\/9.4.373","volume":"9","author":"GN Lance","year":"1967","unstructured":"Lance, G. N., & Williams, W. T. (1967). A general theory of classificatory sorting strategies. The Computer Journal, 9(4), 373\u2013380.","journal-title":"The Computer Journal"},{"issue":"6","key":"6189_CR27","doi-asserted-by":"publisher","first-page":"787","DOI":"10.1145\/331524.331526","volume":"46","author":"T Leighton","year":"1999","unstructured":"Leighton, T., & Rao, S. (1999). Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms. Journal of the ACM (JACM), 46(6), 787\u2013832.","journal-title":"Journal of the ACM (JACM)"},{"key":"6189_CR28","unstructured":"Lichman, M. (2013). UCI machine learning repository."},{"key":"6189_CR29","unstructured":"Lin, F., & Cohen, W. W. (2010). Power iteration clustering. In Proceedings of the 27th international conference on machine learning (ICML-10), pp. 655\u2013662."},{"key":"6189_CR30","doi-asserted-by":"crossref","unstructured":"Lin, W., He, Z., & Xiao, M. (2019). Balanced clustering: A uniform model and fast algorithm. In Proceedings of the twenty-eighth international joint conference on artificial intelligence (IJCAI), pp. 2987\u20132993. International Joint Conferences on Artificial Intelligence Organization.","DOI":"10.24963\/ijcai.2019\/414"},{"key":"6189_CR31","doi-asserted-by":"crossref","unstructured":"Liu, H., Han, J., Nie, F., & Li, X. (2017). Balanced clustering with least square regression. In Proceedings of the thirty-first AAAI conference on artificial intelligence, pp. 2231\u20132237. AAAI Press.","DOI":"10.1609\/aaai.v31i1.10877"},{"key":"6189_CR32","doi-asserted-by":"crossref","unstructured":"Liu, H., Huang, Z., Chen, Q., Li, M., Fu, Y., & Zhang, L. (2018). Fast clustering with flexible balance constraints. In IEEE international conference on big data (big data), pp. 743\u2013750.","DOI":"10.1109\/BigData.2018.8621917"},{"issue":"9","key":"6189_CR33","doi-asserted-by":"publisher","first-page":"2131","DOI":"10.1109\/TPAMI.2013.16","volume":"35","author":"H Liu","year":"2013","unstructured":"Liu, H., Latecki, L. J., & Yan, S. (2013). Fast detection of dense subgraphs with iterative shrinking and expansion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(9), 2131\u20132142.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"issue":"4","key":"6189_CR34","doi-asserted-by":"publisher","first-page":"395","DOI":"10.1007\/s11222-007-9033-z","volume":"17","author":"U Luxburg","year":"2007","unstructured":"Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and Computing, 17(4), 395\u2013416.","journal-title":"Statistics and Computing"},{"key":"6189_CR35","unstructured":"Macqueen, J. (1967). Some methods for classification and analysis of multivariate observations. In 5-th Berkeley symposium on mathematical statistics and probability, pp. 281\u2013297."},{"key":"6189_CR36","doi-asserted-by":"publisher","first-page":"32","DOI":"10.1007\/978-3-662-44415-3_4","volume-title":"Structural, syntactic, and statistical pattern recognition","author":"MI Malinen","year":"2014","unstructured":"Malinen, M. I., & Fr\u00e4nti, P. (2014). Balanced k-means for clustering. In P. Fr\u00e4nti, G. Brown, M. Loog, F. Escolano, & M. Pelillo (Eds.), Structural, syntactic, and statistical pattern recognition, Lecture Notes in Computer Science, vol. 8621, pp. 32\u201341. Berlin, Heidelberg: Springer. https:\/\/doi.org\/10.1007\/978-3-662-44415-3_4."},{"key":"6189_CR37","doi-asserted-by":"crossref","unstructured":"Manning, C. D., Raghavan, P., & Sch\u00fctze, H. (2008). Introduction to information retrieval. Cambridge University Press.","DOI":"10.1017\/CBO9780511809071"},{"key":"6189_CR38","first-page":"849","volume":"14","author":"AY Ng","year":"2001","unstructured":"Ng, A. Y., Jordan, M. I., & Weiss, Y. (2001). On spectral clustering: Analysis and an algorithm. Advances in Neural Information Processing Systems, 14, 849\u2013856.","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"3","key":"6189_CR39","doi-asserted-by":"publisher","first-page":"576","DOI":"10.1109\/TMI.2011.2173699","volume":"31","author":"B Ng","year":"2012","unstructured":"Ng, B., McKeown, M. J., & Abugharbieh, R. (2012). Group replicator dynamics: A novel group-wise evolutionary approach for sparse brain network detection. IEEE Transactions on Medical Imaging, 31(3), 576\u2013585.","journal-title":"IEEE Transactions on Medical Imaging"},{"key":"6189_CR40","doi-asserted-by":"crossref","unstructured":"Pavan, M., & Pelillo, M. (2003). Dominant sets and hierarchical clustering. In 9th IEEE international conference on computer vision (ICCV), pp. 362\u2013369.","DOI":"10.1109\/ICCV.2003.1238367"},{"issue":"1","key":"6189_CR41","doi-asserted-by":"publisher","first-page":"167","DOI":"10.1109\/TPAMI.2007.250608","volume":"29","author":"M Pavan","year":"2007","unstructured":"Pavan, M., & Pelillo, M. (2007). Dominant sets and pairwise clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(1), 167\u2013172.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"6189_CR42","doi-asserted-by":"crossref","unstructured":"Reddi, S. J., Sra, S., P\u00f3czos, B., & Smola, A. J. (2016). Stochastic frank-wolfe methods for nonconvex optimization. In 54th annual Allerton conference on communication, control, and computing, Allerton 2016, Monticello, IL, USA, September 27\u201330, 2016, pp. 1244\u20131251.","DOI":"10.1109\/ALLERTON.2016.7852377"},{"key":"6189_CR43","unstructured":"Rosenberg, A., & Hirschberg, J. (2007). V-measure: A conditional entropy-based external cluster evaluation measure. In EMNLP-CoNLL, pp. 410\u2013420. ACL."},{"issue":"12","key":"6189_CR44","doi-asserted-by":"publisher","first-page":"1540","DOI":"10.1109\/TPAMI.2003.1251147","volume":"25","author":"V Roth","year":"2003","unstructured":"Roth, V., Laub, J., Kawanabe, M., & Buhmann, J. M. (2003). Optimal cluster preserving embedding of nonmetric proximity data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(12), 1540\u20131551.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"issue":"5","key":"6189_CR45","doi-asserted-by":"publisher","first-page":"1299","DOI":"10.1162\/089976698300017467","volume":"10","author":"B Sch\u00f6lkopf","year":"1998","unstructured":"Sch\u00f6lkopf, B., Smola, A., & M\u00fcller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5), 1299\u20131319.","journal-title":"Neural Computation"},{"key":"6189_CR46","doi-asserted-by":"publisher","first-page":"533","DOI":"10.1016\/0022-5193(83)90445-9","volume":"100","author":"P Schuster","year":"1983","unstructured":"Schuster, P., & Sigmund, K. (1983). Replicator dynamics. Journal of Theoretical Biology, 100, 533\u2013538.","journal-title":"Journal of Theoretical Biology"},{"issue":"8","key":"6189_CR47","doi-asserted-by":"publisher","first-page":"888","DOI":"10.1109\/34.868688","volume":"22","author":"J Shi","year":"2000","unstructured":"Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888\u2013905.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"6189_CR48","first-page":"201","volume":"17","author":"PHA Sneath","year":"1957","unstructured":"Sneath, P. H. A. (1957). The application of computers to taxonomy. Journal of General Microbiology, 17, 201\u2013226.","journal-title":"Journal of General Microbiology"},{"key":"6189_CR49","first-page":"1409","volume":"38","author":"RR Sokal","year":"1958","unstructured":"Sokal, R. R., & Michener, C. D. (1958). A statistical method for evaluating systematic relationships. University of Kansas Science Bulletin, 38, 1409\u20131438.","journal-title":"University of Kansas Science Bulletin"},{"key":"6189_CR50","doi-asserted-by":"crossref","unstructured":"Soundararajan, P., & Sarkar, S. (2001). Investigation of measures for grouping by graph partitioning. In Proceedings of conference on computer vision and pattern recognition (CVPR), pp. 239\u2013246.","DOI":"10.1109\/CVPR.2001.990482"},{"key":"6189_CR51","doi-asserted-by":"crossref","unstructured":"Thiel, E., Chehreghani, M. H., & Dubhashi, D. P. (2019). A non-convex optimization approach to correlation clustering. In The thirty-third AAAI conference on artificial intelligence, AAAI, pp. 5159\u20135166.","DOI":"10.1609\/aaai.v33i01.33015159"},{"key":"6189_CR52","unstructured":"Tryon, R. C. (1939). Cluster analysis: Correlation profile and orthometric (factor) analysis for the isolation of unities in mind and personality. Edwards Brother, Incorporated, Lithoprinters and Publishers."},{"key":"6189_CR53","first-page":"2837","volume":"11","author":"NX Vinh","year":"2010","unstructured":"Vinh, N. X., Epps, J., & Bailey, J. (2010). Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. The Journal of Machine Learning Research, 11, 2837\u20132854.","journal-title":"The Journal of Machine Learning Research"},{"issue":"301","key":"6189_CR54","doi-asserted-by":"publisher","first-page":"236","DOI":"10.1080\/01621459.1963.10500845","volume":"58","author":"JH Ward","year":"1963","unstructured":"Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236\u2013244.","journal-title":"Journal of the American Statistical Association"},{"key":"6189_CR55","unstructured":"Weibull, J. W. (1997). Evolutionary game theory. MIT Press, Cambridge, Mass. [u.a.]."},{"issue":"11","key":"6189_CR56","doi-asserted-by":"publisher","first-page":"1101","DOI":"10.1109\/34.244673","volume":"15","author":"Z Wu","year":"1993","unstructured":"Wu, Z., & Leahy, R. (1993). An optimal graph theoretic approach to data clustering: Theory and its application to image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(11), 1101\u20131113.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"6189_CR57","doi-asserted-by":"crossref","unstructured":"Yang, L., Cheung, N.-M., Li, J., & Fang, J. (2019). Deep clustering by Gaussian mixture variational autoencoders with graph embedding. In International conference on computer vision (ICCV), pp. 6439\u20136448.","DOI":"10.1109\/ICCV.2019.00654"}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-022-06189-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10994-022-06189-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-022-06189-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,6]],"date-time":"2023-06-06T19:11:57Z","timestamp":1686078717000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10994-022-06189-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,22]]},"references-count":57,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023,6]]}},"alternative-id":["6189"],"URL":"https:\/\/doi.org\/10.1007\/s10994-022-06189-6","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"value":"0885-6125","type":"print"},{"value":"1573-0565","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,6,22]]},"assertion":[{"value":"29 November 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 February 2022","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 May 2022","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 June 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not Applicable. No conflict of interest occurs.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"This research is mainly focused on conceptual and methodological developments in unsupervised learning and clustering. Clustering is usually used for data management and exploratory data analytics. Thus, this contribution provides methods to further understand, explore and explain the data and obtain deeper insights. Such possibilities can be used for example to understand gender-specific features, data irregularities, private and sensitive information and explainability aspects. On the other hand, the use of clustering for data management and summarization provides a systematic way to compress the data to yield more efficient data precessing in terms of energy and memory usage. This, itself, can be helpful for better environmental conditions. These properties are critical when dealing with large amount of data, in particular for environment friendly solutions. Finally, we would like to emphasize that in this work the experimental studies use the datasets which do not contain any private and sensitive information.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval"}},{"value":"Not Applicable. There is no human study in this research.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent to participate"}},{"value":"Not Applicable. No human study is performed in this research. There is no sensitive information.","order":5,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The code will be available through the author\u2019s home page and will be maintained there with a reference to this publication.","order":6,"name":"Ethics","group":{"name":"EthicsHeading","label":"Code availability"}}]}}