{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T16:13:42Z","timestamp":1776874422572,"version":"3.51.2"},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2024,2,15]],"date-time":"2024-02-15T00:00:00Z","timestamp":1707955200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,15]],"date-time":"2024-02-15T00:00:00Z","timestamp":1707955200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["SN COMPUT. SCI."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In this paper, we address an issue of finding explainable clusters of class-uniform data in labeled datasets. The issue falls into the domain of interpretable supervised clustering. Unlike traditional clustering, supervised clustering aims at forming clusters of labeled data with high probability densities. We are particularly interested in finding clusters of data of a given class and describing the clusters with the set of comprehensive rules. We propose an iterative method to extract high-density clusters with the help of decision-tree-based classifiers as the most intuitive learning method, and discuss the method of node selection to maximize quality of identified groups.<\/jats:p>","DOI":"10.1007\/s42979-023-02590-7","type":"journal-article","created":{"date-parts":[[2024,2,15]],"date-time":"2024-02-15T09:02:36Z","timestamp":1707987756000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Using Decision Trees for Interpretable Supervised Clustering"],"prefix":"10.1007","volume":"5","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3639-1245","authenticated-orcid":false,"given":"Natallia","family":"Kokash","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Leonid","family":"Makhnist","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,2,15]]},"reference":[{"key":"2590_CR1","doi-asserted-by":"publisher","DOI":"10.3389\/fdata.2021.688969","author":"V Belle","year":"2021","unstructured":"Belle V, Papantonis I. Principles and practice of explainable machine learning. Front Big Data. 2021. https:\/\/doi.org\/10.3389\/fdata.2021.688969.","journal-title":"Front Big Data"},{"issue":"1","key":"2590_CR2","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L. Random forests. Mach Learn. 2001;45(1):5\u201332.","journal-title":"Mach Learn"},{"key":"2590_CR3","volume-title":"Classification and regression trees","author":"L Breiman","year":"1984","unstructured":"Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and regression trees. Wadsworth and Brooks; 1984."},{"issue":"1","key":"2590_CR4","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1017\/S0269888997000015","volume":"12","author":"LA Breslow","year":"1997","unstructured":"Breslow LA, Aha DW. Simplifying decision trees: a survey. Knowl Eng Rev. 1997;12(1):1\u201340.","journal-title":"Knowl Eng Rev"},{"key":"2590_CR5","doi-asserted-by":"publisher","first-page":"245","DOI":"10.1613\/jair.1.12228","volume":"70","author":"N Burkart","year":"2021","unstructured":"Burkart N, Huber MF. A survey on the explainability of supervised machine learning. J Artif Intell Res. 2021;70:245\u2013317. https:\/\/doi.org\/10.1613\/jair.1.12228.","journal-title":"J Artif Intell Res"},{"key":"2590_CR6","doi-asserted-by":"crossref","unstructured":"Carvalho DV, Pereira EM, Cardoso JS. Machine learning interpretability: a survey on methods and metrics. Electronics. 2019;8(8), p. 832.","DOI":"10.3390\/electronics8080832"},{"key":"2590_CR7","unstructured":"Castin L, Fr\u00e9nay B. Clustering with decision trees: divisive and agglomerative approach. In: Proc. of ESANN 2018; p. 455\u2013460."},{"key":"2590_CR8","doi-asserted-by":"crossref","unstructured":"Chen T, Guestrin C. XGBoost: a scalable tree boosting system. In: Krishnapuram B, Shah M, Smola AJ, Aggarwal C, Shen D, Rastogi R. editors. Proc. of KDD, ACM 2016; p. 785\u2013794.","DOI":"10.1145\/2939672.2939785"},{"key":"2590_CR9","unstructured":"Deng H, Runger G. Feature selection via regularized trees. In: Proc. of the Int. Joint Conf. on Neural Networks (IJCNN), 2012."},{"key":"2590_CR10","unstructured":"Dua D, Graff C. UCI machine learning repository 2017. http:\/\/archive.ics.uci.edu\/ml\/datasets\/Adult. Accessed 09 Feb 2024."},{"key":"2590_CR11","doi-asserted-by":"crossref","unstructured":"Eick C, Zeidat N, Zhao Z. Supervised clustering\u2014algorithms and benefits. In: Proc. of ICTAI, 2004; p. 774\u2013 776.","DOI":"10.1109\/ICTAI.2004.111"},{"key":"2590_CR12","doi-asserted-by":"publisher","first-page":"367","DOI":"10.1016\/S0167-9473(01)00065-2","volume":"38","author":"J Friedman","year":"2002","unstructured":"Friedman J. Stochastic gradient boosting. Comput Stat Data Anal. 2002;38:367\u201378.","journal-title":"Comput Stat Data Anal"},{"key":"2590_CR13","doi-asserted-by":"crossref","unstructured":"Guidotti R, Ruggieri S. On the stability of interpretable models. In: Proc. of the Int. Joint Conf. on Neural Networks (IJCNN), 2019; p. 1\u20138.","DOI":"10.1109\/IJCNN.2019.8852158"},{"key":"2590_CR14","first-page":"19","volume":"141","author":"P Gulati","year":"2016","unstructured":"Gulati P, Sharma A, Gupta M. Theoretical study of decision tree algorithms to identify pivotal factors for performance improvement: a review. Int J Comput Appl. 2016;141:19\u201325.","journal-title":"Int J Comput Appl"},{"key":"2590_CR15","first-page":"29","volume":"63","author":"S Jahirabadkar","year":"2013","unstructured":"Jahirabadkar S, Kulkarni P. Clustering for high dimensional data: density based subspace clustering algorithms. Int J Comput Appl. 2013;63:29\u201335.","journal-title":"Int J Comput Appl"},{"key":"2590_CR16","doi-asserted-by":"publisher","first-page":"491","DOI":"10.1002\/sim.4780140510","volume":"14","author":"MA Jaro","year":"1995","unstructured":"Jaro MA. Probabilistic linkage of large public health data file. Stat Med. 1995;14:491\u20138.","journal-title":"Stat Med"},{"key":"2590_CR17","doi-asserted-by":"publisher","first-page":"231","DOI":"10.1002\/widm.30","volume":"1","author":"HP Kriegel","year":"2011","unstructured":"Kriegel HP, Kr\u00f6ger P, Sander J, Zimek A. Density-based clustering. Wiley Interdiscpl Rew Data Min Knowl Discov. 2011;1:231\u201340.","journal-title":"Wiley Interdiscpl Rew Data Min Knowl Discov"},{"key":"2590_CR18","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2022.109239","volume":"137","author":"E Laber","year":"2023","unstructured":"Laber E, Murtinho L, Oliveira F. Shallow decision trees for explainable k-means clustering. Pattern Recogn. 2023;137: 109239. https:\/\/doi.org\/10.1016\/j.patcog.2022.109239.","journal-title":"Pattern Recogn"},{"key":"2590_CR19","first-page":"97","volume-title":"Clustering via decision tree construction","author":"B Liu","year":"2005","unstructured":"Liu B, Xia Y, Yu P. Clustering via decision tree construction. Berlin: Springer; 2005. p. 97\u2013124."},{"key":"2590_CR20","doi-asserted-by":"crossref","unstructured":"Maqbool O, Babri HA. A stability analysis of clustering algorithms. In: 2006 IEEE International Multitopic Conference 2006; p. 314\u2013319.","DOI":"10.1109\/INMIC.2006.358184"},{"key":"2590_CR21","unstructured":"Moshkovitz M, Dasgupta S, Rashtchian C, Frost N. Explainable k-means and k-medians clustering. In: Proc. of the ICML, Proc. of Machine Learning Research, 2020;119:7055\u20137065 PMLR. https:\/\/proceedings.mlr.press\/v119\/moshkovitz20a.html. Accessed 09 Feb 2024."},{"key":"2590_CR22","first-page":"1","volume":"4","author":"S Mussard","year":"2003","unstructured":"Mussard S, Terraza M, Seyte F. Decomposition of Gini and the generalized entropy inequality measures. Econ Bull. 2003;4:1\u20135.","journal-title":"Econ Bull"},{"key":"2590_CR23","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, et al. Scikit-learn: machine learning in python. J Mach Learn Res. 2011;12:2825\u201330.","journal-title":"J Mach Learn Res"},{"key":"2590_CR24","doi-asserted-by":"publisher","first-page":"42200","DOI":"10.1109\/ACCESS.2020.2976199","volume":"8","author":"R Roscher","year":"2020","unstructured":"Roscher R, Bohn B, Duarte M, Garcke J. Explainable machine learning for scientific insights and discoveries. IEEE Access. 2020;8:42200\u201316.","journal-title":"IEEE Access"},{"issue":"3","key":"2590_CR25","doi-asserted-by":"publisher","first-page":"660","DOI":"10.1109\/21.97458","volume":"21","author":"SR Safavian","year":"1991","unstructured":"Safavian SR, Landgrebe D. A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern. 1991;21(3):660\u201374.","journal-title":"IEEE Trans Syst Man Cybern"},{"key":"2590_CR26","doi-asserted-by":"crossref","unstructured":"Scheffer T. Nonparametric regularization of decision trees. In: L\u00f3pez de M\u00e1ntaras R, Plaza E, editors. Machine learning: ECML 2000. Springer; 2000. p. 344\u201356.","DOI":"10.1007\/3-540-45164-1_36"},{"key":"2590_CR27","first-page":"12365","volume-title":"Advances in neural information processing systems","author":"VF Souza","year":"2022","unstructured":"Souza VF, Cicalese F, Laber E, Molinaro M. Decision trees with short explainable rules. In: Koyejo S, Mohamed S, Agarwal A, Belgrave D, Cho K, Oh A, editors. Advances in neural information processing systems, vol. 35. Curran Associates Inc.; 2022. p. 12365\u201379."},{"key":"2590_CR28","doi-asserted-by":"publisher","first-page":"18069","DOI":"10.1007\/s00521-019-04051-w","volume":"32","author":"A Vellido","year":"2020","unstructured":"Vellido A. The importance of interpretability and visualization in machine learning for applications in medicine and health care. Neural Comput Appl. 2020;32:18069\u201383.","journal-title":"Neural Comput Appl."},{"key":"2590_CR29","doi-asserted-by":"publisher","first-page":"112","DOI":"10.1016\/j.eswa.2018.03.036","volume":"105","author":"L Wang","year":"2018","unstructured":"Wang L, Li Q, Yu Y, Liu J. Region compatibility based stability assessment for decision trees. Expert Syst Appl. 2018;105:112\u201328.","journal-title":"Expert Syst Appl"},{"key":"2590_CR30","unstructured":"Winkler W. Advanced methods for record linkage. In: Proc. of the Section on Survey Research Methods, American Statistical Association 1994; p. 467\u2013472."},{"key":"2590_CR31","doi-asserted-by":"crossref","unstructured":"Yu J, Amores J, Sebe N, Tian Q. A new study on distance metrics as similarity measurement. In: Proc. of ICME, 2006; p. 533\u2013536.","DOI":"10.1109\/ICME.2006.262443"}],"container-title":["SN Computer Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s42979-023-02590-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s42979-023-02590-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s42979-023-02590-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,11]],"date-time":"2024-11-11T12:51:33Z","timestamp":1731329493000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s42979-023-02590-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,15]]},"references-count":31,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2024,2]]}},"alternative-id":["2590"],"URL":"https:\/\/doi.org\/10.1007\/s42979-023-02590-7","relation":{},"ISSN":["2661-8907"],"issn-type":[{"value":"2661-8907","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,15]]},"assertion":[{"value":"6 January 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 December 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 February 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"On behalf of all the authors, the corresponding author states that there is no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of Interest"}}],"article-number":"268"}}