{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:08:04Z","timestamp":1760148484741,"version":"build-2065373602"},"reference-count":50,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2023,5,9]],"date-time":"2023-05-09T00:00:00Z","timestamp":1683590400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100012190","name":"Ministry of Science and Higher Education of the Russian Federation","doi-asserted-by":"publisher","award":["075-15-2022-1121"],"award-info":[{"award-number":["075-15-2022-1121"]}],"id":[{"id":"10.13039\/501100012190","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>A number of real-world problems of automatic grouping of objects or clustering require a reasonable solution and the possibility of interpreting the result. More specific is the problem of identifying homogeneous subgroups of objects. The number of groups in such a dataset is not specified, and it is required to justify and describe the proposed grouping model. As a tool for interpretable machine learning, we consider formal concept analysis (FCA). To reduce the problem with real attributes to a problem that allows the use of FCA, we use the search for the optimal number and location of cut points and the optimization of the support set of attributes. The approach to identifying homogeneous subgroups was tested on tasks for which interpretability is important: the problem of clustering industrial products according to primary tests (for example, transistors, diodes, and microcircuits) as well as gene expression data (collected to solve the problem of predicting cancerous tumors). For the data under consideration, logical concepts are identified, formed in the form of a lattice of formal concepts. Revealed concepts are evaluated according to indicators of informativeness and can be considered as homogeneous subgroups of elements and their indicative descriptions. The proposed approach makes it possible to single out homogeneous subgroups of elements and provides a description of their characteristics, which can be considered as tougher norms that the elements of the subgroup satisfy. A comparison is made with the COBWEB algorithm designed for conceptual clustering of objects. This algorithm is aimed at discovering probabilistic concepts. The resulting lattices of logical concepts and probabilistic concepts for the considered datasets are simple and easy to interpret.<\/jats:p>","DOI":"10.3390\/a16050246","type":"journal-article","created":{"date-parts":[[2023,5,10]],"date-time":"2023-05-10T01:57:51Z","timestamp":1683683871000},"page":"246","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Subgroup Discovery in Machine Learning Problems with Formal Concepts Analysis and Test Theory Algorithms"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3997-342X","authenticated-orcid":false,"given":"Igor","family":"Masich","sequence":"first","affiliation":[{"name":"Laboratory \u201cHybrid Methods of Modeling and Optimization in Complex Systems\u201d, Siberian Federal University, 79 Svobodny Prospekt, 660041 Krasnoyarsk, Russia"},{"name":"Institute of Informatics and Telecommunications, Reshetnev Siberian State University of Science and Technology, 31 Krasnoyarsky Rabochy Prospekt, 660037 Krasnoyarsk, Russia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1149-3299","authenticated-orcid":false,"given":"Natalya","family":"Rezova","sequence":"additional","affiliation":[{"name":"Institute of Informatics and Telecommunications, Reshetnev Siberian State University of Science and Technology, 31 Krasnoyarsky Rabochy Prospekt, 660037 Krasnoyarsk, Russia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8257-7329","authenticated-orcid":false,"given":"Guzel","family":"Shkaberina","sequence":"additional","affiliation":[{"name":"Laboratory \u201cHybrid Methods of Modeling and Optimization in Complex Systems\u201d, Siberian Federal University, 79 Svobodny Prospekt, 660041 Krasnoyarsk, Russia"},{"name":"Institute of Informatics and Telecommunications, Reshetnev Siberian State University of Science and Technology, 31 Krasnoyarsky Rabochy Prospekt, 660037 Krasnoyarsk, Russia"}]},{"given":"Sergei","family":"Mironov","sequence":"additional","affiliation":[{"name":"Institute of Informatics and Telecommunications, Reshetnev Siberian State University of Science and Technology, 31 Krasnoyarsky Rabochy Prospekt, 660037 Krasnoyarsk, Russia"}]},{"given":"Mariya","family":"Bartosh","sequence":"additional","affiliation":[{"name":"Laboratory \u201cHybrid Methods of Modeling and Optimization in Complex Systems\u201d, Siberian Federal University, 79 Svobodny Prospekt, 660041 Krasnoyarsk, Russia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0667-4001","authenticated-orcid":false,"given":"Lev","family":"Kazakovtsev","sequence":"additional","affiliation":[{"name":"Laboratory \u201cHybrid Methods of Modeling and Optimization in Complex Systems\u201d, Siberian Federal University, 79 Svobodny Prospekt, 660041 Krasnoyarsk, Russia"},{"name":"Institute of Informatics and Telecommunications, Reshetnev Siberian State University of Science and Technology, 31 Krasnoyarsky Rabochy Prospekt, 660037 Krasnoyarsk, Russia"}]}],"member":"1968","published-online":{"date-parts":[[2023,5,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"202","DOI":"10.1016\/j.ins.2017.02.037","article-title":"A methodology for analysis of concept lattice reduction","volume":"396","author":"Dias","year":"2017","journal-title":"Inf. Sci."},{"key":"ref_2","unstructured":"Hammer, P.L. (1986). Lecture at the International Conference on Multi-Attrubute Decision Making via OR-Based Expert Systems, University of Passau."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Chikalov, I. (2013). Three Approaches to Data Analysis. Intelligent Systems Reference Library, 41, Springer.","DOI":"10.1007\/978-3-642-28667-4"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1007\/s10845-009-0351-1","article-title":"Rogue components: Their effect and control using Logical Analysis of Data","volume":"23","author":"Mortada","year":"2012","journal-title":"J. Intell. Manuf."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"041004","DOI":"10.1115\/1.4029955","article-title":"Tool wear monitoring and alarm system based on pattern recognition with Logical Analysis of Data","volume":"137","author":"Shaban","year":"2015","journal-title":"J. Manuf. Sci. Eng."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1429","DOI":"10.1007\/s10845-013-0750-1","article-title":"Fault diagnosis in power transformers using multi-class Logical Analysis of Data","volume":"25","author":"Mortada","year":"2014","journal-title":"J. Intell. Manuf."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"943","DOI":"10.1007\/s10845-014-0926-3","article-title":"Remaining useful life prediction using prognostic methodology based on Logical Analysis of Data and Kaplan-Meier estimation","volume":"27","author":"Ragab","year":"2016","journal-title":"J. Intell. Manuf."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1016\/j.ress.2016.11.015","article-title":"Application of Logical Analysis of Data to machinery-related accident prevention based on scarce data","volume":"159","author":"Jocelyn","year":"2017","journal-title":"Reliab. Eng. Syst. Saf."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1016\/j.jairtraman.2011.10.004","article-title":"Logical Analysis of Data for estimating passenger show rates at Air Canada","volume":"18","author":"Dupuis","year":"2012","journal-title":"J. Air Transp. Manag."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"276","DOI":"10.1134\/S1054661817020092","article-title":"Face recognition using multi-class Logical Analysis of Data","volume":"27","author":"Ragab","year":"2017","journal-title":"Pattern Recognit. Image Anal."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1100","DOI":"10.1016\/j.dam.2004.10.010","article-title":"Subset-conjunctive rules for breast cancer diagnosis","volume":"154","author":"Kohli","year":"2006","journal-title":"Discret. Appl. Math."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1114","DOI":"10.1007\/11752578_135","article-title":"Parallel implementation of Logical Analysis of Data (LAD) for discriminatory analysis of protein mass spectrometry data","volume":"3911","year":"2006","journal-title":"Lect. Notes Comput. Sci."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1023\/A:1022970120229","article-title":"Coronary risk prediction by Logical Analysis of Data","volume":"119","author":"Alexe","year":"2003","journal-title":"Ann. Oper. Res."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Reddy, A., Wang, H., Yu, H., Bonates, T.O., Gulabani, V., Azok, J., Hoehn, G., Hammer, P.L., Baird, A.E., and Li, K.C. (2008). Logical Analysis of Data (LAD) model for the early diagnosis of acute ischemic stroke. BMC Med. Inform. Decis. Mak., 8.","DOI":"10.1186\/1472-6947-8-30"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Lee, C.-F., and Lee, J. (2014). Handbook of Financial Econometrics and Statistics, Springer.","DOI":"10.1007\/978-1-4614-7750-1"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1356","DOI":"10.1287\/opre.1120.1120","article-title":"Pattern-based modeling and solution of probabilistically constrained optimization problems","volume":"60","author":"Lejeune","year":"2012","journal-title":"Oper. Res."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Rival, I. (1982). Ordered Sets: Proceedings, NATO Advanced Studies Institute, 83, Reidel.","DOI":"10.1007\/978-94-009-7798-3"},{"key":"ref_18","unstructured":"Ganter, B., and Wille, R. (1999). Mathematical Foundations, Springer."},{"key":"ref_19","unstructured":"Tilley, T., and Eklund, P. (2007). A Case Study in Software Engineering. In Database and Expert Systems Applications, DEXA\u201907, 18th International Workshop on, Springer."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Ganter, B., and Mineau, G.W. (2000). Conceptual Structures: Logical, Linguistic, and Computational Issues. ICCS 2000. Lecture Notes in Computer Science, Springer.","DOI":"10.1007\/10722280"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Perner, P. (2012). Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2012. Lecture Notes in Computer Science, Springer.","DOI":"10.1007\/978-3-642-31488-9"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1774","DOI":"10.1016\/j.ins.2010.04.011","article-title":"Evaluation of IPAQ questionnaires supported by formal concept analysis","volume":"181","author":"Belohlavek","year":"2011","journal-title":"Inf. Sci."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1989","DOI":"10.1016\/j.ins.2010.07.007","article-title":"Mining gene expression data with pattern structures in formal concept analysis","volume":"181","author":"Kaytoue","year":"2011","journal-title":"Inf. Sci."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Amin, I.I., and Kassim, S.K. (2013, January 28\u201329). Applying formal concept analysis for visualizing DNA methylation status in breast cancer tumor subtypes. Proceedings of the 2013 9th International Computer Engineering Conference (ICENCO), Giza, Egypt.","DOI":"10.1109\/ICENCO.2013.6736473"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1016\/j.dam.2003.11.002","article-title":"Complexity of learning in concept lattices from positive and negative examples","volume":"142","author":"Kuznetsov","year":"2004","journal-title":"Discret. Appl. Math."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"792","DOI":"10.1016\/j.ejor.2020.01.015","article-title":"Interface between Logical Analysis of Data and Formal Concept Analysis","volume":"284","author":"Janostik","year":"2020","journal-title":"Eur. J. Oper. Res."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1007\/s10479-006-0084-x","article-title":"Pattern-based feature selection in genomics and proteomics","volume":"148","author":"Alexe","year":"2006","journal-title":"Ann. OR"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1007\/BF02614316","article-title":"Logical analysis of numerical data","volume":"79","author":"Boros","year":"1997","journal-title":"Math. Program."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Shkaberina, G., Rezova, N., Tovbis, E., and Kazakovtsev, L. (2023). Visual Assessment of Cluster Tendency with Variations of Distance Measures. Algorithms, 16.","DOI":"10.3390\/a16010005"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1109\/TIT.1982.1056489","article-title":"Least Squares Quantization in PCM","volume":"28","author":"Lloyd","year":"1982","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_31","first-page":"219","article-title":"Knowledge acquisition through conceptual clustering: A theoretical framework and an algorithm for partitioning data into conjunctive concepts. A special issue on knowledge acquisition and induction","volume":"4","author":"Michalski","year":"1980","journal-title":"Int. J. Policy Anal. Inf. Syst."},{"key":"ref_32","first-page":"145","article-title":"Conceptual clustering of multi-relational data","volume":"2011","author":"Fonseca","year":"2012","journal-title":"Proc. ILP"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1007\/BF00114265","article-title":"Knowledge acquisition via incremental conceptual clustering","volume":"2","author":"Fisher","year":"1987","journal-title":"Mach. Learn."},{"key":"ref_34","first-page":"71","article-title":"Fuzzy conceptual clustering","volume":"Volume 6171","author":"Perner","year":"2010","journal-title":"Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2010. Berlin, Germany, 12\u201314 July. Lecture Notes in Computer Science"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"752","DOI":"10.1016\/j.ipm.2006.06.001","article-title":"Topic discovery based on text mining techniques","volume":"43","year":"2007","journal-title":"Inf. Process. Manag."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"1267","DOI":"10.1007\/s10462-018-9627-1","article-title":"A review of conceptual clustering algorithms","volume":"52","year":"2019","journal-title":"Artif. Intell. Rev."},{"key":"ref_37","first-page":"349","article-title":"Hierarchical distance-based conceptual clustering","volume":"Volume 5211","author":"Daelemans","year":"2008","journal-title":"Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2008. Lecture Notes in Computer Science"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1145\/272682.272714","article-title":"An error-based conceptual clustering method for providing approximate query answers","volume":"39","author":"Chu","year":"1996","journal-title":"Commun. ACM"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1016\/j.knosys.2015.02.019","article-title":"Mining patterns for clustering on numerical datasets using unsupervised decision trees","volume":"82","year":"2015","journal-title":"Knowl. Based Syst."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1109\/TEVC.2008.915995","article-title":"A multiobjective evolutionary conceptual clustering methodology for gene annotation within structural databases: A case of study on the gene ontology database","volume":"12","author":"Herrera","year":"2008","journal-title":"IEEE Trans. Evol. Comput."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Fanizzi, N., Amato, C., and Esposito, F. (2007, January 17\u201319). Evolutionary conceptual clustering of semantically annotated resources. Proceedings of the International Conference on Semantic Computing 2007 (ICSC2007), Irvine, CA, USA.","DOI":"10.1109\/ICSC.2007.92"},{"key":"ref_42","unstructured":"Segal, E., Battle, A., and Koller, D. (2003, January 3\u20137). Decomposing gene expression into cellular processes. Proceedings of the Pacific Symposium on Biocomputing, Kauai, HI, USA."},{"key":"ref_43","unstructured":"Pei, J., Zhang, X., Cho, M., Wang, H., and Yu, P.S. (2003, January 19\u201322). MaPle: A fast algorithm for maximal pattern-based clustering. Proceedings of the Third IEEE International Conference on Data Mining 2003, ICDM 2003, Melbourne, FL, USA."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1016\/j.ins.2021.06.024","article-title":"Systematic categorization and evaluation of CbO-based algorithms in FCA","volume":"575","author":"Konecny","year":"2021","journal-title":"Inf. Sci."},{"key":"ref_45","first-page":"17","article-title":"A fast algorithm for computing all intersections of objects from an arbitrary semilattice","volume":"1","author":"Kuznetsov","year":"1993","journal-title":"Nauchno Tekhnicheskaya Inf. Seriya 2 Inf. Protsessy I Sist."},{"key":"ref_46","unstructured":"Sivogolovko, E., and Novikov, B. (2012). EDBT-ICDT\u201912, Association for Computing Machinery."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"022035","DOI":"10.1088\/1757-899X\/537\/2\/022035","article-title":"Recursive clustering algorithm based on silhouette criterion maximization for sorting semiconductor devices by homogeneous batches","volume":"537","author":"Golovanov","year":"2019","journal-title":"IOP Conf. Ser. Mater. Sci. Eng."},{"key":"ref_48","unstructured":"Lemmerich, F. (2014). Novel Techniques for Efficient and Effective Subgroup Discovery. [Ph.D. Thesis, Bavarian Julius Maximilian University]."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Orlov, V.I., Rozhnov, I.P., Kazakovtsev, L.A., Rezova, N.L., Popov, V.P., and Mikhnev, D.L. (2021, January 19\u201321). Application of the K-Standards Algorithm for the Clustering Problem of Production Batches of Semiconductor Devices. Proceedings of the 2021 XV International Scientific-Technical Conference on Actual Problems Of Electronic Instrument Engineering (APEIE), Novosibirsk, Russia.","DOI":"10.1109\/APEIE52976.2021.9647632"},{"key":"ref_50","unstructured":"(2023, March 10). National Library of Medicine, Available online: https:\/\/www.ncbi.nlm.nih.gov\/."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/16\/5\/246\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:32:24Z","timestamp":1760124744000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/16\/5\/246"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,9]]},"references-count":50,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2023,5]]}},"alternative-id":["a16050246"],"URL":"https:\/\/doi.org\/10.3390\/a16050246","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2023,5,9]]}}}