{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,27]],"date-time":"2026-05-27T20:35:53Z","timestamp":1779914153638,"version":"3.53.1"},"reference-count":42,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2023,10,19]],"date-time":"2023-10-19T00:00:00Z","timestamp":1697673600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Reducing the size of the training set, which involves replacing it with a condensed set, is a widely adopted practice to enhance the efficiency of instance-based classifiers while trying to maintain high classification accuracy. This objective can be achieved through the use of data reduction techniques, also known as prototype selection or generation algorithms. Although there are numerous algorithms available in the literature that effectively address single-label classification problems, most of them are not applicable to multilabel data, where an instance can belong to multiple classes. Well-known transformation methods cannot be combined with a data reduction technique due to different reasons. The Condensed Nearest Neighbor rule is a popular parameter-free single-label prototype selection algorithm. The IB2 algorithm is the one-pass variation of the Condensed Nearest Neighbor rule. This paper proposes variations of these algorithms for multilabel data. Through an experimental study conducted on nine distinct datasets as well as statistical tests, we demonstrate that the eight proposed approaches (four for each algorithm) offer significant reduction rates without compromising the classification accuracy.<\/jats:p>","DOI":"10.3390\/info14100572","type":"journal-article","created":{"date-parts":[[2023,10,19]],"date-time":"2023-10-19T05:43:28Z","timestamp":1697694208000},"page":"572","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Prototype Selection for Multilabel Instance-Based Learning"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-2611-7319","authenticated-orcid":false,"given":"Panagiotis","family":"Filippakis","sequence":"first","affiliation":[{"name":"Department of Information and Electronic Engineering, School of Engineering, International Hellenic University, 57400 Thessaloniki, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1094-2520","authenticated-orcid":false,"given":"Stefanos","family":"Ougiaroglou","sequence":"additional","affiliation":[{"name":"Department of Information and Electronic Engineering, School of Engineering, International Hellenic University, 57400 Thessaloniki, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1639-2152","authenticated-orcid":false,"given":"Georgios","family":"Evangelidis","sequence":"additional","affiliation":[{"name":"Department of Applied Informatics, School of Information Sciences, University of Macedonia, 156 Egnatia Street, 54636 Thessaloniki, Greece"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2023,10,19]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.4018\/jdwm.2007070101","article-title":"Multi-label classification: An overview","volume":"3","author":"Tsoumakas","year":"2007","journal-title":"Int. J. Data Warehous. Min."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1109\/TIT.1967.1053964","article-title":"Nearest neighbor pattern classification","volume":"13","author":"Cover","year":"1967","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Liu, H., and Motoda, H. (1998). Feature Selection for Knowledge Discovery and Data Mining, Kluwer Academic Publishers.","DOI":"10.1007\/978-1-4615-5689-3"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1109\/TPAMI.2011.142","article-title":"Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study","volume":"34","author":"Garcia","year":"2012","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1109\/TSMCC.2010.2103939","article-title":"A Taxonomy and Experimental Study on Prototype Generation for Nearest Neighbor Classification","volume":"42","author":"Triguero","year":"2012","journal-title":"Trans. Systems Man Cyber Part C"},{"key":"ref_6","unstructured":"Darzentas, J., Vouros, G.A., Vosinakis, S., and Arnellos, A. An Empirical Study of Lazy Multilabel Classification Algorithms. Proceedings of the Artificial Intelligence: Theories, Models and Applications."},{"key":"ref_7","first-page":"515","article-title":"The condensed nearest neighbor rule","volume":"18","author":"Hart","year":"1967","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1007\/BF00153759","article-title":"Instance-based learning algorithms","volume":"6","author":"Aha","year":"1991","journal-title":"Mach. Learn."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Filippakis, P., Ougiaroglou, S., and Evangelidis, G. (2023, January 5\u20137). Condensed Nearest Neighbour Rules for Multi-Label Datasets. Proceedings of the International Database Engineered Applications Symposium Conference, Heraklion, Greece.","DOI":"10.1145\/3589462.3589492"},{"key":"ref_10","first-page":"707","article-title":"Binary codes capable of correcting deletions, insertions, and reversals","volume":"10","author":"Levenshtein","year":"1965","journal-title":"Sov. Phys. Dokl."},{"key":"ref_11","first-page":"2411","article-title":"Mulan: A Java Library for Multi-Label Learning","volume":"12","author":"Tsoumakas","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_12","first-page":"1","article-title":"MEKA: A Multi-label\/Multi-target Extension to WEKA","volume":"17","author":"Read","year":"2016","journal-title":"J. Mach. Learn. Res."},{"key":"ref_13","unstructured":"Charte, F., Rivera, A.J., del Jesus, M.J., and Herrera, F. (2014). Intelligent Data Engineering and Automated Learning\u2013IDEAL 2014, Springer."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"408","DOI":"10.1109\/TSMC.1972.4309137","article-title":"Asymptotic Properties of Nearest Neighbor Rules Using Edited Data","volume":"SMC-2","author":"Wilson","year":"1972","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1007\/s10044-015-0452-8","article-title":"Editing training data for multi-label classification with the k-nearest neighbor rule","volume":"19","author":"Kanj","year":"2015","journal-title":"Pattern Anal. Appl."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"651","DOI":"10.1016\/j.asoc.2018.04.016","article-title":"Local sets for multi-label instance selection","volume":"68","year":"2018","journal-title":"Appl. Soft Comput."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1523","DOI":"10.1016\/j.patcog.2014.10.001","article-title":"Three new instance selection methods based on local sets: A comparative study with several approaches from a bi-objective perspective","volume":"48","author":"Leyva","year":"2015","journal-title":"Pattern Recognit."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Li, H., Fang, M., Li, H., and Wang, P. (2023). Prototype selection for multi-label data based on label correlation. Neural Comput. Appl.","DOI":"10.1007\/s00521-023-08617-7"},{"key":"ref_19","unstructured":"Chou, C.H., Kuo, B.H., and Chang, F. (2006, January 20\u201324). The Generalized Condensed Nearest Neighbor Rule as A Data Reduction Method. Proceedings of the 18th International Conference on Pattern Recognition (ICPR\u201906), Hong Kong, China."},{"key":"ref_20","unstructured":"Suyal, H., and Singh, A. (2021). Computational Intelligence and Healthcare Informatics, Wiley."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"2038","DOI":"10.1016\/j.patcog.2006.12.019","article-title":"ML-KNN: A lazy learning approach to multi-label learning","volume":"40","author":"Zhang","year":"2007","journal-title":"Pattern Recognit."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1016\/j.eswa.2018.05.017","article-title":"Study of data transformation techniques for adapting single-label prototype selection algorithms to multi-label learning","volume":"109","year":"2018","journal-title":"Expert Syst. Appl."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1608","DOI":"10.1016\/j.patcog.2014.11.015","article-title":"Improving kNN multi-label classification in Prototype Selection scenarios using class proposals","volume":"48","year":"2015","journal-title":"Pattern Recognit."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Gonz\u00e1lez, M., Cano, J.R., and Garc\u00eda, S. (2020). ProLSFEO-LDL: Prototype Selection and Label- Specific Feature Evolutionary Optimization for Label Distribution Learning. Appl. Sci., 10.","DOI":"10.3390\/app10093089"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1734","DOI":"10.1109\/TKDE.2016.2545658","article-title":"Label Distribution Learning","volume":"28","author":"Geng","year":"2016","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Sanjurjo Gonz\u00e1lez, H., Pastor L\u00f3pez, I., Garc\u00eda Bringas, P., Quinti\u00e1n, H., and Corchado, E. (2021). Proceedings of the Hybrid Artificial Intelligent Systems, Springer.","DOI":"10.1007\/978-3-030-86271-8"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.neucom.2023.01.004","article-title":"Data reduction via multi-label prototype generation","volume":"526","author":"Ougiaroglou","year":"2023","journal-title":"Neurocomputing"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1561","DOI":"10.1016\/j.patcog.2003.12.012","article-title":"High training set size reduction by space partitioning and prototype abstraction","volume":"37","year":"2004","journal-title":"Pattern Recognit."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"109190","DOI":"10.1016\/j.patcog.2022.109190","article-title":"Multilabel Prototype Generation for data reduction in K-Nearest Neighbour classification","volume":"135","author":"Gallego","year":"2023","journal-title":"Pattern Recognit."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"819","DOI":"10.1016\/0167-8655(96)00041-4","article-title":"A sample set condensation algorithm for the class sensitive artificial neural network","volume":"17","author":"Chen","year":"1996","journal-title":"Pattern Recognit. Lett."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Sun, L., Ji, S., and Ye, J. (2008, January 24\u201327). Hypergraph Spectral Learning for Multi-Label Classification. Proceedings of the Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV, USA.","DOI":"10.1145\/1401890.1401971"},{"key":"ref_32","unstructured":"Byerly, A., and Kalganova, T. (2022). Class Density and Dataset Quality in High-Dimensional, Unstructured Data. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhang, S., Hu, Y., and Bian, G. (2017, January 25\u201326). Research on string similarity algorithm based on Levenshtein Distance. Proceedings of the 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China.","DOI":"10.1109\/IAEAC.2017.8054419"},{"key":"ref_34","first-page":"2825","article-title":"Scikit-learn: Machine Learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Gunopulos, D., Hofmann, T., Malerba, D., and Vazirgiannis, M. (2011). Proceedings of the Machine Learning and Knowledge Discovery in Databases, Springer.","DOI":"10.1007\/978-3-642-23780-5"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"7404627","DOI":"10.1155\/2018\/7404627","article-title":"An Approach to Data Reduction for Learning from Big Datasets: Integrating Stacking, Rotation, and Agent Population Learning Techniques","volume":"2018","author":"Czarnowski","year":"2018","journal-title":"Complexity"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1016\/j.patcog.2017.09.038","article-title":"Clustering-Based k-Nearest Neighbor Classification for Large-Scale Data with Neural Codes Representation","volume":"74","author":"Gallego","year":"2018","journal-title":"Pattern Recogn."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1007\/s10044-014-0393-7","article-title":"RHC: Non-Parametric Cluster-Based Data Reduction for Efficient k-NN Classification","volume":"19","author":"Ougiaroglou","year":"2016","journal-title":"Pattern Anal. Appl."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"569","DOI":"10.1016\/j.asoc.2015.12.015","article-title":"PGGP: Prototype Generation via Genetic Programming","volume":"40","author":"Escalante","year":"2016","journal-title":"Appl. Soft Comput."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1007\/s10044-015-0454-6","article-title":"MOPG: A Multi-Objective Evolutionary Algorithm for Prototype Generation","volume":"20","author":"Escalante","year":"2017","journal-title":"Pattern Anal. Appl."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"2415","DOI":"10.1007\/s00521-016-2278-8","article-title":"Prototype Generation on Structural Data Using Dissimilarity Space Representation","volume":"28","year":"2017","journal-title":"Neural Comput. Appl."},{"key":"ref_42","unstructured":"Sheskin, D. (2011). Handbook of Parametric and Nonparametric Statistical Procedures, Chapman & Hall\/CRC. A Chapman & Hall Book."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/14\/10\/572\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:09:30Z","timestamp":1760130570000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/14\/10\/572"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,19]]},"references-count":42,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2023,10]]}},"alternative-id":["info14100572"],"URL":"https:\/\/doi.org\/10.3390\/info14100572","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,19]]}}}