{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,2]],"date-time":"2025-12-02T06:19:10Z","timestamp":1764656350201,"version":"build-2065373602"},"reference-count":49,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2025,4,18]],"date-time":"2025-04-18T00:00:00Z","timestamp":1744934400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"King Mongkut\u2019s Institute of Technology Ladkrabang Research Fund","award":["2564-02-05-009"],"award-info":[{"award-number":["2564-02-05-009"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Data"],"abstract":"<jats:p>Imbalanced classification presents a significant challenge in real-world datasets, requiring innovative solutions to enhance performance. This study introduces a hybrid binary classification algorithm designed to effectively address this challenge. The algorithm identifies different data types, pairs them, and trains multiple models, which then vote on predictions using weighted strategies to ensure stable performance and minimize overfitting. Unlike some methods, it is designed to work consistently with both noisy and noise-free datasets, prioritizing overall stability rather than specific noise adjustments. The algorithm\u2019s effectiveness is evaluated using Recall, G-Mean, and AUC, measuring its ability to detect the minority class while maintaining balance. The results reveal notable improvements in minority class detection, with Recall outperforming other methods in 16 out of 22 datasets, supported by paired t-tests. The algorithm also shows promising improvements in G-Mean and AUC, ranking first in 17 and 18 datasets, respectively. To further evaluate its performance, the study compares the proposed algorithm with previous methods using G-Mean. The comparison confirms that the proposed algorithm also exhibits strong performance, further highlighting its potential. These findings emphasize the algorithm\u2019s versatility in handling diverse datasets and its ability to balance minority class detection with overall accuracy.<\/jats:p>","DOI":"10.3390\/data10040054","type":"journal-article","created":{"date-parts":[[2025,4,18]],"date-time":"2025-04-18T02:25:31Z","timestamp":1744943131000},"page":"54","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["A Partition-Based Hybrid Algorithm for Effective Imbalanced Classification"],"prefix":"10.3390","volume":"10","author":[{"given":"Kittipong","family":"Theephoowiang","sequence":"first","affiliation":[{"name":"Computer Science, School of Science, King Mongkut\u2019s Institute of Technology Ladkrabang, Bangkok 10152, Thailand"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anantaporn","family":"Hanskunatai","sequence":"additional","affiliation":[{"name":"Computer Science, School of Science, King Mongkut\u2019s Institute of Technology Ladkrabang, Bangkok 10152, Thailand"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,4,18]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1263","DOI":"10.1109\/TKDE.2008.239","article-title":"Learning from imbalanced data","volume":"21","author":"He","year":"2009","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1186\/s12645-021-00082-y","article-title":"Ultrasensitive bioassaying of HER-2 protein for diagnosis of breast cancer using reduced graphene oxide\/chitosan as a nanobiocompatible platform","volume":"12","author":"Nasrollahpour","year":"2021","journal-title":"Cancer Nanotechnol."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"100185","DOI":"10.1016\/j.bdr.2021.100185","article-title":"Core dataset extraction from unlabeled medical big data for lesion localization","volume":"24","author":"Guo","year":"2021","journal-title":"Big Data Res."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1016\/j.ins.2021.07.091","article-title":"TWD-SFNN: Three-way decisions with a single hidden layer feedforward neural network","volume":"579","author":"Cheng","year":"2021","journal-title":"Inf. Sci."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"20021","DOI":"10.1109\/ACCESS.2018.2823979","article-title":"A greedy deep learning method for medical disease analysis","volume":"6","author":"Wu","year":"2018","journal-title":"IEEE Access"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1007\/s11280-012-0178-0","article-title":"Effective detection of sophisticated online banking fraud on extremely imbalanced data","volume":"16","author":"Wei","year":"2013","journal-title":"World Wide Web-Internet Web Inf. Syst."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1016\/j.ins.2020.05.040","article-title":"Resampling ensemble model based on data distribution for imbalanced credit risk evaluation in P2P lending","volume":"536","author":"Niu","year":"2020","journal-title":"Inf. Sci."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"6503459","DOI":"10.1155\/2020\/6503459","article-title":"Using harmony search algorithm in neural networks to improve fraud detection in the banking system","volume":"2020","author":"Daliri","year":"2020","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"107835","DOI":"10.1016\/j.patcog.2021.107835","article-title":"Internet financing credit risk evaluation using multiple structural interacting elastic net feature selection","volume":"114","author":"Cui","year":"2021","journal-title":"Pattern Recognit."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"574","DOI":"10.1109\/JSYST.2011.2165600","article-title":"A fingerprint recognition scheme based on assembling invariant moments for cloud computing communications","volume":"5","author":"Yang","year":"2011","journal-title":"IEEE Syst. J."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1231","DOI":"10.1016\/j.sysarc.2013.10.007","article-title":"Adaptive GTS allocation in IEEE 802.15.4 for real-time wireless sensor networks","volume":"59 Pt D","author":"Xia","year":"2013","journal-title":"J. Syst. Archit."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"110415","DOI":"10.1016\/j.asoc.2023.110415","article-title":"A broad review on class imbalance learning techniques","volume":"143","author":"Rezvani","year":"2023","journal-title":"Appl. Soft Comput."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1613\/jair.953","article-title":"SMOTE: Synthetic Minority Over-sampling Technique","volume":"16","author":"Chawla","year":"2002","journal-title":"J. Artif. Intell. Res."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.patrec.2016.10.006","article-title":"Redundancy-driven modified Tomek-link based undersampling: A solution to class imbalance","volume":"93","author":"Devi","year":"2017","journal-title":"Pattern Recognit. Lett."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1006\/jcss.1997.1504","article-title":"A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting","volume":"55","author":"Freund","year":"1997","journal-title":"J. Comput. Syst. Sci."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Chawla, N.V., Lazarevic, A., Hall, L.O., and Bowyer, K.W. (2003). SMOTEBoost: Improving Prediction of the Minority Class in Boosting. Knowledge Discovery in Databases: PKDD 2003, Proceedings of the European Conference on Principles and Practice of Knowledge Discovery in Databases, Cavtat-Dubrovnik, Croatia, 22\u201326 September 2003, Springer.","DOI":"10.1007\/978-3-540-39804-2_12"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Batuwita, R., and Palade, V. (2010, January 18\u201323). Efficient resampling methods for training support vector machines with imbalanced datasets. Proceedings of the International Joint Conference on Neural Networks 2010, Barcelona, Spain.","DOI":"10.1109\/IJCNN.2010.5596787"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1111\/j.0824-7935.2004.t01-1-00228.x","article-title":"A multiple resampling method for learning from imbalanced datasets","volume":"20","author":"Estabrooks","year":"2004","journal-title":"Comput. Intell."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1016\/j.ins.2017.05.008","article-title":"Clustering-based undersampling in class-imbalanced data","volume":"409\u2013410","author":"Lin","year":"2017","journal-title":"Inf. Sci."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"2378","DOI":"10.1016\/j.fss.2007.12.023","article-title":"A study of the behaviour of linguistic fuzzy rule-based classification systems in the framework of imbalanced datasets","volume":"159","author":"Fernandez","year":"2008","journal-title":"Fuzzy Sets Syst."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1268","DOI":"10.1016\/j.ins.2009.12.014","article-title":"On the 2-tuples based genetic tuning performance for fuzzy rule-based classification systems in imbalanced datasets","volume":"180","author":"Fernandez","year":"2010","journal-title":"Inf. Sci."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1016\/j.neucom.2014.06.021","article-title":"A Resampling Ensemble Algorithm for Classification of Imbalance Problems","volume":"143","author":"Qian","year":"2014","journal-title":"Neurocomputing"},{"key":"ref_23","unstructured":"Batista, G., Bazzan, A., and Monard, M.C. (2003, January 3\u20135). Balancing Training Data for Automated Annotation of Keywords: A Case Study. Proceedings of the II Brazilian Workshop on Bioinformatics, S\u00e3o Paulo, Brazil."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"2326","DOI":"10.1109\/TNSE.2021.3089435","article-title":"PPSF: A Privacy-Preserving and Secure Framework Using Blockchain-Based Machine Learning for IoT-Driven Smart Cities","volume":"8","author":"Kumar","year":"2021","journal-title":"IEEE Trans. Netw. Sci. Eng."},{"key":"ref_25","unstructured":"Elkan, C. (2001, January 4\u201310). The Foundations of Cost-Sensitive Learning. Proceedings of the IJCAI International Joint Conference on Artificial Intelligence, Seattle, WA, USA."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1007\/s10044-003-0192-z","article-title":"New Applications of Ensembles of Classifiers","volume":"6","author":"Barandela","year":"2003","journal-title":"Pattern Anal. Appl."},{"key":"ref_27","unstructured":"Chen, C., Liaw, A., and Breiman, L. (2004). Using Random Forest to Learn Imbalanced Data, Technical Report; University of California."},{"key":"ref_28","unstructured":"Yang, X., Song, Q., and Cao, A. (August, January 31). Weighted Support Vector Machine for Data Classification. Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, Montreal, QC, Canada."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1109\/TSMCA.2009.2029559","article-title":"RUSBoost: A Hybrid Approach to Alleviating Class Imbalance","volume":"40","author":"Seiffert","year":"2010","journal-title":"IEEE Trans. Syst. Man Cybern.\u2014Part A Syst. Hum."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_31","first-page":"878","article-title":"Borderline-SMOTE: A new oversampling method in imbalanced datasets learning","volume":"Volume 3644","author":"Han","year":"2005","journal-title":"Advances in Intelligent Computing, Proceedings of the ICIC 2005, Hefei, China, 23\u201326 August 2005"},{"key":"ref_32","first-page":"475","article-title":"Safe-level-SMOTE: Safe-level-synthetic minority over-sampling technique for handling the class imbalance problem","volume":"Volume 5476","author":"Bunkhumpornpat","year":"2009","journal-title":"Advances in Knowledge Discovery and Data Mining, Proceedings of the PAKDD 2009, Bangkok, Thailand, 27\u201330 April 2009"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1186\/s40537-019-0192-5","article-title":"Survey on deep learning with class imbalance","volume":"6","author":"Johnson","year":"2019","journal-title":"J. Big Data"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1007\/s10462-024-10759-6","article-title":"A survey on imbalanced learning: Latest research, applications and future directions","volume":"57","author":"Chen","year":"2024","journal-title":"Artif. Intell. Rev."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1016\/j.ins.2022.12.046","article-title":"A Hybrid Imbalanced Classification Model Based on Data Density","volume":"624","author":"Shi","year":"2023","journal-title":"Inf. Sci."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"493","DOI":"10.1016\/j.ins.2022.12.029","article-title":"An Imbalanced Binary Classification Method via Space Mapping Using Normalizing Flows with Class Discrepancy Constraints","volume":"623","author":"Huang","year":"2023","journal-title":"Inf. Sci."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"108217","DOI":"10.1016\/j.knosys.2022.108217","article-title":"Two Density-Based Sampling Approaches for Imbalanced and Overlapping Data","volume":"241","author":"Mayabadi","year":"2022","journal-title":"Knowl.-Based Syst."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"110795","DOI":"10.1016\/j.knosys.2023.110795","article-title":"Self-Adaptive Oversampling Method Based on the Complexity of Minority Data in Imbalanced Datasets Classification","volume":"277","author":"Tao","year":"2023","journal-title":"Knowl.-Based Syst."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/BF00994018","article-title":"Support-Vector Networks","volume":"20","author":"Cortes","year":"1995","journal-title":"Mach. Learn."},{"key":"ref_40","unstructured":"Nakai, K. (1991). Yeast. UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"547","DOI":"10.1016\/j.dss.2009.05.016","article-title":"Modeling Wine Preferences by Data Mining from Physicochemical Properties","volume":"47","author":"Cortez","year":"2009","journal-title":"Decis. Support Syst."},{"key":"ref_42","first-page":"255","article-title":"KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework","volume":"17","author":"Fernandez","year":"2011","journal-title":"J. Mult.-Valued Log. Soft Comput."},{"key":"ref_43","unstructured":"Fedesoriano (2024, November 15). Stroke Prediction Dataset. Kaggle. Available online: https:\/\/www.kaggle.com\/datasets\/fedesoriano\/stroke-prediction-dataset\/data."},{"key":"ref_44","unstructured":"Mssmartypants (2024, November 15). Water Quality Dataset. Kaggle. Available online: https:\/\/www.kaggle.com\/datasets\/mssmartypants\/water-quality."},{"key":"ref_45","unstructured":"Sudhanshu (2024, November 15). Microcalcification Classification Dataset. Kaggle. Available online: https:\/\/www.kaggle.com\/datasets\/sudhanshu2198\/microcalcification-classification\/data."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"4065","DOI":"10.1109\/TNNLS.2017.2751612","article-title":"Classification of Imbalanced Data by Oversampling in Kernel Space of Support Vector Machines","volume":"29","author":"Mathew","year":"2018","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"106087","DOI":"10.1016\/j.knosys.2020.106087","article-title":"A Weighted Hybrid Ensemble Method for Classifying Imbalanced Data","volume":"203","author":"Zhao","year":"2020","journal-title":"Knowl.-Based Syst."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"110986","DOI":"10.1016\/j.asoc.2023.110986","article-title":"Adaptive SV-Borderline SMOTE-SVM Algorithm for Imbalanced Data Classification","volume":"150","author":"Guo","year":"2024","journal-title":"Appl. Soft Comput."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"118955","DOI":"10.1016\/j.ins.2023.118955","article-title":"An Overlapping Oriented Imbalanced Ensemble Learning Algorithm with Weighted Projection Clustering Grouping and Consistent Fuzzy Sample Transformation","volume":"637","author":"Li","year":"2023","journal-title":"Inf. Sci."}],"container-title":["Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2306-5729\/10\/4\/54\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:16:58Z","timestamp":1760030218000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2306-5729\/10\/4\/54"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,18]]},"references-count":49,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2025,4]]}},"alternative-id":["data10040054"],"URL":"https:\/\/doi.org\/10.3390\/data10040054","relation":{},"ISSN":["2306-5729"],"issn-type":[{"type":"electronic","value":"2306-5729"}],"subject":[],"published":{"date-parts":[[2025,4,18]]}}}