{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T04:29:34Z","timestamp":1777696174804,"version":"3.51.4"},"reference-count":31,"publisher":"SAGE Publications","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IDA"],"published-print":{"date-parts":[[2022,4,18]]},"abstract":"<jats:p>The imbalanced data problem is widespread in the real world. In the process of training machine learning models, ignoring imbalanced data problems will cause the performance of the model to deteriorate. At present, researchers have proposed many methods to deal with the imbalanced data problems, but these methods mainly focus on the imbalanced data problems in two-class classification tasks. Learning from multi-class imbalanced data sets is still an open problem. In this paper, an ensemble method for classifying multi-class imbalanced data sets is put forward, called multi-class WHMBoost. It is an extension of WHMBoost that we proposed earlier. We do not use the algorithm used in WHMBoost to process the data, but use random balance based on average size so as to balance the data distribution. The weak classifiers we use in the boosting algorithm are support vector machine and decision tree classifier. In the process of training the model, they participate in training with given weights in order to complement each other\u2019s advantages. On 18 multi-class imbalanced data sets, we compared the performance of multi-class WHMBoost with state of the art ensemble algorithms using MAUC, MG-mean and MMCC as evaluation criteria. The results demonstrate that it has obvious advantages compared with state of the art ensemble algorithms and can effectively deal with multi-class imbalanced data sets.<\/jats:p>","DOI":"10.3233\/ida-215874","type":"journal-article","created":{"date-parts":[[2022,4,26]],"date-time":"2022-04-26T13:26:05Z","timestamp":1650979565000},"page":"599-614","source":"Crossref","is-referenced-by-count":2,"title":["Multi-class WHMBoost: An ensemble algorithm for multi-class imbalanced data"],"prefix":"10.1177","volume":"26","author":[{"given":"Jiakun","family":"Zhao","sequence":"first","affiliation":[]},{"given":"Ju","family":"Jin","sequence":"additional","affiliation":[]},{"given":"Yibo","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Ruifeng","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Si","family":"Chen","sequence":"additional","affiliation":[]}],"member":"179","reference":[{"key":"10.3233\/IDA-215874_ref1","unstructured":"N. Japkowicz, Learning from Imbalanced Data Sets: A Comparison of Various Strategies *, 2000."},{"key":"10.3233\/IDA-215874_ref2","doi-asserted-by":"crossref","first-page":"52","DOI":"10.3390\/informatics7040052","article-title":"Multi-class imbalance in text classification: A feature engineering approach to detect cyberbullying in twitter","volume":"7","author":"Talpur","year":"2020","journal-title":"Informatics"},{"key":"10.3233\/IDA-215874_ref5","doi-asserted-by":"crossref","unstructured":"C. Arun and C. Lakshmi, Class Imbalance in Software Fault Prediction Data Set, 2020.","DOI":"10.1007\/978-981-15-0199-9_64"},{"key":"10.3233\/IDA-215874_ref6","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1145\/1007730.1007735","article-title":"A study of the behavior of several methods for balancing machine learning training data","volume":"6","author":"Batista","year":"2004","journal-title":"SIGKDD Explor"},{"key":"10.3233\/IDA-215874_ref7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s40537-019-0192-5","article-title":"Survey on deep learning with class imbalance","volume":"6","author":"Johnson","year":"2019","journal-title":"Journal of Big Data"},{"key":"10.3233\/IDA-215874_ref8","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1613\/jair.953","article-title":"SMOTE: Synthetic minority over-sampling technique","volume":"16","author":"Chawla","year":"2002","journal-title":"J Artif Intell Res"},{"key":"10.3233\/IDA-215874_ref9","doi-asserted-by":"crossref","first-page":"408","DOI":"10.1109\/TSMC.1972.4309137","article-title":"Asymptotic properties of nearest neighbor rules using edited data","volume":"2","author":"Wilson","year":"1972","journal-title":"IEEE Trans Syst Man Cybern"},{"key":"10.3233\/IDA-215874_ref10","first-page":"1322","article-title":"ADASYN: Adaptive synthetic sampling approach for imbalanced learning","author":"He","year":"2008","journal-title":"2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)"},{"key":"10.3233\/IDA-215874_ref11","doi-asserted-by":"crossref","first-page":"515","DOI":"10.1109\/TIT.1968.1054155","article-title":"The condensed nearest neighbor rule (Corresp.)","volume":"14","author":"Hart","year":"1968","journal-title":"IEEE Trans. Inf. Theory"},{"key":"10.3233\/IDA-215874_ref12","doi-asserted-by":"crossref","first-page":"106087","DOI":"10.1016\/j.knosys.2020.106087","article-title":"A weighted hybrid ensemble method for classifying imbalanced data","volume":"203","author":"Zhao","year":"2020","journal-title":"Knowl Based Syst"},{"key":"10.3233\/IDA-215874_ref13","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1109\/TPAMI.2018.2858826","article-title":"Focal Loss for Dense Object Detection","volume":"42","author":"Lin","year":"2020","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"10.3233\/IDA-215874_ref14","doi-asserted-by":"crossref","first-page":"909","DOI":"10.3233\/IDA-194647","article-title":"A novel adaptive k-NN classifier for handling imbalance: Application to brain MRI","volume":"24","author":"Kirtania","year":"2020","journal-title":"Intell Data Anal"},{"key":"10.3233\/IDA-215874_ref15","doi-asserted-by":"crossref","first-page":"1263","DOI":"10.1109\/TKDE.2008.239","article-title":"Learning from Imbalanced Data","volume":"21","author":"He","year":"2009","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"key":"10.3233\/IDA-215874_ref16","doi-asserted-by":"crossref","first-page":"349","DOI":"10.4310\/SII.2009.v2.n3.a8","article-title":"Multi-class AdaBoost","volume":"2","author":"Hastie","year":"2009","journal-title":"Statistics and Its Interface"},{"key":"10.3233\/IDA-215874_ref17","doi-asserted-by":"crossref","unstructured":"N.V. Chawla, A. Lazarevic, L.O. Hall and K.W. Bowyer, SMOTEBoost: Improving Prediction of the Minority Class in Boosting, in: PKDD, 2003.","DOI":"10.1007\/978-3-540-39804-2_12"},{"key":"10.3233\/IDA-215874_ref18","doi-asserted-by":"crossref","first-page":"1189","DOI":"10.1214\/aos\/1013203451","article-title":"Greedy function approximation: A gradient boosting machine.","volume":"29","author":"Friedman","year":"2001","journal-title":"Annals of Statistics"},{"key":"10.3233\/IDA-215874_ref19","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1016\/S0167-9473(01)00065-2","article-title":"Stochastic gradient boosting","volume":"38","author":"Friedman","year":"2002","journal-title":"Computational Statistics & Data Analysis"},{"key":"10.3233\/IDA-215874_ref20","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1109\/TSMCA.2009.2029559","article-title":"RUSBoost: A Hybrid Approach to Alleviating Class Imbalance","volume":"40","author":"Seiffert","year":"2010","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics\u00a0\u2013 Part A: Systems and Humans"},{"key":"10.3233\/IDA-215874_ref21","doi-asserted-by":"crossref","first-page":"96","DOI":"10.1016\/j.knosys.2015.04.022","article-title":"Random Balance: Ensembles of variable priors classifiers for imbalanced data","volume":"85","author":"D\u00edez-Pastor","year":"2015","journal-title":"Knowl Based Syst"},{"key":"10.3233\/IDA-215874_ref22","doi-asserted-by":"crossref","first-page":"1119","DOI":"10.1109\/TSMCB.2012.2187280","article-title":"Multiclass Imbalance Problems: Analysis and Potential Solutions","volume":"42","author":"Wang","year":"2012","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)"},{"key":"10.3233\/IDA-215874_ref23","doi-asserted-by":"crossref","first-page":"274","DOI":"10.1109\/ICBDA.2018.8367691","article-title":"A noise classification algorithm based on SAMME and BP neural network","author":"Guo-qiang","year":"2018","journal-title":"2018 IEEE 3rd International Conference on Big Data Analysis (ICBDA)"},{"issue":"1","key":"10.3233\/IDA-215874_ref24","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1186\/s40537-020-00349-y","article-title":"Boosting methods for multi-class imbalanced data classification: an experimental review","volume":"7","author":"Tanha","year":"2020","journal-title":"Journal Of Big Data"},{"key":"10.3233\/IDA-215874_ref25","first-page":"1","article-title":"MEBoost: Mixing estimators with boosting for imbalanced data classification","author":"Rayhan","year":"2017","journal-title":"2017 11th International Conference on Software, Knowledge, Information Management and Applications (SKIMA)"},{"key":"10.3233\/IDA-215874_ref26","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.csda.2017.01.00","article-title":"RHSBoost: Improving classification performance in imbalance data","volume":"111(C","author":"Gong","year":"2017","journal-title":"Computational Statistics & Data Analysis"},{"key":"10.3233\/IDA-215874_ref27","unstructured":"W. Fan, S. Stolfo, J. Zhang and P. Chan, AdaCost: Misclassification Cost-Sensitive Boosting, in: ICML, 1999."},{"key":"10.3233\/IDA-215874_ref28","doi-asserted-by":"crossref","unstructured":"L. Zhen and L. Qiong, A New Feature Selection Method for Internet Traffic Classification Using ML, Physics Procedia 33(none) (2012).","DOI":"10.1016\/j.phpro.2012.05.220"},{"key":"10.3233\/IDA-215874_ref29","doi-asserted-by":"crossref","first-page":"592","DOI":"10.1109\/ICDM.2006.29","article-title":"Boosting for Learning Multiple Classes with Imbalanced Class Distribution","author":"Sun","year":"2006","journal-title":"Sixth International Conference on Data Mining (ICDM\u201906)"},{"key":"10.3233\/IDA-215874_ref30","first-page":"255","article-title":"KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework","volume":"17","author":"Alcal\u00e1-Fdez","year":"2011","journal-title":"J Multiple Valued Log Soft Comput"},{"key":"10.3233\/IDA-215874_ref31","doi-asserted-by":"crossref","first-page":"480","DOI":"10.1109\/ICPR.2016.7899680","article-title":"An optimal multiclass classifier design","author":"Fiori","year":"2016","journal-title":"2016 23rd International Conference on Pattern Recognition (ICPR)"},{"key":"10.3233\/IDA-215874_ref32","first-page":"1","article-title":"Statistical Comparisons of Classifiers over Multiple Data Sets","volume":"7","author":"Demsar","year":"2006","journal-title":"J Mach Learn Res"},{"key":"10.3233\/IDA-215874_ref33","doi-asserted-by":"crossref","first-page":"800","DOI":"10.1093\/biomet\/75.4.800","article-title":"A sharper Bonferroni procedure for multiple tests of significance","volume":"75","author":"Hochberg","year":"1988","journal-title":"Biometrika"}],"container-title":["Intelligent Data Analysis"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/IDA-215874","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:19:28Z","timestamp":1777454368000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/IDA-215874"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,18]]},"references-count":31,"journal-issue":{"issue":"3"},"URL":"https:\/\/doi.org\/10.3233\/ida-215874","relation":{},"ISSN":["1088-467X","1571-4128"],"issn-type":[{"value":"1088-467X","type":"print"},{"value":"1571-4128","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,4,18]]}}}