{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T10:21:04Z","timestamp":1773829264823,"version":"3.50.1"},"reference-count":65,"publisher":"Springer Science and Business Media LLC","issue":"12","license":[{"start":{"date-parts":[[2014,4,30]],"date-time":"2014-04-30T00:00:00Z","timestamp":1398816000000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Soft Comput"],"published-print":{"date-parts":[[2015,12]]},"DOI":"10.1007\/s00500-014-1291-z","type":"journal-article","created":{"date-parts":[[2014,4,29]],"date-time":"2014-04-29T06:57:26Z","timestamp":1398754646000},"page":"3369-3385","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":36,"title":["To combat multi-class imbalanced problems by means of over-sampling and boosting techniques"],"prefix":"10.1007","volume":"19","author":[{"given":"Lida","family":"Abdi","sequence":"first","affiliation":[]},{"given":"Sattar","family":"Hashemi","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2014,4,30]]},"reference":[{"key":"1291_CR1","unstructured":"Alcal J, Fernndez A, Luengo J, Derrac J, Garca S, Snchez L, Herrera F (2010) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult Valued Log Soft Comput"},{"key":"1291_CR2","doi-asserted-by":"crossref","unstructured":"Alibeigi M, Hashemi S, Hamzeh A (2012) DBFS: an effective density based feature selection scheme for small sample size and high dimensional imbalanced data sets. Data Knowl Eng","DOI":"10.1016\/j.datak.2012.08.001"},{"key":"1291_CR3","unstructured":"Aczl J, Darczy Z (1975) On measures of information and their characterizations. New York"},{"key":"1291_CR4","unstructured":"Bishop CM (2007) Pattern recognition and machine learning (information science and statistics)"},{"issue":"7","key":"1291_CR5","doi-asserted-by":"crossref","first-page":"1145","DOI":"10.1016\/S0031-3203(96)00142-2","volume":"30","author":"AP Bradley","year":"1997","unstructured":"Bradley AP (1997) The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145\u20131159","journal-title":"Pattern Recognit"},{"key":"1291_CR6","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1613\/jair.953","volume":"16","author":"NV Chawla","year":"2002","unstructured":"Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:341\u2013378","journal-title":"J Artif Intell Res"},{"key":"1291_CR7","doi-asserted-by":"crossref","unstructured":"Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) SMOTEBoost: improving prediction of the minority class in boosting. In: Knowledge discovery in databases: PKDD 2003. Springer, Berlin, pp 107\u2013119","DOI":"10.1007\/978-3-540-39804-2_12"},{"key":"1291_CR8","unstructured":"Chawla NV (2003) C4.5 and imbalanced data sets: investigating the effect of sampling method, probabilistic estimate, and decision tree structure. In: Proceedings of the ICML, vol 3"},{"issue":"1","key":"1291_CR9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1007730.1007733","volume":"6","author":"NV Chawla","year":"2004","unstructured":"Chawla NV, Japkowicz N, Kotcz A (2004) Editorial: special issue on learning from imbalanced data sets. SIGKDD Explor Newsl 6(1):1\u20136","journal-title":"SIGKDD Explor Newsl"},{"key":"1291_CR10","doi-asserted-by":"crossref","unstructured":"Chen K, Lu BL, Kwok JT (2006) Efficient classification of multi-label and imbalanced data using min-max modular classifiers. In: International joint conference on neural networks, IJCNN\u201906, IEEE, pp 1770\u20131775","DOI":"10.1109\/IJCNN.2006.246893"},{"key":"1291_CR11","doi-asserted-by":"crossref","unstructured":"Chen XW, Wasikowski M (2008) Fast: a roc-based feature selection metric for small samples and imbalanced data classification problems. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, New York, pp 124\u2013132","DOI":"10.1145\/1401890.1401910"},{"key":"1291_CR12","doi-asserted-by":"crossref","unstructured":"Cohen WW (1995) Fast effective rule induction. In: ICML, vol 95, pp 115\u2013123","DOI":"10.1016\/B978-1-55860-377-6.50023-2"},{"issue":"1","key":"1291_CR13","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1007\/s00211-007-0114-x","volume":"108","author":"J Demmel","year":"2007","unstructured":"Demmel J, Dumitriu I, Holtz O (2007) Fast linear algebra is stable. Numer Math 108(1):59\u201391","journal-title":"Numer Math"},{"key":"1291_CR14","first-page":"1","volume":"7","author":"J Demar","year":"2006","unstructured":"Demar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1\u201330","journal-title":"J Mach Learn Res"},{"key":"1291_CR15","doi-asserted-by":"crossref","unstructured":"Dietterich TG (2000) Ensemble methods in machine learning. In: Multiple classifier systems. Springer, Berlin, pp 1\u201315","DOI":"10.1007\/3-540-45014-9_1"},{"issue":"293","key":"1291_CR16","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1080\/01621459.1961.10482090","volume":"56","author":"OJ Dunn","year":"1961","unstructured":"Dunn OJ (1961) Multiple comparisons among means. J Am Stat Assoc 56(293):52\u201364","journal-title":"J Am Stat Assoc"},{"key":"1291_CR17","doi-asserted-by":"crossref","unstructured":"Elkan C, Noto K (2008) Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 213\u2013220","DOI":"10.1145\/1401890.1401920"},{"key":"1291_CR18","doi-asserted-by":"crossref","unstructured":"Fernndez A, Del Jesus MJ, Herrera F (2010) Multi-class imbalanced data-sets with linguistic fuzzy rule based classification systems based on pairwise learning. In: Computational intelligence for knowledge-based systems design. Springer, Berlin, pp 89\u201398","DOI":"10.1007\/978-3-642-14049-5_10"},{"key":"1291_CR19","unstructured":"Frank A, Asuncion A (2010) UCI machine learning repository: http:\/\/archive.ics.uci.edu\/ml"},{"key":"1291_CR20","unstructured":"Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: ICML, vol 96, pp 148\u2013156"},{"issue":"200","key":"1291_CR21","doi-asserted-by":"crossref","first-page":"675","DOI":"10.1080\/01621459.1937.10503522","volume":"32","author":"M Friedman","year":"1937","unstructured":"Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675\u2013701","journal-title":"J Am Stat Assoc"},{"issue":"5439","key":"1291_CR22","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1126\/science.286.5439.531","volume":"286","author":"TR Golub","year":"1999","unstructured":"Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531\u2013537","journal-title":"Science"},{"issue":"10","key":"1291_CR23","doi-asserted-by":"crossref","first-page":"3595","DOI":"10.1080\/03610929008830400","volume":"19","author":"N Henze","year":"1990","unstructured":"Henze N, Zirkler B (1990) A class of invariant consistent tests for multivariate normality. Commun Statist Theor Meth 19(10):3595\u20133618","journal-title":"Commun Statist Theor Meth"},{"key":"1291_CR24","doi-asserted-by":"crossref","unstructured":"Han H, Wang WY, Mao BH (2005) Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Advances in intelligent computing. Springer, Berlin, pp 878\u2013887","DOI":"10.1007\/11538059_91"},{"issue":"2","key":"1291_CR25","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1023\/A:1010920819831","volume":"45","author":"DJ Hand","year":"2001","unstructured":"Hand DJ, Till RJ (2001) A simple generalisation of the area under the roc curve for multiple class classification problems. Mach Learn 45(2):171\u2013186","journal-title":"Mach Learn"},{"issue":"3","key":"1291_CR26","doi-asserted-by":"crossref","first-page":"515","DOI":"10.1109\/TIT.1968.1054155","volume":"14","author":"PE Hart","year":"1968","unstructured":"Hart PE (1968) The condensed nearest neighbour rule. IEEE Trans Inform Theory 14(3):515\u2013516","journal-title":"IEEE Trans Inform Theory"},{"issue":"2","key":"1291_CR27","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1214\/aos\/1028144844","volume":"26","author":"T Hastie","year":"1998","unstructured":"Hastie T, Tibshirani R (1998) Classification by pairwise coupling. Ann Stat 26(2):451\u2013471","journal-title":"Ann Stat"},{"key":"1291_CR28","doi-asserted-by":"crossref","unstructured":"He H, Bai Y, Garcia EA, Li S (2008) ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: IEEE international joint conference on neural networks. IJCNN (IEEE world congress on computational intelligence), IEEE, pp 1322\u20131328)","DOI":"10.1109\/IJCNN.2008.4633969"},{"issue":"9","key":"1291_CR29","doi-asserted-by":"crossref","first-page":"1263","DOI":"10.1109\/TKDE.2008.239","volume":"21","author":"H He","year":"2009","unstructured":"He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263\u20131284","journal-title":"IEEE Trans Knowl Data Eng"},{"issue":"6","key":"1291_CR30","doi-asserted-by":"crossref","first-page":"571","DOI":"10.1080\/03610928008827904","volume":"9","author":"RL Iman","year":"1980","unstructured":"Iman RL, Davenport JM (1980) Approximations of the critical region of the fbietkan statistic. Commun Stat Theory Methods 9(6):571\u2013595","journal-title":"Commun Stat Theory Methods"},{"key":"1291_CR31","doi-asserted-by":"crossref","unstructured":"Joshi MV, Agarwal RC, Kumar V (2002) Predicting rare classes: can boosting make any weak learner strong? In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 297\u2013306","DOI":"10.1145\/775047.775092"},{"issue":"1","key":"1291_CR32","first-page":"25","volume":"30","author":"S Kotsiantis","year":"2006","unstructured":"Kotsiantis S, Kanellopoulos D, Pintelas P (2006) Handling imbalanced datasets: a review. GESTS Int Trans Comput Sci Eng 30(1):25\u201336","journal-title":"GESTS Int Trans Comput Sci Eng"},{"key":"1291_CR33","doi-asserted-by":"crossref","unstructured":"Kotsiantis SB, Zaharakis ID, Pintelas PE (2007) Supervised machine learning: a review of classification techniques","DOI":"10.1007\/s10462-007-9052-3"},{"key":"1291_CR34","unstructured":"Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: one-sided selection. In: Proceeding 14th international conference on machine learning, p 179\u2013186"},{"key":"1291_CR35","doi-asserted-by":"crossref","unstructured":"Kubat M, Holte R, Matwin S (1997) Learning when negative examples abound. In: Machine learning: ECML-97. Springer, Berlin, pp 146\u2013153","DOI":"10.1007\/3-540-62858-4_79"},{"issue":"1","key":"1291_CR36","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1214\/aoms\/1177729694","volume":"22","author":"S Kullback","year":"1951","unstructured":"Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79\u201386","journal-title":"Ann Math Stat"},{"key":"1291_CR37","doi-asserted-by":"crossref","unstructured":"Lee HJ, Cho S (2006) The novelty detection approach for different degrees of class imbalance. In: Lecture notes in computer science series as the proceedings of international conference on neural information processing, vol 4233, pp 21\u201330","DOI":"10.1007\/11893257_3"},{"issue":"3","key":"1291_CR38","doi-asserted-by":"crossref","first-page":"1041","DOI":"10.1016\/j.eswa.2007.08.044","volume":"35","author":"TW Liao","year":"2008","unstructured":"Liao TW (2008) Classification of weld flaws with imbalanced class data. Expert Syst Appl 35(3):1041\u20131052","journal-title":"Expert Syst Appl"},{"key":"1291_CR39","first-page":"49","volume":"2","author":"PC Mahalanobis","year":"1936","unstructured":"Mahalanobis PC (1936) On the generalized distance in statistics. Proc Natl Inst Sci (Calcutta) 2:49\u201355","journal-title":"Proc Natl Inst Sci (Calcutta)"},{"key":"1291_CR40","volume-title":"Machine learning","author":"TM Mitchell","year":"1997","unstructured":"Mitchell TM (1997) Machine learning. McGraw-Hill, New York"},{"issue":"3","key":"1291_CR41","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1109\/MCAS.2006.1688199","volume":"6","author":"R Polikar","year":"2006","unstructured":"Polikar R (2006) Ensemble based systems in decision making. Circuits Syst Mag IEEE 6(3):21\u201345","journal-title":"Circuits Syst Mag IEEE"},{"key":"1291_CR42","unstructured":"Perrone MP, Cooper LN (1992) When networks disagree: ensemble methods for hybrid neural networks (No. TR-61). Brown University Providence RI Institute for Brain and Neural Systems"},{"key":"1291_CR43","doi-asserted-by":"crossref","unstructured":"Prati RC, Batista GE, Monard MC (2004) Class imbalances versus class overlapping: an analysis of a learning system behavior. In MICAI 2004: advances in artificial intelligence. Springer, Berlin, pp 312\u2013321","DOI":"10.1007\/978-3-540-24694-7_32"},{"key":"1291_CR44","unstructured":"Quinlan JR (1993) C4. 5: programs for machine learning, vol 1. Morgan kaufmann"},{"issue":"1","key":"1291_CR45","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1145\/1007730.1007739","volume":"6","author":"B Raskutti","year":"2004","unstructured":"Raskutti B, Kowalczyk A (2004) Extreme re-balancing for svms: a case study. SIGKDD Explor 6(1):60\u201369","journal-title":"SIGKDD Explor"},{"key":"1291_CR46","first-page":"101","volume":"5","author":"R Rifkin","year":"2004","unstructured":"Rifkin R, Klautau A (2004) In defense of one-vs-all classification. J Mach Learn Res 5:101\u2013141","journal-title":"J Mach Learn Res"},{"key":"1291_CR47","volume-title":"Information retrieval","author":"CV Rijsbergen","year":"1979","unstructured":"Rijsbergen CV (1979) Information retrieval. Butterworths, London"},{"key":"1291_CR48","doi-asserted-by":"crossref","unstructured":"Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A (2008) Improving learner performance with data sampling and boosting. In: Proceeding of the 20th IEEE international conference on tools with artificial intelligence. ICTAI\u201908, IEEE, vol 1, pp 452\u2013459","DOI":"10.1109\/ICTAI.2008.58"},{"issue":"1","key":"1291_CR49","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1109\/TSMCA.2009.2029559","volume":"40","author":"C Seiffert","year":"2010","unstructured":"Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A (2010) Rusboost: a hybrid approach to alleviating class imbalance. IEEE Trans Syst Man Cybern Part B Cybern 40(1):185\u2013197","journal-title":"IEEE Trans Syst Man Cybern Part B Cybern"},{"key":"1291_CR50","doi-asserted-by":"crossref","unstructured":"Sun Y, Wong AK, Wang Y (2005) Parameter inference of cost-sensitive boosting algorithms. In: Machine learning and data mining in pattern recognition. Springer, Berlin, pp 21\u201330","DOI":"10.1007\/11510888_3"},{"key":"1291_CR51","doi-asserted-by":"crossref","unstructured":"Sun Y, Kamel MS, Wang Y (2006) Boosting for learning multiple classes with imbalanced class distribution. In: Proceeding of the sixth international conference on data mining. ICDM\u201906, IEEE, pp 592\u2013602","DOI":"10.1109\/ICDM.2006.29"},{"issue":"12","key":"1291_CR52","doi-asserted-by":"crossref","first-page":"3358","DOI":"10.1016\/j.patcog.2007.04.009","volume":"40","author":"Y Sun","year":"2007","unstructured":"Sun Y, Kamel MS, Wong AK, Wang Y (2007) Cost-sensitive boosting for classification of imbalanced data. Pattern Recognit 40(12):3358\u20133378","journal-title":"Pattern Recognit"},{"key":"1291_CR53","unstructured":"Tan A, Gilbert D, Deville Y (2003) Multi-class protein fold classification using a new ensemble machine learning approach"},{"key":"1291_CR54","first-page":"769","volume":"11","author":"I Tomek","year":"1976","unstructured":"Tomek I (1976) Two modifications of cnn. IEEE Trans Syst Man Cybern 11:769\u2013772","journal-title":"IEEE Trans Syst Man Cybern"},{"key":"1291_CR55","unstructured":"Trujillo-Ortiz A, Hernandez-Walls R, Barba-Rojo K, Cupul-Magana L (2007) HZmvntest: Henze\u2013Zirkler\u2019s multivariate normality test. A MATLAB file [WWW document]. http:\/\/www.mathworks.com\/matlabcentral\/fileexchange\/loadFile.do?objectId=17931"},{"key":"1291_CR56","unstructured":"Wang BX, Japkowicz N (2004) Imbalanced data set learning with synthetic samples. In: Proceeding of the IRIS machine learning workshop"},{"issue":"4","key":"1291_CR57","doi-asserted-by":"crossref","first-page":"1119","DOI":"10.1109\/TSMCB.2012.2187280","volume":"42","author":"S Wang","year":"2012","unstructured":"Wang S, Yao X (2012) Multiclass imbalance problems: analysis and potential solutions. IEEE Trans Syst Man Cybern Part B Cybern 42(4):1119\u20131130","journal-title":"IEEE Trans Syst Man Cybern Part B Cybern"},{"key":"1291_CR58","unstructured":"Weiss GM, Provost F (2001) The effect of class distribution on classifier learning: an empirical study. Rutgers University, USA"},{"key":"1291_CR59","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1613\/jair.1199","volume":"19","author":"GM Weiss","year":"2003","unstructured":"Weiss GM, Provost FJ (2003) Learning when training data are costly: the effect of class distribution on tree induction. J Artif Intell Res (JAIR) 19:315\u2013354","journal-title":"J Artif Intell Res (JAIR)"},{"issue":"1","key":"1291_CR60","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1145\/1007730.1007734","volume":"6","author":"GM Weiss","year":"2004","unstructured":"Weiss GM (2004) Mining with rarity: a unifying framework. ACM SIGKDD Explor Newsl 6(1):7\u201319","journal-title":"ACM SIGKDD Explor Newsl"},{"issue":"3","key":"1291_CR61","doi-asserted-by":"crossref","first-page":"408","DOI":"10.1109\/TSMC.1972.4309137","volume":"2","author":"DL Wilson","year":"1972","unstructured":"Wilson DL (1972) Asymptotic properties of nearest neighbour rules using edited data. IEEE Trans Syst Man Cybern 2(3):408\u2013421","journal-title":"IEEE Trans Syst Man Cybern"},{"key":"1291_CR62","unstructured":"Witten IH (2005) and Frank, E. Data mining, practical machine learning tools and techniques. Morgan Kaufmann"},{"issue":"4","key":"1291_CR63","doi-asserted-by":"crossref","first-page":"1125","DOI":"10.1002\/prot.21870","volume":"70","author":"XM Zhao","year":"2008","unstructured":"Zhao XM, Li X, Chen L, Aihara K (2008) Protein classification with imbalanced data. Proteins Struct Funct Bioinform 70(4):1125\u20131132","journal-title":"Proteins Struct Funct Bioinform"},{"issue":"1","key":"1291_CR64","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1109\/TKDE.2006.17","volume":"18","author":"ZH Zhou","year":"2006","unstructured":"Zhou ZH, Liu XY (2006) Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans Knowl Data Eng 18(1):63\u201377","journal-title":"IEEE Trans Knowl Data Eng"},{"issue":"7","key":"1291_CR65","doi-asserted-by":"crossref","first-page":"32","DOI":"10.4304\/jcp.1.7.32-40","volume":"1","author":"L Zhuang","year":"2006","unstructured":"Zhuang L, Dai H (2006) Parameter optimization of kernel-based one-class classifier on imbalance learning. J Comput 1(7):32\u201340","journal-title":"J Comput"}],"container-title":["Soft Computing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00500-014-1291-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s00500-014-1291-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00500-014-1291-z","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,2]],"date-time":"2025-05-02T16:01:31Z","timestamp":1746201691000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s00500-014-1291-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,4,30]]},"references-count":65,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2015,12]]}},"alternative-id":["1291"],"URL":"https:\/\/doi.org\/10.1007\/s00500-014-1291-z","relation":{},"ISSN":["1432-7643","1433-7479"],"issn-type":[{"value":"1432-7643","type":"print"},{"value":"1433-7479","type":"electronic"}],"subject":[],"published":{"date-parts":[[2014,4,30]]}}}