{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,20]],"date-time":"2026-01-20T01:01:08Z","timestamp":1768870868166,"version":"3.49.0"},"reference-count":73,"publisher":"Springer Science and Business Media LLC","issue":"6","license":[{"start":{"date-parts":[[2017,1,5]],"date-time":"2017-01-05T00:00:00Z","timestamp":1483574400000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Empir Software Eng"],"published-print":{"date-parts":[[2017,12]]},"DOI":"10.1007\/s10664-016-9488-7","type":"journal-article","created":{"date-parts":[[2017,1,5]],"date-time":"2017-01-05T01:07:36Z","timestamp":1483578456000},"page":"2806-2851","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":58,"title":["An empirical study for software change prediction using imbalanced data"],"prefix":"10.1007","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4379-1837","authenticated-orcid":false,"given":"Ruchika","family":"Malhotra","sequence":"first","affiliation":[]},{"given":"Megha","family":"Khanna","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2017,1,5]]},"reference":[{"key":"9488_CR1","unstructured":"Apandi ZFM, Mustapha N, Affendey LS (2011) Evaluating integrated weight linear method to class imbalanced learning in video data. In 3rd Conference on Data Mining and Optimization, 243\u2013247"},{"issue":"1","key":"9488_CR2","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1016\/j.jss.2009.06.055","volume":"83","author":"E Arisholm","year":"2010","unstructured":"Arisholm E, Briand LC, Johannessen EB (2010) A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. J Syst Softw 83(1):2\u201317","journal-title":"J Syst Softw"},{"issue":"10","key":"9488_CR3","first-page":"27","volume":"3","author":"M Bekkar","year":"2013","unstructured":"Bekkar M, Djemaa HK, Alitouche TA (2013) Evaluation measures for models assessment over imbalanced data sets. J Inf Eng Appl 3(10):27\u201338","journal-title":"J Inf Eng Appl"},{"key":"9488_CR4","doi-asserted-by":"crossref","unstructured":"Bieman J, Jain D, Yang H (2001) OO design patterns, design structure, and program changes: an industrial case study. In proceedings of 17th International Conference on Software Maintenance, 580\u2013589","DOI":"10.1109\/ICSM.2001.972775"},{"key":"9488_CR5","first-page":"123","volume":"24","author":"L Breiman","year":"1996","unstructured":"Breiman L (1996) Bagging predictors. Mach Learn 24:123\u2013140","journal-title":"Mach Learn"},{"issue":"1","key":"9488_CR6","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1023\/A:1009783721306","volume":"3","author":"L Briand","year":"1998","unstructured":"Briand L, Daly J, Wust J (1998) A unified framework for cohesion measurement in object-oriented systems. Empir Softw Eng 3(1):65\u2013117","journal-title":"Empir Softw Eng"},{"issue":"1","key":"9488_CR7","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1109\/32.748920","volume":"25","author":"L Briand","year":"1999","unstructured":"Briand L, Daly J, Wust J (1999) A unified framework for coupling measurement in object-oriented systems. IEEE Trans Softw Eng 25(1):91\u2013121","journal-title":"IEEE Trans Softw Eng"},{"issue":"3","key":"9488_CR8","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1016\/S0164-1212(99)00102-8","volume":"51","author":"L Briand","year":"2000","unstructured":"Briand L, Wust J, Daly JW (2000) Exploring the relationship between design measures and software quality in object-oriented systems. J Syst Softw 51(3):245\u2013273","journal-title":"J Syst Softw"},{"issue":"1","key":"9488_CR9","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1023\/A:1009815306478","volume":"6","author":"L Briand","year":"2001","unstructured":"Briand L, Wust J, Lounis H (2001) Replicated case studies for investigating quality factors in object oriented designs. Empir Softw Eng J 6(1):11\u201358","journal-title":"Empir Softw Eng J"},{"issue":"1","key":"9488_CR10","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Brieman","year":"2001","unstructured":"Brieman L (2001) Random forests. Mach Learn 45(1):5\u201332","journal-title":"Mach Learn"},{"issue":"8","key":"9488_CR11","doi-asserted-by":"crossref","first-page":"786","DOI":"10.1109\/32.879814","volume":"26","author":"M CartWright","year":"2000","unstructured":"CartWright M, Shepperd M (2000) An empirical investigation of an object-oriented software system. IEEE Tran Softw Eng 26(8):786\u2013796","journal-title":"IEEE Tran Softw Eng"},{"issue":"5","key":"9488_CR12","doi-asserted-by":"crossref","first-page":"868","DOI":"10.1016\/j.jss.2009.12.023","volume":"83","author":"ABD Carvalho","year":"2010","unstructured":"Carvalho ABD, Pozo A, Vergilio SR (2010) A symbolic fault-prediction model based on multi-objective particle swarm optimization. J Syst Softw 83(5):868\u2013882","journal-title":"J Syst Softw"},{"issue":"8","key":"9488_CR13","doi-asserted-by":"crossref","first-page":"1040","DOI":"10.1016\/j.ins.2008.12.001","volume":"179","author":"C Catal","year":"2009","unstructured":"Catal C, Diri B (2009) Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Inf Sci 179(8):1040\u20131058","journal-title":"Inf Sci"},{"key":"9488_CR14","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1613\/jair.953","volume":"16","author":"NV Chawla","year":"2002","unstructured":"Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321\u2013357","journal-title":"J Artif Intell Res"},{"issue":"6","key":"9488_CR15","doi-asserted-by":"crossref","first-page":"476","DOI":"10.1109\/32.295895","volume":"20","author":"SR Chidamber","year":"1994","unstructured":"Chidamber SR, Kemerer CF (1994) A metrics suite for object oriented design. IEEE Tran Softw Eng 20(6):476\u2013493","journal-title":"IEEE Tran Softw Eng"},{"key":"9488_CR16","first-page":"1","volume":"7","author":"J Dem\u0161ar","year":"2006","unstructured":"Dem\u0161ar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1\u201330","journal-title":"J Mach Learn Res"},{"key":"9488_CR17","doi-asserted-by":"crossref","unstructured":"Domingos P (1999) Metacost: a general method for making classifiers cost-sensitive. In Proc. of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, CA, 155\u2013164","DOI":"10.1145\/312129.312220"},{"issue":"5","key":"9488_CR18","first-page":"407","volume":"25","author":"MO Elish","year":"2013","unstructured":"Elish MO, Al-Khiaty MA (2013) A suite of metrics for quantifying historical changes to predict future change-prone classes in object-oriented software. J Softw: Evol Process 25(5):407\u2013437","journal-title":"J Softw: Evol Process"},{"key":"9488_CR19","doi-asserted-by":"crossref","unstructured":"Eski S, Buzluca F (2011) An empirical study on object-oriented metrics and software evolution in order to reduce testing cost by predicting change prone classes. In Proc. of International Conference on Software Testing, Verification and Validation Workshop, 566\u2013571","DOI":"10.1109\/ICSTW.2011.43"},{"issue":"8","key":"9488_CR20","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1016\/j.patrec.2005.10.010","volume":"27","author":"T Fawcett","year":"2006","unstructured":"Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861\u2013874","journal-title":"Pattern Recogn Lett"},{"issue":"2","key":"9488_CR21","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1214\/aos\/1016218223","volume":"28","author":"J Friedman","year":"2000","unstructured":"Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 28(2):337\u2013407","journal-title":"Ann Stat"},{"issue":"4","key":"9488_CR22","doi-asserted-by":"crossref","first-page":"463","DOI":"10.1109\/TSMCC.2011.2161285","volume":"42","author":"M Galar","year":"2012","unstructured":"Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybern Part C Appl Rev 42(4):463\u2013484","journal-title":"IEEE Trans Syst Man Cybern Part C Appl Rev"},{"key":"9488_CR23","doi-asserted-by":"crossref","unstructured":"Gao K, Khoshgoftaar TM, Napolitano A (2015) Combining feature subset selection and data sampling for coping with highly imbalanced software data. In Proc. of 27th International Conf. on Software Engineering and Knowledge Engineering, Pittsburgh, 2015","DOI":"10.18293\/SEKE2015-182"},{"key":"9488_CR24","doi-asserted-by":"crossref","unstructured":"Giger E, Pinzger M, Gall HC (2012) Can we predict type of code changes? An empirical analysis. In Proc. of 9th IEEE Working Conference on Mining Software Repositories, 217\u2013226","DOI":"10.1109\/MSR.2012.6224284"},{"key":"9488_CR25","unstructured":"Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In Proc. of the Seventeenth International Conference on Machine Learning, 359\u2013366"},{"issue":"6","key":"9488_CR26","doi-asserted-by":"crossref","first-page":"1437","DOI":"10.1109\/TKDE.2003.1245283","volume":"15","author":"MA Hall","year":"2003","unstructured":"Hall MA, Holmes G (2003) Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans Knowl Data Eng 15(6):1437\u20131447","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"9488_CR27","doi-asserted-by":"crossref","unstructured":"Harman M, Islam S, Jia Y, Minku LL, Sarro F, Sirivisut K (2014) less is more: temporal fault predictive performance over multiple Hadoop releases. In Proc. 6th International Symposium on Search Based Software Engineering, 240\u2013246","DOI":"10.1007\/978-3-319-09940-8_19"},{"key":"9488_CR28","volume-title":"Neural networks: a comprehensive foundation","author":"S Haykin","year":"2004","unstructured":"Haykin S (2004) Neural networks: a comprehensive foundation, 2nd edn. Pearson education, Delhi","edition":"2"},{"issue":"9","key":"9488_CR29","doi-asserted-by":"crossref","first-page":"1263","DOI":"10.1109\/TKDE.2008.239","volume":"21","author":"H He","year":"2009","unstructured":"He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263\u20131284","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"9488_CR30","unstructured":"Henderson-Sellers B (1996) Object-oriented metrics, measures of complexity. Prentice Hall"},{"issue":"4","key":"9488_CR31","doi-asserted-by":"crossref","first-page":"1347","DOI":"10.1093\/ietisy\/e89-d.4.1347","volume":"89","author":"AMAN Hirohisa","year":"2006","unstructured":"Hirohisa AMAN, Mochiduki N, Yamada H (2006) A model for detecting cost-prone classes based on Mahalanobis-Taguchi method. IEICE Trans Inf Syst 89(4):1347\u20131358","journal-title":"IEICE Trans Inf Syst"},{"key":"9488_CR32","doi-asserted-by":"crossref","unstructured":"Hulse JV, Khoshgoftaar TM, Napolitano A, Wald R (2009) Feature selection with high-dimensional imbalanced data. In Proc. of International Conference on Data Mining Workshops, 507\u2013514","DOI":"10.1109\/ICDMW.2009.35"},{"key":"9488_CR33","doi-asserted-by":"crossref","unstructured":"Jeni L, Cohn JF, De La Torre F (2013) Facing imbalanced data--recommendations for the use of performance metrics. In Proc. of Humane Association Conf. on Affective Computing and Intelligent Interaction, 245\u2013251","DOI":"10.1109\/ACII.2013.47"},{"key":"9488_CR34","doi-asserted-by":"crossref","unstructured":"Kamei Y, Monden A, Matsumoto S, Kakimoto T, Matsumoto K (2007) The effects of over and under sampling on fault-prone module detection. In Proc. 1st International Symposium on Empirical Software Engineering and Measurement, 196\u2013204","DOI":"10.1109\/ESEM.2007.28"},{"issue":"2","key":"9488_CR35","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1007\/s11219-006-7597-z","volume":"14","author":"TM Khoshgoftaar","year":"2006","unstructured":"Khoshgoftaar TM, Seliya N, Sundaresh N (2006) An empirical study of predicting faults with case-based reasoning. Softw Qual J 14(2):85\u2013111","journal-title":"Softw Qual J"},{"key":"9488_CR36","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1016\/j.jss.2006.05.017","volume":"80","author":"AG Koru","year":"2007","unstructured":"Koru AG, Liu H (2007) Identifying and characterizing change-prone classes in two large-scale open-source products. J Syst Softw 80:63\u201373","journal-title":"J Syst Softw"},{"issue":"8","key":"9488_CR37","doi-asserted-by":"crossref","first-page":"625","DOI":"10.1109\/TSE.2005.89","volume":"31","author":"AG Koru","year":"2005","unstructured":"Koru AG, Tian J (2005) Comparing high-change modules and modules with the highest measurement values in two large-scale open-source products. IEEE Trans Softw Eng 31(8):625\u2013642","journal-title":"IEEE Trans Softw Eng"},{"key":"9488_CR38","unstructured":"Kubat M, Matwin S (1997) Addressing the curse of imbalanced training sets: one sided selection. In Proc. of 14th International Conference on Machine Learning 97: 179\u2013186"},{"issue":"4","key":"9488_CR39","doi-asserted-by":"crossref","first-page":"485","DOI":"10.1109\/TSE.2008.35","volume":"34","author":"S Lessmann","year":"2008","unstructured":"Lessmann S, Baesans B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel finding. IEEE Trans Softw Eng 34(4):485\u2013496","journal-title":"IEEE Trans Softw Eng"},{"issue":"2","key":"9488_CR40","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1007\/s10515-011-0092-1","volume":"19","author":"M Li","year":"2012","unstructured":"Li M, Zhang H, Whu R, Zhou Z (2012) Sample-based software defect prediction with active and semi-supervised learning. Autom Softw Eng 19(2):201\u2013230","journal-title":"Autom Softw Eng"},{"key":"9488_CR41","doi-asserted-by":"crossref","unstructured":"Liu Y, An A, Huang X (2006) Boosting prediction accuracy on imbalanced datasets with SVM ensembles. In Advances in Knowledge Discovery and Data Mining, 107\u2013118","DOI":"10.1007\/11731139_15"},{"key":"9488_CR42","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1016\/j.ins.2013.07.007","volume":"250","author":"V Lopez","year":"2013","unstructured":"Lopez V, Fernandez A, Garcia S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113\u2013141","journal-title":"Inf Sci"},{"issue":"3","key":"9488_CR43","doi-asserted-by":"crossref","first-page":"200","DOI":"10.1007\/s10664-011-9170-z","volume":"17","author":"H Lu","year":"2012","unstructured":"Lu H, Zhou Y, Xu B, Leung H, Chen L (2012) The ability of object-oriented metrics to predict change-proneness: a meta-analysis. Empir Softw Eng J 17(3):200\u2013242","journal-title":"Empir Softw Eng J"},{"key":"9488_CR44","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1016\/j.asoc.2014.11.023","volume":"27","author":"R Malhotra","year":"2015","unstructured":"Malhotra R (2015) A systematic review of machine learning techniques for software fault prediction. Appl Soft Comput 27:504\u2013518","journal-title":"Appl Soft Comput"},{"key":"9488_CR45","doi-asserted-by":"crossref","unstructured":"Malhotra R, Khanna M (2013) Investigation of relationship between object-oriented metrics and change proneness. Int J Mach Learn Cybern. Springer-Verlag 4(4): 273\u2013286","DOI":"10.1007\/s13042-012-0095-7"},{"key":"9488_CR46","doi-asserted-by":"crossref","unstructured":"Malhotra R, Nagpal K, Upmanyu P, Pritam N (2014) Defect collection and reporting system for git based open source software. In Proc. of International Conf. on Data Mining and Intelligent Computing, 1\u20137","DOI":"10.1109\/ICDMIC.2014.6954234"},{"key":"9488_CR47","volume-title":"Agile software development: principles, patters, and practices","author":"RC Martin","year":"2002","unstructured":"Martin RC (2002) Agile software development: principles, patters, and practices. Prentice Hall, USA"},{"issue":"1","key":"9488_CR48","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1109\/TSE.2007.256941","volume":"33","author":"T Menzies","year":"2007","unstructured":"Menzies T, Greenwald J, Frank A (2007a) Data mining static code attributes to learn defect predictors. IEEE Trans Softw Eng 33(1):2\u201313","journal-title":"IEEE Trans Softw Eng"},{"issue":"9","key":"9488_CR49","doi-asserted-by":"crossref","first-page":"637","DOI":"10.1109\/TSE.2007.70721","volume":"33","author":"T Menzies","year":"2007","unstructured":"Menzies T, Dekhtyar A, Distefance J, Greenwald J (2007b) Problems with precision: a response to comments on \u2018data mining static code attributes to learn defect predictors\u2019. IEEE Trans Softw Eng 33(9):637\u2013640","journal-title":"IEEE Trans Softw Eng"},{"issue":"7","key":"9488_CR50","first-page":"1","volume":"16","author":"T Munkhdalai","year":"2015","unstructured":"Munkhdalai T, Namsrai OE, Ryu KH (2015) Self-training in significance space of support vectors for imbalanced biomedical event data. BMC Bioinf 16(7):1","journal-title":"BMC Bioinf"},{"key":"9488_CR51","unstructured":"Murphy KP (2006) Naive Bayes classifiers, Technical Report"},{"issue":"10","key":"9488_CR52","doi-asserted-by":"crossref","first-page":"402","DOI":"10.1109\/TSE.2007.1015","volume":"33","author":"H Olague","year":"2007","unstructured":"Olague H, Etzkorn L, Gholston S, Quattlebaum S (2007) Empirical validation of three software metric suites to predict the fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEE Trans Softw Eng 33(10):402\u2013419","journal-title":"IEEE Trans Softw Eng"},{"issue":"10","key":"9488_CR53","doi-asserted-by":"crossref","first-page":"675","DOI":"10.1109\/TSE.2007.70722","volume":"33","author":"GJ Pai","year":"2007","unstructured":"Pai GJ, Dugan JB (2007) Empirical analysis of software fault content and fault proneness using Bayesian methods. IEEE Trans Softw Eng 33(10):675\u2013686","journal-title":"IEEE Trans Softw Eng"},{"key":"9488_CR54","doi-asserted-by":"crossref","unstructured":"Phua C, Alahakoon D, Lee V (2004) Minority report in fraud detection: classification of skewed data. SIGKDD Explorations 6(1): 50\u201359","DOI":"10.1145\/1007730.1007738"},{"key":"9488_CR55","doi-asserted-by":"crossref","unstructured":"Rodriguez D, Herraiz I, Harrison R, Dolado J, Riquelme JC (2014) Preliminary comparison of techniques for dealing with imbalance in software defect prediction. In Proc. of the 18th International Conf. on Evaluation and Assessment in Software Engineering, 43","DOI":"10.1145\/2601248.2601294"},{"key":"9488_CR56","doi-asserted-by":"crossref","unstructured":"Romano D, Pinzger M (2011) Using source code metrics to predict change-prone java interfaces. 27th IEEE International Conference on Software Maintenance, 303\u2013312","DOI":"10.1109\/ICSM.2011.6080797"},{"key":"9488_CR57","doi-asserted-by":"crossref","first-page":"571","DOI":"10.1016\/j.ins.2010.12.016","volume":"259","author":"C Seiffert","year":"2014","unstructured":"Seiffert C, Khoshgoftaar TM, Hulse JV, Folleco A (2014) An empirical study of the classification performance of learners on imbalanced and noisy software quality data. Inf Sci 259:571\u2013595","journal-title":"Inf Sci"},{"key":"9488_CR58","first-page":"448","volume":"1","author":"N Seliya","year":"2011","unstructured":"Seliya N, Khoshgoftaar TM (2011) The use of decision trees for cost-sensitive classification: an empirical study in software quality prediction. Wiley Interdiscip Rev: Data Min Knowl Disc 1:448\u2013459","journal-title":"Wiley Interdiscip Rev: Data Min Knowl Disc"},{"key":"9488_CR59","doi-asserted-by":"crossref","unstructured":"Shatnawi R (2012) Improving software fault-prediction for imbalanced data. In Proc. of International Conf. on Innovations in Information Technology, 54\u201359","DOI":"10.1109\/INNOVATIONS.2012.6207774"},{"key":"9488_CR60","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/s11219-009-9079-6","volume":"18","author":"Y Singh","year":"2009","unstructured":"Singh Y, Kaur A, Malhotra R (2009) Empirical validation of object-oriented metrics for predicting fault proneness models. Softw Qual J 18:3\u201335","journal-title":"Softw Qual J"},{"key":"9488_CR61","first-page":"111","volume":"36","author":"M Stone","year":"1974","unstructured":"Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J R Soc A 36:111\u2013114","journal-title":"J R Soc A"},{"issue":"10","key":"9488_CR62","doi-asserted-by":"crossref","first-page":"1321","DOI":"10.1109\/TKDE.2007.190623","volume":"19","author":"CT Su","year":"2007","unstructured":"Su CT, Hsiao YH (2007) An evaluation of the robustness of MTS for imbalanced data. IEEE Trans Knowl Data Eng 19(10):1321\u20131332","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"9488_CR63","doi-asserted-by":"crossref","unstructured":"Tan M, Tan L, Dara S, Mayeux C (2015) Online defect prediction for imbalanced data. In Proc. of 37th International Conf. on Software Engineering","DOI":"10.1109\/ICSE.2015.139"},{"key":"9488_CR64","unstructured":"Visa S, Ralescu A (2005) Issues in mining imbalanced data sets- a review paper. In Proc. of 16th Conference on Artificial Intelligence and Cognitive Science, 67\u201373"},{"key":"9488_CR65","doi-asserted-by":"crossref","first-page":"434","DOI":"10.1109\/TR.2013.2259203","volume":"62","author":"S Wang","year":"2013","unstructured":"Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62:434\u2013443","journal-title":"IEEE Trans Reliab"},{"issue":"1","key":"9488_CR66","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1145\/1007730.1007734","volume":"6","author":"GM Weiss","year":"2004","unstructured":"Weiss GM (2004) Mining with rarity: a unifying framework. ACM SIGKDD Explor Newslett 6(1):7\u201319","journal-title":"ACM SIGKDD Explor Newslett"},{"key":"9488_CR67","unstructured":"Weiss GM, McCarthy K, Zabar B (2007) Cost-sensitive learning vs. sampling: which is best for handling unbalanced classes with unequal error costs?. In Proc. of International Conf. on Data Mining, 35\u201341"},{"key":"9488_CR68","unstructured":"Weng CG, Poon J (2008) A new evaluation measure for imbalanced datasets. In Proc. of the 7th Australian Data Mining Conference, 27\u201332"},{"key":"9488_CR69","volume-title":"Data mining: practical machine learning tools and techniques","author":"IH Witten","year":"2011","unstructured":"Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, San Francisco","edition":"3"},{"issue":"2","key":"9488_CR70","doi-asserted-by":"crossref","first-page":"226","DOI":"10.1007\/s12559-015-9319-y","volume":"7","author":"R Xu","year":"2015","unstructured":"Xu R, Chen T, Xia Y, Lu Q, Liu B, Wang X (2015) Word embedding composition for data imbalances in sentiment and emotion classification. Cogn Comput 7(2):226\u2013240","journal-title":"Cogn Comput"},{"issue":"3","key":"9488_CR71","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1109\/TCYB.2013.2257480","volume":"44","author":"P Yang","year":"2014","unstructured":"Yang P, Yoo PD, Fernando J, Zhou BB, Zhang Z, Zomaya AY (2014) Sample subset optimization techniques for imbalanced and ensemble learning problems in bioinformatics applications. IEEE Trans Cybern 44(3):445\u2013455","journal-title":"IEEE Trans Cybern"},{"key":"9488_CR72","unstructured":"Zhang X, Li Y (2011) An empirical study of learning from imbalanced data. In Proc. of the 22nd Australasian Database Conf, 85\u201394"},{"issue":"5","key":"9488_CR73","doi-asserted-by":"crossref","first-page":"607","DOI":"10.1109\/TSE.2009.32","volume":"35","author":"Y Zhou","year":"2009","unstructured":"Zhou Y, Leung H, Xu B (2009) Examining the potentially confounding effect of class size on the associations between object metrics and change proneness. IEEE Trans Softw Eng 35(5):607\u2013623","journal-title":"IEEE Trans Softw Eng"}],"container-title":["Empirical Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s10664-016-9488-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-016-9488-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s10664-016-9488-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,9,17]],"date-time":"2019-09-17T00:58:17Z","timestamp":1568681897000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s10664-016-9488-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,1,5]]},"references-count":73,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2017,12]]}},"alternative-id":["9488"],"URL":"https:\/\/doi.org\/10.1007\/s10664-016-9488-7","relation":{},"ISSN":["1382-3256","1573-7616"],"issn-type":[{"value":"1382-3256","type":"print"},{"value":"1573-7616","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,1,5]]}}}