{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T18:10:55Z","timestamp":1774721455288,"version":"3.50.1"},"reference-count":38,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2022,2,20]],"date-time":"2022-02-20T00:00:00Z","timestamp":1645315200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,2,20]],"date-time":"2022-02-20T00:00:00Z","timestamp":1645315200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100004307","name":"Distinguished Middle-Aged and Young Scientist Encourage and Reward Foundation of Shandong Province","doi-asserted-by":"publisher","award":["No.ZQ2021-19"],"award-info":[{"award-number":["No.ZQ2021-19"]}],"id":[{"id":"10.13039\/501100004307","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2022,8]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Software testing guarantees the delivery of high-quality software products, and software defect prediction (SDP) has become an important part of software testing. Software defect prediction is divided into traditional software defect prediction and just-in-time software defect prediction (JIT-SDP). However, most of the existing software defect prediction frameworks are relatively simplified, which makes it extremely difficult to provide developers with more detailed reference information. To improve the effectiveness of software defect prediction and realize effective software testing resource allocation, this paper proposes a software defect prediction framework based on Nested-Stacking and heterogeneous feature selection. The framework includes three stages: data set preprocessing and feature selection, Nested-Stacking classifier, and model classification performance evaluation. The novel heterogeneous feature selection and nested custom classifiers in the framework can effectively improve the accuracy of software defect prediction. This paper conducts experiments on two software defect data sets (Kamei, PROMISE), and demonstrates the classification performance of the model through two comprehensive evaluation indicators, AUC, and F1-score. The experiment carried out large-scale within-project defect prediction (WPDP) and cross-project defect prediction (CPDP). The results show that the framework proposed in this paper has an excellent classification performance on the two types of software defect data sets, and has been greatly improved compared with the baseline models.<\/jats:p>","DOI":"10.1007\/s40747-022-00676-y","type":"journal-article","created":{"date-parts":[[2022,2,20]],"date-time":"2022-02-20T06:08:26Z","timestamp":1645337306000},"page":"3333-3348","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":58,"title":["Software defect prediction based on nested-stacking and heterogeneous feature selection"],"prefix":"10.1007","volume":"8","author":[{"given":"Li-qiong","family":"Chen","sequence":"first","affiliation":[]},{"given":"Can","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Shi-long","family":"Song","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,2,20]]},"reference":[{"issue":"2","key":"676_CR1","doi-asserted-by":"publisher","first-page":"525","DOI":"10.1007\/s11219-016-9353-3","volume":"26","author":"D Bowes","year":"2018","unstructured":"Bowes D, Hall T, Petri\u0107 J (2018) Software defect prediction: do different classifiers find the same defects? Softw Qual J 26(2):525\u2013552","journal-title":"Softw Qual J"},{"issue":"1","key":"676_CR2","doi-asserted-by":"publisher","first-page":"917","DOI":"10.3233\/JIFS-179459","volume":"38","author":"K Bashir","year":"2020","unstructured":"Bashir K, Li T, Yohannese CW et al (2020) SMOTEFRIS-INFFC: handling the challenge of borderline and noisy examples in imbalanced learning for software defect prediction. J Intell Fuzzy Syst 38(1):917\u2013933","journal-title":"J Intell Fuzzy Syst"},{"key":"676_CR3","doi-asserted-by":"crossref","unstructured":"Goyal S (2020) Heterogeneous Stacked Ensemble Classifier for Software Defect Prediction.2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC). IEEE, 6\u20138 November 2020, pp 126-130","DOI":"10.1109\/PDGC50313.2020.9315754"},{"key":"676_CR4","doi-asserted-by":"publisher","DOI":"10.1016\/j.energy.2020.118874","volume":"214","author":"M Massaoudi","year":"2021","unstructured":"Massaoudi M, Refaat SS, Chihi I et al (2021) A novel stacked generalization ensemble-based hybrid LGBM-XGB-MLP model for Short-Term Load Forecasting. Energy 214:118874","journal-title":"Energy"},{"key":"676_CR5","doi-asserted-by":"crossref","unstructured":"Khuat T T, Le M H (2020) Evaluation of sampling-based ensembles of classifiers on imbalanced data for software defect prediction problems.SN Computer Science 1(2):1-16","DOI":"10.1007\/s42979-020-0119-4"},{"key":"676_CR6","doi-asserted-by":"crossref","unstructured":"Zhu K, Zhang N, Ying S et al (2020) Within-project and cross-project just-in-time defect prediction based on denoising autoencoder and convolutional neural network. IET Softw 14(3):185\u2013195","DOI":"10.1049\/iet-sen.2019.0278"},{"key":"676_CR7","doi-asserted-by":"crossref","unstructured":"Pascarella L, Palomba F, Bacchelli A (2019) Fine-grained just-in-time defect prediction. J Syst Softw 150:22-36","DOI":"10.1016\/j.jss.2018.12.001"},{"key":"676_CR8","unstructured":"Yan M, Xia X, Fan Y et al (2020) Just-in-time defect identification and localization: a two-phase framework. IEEE Trans Softw Eng"},{"key":"676_CR9","doi-asserted-by":"crossref","unstructured":"Bejjanki KK, Gyani J, Gugulothu N (2020) Class imbalance reduction (CIR): a novel approach to software defect prediction in the presence of class imbalance. Symmetry 2(3):407","DOI":"10.3390\/sym12030407"},{"issue":"03","key":"676_CR10","doi-asserted-by":"publisher","first-page":"289","DOI":"10.1142\/S0218194021500108","volume":"31","author":"X Yang","year":"2021","unstructured":"Yang X, Yu H, Fan G et al (2021) DEJIT: a differential evolution algorithm for effort-aware just-in-time software defect prediction. Int J Softw Eng Knowl Eng 31(03):289\u2013310","journal-title":"Int J Softw Eng Knowl Eng"},{"key":"676_CR11","doi-asserted-by":"crossref","unstructured":"Alsawalqah H, Hijazi N, Eshtay M et al (2020) Software defect prediction using heterogeneous ensemble classification based on segmented patterns. Appl Sci 10(5):1745","DOI":"10.3390\/app10051745"},{"key":"676_CR12","doi-asserted-by":"crossref","unstructured":"Malhotra R, Jain J (2020) Handling imbalanced data using ensemble learning in software defect prediction. In: 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence). IEEE, 29\u201331 January 2020, pp 300\u2013304","DOI":"10.1109\/Confluence47617.2020.9058124"},{"key":"676_CR13","doi-asserted-by":"crossref","unstructured":"Matloob F, Aftab S, Iqbal A (2019) A framework for software defect prediction using feature selection and ensemble learning techniques. Int J Modern Educ Comput Sci 11(12)","DOI":"10.5815\/ijmecs.2019.12.01"},{"key":"676_CR14","doi-asserted-by":"crossref","unstructured":"Li Z, Jing X Y, Zhu X, et al (2019) Heterogeneous defect prediction with two-stage ensemble learning. Autom Softw Eng 26(3):599\u2013651","DOI":"10.1007\/s10515-019-00259-1"},{"key":"676_CR15","doi-asserted-by":"crossref","unstructured":"Iqbal A, Aftab S (2020) A classification framework for software defect prediction using multi-filter feature selection technique and MLP. Int J Modern Educ Comput Sci 12(1)","DOI":"10.5815\/ijmecs.2020.01.03"},{"key":"676_CR16","unstructured":"Maruf OM (2019) The impact of parameter optimization of ensemble learning on defect prediction. Comput Sci J Moldova 79(1):85\u2013128"},{"key":"676_CR17","doi-asserted-by":"crossref","unstructured":"Kakkar M, Jain S, Bansal A, et al (2021) Combining data preprocessing methods with imputation techniques for software defect prediction. Research Anthology on Recent Trends, Tools, and Implications of Computer Programming. IGI Global, pp 1792\u20131811","DOI":"10.4018\/978-1-7998-3016-0.ch081"},{"key":"676_CR18","doi-asserted-by":"crossref","unstructured":"Ni C, Chen X, Wu F et al (2019) An empirical study on pareto based multi-objective feature selection for software defect prediction. J Syst Softw 152:215\u2013238","DOI":"10.1016\/j.jss.2019.03.012"},{"key":"676_CR19","doi-asserted-by":"crossref","unstructured":"Balogun AO, Basri S, Abdulkadir SJ et al (2019) Performance analysis of feature selection methods in software defect prediction: a search method approach. Appl Sci 9(13):2764","DOI":"10.3390\/app9132764"},{"key":"676_CR20","unstructured":"Oluwagbemiga BA, Shuib B, Abdulkadir SJ, et al (2019) A hybrid multi-filter wrapper feature selection method for software defect predictors. Int J Supply Chain Manag 8(2):916\u2013922"},{"issue":"5","key":"676_CR21","first-page":"721","volume":"17","author":"K Bashir","year":"2020","unstructured":"Bashir K, Li T, Yahaya M (2020) A novel feature selection method based on maximum likelihood logistic regression for imbalanced learning in software defect prediction. Int Arab J Inf Technol 17(5):721\u2013730","journal-title":"Int Arab J Inf Technol"},{"key":"676_CR22","doi-asserted-by":"crossref","unstructured":"Liu Y, Mu Y, Chen K, et al (2020) Daily activity feature selection in smart homes based on pearson correlation coefficient. Neural Process Lett pp 1\u201317","DOI":"10.1007\/s11063-019-10185-8"},{"key":"676_CR23","doi-asserted-by":"crossref","unstructured":"Cavallo B (2020) Functional relations and Spearman correlation between consistency indices. J Oper Res Soc 71(2):301\u2013311","DOI":"10.1080\/01605682.2018.1516178"},{"key":"676_CR24","doi-asserted-by":"crossref","unstructured":"Novaes MT, de Carvalho OLF, Ferreira PHG, et al (2021) Prediction of secondary testosterone deficiency using machine learning: a comparative analysis of ensemble and base classifiers, probability calibration, and sampling strategies in a slightly imbalanced dataset. Inf Med Unlock 23:100538","DOI":"10.1016\/j.imu.2021.100538"},{"key":"676_CR25","doi-asserted-by":"crossref","unstructured":"Saifan AA, Abu-wardih L (2020) Software defect prediction based on feature subset selection and ensemble classification. ECTI Trans Comput Inf Technol (ECTI-CIT) 14(2):213\u2013228","DOI":"10.37936\/ecti-cit.2020142.224489"},{"key":"676_CR26","doi-asserted-by":"crossref","unstructured":"Wu Y, Ke Y, Chen Z et al (2020) Application of alternating decision tree with AdaBoost and bagging ensembles for landslide susceptibility mapping. Catena 187:104396","DOI":"10.1016\/j.catena.2019.104396"},{"key":"676_CR27","doi-asserted-by":"crossref","unstructured":"Li X K, Chen W, Zhang Q et al (2020) Building auto-encoder intrusion detection system based on random forest feature selection. Comput Secur 95:101851","DOI":"10.1016\/j.cose.2020.101851"},{"key":"676_CR28","doi-asserted-by":"crossref","unstructured":"Kamei Y, Shihab E, Adams B et al (2012) A large-scale empirical study of just-in-time quality assurance. IEEE Trans Softw Eng 39(6):757\u2013773","DOI":"10.1109\/TSE.2012.70"},{"key":"676_CR29","doi-asserted-by":"crossref","unstructured":"Sohan M F, Kabir M A, Rahman M et al (2020) Prevalence of machine learning techniques in software defect prediction. In: International Conference on Cyber Security and Computer Science, Springer, Cham, 15\u201316 February 2020, pp 257\u2013269","DOI":"10.1007\/978-3-030-52856-0_20"},{"key":"676_CR30","doi-asserted-by":"crossref","unstructured":"Jureczko M, Madeyski L (2010) Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th international conference on predictive models in software engineering September 2010, pp 1\u201310","DOI":"10.1145\/1868328.1868342"},{"key":"676_CR31","doi-asserted-by":"crossref","unstructured":"Fay M P, Proschan MA (2010) Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules. Stat Surv 4:1","DOI":"10.1214\/09-SS051"},{"key":"676_CR32","doi-asserted-by":"crossref","unstructured":"Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc: Ser B (Methodol) 57(1):289-300","DOI":"10.1111\/j.2517-6161.1995.tb02031.x"},{"key":"676_CR33","doi-asserted-by":"crossref","unstructured":"Yang X, Yu H, Fan G et al (2019) Local versus global models for just-in-time software defect prediction. Sci Program, pp 1-13","DOI":"10.1155\/2019\/2384706"},{"key":"676_CR34","doi-asserted-by":"crossref","unstructured":"Pan C, Lu M, Xu B, et al (2019) An improved cnn model for within-project software defect prediction. Appl Sci 9(10):2138","DOI":"10.3390\/app9102138"},{"key":"676_CR35","doi-asserted-by":"crossref","unstructured":"Feng S, Keung J, Yu X, et al (2019) COSTE: Complexity-based OverSampling TEchnique to alleviate the class imbalance problem in software defect prediction. Inf Softw Technol 129:106432","DOI":"10.1016\/j.infsof.2020.106432"},{"key":"676_CR36","doi-asserted-by":"crossref","unstructured":"Wang S, Liu T, Nam J et al (2018) Deep semantic feature learning for software defect prediction. IEEE Trans Softw Eng 46(12):1267\u20131293","DOI":"10.1109\/TSE.2018.2877612"},{"key":"676_CR37","doi-asserted-by":"crossref","unstructured":"Nam J, Pan SJ, Kim S (2013) Transfer defect learning. In: 2013 35th international conference on software engineering (ICSE). IEEE, 18\u201326 May 2013, pp 382\u2013391","DOI":"10.1109\/ICSE.2013.6606584"},{"key":"676_CR38","doi-asserted-by":"crossref","unstructured":"Chen J, Hu K, Yu Y et al (2020) Software visualization and deep transfer learning for effective software defect prediction. In: Proceedings of the ACM\/IEEE 42nd international conference on software engineering, June 2020, pp 578\u2013589","DOI":"10.1145\/3377811.3380389"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-022-00676-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-022-00676-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-022-00676-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,8,3]],"date-time":"2022-08-03T10:33:22Z","timestamp":1659522802000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-022-00676-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,20]]},"references-count":38,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,8]]}},"alternative-id":["676"],"URL":"https:\/\/doi.org\/10.1007\/s40747-022-00676-y","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,2,20]]},"assertion":[{"value":"25 May 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 January 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 February 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}