{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T16:16:53Z","timestamp":1778084213788,"version":"3.51.4"},"reference-count":56,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2024,2,16]],"date-time":"2024-02-16T00:00:00Z","timestamp":1708041600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computers"],"abstract":"<jats:p>Software defect prediction models enable test managers to predict defect-prone modules and assist with delivering quality products. A test manager would be willing to identify the attributes that can influence defect prediction and should be able to trust the model outcomes. The objective of this research is to create software defect prediction models with a focus on interpretability. Additionally, it aims to investigate the impact of size, complexity, and other source code metrics on the prediction of software defects. This research also assesses the reliability of cross-project defect prediction. Well-known machine learning techniques, such as support vector machines, k-nearest neighbors, random forest classifiers, and artificial neural networks, were applied to publicly available PROMISE datasets. The interpretability of this approach was demonstrated by SHapley Additive exPlanations (SHAP) and local interpretable model-agnostic explanations (LIME) techniques. The developed interpretable software defect prediction models showed reliability on independent and cross-project data. Finally, the results demonstrate that static code metrics can contribute to the defect prediction models, and the inclusion of explainability assists in establishing trust in the developed models.<\/jats:p>","DOI":"10.3390\/computers13020052","type":"journal-article","created":{"date-parts":[[2024,2,16]],"date-time":"2024-02-16T06:00:25Z","timestamp":1708063225000},"page":"52","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["Interpretable Software Defect Prediction from Project Effort and Static Code Metrics"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0009-0000-4566-1433","authenticated-orcid":false,"given":"Susmita","family":"Haldar","sequence":"first","affiliation":[{"name":"School of Information Technology, Fanshawe College, London, ON N5Y 5R6, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6966-2369","authenticated-orcid":false,"given":"Luiz Fernando","family":"Capretz","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, Western University, London, ON N6A 3K7, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2024,2,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Punitha, K., and Chitra, S. (2013, January 21\u201322). Software defect prediction using software metrics\u2014A survey. Proceedings of the 2013 International Conference on Information Communication and Embedded Systems (ICICES), Chennai, India.","DOI":"10.1109\/ICICES.2013.6508369"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1208","DOI":"10.1109\/TSE.2013.11","article-title":"Data quality: Some comments on the nasa software defect datasets","volume":"39","author":"Shepperd","year":"2013","journal-title":"IEEE Trans. Softw. Eng."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"161","DOI":"10.1049\/iet-sen.2017.0148","article-title":"Progress on approaches to software defect prediction","volume":"12","author":"Li","year":"2018","journal-title":"Iet Softw."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1016\/j.infsof.2014.11.006","article-title":"An empirical study on software defect prediction with a simplified metric set","volume":"59","author":"He","year":"2015","journal-title":"Inf. Softw. Technol."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Balogun, A.O., Basri, S., Mahamad, S., Abdulkadir, S.J., Capretz, L.F., Imam, A.A., Almomani, M.A., Adeyemo, V.E., and Kumar, G. (2021). Empirical analysis of rank aggregation-based multi-filter feature selection methods in software defect prediction. Electronics, 10.","DOI":"10.3390\/electronics10020179"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Ghotra, B., McIntosh, S., and Hassan, A.E. (2017, January 20\u201321). A large-scale study of the impact of feature selection techniques on defect classification models. Proceedings of the 2017 IEEE\/ACM 14th International Conference on Mining Software Repositories (MSR), Buenos Aires, Argentina.","DOI":"10.1109\/MSR.2017.18"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Haldar, S., and Capretz, L.F. (2023, January 17\u201319). Explainable Software Defect Prediction from Cross Company Project Metrics using Machine Learning. Proceedings of the 2023 7th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India.","DOI":"10.1109\/ICICCS56967.2023.10142534"},{"key":"ref_8","first-page":"11","article-title":"Benchmarking Machine Learning Techniques for Software Defect Detection","volume":"6","author":"Aleem","year":"2015","journal-title":"Int. J. Softw. Eng. Appl."},{"key":"ref_9","unstructured":"Aydin, Z.B.G., and Samli, R. (2020, January 9\u201311). Performance Evaluation of Some Machine Learning Algorithms in NASA Defect Prediction Data Sets. Proceedings of the 2020 5th International Conference on Computer Science and Engineering (UBMK), Diyarbakir, Turkey."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1109\/TSE.2007.256941","article-title":"Data mining static code attributes to learn defect predictors","volume":"33","author":"Menzies","year":"2007","journal-title":"IEEE Trans. Softw. Eng."},{"key":"ref_11","unstructured":"Nassif, A.B., Ho, D., and Capretz, L.F. (2011, January 16\u201318). Regression model for software effort estimation based on the use case point method. Proceedings of the 2011 International Conference on Computer and Software Modeling, Singapore."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"681","DOI":"10.1007\/s13198-021-01326-1","article-title":"Effective software defect prediction using support vector machines (SVMs)","volume":"13","author":"Goyal","year":"2022","journal-title":"Int. J. Syst. Assur. Eng. Manag."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"969","DOI":"10.1007\/s11390-015-1575-5","article-title":"A hybrid instance selection using nearest-neighbor for cross-project defect prediction","volume":"30","author":"Ryu","year":"2015","journal-title":"J. Comput. Sci. Technol."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Thapa, S., Alsadoon, A., Prasad, P., Al-Dala\u2019in, T., and Rashid, T.A. (2020, January 25\u201327). Software Defect Prediction Using Atomic Rule Mining and Random Forest. Proceedings of the 2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA), Sydney, Australia.","DOI":"10.1109\/CITISIA50690.2020.9371797"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1007\/s10586-018-1730-1","article-title":"Software defect prediction techniques using metrics based on neural network classifier","volume":"22","author":"Jayanthi","year":"2019","journal-title":"Clust. Comput."},{"key":"ref_16","first-page":"6230953","article-title":"Software defect prediction via attention-based recurrent neural network","volume":"2019","author":"Fan","year":"2019","journal-title":"Sci. Program."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1967","DOI":"10.1007\/s13042-022-01740-2","article-title":"Software defect prediction ensemble learning algorithm based on adaptive variable sparrow search algorithm","volume":"14","author":"Tang","year":"2023","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"103138","DOI":"10.1016\/j.advengsoft.2022.103138","article-title":"Software defect prediction via optimal trained convolutional neural network","volume":"169","author":"Balasubramaniam","year":"2022","journal-title":"Adv. Eng. Softw."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"106985","DOI":"10.1016\/j.infsof.2022.106985","article-title":"A three-stage transfer learning framework for multi-source cross-project software defect prediction","volume":"150","author":"Bai","year":"2022","journal-title":"Inf. Softw. Technol."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Cao, Q., Sun, Q., Cao, Q., and Tan, H. (2015, January 21\u201323). Software defect prediction via transfer learning based neural network. Proceedings of the 2015 First International Conference on Reliability Systems Engineering (ICRSE), Beijing, China.","DOI":"10.1109\/ICRSE.2015.7366475"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Joon, A., Kumar Tyagi, R., and Kumar, K. (2020, January 10\u201312). Noise Filtering and Imbalance Class Distribution Removal for Optimizing Software Fault Prediction using Best Software Metrics Suite. Proceedings of the 2020 5th International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India.","DOI":"10.1109\/ICCES48766.2020.9137899"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Aggarwal, C.C., and Aggarwal, C.C. (2017). An introduction to Outlier Analysis, Springer.","DOI":"10.1007\/978-3-319-47578-3"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1613\/jair.953","article-title":"SMOTE: Synthetic minority over-sampling technique","volume":"16","author":"Chawla","year":"2002","journal-title":"J. Artif. Intell. Res."},{"key":"ref_24","first-page":"3294","article-title":"Software Defect Prediction: Analysis Of Class Imbalance and Performance Stability","volume":"14","author":"Balogun","year":"2019","journal-title":"J. Eng. Sci. Technol."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Pelayo, L., and Dick, S. (2007, January 24\u201327). Applying novel resampling strategies to software defect prediction. Proceedings of the NAFIPS 2007\u20142007 Annual Meeting of the North American Fuzzy Information Processing Society, San Diego, CA, USA.","DOI":"10.1109\/NAFIPS.2007.383813"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Dipa, W.A., and Sunindyo, W.D. (2021, January 3\u20134). Software Defect Prediction Using SMOTE and Artificial Neural Network. Proceedings of the 2021 International Conference on Data and Software Engineering (ICoDSE), Bandung, Indonesia.","DOI":"10.1109\/ICoDSE53690.2021.9648476"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"3103","DOI":"10.1109\/TSE.2021.3079841","article-title":"On the value of oversampling for deep learning in software defect prediction","volume":"48","author":"Yedida","year":"2021","journal-title":"IEEE Trans. Softw. Eng."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"184832","DOI":"10.1109\/ACCESS.2019.2961129","article-title":"Deepcpdp: Deep learning based cross-project defect prediction","volume":"7","author":"Chen","year":"2019","journal-title":"IEEE Access"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1080\/00401706.1999.10485936","article-title":"Regression analysis: Statistical modeling of a response variable","volume":"41","author":"Altland","year":"1999","journal-title":"Technometrics"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"885","DOI":"10.1109\/TR.2018.2847353","article-title":"Ridge and Lasso Regression Models for Cross-Version Defect Prediction","volume":"67","author":"Yang","year":"2018","journal-title":"IEEE Trans. Reliab."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Gezici, B., and Tarhan, A.K. (2022, January 14\u201316). Explainable AI for Software Defect Prediction with Gradient Boosting Classifier. Proceedings of the 2022 7th International Conference on Computer Science and Engineering (UBMK), Diyarbakir, Turkey.","DOI":"10.1109\/UBMK55850.2022.9919490"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Jiarpakdee, J., Tantithamthavorn, C.K., and Grundy, J. (2021, January 17\u201319). Practitioners\u2019 Perceptions of the Goals and Visual Explanations of Defect Prediction Models. Proceedings of the 2021 IEEE\/ACM 18th International Conference on Mining Software Repositories (MSR), Madrid, Spain.","DOI":"10.1109\/MSR52588.2021.00055"},{"key":"ref_33","unstructured":"Sayyad Shirabad, J., and Menzies, T.J. (2024, February 11). The PROMISE Repository of Software Engineering Databases. Available online: http:\/\/promise.site.uottawa.ca\/SERepository."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Gray, D., Bowes, D., Davey, N., Sun, Y., and Christianson, B. (2011, January 11\u201312). The misuse of the NASA metrics data program data sets for automated software defect prediction. Proceedings of the 15th Annual Conference on Evaluation & Assessment in Software Engineering (EASE 2011), Durham, UK.","DOI":"10.1049\/ic.2011.0012"},{"key":"ref_35","first-page":"94","article-title":"Feature Selection: A Data Perspective","volume":"50","author":"Li","year":"2017","journal-title":"ACM Comput. Surv."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Rahman Khan Mamun, M.M., and Alouani, A. (2021, January 12\u201317). Arrhythmia Classification Using Hybrid Feature Selection Approach and Ensemble Learning Technique. Proceedings of the 2021 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), Virtual Event, ON, Canada.","DOI":"10.1109\/CCECE53047.2021.9569067"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Rosati, S., Gianfreda, C.M., Balestra, G., Martincich, L., Giannini, V., and Regge, D. (2018, January 11\u201313). Correlation based Feature Selection impact on the classification of breast cancer patients response to neoadjuvant chemotherapy. Proceedings of the 2018 IEEE International Symposium on Medical Measurements and Applications (MeMeA), Rome, Italy.","DOI":"10.1109\/MeMeA.2018.8438698"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"2161","DOI":"10.1007\/s10586-021-03254-y","article-title":"A novel feature selection method for data mining tasks using hybrid sine cosine algorithm and genetic algorithm","volume":"24","author":"Abualigah","year":"2021","journal-title":"Clust. Comput."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1016\/j.compeleceng.2013.11.024","article-title":"A survey on feature selection methods","volume":"40","author":"Chandrashekar","year":"2014","journal-title":"Comput. Electr. Eng."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Thant, M.W., and Aung, N.T.T. (2019, January 6\u20137). Software defect prediction using hybrid approach. Proceedings of the 2019 International Conference on Advanced Information Technologies (ICAIT), Yangon, Myanmar.","DOI":"10.1109\/AITC.2019.8921374"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Rajnish, K., Bhattacharjee, V., and Chandrabanshi, V. (2021, January 5\u20137). Applying Cognitive and Neural Network Approach over Control Flow Graph for Software Defect Prediction. Proceedings of the 2021 Thirteenth International Conference on Contemporary Computing (IC3-2021), Noida, India.","DOI":"10.1145\/3474124.3474127"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Jindal, R., Malhotra, R., and Jain, A. (2014, January 8\u201310). Software defect prediction using neural networks. Proceedings of the 3rd International Conference on Reliability, Infocom Technologies and Optimization, Noida, India.","DOI":"10.1109\/ICRITO.2014.7014673"},{"key":"ref_43","first-page":"2825","article-title":"Scikit-learn: Machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Rana, G., Haq, E.u., Bhatia, E., and Katarya, R. (2020, January 5\u20137). A Study of Hyper-Parameter Tuning in The Field of Software Analytics. Proceedings of the 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India.","DOI":"10.1109\/ICECA49313.2020.9297613"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Osman, H., Ghafari, M., and Nierstrasz, O. (2017, January 21\u201321). Hyperparameter optimization to improve bug prediction accuracy. Proceedings of the 2017 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE), Klagenfurt, Austria.","DOI":"10.1109\/MALTESQUE.2017.7882014"},{"key":"ref_46","first-page":"CP 653","article-title":"Software defect prediction model based on LLE and SVM","volume":"2014","author":"Shan","year":"2014","journal-title":"IET Conf. Publ."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1109\/TIT.1967.1053964","article-title":"Nearest neighbor pattern classification","volume":"13","author":"Cover","year":"1967","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Al-Sharafi, M.A., Al-Emran, M., Al-Kabi, M.N., and Shaalan, K. (2022, January 20\u201322). A Robust Tuned K-Nearest Neighbours Classifier for Software Defect Prediction. Proceedings of the 2nd International Conference on Emerging Technologies and Intelligent Systems, Sanya, China.","DOI":"10.1007\/978-3-031-25274-7"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Soe, Y.N., Santosa, P.I., and Hartanto, R. (2018, January 12\u201313). Software defect prediction using random forest algorithm. Proceedings of the 2018 12th South East Asian Technical University Consortium (SEATUC), Yogyakarta, Indonesia.","DOI":"10.1109\/SEATUC.2018.8788881"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Ribeiro, M.T., Singh, S., and Guestrin, C. (2016, January 13\u201317). \u201cWhy should i trust you?\u201d Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939778"},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"52138","DOI":"10.1109\/ACCESS.2018.2870052","article-title":"Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)","volume":"6","author":"Adadi","year":"2018","journal-title":"IEEE Access"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Biecek, P., and Burzykowski, T. (2021). Explanatory Model Analysis, Chapman and Hall\/CRC.","DOI":"10.1201\/9780429027192"},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1109\/TSE.2020.2982385","article-title":"An empirical study of model-agnostic techniques for defect prediction models","volume":"48","author":"Jiarpakdee","year":"2020","journal-title":"IEEE Trans. Softw. Eng."},{"key":"ref_55","unstructured":"Lundberg, S.M., and Lee, S.I. (2017). Advances in Neural Information Processing Systems 30, Neural Information Processing Systems Foundation, Inc."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"369","DOI":"10.1007\/s10515-020-00277-4","article-title":"Understanding machine learning software defect predictions","volume":"27","author":"Esteves","year":"2020","journal-title":"Autom. Softw. Eng."}],"container-title":["Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-431X\/13\/2\/52\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:00:45Z","timestamp":1760104845000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-431X\/13\/2\/52"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,16]]},"references-count":56,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2024,2]]}},"alternative-id":["computers13020052"],"URL":"https:\/\/doi.org\/10.3390\/computers13020052","relation":{},"ISSN":["2073-431X"],"issn-type":[{"value":"2073-431X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,16]]}}}