{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,15]],"date-time":"2026-05-15T14:52:23Z","timestamp":1778856743737,"version":"3.51.4"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T00:00:00Z","timestamp":1706745600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T00:00:00Z","timestamp":1706745600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Big Data"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Credit risk prediction is a crucial task for financial institutions. The technological advancements in machine learning, coupled with the availability of data and computing power, has given rise to more credit risk prediction models in financial institutions. In this paper, we propose a stacked classifier approach coupled with a filter-based feature selection (FS) technique to achieve efficient credit risk prediction using multiple datasets. The proposed stacked model includes the following base estimators: Random Forest (RF), Gradient Boosting (GB), and Extreme Gradient Boosting (XGB). Furthermore, the estimators in the Stacked architecture were linked sequentially to extract the best performance. The filter- based FS method that is used in this research is based on information gain (IG) theory. The proposed algorithm was evaluated using the accuracy, the F1-Score and the Area Under the Curve (AUC). Furthermore, the Stacked algorithm was compared to the following methods: Artificial Neural Network (ANN), Decision Tree (DT), and k-Nearest Neighbour (KNN). The experimental results show that stacked model obtained AUCs of 0.934, 0.944 and 0.870 on the Australian, German and Taiwan datasets, respectively. These results, in conjunction with the accuracy and F1-score metrics, demonstrated that the proposed stacked classifier outperforms the individual estimators and other existing methods.<\/jats:p>","DOI":"10.1186\/s40537-024-00882-0","type":"journal-article","created":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T11:02:09Z","timestamp":1706785329000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":61,"title":["A machine learning-based credit risk prediction engine system using a stacked classifier and a filter-based feature selection method"],"prefix":"10.1186","volume":"11","author":[{"given":"Ileberi","family":"Emmanuel","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yanxia","family":"Sun","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zenghui","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,2,1]]},"reference":[{"issue":"1","key":"882_CR1","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1186\/s40854-019-0121-9","volume":"5","author":"S Moradi","year":"2019","unstructured":"Moradi S, Mokhatab RF. A dynamic credit risk assess- ment model with data mining techniques: evidence from Iranian banks. Financ Innov. 2019;5(1):15.","journal-title":"Financ Innov"},{"issue":"1","key":"882_CR2","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1186\/s40854-019-0159-8","volume":"5","author":"ZU Rehman","year":"2019","unstructured":"Rehman ZU, Muhammad N, Sarwar B, Raz MA. Impact of risk management strategies on the credit risk faced by commercial banks of Balochistan. Financ Innov. 2019;5(1):44.","journal-title":"Financ Innov"},{"issue":"3","key":"882_CR3","doi-asserted-by":"publisher","first-page":"316","DOI":"10.1108\/RAF-07-2017-0143","volume":"17","author":"S Khemakhem","year":"2018","unstructured":"Khemakhem S, Boujelbene Y. Predicting credit risk on the basis of financial and non-financial variables and data mining. Rev Acc Financ. 2018;17(3):316\u201340.","journal-title":"Rev Acc Financ"},{"key":"882_CR4","doi-asserted-by":"publisher","first-page":"631","DOI":"10.1016\/j.procs.2020.01.057","volume":"165","author":"VN Dornadula","year":"2019","unstructured":"Dornadula VN, Geetha S. Credit card fraud detection using machine learning algorithms. Procedia Computer Science. 2019;165:631\u201341.","journal-title":"Procedia Computer Science"},{"key":"882_CR5","first-page":"68","volume":"67","author":"V Garc\u0131a","year":"2012","unstructured":"Garc\u0131a V, Marques AI, S\u00b4anchez J.S. Improving Risk Pre- dictions by Preprocessing Imbalanced Credit Data. Neural Information Processing. 2012;67:68\u201375.","journal-title":"Neural Information Processing"},{"key":"882_CR6","doi-asserted-by":"publisher","first-page":"84897","DOI":"10.1109\/ACCESS.2019.2924923","volume":"7","author":"Y Song","year":"2019","unstructured":"Song Y, Peng Y. A MCDM-Based Evaluation Approach for Imbalanced Classification Methods in Financial Risk Prediction. IEEE Access. 2019;7:84897\u2013906.","journal-title":"IEEE Access"},{"key":"882_CR7","doi-asserted-by":"publisher","first-page":"78549","DOI":"10.1109\/ACCESS.2019.2922676","volume":"7","author":"S Guo","year":"2019","unstructured":"Guo S, He H, Huang X. A multi-stage self-adaptive classi- fier ensemble model with application in credit scoring. IEEE Access. 2019;7:78549\u201359.","journal-title":"IEEE Access"},{"issue":"4","key":"882_CR8","doi-asserted-by":"publisher","first-page":"491","DOI":"10.1109\/TKDE.2005.66","volume":"17","author":"H Liu","year":"2005","unstructured":"Liu H, Yu L. Toward integrating feature selection algorithms for classification and clustering. IEEE Tran Knowl Data Eng. 2005;17(4):491\u2013502.","journal-title":"IEEE Tran Knowl Data Eng"},{"key":"882_CR9","doi-asserted-by":"crossref","unstructured":"Tang PS, Tang XL, Tao ZY, Li JP (2014) Research on feature selection algorithm based on mutual information and genetic algorithm. 11th Int. Comput. Conf. Wavelet Active Media Tech. Inf. Processing (ICCWAMTIP) IEEE, 403\u2013406.","DOI":"10.1109\/ICCWAMTIP.2014.7073436"},{"key":"882_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.patrec.2017.03.018","volume":"92","author":"C Liu","year":"2017","unstructured":"Liu C, Wang Q, Zhao Q, Shen X, Konan M. A new feature selection method based on a validity index of feature subset. Pattern Recogn Lett. 2017;92:1\u20138.","journal-title":"Pattern Recogn Lett"},{"key":"882_CR11","doi-asserted-by":"crossref","unstructured":"Pandey TN, Jagadev AK, Mohapatra SK, Dehuri S (2017) Credit risk analysis using machine learning classifiers. In: 2017 International Conference on Energy, Communication, Data Analytics and Soft Computing (ICECDS) (pp. 1850\u20131854). IEEE.","DOI":"10.1109\/ICECDS.2017.8389769"},{"key":"882_CR12","doi-asserted-by":"crossref","unstructured":"Zhang L, Hui X, Wang L (2009) Application of adaptive support vector machines method in credit scoring. In: International Conference on Management Science and Engineering, 1410\u20131415.","DOI":"10.1109\/ICMSE.2009.5317970"},{"issue":"3","key":"882_CR13","first-page":"58","volume":"8","author":"N Mohammadi","year":"2016","unstructured":"Mohammadi N, Zangeneh M. Customer credit risk assess- ment using artificial neural networks. IJ Information Technol Computer Science. 2016;8(3):58\u201366.","journal-title":"IJ Information Technol Computer Science"},{"key":"882_CR14","doi-asserted-by":"crossref","unstructured":"Hsu TC, Liou ST, Wang YP, Huang YS, Che-Lin (2019) Enhanced Recurrent Neural Network for Combining Static and Dynamic Features for Credit Card Default Prediction. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1572\u20131576.","DOI":"10.1109\/ICASSP.2019.8682212"},{"key":"882_CR15","doi-asserted-by":"publisher","first-page":"301","DOI":"10.1016\/j.eswa.2019.02.033","volume":"128","author":"W Bao","year":"2019","unstructured":"Bao W, Lianju N, Yue K. Integration of unsupervised and supervised machine learning algorithms for credit risk assessment. Expert Syst Appl. 2019;128:301\u201315.","journal-title":"Expert Syst Appl"},{"key":"882_CR16","doi-asserted-by":"crossref","unstructured":"Ha VS, Lu DN, Choi GS, Nguyen HN, Yoon B (2019) Improv- ing credit risk prediction in online peer-to-peer (P2P) lending using feature selection with deep learning. In: 21st International Conference on Advanced Communication Technology, 511\u2013515.","DOI":"10.23919\/ICACT.2019.8701943"},{"key":"882_CR17","doi-asserted-by":"publisher","DOI":"10.1016\/j.compbiomed.2020.103899","volume":"123","author":"C Chen","year":"2020","unstructured":"Chen C, Zhang Q, Yu B, Yu Z, Lawrence PJ, Ma Q, Zhang Y. Improving protein-protein interactions prediction accuracy using XGBoost feature selection and stacked ensemble classifier. Comput Biol Med. 2020;123: 103899.","journal-title":"Comput Biol Med"},{"key":"882_CR18","doi-asserted-by":"crossref","unstructured":"Chakrabarty N, Kundu T, Dandapat S, Sarkar A, Kole DK (2019) Flight arrival delay prediction using gradient boosting classifier. In: Emerging technologies in data mining and information security, 651-659","DOI":"10.1007\/978-981-13-1498-8_57"},{"key":"882_CR19","doi-asserted-by":"publisher","first-page":"17804","DOI":"10.1109\/ACCESS.2019.2960161","volume":"8","author":"HT Weldegebriel","year":"2019","unstructured":"Weldegebriel HT, Liu H, Haq AU, Bugingo E, Zhang D. A new hybrid convolutional neural network and eXtreme gradient boosting classifier for recognizing handwritten Ethiopian characters. IEEE Access. 2019;8:17804\u201318.","journal-title":"IEEE Access"},{"issue":"4","key":"882_CR20","doi-asserted-by":"publisher","first-page":"1632","DOI":"10.1109\/TDSC.2019.2922958","volume":"18","author":"J Liang","year":"2019","unstructured":"Liang J, Qin Z, Xiao S, Ou L, Lin X. Efficient & secure decision tree classification for cloud-assisted online diagnosis services. IEEE Trans Dependable Secure Comput. 2019;18(4):1632\u201344.","journal-title":"IEEE Trans Dependable Secure Comput"},{"issue":"1","key":"882_CR21","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L. Random forests. Mach Learn. 2001;45(1):5\u201332.","journal-title":"Mach Learn"},{"key":"882_CR22","doi-asserted-by":"publisher","first-page":"1356","DOI":"10.1016\/j.proeng.2014.03.129","volume":"69","author":"B Trstenjak","year":"2014","unstructured":"Trstenjak B, Mikac S, Donko D. KNN with TF-IDF based framework for text categorization. Procedia Eng. 2014;69:1356\u201364.","journal-title":"Procedia Eng"},{"issue":"2","key":"882_CR23","doi-asserted-by":"publisher","first-page":"290","DOI":"10.1016\/j.eswa.2005.07.019","volume":"3","author":"S Tan","year":"2006","unstructured":"Tan S. An effective refinement strategy for KNN text classifier. Expert Syst Appl. 2006;3(2):290\u20138.","journal-title":"Expert Syst Appl"},{"key":"882_CR24","doi-asserted-by":"publisher","first-page":"38597","DOI":"10.1109\/ACCESS.2019.2905633","volume":"7","author":"SM Kasongo","year":"2019","unstructured":"Kasongo SM, Sun Y. A deep learning method with filter based feature engineering for wireless intrusion detection system. IEEE access. 2019;7:38597\u2013607.","journal-title":"IEEE access"},{"key":"882_CR25","unstructured":"\u201cUCI Machine Learning Repository: Stat-log (Australian Credit Approval) DataSet.\u201d http:\/\/archive.ics.uci.edu\/ml\/datasets\/statlog+(australian+credit+approval) (accessed Oct. 31, 2020)."},{"key":"882_CR26","unstructured":"\u201cUCI Machine Learning Repository: Stat-log (German Credit Data) Data Set.\u201d https:\/\/archive.ics.uci.edu\/ml\/datasets\/statlog+(german+credit+data) (accessed Oct. 31, 2020)."},{"key":"882_CR27","unstructured":"\u201cUCI Machine Learning Repository: default of credit card clients Data Set.\u201d https:\/\/archive.ics.uci.edu\/ml\/datasets\/default+of+credit+card+clients (accessed Mar. 14, 2020)."},{"key":"882_CR28","doi-asserted-by":"crossref","unstructured":"Gao Z, Xu Y, Meng F, Qi F, Lin Z (2014) Improved information gain-based feature selection for text categorization. Int. Conf. Wireless Commun. Vehicular Technol. Inform Theory and Aerosp. Electron. Sys. (VITAE) IEEE, 1\u20135.","DOI":"10.1109\/VITAE.2014.6934421"},{"issue":"1","key":"882_CR29","first-page":"3","volume":"5","author":"CE Shannon","year":"2001","unstructured":"Shannon CE. A mathematical theory of communication. ACM SIGMOBILE. 2001;5(1):3\u201355.","journal-title":"ACM SIGMOBILE"},{"key":"882_CR30","doi-asserted-by":"publisher","first-page":"208","DOI":"10.1016\/j.neucom.2016.07.036","volume":"216","author":"H Zhou","year":"2016","unstructured":"Zhou H, Deng Z, Xia Y, Fu M. A new sampling method in particle filter based on pearson correlation coefficient. Neurocomputing. 2016;216:208\u201315.","journal-title":"Neurocomputing"},{"key":"882_CR31","unstructured":"Google Colab [Online]. Available: https:\/\/colab.research.google.com\/"},{"key":"882_CR32","unstructured":"Scikit-learn : machine learning in Python. https:\/\/scikit-learn.org\/stable\/"},{"key":"882_CR33","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1186\/s40537-022-00573-8","volume":"9","author":"E Ileberi","year":"2022","unstructured":"Ileberi E, Sun Y, Wang Z. A machine learning based credit card fraud detection using the GA algorithm for feature selection. J Big Data. 2022;9:24.","journal-title":"J Big Data"},{"key":"882_CR34","unstructured":"Lipton ZC, Elkan C, Narayanaswamy B (2014) Thresh- olding Classifiers to Maximize F1 Score. arXiv:1402.1892 [cs, stat], May 2014, Accessed: Nov. 01, 2020. http:\/\/arxiv.org\/abs\/1402.1892"},{"issue":"3","key":"882_CR35","doi-asserted-by":"publisher","first-page":"696","DOI":"10.1007\/s00357-019-09345-1","volume":"37","author":"J Muschelli","year":"2020","unstructured":"Muschelli J. ROC and AUC with a binary predictor: a poten- tially misleading metric. J Classif. 2020;37(3):696\u2013708.","journal-title":"J Classif"},{"issue":"1","key":"882_CR36","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1109\/MSP.2017.2765202","volume":"35","author":"A Creswell","year":"2018","unstructured":"Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath AA. Generative adversarial networks: An overview. IEEE Signal Process Mag. 2018;35(1):53\u201365.","journal-title":"IEEE Signal Process Mag"},{"key":"882_CR37","doi-asserted-by":"publisher","first-page":"108074","DOI":"10.1016\/j.compchemeng.2022.108074","volume":"169","author":"T Zhao","year":"2023","unstructured":"Zhao T, Zheng Y, Wu Z. Feature selection-based machine learning modeling for distributed model predictive control of nonlinear processes. Computers Chem Eng. 2023;169:108074.","journal-title":"Computers Chem Eng."},{"issue":"8","key":"882_CR38","first-page":"5","volume":"2020","author":"C Edmond","year":"2020","unstructured":"Edmond C, Girsang AS. Classification performance for credit scoring using neural network. Int J. 2020;2020(8):5.","journal-title":"Int J"},{"issue":"2015","key":"882_CR39","first-page":"83","volume":"2015","author":"A Laudani","year":"2015","unstructured":"Laudani A, Lozito GM, Fulginei FR, Salvini A. On training efficiency and computational costs of a feed forward neural network: A review. Comput Intell Neurosci. 2015;2015(2015):83.","journal-title":"Comput Intell Neurosci"},{"key":"882_CR40","unstructured":"Stoffel M, Bamer F, Markert B. (2019). Stability of feed forward artificial neural networks versus nonlinear structural models in high speed deformations: A critical comparison. Arch Mech. 2019;71(2):34"}],"container-title":["Journal of Big Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-024-00882-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s40537-024-00882-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-024-00882-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T11:15:56Z","timestamp":1706786156000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofbigdata.springeropen.com\/articles\/10.1186\/s40537-024-00882-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,1]]},"references-count":40,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["882"],"URL":"https:\/\/doi.org\/10.1186\/s40537-024-00882-0","relation":{},"ISSN":["2196-1115"],"issn-type":[{"value":"2196-1115","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,1]]},"assertion":[{"value":"4 October 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 January 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 February 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"23"}}