{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,3]],"date-time":"2026-05-03T12:38:40Z","timestamp":1777811920501,"version":"3.51.4"},"reference-count":49,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2024,7,14]],"date-time":"2024-07-14T00:00:00Z","timestamp":1720915200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of PR China","award":["72101121"],"award-info":[{"award-number":["72101121"]}]},{"name":"National Natural Science Foundation of PR China","award":["21YJC790054"],"award-info":[{"award-number":["21YJC790054"]}]},{"name":"Ministry of Education, Humanities, and social science projects","award":["72101121"],"award-info":[{"award-number":["72101121"]}]},{"name":"Ministry of Education, Humanities, and social science projects","award":["21YJC790054"],"award-info":[{"award-number":["21YJC790054"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Systems"],"abstract":"<jats:p>Credit evaluation has always been an important part of the financial field. The existing credit evaluation methods have difficulty in solving the problems of redundant data features and imbalanced samples. In response to the above issues, an ensemble model combining an advanced feature selection algorithm and an optimized loss function is proposed, which can be applied in the field of credit evaluation and improve the risk management ability of financial institutions. Firstly, the Boruta algorithm is embedded for feature selection, which can effectively reduce the data dimension and noise and improve the model\u2019s capacity for generalization by automatically identifying and screening out features that are highly correlated with target variables. Then, the GHM loss function is incorporated into the XGBoost model to tackle the issue of skewed sample distribution, which is common in classification, and further improve the classification and prediction performance of the model. The comparative experiments on four large datasets demonstrate that the proposed method is superior to the existing mainstream methods and can effectively extract features and handle the problem of imbalanced samples.<\/jats:p>","DOI":"10.3390\/systems12070254","type":"journal-article","created":{"date-parts":[[2024,7,15]],"date-time":"2024-07-15T08:35:17Z","timestamp":1721032517000},"page":"254","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":22,"title":["XGBoost-B-GHM: An Ensemble Model with Feature Selection and GHM Loss Function Optimization for Credit Scoring"],"prefix":"10.3390","volume":"12","author":[{"given":"Yuxuan","family":"Xia","sequence":"first","affiliation":[{"name":"School of Management Science and Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China"}]},{"given":"Shanshan","family":"Jiang","sequence":"additional","affiliation":[{"name":"School of Management Science and Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China"}]},{"given":"Lingyi","family":"Meng","sequence":"additional","affiliation":[{"name":"School of Management Science and Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China"},{"name":"School of Politics, Economics and International Relations, University of Reading, Whiteknights, Reading RG6 6DH, UK"}]},{"given":"Xin","family":"Ju","sequence":"additional","affiliation":[{"name":"School of Politics, Economics and International Relations, University of Reading, Whiteknights, Reading RG6 6DH, UK"}]}],"member":"1968","published-online":{"date-parts":[[2024,7,14]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"114835","DOI":"10.1016\/j.eswa.2021.114835","article-title":"A conservative approach for online credit scoring","volume":"176","author":"Ashofteh","year":"2021","journal-title":"Expert Syst. Appl."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"115946","DOI":"10.1016\/j.eswa.2021.115946","article-title":"Density-oriented linear discriminant analysis","volume":"187","author":"Bahraini","year":"2022","journal-title":"Expert Syst. Appl."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1007\/s40304-021-00261-8","article-title":"Consistency of the k-Nearest Neighbor Classifier for Spatially Dependent Data","volume":"11","author":"Younso","year":"2023","journal-title":"Commun. Math. Stat."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"4557","DOI":"10.1109\/JSYST.2019.2937552","article-title":"Classification methods applied to credit scoring with collateral","volume":"14","author":"Teles","year":"2019","journal-title":"IEEE Syst. J."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"3566","DOI":"10.3758\/s13428-022-01976-4","article-title":"A comparison of logistic regression methods for Ising model estimation","volume":"55","author":"Brusco","year":"2023","journal-title":"Behav. Res. Methods"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"21582440231189693","DOI":"10.1177\/21582440231189693","article-title":"Modeling tenant\u2019s credit scoring using logistic regression","volume":"13","author":"Ling","year":"2023","journal-title":"SAGE Open"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1007\/s12065-020-00519-0","article-title":"A novel approach to build accurate and diverse decision tree forest","volume":"15","author":"Panhalkar","year":"2022","journal-title":"Evol. Intell."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"109239","DOI":"10.1016\/j.patcog.2022.109239","article-title":"Shallow decision trees for explainable k-means clustering","volume":"137","author":"Laber","year":"2023","journal-title":"Pattern Recognit."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Deng, J., Li, Q., and Wei, W. (2023). Improved Cascade Correlation Neural Network Model Based on Group Intelligence Optimization Algorithm. Axioms, 12.","DOI":"10.3390\/axioms12020164"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3523057","article-title":"Machine learning for computer systems and networking: A survey","volume":"55","author":"Kanakis","year":"2022","journal-title":"ACM Comput. Surv."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1016\/j.eswa.2010.06.048","article-title":"A comparative assessment of ensemble learning for credit scoring","volume":"38","author":"Wang","year":"2011","journal-title":"Expert Syst. Appl."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1178","DOI":"10.1016\/j.ejor.2021.06.053","article-title":"Machine learning for credit scoring: Improving logistic regression with non-linear decision-tree effects","volume":"297","author":"Dumitrescu","year":"2022","journal-title":"Eur. J. Oper. Res."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1016\/j.eswa.2017.02.017","article-title":"A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring","volume":"78","author":"Xia","year":"2017","journal-title":"Expert Syst. Appl."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"104036","DOI":"10.1016\/j.engappai.2020.104036","article-title":"Step-wise multi-grained augmented gradient boosting decision trees for credit scoring","volume":"97","author":"Liu","year":"2021","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"106852","DOI":"10.1016\/j.asoc.2020.106852","article-title":"A new deep learning ensemble credit risk evaluation model with an improved synthetic minority oversampling technique","volume":"98","author":"Shen","year":"2021","journal-title":"Appl. Soft Comput."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"292","DOI":"10.1016\/j.ejor.2021.03.006","article-title":"Deep learning for credit scoring: Do or don\u2019t?","volume":"295","author":"Gunnarsson","year":"2021","journal-title":"Eur. J. Oper. Res."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"50426","DOI":"10.1109\/ACCESS.2021.3068854","article-title":"Making deep learning-based predictions for credit scoring explainable","volume":"9","author":"Dastile","year":"2021","journal-title":"IEEE Access"},{"key":"ref_18","first-page":"197","article-title":"RankXGB-Based Enterprise Credit Scoring by Electricity Consumption in Edge Computing Environment","volume":"75","author":"Shen","year":"2023","journal-title":"CMC Comput. Mater. Contin."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"465","DOI":"10.1016\/j.engappai.2016.12.002","article-title":"A deep learning approach for credit scoring using credit default swaps","volume":"65","author":"Luo","year":"2017","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_20","first-page":"1","article-title":"Exploration of financial market credit scoring and risk management and prediction using deep learning and bionic algorithm","volume":"30","author":"Du","year":"2022","journal-title":"J. Glob. Inf. Manag. (JGIM)"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"39700","DOI":"10.1109\/ACCESS.2022.3166891","article-title":"Credit card fraud detection using state-of-the-art machine learning and deep learning algorithms","volume":"10","author":"Alarfaj","year":"2022","journal-title":"IEEE Access"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"4847","DOI":"10.1007\/s00521-023-09232-2","article-title":"Toward interpretable credit scoring: Integrating explainable artificial intelligence with deep learning for credit card default prediction","volume":"36","author":"Talaat","year":"2024","journal-title":"Neural Comput. Appl."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"3446","DOI":"10.1016\/j.eswa.2011.09.033","article-title":"An experimental comparison of classification algorithms for imbalanced credit scoring data sets","volume":"39","author":"Brown","year":"2012","journal-title":"Expert Syst. Appl."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s10462-024-10759-6","article-title":"A survey on imbalanced learning: Latest research, applications and future directions","volume":"57","author":"Chen","year":"2024","journal-title":"Artif. Intell. Rev."},{"key":"ref_25","first-page":"59","article-title":"CrossValidation for Imbalanced Datasets: Avoiding Overoptimistic and Overfitting Approaches","volume":"13","author":"Abreu","year":"2018","journal-title":"Res. Front."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1080\/0952813X.2020.1864783","article-title":"Correlation-based oversampling aided cost sensitive ensemble learning technique for treatment of class imbalance","volume":"34","author":"Devi","year":"2022","journal-title":"J. Exp. Theor. Artif. Intell."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"8689","DOI":"10.1109\/ACCESS.2023.3239889","article-title":"Internet financial credit scoring models based on deep forest and resampling methods","volume":"11","author":"Zhong","year":"2023","journal-title":"IEEE Access"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"60","DOI":"10.1016\/j.neucom.2023.01.023","article-title":"Neural collapse inspired attraction\u2013repulsion-balanced loss for imbalanced learning","volume":"527","author":"Xie","year":"2023","journal-title":"Neurocomputing"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"2328","DOI":"10.1007\/s10489-019-01624-z","article-title":"Cost-sensitive hierarchical classification for imbalance classes","volume":"50","author":"Zheng","year":"2020","journal-title":"Appl. Intell."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1016\/j.ins.2022.02.021","article-title":"Predict-then-optimize or predict-and-optimize? An empirical evaluation of cost-sensitive learning strategies","volume":"594","author":"Vanderschueren","year":"2022","journal-title":"Inf. Sci."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"105895","DOI":"10.1016\/j.engappai.2023.105895","article-title":"A high dimensional features-based cascaded forward neural network coupled with MVMD and Boruta-GBDT for multi-step ahead forecasting of surface soil moisture","volume":"120","author":"Jamei","year":"2023","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1016\/j.ins.2022.03.047","article-title":"Residual memory inference network for regression tracking with weighted gradient harmonized loss","volume":"597","author":"Zhang","year":"2022","journal-title":"Inf. Sci."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Gilani, N., Arabi Belaghi, R., Aftabi, Y., Faramarzi, E., Edguenlue, T., and Somi, M.H. (2022). Identifying potential miRNA biomarkers for gastric cancer diagnosis using machine learning variable selection approach. Front. Genet., 12.","DOI":"10.3389\/fgene.2021.779455"},{"key":"ref_34","first-page":"541","article-title":"Multi-Step-Ahead Forecasting of the CBOE Volatility Index in a Data-Rich Environment: Application of Random Forest with Boruta Algorithm","volume":"38","author":"Kim","year":"2022","journal-title":"Korean Econ. Rev."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"117943","DOI":"10.1016\/j.eswa.2022.117943","article-title":"Research on prediction of multi-class theft crimes by an optimized decomposition and fusion method based on XGBoost","volume":"207","author":"Yan","year":"2022","journal-title":"Expert Syst. Appl."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"3156","DOI":"10.1109\/TNNLS.2020.3009776","article-title":"GBDT-MO: Gradient-boosted decision trees for multiple outputs","volume":"32","author":"Zhang","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1189","DOI":"10.1214\/aos\/1013203451","article-title":"Greedy function approximation: A gradient boosting machine","volume":"29","author":"Friedman","year":"2001","journal-title":"Ann. Stat."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"e12553","DOI":"10.1111\/exsy.12553","article-title":"Ensemble feature selection in medical datasets: Combining filter, wrapper, and embedded feature selection results","volume":"37","author":"Chen","year":"2020","journal-title":"Expert Syst."},{"key":"ref_39","first-page":"8577","article-title":"Gradient harmonized single-stage detector","volume":"33","author":"Li","year":"2019","journal-title":"AAAI Conf. Artif. Intell."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1016\/j.patrec.2005.10.010","article-title":"An introduction to ROC analysis","volume":"27","author":"Fawcett","year":"2006","journal-title":"Pattern Recognit. Lett."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"2821","DOI":"10.1007\/s10462-021-10072-6","article-title":"A survey on feature selection methods for mixed data","volume":"55","year":"2022","journal-title":"Artif. Intell. Rev."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"308","DOI":"10.1080\/26395940.2022.2102543","article-title":"Hyperspectral estimation of petroleum hydrocarbon content in soil using ensemble learning method and LASSO feature extraction","volume":"34","author":"Wu","year":"2022","journal-title":"Environ. Pollut. Bioavailab."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"450","DOI":"10.1109\/TNNLS.2020.2978755","article-title":"Data clustering via uncorrelated ridge regression","volume":"32","author":"Zhang","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"420","DOI":"10.1080\/00401706.2020.1742207","article-title":"Ridge regression: A historical context","volume":"62","author":"Hoerl","year":"2020","journal-title":"Technometrics"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"119728","DOI":"10.1016\/j.neuroimage.2022.119728","article-title":"Feature-space selection with banded ridge regression","volume":"264","author":"Eickenberg","year":"2022","journal-title":"NeuroImage"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"104088","DOI":"10.1016\/j.frl.2023.104088","article-title":"SAFE Artificial Intelligence in finance","volume":"56","author":"Giudici","year":"2023","journal-title":"Financ. Res. Lett."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1186\/s13018-024-04774-0","article-title":"Preoperative prediction model for risk of readmission after total joint replacement surgery: A random forest approach leveraging NLP and unfairness mitigation for improved patient care and cost-effectiveness","volume":"19","author":"Digumarthi","year":"2024","journal-title":"J. Orthop. Surg. Res."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"nwad292","DOI":"10.1093\/nsr\/nwad292","article-title":"Bilevel optimization for automated machine learning: A new perspective on framework and algorithm","volume":"11","author":"Liu","year":"2023","journal-title":"Natl. Sci. Rev."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1109\/TVCG.2018.2865020","article-title":"Evaluating multi-dimensional visualizations for understanding fuzzy clusters","volume":"25","author":"Zhao","year":"2018","journal-title":"IEEE Trans. Vis. Comput. Graph."}],"container-title":["Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-8954\/12\/7\/254\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T15:16:45Z","timestamp":1760109405000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-8954\/12\/7\/254"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,14]]},"references-count":49,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2024,7]]}},"alternative-id":["systems12070254"],"URL":"https:\/\/doi.org\/10.3390\/systems12070254","relation":{},"ISSN":["2079-8954"],"issn-type":[{"value":"2079-8954","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,14]]}}}