{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T18:51:12Z","timestamp":1777661472055,"version":"3.51.4"},"reference-count":15,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,2,17]],"date-time":"2025-02-17T00:00:00Z","timestamp":1739750400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,2,17]],"date-time":"2025-02-17T00:00:00Z","timestamp":1739750400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank"},{"DOI":"10.13039\/501100002386","name":"Cairo University","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100002386","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Big Data"],"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Gradient Boosting Trees (GBT) is a powerful machine learning technique that is based on ensemble learning methods that leverage the idea of boosting. GBT combines multiple weak learners sequentially to boost its prediction power proving its outstanding efficiency in many problems, and hence it is now considered one of the top techniques used to solve prediction problems. In this paper, a hybrid approach is proposed that combines GBT with K-means and Bisecting K-means clustering to enhance the predictive power of the approach on regression datasets. The proposed approach is applied on 40 regression datasets from UCI and Kaggle websites and it achieves better efficiency than using only one GBT model. Statistical tests are applied, namely, Friedman and Wilcoxon signed-rank tests showing that the proposed approach achieves significant better results than using only one GBT model.<\/jats:p>","DOI":"10.1186\/s40537-025-01071-3","type":"journal-article","created":{"date-parts":[[2025,2,17]],"date-time":"2025-02-17T11:29:55Z","timestamp":1739791795000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":42,"title":["Enhancing the performance of gradient boosting trees on regression problems"],"prefix":"10.1186","volume":"12","author":[{"given":"Lydia Wahid","family":"Rizkallah","sequence":"first","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,2,17]]},"reference":[{"key":"1071_CR1","doi-asserted-by":"publisher","first-page":"955","DOI":"10.1007\/s00521-017-3128-z","volume":"31","author":"BA Tama","year":"2019","unstructured":"Tama BA, Rhee KH. An in-depth experimental study of anomaly detection using gradient boosted machine. Neural Comput Appl. 2019;31:955\u201365.","journal-title":"Neural Comput Appl"},{"key":"1071_CR2","doi-asserted-by":"crossref","unstructured":"Henzel J, Sikora M. Gradient boosting application in forecasting of performance indicators values for measuring the efficiency of promotions in FMCG retail. In 2020 15th Conference on Computer Science and Information Systems (FedCSIS). 2020, September; 59\u201368. IEEE.","DOI":"10.15439\/2020F118"},{"key":"1071_CR3","doi-asserted-by":"publisher","first-page":"1308","DOI":"10.1007\/s42452-020-3060-1","volume":"2","author":"EK Sahin","year":"2020","unstructured":"Sahin EK. Assessing the predictive capability of ensemble tree methods for landslide susceptibility mapping using XGBoost, gradient boosting machine, and random forest. SN Appl Sci. 2020;2:1308.","journal-title":"SN Appl Sci"},{"key":"1071_CR4","doi-asserted-by":"publisher","first-page":"105942","DOI":"10.1016\/j.asoc.2019.105942","volume":"86","author":"R Sun","year":"2020","unstructured":"Sun R, Wang G, Zhang W, Hsu LT, Ochieng WY. A gradient boosting decision tree based GPS signal reception classification algorithm. Appl Soft Comput. 2020;86:105942.","journal-title":"Appl Soft Comput"},{"key":"1071_CR5","doi-asserted-by":"publisher","first-page":"143","DOI":"10.1007\/s42979-021-00558-z","volume":"2","author":"D Vassallo","year":"2021","unstructured":"Vassallo D, Vella V, Ellul J. Application of gradient boosting algorithms for anti-money laundering in cryptocurrencies. SN COMPUT SCI. 2021;2:143.","journal-title":"SN COMPUT SCI"},{"issue":"6","key":"1071_CR6","doi-asserted-by":"publisher","first-page":"3403","DOI":"10.1007\/s11440-022-01777-1","volume":"18","author":"S Demir","year":"2023","unstructured":"Demir S, Sahin EK. Predicting occurrence of liquefaction-induced lateral spreading using gradient boosting algorithms integrated with particle swarm optimization: PSO-XGBoost, PSO-LightGBM, and PSO-CatBoost. Acta Geotech. 2023;18(6):3403\u201319.","journal-title":"Acta Geotech"},{"key":"1071_CR7","doi-asserted-by":"publisher","first-page":"119030","DOI":"10.1016\/j.eswa.2022.119030","volume":"213","author":"MHL Louk","year":"2023","unstructured":"Louk MHL, Tama BA. Dual-IDS: A bagging-based gradient boosting decision tree model for network anomaly intrusion detection system. Expert Syst Appl. 2023;213:119030.","journal-title":"Expert Syst Appl"},{"key":"1071_CR8","doi-asserted-by":"publisher","first-page":"2257","DOI":"10.1007\/s11053-019-09576-4","volume":"29","author":"S Asante-Okyere","year":"2020","unstructured":"Asante-Okyere S, Shen C, Ziggah YY, Rulegeya MM, Zhu X. A novel hybrid technique of integrating gradient-boosted machine and clustering algorithms for lithology classification. Nat Resour Res. 2020;29:2257\u201373.","journal-title":"Nat Resour Res"},{"key":"1071_CR9","doi-asserted-by":"publisher","first-page":"211411","DOI":"10.1109\/ACCESS.2020.3038490","volume":"8","author":"CB Issaid","year":"2020","unstructured":"Issaid CB, Ant\u00f3n-Haro C, Mestre X, Alouini MS. User clustering for MIMO NOMA via classifier chains and gradient-boosting decision trees. IEEE Access. 2020;8:211411\u201321.","journal-title":"IEEE Access"},{"key":"1071_CR10","doi-asserted-by":"publisher","first-page":"443","DOI":"10.1109\/ACCESS.2021.3137870","volume":"10","author":"G Rouwhorst","year":"2021","unstructured":"Rouwhorst G, Duque EMS, Nguyen PH, Slootweg H. Improving clustering-based forecasting of aggregated distribution transformer loadings with gradient boosting and feature selection. IEEE Access. 2021;10:443\u201355.","journal-title":"IEEE Access"},{"issue":"12","key":"1071_CR11","doi-asserted-by":"publisher","first-page":"6721","DOI":"10.1007\/s00521-020-05450-0","volume":"33","author":"AA Taha","year":"2021","unstructured":"Taha AA, Malebary SJ. Hybrid classification of Android malware based on fuzzy clustering and the gradient boosting machine. Neural Comput Appl. 2021;33(12):6721\u201332.","journal-title":"Neural Comput Appl"},{"key":"1071_CR12","doi-asserted-by":"publisher","first-page":"1723","DOI":"10.1038\/s41598-024-52251-9","volume":"14","author":"H Saito","year":"2024","unstructured":"Saito H, Yoshimura H, Tanaka K, et al. Predicting CKD progression using time-series clustering and light gradient boosting machines. Sci Rep. 2024;14:1723.","journal-title":"Sci Rep"},{"key":"1071_CR13","doi-asserted-by":"publisher","first-page":"1937","DOI":"10.1007\/s10462-020-09896-5","volume":"54","author":"C Bent\u00e9jac","year":"2021","unstructured":"Bent\u00e9jac C, Cs\u00f6rg\u0151 A, Mart\u00ednez-Mu\u00f1oz G. A comparative analysis of gradient boosting algorithms. Artif Intell Rev. 2021;54:1937\u201367.","journal-title":"Artif Intell Rev"},{"key":"1071_CR14","unstructured":"https:\/\/archive.ics.uci.edu\/. Accessed 3 July 2024."},{"key":"1071_CR15","unstructured":"https:\/\/www.kaggle.com\/. Accessed 3 July 2024."}],"container-title":["Journal of Big Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-025-01071-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s40537-025-01071-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-025-01071-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,17]],"date-time":"2025-02-17T11:30:01Z","timestamp":1739791801000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofbigdata.springeropen.com\/articles\/10.1186\/s40537-025-01071-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,2,17]]},"references-count":15,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["1071"],"URL":"https:\/\/doi.org\/10.1186\/s40537-025-01071-3","relation":{},"ISSN":["2196-1115"],"issn-type":[{"value":"2196-1115","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,2,17]]},"assertion":[{"value":"6 July 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 January 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 February 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"35"}}