{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"institution":[{"name":"medRxiv"}],"indexed":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T07:29:07Z","timestamp":1768548547781,"version":"3.49.0"},"posted":{"date-parts":[[2020,12,8]]},"group-title":"Epidemiology","reference-count":27,"publisher":"openRxiv","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"accepted":{"date-parts":[[2020,12,8]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                <jats:p>\n                  We compare the performance of major decision tree-based ensemble machine learning models on the task of COVID-19 death probability prediction, conditional on three risk factors:\n                  <jats:italic>age group, sex<\/jats:italic>\n                  and\n                  <jats:italic>underlying comorbidity or disease<\/jats:italic>\n                  , using the US Centers for Disease Control and Prevention (CDC)\u2019s COVID-19 case surveillance dataset. To evaluate the impact of the three risk factors on COVID-19 death probability, we extract and analyze the conditional probability profile produced by the best performer. The results show the presence of an exponential rise in death probability from COVID-19 with the age group, with males exhibiting a higher exponential growth rate than females, an effect that is stronger when an underlying comorbidity or disease is present, which also acts as an accelerator of COVID-19 death probability rise for both male and female subjects. The results are discussed in connection to healthcare and epidemiological concerns and in the degree to which they reinforce findings coming from other studies on COVID-19.\n                <\/jats:p>","DOI":"10.1101\/2020.12.06.20244756","type":"posted-content","created":{"date-parts":[[2020,12,8]],"date-time":"2020-12-08T18:00:42Z","timestamp":1607450442000},"source":"Crossref","is-referenced-by-count":1,"title":["Comparing Decision Tree-Based Ensemble Machine Learning Models for COVID-19 Death Probability Profiling"],"prefix":"10.64898","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0298-3974","authenticated-orcid":false,"given":"Carlos Pedro","family":"Gon\u00e7alves","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8710-0367","authenticated-orcid":false,"given":"Jos\u00e9","family":"Rouco","sequence":"additional","affiliation":[]}],"member":"54368","reference":[{"issue":"10","key":"2020121010350958000_2020.12.06.20244756v1.1","doi-asserted-by":"crossref","first-page":"e0239571","DOI":"10.1371\/journal.pone.0239571","article-title":"Association between COVID-19 prognosis and disease presentation, comorbidities and chronic treatment of hospitalized patients","volume":"15","year":"2020","journal-title":"PLoS ONE"},{"issue":"10","key":"2020121010350958000_2020.12.06.20244756v1.2","doi-asserted-by":"crossref","first-page":"e0240346","DOI":"10.1371\/journal.pone.0240346","article-title":"Physiological and socioeconomic characteristics predict COVID-19 mortality and resource utilization in Brazil","volume":"15","year":"2020","journal-title":"PLoS ONE"},{"issue":"10","key":"2020121010350958000_2020.12.06.20244756v1.3","doi-asserted-by":"crossref","first-page":"e0240400","DOI":"10.1371\/journal.pone.0240400","article-title":"In-hospital mortality is associated with inflammatory response in NAFLD patients admitted for COVID-19","volume":"15","year":"2020","journal-title":"PLoS ONE"},{"key":"2020121010350958000_2020.12.06.20244756v1.4","doi-asserted-by":"publisher","DOI":"10.1016\/j.jinf.2020.03.004"},{"key":"2020121010350958000_2020.12.06.20244756v1.5","doi-asserted-by":"publisher","DOI":"10.1136\/bmjresp-2020-000716"},{"issue":"5","key":"2020121010350958000_2020.12.06.20244756v1.6","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1038\/s42256-020-0180-7","article-title":"An interpretable mortality prediction model for COVID-19 patients","volume":"2","year":"2020","journal-title":"Nature Machine Intelligence"},{"issue":"1","key":"2020121010350958000_2020.12.06.20244756v1.7","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1002\/art.21695","article-title":"Short-Term Prediction of Mortality in Patients With Systemic Lupus Erythematosus: Classification of Outcomes Using Random Forests","volume":"55","year":"2006","journal-title":"Arthritis & Rheumatism (Arthritis Care & Research)"},{"issue":"5","key":"2020121010350958000_2020.12.06.20244756v1.8","first-page":"443","article-title":"Mortality Risk Score Prediction in an Elderly Population Using Machine Learning","volume":"177","year":"2012","journal-title":"American Journal of Epidemiology"},{"issue":"3","key":"2020121010350958000_2020.12.06.20244756v1.9","doi-asserted-by":"crossref","first-page":"841","DOI":"10.1214\/08-AOAS169","article-title":"Random Survival Forests","volume":"2","year":"2008","journal-title":"The Annals of Applied Statistics"},{"key":"2020121010350958000_2020.12.06.20244756v1.10","doi-asserted-by":"crossref","unstructured":"Wongvibulsin S , Wu KC and Zeger SL . Clinical risk prediction with random forests for survival, longitudinal, and multivariate (RF-SLAM) data analysis. BMC Medical Research Methodology, 20(1). https:\/\/doi.org\/10.1186\/s12874-019-0863-0.","DOI":"10.1186\/s12874-019-0863-0"},{"key":"2020121010350958000_2020.12.06.20244756v1.11","doi-asserted-by":"crossref","unstructured":"Niculescu-Mizil A and Caruana RA (2005). Predicting good probabilities with supervised learning. ICML \u201805:Proceedings of the 22nd international conference on Machine learning August 2005, 625\u2013632. https:\/\/doi.org\/10.1145\/1102351.1102430.","DOI":"10.1145\/1102351.1102430"},{"key":"2020121010350958000_2020.12.06.20244756v1.12","doi-asserted-by":"publisher","DOI":"10.1175\/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2"},{"key":"2020121010350958000_2020.12.06.20244756v1.13","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1186\/s41512-017-0020-3","article-title":"The Brier score does not evaluate the clinical utility of diagnostic tests or prediction models","volume":"1","year":"2017","journal-title":"Diagnostic and Prognostic Research"},{"key":"2020121010350958000_2020.12.06.20244756v1.14","first-page":"1321","article-title":"On calibration of modern neural networks","volume":"70","year":"2017","journal-title":"ICML\u201917:Proceedings of the 34th International Conference on Machine Learning"},{"key":"2020121010350958000_2020.12.06.20244756v1.15","unstructured":"Naeini MP , Cooper GF and Hauskrecht M (2015). Obtaining well calibrated probabilities using bayesian binning. AAAI, 2901, retrieved from https:\/\/people.cs.pitt.edu\/~milos\/research\/AAAI_Calibration.pdf."},{"key":"2020121010350958000_2020.12.06.20244756v1.16","first-page":"3","article-title":"Extremely randomized trees","volume":"63","year":"2005","journal-title":"Machine Learning"},{"key":"2020121010350958000_2020.12.06.20244756v1.17","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9781107415324.004"},{"key":"2020121010350958000_2020.12.06.20244756v1.18","doi-asserted-by":"crossref","first-page":"e9945","DOI":"10.7717\/peerj.9945","article-title":"A descriptive study of random forest algorithm for predicting COVID-19 patients outcome","volume":"8","year":"2020","journal-title":"PeerJ"},{"key":"2020121010350958000_2020.12.06.20244756v1.19","first-page":"357","article-title":"COVID-19 Patient Health Prediction Using Boosted Random Forest Algorithm. Front","volume":"8","year":"2020","journal-title":"Public Health"},{"key":"2020121010350958000_2020.12.06.20244756v1.20","doi-asserted-by":"publisher","DOI":"10.1006\/jcss.1997.1504"},{"issue":"1","key":"2020121010350958000_2020.12.06.20244756v1.21","first-page":"1","article-title":"Traffic Flow Prediction using Adaboost Algorithm with Random Forests as a Weak Learner","volume":"1","year":"2007","journal-title":"World Academy of Science, Engineering and Technology International Journal of Mathematical and Computational Sciences"},{"issue":"5","key":"2020121010350958000_2020.12.06.20244756v1.22","doi-asserted-by":"crossref","first-page":"1189","DOI":"10.1214\/aos\/1013203450","article-title":"Greedy function approximation: A gradient boosting machine","volume":"29","year":"2001","journal-title":"The Annals of Statistics"},{"key":"2020121010350958000_2020.12.06.20244756v1.23","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-9473(01)00065-2"},{"key":"2020121010350958000_2020.12.06.20244756v1.24","unstructured":"Ke G , Meng Q , Finley TW , Wang T , Chen W , Ma W , Ye Q , Liu, T-Y (2017). LightGBM: a highly efficient gradient boosting decision tree. NIPS\u201917:Proceedings of the 31st International Conference on Neural Information Processing Systems, 3149\u20133157, https:\/\/doi.org\/10.5555\/3294996.3295074."},{"key":"2020121010350958000_2020.12.06.20244756v1.25","doi-asserted-by":"crossref","unstructured":"Wu Z , Zhang J , Zhang H , Zeng H , Dai G , Kong W , Babiloni F , Yang C (2019). A LightGBM-Based EEG Analysis Method for Driver Mental States Classification. Computational Intelligence and Neuroscience Volume 2019, Article ID 3761203, 11p., https:\/\/doi.org\/10.1155\/2019\/3761203.","DOI":"10.1155\/2019\/3761203"},{"key":"2020121010350958000_2020.12.06.20244756v1.26","doi-asserted-by":"crossref","unstructured":"Fleitas, PE , Paz, JA , Simoy, MI , Vargas, C , Cimino, RO , Krolewiecki, AJ , Aparicio, JP (2020). Understanding the value of clinical symptoms of COVID-19. A logistic regression model. medRxiv, ID: ppmedrxiv-20207019, https:\/\/doi.org\/10.1101\/2020.10.07.20207019.","DOI":"10.1101\/2020.10.07.20207019"},{"key":"2020121010350958000_2020.12.06.20244756v1.27","doi-asserted-by":"crossref","unstructured":"Ejaz, H , Alsrhani, A , Zafar, A , Javed, H , Junaid, K , Abdalla, AE , Abosalif, KOA , Ahmed, Z , Younas, S (2020). COVID-19 and comorbidities: Deleterious impact on infected patients. Journal of Infection and Public Health, August, 4, https:\/\/doi.org\/10.1016\/j.jiph.2020.07.014.","DOI":"10.1016\/j.jiph.2020.07.014"}],"container-title":[],"original-title":[],"link":[{"URL":"https:\/\/syndication.highwire.org\/content\/doi\/10.1101\/2020.12.06.20244756","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,15]],"date-time":"2026-01-15T14:00:52Z","timestamp":1768485652000},"score":1,"resource":{"primary":{"URL":"http:\/\/medrxiv.org\/lookup\/doi\/10.1101\/2020.12.06.20244756"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12,8]]},"references-count":27,"URL":"https:\/\/doi.org\/10.1101\/2020.12.06.20244756","relation":{},"subject":[],"published":{"date-parts":[[2020,12,8]]},"subtype":"preprint"}}