{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T16:01:03Z","timestamp":1776960063572,"version":"3.51.4"},"reference-count":33,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2025,6,15]],"date-time":"2025-06-15T00:00:00Z","timestamp":1749945600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>COVID-19 mortality is a complex phenomenon influenced by multiple factors. This study aimed to identify factors associated with death in COVID-19 patients by considering clinical, demographic, environmental, and socioeconomic conditions, using machine learning models and a national dataset from Mexico covering all pandemic waves. We integrated data from the national COVID-19 dataset, municipal-level socioeconomic indicators, and water quality contaminants (physicochemical and microbiological). Patients were assigned to one of four datasets (groundwater, lentic, lotic, and coastal) based on their municipality of residence. We trained XGBoost models to predict patient death or survival on balanced subsets of each dataset. Hyperparameters were optimized using a grid search and cross-validation, and feature importance was analyzed using SHAP values, point-biserial correlation, and XGBoost metrics. The models achieved strong predictive performance (F1 score &gt; 0.97). Key risk factors included older age (\u226550 years), pneumonia, intubation, obesity, diabetes, hypertension, and chronic kidney disease, while outpatient status, younger age (&lt;40 years), contact with a confirmed case, and care in private medical units were associated with survival. Female sex showed a protective trend. Higher socioeconomic levels appeared protective, whereas lower levels increased risk. Water quality contaminants (e.g., manganese, hardness, fluoride, dissolved oxygen, fecal coliforms) ranked among the top 30 features, suggesting an association between environmental factors and COVID-19 mortality.<\/jats:p>","DOI":"10.3390\/make7020055","type":"journal-article","created":{"date-parts":[[2025,6,16]],"date-time":"2025-06-16T09:51:22Z","timestamp":1750067482000},"page":"55","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Factors Associated with COVID-19 Mortality in Mexico: A Machine Learning Approach Using Clinical, Socioeconomic, and Environmental Data"],"prefix":"10.3390","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1577-5629","authenticated-orcid":false,"given":"Lorena","family":"D\u00edaz-Gonz\u00e1lez","sequence":"first","affiliation":[{"name":"Centro de Investigaci\u00f3n en Ciencias, Universidad Aut\u00f3noma del Estado de Morelos, Cuernavaca 62209, Morelos, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yael Sharim","family":"Toribio-Colin","sequence":"additional","affiliation":[{"name":"Licenciatura en Ciencias, Instituto de Investigaci\u00f3n en Ciencias B\u00e1sicas Aplicadas (IICBA), Universidad Aut\u00f3noma del Estado de Morelos, Cuernavaca 62209, Morelos, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Julio C\u00e9sar","family":"P\u00e9rez-Sansalvador","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Instituto Nacional de Astrof\u00edsica, \u00d3ptica y Electr\u00f3nica, Luis Enrique Erro 1, Tonantzintla 72840, Puebla, Mexico"},{"name":"Secretar\u00eda de Ciencia, Humanidades, Tecnolog\u00eda e Innovaci\u00f3n (SECIHTI), Insurgentes Sur 1582, Ciudad de Mexico 03940, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7180-1038","authenticated-orcid":false,"given":"Noureddine","family":"Lakouari","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Instituto Nacional de Astrof\u00edsica, \u00d3ptica y Electr\u00f3nica, Luis Enrique Erro 1, Tonantzintla 72840, Puebla, Mexico"},{"name":"Secretar\u00eda de Ciencia, Humanidades, Tecnolog\u00eda e Innovaci\u00f3n (SECIHTI), Insurgentes Sur 1582, Ciudad de Mexico 03940, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,6,15]]},"reference":[{"key":"ref_1","unstructured":"PAHO (2025, April 25). WHO Characterizes COVID-19 as a Pandemic. Available online: https:\/\/www.paho.org\/en\/news\/11-3-2020-who-characterizes-covid-19-pandemic."},{"key":"ref_2","unstructured":"WHO (2025, April 25). WHO Coronavirus (COVID-19) Dashboard. Available online: https:\/\/covid19.who.int\/."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"104258","DOI":"10.1016\/j.ijmedinf.2020.104258","article-title":"Personalized predictive models for symptomatic COVID-19 patients using basic preconditions: Hospitalizations, mortality, and the need for an ICU or ventilator","volume":"142","author":"Cassandras","year":"2020","journal-title":"Int. J. Med. Inform."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Khadem, H., Nemat, H., Eissa, M.R., Elliott, J., and Benaissa, M. (2022). COVID-19 mortality risk assessments for individuals with and without diabetes mellitus: Machine learning models integrated with interpretation framework. Comput. Biol. Med., 144.","DOI":"10.1016\/j.compbiomed.2022.105361"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"2240334","DOI":"10.1080\/27684520.2023.2240334","article-title":"Interpretable machine learning for mortality modeling on patients with chronic diseases considering the COVID-19 pandemic in a region of Chile: A Shapley value based approach","volume":"1","author":"Ferreira","year":"2023","journal-title":"Res. Stat."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Datta, D., George Dalmida, S., Martinez, L., Newman, D., Hashemi, J., Khoshgoftaar, T.M., Shorten, C., Sareli, C., and Eckardt, P. (2023). Using machine learning to identify patient characteristics to predict mortality of in-patients with COVID-19 in south Florida. Front. Digit. Health, 5.","DOI":"10.3389\/fdgth.2023.1193467"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Rojas-Garc\u00eda, M., V\u00e1zquez, B., Torres-Poveda, K., and Madrid-Marina, V. (2023). Lethality risk markers by sex and age-group for COVID-19 in Mexico: A cross-sectional study based on machine learning approach. BMC Infect. Dis., 23.","DOI":"10.1186\/s12879-022-07951-w"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Li, H., Ashrafi, N., Kang, C., Zhao, G., Chen, Y., and Pishgar, M. (2024). A machine learning-based prediction of hospital mortality in mechanically ventilated ICU patients. PLoS ONE, 19.","DOI":"10.1101\/2024.07.12.24310325"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Sharifi-Kia, A., Nahvijou, A., and Sheikhtaheri, A. (2023). Machine learning-based mortality prediction models for smoker COVID-19 patients. BMC Med. Inform. Decis. Mak., 23.","DOI":"10.1186\/s12911-023-02237-w"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Casillas, N., Ram\u00f3n, A., Torres, A.M., Blasco, P., and Mateo, J. (2023). Predictive model for mortality in severe COVID-19 patients across the six pandemic waves. Viruses, 15.","DOI":"10.3390\/v15112184"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1007\/s10916-023-01979-4","article-title":"Risk factors associated with COVID-19 lethality: A machine learning approach using Mexico database","volume":"47","year":"2023","journal-title":"J. Med. Syst."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1186\/s12963-024-00330-4","article-title":"Country-specific determinants for COVID-19 case fatality rate and response strategies from a global perspective: An interpretable machine learning framework","volume":"22","author":"Zhou","year":"2024","journal-title":"Popul. Health Metr."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"2358851","DOI":"10.1080\/17538947.2024.2358851","article-title":"Relationships between geo-spatial features and COVID-19. hospitalisations revealed by machine learning models and SHAP values","volume":"17","author":"Chu","year":"2024","journal-title":"Int. J. Digit. Earth"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"101224","DOI":"10.1016\/j.ecoinf.2021.101224","article-title":"An evaluation of feature selection methods for environmental data","volume":"61","author":"Effrosynidis","year":"2021","journal-title":"Ecol. Inform."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1613\/jair.953","article-title":"SMOTE: Synthetic Minority Over-sampling Technique","volume":"16","author":"Chawla","year":"2002","journal-title":"J. Artif. Intell. Res."},{"key":"ref_16","unstructured":"National Epidemiological Surveillance System (2024, November 04). Datos Abiertos-Bases Hist\u00f3ricas-Direcci\u00f3n General de Epidemiolog\u00eda, Available online: https:\/\/www.gob.mx\/salud\/documentos\/datos-abiertos-bases-historicas-direccion-general-de-epidemiologia."},{"key":"ref_17","unstructured":"United Nations Development Programme (2024, November 04). \u00cdndice de Desarrollo Humano (IDH) Municipal Resultados 2010\u20132020 [Dataset]. Available online: https:\/\/drive.google.com\/drive\/folders\/1GRxyxSIPAL629vOnMLsLZgX70iqVo5ZX."},{"key":"ref_18","unstructured":"United Nations Development Programme (2024, November 04). Informe de Desarrollo Humano Municipal 2010\u20132020: Una D\u00e9cada de Transformaciones Locales en M\u00e9xico. Programa de las Naciones Unidas para el Desarrollo, p. 99. Available online: https:\/\/www.undp.org\/es\/mexico\/publicaciones\/informe-de-desarrollo-humano-municipal-2010-2020-una-decada-de-transformaciones-locales-en-mexico-0."},{"key":"ref_19","unstructured":"P\u00e9rez-Tamayo, R. (2016). Patolog\u00eda de la Pobreza, Fondo de Cultura Econ\u00f3mica."},{"key":"ref_20","unstructured":"Comisi\u00f3n Nacional del Agua (2025, February 11). Resultados de la Red Nacional de medici\u00f3n de Calidad del Agua (RENAMECA), Available online: https:\/\/www.gob.mx\/conagua\/articulos\/resultados-de-la-red-nacional-de-medicion-de-calidad-del-agua-renameca?idiom=es."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"104498","DOI":"10.1016\/j.jconhyd.2025.104498","article-title":"AQuA-P: A machine learning-based tool for water quality assessment","volume":"269","author":"Lakouari","year":"2025","journal-title":"J. Contam. Hydrol."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). XGBoost: A scalable tree boosting system. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_23","unstructured":"Russell, S., and Norvig, P. (2020). Decision Trees. Artificial Intelligence: A Modern Approach, Pearson. [4th ed.]."},{"key":"ref_24","unstructured":"(2025, June 11). XGBoost Python Package. Available online: https:\/\/xgboost.readthedocs.io\/en\/stable\/python\/index.html."},{"key":"ref_25","first-page":"201","article-title":"Performance Comparison of Grid Search and Random Search Methods for Hyperparameter Tuning in Extreme Gradient Boosting Algorithm to Predict Chronic Kidney Failure","volume":"14","author":"Anggoro","year":"2021","journal-title":"Int. J. Intell. Eng. Syst."},{"key":"ref_26","unstructured":"XGBoost Contributors (2025, February 11). XGBoost Parameters. Available online: https:\/\/xgboost.readthedocs.io\/en\/stable\/parameter.html."},{"key":"ref_27","unstructured":"Bishop, C.M. (2006). Pattern Recognition and Machine Learning, Springer."},{"key":"ref_28","unstructured":"Lundberg, S.M., and Lee, S.I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, Curran Associates, Inc.. Available online: https:\/\/arxiv.org\/abs\/1705.07874."},{"key":"ref_29","unstructured":"Thomson, R.E. (2009). A value for n-person games. The Shapley Value, Cambridge University Press."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Kornbrot, D. (2014). Point Biserial Correlation. Wiley StatsRef: Statistics Reference Online, Wiley.","DOI":"10.1002\/9781118445112.stat06227"},{"key":"ref_31","unstructured":"Toribio-Colin, Y.S. (2025). Factors Associated with COVID-19 Mortality in Mexico: A Machine Learning Approach Using Clinical, Socioeconomic, and Environmental Data [Dataset]. Zenodo."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"e051458","DOI":"10.1136\/bmjopen-2021-051458","article-title":"Reopening Italy\u2019s schools in September 2020: A Bayesian estimation of the change in the growth rate of new SARS-CoV-2 cases","volume":"11","author":"Casini","year":"2021","journal-title":"BMJ Open"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"100092","DOI":"10.1016\/j.lanepe.2021.100092","article-title":"A cross-sectional and prospective cohort study of the role of schools in the SARS-CoV-2 second wave in Italy","volume":"5","author":"Gandini","year":"2021","journal-title":"Lancet Reg. Health Eur."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/7\/2\/55\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:52:24Z","timestamp":1760032344000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/7\/2\/55"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,15]]},"references-count":33,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2025,6]]}},"alternative-id":["make7020055"],"URL":"https:\/\/doi.org\/10.3390\/make7020055","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,15]]}}}