{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,17]],"date-time":"2025-12-17T09:53:15Z","timestamp":1765965195345,"version":"3.48.0"},"reference-count":42,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2025,12,16]],"date-time":"2025-12-16T00:00:00Z","timestamp":1765843200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>This study presents a decision tree model-based approach to classify rural net migration across Colombian departments using sociodemographic and economic variables. In the model formulation, immigration is considered the movement of people to a destination area to settle there, while emigration is the movement of people from that specific area to other places. The target variable was defined as a binary category representing positive (when the immigration is greater than emigration) or negative net migration. Four classification models were trained and evaluated: Decision Tree, Random Forest, AdaBoost, and XGBoost. Data were preprocessed using cleaning techniques, categorical variable encoding, and class balance assessment. Model performance was evaluated using various metrics, including accuracy, precision, sensitivity, F1 score, and the area under the ROC curve. The results show that Random Forest achieves the highest accuracy, precision, sensitivity, and F1 score in the 10-variable and 15-variable settings, while XGBoost is competitive but not dominant. Furthermore, the importance of the model was analyzed to identify key factors influencing migration patterns. This approach allows for a more precise understanding of regional migration dynamics in Colombia and can serve as a basis for designing informed public policies.<\/jats:p>","DOI":"10.3390\/a18120797","type":"journal-article","created":{"date-parts":[[2025,12,16]],"date-time":"2025-12-16T17:22:58Z","timestamp":1765905778000},"page":"797","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Net Rural Migration Classification in Colombia Using Supervised Decision Tree Algorithms"],"prefix":"10.3390","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9101-2936","authenticated-orcid":false,"given":"Juan M.","family":"S\u00e1nchez","sequence":"first","affiliation":[{"name":"Facultad de Ingenier\u00eda, Universidad Distrital Francisco Jos\u00e9 de Caldas, Bogot\u00e1 110231, Colombia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0742-6069","authenticated-orcid":false,"given":"Helbert E.","family":"Espitia","sequence":"additional","affiliation":[{"name":"Facultad de Ingenier\u00eda, Universidad Distrital Francisco Jos\u00e9 de Caldas, Bogot\u00e1 110231, Colombia"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-9430-3703","authenticated-orcid":false,"given":"Cesar L.","family":"Gonz\u00e1lez","sequence":"additional","affiliation":[{"name":"Facultad de Ingenier\u00eda, Universidad Militar Nueva Granada, Bogot\u00e1 110111, Colombia"}]}],"member":"1968","published-online":{"date-parts":[[2025,12,16]]},"reference":[{"doi-asserted-by":"crossref","unstructured":"Galindo, G., Navarro, J., Reales, J., Castro, J., Romero, D., Rodriguez, A.S., and Rivera-Royero, D. (2022). Immigrants resettlement in developing countries: A data-driven decision tool applied to the case of Venezuelan immigrants in Colombia. PLoS ONE, 17.","key":"ref_1","DOI":"10.1371\/journal.pone.0262781"},{"doi-asserted-by":"crossref","unstructured":"Maldonado, A.D., Ramos-L\u00f3pez, D., and Aguilera, P.A. (2018). A Comparison of Machine-Learning Methods to Select Socioeconomic Indicators in Cultural Landscapes. Sustainability, 10.","key":"ref_2","DOI":"10.3390\/su10114312"},{"doi-asserted-by":"crossref","unstructured":"Alarc\u00e3o, V., Candeias, P., and Stefanovska-Petkovska, M. (2024). Chapter 13\u2014Environmental migration and human rights: Clues for the debate. Environmental Health Behavior, Concepts, Determinants, and Impacts, Academic Press.","key":"ref_3","DOI":"10.1016\/B978-0-12-824000-7.00004-0"},{"key":"ref_4","first-page":"103713","article-title":"Long-term effects of weather-induced migration on urban labor and housing markets","volume":"99","author":"Busso","year":"2025","journal-title":"J. Urban Econ."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"102540","DOI":"10.1016\/j.jdeveco.2020.102540","article-title":"Land reform and human capital development: Evidence from Peru","volume":"147","author":"Albertus","year":"2020","journal-title":"J. Dev. Econ."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"104787","DOI":"10.1016\/j.landusepol.2020.104787","article-title":"Worldwide trends in the scientific production on rural depopulation","volume":"97","year":"2020","journal-title":"Land Use Policy"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"103713","DOI":"10.1016\/j.regsciurbeco.2021.103713","article-title":"Rural\u2013urban migration in developing countries: Lessons from the literature","volume":"91","author":"Selod","year":"2021","journal-title":"Reg. Sci. Urban Econ."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1016\/j.jue.2015.09.002","article-title":"Demography, urbanization and development: Rural push, urban pull and urban push?","volume":"98","author":"Jedwab","year":"2017","journal-title":"J. Urban Econ."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"106727","DOI":"10.1016\/j.landusepol.2023.106727","article-title":"Worldwide research trends on land tenure","volume":"131","year":"2023","journal-title":"Land Use Policy"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1016\/j.worlddev.2014.12.013","article-title":"Cities, territories, and inclusive growth: Unraveling urban\u2013rural linkages in Chile, Colombia, and Mexico","volume":"73","author":"Carriazo","year":"2015","journal-title":"World Dev."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.19053\/uptc.01211048.17548","article-title":"Investigating the complex dynamics of rural\u2013urban migration: A bibliometric analysis","volume":"24","author":"Kumari","year":"2024","journal-title":"Rev. Inquietud Empres."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"103658","DOI":"10.1016\/j.regsciurbeco.2021.103658","article-title":"Rural\u2013urban migration at high urbanization levels","volume":"91","author":"Busso","year":"2021","journal-title":"Reg. Sci. Urban Econ."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"105941","DOI":"10.1016\/j.worlddev.2022.105941","article-title":"Economic and social development along the urban\u2013rural continuum: New opportunities to inform policy","volume":"157","author":"Cattaneo","year":"2022","journal-title":"World Dev."},{"doi-asserted-by":"crossref","unstructured":"Tufail, S., Riggs, H., Tariq, M., and Sarwat, A.I. (2023). Advancements and Challenges in Machine Learning: A Comprehensive Review of Models, Libraries, Applications, and Algorithms. Electronics, 12.","key":"ref_14","DOI":"10.3390\/electronics12081789"},{"doi-asserted-by":"crossref","unstructured":"Fu, M., Zhang, C., Hu, C., Wu, T., Dong, J., and Zhu, L. (2023). Achieving Verifiable Decision Tree Prediction on Hybrid Blockchains. Entropy, 25.","key":"ref_15","DOI":"10.3390\/e25071058"},{"doi-asserted-by":"crossref","unstructured":"Hafeez, M.A., Rashid, M., Tariq, H., Abideen, Z.U., Alotaibi, S.S., and Sinky, M.H. (2021). Performance Improvement of Decision Tree: A Robust Classifier Using Tabu Search Algorithm. Appl. Sci., 11.","key":"ref_16","DOI":"10.3390\/app11156728"},{"doi-asserted-by":"crossref","unstructured":"Tyralis, H., Papacharalampous, G., and Langousis, A. (2019). A Brief Review of Random Forests for Water Scientists and Practitioners and Their Recent History in Water Resources. Water, 11.","key":"ref_17","DOI":"10.3390\/w11050910"},{"doi-asserted-by":"crossref","unstructured":"Aldrich, C. (2020). Process Variable Importance Analysis by Use of Random Forests in a Shapley Regression Framework. Minerals, 10.","key":"ref_18","DOI":"10.3390\/min10050420"},{"doi-asserted-by":"crossref","unstructured":"Horny\u00e1k, O., and Iantovics, L.B. (2023). AdaBoost Algorithm Could Lead to Weak Results for Data with Certain Characteristics. Mathematics, 11.","key":"ref_19","DOI":"10.3390\/math11081801"},{"doi-asserted-by":"crossref","unstructured":"Wang, C., Xu, S., and Yang, J. (2021). Adaboost Algorithm in Artificial Intelligence for Optimizing the IRI Prediction Accuracy of Asphalt Concrete Pavement. Sensors, 21.","key":"ref_20","DOI":"10.3390\/s21175682"},{"doi-asserted-by":"crossref","unstructured":"Taser, P.Y. (2021). Application of Bagging and Boosting Approaches Using Decision Tree-Based Algorithms in Diabetes Risk Prediction. Proceedings, 74.","key":"ref_21","DOI":"10.3390\/proceedings2021074006"},{"doi-asserted-by":"crossref","unstructured":"Wang, L., Wang, X., Chen, A., Jin, X., and Che, H. (2020). Prediction of Type 2 Diabetes Risk and Its Effect Evaluation Based on the XGBoost Model. Healthcare, 8.","key":"ref_22","DOI":"10.3390\/healthcare8030247"},{"doi-asserted-by":"crossref","unstructured":"Zou, M., Jiang, W.-G., Qin, Q.-H., Liu, Y.-C., and Li, M.-L. (2022). Optimized XGBoost Model with Small Dataset for Predicting Relative Density of Ti-6Al-4V Parts Manufactured by Selective Laser Melting. Materials, 15.","key":"ref_23","DOI":"10.3390\/ma15155298"},{"unstructured":"Departamento Administrativo Nacional de Estad\u00edstica (DANE) (2024, November 03). Estad\u00edsticas\u00a0de Migraci\u00f3n, Available online: https:\/\/www.dane.gov.co\/index.php\/estadisticas-por-tema\/demografia-y-poblacion\/estadisticas-de-migracion.","key":"ref_24"},{"unstructured":"Departamento Administrativo Nacional de Estad\u00edstica (DANE) (2024, November 03). Encuesta de Calidad de Vida (ECV), Available online: https:\/\/www.dane.gov.co\/index.php\/estadisticas-por-tema\/pobreza-y-condiciones-de-vida\/calidad-de-vida-ecv.","key":"ref_25"},{"unstructured":"Departamento Administrativo Nacional de Estad\u00edstica (DANE) (2024, November 03). Demograf\u00eda y Poblaci\u00f3n, Available online: https:\/\/www.dane.gov.co\/index.php\/estadisticas-por-tema\/demografia-y-poblacion.","key":"ref_26"},{"unstructured":"Instituto Nacional de V\u00edas (INVIAS) (2024, November 03). Open Data Portal, Available online: https:\/\/inviasopendata-invias.opendata.arcgis.com.","key":"ref_27"},{"unstructured":"Polic\u00eda Nacional de Colombia (2024, November 03). Estad\u00edstica Delictiva, Available online: https:\/\/www.policia.gov.co\/estadistica-delictiva.","key":"ref_28"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"220311","DOI":"10.1098\/rsos.220311","article-title":"Published correlational effect sizes in social and developmental psychology","volume":"9","author":"Szucs","year":"2022","journal-title":"R. Soc. Open Sci."},{"key":"ref_30","first-page":"1","article-title":"A comparative analysis of Spearman and Pearson correlation using SPSS","volume":"5","author":"Suherman","year":"2025","journal-title":"OPTIMA J. Guid. Couns."},{"unstructured":"Breiman, L., Friedman, J., Olshen, R., and Stone, C. (1984). Classification and Regression Trees, Wadsworth.","key":"ref_31"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"e1301","DOI":"10.1002\/widm.1301","article-title":"Hyperparameters and Tuning Strategies for Random Forest","volume":"9","author":"Probst","year":"2019","journal-title":"WIREs Data Mining Knowl. Discov."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random Forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"doi-asserted-by":"crossref","unstructured":"Schapire, R., and Freund, Y. (2012). Boosting: Foundations and Algorithms, MIT Press.","key":"ref_34","DOI":"10.7551\/mitpress\/8291.001.0001"},{"doi-asserted-by":"crossref","unstructured":"Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning, Springer. Chapter 10.","key":"ref_35","DOI":"10.1007\/978-0-387-84858-7"},{"doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Francisco, CA, USA.","key":"ref_36","DOI":"10.1145\/2939672.2939785"},{"doi-asserted-by":"crossref","unstructured":"Dietterich, T.G. (2000). Ensemble Methods in Machine Learning. Multiple Classifier Systems. Lecture Notes in Computer Science, Springer.","key":"ref_37","DOI":"10.1007\/3-540-45014-9_1"},{"unstructured":"G\u00e9ron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, O\u2019Reilly Media. [2nd ed.].","key":"ref_38"},{"doi-asserted-by":"crossref","unstructured":"Zhou, Z.-H. (2012). Ensemble Methods: Foundations and Algorithms, Chapman and Hall\/CRC.","key":"ref_39","DOI":"10.1201\/b12207"},{"key":"ref_40","first-page":"3133","article-title":"Do We Need Hundreds of Classifiers to Solve Real World Classification Problems?","volume":"15","author":"Cernadas","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1007\/s10113-022-01915-1","article-title":"Applying machine learning to social datasets: A study of migration in southwestern Bangladesh using random forests","volume":"22","author":"Best","year":"2022","journal-title":"Reg. Environ. Change"},{"doi-asserted-by":"crossref","unstructured":"Fiandrino, S., Cattuto, C., Paolotti, D., and Schifanella, R. (2023). Combining environmental and socioeconomic data to understand determinants of conflicts in Colombia. Front. Big Data, 6.","key":"ref_42","DOI":"10.3389\/fdata.2023.1107785"}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/12\/797\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,17]],"date-time":"2025-12-17T09:48:55Z","timestamp":1765964935000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/12\/797"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,16]]},"references-count":42,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["a18120797"],"URL":"https:\/\/doi.org\/10.3390\/a18120797","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2025,12,16]]}}}