{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T23:17:01Z","timestamp":1775258221407,"version":"3.50.1"},"reference-count":27,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2022,2,28]],"date-time":"2022-02-28T00:00:00Z","timestamp":1646006400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Program COMPETE 2020, Portugal 2020","award":["POCI-05-5762-FSE-000199"],"award-info":[{"award-number":["POCI-05-5762-FSE-000199"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>In Portugal, the dropout rate of university courses is around 29%. Understanding the reasons behind such a high desertion rate can drastically improve the success of students and universities. This work applies existing data mining techniques to predict the academic dropout mainly using the academic grades. Four different machine learning techniques are presented and analyzed. The dataset consists of 331 students who were previously enrolled in the Computer Engineering degree at the Universidade de Tr\u00e1s-os-Montes e Alto Douro (UTAD). The study aims to detect students who may prematurely drop out using existing methods. The most relevant data features were identified using the Permutation Feature Importance technique. In the second phase, several methods to predict the dropouts were applied. Then, each machine learning technique\u2019s results were displayed and compared to select the best approach to predict academic dropout. The methods used achieved good results, reaching an F1-Score of 81% in the final test set, concluding that students\u2019 marks somehow incorporate their living conditions.<\/jats:p>","DOI":"10.3390\/fi14030076","type":"journal-article","created":{"date-parts":[[2022,2,28]],"date-time":"2022-02-28T20:09:57Z","timestamp":1646078997000},"page":"76","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":22,"title":["Forecasting Students Dropout: A UTAD University Study"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4236-545X","authenticated-orcid":false,"given":"Diogo E.","family":"Moreira da Silva","sequence":"first","affiliation":[{"name":"ECT\u2013UTAD Escola de Ci\u00eancias e Tecnologia, Universidade de Tr\u00e1s-os-Montes e Alto Douro, 5000-811 Vila Real, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3224-4926","authenticated-orcid":false,"given":"Eduardo J.","family":"Solteiro Pires","sequence":"additional","affiliation":[{"name":"ECT\u2013UTAD Escola de Ci\u00eancias e Tecnologia, Universidade de Tr\u00e1s-os-Montes e Alto Douro, 5000-811 Vila Real, Portugal"},{"name":"INESC TEC\u2014INESC Technology and Science (UTAD Pole), 5001-801 Vila Real, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9818-7090","authenticated-orcid":false,"given":"Ars\u00e9nio","family":"Reis","sequence":"additional","affiliation":[{"name":"ECT\u2013UTAD Escola de Ci\u00eancias e Tecnologia, Universidade de Tr\u00e1s-os-Montes e Alto Douro, 5000-811 Vila Real, Portugal"},{"name":"INESC TEC\u2014INESC Technology and Science (UTAD Pole), 5001-801 Vila Real, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4283-1243","authenticated-orcid":false,"given":"Paulo B.","family":"de Moura Oliveira","sequence":"additional","affiliation":[{"name":"ECT\u2013UTAD Escola de Ci\u00eancias e Tecnologia, Universidade de Tr\u00e1s-os-Montes e Alto Douro, 5000-811 Vila Real, Portugal"},{"name":"INESC TEC\u2014INESC Technology and Science (UTAD Pole), 5001-801 Vila Real, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4847-5104","authenticated-orcid":false,"given":"Jo\u00e3o","family":"Barroso","sequence":"additional","affiliation":[{"name":"ECT\u2013UTAD Escola de Ci\u00eancias e Tecnologia, Universidade de Tr\u00e1s-os-Montes e Alto Douro, 5000-811 Vila Real, Portugal"},{"name":"INESC TEC\u2014INESC Technology and Science (UTAD Pole), 5001-801 Vila Real, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2022,2,28]]},"reference":[{"key":"ref_1","unstructured":"Engr\u00e1cia, P., Oliveira, J., and DGEEC (2022, January 17). Percursos no Ensino Superior 2018. Available online: https:\/\/www.dgeec.mec.pt\/np4\/292\/%7B$clientServletPath%7D\/?newsId=516&fileName=DGEEC_SituacaoApos4AnosLicenciaturas.pdf."},{"key":"ref_2","first-page":"225","article-title":"Predicting Students\u2019 Dropout at University Using Artificial Neural Networks","volume":"7","author":"Siri","year":"2015","journal-title":"Ital. J. Sociol. Educ."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Queiroga, E.M., Lopes, J.L., Kappel, K., Aguiar, M., Ara\u00fajo, R.M., Munoz, R., Villarroel, R., and Cechinel, C. (2020). A Learning Analytics Approach to Identify Students at Risk of Dropout: A Case Study with a Technical Distance Education Course. Appl. Sci., 10.","DOI":"10.3390\/app10113998"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"107271","DOI":"10.1016\/j.compeleceng.2021.107271","article-title":"Deep analytic model for student dropout prediction in massive open online courses","volume":"93","author":"Mubarak","year":"2021","journal-title":"Comput. Electr. Eng."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Dass, S., Gary, K., and Cunningham, J. (2021). Predicting Student Dropout in Self-Paced MOOC Course Using Random Forest Model. Information, 12.","DOI":"10.3390\/info12110476"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"149","DOI":"10.18178\/ijmlc.2019.9.2.779","article-title":"Neural networks to predict dropout at the universities","volume":"9","author":"Alban","year":"2019","journal-title":"Int. J. Mach. Learn. Comput."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Plagge, M. (2013, January 4\u20136). Using Artificial Neural Networks to predict first-year traditional students second year retention rates. Proceedings of the Annual Southeast Conference, Savannah, GA, USA.","DOI":"10.1145\/2498328.2500061"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.childyouth.2018.11.030","article-title":"Dropout early warning systems for high school students using machine learning","volume":"96","author":"Chung","year":"2019","journal-title":"Child. Youth Serv. Rev."},{"key":"ref_9","unstructured":"Pereira, R.T., and Zambrano, J.C. (2017, January 18\u201321). Application of decision trees for detection of student dropout profiles. Proceedings of the 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), Cancun, Mexico."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"133076","DOI":"10.1109\/ACCESS.2021.3115851","article-title":"A real-life machine learning experience for predicting university dropout at different stages using academic data","volume":"9","author":"Preciado","year":"2021","journal-title":"IEEE Access"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"206","DOI":"10.25046\/aj040425","article-title":"Predictive modelling of student dropout using ensemble classifier method in higher education","volume":"4","author":"Hutagaol","year":"2019","journal-title":"Adv. Sci. Technol. Eng. Syst. J."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Kiss, B., Nagy, M., Molontay, R., and Csabay, B. (2019, January 21\u201322). Predicting dropout using high school and first-semester academic achievement measures. Proceedings of the 2019 17th International Conference on Emerging eLearning Technologies and Applications (ICETA), Star\u00fd Smokovec, Slovakia.","DOI":"10.1109\/ICETA48886.2019.9040158"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Dharmawan, T., Ginardi, H., and Munif, A. (2018, January 7\u20138). Dropout detection using non-academic data. Proceedings of the 2018 4th International Conference on Science and Technology (ICST), Yogyakarta, Indonesia.","DOI":"10.1109\/ICSTC.2018.8528619"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Hasbun, T., Araya, A., and Villalon, J. (2016, January 7\u20138). Extracurricular activities as dropout prediction factors in higher education using decision trees. Proceedings of the 2016 IEEE 16th International Conference on Advanced Learning Technologies (ICALT), Yogyakarta, Indonesia.","DOI":"10.1109\/ICALT.2016.66"},{"key":"ref_15","unstructured":"Mduma, N., Kalegele, K., and Machuve, D. (2022, January 17). A survey of Machine Learning Approaches and Techniques for Student Dropout Prediction 2019. Available online: https:\/\/dspace.nm-aist.ac.tz\/handle\/20.500.12479\/71."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"de Oliveira, C.F., Sobral, S.R., Ferreira, M.J., and Moreira, F. (2021). How Does Learning Analytics Contribute to Prevent Students\u2019 Dropout in Higher Education: A Systematic Literature Review. Big Data Cogn. Comput., 5.","DOI":"10.3390\/bdcc5040064"},{"key":"ref_17","unstructured":"Kriesel, D. (2022, January 17). Neural Networks. Available online: https:\/\/www.dkriesel.com\/_media\/science\/neuronalenetze-en-zeta2-2col-dkrieselcom.pdf."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zhou, Z.H. (2009). Ensemble Learning. Encyclopedia of Biometrics, Springer.","DOI":"10.1007\/978-0-387-73003-5_293"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/3-540-45014-9_1","article-title":"Ensemble methods in machine learning","volume":"Volume 1857","author":"Dietterich","year":"2000","journal-title":"International Workshop on Multiple Classifier Systems"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1016\/S0034-4257(97)00049-7","article-title":"Decision tree classification of land cover from remotely sensed data","volume":"61","author":"Brodley","year":"1997","journal-title":"Remote Sens. Environ."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/j.jbi.2018.03.007","article-title":"Wisdom of artificial crowds feature selection in untargeted metabolomics: An application to the development of a blood-based diagnostic test for thrombotic myocardial infarction","volume":"81","author":"Trainor","year":"2018","journal-title":"J. Biomed. Inform."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"21","DOI":"10.3389\/fnbot.2013.00021","article-title":"Gradient boosting machines, a tutorial","volume":"7","author":"Natekin","year":"2013","journal-title":"Front. Neurorobot."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). XGBoost: A scalable tree boosting system. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_24","unstructured":"Dorogush, A.V., Ershov, V., and Gulin, A. (2018). CatBoost: Gradient boosting with categorical features support. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1263","DOI":"10.1109\/TKDE.2008.239","article-title":"Learning from Imbalanced Data","volume":"21","author":"He","year":"2009","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"146","DOI":"10.1016\/j.trechm.2020.12.004","article-title":"Metrics for Benchmarking and Uncertainty Quantification: Quality, Applicability, and Best Practices for Machine Learning in Chemistry","volume":"3","author":"Vishwakarma","year":"2021","journal-title":"Trends Chem."}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/14\/3\/76\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:29:17Z","timestamp":1760135357000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/14\/3\/76"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,28]]},"references-count":27,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2022,3]]}},"alternative-id":["fi14030076"],"URL":"https:\/\/doi.org\/10.3390\/fi14030076","relation":{},"ISSN":["1999-5903"],"issn-type":[{"value":"1999-5903","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,2,28]]}}}