{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T14:51:57Z","timestamp":1773154317505,"version":"3.50.1"},"reference-count":69,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2024,6,4]],"date-time":"2024-06-04T00:00:00Z","timestamp":1717459200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Machine learning (ML) has become integral in educational decision-making through technologies such as learning analytics and educational data mining. However, the adoption of machine learning-driven tools without scrutiny risks perpetuating biases. Despite ongoing efforts to tackle fairness issues, their application to educational datasets remains limited. To address the mentioned gap in the literature, this research evaluates the effectiveness of four bias mitigation techniques in an educational dataset aiming at predicting students\u2019 dropout rate. The overarching research question is: \u201cHow effective are the techniques of reweighting, resampling, and Reject Option-based Classification (ROC) pivoting in mitigating the predictive bias associated with high school dropout rates in the HSLS:09 dataset?\" The effectiveness of these techniques was assessed based on performance metrics including false positive rate (FPR), accuracy, and F1 score. The study focused on the biological sex of students as the protected attribute. The reweighting technique was found to be ineffective, showing results identical to the baseline condition. Both uniform and preferential resampling techniques significantly reduced predictive bias, especially in the FPR metric but at the cost of reduced accuracy and F1 scores. The ROC pivot technique marginally reduced predictive bias while maintaining the original performance of the classifier, emerging as the optimal method for the HSLS:09 dataset. This research extends the understanding of bias mitigation in educational contexts, demonstrating practical applications of various techniques and providing insights for educators and policymakers. By focusing on an educational dataset, it contributes novel insights beyond the commonly studied datasets, highlighting the importance of context-specific approaches in bias mitigation.<\/jats:p>","DOI":"10.3390\/info15060326","type":"journal-article","created":{"date-parts":[[2024,6,4]],"date-time":"2024-06-04T05:17:30Z","timestamp":1717478250000},"page":"326","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["A Comparison of Bias Mitigation Techniques for Educational Classification Tasks Using Supervised Machine Learning"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9622-3780","authenticated-orcid":false,"given":"Tarid","family":"Wongvorachan","sequence":"first","affiliation":[{"name":"Measurement, Evaluation, and Data Science, University of Alberta, Edmonton, AB T6G 2G5, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5853-1267","authenticated-orcid":false,"given":"Okan","family":"Bulut","sequence":"additional","affiliation":[{"name":"Centre for Research in Applied Measurement and Evaluation, University of Alberta, Edmonton, AB T6G 2G5, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-2278-8966","authenticated-orcid":false,"given":"Joyce Xinle","family":"Liu","sequence":"additional","affiliation":[{"name":"Measurement, Evaluation, and Data Science, University of Alberta, Edmonton, AB T6G 2G5, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-4847-9934","authenticated-orcid":false,"given":"Elisabetta","family":"Mazzullo","sequence":"additional","affiliation":[{"name":"Measurement, Evaluation, and Data Science, University of Alberta, Edmonton, AB T6G 2G5, Canada"}]}],"member":"1968","published-online":{"date-parts":[[2024,6,4]]},"reference":[{"key":"ref_1","unstructured":"Barocas, S., Hardt, M., and Narayanan, A. (2023). Fairness and Machine Learning: Limitations and Opportunities, MIT Press."},{"key":"ref_2","unstructured":"Crawford, K. (2024, May 13). The Trouble with Bias. Available online: https:\/\/www.youtube.com\/watch?v=fMym_BKWQzk."},{"key":"ref_3","unstructured":"Shin, T. (2024, May 13). Real-Life Examples of Discriminating Artificial Intelligence. Towards Data Science. Available online: https:\/\/towardsdatascience.com\/real-life-examples-of-discriminating-artificial-intelligence-cae395a90070."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Dwork, C., Hardt, M., Pitassi, T., Reingold, O., and Zemel, R. (2012, January 8\u201310). Fairness through awareness. Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, Cambridge, MA, USA.","DOI":"10.1145\/2090236.2090255"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Chen, G., Rolim, V., Mello, R.F., and Ga\u0161evi\u0107, D. (2020, January 23\u201327). Let\u2019s shine together!: A comparative study between learning analytics and educational data mining. Proceedings of the Tenth International Conference on Learning Analytics & Knowledge, Frankfurt, Germany.","DOI":"10.1145\/3375462.3375500"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1207","DOI":"10.1111\/jcal.12577","article-title":"Artificial intelligence in educational assessment: \u2018Breakthrough? or buncombe and ballyhoo?\u2019","volume":"37","author":"Gardner","year":"2021","journal-title":"J. Comput. Assist. Learn."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1052","DOI":"10.1007\/s40593-021-00285-9","article-title":"Algorithmic Bias in Education","volume":"32","author":"Baker","year":"2022","journal-title":"Int. J. Artif. Intell. Educ."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"431","DOI":"10.1007\/s43681-021-00096-7","article-title":"Artificial intelligence in education: Addressing ethical challenges in K-12 settings","volume":"2","author":"Akgun","year":"2022","journal-title":"AI Ethics"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1007\/s13347-017-0279-x","article-title":"Fair, transparent, and accountable algorithmic decision-making processes: The premise, the proposed solutions, and the open challenges","volume":"31","author":"Lepri","year":"2018","journal-title":"Philos. Technol."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3457607","article-title":"A survey on bias and fairness in machine learning","volume":"54","author":"Mehrabi","year":"2022","journal-title":"ACM Comput. Surv."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Rokach, L., Maimon, O., and Shmueli, E. (2023). Machine Learning for Data Science Handbook, Springer International Publishing.","DOI":"10.1007\/978-3-031-24628-9"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3616865","article-title":"Fairness in machine learning: A survey","volume":"56","author":"Caton","year":"2023","journal-title":"ACM Comput. Surv."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1089\/big.2016.0047","article-title":"Fair prediction with disparate impact: A study of bias in recidivism prediction instruments","volume":"5","author":"Chouldechova","year":"2017","journal-title":"Big Data"},{"key":"ref_14","first-page":"1","article-title":"dalex: Responsible machine learning with interactive explainability and fairness in Python","volume":"22","author":"Baniecki","year":"2021","journal-title":"J. Mach. Learn. Res."},{"key":"ref_15","unstructured":"Wi\u015bniewski, J., and Biecek, P. (2024, May 30). Hey, ML Engineer! Is Your Model Fair?. Available online: https:\/\/docs.mlinpl.org\/virtual-event\/2020\/posters\/11-Hey_ML_engineer_Is_your_model_fair.pdf."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Mashhadi, A., Zolyomi, A., and Quedado, J. (May, January 30). A Case Study of Integrating Fairness Visualization Tools in Machine Learning Education. Proceedings of the Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (CHI EA \u203222), New York, NY, USA.","DOI":"10.1145\/3491101.3503568"},{"key":"ref_17","unstructured":"Baniecki, H., Kretowicz, W., Piatyszek, P., Wisniewski, J., and Biecek, P. (2024, May 30). Module Dalex.Fairness. Available online: https:\/\/dalex.drwhy.ai\/python\/api\/fairness\/."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Mohanty, P.K., Das, P., and Roy, D.S. (2022, January 14\u201316). Predicting daily household energy usages by using Model Agnostic Language for Exploration and Explanation. Proceedings of the 2022 OITS International Conference on Information Technology (OCIT), Bhubaneswar, India.","DOI":"10.1109\/OCIT56763.2022.00106"},{"key":"ref_19","first-page":"149","article-title":"Fairness in machine learning: Lessons from political philosophy","volume":"Volume 81","author":"Friedler","year":"2018","journal-title":"Proceedings of the 1st Conference on Fairness, Accountability and Transparency"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Srivastava, M., Heidari, H., and Krause, A. (2019, January 4\u20138). Mathematical notions vs. Human perception of fairness: A descriptive approach to fairness for machine learning. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA.","DOI":"10.1145\/3292500.3330664"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1146\/annurev-statistics-042720-125902","article-title":"Algorithmic fairness: Choices, assumptions, and definitions","volume":"8","author":"Mitchell","year":"2021","journal-title":"Annu. Rev. Stat. Its Appl."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C., and Venkatasubramanian, S. (2015, January 10\u201313). Certifying and removing disparate impact. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia.","DOI":"10.1145\/2783258.2783311"},{"key":"ref_23","first-page":"103","article-title":"Data quality: \u201cGarbage in\u2013garbage out\u201d","volume":"47","author":"Kilkenny","year":"2018","journal-title":"Health Inf. Manag. J."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"3217","DOI":"10.1002\/int.22415","article-title":"Missing the missing values: The ugly duckling of fairness in machine learning","volume":"36","author":"Fernando","year":"2021","journal-title":"Int. J. Intell. Syst."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1011","DOI":"10.1613\/jair.1.13197","article-title":"Impact of imputation strategies on fairness in machine learning","volume":"74","author":"Caton","year":"2022","journal-title":"J. Artif. Intell. Res."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"381","DOI":"10.21275\/ART20203995","article-title":"Machine learning algorithms\u2014A review","volume":"9","author":"Mahesh","year":"2020","journal-title":"Int. J. Sci. Res. (IJSR)"},{"key":"ref_27","unstructured":"Ro\u00dfbach, P. (2024, May 13). Neural Networks vs. Random Forests\u2014Does It Always Have to Be Deep Learning?. Available online: https:\/\/blog.frankfurt-school.de\/wp-content\/uploads\/2018\/10\/Neural-Networks-vs-Random-Forests.pdf."},{"key":"ref_28","unstructured":"Li, H. (SAS Blogs, 2017). Which machine learning algorithm should I use?, SAS Blogs."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1016\/j.neucom.2020.07.061","article-title":"On hyperparameter optimization of machine learning algorithms: Theory and practice","volume":"415","author":"Yang","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Visalakshi, S., and Radha, V. (2014, January 18\u201320). A literature review of feature selection techniques and applications: Review of feature selection in data mining. Proceedings of the 2014 IEEE International Conference on Computational Intelligence and Computing Research, Coimbatore, India.","DOI":"10.1109\/ICCIC.2014.7238499"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"330","DOI":"10.1145\/230538.230561","article-title":"Bias in computer systems","volume":"14","author":"Friedman","year":"1996","journal-title":"ACM Trans. Inf. Syst."},{"key":"ref_32","unstructured":"Dobbe, R., Dean, S., Gilbert, T., and Kohli, N. (2018). A broader view on bias in automated decision-making: Reflecting on epistemology and dynamics. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"2199","DOI":"10.30534\/ijeter\/2020\/117852020","article-title":"Analysis, prediction and evaluation of COVID-19 datasets using machine learning algorithms","volume":"8","author":"Prakash","year":"2020","journal-title":"Int. J. Emerg. Trends Eng. Res."},{"key":"ref_34","unstructured":"Haas, C. (2019, January 15\u201318). The price of fairness\u2014A framework to explore trade-offs in algorithmic fairness. Proceedings of the International Conference on Information Systems (ICIS), Munich, Germany."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1016\/j.cognition.2010.10.004","article-title":"Conceptual complexity and the bias\/variance tradeoff","volume":"118","author":"Briscoe","year":"2011","journal-title":"Cognition"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Speicher, T., Heidari, H., Grgic-Hlaca, N., Gummadi, K.P., Singla, A., Weller, A., and Zafar, M.B. (2018, January 19\u201323). A unified approach to quantifying algorithmic unfairness: Measuring individual & group unfairness via inequality indices. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK.","DOI":"10.1145\/3219819.3220046"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., and Huq, A. (2017, January 13\u201317). Algorithmic decision making and the cost of fairness. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD \u203217), Halifax, NS, Canada.","DOI":"10.1145\/3097983.3098095"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Veale, M., Van Kleek, M., and Binns, R. (2018, January 21\u201326). Fairness and accountability design needs for algorithmic support in high-stakes public sector decision-making. Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal, QC, Canada.","DOI":"10.1145\/3173574.3174014"},{"key":"ref_39","first-page":"3133","article-title":"Do we need hundreds of classifiers to solve real world classification problems?","volume":"15","author":"Cernadas","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Noble, S.U. (2018). Algorithms of Oppression: How Search Engines Reinforce Racism, New York University Press.","DOI":"10.2307\/j.ctt1pwt9w5"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Selbst, A.D., Boyd, D., Friedler, S.A., Venkatasubramanian, S., and Vertesi, J. (2019, January 29\u201331). Fairness and abstraction in sociotechnical systems. Proceedings of the Conference on Fairness, Accountability, and Transparency, Atlanta, GA, USA.","DOI":"10.1145\/3287560.3287598"},{"key":"ref_42","unstructured":"Morozov, E. (2013). To Save Everything, Click Here: The Folly of Technological Solutionism, PublicAffairs. [1st ed.]."},{"key":"ref_43","first-page":"1","article-title":"Fairlearn: Assessing and improving fairness of AI systems","volume":"24","author":"Weerts","year":"2023","journal-title":"J. Mach. Learn. Res."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"205395171774353","DOI":"10.1177\/2053951717743530","article-title":"Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data","volume":"4","author":"Veale","year":"2017","journal-title":"Big Data Soc."},{"key":"ref_45","unstructured":"Vartan, S. (Scientific American, 2019). Racial bias found in a major health care risk algorithm, Scientific American."},{"key":"ref_46","unstructured":"Larson, J., Mattu, S., Kirchner, L., and Angwin, J. (ProPublica, 2016). How we analyzed the COMPAS recidivism algorithm, ProPublica."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Biecek, P., and Burzykowski, T. (2021). Explanatory Model Analysis, Chapman and Hall\/CRC.","DOI":"10.1201\/9780429027192"},{"key":"ref_48","unstructured":"Hardt, M., Price, E., and Srebro, N. (2016, January 5\u201310). Equality of opportunity in supervised learning. Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS\u201916), Barcelona, Spain."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Zafar, M.B., Valera, I., Gomez Rodriguez, M., and Gummadi, K.P. (2017, January 3\u20137). Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. Proceedings of the 26th International Conference on World Wide Web, Perth, Australia.","DOI":"10.1145\/3038912.3052660"},{"key":"ref_50","unstructured":"Bobko, P., and Roth, P.L. (2004). Research in Personnel and Human Resources Management, Emerald Group Publishing Limited."},{"key":"ref_51","first-page":"1","article-title":"Adverse impact in black student 6-year college graduation rates","volume":"39","author":"Hobson","year":"2021","journal-title":"Res. High. Educ."},{"key":"ref_52","unstructured":"Raghavan, M., and Kim, P.T. (2023). Limitations of the \u201cfour-fifths rule\u201d and statistical parity tests for measuring fairness. Georget. Law Technol. Rev., 8, Available online: https:\/\/ssrn.com\/abstract=4624571."},{"key":"ref_53","unstructured":"Watkins, E.A., McKenna, M., and Chen, J. (2022). The four-fifths rule is not disparate impact: A woeful tale of epistemic trespassing in algorithmic fairness. arXiv."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s10115-011-0463-8","article-title":"Data preprocessing techniques for classification without discrimination","volume":"33","author":"Kamiran","year":"2012","journal-title":"Knowl. Inf. Syst."},{"key":"ref_55","first-page":"2825","article-title":"Scikit-learn: Machine learning in Python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Kamiran, F., Karim, A., and Zhang, X. (2012, January 10\u201312). Decision theory for discrimination-aware classification. Proceedings of the 2012 IEEE 12th International Conference on Data Mining, Brussels, Belgium.","DOI":"10.1109\/ICDM.2012.45"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Fern\u00e1ndez, A., Garc\u00eda, S., Galar, M., Prati, R.C., Krawczyk, B., and Herrera, F. (2018). Learning from Imbalanced Data Sets, Springer International Publishing.","DOI":"10.1007\/978-3-319-98074-4"},{"key":"ref_58","unstructured":"National Center for Educational Statistics [NCES] (2016). High School Longitudinal Study of 2009, NCES."},{"key":"ref_59","unstructured":"Van Rossum, G., and Drake, F.L. (2009). Python 3 Reference Manual, CreateSpace."},{"key":"ref_60","doi-asserted-by":"crossref","first-page":"52","DOI":"10.5539\/hes.v9n3p52","article-title":"Revisiting the Tinto\u2019s theoretical dropout model","volume":"9","author":"Nicoletti","year":"2019","journal-title":"High. Educ. Stud."},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Bulut, O., Wongvorachan, T., and He, S. (2024, May 30). Enhancing High-School Dropout Identification: A Collaborative Approach Integrating Human and Machine Insights. Manuscript Submitted for Publication, Available online: https:\/\/www.researchsquare.com\/article\/rs-3871667\/v1.","DOI":"10.21203\/rs.3.rs-3871667\/v1"},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"He, H., and Ma, Y. (2013). Imbalanced learning: Foundations, Algorithms, and Applications, John Wiley & Sons, Inc.","DOI":"10.1002\/9781118646106"},{"key":"ref_63","first-page":"116","article-title":"Classification of non-performing financing using logistic regression and synthetic minority over-sampling technique-nominal continuous (SMOTE-NC)","volume":"13","author":"Islahulhaq","year":"2021","journal-title":"Int. J. Adv. Soft Comput. Its Appl."},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Canbek, G., Sagiroglu, S., Temizel, T.T., and Baykal, N. (2017, January 5\u20138). Binary classification performance measures\/metrics: A comprehensive visualized roadmap to gain new insights. Proceedings of the 2017 International Conference on Computer Science and Engineering (UBMK), Antalya, Turkey.","DOI":"10.1109\/UBMK.2017.8093539"},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD \u201916), San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_66","unstructured":"Chollet, F. (2024, May 30). Keras. Available online: https:\/\/keras.io."},{"key":"ref_67","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1080\/08839514.2021.1877481","article-title":"Classifier selection and ensemble model for multi-class imbalance learning in education grants prediction","volume":"35","author":"Sun","year":"2021","journal-title":"Appl. Artif. Intell."},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Barros, T.M., SouzaNeto, P.A., Silva, I., and Guedes, L.A. (2019). Predictive models for imbalanced data: A school dropout perspective. Educ. Sci., 9.","DOI":"10.3390\/educsci9040275"},{"key":"ref_69","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1007\/s10489-012-0374-8","article-title":"Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data","volume":"38","author":"Cano","year":"2013","journal-title":"Appl. Intell."}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/6\/326\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:53:25Z","timestamp":1760108005000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/6\/326"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,6,4]]},"references-count":69,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2024,6]]}},"alternative-id":["info15060326"],"URL":"https:\/\/doi.org\/10.3390\/info15060326","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,6,4]]}}}