{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T14:45:05Z","timestamp":1774622705007,"version":"3.50.1"},"reference-count":48,"publisher":"Wiley","issue":"6","license":[{"start":{"date-parts":[[2025,3,24]],"date-time":"2025-03-24T00:00:00Z","timestamp":1742774400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/"}],"content-domain":{"domain":["bera-journals.onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Brit J Educational Tech"],"published-print":{"date-parts":[[2025,11]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:label\/><jats:p>The use of predictive analytics powered by machine learning (ML) to model educational data has increasingly been identified to exhibit bias towards marginalized populations, prompting the need for more equitable applications of these techniques. To tackle bias that emerges in training data or models at different stages of the ML modelling pipeline, numerous debiasing approaches have been proposed. Yet, research into state\u2010of\u2010the\u2010art techniques for effectively employing these approaches to enhance fairness in educational predictive scenarios remains limited. Prior studies often focused on mitigating bias from a single source at a specific stage of model construction within narrowly defined scenarios, overlooking the complexities of bias originating from multiple sources across various stages. Moreover, these approaches were often evaluated using typical threshold\u2010dependent fairness metrics, which fail to account for real\u2010world educational scenarios where thresholds are typically unknown before evaluation. To bridge these gaps, this study systematically examined a total of 28 representative debiasing approaches, categorized by the sources of bias and the stage they targeted, for two critical educational predictive tasks, namely forum post classification and student career prediction. Both tasks involve a two\u2010phase modelling process where features learned from upstream models in the first phase are fed into classical ML models for final predictions, which is a common yet under\u2010explored setting for educational data modelling. The study observed that addressing local stereotypical bias, label bias or proxy discrimination in training data, as well as imposing fairness constraints on models, can effectively enhance predictive fairness. But their efficacy was often compromised when features from upstream models were inherently biased. Beyond that, this study proposes two novel strategies, namely Multi\u2010Stage and Multi\u2010Source debiasing to integrate existing approaches. These strategies demonstrated substantial improvements in mitigating unfairness, underscoring the importance of unified approaches capable of addressing biases from various sources across multiple stages.<\/jats:p><\/jats:sec><jats:sec><jats:label\/><jats:p>\n<jats:boxed-text content-type=\"box\" position=\"anchor\"><jats:caption><jats:title>Practitioner notes<\/jats:title><\/jats:caption><jats:p>What is already known about this topic\n<jats:list list-type=\"bullet\">\n<jats:list-item><jats:p>Predictive analytics for educational data modelling often exhibit bias against students from certain demographic groups based on sensitive attributes.<\/jats:p><\/jats:list-item>\n<jats:list-item><jats:p>Bias can emerge in training data or models at different time points of the ML modelling pipeline, resulting in unfair final predictions.<\/jats:p><\/jats:list-item>\n<jats:list-item><jats:p>Numerous debiasing approaches have been developed to tackle bias at different stages, including pre\u2010processing training data, in\u2010processing models, and post\u2010processing predicted outcomes or trained models.<\/jats:p><\/jats:list-item>\n<\/jats:list><\/jats:p><jats:p>What this paper adds\n<jats:list list-type=\"bullet\">\n<jats:list-item><jats:p>A systematic evaluation of 28 state\u2010of\u2010the\u2010art debiasing approaches covering multiple sources of biases and multiple stages across two different educational predictive scenarios, identifying leading sources of data biases contributing to predictive unfairness.<\/jats:p><\/jats:list-item>\n<jats:list-item><jats:p>Further enhancing predictive fairness with proposed debiasing strategies considering the multi\u2010source and multi\u2010stage characteristics of biases.<\/jats:p><\/jats:list-item>\n<jats:list-item><jats:p>Revealing potential risks of debiasing focused on a single sensitive attribute.<\/jats:p><\/jats:list-item>\n<\/jats:list><\/jats:p><jats:p>Implications for practitioners\n<jats:list list-type=\"bullet\">\n<jats:list-item><jats:p>Pre\u2010processing approaches, particularly those addressing stereotypical bias, label bias and proxy discrimination, are generally effective for improving fairness in educational predictions. Re\u2010weighing methods are especially useful for smaller datasets to tackle stereotypical bias.<\/jats:p><\/jats:list-item>\n<jats:list-item><jats:p>When dealing with two\u2010phase modelling, biases inherently encoded in the features generated from upstream models might not be effectively addressed by debiasing approaches applied to downstream models.<\/jats:p><\/jats:list-item>\n<jats:list-item><jats:p>Combining debiasing approaches to tackle multiple sources of biases across multiple stages significantly enhances predictive fairness.<\/jats:p><\/jats:list-item>\n<\/jats:list><\/jats:p><\/jats:boxed-text>\n<\/jats:p><\/jats:sec>","DOI":"10.1111\/bjet.13575","type":"journal-article","created":{"date-parts":[[2025,3,25]],"date-time":"2025-03-25T08:18:30Z","timestamp":1742890710000},"page":"2478-2501","update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["When and how biases seep in: Enhancing debiasing approaches for fair educational predictive analytics"],"prefix":"10.1111","volume":"56","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4205-7975","authenticated-orcid":false,"given":"Lin","family":"Li","sequence":"first","affiliation":[{"name":"Centre for Learning Analytics Monash University  Melbourne VIC Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4194-318X","authenticated-orcid":false,"given":"Namrata","family":"Srivastava","sequence":"additional","affiliation":[{"name":"Centre for Learning Analytics Monash University  Melbourne VIC Australia"},{"name":"Penn Center for Learning Analytics University of Pennsylvania  Philadelphia Pennsylvania USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9462-3924","authenticated-orcid":false,"given":"Jia","family":"Rong","sequence":"additional","affiliation":[{"name":"Centre for Learning Analytics Monash University  Melbourne VIC Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6911-3853","authenticated-orcid":false,"given":"Quanlong","family":"Guan","sequence":"additional","affiliation":[{"name":"Department of Computer Science, College of Information Science and Technology Jinan University  Guangzhou Guangdong China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9265-1908","authenticated-orcid":false,"given":"Dragan","family":"Ga\u0161evi\u0107","sequence":"additional","affiliation":[{"name":"Centre for Learning Analytics Monash University  Melbourne VIC Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8236-3133","authenticated-orcid":false,"given":"Guanliang","family":"Chen","sequence":"additional","affiliation":[{"name":"Centre for Learning Analytics Monash University  Melbourne VIC Australia"}]}],"member":"311","published-online":{"date-parts":[[2025,3,24]]},"reference":[{"key":"e_1_2_12_2_1","first-page":"60","volume-title":"Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmassan, Stockholm, Sweden","author":"Agarwal A.","year":"2018"},{"key":"e_1_2_12_3_1","doi-asserted-by":"publisher","DOI":"10.1187\/cbe.19-03-0048"},{"key":"e_1_2_12_4_1","first-page":"38747","article-title":"Beyond adult and compas: Fair multi\u2010class prediction via information projection","volume":"35","author":"Alghamdi W.","year":"2022","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_12_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.compedu.2017.11.002"},{"key":"e_1_2_12_6_1","first-page":"1","volume-title":"Emerging trends in the social and behavioral sciences: An interdisciplinary, searchable, and linkable resource","author":"Artelt C.","year":"2015"},{"key":"e_1_2_12_7_1","first-page":"98","article-title":"Ethnicity, gender, and perceptions of online learning in higher education","volume":"8","author":"Ashong C. Y.","year":"2012","journal-title":"MERLOT Journal of Online Learning and Teaching"},{"key":"e_1_2_12_8_1","doi-asserted-by":"publisher","DOI":"10.1007\/s40593-021-00285-9"},{"key":"e_1_2_12_9_1","first-page":"71","volume-title":"International Conference on Artificial Intelligence in Education, Utrecht, The Netherlands","author":"Bayer V.","year":"2021"},{"key":"e_1_2_12_10_1","volume-title":"Women in stem: A gender gap to innovation","author":"Beede D. N.","year":"2011"},{"key":"e_1_2_12_11_1","first-page":"2022","volume-title":"Workshop on Trustworthy and Socially Responsible Machine Learning","author":"BehnamGhader P.","year":"2022"},{"key":"e_1_2_12_12_1","unstructured":"Bird S. Dud\u00edk M. Edgar R. Horn B. Lutz R. Milan V. Sameki M. Wallach H. &Walker K.(2020).Fairlearn: A toolkit for assessing and improving fairness in AI. Microsoft Tech. Rep. MSR\u2010TR\u20102020\u201032."},{"key":"e_1_2_12_13_1","doi-asserted-by":"publisher","DOI":"10.1126\/science.aal4230"},{"key":"e_1_2_12_14_1","doi-asserted-by":"publisher","DOI":"10.1017\/S0047279415000227"},{"key":"e_1_2_12_15_1","unstructured":"Caton S. &Haas C.(2020).Fairness in machine learning: A survey.arXiv preprint arXiv:2010.04053."},{"key":"e_1_2_12_16_1","doi-asserted-by":"publisher","DOI":"10.1613\/jair.953"},{"key":"e_1_2_12_17_1","unstructured":"Clavi\u00e9 B. &Gal K.(2019).Edubert: Pretrained deep language models for learning analytics.arXiv preprint arXiv:1912.00690."},{"key":"e_1_2_12_18_1","doi-asserted-by":"publisher","DOI":"10.1111\/bjet.13217"},{"key":"e_1_2_12_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2024.3361979"},{"key":"e_1_2_12_20_1","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1145\/3287560.3287589","volume-title":"Proceedings of the Conference on Fairness, Accountability, and Transparency, Atlanta, Georgia, USA","author":"Friedler S. A.","year":"2019"},{"key":"e_1_2_12_21_1","volume-title":"Education inequalities at the school starting gate: Gaps, trends, and strategies to address them","author":"Garc\u00eda E.","year":"2017"},{"key":"e_1_2_12_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3303772.3303791"},{"key":"e_1_2_12_23_1","doi-asserted-by":"crossref","first-page":"11335","DOI":"10.18653\/v1\/2022.emnlp-main.779","volume-title":"Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing","author":"Han X.","year":"2022"},{"key":"e_1_2_12_24_1","first-page":"68","volume-title":"Proceedings of the 2nd Conference of the Asia\u2010Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Han X.","year":"2022"},{"key":"e_1_2_12_25_1","first-page":"702","volume-title":"International Conference on Artificial Intelligence and Statistics, Palermo, Italy","author":"Jiang H.","year":"2020"},{"key":"e_1_2_12_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-011-0463-8"},{"key":"e_1_2_12_27_1","doi-asserted-by":"publisher","DOI":"10.4324\/9780429329067-10"},{"key":"e_1_2_12_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3468507.3468518"},{"key":"e_1_2_12_29_1","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1145\/3576050.3576119","volume-title":"LAK23: 13th International Learning Analytics and Knowledge Conference,","author":"Li L.","year":"2023"},{"key":"e_1_2_12_30_1","doi-asserted-by":"crossref","first-page":"608","DOI":"10.1145\/3636555.3636920","volume-title":"Proceedings of the 14th Learning Analytics and Knowledge Conference, Kyoto, Japan","author":"Li L.","year":"2024"},{"key":"e_1_2_12_31_1","first-page":"255","volume-title":"International Conference on Artificial Intelligence in Education, Utrecht, The Netherlands,","author":"Litman D.","year":"2021"},{"key":"e_1_2_12_32_1","doi-asserted-by":"crossref","unstructured":"Liu H. Jin W. Karimi H. Liu Z. &Tang J.(2021).The authors matter: Understanding and mitigating implicit bias in deep text classification.arXiv preprint arXiv:2105.02778.","DOI":"10.18653\/v1\/2021.findings-acl.7"},{"key":"e_1_2_12_33_1","doi-asserted-by":"publisher","DOI":"10.1177\/0004944116664618"},{"key":"e_1_2_12_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3457607"},{"key":"e_1_2_12_35_1","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1211286109"},{"key":"e_1_2_12_36_1","first-page":"i","article-title":"Assistments longitudinal data mining competition special issue: A preface","volume":"12","author":"Patikorn T.","year":"2020","journal-title":"Journal of Educational Data Mining"},{"key":"e_1_2_12_37_1","first-page":"5690","article-title":"On fairness and calibration","volume":"30","author":"Pleiss G.","year":"2017","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_12_38_1","volume-title":"9th International Conference on Learning Representations, Online only","author":"Roh Y.","year":"2021"},{"key":"e_1_2_12_39_1","volume-title":"Failing at fairness: How America's schools cheat girls","author":"Sadker M.","year":"2010"},{"key":"e_1_2_12_40_1","first-page":"1275","volume-title":"Proceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Republic of Korea","author":"Sha L.","year":"2022"},{"key":"e_1_2_12_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3465416.3483305"},{"key":"e_1_2_12_42_1","doi-asserted-by":"crossref","first-page":"1993","DOI":"10.1145\/3531146.3533242","volume-title":"Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea","author":"Tschantz M. C.","year":"2022"},{"key":"e_1_2_12_43_1","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1145\/3506860.3506902","volume-title":"LAK22: 12th International Learning Analytics and Knowledge Conference, Online, USA","author":"Vasquez Verdugo J.","year":"2022"},{"key":"e_1_2_12_44_1","unstructured":"Verger M. Lall\u00e9 S. Bouchet F. &Luengo V.(2023).Is your model \u201cMADD\u201d? a novel metric to evaluate algorithmic fairness for predictive student models.arXiv preprint arXiv:2305.15342."},{"key":"e_1_2_12_45_1","first-page":"5310","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision, Los Alamitos, CA, USA","author":"Wang T.","year":"2019"},{"key":"e_1_2_12_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3340531.3411980"},{"key":"e_1_2_12_47_1","doi-asserted-by":"publisher","DOI":"10.1007\/s40593-019-00175-1"},{"key":"e_1_2_12_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3278721.3278779"},{"key":"e_1_2_12_49_1","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1016\/j.chb.2016.02.073","article-title":"Is there a gender difference in interacting with intelligent tutoring system? Can bayesian knowledge tracing and learning curve analysis models answer this question?","volume":"61","author":"Zhuhadar L.","year":"2016","journal-title":"Computers in Human Behavior"}],"container-title":["British Journal of Educational Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/bera-journals.onlinelibrary.wiley.com\/doi\/pdf\/10.1111\/bjet.13575","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T03:31:23Z","timestamp":1760067083000},"score":1,"resource":{"primary":{"URL":"https:\/\/bera-journals.onlinelibrary.wiley.com\/doi\/10.1111\/bjet.13575"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,24]]},"references-count":48,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,11]]}},"alternative-id":["10.1111\/bjet.13575"],"URL":"https:\/\/doi.org\/10.1111\/bjet.13575","archive":["Portico"],"relation":{},"ISSN":["0007-1013","1467-8535"],"issn-type":[{"value":"0007-1013","type":"print"},{"value":"1467-8535","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,3,24]]},"assertion":[{"value":"2024-07-25","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-01-28","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-03-24","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}