{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T00:42:13Z","timestamp":1776732133660,"version":"3.51.2"},"reference-count":62,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2025,7,21]],"date-time":"2025-07-21T00:00:00Z","timestamp":1753056000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>Detecting biased language in large-scale corpora, such as the Wiki Neutrality Corpus, is essential for promoting neutrality in digital content. This study systematically evaluates a range of machine learning (ML) and deep learning (DL) models for the detection of biased and pre-conditioned phrases. Conventional classifiers, including Extreme Gradient Boosting (XGBoost), Light Gradient-Boosting Machine (LightGBM), and Categorical Boosting (CatBoost), are compared with advanced neural architectures such as Bidirectional Encoder Representations from Transformers (BERT), Long Short-Term Memory (LSTM) networks, and Generative Adversarial Networks (GANs). A novel hybrid architecture is proposed, integrating DistilBERT, LSTM, and GANs within a unified framework. Extensive experimentation with intermediate variants DistilBERT + LSTM (without GAN) and DistilBERT + GAN (without LSTM) demonstrates that the fully integrated model consistently outperforms all alternatives. The proposed hybrid model achieves a cross-validation accuracy of 99.00%, significantly surpassing traditional baselines such as XGBoost (96.73%) and LightGBM (96.83%). It also exhibits superior stability, statistical significance (paired t-tests), and favorable trade-offs between performance and computational efficiency. The results underscore the potential of hybrid deep learning models for capturing subtle linguistic bias and advancing more objective and reliable automated content moderation systems.<\/jats:p>","DOI":"10.3390\/bdcc9070190","type":"journal-article","created":{"date-parts":[[2025,7,21]],"date-time":"2025-07-21T13:59:11Z","timestamp":1753106351000},"page":"190","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Detection of Biased Phrases in the Wiki Neutrality Corpus for Fairer Digital Content Management Using Artificial Intelligence"],"prefix":"10.3390","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7983-2189","authenticated-orcid":false,"family":"Abdullah","sequence":"first","affiliation":[{"name":"Centro de Investigaci\u00f3n en Computaci\u00f3n, Instituto Polit\u00e9cnico Nacional, Mexico City 07738, Mexico"},{"name":"Department of Computer Science, Bahria University Lahore Campus, Lahore 54600, Pakistan"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-5397-6768","authenticated-orcid":false,"given":"Muhammad","family":"Ateeb Ather","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Bahria University Lahore Campus, Lahore 54600, Pakistan"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1307-1647","authenticated-orcid":false,"given":"Olga","family":"Kolesnikova","sequence":"additional","affiliation":[{"name":"Centro de Investigaci\u00f3n en Computaci\u00f3n, Instituto Polit\u00e9cnico Nacional, Mexico City 07738, Mexico"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3901-3522","authenticated-orcid":false,"given":"Grigori","family":"Sidorov","sequence":"additional","affiliation":[{"name":"Centro de Investigaci\u00f3n en Computaci\u00f3n, Instituto Polit\u00e9cnico Nacional, Mexico City 07738, Mexico"}]}],"member":"1968","published-online":{"date-parts":[[2025,7,21]]},"reference":[{"key":"ref_1","first-page":"27","article-title":"Political bias in media: An NLP perspective","volume":"2","author":"Luo","year":"2025","journal-title":"Comput. Soc. Sci. Rev."},{"key":"ref_2","first-page":"210","article-title":"Investigating the effect of media bias in social perception: A computational analysis","volume":"9","author":"Raza","year":"2022","journal-title":"Int. J. Soc. Inform."},{"key":"ref_3","first-page":"801","article-title":"Detecting and Mitigating Bias in Textual Data","volume":"31","author":"Zhang","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_4","first-page":"5","article-title":"Fairness in algorithmic text analysis: Challenges and implications","volume":"3","author":"Berendt","year":"2023","journal-title":"AI Ethics"},{"key":"ref_5","first-page":"301","article-title":"Ethical content moderation on social platforms: Balancing fairness and automation","volume":"34","author":"Cai","year":"2024","journal-title":"Inf. Syst. J."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Hu, B., Sheng, Q., Cao, J., Li, Y., and Wang, D. (2025). Llm-generated fake news induces truth decay in news ecosystem: A case study on neural news recommendation. arXiv.","DOI":"10.1145\/3726302.3730027"},{"key":"ref_7","first-page":"2057","article-title":"Comparison of Approaches for Querying Formal Ontologies via Natural Language","volume":"28","author":"Capus","year":"2024","journal-title":"Computaci\u00f3n y Sistemas"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1186\/s13071-024-06618-6","article-title":"Decision tree-based learning and laboratory data mining: An efficient approach to amebiasis testing","volume":"18","author":"Tarawneh","year":"2025","journal-title":"Parasites Vectors"},{"key":"ref_9","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2019, January 2\u20137). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the NAACL-HLT 2019, Minneapolis, MN, USA."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s10462-025-11162-5","article-title":"BERT applications in natural language processing: A review","volume":"58","author":"Gardazi","year":"2025","journal-title":"Artif. Intell. Rev."},{"key":"ref_11","unstructured":"Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2020). DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv."},{"key":"ref_12","first-page":"123","article-title":"Interpretable attention mechanisms for text bias classification","volume":"12","author":"Montanez","year":"2024","journal-title":"Trans. ACL"},{"key":"ref_13","first-page":"45","article-title":"Meta-linguistic signals for bias detection in political news","volume":"58","year":"2024","journal-title":"Lang. Resour. Eval."},{"key":"ref_14","unstructured":"Lee, H., Park, J., and Kim, S. (2022, January 10\u201315). Framing Bias Detection with Hybrid CNN-LSTM Architectures. Proceedings of the NAACL 2022, Seattle, WA, USA."},{"key":"ref_15","unstructured":"Wang, X., and Zhou, K. (2023, January 6\u201310). Domain-Robust Bias Detection Using Attention-Augmented Transformers. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), Honolulu, HI, USA."},{"key":"ref_16","first-page":"1655","article-title":"An Extension of the ELECTRE III Method based on the 2-tuple Linguistic Representation Model for Dealing with Heterogeneous Information","volume":"28","year":"2024","journal-title":"Comput. Sist."},{"key":"ref_17","unstructured":"Chen, X., Li, W., and Tang, Y. (2023, January 9\u201314). Seq2Seq Rewriting for Bias Reduction: Fluency vs. Neutrality. Proceedings of the Findings of ACL 2023, Taipei, Taiwan."},{"key":"ref_18","unstructured":"Smith, J., and Doe, A. (2023, January 9\u201314). Evaluating Annotation Noise in the Wiki Neutrality Corpus. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), Toronto, ON, Canada."},{"key":"ref_19","first-page":"75","article-title":"Cross-Lingual Perspectives on Bias Detection: A Survey of Non-English Corpora","volume":"50","author":"Patel","year":"2024","journal-title":"Comput. Linguist."},{"key":"ref_20","unstructured":"Jones, E., and Kumar, S. (2024, January 12\u201316). On the Inflation of BLEU-Based Metrics for Text Rewriting Tasks. Proceedings of the Findings of EMNLP 2024, Singapore."},{"key":"ref_21","unstructured":"Lee, S., and Nguyen, M. (2023, January 11\u201314). The Impact of Subtle Textual Bias on Critical Thinking in Education. Proceedings of the 2023 International Conference on Educational Data Mining, Berlin, Germany."},{"key":"ref_22","first-page":"104","article-title":"Media Literacy and Bias Awareness: Educational Implications of NLP Tools","volume":"158","author":"Ahmed","year":"2024","journal-title":"Comput. Educ."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"e70172","DOI":"10.1111\/cts.70172","article-title":"A tutorial and use case example of the eXtreme gradient boosting (XGBoost) artificial intelligence algorithm for drug development applications","volume":"18","author":"Wiens","year":"2025","journal-title":"Clin. Transl. Sci."},{"key":"ref_24","first-page":"3146","article-title":"LightGBM: A Highly Efficient Gradient Boosting Decision Tree","volume":"30","author":"Ke","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_25","unstructured":"Dorogush, A.V., Ershov, V., and Gulin, A. (2018, January 3\u20138). CatBoost: Gradient Boosting with Categorical Features Support. Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s12040-025-02567-5","article-title":"Enhancing daily streamflow prediction: A comparative analysis of univariate LSTM and N-BEATS models with coupled SWAT-LSTM and SWAT-N-BEATS models incorporating influential SWAT features","volume":"134","author":"Priya","year":"2025","journal-title":"J. Earth Syst. Sci."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Chen, P., Wang, S., Wang, C., Wang, S., Huang, B., Huang, L., and Zang, Z. (2025). A GAN-Enhanced Deep Learning Framework for Rooftop Detection from Historical Aerial Imagery. arXiv.","DOI":"10.1080\/01431161.2025.2534994"},{"key":"ref_28","first-page":"256","article-title":"From simple detection to quality-aware prediction: Exploring argument complexity with machine learning","volume":"45","author":"Bartyrshin","year":"2025","journal-title":"J. Mach. Learn. Res."},{"key":"ref_29","unstructured":"UNESCO (2025, February 24). Education and Media Literacy in the Age of AI: A Global Perspective. Available online: https:\/\/unesco.org\/media-literacy-ai-report-2022."},{"key":"ref_30","unstructured":"Garc\u00eda, M., and Silva, R. (2024, January 3\u20136). Bridging Ethical AI and Real-World Applications: Lessons from Misinformation Mitigation. Proceedings of the Proceedings of the 2024 Conference on Fairness, Accountability, and Transparency (FAccT), Barcelona, Spain."},{"key":"ref_31","unstructured":"Basar, E., Sun, X., Hendrickx, I., de Wit, J., Bosse, T., De Bruijn, G.J., Bosch, J.A., and Krahmer, E. (2025, January 19\u201324). How Well Can Large Language Models Reflect? A Human Evaluation of LLM-generated Reflections for Motivational Interviewing Dialogues. Proceedings of the 31st International Conference on Computational Linguistics, Abu Dhabi, United Arab Emirates."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Balloccu, S., Schmidtov\u00e1, P., Lango, M., and Dusek, O. (2024, January 17\u201322). Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMs. Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), St. Julian\u2019s, Malta.","DOI":"10.18653\/v1\/2024.eacl-long.5"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"El-Ghawi, Y., Marzouk, A., and Khamis, A. (2024, January 16). LexiconLadies at FIGNEWS 2024 Shared Task: Identifying Keywords for Bias Annotation Guidelines of Facebook News Headlines on the Israel-Palestine 2023 War. Proceedings of the Second Arabic Natural Language Processing Conference, Bangkok, Thailand.","DOI":"10.18653\/v1\/2024.arabicnlp-1.59"},{"key":"ref_34","first-page":"45","article-title":"Leveraging Large Language Models for Stereotypical Bias Detection and Explanation","volume":"22","author":"Liu","year":"2025","journal-title":"J. AI Ethics"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Blqees, B., Wardi, A., Al-Sibani, M., Al-Siyabi, H., and Zidjaly, N. (2024, January 16). BiasGanda at FIGNEWS 2024 Shared Task: A Quest to Uncover Biased Views in News Coverage. Proceedings of the Second Arabic Natural Language Processing Conference, Bangkok, Thailand.","DOI":"10.18653\/v1\/2024.arabicnlp-1.65"},{"key":"ref_36","first-page":"78","article-title":"Reducing Cognitive Biases in Security and Risk Decision-Making through AI-Driven Information Systems","volume":"30","author":"Dragomir","year":"2025","journal-title":"J. Secur. Risk Assess."},{"key":"ref_37","first-page":"200","article-title":"Automating Linguistic Intergroup Bias Detection using NLP Techniques","volume":"29","author":"Collins","year":"2025","journal-title":"J. Comput. Linguist."},{"key":"ref_38","first-page":"130","article-title":"A Multi-Bias Detection Framework for News Article Analysis","volume":"41","author":"Shah","year":"2025","journal-title":"J. Comput. Linguist."},{"key":"ref_39","unstructured":"Li, Z., Wang, Y., and Zhao, T. (2024, January 12\u201316). Bias Evaluation and Mitigation in GPT-3 and GPT-4. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), Miami, FL, USA."},{"key":"ref_40","first-page":"1877","article-title":"Language Models are Few-Shot Learners","volume":"33","author":"Brown","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_41","unstructured":"Li, X., Rong, H., and Wu, F. (2024, January 20\u201327). A Comparative Study of Generative LLMs: Claude, Gemini, Qwen and Beyond. Proceedings of the 2024 AAAI Conference on Artificial Intelligence, New York, NY, USA."},{"key":"ref_42","first-page":"101","article-title":"Bias Propagation in Large-Scale Transformer Systems","volume":"72","author":"Zhang","year":"2025","journal-title":"J. Artif. Intell. Res."},{"key":"ref_43","first-page":"89","article-title":"Mitigating Subtle Bias in Language Models Through Adversarial Training","volume":"37","author":"Smith","year":"2025","journal-title":"ACM Trans. Artif. Intell."},{"key":"ref_44","unstructured":"Universitat Oberta de Catalunya (UOC), and University of Luxembourg (2024, January 3\u20136). LangBiTe: An Open-Source Tool for Bias Detection in AI-Generated Content. Proceedings of the International Conference on Fairness and Transparency in AI, Rio de Janeiro, Brazil."},{"key":"ref_45","unstructured":"Johnson, R., Patel, A., and Green, S. (2025). Analyzing Gender Bias in Film Reviews: A Textual Perspective. PLoS ONE, 20."},{"key":"ref_46","first-page":"55","article-title":"Gender Bias and Victim-Blaming in Judicial Language: An Ethical Perspective","volume":"18","author":"AI","year":"2024","journal-title":"J. Soc. Justice AI Ethics"},{"key":"ref_47","unstructured":"Powers, M., Mavani, U., Jonala, H.R., Tiwari, A., and Wei, H. (2024). GUS-Net: Social Bias Classification in Text with Generalizations, Unfairness, and Stereotypes. arXiv."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Xie, B., Davidson, M.J., Franke, B., McLeod, E., Li, M., and Ko, A.J. (2021, January 22\u201325). Domain Experts\u2019 Interpretations of Assessment Bias in a Scaled, Online Computer Science Curriculum. Proceedings of the Eighth ACM Conference on Learning@ Scale, Virtual.","DOI":"10.1145\/3430895.3460141"},{"key":"ref_49","unstructured":"Duh, K., Gomez, H., and Bethard, S. (2024). Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), Mexico City, Mexico, 16\u201321 June 2024, Association for Computational Linguistics."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"121542","DOI":"10.1016\/j.eswa.2023.121542","article-title":"Nbias: A Natural Language Processing Framework for BIAS Identification in Text","volume":"237","author":"Raza","year":"2024","journal-title":"Expert Syst. Appl."},{"key":"ref_51","unstructured":"Jourdan, F. (2024). Advancing Fairness in Natural Language Processing: From Traditional Methods to Explainability. arXiv."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Pryzant, R., Martinez, R.D., Dass, N., Kurohashi, S., Jurafsky, D., and Yang, D. (2019). Automatically Neutralizing Subjective Bias in Text. arXiv.","DOI":"10.1609\/aaai.v34i01.5385"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"7107","DOI":"10.1109\/ACCESS.2020.3043221","article-title":"Ensembling Classical Machine Learning and Deep Learning Approaches for Morbidity Identification From Clinical Notes","volume":"9","author":"Kumar","year":"2021","journal-title":"IEEE Access"},{"key":"ref_54","first-page":"109","article-title":"The Impact of Training Methods on the Development of Pre-trained Language Models","volume":"28","author":"Uribe","year":"2024","journal-title":"Comput. Sist."},{"key":"ref_55","unstructured":"Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv."},{"key":"ref_56","unstructured":"OpenAI (2025, March 20). GPT-3.5 Turbo. Available online: https:\/\/platform.openai.com\/docs\/models\/gpt-3-5."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Zhou, X., Zhang, X., Tao, C., Chen, J., Xu, B., Wang, W., and Xiao, J. (2021, January 6\u201311). Multi-Grained Knowledge Distillation for Named Entity Recognition. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online.","DOI":"10.18653\/v1\/2021.naacl-main.454"},{"key":"ref_58","first-page":"112","article-title":"Bias Detection in Text Using Transformer-based Models","volume":"28","author":"Zhang","year":"2021","journal-title":"J. Nat. Lang. Process."},{"key":"ref_59","first-page":"45","article-title":"Hybrid BERT and CNN Model for Textual Bias Detection","volume":"37","author":"Lee","year":"2023","journal-title":"AI J. Text Process."},{"key":"ref_60","unstructured":"Liang, H., Zheng, F., and Liu, J. (2020, January 13\u201318). Adversarial Learning for Bias Detection in Textual Data Using GANs. Proceedings of the International Conference on Machine Learning, Online."},{"key":"ref_61","first-page":"4002","article-title":"Enhancing BiLSTM with Attention Mechanism for Bias Detection in Large Text Corpora","volume":"33","author":"Swayamdipta","year":"2020","journal-title":"Trans. Neural Netw."},{"key":"ref_62","first-page":"815","article-title":"A Multi-Task Learning Approach to Detecting Bias in Social Media Texts","volume":"48","author":"Tan","year":"2022","journal-title":"Comput. Linguist."}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/9\/7\/190\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:13:23Z","timestamp":1760033603000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/9\/7\/190"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,21]]},"references-count":62,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2025,7]]}},"alternative-id":["bdcc9070190"],"URL":"https:\/\/doi.org\/10.3390\/bdcc9070190","relation":{},"ISSN":["2504-2289"],"issn-type":[{"value":"2504-2289","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,7,21]]}}}