{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,7]],"date-time":"2026-03-07T12:11:30Z","timestamp":1772885490799,"version":"3.50.1"},"reference-count":67,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2024,12,17]],"date-time":"2024-12-17T00:00:00Z","timestamp":1734393600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,12,17]],"date-time":"2024-12-17T00:00:00Z","timestamp":1734393600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100002322","name":"Coordena\u00e7\u00e3o de Aperfei\u00e7oamento de Pessoal de N\u00edvel Superior","doi-asserted-by":"publisher","award":["Code 001"],"award-info":[{"award-number":["Code 001"]}],"id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002322","name":"Coordena\u00e7\u00e3o de Aperfei\u00e7oamento de Pessoal de N\u00eevel Superior","doi-asserted-by":"publisher","award":["88887.484211\/2020-00"],"award-info":[{"award-number":["88887.484211\/2020-00"]}],"id":[{"id":"10.13039\/501100002322","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Comput &amp; Applic"],"published-print":{"date-parts":[[2025,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Hate speech is a growing problem on social media due to the larger volume of content being shared. Recent works demonstrated the usefulness of distinct machine learning algorithms combined with natural language processing techniques to detect hateful content. However, when not constructed with the necessary care, learning models can magnify discriminatory behaviour and lead the model to incorrectly associate comments with specific identity terms (e.g., woman, black, and gay) with a particular class, such as hate speech. Moreover, some specific characteristics should be considered in the test set when evaluating the presence of bias, considering that the test set can follow the same biased distribution of the training set and compromise the results obtained by the bias metrics. This work argues that considering the potential bias in hate speech detection is needed and focuses on developing an intelligent system to address these limitations. Firstly, we proposed a comprehensive, <jats:bold>unbiased dataset<\/jats:bold> to unintended gender bias evaluation. Secondly, we propose a framework to help analyse bias from feature extraction techniques. Then, we evaluate several state-of-the-art feature extraction techniques, specifically focusing on the bias towards identity terms. We consider six feature extraction techniques, including TF, TF-IDF, FastText, GloVe, BERT, and RoBERTa, and six classifiers, LR, DT, SVM, XGB, MLP, and RF. The experimental study across hate speech datasets and a range of classification and unintended bias metrics demonstrates that the choice of the feature extraction technique can impact the bias on predictions, and its effectiveness can depend on the dataset analysed. For instance, combining TF and TF-IDF with DT and MLP resulted in higher bias, while BERT and RoBERTa showed lower bias with the same classifier for the HE and WH datasets. The proposed dataset and source code will be publicly available when the paper is published.<\/jats:p>","DOI":"10.1007\/s00521-024-10841-8","type":"journal-article","created":{"date-parts":[[2024,12,17]],"date-time":"2024-12-17T16:25:31Z","timestamp":1734452731000},"page":"3887-3905","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Gender bias detection on hate speech classification: an analysis at feature-level"],"prefix":"10.1007","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7683-0196","authenticated-orcid":false,"given":"Francimaria R. S.","family":"Nascimento","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7714-2283","authenticated-orcid":false,"given":"George D. C.","family":"Cavalcanti","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7461-7570","authenticated-orcid":false,"given":"Marjory Da","family":"Costa-Abreu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,12,17]]},"reference":[{"issue":"4","key":"10841_CR1","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3232676","volume":"51","author":"P Fortuna","year":"2018","unstructured":"Fortuna P, Nunes S (2018) A survey on automatic detection of hate speech in text. ACM Comput Surv 51(4):1\u201330","journal-title":"ACM Comput Surv"},{"issue":"6","key":"10841_CR2","doi-asserted-by":"publisher","first-page":"6995","DOI":"10.3233\/JIFS-212872","volume":"43","author":"F Balouchzahi","year":"2022","unstructured":"Balouchzahi F, Shashirekha HL, Sidorov G, Gelbukh A (2022) A comparative study of syllables and character level n-grams for dravidian multi-script and code-mixed offensive language identification. J Intell Fuzzy Syst 43(6):6995\u20137005","journal-title":"J Intell Fuzzy Syst"},{"key":"10841_CR3","doi-asserted-by":"publisher","first-page":"100194","DOI":"10.1016\/j.osnem.2021.100194","volume":"28","author":"R.M.O. Cruz","year":"2022","unstructured":"Cruz R.M.O., de Sousa W.V., Cavalcanti G.D.C. (2022) Selecting and combining complementary feature representations and classifiers for hate speech detection. Online Soc Netw Med 28:100194. https:\/\/doi.org\/10.1016\/j.osnem.2021.100194","journal-title":"Online Soc Netw Med"},{"key":"10841_CR4","doi-asserted-by":"publisher","first-page":"106458","DOI":"10.1016\/j.knosys.2020.106458","volume":"210","author":"P Kapil","year":"2020","unstructured":"Kapil P, Ekbal A (2020) A deep neural network based multi-task learning approach to hate speech detection. Knowl-Based Syst 210:106458","journal-title":"Knowl-Based Syst"},{"issue":"1","key":"10841_CR5","first-page":"1","volume":"10","author":"J Salminen","year":"2020","unstructured":"Salminen J, Hopf M, Chowdhury SA, Jung S-G, Almerekhi H, Jansen BJ (2020) Developing an online hate classifier for multiple social media platforms. HCIS 10(1):1","journal-title":"HCIS"},{"key":"10841_CR6","doi-asserted-by":"publisher","first-page":"598","DOI":"10.1016\/j.neucom.2021.11.053","volume":"488","author":"A Sengupta","year":"2022","unstructured":"Sengupta A, Bhattacharjee SK, Akhtar MS, Chakraborty T (2022) Does aggression lead to hate? detecting and reasoning offensive traits in hinglish code-mixed texts. Neurocomputing 488:598\u2013617. https:\/\/doi.org\/10.1016\/j.neucom.2021.11.053","journal-title":"Neurocomputing"},{"key":"10841_CR7","doi-asserted-by":"publisher","first-page":"100205","DOI":"10.1016\/j.osnem.2022.100205","volume":"29","author":"Z Zhao","year":"2022","unstructured":"Zhao Z, Zhang Z, Hopfgartner F (2022) Utilizing subjectivity level to mitigate identity term bias in toxic comments classification. Online Soci Netw Med 29:100205","journal-title":"Online Soci Netw Med"},{"key":"10841_CR8","doi-asserted-by":"publisher","unstructured":"Dixon L, Li J, Sorensen J, Thain N, Vasserman L (2018) Measuring and mitigating unintended bias in text classification. In: Proceedings of the 2018 AAAI\/ACM Conference on AI, Ethics, and Society. AIES \u201918, pp. 67\u201373. ACM, New York, NY, USA . https:\/\/doi.org\/10.1145\/3278721.3278729","DOI":"10.1145\/3278721.3278729"},{"key":"10841_CR9","doi-asserted-by":"publisher","first-page":"117032","DOI":"10.1016\/j.eswa.2022.117032","volume":"201","author":"FRS Nascimento","year":"2022","unstructured":"Nascimento FRS, Cavalcanti GDC, Costa-Abreu MD (2022) Unintended bias evaluation: an analysis of hate speech detection and gender bias mitigation on social media using ensemble learning. Exp Syst Appl 201:117032. https:\/\/doi.org\/10.1016\/j.eswa.2022.117032","journal-title":"Exp Syst Appl"},{"key":"10841_CR10","doi-asserted-by":"crossref","unstructured":"Badjatiya P, Gupta M, Varma V (2019) Stereotypical bias removal for hate speech detection task using knowledge-based generalizations. In: The World Wide Web Conference, pp. 49\u201359. ACM, New York, NY, USA","DOI":"10.1145\/3308558.3313504"},{"key":"10841_CR11","doi-asserted-by":"publisher","first-page":"126232","DOI":"10.1016\/j.neucom.2023.126232","volume":"546","author":"MS Jahan","year":"2023","unstructured":"Jahan MS, Oussalah M (2023) A systematic review of hate speech automatic detection using natural language processing. Neurocomputing 546:126232. https:\/\/doi.org\/10.1016\/j.neucom.2023.126232","journal-title":"Neurocomputing"},{"issue":"8","key":"10841_CR12","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1371\/journal.pone.0237861","volume":"15","author":"M Mozafari","year":"2020","unstructured":"Mozafari M, Farahbakhsh R, Crespi N (2020) Hate speech detection and racial bias mitigation in social media based on bert model. PLoS ONE 15(8):1\u201326","journal-title":"PLoS ONE"},{"key":"10841_CR13","doi-asserted-by":"crossref","unstructured":"Sap M, Card D, Gabriel S, Choi Y, Smith NA (2019) The risk of racial bias in hate speech detection. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1668\u20131678","DOI":"10.18653\/v1\/P19-1163"},{"key":"10841_CR14","doi-asserted-by":"crossref","unstructured":"Park JH, Shin J, Fung P (2018) Reducing gender bias in abusive language detection. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2799\u20132804. ACL, Brussels, Belgium","DOI":"10.18653\/v1\/D18-1302"},{"key":"10841_CR15","doi-asserted-by":"publisher","unstructured":"Lee MSA, Singh J (2021) Risk identification questionnaire for detecting unintended bias in the machine learning development lifecycle. In: Proceedings of the 2021 AAAI\/ACM Conference on AI, Ethics, and Society. AIES \u201921, pp. 704\u2013714. ACM, New York, NY, USA . https:\/\/doi.org\/10.1145\/3461702.3462572","DOI":"10.1145\/3461702.3462572"},{"issue":"2","key":"10841_CR16","doi-asserted-by":"publisher","first-page":"215824402311813","DOI":"10.1177\/21582440231181311","volume":"13","author":"FRS Nascimento","year":"2023","unstructured":"Nascimento FRS, Cavalcanti GDC, Costa-Abreu MD (2023) Exploring automatic hate speech detection on social media: A focus on content-based analysis. SAGE Open 13(2):21582440231181310.https:\/\/doi.org\/10.1177\/21582440231181311","journal-title":"SAGE Open"},{"key":"10841_CR17","doi-asserted-by":"crossref","unstructured":"Senarath Y, Purohit H (2020) Evaluating semantic feature representations to efficiently detect hate intent on social media. In: 2020 IEEE 14th International Conference on Semantic Computing, pp. 199\u2013202. IEEE, San Diego, CA, USA","DOI":"10.1109\/ICSC.2020.00041"},{"key":"10841_CR18","doi-asserted-by":"crossref","unstructured":"Nobata C, Tetreault J, Thomas A, Mehdad Y, Chang Y (2016) Abusive language detection in online user content. In: Proceedings of the 25th International Conference on World Wide Web, pp. 145\u2013153. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE","DOI":"10.1145\/2872427.2883062"},{"key":"10841_CR19","doi-asserted-by":"crossref","unstructured":"Cao R, Lee RK-W, Hoang T-A (2020) Deephate: Hate speech detection via multi-faceted text representations. 12th ACM Conference on Web Science. WebSci \u201920. ACM, New York, NY, USA, pp 11\u201320","DOI":"10.1145\/3394231.3397890"},{"key":"10841_CR20","doi-asserted-by":"crossref","unstructured":"Founta AM, Chatzakou D, Kourtellis N, Blackburn J, Vakali A, Leontiadis I (2019) A unified deep learning architecture for abuse detection. In: Proceedings of the 10th ACM Conference on Web Science, pp. 105\u2013114. ACM, New York, NY, USA","DOI":"10.1145\/3292522.3326028"},{"key":"10841_CR21","unstructured":"Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, et al (2016) Google\u2019s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144"},{"issue":"1","key":"10841_CR22","doi-asserted-by":"publisher","first-page":"279","DOI":"10.1007\/s10660-022-09630-z","volume":"23","author":"AL Karn","year":"2023","unstructured":"Karn AL, Karna RK, Kondamudi BR, Bagale G, Pustokhin DA, Pustokhina IV, Sengan S (2023) Customer centric hybrid recommendation system for e-commerce applications by integrating hybrid sentiment analysis. Electron Commer Res 23(1):279\u2013314","journal-title":"Electron Commer Res"},{"key":"10841_CR23","doi-asserted-by":"publisher","unstructured":"Sun T, Gaut A, Tang S, Huang Y, ElSherief M, Zhao J, Mirza D, Belding E, Chang K-W, Wang WY (2019) Mitigating gender bias in natural language processing: Literature review. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 1630\u20131640. ACL, Florence, Italy . https:\/\/doi.org\/10.18653\/v1\/P19-1159","DOI":"10.18653\/v1\/P19-1159"},{"key":"10841_CR24","first-page":"296","volume-title":"Ethics of data and analytics","author":"J Dastin","year":"2018","unstructured":"Dastin J (2018) Amazon scraps secret ai recruiting tool that showed bias against women. Ethics of data and analytics. Auerbach Publications, San Francisco, USA, pp 296\u2013299"},{"key":"10841_CR25","doi-asserted-by":"publisher","unstructured":"Deshpande KV, Pan S, Foulds JR (2020) Mitigating demographic bias in ai-based resume filtering. In: Adjunct Publication of the 28th ACM Conference on User Modeling, Adaptation and Personalization, pp. 268\u2013275. ACM, New York, NY, USA . https:\/\/doi.org\/10.1145\/3386392.3399569","DOI":"10.1145\/3386392.3399569"},{"key":"10841_CR26","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s10586-022-03956-x","volume":"27","author":"AC Mazari","year":"2023","unstructured":"Mazari AC, Boudoukhani N, Djeffal A (2023) Bert-based ensemble learning for multi-aspect hate speech detection. Cluster Comput 27:1\u201315. https:\/\/doi.org\/10.1007\/s10586-022-03956-x","journal-title":"Cluster Comput"},{"key":"10841_CR27","doi-asserted-by":"publisher","unstructured":"Indurthi V, Syed B, Shrivastava M, Chakravartula N, Gupta M, Varma V (2019) FERMI at SemEval-2019 task 5: Using sentence embeddings to identify hate speech against immigrants and women in Twitter. In: Proceedings of the 13th International Workshop on Semantic Evaluation, pp. 70\u201374. Association for Computational Linguistics, Minneapolis, Minnesota, USA . https:\/\/doi.org\/10.18653\/v1\/S19-2009 . https:\/\/aclanthology.org\/S19-2009","DOI":"10.18653\/v1\/S19-2009"},{"key":"10841_CR28","doi-asserted-by":"publisher","first-page":"121115","DOI":"10.1016\/j.eswa.2023.121115","volume":"235","author":"AA Firmino","year":"2024","unstructured":"Firmino AA, Souza Baptista C, Paiva AC (2024) Improving hate speech detection using cross-lingual learning. Expert Syst Appl 235:121115","journal-title":"Expert Syst Appl"},{"key":"10841_CR29","doi-asserted-by":"publisher","first-page":"300","DOI":"10.1162\/tacl_a_00550","volume":"11","author":"AM Davani","year":"2023","unstructured":"Davani AM, Atari M, Kennedy B, Dehghani M (2023) Hate speech classifiers learn normative social stereotypes. Trans Assoc Comput Linguist 11:300\u2013319","journal-title":"Trans Assoc Comput Linguist"},{"issue":"13s","key":"10841_CR30","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3580494","volume":"55","author":"T Garg","year":"2023","unstructured":"Garg T, Masud S, Suresh T, Chakraborty T (2023) Handling bias in toxic speech detection: a survey. ACM Comput Surv 55(13s):1\u201332","journal-title":"ACM Comput Surv"},{"key":"10841_CR31","doi-asserted-by":"crossref","unstructured":"\u015eahinu\u00e7 F, Yilmaz EH, Toraman C, Ko\u00e7 A (2022) The effect of gender bias on hate speech detection. Signal, Image and Video Processing, 1\u20137","DOI":"10.1109\/SIU53274.2021.9477781"},{"key":"10841_CR32","doi-asserted-by":"publisher","first-page":"105149","DOI":"10.1016\/j.cities.2024.105149","volume":"152","author":"K Shen","year":"2024","unstructured":"Shen K, Ding L, Kong L, Liu X (2024) From physical space to cyberspace: recessive gender biases in social media mirror the real world. Cities 152:105149","journal-title":"Cities"},{"key":"10841_CR33","doi-asserted-by":"crossref","unstructured":"Waseem Z, Hovy D (2016) Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL Student Research Workshop, pp. 88\u201393. ACL, San Diego, California","DOI":"10.18653\/v1\/N16-2013"},{"key":"10841_CR34","doi-asserted-by":"crossref","unstructured":"Founta AM, Djouvas C, Chatzakou D, Leontiadis I, Blackburn J, Stringhini G, Vakali A, Sirivianos M, Kourtellis N (2018) Large scale crowdsourcing and characterization of twitter abusive behavior. In: Twelfth International AAAI Conference on Web and Social Media, pp. 491\u2013500. AAAI Press, California, USA","DOI":"10.1609\/icwsm.v12i1.14991"},{"key":"10841_CR35","doi-asserted-by":"crossref","unstructured":"Basile V, Bosco C, Fersini E, Nozza D, Patti V, Pardo FMR, Rosso P, Sanguinetti M (2019) Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter. In: Proceedings of the 13th International Workshop on Semantic Evaluation, Minneapolis, Minnesota, USA, pp. 54\u201363 . Association for Computational Linguistics","DOI":"10.18653\/v1\/S19-2007"},{"key":"10841_CR36","doi-asserted-by":"crossref","unstructured":"Salminen J, Almerekhi H, Milenkovic M, Jung S-g, An J, Kwak H, Jansen BJ (2018) Anatomy of online hate: Developing a taxonomy and machine learning models for identifying and classifying hate in online news media. In: Proceedings of the International AAAI Conference on Web and Social Media, California, USA, pp. 330\u2013339","DOI":"10.1609\/icwsm.v12i1.15028"},{"key":"10841_CR37","doi-asserted-by":"publisher","unstructured":"Almerekhi, H., Kwak, H., Jansen, B.J., Salminen, J.: Detecting toxicity triggers in online discussions. In: Proceedings of the 30th ACM Conference on Hypertext and Social Media, pp. 291\u2013292. ACM, New York, NY, USA (2019). https:\/\/doi.org\/10.1145\/3342220.3344933","DOI":"10.1145\/3342220.3344933"},{"key":"10841_CR38","doi-asserted-by":"publisher","unstructured":"Wulczyn E, Thain N, Dixon L (2017) Ex machina: Personal attacks seen at scale. In: Proceedings of the 26th International Conference on World Wide Web, pp. 1391\u20131399. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE . https:\/\/doi.org\/10.1145\/3038912.3052591","DOI":"10.1145\/3038912.3052591"},{"key":"10841_CR39","doi-asserted-by":"crossref","unstructured":"Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Eleventh International AAAI Conference on Web and Social Media. AAAI Press, Montreal, Canada","DOI":"10.1609\/icwsm.v11i1.14955"},{"key":"10841_CR40","doi-asserted-by":"crossref","unstructured":"Zampieri M, Malmasi S, Nakov P, Rosenthal S, Farra N, Kumar R (2019) Predicting the type and target of offensive posts in social media. In: Proceedings of NAACL-HLT, pp. 1415\u20131420. ACL, Minneapolis, Minnesota. https:\/\/doi.org\/10.18653\/v1\/N19-1144","DOI":"10.18653\/v1\/N19-1144"},{"key":"10841_CR41","doi-asserted-by":"publisher","unstructured":"Golbeck J, Ashktorab Z, Banjo RO, Berlinger A, Bhagwan S, Buntain C, Cheakalos P, Geller AA, Gergory Q, Gnanasekaran RK (2017) et al.: A large labeled corpus for online harassment research. In: Proceedings of the 2017 ACM on Web Science Conference, pp. 229\u2013233. ACM, New York, NY, USA . https:\/\/doi.org\/10.1145\/3091478.3091509","DOI":"10.1145\/3091478.3091509"},{"key":"10841_CR42","doi-asserted-by":"publisher","unstructured":"Gibert O, Perez N, Garc\u0131a-Pablos A, Cuadros M (2018) Hate speech dataset from a white supremacy forum. In: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), pp. 11\u201320. ACL, Brussels, Belgium. https:\/\/doi.org\/10.18653\/v1\/W18-5102","DOI":"10.18653\/v1\/W18-5102"},{"key":"10841_CR43","doi-asserted-by":"crossref","unstructured":"Waseem Z (2016) Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In: Proceedings of the First Workshop on NLP and Computational Social Science, pp. 138\u2013142. ACL, Austin, Texas","DOI":"10.18653\/v1\/W16-5618"},{"key":"10841_CR44","unstructured":"Toraman C, \u015eahinu\u00e7 F, Yilmaz E (2022) Large-scale hate speech detection with cross-domain transfer. In: Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 2215\u20132225. European Language Resources Association, Marseille, France . https:\/\/aclanthology.org\/2022.lrec-1.238"},{"key":"10841_CR45","doi-asserted-by":"publisher","unstructured":"Almatarneh S, Gamallo P, Pena FJR, Alexeev A(2019) Supervised classifiers to identify hate speech on english and spanish tweets. In: International Conference on Asian Digital Libraries, pp. 23\u201330. Springer, Berlin, Heidelberg . https:\/\/doi.org\/10.1007\/978-3-030-34058-2_3","DOI":"10.1007\/978-3-030-34058-2_3"},{"issue":"2","key":"10841_CR46","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3377323","volume":"20","author":"M Corazza","year":"2020","unstructured":"Corazza M, Menini S, Cabrio E, Tonelli S, Villata S (2020) A multilingual evaluation for online hate speech detection. ACM Trans Internet Technol 20(2):1\u201322","journal-title":"ACM Trans Internet Technol"},{"issue":"4","key":"10841_CR47","doi-asserted-by":"publisher","first-page":"215","DOI":"10.14257\/ijmue.2015.10.4.21","volume":"10","author":"ND Gitari","year":"2015","unstructured":"Gitari ND, Zuping Z, Damien H, Long J (2015) A lexicon-based approach for hate speech detection. International Journal of Multimedia and Ubiquitous Engineering 10(4):215\u2013230","journal-title":"International Journal of Multimedia and Ubiquitous Engineering"},{"key":"10841_CR48","doi-asserted-by":"crossref","unstructured":"R\u00f6ttger P, Vidgen B, Nguyen D, Waseem Z, Margetts H, Pierrehumbert J (2021) et al.: Hatecheck: Functional tests for hate speech detection models. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), p. 41 . Association for Computational Linguistics","DOI":"10.18653\/v1\/2021.acl-long.4"},{"issue":"16","key":"10841_CR49","doi-asserted-by":"publisher","first-page":"8000","DOI":"10.3390\/app12168000","volume":"12","author":"Y Asiri","year":"2022","unstructured":"Asiri Y, Halawani HT, Alghamdi HM, Abdalaha Hamza SH, Abdel-Khalek S, Mansour RF (2022) Enhanced seagull optimization with natural language processing based hate speech detection and classification. Appl Sci 12(16):8000","journal-title":"Appl Sci"},{"key":"10841_CR50","doi-asserted-by":"crossref","unstructured":"DeSouza GA, Da-Costa-Abreu M (2020) Automatic offensive language detection from twitter data using machine learning and feature selection of metadata. In: 2020 International Joint Conference on Neural Networks, pp. 1\u20136. IEEE, Glasgow, UK","DOI":"10.1109\/IJCNN48605.2020.9207652"},{"key":"10841_CR51","doi-asserted-by":"publisher","first-page":"102140","DOI":"10.1016\/j.inffus.2023.102140","volume":"103","author":"F Farhangian","year":"2024","unstructured":"Farhangian F, Cruz RM, Cavalcanti GD (2024) Fake news detection: taxonomy and comparative study. Information Fusion 103:102140","journal-title":"Information Fusion"},{"issue":"2","key":"10841_CR52","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3369869","volume":"20","author":"F-M Plaza-Del-Arco","year":"2020","unstructured":"Plaza-Del-Arco F-M, Molina-Gonz\u00e1lez MD, Ure\u00f1a-L\u00f3pez LA, Mart\u00edn-Valdivia MT (2020) Detecting misogyny and xenophobia in spanish tweets using language technologies. ACM Trans Int Technol (TOIT) 20(2):1\u201319","journal-title":"ACM Trans Int Technol (TOIT)"},{"key":"10841_CR53","doi-asserted-by":"crossref","unstructured":"Kumari K, Jamatia A (2022) An approach of hate speech identification on twitter corpus. In: International Conference on Frontiers of Intelligent Computing: Theory and Applications, pp. 115\u2013125 . Springer","DOI":"10.1007\/978-981-19-7513-4_11"},{"key":"10841_CR54","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825\u20132830","journal-title":"J Mach Learn Res"},{"key":"10841_CR55","volume-title":"Data mining for business analytics: concepts, techniques, and applications in R","author":"G Shmueli","year":"2017","unstructured":"Shmueli G, Bruce PC, Yahav I, Patel NR, Lichtendahl KC Jr (2017) Data mining for business analytics: concepts, techniques, and applications in R. Wiley, USA"},{"key":"10841_CR56","doi-asserted-by":"crossref","unstructured":"Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532\u20131543","DOI":"10.3115\/v1\/D14-1162"},{"key":"10841_CR57","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1162\/tacl_a_00051","volume":"5","author":"P Bojanowski","year":"2017","unstructured":"Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135\u2013146","journal-title":"Trans Assoc Comput Linguist"},{"key":"10841_CR58","unstructured":"Devlin J, Chang M.-W, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171\u20134186. ACL, Minneapolis, Minnesota"},{"key":"10841_CR59","unstructured":"Risch J, Krestel R (2020) Bagging bert models for robust aggression identification. In: Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying, pp. 55\u201361. ELRA, Marseille, France"},{"key":"10841_CR60","doi-asserted-by":"crossref","unstructured":"Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, Platen P, Ma C, Jernite Y, Plu J, Xu C, Scao TL, Gugger S, Drame M, Lhoest Q, Rush AM (2020) Transformers: State-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38\u201345. ACL, Online","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"10841_CR61","unstructured":"Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V (2019) Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692"},{"key":"10841_CR62","doi-asserted-by":"crossref","unstructured":"Cortes C, Vapnik V (1995) Support-vector networks. Machine learning 20:273\u2013297","DOI":"10.1007\/BF00994018"},{"key":"10841_CR63","doi-asserted-by":"publisher","unstructured":"Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD \u201916, pp. 785\u2013794. Association for Computing Machinery, New York, NY, USA . https:\/\/doi.org\/10.1145\/2939672.2939785","DOI":"10.1145\/2939672.2939785"},{"key":"10841_CR64","doi-asserted-by":"crossref","unstructured":"Aggarwal CC et al (2018) Neural networks and deep learning. Springer 10(978):3","DOI":"10.1007\/978-3-319-94463-0"},{"key":"10841_CR65","doi-asserted-by":"publisher","unstructured":"Borkan D, Dixon L, Sorensen J, Thain N, Vasserman L (2019) Nuanced metrics for measuring unintended bias with real data for text classification. In: Companion Proceedings of the 2019 World Wide Web Conference, pp. 491\u2013500. ACM, New York, NY, USA . https:\/\/doi.org\/10.1145\/3308560.3317593","DOI":"10.1145\/3308560.3317593"},{"key":"10841_CR66","doi-asserted-by":"publisher","first-page":"100071","DOI":"10.1016\/j.osnem.2020.100071","volume":"17","author":"P Charitidis","year":"2020","unstructured":"Charitidis P, Doropoulos S, Vologiannidis S, Papastergiou I, Karakeva S (2020) Towards countering hate speech against journalists on social media. Online Soci Netw Med 17:100071","journal-title":"Online Soci Netw Med"},{"key":"10841_CR67","first-page":"1","volume":"7","author":"J Dem\u0161ar","year":"2006","unstructured":"Dem\u0161ar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1\u201330","journal-title":"J Mach Learn Res"}],"container-title":["Neural Computing and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-024-10841-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00521-024-10841-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-024-10841-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,7]],"date-time":"2025-02-07T08:59:21Z","timestamp":1738918761000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00521-024-10841-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,17]]},"references-count":67,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2025,2]]}},"alternative-id":["10841"],"URL":"https:\/\/doi.org\/10.1007\/s00521-024-10841-8","relation":{},"ISSN":["0941-0643","1433-3058"],"issn-type":[{"value":"0941-0643","type":"print"},{"value":"1433-3058","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,17]]},"assertion":[{"value":"28 November 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 October 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 December 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest:"}}]}}