{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T11:02:08Z","timestamp":1777114928814,"version":"3.51.4"},"reference-count":120,"publisher":"Association for Computing Machinery (ACM)","issue":"13s","license":[{"start":{"date-parts":[[2023,7,13]],"date-time":"2023-07-13T00:00:00Z","timestamp":1689206400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Prime Minister Doctoral Fellowship"},{"name":"Ramanujan Fellowship"},{"name":"Wipro Research Grant"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Comput. Surv."],"published-print":{"date-parts":[[2023,12,31]]},"abstract":"<jats:p>Detecting online toxicity has always been a challenge due to its inherent subjectivity. Factors such as the context, geography, socio-political climate, and background of the producers and consumers of the posts play a crucial role in determining if the content can be flagged as toxic. Adoption of automated toxicity detection models in production can thus lead to a sidelining of the various groups they aim to help in the first place. It has piqued researchers\u2019 interest in examining unintended biases and their mitigation. Due to the nascent and multi-faceted nature of the work, complete literature is chaotic in its terminologies, techniques, and findings. In this article, we put together a systematic study of the limitations and challenges of existing methods for mitigating bias in toxicity detection.<\/jats:p>\n          <jats:p>\n            We look closely at proposed methods for evaluating and mitigating bias in toxic speech detection. To examine the limitations of existing methods, we also conduct a case study to introduce the concept of\n            <jats:italic>bias shift<\/jats:italic>\n            due to knowledge-based bias mitigation. The survey concludes with an overview of the critical challenges, research gaps, and future directions. While reducing toxicity on online platforms continues to be an active area of research, a systematic study of various biases and their mitigation strategies will help the research community produce robust and fair models.\n            <jats:xref ref-type=\"fn\">\n              <jats:sup>1<\/jats:sup>\n            <\/jats:xref>\n          <\/jats:p>","DOI":"10.1145\/3580494","type":"journal-article","created":{"date-parts":[[2023,1,20]],"date-time":"2023-01-20T11:05:19Z","timestamp":1674212719000},"page":"1-32","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":62,"title":["Handling Bias in Toxic Speech Detection: A Survey"],"prefix":"10.1145","volume":"55","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4815-0975","authenticated-orcid":false,"given":"Tanmay","family":"Garg","sequence":"first","affiliation":[{"name":"IIT Delhi, India"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2668-5581","authenticated-orcid":false,"given":"Sarah","family":"Masud","sequence":"additional","affiliation":[{"name":"IIT Delhi, India"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5280-9328","authenticated-orcid":false,"given":"Tharun","family":"Suresh","sequence":"additional","affiliation":[{"name":"IIT Delhi, India"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0210-0369","authenticated-orcid":false,"given":"Tanmoy","family":"Chakraborty","sequence":"additional","affiliation":[{"name":"IIT Delhi, India"}]}],"member":"320","published-online":{"date-parts":[[2023,7,13]]},"reference":[{"key":"e_1_3_3_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/3461702.3462557"},{"key":"e_1_3_3_3_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.alw-1.21"},{"key":"e_1_3_3_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASONAM.2018.8508247"},{"key":"e_1_3_3_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331262"},{"key":"e_1_3_3_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3308558.3313504"},{"key":"e_1_3_3_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/3383313.3418435"},{"key":"e_1_3_3_8_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.485"},{"key":"e_1_3_3_9_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1120"},{"key":"e_1_3_3_10_2","doi-asserted-by":"publisher","DOI":"10.1088\/1742-5468\/2008\/10\/P10008"},{"key":"e_1_3_3_11_2","doi-asserted-by":"publisher","DOI":"10.5555\/3157382.3157584"},{"key":"e_1_3_3_12_2","article-title":"Limitations of pinned AUC for measuring unintended bias","author":"Borkan Daniel","year":"2019","unstructured":"Daniel Borkan, Lucas Dixon, John Li, Jeffrey Sorensen, Nithum Thain, and Lucy Vasserman. 2019. Limitations of pinned AUC for measuring unintended bias. arXiv preprint arXiv:1903.02088 (2019).","journal-title":"arXiv preprint arXiv:1903.02088"},{"key":"e_1_3_3_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3308560.3317593"},{"key":"e_1_3_3_14_2","article-title":"Language models are few-shot learners","author":"Brown Tom B.","year":"2020","unstructured":"Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell et\u00a0al. 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020).","journal-title":"arXiv preprint arXiv:2005.14165"},{"key":"e_1_3_3_15_2","series-title":"Proceedings of the 1st Conference on Fairness, Accountability and Transparency","first-page":"77","volume":"81","author":"Buolamwini Joy","year":"2018","unstructured":"Joy Buolamwini and Timnit Gebru. 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Proceedings of the 1st Conference on Fairness, Accountability and Transparency(Proceedings of Machine Learning Research, Vol. 81), Sorelle A. Friedler and Christo Wilson (Eds.). PMLR, 77\u201391. Retrieved from: https:\/\/proceedings.mlr.press\/v81\/buolamwini18a.html."},{"key":"e_1_3_3_16_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.aal4230"},{"key":"e_1_3_3_17_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.aal4230"},{"key":"e_1_3_3_18_2","doi-asserted-by":"crossref","unstructured":"Tanmoy Chakraborty and Sarah Masud. 2022. Nipping in the bud: Detection diffusion and mitigation of hate speech on social media. arxiv:2201.00961 [cs.SI].","DOI":"10.1145\/3522598.3522601"},{"key":"e_1_3_3_19_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1418"},{"key":"e_1_3_3_20_2","article-title":"Examining racial bias in an online abuse corpus with structural topic modeling","author":"Davidson Thomas","year":"2020","unstructured":"Thomas Davidson and Debasmita Bhattacharya. 2020. Examining racial bias in an online abuse corpus with structural topic modeling. arXiv preprint arXiv:2005.13041 (2020).","journal-title":"arXiv preprint arXiv:2005.13041"},{"key":"e_1_3_3_21_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-3504"},{"key":"e_1_3_3_22_2","doi-asserted-by":"publisher","DOI":"10.1609\/icwsm.v11i1.14955"},{"key":"e_1_3_3_23_2","article-title":"Hate speech dataset from a white supremacy forum","author":"Gibert Ona de","year":"2018","unstructured":"Ona de Gibert, Naiara Perez, Aitor Garc\u00eda-Pablos, and Montse Cuadros. 2018. Hate speech dataset from a white supremacy forum. arXiv preprint arXiv:1809.04444 (2018).","journal-title":"arXiv preprint arXiv:1809.04444"},{"key":"e_1_3_3_24_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.150"},{"key":"e_1_3_3_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/3278721.3278729"},{"key":"e_1_3_3_26_2","doi-asserted-by":"publisher","DOI":"10.1002\/9781118827628.ch21"},{"key":"e_1_3_3_27_2","doi-asserted-by":"publisher","DOI":"10.5555\/2002472.2002641"},{"key":"e_1_3_3_28_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.29"},{"issue":"4","key":"e_1_3_3_29_2","first-page":"002190962095120","article-title":"Hate speech and election violence in Nigeria","volume":"56","author":"Ezeibe Christian","year":"2020","unstructured":"Christian Ezeibe. 2020. Hate speech and election violence in Nigeria. J. Asian Afric. Stud. 56, 4 (2020), 0021909620951208.","journal-title":"J. Asian Afric. Stud."},{"key":"e_1_3_3_30_2","doi-asserted-by":"publisher","DOI":"10.4000\/books.aaccademia.4497"},{"key":"e_1_3_3_31_2","doi-asserted-by":"publisher","DOI":"10.1177\/1754073917751229"},{"key":"e_1_3_3_32_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-3510"},{"key":"e_1_3_3_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/3232676"},{"key":"e_1_3_3_34_2","first-page":"6786","volume-title":"Proceedings of the 12th Language Resources and Evaluation Conference","author":"Fortuna Paula","year":"2020","unstructured":"Paula Fortuna, Juan Soler, and Leo Wanner. 2020. Toxic, hateful, offensive or abusive? What are we really classifying? An empirical analysis of hate speech datasets. In Proceedings of the 12th Language Resources and Evaluation Conference. 6786\u20136794."},{"key":"e_1_3_3_35_2","doi-asserted-by":"publisher","DOI":"10.1609\/icwsm.v12i1.14991"},{"key":"e_1_3_3_36_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-5815"},{"key":"e_1_3_3_37_2","doi-asserted-by":"publisher","DOI":"10.3390\/app11073184"},{"key":"e_1_3_3_38_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.wnut-1.35"},{"key":"e_1_3_3_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3091478.3091509"},{"key":"e_1_3_3_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3411764.3445423"},{"key":"e_1_3_3_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACIIW.2019.8925049"},{"key":"e_1_3_3_42_2","doi-asserted-by":"publisher","DOI":"10.1037\/0022-3514.74.6.1464"},{"key":"e_1_3_3_43_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.234"},{"key":"e_1_3_3_44_2","doi-asserted-by":"publisher","DOI":"10.1002\/9780470939376.ch25"},{"key":"e_1_3_3_45_2","article-title":"Multilingual Twitter corpus and baselines for evaluating demographic bias in hate speech recognition","author":"Huang Xiaolei","year":"2020","unstructured":"Xiaolei Huang, Linzi Xing, Franck Dernoncourt, and Michael J. Paul. 2020. Multilingual Twitter corpus and baselines for evaluating demographic bias in hate speech recognition. arXiv preprint arXiv:2002.10361 (2020).","journal-title":"arXiv preprint arXiv:2002.10361"},{"key":"e_1_3_3_46_2","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.2025334119"},{"key":"e_1_3_3_47_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S19-2081"},{"key":"e_1_3_3_48_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Jin Xisen","year":"2020","unstructured":"Xisen Jin, Zhongyu Wei, Junyi Du, Xiangyang Xue, and Xiang Ren. 2020. Towards hierarchical importance attribution: Explaining compositional semantics for neural sequence models. In Proceedings of the International Conference on Learning Representations. Retrieved from: https:\/\/openreview.net\/forum?id=BkxRRkSKwr."},{"key":"e_1_3_3_49_2","doi-asserted-by":"publisher","DOI":"10.31234\/osf.io\/hqjxn"},{"key":"e_1_3_3_50_2","article-title":"Contextualizing hate speech classifiers with post-hoc explanation","author":"Kennedy Brendan","year":"2020","unstructured":"Brendan Kennedy, Xisen Jin, Aida Mostafazadeh Davani, Morteza Dehghani, and Xiang Ren. 2020. Contextualizing hate speech classifiers with post-hoc explanation. arXiv preprint arXiv:2005.02439 (2020).","journal-title":"arXiv preprint arXiv:2005.02439"},{"key":"e_1_3_3_51_2","article-title":"The hateful memes challenge: Detecting hate speech in multimodal memes","author":"Kiela Douwe","year":"2020","unstructured":"Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, and Davide Testuggine. 2020. The hateful memes challenge: Detecting hate speech in multimodal memes. arXiv preprint arXiv:2005.04790 (2020).","journal-title":"arXiv preprint arXiv:2005.04790"},{"key":"e_1_3_3_52_2","unstructured":"Jae Yeon Kim Carlos Ortiz Sarah Nam Sarah Santiago and Vivek Datta. 2020. Intersectional Bias in Hate Speech and Abusive Language Datasets. arxiv:2005.05921 [cs.CL]."},{"key":"e_1_3_3_53_2","first-page":"34","volume-title":"Proceedings of the 1st Workshop on Language Technology for Equality, Diversity and Inclusion","author":"Kumar Senthil","year":"2021","unstructured":"Senthil Kumar, Aravindan Chandrabose, and Bharathi Raja Chakravarthi. 2021. An overview of fairness in data\u2014Illuminating the bias in data pipeline. In Proceedings of the 1st Workshop on Language Technology for Equality, Diversity and Inclusion. Association for Computational Linguistics, 34\u201345. Retrieved from https:\/\/aclanthology.org\/2021.ltedi-1.5."},{"key":"e_1_3_3_54_2","article-title":"Topics to avoid: Demoting latent confounds in text classification","author":"Kumar Sachin","year":"2019","unstructured":"Sachin Kumar, Shuly Wintner, Noah A. Smith, and Yulia Tsvetkov. 2019. Topics to avoid: Demoting latent confounds in text classification. arXiv preprint arXiv:1909.00453 (2019).","journal-title":"arXiv preprint arXiv:1909.00453"},{"key":"e_1_3_3_55_2","first-page":"1078","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Bras Ronan Le","year":"2020","unstructured":"Ronan Le Bras, Swabha Swayamdipta, Chandra Bhagavatula, Rowan Zellers, Matthew Peters, Ashish Sabharwal, and Yejin Choi. 2020. Adversarial filters of dataset biases. In Proceedings of the International Conference on Machine Learning. PMLR, 1078\u20131088."},{"key":"e_1_3_3_56_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.488"},{"key":"e_1_3_3_57_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-acl.7"},{"key":"e_1_3_3_58_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.woah-1.4"},{"key":"e_1_3_3_59_2","doi-asserted-by":"publisher","DOI":"10.1145\/3419249.3420142"},{"key":"e_1_3_3_60_2","doi-asserted-by":"publisher","DOI":"10.1145\/3292522.3326034"},{"key":"e_1_3_3_61_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1063"},{"key":"e_1_3_3_62_2","volume-title":"Proceedings of the 1st International Conference on Learning Representations","author":"Mikolov Tom\u00e1s","year":"2013","unstructured":"Tom\u00e1s Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. In Proceedings of the 1st International Conference on Learning Representations. Retrieved from: http:\/\/arxiv.org\/abs\/1301.3781."},{"key":"e_1_3_3_63_2","doi-asserted-by":"publisher","DOI":"10.1145\/219717.219748"},{"key":"e_1_3_3_64_2","first-page":"1088","volume-title":"Proceedings of the 27th International Conference on Computational Linguistics","author":"Mishra Pushkar","year":"2018","unstructured":"Pushkar Mishra, Marco Del Tredici, Helen Yannakoudakis, and Ekaterina Shutova. 2018. Author profiling for abuse detection. In Proceedings of the 27th International Conference on Computational Linguistics. 1088\u20131098."},{"key":"e_1_3_3_65_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0237861"},{"key":"e_1_3_3_66_2","doi-asserted-by":"publisher","DOI":"10.1057\/s41599-020-00550-7"},{"key":"e_1_3_3_67_2","doi-asserted-by":"publisher","DOI":"10.2174\/1874917801306010010"},{"key":"e_1_3_3_68_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-demos.2"},{"key":"e_1_3_3_69_2","doi-asserted-by":"publisher","DOI":"10.1145\/3350546.3352512"},{"key":"e_1_3_3_70_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.199"},{"key":"e_1_3_3_71_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-1302"},{"key":"e_1_3_3_72_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1162"},{"key":"e_1_3_3_73_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-acl.246"},{"key":"e_1_3_3_74_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-emnlp.379"},{"key":"e_1_3_3_75_2","first-page":"1534","volume-title":"Proceedings of the 27th International Conference on Computational Linguistics","author":"Preo\u0163iuc-Pietro Daniel","year":"2018","unstructured":"Daniel Preo\u0163iuc-Pietro and Lyle Ungar. 2018. User-level race and ethnicity predictors from Twitter text. In Proceedings of the 27th International Conference on Computational Linguistics. 1534\u20131545."},{"key":"e_1_3_3_76_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-2019"},{"key":"e_1_3_3_77_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.183"},{"key":"e_1_3_3_78_2","volume-title":"Proceedings of the 35th Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2)","author":"Rahman Md Mustafizur","year":"2021","unstructured":"Md Mustafizur Rahman, Dinesh Balakrishnan, Dhiraj Murthy, Mucahid Kutlu, and Matthew Lease. 2021. An information retrieval approach to building datasets for hate speech detection. In Proceedings of the 35th Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2). Retrieved from: https:\/\/openreview.net\/forum?id=jI_BbL-qjJN."},{"key":"e_1_3_3_79_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.alw-1.9"},{"key":"e_1_3_3_80_2","doi-asserted-by":"publisher","DOI":"10.1111\/ajps.12103"},{"key":"e_1_3_3_81_2","doi-asserted-by":"publisher","DOI":"10.1093\/oso\/9780190634728.001.0001"},{"key":"e_1_3_3_82_2","article-title":"Measuring the reliability of hate speech annotations: The case of the European refugee crisis","author":"Ross Bj\u00f6rn","year":"2017","unstructured":"Bj\u00f6rn Ross, Michael Rist, Guillermo Carbonell, Benjamin Cabrera, Nils Kurowsky, and Michael Wojatzki. 2017. Measuring the reliability of hate speech annotations: The case of the European refugee crisis. arXiv preprint arXiv:1701.08118 (2017).","journal-title":"arXiv preprint arXiv:1701.08118"},{"key":"e_1_3_3_83_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.naacl-main.13"},{"key":"e_1_3_3_84_2","doi-asserted-by":"publisher","DOI":"10.1145\/3292522.3326032"},{"key":"e_1_3_3_85_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0228723"},{"key":"e_1_3_3_86_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1163"},{"key":"e_1_3_3_87_2","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1121"},{"key":"e_1_3_3_88_2","doi-asserted-by":"publisher","DOI":"10.1609\/icwsm.v16i1.19340"},{"key":"e_1_3_3_89_2","doi-asserted-by":"publisher","DOI":"10.1177\/1745691621991860"},{"key":"e_1_3_3_90_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W17-1101"},{"key":"e_1_3_3_91_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1341"},{"key":"e_1_3_3_92_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.468"},{"key":"e_1_3_3_93_2","doi-asserted-by":"publisher","DOI":"10.4324\/9781003165330-9"},{"key":"e_1_3_3_94_2","doi-asserted-by":"publisher","DOI":"10.3758\/s13423-016-1086-6"},{"key":"e_1_3_3_95_2","unstructured":"Julia Maria Stru\u00df Melanie Siegel Josef Ruppenhofer Michael Wiegand and Manfred Klenner. 2019. Overview of GermEval task 2 2019 shared task on the identification of offensive language. Preliminary Proceedings of the 15th Conference on Natural Language Processing (KONVENS\u201919 October 9\u201311 2019 at Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg) German Society for Computational Linguistics & Language Technology und Friedrich-Alexander-Universit\u00e4t Erlangen-N\u00fcrnberg M\u00fcnchen [u.a.] 352\u2013363. https:\/\/nbn-resolving.org\/urn:nbn:de:bsz:mh39-93197."},{"key":"e_1_3_3_96_2","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1089\/1094931041291295","article-title":"The online disinhibition effect","volume":"7","author":"Suler J.","year":"2004","unstructured":"J. Suler. 2004. The online disinhibition effect. Cyberpsychol. Behav.: Impact Internet, Multim. Virt. Real. Behav. Societ. 7, 3 (2004), 321\u2013326.","journal-title":"Cyberpsychol. Behav.: Impact Internet, Multim. Virt. Real. Behav. Societ."},{"key":"e_1_3_3_97_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1159"},{"key":"e_1_3_3_98_2","article-title":"Mitigating gender bias in natural language processing: Literature review","author":"Sun Tony","year":"2019","unstructured":"Tony Sun, Andrew Gaut, Shirlyn Tang, Yuxin Huang, Mai ElSherief, Jieyu Zhao, Diba Mirza, Elizabeth Belding, Kai-Wei Chang, and William Yang Wang. 2019. Mitigating gender bias in natural language processing: Literature review. arXiv preprint arXiv:1906.08976 (2019).","journal-title":"arXiv preprint arXiv:1906.08976"},{"key":"e_1_3_3_99_2","article-title":"A framework for understanding unintended consequences of machine learning","author":"Suresh Harini","year":"2019","unstructured":"Harini Suresh and John V. Guttag. 2019. A framework for understanding unintended consequences of machine learning. arXiv preprint arXiv:1901.10002 (2019).","journal-title":"arXiv preprint arXiv:1901.10002"},{"key":"e_1_3_3_100_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.746"},{"key":"e_1_3_3_101_2","article-title":"Investigating biases in textual entailment datasets","author":"Tan Shawn","year":"2019","unstructured":"Shawn Tan, Yikang Shen, Chin-wei Huang, and Aaron Courville. 2019. Investigating biases in textual entailment datasets. arXiv preprint arXiv:1906.09635 (2019).","journal-title":"arXiv preprint arXiv:1906.09635"},{"key":"e_1_3_3_102_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W17-1606"},{"key":"e_1_3_3_103_2","doi-asserted-by":"publisher","DOI":"10.1080\/17440572.2019.1591952"},{"key":"e_1_3_3_104_2","doi-asserted-by":"publisher","DOI":"10.1609\/icwsm.v14i1.7334"},{"key":"e_1_3_3_105_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.gebnlp-1.8"},{"key":"e_1_3_3_106_2","first-page":"14","volume-title":"Proceedings of the Workshop on Resources and Techniques for User and Author Profiling in Abusive Language","author":"Rosendaal Juliet Van","year":"2020","unstructured":"Juliet Van Rosendaal, Tommaso Caselli, and Malvina Nissim. 2020. Lower bias, higher density abusive language datasets: A recipe. In Proceedings of the Workshop on Resources and Techniques for User and Author Profiling in Abusive Language. 14\u201319."},{"key":"e_1_3_3_107_2","doi-asserted-by":"publisher","DOI":"10.1089\/cyber.2022.0009"},{"key":"e_1_3_3_108_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00894"},{"key":"e_1_3_3_109_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W16-5618"},{"key":"e_1_3_3_110_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N16-2013"},{"key":"e_1_3_3_111_2","unstructured":"Laura Weidinger John Mellor Maribeth Rauh Conor Griffin Jonathan Uesato Po-Sen Huang Myra Cheng Mia Glaese Borja Balle Atoosa Kasirzadeh Zac Kenton Sasha Brown Will Hawkins Tom Stepleton Courtney Biles Abeba Birhane Julia Haas Laura Rimell Lisa Anne Hendricks William Isaac Sean Legassick Geoffrey Irving and Iason Gabriel. 2021. Ethical and Social Risks of Harm from Language Models. arxiv:2112.04359 [cs.CL]."},{"key":"e_1_3_3_112_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.alw-1.22"},{"key":"e_1_3_3_113_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.alw-1.7"},{"key":"e_1_3_3_114_2","first-page":"602","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Wiegand Michael","year":"2019","unstructured":"Michael Wiegand, Josef Ruppenhofer, and Thomas Kleinbauer. 2019. Detection of abusive language: The problem of biased datasets. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 602\u2013608."},{"key":"e_1_3_3_115_2","unstructured":"Michael Wiegand Melanie Siegel and Josef Ruppenhofer. 2018. Overview of the GermEval 2018 shared task on the identification of offensive language Austrian Academy of Sciences."},{"key":"e_1_3_3_116_2","doi-asserted-by":"publisher","DOI":"10.1145\/3038912.3052591"},{"key":"e_1_3_3_117_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.socialnlp-1.2"},{"key":"e_1_3_3_118_2","doi-asserted-by":"publisher","DOI":"10.7717\/peerj-cs.598"},{"key":"e_1_3_3_119_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.380"},{"key":"e_1_3_3_120_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-2003"},{"key":"e_1_3_3_121_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.eacl-main.274"}],"container-title":["ACM Computing Surveys"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3580494","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3580494","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:37:42Z","timestamp":1750178262000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3580494"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,13]]},"references-count":120,"journal-issue":{"issue":"13s","published-print":{"date-parts":[[2023,12,31]]}},"alternative-id":["10.1145\/3580494"],"URL":"https:\/\/doi.org\/10.1145\/3580494","relation":{},"ISSN":["0360-0300","1557-7341"],"issn-type":[{"value":"0360-0300","type":"print"},{"value":"1557-7341","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,13]]},"assertion":[{"value":"2022-01-26","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-01-09","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-07-13","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}