{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,4,12]],"date-time":"2024-04-12T07:12:46Z","timestamp":1712905966088},"reference-count":38,"publisher":"MIT Press","license":[{"start":{"date-parts":[[2023,1,3]],"date-time":"2023-01-03T00:00:00Z","timestamp":1672704000000},"content-version":"vor","delay-in-days":367,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,12,23]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>To proactively offer social media users a safe online experience, there is a need for systems that can detect harmful posts and promptly alert platform moderators. In order to guarantee the enforcement of a consistent policy, moderators are provided with detailed guidelines. In contrast, most state-of-the-art models learn what abuse is from labeled examples and as a result base their predictions on spurious cues, such as the presence of group identifiers, which can be unreliable. In this work we introduce the concept of policy-aware abuse detection, abandoning the unrealistic expectation that systems can reliably learn which phenomena constitute abuse from inspecting the data alone. We propose a machine-friendly representation of the policy that moderators wish to enforce, by breaking it down into a collection of intents and slots. We collect and annotate a dataset of 3,535 English posts with such slots, and show how architectures for intent classification and slot filling can be used for abuse detection, while providing a rationale for model decisions.1<\/jats:p>","DOI":"10.1162\/tacl_a_00527","type":"journal-article","created":{"date-parts":[[2023,1,3]],"date-time":"2023-01-03T19:19:55Z","timestamp":1672773595000},"page":"1440-1454","update-policy":"http:\/\/dx.doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":1,"title":["Explainable Abuse Detection as Intent Classification and Slot Filling"],"prefix":"10.1162","volume":"10","author":[{"given":"Agostina","family":"Calabrese","sequence":"first","affiliation":[{"name":"Institute for Language, Cognition and Computation, School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, United Kingdom. a.calabrese@ed.ac.uk"}]},{"given":"Bj\u00f6rn","family":"Ross","sequence":"additional","affiliation":[{"name":"Institute for Language, Cognition and Computation, School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, United Kingdom. b.ross@ed.ac.uk"}]},{"given":"Mirella","family":"Lapata","sequence":"additional","affiliation":[{"name":"Institute for Language, Cognition and Computation, School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, United Kingdom. mlap@inf.ed.ac.uk"}]}],"member":"281","published-online":{"date-parts":[[2022,12,23]]},"reference":[{"key":"2023010319173278600_bib1","doi-asserted-by":"publisher","first-page":"5026","DOI":"10.18653\/v1\/2020.emnlp-main.408","article-title":"Conversational semantic parsing","volume-title":"Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16\u201320, 2020","author":"Aghajanyan","year":"2020"},{"key":"2023010319173278600_bib2","doi-asserted-by":"publisher","first-page":"4402","DOI":"10.18653\/v1\/2021.acl-long.340","article-title":"Intent classification and slot filling for privacy policies","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL\/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1\u20136, 2021","author":"Ahmad","year":"2021"},{"key":"2023010319173278600_bib3","doi-asserted-by":"publisher","first-page":"2672","DOI":"10.18653\/v1\/2022.naacl-main.192","article-title":"Necessity and sufficiency for explaining text classifiers: A case study in hate speech detection","volume-title":"Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Balkir","year":"2022"},{"key":"2023010319173278600_bib4","doi-asserted-by":"publisher","first-page":"15","DOI":"10.18653\/v1\/2021.bppf-1.3","article-title":"We need to consider disagreement in evaluation","volume-title":"1st Workshop on Benchmarking: Past, Present and Future","author":"Basile","year":"2021"},{"key":"2023010319173278600_bib5","first-page":"8","article-title":"Explanation and justification in machine learning: A survey","volume-title":"IJCAI-17 Workshop on Explainable AI (XAI)","author":"Or","year":"2017"},{"key":"2023010319173278600_bib6","doi-asserted-by":"publisher","first-page":"243","DOI":"10.1145\/3447535.3462484","article-title":"AAA: Fair evaluation for abuse detection systems wanted","volume-title":"WebSci \u201921: 13th ACM Web Science Conference 2021, Virtual Event, United Kingdom, June 21\u201325, 2021","author":"Calabrese","year":"2021"},{"key":"2023010319173278600_bib7","doi-asserted-by":"publisher","first-page":"4157","DOI":"10.18653\/v1\/2020.acl-main.382","article-title":"Make up your mind! Adversarial generation of inconsistent natural language explanations","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Camburu","year":"2020"},{"key":"2023010319173278600_bib8","doi-asserted-by":"publisher","first-page":"92","DOI":"10.1162\/tacl_a_00449","article-title":"Dealing with disagreements: Looking beyond the majority vote in subjective annotations","volume":"10","author":"Davani","year":"2022","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2023010319173278600_bib9","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1145\/3278721.3278729","article-title":"Measuring and mitigating unintended bias in text classification","volume-title":"Proceedings of the 2018 AAAI\/ ACM Conference on AI, Ethics, and Society, AIES 2018, New Orleans, LA, USA, February 02\u201303, 2018","author":"Dixon","year":"2018"},{"key":"2023010319173278600_bib10","doi-asserted-by":"publisher","first-page":"731","DOI":"10.18653\/v1\/P18-1068","article-title":"Coarse-to- fine decoding for neural semantic parsing","volume-title":"Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15\u201320, 2018, Volume 1: Long Papers","author":"Li","year":"2018"},{"key":"2023010319173278600_bib11","doi-asserted-by":"crossref","DOI":"10.1609\/icwsm.v12i1.14991","article-title":"Large scale crowdsourcing and characterization of twitter abusive behavior","volume-title":"Twelfth International AAAI Conference on Web and Social Media","author":"Founta","year":"2018"},{"key":"2023010319173278600_bib12","doi-asserted-by":"publisher","first-page":"2787","DOI":"10.18653\/v1\/D18-1300","article-title":"Semantic parsing for task oriented dialog using hierarchical representations","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 \u2013 November 4, 2018","author":"Gupta","year":"2018"},{"key":"2023010319173278600_bib13","doi-asserted-by":"publisher","first-page":"5435","DOI":"10.18653\/v1\/2020.acl-main.483","article-title":"Contextualizing hate speech classifiers with post-hoc explanation","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5\u201310, 2020","author":"Kennedy","year":"2020"},{"key":"2023010319173278600_bib14","doi-asserted-by":"publisher","first-page":"7871","DOI":"10.18653\/v1\/2020.acl-main.703","article-title":"BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics","author":"Lewis","year":"2020"},{"key":"2023010319173278600_bib15","doi-asserted-by":"publisher","first-page":"369","DOI":"10.1609\/icwsm.v13i01.3237","article-title":"Thou shalt not hate: Countering online hate speech","volume-title":"Proceedings of the International AAAI Conference on Web and Social Media","author":"Mathew","year":"2019"},{"issue":"3","key":"2023010319173278600_bib16","doi-asserted-by":"publisher","first-page":"276","DOI":"10.11613\/BM.2012.031","article-title":"Interrater reliability: The kappa statistic","volume":"22","author":"McHugh","year":"2012","journal-title":"Biochemia Medica"},{"key":"2023010319173278600_bib17","article-title":"Tackling online abuse: A survey of automated abuse detection methods","author":"Mishra","year":"2019","journal-title":"CoRR"},{"key":"2023010319173278600_bib18","doi-asserted-by":"publisher","first-page":"928","DOI":"10.1007\/978-3-030-36687-2_77","article-title":"A BERT-based transfer learning approach for hate speech detection in online social media","volume-title":"Complex Networks and Their Applications VIII - Volume 1 Proceedings of the Eighth International Conference on Complex Networks and Their Applications COMPLEX NETWORKS 2019, Lisbon, Portugal, December 10\u201312, 2019","author":"Mozafari","year":"2019"},{"key":"2023010319173278600_bib19","doi-asserted-by":"publisher","first-page":"4674","DOI":"10.18653\/v1\/D19-1474","article-title":"Multilingual and multi-aspect hate speech analysis","volume-title":"Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3\u20137, 2019","author":"Ousidhoum","year":"2019"},{"key":"2023010319173278600_bib20","doi-asserted-by":"publisher","first-page":"107528","DOI":"10.1016\/j.patcog.2020.107528","article-title":"One-vs-one classification for deep neural networks","volume":"108","author":"Pawara","year":"2020","journal-title":"Pattern Recognition"},{"key":"2023010319173278600_bib21","doi-asserted-by":"publisher","first-page":"1532","DOI":"10.3115\/v1\/D14-1162","article-title":"GloVe: Global vectors for word representation","volume-title":"Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)","author":"Pennington","year":"2014"},{"key":"2023010319173278600_bib22","doi-asserted-by":"publisher","first-page":"878","DOI":"10.3115\/v1\/P15-1085","article-title":"Language to code: Learning semantic parsers for if-this-then-that recipes","volume-title":"Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26\u201331, 2015, Beijing, China, Volume 1: Long Papers","author":"Quirk","year":"2015"},{"key":"2023010319173278600_bib23","first-page":"6","article-title":"Measuring the reliability of hate speech annotations: The case of the european refugee crisis","volume-title":"3rd Workshop on Natural Language Processing for Computer-Mediated Communication\/Social Media","author":"Ross","year":"2016"},{"key":"2023010319173278600_bib24","doi-asserted-by":"publisher","first-page":"pages 175\u2013pages 190","DOI":"10.18653\/v1\/2022.naacl-main.13","article-title":"Two contrasting data annotation paradigms for subjective NLP tasks","volume-title":"Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"R\u00f6ttger","year":"2022"},{"key":"2023010319173278600_bib25","doi-asserted-by":"publisher","first-page":"41","DOI":"10.18653\/v1\/2021.acl-long.4","article-title":"Hatecheck: Functional tests for hate speech detection models","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL\/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1\u20136, 2021","author":"R\u00f6ttger","year":"2021"},{"key":"2023010319173278600_bib26","first-page":"5477","article-title":"Social bias frames: Reasoning about social and power implications of language","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020","author":"Sap","year":"2020"},{"key":"2023010319173278600_bib27","doi-asserted-by":"publisher","first-page":"484","DOI":"10.1162\/tacl_a_00472","article-title":"A neighborhood framework for resource-lean content flagging","volume":"10","author":"Sarwar","year":"2022","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"2023010319173278600_bib28","doi-asserted-by":"crossref","first-page":"1385","DOI":"10.1613\/jair.1.12752","article-title":"Learning from disagreement: A survey","volume":"72","author":"Uma","year":"2021","journal-title":"Journal of Artificial Intelligence Research"},{"key":"2023010319173278600_bib29","first-page":"5998","article-title":"Attention is all you need","volume-title":"Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4\u20139, 2017, Long Beach, CA, USA","author":"Vaswani","year":"2017"},{"key":"2023010319173278600_bib30","doi-asserted-by":"publisher","first-page":"80","DOI":"10.18653\/v1\/W19-3509","article-title":"Challenges and frontiers in abusive content detection","volume-title":"Proceedings of the Third Workshop on Abusive Language Online","author":"Vidgen","year":"2019"},{"key":"2023010319173278600_bib31","doi-asserted-by":"publisher","first-page":"2289","DOI":"10.18653\/v1\/2021.naacl-main.182","article-title":"Introducing CAD: The contextual abuse dataset","volume-title":"Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","author":"Vidgen","year":"2021"},{"key":"2023010319173278600_bib32","doi-asserted-by":"publisher","first-page":"1667","DOI":"10.18653\/v1\/2021.acl-long.132","article-title":"Learning from the worst: Dynamically generated datasets to improve online hate detection","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL\/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1\u20136, 2021","author":"Vidgen","year":"2021"},{"key":"2023010319173278600_bib33","first-page":"19","article-title":"Detecting hate speech on the world wide web","volume-title":"Proceedings of the Second Workshop on Language in Social Media","author":"Warner","year":"2012"},{"key":"2023010319173278600_bib34","doi-asserted-by":"publisher","first-page":"78","DOI":"10.18653\/v1\/W17-3012","article-title":"Understanding abuse: A typology of abusive language detection subtasks","volume-title":"Proceedings of the First Workshop on Abusive Language Online, ALW @ACL 2017, Vancouver, BC, Canada, August 4, 2017","author":"Waseem","year":"2017"},{"key":"2023010319173278600_bib35","doi-asserted-by":"publisher","DOI":"10.1145\/3547138","article-title":"A survey of joint intent detection and slot-filling models in natural language understanding","author":"Weld","year":"2021","journal-title":"arXiv preprint arXiv:2101.08091"},{"key":"2023010319173278600_bib36","article-title":"The unreliability of explanations in few-shot in-context learning","author":"Xi","year":"2022","journal-title":"arXiv preprint arXiv:2205.03401"},{"key":"2023010319173278600_bib37","doi-asserted-by":"publisher","first-page":"4134","DOI":"10.18653\/v1\/2020.acl-main.380","article-title":"Demographics should not be the reason of toxicity: Mitigating discrimination in text classifications with instance weighting","volume-title":"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5\u201310, 2020","author":"Zhang","year":"2020"},{"key":"2023010319173278600_bib38","article-title":"A legal approach to hate speech: Operationalizing the EU\u2019s legal framework against the expression of hatred as an NLP task","author":"Zufall","year":"2020","journal-title":"arXiv preprint arXiv:2004.03422"}],"container-title":["Transactions of the Association for Computational Linguistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00527\/2065936\/tacl_a_00527.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/tacl\/article-pdf\/doi\/10.1162\/tacl_a_00527\/2065936\/tacl_a_00527.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,3]],"date-time":"2023-01-03T19:20:03Z","timestamp":1672773603000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/tacl\/article\/doi\/10.1162\/tacl_a_00527\/114369\/Explainable-Abuse-Detection-as-Intent"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022]]},"references-count":38,"URL":"https:\/\/doi.org\/10.1162\/tacl_a_00527","relation":{},"ISSN":["2307-387X"],"issn-type":[{"value":"2307-387X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2022]]},"published":{"date-parts":[[2022]]}}}