{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T11:41:30Z","timestamp":1742384490503,"version":"3.37.3"},"reference-count":33,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,3,1]],"date-time":"2023-03-01T00:00:00Z","timestamp":1677628800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,3,10]],"date-time":"2023-03-10T00:00:00Z","timestamp":1678406400000},"content-version":"vor","delay-in-days":9,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Universit\u00e4t der Bundeswehr M\u00fcnchen"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Datenbank Spektrum"],"published-print":{"date-parts":[[2023,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Abusive language detection has become an integral part of the research, as reflected in numerous publications and several shared tasks conducted in recent years. It has been shown that the obtained models perform well on the datasets on which they were trained, but have difficulty generalizing to other datasets. This work also focuses on model generalization, but \u2013 in contrast to previous work \u2013 we use homogeneous datasets for our experiments, assuming that they have a\u00a0higher generalizability. We want to find out how similar datasets have to be for trained models to generalize and whether generalizability depends on the method used to obtain a\u00a0model. To this end, we selected four German datasets from popular shared tasks, three of which are from consecutive GermEval shared tasks. Furthermore, we evaluate two deep learning methods and three traditional machine learning methods to derive generalizability trends based on the results. Our experiments show that generalization is only partially given, although the annotation schemes for these datasets are almost identical. Our findings additionally show that generalizability depends solely on the (combinations of) training sets and is consistent no matter what the underlying method is.<\/jats:p>","DOI":"10.1007\/s13222-023-00438-1","type":"journal-article","created":{"date-parts":[[2023,3,10]],"date-time":"2023-03-10T14:04:24Z","timestamp":1678457064000},"page":"15-25","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Generalizability of Abusive Language Detection Models on Homogeneous German Datasets"],"prefix":"10.1007","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2619-3248","authenticated-orcid":false,"given":"Nina","family":"Seemann","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yeong Su","family":"Lee","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Julian","family":"H\u00f6llig","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michaela","family":"Geierhos","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2023,3,10]]},"reference":[{"key":"438_CR1","doi-asserted-by":"publisher","first-page":"54","DOI":"10.18653\/v1\/S19-2007","volume-title":"Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics, Minneapolis, Minnesota, USA","author":"V Basile","year":"2019","unstructured":"Basile V, Bosco C, Fersini E et al (2019) SemEval-2019 task 5: Multilingual detection of hate speech against immigrants and women in Twitter. In: Proceedings of the 13th International Workshop on Semantic Evaluation. Association for Computational Linguistics, Minneapolis, Minnesota, USA, pp 54\u201363 https:\/\/doi.org\/10.18653\/v1\/S19-2007"},{"key":"438_CR2","doi-asserted-by":"publisher","first-page":"100","DOI":"10.1016\/j.osnem.2021.100153","volume":"24","author":"DR Beddiar","year":"2021","unstructured":"Beddiar DR, Jahan MS, Oussalah M (2021) Data expansion using back translation and paraphrasing for hate speech detection. Online Soc Netw Media 24:100\u2013153. https:\/\/doi.org\/10.1016\/j.osnem.2021.100153","journal-title":"Online Soc Netw Media"},{"key":"438_CR3","doi-asserted-by":"publisher","DOI":"10.1186\/s12864-019-6413-7","author":"D Chicco","year":"2020","unstructured":"Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. Bmc Genomics. https:\/\/doi.org\/10.1186\/s12864-019-6413-7","journal-title":"Bmc Genomics"},{"key":"438_CR4","first-page":"512","volume-title":"Proceedings of the 11th International AAAI Conference on Web and Social Media, ICWSM \u201917","author":"T Davidson","year":"2017","unstructured":"Davidson T, Warmsley D, Macy M et al (2017) Automated hate speech detection and the problem of offensive language. In: Proceedings of the 11th International AAAI Conference on Web and Social Media, ICWSM \u201917, pp 512\u2013515"},{"key":"438_CR5","series-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","doi-asserted-by":"publisher","first-page":"4171","DOI":"10.18653\/v1\/N19-1423","volume-title":"Long and short papers","author":"J Devlin","year":"2019","unstructured":"Devlin J, Chang MW, Lee K et al (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Long and short papers. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol 1. ACL, Minneapolis, pp 4171\u20134186 https:\/\/doi.org\/10.18653\/v1\/N19-1423"},{"key":"438_CR6","first-page":"214","volume-title":"Proceedings of the third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018) co-located with 34th Conference of the Spanish Society for Natural Language Processing (SEPLN 2018)","author":"E Fersini","year":"2018","unstructured":"Fersini E, Rosso P, Anzovino M (2018) Overview of the task on automatic misogyny identification at IberEval 2018. In: Rosso P, Gonzalo J, Mart\u00ednez R et al (eds) Proceedings of the third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018) co-located with 34th Conference of the Spanish Society for Natural Language Processing (SEPLN 2018), vol 2150. Ceur Workshop Proceedings, Sevilla, pp 214\u2013228"},{"issue":"3","key":"438_CR7","doi-asserted-by":"publisher","first-page":"102524","DOI":"10.1016\/j.ipm.2021.102524","volume":"58","author":"P Fortuna","year":"2021","unstructured":"Fortuna P, Soler-Company J, Wanner L (2021) How well do hate speech, toxicity, abusive and offensive language classification models generalize across datasets? Inf Process Manag 58(3):102524. https:\/\/doi.org\/10.1016\/j.ipm.2021.102524","journal-title":"Inf Process Manag"},{"issue":"1","key":"438_CR8","doi-asserted-by":"publisher","first-page":"491","DOI":"10.1609\/icwsm.v12i1.14991","volume":"12","author":"A Founta","year":"2018","unstructured":"Founta A, Djouvas C, Chatzakou D et al (2018) Large scale crowdsourcing and characterization of Twitter abusive behavior. ICWSM 12(1):491\u2013500","journal-title":"ICWSM"},{"key":"438_CR9","doi-asserted-by":"publisher","first-page":"260","DOI":"10.26615\/978-954-452-049-6_036","volume-title":"Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017. INCOMA Ltd., Varna, Bulgaria","author":"L Gao","year":"2017","unstructured":"Gao L, Huang R (2017) Detecting Online hate speech using context aware models. In: Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017. INCOMA Ltd., Varna, Bulgaria, pp 260\u2013266 https:\/\/doi.org\/10.26615\/978-954-452-049-6_036"},{"key":"438_CR10","doi-asserted-by":"publisher","first-page":"11","DOI":"10.18653\/v1\/W18-5102","volume-title":"Proceedings of the 2nd Workshop on Abusive Language Online (ALW2). Association for Computational Linguistics, Brussels, Belgium","author":"O de Gibert","year":"2018","unstructured":"de Gibert O, Perez N, Garc\u00eda-Pablos A et al (2018) Hate speech dataset from a\u00a0white supremacy forum. In: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2). Association for Computational Linguistics, Brussels, Belgium, pp 11\u201320 https:\/\/doi.org\/10.18653\/v1\/W18-5102"},{"key":"438_CR11","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1145\/3091478.3091509","volume-title":"Proceedings of the 2017 ACM on Web Science Conference. Association for Computing Machinery, New York, NY, USA, WebSci \u201917","author":"J Golbeck","year":"2017","unstructured":"Golbeck J, Ashktorab Z, Banjo RO et al (2017) A\u00a0large labeled corpus for Online harassment research. In: Proceedings of the 2017 ACM on Web Science Conference. Association for Computing Machinery, New York, NY, USA, WebSci \u201917, pp 229\u2013233 https:\/\/doi.org\/10.1145\/3091478.3091509"},{"key":"438_CR12","volume-title":"Proceedings of ACL workshop Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications","author":"B Hamp","year":"1997","unstructured":"Hamp B, Feldweg H (1997) GermaNet - a\u00a0lexical-semantic net for German. In: Proceedings of ACL workshop Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications"},{"key":"438_CR13","unstructured":"Jigsaw\/Conversation AI (2018) Toxic comment classification challenge. https:\/\/tinyurl.com\/y7qmd8lm. Accessed 09.05.2022"},{"key":"438_CR14","doi-asserted-by":"publisher","first-page":"132","DOI":"10.18653\/v1\/W18-5117","volume-title":"Proceedings of the 2nd Workshop on Abusive Language Online (ALW2). Association for Computational Linguistics, Brussels, Belgium","author":"M Karan","year":"2018","unstructured":"Karan M, \u0160najder J (2018) Cross-domain detection of abusive language Online. In: Proceedings of the 2nd Workshop on Abusive Language Online (ALW2). Association for Computational Linguistics, Brussels, Belgium, pp 132\u2013137 https:\/\/doi.org\/10.18653\/v1\/W18-5117"},{"key":"438_CR15","volume-title":"3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7\u20119, 2015, Conference Track Proceedings","author":"DP Kingma","year":"2015","unstructured":"Kingma DP, Ba J (2015) Adam: A\u00a0method for stochastic optimization. In: Bengio Y, LeCun Y (eds) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7\u20119, 2015, Conference Track Proceedings"},{"key":"438_CR16","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1007\/s41701-019-00065-w","volume":"4","author":"V Kolhatkar","year":"2018","unstructured":"Kolhatkar V, Wu H, Cavasso L et al (2018) The SFU opinion and comments corpus: a\u00a0corpus for the analysis of Online news comments. Corpus Pragmat 4:155\u2013190. https:\/\/doi.org\/10.1007\/s41701-019-00065-w","journal-title":"Corpus Pragmat"},{"key":"438_CR17","volume-title":"Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan","author":"R Kumar","year":"2018","unstructured":"Kumar R, Reganti AN, Bhatia A et al (2018) Aggression-annotated corpus of Hindi-English code-mixed data. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). European Language Resources Association (ELRA), Miyazaki, Japan"},{"key":"438_CR18","series-title":"Lecture notes in computer science","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1007\/978-3-030-27947-9_9","volume-title":"TSD","author":"N Ljube\u0161i\u0107","year":"2019","unstructured":"Ljube\u0161i\u0107 N, Fi\u0161er D, Erjavec T (2019) The FRENK Datasets of socially unacceptable discourse in Slovene and English. In: TSD. Lecture notes in computer science, vol 11697. Springer, Cham, pp 103\u2013114"},{"key":"438_CR19","first-page":"153","volume-title":"Proceedings of the Third Workshop on Computational Modeling of People\u2019s Opinions, Personality, and Emotion\u2019s in Social Media. Association for Computational Linguistics, Barcelona, Spain","author":"N Ljube\u0161i\u0107","year":"2020","unstructured":"Ljube\u0161i\u0107 N, Markov I, Fi\u0161er D et al (2020) The LiLaH emotion lexicon of Croatian, Dutch and Slovene. In: Proceedings of the Third Workshop on Computational Modeling of People\u2019s Opinions, Personality, and Emotion\u2019s in Social Media. Association for Computational Linguistics, Barcelona, Spain, pp 153\u2013157"},{"key":"438_CR20","first-page":"149","volume-title":"Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics","author":"I Markov","year":"2021","unstructured":"Markov I, Ljube\u0161i\u0107 N, Fi\u0161er D et al (2021) Exploring stylometric and emotion-based features for multilingual cross-domain hate speech detection. In: Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis. Association for Computational Linguistics, pp 149\u2013159"},{"key":"438_CR21","first-page":"167","volume-title":"Working Notes of FIRE 2019 - Forum for Information Retrieval Evaluation. CEUR-WS, FIRE \u201919","author":"S Modha","year":"2019","unstructured":"Modha S, Mandl T, Majumder P et al (2019) Overview of the HASOC track at FIRE 2019: Hate speech and offensive content identification in Indo-European languages. In: Working Notes of FIRE 2019 - Forum for Information Retrieval Evaluation. CEUR-WS, FIRE \u201919, pp 167\u2013190"},{"key":"438_CR22","doi-asserted-by":"publisher","first-page":"363","DOI":"10.18653\/v1\/P19-2051","volume-title":"Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. Association for Computational Linguistics, Florence, Italy","author":"EW Pamungkas","year":"2019","unstructured":"Pamungkas EW, Patti V (2019) Cross-domain and cross-lingual abusive language detection: A\u00a0hybrid approach with deep learning and a\u00a0multilingual lexicon. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. Association for Computational Linguistics, Florence, Italy, pp 363\u2013370 https:\/\/doi.org\/10.18653\/v1\/P19-2051"},{"key":"438_CR23","unstructured":"Prakash A (2019) Fine-tuning BERT model using PyTorch. https:\/\/medium.com\/@prakashakshay90\/f34148d58a37. Accessed 14.10.2022"},{"key":"438_CR24","first-page":"1","volume-title":"Proceedings of the GermEval 2021 Shared Task on the Identification of Toxic, Engaging, and Fact-Claiming Comments. Association for Computational Linguistics","author":"J Risch","year":"2021","unstructured":"Risch J, Stoll A, Wilms L et al (2021) Overview of the GermEval 2021 shared task on the identification of toxic, engaging, and fact-claiming comments. In: Proceedings of the GermEval 2021 Shared Task on the Identification of Toxic, Engaging, and Fact-Claiming Comments. Association for Computational Linguistics, pp 1\u201312"},{"key":"438_CR25","doi-asserted-by":"publisher","first-page":"354","DOI":"10.5167\/uzh-178687","volume-title":"Proceedings of the 15th Conference on Natural Language Processing (KONVENS)","author":"JM Stru\u00df","year":"2019","unstructured":"Stru\u00df JM, Siegel M, Ruppenhofer J et al (2019) Overview of germEval task 2, 2019 shared task on the identification of offensive language. In: German Society for Computational Linguistics (ed) Proceedings of the 15th Conference on Natural Language Processing (KONVENS), pp 354\u2013365 https:\/\/doi.org\/10.5167\/uzh-178687"},{"key":"438_CR26","doi-asserted-by":"publisher","first-page":"940","DOI":"10.18653\/v1\/K19-1088","volume-title":"Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL). Association for Computational Linguistics, Hong Kong, China","author":"SD Swamy","year":"2019","unstructured":"Swamy SD, Jamatia A, Gamb\u00e4ck B (2019) Studying Generalisability across abusive language detection datasets. In: Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL). Association for Computational Linguistics, Hong Kong, China, pp 940\u2013950 https:\/\/doi.org\/10.18653\/v1\/K19-1088"},{"key":"438_CR27","first-page":"672","volume-title":"Proceedings of the International Conference Recent Advances in Natural Language Processing. INCOMA Ltd. Shoumen, Bulgaria, Hissar, Bulgaria","author":"C Van Hee","year":"2015","unstructured":"Van Hee C, Lefever E, Verhoeven B et al (2015) Detection and fine-grained classification of cyberbullying events. In: Proceedings of the International Conference Recent Advances in Natural Language Processing. INCOMA Ltd. Shoumen, Bulgaria, Hissar, Bulgaria, pp 672\u2013680"},{"key":"438_CR28","doi-asserted-by":"publisher","first-page":"138","DOI":"10.18653\/v1\/W16-5618","volume-title":"Proceedings of the First Workshop on NLP and Computational Social Science. Association for Computational Linguistics, Austin, Texas","author":"Z Waseem","year":"2016","unstructured":"Waseem Z (2016) Are you a\u00a0racist or am I seeing things? Annotator influence on hate speech detection on Twitter. In: Proceedings of the First Workshop on NLP and Computational Social Science. Association for Computational Linguistics, Austin, Texas, pp 138\u2013142 https:\/\/doi.org\/10.18653\/v1\/W16-5618"},{"key":"438_CR29","doi-asserted-by":"publisher","first-page":"88","DOI":"10.18653\/v1\/N16-2013","volume-title":"Proceedings of the NAACL Student Research Workshop. Association for Computational Linguistics, San Diego, California","author":"Z Waseem","year":"2016","unstructured":"Waseem Z, Hovy D (2016) Hateful symbols or hateful people? Predictive features for hate speech detection on twitter. In: Proceedings of the NAACL Student Research Workshop. Association for Computational Linguistics, San Diego, California, pp 88\u201393 https:\/\/doi.org\/10.18653\/v1\/N16-2013"},{"key":"438_CR30","first-page":"1","volume-title":"Proceedings of GermEval 2018, 14th Conference on Natural Language Processing (KONVENS 2018)","author":"M Wiegand","year":"2018","unstructured":"Wiegand M, Siegel M, Ruppenhofer J (2018) Overview of the GermEval 2018 shared task on the identification of offensive language. In: Ruppenhofer J, Siegel M, Wiegand M (eds) Proceedings of GermEval 2018, 14th Conference on Natural Language Processing (KONVENS 2018), pp 1\u201310"},{"key":"438_CR31","doi-asserted-by":"publisher","first-page":"1391","DOI":"10.1145\/3038912.3052591","volume-title":"Proceedings of the 26th International Conference on World Wide Web","author":"E Wulczyn","year":"2017","unstructured":"Wulczyn E, Thain N, Dixon L (2017) Ex machina: Personal attacks seen at scale. In: Proceedings of the 26th International Conference on World Wide Web, pp 1391\u20131399 https:\/\/doi.org\/10.1145\/3038912.3052591"},{"key":"438_CR32","doi-asserted-by":"publisher","first-page":"4699","DOI":"10.18653\/v1\/2021.findings-emnlp.402","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic","author":"T Wullach","year":"2021","unstructured":"Wullach T, Adler A, Minkov E (2021) Fight fire with fire: Fine-tuning hate detectors using large samples of generated hate speech. In: Findings of the Association for Computational Linguistics: EMNLP 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, pp 4699\u20134705 https:\/\/doi.org\/10.18653\/v1\/2021.findings-emnlp.402"},{"key":"438_CR33","series-title":"Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies","doi-asserted-by":"publisher","first-page":"1415","DOI":"10.18653\/v1\/N19-1144","volume-title":"Long and short papers","author":"M Zampieri","year":"2019","unstructured":"Zampieri M, Malmasi S, Nakov P et al (2019) Predicting the type and target of offensive posts in social media. In: Long and short papers. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol 1. Association for Computational Linguistics, Minneapolis, pp 1415\u20131420 https:\/\/doi.org\/10.18653\/v1\/N19-1144"}],"container-title":["Datenbank-Spektrum"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13222-023-00438-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s13222-023-00438-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13222-023-00438-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,15]],"date-time":"2023-05-15T15:43:15Z","timestamp":1684165395000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s13222-023-00438-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3]]},"references-count":33,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,3]]}},"alternative-id":["438"],"URL":"https:\/\/doi.org\/10.1007\/s13222-023-00438-1","relation":{},"ISSN":["1618-2162","1610-1995"],"issn-type":[{"type":"print","value":"1618-2162"},{"type":"electronic","value":"1610-1995"}],"subject":[],"published":{"date-parts":[[2023,3]]},"assertion":[{"value":"18 October 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 February 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 March 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}