{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T06:01:47Z","timestamp":1778047307747,"version":"3.51.4"},"reference-count":57,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2020,8,3]],"date-time":"2020-08-03T00:00:00Z","timestamp":1596412800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>The proliferation of social media platforms changed the way people interact online. However, engagement with social media comes with a price, the users\u2019 privacy. Breaches of users\u2019 privacy, such as the Cambridge Analytica scandal, can reveal how the users\u2019 data can be weaponized in political campaigns, which many times trigger hate speech and anti-immigration views. Hate speech detection is a challenging task due to the different sources of hate that can have an impact on the language used, as well as the lack of relevant annotated data. To tackle this, we collected and manually annotated an immigration-related dataset of publicly available Tweets in UK, US, and Canadian English. In an empirical study, we explored anti-immigration speech detection utilizing various language features (word n-grams, character n-grams) and measured their impact on a number of trained classifiers. Our work demonstrates that using word n-grams results in higher precision, recall, and f-score as compared to character n-grams. Finally, we discuss the implications of these results for future work on hate-speech detection and social media data analysis in general.<\/jats:p>","DOI":"10.3390\/make2030011","type":"journal-article","created":{"date-parts":[[2020,8,3]],"date-time":"2020-08-03T07:45:57Z","timestamp":1596440757000},"page":"192-215","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":22,"title":["Monitoring Users\u2019 Behavior: Anti-Immigration Speech Detection on Twitter"],"prefix":"10.3390","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3392-9970","authenticated-orcid":false,"given":"Nikolaos","family":"Pitropakis","sequence":"first","affiliation":[{"name":"School of Computing, Edinburgh Napier University, Edinburgh EH10 5DT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kamil","family":"Kokot","sequence":"additional","affiliation":[{"name":"School of Computing, Edinburgh Napier University, Edinburgh EH10 5DT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8568-7806","authenticated-orcid":false,"given":"Dimitra","family":"Gkatzia","sequence":"additional","affiliation":[{"name":"School of Computing, Edinburgh Napier University, Edinburgh EH10 5DT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Robert","family":"Ludwiniak","sequence":"additional","affiliation":[{"name":"School of Computing, Edinburgh Napier University, Edinburgh EH10 5DT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8819-5831","authenticated-orcid":false,"given":"Alexios","family":"Mylonas","sequence":"additional","affiliation":[{"name":"School of Computing and Informatics, Bournemouth University, Poole BH12 5BB, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Miltiadis","family":"Kandias","sequence":"additional","affiliation":[{"name":"Department of Informatics, Athens University of Economics and Business, 104 34 Athina, Greece"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,8,3]]},"reference":[{"key":"ref_1","unstructured":"Battisby, A. (2019). The Latest UK Social Media Statistics for 2019, Avocado Social. Available online: https:\/\/avocadosocial.com\/latest-social-media-statistics-and-demographics-for-the-uk-in-2019\/."},{"key":"ref_2","unstructured":"(2020, July 30). We Are Social, Hootsuite. The Global Digital Report 2019. Available online: https:\/\/wearesocial.com\/global-digital-report-2019."},{"key":"ref_3","unstructured":"Dictionary, C. (2008). Cambridge Advanced Learner\u2019S Dictionary, Cambridge University Press."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1199","DOI":"10.1086\/669605","article-title":"Terrorist events and attitudes toward immigrants: A natural experiment","volume":"118","author":"Legewie","year":"2013","journal-title":"Am. J. Sociol."},{"key":"ref_5","unstructured":"Grierson, J. (The Guardian, 2018). Hostile environment: Anatomy of a policy disaster, The Guardian, p.1."},{"key":"ref_6","first-page":"35","article-title":"\u201cTake Back Control of Our Borders\u201d: The Role of Arguments about Controlling Immigration in the Brexit Debate","volume":"15","author":"Goodman","year":"2017","journal-title":"Rocznik Instytutu Europy \u015arodkowo-Wschodniej"},{"key":"ref_7","unstructured":"Aguilera, J. (Time, 2020). Xenophobia \u2018is a pre-existing condition\u2019. How harmful stereotypes and racism are spreading around the coronavirus, Time, p.1."},{"key":"ref_8","unstructured":"Siegel, A., Nikitin, E., Barber\u00e1, P., Sterling, J., Pullen, B., Bonneau, R., Nagler, J., and Tucker, J.A. (2019). Trumping Hate on Twitter? Online Hate Speech in the 2016 US Election Campaign and Its Aftermath, Alexandra Siegel. Available online: https:\/\/alexandra-siegel.com\/wp-content\/uploads\/2019\/08\/qjps_election_hatespeech_RR.pdf."},{"key":"ref_9","first-page":"22","article-title":"Revealed: 50 million Facebook profiles harvested for Cambridge Analytica in major data breach","volume":"17","author":"Cadwalladr","year":"2018","journal-title":"Guardian"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1093\/bjc\/azv059","article-title":"Cyberhate on social media in the aftermath of Woolwich: A case study in computational criminology and big data","volume":"56","author":"Williams","year":"2016","journal-title":"Br. J. Criminol."},{"key":"ref_11","unstructured":"i Orts, \u00d2.G. (2019, January 6\u20137). Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter at SemEval-2019 Task 5: Frequency Analysis Interpolation for Hate in Speech Detection. Proceedings of the 13th International Workshop on Semantic Evaluation, Minneapolis, MN, USA."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1080\/1369183X.2018.1451308","article-title":"Racism and xenophobia experienced by Polish migrants in the UK before and after Brexit vote","volume":"45","author":"Rzepnikowska","year":"2019","journal-title":"J. Ethnic Migr. Stud."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1080\/1369118X.2017.1388427","article-title":"# Islamexit: Inter-group antagonism on Twitter","volume":"22","author":"Evolvi","year":"2019","journal-title":"Inf. Commun. Soc."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1080\/08838151.2015.1127245","article-title":"Why we share: A uses and gratifications approach to privacy regulation in social media use","volume":"60","author":"Quinn","year":"2016","journal-title":"J. Broadcast. Electron. Media"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Whiting, A., and Williams, D. (2013). Why people use social media: A uses and gratifications approach. Qual. Market Res. Int. J.","DOI":"10.1108\/QMR-06-2013-0041"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"102","DOI":"10.1016\/j.intmar.2012.01.002","article-title":"How does brand-related user-generated content differ across YouTube, Facebook, and Twitter?","volume":"26","author":"Smith","year":"2012","journal-title":"J. Interact. Market."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1177\/0163443711427199","article-title":"The institutionalization of YouTube: From user-generated content to professionally generated content","volume":"34","author":"Kim","year":"2012","journal-title":"Media Cult. Soc."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1016\/j.chb.2016.09.024","article-title":"Social media engagement: What motivates user participation and consumption on YouTube?","volume":"66","author":"Khan","year":"2017","journal-title":"Comput. Hum. Behav."},{"key":"ref_19","unstructured":"Nations, D. (2017). What Is Microblogging?, Sprout Social."},{"key":"ref_20","first-page":"128","article-title":"Exploring identity motives in Twitter usage in Saudi Arabia and the UK","volume":"199","author":"Selim","year":"2014","journal-title":"Annu. Rev. Cyberther. Telemed."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"406","DOI":"10.1016\/j.chb.2009.11.012","article-title":"All about me: Disclosure in online social networking profiles: The case of FACEBOOK","volume":"26","author":"Nosko","year":"2010","journal-title":"Comput. Hum. Behav."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1177\/1461444816660731","article-title":"Social media cultivating perceptions of privacy: A 5-year analysis of privacy attitudes and self-disclosure behaviors among Facebook users","volume":"20","author":"Shanahan","year":"2018","journal-title":"New Media Soc."},{"key":"ref_23","unstructured":"Twitter (2020, May 15). Twitter Privacy Policy. Available online: https:\/\/twitter.com\/en\/privacy."},{"key":"ref_24","first-page":"3","article-title":"The privacy paradox on social network sites revisited: The role of individual characteristics and group norms","volume":"2","author":"Utz","year":"2009","journal-title":"Cyberpsychology"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1080\/08838151.2012.732140","article-title":"The impact of context collapse and privacy on social network site disclosures","volume":"56","author":"Vitak","year":"2012","journal-title":"J. Broadcast. Electron. Media"},{"key":"ref_26","first-page":"2015","article-title":"Ted Cruz using firm that harvested data on millions of unwitting Facebook users","volume":"11","author":"Davies","year":"2015","journal-title":"Guardian"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1109\/MC.2018.3191268","article-title":"User data privacy: Facebook, Cambridge Analytica, and privacy protection","volume":"51","author":"Isaak","year":"2018","journal-title":"Computer"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1002\/bsl.2370040405","article-title":"Criminal profiling from crime scene analysis","volume":"4","author":"Douglas","year":"1986","journal-title":"Behav. Sci. Law"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Kandias, M., Mylonas, A., Virvilis, N., Theoharidou, M., and Gritzalis, D. (2010). An insider threat prediction model. International Conference on Trust, Privacy and Security in Digital Business, Springer.","DOI":"10.1007\/978-3-642-15152-1_3"},{"key":"ref_30","unstructured":"Kandias, M., Mitrou, L., Stavrou, V., and Gritzalis, D. (2013, January 29\u201331). Which side are you on? A new Panopticon vs. privacy. Proceedings of the 2013 International Conference on Security and Cryptography (SECRYPT), Reykjavik, Iceland."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Kandias, M., Stavrou, V., Bozovic, N., Mitrou, L., and Gritzalis, D. (2013, January 18\u201321). Can we trust this user? Predicting insider\u2019s attitude via YouTube usage profiling. Proceedings of the 2013 IEEE 10th International Conference on Ubiquitous Intelligence and Computing, Vietri sul Mere, Italy.","DOI":"10.1109\/UIC-ATC.2013.12"},{"key":"ref_32","unstructured":"Mitrou, L., Kandias, M., Stavrou, V., and Gritzalis, D. (2014, January 7\u20139). Social media profiling: A Panopticon or Omniopticon tool?. Proceedings of the 6th Conference of the Surveillance Studies Network, Barcelona, Spain."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1109\/MCI.2014.2307227","article-title":"Jumping NLP curves: A review of natural language processing research","volume":"9","author":"Cambria","year":"2014","journal-title":"IEEE Computat. Intell. Mag."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"402","DOI":"10.1177\/0894439310386557","article-title":"Election forecasts with Twitter: How 140 characters reflect the political landscape","volume":"29","author":"Tumasjan","year":"2011","journal-title":"Soc. Sci. Comput. Rev."},{"key":"ref_35","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2019, January 3). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1016\/j.inffus.2016.10.004","article-title":"A review of natural language processing techniques for opinion mining systems","volume":"36","author":"Sun","year":"2017","journal-title":"Inf. Fusion"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Schmidt, A., and Wiegand, M. (2017, January 3). A survey on hate speech detection using natural language processing. Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media, Valencia, Spain.","DOI":"10.18653\/v1\/W17-1101"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., and Chang, Y. (2016, January 11\u201315). Abusive language detection in online user content. Proceedings of the 25th International Conference on World Wide Web, Montreal, QC, Canada.","DOI":"10.1145\/2872427.2883062"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1002\/poi3.85","article-title":"Cyber hate speech on twitter: An application of machine classification and statistical modeling for policy and decision making","volume":"7","author":"Burnap","year":"2015","journal-title":"Policy Internet"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1140\/epjds\/s13688-016-0072-6","article-title":"Us and them: Identifying cyber hate on Twitter across multiple protected characteristics","volume":"5","author":"Burnap","year":"2016","journal-title":"EPJ Data Sci."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Chen, Y., Zhou, Y., Zhu, S., and Xu, H. (2012, January 3\u20135). Detecting offensive language in social media to protect adolescent online safety. Proceedings of the 2012 International Conference on Privacy, Security, Risk and Trust, Amsterdam, The Netherlands.","DOI":"10.1109\/SocialCom-PASSAT.2012.55"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Kwok, I., and Wang, Y. (2013, January 14\u201318). Locate the hate: Detecting tweets against blacks. Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, Washington, DC, USA.","DOI":"10.1609\/aaai.v27i1.8539"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Mehdad, Y., and Tetreault, J. (2016, January 13\u201315). Do characters abuse more than words?. Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, Los Angeles, CA, USA.","DOI":"10.18653\/v1\/W16-3638"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Waseem, Z., and Hovy, D. (2016, January 13\u201315). Hateful symbols or hateful people? predictive features for hate speech detection on twitter. Proceedings of the NAACL Student Research Workshop, San Diego, CA, USA.","DOI":"10.18653\/v1\/N16-2013"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Basile, V., Bosco, C., Fersini, E., Nozza, D., Patti, V., Pardo, F.M.R., Rosso, P., and Sanguinetti, M. (2019, January 6\u20137). Semeval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter. Proceedings of the 13th International Workshop on Semantic Evaluation, Minneapolis, MN, USA.","DOI":"10.18653\/v1\/S19-2007"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"270","DOI":"10.1002\/asi.21690","article-title":"Automatic identification of personal insults on social news sites","volume":"63","author":"Sood","year":"2012","journal-title":"J. Am. Soc. Inf. Sci. Technol."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2362394.2362400","article-title":"Common sense reasoning for detection, prevention, and mitigation of cyberbullying","volume":"2","author":"Dinakar","year":"2012","journal-title":"ACM Trans. Interact. Intell. Syst."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"215","DOI":"10.14257\/ijmue.2015.10.4.21","article-title":"A lexicon-based approach for hate speech detection","volume":"10","author":"Gitari","year":"2015","journal-title":"Int. J. Multimed. Ubiquitous Eng."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Rodr\u00edguez, A., Argueta, C., and Chen, Y.L. (2019, January 11\u201313). Automatic detection of hate speech on facebook using sentiment and emotion analysis. Proceedings of the 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Okinawa, Japan.","DOI":"10.1109\/ICAIIC.2019.8669073"},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"4730","DOI":"10.1007\/s10489-018-1242-y","article-title":"Effective hate-speech detection in Twitter data using recurrent neural networks","volume":"48","author":"Pitsilis","year":"2018","journal-title":"Appl. Intell."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Robinson, D., and Tepper, J. (2018). Detecting hate speech on twitter using a convolution-gru based deep neural network. European sEmantic Web Conference, Springer.","DOI":"10.1007\/978-3-319-93417-4_48"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Badjatiya, P., Gupta, S., Gupta, M., and Varma, V. (2017, January 3\u20137). Deep learning for hate speech detection in tweets. Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia.","DOI":"10.1145\/3041021.3054223"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Florio, K., Basile, V., Lai, M., and Patti, V. (2019, January 3\u20136). Leveraging Hate Speech Detection to Investigate Immigration-related Phenomena in Italy. Proceedings of the 2019 8th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW), Cambridge, UK.","DOI":"10.1109\/ACIIW.2019.8925079"},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"450","DOI":"10.1177\/1369148117710799","article-title":"Taking back control? Investigating the role of immigration in the 2016 vote for Brexit","volume":"19","author":"Goodwin","year":"2017","journal-title":"Br. J. Politics Int. Relat."},{"key":"ref_55","unstructured":"Wright, T. (2020, July 30). Majority of Canadians Think Immigration Should Be Limited: Poll. Global News. Available online: https:\/\/globalnews.ca\/news\/5397306\/canada-immigration-poll\/."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Kintis, P., Miramirkhani, N., Lever, C., Chen, Y., Romero-Gomez, R., Pitropakis, N., Nikiforakis, N., and Antonakakis, M. (November, January 30). Hiding in plain sight: A longitudinal study of combosquatting abuse. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA.","DOI":"10.1145\/3133956.3134002"},{"key":"ref_57","first-page":"2056305117733226","article-title":"Donald Trump\u2019s \u201cpolitical incorrectness\u201d: Neoliberalism as frontstage racism on social media","volume":"3","year":"2017","journal-title":"Soc. Media Soc."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/2\/3\/11\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:53:56Z","timestamp":1760176436000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/2\/3\/11"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,8,3]]},"references-count":57,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2020,9]]}},"alternative-id":["make2030011"],"URL":"https:\/\/doi.org\/10.3390\/make2030011","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,8,3]]}}}