{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,21]],"date-time":"2026-05-21T16:14:25Z","timestamp":1779380065957,"version":"3.53.1"},"reference-count":53,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2023,10,26]],"date-time":"2023-10-26T00:00:00Z","timestamp":1698278400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Scientific Research, King Faisal University, Saudi Arabia","award":["GRANT4,055"],"award-info":[{"award-number":["GRANT4,055"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>Social media platforms have become the primary means of communication and information sharing, facilitating interactive exchanges among users. Unfortunately, these platforms also witness the dissemination of inappropriate and toxic content, including hate speech and insults. While significant efforts have been made to classify toxic content in the English language, the same level of attention has not been given to Arabic texts. This study addresses this gap by constructing a standardized Arabic dataset specifically designed for toxic tweet classification. The dataset is annotated automatically using Google\u2019s Perspective API and the expertise of three native Arabic speakers and linguists. To evaluate the performance of different models, we conduct a series of experiments using seven models: long short-term memory (LSTM), bidirectional LSTM, a convolutional neural network, a gated recurrent unit (GRU), bidirectional GRU, multilingual bidirectional encoder representations from transformers, and AraBERT. Additionally, we employ word embedding techniques. Our experimental findings demonstrate that the fine-tuned AraBERT model surpasses the performance of other models, achieving an impressive accuracy of 0.9960. Notably, this accuracy value outperforms similar approaches reported in recent literature. This study represents a significant advancement in Arabic toxic tweet classification, shedding light on the importance of addressing toxicity in social media platforms while considering diverse languages and cultures.<\/jats:p>","DOI":"10.3390\/bdcc7040170","type":"journal-article","created":{"date-parts":[[2023,10,26]],"date-time":"2023-10-26T06:20:16Z","timestamp":1698301216000},"page":"170","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":35,"title":["Arabic Toxic Tweet Classification: Leveraging the AraBERT Model"],"prefix":"10.3390","volume":"7","author":[{"given":"Amr Mohamed El","family":"Koshiry","sequence":"first","affiliation":[{"name":"Department of Curricula and Teaching Methods, College of Education, King Faisal University, P.O. Box 400, Al-Ahsa 31982, Saudi Arabia"},{"name":"Faculty of Specific Education, Minia University, Minia 61519, Egypt"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3217-2889","authenticated-orcid":false,"given":"Entesar Hamed I.","family":"Eliwa","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Statistics, College of Science, King Faisal University, P.O. Box 400, Al-Ahsa 31982, Saudi Arabia"},{"name":"Department of Computer Science, Faculty of Science, Minia University, Minia 61519, Egypt"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1785-1058","authenticated-orcid":false,"given":"Tarek","family":"Abd El-Hafeez","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Faculty of Science, Minia University, Minia 61519, Egypt"},{"name":"Computer Science Unit, Deraya University, Minia 61765, Egypt"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ahmed","family":"Omar","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Faculty of Science, Minia University, Minia 61519, Egypt"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2023,10,26]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"312","DOI":"10.1016\/j.neucom.2021.11.095","article-title":"Defining and detecting toxicity on social media: Context and knowledge are key","volume":"490","author":"Sheth","year":"2022","journal-title":"Neurocomputing"},{"key":"ref_2","first-page":"7547","article-title":"AlexNet architecture based convolutional neural network for toxic comments classification","volume":"34","author":"Singh","year":"2022","journal-title":"J. King Saud Univ.\u2014Comput. Inf. Sci."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Chakrabarty, N. (2019). A Machine Learning Approach to Comment Toxicity Classification, Springer.","DOI":"10.1007\/978-981-13-9042-5_16"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"101785","DOI":"10.1016\/j.is.2021.101785","article-title":"Multi-label Arabic text classification in Online Social Networks","volume":"100","author":"Omar","year":"2021","journal-title":"Inf. Syst."},{"key":"ref_5","unstructured":"Omar, A., Mahmoud, T.M., and Abd-El-Hafeez, T. (2018). The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2018), Springer-Advances in Intelligent Systems and Computing."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Aldjanabi, W., Dahou, A., Al-Qaness, M.A.A., Elaziz, M.A., Helmi, A.M., and Dama\u0161evi\u010dius, R. (2021). Arabic offensive and hate speech detection using a cross-corpora multi-task learning model. Informatics, 8.","DOI":"10.3390\/informatics8040069"},{"key":"ref_7","unstructured":"Mubarak, H., Darwish, K., Magdy, W., Elsayed, T., and Al-Khalifa, H. (2020). Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, European Language Resource Association."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Mulki, H., Haddad, H., Ali, C.B., and Alshabani, H. (2019, January 1). L-HSAB: A Levantine Twitter Dataset for Hate Speech and Abusive Language. Proceedings of the Third Workshop on Abusive Language Online, Florence, Italy.","DOI":"10.18653\/v1\/W19-3512"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Haddad, H., Mulki, H., and Oueslati, A. (2019, January 6\u201317). T-hsab: A tunisian hate speech and abusive dataset. Proceedings of the International Conference on Arabic Language Processing, Nancy, France.","DOI":"10.1007\/978-3-030-32959-4_18"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1621","DOI":"10.1080\/08839514.2021.1988443","article-title":"Semi-Supervised Self-Training of Hate and Offensive Speech from Social Media","volume":"35","author":"Alsafari","year":"2021","journal-title":"Appl. Artif. Intell."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Muaad, A.Y., Davanagere, H.J., Al-antari, M.A., Benifa, J.V.B., and Chola, C. (2022). AI-Based Misogyny Detection from Arabic Levantine Twitter Tweets. Comput. Sci. Math. Forum, 2.","DOI":"10.3390\/IOCA2021-10880"},{"key":"ref_12","unstructured":"Farha, I.A., and Magdy, W. (2020, January 12). Multitask Learning for Arabic Offensive Language and Hate-Speech Detection. Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, Marseille, France."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Alshalan, R., and Al-Khalifa, H. (2020). A deep learning approach for automatic hate speech detection in the saudi twittersphere. Appl. Sci., 10.","DOI":"10.3390\/app10238614"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Albayari, R., and Abdallah, S. (2022). Instagram-Based Benchmark Dataset for Cyberbullying Detection in Arabic Text. Data, 7.","DOI":"10.3390\/data7070083"},{"key":"ref_15","first-page":"972","article-title":"BERT-based Approach to Arabic Hate Speech and Offensive Language Detection in Twitter: Exploiting Emojis and Sentiment Analysis","volume":"13","author":"Althobaiti","year":"2022","journal-title":"Int. J. Adv. Comput. Sci. Appl."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Mubarak, H., Hassan, S., and Chowdhury, S.A. (2022). Emojis as Anchors to Detect Arabic Offensive Language and Hate Speech. arXiv.","DOI":"10.1017\/S1351324923000402"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Reynolds, K., Kontostathis, A., and Edwards, L. (2011, January 18\u201321). Using machine learning to detect cyberbullying. Proceedings of the 2011 10th International Conference on Machine Learning and Applications and Workshops, Honolulu, HI, USA.","DOI":"10.1109\/ICMLA.2011.152"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2362394.2362400","article-title":"Common sense reasoning for detection, prevention, and mitigation of cyberbullying","volume":"2","author":"Dinakar","year":"2012","journal-title":"ACM Trans. Interact. Intell. Syst."},{"key":"ref_19","unstructured":"Nahar, V., Li, X., Pang, C., and Zhang, Y. (2013, January 13\u201315). Cyberbullying detection based on text-stream classification. Proceedings of the 11th Australasian Data Mining Conference (AusDM 2013), Canberra, Australia."},{"key":"ref_20","unstructured":"Dadvar, M., Trieschnigg, D., Ordelman, R., and De Jong, F. (2013). Advances in Information Retrieval, Proccedings of the 35th European Conference on IR Research, ECIR 2013, Moscow, Russia, 24\u201327 March 2013, Springer. Proceedings 35."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Feng, W., Huang, W., and Ren, J. (2018). Class imbalance ensemble learning based on the margin theory. Appl. Sci., 8.","DOI":"10.3390\/app8050815"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Chavan, V.S., and Shylaja, S.S. (2015, January 10\u201313). Machine learning approach for detection of cyber-aggressive comments by peers on social media network. Proceedings of the 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Kochi, India.","DOI":"10.1109\/ICACCI.2015.7275970"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Mangaonkar, A., Hayrapetian, A., and Raje, R. (2015, January 21\u201323). Collaborative detection of cyberbullying behavior in Twitter data. Proceedings of the 2015 IEEE International Conference on Electro\/Information Technology (EIT), DeKalb, IL, USA.","DOI":"10.1109\/EIT.2015.7293405"},{"key":"ref_24","unstructured":"Van Hee, C., Lefever, E., Verhoeven, B., Mennes, J., Desmet, B., De Pauw, G., Daelemans, W., and Hoste, V. (2015, January 7\u20139). Detection and fine-grained classification of cyberbullying events. Proceedings of the International Conference Recent Advances in Natural Language Processing, Hissar, Bulgaria."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1016\/j.ijcci.2016.07.002","article-title":"Sustainable cyberbullying detection with category-maximized relevance of harmful phrases and double-filtered automatic optimization","volume":"8","author":"Ptaszynski","year":"2016","journal-title":"Int. J. Child-Comput. Interact."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Singh, V.K., Huang, Q., and Atrey, P.K. (2016, January 18\u201321). Cyberbullying detection using probabilistic socio-textual information fusion. Proceedings of the 2016 IEEE\/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), San Francisco, CA, USA.","DOI":"10.1109\/ASONAM.2016.7752342"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"433","DOI":"10.1016\/j.chb.2016.05.051","article-title":"Cybercrime detection in online communications: The experimental case of cyberbullying detection in the Twitter network","volume":"63","author":"Varathan","year":"2016","journal-title":"Comput. Hum. Behav."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhao, R., Zhou, A., and Mao, K. (2016, January 4\u20137). Automatic detection of cyberbullying on social networks based on bullying features. Proceedings of the 17th International Conference on Distributed Computing and Networking, Singapore.","DOI":"10.1145\/2833312.2849567"},{"key":"ref_29","first-page":"17","article-title":"Automatic monitoring and prevention of cyberbullying","volume":"8","author":"Sugandhi","year":"2016","journal-title":"Int. J. Comput. Appl."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Hosseinmardi, H., Rafiq, R.I., Han, R., Lv, Q., and Mishra, S. (2016, January 18\u201321). Prediction of cyberbullying incidents in a media-based social network. Proceedings of the 2016 IEEE\/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), San Francisco, CA, USA.","DOI":"10.1109\/ASONAM.2016.7752233"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zhang, X., Tong, J., Vishwamitra, N., Whittaker, E., Mazer, J.P., Kowalski, R., Hu, H., Luo, F., Macbeth, J., and Dillon, E. (2016, January 18\u201320). Cyberbullying detection with a pronunciation based convolutional neural network. Proceedings of the 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), Anaheim, CA, USA.","DOI":"10.1109\/ICMLA.2016.0132"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1016\/j.chb.2018.12.021","article-title":"Automatic cyberbullying detection: A systematic review","volume":"93","author":"Rosa","year":"2019","journal-title":"Comput. Hum. Behav."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"275","DOI":"10.25046\/aj020634","article-title":"A Multilingual System for Cyberbullying Detection: Arabic Content Detection using Machine Learning","volume":"2","author":"Haidar","year":"2017","journal-title":"Adv. Sci. Technol. Eng. Syst. J."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Haidar, B., Chamoun, M., and Serhrouchni, A. (2018, January 19\u201320). Arabic cyberbullying detection: Using deep learning. Proceedings of the 2018 7th International Conference on Computer and Communication Engineering (ICCCE), Kuala Lumpur, Malaysia.","DOI":"10.1109\/ICCCE.2018.8539303"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Haidar, B., Chamoun, M., and Serhrouchni, A. (2019, January 14\u201317). Arabic cyberbullying detection: Enhancing performance by using ensemble machine learning. Proceedings of the 2019 International Conference on Internet of Things (Ithings) and Ieee Green Computing and Communications (Greencom) and IEEE Cyber, Physical and Social Computing (Cpscom) and IEEE Smart Data (Smartdata), Atlanta, GA, USA.","DOI":"10.1109\/iThings\/GreenCom\/CPSCom\/SmartData.2019.00074"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Mouheb, D., Abushamleh, M.H., Abushamleh, M.H., Al Aghbari, Z., and Kamel, I. (2019, January 24\u201326). Real-time detection of cyberbullying in arabic twitter streams. Proceedings of the 2019 10th IFIP International Conference on New Technologies, Mobility and Security (NTMS), Canary Islands, Spain.","DOI":"10.1109\/NTMS.2019.8763808"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Mouheb, D., Albarghash, R., Mowakeh, M.F., Al Aghbari, Z., and Kamel, I. (2019, January 3\u20137). Detection of Arabic cyberbullying on social networks using machine learning. Proceedings of the 2019 IEEE\/ACS 16th International Conference on Computer Systems and Applications (AICCSA), Abu Dhabi, United Arab Emirates.","DOI":"10.1109\/AICCSA47632.2019.9035276"},{"key":"ref_38","first-page":"2330","article-title":"Automatic cyber bullying detection in Arabic social media","volume":"12","author":"AlHarbi","year":"2019","journal-title":"Int. J. Eng. Res. Technol."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Rachid, B.A., Azza, H., and Ghezala, H.H.B. (2020, January 19\u201324). Classification of cyberbullying text in Arabic. Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK.","DOI":"10.1109\/IJCNN48605.2020.9206643"},{"key":"ref_40","first-page":"1409","article-title":"Cyber-bullying and cyber-harassment detection using supervised machine learning techniques in Arabic social media contents","volume":"21","author":"Kanan","year":"2020","journal-title":"J. Internet Technol."},{"key":"ref_41","first-page":"34","article-title":"Detection of cyberbullying in tweets in Egyptian dialects","volume":"18","author":"Farid","year":"2020","journal-title":"Int. J. Comput. Sci. Inf. Secur. IJCSIS"},{"key":"ref_42","first-page":"123","article-title":"Using machine learning algorithms for automatic cyber bullying detection in Arabic social media","volume":"12","author":"AlHarbi","year":"2020","journal-title":"J. Inf. Technol. Manag."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"7585","DOI":"10.1016\/j.aej.2022.01.011","article-title":"Comparative analysis of Gated Recurrent Units (GRU), long Short-Term memory (LSTM) cells, autoregressive Integrated moving average (ARIMA), seasonal autoregressive Integrated moving average (SARIMA) for forecasting COVID-19 trends","volume":"61","author":"ArunKumar","year":"2022","journal-title":"Alex. Eng. J."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"106363","DOI":"10.1109\/ACCESS.2021.3100435","article-title":"Detecting White Supremacist Hate Speech Using Domain Specific Word Embedding with Deep Learning and BERT","volume":"9","author":"Alatawi","year":"2021","journal-title":"IEEE Access"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Cho, K., van Merri\u00ebnboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014, January 25\u201329). Learning phrase representations using RNN encoder-decoder for statistical machine translation. Proceedings of the EMNLP 2014\u20142014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar.","DOI":"10.3115\/v1\/D14-1179"},{"key":"ref_46","unstructured":"Antoun, W., Baly, F., and Hajj, H. (2020). AraBERT: Transformer-based Model for Arabic Language Understanding. arXiv."},{"key":"ref_47","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2019, January 2\u20137). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the NAACL HLT 2019\u2014Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, MN, USA. no. Mlm, 2019."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Omar, A., Mahmoud, T.M., and Abd-El-Hafeez, T. (2020). Comparative Performance of Machine Learning and Deep Learning Algorithms for Arabic Hate Speech Detection in OSNs, Springer International Publishing.","DOI":"10.1007\/978-3-030-44289-7_24"},{"key":"ref_49","unstructured":"Twitter (2022, January 01). Twitter API Wiki\/Twitter API Documentation. Available online: http:\/\/apiwiki.twitter.com\/w\/page\/22554679\/Twitter-API-Documentation."},{"key":"ref_50","unstructured":"Google and Jigsaw (2022, February 01). Perspective API. Available online: https:\/\/perspectiveapi.com."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Han, X., and Tsvetkov, Y. (2020, January 16\u201320). Fortifying Toxic Speech Detectors Against Veiled Toxicity. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online.","DOI":"10.18653\/v1\/2020.emnlp-main.622"},{"key":"ref_52","first-page":"100019","article-title":"PROVOKE: Toxicity trigger detection in conversations from the top 100 subreddits","volume":"6","author":"Almerekhi","year":"2022","journal-title":"Data Inf. Manag."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Pavlopoulos, J., Thain, N., Dixon, L., and Androutsopoulos, I. (2019, January 6\u20137). ConvAI at SemEval-2019 Task 6: Offensive Language Identification and Categorization with Perspective and BERT. Proceedings of the 13th International Workshop on Semantic Evaluation, Minneapolis, MI, USA.","DOI":"10.18653\/v1\/S19-2102"}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/7\/4\/170\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:12:14Z","timestamp":1760130734000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/7\/4\/170"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,26]]},"references-count":53,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["bdcc7040170"],"URL":"https:\/\/doi.org\/10.3390\/bdcc7040170","relation":{},"ISSN":["2504-2289"],"issn-type":[{"value":"2504-2289","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,26]]}}}