{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,15]],"date-time":"2026-05-15T13:46:25Z","timestamp":1778852785944,"version":"3.51.4"},"reference-count":24,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2023,8,13]],"date-time":"2023-08-13T00:00:00Z","timestamp":1691884800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,8,13]],"date-time":"2023-08-13T00:00:00Z","timestamp":1691884800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100012533","name":"Minia University","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100012533","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Lang Resources &amp; Evaluation"],"published-print":{"date-parts":[[2024,6]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Since cyberbullying impacts both individual victims and entire society, research on abusive language and its detection has attracted attention in recent years. Because social media sites like Facebook, Instagram, Twitter, and others are so widely accessible, hate speech, bullying, sexism, racism, aggressive material, harassment, poisonous comments, and other types of abuse have all substantially increased. Due to the critical requirement to detect, regulate, and limit the spread of harmful content on social networking sites, we conducted this study to automate the detection of offensive language or cyberbullying. We created a new Arabic balanced data set to be used in the offensive detection process because having a balanced data set for a model would result in improved accuracy models. Recently, the performance of single classifiers has been improved using ensemble machine learning. The purpose of this study is to examine the effectiveness of several single and ensemble machine learning algorithms in identifying Arabic text that contains foul language and cyberbullying. Applying them to three Arabic datasets, we have selected three machine learning classifiers and three ensemble models for this aim. Two of them are offensive datasets that are readily accessible in the public, while the third one was created. The results showed that the single learner machine learning strategy is inferior to the ensemble machine learning methodology. Voting performs is the best performing trained ensemble machine learning classifier, outperforming the best single learner classifier (65.1%, 76.2%, and 98%) for the same datasets with accuracy scores of (71.1%, 76.7%, and 98.5%) for each of the three datasets used. Finally, we improve the voting technique\u2019s performance through hyperparameter tuning on the Arabic cyberbullying data set.<\/jats:p>","DOI":"10.1007\/s10579-023-09683-y","type":"journal-article","created":{"date-parts":[[2023,8,13]],"date-time":"2023-08-13T12:01:35Z","timestamp":1691928095000},"page":"695-712","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":24,"title":["Comparative performance of ensemble machine learning for Arabic cyberbullying and offensive language detection"],"prefix":"10.1007","volume":"58","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0124-7908","authenticated-orcid":false,"given":"Marwa","family":"Khairy","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0309-8088","authenticated-orcid":false,"given":"Tarek M.","family":"Mahmoud","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3105-0417","authenticated-orcid":false,"given":"Ahmed","family":"Omar","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1785-1058","authenticated-orcid":false,"given":"Tarek","family":"Abd El-Hafeez","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2023,8,13]]},"reference":[{"issue":"6","key":"9683_CR2","doi-asserted-by":"publisher","first-page":"17","DOI":"10.5121\/ijdkp.2016.6602","volume":"6","author":"EA Abozinadah","year":"2016","unstructured":"Abozinadah, E. A., & Jones, J. H., Jr. (2016). Improved micro-blog classification for detecting abusive Arabic Twitter accounts. International Journal of Data Mining & Knowledge Management Process, 6(6), 17\u201328.","journal-title":"International Journal of Data Mining & Knowledge Management Process"},{"key":"9683_CR1","doi-asserted-by":"crossref","unstructured":"Abozinadah, E. A., & Jones, J. H., Jr. (2017). A statistical learning approach to detect abusive Twitter accounts. In Proceedings of the international conference on computing data analysis\u2014ICCDA \u201817 (pp. 6\u201313).","DOI":"10.1145\/3093241.3093281"},{"issue":"2","key":"9683_CR3","doi-asserted-by":"publisher","first-page":"113","DOI":"10.7763\/IJKE.2015.V1.19","volume":"1","author":"EA Abozinadah","year":"2015","unstructured":"Abozinadah, E. A., Mbaziira, A. V., & Jones, J. H., Jr. (2015). Detection of abusive accounts with Arabic tweets. International Journal of Knowledge Engineering, 1(2), 113\u2013119.","journal-title":"International Journal of Knowledge Engineering"},{"key":"9683_CR4","doi-asserted-by":"publisher","first-page":"174","DOI":"10.1016\/j.procs.2018.10.473","volume":"142","author":"A Alakrot","year":"2018","unstructured":"Alakrot, A., Murray, L., & Nikolov, N. S. (2018). Dataset construction for the detection of anti-social behavior in online communication in Arabic. Procedia Computer Science, 142, 174\u2013181.","journal-title":"Procedia Computer Science"},{"key":"9683_CR5","doi-asserted-by":"crossref","unstructured":"Alam, K. S., Bhowmik, S., & Prosun, P. R. K. (2021). Cyberbullying detection: An ensemble based machine learning approach. In 2021 third international conference on intelligent communication technologies and virtual mobile networks (ICICV) (pp. 710\u2013715).","DOI":"10.1109\/ICICV50876.2021.9388499"},{"key":"9683_CR6","unstructured":"Brownlee, J. (2016). Machine learning mastery with Python (Vol. 527, pp. 100\u2013120). Machine Learning Mastery Pty Ltd"},{"key":"9683_CR7","unstructured":"Bushr, H., Zoher, O., Anas, A., & Nada, G. (2020). Arabic offensive language detection with attention-based deep neural networks. In Proceedings of the 4th workshop on open-source Arabic corpora and processing tools (pp. 76\u201381)."},{"key":"9683_CR8","doi-asserted-by":"publisher","unstructured":"Dietterich T. G. (2000). Ensemble methods in machine learning. In Multiple classifier systems. MCS 2000. Lecture notes in computer science (Vol. 1857). Springer. https:\/\/doi.org\/10.1007\/3-540-45014-9_1","DOI":"10.1007\/3-540-45014-9_1"},{"key":"9683_CR9","doi-asserted-by":"publisher","unstructured":"D\u017eeroski, S., Panov, P., & \u017denko, B. (2009). Machine learning, ensemble methods in. In R. Meyers (Ed.), Encyclopedia of complexity and systems science. Springer. https:\/\/doi.org\/10.1007\/978-0-387-30440-3_315","DOI":"10.1007\/978-0-387-30440-3_315"},{"key":"9683_CR10","doi-asserted-by":"crossref","unstructured":"Haidar, B., Chamoun, M., & Serhrouchni, A. (2019). Arabic cyberbullying detection enhancing performance by using ensemble machine learning. In International conference of Internet of Things (pp. 323\u2013327).","DOI":"10.1109\/iThings\/GreenCom\/CPSCom\/SmartData.2019.00074"},{"key":"9683_CR11","unstructured":"https:\/\/github.com\/omammar167\/Arabic-Abusive-Datasets"},{"key":"9683_CR12","unstructured":"Husain, F. (2020). Arabic offensive language detection using machine learning and ensemble machine learning approaches. ArXiv Preprint. https:\/\/arxiv.org\/abs\/2005.08946"},{"key":"9683_CR14","doi-asserted-by":"publisher","unstructured":"Khairy, M., Mahmoud, T. M., Abd-El-Hafeez, T., & Mahfouz, A. (2021). User awareness of privacy, reporting system and cyberbullying on Facebook. In A. E. Hassanien, K. C. Chang, & T. Mincong (Eds.), Advanced machine learning technologies and applications. AMLTA 2021. Advances in intelligent systems and computing.  (Vol. 1339). Springer. https:\/\/doi.org\/10.1007\/978-3-030-69717-4_58","DOI":"10.1007\/978-3-030-69717-4_58"},{"key":"9683_CR15","doi-asserted-by":"publisher","first-page":"211","DOI":"10.30958\/ajmmc.1-3-4","volume":"1","author":"M Meng\u00fc","year":"2015","unstructured":"Meng\u00fc, M., & Meng\u00fc, S. (2015). Violence and social media. Athens Journal of Mass Media and Communications, 1, 211\u2013228.","journal-title":"Athens Journal of Mass Media and Communications"},{"key":"9683_CR16","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1016\/j.eswa.2018.03.058","volume":"106","author":"MM Mironczuk","year":"2018","unstructured":"Mironczuk, M. M., & Protasiewicz, J. (2018). A recent overview of the state-of-the-art elements of text classification. Expert Systems with Applications, 106, 36\u201354. https:\/\/doi.org\/10.1016\/j.eswa.2018.03.058","journal-title":"Expert Systems with Applications"},{"key":"9683_CR17","doi-asserted-by":"crossref","unstructured":"Mubarak, H., & Darwish, K. (2019). Arabic offensive language classification on twitter. In: International conference on social informatics (pp. 269\u2013276). Springer.","DOI":"10.1007\/978-3-030-34971-4_18"},{"key":"9683_CR18","doi-asserted-by":"crossref","unstructured":"Mubarak, H., Darwish, K., & Magdy, W. (2017). Abusive language detection on Arabic social media. In Proceedings of the first workshop on abusive language online. Vancouver, Canada (pp. 52\u201356).","DOI":"10.18653\/v1\/W17-3008"},{"key":"9683_CR19","doi-asserted-by":"crossref","unstructured":"Nadali, S., Murad, M., Sharef, N., Mustapha, A., & Shojaee, S. (2013). A review of cyberbullying detection: An overview. In 13th international conference on intelligent systems design and applications, Bangi (pp. 325\u2013330).","DOI":"10.1109\/ISDA.2013.6920758"},{"key":"9683_CR20","unstructured":"Retrieved February 16, 2021, from http:\/\/istizada.com\/complete-list-of-arabic-speaking-countries-2014\/"},{"key":"9683_CR21","unstructured":"Retrieved June 2, 2021, from https:\/\/courses.analyticsvidhya.com\/courses\/ensemble-learning-and-ensemble-learning-techniques"},{"key":"9683_CR22","unstructured":"Salem, F. (2017). The Arab social media report 2017: Social media and the Internet of Things: Towards data-driven policymaking in the Arab world. MBR School of Government."},{"key":"9683_CR23","unstructured":"Shammur, A., Hamdy, M., Ahmed, A., Soongyo, J., Beard, J., & Joni, S. (2020). a multi-platform arabic news comment dataset for offensive language detection. In Proceedings of the 12th conference on language resources and evaluation (LREC 2020) (pp. 6203\u20136212) Marseille, 11\u201316."},{"key":"9683_CR24","doi-asserted-by":"publisher","first-page":"815","DOI":"10.3390\/app8050815","volume":"8","author":"F Wei","year":"2018","unstructured":"Wei, F., Wenjiang, H., & Jinchang, R. (2018). Class imbalance ensemble learning based on the margin theory. Applied Sciences, 8, 815. https:\/\/doi.org\/10.3390\/app8050815","journal-title":"Applied Sciences"},{"issue":"7","key":"9683_CR25","doi-asserted-by":"publisher","first-page":"e67863","DOI":"10.1371\/journal.pone.0067863","volume":"8","author":"Q Wei","year":"2013","unstructured":"Wei, Q., & Dunbrack, R. L., Jr. (2013). The role of balanced training and testing data sets for binary classifiers in bioinformatics. PLoS ONE, 8(7), e67863. https:\/\/doi.org\/10.1371\/journal.pone.0067863","journal-title":"PLoS ONE"}],"container-title":["Language Resources and Evaluation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10579-023-09683-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10579-023-09683-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10579-023-09683-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,28]],"date-time":"2024-05-28T14:31:15Z","timestamp":1716906675000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10579-023-09683-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,13]]},"references-count":24,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,6]]}},"alternative-id":["9683"],"URL":"https:\/\/doi.org\/10.1007\/s10579-023-09683-y","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-1730412\/v1","asserted-by":"object"}]},"ISSN":["1574-020X","1574-0218"],"issn-type":[{"value":"1574-020X","type":"print"},{"value":"1574-0218","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,13]]},"assertion":[{"value":"23 July 2023","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 August 2023","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and\/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical approval"}},{"value":"Informed consent was obtained from all individual participants included in the study.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Informed consent"}}]}}