{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T06:03:46Z","timestamp":1776060226730,"version":"3.50.1"},"reference-count":48,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2024,2,6]],"date-time":"2024-02-06T00:00:00Z","timestamp":1707177600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Key Research and Development Program of China","award":["2022QY1403"],"award-info":[{"award-number":["2022QY1403"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>The popularization of the internet and the widespread use of smartphones have led to a rapid growth in the number of social media users. While information technology has brought convenience to people, it has also given rise to cyberbullying, which has a serious negative impact. The identity of online users is hidden, and due to the lack of supervision and the imperfections of relevant laws and policies, cyberbullying occurs from time to time, bringing serious mental harm and psychological trauma to the victims. The pre-trained language model BERT (Bidirectional Encoder Representations from Transformers) has achieved good results in the field of natural language processing, which can be used for cyberbullying detection. In this research, we construct a variety of traditional machine learning, deep learning and Chinese pre-trained language models as a baseline, and propose a hybrid model based on a variant of BERT: XLNet, and deep Bi-LSTM for Chinese cyberbullying detection. In addition, real cyber bullying remarks are collected to expand the Chinese offensive language dataset COLDATASET. The performance of the proposed model outperforms all baseline models on this dataset, improving 4.29% compared to SVM\u2014the best performing method in traditional machine learning, 1.49% compared to GRU\u2014the best performing method in deep learning, and 1.13% compared to BERT.<\/jats:p>","DOI":"10.3390\/info15020093","type":"journal-article","created":{"date-parts":[[2024,2,6]],"date-time":"2024-02-06T12:38:24Z","timestamp":1707223104000},"page":"93","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Chinese Cyberbullying Detection Using XLNet and Deep Bi-LSTM Hybrid Model"],"prefix":"10.3390","volume":"15","author":[{"given":"Shifeng","family":"Chen","sequence":"first","affiliation":[{"name":"School of Mechanical Engineering, University of Science and Technology Beijing, Beijing 100083, China"}]},{"given":"Jialin","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Mechanical Engineering, University of Science and Technology Beijing, Beijing 100083, China"}]},{"given":"Ketai","family":"He","sequence":"additional","affiliation":[{"name":"School of Mechanical Engineering, University of Science and Technology Beijing, Beijing 100083, China"}]}],"member":"1968","published-online":{"date-parts":[[2024,2,6]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"23973","DOI":"10.1007\/s11042-019-7234-z","article-title":"Cyberbullying detection on social multimedia using soft computing techniques: A meta-analysis","volume":"78","author":"Kumar","year":"2019","journal-title":"Multimed. Tools Appl."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1111\/j.1469-7610.2007.01846.x","article-title":"Cyberbullying: Its nature and impact in secondary school pupils","volume":"49","author":"Smith","year":"2008","journal-title":"J. Child Psychol. Psychiatry"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1089\/cyber.2019.0370","article-title":"Cyberbullying and children and young people\u2019s mental health: A systematic map of systematic reviews","volume":"23","author":"Kwan","year":"2020","journal-title":"Cyberpsychol. Behav. Soc. Netw."},{"key":"ref_4","unstructured":"Smith, P.K., Del Barrio, C., and Tokunaga, R.S. (2013). Principles of Cyberbullying Research: Definitions, Measures, and Methodology, Routledge."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"S148","DOI":"10.1542\/peds.2016-1758U","article-title":"Defining cyberbullying","volume":"140","author":"Englander","year":"2017","journal-title":"Pediatrics"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1016\/j.appdev.2013.04.002","article-title":"Relevant dimensions of cyberbullying\u2014Results from two experimental studies","volume":"34","author":"Pieschl","year":"2013","journal-title":"J. Appl. Dev. Psychol."},{"key":"ref_7","first-page":"143","article-title":"Current perspectives: The impact of cyberbullying on adolescent health","volume":"5","author":"Nixon","year":"2014","journal-title":"Adolesc. Health Med. Ther."},{"key":"ref_8","first-page":"182","article-title":"Cyberbullying versus face-to-face bullying: A theoretical and conceptual review","volume":"217","author":"Dooley","year":"2009","journal-title":"Z. Psychol. Psychol."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1016\/j.chb.2012.05.024","article-title":"The nature of cyberbullying, and strategies for prevention","volume":"29","author":"Slonje","year":"2013","journal-title":"Comput. Hum. Behav."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"634909","DOI":"10.3389\/fpubh.2021.634909","article-title":"Cyberbullying among adolescents and children: A comprehensive review of the global situation, risk factors, and preventive measures","volume":"9","author":"Zhu","year":"2021","journal-title":"Front. Public Health"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Deng, J., Zhou, J., Sun, H., Zheng, C., Mi, F., Meng, H., and Huang, M. (2022). Cold: A benchmark for chinese offensive language detection. arXiv.","DOI":"10.18653\/v1\/2022.emnlp-main.796"},{"key":"ref_12","unstructured":"Yin, D., Xue, Z., Hong, L., Davison, B.D., Kontostathis, A., and Edwards, L. (2009, January 21). Detection of harassment on web 2.0. Proceedings of the Content Analysis in the WEB, Madrid, Spain."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Reynolds, K., Kontostathis, A., and Edwards, L. (2011, January 18\u201321). Using machine learning to detect cyberbullying. Proceedings of the 2011 10th International Conference on Machine Learning and Applications and Workshops, Honolulu, HI, USA.","DOI":"10.1109\/ICMLA.2011.152"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1609\/icwsm.v5i3.14209","article-title":"Modeling the detection of textual cyberbullying","volume":"5","author":"Dinakar","year":"2011","journal-title":"Proc. Int. Aaai Conf. Web Soc. Media"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1007\/s13042-015-0463-1","article-title":"Content based approach to find the credibility of user in social networks: An application of cyberbullying","volume":"8","author":"Sarna","year":"2017","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Islam, M.M., Uddin, M.A., Islam, L., Akter, A., Sharmin, S., and Acharjee, U.K. (2020, January 16\u201318). Cyberbullying detection on social networks using machine learning approaches. Proceedings of the 2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), Gold Coast, Australia.","DOI":"10.1109\/CSDE50874.2020.9411601"},{"key":"ref_17","unstructured":"Zhang, A., Li, B., Wan, S., and Wang, K. (2019). International Conference on Machine Learning and Intelligent Communications, Springer International Publishing."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"160","DOI":"10.1186\/s40537-021-00550-7","article-title":"Cyberbullying detection: Advanced preprocessing techniques & deep learning architecture for Roman Urdu data","volume":"8","author":"Dewani","year":"2021","journal-title":"J. Big Data"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"102616","DOI":"10.1016\/j.ipm.2021.102616","article-title":"Improving classifier training efficiency for automatic cyberbullying detection with feature density","volume":"58","author":"Eronen","year":"2021","journal-title":"Inf. Process. Manag."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1537","DOI":"10.1007\/s11280-021-00920-4","article-title":"A Bi-GRU with attention and CapsNet hybrid model for cyberbullying detection on social media","volume":"25","author":"Kumar","year":"2022","journal-title":"World Wide Web"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"6644652","DOI":"10.1155\/2021\/6644652","article-title":"Nature-inspired-based approach for automated cyberbullying classification on multimedia social networking","volume":"2021","author":"Yuvaraj","year":"2021","journal-title":"Math. Probl. Eng."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1897","DOI":"10.1007\/s00530-020-00710-4","article-title":"CyberBERT: BERT for cyberbullying identification: BERT for cyberbullying identification","volume":"28","author":"Paul","year":"2022","journal-title":"Multimed. Syst."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1941","DOI":"10.1007\/s00530-020-00690-5","article-title":"ALBERT-based fine-tuning model for cyberbullying analysis","volume":"28","author":"Tripathy","year":"2022","journal-title":"Multimed. Syst."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"113362","DOI":"10.1016\/j.dss.2020.113362","article-title":"Antisocial online behavior detection using deep learning","volume":"138","author":"Zinovyeva","year":"2020","journal-title":"Decis. Support Syst."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"126232","DOI":"10.1016\/j.neucom.2023.126232","article-title":"A systematic review of Hate Speech automatic detection using Natural Language Processing","volume":"546","author":"Jahan","year":"2023","journal-title":"Neurocomputing"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Li, W. (2019, January 23\u201325). A Content-Based Approach for Analysing Cyberbullying on Sina Weibo. Proceedings of the 2nd International Conference on Information Management and Management Sciences, Chengdu, China.","DOI":"10.1145\/3357292.3357294"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"861823","DOI":"10.3389\/fpsyg.2022.861823","article-title":"To be ethical and responsible digital citizens or not: A linguistic analysis of cyberbullying on social media","volume":"13","author":"Zhong","year":"2022","journal-title":"Front. Psychol."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1093","DOI":"10.1007\/s11196-020-09790-x","article-title":"From flaming to incited crime: Recognising cyberbullying on Chinese wechat account","volume":"34","author":"Zhang","year":"2021","journal-title":"Int. J. Semiot. Law-Rev. Int. S\u00e9Miotique Jurid."},{"key":"ref_29","unstructured":"Zhang, A., Lipton, Z.C., Li, M., and Smola, A.J. (2021). Dive into deep learning. arXiv."},{"key":"ref_30","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., and Zettlemoyer, L. (2018, January 1\u20136). Deep Contextualized Word Representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), New Orleans, LA, USA. Association forComputational Linguistics.","DOI":"10.18653\/v1\/N18-1202"},{"key":"ref_32","unstructured":"Radford, A., Narasimhan, K., Salimans, T., and Sutskever, I. (2023, December 05). Improving Language Understanding by Generative Pre-Training. Available online: https:\/\/www.mikecaptain.com\/resources\/pdf\/GPT-1.pdf."},{"key":"ref_33","unstructured":"Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., and Le, Q.V. (2019). Advances in Neural Information Processing Systems, Available online: https:\/\/papers.nips.cc\/paper_files\/paper\/2019\/hash\/dc6a7e655d7e5840e66733e9ee67cc69-Abstract.html."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"2673","DOI":"10.1109\/78.650093","article-title":"Bidirectional recurrent neural networks","volume":"45","author":"Schuster","year":"1997","journal-title":"IEEE Trans. Signal Process."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., and Hu, G. (2020). Revisiting pre-trained models for Chinese natural language processing. arXiv.","DOI":"10.18653\/v1\/2020.findings-emnlp.58"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q.V., and Salakhutdinov, R. (2019). Transformer-xl: Attentive language models beyond a fixed-length context. arXiv.","DOI":"10.18653\/v1\/P19-1285"},{"key":"ref_37","unstructured":"Zhang, Y., and Wallace, B. (2015). A sensitivity analysis of (and practitioners\u2019 guide to) convolutional neural networks for sentence classification. arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Cho, K., Van Merri\u00ebnboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv.","DOI":"10.3115\/v1\/D14-1179"},{"key":"ref_40","unstructured":"Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv."},{"key":"ref_41","unstructured":"Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., and Soricut, R. (2019). Albert: A lite bert for self-supervised learning of language representations. arXiv."},{"key":"ref_42","unstructured":"Sun, Y., Wang, S., Feng, S., Ding, S., Pang, C., Shang, J., Liu, J., Chen, X., Zhao, Y., and Lu, Y. (2021). Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and generation. arXiv."},{"key":"ref_43","unstructured":"Cui, Y., Che, W., Wang, S., and Liu, T. (2022). Lert: A linguistically-motivated pre-trained language model. arXiv."},{"key":"ref_44","unstructured":"Clark, K., Luong, M.T., Le, Q.V., and Manning, C.D. (2020). Electra: Pre-training text encoders as discriminators rather than generators. arXiv."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Bender, E.M., Gebru, T., McMillan-Major, A., and Shmitchell, S. (2021, January 3\u201310). On the dangers of stochastic parrots: Can language models be too big?. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event, Canada.","DOI":"10.1145\/3442188.3445922"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Hamid, O.H. (2023, January 24\u201325). ChatGPT and the Chinese Room Argument: An Eloquent AI Conversationalist Lacking True Understanding and Consciousness. Proceedings of the 2023 9th International Conference on Information Technology Trends (ITT), Dubai, United Arab Emirates.","DOI":"10.1109\/ITT59889.2023.10184233"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"193","DOI":"10.23919\/JSC.2023.0020","article-title":"Unlearning Descartes: Sentient AI is a Political Problem","volume":"4","author":"Hull","year":"2023","journal-title":"J. Soc. Comput."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Hamid, O.H. (2022, January 5\u20138). There Is More to AI than Meets the Eye: Aligning Man-made Algorithms with Nature-inspired Mechanisms. Proceedings of the 2022 IEEE\/ACS 19th International Conference on Computer Systems and Applications (AICCSA), Abu Dhabi, United Arab.","DOI":"10.1109\/AICCSA56895.2022.10017523"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/2\/93\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T13:55:52Z","timestamp":1760104552000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/15\/2\/93"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,6]]},"references-count":48,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2024,2]]}},"alternative-id":["info15020093"],"URL":"https:\/\/doi.org\/10.3390\/info15020093","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,6]]}}}