{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T20:41:09Z","timestamp":1770496869383,"version":"3.49.0"},"reference-count":39,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T00:00:00Z","timestamp":1721174400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T00:00:00Z","timestamp":1721174400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Data Sci Anal"],"published-print":{"date-parts":[[2025,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Cyberbullying on social media platforms is pervasive and challenging to detect due to linguistic subtleties and the need for extensive data annotation. We introduce a Deep Contrastive Self-Supervised Learning (DCSSL) model that integrates a Natural Language Inference (NLI) dataset, a fine-tuned sentence encoder, and data augmentation to enhance the understanding of cyberbullying's nuanced semantics and offensiveness. The DCSSL model effectively captures contextual dependencies and the varied semantic implications inherent in cyberbullying instances, addressing the limitations of manual data annotation processes when compared against established models such as BERT and Bi-LSTM. Our proposed model registers a significant improvement, achieving a macro average F1 score of 0.9231 on cyberbullying datasets, highlighting its applicability in environments where manual annotation is impractical or unavailable.<\/jats:p>","DOI":"10.1007\/s41060-024-00607-9","type":"journal-article","created":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T10:01:58Z","timestamp":1721210518000},"page":"469-490","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Towards a cyberbullying detection approach: fine-tuned contrastive self-supervised learning for data augmentation"],"prefix":"10.1007","volume":"19","author":[{"given":"Lulwah M.","family":"Al-Harigy","sequence":"first","affiliation":[]},{"given":"Hana A.","family":"Al-Nuaim","sequence":"additional","affiliation":[]},{"given":"Naghmeh","family":"Moradpoor","sequence":"additional","affiliation":[]},{"given":"Zhiyuan","family":"Tan","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,7,17]]},"reference":[{"key":"607_CR1","unstructured":"Taylor, P.: \"Statista. [Online]. Available: https:\/\/www.statista.com\/statistics\/1190263\/internet-users-worldwide\/. Accessed 5 February 2023"},{"issue":"6","key":"607_CR2","doi-asserted-by":"crossref","first-page":"2043","DOI":"10.1007\/s00530-020-00747-5","volume":"8","author":"A Kumar","year":"2022","unstructured":"Kumar, A., Sachdeva, N.: Multimodal cyberbullying detection using capsule network with dynamic routing and deep convolutional neural network. Multimed. Syst. 8(6), 2043\u20132052 (2022)","journal-title":"Multimed. Syst."},{"key":"607_CR3","doi-asserted-by":"crossref","first-page":"1537","DOI":"10.1007\/s11280-021-00920-4","volume":"25","author":"A Kumar","year":"2021","unstructured":"Kumar, A., Sachdeva, N.: A Bi-GRU with attention and CapsNet hybrid model for cyberbullying detection on social media. World Wide Web 25, 1537\u20131550 (2021)","journal-title":"World Wide Web"},{"key":"607_CR4","first-page":"4794227","volume":"2022","author":"LM Al-Harigy","year":"2022","unstructured":"Al-Harigy, L.M., Al-Nuaim, H.A., Moradpoor, N., Tan, Z.: \u201cBuilding towards automated cyberbullying detection: a comparative analysis,.\u201d Comput. Intell. Neurosci. 2022, 4794227 (2022)","journal-title":"Comput. Intell. Neurosci."},{"issue":"5","key":"607_CR5","first-page":"5549","volume":"45","author":"X Wang","year":"2022","unstructured":"Wang, X., Qi, G.-J.: Contrastive learning with stronger augmentations. IEEE Trans. Pattern Anal. Mach. Intell. 45(5), 5549\u20135560 (2022)","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"607_CR6","doi-asserted-by":"crossref","unstructured":"Miyai, A., Yu, Q., Ikami, D., Irie, G., Aizawa, K.: \"Rethinking rotation in self-supervised contrastive learning: adaptive positive or negative data augmentation,\" in IEEE\/CVF Winter Conference on Applications of Computer Vision, (2023)","DOI":"10.1109\/WACV56688.2023.00283"},{"key":"607_CR7","unstructured":"Falcon, W., Cho, K.: \"A framework for contrastive self-supervised learning and designing a new approach,\" in arXiv preprint arXiv, (2020)"},{"key":"607_CR8","unstructured":"Saunshi, N., Ash, J. T., Goel, S., Misra, D., Zhang, C., Arora, S., Kakade, S., Krishnamurthy, A.: \"Understanding contrastive learning requires incorporating inductive biases,\" in the 39 the International Conference on Machine Learning, (2022)"},{"key":"607_CR9","unstructured":"Xiao, T., Wang, X., Efros, A. A., Darrell, T.: \"What should not be contrastive in contrastive learning,\" in The 9th International Conference on Learning Representations, (2021)"},{"key":"607_CR10","doi-asserted-by":"crossref","unstructured":"Fang, H., Wang, S., Zhou, M., Ding, J., Xie, P.: \"CERT: Contrastive self-supervised learning for language understanding,\" in arXiv preprint arXiv:2005.12766, (2020)","DOI":"10.36227\/techrxiv.12308378.v1"},{"key":"607_CR11","doi-asserted-by":"crossref","unstructured":"Al-Harigy, L., Al-Nuaim, H., Moradpoor, N.: \"Deep pre-trained contrastive self-supervised learning: a cyberbullying detection approach with augmented datasets,\" in 14th International Conference on Computational Intelligence and Communication Networks (CICN), Al-Khobar - KSA, (2022)","DOI":"10.1109\/CICN56167.2022.10008274"},{"key":"607_CR12","doi-asserted-by":"crossref","unstructured":"Gao, T., Yao, X., Chen, D.: \"SimCSE: simple contrastive learning of sentence embeddings,\" In: Conference on Empirical Methods in Natural Language Processing, (2021)","DOI":"10.18653\/v1\/2021.emnlp-main.552"},{"key":"607_CR13","doi-asserted-by":"crossref","unstructured":"Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., Kumar, R.: \"Predicting the type and target of offensive posts in social media,\" in Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: human language technologies, (2019)","DOI":"10.18653\/v1\/N19-1144"},{"key":"607_CR14","doi-asserted-by":"crossref","unstructured":"Rosentha, S., Atanasova, P., Karadzhov, G., Zampieri, M., Nakov, P.: \"SOLID: a large-scale semi-supervised dataset for offensive language identification,\" in arXiv:2004.14454, (2021)","DOI":"10.18653\/v1\/2021.findings-acl.80"},{"key":"607_CR15","unstructured":"Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: \"A simple framework for contrastive learning of visual representations,\" in The 37th International Conference on Machine Learning (ICML'20), (2020)"},{"key":"607_CR16","unstructured":"Wu, Z., Wang, S., Gu, J., Khabsa, M., Sun, F., Ma, H.: \"CLEAR: contrastive learning for sentence representation,\" in arXiv preprint arXiv:2012.15466, (2020)"},{"key":"607_CR17","doi-asserted-by":"crossref","unstructured":"Giorgi, J., Nitski, O., Wang, B., Bader, G.: \"DeCLUTR: deep contrastive learning for unsupervised textual representations,\" in \"Italic\">arXiv preprint arXiv:2006.03659, (2020)","DOI":"10.18653\/v1\/2021.acl-long.72"},{"key":"607_CR18","unstructured":"Chen, Q., Zhang, R., Zheng, Y., Mao, Y.: \"Dual contrastive learning: text classification via label-aware data augmentation,\" in arXiv preprint arXiv:2201.08702, (2022)"},{"key":"607_CR19","unstructured":"Mao, Z., Zhu, D., Lu, J., Zhao, R., Tan, F.: \"SDA: simple discrete augmentation for contrastive sentence representation learning,\" in arXiv preprint arXiv:2210.03963, (2022)"},{"key":"607_CR20","doi-asserted-by":"crossref","unstructured":"Chen, J., Zhang, R., Mao, Y., Xu, J.: \"ContrastNet: a contrastive learning framework for few-shot text classification,\" in The Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22), (2022)","DOI":"10.1609\/aaai.v36i10.21292"},{"key":"607_CR21","doi-asserted-by":"crossref","unstructured":"Febriana, T., Budiarto, A.: \"Twitter dataset for hate speech and cyberbullying detection in indonesian language,\" in International Conference on Information Management and Technology (ICIMTech), (2019)","DOI":"10.1109\/ICIMTech.2019.8843722"},{"key":"607_CR22","unstructured":"Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: \"BERT: pre-training of deep bidirectional transformers for language understanding,\" arXiv preprint arXiv:1810.04805, (2018)"},{"key":"607_CR23","volume-title":"Improving Language Understanding by Generative Pre-Training","author":"A Radford","year":"2018","unstructured":"Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving Language Understanding by Generative Pre-Training. OpenAI (2018)"},{"key":"607_CR24","doi-asserted-by":"crossref","unstructured":"Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., Zettlemoyer, L.: \"BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension,\" in arXiv preprint arXiv:1910.13461, (2019)","DOI":"10.18653\/v1\/2020.acl-main.703"},{"issue":"6","key":"607_CR25","doi-asserted-by":"crossref","first-page":"1897","DOI":"10.1007\/s00530-020-00710-4","volume":"28","author":"S Paul","year":"2020","unstructured":"Paul, S., Saha, S.: CyberBERT: BERT for cyberbullying identifcation. Multimed. Syst. 28(6), 1897\u20131904 (2020)","journal-title":"Multimed. Syst."},{"key":"607_CR26","doi-asserted-by":"crossref","first-page":"103541","DOI":"10.1109\/ACCESS.2021.3098979","volume":"9","author":"F Elsafoury","year":"2021","unstructured":"Elsafoury, F., Katsigiannis, S., Pervez, Z., Ramzan, N.: When the timeline meets the pipeline: a survey on automated cyberbullying detection. IEEE access 9, 103541\u2013103563 (2021)","journal-title":"IEEE access"},{"key":"607_CR27","doi-asserted-by":"crossref","unstructured":"Guo, X., Anjum, U., Zhan, J.: \"Cyberbully detection using BERT with augmented texts,\" in International Conference on Big Data (Big Data), (2022)","DOI":"10.1109\/BigData55660.2022.10020581"},{"key":"607_CR28","doi-asserted-by":"crossref","first-page":"1941","DOI":"10.1007\/s00530-020-00690-5","volume":"28","author":"JK Tripathy","year":"2020","unstructured":"Tripathy, J.K., Chakkaravarthy, S.S., Satapathy, S.C., Sahoo, M., Vaidehi, V.: ALBERT-based fine-tuning model for cyberbullying analysis. Multimed. Syst. 28, 1941\u20131949 (2020)","journal-title":"Multimed. Syst."},{"key":"607_CR29","doi-asserted-by":"crossref","unstructured":"Nouri, N.: \"Data augmentation with dual training for offensive span detection,\" in the 2022 conference of the north american chapter of the association for computational linguistics: human language technologies, (2022)","DOI":"10.18653\/v1\/2022.naacl-main.185"},{"key":"607_CR30","doi-asserted-by":"crossref","unstructured":"Gonzalez-Pizarro, F., Zannettou, S.: \"Understanding and detecting hateful content using contrastive learning,\" in arXiv preprint arXiv:2201.08387, (2022)","DOI":"10.1609\/icwsm.v17i1.22143"},{"key":"607_CR31","doi-asserted-by":"crossref","unstructured":"B. Bhatia, A. Verma, Anjum and R. Katarya, \"Analysing Cyberbullying using Natural Language Processing by Understanding Jargon in Social Media,\" Sustainable Advanced Computing, Springer, Singapore (2022)","DOI":"10.1007\/978-981-16-9012-9_32"},{"key":"607_CR32","doi-asserted-by":"crossref","first-page":"101710","DOI":"10.1016\/j.cose.2019.101710","volume":"90","author":"V Balakrishnan","year":"2020","unstructured":"Balakrishnan, V., Khan, S., Arabnia, H.R.: Improving cyberbullying detection using twitter users\u2019 psychological features and machine learning. Comput. Secur. 90, 101710 (2020)","journal-title":"Comput. Secur."},{"key":"607_CR33","doi-asserted-by":"crossref","unstructured":"Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., Kumar, R.: \"SemEval-2019 Task 6: identifying and categorizing offensive language in social media (OffensEval),\" The 13th International Workshop on Semantic Evaluation, (2019)","DOI":"10.18653\/v1\/S19-2010"},{"key":"607_CR34","doi-asserted-by":"crossref","unstructured":"Zampieri, M., Nakov, P., Rosenthal, S., Atanasova, P., Karadzhov, G., Mubarak, H., Derczynski, L., Pitenis, Z., Coltekin, C.: \"SemEval-2020 task 12: multilingual offensive language identification in social media (OffensEval 2020),\" in The Fourteenth Workshop on Semantic Evaluation, (2020)","DOI":"10.18653\/v1\/2020.semeval-1.188"},{"key":"607_CR35","doi-asserted-by":"crossref","unstructured":"Li, B., Hou, Y., Che, W.: \"Data augmentation approaches in natural language processing: a survey,\" in Ai Open 3, (2022)","DOI":"10.1016\/j.aiopen.2022.03.001"},{"key":"607_CR36","doi-asserted-by":"crossref","unstructured":"Wiedemann, G., Yimam, S. M., Biemann, C.: \"UHH-LT at SemEval-2020 task 12: fine-tuning of pre-trained transformer networks for offensive language detection,\" in The International Workshop on Semantic Evaluation (SemEval), (2020)","DOI":"10.18653\/v1\/2020.semeval-1.213"},{"key":"607_CR37","doi-asserted-by":"crossref","unstructured":"Wang, S., Liu, J., Ouyang, X., Sun, Y.: \"Galileo at SemEval-2020 task 12: multi-lingual learning for offensive language identification using pre-trained language models,\" in the International Workshop on Semantic Evaluation (SemEval)., (2020)","DOI":"10.18653\/v1\/2020.semeval-1.189"},{"key":"607_CR38","doi-asserted-by":"crossref","unstructured":"Dadu, T., Pant, K.: \"Team rouges at SemEval-2020 task 12: cross-lingual inductive transfer to detect offensive language,\" in the International Workshop on Semantic Evaluation (SemEval), (2020)","DOI":"10.18653\/v1\/2020.semeval-1.290"},{"key":"607_CR39","unstructured":"Zhang, X., Zhao, J., LeCun, Y.: \"Character-level convolutional network for text classification applied to chinese corpus,\" in \"Italic\">arXiv preprint arXiv:1611.04358, (2016)"}],"container-title":["International Journal of Data Science and Analytics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41060-024-00607-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s41060-024-00607-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41060-024-00607-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,23]],"date-time":"2025-03-23T07:19:42Z","timestamp":1742714382000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s41060-024-00607-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,17]]},"references-count":39,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,4]]}},"alternative-id":["607"],"URL":"https:\/\/doi.org\/10.1007\/s41060-024-00607-9","relation":{},"ISSN":["2364-415X","2364-4168"],"issn-type":[{"value":"2364-415X","type":"print"},{"value":"2364-4168","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,17]]},"assertion":[{"value":"26 December 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 July 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 July 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}