{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,24]],"date-time":"2025-12-24T17:50:06Z","timestamp":1766598606384,"version":"build-2065373602"},"reference-count":19,"publisher":"MDPI AG","issue":"21","license":[{"start":{"date-parts":[[2023,10,29]],"date-time":"2023-10-29T00:00:00Z","timestamp":1698537600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Electronics"],"abstract":"<jats:p>The rise of social networks and the increasing amount of time people spend on them have created a perfect place for the dissemination of false narratives, propaganda, and manipulated content. In order to prevent the spread of disinformation, content moderation is needed. However, manual moderation is unfeasible due to the large amount of daily posts. This paper studies the impact of using different loss functions on a multi-label classification problem with an imbalanced dataset, consisting of 20 persuasion techniques and only 950 samples, provided by SemEval\u2019s 2021 Task 6. We used machine learning models, such as Naive Bayes and Decision Trees, and a custom deep learning architecture, based on DistilBERT and Convolutional Layers. Overall, the machine learning models achieved far worse results than the deep learning model, using Binary Cross Entropy, which we considered our baseline deep learning model. To address the class imbalance problem, we trained our model using different loss functions, such as Focal Loss and Asymmetric Loss. The latter providing the best results, particularly for the least represented classes.<\/jats:p>","DOI":"10.3390\/electronics12214447","type":"journal-article","created":{"date-parts":[[2023,10,30]],"date-time":"2023-10-30T13:00:49Z","timestamp":1698670849000},"page":"4447","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Detecting Persuasion Attempts on Social Networks: Unearthing the Potential of Loss Functions and Text Pre-Processing in Imbalanced Data Settings"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-5332-3861","authenticated-orcid":false,"given":"R\u00faben","family":"Teimas","sequence":"first","affiliation":[{"name":"School of Science and Technology, University of \u00c9vora, 7000-671 \u00c9vora, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3025-0687","authenticated-orcid":false,"given":"Jos\u00e9","family":"Saias","sequence":"additional","affiliation":[{"name":"School of Science and Technology, University of \u00c9vora, 7000-671 \u00c9vora, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2023,10,29]]},"reference":[{"key":"ref_1","first-page":"1","article-title":"The Future of False Information Detection on Social Media: New Perspectives and Trends","volume":"53","author":"Guo","year":"2020","journal-title":"ACM Comput. Surv."},{"doi-asserted-by":"crossref","unstructured":"Da San Martino, G., Yu, S., Barr\u00f3n-Cede\u00f1o, A., Petrov, R., and Nakov, P. (2019, January 3\u20137). Fine-Grained Analysis of Propaganda in News Article. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China.","key":"ref_2","DOI":"10.18653\/v1\/D19-1565"},{"doi-asserted-by":"crossref","unstructured":"Dimitrov, D., Bin Ali, B., Shaar, S., Alam, F., Silvestri, F., Firooz, H., Nakov, P., and Da San Martino, G. (2021, January 5\u20136). SemEval-2021 Task 6: Detection of Persuasion Techniques in Texts and Images. Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), Online.","key":"ref_3","DOI":"10.18653\/v1\/2021.semeval-1.7"},{"doi-asserted-by":"crossref","unstructured":"Huang, Y., Giledereli, B., K\u00f6ksal, A., \u00d6zg\u00fcr, A., and Ozkirimli, E. (2021). Balancing Methods for Multi-label Text Classification with Long-Tailed Class Distribution. arXiv.","key":"ref_4","DOI":"10.18653\/v1\/2021.emnlp-main.643"},{"doi-asserted-by":"crossref","unstructured":"Ibrohim, M.O., and Budi, I. (2019, January 1). Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter. Proceedings of the Third Workshop on Abusive Language Online, Florence, Italy.","key":"ref_5","DOI":"10.18653\/v1\/W19-3506"},{"doi-asserted-by":"crossref","unstructured":"Abujaber, D., Qarqaz, A., and Abdullah, M.A. (2021, January 5\u20136). LeCun at SemEval-2021 Task 6: Detecting Persuasion Techniques in Text Using Ensembled Pretrained Transformers and Data Augmentation. Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), Online.","key":"ref_6","DOI":"10.18653\/v1\/2021.semeval-1.148"},{"doi-asserted-by":"crossref","unstructured":"Gupta, V., and Sharma, R. (2021, January 5\u20136). NLPIITR at SemEval-2021 Task 6: RoBERTa Model with Data Augmentation for Persuasion Techniques Detection. Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), Online.","key":"ref_7","DOI":"10.18653\/v1\/2021.semeval-1.147"},{"doi-asserted-by":"crossref","unstructured":"Tian, J., Gui, M., Li, C., Yan, M., and Xiao, W. (2021, January 5\u20136). MinD at SemEval-2021 Task 6: Propaganda Detection using Transfer Learning and Multimodal Fusion. Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), Online.","key":"ref_8","DOI":"10.18653\/v1\/2021.semeval-1.150"},{"doi-asserted-by":"crossref","unstructured":"Gupta, K., Gautam, D., and Mamidi, R. (2021, January 5\u20136). Volta at SemEval-2021 Task 6: Towards Detecting Persuasive Texts and Images using Textual and Multimodal Ensemble. Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), Online.","key":"ref_9","DOI":"10.18653\/v1\/2021.semeval-1.149"},{"doi-asserted-by":"crossref","unstructured":"Messina, N., Falchi, F., Gennaro, C., and Amato, G. (2021, January 5\u20136). AIMH at SemEval-2021 Task 6: Multimodal Classification Using an Ensemble of Transformer Models. Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), Online.","key":"ref_10","DOI":"10.18653\/v1\/2021.semeval-1.140"},{"unstructured":"Sanh, V., Debut, L., Chaumond, J., and Wolf, T. (2019). DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv.","key":"ref_11"},{"unstructured":"Alshubaily, I. (2021). TextCNN with Attention for Text Classification. arXiv.","key":"ref_12"},{"doi-asserted-by":"crossref","unstructured":"Yu, L., Chen, L., Dong, J., Li, M., Liu, L., Zhao, B., and Zhang, C. (2020, January 13\u201317). Detecting Malicious Web Requests Using an Enhanced TextCNN. Proceedings of the 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), Madrid, Spain.","key":"ref_13","DOI":"10.1109\/COMPSAC48688.2020.0-167"},{"doi-asserted-by":"crossref","unstructured":"Zhu, X., Wang, J., and Zhang, X. (2021, January 5\u20136). YNU-HPCC at SemEval-2021 Task 6: Combining ALBERT and Text-CNN for Persuasion Detection in Texts and Images. Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), Online.","key":"ref_14","DOI":"10.18653\/v1\/2021.semeval-1.144"},{"doi-asserted-by":"crossref","unstructured":"Safaya, A., Abdullatif, M., and Yuret, D. (2020, January 12\u201313). KUISAIL at SemEval-2020 Task 12: BERT-CNN for Offensive Speech Identification in Social Media. Proceedings of the Fourteenth Workshop on Semantic Evaluation, Barcelona (online).","key":"ref_15","DOI":"10.18653\/v1\/2020.semeval-1.271"},{"doi-asserted-by":"crossref","unstructured":"Lin, T., Goyal, P., Girshick, R.B., He, K., and Doll\u00e1r, P. (2017). Focal Loss for Dense Object Detection. arXiv.","key":"ref_16","DOI":"10.1109\/ICCV.2017.324"},{"unstructured":"Baruch, E.B., Ridnik, T., Zamir, N., Noy, A., Friedman, I., Protter, M., and Zelnik-Manor, L. (2020). Asymmetric Loss For Multi-Label Classification. arXiv.","key":"ref_17"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3544558","article-title":"A Survey on Data Augmentation for Text Classification","volume":"55","author":"Bayer","year":"2022","journal-title":"ACM Comput. Surv."},{"doi-asserted-by":"crossref","unstructured":"Piskorski, J., Stefanovitch, N., Da San Martino, G., and Nakov, P. (2023, January 13\u201314). SemEval-2023 Task 3: Detecting the Category, the Framing, and the Persuasion Techniques in Online News in a Multi-lingual Setup. Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), Toronto, ON, Canada.","key":"ref_19","DOI":"10.18653\/v1\/2023.semeval-1.317"}],"container-title":["Electronics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-9292\/12\/21\/4447\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:13:46Z","timestamp":1760130826000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-9292\/12\/21\/4447"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,29]]},"references-count":19,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2023,11]]}},"alternative-id":["electronics12214447"],"URL":"https:\/\/doi.org\/10.3390\/electronics12214447","relation":{},"ISSN":["2079-9292"],"issn-type":[{"type":"electronic","value":"2079-9292"}],"subject":[],"published":{"date-parts":[[2023,10,29]]}}}