{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T12:38:13Z","timestamp":1775911093712,"version":"3.50.1"},"reference-count":48,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2021,8,17]],"date-time":"2021-08-17T00:00:00Z","timestamp":1629158400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,8,17]],"date-time":"2021-08-17T00:00:00Z","timestamp":1629158400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Data Sci. Eng."],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Sexism, a permeate form of oppression, causes profound suffering through various manifestations. Given the increasing number of experiences of sexism shared online, categorizing these recollections automatically can support the battle against sexism, since it can promote successful evaluations by gender studies researchers and government representatives engaged in policy making. In this paper, we examine the fine-grained, multi-label classification of accounts (reports) of sexism. To the best of our knowledge, we consider substantially more categories of sexism than any related prior work through our 23-class problem formulation. Moreover, we present the first semi-supervised work for the multi-label classification of accounts describing any type(s) of sexism. We devise self-training-based techniques tailor-made for the multi-label nature of the problem to utilize unlabeled samples for augmenting the labeled set. We identify high textual diversity with respect to the existing labeled set as a desirable quality for candidate unlabeled instances and develop methods for incorporating it into our approach. We also explore ways of infusing class imbalance alleviation for multi-label classification into our semi-supervised learning, independently and in conjunction with the method involving diversity. In addition to data augmentation methods, we develop a neural model which combines biLSTM and attention with a domain-adapted BERT model in an end-to-end trainable manner. Further, we formulate a multi-level training approach in which models are sequentially trained using categories of sexism of different levels of granularity. Moreover, we devise a loss function that exploits any label confidence scores associated with the data. Several proposed methods outperform various baselines on a recently released dataset for multi-label sexism categorization across several standard metrics.<\/jats:p>","DOI":"10.1007\/s41019-021-00168-y","type":"journal-article","created":{"date-parts":[[2021,8,17]],"date-time":"2021-08-17T16:03:04Z","timestamp":1629216184000},"page":"359-379","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["Fine-Grained Multi-label Sexism Classification Using a Semi-Supervised Multi-level Neural Approach"],"prefix":"10.1007","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8280-6152","authenticated-orcid":false,"given":"Harika","family":"Abburi","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Pulkit","family":"Parikh","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Niyati","family":"Chhaya","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vasudeva","family":"Varma","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,8,17]]},"reference":[{"key":"168_CR1","doi-asserted-by":"crossref","unstructured":"Abburi H, Parikh P, Chhaya N, Varma V (2020) Semi-supervised multi-task learning for multi-label fine-grained sexism classification. In: Proceedings of the 28th international conference on computational linguistics, international committee on computational linguistics, Barcelona, Spain (Online), pp 5810\u20135820","DOI":"10.18653\/v1\/2020.coling-main.511"},{"key":"168_CR2","doi-asserted-by":"crossref","unstructured":"Abney S (2007) Semisupervised learning for computational linguistics. Chapman and Hall\/CRC","DOI":"10.1201\/9781420010800"},{"key":"168_CR3","doi-asserted-by":"crossref","unstructured":"Agrawal S, Awekar A (2018) Deep learning for detecting cyberbullying across multiple social media platforms. In: European Conference on Information Retrieval, Springer, pp 141\u2013153","DOI":"10.1007\/978-3-319-76941-7_11"},{"key":"168_CR4","doi-asserted-by":"crossref","unstructured":"Anzovino M, Fersini E, Rosso P (2018) Automatic identification and classification of misogynistic language on twitter. In: International conference on applications of natural language to information systems, Springer, pp 57\u201364","DOI":"10.1007\/978-3-319-91947-8_6"},{"key":"168_CR5","doi-asserted-by":"crossref","unstructured":"Badjatiya P, Gupta S, Gupta M, Varma V (2017) Deep learning for hate speech detection in tweets. In: Proceedings of the 26th international conference on world wide web companion, International World Wide Web Conferences Steering Committee, pp 759\u2013760","DOI":"10.1145\/3041021.3054223"},{"issue":"1","key":"168_CR6","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1140\/epjds\/s13688-016-0072-6","volume":"5","author":"P Burnap","year":"2016","unstructured":"Burnap P, Williams ML (2016) Us and them: identifying cyber hate on twitter across multiple protected characteristics. EPJ Data Sci 5(1):11","journal-title":"EPJ Data Sci"},{"key":"168_CR7","doi-asserted-by":"crossref","unstructured":"Cer D, Yang Y, Kong Sy, Hua N, Limtiaco N, John RS, Constant N, Guajardo-Cespedes M, Yuan S, Tar C, et\u00a0al. (2018) Universal sentence encoder. arXiv preprint arXiv:180311175","DOI":"10.18653\/v1\/D18-2029"},{"key":"168_CR8","unstructured":"Chiril P, Moriceau V, Benamara F, Mari A, Origgi G, Coulomb-Gully M (2020) An annotated corpus for sexism detection in french tweets. In: Proceedings of the 12th language resources and evaluation conference, pp 1397\u20131403"},{"key":"168_CR9","doi-asserted-by":"crossref","unstructured":"Chiril P, Moriceau V, Benamara F, Mari A, Origgi G, Coulomb-Gully M (2020) He said \u201cwho\u2019s gonna take care of your children when you are at acl?\u2019: Reported sexist acts are not sexist. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp 4055\u20134066","DOI":"10.18653\/v1\/2020.acl-main.373"},{"key":"168_CR10","doi-asserted-by":"crossref","unstructured":"Chowdhury AG, Sawhney R, Shah R, Mahata D (2019) # youtoo? detection of personal recollections of sexual harassment on social media. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 2527\u20132537","DOI":"10.18653\/v1\/P19-1241"},{"issue":"2","key":"168_CR11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3377323","volume":"20","author":"M Corazza","year":"2020","unstructured":"Corazza M, Menini S, Cabrio E, Tonelli S, Villata S (2020) A multilingual evaluation for online hate speech detection. ACM Trans Internet Technol TOIT 20(2):1\u201322","journal-title":"ACM Trans Internet Technol TOIT"},{"key":"168_CR12","doi-asserted-by":"crossref","unstructured":"Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Eleventh international aaai conference on web and social media","DOI":"10.1609\/icwsm.v11i1.14955"},{"key":"168_CR13","unstructured":"Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805"},{"key":"168_CR14","doi-asserted-by":"crossref","unstructured":"Dutta D, Sircar O (2013) India\u2019s winter of discontent: Some feminist dilemmas in the wake of a rape. Fem Stud 39(1):293\u2013306","DOI":"10.1353\/fem.2013.0023"},{"key":"168_CR15","doi-asserted-by":"crossref","unstructured":"Eccles JS, Jacobs JE, Harold RD (1990) Gender role stereotypes, expectancy effects, and parents\u2019 socialization of gender differences. J Soc Issues 46(2):183\u2013201","DOI":"10.1111\/j.1540-4560.1990.tb01929.x"},{"key":"168_CR16","doi-asserted-by":"crossref","unstructured":"ElSherief M, Belding E, Nguyen D (2017) # notokay: Understanding gender-based violence in social media. In: Eleventh international AAAI conference on web and social media","DOI":"10.1609\/icwsm.v11i1.14877"},{"issue":"5","key":"168_CR17","doi-asserted-by":"publisher","first-page":"4743","DOI":"10.3233\/JIFS-179023","volume":"36","author":"S Frenda","year":"2019","unstructured":"Frenda S, Ghanem B, Montes-y G\u00f3mez M, Rosso P (2019) Online hate speech against women: automatic identification of misogyny and sexism on twitter. J Intell Fuzzy Syst 36(5):4743\u20134752","journal-title":"J Intell Fuzzy Syst"},{"key":"168_CR18","unstructured":"Gao L, Kuppersmith A, Huang R (2017) Recognizing explicit and implicit hate speech using a weakly supervised two-path bootstrapping approach. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp 774\u2013782"},{"key":"168_CR19","unstructured":"Jafarpour B, Matwin S et\u00a0al. (2018) Boosting text classification performance on sexist tweets by text augmentation and text generation using a combination of knowledge graphs. In: Proceedings of the 2nd workshop on abusive language online (ALW2), pp 107\u2013114"},{"key":"168_CR20","doi-asserted-by":"crossref","unstructured":"Jha A, Mamidi R (2017) When does a compliment become sexist? analysis and classification of ambivalent sexism using twitter data. In: Proceedings of the second workshop on NLP and computational social science, pp 7\u201316","DOI":"10.18653\/v1\/W17-2902"},{"key":"168_CR21","doi-asserted-by":"crossref","unstructured":"Karlekar S, Bansal M (2018) Safecity: understanding diverse forms of sexual harassment personal stories. In: Proceedings of the 2017 conference on empirical methods in natural language processing (EMNLP), pp 2805\u20132811","DOI":"10.18653\/v1\/D18-1303"},{"key":"168_CR22","doi-asserted-by":"crossref","unstructured":"Khatua A, Cambria E, Khatua A (2018) Sounds of silence breakers: exploring sexual violence on twitter. In: 2018 IEEE\/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 397\u2013400","DOI":"10.1109\/ASONAM.2018.8508576"},{"key":"168_CR23","doi-asserted-by":"crossref","unstructured":"Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2017 conference on empirical methods in natural language processing (EMNLP), pp 1746\u20131751","DOI":"10.3115\/v1\/D14-1181"},{"key":"168_CR24","unstructured":"Mead M (1963) Sex and temperament in three primitive societies, vol 370. Morrow New York"},{"key":"168_CR25","doi-asserted-by":"publisher","first-page":"28","DOI":"10.3389\/fdigh.2018.00028","volume":"5","author":"S Melville","year":"2018","unstructured":"Melville S, Eccles K, Yasseri T (2018) Topic modelling of everyday sexism project entries. Front Dig Human 5:28","journal-title":"Front Dig Human"},{"key":"168_CR26","volume-title":"Seeing like a feminist","author":"N Menon","year":"2012","unstructured":"Menon N (2012) Seeing like a feminist. Penguin, Westminster"},{"key":"168_CR27","doi-asserted-by":"crossref","unstructured":"Nobata C, Tetreault J, Thomas A, Mehdad Y, Chang Y (2016) Abusive language detection in online user content. In: Proceedings of the 25th international conference on world wide web, International World Wide web conferences steering committee, pp 145\u2013153","DOI":"10.1145\/2872427.2883062"},{"key":"168_CR28","doi-asserted-by":"crossref","unstructured":"Parikh P, Abburi H, Badjatiya P, Krishnan R, Chhaya N, Gupta M, Varma V (2019) Multi-label categorization of accounts of sexism using a neural framework. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 1642\u20131652","DOI":"10.18653\/v1\/D19-1174"},{"key":"168_CR29","unstructured":"Pennebaker JW, Boyd RL, Jordan K, Blackburn K (2015) The development and psychometric properties of liwc2015. Tech. rep"},{"key":"168_CR30","doi-asserted-by":"crossref","unstructured":"Pennington J, Socher R, Manning C (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532\u20131543","DOI":"10.3115\/v1\/D14-1162"},{"key":"168_CR31","doi-asserted-by":"crossref","unstructured":"Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of NAACL","DOI":"10.18653\/v1\/N18-1202"},{"issue":"2","key":"168_CR32","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3369869","volume":"20","author":"FM Plaza-Del-Arco","year":"2020","unstructured":"Plaza-Del-Arco FM, Molina-Gonz\u00e1lez MD, Ure\u00f1a-L\u00f3pez LA, Mart\u00edn-Valdivia MT (2020) Detecting misogyny and xenophobia in spanish tweets using language technologies. ACM Trans Internet Technol TOIT 20(2):1\u201319","journal-title":"ACM Trans Internet Technol TOIT"},{"key":"168_CR33","doi-asserted-by":"crossref","unstructured":"Qian J, ElSherief M, Belding E, Wang WY (2018) Hierarchical cvae for fine-grained hate speech classification. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 3550\u20133559","DOI":"10.18653\/v1\/D18-1391"},{"key":"168_CR34","doi-asserted-by":"publisher","first-page":"219563","DOI":"10.1109\/ACCESS.2020.3042604","volume":"8","author":"F Rodr\u00edguez-S\u00e1nchez","year":"2020","unstructured":"Rodr\u00edguez-S\u00e1nchez F, Carrillo-de Albornoz J, Plaza L (2020) Automatic classification of sexism in social networks: an empirical study on twitter data. IEEE Access 8:219563\u2013219576","journal-title":"IEEE Access"},{"key":"168_CR35","doi-asserted-by":"crossref","unstructured":"Schrading N, Alm CO, Ptucha R, Homan C (2015) An analysis of domestic abuse discourse on reddit. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 2577\u20132583","DOI":"10.18653\/v1\/D15-1309"},{"key":"168_CR36","doi-asserted-by":"crossref","unstructured":"Suvarna A, Bhalla G (2020) # notawhore! a computational linguistic perspective of rape culture and victimization on social media. In: Proceedings of the 58th annual meeting of the association for computational linguistics: student research workshop, pp 328\u2013335","DOI":"10.18653\/v1\/2020.acl-srw.43"},{"key":"168_CR37","unstructured":"Van\u00a0Hee C, Lefever E, Verhoeven B, Mennes J, Desmet B, De\u00a0Pauw G, Daelemans W, Hoste V (2015) Detection and fine-grained classification of cyberbullying events. In: Proceedings of the international conference recent advances in natural language processing, pp 672\u2013680"},{"key":"168_CR38","doi-asserted-by":"crossref","unstructured":"Wang J, Yu LC, Lai KR, Zhang X (2016) Dimensional sentiment analysis using a regional cnn-lstm model. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), vol\u00a02, pp 225\u2013230","DOI":"10.18653\/v1\/P16-2037"},{"key":"168_CR39","unstructured":"Warner W, Hirschberg J (2012) Detecting hate speech on the world wide web. In: Proceedings of the second workshop on language in social media, Association for Computational Linguistics, pp 19\u201326"},{"key":"168_CR40","doi-asserted-by":"crossref","unstructured":"Waseem Z, Hovy D (2016) Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the NAACL student research workshop, pp 88\u201393","DOI":"10.18653\/v1\/N16-2013"},{"key":"168_CR41","unstructured":"Xiao H (2018) bert-as-service. https:\/\/github.com\/hanxiao\/bert-as-service"},{"key":"168_CR42","doi-asserted-by":"crossref","unstructured":"Yan P, Li L, Chen W, Zeng D (2019) Quantum-inspired density matrix encoder for sexual harassment personal stories classification. In: 2019 IEEE international conference on intelligence and security informatics (ISI), IEEE, pp 218\u2013220","DOI":"10.1109\/ISI.2019.8823281"},{"key":"168_CR43","doi-asserted-by":"crossref","unstructured":"Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1480\u20131489","DOI":"10.18653\/v1\/N16-1174"},{"key":"168_CR44","doi-asserted-by":"crossref","unstructured":"Yarowsky D (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: 33rd annual meeting of the association for computational linguistics, pp 189\u2013196","DOI":"10.3115\/981658.981684"},{"issue":"8","key":"168_CR45","doi-asserted-by":"publisher","first-page":"1819","DOI":"10.1109\/TKDE.2013.39","volume":"26","author":"ML Zhang","year":"2014","unstructured":"Zhang ML, Zhou ZH (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819\u20131837","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"168_CR46","unstructured":"Zhang Z, Luo L (2018) Hate speech detection: A solved problem? the challenging case of long tail on twitter. Semantic Web pp 1\u201321"},{"key":"168_CR47","unstructured":"Zhong H, Li H, Squicciarini AC, Rajtmajer SM, Griffin C, Miller DJ, Caragea C (2016) Content-driven detection of cyberbullying on the instagram social network. In: IJCAI, pp 3952\u20133958"},{"key":"168_CR48","unstructured":"Zhou C, Sun C, Liu Z, Lau F (2015) A c-lstm neural network for text classification. arXiv preprint arXiv:151108630"}],"container-title":["Data Science and Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41019-021-00168-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s41019-021-00168-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41019-021-00168-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,7]],"date-time":"2023-01-07T14:20:58Z","timestamp":1673101258000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s41019-021-00168-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,17]]},"references-count":48,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["168"],"URL":"https:\/\/doi.org\/10.1007\/s41019-021-00168-y","relation":{},"ISSN":["2364-1185","2364-1541"],"issn-type":[{"value":"2364-1185","type":"print"},{"value":"2364-1541","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,8,17]]},"assertion":[{"value":"20 March 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 June 2021","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"31 July 2021","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 August 2021","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}