{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T07:09:50Z","timestamp":1777705790449,"version":"3.51.4"},"reference-count":37,"publisher":"SAGE Publications","issue":"5","license":[{"start":{"date-parts":[[2021,12,24]],"date-time":"2021-12-24T00:00:00Z","timestamp":1640304000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"published-print":{"date-parts":[[2022,3,31]]},"abstract":"<jats:p>Multi-label text classification aims at assigning more than one class to a given text document, which makes the task more ambiguous and challenging at the same time. The ambiguities come from the fact that often several labels in the prescribed label set are semantically close to each other, making clear demarcation between them difficult. As a consequence, any Machine Learning based approach for developing multi-label classification scheme needs to define its feature space by choosing features beyond linguistic or semi-linguistic features, so that the semantic closeness between the labels is also taken into account. The present work describes a scheme of feature extraction where the training document set and the prescribed label set are intertwined in a novel way to capture the ambiguity in a meaningful way. In particular, experiments were conducted using Topic Modeling and Fuzzy C-Means clustering which aim at measuring the underlying uncertainty using probability and membership based measures, respectively. Several Nonparametric hypothesis tests establish the effectiveness of the features obtained through Fuzzy C-Means clustering in multi-label classification. A new algorithm has been proposed for training the system for multi-label classification using the above set of features.<\/jats:p>","DOI":"10.3233\/jifs-219232","type":"journal-article","created":{"date-parts":[[2021,12,31]],"date-time":"2021-12-31T08:41:56Z","timestamp":1640940116000},"page":"4425-4436","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":2,"title":["Multi-label text classification with an ensemble feature space"],"prefix":"10.1177","volume":"42","author":[{"given":"Kushagri","family":"Tandon","sequence":"first","affiliation":[{"name":"Department of Mathematics, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Niladri","family":"Chatterjee","sequence":"additional","affiliation":[{"name":"Department of Mathematics, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, India"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2021,12,24]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"AkbikA. BergmannT. BlytheD. RasulK. SchweterS. and VollgrafR. FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP in: (2019) 54\u201359. doi:10.18653\/v1\/N19-4010."},{"key":"e_1_3_1_3_2","unstructured":"AkbikA. BlytheD. and VollgrafR. Contextual String Embeddings for Sequence Labeling in: (2018) 1638\u20131649."},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","unstructured":"AlvesR. Information Retrieval Dataset \u2013Internet Movie Database(IMDB) 2 (2017). doi: 10.17632\/rth2kr5hxf.2.","DOI":"10.17632\/rth2kr5hxf.2"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1016\/0098-3004(84)90020-7"},{"key":"e_1_3_1_6_2","unstructured":"BianchiF. TerragniS. and HovyD. Pre-training is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence ArXiv:2004.03974 [Cs]. (2021). http:\/\/arxiv.org\/abs\/2004.03974"},{"key":"e_1_3_1_7_2","doi-asserted-by":"crossref","unstructured":"BianchiF. TerragniS. HovyD. NozzaD. and FersiniE. Cross-lingual Contextualized Topic Models with Zero-shot Learning in: (2021) 1676\u20131683.","DOI":"10.18653\/v1\/2021.eacl-main.143"},{"key":"e_1_3_1_8_2","first-page":"993","article-title":"Latent dirichlet allocation","volume":"3","author":"Blei D.M.","year":"2003","unstructured":"BleiD.M., NgA.Y. and JordanM.I., Latent dirichlet allocation, J. Mach. Learn. Res.3 (2003), 993\u20131022.","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_1_9_2","unstructured":"BojanowskiP. GraveE. JoulinA. and MikolovT. Enriching Word Vectors with Subword Information ArXiv:1607.04606 [Cs] (2017). http:\/\/arxiv.org\/abs\/1607.04606"},{"key":"e_1_3_1_10_2","unstructured":"DevlinJ. ChangM.-W. LeeK. and ToutanovaK. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding ArXiv:1810.04805 [Cs] (2019). https:\/\/arxiv.org\/abs\/1810.04805"},{"key":"e_1_3_1_11_2","unstructured":"DiasM. omadson\/fuzzy-c-means 2021. https:\/\/github.com\/omadson\/fuzzy-c-means"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1080\/01969727308546046"},{"key":"e_1_3_1_13_2","doi-asserted-by":"crossref","unstructured":"ElisseeffA. and WestonJ. A kernel method for multi-labelled classification in: Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic MIT Press Vancouver British Columbia Canada 2001: pp. 681\u2013687.","DOI":"10.7551\/mitpress\/1120.003.0092"},{"key":"e_1_3_1_14_2","unstructured":"GibbonsJ.D. and ChakrabortiS. Nonparametric Statistical Inference Fourth Edition: Revised and Expanded Taylor & Francis 2014."},{"key":"e_1_3_1_15_2","doi-asserted-by":"crossref","unstructured":"GodboleS. and SarawagiS. Discriminative Methods for Multi-labeled Classification in: H. Dai R. Srikant and C. Zhang (Eds.) Advances in Knowledge Discovery and Data Mining Springer Berlin Heidelberg 2004: pp. 22\u201330. doi:10.1007\/978-3-540-24775-3_5.","DOI":"10.1007\/978-3-540-24775-3_5"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2011.08.141"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","unstructured":"KlimtB. and YangY. The enron corpus: a new dataset for email classification research in: Proceedings of the 15th European Conference on Machine Learning Springer-Verlag Pisa Italy (2004) 217\u2013226. doi:10.1007\/978-3-540-30115-8_22.","DOI":"10.1007\/978-3-540-30115-8_22"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/TFUZZ.2013.2294355"},{"key":"e_1_3_1_19_2","doi-asserted-by":"crossref","unstructured":"LoperE. and BirdS. NLTK: the Natural Language Toolkit in: Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics - Association for Computational Linguistics Philadelphia Pennsylvania (2002) 63\u201370. doi:10.3115\/1118108.1118117.","DOI":"10.3115\/1118108.1118117"},{"key":"e_1_3_1_20_2","unstructured":"McCallumA.K. Multi-label text classification with a mixture model trained by EM in: AAAI 99 Workshop on Text Learning (1999)."},{"key":"e_1_3_1_21_2","article-title":"PyTorch: An imperative style,high-performance deep learning library","volume":"32","author":"Paszke A.","year":"2019","unstructured":"PaszkeA., GrossS., MassaF., LererA., BradburyJ., ChananG., KilleenT., LinZ., GimelsheinN., AntigaL., DesmaisonA., KopfA., YangE., DeVitoZ., RaisonM., TejaniA., ChilamkurthyS., SteinerB., FangL., BaiJ. and ChintalaS., PyTorch: An imperative style,high-performance deep learning library, Advances in NeuralInformation Processing Systems32 (2019).","journal-title":"Advances in NeuralInformation Processing Systems"},{"key":"e_1_3_1_22_2","first-page":"2825","article-title":"Scikit-learn: Machine learning in python","volume":"12","author":"Pedregosa F.","year":"2011","unstructured":"PedregosaF., VaroquauxG., GramfortA., MichelV., ThirionB., GriselO., BlondelM., PrettenhoferP., WeissR., DubourgV., VanderplasJ., PassosA., CournapeauD., BrucherM., PerrotM. and Duchesnay\u00c9., Scikit-learn: Machine learning in python, Journal of Machine Learning Research12 (2011), 2825\u20132830.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_3_1_23_2","doi-asserted-by":"crossref","unstructured":"PenningtonJ. SocherR. and ManningC.D. GloVe: Global Vectors for Word Representation in: (2014) 1532\u20131543. doi:10.3115\/v1\/D14-1162.","DOI":"10.3115\/v1\/D14-1162"},{"key":"e_1_3_1_24_2","doi-asserted-by":"crossref","unstructured":"ReimersN. and GurevychI. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks in: (2019) 3982\u20133992. doi:10.18653\/v1\/D19-1410.","DOI":"10.18653\/v1\/D19-1410"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007649029923"},{"key":"e_1_3_1_26_2","unstructured":"ScottS. and MatwinS. Feature Engineering for Text Classification in: Proceedings of ICML-99 16th International Conference on Machine Learning Morgan Kaufmann Publishers (1999) 379\u2013388."},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","unstructured":"SculleyD. Web-scale k-means clustering in: Proceedings of the 19th International Conference on World Wide Web \u2013 WWW \u201910 ACM Press Raleigh North Carolina USA (2010) 1177. doi:10.1145\/1772690.1772862.","DOI":"10.1145\/1772690.1772862"},{"key":"e_1_3_1_28_2","article-title":"A literature survey on algorithms for multi-label learning","volume":"18","author":"Sorower M.S.","year":"2010","unstructured":"SorowerM.S., A literature survey on algorithms for multi-label learning, Oregon State University, Corvallis18 (2010).","journal-title":"Oregon State University, Corvallis"},{"key":"e_1_3_1_29_2","unstructured":"SrivastavaA. and SuttonC. Autoencoding Variational Inference For Topic Models ArXiv:1703.01488 [Stat] (2017). http:\/\/arxiv.org\/abs\/1703.01488"},{"key":"e_1_3_1_30_2","unstructured":"Szyma\u0144skiP. and KajdanowiczT. A scikit-based Python environment for performing multi-label classification ArXiv:1702.01460 [Cs] (2018). http:\/\/arxiv.org\/abs\/1702.01460"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.4018\/jdwm.2007070101"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","unstructured":"TsoumakasG. and VlahavasI. Random k-Labelsets: An Ensemble Method for Multilabel Classification in: J.N. Kok J. Koronacki R.L. de Mantaras S. Matwin D. Mladeni\u010d and A. Skowron (Eds.) Machine Learning: ECML2007 Springer Berlin Heidelberg (2007) 406\u2013417. doi:10.1007\/978-3-540-74958-5_38.","DOI":"10.1007\/978-3-540-74958-5_38"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","unstructured":"WangH. HuangM. and ZhuX. A Generative Probabilistic Model for Multi-label Classification in: 2008 Eighth IEEE International Conference on Data Mining (2008) 628\u2013637. doi:10.1109\/ICDM.2008.86.","DOI":"10.1109\/ICDM.2008.86"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.2307\/3001968"},{"key":"e_1_3_1_35_2","doi-asserted-by":"crossref","unstructured":"YounesZ. AbdallahF. and Den\u00e6uxT. Fuzzy multi-label learning under veristic variables in: International Conference on Fuzzy Systems (2010) 1\u20138. doi:10.1109\/FUZZY.2010.5584079.","DOI":"10.1109\/FUZZY.2010.5584079"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11063-009-9095-3"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2009.06.010"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","unstructured":"ZhangM.-L. and ZhouZ.-H. A k-nearest neighbor based algorithm for multi-label classification in: 2005 IEEE International Conference on Granular Computing 2 (2005) 718\u2013721. doi:10.1109\/GRC.2005.1547385.","DOI":"10.1109\/GRC.2005.1547385"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-219232","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/JIFS-219232","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-219232","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:45:13Z","timestamp":1777455913000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/JIFS-219232"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,24]]},"references-count":37,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2022,3,31]]}},"alternative-id":["10.3233\/JIFS-219232"],"URL":"https:\/\/doi.org\/10.3233\/jifs-219232","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,12,24]]}}}