{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T06:29:44Z","timestamp":1777703384531,"version":"3.51.4"},"reference-count":24,"publisher":"SAGE Publications","issue":"3","license":[{"start":{"date-parts":[[2018,3,22]],"date-time":"2018-03-22T00:00:00Z","timestamp":1521676800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"published-print":{"date-parts":[[2018,3,22]]},"abstract":"<jats:p>High-dimensional data analysis is quite inevitable due to emerging technologies in various domains such as finance, healthcare, genomics and signal processing. Though data sets generated in these domains are high-dimensional, intrinsic dimensions that provide meaningful information are often much smaller. Conventionally, unsupervised clustering methods known as subspace clustering are utilized for finding clusters in different subspaces of high dimensional data, by identifying relevant features, irrespective of labels associated with each instance. Available label information, if incorporated in clustering algorithm, can bias the algorithm towards solutions more consistent with our knowledge, leading to improved cluster quality. Therefore, an Information Gain based Semi-supervised- subspace Clustering (IGSC) is proposed that identifies a subset of important attributes based on the known label for each data instance. The information about the labels associated with data sets is integrated with the search strategy for subspaces to leverage them into a model based clustering approach. Our experimentation on 13 real world labeled data sets proves the feasibility of IGSC and we validate the clusters obtained, using an improvised Davies Bouldin Index (DBI) for semi-supervised clusters.<\/jats:p>","DOI":"10.3233\/jifs-169456","type":"journal-article","created":{"date-parts":[[2018,3,23]],"date-time":"2018-03-23T12:25:35Z","timestamp":1521807935000},"page":"1619-1629","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":6,"title":["Semi supervised approach towards subspace clustering"],"prefix":"10.1177","volume":"34","author":[{"given":"Sandhya","family":"Harikumar","sequence":"first","affiliation":[{"name":"Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Amritapuri, India"}]},{"given":"A.S.","family":"Akhil","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Amrita Vishwa Vidyapeetham, Amritapuri, India"}]}],"member":"179","published-online":{"date-parts":[[2018,3,22]]},"reference":[{"issue":"1","key":"e_1_3_2_2_2","doi-asserted-by":"crossref","DOI":"10.1145\/1497577.1497578","article-title":"Clustering highdimensional data: A survey on subspace clustering, patternbased clustering, and correlation clustering","volume":"3","author":"Kriegel H.P.","year":"2009","unstructured":"KriegelH.P., KrogerP. and ZimekA., Clustering highdimensional data: A survey on subspace clustering, patternbased clustering, and correlation clustering, ACM Transactions on Knowledge Discovery from Data3(1) (2009).","journal-title":"ACM Transactions on Knowledge Discovery from Data"},{"issue":"4","key":"e_1_3_2_3_2","first-page":"351","article-title":"Subspace clustering","volume":"2","author":"Kriegel H.P.","year":"2012","unstructured":"KriegelH.P., KrogerP. and ZimekA., Subspace clustering, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery2(4) (2012), 351\u2013364.","journal-title":"Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery"},{"key":"e_1_3_2_4_2","first-page":"217","volume-title":"When is nearest neighbors meaningful","author":"Beyer K.","year":"1999","unstructured":"BeyerK., GoldsteinJ., RamakrishnanR., ShaftU., When is nearest neighbors meaningfulProceedings International Conference on Database Theory (ICDT) (1999), 217\u2013235."},{"key":"e_1_3_2_5_2","volume-title":"Irrelevant features and the subset selection problem","author":"John G.H.","year":"1994","unstructured":"JohnG.H., KohaviR. and PflegerP., Irrelevant features and the subset selection problem. Machine Learning: Proceedings of the Eleventh International ConferenceMorgan Kaufmann, 1994."},{"key":"e_1_3_2_6_2","article-title":"Selection of relevant features and examples in machine learning","author":"Langley P.","year":"1994","unstructured":"LangleyP. and BlumA.L., Selection of relevant features and examples in machine learning, Special issue of Artificial Intelligence on Relevance (1994).","journal-title":"Special issue of Artificial Intelligence on Relevance"},{"key":"e_1_3_2_7_2","first-page":"48","volume-title":"Using information gain as feature weight","author":"Ayan N.F.","year":"1999","unstructured":"AyanN.F., Using information gain as feature weight, TAINN\u201999 8th Turkish Symposium on Artificial Intelligence and Neural NetworksIstanbul48\u201357, (1999)."},{"key":"e_1_3_2_8_2","doi-asserted-by":"crossref","DOI":"10.1007\/BF00116251","article-title":"Induction of decision trees","volume":"1","author":"Quinlan J.R.","year":"1986","unstructured":"QuinlanJ.R., Induction of decision trees, Machine Learning1 (1986).","journal-title":"Machine Learning"},{"key":"e_1_3_2_9_2","first-page":"61 72","volume-title":"Fast algorithms for projected clustering","author":"Aggarwal C.C.","year":"1999","unstructured":"AggarwalC.C., WolfJ.L., YuP.S., ProcopiucC. and ParkJ.S., Fast algorithms for projected clustering. Proceedings of the 1999 ACM SIGMOD international conference on Management of data199961\u201372. ACM Press."},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/342009.335383"},{"key":"e_1_3_2_11_2","unstructured":"WooK.G. and LeeJ.H. FINDIT: A Fast and Intelligent Subspace Clustering Algorithm using Dimension Voting. PhD thesis Korea Advanced Institute of Science and Technology Taejon Korea 2002."},{"key":"e_1_3_2_12_2","first-page":"517","article-title":"\u00cet\u2019-clusters: Capturing subspace correlation in a large data set","year":"2002","unstructured":"Yang, et al., \u00cet\u2019-clusters: Capturing subspace correlation in a large data set. In ICDE (2002), pp. 517\u2013528.","journal-title":"ICDE"},{"key":"e_1_3_2_13_2","first-page":"1022","volume-title":"Multi-interval discretization of continuous valued attributes for classification learning","author":"Fayyad U.M.","year":"1993","unstructured":"FayyadU.M. and IraniK.B., Multi-interval discretization of continuous valued attributes for classification learning, 13th International Joint Conference on Artificial Intelligence (1993), 1022\u20131027."},{"key":"e_1_3_2_14_2","first-page":"445","volume-title":"Improving Classification Performance with Discretization on Biomedical Datasets","author":"Lustgarten J.L.","year":"2008","unstructured":"LustgartenJ.L., GopalakrishnanV., GroverH. and VisweswaranS., Improving Classification Performance with Discretization on Biomedical Datasets, in AMIA Annu Symp Proc (2008), 445\u2013449."},{"key":"e_1_3_2_15_2","first-page":"46","volume-title":"Density-connected subspace clustering for high dimensional data","author":"Kailing K.","year":"2004","unstructured":"KailingK., KriegelH.P. and KrogerP., Density-connected subspace clustering for high dimensional data, in proceedings of the 4th SIAM International Conference on Data Mining (2004), 46\u2013257Orlando, FL."},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/276304.276314"},{"key":"e_1_3_2_17_2","article-title":"Introduction to Semi-Supervised Learning","author":"Zhu X.","year":"2009","unstructured":"ZhuX. and GoldbergA., Introduction to Semi-Supervised Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning (2009).","journal-title":"Synthesis Lectures on Artificial Intelligence and Machine Learning"},{"key":"e_1_3_2_18_2","article-title":"A survey on soft subspace clustering","author":"Deng Z.","year":"2016","unstructured":"DengZ., ChoiK.-S., JiangY., WangJ. and WangS., A survey on soft subspace clustering, Information Sciences (2016).","journal-title":"Information Sciences"},{"key":"e_1_3_2_19_2","unstructured":"GoilS. NageshH. and ChoudharyA. MAFIA: Efficient and scalable subspace clustering for very large data sets Technical Report CPDC-TR-9906-010 Northwestern University 1999."},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1024016609528"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2007.11.011"},{"key":"e_1_3_2_22_2","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1007\/978-3-319-23258-4_29","article-title":"Data integration of heterogeneous data sources using QR decomposition","volume":"385","author":"Sandhya H.","year":"2016","unstructured":"SandhyaH. and RoyM.M., Data integration of heterogeneous data sources using QR decomposition, Advances in Intelligent Systems and Computing385 (2016), 333\u2013344.","journal-title":"Advances in Intelligent Systems and Computing"},{"key":"e_1_3_2_23_2","volume-title":"Apriori algorithm for association rule mining in high dimensional data","author":"Harikumar S.","year":"2016","unstructured":"HarikumarS. and DilipkumarD.U., Apriori algorithm for association rule mining in high dimensional data, in Proceedings of the 2016 International Conference on Data Science and Engineering, ICDSE 2016, 2016."},{"key":"e_1_3_2_24_2","doi-asserted-by":"crossref","first-page":"420","DOI":"10.1007\/3-540-44503-X_27","volume-title":"Database Theory-ICDT 2001, Lecture Notes in Computer Science","author":"Aggarwal C.","year":"2001","unstructured":"AggarwalC., HinneburgA., KeimD.On the surprising behavior of distance metrics in high dimensional spaceDatabase Theory-ICDT 2001, Lecture Notes in Computer Science (2001), 420\u2013434 , Berlin , HeidelbergSpringer."},{"issue":"2","key":"e_1_3_2_25_2","doi-asserted-by":"crossref","DOI":"10.1109\/TPAMI.1979.4766909","article-title":"A cluster separation measure","volume":"1","author":"Davies D.L.","year":"1979","unstructured":"DaviesD.L. and BouldinD.W., A cluster separation measure, IEEE Transactions on Pattern Analysis and Machine Intelligence1(2) (1979).","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-169456","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/JIFS-169456","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/JIFS-169456","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:38:35Z","timestamp":1777455515000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/JIFS-169456"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,3,22]]},"references-count":24,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2018,3,22]]}},"alternative-id":["10.3233\/JIFS-169456"],"URL":"https:\/\/doi.org\/10.3233\/jifs-169456","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,3,22]]}}}