{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:10:42Z","timestamp":1760238642308,"version":"build-2065373602"},"reference-count":57,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2020,8,31]],"date-time":"2020-08-31T00:00:00Z","timestamp":1598832000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>This paper proposes a novel approach for selecting a subset of features in semi-supervised datasets where only some of the patterns are labeled. The whole process is completed in two phases. In the first phase, i.e., Phase-I, the whole dataset is divided into two parts: The first part, which contains labeled patterns, and the second part, which contains unlabeled patterns. In the first part, a small number of features are identified using well-known maximum relevance (from first part) and minimum redundancy (whole dataset) based feature selection approaches using the correlation coefficient. The subset of features from the identified set of features, which produces a high classification accuracy using any supervised classifier from labeled patterns, is selected for later processing. In the second phase, i.e., Phase-II, the patterns belonging to the first and second part are clustered separately into the available number of classes of the dataset. In the clusters of the first part, take the majority of patterns belonging to a cluster as the class for that cluster, which is given already. Form the pairs of cluster centroids made in the first and second part. The centroid of the second part nearest to a centroid of the first part will be paired. As the class of the first centroid is known, the same class can be assigned to the centroid of the cluster of the second part, which is unknown. The actual class of the patterns if known for the second part of the dataset can be used to test the classification accuracy of patterns in the second part. The proposed two-phase approach performs well in terms of classification accuracy and number of features selected on the given benchmarked datasets.<\/jats:p>","DOI":"10.3390\/a13090215","type":"journal-article","created":{"date-parts":[[2020,8,31]],"date-time":"2020-08-31T11:53:49Z","timestamp":1598874829000},"page":"215","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["A Two-Phase Approach for Semi-Supervised Feature Selection"],"prefix":"10.3390","volume":"13","author":[{"given":"Amit","family":"Saxena","sequence":"first","affiliation":[{"name":"Department of Computer Science and Information Technology, Guru Ghasidas University, Bilaspur, Chhattisgarh 495009, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2468-0667","authenticated-orcid":false,"given":"Shreya","family":"Pare","sequence":"additional","affiliation":[{"name":"School of Computer Science, FEIT, University of Technology Sydney, Sydney, NSW 2007, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3805-1087","authenticated-orcid":false,"given":"Mahendra Singh","family":"Meena","sequence":"additional","affiliation":[{"name":"School of Computer Science, FEIT, University of Technology Sydney, Sydney, NSW 2007, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6375-8615","authenticated-orcid":false,"given":"Deepak","family":"Gupta","sequence":"additional","affiliation":[{"name":"Department of Computer Science &amp; Engineering, National Institute of Technology Arunachal Pradesh, Yupia 791112, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Akshansh","family":"Gupta","sequence":"additional","affiliation":[{"name":"Central Electronics Engineering Research Institute, Delhi 110028, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Imran","family":"Razzak","sequence":"additional","affiliation":[{"name":"School of Information Technology, Deakin University, Geeloing, VIC 3217, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chin-Teng","family":"Lin","sequence":"additional","affiliation":[{"name":"School of Computer Science, FEIT, University of Technology Sydney, Sydney, NSW 2007, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7745-9667","authenticated-orcid":false,"given":"Mukesh","family":"Prasad","sequence":"additional","affiliation":[{"name":"School of Computer Science, FEIT, University of Technology Sydney, Sydney, NSW 2007, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,8,31]]},"reference":[{"key":"ref_1","unstructured":"Duda, R.O., Hart, P.E., and Stork, D.G. (2001). Pattern Classification, John Wiley and Sons (Asia)."},{"key":"ref_2","first-page":"315","article-title":"Hybrid Feature Selection Methods for High Dimensional Multi\u2013class Datasets","volume":"9","author":"Saxena","year":"2017","journal-title":"Int. J. Data Min. Model. Manag."},{"key":"ref_3","first-page":"494","article-title":"An Evolutionary Feature Selection Technique using Polynomial Neural Network","volume":"8","author":"Saxena","year":"2011","journal-title":"Int. J. Comput. Sci. Issues"},{"key":"ref_4","unstructured":"Michalski, R.S., Karbonell, J.G., and Kubat, M. (1998). Machine Learning and Data Mining: Methods and Applications, John Wiley and Sons."},{"key":"ref_5","unstructured":"Kamber, M., and Han, J. (2006). Data Mining: Concepts and Techniques, Morgan Kaufmann Publisher. [2nd ed.]."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/360402.360406","article-title":"Web Mining Research: A Survey","volume":"2","author":"Kosala","year":"2000","journal-title":"ACM Sig Kdd Explor. Newsl."},{"key":"ref_7","unstructured":"Baldi, P., and Brunak, S. (1998). Bioinformatics: The Machine Learning Approach, MIT Press. [2nd ed.]."},{"key":"ref_8","unstructured":"Boero, G., and Cavalli, E. (1996). Forecasting the Exchangerage: A Comparison between Econometric and Neural Network Models, A FIR Colloquium."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"447","DOI":"10.2307\/253819","article-title":"Fuzzy Techniques of Pattern Recognition in Risk and Claim Classification","volume":"62","author":"Derrig","year":"1995","journal-title":"J. Risk Insur."},{"key":"ref_10","unstructured":"Mitchel, T.M. (1997). Machine Learning, McGraw Hill."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1016\/j.patrec.2013.12.008","article-title":"Integration of Densesub Graph Finding with Feature Clustering for Unsupervised Feature Selection","volume":"40","author":"Bandyopadhyay","year":"2014","journal-title":"Pattern Recognit. Lett."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1389","DOI":"10.1109\/TSMC.2015.2406855","article-title":"An Improved Polynomial Neural Network Classifier Using Real-Coded Genetic Algorithm","volume":"45","author":"Lin","year":"2015","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1109\/TSA.2002.1011533","article-title":"Speaker Recognition with Polynomial Classifiers","volume":"10","author":"Campbell","year":"2002","journal-title":"IEEE Trans. Speech Audio Process."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"277","DOI":"10.1109\/TFUZZ.2002.1006431","article-title":"Fuzzy Logic Approaches to Structure Preserving Dimensionality Reduction","volume":"10","author":"Pal","year":"2002","journal-title":"IEEE Trans. Fuzzy Syst."},{"key":"ref_15","unstructured":"Jain, A.K., and Dubes, R.C. (1988). Algorithms for Clustering Data, Prentice\u2013Hall."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1109\/T-C.1969.222678","article-title":"A Nonlinear Mapping for Data Structure Analysis","volume":"18","author":"Sammon","year":"1969","journal-title":"IEEE Trans. Comput."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1016\/0146-664X(78)90053-9","article-title":"A Nonlinear Mapping Algorithm for Large Databases","volume":"7","author":"Schachter","year":"1978","journal-title":"Comput. Graph. Image Process."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"799","DOI":"10.1049\/el:19780539","article-title":"Improving the Efficiency of Sammon\u2019s Nonlinear Mapping by using Clustering Archetypes","volume":"14","author":"Pykett","year":"1980","journal-title":"Electron. Lett."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1016\/S0165-0114(98)00222-X","article-title":"Soft Computing for Feature Analysis","volume":"103","author":"Pal","year":"1999","journal-title":"Fuzzy Sets Syst."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1109\/TEVC.2004.825567","article-title":"A Novel Approach for Designing Classifiers Using Genetic Programming","volume":"8","author":"Muni","year":"2004","journal-title":"IEEE Trans. Evol. Comput."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1109\/34.990133","article-title":"Unsupervised Feature Selection using Feature Similarity","volume":"24","author":"Mitra","year":"2002","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Dash, M., and Liu, H. (2000, January 18\u201320). Feature Selection for Clustering. Proceedings of the Asia Pacific Conference on Knowledge Discovery and Data Mining, Kyoto, Japan.","DOI":"10.1007\/3-540-45571-X_13"},{"key":"ref_23","unstructured":"Dy, J.G., and Brodley, C.E. (2000). Feature Subset Selection and Order Identification for Unsupervised Learning, ICML."},{"key":"ref_24","unstructured":"Basu, S., Micchelli, C.A., and Olsen, P. (2000, January 28\u201331). Maximum Entropy and Maximum Likelihood Criteria for Feature Selection from Multivariate Data. Proceedings of the 2000 IEEE International Symposium on Circuits and Systems (ISCAS), Geneva, Switzerland."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"366","DOI":"10.1109\/72.839007","article-title":"Unsupervised Feature Evaluation: A Neuro\u2013Fuzzy Approach","volume":"1","author":"Pal","year":"2000","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TSMCB.2005.854499","article-title":"Genetic Programming for Simultaneous Feature Selection and Classifier Design","volume":"36","author":"Muni","year":"2006","journal-title":"IEEE Trans. Syst. Man Cyber."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1051","DOI":"10.1109\/T-C.1971.223401","article-title":"Redundancy in Feature Extraction","volume":"100","author":"Heydorn","year":"1971","journal-title":"IEEE Trans. Comput."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1106","DOI":"10.1109\/T-C.1971.223412","article-title":"Feature Selection with a Linear Dependence Measure","volume":"100","author":"Das","year":"1971","journal-title":"IEEE Trans. Comput."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1007\/s12543-010-0047-4","article-title":"Evolutionary Methods for Unsupervised Feature Selection using Sammon\u2019s Stress Function","volume":"2","author":"Saxena","year":"2010","journal-title":"Fuzzy Inf. Eng."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1226","DOI":"10.1109\/TPAMI.2005.159","article-title":"Feature Selection based on Mutual Information Criteria of Max\u2013dependency, Max\u2013relevance, and Min\u2013redundancy","volume":"27","author":"Peng","year":"2005","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Xu, J., Yang, G., Man, H., and He, H. (2013, January 4\u20136). L1 Graph base on Sparse Coding for Feature Selection. Proceedings of the International Symposium on Neural Networks, Dalian, China.","DOI":"10.1007\/978-3-642-39065-4_71"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Xu, J., Yin, Y., Man, H., and He, H. (2012, January 10\u201315). Feature Selection based on Sparse Imputation. Proceedings of the 2012 International Joint Conference on Neural Networks (IJCNN), Brisbane, QLD, Australia.","DOI":"10.1109\/IJCNN.2012.6252639"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"954","DOI":"10.1109\/TNN.2011.2128342","article-title":"Feature Selection using Probabilistic Prediction of Support Vector Regression","volume":"22","author":"Yang","year":"2011","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_34","unstructured":"Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., and Vapnik, V. (2000, January 27\u201330). Feature Selection for SVMs. Proceedings of the Advances in Neural Information Processing Systems 13, Cambridge, MA, USA."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1974","DOI":"10.1109\/TNNLS.2016.2562670","article-title":"Semi\u2013supervised Feature Selection based on Relevance and Redundancy Criteria","volume":"28","author":"Xu","year":"2016","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_36","first-page":"1157","article-title":"An Introduction to variable and Feature Selection","volume":"3","author":"Guyon","year":"2003","journal-title":"J. Mach. Learn. Res."},{"key":"ref_37","unstructured":"Hall, M.A. (July, January 29). Correlation\u2013based Feature Selection for Discrete and Numeric Class Machine Learning. Proceedings of the 17th International Conference on Machine Learning (ICML2000), Stanford University, Stanford, CA, USA."},{"key":"ref_38","unstructured":"He, X., Caiand, D., and Niyogi, P. (2005, January 5\u20138). Laplacian Score for Feature Selection. Proceedings of the Advances in Neural Information Processing Systems 18, Vancouver, BC, Canada."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1033","DOI":"10.1109\/TNN.2010.2047114","article-title":"Discriminative Semi-supervised Feature Selection via Manifold Regularization","volume":"21","author":"Xu","year":"2010","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Zhao, Z., and Liu, H. (2007, January 26\u201328). Semi\u2013supervised Feature Selection via Spectral Analysis. Proceedings of the Seventh SIAM International Conference on Data Mining, Minneapolis, MN, USA.","DOI":"10.1137\/1.9781611972771.75"},{"key":"ref_41","unstructured":"Ren, J., Qiu, Z., Fan, W., Cheng, H., and Yu, P.S. (2008, January 20\u201323). Forward Semi\u2013supervised Feature Selection. Proceedings of the 12th Pacific-Asia Conference, PAKDD 2008, Osaka, Japan."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1016\/j.patcog.2016.11.003","article-title":"A Survey on Semi\u2013supervised Feature Selection Methods","volume":"64","author":"Sheikhpour","year":"2017","journal-title":"Pattern Recognit."},{"key":"ref_43","unstructured":"MacQueen, J. (1965\u20137, January 27). Some Methods for Classification and Analysis of Multivariate Observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"40046","DOI":"10.1063\/1.5033710","article-title":"An Improved Initialization Center k\u2013means Clustering Algorithm based on Distance and Density","volume":"1955","author":"Duan","year":"2018","journal-title":"AIP Conf. Proc."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1080\/01969727308546046","article-title":"A Fuzzy Relative of the ISO DATA Process and Its Use in Detecting Compact Well\u2013Separated Clusters","volume":"3","author":"Dunn","year":"1973","journal-title":"J. Cybern."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Bezdek, J.C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms, Kluwer Academic Publishers.","DOI":"10.1007\/978-1-4757-0450-1"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"664","DOI":"10.1016\/j.neucom.2017.06.053","article-title":"A Review of Clustering Techniques and Developments","volume":"267","author":"Saxena","year":"2017","journal-title":"Neurocomputing"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"301","DOI":"10.7763\/IJET.2016.V8.902","article-title":"An Overview of Semi\u2013supervised Fuzzy Clustering Algorithms","volume":"8","author":"Thong","year":"2016","journal-title":"Int. J. Eng. Technol."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Li, L., Garibaldi, J.M., He, D., and Wang, M. (2015). Semi\u2013supervised Fuzzy Clustering with Feature Discrimination. PLoS ONE, 10.","DOI":"10.1371\/journal.pone.0131160"},{"key":"ref_50","first-page":"20004","article-title":"A Comprehensive Foundation","volume":"2","author":"Haykin","year":"2004","journal-title":"Neural Netw."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"364","DOI":"10.1109\/TSMC.1971.4308320","article-title":"Polynomial Theory of Complex Systems","volume":"4","author":"Ivakhnenko","year":"1971","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Madala, H.R. (2019). Inductive Learning Algorithms for Complex Systems Modeling, CRC Press.","DOI":"10.1201\/9781351073493"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"1705","DOI":"10.1016\/j.patrec.2008.04.012","article-title":"A Reduced and Comprehensible Polynomial Neural Network for Classification","volume":"29","author":"Misra","year":"2008","journal-title":"Pattern Recognit. Lett."},{"key":"ref_54","unstructured":"Kennedy, J., and Eberhart, R. (December, January 27). Particle Swarm Optimization. Proceedings of the ICNN\u201995\u2013International Conference on Neural Networks, Perth, WA, Australia."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"3106","DOI":"10.1016\/j.asoc.2010.12.013","article-title":"A Condensed Polynomial Neural Network for Classification using Swarm Intelligence","volume":"11","author":"Dehuri","year":"2011","journal-title":"Appl. Soft Comput."},{"key":"ref_56","unstructured":"Dheeru, D., and Taniskidou, E.K. (2020, August 30). UCI Machine Learning Repository. Available online: http:\/\/archive.ics.uci.edu\/ml."},{"key":"ref_57","unstructured":"Rossi, R.A., and Ahmad, N.K. (2020, August 30). The Network Data Repository with Interactive Graph Analytics and Visualization. Available online: http:\/\/networkrepository.com."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/13\/9\/215\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:05:15Z","timestamp":1760177115000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/13\/9\/215"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,8,31]]},"references-count":57,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2020,9]]}},"alternative-id":["a13090215"],"URL":"https:\/\/doi.org\/10.3390\/a13090215","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2020,8,31]]}}}