{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T04:29:04Z","timestamp":1777696144554,"version":"3.51.4"},"reference-count":88,"publisher":"SAGE Publications","issue":"2","license":[{"start":{"date-parts":[[2022,3,1]],"date-time":"2022-03-01T00:00:00Z","timestamp":1646092800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Intelligent Data Analysis: An International Journal"],"published-print":{"date-parts":[[2022,3]]},"abstract":"<jats:p>Software maintainability is a significant contributor while choosing particular software. It is helpful in estimation of the efforts required after delivering the software to the customer. However, issues like imbalanced distribution of datasets, and redundant and irrelevant occurrence of various features degrade the performance of maintainability prediction models. Therefore, current study applies ImpS algorithm to handle imbalanced data and extensively investigates several Feature Selection (FS) techniques including Symmetrical Uncertainty (SU), RandomForest filter, and Correlation-based FS using one open-source, three proprietaries and two commercial datasets. Eight different machine learning algorithms are utilized for developing prediction models. The performance of models is evaluated using Accuracy, G-Mean, Balance, &amp; Area under the ROC Curve. Two statistical tests, Friedman Test and Wilcoxon Signed Ranks Test are conducted for assessing different FS techniques. The results substantiate that FS techniques significantly improve the performance of various prediction models with an overall improvement of 18.58%, 129.73%, 80.00%, and 45.76% in the median values of Accuracy, G-Mean, Balance, &amp; AUC, respectively for all the datasets taken together. Friedman test advocates the supremacy of SU FS technique. Wilcoxon Signed Ranks test showcases that SU FS technique is significantly superior to the CFS technique for three out of six datasets.<\/jats:p>","DOI":"10.3233\/ida-215825","type":"journal-article","created":{"date-parts":[[2022,3,22]],"date-time":"2022-03-22T18:20:19Z","timestamp":1647973219000},"page":"311-344","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":2,"title":["A feature selection strategy for improving software maintainability prediction"],"prefix":"10.1177","volume":"26","author":[{"given":"Shikha","family":"Gupta","sequence":"first","affiliation":[{"name":"GGSIP University","place":["India"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anuradha","family":"Chug","sequence":"additional","affiliation":[{"name":"GGSIP University","place":["India"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2022,3]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.1058483"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1049\/iet-sen.2013.0046"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","unstructured":"AlsolaiH. RoperM. and NassarD. Predicting software maintainability in object-oriented systems using ensemble techniques. In: IEEE International Conference on Software Maintenance and Evolution (ICSME) IEEE September 23\u201329 2018 pp. 716\u2013721. doi: 10.1109\/ICSME.2018.00088.","DOI":"10.1109\/ICSME.2018.00088"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","unstructured":"AlsolaiH. and RoperM. Application of ensemble techniques in predicting object-oriented software maintainability. In: Proceedings of the Evaluation and Assessment on Software Engineering ACM April 15\u201317 2019 pp. 370\u2013373. doi: 10.1145\/3319008.3319716.","DOI":"10.1145\/3319008.3319716"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2009.06.055"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.5120\/ijca2018916305"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1080\/09720510."},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF00058655"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1010933404324"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.3390\/app8091521"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2009.12.023"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compeleceng.2013.11.024"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/MS.2005.151"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.1999.788639"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/32.295895"},{"issue":"2","key":"e_1_3_2_17_2","first-page":"615","article-title":"Benchmarking framework for maintainability prediction of open source software using object oriented metrics","volume":"12","author":"Chug A.","year":"2016","unstructured":"ChugA. and MalhotraR., Benchmarking framework for maintainability prediction of open source software using object oriented metrics, International Journal of Innovative Computing, Information and Control 12(2) (2016), 615\u2013634.","journal-title":"International Journal of Innovative Computing, Information and Control"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1967.1053964"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/WCRE.2003.1287246"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1016\/0020-7373(89)90027-8"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2013.07.005"},{"key":"e_1_3_2_22_2","first-page":"74","article-title":"Filters, wrappers and a boosting-based hybrid for feature selection","author":"Das S.","year":"2001","unstructured":"DasS., Filters, wrappers and a boosting-based hybrid for feature selection, In: Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), June 28\u2013July 1, 2001, Vol. 1, pp. 74\u201381.","journal-title":"Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001)"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1016\/S1088-467X(97)00008-5"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.14445\/22315381\/IJETT-V37P215"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/2347696.2347703"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00500-014-1576-2"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","unstructured":"ElishM.O. and ElishK.O. Application of treenet in predicting object-oriented software maintainability: A comparative study In: 13th European Conference on Software Maintenance and Reengineering IEEE March 24\u201327 2009 pp. 69\u201378. doi: 10.1109\/CSMR.2009.57.","DOI":"10.1109\/CSMR.2009.57"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","unstructured":"GaoK. KhoshgoftaarT.M. and NapolitanoA. Combining Feature Subset Selection and Data Sampling for Coping with Highly Imbalanced Software Data In: The 27th International Conference on Software Engineering and Knowledge Engineering (SEKE) July 2015 pp. 439\u2013444. doi: 10.18293\/SEKE2015-182.","DOI":"10.18293\/SEKE2015-182"},{"key":"e_1_3_2_29_2","unstructured":"GunnalanR. MenziesT. AppukuttyK. SrinivasanA. and HuY. Feature subset selection with tar2less 2003."},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","unstructured":"GuptaS. and ChugA. Assessing Cross-Project Technique for Software Maintainability Prediction Procedia Computer Science 167 (2020) 656\u2013665. doi: 10.1016\/j.procs.2020.03.332.","DOI":"10.1016\/j.procs.2020.03.332"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1080\/09720510.2020.1799501"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1080\/09720529.2020.1728898"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2016.12.035"},{"key":"e_1_3_2_34_2","unstructured":"HallM.A. Correlation-based feature selection for machine learning The University of Waikato 1999."},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2003.1245283"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.4236\/jis.2016.73009"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2008.239"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1093\/biostatistics\/kxj011"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1198\/106186006X133933"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/34.574797"},{"issue":"1","key":"e_1_3_2_41_2","first-page":"1","article-title":"Data Reduction Techniques: A Comparative Study for Attribute Selection Methods","volume":"8","author":"Janabi K.B.A.","year":"2018","unstructured":"JanabiK.B.A. and KadhimR., Data Reduction Techniques: A Comparative Study for Attribute Selection Methods, International Journal of Advanced Computer Science and Technology(IJACST) 8(1) (2018), 1\u201313.","journal-title":"International Journal of Advanced Computer Science and Technology(IJACST)"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2913349"},{"issue":"1","key":"e_1_3_2_43_2","first-page":"39","article-title":"Software Maintainability Prediction Model Based on Fuzzy Neural Network","volume":"20","author":"Jia L.","year":"2013","unstructured":"JiaL. YangB. ParkD.H. TanF. and ParkM., Software Maintainability Prediction Model Based on Fuzzy Neural Network, Journal of Multiple-Valued Logic & Soft Computing 20(1-2) (2013), 39\u201353.","journal-title":"Journal of Multiple-Valued Logic & Soft Computing"},{"key":"e_1_3_2_44_2","first-page":"152","article-title":"A hybrid feature selection algorithm: Combination of symmetrical uncertainty and genetic algorithms","author":"Jiang B.","year":"2008","unstructured":"JiangB. DingX. MaL. HeY. WangT. and XieW., A hybrid feature selection algorithm: Combination of symmetrical uncertainty and genetic algorithms, In: The second international symposium on optimization and systems biology (OSB\u201908), ORSC and APORC, October 31\u00e2\u0080\u0093-November 3, 2008, pp. 152\u2013157.","journal-title":"The second international symposium on optimization and systems biology (OSB\u201908)"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2010.03.016"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.5120\/339-515"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","unstructured":"KhalidS. KhalilT. and NasreenS. A survey of feature selection and feature extraction techniques in machine learning In: Science and Information Conference IEEE August 27\u201329 2014 pp. 372\u2013378. doi: 10.1109\/SAI.2014.6918213.","DOI":"10.1109\/SAI.2014.6918213"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0004-3702(97)00043-X"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.infsof.2005.03.002"},{"key":"e_1_3_2_50_2","first-page":"179","article-title":"Addressing the curse of imbalanced training sets: one-sided selection","author":"Kubat M.","year":"1997","unstructured":"KubatM. MatwinS. and others, Addressing the curse of imbalanced training sets: one-sided selection, In: Proceedings of the 14th International Conference on Machine Learning (ICML), Citeseer, 1997, Vol. 97, pp. 179\u2013186.","journal-title":"Proceedings of the 14th International Conference on Machine Learning (ICML)"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","unstructured":"KumarL. NaikD.K. and RathS.K. Validating the effectiveness of object-oriented metrics for predicting maintainability Procedia Computer Science 57 (2015) 798\u2013806. doi: 10.1016\/j.procs.2015.07.479.","DOI":"10.1016\/j.procs.2015.07.479"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2016.01.003"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1007\/s13198-017-0618-4"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10515-011-0092-1"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1016\/0164-1212(93)90077-B"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2005.66"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2019.02.023"},{"key":"e_1_3_2_58_2","first-page":"133","article-title":"Introduction to Gaussian processes","volume":"168","author":"MacKay D.J.C.","year":"1998","unstructured":"MacKayD.J.C., Introduction to Gaussian processes, NATO ASI Series F Computer and Systems Sciences 168 (1998), 133\u2013166.","journal-title":"NATO ASI Series F Computer and Systems Sciences"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2014.11.023"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","unstructured":"MalhotraR. and ChugA. Application of evolutionary algorithms for software maintainability prediction using object-oriented metrics In: Proceedings of the 8th International Conference on Bioinspired Information and Communications Technologies ACM December 1\u20133 2014 pp. 348\u2013351. doi: 10.4108\/icst.bict.2014.258044.","DOI":"10.4108\/icst.bict.2014.258044"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1007\/s13198-014-0227-4"},{"issue":"2","key":"e_1_3_2_62_2","first-page":"19","article-title":"Software maintainability prediction using machine learning algorithms","volume":"2","author":"Malhotra R.","year":"2012","unstructured":"MalhotraR. and ChugA., Software maintainability prediction using machine learning algorithms, Software Engineering: An International Journal (SEIJ) 2(2) (2012), 19\u201336.","journal-title":"Software Engineering: An International Journal (SEIJ)"},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1109\/ESEM.2009.5316048"},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.21917\/ijsc.2013.0077"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.2174\/092986609789839250"},{"key":"e_1_3_2_66_2","doi-asserted-by":"publisher","DOI":"10.3390\/sym11040498"},{"key":"e_1_3_2_67_2","doi-asserted-by":"publisher","DOI":"10.25103\/JESTR.106.20"},{"key":"e_1_3_2_68_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2018.10.036"},{"key":"e_1_3_2_69_2","unstructured":"RStudio Team and others RStudio: integrated development for R RStudio Inc. Boston MA 42 (2015) 14. https:\/\/rstudio.com\/."},{"key":"e_1_3_2_70_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-87481-2_21"},{"key":"e_1_3_2_71_2","unstructured":"QuinlanJ. C4. 5: programs for machine learning Morgan Kaufmann 2014. (ISBN 1-55860-238-0)."},{"key":"e_1_3_2_72_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-017-2988-6"},{"key":"e_1_3_2_73_2","doi-asserted-by":"publisher","DOI":"10.2307\/2333709"},{"key":"e_1_3_2_74_2","doi-asserted-by":"publisher","unstructured":"ShatnawiR. Improving software fault-prediction for imbalanced data In: International Conference on Innovations in Information Technology (IIT) IEEE March 18\u201320 2012 pp. 54\u201359. doi: 10.1109\/INNOVATIONS.2012.6207774.","DOI":"10.1109\/INNOVATIONS.2012.6207774"},{"key":"e_1_3_2_75_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-019-09682-y"},{"issue":"6","key":"e_1_3_2_76_2","first-page":"7255","article-title":"Automatic outlier identification in data mining using IQR in real-time data","volume":"3","author":"Sunitha L.","year":"2014","unstructured":"SunithaL. BalRajuM. SasikiranJ. and RamanaE.V., Automatic outlier identification in data mining using IQR in real-time data, International Journal of Advanced Research in Computer and Communication Engineering 3(6) (2014), 7255\u20137257.","journal-title":"International Journal of Advanced Research in Computer and Communication Engineering"},{"key":"e_1_3_2_77_2","doi-asserted-by":"publisher","DOI":"10.1111\/eva.12524"},{"key":"e_1_3_2_78_2","unstructured":"TherneauT.M. and AtkinsonE.J. An introduction to recursive partitioning using the RPART routines. 2018 Mayo Foundation 2019."},{"key":"e_1_3_2_79_2","doi-asserted-by":"publisher","unstructured":"QuahT. and ThwinM.M.T. Application of neural networks for software quality prediction using object-oriented metrics In: Proceedings of International Conference on Software Maintenance (ICSM 2003) IEEE September 22\u201326 2003 pp. 116\u2013125. doi: 10.1109\/ICSM.2003.1235412.","DOI":"10.1109\/ICSM.2003.1235412"},{"key":"e_1_3_2_80_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-019-08537-6"},{"key":"e_1_3_2_81_2","doi-asserted-by":"publisher","DOI":"10.1142\/S0218488519500375"},{"key":"e_1_3_2_82_2","doi-asserted-by":"publisher","DOI":"10.1109\/TEVC.2015.2504420"},{"key":"e_1_3_2_83_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2018.06.029"},{"key":"e_1_3_2_84_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jss.2006.10.049"},{"key":"e_1_3_2_85_2","doi-asserted-by":"publisher","DOI":"10.1080\/00220973.1993.9943832"},{"key":"e_1_3_2_86_2","first-page":"1","article-title":"Selecting Attributes","author":"Romanski P.","unstructured":"RomanskiP. KotthoffL. and KotthoffM.L., Selecting Attributes, Package \u00e2\u0080\u0098FSelector\u2019, pp. 1\u201318. https:\/\/cran.r-project.org\/web\/packages\/FSelector\/FSelector.pdf.","journal-title":"Package \u00e2\u0080\u0098FSelector\u2019"},{"key":"e_1_3_2_87_2","doi-asserted-by":"publisher","unstructured":"SES Subcommittee IEEE standard for software maintenance IEEE Std 1992 pp. 1219\u20131993. doi: 10.1109\/IEEESTD.1993.115570.","DOI":"10.1109\/IEEESTD.1993.115570"},{"key":"e_1_3_2_88_2","unstructured":"R Core Team and others R: A language and environment for statistical computing 2013. https:\/\/cran.r-project.org\/bin\/windows\/base\/old\/4.0.0\/."},{"key":"e_1_3_2_89_2","article-title":"An Implementation of Re-Sampling Approaches to Utility-Based Learning for Both Classification and Regression Tasks","author":"Paula B.","year":"2017","unstructured":"PaulaB. RitaR. and LuisT., An Implementation of Re-Sampling Approaches to Utility-Based Learning for Both Classification and Regression Tasks, R Package \u2018UBL\u2019 (2017)1\u201361. https:\/\/cran.r-project.org\/web\/packages\/UBL\/UBL.pdf, https:\/\/github.com\/paobranco\/UBL.","journal-title":"R Package \u2018UBL\u2019"}],"container-title":["Intelligent Data Analysis: An International Journal"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/IDA-215825","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.3233\/IDA-215825","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.3233\/IDA-215825","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:19:23Z","timestamp":1777454363000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.3233\/IDA-215825"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3]]},"references-count":88,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,3]]}},"alternative-id":["10.3233\/IDA-215825"],"URL":"https:\/\/doi.org\/10.3233\/ida-215825","relation":{},"ISSN":["1088-467X","1571-4128"],"issn-type":[{"value":"1088-467X","type":"print"},{"value":"1571-4128","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3]]}}}