{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T02:41:49Z","timestamp":1775702509541,"version":"3.50.1"},"reference-count":19,"publisher":"World Scientific Pub Co Pte Lt","issue":"02","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Soft. Eng. Knowl. Eng."],"published-print":{"date-parts":[[2010,3]]},"abstract":"<jats:p> Feature selection for supervised learning concerns the problem of selecting a number of important features (w.r.t. the class labels) for the purposes of training accurate prediction models. Traditional feature selection methods, however, fail to take the sample distributions into consideration which may lead to poor prediction for minority class examples. Due to the sophistication and the cost involved in the data collection process, many applications, such as biomedical research, commonly face biased data collections with one class of examples (e.g., diseased samples) significantly less than other classes (e.g., normal samples). For these applications, the minority class examples, such as disease samples, credit card frauds, and network intrusions, are only a small portion of the data but deserve full attention for accurate prediction. In this paper, we propose three filtering techniques, Higher Weight (HW), Differential Minority Repeat (DMR) and Balanced Minority Repeat (BMR), to identify important features from datasets with biased sample distribution. Experimental comparisons with the ReliefF method on five datasets demonstrate the effectiveness of the proposed methods in selecting informative features for accurate prediction of minority class examples. <\/jats:p>","DOI":"10.1142\/s0218194010004645","type":"journal-article","created":{"date-parts":[[2010,6,4]],"date-time":"2010-06-04T08:20:18Z","timestamp":1275639618000},"page":"113-137","source":"Crossref","is-referenced-by-count":11,"title":["FEATURE SELECTION FOR DATASETS WITH IMBALANCED CLASS DISTRIBUTIONS"],"prefix":"10.1142","volume":"20","author":[{"given":"ABU H. M.","family":"KAMAL","sequence":"first","affiliation":[{"name":"Department of Computer Science &amp; Engineering, Florida Atlantic University, Boca Raton, FL 33431, USA"}]},{"given":"XINGQUAN","family":"ZHU","sequence":"additional","affiliation":[{"name":"Department of Computer Science &amp; Engineering, Florida Atlantic University, Boca Raton, FL 33431, USA"}]},{"given":"ABHIJIT","family":"PANDYA","sequence":"additional","affiliation":[{"name":"Department of Computer Science &amp; Engineering, Florida Atlantic University, Boca Raton, FL 33431, USA"}]},{"given":"SAM","family":"HSU","sequence":"additional","affiliation":[{"name":"Department of Computer Science &amp; Engineering, Florida Atlantic University, Boca Raton, FL 33431, USA"}]},{"given":"RAMASWAMY","family":"NARAYANAN","sequence":"additional","affiliation":[{"name":"Department of Chemistry &amp; Biochemistry, Florida Atlantic University, Boca Raton, FL 33431, USA"}]}],"member":"219","published-online":{"date-parts":[[2012,4,30]]},"reference":[{"key":"rf1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4615-5689-3"},{"key":"rf2","first-page":"1157","volume":"3","author":"Guyou I.","journal-title":"Journal of Machine Learning Research"},{"key":"rf3","volume":"13","author":"Kwak N.","journal-title":"IEEE Trans. on Neural Networks"},{"key":"rf4","volume":"1","author":"Golub T.","journal-title":"Science"},{"key":"rf5","volume-title":"Computational Genome Analysis, An Introduction","author":"Deonier R. C.","year":"2007"},{"key":"rf7","first-page":"2649","volume":"63","author":"Logsdon C.","journal-title":"Cancer Research"},{"key":"rf8","first-page":"1","volume":"97","author":"Kohavi R.","journal-title":"Artificial Intelligence"},{"key":"rf14","volume-title":"Rough Sets, Theoretical Aspects of Reasoning about Data","author":"Pawlak Z.","year":"1991"},{"key":"rf22","first-page":"81","volume":"1","author":"Quinlan J. R.","journal-title":"Mach. Learn."},{"key":"rf23","volume-title":"C4.5: Programs for Machine Learning","author":"Quinlan J. R.","year":"1993"},{"key":"rf28","doi-asserted-by":"publisher","DOI":"10.1007\/978-94-017-2053-3_9"},{"key":"rf29","volume":"6","author":"Japkowicz N.","journal-title":"Intelligent Data Analysis"},{"key":"rf33","volume":"53","author":"Robnik-\u0160ikonja M.","journal-title":"Mach. Learn."},{"key":"rf34","volume-title":"Data Mining: Practical Machine Learning Tools and Techniques","author":"Witten I.","year":"2003"},{"key":"rf35","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bth267"},{"key":"rf37","volume":"297","author":"Pedro D.","journal-title":"Machine Learning"},{"key":"rf38","doi-asserted-by":"publisher","DOI":"10.1023\/A:1010933404324"},{"key":"rf39","first-page":"37","volume":"6","author":"Aha D. W.","journal-title":"Mach. Learn."},{"key":"rf40","doi-asserted-by":"publisher","DOI":"10.4018\/978-1-59904-252-7"}],"container-title":["International Journal of Software Engineering and Knowledge Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S0218194010004645","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,8,7]],"date-time":"2019-08-07T13:08:15Z","timestamp":1565183295000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/abs\/10.1142\/S0218194010004645"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,3]]},"references-count":19,"journal-issue":{"issue":"02","published-online":{"date-parts":[[2012,4,30]]},"published-print":{"date-parts":[[2010,3]]}},"alternative-id":["10.1142\/S0218194010004645"],"URL":"https:\/\/doi.org\/10.1142\/s0218194010004645","relation":{},"ISSN":["0218-1940","1793-6403"],"issn-type":[{"value":"0218-1940","type":"print"},{"value":"1793-6403","type":"electronic"}],"subject":[],"published":{"date-parts":[[2010,3]]}}}