{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T01:42:22Z","timestamp":1780537342295,"version":"3.54.1"},"reference-count":23,"publisher":"Oxford University Press (OUP)","issue":"18","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,9,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Although the random forest classification procedure works well in datasets with many features, when the number of features is huge and the percentage of truly informative features is small, such as with DNA microarray data, its performance tends to decline significantly. In such instances, the procedure can be improved by reducing the contribution of trees whose nodes are populated by non-informative features. To some extent, this can be achieved by prefiltering, but we propose a novel, yet simple, adjustment that has demonstrably superior performance: choose the eligible subsets at each node by weighted random sampling instead of simple random sampling, with the weights tilted in favor of the informative features. This results in an \u2018enriched random forest\u2019. We illustrate the superior performance of this procedure in several actual microarray datasets.<\/jats:p>\n               <jats:p>Contact: \u00a0damaratu@prdus.jnj.com<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn356","type":"journal-article","created":{"date-parts":[[2008,7,24]],"date-time":"2008-07-24T00:14:51Z","timestamp":1216858491000},"page":"2010-2014","source":"Crossref","is-referenced-by-count":186,"title":["Enriched random forests"],"prefix":"10.1093","volume":"24","author":[{"given":"Dhammika","family":"Amaratunga","sequence":"first","affiliation":[{"name":"1 Department of Nonclinical Biostatistics, Johnson & Johnson PRD LLC, Raritan, NJ 08869, 2Department of Statistics, Rutgers University, 110 Frelinghuysen Ave, Piscataway, NJ 08854, USA and 3Department of Statistics, Dongguk University, Seoul, South Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Javier","family":"Cabrera","sequence":"additional","affiliation":[{"name":"1 Department of Nonclinical Biostatistics, Johnson & Johnson PRD LLC, Raritan, NJ 08869, 2Department of Statistics, Rutgers University, 110 Frelinghuysen Ave, Piscataway, NJ 08854, USA and 3Department of Statistics, Dongguk University, Seoul, South Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yung-Seop","family":"Lee","sequence":"additional","affiliation":[{"name":"1 Department of Nonclinical Biostatistics, Johnson & Johnson PRD LLC, Raritan, NJ 08869, 2Department of Statistics, Rutgers University, 110 Frelinghuysen Ave, Piscataway, NJ 08854, USA and 3Department of Statistics, Dongguk University, Seoul, South Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2008,7,22]]},"reference":[{"key":"2023020211121166200_B1","volume-title":"Exploration and Analysis of DNA Microarray and Protein Array Data.","author":"Amaratunga","year":"2004"},{"key":"2023020211121166200_B2","article-title":"A conditional t suite of tests for identifying differentially expressed genes in a DNA microarray experiment with little replication","author":"Amaratunga","year":"2007","journal-title":"Stat. Biopharmaceut. Res."},{"key":"2023020211121166200_B3","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1093\/biostatistics\/kxm017","article-title":"Microarray learning with ABC","volume":"9","author":"Amaratunga","year":"2008","journal-title":"Biostatistics"},{"key":"2023020211121166200_B4","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/BF00058655","article-title":"Bagging predictors","volume":"24","author":"Breiman","year":"1996","journal-title":"Mach. Learn."},{"key":"2023020211121166200_B5","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"key":"2023020211121166200_B6","article-title":"Random forests manual (version 4.0)","author":"Breiman","year":"2003","journal-title":"Technical Report of the University of California, Berkeley, Department of Statistics."},{"key":"2023020211121166200_B7","doi-asserted-by":"crossref","first-page":"1343","DOI":"10.1093\/carcin\/bgi100","article-title":"Gene expression profiling of NMU-induced rat mammary tumors: cross species comparison with human breast cancer","volume":"26","author":"Chan","year":"2005","journal-title":"Carcinogenesis"},{"key":"2023020211121166200_B8","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1186\/1471-2105-7-3","article-title":"Gene selection and classification of microarray data using random forest","volume":"7","author":"D\u00edaz-Uriarte","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023020211121166200_B9","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1198\/016214502753479248","article-title":"Comparison of discrimination methods for the classification of tumors using gene expression data","volume":"97","author":"Dudoit","year":"2002","journal-title":"J. Am. Stat. Assoc."},{"key":"2023020211121166200_B10","doi-asserted-by":"crossref","first-page":"869","DOI":"10.1016\/j.csda.2004.03.017","article-title":"An extensive evaluation of recent classification tools applied to microarray data","volume":"48","author":"Lee","year":"2005","journal-title":"Comput. Stat. Data Anal."},{"key":"2023020211121166200_B11","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1007\/978-1-4615-0873-1_11","article-title":"How many genes are needed for a discriminant microarray data analysis?","volume-title":"Methods of Microarray Data Analysis.","author":"Li","year":"2002"},{"key":"2023020211121166200_B12","author":"MacDonald","year":"2001","journal-title":"Human glioblastoma."},{"key":"2023020211121166200_B13","doi-asserted-by":"crossref","first-page":"574","DOI":"10.1002\/path.1921","article-title":"Differential expression of a gene signature for scavenger\/lectin receptors by endothelial cells and macrophages in human lymph node sinuses, the primary sites of regional metastasis","volume":"208","author":"Martens","year":"2006","journal-title":"J. Pathol."},{"key":"2023020211121166200_B14","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1016\/j.jneuroim.2004.08.026","article-title":"Patterns of gene dysregulation in the frontal cortex of patients with HIV encephalitis","volume":"157","author":"Masiliah","year":"2004","journal-title":"J. Neuroimmunol."},{"key":"2023020211121166200_B15","article-title":"Sialin-deficient mice: a novel animal model for infantile free sialic acid storage disease (ISSD)","volume-title":"Society for Neuroscience 35th Annual Meeting.","author":"Moechars","year":"2005"},{"key":"2023020211121166200_B16","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1038\/ng1180","article-title":"PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes","volume":"34","author":"Mootha","year":"2003","journal-title":"Nat. Genet."},{"key":"2023020211121166200_B17","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1186\/1471-2164-7-92","article-title":"Transcriptomes of human prostate cells","volume":"7","author":"Oudes","year":"2006","journal-title":"BMC Genomics"},{"key":"2023020211121166200_B18","doi-asserted-by":"crossref","first-page":"3032","DOI":"10.1093\/bioinformatics\/btm448","article-title":"The high-level similarity of some disparate gene expression measures","volume":"23","author":"Raghavan","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020211121166200_B19","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1038\/ng1060","article-title":"A molecular signature of metastasis in primary solid tumors","volume":"33","author":"Ramaswamy","year":"2002","journal-title":"Nat. Genet."},{"key":"2023020211121166200_B20","doi-asserted-by":"crossref","DOI":"10.2202\/1544-6115.1027","article-title":"Linear models and empirical Bayes methods for assessing differential expression in microarray experiments","volume":"3","author":"Smyth","year":"2004","journal-title":"Stat. Appl. Genet. Mol. Biol."},{"key":"2023020211121166200_B21","doi-asserted-by":"crossref","first-page":"9440","DOI":"10.1073\/pnas.1530509100","article-title":"Statistical significance for genome-wide studies","volume":"100","author":"Storey","year":"2007","journal-title":"Proc. Natl. Acad. Sci."},{"key":"2023020211121166200_B22","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1186\/1471-2105-8-25","article-title":"Bias in random forest variable importance measures: illustrations, sources and a solution","volume":"8","author":"Strobl","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023020211121166200_B23","doi-asserted-by":"crossref","first-page":"285","DOI":"10.1385\/JMN:25:3:285","article-title":"Microarray analysis of postictal transcriptional regulation of neuropeptides","volume":"25","author":"Wilson","year":"2005","journal-title":"J. Mol. Neurosci."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/18\/2010\/49051677\/bioinformatics_24_18_2010.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/18\/2010\/49051677\/bioinformatics_24_18_2010.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T13:48:05Z","timestamp":1675345685000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/18\/2010\/190849"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,7,22]]},"references-count":23,"journal-issue":{"issue":"18","published-print":{"date-parts":[[2008,9,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn356","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2008,9,15]]},"published":{"date-parts":[[2008,7,22]]}}}