{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,6]],"date-time":"2026-04-06T10:48:40Z","timestamp":1775472520123,"version":"3.50.1"},"reference-count":57,"publisher":"Oxford University Press (OUP)","issue":"14","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,7,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Genome-wide association (GWA) studies have proven to be a successful approach for helping unravel the genetic basis of complex genetic diseases. However, the identified associations are not well suited for disease prediction, and only a modest portion of the heritability can be explained for most diseases, such as Type 2 diabetes or Crohn's disease. This may partly be due to the low power of standard statistical approaches to detect gene\u2013gene and gene\u2013environment interactions when small marginal effects are present. A promising alternative is Random Forests, which have already been successfully applied in candidate gene analyses. Important single nucleotide polymorphisms are detected by permutation importance measures. To this day, the application to GWA data was highly cumbersome with existing implementations because of the high computational burden.<\/jats:p>\n               <jats:p>Results: Here, we present the new freely available software package Random Jungle (RJ), which facilitates the rapid analysis of GWA data. The program yields valid results and computes up to 159 times faster than the fastest alternative implementation, while still maintaining all options of other programs. Specifically, it offers the different permutation importance measures available. It includes new options such as the backward elimination method. We illustrate the application of RJ to a GWA of Crohn's disease. The most important single nucleotide polymorphisms (SNPs) validate recent findings in the literature and reveal potential interactions.<\/jats:p>\n               <jats:p>Availability: The RJ software package is freely available at http:\/\/www.randomjungle.org<\/jats:p>\n               <jats:p>Contact: \u00a0inke.koenig@imbs.uni-luebeck.de; ziegler@imbs.uni-luebeck.de<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq257","type":"journal-article","created":{"date-parts":[[2010,5,27]],"date-time":"2010-05-27T02:45:03Z","timestamp":1274928303000},"page":"1752-1758","source":"Crossref","is-referenced-by-count":198,"title":["On safari to Random Jungle: a fast implementation of Random Forests for high-dimensional data"],"prefix":"10.1093","volume":"26","author":[{"given":"Daniel F.","family":"Schwarz","sequence":"first","affiliation":[{"name":"Institut f\u00fcr Medizinische Biometrie und Statistik, Universit\u00e4t zu L\u00fcbeck, Universit\u00e4tsklinikum Schleswig-Holstein, Campus L\u00fcbeck, Maria-Goeppert-Strasse 1, 23562 L\u00fcbeck, Germany"}]},{"given":"Inke R.","family":"K\u00f6nig","sequence":"additional","affiliation":[{"name":"Institut f\u00fcr Medizinische Biometrie und Statistik, Universit\u00e4t zu L\u00fcbeck, Universit\u00e4tsklinikum Schleswig-Holstein, Campus L\u00fcbeck, Maria-Goeppert-Strasse 1, 23562 L\u00fcbeck, Germany"}]},{"given":"Andreas","family":"Ziegler","sequence":"additional","affiliation":[{"name":"Institut f\u00fcr Medizinische Biometrie und Statistik, Universit\u00e4t zu L\u00fcbeck, Universit\u00e4tsklinikum Schleswig-Holstein, Campus L\u00fcbeck, Maria-Goeppert-Strasse 1, 23562 L\u00fcbeck, Germany"}]}],"member":"286","published-online":{"date-parts":[[2010,5,26]]},"reference":[{"key":"2023012507574600300_B1","doi-asserted-by":"crossref","first-page":"2249","DOI":"10.1016\/j.csda.2007.08.015","article-title":"Empirical characterization of random forest variable importance measures","volume":"52","author":"Archer","year":"2008","journal-title":"Comput. Stat. Data Anal."},{"key":"2023012507574600300_B2","doi-asserted-by":"crossref","first-page":"7888","DOI":"10.1158\/0008-5472.CAN-04-4278","article-title":"Tumor necrosis factor-related apoptosis-inducing ligand-mediated proliferation of tumor cells with receptor-proximal apoptosis defects","volume":"65","author":"Baader","year":"2005","journal-title":"Cancer Res."},{"key":"2023012507574600300_B3","doi-asserted-by":"crossref","first-page":"3164","DOI":"10.4049\/jimmunol.167.6.3164","article-title":"Disruption of NF-kappaB signaling reveals a novel role for NF-kappaB in the regulation of TNF-related apoptosis-inducing ligand expression","volume":"167","author":"Baetu","year":"2001","journal-title":"J. Immunol."},{"key":"2023012507574600300_B4","doi-asserted-by":"crossref","first-page":"955","DOI":"10.1038\/ng.175","article-title":"Genome-wide association defines more than 30 distinct susceptibility loci for Crohn's disease","volume":"40","author":"Barrett","year":"2008","journal-title":"Nat. Genet."},{"key":"2023012507574600300_B5","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/BF00058655","article-title":"Bagging predictors","volume":"24","author":"Breiman","year":"1996","journal-title":"Mach. Learn."},{"key":"2023012507574600300_B6","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random Forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn."},{"key":"2023012507574600300_B7","author":"Breiman","year":"2004","journal-title":"Random Forests 5.1."},{"key":"2023012507574600300_B8","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1002\/gepi.20041","article-title":"Identifying SNPs predictive of phenotype using random forests","volume":"28","author":"Bureau","year":"2005","journal-title":"Genet. Epidemiol."},{"key":"2023012507574600300_B9","doi-asserted-by":"crossref","first-page":"1097","DOI":"10.1111\/j.1365-2036.2006.02854.x","article-title":"Meta-analysis: colorectal and small bowel cancer risk in patients with Crohn's disease","volume":"23","author":"Canavan","year":"2006","journal-title":"Aliment Pharmacol. Ther."},{"key":"2023012507574600300_B10","doi-asserted-by":"crossref","first-page":"1368","DOI":"10.1158\/1055-9965.EPI-07-2830","article-title":"Pathway analysis of single-nucleotide polymorphisms potentially associated with glioblastoma multiforme susceptibility using random forests","volume":"17","author":"Chang","year":"2008","journal-title":"Cancer Epidemiol. Biomarkers Prev."},{"key":"2023012507574600300_B11","doi-asserted-by":"crossref","first-page":"392","DOI":"10.1038\/nrg2579","article-title":"Genome-wide association studies: detecting gene-gene interactions that underlie human diseases","volume":"10","author":"Cordell","year":"2009","journal-title":"Nat. Rev. Genet."},{"key":"2023012507574600300_B12","volume-title":"Multidimensional Scaling. Monographs on Statistics and Applied Probability.","author":"Cox","year":"2001"},{"key":"2023012507574600300_B13","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1186\/1471-2105-7-3","article-title":"Gene selection and classification of microarray data using random forest","volume":"7","author":"Diaz-Uriarte","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023012507574600300_B14","doi-asserted-by":"crossref","first-page":"1461","DOI":"10.1126\/science.1135245","article-title":"A genome-wide association study identifies IL23R as an inflammatory bowel disease gene","volume":"314","author":"Duerr","year":"2006","journal-title":"Science"},{"key":"2023012507574600300_B15","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1016\/0140-6736(90)91889-I","article-title":"Increased risk of large-bowel cancer in Crohn's disease with colonic involvement","volume":"336","author":"Ekbom","year":"1990","journal-title":"Lancet"},{"key":"2023012507574600300_B16","doi-asserted-by":"crossref","first-page":"41701","DOI":"10.1074\/jbc.M206473200","article-title":"Induction of NOD2 in myelomonocytic and intestinal epithelial cells via nuclear factor-kappaB activation","volume":"277","author":"Gutierrez","year":"2002","journal-title":"J. Biol. Chem."},{"key":"2023012507574600300_B17","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1046\/j.1469-1809.2000.6450413.x","article-title":"Selecting SNPs in two-stage analysis of disease association data: a model-free approach","volume":"64","author":"Hoh","year":"2000","journal-title":"Ann. Hum. Genet."},{"key":"2023012507574600300_B18","doi-asserted-by":"crossref","first-page":"651","DOI":"10.1198\/106186006X133933","article-title":"Unbiased recursive partitioning","volume":"15","author":"Hothorn","year":"2006","journal-title":"J. Comput. Graph. Stat."},{"key":"2023012507574600300_B19","doi-asserted-by":"crossref","first-page":"e1000337","DOI":"10.1371\/journal.pgen.1000337","article-title":"Interpretation of genetic association studies: markers with replicated highly significant odds ratios may be poor classifiers","volume":"5","author":"Jakobsdottir","year":"2009","journal-title":"PLoS Genet."},{"issue":"Suppl. 1","key":"2023012507574600300_B20","doi-asserted-by":"crossref","first-page":"S65","DOI":"10.1186\/1471-2105-10-S1-S65","article-title":"A random forest approach to the detection of epistatic interactions in case-control studies","volume":"10","author":"Jiang","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023012507574600300_B21","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1504\/IJDMB.2008.022149","article-title":"Patient-centered yes\/no prognosis using learning machines","volume":"2","author":"K\u00f6nig","year":"2008","journal-title":"Int. J. Data Min. Bioinform."},{"key":"2023012507574600300_B22","first-page":"18","article-title":"Classification and Regression by randomForest","volume":"2","author":"Liaw","year":"2002","journal-title":"R News"},{"key":"2023012507574600300_B23","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1186\/1471-2156-5-32","article-title":"Screening large-scale association study data: exploiting interactions using random forests","volume":"5","author":"Lunetta","year":"2004","journal-title":"BMC Genet."},{"key":"2023012507574600300_B24","first-page":"281","article-title":"Some methods of classification and analysis of multivariate observations","volume-title":"Proceedings of the 5th Berkeley Symposium on Mathamatical Statistics and Probability","author":"Macqueen","year":"1967"},{"key":"2023012507574600300_B25","doi-asserted-by":"crossref","first-page":"747","DOI":"10.1038\/nature08494","article-title":"Finding the missing heritability of complex diseases","volume":"461","author":"Manolio","year":"2009","journal-title":"Nature"},{"key":"2023012507574600300_B26","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1038\/ng1537","article-title":"Genome-wide strategies for detecting multiple loci that influence complex diseases","volume":"37","author":"Marchini","year":"2005","journal-title":"Nat. Genet."},{"key":"2023012507574600300_B27","doi-asserted-by":"crossref","first-page":"906","DOI":"10.1038\/ng2088","article-title":"A new multipoint method for genome-wide association studies by imputation of genotypes","volume":"39","author":"Marchini","year":"2007","journal-title":"Nat. Genet."},{"key":"2023012507574600300_B28","doi-asserted-by":"crossref","first-page":"356","DOI":"10.1038\/nrg2344","article-title":"Genome-wide association studies for complex traits: consensus, uncertainty and challenges","volume":"9","author":"McCarthy","year":"2008","journal-title":"Nat. Rev. Genet."},{"key":"2023012507574600300_B29","doi-asserted-by":"crossref","first-page":"77","DOI":"10.2165\/00822942-200605020-00002","article-title":"Machine learning for detecting gene-gene interactions: a review","volume":"5","author":"McKinney","year":"2006","journal-title":"Appl. Bioinformatics"},{"key":"2023012507574600300_B30","doi-asserted-by":"crossref","first-page":"e1000432","DOI":"10.1371\/journal.pgen.1000432","article-title":"Capturing the spectrum of interaction effects in genetic association studies by simulated evaporative cooling network analysis","volume":"5","author":"McKinney","year":"2009","journal-title":"PLoS Genet."},{"key":"2023012507574600300_B31","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1186\/1471-2105-10-78","article-title":"Performance of random forest when SNPs are in linkage disequilibrium","volume":"10","author":"Meng","year":"2009","journal-title":"BMC Bioinformatics"},{"issue":"Suppl. 1","key":"2023012507574600300_B32","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/1753-6561-1-S1-S4","article-title":"Genetic Analysis Workshop 15: simulation of a complex genetic model for rheumatoid arthritis in nuclear families including a dense SNP map with linkage disequilibrium between marker loci and trait loci","volume":"1","author":"Miller","year":"2007","journal-title":"BMC Proc."},{"key":"2023012507574600300_B33","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1093\/bioinformatics\/btp713","article-title":"Bioinformatics challenges for genome-wide association studies","volume":"26","author":"Moore","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012507574600300_B34","doi-asserted-by":"crossref","first-page":"1884","DOI":"10.1093\/bioinformatics\/btp331","article-title":"Predictor correlation impacts machine learning algorithms: implications for genomic studies","volume":"25","author":"Nicodemus","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012507574600300_B35","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1186\/1471-2105-11-110","article-title":"The behaviour of random forest permutation-based variable importance measures under predictor correlation","volume":"11","author":"Nicodemus","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023012507574600300_B36","first-page":"3276","article-title":"Overexpression of a dominant-negative signal transducer and activator of transcription 3 variant in tumor cells leads to production of soluble factors that induce apoptosis and cell cycle arrest","volume":"61","author":"Niu","year":"2001","journal-title":"Cancer Res."},{"key":"2023012507574600300_B37","doi-asserted-by":"crossref","first-page":"1378","DOI":"10.1126\/science.1089769","article-title":"Control of pancreas and liver gene expression by HNF transcription factors","volume":"303","author":"Odom","year":"2004","journal-title":"Science"},{"key":"2023012507574600300_B38","doi-asserted-by":"crossref","first-page":"5699","DOI":"10.4049\/jimmunol.168.11.5699","article-title":"A receptor for the heterodimeric cytokine IL-23 is composed of IL-12Rbeta1 and a novel cytokine receptor subunit, IL-23R","volume":"168","author":"Parham","year":"2002","journal-title":"J. Immunol."},{"key":"2023012507574600300_B39","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1016\/S0065-2660(01)42028-1","article-title":"Classification methods for confronting heterogeneity","volume":"42","author":"Province","year":"2001","journal-title":"Adv. Genet."},{"key":"2023012507574600300_B40","author":"R Development Core Team","year":"2009","journal-title":"R: a language and environment for statistical computing."},{"key":"2023012507574600300_B41","doi-asserted-by":"crossref","first-page":"596","DOI":"10.1038\/ng2032","article-title":"Genome-wide association study identifies new susceptibility loci for Crohn disease and implicates autophagy in disease pathogenesis","volume":"39","author":"Rioux","year":"2007","journal-title":"Nat. Genet."},{"key":"2023012507574600300_B42","doi-asserted-by":"crossref","first-page":"4245","DOI":"10.1091\/mbc.e07-04-0309","article-title":"Parallels between global transcriptional programs of polarizing Caco-2 intestinal epithelial cells in vitro and gene expression programs in normal colon and colon cancer","volume":"18","author":"Saaf","year":"2007","journal-title":"Mol. Biol. Cell."},{"key":"2023012507574600300_B43","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1056\/NEJMoa072366","article-title":"Genomewide association analysis of coronary artery disease","volume":"357","author":"Samani","year":"2007","journal-title":"N. Engl J. Med."},{"key":"2023012507574600300_B44","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1007\/BF00116037","article-title":"The strength of weak learnability","volume":"5","author":"Schapire","year":"1990","journal-title":"Mach. Learn."},{"issue":"Suppl. 1","key":"2023012507574600300_B45","doi-asserted-by":"crossref","first-page":"S59","DOI":"10.1186\/1753-6561-1-S1-S59","article-title":"Picking single-nucleotide polymorphisms in forests","volume":"1","author":"Schwarz","year":"2007","journal-title":"BMC Proc."},{"key":"2023012507574600300_B46","doi-asserted-by":"crossref","first-page":"S65","DOI":"10.1186\/1753-6561-3-S7-S65","article-title":"Evaluation of single-nucleotide polymorphism imputation using random forests","volume":"3","author":"Schwarz","year":"2009","journal-title":"BMC Proc."},{"key":"2023012507574600300_B47","doi-asserted-by":"crossref","first-page":"405","DOI":"10.1161\/hh0402.105898","article-title":"Sp1 transcription factor as a molecular target for nitric oxide- and cyclic nucleotide-mediated suppression of cGMP-dependent protein kinase-Ialpha expression in vascular smooth muscle cells","volume":"90","author":"Sellak","year":"2002","journal-title":"Circ. Res."},{"key":"2023012507574600300_B48","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1186\/1471-2105-9-307","article-title":"Conditional variable importance for random forests","volume":"9","author":"Strobl","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012507574600300_B49","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1186\/1471-2105-8-25","article-title":"Bias in random forest variable importance measures: illustrations, sources and a solution","volume":"8","author":"Strobl","year":"2007","journal-title":"BMC Bioinformatics"},{"issue":"Suppl. 1","key":"2023012507574600300_B50","doi-asserted-by":"crossref","first-page":"S62","DOI":"10.1186\/1753-6561-1-S1-S62","article-title":"Classification of rheumatoid arthritis status with candidate gene and genome-wide single-nucleotide polymorphisms using random forests","volume":"1","author":"Sun","year":"2007","journal-title":"BMC Proc."},{"key":"2023012507574600300_B51","first-page":"4903","article-title":"Cyclooxygenase-2 overexpression inhibits death receptor 5 expression and confers resistance to tumor necrosis factor-related apoptosis-inducing ligand-induced apoptosis in human colon cancer cells","volume":"62","author":"Tang","year":"2002","journal-title":"Cancer Res."},{"key":"2023012507574600300_B52","first-page":"5118","article-title":"Rottlerin sensitizes colon carcinoma cells to tumor necrosis factor-related apoptosis-inducing ligand-induced apoptosis via uncoupling of the mitochondria independent of protein kinase C","volume":"63","author":"Tillman","year":"2003","journal-title":"Cancer Res."},{"key":"2023012507574600300_B53","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1038\/nature05911","article-title":"Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls","volume":"447","author":"Wellcome Trust Case Control Consortium","year":"2007","journal-title":"Nature"},{"key":"2023012507574600300_B54","doi-asserted-by":"crossref","first-page":"6718","DOI":"10.1158\/0008-5472.CAN-08-0657","article-title":"Sp1-mediated TRAIL induction in chemosensitization","volume":"68","author":"Xu","year":"2008","journal-title":"Cancer Res."},{"key":"2023012507574600300_B55","doi-asserted-by":"crossref","first-page":"130","DOI":"10.1186\/1471-2105-10-130","article-title":"Willows: a memory efficient tree and forest construction package","volume":"10","author":"Zhang","year":"2009","journal-title":"BMC Bioinformatics"},{"issue":"Suppl. 1","key":"2023012507574600300_B56","doi-asserted-by":"crossref","first-page":"S51","DOI":"10.1002\/gepi.20280","article-title":"Data mining, neural nets, trees\u2013problems 2 and 3 of Genetic Analysis Workshop 15","volume":"31","author":"Ziegler","year":"2007","journal-title":"Genet. Epidemiol."},{"key":"2023012507574600300_B57","doi-asserted-by":"crossref","DOI":"10.1002\/9783527633654","volume-title":"A Statistical Approach to Genetic Epidemiology: Concepts and Applications.","author":"Ziegler","year":"2010"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/14\/1752\/48851927\/bioinformatics_26_14_1752.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/14\/1752\/48851927\/bioinformatics_26_14_1752.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T07:58:12Z","timestamp":1674633492000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/14\/1752\/177075"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,5,26]]},"references-count":57,"journal-issue":{"issue":"14","published-print":{"date-parts":[[2010,7,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq257","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,7,15]]},"published":{"date-parts":[[2010,5,26]]}}}