{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T22:30:56Z","timestamp":1774737056976,"version":"3.50.1"},"reference-count":53,"publisher":"Oxford University Press (OUP)","issue":"8","license":[{"start":{"date-parts":[[2017,12,18]],"date-time":"2017-12-18T00:00:00Z","timestamp":1513555200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01GM122084"],"award-info":[{"award-number":["R01GM122084"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,4,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Random forest (RF) has become a widely popular prediction generating mechanism. Its strength lies in its flexibility, interpretability and ability to handle large number of features, typically larger than the sample size. However, this methodology is of limited use if one wishes to identify statistically significant features. Several ranking schemes are available that provide information on the relative importance of the features, but there is a paucity of general inferential mechanism, particularly in a multi-variate set up. We use the conditional inference tree framework to generate a RF where features are deleted sequentially based on explicit hypothesis testing. The resulting sequential algorithm offers an inferentially justifiable, but model-free, variable selection procedure. Significant features are then used to generate predictive RF. An added advantage of our methodology is that both variable selection and prediction are based on conditional inference framework and hence are coherent.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We illustrate the performance of our Sequential Multi-Response Feature Selection approach through simulation studies and finally apply this methodology on Genomics of Drug Sensitivity for Cancer dataset to identify genetic characteristics that significantly impact drug sensitivities. Significant set of predictors obtained from our method are further validated from biological perspective.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>https:\/\/github.com\/jomayer\/SMuRF<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx784","type":"journal-article","created":{"date-parts":[[2017,12,15]],"date-time":"2017-12-15T22:44:33Z","timestamp":1513377873000},"page":"1336-1344","source":"Crossref","is-referenced-by-count":13,"title":["Sequential feature selection and inference using multi-variate random forests"],"prefix":"10.1093","volume":"34","author":[{"given":"Joshua","family":"Mayer","sequence":"first","affiliation":[{"name":"Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX, USA"}]},{"given":"Raziur","family":"Rahman","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, TX, USA"}]},{"given":"Souparno","family":"Ghosh","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX, USA"}]},{"given":"Ranadip","family":"Pal","sequence":"additional","affiliation":[{"name":"Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, TX, USA"}]}],"member":"286","published-online":{"date-parts":[[2017,12,18]]},"reference":[{"key":"2023012713005146400_btx784-B1","doi-asserted-by":"crossref","first-page":"1545","DOI":"10.1162\/neco.1997.9.7.1545","article-title":"Shape quantization and recognition with randomized trees","volume":"9","author":"Amit","year":"1997","journal-title":"Neural Comput"},{"key":"2023012713005146400_btx784-B2","doi-asserted-by":"crossref","first-page":"25.","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet"},{"key":"2023012713005146400_btx784-B3","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1109\/TPAMI.2007.250609","article-title":"A comparison of decision tree ensemble creation techniques","volume":"29","author":"Banfield","year":"2007","journal-title":"IEEE Trans. Pattern Anal. Machine Intel"},{"key":"2023012713005146400_btx784-B4","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn"},{"key":"2023012713005146400_btx784-B5","author":"Buil","year":"2007"},{"key":"2023012713005146400_btx784-B6","author":"Chen","year":"2012"},{"key":"2023012713005146400_btx784-B7","doi-asserted-by":"crossref","first-page":"720","DOI":"10.1593\/neo.09398","article-title":"Growth-inhibitory and antiangiogenic activity of the mek inhibitor pd0325901 in malignant melanoma with or without braf mutations","volume":"11","author":"Ciuffreda","year":"2009","journal-title":"Neoplasia"},{"key":"2023012713005146400_btx784-B8","doi-asserted-by":"crossref","DOI":"10.1038\/nbt.2877","article-title":"A community effort to assess and improve drug sensitivity prediction algorithms","volume":"32","author":"Costello","year":"2014","journal-title":"Nat. Biotechnol"},{"key":"2023012713005146400_btx784-B9","first-page":"1105","article-title":"Multivariate regression trees: a new technique for modeling species\u2013environment relationships","volume":"83","author":"De\u2019ath","year":"2002","journal-title":"Ecology"},{"key":"2023012713005146400_btx784-B10","doi-asserted-by":"crossref","first-page":"3.","DOI":"10.1186\/1471-2105-7-3","article-title":"Gene selection and classification of microarray data using random forest","volume":"7","author":"Diaz-Uriarte","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023012713005146400_btx784-B11","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1023\/A:1007607513941","article-title":"An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization","volume":"40","author":"Dietterich","year":"2000","journal-title":"Machine Learn"},{"key":"2023012713005146400_btx784-B12","first-page":"93","article-title":"Classification in microarray experiments","volume":"1","author":"Dudoit","year":"2003","journal-title":"Stat. Anal. Gene Expression Microarray Data"},{"key":"2023012713005146400_btx784-B13","doi-asserted-by":"crossref","first-page":"782","DOI":"10.1016\/S1470-2045(12)70269-3","article-title":"Activity of the oral mek inhibitor trametinib in patients with advanced melanoma: a phase 1 dose-escalation trial","volume":"13","author":"Falchook","year":"2012","journal-title":"Lancet Oncol"},{"key":"2023012713005146400_btx784-B14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v033.i01","article-title":"Regularization paths for generalized linear models via coordinate descent","volume":"33","author":"Friedman","year":"2010","journal-title":"J. Stat. Software"},{"key":"2023012713005146400_btx784-B15","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1007\/s10994-006-6226-1","article-title":"Extremely randomized trees","volume":"63","author":"Geurts","year":"2006","journal-title":"Machine Learn"},{"key":"2023012713005146400_btx784-B16","doi-asserted-by":"crossref","first-page":"e0144490.","DOI":"10.1371\/journal.pone.0144490","article-title":"A copula based approach for design of multivariate random forests for drug sensitivity prediction","volume":"10","author":"Haider","year":"2015","journal-title":"PLoS One"},{"key":"2023012713005146400_btx784-B17","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1016\/j.csda.2012.09.020","article-title":"A new variable selection approach using random forests","volume":"60","author":"Hapfelmeier","year":"2013","journal-title":"Comput. Stat. Data Anal"},{"key":"2023012713005146400_btx784-B18","doi-asserted-by":"crossref","first-page":"727","DOI":"10.1038\/nrd892","article-title":"The druggable genome","volume":"1","author":"Hopkins","year":"2002","journal-title":"Nat. Rev. Drug Discov"},{"key":"2023012713005146400_btx784-B19","doi-asserted-by":"crossref","first-page":"651","DOI":"10.1198\/106186006X133933","article-title":"Unbiased recursive partitioning: a conditional inference framework","volume":"15","author":"Hothorn","year":"2006","journal-title":"J. Comput. Graph. Stat"},{"key":"2023012713005146400_btx784-B20","first-page":"3905","article-title":"Partykit: a modular toolkit for recursive partytioning in r","volume":"16","author":"Hothorn","year":"2015","journal-title":"J. Machine Learn. Res"},{"key":"2023012713005146400_btx784-B21","doi-asserted-by":"crossref","first-page":"409","DOI":"10.1016\/j.patcog.2008.08.001","article-title":"Performance of feature-selection methods in the classification of high-dimension data","volume":"42","author":"Hua","year":"2009","journal-title":"Pattern Recogn"},{"key":"2023012713005146400_btx784-B22","doi-asserted-by":"crossref","first-page":"821","DOI":"10.1038\/nrd2132","article-title":"Drugs, their targets and the nature and number of drug targets","volume":"5","author":"Imming","year":"2006","journal-title":"Nat. Rev. Drug Discov"},{"key":"2023012713005146400_btx784-B23","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1093\/nar\/28.1.27","article-title":"Kegg: kyoto encyclopedia of genes and genomes","volume":"28","author":"Kanehisa","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2023012713005146400_btx784-B24","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1111\/biom.12292","article-title":"Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure","volume":"71","author":"Li","year":"2015","journal-title":"Biometrics"},{"key":"2023012713005146400_btx784-B25","author":"Liu","year":"2009"},{"key":"2023012713005146400_btx784-B26","doi-asserted-by":"crossref","first-page":"3448","DOI":"10.1093\/bioinformatics\/bti551","article-title":"Bingo: a cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks","volume":"21","author":"Maere","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012713005146400_btx784-B27","doi-asserted-by":"crossref","first-page":"e1000591.","DOI":"10.1371\/journal.pcbi.1000591","article-title":"Identifying drug effects via pathway alterations using an integer linear programming optimization formulation on phosphoproteomic data","volume":"5","author":"Mitsos","year":"2009","journal-title":"PLoS Comput. Biol"},{"key":"2023012713005146400_btx784-B28","author":"Nie","year":"2010"},{"key":"2023012713005146400_btx784-B29","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1007\/s11222-008-9111-x","article-title":"Joint covariate selection and joint subspace selection for multiple classification problems","volume":"20","author":"Obozinski","year":"2010","journal-title":"Stat. Comput"},{"key":"2023012713005146400_btx784-B30","doi-asserted-by":"crossref","first-page":"1407","DOI":"10.1093\/bioinformatics\/btw765","article-title":"Integratedmrf: random forest-based framework for integrating prediction from different data types","volume":"33","author":"Rahman","year":"2017","journal-title":"Bioinformatics"},{"key":"2023012713005146400_btx784-B31","author":"Rahman","year":"2016"},{"key":"2023012713005146400_btx784-B32","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1023\/A:1025667309714","article-title":"Theoretical and empirical analysis of relieff and rrelieff","volume":"53","author":"Robnik-\u0160ikonja","year":"2003","journal-title":"Machine Learn"},{"key":"2023012713005146400_btx784-B33","author":"Robnik-Sikonja","year":"2016"},{"key":"2023012713005146400_btx784-B34","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1016\/j.isprsjprs.2011.11.002","article-title":"An assessment of the effectiveness of a random forest classifier for land-cover classification","volume":"67","author":"Rodriguez-Galiano","year":"2012","journal-title":"ISPRS J. Photogrammetry Remote Sensing"},{"key":"2023012713005146400_btx784-B35","doi-asserted-by":"crossref","first-page":"1752","DOI":"10.1093\/bioinformatics\/btq257","article-title":"On safari to random jungle: a fast implementation of random forests for high-dimensional data","volume":"26","author":"Schwarz","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012713005146400_btx784-B36","doi-asserted-by":"crossref","first-page":"2498","DOI":"10.1101\/gr.1239303","article-title":"Cytoscape: a software environment for integrated models of biomolecular interaction networks","volume":"13","author":"Shannon","year":"2003","journal-title":"Genome Res"},{"key":"2023012713005146400_btx784-B37","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v039.i05","article-title":"Regularization paths for cox\u2019s proportional hazards model via coordinate descent","volume":"39","author":"Simon","year":"2011","journal-title":"J. Stat. Software"},{"key":"2023012713005146400_btx784-B38","doi-asserted-by":"crossref","first-page":"1727","DOI":"10.1172\/JCI37127","article-title":"Predicting drug susceptibility of non-small cell lung cancers based on genetic lesions","volume":"119","author":"Sos","year":"2009","journal-title":"J. Clin. Investig"},{"key":"2023012713005146400_btx784-B39","first-page":"220","article-title":"On the asymptotic theory of permutation statistics","volume":"8","author":"Strasser","year":"1999","journal-title":"Math. Methods Stat"},{"key":"2023012713005146400_btx784-B40","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-8-25","article-title":"Bias in random forest variable importance measures: illustrations, sources and a solution","volume":"8","author":"Strobl","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023012713005146400_btx784-B41","doi-asserted-by":"crossref","first-page":"334","DOI":"10.1007\/978-3-540-25966-4_33","volume-title":"International Workshop on Multiple Classifier Systems","author":"Svetnik","year":"2004"},{"key":"2023012713005146400_btx784-B42","doi-asserted-by":"crossref","first-page":"D447","DOI":"10.1093\/nar\/gku1003","article-title":"String v10: protein\u2013protein interaction networks, integrated over the tree of life","volume":"43","author":"Szklarczyk","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023012713005146400_btx784-B43","doi-asserted-by":"crossref","DOI":"10.1038\/srep44016","article-title":"Principal components analysis based unsupervised feature extraction applied to gene expression analysis of blood from dengue haemorrhagic fever patients","volume":"7","author":"Taguchi","year":"2017","journal-title":"Sci. Rep"},{"key":"2023012713005146400_btx784-B44","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"2023012713005146400_btx784-B45","doi-asserted-by":"crossref","first-page":"1986","DOI":"10.1093\/bioinformatics\/btr300","article-title":"Classification with correlated features: unreliability of feature ranking and solutions","volume":"27","author":"Tolo\u015fi","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012713005146400_btx784-B46","doi-asserted-by":"crossref","first-page":"844","DOI":"10.1126\/science.1092472","article-title":"In vivo activation of the p53 pathway by small-molecule antagonists of mdm2","volume":"303","author":"Vassilev","year":"2004","journal-title":"Science"},{"key":"2023012713005146400_btx784-B47","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1097\/PPO.0b013e318212dd6d","article-title":"Molecular tumor profiling for prediction of response to anticancer therapies","volume":"17","author":"Walther","year":"2011","journal-title":"Cancer J"},{"key":"2023012713005146400_btx784-B48","author":"Wan","year":"2013"},{"key":"2023012713005146400_btx784-B49","doi-asserted-by":"crossref","first-page":"D668","DOI":"10.1093\/nar\/gkj067","article-title":"Drugbank: a comprehensive resource for in silico drug discovery and exploration","volume":"34","author":"Wishart","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023012713005146400_btx784-B50","doi-asserted-by":"crossref","first-page":"D955","DOI":"10.1093\/nar\/gks1111","article-title":"Genomics of drug sensitivity in cancer (gdsc): a resource for therapeutic biomarker discovery in cancer cells","volume":"41","author":"Yang","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023012713005146400_btx784-B51","doi-asserted-by":"crossref","first-page":"769","DOI":"10.1016\/j.patcog.2012.09.005","article-title":"Stratified sampling for feature subspace selection in random forests for high dimensional data","volume":"46","author":"Ye","year":"2013","journal-title":"Pattern Recogn"},{"key":"2023012713005146400_btx784-B52","first-page":"1.","article-title":"Analysis of important gene ontology terms and biological pathways related to pancreatic cancer","volume":"2016","author":"Yin","year":"2016","journal-title":"BioMed Res. Int"},{"key":"2023012713005146400_btx784-B53","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1111\/j.1467-9868.2005.00503.x","article-title":"Regularization and variable selection via the elastic net","volume":"67","author":"Zou","year":"2005","journal-title":"J. R. Stat. Soc. Ser. B"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/8\/1336\/48915052\/bioinformatics_34_8_1336.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/8\/1336\/48915052\/bioinformatics_34_8_1336.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,29]],"date-time":"2024-06-29T19:42:54Z","timestamp":1719690174000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/8\/1336\/4756094"}},"subtitle":[],"editor":[{"given":"Janet","family":"Kelso","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2017,12,18]]},"references-count":53,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2018,4,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx784","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,4,15]]},"published":{"date-parts":[[2017,12,18]]}}}