{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:33:57Z","timestamp":1772138037382,"version":"3.50.1"},"reference-count":43,"publisher":"Oxford University Press (OUP)","issue":"11","license":[{"start":{"date-parts":[[2020,3,10]],"date-time":"2020-03-10T00:00:00Z","timestamp":1583798400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000054","name":"National Cancer Institute","doi-asserted-by":"publisher","award":["P30 CA006927"],"award-info":[{"award-number":["P30 CA006927"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["625061"],"award-info":[{"award-number":["625061"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006751","name":"US Army","doi-asserted-by":"crossref","award":["W911NF-16-2-0189"],"award-info":[{"award-number":["W911NF-16-2-0189"]}],"id":[{"id":"10.13039\/100006751","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>One of the major goals in large-scale genomic studies is to identify genes with a prognostic impact on time-to-event outcomes which provide insight into the disease process. With rapid developments in high-throughput genomic technologies in the past two decades, the scientific community is able to monitor the expression levels of tens of thousands of genes and proteins resulting in enormous datasets where the number of genomic features is far greater than the number of subjects. Methods based on univariate Cox regression are often used to select genomic features related to survival outcome; however, the Cox model assumes proportional hazards (PH), which is unlikely to hold for each feature. When applied to genomic features exhibiting some form of non-proportional hazards (NPH), these methods could lead to an under- or over-estimation of the effects. We propose a broad array of marginal screening techniques that aid in feature ranking and selection by accommodating various forms of NPH. First, we develop an approach based on Kullback\u2013Leibler information divergence and the Yang\u2013Prentice model that includes methods for the PH and proportional odds (PO) models as special cases. Next, we propose R2 measures for the PH and PO models that can be interpreted in terms of explained randomness. Lastly, we propose a generalized pseudo-R2 index that includes PH, PO, crossing hazards and crossing odds models as special cases and can be interpreted as the percentage of separability between subjects experiencing the event and not experiencing the event according to feature measurements.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We evaluate the performance of our measures using extensive simulation studies and publicly available datasets in cancer genomics. We demonstrate that the proposed methods successfully address the issue of NPH in genomic feature selection and outperform existing methods.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>R code for the proposed methods is available at github.com\/lburns27\/Feature-Selection.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Contact<\/jats:title>\n                    <jats:p>karthik.devarajan@fccc.edu<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa161","type":"journal-article","created":{"date-parts":[[2020,3,6]],"date-time":"2020-03-06T07:26:22Z","timestamp":1583479582000},"page":"3409-3417","source":"Crossref","is-referenced-by-count":4,"title":["Unified methods for feature selection in large-scale genomic studies with censored survival outcomes"],"prefix":"10.1093","volume":"36","author":[{"given":"Lauren","family":"Spirko-Burns","sequence":"first","affiliation":[{"name":"Department of Statistical Science , Temple University"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8475-383X","authenticated-orcid":false,"given":"Karthik","family":"Devarajan","sequence":"additional","affiliation":[{"name":"Department of Biostatistics & Bioinformatics , Fox Chase Cancer Center, Temple University Health System, Philadelphia, PA, USA"}]}],"member":"286","published-online":{"date-parts":[[2020,3,10]]},"reference":[{"key":"2023062312020014900_btaa161-B1","volume-title":"Survival Analysis Using SAS: A Practical Guide","author":"Allison","year":"1995"},{"key":"2023062312020014900_btaa161-B2","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1038\/bjc.1995.364","article-title":"Review of survival analysis published in cancer journals","volume":"72","author":"Altman","year":"1995","journal-title":"Br. J. Cancer"},{"key":"2023062312020014900_btaa161-B3","doi-asserted-by":"crossref","first-page":"527","DOI":"10.1093\/biomet\/82.3.527","article-title":"Model misspecification in proportional hazards regression","volume":"82","author":"Anderson","year":"1995","journal-title":"Biometrika"},{"key":"2023062312020014900_btaa161-B5","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Stat. Soc"},{"key":"2023062312020014900_btaa161-B6","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1002\/sim.4780020223","article-title":"Analysis of survival data by the proportional odds model","volume":"2","author":"Bennett","year":"1983","journal-title":"Stat. Med"},{"key":"2023062312020014900_btaa161-B7","doi-asserted-by":"crossref","first-page":"13790","DOI":"10.1073\/pnas.191502998","article-title":"Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses","volume":"98","author":"Bhattacharjee","year":"2001","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062312020014900_btaa161-B8","doi-asserted-by":"crossref","first-page":"2627","DOI":"10.1002\/sim.4242","article-title":"A simulation study of predictive ability measures in a survival model I: explained variation measures","volume":"31","author":"Choodari-Oskooei","year":"2012","journal-title":"Stat. Med"},{"key":"2023062312020014900_btaa161-B9","doi-asserted-by":"crossref","first-page":"2644","DOI":"10.1002\/sim.5460","article-title":"A simulation study of predictive ability measures in a survival model II: explained randomness and predictive accuracy","volume":"31","author":"Choodari-Oskooei","year":"2012","journal-title":"Stat. Med"},{"key":"2023062312020014900_btaa161-B10","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1111\/j.2517-6161.1972.tb00899.x","article-title":"Regression models and life-tables","volume":"34","author":"Cox","year":"1972","journal-title":"J. R. Stat. Soc"},{"key":"2023062312020014900_btaa161-B11","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1007\/978-1-4612-0103-8_18","volume-title":"Goodness-of-Fit Tests and Model Validity","author":"Devarajan","year":"2002"},{"key":"2023062312020014900_btaa161-B12","doi-asserted-by":"crossref","first-page":"2333","DOI":"10.1080\/03610920802536958","article-title":"Testing for covariate effect in the cox proportional hazards regression model","volume":"38","author":"Devarajan","year":"2009","journal-title":"Commun. Stat"},{"key":"2023062312020014900_btaa161-B13","doi-asserted-by":"crossref","first-page":"667","DOI":"10.1016\/j.csda.2010.06.010","article-title":"A semi-parametric generalization of the Cox proportional hazards regression model: inference and applications","volume":"55","author":"Devarajan","year":"2011","journal-title":"Comput. Stat. Data Anal"},{"key":"2023062312020014900_btaa161-B14","doi-asserted-by":"crossref","first-page":"587","DOI":"10.1186\/1471-2105-11-587","article-title":"Comparison of beta-value and M-value methods for quantifying methylation levels by microarray analysis","volume":"11","author":"Du","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023062312020014900_btaa161-B15","doi-asserted-by":"crossref","first-page":"784","DOI":"10.1093\/bioinformatics\/btq035","article-title":"Gene selection in microarray survival studies under possibly non-proportional hazards","volume":"26","author":"Dunkler","year":"2010","journal-title":"Bioinformatics"},{"key":"2023062312020014900_btaa161-B17","doi-asserted-by":"crossref","first-page":"1029","DOI":"10.1002\/bimj.200610301","article-title":"Consistent estimation of the expected Brier score in general survival models with right-censored event times","volume":"48","author":"Gerds","year":"2006","journal-title":"Biometric. J"},{"key":"2023062312020014900_btaa161-B18","doi-asserted-by":"crossref","first-page":"2529","DOI":"10.1002\/(SICI)1097-0258(19990915\/30)18:17\/18<2529::AID-SIM274>3.0.CO;2-5","article-title":"Assessment and comparison of prognostic classification schemes for survival data","volume":"18","author":"Graf","year":"1999","journal-title":"Stat. Med"},{"key":"2023062312020014900_btaa161-B19","doi-asserted-by":"crossref","first-page":"515","DOI":"10.1093\/biomet\/81.3.515","article-title":"Proportional hazards tests and diagnostics based on weighted residuals","volume":"81","author":"Grambsch","year":"1994","journal-title":"Biometrika"},{"key":"2023062312020014900_btaa161-B20","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1093\/biostatistics\/4.2.249","article-title":"Exploration, normalization, and summaries of high density oligonucleotide array probe level data","volume":"4","author":"Irizarry","year":"2003","journal-title":"Biostatistics"},{"key":"2023062312020014900_btaa161-B22","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1093\/biomet\/75.3.525","article-title":"Measures of dependence for censored survival data","volume":"75","author":"Kent","year":"1988","journal-title":"Biometrika"},{"key":"2023062312020014900_btaa161-B23","doi-asserted-by":"crossref","DOI":"10.1007\/b97377","volume-title":"Survival Analysis: Techniques for Censored and Truncated Data","author":"Klein","year":"2003"},{"key":"2023062312020014900_btaa161-B25","doi-asserted-by":"crossref","first-page":"177","DOI":"10.1016\/0167-9473(96)88029-7","article-title":"Effects of model misspecification in estimating covariate effects in survival analysis for small sample sizes","volume":"22","author":"Li","year":"1996","journal-title":"Comput. Stat. Data Anal"},{"key":"2023062312020014900_btaa161-B27","author":"Martinussen","year":"2006"},{"key":"2023062312020014900_btaa161-B28","doi-asserted-by":"crossref","first-page":"691","DOI":"10.1093\/biomet\/78.3.691","article-title":"A note on a general definition of the coefficient of determination","volume":"78","author":"Nagelkerke","year":"1991","journal-title":"Biometrika"},{"key":"2023062312020014900_btaa161-B29","doi-asserted-by":"crossref","first-page":"2310","DOI":"10.1073\/pnas.91.6.2310","article-title":"Predictive capability of proportional hazards regression","volume":"91","author":"O\u2019Quigley","year":"1994","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062312020014900_btaa161-B30","doi-asserted-by":"crossref","first-page":"479","DOI":"10.1002\/sim.1946","article-title":"Explained randomness in proportional hazards models","volume":"24","author":"O\u2019Quigley","year":"2005","journal-title":"Stat. Med"},{"key":"2023062312020014900_btaa161-B33","doi-asserted-by":"crossref","first-page":"e47","DOI":"10.1093\/nar\/gkv007","article-title":"limma powers differential expression analyses for RNA-sequencing and microarray studies","volume":"43","author":"Ritchie","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023062312020014900_btaa161-B34","doi-asserted-by":"crossref","DOI":"10.1186\/1471-2105-11-150","article-title":"Identifying common prognostic factors in genomic cancer studies: a novel index for censored outcomes","volume":"11","author":"Rouam","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023062312020014900_btaa161-B35","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1186\/1471-2288-11-28","article-title":"A pseudo-R2 measure for selecting genomic markers with crossing hazards functions","volume":"11","author":"Rouam","year":"2011","journal-title":"BMC Med. Res. Methodol"},{"key":"2023062312020014900_btaa161-B36","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1177\/1536867X0600600105","article-title":"Explained variation for survival models","volume":"6","author":"Royston","year":"2006","journal-title":"Stata J"},{"key":"2023062312020014900_btaa161-B37","doi-asserted-by":"crossref","first-page":"723","DOI":"10.1002\/sim.1621","article-title":"A new measure of prognostic separation in survival data","volume":"23","author":"Royston","year":"2004","journal-title":"Stat. Med"},{"key":"2023062312020014900_btaa161-B38","doi-asserted-by":"crossref","first-page":"218","DOI":"10.1158\/1940-6207.CAPR-10-0155","article-title":"Gene expression profiling predicts the development of oral cancer","volume":"4","author":"Saintigny","year":"2011","journal-title":"Cancer Prev. Res. (Phila)"},{"key":"2023062312020014900_btaa161-B39","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1111\/j.0006-341X.2000.00249.x","article-title":"Predictive accuracy and explained variation in Cox regression","volume":"56","author":"Schemper","year":"2000","journal-title":"Biometrics"},{"key":"2023062312020014900_btaa161-B41","doi-asserted-by":"crossref","first-page":"1","DOI":"10.2202\/1544-6115.1027","article-title":"Linear models and empirical Bayes methods for assessing differential expression in microarray experiments","volume":"3","author":"Smyth","year":"2004","journal-title":"Stat. Appl. Genet. Mol. Biol"},{"key":"2023062312020014900_btaa161-B42","doi-asserted-by":"crossref","first-page":"657","DOI":"10.1080\/01621459.1995.10476560","article-title":"Information distinguishability with application to analysis of failure data","volume":"90","author":"Soofi","year":"1995","journal-title":"J. Am. Stat. Assoc"},{"key":"2023062312020014900_btaa161-B44","doi-asserted-by":"crossref","first-page":"9440","DOI":"10.1073\/pnas.1530509100","article-title":"Statistical significance for genomewide studies","volume":"100","author":"Storey","year":"2003","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062312020014900_btaa161-B45","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1093\/biomet\/73.2.363","article-title":"Misspecified proportional hazard models","volume":"73","author":"Struthers","year":"1986","journal-title":"Biometrika"},{"key":"2023062312020014900_btaa161-B46","doi-asserted-by":"crossref","first-page":"5198","DOI":"10.1158\/1078-0432.CCR-08-0196","article-title":"Novel molecular subtypes of serous and endometroid ovarian cancer linked to clinical outcome","volume":"14","author":"Tothill","year":"2008","journal-title":"Clin. Cancer Res"},{"key":"2023062312020014900_btaa161-B48","doi-asserted-by":"crossref","first-page":"368","DOI":"10.1504\/IJCBDD.2008.022208","article-title":"Statistical issues in the analysis of DNA copy number variations","volume":"1","author":"Wineinger","year":"2008","journal-title":"Int. J. Comput. Biol. Drug Des"},{"key":"2023062312020014900_btaa161-B49","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1080\/10485259908832799","article-title":"A measure of dependence for proportional hazards models","volume":"12","author":"Xu","year":"1999","journal-title":"J. Nonparametric Stat"},{"key":"2023062312020014900_btaa161-B50","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1016\/j.compbiolchem.2005.02.001","article-title":"Survival analysis of microarray expression data by transformation models","volume":"29","author":"Xu","year":"2005","journal-title":"Comput. Biol. Chem"},{"key":"2023062312020014900_btaa161-B51","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1093\/biomet\/92.1.1","article-title":"Semiparametric analysis of short-term and long-term hazard ratios with two-sample survival data","volume":"92","author":"Yang","year":"2005","journal-title":"Biometrika"},{"key":"2023062312020014900_btaa161-B52","doi-asserted-by":"crossref","first-page":"554","DOI":"10.1111\/j.1467-9469.2012.00804.x","article-title":"Checking the short-term and long-term hazard ratio model for survival data","volume":"39","author":"Yang","year":"2012","journal-title":"Scand. J. Stat"},{"key":"2023062312020014900_btaa161-B53","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1002\/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3","article-title":"Index for rating diagnostic tests","volume":"3","author":"Youden","year":"1950","journal-title":"Cancer"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa161\/33125534\/btaa161.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/11\/3409\/50670673\/bioinformatics_36_11_3409.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/11\/3409\/50670673\/bioinformatics_36_11_3409.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,1]],"date-time":"2024-08-01T14:49:49Z","timestamp":1722523789000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/11\/3409\/5802463"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,3,10]]},"references-count":43,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2020,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa161","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.02.14.944314","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,6]]},"published":{"date-parts":[[2020,3,10]]}}}