{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T12:23:26Z","timestamp":1772713406755,"version":"3.50.1"},"reference-count":26,"publisher":"Oxford University Press (OUP)","issue":"16","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":3406,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,8,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Survival prediction from gene expression data and other high-dimensional genomic data has been subject to much research during the last years. These kinds of data are associated with the methodological problem of having many more gene expression values than individuals. In addition, the responses are censored survival times. Most of the proposed methods handle this by using Cox's proportional hazards model and obtain parameter estimates by some dimension reduction or parameter shrinkage estimation technique. Using three well-known microarray gene expression data sets, we compare the prediction performance of seven such methods: univariate selection, forward stepwise selection, principal components regression (PCR), supervised principal components regression, partial least squares regression (PLS), ridge regression and the lasso.<\/jats:p>\n               <jats:p>Results: Statistical learning from subsets should be repeated several times in order to get a fair comparison between methods. Methods using coefficient shrinkage or linear combinations of the gene expression values have much better performance than the simple variable selection methods. For our data sets, ridge regression has the overall best performance.<\/jats:p>\n               <jats:p>Availability: Matlab and R code for the prediction methods are available at http:\/\/www.med.uio.no\/imb\/stat\/bmms\/software\/microsurv\/.<\/jats:p>\n               <jats:p>Contact: \u00a0hegembo@math.uio.no<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm305","type":"journal-article","created":{"date-parts":[[2007,6,7]],"date-time":"2007-06-07T00:25:06Z","timestamp":1181175906000},"page":"2080-2087","source":"Crossref","is-referenced-by-count":218,"title":["Predicting survival from microarray data\u2014a comparative study"],"prefix":"10.1093","volume":"23","author":[{"given":"H.M.","family":"B\u00f8velstad","sequence":"first","affiliation":[{"name":"1 Department of Mathematics, 2Department of Informatics, University of Oslo, 3Norwegian Computing Center and 4Institute of Basic Medical Sciences, Department of Biostatistics, University of Oslo and Statistics for Innovation \u2013 (sfi) 2, Norway"}]},{"given":"S.","family":"Nyg\u00e5rd","sequence":"additional","affiliation":[{"name":"1 Department of Mathematics, 2Department of Informatics, University of Oslo, 3Norwegian Computing Center and 4Institute of Basic Medical Sciences, Department of Biostatistics, University of Oslo and Statistics for Innovation \u2013 (sfi) 2, Norway"}]},{"given":"H.L.","family":"St\u00f8rvold","sequence":"additional","affiliation":[{"name":"1 Department of Mathematics, 2Department of Informatics, University of Oslo, 3Norwegian Computing Center and 4Institute of Basic Medical Sciences, Department of Biostatistics, University of Oslo and Statistics for Innovation \u2013 (sfi) 2, Norway"}]},{"given":"M.","family":"Aldrin","sequence":"additional","affiliation":[{"name":"1 Department of Mathematics, 2Department of Informatics, University of Oslo, 3Norwegian Computing Center and 4Institute of Basic Medical Sciences, Department of Biostatistics, University of Oslo and Statistics for Innovation \u2013 (sfi) 2, Norway"}]},{"given":"\u00d8.","family":"Borgan","sequence":"additional","affiliation":[{"name":"1 Department of Mathematics, 2Department of Informatics, University of Oslo, 3Norwegian Computing Center and 4Institute of Basic Medical Sciences, Department of Biostatistics, University of Oslo and Statistics for Innovation \u2013 (sfi) 2, Norway"}]},{"given":"A.","family":"Frigessi","sequence":"additional","affiliation":[{"name":"1 Department of Mathematics, 2Department of Informatics, University of Oslo, 3Norwegian Computing Center and 4Institute of Basic Medical Sciences, Department of Biostatistics, University of Oslo and Statistics for Innovation \u2013 (sfi) 2, Norway"}]},{"given":"O.C.","family":"Lingj\u00e6rde","sequence":"additional","affiliation":[{"name":"1 Department of Mathematics, 2Department of Informatics, University of Oslo, 3Norwegian Computing Center and 4Institute of Basic Medical Sciences, Department of Biostatistics, University of Oslo and Statistics for Innovation \u2013 (sfi) 2, Norway"}]}],"member":"286","published-online":{"date-parts":[[2007,6,6]]},"reference":[{"key":"2024121118015302600_B1","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1016\/S0167-9473(97)00015-7","article-title":"Length modified ridge regression","volume":"25","author":"Aldrin","year":"1997","journal-title":"Comput. Stat. Data Anal"},{"key":"2024121118015302600_B2","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1371\/journal.pbio.0020108","article-title":"Semi-supervised methods to predict patient survival from gene expression data","volume":"2","author":"Bair","year":"2004","journal-title":"PLoS Biol"},{"key":"2024121118015302600_B3","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1198\/016214505000000628","article-title":"Prediction by supervised principal components","volume":"101","author":"Bair","year":"2006","journal-title":"J. Am. Stat. Assoc"},{"key":"2024121118015302600_B4","doi-asserted-by":"crossref","first-page":"3738","DOI":"10.1073\/pnas.0409462102","article-title":"Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival","volume":"102","author":"Chang","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2024121118015302600_B5","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1111\/j.2517-6161.1972.tb00899.x","article-title":"Regression models and life tables (with discussion)","volume":"34","author":"Cox","year":"1972","journal-title":"J. R. Stat. Soc. B"},{"key":"2024121118015302600_B6","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1080\/00401706.1993.10485033","article-title":"A statistical view of some chemometrics regression tools (with discussion)","volume":"35","author":"Frank","year":"1993","journal-title":"Technometrics"},{"key":"2024121118015302600_B7","volume-title":"The Elements of Statistical Learning, Data Mining, Inference, and Prediction","author":"Hastie","year":"2001"},{"key":"2024121118015302600_B8","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1080\/00401706.1970.10488634","article-title":"Ridge regression: biased estimation for non-orthogonal problems","volume":"12","author":"Hoerl","year":"1970","journal-title":"Technometrics"},{"key":"2024121118015302600_B9","doi-asserted-by":"crossref","DOI":"10.1007\/b97377","volume-title":"Survival Analysis. Techniques for Censored and Truncated Data","author":"Klein","year":"2003","edition":"2nd"},{"key":"2024121118015302600_B10","volume-title":"Multivariate Calibration","author":"Martens","year":"1989"},{"key":"2024121118015302600_B11","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1080\/00401706.1996.10484549","article-title":"Iteratively reweighted partial least squares estimation for generalized linear regression","volume":"38","author":"Marx","year":"1996","journal-title":"Technometrics"},{"key":"2024121118015302600_B12","doi-asserted-by":"crossref","first-page":"488","DOI":"10.1016\/S0140-6736(05)17866-0","article-title":"Prediction of cancer outcome with microarrays: a multiple random validation strategy","volume":"365","author":"Michiels","year":"2005","journal-title":"Lancet"},{"key":"2024121118015302600_B13","unstructured":"Nyg\u00e5rd\n              S\n            \n            \u00a0et al.\n          Partial least squares Cox regression on genomic data handling additional covariates. Statistical Research Report 5\/2006\n          2006\n          Department of Mathematics, University of Oslo\n          \n            http:\/\/www.math.uio.no\/eprint\/stat_report\/2006\/05-06.html"},{"key":"2024121118015302600_B14","unstructured":"Park\n              MP\n            \n            \u00a0HastieT\n          L1 regularization path algorithm for generalized linear models\n          Technical report. 2006\u201314.\n          2006\n          Department of Statistics, Stanford University\n          \n            http:\/\/www-stat.stanford.edu\/reports\/papers2006.html"},{"key":"2024121118015302600_B15","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1093\/bioinformatics\/18.suppl_1.S120","article-title":"Linking gene expression data with patient survival times using partial least squares","volume":"18","author":"Park","year":"2002","journal-title":"Bioinformatics"},{"key":"2024121118015302600_B16","doi-asserted-by":"crossref","first-page":"1937","DOI":"10.1056\/NEJMoa012914","article-title":"The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma","volume":"346","author":"Rosenwald","year":"2002","journal-title":"N. Engl. J. Med"},{"key":"2024121118015302600_B17","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1093\/biostatistics\/kxj006","article-title":"Microarray gene expression data with linked survival phenotypes: diffuse large-B-cell lymphoma revisited","volume":"7","author":"Segal","year":"2006","journal-title":"Biostatistics"},{"key":"2024121118015302600_B18","doi-asserted-by":"crossref","first-page":"486","DOI":"10.1080\/01621459.1993.10476299","article-title":"Linear model selection by cross-validation","volume":"88","author":"Shao","year":"1993","journal-title":"J. Am. Stat. Assoc"},{"key":"2024121118015302600_B19","doi-asserted-by":"crossref","first-page":"8418","DOI":"10.1073\/pnas.0932692100","article-title":"Repeated observation of breast tumor subtypes in independent gene expression data sets","volume":"100","author":"S\u00f8rlie","year":"2003","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2024121118015302600_B20","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the Lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. B"},{"key":"2024121118015302600_B21","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1002\/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3","article-title":"The lasso method for variable selection in the Cox model","volume":"16","author":"Tibshirani","year":"1997","journal-title":"Stat. Med"},{"key":"2024121118015302600_B22","doi-asserted-by":"crossref","first-page":"1999","DOI":"10.1056\/NEJMoa021967","article-title":"A gene-expression signature as a predictor of survival in breast cancer","volume":"347","author":"van de Vijver","year":"2002","journal-title":"N. Engl. J. Med"},{"key":"2024121118015302600_B23","doi-asserted-by":"crossref","first-page":"3201","DOI":"10.1002\/sim.2353","article-title":"Cross-validated Cox regression on microarray gene expression data","volume":"25","author":"van Houwelingen","year":"2006","journal-title":"Stat. Med"},{"key":"2024121118015302600_B24","doi-asserted-by":"crossref","first-page":"530","DOI":"10.1038\/415530a","article-title":"Gene expression profiling predicts clinical outcome of breast cancer","volume":"415","author":"van't Veer","year":"2002","journal-title":"Nature"},{"key":"2024121118015302600_B25","doi-asserted-by":"crossref","first-page":"2305","DOI":"10.1002\/sim.4780122407","article-title":"Cross-validation in survival analysis","volume":"12","author":"Verweij","year":"1993","journal-title":"Stat. Med"},{"key":"2024121118015302600_B26","doi-asserted-by":"crossref","first-page":"2427","DOI":"10.1002\/sim.4780132307","article-title":"Penalized likelihood in Cox regression","volume":"13","author":"Verweij","year":"1994","journal-title":"Stat. Med"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/16\/2080\/61051962\/bioinformatics_23_16_2080.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/16\/2080\/61051962\/bioinformatics_23_16_2080.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,11]],"date-time":"2024-12-11T22:24:05Z","timestamp":1733955845000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/16\/2080\/198497"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,6,6]]},"references-count":26,"journal-issue":{"issue":"16","published-print":{"date-parts":[[2007,8,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm305","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,8,15]]},"published":{"date-parts":[[2007,6,6]]}}}