{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,22]],"date-time":"2025-02-22T00:44:55Z","timestamp":1740185095573,"version":"3.37.3"},"reference-count":35,"publisher":"Oxford University Press (OUP)","issue":"14","license":[{"start":{"date-parts":[[2017,7,12]],"date-time":"2017-07-12T00:00:00Z","timestamp":1499817600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["DMS\u201314-07439"],"award-info":[{"award-number":["DMS\u201314-07439"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,7,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>The discovery of relationships between gene expression measurements and phenotypic responses is hampered by both computational and statistical impediments. Conventional statistical methods are less than ideal because they either fail to select relevant genes, predict poorly, ignore the unknown interaction structure between genes, or are computationally intractable. Thus, the creation of new methods which can handle many expression measurements on relatively small numbers of patients while also uncovering gene\u2013gene relationships and predicting well is desirable.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We develop a new technique for using the marginal relationship between gene expression measurements and patient survival outcomes to identify a small subset of genes which appear highly relevant for predicting survival, produce a low-dimensional embedding based on this small subset, and amplify this embedding with information from the remaining genes. We motivate our methodology by using gene expression measurements to predict survival time for patients with diffuse large B-cell lymphoma, illustrate the behavior of our methodology on carefully constructed synthetic examples, and test it on a number of other gene expression datasets. Our technique is computationally tractable, generally outperforms other methods, is extensible to other phenotypes, and also identifies different genes (relative to existing methods) for possible future study.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and Implementation<\/jats:title><jats:p>All of the code and data are available at http:\/\/mypage.iu.edu\/\u223cdajmcdon\/research\/.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary material is available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx265","type":"journal-article","created":{"date-parts":[[2017,4,20]],"date-time":"2017-04-20T07:52:13Z","timestamp":1492674733000},"page":"i350-i358","source":"Crossref","is-referenced-by-count":5,"title":["Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression"],"prefix":"10.1093","volume":"33","author":[{"given":"Lei","family":"Ding","sequence":"first","affiliation":[{"name":"Department of Statistics, Indiana University, Bloomington, IN, USA"}]},{"given":"Daniel J","family":"McDonald","sequence":"additional","affiliation":[{"name":"Department of Statistics, Indiana University, Bloomington, IN, USA"}]}],"member":"286","published-online":{"date-parts":[[2017,7,12]]},"reference":[{"key":"2023051506490932100_btx265-B1","doi-asserted-by":"crossref","first-page":"10101","DOI":"10.1073\/pnas.97.18.10101","article-title":"Singular value decomposition for genome-wide expression data processing and modeling","volume":"97","author":"Alter","year":"2000","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051506490932100_btx265-B2","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1198\/016214505000000628","article-title":"Prediction by supervised principal components","volume":"101","author":"Bair","year":"2006","journal-title":"J. Am. Stat. Assoc"},{"key":"2023051506490932100_btx265-B3","doi-asserted-by":"crossref","first-page":"e108","DOI":"10.1371\/journal.pbio.0020108","article-title":"Semi-supervised methods to predict patient survival from gene expression data","volume":"2","author":"Bair","year":"2004","journal-title":"PLoS Biol"},{"key":"2023051506490932100_btx265-B4","doi-asserted-by":"crossref","first-page":"955","DOI":"10.1038\/ng.175","article-title":"Genome-wide association defines more than 30 distinct susceptibility loci for crohn\u2019s disease","volume":"40","author":"Barrett","year":"2008","journal-title":"Nat. Genet"},{"key":"2023051506490932100_btx265-B5","doi-asserted-by":"crossref","first-page":"816","DOI":"10.1038\/nm733","article-title":"Gene-expression profiles predict survival of patients with lung adenocarcinoma","volume":"8","author":"Beer","year":"2002","journal-title":"Nat. Med"},{"key":"2023051506490932100_btx265-B6","doi-asserted-by":"crossref","first-page":"1605","DOI":"10.1056\/NEJMoa031046","article-title":"Gene expression profiling identifies new subclasses and improves outcome prediction in adult myeloid leukemia","volume":"350","author":"Bullinger","year":"2004","journal-title":"New Engl. J. Med"},{"key":"2023051506490932100_btx265-B7","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1038\/nature05911","article-title":"Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls","volume":"447","author":"Burton","year":"2007","journal-title":"Nature"},{"key":"2023051506490932100_btx265-B8","first-page":"2313","article-title":"The Dantzig selector: statistical estimation when p is much larger than n","volume":"35","author":"Candes","year":"2007","journal-title":"Ann. Stat"},{"key":"2023051506490932100_btx265-B9","doi-asserted-by":"crossref","first-page":"1077","DOI":"10.1038\/ng.714","article-title":"Thirty new loci for age at menarche identified by a meta-analysis of genome-wide association studies","volume":"42","author":"Elks","year":"2010","journal-title":"Nat. Genet"},{"key":"2023051506490932100_btx265-B10","doi-asserted-by":"crossref","first-page":"432","DOI":"10.1093\/biostatistics\/kxm045","article-title":"Sparse inverse covariance estimation with the graphical lasso","volume":"9","author":"Friedman","year":"2008","journal-title":"Biostatistics"},{"key":"2023051506490932100_btx265-B11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v033.i01","article-title":"Regularization paths for generalized linear models via coordinate descent","volume":"33","author":"Friedman","year":"2010","journal-title":"J. Stat. Softw"},{"key":"2023051506490932100_btx265-B12","doi-asserted-by":"crossref","first-page":"research0003","DOI":"10.1186\/gb-2001-2-1-research0003","article-title":"Supervised harvesting of expression trees","volume":"2","author":"Hastie","year":"2001","journal-title":"Genome Biol"},{"key":"2023051506490932100_btx265-B13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/gb-2000-1-2-research0003","article-title":"Identifying distinct sets of genes with similar expression patterns via \u201cgene shaving\u201d","volume":"1","author":"Hastie","year":"2000","journal-title":"Genome Biol"},{"key":"2023051506490932100_btx265-B14","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1080\/00401706.1970.10488634","article-title":"Ridge regression: biased estimation for nonorthogonal problems","volume":"12","author":"Hoerl","year":"1970","journal-title":"Technometrics"},{"key":"2023051506490932100_btx265-B15","doi-asserted-by":"crossref","first-page":"344","DOI":"10.1080\/10618600.2014.995799","article-title":"On the Nystr\u00f6m and column-sampling methods for the approximate principal components analysis of large data sets","volume":"25","author":"Homrighausen","year":"2016","journal-title":"J. Comput. Graph. Stat"},{"key":"2023051506490932100_btx265-B16","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1111\/j.2044-8317.1957.tb00179.x","article-title":"The relations of the newer multivariate statistical methods to factor analysis","volume":"10","author":"Hotelling","year":"1957","journal-title":"Br. J. Stat. Psychol"},{"key":"2023051506490932100_btx265-B17","doi-asserted-by":"crossref","first-page":"2700","DOI":"10.1093\/hmg\/ddv028","article-title":"Genetic variants associated with motion sickness point to roles for inner ear development, neurological processes and glucose homeostasis","volume":"24","author":"Hromatka","year":"2015","journal-title":"Hum. Mol. Genet"},{"key":"2023051506490932100_btx265-B18","doi-asserted-by":"crossref","first-page":"682","DOI":"10.1198\/jasa.2009.0121","article-title":"On consistency and sparsity for principal components analysis in high dimensions","volume":"104","author":"Johnstone","year":"2009","journal-title":"J. Am. Stat. Assoc"},{"volume-title":"Principal Component Analysis","year":"2002","author":"Jolliffe","key":"2023051506490932100_btx265-B19"},{"volume-title":"A Course in Multivariate Analysis","year":"1965","author":"Kendall","key":"2023051506490932100_btx265-B20"},{"key":"2023051506490932100_btx265-B21","doi-asserted-by":"crossref","first-page":"1403","DOI":"10.1007\/s00439-012-1174-2","article-title":"Genome-wide analysis of polymorphisms associated with cytokine responses in smallpox vaccine recipients","volume":"131","author":"Kennedy","year":"2012","journal-title":"Hum. Genet"},{"key":"2023051506490932100_btx265-B22","doi-asserted-by":"crossref","first-page":"R48","DOI":"10.1093\/hmg\/ddp012","article-title":"Parkinson\u2019s disease: from monogenic forms to genetic susceptibility factors","volume":"18","author":"Lesage","year":"2009","journal-title":"Hum. Mol. Genet"},{"year":"2002","author":"Lu","key":"2023051506490932100_btx265-B23"},{"key":"2023051506490932100_btx265-B24","doi-asserted-by":"crossref","first-page":"1436","DOI":"10.1214\/009053606000000281","article-title":"High-dimensional graphs and variable selection with the lasso","volume":"34","author":"Meinshausen","year":"2006","journal-title":"Ann. Stat"},{"key":"2023051506490932100_btx265-B25","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v018.i02","article-title":"The pls package: principal component and partial least squares regression in r","volume":"18","author":"Mevik","year":"2007","journal-title":"J. Stat. Softw"},{"key":"2023051506490932100_btx265-B26","doi-asserted-by":"crossref","first-page":"1595","DOI":"10.1214\/009053607000000578","article-title":"Preconditioning\u2019 for feature selection and regression in high-dimensional problems","volume":"36","author":"Paul","year":"2008","journal-title":"Ann. Stat"},{"key":"2023051506490932100_btx265-B27","first-page":"566","article-title":"Principal components analysis","volume":"6","author":"Pearson","year":"1901","journal-title":"Lond. Edinb. Dublin Philos. Mag. J"},{"key":"2023051506490932100_btx265-B28","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1038\/nature13545","article-title":"Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche","volume":"514","author":"Perry","year":"2014","journal-title":"Nature"},{"key":"2023051506490932100_btx265-B29","doi-asserted-by":"crossref","first-page":"1937","DOI":"10.1056\/NEJMoa012914","article-title":"The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma","volume":"346","author":"Rosenwald","year":"2002","journal-title":"New Engl. J. Med"},{"key":"2023051506490932100_btx265-B30","doi-asserted-by":"crossref","first-page":"636","DOI":"10.1016\/j.biopsych.2015.12.006","article-title":"Pharmacogenomic study of clozapine-induced agranulocytosis\/granulocytopenia in a Japanese population","volume":"80","author":"Saito","year":"2016","journal-title":"Biol. Psychiatry"},{"key":"2023051506490932100_btx265-B31","doi-asserted-by":"crossref","first-page":"881","DOI":"10.1038\/nature05616","article-title":"A genome-wide association study identifies novel risk loci for type 2 diabetes","volume":"445","author":"Sladek","year":"2007","journal-title":"Nature"},{"key":"2023051506490932100_btx265-B32","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. Roy. Stat. Soc. B"},{"key":"2023051506490932100_btx265-B33","doi-asserted-by":"crossref","first-page":"530","DOI":"10.1038\/415530a","article-title":"Gene expression profiling predicts clinical outcome of breast cancer","volume":"415","author":"Van\u2019t Veer","year":"2002","journal-title":"Nature"},{"key":"2023051506490932100_btx265-B34","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1007\/0-306-47815-3_5","volume-title":"A Practical Approach to Microarray Data Analysis","author":"Wall","year":"2003"},{"key":"2023051506490932100_btx265-B35","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1111\/j.1467-9868.2005.00532.x","article-title":"Model selection and estimation in regression with grouped variables","volume":"68","author":"Yuan","year":"2006","journal-title":"J. Roy. Stat. Soc. B"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/14\/i350\/50314760\/bioinformatics_33_14_i350.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/14\/i350\/50314760\/bioinformatics_33_14_i350.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,23]],"date-time":"2024-06-23T17:42:23Z","timestamp":1719164543000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/14\/i350\/3953978"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,7,12]]},"references-count":35,"journal-issue":{"issue":"14","published-print":{"date-parts":[[2017,7,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx265","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2017,7,15]]},"published":{"date-parts":[[2017,7,12]]}}}