{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,29]],"date-time":"2025-10-29T03:18:58Z","timestamp":1761707938594},"reference-count":28,"publisher":"Oxford University Press (OUP)","issue":"21","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2006,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: The nearest shrunken centroids classifier has become a popular algorithm in tumor classification problems using gene expression microarray data. Feature selection is an embedded part of the method to select top-ranking genes based on a univariate distance statistic calculated for each gene individually. The univariate statistics summarize gene expression profiles outside of the gene co-regulation network context, leading to redundant information being included in the selection procedure.<\/jats:p>\n               <jats:p>Results: We propose an Eigengene-based Linear Discriminant Analysis (ELDA) to address gene selection in a multivariate framework. The algorithm uses a modified rotated Spectral Decomposition (SpD) technique to select \u2018hub\u2019 genes that associate with the most important eigenvectors. Using three benchmark cancer microarray datasets, we show that ELDA selects the most characteristic genes, leading to substantially smaller classifiers than the univariate feature selection based analogues. The resulting de-correlated expression profiles make the gene-wise independence assumption more realistic and applicable for the shrunken centroids classifier and other diagonal linear discriminant type of models. Our algorithm further incorporates a misclassification cost matrix, allowing differential penalization of one type of error over another. In the breast cancer data, we show false negative prognosis can be controlled via a cost-adjusted discriminant function.<\/jats:p>\n               <jats:p>Availability: R code for the ELDA algorithm is available from author upon request.<\/jats:p>\n               <jats:p>Contact: \u00a0zhaoling.meng@sanofi-aventis.com<\/jats:p>\n               <jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btl442","type":"journal-article","created":{"date-parts":[[2006,8,23]],"date-time":"2006-08-23T03:33:41Z","timestamp":1156304021000},"page":"2635-2642","source":"Crossref","is-referenced-by-count":34,"title":["Eigengene-based linear discriminant model for tumor classification using gene expression microarray data"],"prefix":"10.1093","volume":"22","author":[{"given":"Ronglai","family":"Shen","sequence":"first","affiliation":[{"name":"Department of Biostatistics, University of Michigan 1 \u00a0 1 \u00a0 \u00a0 Ann Arbor, MI 48109-0602, USA"}]},{"given":"Debashis","family":"Ghosh","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, University of Michigan 1 \u00a0 1 \u00a0 \u00a0 Ann Arbor, MI 48109-0602, USA"}]},{"given":"Arul","family":"Chinnaiyan","sequence":"additional","affiliation":[{"name":"Department of Pathology and Urology, University of Michigan 2 \u00a0 2 \u00a0 \u00a0 Ann Arbor, MI 48109-0602, USA"}]},{"given":"Zhaoling","family":"Meng","sequence":"additional","affiliation":[{"name":"Biostatistics and Programming 3 \u00a0 3 \u00a0 \u00a0 Sanofi aventis, PO Box 6800, Bridgewater, NJ 08807-0800, USA"}]}],"member":"286","published-online":{"date-parts":[[2006,8,22]]},"reference":[{"key":"2023012408430616500_b1","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1006\/mvre.2001.2380","article-title":"Identification of endothelial cell genes expressed in an in vitro model of angiogenesis: induction of esm-1, (beta)ig-h3, and nrcam","volume":"63","author":"Aitkenhead","year":"2002","journal-title":"Microvasc. Res."},{"key":"2023012408430616500_b2","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1038\/35000501","article-title":"Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling","volume":"403","author":"Alizadeh","year":"2000","journal-title":"Nature"},{"key":"2023012408430616500_b3","doi-asserted-by":"crossref","first-page":"10101","DOI":"10.1073\/pnas.97.18.10101","article-title":"Singular value decomposition for genome-wide expression data processing and modeling","volume":"97","author":"Alter","year":"2000","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012408430616500_b4","first-page":"1","article-title":"Igfbp-3 and igfbp-5 association with endothelial cells: role of c-terminal heparin binding domain","volume":"5","author":"Booth","year":"1995","journal-title":"Growth Regul."},{"key":"2023012408430616500_b5","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1093\/bioinformatics\/btg419","article-title":"Is cross-validation valid for small-sample microarray classification?","volume":"20","author":"Braga-Neto","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012408430616500_b6","doi-asserted-by":"crossref","first-page":"3738","DOI":"10.1073\/pnas.0409462102","article-title":"Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival","volume":"102","author":"Chang","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012408430616500_b7","article-title":"Optimal feature selection for nearest centroid classifiers, with applications to gene expression microarrays","author":"Dabney","year":"2005"},{"key":"2023012408430616500_b8","doi-asserted-by":"crossref","first-page":"4148","DOI":"10.1093\/bioinformatics\/bti681","article-title":"Classification of microarrays to nearest centroids","volume":"21","author":"Dabney","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012408430616500_b9","first-page":"4059","article-title":"A cell proliferation signature is a marker of extremely poor outcome in a subpopulation of breast cancer patients","volume":"65","author":"Dai","year":"2005","journal-title":"J. Natl Cancer Inst."},{"key":"2023012408430616500_b10","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1198\/016214502753479248","article-title":"Comparison of discrimination methods for the classification of tumors using gene expression data","volume":"97","author":"Dudoit","year":"2002","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012408430616500_b11","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1093\/bioinformatics\/bth469","article-title":"Outcome signature genes in breast cancer: is there a unique set?","volume":"21","author":"Ein-Dor","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012408430616500_b12","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1126\/science.286.5439.531","article-title":"Molecular classification of cancer: class discovery and class prediction by gene expression monitoring","volume":"286","author":"Golub","year":"1999","journal-title":"Science"},{"key":"2023012408430616500_b13","doi-asserted-by":"crossref","first-page":"9608","DOI":"10.1073\/pnas.1632587100","article-title":"Prediction of clinical drug efficacy by classification of drug-induced genomic expression profiles in vitro","volume":"100","author":"Gunther","year":"2003","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012408430616500_b14","article-title":"Connectivity, Module-Conformity, and Significance: Understanding Gene Co-Expression Network Methods","author":"Horvath","year":"2006"},{"key":"2023012408430616500_b15","volume-title":"Applied Multivariate Statistical Analysis","author":"Johnson","year":"1998","edition":"4th edn"},{"key":"2023012408430616500_b16","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1007\/BF02289233","article-title":"The varimax criterion for analytic rotation in factor analysis","volume":"23","author":"Kaiser","year":"1958","journal-title":"Psychometrika"},{"key":"2023012408430616500_b17","doi-asserted-by":"crossref","first-page":"673","DOI":"10.1038\/89044","article-title":"Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks","volume":"7","author":"Khan","year":"2001","journal-title":"Nat. Med."},{"key":"2023012408430616500_b18","first-page":"1137","article-title":"A study of cross-validation and bootstrap for accuracy estimation and model selection","author":"Kohavi","year":"1995"},{"key":"2023012408430616500_b19","doi-asserted-by":"crossref","first-page":"784","DOI":"10.1002\/jcb.10093","article-title":"Signalling pathways involved in the direct effects of igfbp-5 on breast epithelial cell attachment and survival","volume":"84","author":"McCaig","year":"2000","journal-title":"J. Cell Biochem."},{"key":"2023012408430616500_b20","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1086\/376561","article-title":"Selection of genetic markers for assiciation analyses, using linkage disequilibrium and haplotypes","volume":"73","author":"Meng","year":"2003","journal-title":"Am. J. Hum. Genet."},{"key":"2023012408430616500_b21","doi-asserted-by":"crossref","first-page":"9309","DOI":"10.1073\/pnas.0401994101","article-title":"Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression","volume":"101","author":"Rhodes","year":"2004","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012408430616500_b22","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1093\/jnci\/95.1.14","article-title":"Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification","volume":"95","author":"Simon","year":"2003","journal-title":"J. Natl Cancer Inst."},{"key":"2023012408430616500_b23","doi-asserted-by":"crossref","first-page":"6567","DOI":"10.1073\/pnas.082099299","article-title":"Diagnosis of multiple cancer types by shrunken centroids of gene expression","volume":"99","author":"Tibshirani","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012408430616500_b24","doi-asserted-by":"crossref","first-page":"530","DOI":"10.1038\/415530a","article-title":"Gene expression profiling predicts clinical outcome of breast cancer","volume":"415","author":"van\u2019t Veer","year":"2002","journal-title":"Nature"},{"key":"2023012408430616500_b25","doi-asserted-by":"crossref","first-page":"1999","DOI":"10.1056\/NEJMoa021967","article-title":"A gene-expression signature as a predictor of survival in breast cancer","volume":"347","author":"van de Vijver","year":"2002","journal-title":"N. Engl. J. Med."},{"key":"2023012408430616500_b26","first-page":"8917","article-title":"Elevated levels of connective tissue growth factor, wisp-1, and cyr61 in primary breast cancers associated with more advanced features","volume":"61","author":"Xie","year":"2001","journal-title":"Cancer Res."},{"key":"2023012408430616500_b27","doi-asserted-by":"crossref","first-page":"763","DOI":"10.1093\/bioinformatics\/17.9.763","article-title":"Principal component analysis for clustering gene expression data","volume":"17","author":"Yeung","year":"2000","journal-title":"Bioinformatics"},{"key":"2023012408430616500_b28","doi-asserted-by":"crossref","DOI":"10.1186\/1471-2105-6-168","article-title":"High-throughput gominer, an industrial-strength integrative gene ontology tool for interpretation of multiple-microarray experiments, with application to studies of common variable immune deficiency (cvid)","volume":"6","author":"Zeeberg","year":"2005","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/21\/2635\/48839105\/bioinformatics_22_21_2635.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/21\/2635\/48839105\/bioinformatics_22_21_2635.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,24]],"date-time":"2023-01-24T09:04:33Z","timestamp":1674551073000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/22\/21\/2635\/250858"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,8,22]]},"references-count":28,"journal-issue":{"issue":"21","published-print":{"date-parts":[[2006,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btl442","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2006,11,1]]},"published":{"date-parts":[[2006,8,22]]}}}