{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,12]],"date-time":"2026-02-12T10:30:48Z","timestamp":1770892248405,"version":"3.50.1"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"9","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,5,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Pathway and gene set-based approaches for the analysis of gene expression profiling experiments have become increasingly popular for addressing problems associated with individual gene analysis. Since most genes are not differently expressed, existing gene set tests, which consider all the genes within a gene set, are subject to considerable noise and power loss, a concern exacerbated in studies in which the degree of differential expression is moderate for truly differentially expressed genes. For a significantly differentially expressed pathway, it is also of substantial interest to select important genes that drive the differential expression of the pathway.<\/jats:p><jats:p>Methods: We develop a unified framework to jointly test the significance of a pathway and to select a subset of genes that drive the significant pathway effect. To achieve dimension reduction and gene selection, we decompose each gene pathway into a single score by using a regularized form of linear discriminant analysis, called sparse linear discriminant analysis (sLDA). Testing for the significance of the pathway effect proceeds via permutation of the sLDA score. The sLDA-based test is compared with competing approaches with simulations and two applications: a study on the effect of metal fume exposure on immune response and a study of gene expression profiles among Type II Diabetes patients.<\/jats:p><jats:p>Results: Our results show that sLDA-based testing provides a powerful approach to test for the significance of a differentially expressed pathway and gene selection.<\/jats:p><jats:p>Availability: An implementation of the proposed sLDA-based pathway test in the R statistical computing environment is available at http:\/\/www.hsph.harvard.edu\/\u223cmwu\/software\/<\/jats:p><jats:p>Contact: \u00a0xlin@hsph.harvard.edu<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp019","type":"journal-article","created":{"date-parts":[[2009,1,26]],"date-time":"2009-01-26T01:13:06Z","timestamp":1232932386000},"page":"1145-1151","source":"Crossref","is-referenced-by-count":92,"title":["Sparse linear discriminant analysis for simultaneous testing for the significance of a gene set\/pathway and gene selection"],"prefix":"10.1093","volume":"25","author":[{"given":"Michael C.","family":"Wu","sequence":"first","affiliation":[{"name":"1 Department of Biostatistics and 2Department of Environmental Health, Harvard School of Public Health, 655 Huntington Ave., Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lingsong","family":"Zhang","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics and 2Department of Environmental Health, Harvard School of Public Health, 655 Huntington Ave., Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhaoxi","family":"Wang","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics and 2Department of Environmental Health, Harvard School of Public Health, 655 Huntington Ave., Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David C.","family":"Christiani","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics and 2Department of Environmental Health, Harvard School of Public Health, 655 Huntington Ave., Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xihong","family":"Lin","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics and 2Department of Environmental Health, Harvard School of Public Health, 655 Huntington Ave., Boston, MA 02115, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2009,1,25]]},"reference":[{"key":"2023013110282844200_B1","doi-asserted-by":"crossref","first-page":"760","DOI":"10.1093\/biomet\/asm050","article-title":"The high-dimension, low-sample-size geometric representation holds under mild conditions","volume":"94","author":"Ahn","year":"2007","journal-title":"Biometrika"},{"key":"2023013110282844200_B2","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology. The Gene Ontology Consortium","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet."},{"key":"2023013110282844200_B3","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1198\/016214505000000628","article-title":"Prediction by supervised principal components","volume":"101","author":"Bair","year":"2006","journal-title":"J. Am. Stat. Assoc."},{"key":"2023013110282844200_B4","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Stat. Soc. Ser. B Methodol."},{"key":"2023013110282844200_B5","first-page":"98","article-title":"Global functional profiling of gene expression","volume":"81","author":"Draghici","year":"2003","journal-title":"Genomics"},{"key":"2023013110282844200_B6","first-page":"25","article-title":"High dimensional feature selection for discriminant microarray data analysis","volume":"15","author":"Feng","year":"2003","journal-title":"Adv. Data Mining Model"},{"key":"2023013110282844200_B7","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1111\/j.1469-1809.1936.tb02137.x","article-title":"The use of multiple measurements in taxonomic problems","volume":"7","author":"Fisher","year":"1936","journal-title":"Ann. Eugen."},{"key":"2023013110282844200_B8","doi-asserted-by":"crossref","first-page":"230","DOI":"10.6026\/97320630002230","article-title":"On sparse Fisher discriminant method for microarray data analysis","volume":"2","author":"Fung","year":"2007","journal-title":"Bioinformation"},{"key":"2023013110282844200_B9","doi-asserted-by":"crossref","first-page":"980","DOI":"10.1093\/bioinformatics\/btm051","article-title":"Analyzing gene expression data in terms of gene sets: methodological issues","volume":"23","author":"Goeman","year":"2007","journal-title":"Bioinformatics"},{"key":"2023013110282844200_B10","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1093\/bioinformatics\/btg382","article-title":"A global test for groups of genes: testing association with a clinical outcome","volume":"20","author":"Goeman","year":"2004","journal-title":"Bioinformatics"},{"key":"2023013110282844200_B11","doi-asserted-by":"crossref","first-page":"1283","DOI":"10.2337\/diabetes.54.5.1283","article-title":"Proteome analysis of skeletal muscle from obese and morbidly obese women","volume":"54","author":"Hittel","year":"2005","journal-title":"Diabetes"},{"key":"2023013110282844200_B12","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1139\/y84-010","article-title":"Renal enzymes during experimental diabetes mellitus in the rat. Role of insulin, carbohydrate metabolism, and ketoacidosis","volume":"62","author":"Lemieux","year":"1984","journal-title":"Can. J. Physiol. Pharmacol."},{"key":"2023013110282844200_B13","doi-asserted-by":"crossref","first-page":"0032.1","DOI":"10.1186\/gb-2001-2-8-research0032","article-title":"Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application","volume":"2","author":"Li","year":"2001","journal-title":"Genome Biol."},{"key":"2023013110282844200_B14","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1016\/j.tox.2005.07.012","article-title":"Detection of autoantibody to aldolase B in sera from patients with troglitazone-induced liver dysfunction","volume":"216","author":"Maniratanachote","year":"2005","journal-title":"Toxicology"},{"key":"2023013110282844200_B15","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1016\/S0021-9258(19)85918-5","article-title":"Purification and properties of liver fructose 1, 6-bisphosphatase from C57BL\/KsJ normal and diabetic mice","volume":"255","author":"Marcus","year":"1980","journal-title":"J. Biol.Chem."},{"key":"2023013110282844200_B16","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1038\/ng1180","article-title":"PGC-1 \u03b1-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes","volume":"34","author":"Mootha","year":"2003","journal-title":"Nat. Genet."},{"key":"2023013110282844200_B17","doi-asserted-by":"crossref","first-page":"1499","DOI":"10.1194\/jlr.M700090-JLR200","article-title":"Effects of glucose metabolism on the regulation of genes of fatty acid synthesis and triglyceride secretion in the liver","volume":"48","author":"Morral","year":"2007","journal-title":"J. Lipid. Res."},{"key":"2023013110282844200_B18","doi-asserted-by":"crossref","first-page":"1427","DOI":"10.2337\/diacare.27.6.1427","article-title":"Serum \u03b3-glutamyltransferase and risk of metabolic syndrome and type 2 diabetes in middle-aged Japanese men","volume":"27","author":"Nakanishi","year":"2004","journal-title":"Diabetes Care"},{"key":"2023013110282844200_B19","first-page":"426","article-title":"Identification of novel diagnostic marker candidates for diabetic retinopathy by serological proteome analysis","volume":"46","author":"Oh","year":"2005","journal-title":"Invest. Ophtalmol. Vis. Sci."},{"key":"2023013110282844200_B20","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1042\/bj2080333","article-title":"Insulin mediates the stimulation of pyruvate kinase by a dual mechanism","volume":"208","author":"Park","year":"1982","journal-title":"Biochem. J."},{"key":"2023013110282844200_B21","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1214\/aos\/1176344136","article-title":"Estimating the dimension of a model","volume":"6","author":"Schwarz","year":"1978","journal-title":"Ann. Stat."},{"key":"2023013110282844200_B22","doi-asserted-by":"crossref","first-page":"15545","DOI":"10.1073\/pnas.0506580102","article-title":"Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles","volume":"102","author":"Subramanian","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110282844200_B23","doi-asserted-by":"crossref","first-page":"13544","DOI":"10.1073\/pnas.0506577102","article-title":"Discovering statistically significant pathways in expression profiling studies","volume":"103","author":"Tian","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110282844200_B24","doi-asserted-by":"crossref","first-page":"6567","DOI":"10.1073\/pnas.082099299","article-title":"Diagnosis of multiple cancer types by shrunken centroids of gene expression","volume":"99","author":"Tibshirani","year":"2002","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023013110282844200_B25","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"2023013110282844200_B26","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1186\/1471-2105-6-225","article-title":"Pathway level analysis of gene expression using singular value decomposition","volume":"6","author":"Tomfohr","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023013110282844200_B27","article-title":"Two-group classification via sparse linear discriminant analysis","volume-title":"Technical report.","author":"Wu","year":"2008"},{"key":"2023013110282844200_B28","doi-asserted-by":"crossref","first-page":"1584","DOI":"10.1007\/s00125-002-0905-7","article-title":"Microarray profiling of skeletal muscle tissues from equally obese, non-diabetic insulin-sensitive and insulin-resistant Pima Indians","volume":"45","author":"Yang","year":"2002","journal-title":"Diabetologia"},{"key":"2023013110282844200_B29","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1111\/j.1467-9868.2005.00503.x","article-title":"Regularization and variable selection via the elastic net","volume":"67","author":"Zou","year":"2005","journal-title":"J. R. Stat. Soc. Ser. B Stat. Methodol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/9\/1145\/48985709\/bioinformatics_25_9_1145.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/9\/1145\/48985709\/bioinformatics_25_9_1145.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,7]],"date-time":"2025-02-07T06:17:23Z","timestamp":1738909043000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/9\/1145\/203375"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,1,25]]},"references-count":29,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2009,5,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp019","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,5,1]]},"published":{"date-parts":[[2009,1,25]]}}}