{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T03:04:26Z","timestamp":1775012666897,"version":"3.50.1"},"reference-count":27,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2009,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>The analysis of high-throughput gene expression data with respect to sets of genes rather than individual genes has many advantages. A variety of methods have been developed for assessing the enrichment of sets of genes with respect to differential expression. In this paper we provide a comparative study of four of these methods: Fisher's exact test, Gene Set Enrichment Analysis (GSEA), Random-Sets (RS), and Gene List Analysis with Prediction Accuracy (GLAPA). The first three methods use associative statistics, while the fourth uses predictive statistics. We first compare all four methods on simulated data sets to verify that Fisher's exact test is markedly worse than the other three approaches. We then validate the other three methods on seven real data sets with known genetic perturbations and then compare the methods on two cancer data sets where our a priori knowledge is limited.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>The simulation study highlights that none of the three method outperforms all others consistently. GSEA and RS are able to detect weak signals of deregulation and they perform differently when genes in a gene set are both differentially up and down regulated. GLAPA is more conservative and large differences between the two phenotypes are required to allow the method to detect differential deregulation in gene sets. This is due to the fact that the enrichment statistic in GLAPA is prediction error which is a stronger criteria than classical two sample statistic as used in RS and GSEA. This was reflected in the analysis on real data sets as GSEA and RS were seen to be significant for particular gene sets while GLAPA was not, suggesting a small effect size. We find that the rank of gene set enrichment induced by GLAPA is more similar to RS than GSEA. More importantly, the rankings of the three methods share significant overlap.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>The three methods considered in our study recover relevant gene sets known to be deregulated in the experimental conditions and pathologies analyzed. There are differences between the three methods and GSEA seems to be more consistent in finding enriched gene sets, although no method uniformly dominates over all data sets. Our analysis highlights the deep difference existing between associative and predictive methods for detecting enrichment and the use of both to better interpret results of pathway analysis. We close with suggestions for users of gene set methods.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-10-275","type":"journal-article","created":{"date-parts":[[2009,9,2]],"date-time":"2009-09-02T18:13:47Z","timestamp":1251915227000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":91,"title":["Comparative study of gene set enrichment methods"],"prefix":"10.1186","volume":"10","author":[{"given":"Luca","family":"Abatangelo","sequence":"first","affiliation":[]},{"given":"Rosalia","family":"Maglietta","sequence":"additional","affiliation":[]},{"given":"Angela","family":"Distaso","sequence":"additional","affiliation":[]},{"given":"Annarita","family":"D'Addabbo","sequence":"additional","affiliation":[]},{"given":"Teresa Maria","family":"Creanza","sequence":"additional","affiliation":[]},{"given":"Sayan","family":"Mukherjee","sequence":"additional","affiliation":[]},{"given":"Nicola","family":"Ancona","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2009,9,2]]},"reference":[{"key":"3005_CR1","doi-asserted-by":"publisher","first-page":"789","DOI":"10.1038\/nm1087","volume":"10","author":"B Vogelstein","year":"2004","unstructured":"Vogelstein B, Kinzler KW: Cancer genes and the pathways they control. Nature Medicine 2004, 10: 789\u2013799. 10.1038\/nm1087","journal-title":"Nature Medicine"},{"key":"3005_CR2","doi-asserted-by":"publisher","first-page":"15545","DOI":"10.1073\/pnas.0506580102","volume":"102","author":"A Subramanian","year":"2005","unstructured":"Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci 2005, 102: 15545\u201315550. 10.1073\/pnas.0506580102","journal-title":"Proc Natl Acad Sci"},{"issue":"2","key":"3005_CR3","doi-asserted-by":"publisher","first-page":"266","DOI":"10.1006\/geno.2002.6698","volume":"79","author":"P Khatri","year":"2002","unstructured":"Khatri P, Draghici S, Ostermeier GC, Krawetz SA: Profiling gene expression using onto-express. Genomics 2002, 79(2):266\u2013270. 10.1006\/geno.2002.6698","journal-title":"Genomics"},{"key":"3005_CR4","doi-asserted-by":"publisher","first-page":"1943","DOI":"10.1093\/bioinformatics\/bti260","volume":"21","author":"WT Barry","year":"2005","unstructured":"Barry WT, Nobel AB, Wright FA: Significance analysis of functional categories in gene expression studies: a structured permutation approach. Bioinformatics 2005, 21: 1943\u20131949. 10.1093\/bioinformatics\/bti260","journal-title":"Bioinformatics"},{"key":"3005_CR5","doi-asserted-by":"publisher","first-page":"13544","DOI":"10.1073\/pnas.0506577102","volume":"102","author":"L Tian","year":"2005","unstructured":"Tian L, Greenberg SA, Kong SW, Altschuler J, Kohane IS, Park PJ: Discovering statistically significant pathways in expression profiling studies. Proc Natl Acad Sci 2005, 102: 13544\u201313549. 10.1073\/pnas.0506577102","journal-title":"Proc Natl Acad Sci"},{"key":"3005_CR6","doi-asserted-by":"publisher","first-page":"3587","DOI":"10.1093\/bioinformatics\/bti565","volume":"21","author":"P Khatri","year":"2005","unstructured":"Khatri P, Draghici S: Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 2005, 21: 3587\u20133595. 10.1093\/bioinformatics\/bti565","journal-title":"Bioinformatics"},{"issue":"16","key":"3005_CR7","doi-asserted-by":"publisher","first-page":"2063","DOI":"10.1093\/bioinformatics\/btm289","volume":"23","author":"R Maglietta","year":"2007","unstructured":"Maglietta R, Piepoli A, Catalano D, Licciulli F, Carella M, Liuni S, Pesole G, Perri F, Ancona N: Statistical assessment of functional categories of genes deregulated in pathological conditions by using microarray data. Bioinformatics 2007, 23(16):2063\u20132072. 10.1093\/bioinformatics\/btm289","journal-title":"Bioinformatics"},{"issue":"1","key":"3005_CR8","doi-asserted-by":"publisher","first-page":"85","DOI":"10.1214\/07-AOAS104","volume":"1","author":"MA Newton","year":"2007","unstructured":"Newton MA, Quintana FA, Den Boon JA, Sengupta S, Ahlquist P: Random-Set methods identify distinct aspects of the enrichment signal in gene-set analysis. The Annals of Applied Statistics 2007, 1(1):85\u2013106. 10.1214\/07-AOAS104","journal-title":"The Annals of Applied Statistics"},{"issue":"1","key":"3005_CR9","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1214\/07-AOAS101","volume":"1","author":"B Efron","year":"2007","unstructured":"Efron B, Tibshirani R: On testing the significance of sets of genes. The Annals of Applied Statistics 2007, 1(1):107\u2013129. 10.1214\/07-AOAS101","journal-title":"The Annals of Applied Statistics"},{"key":"3005_CR10","doi-asserted-by":"publisher","first-page":"353","DOI":"10.1038\/nature04296","volume":"439","author":"AH Bild","year":"2006","unstructured":"Bild AH, Yao G, Chang JT, Wang Q, Potti A, Chasse D, Joshi MB, Harpole D, Lancaster JM, Berchuck A, Olson JAJ, Marks JR, Dressman HK, West M, Nevins JR: Oncogenic pathway signatures in human cancers as a guide to targeted therapies. Nature 2006, 439: 353\u2013357. 10.1038\/nature04296","journal-title":"Nature"},{"key":"3005_CR11","doi-asserted-by":"publisher","first-page":"5974","DOI":"10.1073\/pnas.0931261100","volume":"100","author":"XJ Ma","year":"2003","unstructured":"Ma XJ, Salunga R, Tuggle JT, Gaudet J, Enright E, McQuary P, Payette T, Pistone M, Stecker K, Zhang BM, Zhou YX, Varnholt H, Smith B, Gadd M, Chatfield E, Kessler J, Baer TM, Erlander MG, Sgroi D: Gene expression profiles of human breast cancer progression. Proc Natl Acad Sci USA 2003, 100: 5974\u20135979. 10.1073\/pnas.0931261100","journal-title":"Proc Natl Acad Sci USA"},{"key":"3005_CR12","doi-asserted-by":"publisher","first-page":"570","DOI":"10.1056\/NEJMoa060467","volume":"355","author":"A Potti","year":"2006","unstructured":"Potti A, Mukherjee S, Petersen R, Dressman HK, Bild A, Koontz J, Kratzke R, Watson MA, Kelley M, Ginsburg GS, West M, Harpole DHJ, Nevins JR: A genomic strategy to refine prognosis in early stage non-small cell lung carcinoma. N Engl J Med 2006, 355: 570\u2013580. 10.1056\/NEJMoa060467","journal-title":"N Engl J Med"},{"key":"3005_CR13","doi-asserted-by":"publisher","first-page":"435","DOI":"10.1152\/physiolgenomics.00315.2005","volume":"25","author":"SM Mense","year":"2006","unstructured":"Mense SM, Sengupta A, Zhou M, Lan C, Bentsman G, Volsky DJ, L Z: Gene expression profiling reveals the profound upregulation of hypoxia-responsive genes in primary human astrocytes. Physiol Genomics 2006, 25: 435\u2013449. 10.1152\/physiolgenomics.00315.2005","journal-title":"Physiol Genomics"},{"issue":"4","key":"3005_CR14","doi-asserted-by":"publisher","first-page":"e15","DOI":"10.1093\/nar\/30.4.e15","volume":"30","author":"YH Yang","year":"2002","unstructured":"Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP: Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucl Acids Res 2002, 30(4):e15. 10.1093\/nar\/30.4.e15","journal-title":"Nucl Acids Res"},{"issue":"2","key":"3005_CR15","doi-asserted-by":"publisher","first-page":"e28","DOI":"10.1371\/journal.pcbi.0040028","volume":"4","author":"EJ Edelman","year":"2008","unstructured":"Edelman EJ, Guinney J, Chi JT, Febbo PG, Mukherjee S: Modeling cancer progression via pathway dependences. PLoS Comput Biol 2008, 4(2):e28. 10.1371\/journal.pcbi.0040028","journal-title":"PLoS Comput Biol"},{"key":"3005_CR16","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-2346-5","volume-title":"Permutation tests: a practical guide to resampling methods for testing hypotheses","author":"P Good","year":"1994","unstructured":"Good P: Permutation tests: a practical guide to resampling methods for testing hypotheses. New York: Springer Verlag; 1994."},{"key":"3005_CR17","doi-asserted-by":"publisher","first-page":"119","DOI":"10.1089\/106652703321825928","volume":"10","author":"S Mukherjee","year":"2003","unstructured":"Mukherjee S, Tamayo P, Rogers S, Rifkin R, Engle A, Campbell C, Golub TR, Mesirov JP: Estimating dataset size requirements for classifying DNA microarray data. J Comp Biol 2003, 10: 119\u2013142. 10.1089\/106652703321825928","journal-title":"J Comp Biol"},{"key":"3005_CR18","doi-asserted-by":"publisher","first-page":"1139","DOI":"10.1142\/S0219720007003041","volume":"5","author":"L Klebanov","year":"2007","unstructured":"Klebanov L, Glazko G, Salzman P, Yakovlev A: A multivariate extension of the gene set enrichment analysis. Journal of Bioinformatics and Computational Biology 2007, 5: 1139\u20131153. 10.1142\/S0219720007003041","journal-title":"Journal of Bioinformatics and Computational Biology"},{"key":"3005_CR19","doi-asserted-by":"publisher","first-page":"307","DOI":"10.1038\/35042675","volume":"408","author":"B Vogelstein","year":"2000","unstructured":"Vogelstein B, Lane D, Levine AJ: Surfing the p53 network. Nature 2000, 408: 307\u2013310. 10.1038\/35042675","journal-title":"Nature"},{"issue":"39","key":"3005_CR20","doi-asserted-by":"publisher","first-page":"36329","DOI":"10.1074\/jbc.M204962200","volume":"277","author":"Q Wu","year":"2002","unstructured":"Wu Q, Kirschmeier P, Hockenberry T, Yang TY, Brassard DL, Wang L, McClanahan T, Black S, Rizzi G, Musco ML, Mirza A, Liu S: Transcriptional regulation during p21WAF1\/CIP1-induced apoptosis in human ovarian cancer cells. J Biol Chem 2002, 277(39):36329\u201336337. 10.1074\/jbc.M204962200","journal-title":"J Biol Chem"},{"key":"3005_CR21","doi-asserted-by":"publisher","first-page":"3749","DOI":"10.1038\/sj.onc.1206439","volume":"22","author":"PP Ongusaha","year":"2003","unstructured":"Ongusaha PP, Ouchi T, Kim KT, Nytko E, Kwak JC, Duda RB, Deng CX, Lee SW: BRCA1 shifts p53-mediated cellular outcomes towards irreversible growth arrest. Oncogene 2003, 22: 3749\u20133758. 10.1038\/sj.onc.1206439","journal-title":"Oncogene"},{"issue":"6","key":"3005_CR22","first-page":"453","volume":"1","author":"Y Jiang","year":"2003","unstructured":"Jiang Y, Zhang W, Kondo K, Klco JM, St Martin TB, Dufault MR, Madden SL, Kaelin WGJ, Nacht M: Gene expression profiling in a renal cell carcinoma cell line: dissecting VHL and hypoxia-dependent pathways. Mol Cancer Res 2003, 1(6):453\u2013462.","journal-title":"Mol Cancer Res"},{"key":"3005_CR23","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1023\/A:1006163101948","volume":"52","author":"R Elledge","year":"1998","unstructured":"Elledge R, Allred C: Prognostic and predictive value of p53 and p21 in breast cancer. Breast Cancer Res Treat 1998, 52: 79\u201398. 10.1023\/A:1006163101948","journal-title":"Breast Cancer Res Treat"},{"key":"3005_CR24","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1016\/S0092-8674(00)81683-9","volume":"1","author":"D Hanahan","year":"2000","unstructured":"Hanahan D, Weinberg RA: The hallmarks of cancer. Cell 2000, 1: 57\u201370. 10.1016\/S0092-8674(00)81683-9","journal-title":"Cell"},{"issue":"10","key":"3005_CR25","doi-asserted-by":"publisher","first-page":"e1047","DOI":"10.1371\/journal.pone.0001047","volume":"2","author":"MH van Vliet","year":"2007","unstructured":"van Vliet MH, Klijn CN, Wessels LFA, Reinders MJT: Module-Based Outcome Prediction Using Breast Cancer Compendia. PLoS ONE 2007, 2(10):e1047. 10.1371\/journal.pone.0001047","journal-title":"PLoS ONE"},{"key":"3005_CR26","first-page":"105","volume":"20","author":"GE Richardson","year":"1993","unstructured":"Richardson GE, Johnson BE: The biology of lung cancer. Semin Oncol 1993, 20: 105\u201327.","journal-title":"Semin Oncol"},{"key":"3005_CR27","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1007\/s00438-005-0014-7","volume":"274","author":"Z Ju","year":"2005","unstructured":"Ju Z, Kapoor M, Newton K, Cheon K, Ramaswamy A, Lotan R, Strong LC, Koo JS: Global detection of molecular changes reveals concurrent alteration of several biological pathways in nonsmall cell lung cancer cells. Mol Gen Genomics 2005, 274: 141\u2013154. 10.1007\/s00438-005-0014-7","journal-title":"Mol Gen Genomics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-10-275.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,8,31]],"date-time":"2021-08-31T21:35:49Z","timestamp":1630445749000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-10-275"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,9,2]]},"references-count":27,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2009,12]]}},"alternative-id":["3005"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-10-275","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2009,9,2]]},"assertion":[{"value":"11 November 2008","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 September 2009","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 September 2009","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"275"}}