{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,10]],"date-time":"2026-06-10T23:39:11Z","timestamp":1781134751247,"version":"3.54.1"},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"15","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2013,8,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: DNA methylation is an epigenetic mark that can stably repress gene expression. Because of its biological and clinical significance, several methods have been developed to compare genome-wide patterns of methylation between groups of samples. The application of gene set analysis to identify relevant groups of genes that are enriched for differentially methylated genes is often a major component of the analysis of these data. This can be used, for example, to identify processes or pathways that are perturbed in disease development. We show that gene-set analysis, as it is typically applied to genome-wide methylation assays, is severely biased as a result of differences in the numbers of CpG sites associated with different classes of genes and gene promoters.<\/jats:p>\n               <jats:p>Results: We demonstrate this bias using published data from a study of differential CpG island methylation in lung cancer and a dataset we generated to study methylation changes in patients with long-standing ulcerative colitis. We show that several of the gene sets that seem enriched would also be identified with randomized data. We suggest two existing approaches that can be adapted to correct the bias. Accounting for the bias in the lung cancer and ulcerative colitis datasets provides novel biological insights into the role of methylation in cancer development and chronic inflammation, respectively. Our results have significant implications for many previous genome-wide methylation studies that have drawn conclusions on the basis of such strongly biased analysis.<\/jats:p>\n               <jats:p>Contact: \u00a0cathal.seoighe@nuigalway.ie<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btt311","type":"journal-article","created":{"date-parts":[[2013,6,4]],"date-time":"2013-06-04T01:26:49Z","timestamp":1370309209000},"page":"1851-1857","source":"Crossref","is-referenced-by-count":133,"title":["Gene-set analysis is severely biased when applied to genome-wide methylation data"],"prefix":"10.1093","volume":"29","author":[{"given":"Paul","family":"Geeleher","sequence":"first","affiliation":[{"name":"1 Section of Hematology\/Oncology, Department of Medicine, University of Chicago, Chicago, IL 60637 USA, 2Department of Mathematics, Statistics and Applied Mathematics and 3Department of Pharmacology and Therapeutics, National University of Ireland, Galway, Ireland and 4Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, NY 10461, USA"},{"name":"1 Section of Hematology\/Oncology, Department of Medicine, University of Chicago, Chicago, IL 60637 USA, 2Department of Mathematics, Statistics and Applied Mathematics and 3Department of Pharmacology and Therapeutics, National University of Ireland, Galway, Ireland and 4Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, NY 10461, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Lori","family":"Hartnett","sequence":"additional","affiliation":[{"name":"1 Section of Hematology\/Oncology, Department of Medicine, University of Chicago, Chicago, IL 60637 USA, 2Department of Mathematics, Statistics and Applied Mathematics and 3Department of Pharmacology and Therapeutics, National University of Ireland, Galway, Ireland and 4Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, NY 10461, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Laurance J.","family":"Egan","sequence":"additional","affiliation":[{"name":"1 Section of Hematology\/Oncology, Department of Medicine, University of Chicago, Chicago, IL 60637 USA, 2Department of Mathematics, Statistics and Applied Mathematics and 3Department of Pharmacology and Therapeutics, National University of Ireland, Galway, Ireland and 4Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, NY 10461, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Aaron","family":"Golden","sequence":"additional","affiliation":[{"name":"1 Section of Hematology\/Oncology, Department of Medicine, University of Chicago, Chicago, IL 60637 USA, 2Department of Mathematics, Statistics and Applied Mathematics and 3Department of Pharmacology and Therapeutics, National University of Ireland, Galway, Ireland and 4Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, NY 10461, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Raja Affendi","family":"Raja Ali","sequence":"additional","affiliation":[{"name":"1 Section of Hematology\/Oncology, Department of Medicine, University of Chicago, Chicago, IL 60637 USA, 2Department of Mathematics, Statistics and Applied Mathematics and 3Department of Pharmacology and Therapeutics, National University of Ireland, Galway, Ireland and 4Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, NY 10461, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Cathal","family":"Seoighe","sequence":"additional","affiliation":[{"name":"1 Section of Hematology\/Oncology, Department of Medicine, University of Chicago, Chicago, IL 60637 USA, 2Department of Mathematics, Statistics and Applied Mathematics and 3Department of Pharmacology and Therapeutics, National University of Ireland, Galway, Ireland and 4Department of Genetics, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, NY 10461, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2013,6,3]]},"reference":[{"key":"2023012810451847000_btt311-B1","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology. the gene ontology consortium","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat Genet."},{"key":"2023012810451847000_btt311-B2","doi-asserted-by":"crossref","first-page":"1943","DOI":"10.1093\/bioinformatics\/bti260","article-title":"Significance analysis of functional categories in gene expression studies: a structured permutation approach","volume":"21","author":"Barry","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012810451847000_btt311-B3","doi-asserted-by":"crossref","first-page":"R10","DOI":"10.1186\/gb-2011-12-1-r10","article-title":"DNA methylation patterns associate with genetic and gene expression variation in HapMap cell lines","volume":"12","author":"Bell","year":"2011","journal-title":"Genome Biol."},{"key":"2023012810451847000_btt311-B4","doi-asserted-by":"crossref","first-page":"934","DOI":"10.1126\/science.1220671","article-title":"Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution","volume":"336","author":"Booth","year":"2012","journal-title":"Science"},{"key":"2023012810451847000_btt311-B5","doi-asserted-by":"crossref","first-page":"2483","DOI":"10.1200\/JCO.2011.39.3090","article-title":"Quantitative DNA methylation analysis identifies a single CpG dinucleotide important for ZAP-70 expression and predictive of prognosis in chronic lymphocytic leukemia","volume":"30","author":"Claus","year":"2012","journal-title":"J. Clin. Oncol."},{"key":"2023012810451847000_btt311-B6","first-page":"2029","article-title":"Methylation of CpG in a small region of the hMLH1 promoter invariably correlates with the absence of gene expression","volume":"59","author":"Deng","year":"1999","journal-title":"Cancer Res."},{"key":"2023012810451847000_btt311-B7","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1038\/nbt.1530","article-title":"Targeted bisulfite sequencing reveals changes in DNA methylation associated with nuclear reprogramming","volume":"27","author":"Deng","year":"2009","journal-title":"Nat. Biotechnol."},{"key":"2023012810451847000_btt311-B8","doi-asserted-by":"crossref","first-page":"1350","DOI":"10.1038\/ng.471","article-title":"Differential methylation of tissue- and cancer-specific CpG island shores distinguishes human induced pluripotent stem cells, embryonic stem cells and fibroblasts","volume":"41","author":"Doi","year":"2009","journal-title":"Nat. Genet."},{"key":"2023012810451847000_btt311-B9","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1186\/1476-4598-9-44","article-title":"A genome-wide screen identifies frequently methylated genes in haematological and epithelial cancers","volume":"9","author":"Dunwell","year":"2010","journal-title":"Mol. Cancer"},{"key":"2023012810451847000_btt311-B10","doi-asserted-by":"crossref","first-page":"526","DOI":"10.1136\/gut.48.4.526","article-title":"The risk of colorectal cancer in ulcerative colitis: a meta-analysis","volume":"48","author":"Eaden","year":"2001","journal-title":"Gut"},{"key":"2023012810451847000_btt311-B11","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1214\/07-AOAS101","article-title":"On testing the significance of sets of genes","volume":"1","author":"Efron","year":"2007","journal-title":"Ann. Appl. Stat."},{"key":"2023012810451847000_btt311-B12","doi-asserted-by":"crossref","first-page":"11206","DOI":"10.1073\/pnas.0900301106","article-title":"DNA methylation is widespread and associated with differential gene expression in castes of the honeybee, Apis mellifera","volume":"106","author":"Elango","year":"2009","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012810451847000_btt311-B13","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1093\/bioinformatics\/btl567","article-title":"Using gostats to test gene lists for go term association","volume":"23","author":"Falcon","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012810451847000_btt311-B14","doi-asserted-by":"crossref","first-page":"1181","DOI":"10.1038\/onc.2011.307","article-title":"DNA hypermethylation in lung cancer is targeted at differentiation-associated genes","volume":"31","author":"Helman","year":"2012","journal-title":"Oncogene"},{"key":"2023012810451847000_btt311-B15","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1093\/nar\/gkn923","article-title":"Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists","volume":"37","author":"Huang","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023012810451847000_btt311-B16","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1038\/nprot.2008.211","article-title":"Systematic and integrative analysis of large gene lists using david bioinformatics resources","volume":"4","author":"Huang","year":"2009","journal-title":"Nat. Protoc."},{"key":"2023012810451847000_btt311-B17","doi-asserted-by":"crossref","first-page":"178","DOI":"10.1038\/ng.298","article-title":"The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores","volume":"41","author":"Irizarry","year":"2009","journal-title":"Nat. Genet."},{"key":"2023012810451847000_btt311-B18","article-title":"The DNA methylation landscape of small cell lung cancer suggests a differentiation defect of neuroendocrine cells","author":"Kalari","year":"2012","journal-title":"Oncogene"},{"key":"2023012810451847000_btt311-B19","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1093\/nar\/28.1.27","article-title":"KEGG: kyoto encyclopedia of genes and genomes","volume":"28","author":"Kanehisa","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023012810451847000_btt311-B20","doi-asserted-by":"crossref","first-page":"e10028","DOI":"10.1371\/journal.pone.0010028","article-title":"A study of the influence of sex on genome wide methylation","volume":"5","author":"Liu","year":"2010","journal-title":"PLoS One"},{"key":"2023012810451847000_btt311-B21","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1038\/nbt.1630","article-title":"Great improves functional interpretation of cis-regulatory regions","volume":"28","author":"McLean","year":"2010","journal-title":"Nat. Biotechnol."},{"key":"2023012810451847000_btt311-B22","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1007\/978-1-59745-522-0_5","article-title":"Methylated DNA immunoprecipitation (medip)","volume":"507","author":"Mohn","year":"2009","journal-title":"Methods Mol. Biol."},{"key":"2023012810451847000_btt311-B23","doi-asserted-by":"crossref","first-page":"3829","DOI":"10.1093\/nar\/gkp260","article-title":"High-resolution genome-wide cytosine methylation profiling with simultaneous copy number analysis and optimization for limited cell numbers","volume":"37","author":"Oda","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"2023012810451847000_btt311-B24","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1186\/1745-6150-4-14","article-title":"Transcript length bias in RNA-seq data confounds systems biology","volume":"4","author":"Oshlack","year":"2009","journal-title":"Biol. Direct"},{"key":"2023012810451847000_btt311-B25","doi-asserted-by":"crossref","first-page":"252","DOI":"10.1073\/pnas.0710735105","article-title":"High-resolution mapping of DNA hypermethylation and hypomethylation in lung cancer","volume":"105","author":"Rauch","year":"2008","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012810451847000_btt311-B26","doi-asserted-by":"crossref","first-page":"1583","DOI":"10.1101\/gr.119131.110","article-title":"Large-scale methylation domains mark a functional subset of neuronally expressed genes","volume":"21","author":"Schroeder","year":"2011","journal-title":"Genome Res."},{"key":"2023012810451847000_btt311-B27","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1038\/nature08683","article-title":"DNMT1 maintains progenitor function in self-renewing somatic tissue","volume":"463","author":"Sen","year":"2010","journal-title":"Nature"},{"key":"2023012810451847000_btt311-B28","first-page":"397","volume-title":"Limma: Linear Models for Microarray Data","author":"Smyth","year":"2005"},{"key":"2023012810451847000_btt311-B29","doi-asserted-by":"crossref","first-page":"1898","DOI":"10.1053\/j.gastro.2009.12.044","article-title":"Functional switching of TGF-beta1 signaling in liver cancer via epigenetic modulation of a single CpG site in TTP promoter","volume":"138","author":"Sohn","year":"2010","journal-title":"Gastroenterology"},{"key":"2023012810451847000_btt311-B30","doi-asserted-by":"crossref","first-page":"R84","DOI":"10.1186\/gb-2012-13-10-r84","article-title":"Tissue of origin determines cancer-associated CpG island promoter hypermethylation patterns","volume":"13","author":"Sproul","year":"2012","journal-title":"Genome Biol."},{"key":"2023012810451847000_btt311-B31","doi-asserted-by":"crossref","first-page":"15545","DOI":"10.1073\/pnas.0506580102","article-title":"Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles","volume":"102","author":"Subramanian","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012810451847000_btt311-B32","doi-asserted-by":"crossref","first-page":"1974","DOI":"10.1101\/gr.093310.109","article-title":"The presence of RNA polymerase II, active or stalled, predicts epigenetic fate of promoter CpG islands","volume":"19","author":"Takeshima","year":"2009","journal-title":"Genome Res."},{"key":"2023012810451847000_btt311-B33","doi-asserted-by":"crossref","first-page":"853","DOI":"10.1038\/ng1598","article-title":"Chromosome-wide and promoter-specific analyses identify sites of differential DNA methylation in normal and transformed human cells","volume":"37","author":"Weber","year":"2005","journal-title":"Nat. Genet."},{"key":"2023012810451847000_btt311-B34","doi-asserted-by":"crossref","first-page":"R14","DOI":"10.1186\/gb-2010-11-2-r14","article-title":"Gene ontology analysis for RNA-seq: accounting for selection bias","volume":"11","author":"Young","year":"2010","journal-title":"Genome Biol."},{"key":"2023012810451847000_btt311-B35","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s12013-012-9336-3","article-title":"Differential DNA methylation status between human preadipocytes and mature adipocytes","volume":"63","author":"Zhu","year":"2012","journal-title":"Cell Biochem. Biophys."},{"key":"2023012810451847000_btt311-B36","doi-asserted-by":"crossref","first-page":"1835","DOI":"10.1053\/j.gastro.2006.09.050","article-title":"Correlation between the single-site CpG methylation and expression silencing of the XAF1 gene in human gastric and colon cancers","volume":"131","author":"Zou","year":"2006","journal-title":"Gastroenterology"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/29\/15\/1851\/48888210\/bioinformatics_29_15_1851.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/29\/15\/1851\/48888210\/bioinformatics_29_15_1851.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,28]],"date-time":"2023-01-28T12:29:03Z","timestamp":1674908943000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/29\/15\/1851\/265573"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,6,3]]},"references-count":36,"journal-issue":{"issue":"15","published-print":{"date-parts":[[2013,8,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btt311","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2013,8,1]]},"published":{"date-parts":[[2013,6,3]]}}}