{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,15]],"date-time":"2026-04-15T18:27:29Z","timestamp":1776277649258,"version":"3.50.1"},"reference-count":34,"publisher":"Oxford University Press (OUP)","issue":"22","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,11,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: The Illumina BeadArray is a popular platform for profiling DNA methylation, an important epigenetic event associated with gene silencing and chromosomal instability. However, current approaches rely on an arbitrary detection P-value cutoff for excluding probes and samples from subsequent analysis as a quality control step, which results in missing observations and information loss. It is desirable to have an approach that incorporates the whole data, but accounts for the different quality of individual observations.<\/jats:p><jats:p>Results: We first investigate and propose a statistical framework for removing the source of biases in Illumina Methylation BeadArray based on several positive control samples. We then introduce a weighted model-based clustering called LumiWCluster for Illumina BeadArray that weights each observation according to the detection P-values systematically and avoids discarding subsets of the data. LumiWCluster allows for discovery of distinct methylation patterns and automatic selection of informative CpG loci. We demonstrate the advantages of LumiWCluster on two publicly available Illumina GoldenGate Methylation datasets (ovarian cancer and hepatocellular carcinoma).<\/jats:p><jats:p>Availability: \u00a0R package LumiWCluster can be downloaded from http:\/\/www.unc.edu\/~pfkuan\/LumiWCluster<\/jats:p><jats:p>Contact: \u00a0pfkuan@bios.unc.edu<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq553","type":"journal-article","created":{"date-parts":[[2010,9,30]],"date-time":"2010-09-30T00:45:30Z","timestamp":1285807530000},"page":"2849-2855","source":"Crossref","is-referenced-by-count":78,"title":["A statistical framework for Illumina DNA methylation arrays"],"prefix":"10.1093","volume":"26","author":[{"given":"Pei Fen","family":"Kuan","sequence":"first","affiliation":[{"name":"1 Department of Biostatistics, 2Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, 3Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI 53792 and 4Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA"},{"name":"1 Department of Biostatistics, 2Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, 3Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI 53792 and 4Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA"}]},{"given":"Sijian","family":"Wang","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics, 2Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, 3Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI 53792 and 4Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA"}]},{"given":"Xin","family":"Zhou","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics, 2Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, 3Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI 53792 and 4Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA"}]},{"given":"Haitao","family":"Chu","sequence":"additional","affiliation":[{"name":"1 Department of Biostatistics, 2Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599, 3Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI 53792 and 4Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA"}]}],"member":"286","published-online":{"date-parts":[[2010,9,29]]},"reference":[{"key":"2023012507564110400_B1","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1007\/s00438-010-0522-y","article-title":"High-throughput assessment of CpG site methylation for distinguishing between HCV-cirrhosis and HCV-associated hepatocellular carcinoma","volume":"283","author":"Archer","year":"2010","journal-title":"Mol. Genet. Genomics"},{"key":"2023012507564110400_B2","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","article-title":"Controlling the false discovery rate: a practical and powerful approach to multiple testing","volume":"57","author":"Benjamini","year":"1995","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"2023012507564110400_B3","doi-asserted-by":"crossref","first-page":"383","DOI":"10.1101\/gr.4410706","article-title":"High-throughput DNA methylation profiling using universal bead arrays","volume":"16","author":"Bibikova","year":"2006","journal-title":"Genome Res."},{"key":"2023012507564110400_B4","doi-asserted-by":"crossref","first-page":"989","DOI":"10.3150\/bj\/1106314847","article-title":"Some theory for Fisher's linear discriminant function, \u201cnaive Bayes\u201d, and some alternatives when there are many more variables than observations","volume":"10","author":"Bickel","year":"2004","journal-title":"Bernoulli"},{"key":"2023012507564110400_B5","doi-asserted-by":"crossref","first-page":"e1000602","DOI":"10.1371\/journal.pgen.1000602","article-title":"Aging and environmental exposures alter tissue-specific DNA methylation dependent upon CpG island context","volume":"5","author":"Christensen","year":"2009","journal-title":"PLoS Genet."},{"key":"2023012507564110400_B6","doi-asserted-by":"crossref","DOI":"10.1038\/nbt1414","article-title":"A Bayesian deconvolution strategy for immunoprecipitation based DNA methylome analysis","volume":"26","author":"Down","year":"2008","journal-title":"Nat. Biotechnol."},{"key":"2023012507564110400_B7","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1186\/1756-0500-1-18","article-title":"Spike-in validation of an Illumina-specific variance-stabilizing transformation","volume":"1","author":"Dunning","year":"2008","journal-title":"BMC Res. Notes"},{"key":"2023012507564110400_B8","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1186\/1471-2105-9-85","article-title":"Statistical issues in the analysis of Illumina data","volume":"9","author":"Dunning","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012507564110400_B9","doi-asserted-by":"crossref","first-page":"286","DOI":"10.1038\/nrg2005","article-title":"Cancer epigenomics: DNA methylomes and histone-modifications maps","volume":"8","author":"Esteller","year":"2007","journal-title":"Nat. Rev. Genet."},{"key":"2023012507564110400_B10","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1198\/016214502760047131","article-title":"Model-based clustering, discriminant analysis, and density estimation","volume":"97","author":"Fraley","year":"2002","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012507564110400_B11","doi-asserted-by":"crossref","first-page":"e9749","DOI":"10.1371\/journal.pone.0009749","article-title":"Hepatocellular carcinoma displays distinct DNA methylation signatures with potential as clinical predictors","volume":"5","author":"Hernandez-Vargas","year":"2010","journal-title":"PLoS One"},{"key":"2023012507564110400_B12","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1186\/1471-2105-9-365","article-title":"Model-based clustering of DNA methylation array data: a recursive-partitioning algorithm for high-dimensional data arising as a mixture of beta distribution","volume":"9","author":"Houseman","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012507564110400_B13","doi-asserted-by":"crossref","first-page":"e9359","DOI":"10.1371\/journal.pone.0009359","article-title":"DNA methylation profiles of ovarian epithelial carcinoma tumors and cell lines","volume":"5","author":"Houshdaran","year":"2009","journal-title":"PLoS ONE"},{"key":"2023012507564110400_B14","author":"Illumina","year":"2006","journal-title":"GoldenGate methylation cancer panel I."},{"key":"2023012507564110400_B15","doi-asserted-by":"crossref","first-page":"780","DOI":"10.1101\/gr.7301508","article-title":"Comprehensive high-throughput arrays for relative methylation (CHARM)","volume":"18","author":"Irizarry","year":"2008","journal-title":"Genome Res."},{"key":"2023012507564110400_B16","doi-asserted-by":"crossref","first-page":"178","DOI":"10.1038\/ng.298","article-title":"The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores","volume":"41","author":"Irizarry","year":"2009","journal-title":"Nat. Genet."},{"key":"2023012507564110400_B17","doi-asserted-by":"crossref","first-page":"2118","DOI":"10.1093\/bioinformatics\/bti318","article-title":"Applications of beta-mixture models in bioinformatics","volume":"21","author":"Ji","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012507564110400_B18","doi-asserted-by":"crossref","DOI":"10.1002\/9780470316801","volume-title":"Finding Groups in Data: An Introduction to Cluster Analysis.","author":"Kaufman","year":"1990"},{"key":"2023012507564110400_B19","doi-asserted-by":"crossref","first-page":"1462","DOI":"10.1101\/gr.091447.109","article-title":"Genome -wide screen of promoter methylation identifies novel markers in melanoma","volume":"19","author":"Koga","year":"2009","journal-title":"Genome Res."},{"key":"2023012507564110400_B20","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1038\/nrg2732","article-title":"Principles and challenges of genome-wide DNA methylation analysis","volume":"11","author":"Laird","year":"2010","journal-title":"Nat. Rev. Genet."},{"key":"2023012507564110400_B21","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1177\/0962280208099451","article-title":"Considerations for processing and analysis of Goldengate-based two-colour illumina platforms","volume":"18","author":"Lynch","year":"2009","journal-title":"Stat. Methods Med. Res."},{"key":"2023012507564110400_B22","doi-asserted-by":"crossref","first-page":"416","DOI":"10.1093\/carcin\/bgp006","article-title":"Epigenetic profiling reveals etiologically distinct patterns of DNA methylation in head and neck squamous cell carcinoma","volume":"30","author":"Marsit","year":"2009","journal-title":"Carcinogenesis"},{"key":"2023012507564110400_B23","first-page":"1145","article-title":"Penalized model-based clustering with application to variable selection","volume":"80","author":"Pan","year":"2007","journal-title":"J. Mach. Learn. Res."},{"key":"2023012507564110400_B24","first-page":"266","article-title":"Discussion of \u201cBayesian clustering with variable selection and transformation selection\u201d by liu et al","volume":"7","author":"Raftery","year":"2003","journal-title":"Bayesian Stat."},{"key":"2023012507564110400_B25","doi-asserted-by":"crossref","first-page":"1518","DOI":"10.1101\/gr.077479.108","article-title":"An integrated resource for genome-wide identification and analysis of human tissue-specific differential methylated regions (tDMRs)","volume":"18","author":"Rakyan","year":"2008","journal-title":"Genome Res."},{"key":"2023012507564110400_B26","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1007\/s10596-009-9136-z","article-title":"Weighted model-based clustering for remote sensing image analysis","volume":"14","author":"Richards","year":"2009","journal-title":"Comput. Geosci."},{"key":"2023012507564110400_B27","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","article-title":"Silhouettes: a graphical aid to the interpretation and validation of cluster analysis","volume":"20","author":"Rousseeuwl","year":"1987","journal-title":"J. Comput. Appl. Math."},{"key":"2023012507564110400_B28","doi-asserted-by":"crossref","first-page":"2534","DOI":"10.1093\/bioinformatics\/bth280","article-title":"Interactively optimizing signal-to-noise ratios in expression profiling, project-specific algorithm selection and detection p-value weighting in Affymetrix microarrays","volume":"20","author":"Seo","year":"2004","journal-title":"Bioinfomatics"},{"key":"2023012507564110400_B29","doi-asserted-by":"crossref","first-page":"2906","DOI":"10.1093\/bioinformatics\/btp543","article-title":"Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis","volume":"25","author":"Shen","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012507564110400_B30","doi-asserted-by":"crossref","first-page":"1896","DOI":"10.1093\/bioinformatics\/bth176","article-title":"A comparison of cluster analysis methods using DNA methylation data","volume":"20","author":"Siegmund","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012507564110400_B31","doi-asserted-by":"crossref","first-page":"440","DOI":"10.1111\/j.1541-0420.2007.00922.x","article-title":"Variable selection for model-based high dimensional clustering and its application to microarray data","volume":"64","author":"Wang","year":"2008","journal-title":"Biometrics"},{"key":"2023012507564110400_B32","doi-asserted-by":"crossref","first-page":"2926","DOI":"10.1093\/nar\/gkn133","article-title":"A study of the relationships between oligonucleotide properties and hybridization signal intensities from NimbleGen microarray datasets","volume":"36","author":"Wei","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012507564110400_B33","doi-asserted-by":"crossref","DOI":"10.1186\/1745-6150-3-23","article-title":"On the necessity of different statistical treatment for Illumina BeadChip and Affymetrix GeneChip data and its significance for biological interpretation","volume":"3","author":"Wong","year":"2008","journal-title":"Biol. Direct"},{"key":"2023012507564110400_B34","doi-asserted-by":"crossref","first-page":"751","DOI":"10.1093\/bioinformatics\/btp040","article-title":"Statistical methods of background correction for Illumina BeadArray data","volume":"25","author":"Xie","year":"2009","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/22\/2849\/48853356\/bioinformatics_26_22_2849.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/22\/2849\/48853356\/bioinformatics_26_22_2849.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,26]],"date-time":"2025-02-26T04:03:05Z","timestamp":1740542585000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/22\/2849\/228179"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,9,29]]},"references-count":34,"journal-issue":{"issue":"22","published-print":{"date-parts":[[2010,11,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq553","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,11,15]]},"published":{"date-parts":[[2010,9,29]]}}}