{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T23:09:05Z","timestamp":1772838545415,"version":"3.50.1"},"reference-count":25,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2018,8,23]],"date-time":"2018-08-23T00:00:00Z","timestamp":1534982400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000269","name":"Economic and Social Research Council","doi-asserted-by":"publisher","award":["ES\/N00812X\/1"],"award-info":[{"award-number":["ES\/N00812X\/1"]}],"id":[{"id":"10.13039\/501100000269","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Essex University and ESRC","award":["ES\/M008592\/1"],"award-info":[{"award-number":["ES\/M008592\/1"]}]},{"DOI":"10.13039\/501100000269","name":"ESRC","doi-asserted-by":"publisher","award":["ES\/M008592\/1"],"award-info":[{"award-number":["ES\/M008592\/1"]}],"id":[{"id":"10.13039\/501100000269","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000265","name":"Medical Research Council","doi-asserted-by":"publisher","award":["MR\/K013807\/1"],"award-info":[{"award-number":["MR\/K013807\/1"]}],"id":[{"id":"10.13039\/501100000265","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Essex University"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,3,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>The datasets generated by DNA methylation analyses are getting bigger. With the release of the HumanMethylationEPIC micro-array and datasets containing thousands of samples, analyses of these large datasets using R are becoming impractical due to large memory requirements. As a result there is an increasing need for computationally efficient methodologies to perform meaningful analysis on high dimensional data.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Here we introduce the bigmelon R package, which provides a memory efficient workflow that enables users to perform the complex, large scale analyses required in epigenome wide association studies (EWAS) without the need for large RAM. Building on top of the CoreArray Genomic Data Structure file format and libraries packaged in the gdsfmt package, we provide a practical workflow that facilitates the reading-in, preprocessing, quality control and statistical analysis of DNA methylation data.<\/jats:p>\n                  <jats:p>We demonstrate the capabilities of the bigmelon package using a large dataset consisting of 1193 human blood samples from the Understanding Society: UK Household Longitudinal Study, assayed on the EPIC micro-array platform.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The bigmelon package is available on Bioconductor (http:\/\/bioconductor.org\/packages\/bigmelon\/). The Understanding Society dataset is available at https:\/\/www.understandingsociety.ac.uk\/about\/health\/data upon request.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty713","type":"journal-article","created":{"date-parts":[[2018,8,22]],"date-time":"2018-08-22T19:19:20Z","timestamp":1534965560000},"page":"981-986","source":"Crossref","is-referenced-by-count":61,"title":["Bigmelon: tools for analysing large DNA methylation datasets"],"prefix":"10.1093","volume":"35","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1817-1495","authenticated-orcid":false,"given":"Tyler J","family":"Gorrie-Stone","sequence":"first","affiliation":[{"name":"School of Biological Sciences, University of Essex, Colchester, UK"}]},{"given":"Melissa C","family":"Smart","sequence":"additional","affiliation":[{"name":"Institute for Social and Economic Research, University of Essex, Colchester, UK"}]},{"given":"Ayden","family":"Saffari","sequence":"additional","affiliation":[{"name":"Department of Psychological Sciences, Birkbeck, University of London, London, UK"},{"name":"Department of Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, UK"},{"name":"MRC Unit, The Gambia and MRC International Nutrition Group, London School of Hygiene and Tropical Medicine, London, UK"}]},{"given":"Karim","family":"Malki","sequence":"additional","affiliation":[{"name":"Institute of Psychiatry, Psychology and Neuroscience, King\u2019s College London, London, UK"}]},{"given":"Eilis","family":"Hannon","sequence":"additional","affiliation":[{"name":"University of Exeter Medical School, University of Exeter, Exeter, UK"}]},{"given":"Joe","family":"Burrage","sequence":"additional","affiliation":[{"name":"University of Exeter Medical School, University of Exeter, Exeter, UK"}]},{"given":"Jonathan","family":"Mill","sequence":"additional","affiliation":[{"name":"University of Exeter Medical School, University of Exeter, Exeter, UK"}]},{"given":"Meena","family":"Kumari","sequence":"additional","affiliation":[{"name":"Institute for Social and Economic Research, University of Essex, Colchester, UK"}]},{"given":"Leonard C","family":"Schalkwyk","sequence":"additional","affiliation":[{"name":"School of Biological Sciences, University of Essex, Colchester, UK"}]}],"member":"286","published-online":{"date-parts":[[2018,8,23]]},"reference":[{"key":"2023013107270927500_bty713-B1","doi-asserted-by":"crossref","first-page":"1363","DOI":"10.1093\/bioinformatics\/btu049","article-title":"Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays","volume":"30","author":"Aryee","year":"2014","journal-title":"Bioinformatics"},{"key":"2023013107270927500_bty713-B2","doi-asserted-by":"crossref","first-page":"1138","DOI":"10.1038\/nmeth.3115","article-title":"Comprehensive analysis of DNA methylation data with RnBeads","volume":"11","author":"Assenov","year":"2014","journal-title":"Nat. Methods"},{"key":"2023013107270927500_bty713-B3","doi-asserted-by":"crossref","first-page":"288","DOI":"10.1016\/j.ygeno.2011.07.007","article-title":"High density DNA methylation array with single CpG site resolution","volume":"98","author":"Bibikova","year":"2011","journal-title":"Genomics"},{"key":"2023013107270927500_bty713-B4","doi-asserted-by":"crossref","first-page":"R80.","DOI":"10.1186\/gb-2004-5-10-r80","article-title":"Bioconductor: open software development for computational biology and bioinformatics","volume":"5","author":"Gentleman","year":"2004","journal-title":"Genome Biol"},{"key":"2023013107270927500_bty713-B5","doi-asserted-by":"crossref","first-page":"3329","DOI":"10.1093\/bioinformatics\/bts610","article-title":"GWASTools: an R\/Bioconductor package for quality control and analysis of genome-wide association studies","volume":"28","author":"Gogarten","year":"2012","journal-title":"Bioinformatics"},{"key":"2023013107270927500_bty713-B6","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1186\/s13059-016-1041-x","article-title":"An integrated genetic-epigenetic analysis of schizophrenia: evidence for co-localization of genetic associations and differential DNA methylation","volume":"17","author":"Hannon","year":"2016","journal-title":"Genome Biol"},{"key":"2023013107270927500_bty713-B7","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1016\/j.molcel.2012.10.016","article-title":"Genome-wide methylation profiles reveal quantitative views of human aging rates","volume":"49","author":"Hannum","year":"2013","journal-title":"Mol. Cell"},{"key":"2023013107270927500_bty713-B8","doi-asserted-by":"crossref","first-page":"768","DOI":"10.1038\/ng.865","article-title":"Increased methylation variation in epigenetic domains across cancer types","volume":"43","author":"Hansen","year":"2011","journal-title":"Nat. Genet"},{"key":"2023013107270927500_bty713-B9","doi-asserted-by":"crossref","first-page":"R115.","DOI":"10.1186\/gb-2013-14-10-r115","article-title":"DNA methylation age of human tissues and cell types","volume":"14","author":"Horvath","year":"2013","journal-title":"Genome Biol"},{"key":"2023013107270927500_bty713-B10","doi-asserted-by":"crossref","first-page":"86.","DOI":"10.1186\/1471-2105-13-86","article-title":"DNA methylation arrays as surrogate measures of cell mixture distribution","volume":"13","author":"Houseman","year":"2012","journal-title":"BMC Bioinformatics"},{"key":"2023013107270927500_bty713-B11","doi-asserted-by":"crossref","first-page":"200","DOI":"10.1093\/ije\/dyr238","article-title":"Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies","volume":"41","author":"Jaffe","year":"2012","journal-title":"Int. J. Epidemiol"},{"key":"2023013107270927500_bty713-B12","doi-asserted-by":"crossref","first-page":"40. 7","DOI":"10.1038\/nn.4181","article-title":"Mapping DNA methylation across development, genotype and schizophrenia in the human frontal cortex","volume":"19","author":"Jaffe","year":"2016","journal-title":"Nat. Neurosci"},{"key":"2023013107270927500_bty713-B13","doi-asserted-by":"crossref","first-page":"142","DOI":"10.1038\/nbt.2487","article-title":"Epigenome-wide association data implicate DNA methylation as an intermediary of genetic risk in rheumatoid arthritis","volume":"31","author":"Liu","year":"2013","journal-title":"Nat. Biotechnol"},{"key":"2023013107270927500_bty713-B14","doi-asserted-by":"crossref","first-page":"359.","DOI":"10.1186\/1471-2105-14-359","article-title":"Marmal-aid \u2013 a database for Infinium HumanMethylation450","volume":"14","author":"Lowe","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023013107270927500_bty713-B15","author":"Mersmann","year":"2015"},{"key":"2023013107270927500_bty713-B16","author":"Min","year":"2017"},{"key":"2023013107270927500_bty713-B17","doi-asserted-by":"crossref","first-page":"389","DOI":"10.2217\/epi.15.114","article-title":"Validation of a DNA methylation microarray for 850,000 CpG sites of the human genome enriched in enhancer sequences","volume":"8","author":"Moran","year":"2016","journal-title":"Epigenomics"},{"key":"2023013107270927500_bty713-B18","doi-asserted-by":"crossref","first-page":"428","DOI":"10.1093\/bioinformatics\/btt684","article-title":"ChAMP: 450k Chip Analysis Methylation Pipeline","volume":"30","author":"Morris","year":"2014","journal-title":"Bioinformatics"},{"key":"2023013107270927500_bty713-B19","doi-asserted-by":"crossref","first-page":"293.","DOI":"10.1186\/1471-2164-14-293","article-title":"A data-driven approach to preprocessing Illumina 450K methylation array data","volume":"14","author":"Pidsley","year":"2013","journal-title":"BMC Genomics"},{"key":"2023013107270927500_bty713-B20","doi-asserted-by":"crossref","first-page":"529","DOI":"10.1038\/nrg3000","article-title":"Epigenome-wide association studies for common human diseases","volume":"12","author":"Rakyan","year":"2011","journal-title":"Nat. Rev. Genet"},{"key":"2023013107270927500_bty713-B21","doi-asserted-by":"crossref","first-page":"e47","DOI":"10.1093\/nar\/gkv007","article-title":"limma powers differential expression analyses for RNA-sequencing and microarray studies","volume":"43","author":"Ritchie","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023013107270927500_bty713-B22","doi-asserted-by":"crossref","first-page":"264.","DOI":"10.12688\/f1000research.2-264.v1","article-title":"illuminaio: an open source IDAT parsing tool for Illumina microarrays","volume":"2","author":"Smith","year":"2013","journal-title":"F1000Research"},{"key":"2023013107270927500_bty713-B23","doi-asserted-by":"crossref","first-page":"e90","DOI":"10.1093\/nar\/gkt090","article-title":"Low-level processing of Illumina Infinium DNA Methylation BeadArrays","volume":"41","author":"Triche","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023013107270927500_bty713-B24","doi-asserted-by":"crossref","first-page":"3435","DOI":"10.1093\/bioinformatics\/btu566","article-title":"MethylAid: visual and interactive quality control of large Illumina 450k datasets","volume":"30","author":"van Iterson","year":"2014","journal-title":"Bioinformatics"},{"key":"2023013107270927500_bty713-B25","doi-asserted-by":"crossref","first-page":"3326","DOI":"10.1093\/bioinformatics\/bts606","article-title":"A high-performance computing toolset for relatedness and principal component analysis of SNP data","volume":"28","author":"Zheng","year":"2012","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/6\/981\/48968076\/bioinformatics_35_6_981.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/6\/981\/48968076\/bioinformatics_35_6_981.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T10:26:33Z","timestamp":1675160793000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/6\/981\/5078475"}},"subtitle":[],"editor":[{"given":"Janet","family":"Kelso","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,8,23]]},"references-count":25,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2019,3,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty713","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,3,15]]},"published":{"date-parts":[[2018,8,23]]}}}