{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,13]],"date-time":"2026-05-13T08:37:08Z","timestamp":1778661428770,"version":"3.51.4"},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"18","license":[{"start":{"date-parts":[[2019,2,1]],"date-time":"2019-02-01T00:00:00Z","timestamp":1548979200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100010663","name":"European Research Council","doi-asserted-by":"publisher","award":["640275"],"award-info":[{"award-number":["640275"]}],"id":[{"id":"10.13039\/100010663","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100006313","name":"Swedish Childhood Cancer Foundation","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100006313","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100002794","name":"Swedish Cancer Society","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100002794","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Swedish Research Council and the Ragnar S\u00f6derberg\u2019s Foundation"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,9,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Medulloblastoma (MB) is a brain cancer predominantly arising in children. Roughly 70% of patients are cured today, but survivors often suffer from severe sequelae. MB has been extensively studied by molecular profiling, but often in small and scattered cohorts. To improve cure rates and reduce treatment side effects, accurate integration of such data to increase analytical power will be important, if not essential.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We have integrated 23 transcription datasets, spanning 1350 MB and 291 normal brain samples. To remove batch effects, we combined the Removal of Unwanted Variation (RUV) method with a novel pipeline for determining empirical negative control genes and a panel of metrics to evaluate normalization performance. The documented approach enabled the removal of a majority of batch effects, producing a large-scale, integrative dataset of MB and cerebellar expression data. The proposed strategy will be broadly applicable for accurate integration of data and incorporation of normal reference samples for studies of various diseases. We hope that the integrated dataset will improve current research in the field of MB by allowing more large-scale gene expression analyses.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The RUV-normalized expression data is available through the Gene Expression Omnibus (GEO; https:\/\/www.ncbi.nlm.nih.gov\/geo\/) and can be accessed via the GSE series number GSE124814.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz066","type":"journal-article","created":{"date-parts":[[2019,1,30]],"date-time":"2019-01-30T21:20:03Z","timestamp":1548883203000},"page":"3357-3364","source":"Crossref","is-referenced-by-count":59,"title":["Batch-normalization of cerebellar and medulloblastoma gene expression datasets utilizing empirically defined negative control genes"],"prefix":"10.1093","volume":"35","author":[{"given":"Holger","family":"Weishaupt","sequence":"first","affiliation":[{"name":"Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Rudbeck Laboratory, Uppsala University , Uppsala, Sweden"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Patrik","family":"Johansson","sequence":"additional","affiliation":[{"name":"Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Rudbeck Laboratory, Uppsala University , Uppsala, Sweden"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anders","family":"Sundstr\u00f6m","sequence":"additional","affiliation":[{"name":"Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Rudbeck Laboratory, Uppsala University , Uppsala, Sweden"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zelmina","family":"Lubovac-Pilav","sequence":"additional","affiliation":[{"name":"Division for Biology and Bioinformatics, School of Bioscience, The Systems Biology Research Centre, University of Sk\u00f6vde , Sk\u00f6vde, Sweden"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bj\u00f6rn","family":"Olsson","sequence":"additional","affiliation":[{"name":"Division for Biology and Bioinformatics, School of Bioscience, The Systems Biology Research Centre, University of Sk\u00f6vde , Sk\u00f6vde, Sweden"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sven","family":"Nelander","sequence":"additional","affiliation":[{"name":"Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Rudbeck Laboratory, Uppsala University , Uppsala, Sweden"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fredrik J","family":"Swartling","sequence":"additional","affiliation":[{"name":"Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Rudbeck Laboratory, Uppsala University , Uppsala, Sweden"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2019,2,1]]},"reference":[{"key":"2023013108061307300_btz066-B36","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/nrg2918","article-title":"Network medicine: a network-based approach to human disease","volume":"12","author":"Barab\u00e1si","year":"2011","journal-title":"Nat Rev Genet"},{"key":"2023013108061307300_btz066-B1","doi-asserted-by":"crossref","first-page":"737","DOI":"10.1016\/j.ccell.2017.05.005","article-title":"Intertumoral heterogeneity within medulloblastoma subgroups","volume":"31","author":"Cavalli","year":"2017","journal-title":"Cancer Cell"},{"key":"2023013108061307300_btz066-B2","doi-asserted-by":"crossref","first-page":"e17238","DOI":"10.1371\/journal.pone.0017238","article-title":"Removing batch effects in analysis of expression microarray data: an evaluation of six batch adjustment methods","volume":"6","author":"Chen","year":"2011","journal-title":"PLoS One"},{"key":"2023013108061307300_btz066-B3","doi-asserted-by":"crossref","first-page":"62","DOI":"10.1186\/s13148-015-0103-3","article-title":"MethPed: a DNA methylation classifier tool for the identification of pediatric brain tumor subtypes","volume":"7","author":"Danielsson","year":"2015","journal-title":"Clin. Epigenet."},{"key":"2023013108061307300_btz066-B4","doi-asserted-by":"crossref","first-page":"373","DOI":"10.1007\/s00401-012-1070-9","article-title":"Aberrant patterns of H3K4 and H3K27 histone lysine methylation occur across subgroups in medulloblastoma","volume":"125","author":"Dubuc","year":"2013","journal-title":"Acta Neuropathol."},{"key":"2023013108061307300_btz066-B5","doi-asserted-by":"crossref","first-page":"362","DOI":"10.1016\/S0168-9525(03)00140-9","article-title":"Human housekeeping genes are compact","volume":"19","author":"Eisenberg","year":"2003","journal-title":"Trends Genet."},{"key":"2023013108061307300_btz066-B6","doi-asserted-by":"crossref","first-page":"569","DOI":"10.1016\/j.tig.2013.05.010","article-title":"Human housekeeping genes, revisited","volume":"29","author":"Eisenberg","year":"2013","journal-title":"Trends Genet."},{"key":"2023013108061307300_btz066-B7","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1093\/biostatistics\/kxr034","article-title":"Using control genes to correct for unwanted variation in microarray data","volume":"13","author":"Gagnon-Bartsch","year":"2012","journal-title":"Biostatistics"},{"key":"2023013108061307300_btz066-B8","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1007\/s12561-013-9081-1","article-title":"A two-stage procedure for the removal of batch effects in microarray studies","volume":"6","author":"Giordan","year":"2014","journal-title":"Stat. Biosci."},{"key":"2023013108061307300_btz066-B9","doi-asserted-by":"crossref","first-page":"79","DOI":"10.2217\/cns.14.58","article-title":"Medulloblastoma development: tumor biology informs treatment decisions","volume":"4.2","author":"Gopalakrishnan","year":"2015","journal-title":"CNS Oncology"},{"key":"2023013108061307300_btz066-B10","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1186\/1471-2105-14-75","article-title":"virtualArray: a R\/bioconductor package to merge raw data from different microarray platforms","volume":"14","author":"Heider","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023013108061307300_btz066-B11","doi-asserted-by":"crossref","first-page":"537","DOI":"10.1038\/nature13268","article-title":"Decoding the regulatory landscape of medulloblastoma using DNA methylation sequencing","volume":"510","author":"Hovestadt","year":"2014","journal-title":"Nature"},{"key":"2023013108061307300_btz066-B12","doi-asserted-by":"crossref","first-page":"1182","DOI":"10.1093\/bioinformatics\/bts096","article-title":"R\/DWD: distance-weighted discrimination for classification, visualization and batch adjustment","volume":"28","author":"Huang","year":"2012","journal-title":"Bioinformatics"},{"key":"2023013108061307300_btz066-B13","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1093\/biostatistics\/kxv026","article-title":"Correcting gene expression data when neither the unwanted variation nor the factor of interest are observed","volume":"17","author":"Jacob","year":"2016","journal-title":"Biostatistics"},{"key":"2023013108061307300_btz066-B14","doi-asserted-by":"crossref","first-page":"118","DOI":"10.1093\/biostatistics\/kxj037","article-title":"Adjusting batch effects in microarray expression data using empirical Bayes methods","volume":"8","author":"Johnson","year":"2007","journal-title":"Biostatistics"},{"key":"2023013108061307300_btz066-B15","doi-asserted-by":"crossref","first-page":"473","DOI":"10.1007\/s00401-012-0958-8","article-title":"Molecular subgroups of medulloblastoma: an international meta-analysis of transcriptome, genetic aberrations, and clinical data of WNT, SHH, Group 3, and Group 4 medulloblastomas","volume":"123","author":"Kool","year":"2012","journal-title":"Acta Neuropathol."},{"key":"2023013108061307300_btz066-B17","doi-asserted-by":"crossref","first-page":"469","DOI":"10.1093\/bib\/bbs037","article-title":"Batch effect removal methods for microarray gene expression data integration: a survey","volume":"14","author":"Lazar","year":"2013","journal-title":"Brief. Bioinf."},{"key":"2023013108061307300_btz066-B18","doi-asserted-by":"crossref","first-page":"882","DOI":"10.1093\/bioinformatics\/bts034","article-title":"The sva package for removing batch effects and other unwanted variation in high-throughput experiments","volume":"28","author":"Leek","year":"2012","journal-title":"Bioinformatics"},{"key":"2023013108061307300_btz066-B19","volume-title":"WHO Classification of Tumours of the Central Nervous System","author":"Louis","year":"2016"},{"key":"2023013108061307300_btz066-B20","doi-asserted-by":"crossref","first-page":"1457","DOI":"10.1158\/1078-0432.CCR-14-1144","article-title":"Tumor-associated macrophages in SHH subgroup of medulloblastomas","volume":"21","author":"Margol","year":"2015","journal-title":"Clin. Cancer Res."},{"key":"2023013108061307300_btz066-B21","first-page":"99","article-title":"Multiplex meta-analysis of medulloblastoma expression studies with external controls","volume-title":"Biocomputing 2014","author":"Morgan","year":"2014"},{"key":"2023013108061307300_btz066-B22","doi-asserted-by":"crossref","first-page":"1711","DOI":"10.1101\/gr.135129.111","article-title":"Predicting cell-type-specific gene expression from regions of open chromatin","volume":"22","author":"Natarajan","year":"2012","journal-title":"Genome Res."},{"key":"2023013108061307300_btz066-B23","doi-asserted-by":"crossref","first-page":"1408","DOI":"10.1200\/JCO.2009.27.4324","article-title":"Medulloblastoma comprises four distinct molecular variants","volume":"29","author":"Northcott","year":"2011","journal-title":"J. Clin. Oncol."},{"key":"2023013108061307300_btz066-B24","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1038\/nature11327","article-title":"Subgroup-specific structural variation across 1\u2009000 medulloblastoma genomes","volume":"488","author":"Northcott","year":"2012","journal-title":"Nature"},{"key":"2023013108061307300_btz066-B25","doi-asserted-by":"crossref","first-page":"615","DOI":"10.1007\/s00401-011-0899-7","article-title":"Rapid, reliable, and reproducible molecular sub-grouping of clinical medulloblastoma samples","volume":"123","author":"Northcott","year":"2012","journal-title":"Acta Neuropathol."},{"key":"2023013108061307300_btz066-B26","doi-asserted-by":"crossref","first-page":"428","DOI":"10.1038\/nature13379","article-title":"Enhancer hijacking activates GFI1 family oncogenes in medulloblastoma","volume":"511","author":"Northcott","year":"2014","journal-title":"Nature"},{"key":"2023013108061307300_btz066-B27","doi-asserted-by":"crossref","first-page":"311","DOI":"10.1038\/nature22973","article-title":"The whole-genome landscape of medulloblastoma subtypes","volume":"547","author":"Northcott","year":"2017","journal-title":"Nature"},{"key":"2023013108061307300_btz066-B28","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1093\/biostatistics\/kxv027","article-title":"Methods that remove batch effects while retaining group differences may lead to exaggerated confidence in downstream analyses","volume":"17","author":"Nygaard","year":"2016","journal-title":"Biostatistics"},{"key":"2023013108061307300_btz066-B29","doi-asserted-by":"crossref","first-page":"2757","DOI":"10.1093\/bioinformatics\/btu375","article-title":"Preserving biological heterogeneity with a permuted surrogate variable analysis for genomics batch correction","volume":"30","author":"Parker","year":"2014","journal-title":"Bioinformatics"},{"key":"2023013108061307300_btz066-B30","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1007\/s00401-014-1297-8","article-title":"Genomic and transcriptomic analyses match medulloblastoma mouse models to their human counterparts","volume":"128","author":"P\u00f6schl","year":"2014","journal-title":"Acta Neuropathol."},{"key":"2023013108061307300_btz066-B31","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1038\/nature11213","article-title":"Novel mutations target distinct subgroups of medulloblastoma","volume":"488","author":"Robinson","year":"2012","journal-title":"Nature"},{"key":"2023013108061307300_btz066-B37","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1007\/s00401-012-1077-2","article-title":"DNA methylation profiling of medulloblastoma allows robust subclassification and improved outcome prediction using formalin-fixed biopsies","volume":"125","author":"Schwalbe","year":"2013","journal-title":"Acta neuropathologica"},{"key":"2023013108061307300_btz066-B38","doi-asserted-by":"crossref","first-page":"958","DOI":"10.1016\/S1470-2045(17)30243-7","article-title":"Novel molecular subgroups for clinical classification and outcome prediction in childhood medulloblastoma: a cohort study","volume":"18","author":"Schwalbe","year":"2017","journal-title":"The Lancet Oncology"},{"key":"2023013108061307300_btz066-B32","doi-asserted-by":"crossref","DOI":"10.12688\/f1000research.10859.1","article-title":"The evolution of medulloblastoma therapy to personalized medicine","volume":"6","author":"Sengupta","year":"2017","journal-title":"F1000Research"},{"key":"2023013108061307300_btz066-B34","doi-asserted-by":"crossref","first-page":"465","DOI":"10.1007\/s00401-011-0922-z","article-title":"Molecular subgroups of medulloblastoma: the current consensus","volume":"123","author":"Taylor","year":"2012","journal-title":"Acta Neuropathol."},{"key":"2023013108061307300_btz066-B35","doi-asserted-by":"crossref","first-page":"986","DOI":"10.1016\/j.cell.2011.02.016","article-title":"Interactome networks and human disease","volume":"144","author":"Vidal","year":"2011","journal-title":"Cell"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/18\/3357\/48975671\/bioinformatics_35_18_3357.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/18\/3357\/48975671\/bioinformatics_35_18_3357.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T13:54:26Z","timestamp":1675173266000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/18\/3357\/5305636"}},"subtitle":[],"editor":[{"given":"Janet","family":"Kelso","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2019,2,1]]},"references-count":36,"journal-issue":{"issue":"18","published-print":{"date-parts":[[2019,9,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz066","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,9,15]]},"published":{"date-parts":[[2019,2,1]]}}}