{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T04:18:17Z","timestamp":1773116297819,"version":"3.50.1"},"reference-count":18,"publisher":"Oxford University Press (OUP)","issue":"22","license":[{"start":{"date-parts":[[2018,9,1]],"date-time":"2018-09-01T00:00:00Z","timestamp":1535760000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["RO1 CA190766"],"award-info":[{"award-number":["RO1 CA190766"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["RO1 MD011764"],"award-info":[{"award-number":["RO1 MD011764"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["RO1 HL117191"],"award-info":[{"award-number":["RO1 HL117191"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,11,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Integrative analysis of multi-omics data from different high-throughput experimental platforms provides valuable insight into regulatory mechanisms associated with complex diseases, and gains statistical power to detect markers that are otherwise overlooked by single-platform omics analysis. In practice, a significant portion of samples may not be measured completely due to insufficient tissues or restricted budget (e.g. gene expression profile are measured but not methylation). Current multi-omics integrative methods require complete data. A common practice is to ignore samples with any missing platform and perform complete case analysis, which leads to substantial loss of statistical power.<\/jats:p><\/jats:sec><jats:sec><jats:title>Methods<\/jats:title><jats:p>In this article, inspired by the popular Integrative Bayesian Analysis of Genomics data (iBAG), we propose a full Bayesian model that allows incorporation of samples with missing omics data.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Simulation results show improvement of the new full Bayesian approach in terms of outcome prediction accuracy and feature selection performance when sample size is limited and proportion of missingness is large. When sample size is large or the proportion of missingness is low, incorporating samples with missingness may introduce extra inference uncertainty and generate worse prediction and feature selection performance. To determine whether and how to incorporate samples with missingness, we propose a self-learning cross-validation (CV) decision scheme. Simulations and a real application on child asthma dataset demonstrate superior performance of the CV decision scheme when various types of missing mechanisms are evaluated.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>Freely available on the GitHub at https:\/\/github.com\/CHPGenetics\/FBM<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty775","type":"journal-article","created":{"date-parts":[[2018,9,1]],"date-time":"2018-09-01T04:01:06Z","timestamp":1535774466000},"page":"3801-3808","source":"Crossref","is-referenced-by-count":27,"title":["Bayesian integrative model for multi-omics data with missingness"],"prefix":"10.1093","volume":"34","author":[{"given":"Zhou","family":"Fang","sequence":"first","affiliation":[{"name":"Department of Biostatistics, University of Pittsburgh, Pittsburgh, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tianzhou","family":"Ma","sequence":"additional","affiliation":[{"name":"Department of Epidemiology and Biostatistics, University of Maryland, College Park, MD, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gong","family":"Tang","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, University of Pittsburgh, Pittsburgh, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Li","family":"Zhu","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, University of Pittsburgh, Pittsburgh, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qi","family":"Yan","sequence":"additional","affiliation":[{"name":"Division of Pediatric Pulmonology, Allergy and Immunology, Children's Hospital of Pittsburgh of UPMC, Pittsburgh, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ting","family":"Wang","sequence":"additional","affiliation":[{"name":"Division of Pediatric Pulmonology, Allergy and Immunology, Children's Hospital of Pittsburgh of UPMC, Pittsburgh, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Juan C","family":"Celed\u00f3n","sequence":"additional","affiliation":[{"name":"Division of Pediatric Pulmonology, Allergy and Immunology, Children's Hospital of Pittsburgh of UPMC, Pittsburgh, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wei","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, University of Pittsburgh, Pittsburgh, USA"},{"name":"Division of Pediatric Pulmonology, Allergy and Immunology, Children's Hospital of Pittsburgh of UPMC, Pittsburgh, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5447-1014","authenticated-orcid":false,"given":"George C","family":"Tseng","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, University of Pittsburgh, Pittsburgh, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2018,9,1]]},"reference":[{"key":"2023012712294593800_bty775-B1","first-page":"12","article-title":"Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes","volume":"9","author":"Brock","year":"2008","journal-title":"BMC Biostatistics"},{"key":"2023012712294593800_bty775-B2","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1093\/oso\/9780198522669.003.0010","volume-title":"Bayesian Statistics 4","author":"Geweke","year":"1992"},{"key":"2023012712294593800_bty775-B3","doi-asserted-by":"crossref","first-page":"1011","DOI":"10.1214\/17-AOAS1033","article-title":"Integrative sparse K-means with overlapping group lasso in genomic applications for disease subtype discovery","volume":"11","author":"Huo","year":"2017","journal-title":"Ann. Appl. Stat"},{"key":"2023012712294593800_bty775-B4","doi-asserted-by":"crossref","first-page":"55","DOI":"10.2307\/3315865","article-title":"Bayesian methods for generalized linear models with covariates missing at random","volume":"30","author":"Ibrahim","year":"2002","journal-title":"Can. J. Stat"},{"key":"2023012712294593800_bty775-B5","doi-asserted-by":"crossref","first-page":"730","DOI":"10.1214\/009053604000001147","article-title":"Spike and slab variable selection: frequentist and Bayesian strategies","volume":"33","author":"Ishwaran","year":"2005","journal-title":"Ann. Stat"},{"key":"2023012712294593800_bty775-B6","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1093\/biostatistics\/kxw039","article-title":"Integrative clustering of multi-level omics data for disease subtype discovery using sequential double regularization","volume":"18","author":"Kim","year":"2017","journal-title":"Biostatistics"},{"key":"2023012712294593800_bty775-B7","doi-asserted-by":"crossref","DOI":"10.1002\/9781119013563","volume-title":"Statistical Analysis with Missing Data","author":"Little","year":"2002","edition":"2"},{"key":"2023012712294593800_bty775-B8","doi-asserted-by":"crossref","first-page":"2610","DOI":"10.1093\/bioinformatics\/btt425","article-title":"Bayesian consensus clustering","volume":"29","author":"Lock","year":"2013","journal-title":"Bioinformatics"},{"key":"2023012712294593800_bty775-B9","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1093\/biostatistics\/5.2.155","article-title":"Detecting differential gene expression with a semiparametric hierarchical mixture method","volume":"5","author":"Newton","year":"2004","journal-title":"Biostatistics"},{"key":"2023012712294593800_bty775-B10","first-page":"78","article-title":"Biological impact of missing-value imputation on downstream analyses of gene expression profiles","volume":"27","author":"Oh","year":"2011","journal-title":"Biostatistics"},{"key":"2023012712294593800_bty775-B11","doi-asserted-by":"crossref","first-page":"587.","DOI":"10.1186\/1471-2105-11-587","article-title":"Comparison of beta-value and M-value methods for quantifying methylation levels by microarray analysis","volume":"11","author":"Du","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023012712294593800_bty775-B12","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1146\/annurev-statistics-041715-033506","article-title":"Statistical methods in integrative genomics","volume":"3","author":"Richardson","year":"2016","journal-title":"Annu. Rev. Stat. Appl"},{"key":"2023012712294593800_bty775-B13","doi-asserted-by":"crossref","first-page":"2906","DOI":"10.1093\/bioinformatics\/btp543","article-title":"Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis","volume":"25","author":"Shen","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012712294593800_bty775-B14","doi-asserted-by":"crossref","first-page":"528","DOI":"10.1080\/01621459.1987.10478458","article-title":"The calculation of posterior distributions by data augmentation","volume":"82","author":"Tanner","year":"1987","journal-title":"J. Am. Stat. Assoc"},{"key":"2023012712294593800_bty775-B15","doi-asserted-by":"crossref","first-page":"3785","DOI":"10.1093\/nar\/gkr1265","article-title":"Comprehensive literature review and statistical considerations for microarray meta-analysis","volume":"40","author":"Tseng","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023012712294593800_bty775-B16","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9781107706484","volume-title":"Integrating Omics Data","author":"Tseng","year":"2015"},{"key":"2023012712294593800_bty775-B17","doi-asserted-by":"crossref","first-page":"402","DOI":"10.1186\/s12859-016-1273-5","article-title":"Handling missing rows in multi-omics data integration: multiple imputation in multiple factor analysis framework","volume":"17","author":"Voillet","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2023012712294593800_bty775-B18","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1093\/bioinformatics\/bts655","article-title":"iBAG: integrative Bayesian analysis of high-dimensional multiplatform genomics data","volume":"29","author":"Wang","year":"2013","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/22\/3801\/48920278\/bioinformatics_34_22_3801.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/22\/3801\/48920278\/bioinformatics_34_22_3801.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,9]],"date-time":"2024-07-09T20:53:47Z","timestamp":1720558427000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/22\/3801\/5089231"}},"subtitle":[],"editor":[{"given":"John","family":"Hancock","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2018,9,1]]},"references-count":18,"journal-issue":{"issue":"22","published-print":{"date-parts":[[2018,11,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty775","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,11,15]]},"published":{"date-parts":[[2018,9,1]]}}}