{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T11:43:39Z","timestamp":1753875819196,"version":"3.41.2"},"reference-count":53,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2021,10,26]],"date-time":"2021-10-26T00:00:00Z","timestamp":1635206400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,1,17]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>As our understanding of the microbiome has expanded, so has the recognition of its critical role in human health and disease, thereby emphasizing the importance of testing whether microbes are associated with environmental factors or clinical outcomes. However, many of the fundamental challenges that concern microbiome surveys arise from statistical and experimental design issues, such as the sparse and overdispersed nature of microbiome count data and the complex correlation structure among samples. For example, in the human microbiome project (HMP) dataset, the repeated observations across time points (level 1) are nested within body sites (level 2), which are further nested within subjects (level 3). Therefore, there is a great need for the development of specialized and sophisticated statistical tests. In this paper, we propose multilevel zero-inflated negative-binomial models for association analysis in microbiome surveys. We develop a variational approximation method for maximum likelihood estimation and inference. It uses optimization, rather than sampling, to approximate the log-likelihood and compute parameter estimates, provides a robust estimate of the covariance of parameter estimates and constructs a Wald-type test statistic for association testing. We evaluate and demonstrate the performance of our method using extensive simulation studies and an application to the HMP dataset. We have developed an R package MZINBVA to implement the proposed method, which is available from the GitHub repository https:\/\/github.com\/liudoubletian\/MZINBVA.<\/jats:p>","DOI":"10.1093\/bib\/bbab443","type":"journal-article","created":{"date-parts":[[2021,9,30]],"date-time":"2021-09-30T13:33:48Z","timestamp":1633008828000},"source":"Crossref","is-referenced-by-count":5,"title":["MZINBVA: variational approximation for multilevel zero-inflated negative-binomial models for association analysis in microbiome surveys"],"prefix":"10.1093","volume":"23","author":[{"given":"Tiantian","family":"Liu","sequence":"first","affiliation":[{"name":"SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, 800 Dongchuan RD, 200240, Shanghai, China"}]},{"given":"Peirong","family":"Xu","sequence":"additional","affiliation":[{"name":"Department of Breast Surgery, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, 200127, Shanghai, China"}]},{"given":"Yueyao","family":"Du","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, Yale University, 60 College Stree, CT 06520, New Haven, USA"},{"name":"MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, 800 Dongchuan RD, 200240, Shanghai, China"}]},{"given":"Hui","family":"Lu","sequence":"additional","affiliation":[{"name":"SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, 800 Dongchuan RD, 200240, Shanghai, China"}]},{"given":"Hongyu","family":"Zhao","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, Yale University, 60 College Stree, CT 06520, New Haven, USA"}]},{"given":"Tao","family":"Wang","sequence":"additional","affiliation":[{"name":"SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, 800 Dongchuan RD, 200240, Shanghai, China"}]}],"member":"286","published-online":{"date-parts":[[2021,10,26]]},"reference":[{"issue":"6289","key":"2022012508571939800_ref1","doi-asserted-by":"crossref","first-page":"1116","DOI":"10.1126\/science.aad9948","article-title":"Gene-microbiota interactions contribute to the pathogenesis of inflammatory bowel disease","volume":"352","author":"Chu","year":"2016","journal-title":"Science"},{"issue":"6285","key":"2022012508571939800_ref2","doi-asserted-by":"crossref","first-page":"560","DOI":"10.1126\/science.aad3503","article-title":"Population-level analysis of gut microbiome variation","volume":"352","author":"Falony","year":"2016","journal-title":"Science"},{"issue":"4","key":"2022012508571939800_ref3","doi-asserted-by":"crossref","first-page":"611","DOI":"10.1016\/j.cmet.2017.09.008","article-title":"Improvement of insulin sensitivity after lean donor feces in metabolic syndrome is driven by baseline intestinal microbiota composition","volume":"26","author":"Kootte","year":"2017","journal-title":"Cell Metab"},{"issue":"6","key":"2022012508571939800_ref4","doi-asserted-by":"crossref","first-page":"968","DOI":"10.1038\/s41591-019-0458-7","article-title":"Metagenomic and metabolomic analyses reveal distinct stage-specific phenotypes of the gut microbiota in colorectal cancer","volume":"25","author":"Yachida","year":"2019","journal-title":"Nat Med"},{"issue":"9","key":"2022012508571939800_ref5","doi-asserted-by":"crossref","first-page":"2302","DOI":"10.1016\/j.cell.2021.03.024","article-title":"The long-term genetic stability and individual specificity of the human gut microbiome","volume":"184","author":"Chen","year":"2021","journal-title":"Cell"},{"issue":"12","key":"2022012508571939800_ref6","doi-asserted-by":"crossref","first-page":"1200","DOI":"10.1038\/nmeth.2658","article-title":"Differential abundance analysis for microbial marker-gene surveys","volume":"10","author":"Paulson","year":"2013","journal-title":"Nat Methods"},{"issue":"2","key":"2022012508571939800_ref7","doi-asserted-by":"crossref","first-page":"250","DOI":"10.1016\/j.cell.2014.06.037","article-title":"Conducting a microbiome study","volume":"158","author":"Goodrich","year":"2014","journal-title":"Cell"},{"issue":"7","key":"2022012508571939800_ref8","doi-asserted-by":"crossref","first-page":"410","DOI":"10.1038\/s41579-018-0029-9","article-title":"Best practices for analysing microbiomes","volume":"16","author":"Knight","year":"2018","journal-title":"Nat Rev Microbiol"},{"issue":"12","key":"2022012508571939800_ref9","doi-asserted-by":"crossref","first-page":"2317","DOI":"10.1101\/gr.096651.109","article-title":"The NIH human microbiome project","volume":"19","author":"Peterson","year":"2009","journal-title":"Genome Res"},{"issue":"4","key":"2022012508571939800_ref10","doi-asserted-by":"crossref","first-page":"643","DOI":"10.1093\/bioinformatics\/btx650","article-title":"An omnibus test for differential distribution analysis of microbiome sequencing data","volume":"34","author":"Chen","year":"2018","journal-title":"Bioinformatics"},{"issue":"2","key":"2022012508571939800_ref11","doi-asserted-by":"crossref","first-page":"102","DOI":"10.1089\/cmb.2015.0157","article-title":"Zero-inflated beta regression for differential abundance analysis with metagenomics data","volume":"23","author":"Peng","year":"2016","journal-title":"J Comput Biol"},{"issue":"1","key":"2022012508571939800_ref12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-019-2744-2","article-title":"metamicrobiomeR: an R package for analysis of microbiome relative abundance data using zero-inflated beta GAMLSS and meta-analysis across studies using random effects models","volume":"20","author":"Ho","year":"2019","journal-title":"BMC Bioinfo"},{"issue":"7","key":"2022012508571939800_ref13","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1006329","article-title":"A marginalized two-part beta regression model for microbiome compositional data","volume":"14","author":"Chai","year":"2018","journal-title":"PLoS Comput Biol"},{"issue":"1","key":"2022012508571939800_ref14","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1111\/j.0006-341X.2001.00219.x","article-title":"Dem\u00e9trio. A score test for testing a zero-inflated Poisson regression model against zero-inflated negative binomial alternatives","volume":"57","author":"Ridout","year":"2001","journal-title":"Biometrics"},{"issue":"2","key":"2022012508571939800_ref15","article-title":"Zero-inflated negative binomial regression for differential abundance testing in microbiome studies","volume":"2","author":"Zhang","year":"2016","journal-title":"J Bioinfo Gen"},{"key":"2022012508571939800_ref16","doi-asserted-by":"crossref","DOI":"10.7717\/peerj.4600","article-title":"GMPR: a robust normalization method for zero-inflated count data with application to microbiome sequencing data","volume":"6","author":"Chen","year":"2018","journal-title":"PeerJ"},{"issue":"13","key":"2022012508571939800_ref17","doi-asserted-by":"crossref","first-page":"3959","DOI":"10.1093\/bioinformatics\/btaa255","article-title":"A novel normalization and differential abundance test framework for microbiome data","volume":"36","author":"Ma","year":"2020","journal-title":"Bioinformatics"},{"key":"2022012508571939800_ref18","doi-asserted-by":"publisher","DOI":"10.1093\/biostatistics\/kxz050","article-title":"A Bayesian zero-inflated negative binomial regression model for the integrative analysis of microbiome data","author":"Jiang","journal-title":"Biostatistics"},{"issue":"17","key":"2022012508571939800_ref19","doi-asserted-by":"crossref","first-page":"2611","DOI":"10.1093\/bioinformatics\/btw308","article-title":"A two-part mixed-effects model for analyzing longitudinal microbiome compositional data","volume":"32","author":"Chen","year":"2016","journal-title":"Bioinformatics"},{"key":"2022012508571939800_ref20","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1007\/s12561-021-09302-w","article-title":"Modeling longitudinal microbiome compositional data: a two-part linear mixed model with shared random effects","volume":"13","author":"Han","year":"2021","journal-title":"Stat Biosci"},{"issue":"11","key":"2022012508571939800_ref21","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0242073","article-title":"Zero-inflated Gaussian mixed models for analyzing longitudinal microbiome data","volume":"15","author":"Zhang","year":"2020","journal-title":"PLoS One"},{"issue":"2","key":"2022012508571939800_ref22","doi-asserted-by":"crossref","first-page":"253","DOI":"10.1214\/18-STS681","article-title":"Statistical analysis of zero-inflated nonnegative continuous data: a review","volume":"34","author":"Liu","year":"2019","journal-title":"Stat Sci"},{"issue":"1","key":"2022012508571939800_ref23","first-page":"1","article-title":"The composition and stability of the vaginal microbiota of normal pregnant women is different from that of non-pregnant women","volume":"2","author":"Romero","year":"2014","journal-title":"Microbiome"},{"issue":"11","key":"2022012508571939800_ref24","doi-asserted-by":"crossref","first-page":"2447","DOI":"10.1017\/S0950268816000662","article-title":"Zero-inflated negative binomial mixed model: an application to two microbial organisms important in oesophagitis","volume":"144","author":"Fang","year":"2016","journal-title":"Epidemiol Infect"},{"key":"2022012508571939800_ref25","doi-asserted-by":"crossref","DOI":"10.1201\/9780203489437","volume-title":"Generalized Latent Variable Modeling: Multilevel, Longitudinal, and Structural Equation Models","author":"Skrondal","year":"2004"},{"issue":"2","key":"2022012508571939800_ref26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v033.i02","article-title":"MCMC methods for multi-response generalised linear mixed models: the MCMCglmm R package","volume":"33","author":"Hadfield","year":"2010","journal-title":"J Stat Softw"},{"key":"2022012508571939800_ref27","article-title":"GLMMadaptive: generalized linear mixed models using adaptive Gaussian quadrature","author":"Rizopoulos","year":"2021","journal-title":"R package version 0.8\u20132."},{"issue":"2","key":"2022012508571939800_ref28","doi-asserted-by":"crossref","first-page":"378","DOI":"10.32614\/RJ-2017-066","article-title":"glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling","volume":"9","author":"Brooks","year":"2017","journal-title":"R J"},{"volume-title":"Pattern Recognition and Machine Learning","year":"2006","author":"Bishop","key":"2022012508571939800_ref29"},{"issue":"8","key":"2022012508571939800_ref30","doi-asserted-by":"crossref","first-page":"2345","DOI":"10.1093\/bioinformatics\/btz973","article-title":"Fast zero-inflated negative binomial mixed modeling approach for analyzing longitudinal metagenomics data","volume":"36","author":"Zhang","year":"2020","journal-title":"Bioinformatics"},{"volume-title":"Linear and Generalized Linear Mixed Models and Their Applications","year":"2007","author":"Jiang","key":"2022012508571939800_ref31"},{"issue":"1","key":"2022012508571939800_ref32","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1198\/jcgs.2011.09118","article-title":"Gaussian variational approximate inference for generalized linear mixed models","volume":"21","author":"Ormerod","year":"2012","journal-title":"J Comput Graph Stat"},{"key":"2022012508571939800_ref33","doi-asserted-by":"crossref","first-page":"140","DOI":"10.1198\/tast.2010.09058","article-title":"Explaining variational approximations","volume":"64","author":"Ormerod","year":"2010","journal-title":"Am Stat"},{"issue":"518","key":"2022012508571939800_ref34","doi-asserted-by":"crossref","first-page":"859","DOI":"10.1080\/01621459.2017.1285773","article-title":"Variational inference: a review for statisticians","volume":"112","author":"Blei","year":"2017","journal-title":"J Am Stat Assoc"},{"issue":"3","key":"2022012508571939800_ref35","doi-asserted-by":"crossref","first-page":"693","DOI":"10.1007\/s11336-017-9555-z","article-title":"A variational maximization-maximization algorithm for generalized linear mixed models with crossed random effects","volume":"82","author":"Jeon","year":"2017","journal-title":"Psychometrika"},{"issue":"2","key":"2022012508571939800_ref36","doi-asserted-by":"crossref","first-page":"226","DOI":"10.1111\/j.2517-6161.1982.tb01203.x","article-title":"Finding the observed information matrix when using the EM algorithm","volume":"44","author":"Louis","year":"1982","journal-title":"J R Stat Soc Ser B"},{"issue":"528","key":"2022012508571939800_ref37","doi-asserted-by":"crossref","first-page":"1765","DOI":"10.1080\/01621459.2018.1518235","article-title":"Semiparametric regression using variational approximations","volume":"114","author":"Hui","year":"2019","journal-title":"J Am Stat Assoc"},{"issue":"4","key":"2022012508571939800_ref38","doi-asserted-by":"crossref","first-page":"778","DOI":"10.1080\/10618600.2019.1609977","article-title":"Beyond prediction: a framework for inference with variational approximations in mixture models","volume":"28","author":"Westling","year":"2019","journal-title":"J Comput Graph Stat"},{"issue":"527","key":"2022012508571939800_ref39","doi-asserted-by":"crossref","first-page":"1147","DOI":"10.1080\/01621459.2018.1473776","article-title":"Frequentist consistency of variational bayes","volume":"114","author":"Wang","year":"2019","journal-title":"J Am Stat Assoc"},{"issue":"1","key":"2022012508571939800_ref40","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-020-03803-z","article-title":"NBZIMM: negative binomial and zero-inflated mixed models, with application to microbiome\/metagenomics data analysis","volume":"21","author":"Zhang","year":"2020","journal-title":"BMC Bioinfo"},{"key":"2022012508571939800_ref41","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1186\/s13059-014-0550-8","article-title":"Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2","volume":"15","author":"Love","year":"2014","journal-title":"Genome Biol"},{"issue":"4","key":"2022012508571939800_ref42","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1038\/nmeth.3805","article-title":"iCOBRA: open, reproducible, standardized and live method benchmarking","volume":"13","author":"Soneson","year":"2016","journal-title":"Nat Methods"},{"issue":"4","key":"2022012508571939800_ref43","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0224909","article-title":"Sequence count data are poorly fit by the negative binomial distribution","volume":"15","author":"Hawinkel","year":"2020","journal-title":"PLoS One"},{"issue":"10","key":"2022012508571939800_ref44","doi-asserted-by":"crossref","first-page":"3276","DOI":"10.1093\/bioinformatics\/btaa105","article-title":"SPsimSeq: semi-parametric simulation of bulk and single-cell RNA-sequencing data","volume":"36","author":"Assefa","year":"2020","journal-title":"Bioinformatics"},{"issue":"4","key":"2022012508571939800_ref45","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1003531","article-title":"Waste not, want not: why rarefying microbiome data is inadmissible","volume":"10","author":"McMurdie","year":"2014","journal-title":"PLoS Comput Biol"},{"issue":"7758","key":"2022012508571939800_ref46","doi-asserted-by":"crossref","first-page":"641","DOI":"10.1038\/s41586-019-1238-8","article-title":"The integrative human microbiome project","volume":"569","author":"Proctor","year":"2019","journal-title":"Nature"},{"issue":"4","key":"2022012508571939800_ref47","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0061217","article-title":"phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data","volume":"8","author":"McMurdie","year":"2013","journal-title":"PLoS One"},{"issue":"5","key":"2022012508571939800_ref48","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/gb-2014-15-5-r66","article-title":"Exploration of bacterial community classes in major human habitats","volume":"15","author":"Zhou","year":"2014","journal-title":"Genome Biol"},{"key":"2022012508571939800_ref49","doi-asserted-by":"crossref","first-page":"767","DOI":"10.3389\/fmicb.2019.00767","article-title":"Gender-specific associations between saliva microbiota and body size","volume":"10","author":"Raju","year":"2019","journal-title":"Front Microbiol"},{"key":"2022012508571939800_ref50","doi-asserted-by":"crossref","DOI":"10.7717\/peerj.4458","article-title":"Characterization of the salivary microbiome in people with obesity","volume":"6","author":"Wu","year":"2018","journal-title":"PeerJ"},{"issue":"11","key":"2022012508571939800_ref51","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0181693","article-title":"Intestinal Ralstonia pickettii augments glucose intolerance in obesity","volume":"12","author":"Udayappan","year":"2017","journal-title":"PLoS One"},{"issue":"10","key":"2022012508571939800_ref52","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0204724","article-title":"Obesity alters composition and diversity of the oral microbiota in patients with type 2 diabetes mellitus independently of glycemic control","volume":"13","author":"Tam","year":"2018","journal-title":"PLoS One"},{"issue":"13","key":"2022012508571939800_ref53","doi-asserted-by":"crossref","first-page":"13090","DOI":"10.18632\/aging.103399","article-title":"Changes of saliva microbiota in the onset and after the treatment of diabetes in patients with periodontitis","volume":"12","author":"Yang","year":"2020","journal-title":"Aging (Albany NY)"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/1\/bbab443\/42258638\/bbab443.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/23\/1\/bbab443\/42258638\/bbab443.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,9]],"date-time":"2024-09-09T02:32:35Z","timestamp":1725849155000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbab443\/6409694"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,26]]},"references-count":53,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,1,17]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbab443","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"type":"print","value":"1467-5463"},{"type":"electronic","value":"1477-4054"}],"subject":[],"published-other":{"date-parts":[[2022,1]]},"published":{"date-parts":[[2021,10,26]]},"article-number":"bbab443"}}