{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:02Z","timestamp":1772138042049,"version":"3.50.1"},"reference-count":30,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2021,1,22]],"date-time":"2021-01-22T00:00:00Z","timestamp":1611273600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01GM116065"],"award-info":[{"award-number":["R01GM116065"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,7,19]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Many methods for testing association between the microbiome and covariates of interest (e.g. clinical outcomes, environmental factors) assume that these associations are driven by changes in the relative abundance of taxa. However, these associations may also result from changes in which taxa are present and which are absent. Analyses of such presence\u2013absence associations face a unique challenge: confounding by library size (total sample read count), which occurs when library size is associated with covariates in the analysis. It is known that rarefaction (subsampling to a common library size) controls this bias, but at the potential cost of information loss as well as the introduction of a stochastic component into the analysis. Currently, there is a need for robust and efficient methods for testing presence\u2013absence associations in the presence of such confounding, both at the community level and at the individual-taxon level, that avoid the drawbacks of rarefaction.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We have previously developed the linear decomposition model (LDM) that unifies the community-level and taxon-level tests into one framework. Here, we present an extension of the LDM for testing presence\u2013absence associations. The extended LDM is a non-stochastic approach that repeatedly applies the LDM to all rarefied taxa count tables, averages the residual sum-of-squares (RSS) terms over the rarefaction replicates, and then forms an F-statistic based on these average RSS terms. We show that this approach compares favorably to averaging the F-statistic from R rarefaction replicates, which can only be calculated stochastically. The flexible nature of the LDM allows discrete or continuous traits or interactions to be tested while allowing confounding covariates to be adjusted for. Our simulations indicate that our proposed method is robust to any systematic differences in library size and has better power than alternative approaches. We illustrate our method using an analysis of data on inflammatory bowel disease (IBD) in which cases have systematically smaller library sizes than controls.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availabilityand implementation<\/jats:title>\n                    <jats:p>The R package LDM is available on GitHub at https:\/\/github.com\/yijuanhu\/LDM in formats appropriate for Macintosh or Windows.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab012","type":"journal-article","created":{"date-parts":[[2021,1,5]],"date-time":"2021-01-05T17:31:42Z","timestamp":1609867902000},"page":"1652-1657","source":"Crossref","is-referenced-by-count":15,"title":["A rarefaction-based extension of the LDM for testing presence\u2013absence associations in the microbiome"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2171-9041","authenticated-orcid":false,"given":"Yi-Juan","family":"Hu","sequence":"first","affiliation":[{"name":"Department of Biostatistics and Bioinformatics, Emory University , Atlanta, GA 30322, USA"}]},{"given":"Andrea","family":"Lane","sequence":"additional","affiliation":[{"name":"Department of Biostatistics and Bioinformatics, Emory University , Atlanta, GA 30322, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7275-5371","authenticated-orcid":false,"given":"Glen A","family":"Satten","sequence":"additional","affiliation":[{"name":"Department of Gynecology and Obstetrics, Emory University School of Medicine , Atlanta, GA 30322, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,1,22]]},"reference":[{"key":"2023051709560106600_btab012-B2","doi-asserted-by":"crossref","first-page":"2611","DOI":"10.1093\/bioinformatics\/btw308","article-title":"A two-part mixed-effects model for analyzing longitudinal microbiome compositional data","volume":"32","author":"Chen","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051709560106600_btab012-B4","doi-asserted-by":"crossref","first-page":"643","DOI":"10.1093\/bioinformatics\/btx650","article-title":"An omnibus test for differential distribution analysis of microbiome sequencing data","volume":"34","author":"Chen","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051709560106600_btab012-B3"},{"key":"2023051709560106600_btab012-B5","doi-asserted-by":"crossref","first-page":"R88","DOI":"10.1093\/hmg\/ddt398","article-title":"Sequencing the human microbiome in health and disease","volume":"22","author":"Cox","year":"2013","journal-title":"Hum. Mol. Genet"},{"key":"2023051709560106600_btab012-B6","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1186\/2049-2618-2-15","article-title":"Unifying the analysis of high-throughput sequencing datasets: characterizing rna-seq, 16s rrna gene sequencing and selective growth experiments by compositional data analysis","volume":"2","author":"Fernandes","year":"2014","journal-title":"Microbiome"},{"key":"2023051709560106600_btab012-B7","doi-asserted-by":"crossref","first-page":"382","DOI":"10.1016\/j.chom.2014.02.005","article-title":"The treatment-naive microbiome in new-onset crohn\u2019s disease","volume":"15","author":"Gevers","year":"2014","journal-title":"Cell Host Microbe"},{"key":"2023051709560106600_btab012-B8","doi-asserted-by":"crossref","first-page":"4106","DOI":"10.1093\/bioinformatics\/btaa260","article-title":"Testing hypotheses about the microbiome using the linear decomposition model (ldm)","volume":"36","author":"Hu","year":"2020","journal-title":"Bioinformatics"},{"key":"2023051709560106600_btab012-B9","doi-asserted-by":"crossref","first-page":"292","DOI":"10.1016\/S0076-6879(05)97017-1","article-title":"The application of rarefaction techniques to molecular inventories of microbial diversity","volume":"397","author":"Hughes","year":"2005","journal-title":"Methods Enzymol"},{"key":"2023051709560106600_btab012-B10","doi-asserted-by":"crossref","first-page":"2114","DOI":"10.3389\/fmicb.2017.02114","article-title":"Analysis of microbiome data in the presence of excess zeros","volume":"8","author":"Kaul","year":"2017","journal-title":"Front. Microbiol"},{"key":"2023051709560106600_btab012-B11","doi-asserted-by":"crossref","first-page":"292","DOI":"10.1101\/gr.126573.111","article-title":"Genomic analysis identifies association of fusobacterium with colorectal carcinoma","volume":"22","author":"Kostic","year":"2012","journal-title":"Genome Res"},{"key":"2023051709560106600_btab012-B12","volume-title":"Numerical Ecology, 3rd Edition","author":"Legendre","year":"2012"},{"key":"2023051709560106600_btab012-B13","doi-asserted-by":"crossref","first-page":"550","DOI":"10.1186\/s13059-014-0550-8","article-title":"Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2","volume":"15","author":"Love","year":"2014","journal-title":"Genome Biol"},{"key":"2023051709560106600_btab012-B14","doi-asserted-by":"crossref","first-page":"8228","DOI":"10.1128\/AEM.71.12.8228-8235.2005","article-title":"Unifrac: a new phylogenetic method for comparing microbial communities","volume":"71","author":"Lozupone","year":"2005","journal-title":"Appl. Environ. Microbiol"},{"key":"2023051709560106600_btab012-B15","doi-asserted-by":"crossref","first-page":"1576","DOI":"10.1128\/AEM.01996-06","article-title":"Quantitative and qualitative \u03b2 diversity measures lead to different insights into factors that structure microbial communities","volume":"73","author":"Lozupone","year":"2007","journal-title":"Appl. Environ. Microbiol"},{"key":"2023051709560106600_btab012-B16","first-page":"27663","article-title":"Analysis of composition of microbiomes: a novel method for studying microbial composition","volume":"26","author":"Mandal","year":"2015","journal-title":"Microb. Ecol. Health Dis"},{"key":"2023051709560106600_btab012-B17","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1890\/0012-9658(2001)082[0290:FMMTCD]2.0.CO;2","article-title":"Fitting multivariate models to community data: a comment on distance-based redundancy analysis","volume":"82","author":"McArdle","year":"2001","journal-title":"Ecology"},{"key":"2023051709560106600_btab012-B18","doi-asserted-by":"crossref","first-page":"e46923","DOI":"10.7554\/eLife.46923","article-title":"Consistent and correctable bias in metagenomic sequencing experiments","volume":"8","author":"McLaren","year":"2019","journal-title":"Elife"},{"key":"2023051709560106600_btab012-B19","first-page":"371","volume-title":"Methods in Enzymology","author":"Navas-Molina","year":"2013"},{"key":"2023051709560106600_btab012-B20","doi-asserted-by":"crossref","first-page":"e39242","DOI":"10.1371\/journal.pone.0039242","article-title":"Non-invasive mapping of the gastrointestinal microbiota identifies children with inflammatory bowel disease","volume":"7","author":"Papa","year":"2012","journal-title":"PLoS One"},{"key":"2023051709560106600_btab012-B21","doi-asserted-by":"crossref","first-page":"1200","DOI":"10.1038\/nmeth.2658","article-title":"Differential abundance analysis for microbial marker-gene surveys","volume":"10","author":"Paulson","year":"2013","journal-title":"Nat. Methods"},{"key":"2023051709560106600_btab012-B22","doi-asserted-by":"crossref","first-page":"S2","DOI":"10.1111\/j.1753-4887.2012.00489.x","article-title":"The human microbiome: ecosystem resilience and health","volume":"70","author":"Relman","year":"2012","journal-title":"Nutr. Rev"},{"key":"2023051709560106600_btab012-B23","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1093\/bioinformatics\/btp616","article-title":"edgeR: a Bioconductor package for differential expression analysis of digital gene expression data","volume":"26","author":"Robinson","year":"2010","journal-title":"Bioinformatics"},{"key":"2023051709560106600_btab012-B24","doi-asserted-by":"crossref","first-page":"417","DOI":"10.3389\/fmicb.2012.00417","article-title":"Fundamentals of microbial community resistance and resilience","volume":"3","author":"Shade","year":"2012","journal-title":"Front. Microbiol"},{"key":"2023051709560106600_btab012-B25","doi-asserted-by":"crossref","first-page":"1971","DOI":"10.1002\/ibd.21606","article-title":"Invasive potential of gut mucosa-derived fusobacterium nucleatum positively correlates with ibd status of the host","volume":"17","author":"Strauss","year":"2011","journal-title":"Inflam. Bowel Dis"},{"key":"2023051709560106600_btab012-B26","doi-asserted-by":"crossref","first-page":"698","DOI":"10.1093\/biostatistics\/kxy025","article-title":"Zero-inflated generalized Dirichlet multinomial regression model for microbiome compositional data analysis","volume":"20","author":"Tang","year":"2019","journal-title":"Biostatistics"},{"key":"2023051709560106600_btab012-B27","doi-asserted-by":"crossref","first-page":"1278","DOI":"10.1093\/bioinformatics\/btw804","article-title":"A general framework for association analysis of microbial communities on a taxonomic tree","volume":"33","author":"Tang","year":"2017","journal-title":"Bioinformatics"},{"key":"2023051709560106600_btab012-B28","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1186\/s40168-017-0237-y","article-title":"Normalization and microbial differential abundance strategies depend upon data characteristics","volume":"5","author":"Weiss","year":"2017","journal-title":"Microbiome"},{"key":"2023051709560106600_btab012-B29","volume-title":"Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment","author":"Westfall","year":"1993"},{"key":"2023051709560106600_btab012-B30","doi-asserted-by":"crossref","first-page":"606","DOI":"10.1111\/1471-0528.14390","article-title":"Why do lactobacilli dominate the human vaginal microbiota?","volume":"124","author":"Witkin","year":"2017","journal-title":"BJOG Int. J. Obstetr. Gynaecol"},{"key":"2023051709560106600_btab012-B31","author":"Zhu","year":"2020"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab012\/36113910\/btab012.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/12\/1652\/50361394\/btab012.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/12\/1652\/50361394\/btab012.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,17]],"date-time":"2023-05-17T06:40:50Z","timestamp":1684305650000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/12\/1652\/6105226"}},"subtitle":[],"editor":[{"given":"Peter","family":"Robinson","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,1,22]]},"references-count":30,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2021,7,19]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab012","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.05.26.117879","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,6,15]]},"published":{"date-parts":[[2021,1,22]]}}}