{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,19]],"date-time":"2025-11-19T19:29:24Z","timestamp":1763580564280,"version":"3.41.2"},"reference-count":30,"publisher":"Oxford University Press (OUP)","issue":"7","license":[{"start":{"date-parts":[[2025,7,12]],"date-time":"2025-07-12T00:00:00Z","timestamp":1752278400000},"content-version":"vor","delay-in-days":11,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["12371287"],"award-info":[{"award-number":["12371287"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012166","name":"National Key R&D Program of China","doi-asserted-by":"crossref","award":["2022YFA1305400"],"award-info":[{"award-number":["2022YFA1305400"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100000002","name":"U.S. National Institutes of Health","doi-asserted-by":"crossref","award":["R01 GM152812","12325110","12288201"],"award-info":[{"award-number":["R01 GM152812","12325110","12288201"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"crossref"}]},{"name":"CAS Project for Young Scientists in Basic Research","award":["YSBR-034"],"award-info":[{"award-number":["YSBR-034"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Compositional data are frequently encountered in many disciplines, such as in next-generation sequencing experiments widely used in biomedical studies. Regression analysis with compositional data as either responses or predictors has been well studied. However, when both responses and predictors are compositional, the inventory of analysis tools is surprisingly limited, especially in the high-dimensional setting. Among the few existing methods, most of them rely on a log-ratio transformation to move compositional data from the simplex to real numbers. Yet, a serious weakness of these methods is their failure to handle the substantial fraction of zeroes observed in data collected from next-generation sequencing experiments.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>To investigate associations between two high-dimensional multi-omics compositions, we propose a composition-on-composition (COC) regression analysis method which does not require log-ratio transformations and hence can handle zeroes in the data. To account for high dimensionality, we estimate regression coefficients using a penalized estimation equation approach. Finally, inference procedures for COC regression are also proposed. Superior performance of COC is demonstrated through both comprehensive numerical simulations and case studies.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Source R codes to implement COC method is available at https:\/\/github.com\/nrios4\/COC.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf387","type":"journal-article","created":{"date-parts":[[2025,7,12]],"date-time":"2025-07-12T05:24:41Z","timestamp":1752297881000},"source":"Crossref","is-referenced-by-count":1,"title":["Composition-on-composition regression analysis for multi-omics integration of metagenomic data"],"prefix":"10.1093","volume":"41","author":[{"given":"Nicholas","family":"Rios","sequence":"first","affiliation":[{"name":"Department of Statistics, George Mason University , Fairfax, VA 22030,","place":["United States"]}]},{"given":"Yuke","family":"Shi","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Mathematical Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences , Beijing 100190,","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1273-5624","authenticated-orcid":false,"given":"Jun","family":"Chen","sequence":"additional","affiliation":[{"name":"Biomedical Statistics and Informatics, Mayo Clinic , Rochester, MN 55905,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9650-143X","authenticated-orcid":false,"given":"Xiang","family":"Zhan","sequence":"additional","affiliation":[{"name":"School of Statistics and Data Science, Southeast University , Nanjing 211189,","place":["China"]}]},{"given":"Lingzhou","family":"Xue","sequence":"additional","affiliation":[{"name":"Department of Statistics, Pennsylvania State University , University Park, PA 16802,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3325-8265","authenticated-orcid":false,"given":"Qizhai","family":"Li","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Mathematical Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences , Beijing 100190,","place":["China"]},{"name":"School of Mathematical Sciences, University of Chinese Academy of Sciences , Beijing 101408,","place":["China"]}]}],"member":"286","published-online":{"date-parts":[[2025,7,12]]},"reference":[{"key":"2025072116512919700_btaf387-B1","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1111\/j.2517-6161.1982.tb01195.x","article-title":"The statistical analysis of compositional data","volume":"44","author":"Aitchison","year":"1982","journal-title":"J R Stat Soc B Methodol"},{"key":"2025072116512919700_btaf387-B2","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1093\/biomet\/71.2.323","article-title":"Log contrast models for experiments with mixtures","volume":"71","author":"Aitchison","year":"1984","journal-title":"Biometrika"},{"key":"2025072116512919700_btaf387-B3","doi-asserted-by":"crossref","first-page":"219","DOI":"10.6339\/JDS.201901_17(1).0010","article-title":"Regression for compositional data with compositional data as predictor variables with or without zero values","volume":"17","author":"Alenazi","year":"2021","journal-title":"J Data Sci"},{"key":"2025072116512919700_btaf387-B4","doi-asserted-by":"crossref","first-page":"335","DOI":"10.1038\/nmeth.f.303","article-title":"QIIME allows analysis of high-throughput community sequencing data","volume":"7","author":"Caporaso","year":"2010","journal-title":"Nat Methods"},{"key":"2025072116512919700_btaf387-B5","doi-asserted-by":"crossref","first-page":"btae071","DOI":"10.1093\/bioinformatics\/btae071","article-title":"Zero is not absence: censoring-based differential abundance analysis for microbiome data","volume":"40","author":"Chan","year":"2024","journal-title":"Bioinformatics"},{"key":"2025072116512919700_btaf387-B6","doi-asserted-by":"crossref","first-page":"2270","DOI":"10.1080\/02664763.2016.1157145","article-title":"Multiple linear regression with compositional response and covariates","volume":"44","author":"Chen","year":"2017","journal-title":"J Appl Stat"},{"key":"2025072116512919700_btaf387-B7","doi-asserted-by":"crossref","first-page":"D141","DOI":"10.1093\/nar\/gkn879","article-title":"The ribosomal database project: improved alignments and new tools for rRNA analysis","volume":"37","author":"Cole","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2025072116512919700_btaf387-B8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/gb-2012-13-7-r60","article-title":"A tool kit for quantifying eukaryotic rRNA gene sequences from human microbiome samples","volume":"13","author":"Dollive","year":"2012","journal-title":"Genome Biol"},{"key":"2025072116512919700_btaf387-B9","doi-asserted-by":"crossref","first-page":"974","DOI":"10.1111\/biom.13465","article-title":"A transformation-free linear regression for compositional outcomes and predictors","volume":"78","author":"Fiksel","year":"2022","journal-title":"Biometrics"},{"key":"2025072116512919700_btaf387-B10","doi-asserted-by":"crossref","first-page":"322","DOI":"10.1016\/j.annepidem.2016.03.003","article-title":"It\u2019s all relative: analyzing microbiome data as compositions","volume":"26","author":"Gloor","year":"2016","journal-title":"Ann Epidemiol"},{"key":"2025072116512919700_btaf387-B11","doi-asserted-by":"crossref","first-page":"2224","DOI":"10.3389\/fmicb.2017.02224","article-title":"Microbiome datasets are compositional: and this is not optional","volume":"8","author":"Gloor","year":"2017","journal-title":"Front Microbiol"},{"key":"2025072116512919700_btaf387-B12","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1146\/annurev-statistics-042720-124436","article-title":"Compositional data analysis","volume":"8","author":"Greenacre","year":"2021","journal-title":"Annu Rev Stat Appl"},{"key":"2025072116512919700_btaf387-B13","doi-asserted-by":"crossref","first-page":"769","DOI":"10.1007\/s11634-024-00600-x","article-title":"The chiPower transformation: a valid alternative to logratio transformations in compositional data analysis","volume":"18","author":"Greenacre","year":"2024","journal-title":"Adv Data Anal Classif"},{"key":"2025072116512919700_btaf387-B14","doi-asserted-by":"crossref","first-page":"210","DOI":"10.1093\/bib\/bbx104","article-title":"A broken promise: microbiome differential abundance methods do not control the false discovery rate","volume":"20","author":"Hawinkel","year":"2019","journal-title":"Brief Bioinform"},{"key":"2025072116512919700_btaf387-B15","doi-asserted-by":"crossref","first-page":"e66019","DOI":"10.1371\/journal.pone.0066019","article-title":"Archaea and fungi of the human gut microbiome: correlations with diet and bacterial residents","volume":"8","author":"Hoffmann","year":"2013","journal-title":"PLoS One"},{"key":"2025072116512919700_btaf387-B16","doi-asserted-by":"crossref","first-page":"792","DOI":"10.1080\/01621459.2022.2151447","article-title":"A flexible zero-inflated Poisson-gamma model with application to microbiome sequence count data","volume":"118","author":"Jiang","year":"2023","journal-title":"J Am Stat Assoc"},{"key":"2025072116512919700_btaf387-B17","doi-asserted-by":"crossref","first-page":"1094","DOI":"10.1080\/01621459.2017.1307116","article-title":"Distribution-free predictive inference for regression","volume":"113","author":"Lei","year":"2018","journal-title":"J Am Stat Assoc"},{"key":"2025072116512919700_btaf387-B18","doi-asserted-by":"crossref","first-page":"1318","DOI":"10.1111\/biom.13703","article-title":"It\u2019s all relative: regression analysis with compositional predictors","volume":"79","author":"Li","year":"2023","journal-title":"Biometrics"},{"key":"2025072116512919700_btaf387-B19","doi-asserted-by":"crossref","first-page":"785","DOI":"10.1093\/biomet\/asu031","article-title":"Variable selection in regression with compositional covariates","volume":"101","author":"Lin","year":"2014","journal-title":"Biometrika"},{"key":"2025072116512919700_btaf387-B20","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1186\/s13059-015-0637-x","article-title":"Associations between host gene expression, the mucosal microbiome, and clinical outcome in the pelvic pouch of patients with inflammatory bowel disease","volume":"16","author":"Morgan","year":"2015","journal-title":"Genome Biol"},{"key":"2025072116512919700_btaf387-B21","doi-asserted-by":"crossref","first-page":"1306","DOI":"10.1038\/s41592-019-0616-3","article-title":"Learning representations of microbe\u2013metabolite interactions","volume":"16","author":"Morton","year":"2019","journal-title":"Nat Methods"},{"key":"2025072116512919700_btaf387-B22","doi-asserted-by":"crossref","first-page":"2719","DOI":"10.1038\/s41467-019-10656-5","article-title":"Establishing microbial composition measurement standards with reference frames","volume":"10","author":"Morton","year":"2019","journal-title":"Nat Commun"},{"key":"2025072116512919700_btaf387-B23","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1038\/s41467-022-28034-z","article-title":"Microbiome differential abundance methods produce different results across 38 datasets","volume":"13","author":"Nearing","year":"2022","journal-title":"Nat Commun"},{"key":"2025072116512919700_btaf387-B24","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1144\/GSL.SP.2006.264.01.01","article-title":"Compositional data and their analysis: an introduction","volume":"264","author":"Pawlowsky-Glahn","year":"2006","journal-title":"SP"},{"key":"2025072116512919700_btaf387-B25","doi-asserted-by":"crossref","first-page":"3253","DOI":"10.1214\/24-AOAS1935","article-title":"A latent variable mixture model for composition-on-composition regression with application to chemical recycling","volume":"18","author":"Rios","year":"2024","journal-title":"Ann Appl Stat"},{"key":"2025072116512919700_btaf387-B26","first-page":"371","article-title":"A tutorial on conformal prediction","volume":"9","author":"Shafer","year":"2008","journal-title":"J Mach Learn Res"},{"key":"2025072116512919700_btaf387-B27","doi-asserted-by":"crossref","first-page":"1019","DOI":"10.1214\/16-AOAS928","article-title":"Regression analysis for microbiome compositional data","volume":"10","author":"Shi","year":"2016","journal-title":"Ann Appl Stat"},{"key":"2025072116512919700_btaf387-B28","doi-asserted-by":"crossref","first-page":"507","DOI":"10.1038\/nature24460","article-title":"Quantitative microbiome profiling links gut community variation to microbial load","volume":"551","author":"Vandeputte","year":"2017","journal-title":"Nature"},{"volume-title":"Algorithmic Learning in a Random World","year":"2005","author":"Vovk","key":"2025072116512919700_btaf387-B29"},{"key":"2025072116512919700_btaf387-B30","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1126\/science.1208344","article-title":"Linking long-term dietary patterns with gut microbial enterotypes","volume":"334","author":"Wu","year":"2011","journal-title":"Science"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf387\/63731789\/btaf387.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/7\/btaf387\/63731789\/btaf387.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/7\/btaf387\/63731789\/btaf387.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,21]],"date-time":"2025-07-21T20:51:37Z","timestamp":1753131097000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf387\/8197195"}},"subtitle":[],"editor":[{"given":"Peter","family":"Robinson","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,7,1]]},"references-count":30,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2025,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf387","relation":{},"ISSN":["1367-4811"],"issn-type":[{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2025,7]]},"published":{"date-parts":[[2025,7,1]]},"article-number":"btaf387"}}