{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,1]],"date-time":"2026-02-01T06:26:25Z","timestamp":1769927185503,"version":"3.49.0"},"reference-count":59,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2024,12,21]],"date-time":"2024-12-21T00:00:00Z","timestamp":1734739200000},"content-version":"vor","delay-in-days":29,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"National Institute of Health","award":["R01ES027013"],"award-info":[{"award-number":["R01ES027013"]}]},{"name":"National Institute of Health","award":["R01AI149754"],"award-info":[{"award-number":["R01AI149754"]}]},{"DOI":"10.13039\/100000199","name":"United States Department of Agriculture","doi-asserted-by":"publisher","award":["ARZT-1361620-H22-149"],"award-info":[{"award-number":["ARZT-1361620-H22-149"]}],"id":[{"id":"10.13039\/100000199","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,11,22]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Sequencing-based microbial count data analysis is a challenging task due to the presence of numerous non-biological zeros, which can impede downstream analysis. To tackle this issue, we introduce two novel approaches, PhyImpute and UniFracImpute, which leverage similar microbial samples to identify and impute non-biological zeros in microbial count data. Our proposed methods utilize the probability of non-biological zeros and phylogenetic trees to estimate sample-to-sample similarity, thus addressing this challenge. To evaluate the performance of our proposed methods, we conduct experiments using both simulated and real microbial data. The results demonstrate that PhyImpute and UniFracImpute outperform existing methods in recovering the zeros and empowering downstream analyses such as differential abundance analysis, and disease status classification.<\/jats:p>","DOI":"10.1093\/bib\/bbae653","type":"journal-article","created":{"date-parts":[[2024,12,21]],"date-time":"2024-12-21T23:01:46Z","timestamp":1734822106000},"source":"Crossref","is-referenced-by-count":1,"title":["PhyImpute and UniFracImpute: two imputation approaches incorporating phylogeny information for microbial count data"],"prefix":"10.1093","volume":"26","author":[{"given":"Qianwen","family":"Luo","sequence":"first","affiliation":[{"name":"Department of Biosystems Engineering, University of Arizona , Tucson, AZ 85721 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shanshan","family":"Zhang","sequence":"additional","affiliation":[{"name":"Interdisciplinary Program in Statistics and Data Science, University of Arizona , Tucson, AZ 85721 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hamza","family":"Butt","sequence":"additional","affiliation":[{"name":"Department of Epidemiology and Biostatistics, University of Arizona , Tucson, AZ 85721 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yin","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Pharmacology and Toxicology, School of Pharmacy, University of Arizona , Tucson, AZ 85721 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hongmei","family":"Jiang","sequence":"additional","affiliation":[{"name":"Department of Statistics and Data Science, Northwestern University , Evanston, IL 60208 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8273-0776","authenticated-orcid":false,"given":"Lingling","family":"An","sequence":"additional","affiliation":[{"name":"Department of Biosystems Engineering, University of Arizona , Tucson, AZ 85721 ,","place":["United States"]},{"name":"Interdisciplinary Program in Statistics and Data Science, University of Arizona , Tucson, AZ 85721 ,","place":["United States"]},{"name":"Department of Epidemiology and Biostatistics, University of Arizona , Tucson, AZ 85721 ,","place":["United States"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2024,12,21]]},"reference":[{"key":"2024122123012963700_ref1","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1186\/s40168-020-00875-0","article-title":"Microbiome definition re-visited: old concepts and new challenges","volume":"8","author":"Berg","year":"2020","journal-title":"Microbiome"},{"key":"2024122123012963700_ref2","doi-asserted-by":"publisher","first-page":"197","DOI":"10.1007\/s12561-021-09307-5","article-title":"Introduction to special issue on statistics in microbiome and metagenomics","volume":"13","author":"Li","year":"2021","journal-title":"Stat Biosci"},{"key":"2024122123012963700_ref3","doi-asserted-by":"crossref","first-page":"2114","DOI":"10.3389\/fmicb.2017.02114","article-title":"Analysis of microbiome data in the presence of excess zeros","volume":"8","author":"Kaul","year":"2017","journal-title":"Front Microbiol"},{"key":"2024122123012963700_ref4","doi-asserted-by":"publisher","first-page":"8","DOI":"10.7554\/eLife.46923","article-title":"Consistent and correctable bias in metagenomic sequencing experiments","volume":"8","author":"McLaren","year":"2019","journal-title":"Elife"},{"key":"2024122123012963700_ref5","doi-asserted-by":"publisher","first-page":"2789","DOI":"10.1016\/j.csbj.2020.09.014","article-title":"Naught all zeros in sequence count data are the same","volume":"18","author":"Silverman","year":"2020","journal-title":"Comput Struct Biotechnol J"},{"key":"2024122123012963700_ref6","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-020-02104-1","article-title":"Assessment of statistical methods from single cell, bulk RNA-seq, and metagenomics applied to microbiome data","volume":"21","author":"Calgaro","year":"2020","journal-title":"Genome Biol"},{"key":"2024122123012963700_ref7","doi-asserted-by":"publisher","first-page":"4233","DOI":"10.1093\/bioinformatics\/btaa283","article-title":"scDoc: correcting drop-out events in single-cell RNA-seq data","volume":"36","author":"Ran","year":"2020","journal-title":"Bioinformatics"},{"key":"2024122123012963700_ref8","doi-asserted-by":"publisher","first-page":"997","DOI":"10.1038\/s41467-018-03405-7","article-title":"An accurate and robust imputation method scImpute for single-cell RNA-seq data","volume":"9","author":"Li","year":"2018","journal-title":"Nat Commun"},{"key":"2024122123012963700_ref9","doi-asserted-by":"publisher","first-page":"539","DOI":"10.1038\/s41592-018-0033-z","article-title":"SAVER: gene expression recovery for single-cell RNA sequencing","volume":"15","author":"Huang","year":"2018","journal-title":"Nat Methods"},{"key":"2024122123012963700_ref10","doi-asserted-by":"publisher","first-page":"716","DOI":"10.1016\/j.cell.2018.05.061","article-title":"Recovering gene interactions from single-cell data using data diffusion","volume":"174","author":"van Dijk","year":"2018","journal-title":"Cell"},{"key":"2024122123012963700_ref11","article-title":"Matrix completion and low-rank SVD via fast alternatingleast squares","volume-title":"The Journal of Machine Learning Research","author":""},{"key":"2024122123012963700_ref12","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-020-1926-6","article-title":"Eleven grand challenges in single-cell data science","volume":"21","author":"L\u00e4hnemann","year":"2020","journal-title":"Genome Biol"},{"key":"2024122123012963700_ref13","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13059-020-02132-x","article-title":"A systematic evaluation of single-cell RNA-sequencing imputation methods","volume":"21","author":"Hou","year":"2020","journal-title":"Genome Biol"},{"key":"2024122123012963700_ref14","doi-asserted-by":"publisher","first-page":"4877","DOI":"10.1093\/nar\/gkac317","article-title":"scIMC: a platform for benchmarking comparison and visualization analysis of scRNA-seq data imputation methods","volume":"50","author":"Dai","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2024122123012963700_ref15","doi-asserted-by":"publisher","first-page":"3108","DOI":"10.1038\/s41467-018-05469-x","article-title":"Network enhancement as a general method to denoise weighted biological networks","volume":"9","author":"Wang","year":"2018","journal-title":"Nat Commun"},{"key":"2024122123012963700_ref16","doi-asserted-by":"publisher","first-page":"302","DOI":"10.1186\/s12859-023-05417-7","article-title":"Evaluating imputation methods for single-cell RNA-seq data","volume":"24","author":"Cheng","year":"2023","journal-title":"BMC Bioinform"},{"key":"2024122123012963700_ref17","doi-asserted-by":"publisher","first-page":"192","DOI":"10.1186\/s13059-021-02400-4","article-title":"mbImpute: an accurate and robust imputation method for microbiome data","volume":"22","author":"Jiang","year":"2021","journal-title":"Genome Biol"},{"key":"2024122123012963700_ref18","doi-asserted-by":"publisher","first-page":"1000","DOI":"10.1093\/bioinformatics\/btz686","article-title":"Phylogenetic tree-based microbiome association test","volume":"36","author":"Kim","year":"2020","journal-title":"Bioinformatics"},{"key":"2024122123012963700_ref19","doi-asserted-by":"publisher","first-page":"3567","DOI":"10.1093\/bioinformatics\/btz120","article-title":"pldist: ecological dissimilarities for paired and longitudinal microbiome association analysis","volume":"35","author":"Plantinga","year":"2019","journal-title":"Bioinformatics"},{"key":"2024122123012963700_ref20","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1186\/s13073-016-0302-3","article-title":"An adaptive association test for microbiome data","volume":"8","author":"Wu","year":"2016","journal-title":"Genome Med"},{"key":"2024122123012963700_ref21","doi-asserted-by":"publisher","first-page":"8228","DOI":"10.1128\/AEM.71.12.8228-8235.2005","article-title":"UniFrac: a new phylogenetic method for comparing microbial communities","volume":"71","author":"Lozupone","year":"2005","journal-title":"Appl Environ Microbiol"},{"key":"2024122123012963700_ref22","doi-asserted-by":"publisher","first-page":"1576","DOI":"10.1128\/AEM.01996-06","article-title":"Quantitative and qualitative beta diversity measures lead to different insights into factors that structure microbial communities","volume":"73","author":"Lozupone","year":"2007","journal-title":"Appl Environ Microbiol"},{"key":"2024122123012963700_ref23","doi-asserted-by":"publisher","first-page":"2106","DOI":"10.1093\/bioinformatics\/bts342","article-title":"Associating microbiome composition with environmental covariates using generalized UniFrac distances","volume":"28","author":"Chen","year":"2012","journal-title":"Bioinformatics"},{"key":"2024122123012963700_ref24","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1038\/nature12198","article-title":"Gut metagenome in European women with normal, impaired and diabetic glucose control","volume":"498","author":"Karlsson","year":"2013","journal-title":"Nature"},{"key":"2024122123012963700_ref25","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1038\/nature11450","article-title":"A metagenome-wide association study of gut microbiota in type 2 diabetes","volume":"490","author":"Qin","year":"2012","journal-title":"Nature"},{"key":"2024122123012963700_ref26","doi-asserted-by":"publisher","first-page":"766","DOI":"10.15252\/msb.20145645","article-title":"Potential of fecal microbiota for early-stage detection of colorectal cancer","volume":"10","author":"Zeller","year":"2014","journal-title":"Mol Syst Biol"},{"key":"2024122123012963700_ref27","doi-asserted-by":"publisher","first-page":"e1008913","DOI":"10.1371\/journal.pcbi.1008913","article-title":"A statistical model for describing and simulating microbial community profiles","volume":"17","author":"Ma","year":"2021","journal-title":"PLoS Comput Biol"},{"key":"2024122123012963700_ref28","doi-asserted-by":"publisher","first-page":"449","DOI":"10.1016\/j.jmoldx.2019.01.006","article-title":"Leveraging human microbiome features to diagnose and stratify children with irritable bowel syndrome","volume":"21","author":"Hollister","year":"2019","journal-title":"J Mol Diagn"},{"key":"2024122123012963700_ref29","doi-asserted-by":"publisher","first-page":"822","DOI":"10.1038\/nbt.2939","article-title":"Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes","volume":"32","author":"Nielsen","year":"2014","journal-title":"Nat Biotechnol"},{"key":"2024122123012963700_ref30","doi-asserted-by":"publisher","first-page":"e01926","DOI":"10.1128\/mBio.01926-14","article-title":"Dynamic changes in the subgingival microbiome and their potential for diagnosis and prognosis of periodontitis","volume":"6","author":"Shi","year":"2015","journal-title":"MBio"},{"key":"2024122123012963700_ref31","doi-asserted-by":"publisher","first-page":"342","DOI":"10.1038\/s41467-022-28034-z","article-title":"Microbiome differential abundance methods produce different results across 38 datasets","volume":"13","author":"Nearing","year":"2022","journal-title":"Nat Commun"},{"key":"2024122123012963700_ref32","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1186\/2049-2618-2-15","article-title":"Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis","volume":"2","author":"Fernandes","year":"2014","journal-title":"Microbiome"},{"key":"2024122123012963700_ref33","doi-asserted-by":"publisher","first-page":"27663","DOI":"10.3402\/mehd.v26.27663","article-title":"Analysis of composition of microbiomes: A novel method for studying microbial composition","volume":"26","author":"Mandal","year":"2015","journal-title":"Microb Ecol Health Dis"},{"key":"2024122123012963700_ref34","doi-asserted-by":"publisher","first-page":"94","DOI":"10.1214\/19-aoas1283","article-title":"Modeling microbial abundances and dysbiosis with beta-binomial regression","volume":"14","author":"Martin","year":"2020","journal-title":"Ann Appl Stat"},{"key":"2024122123012963700_ref35","doi-asserted-by":"publisher","first-page":"550","DOI":"10.1186\/s13059-014-0550-8","article-title":"Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2","volume":"15","author":"Love","year":"2014","journal-title":"Genome Biol"},{"key":"2024122123012963700_ref36","doi-asserted-by":"publisher","first-page":"e1009442","DOI":"10.1371\/journal.pcbi.1009442","article-title":"Multivariable association discovery in population-scale meta-omics studies","volume":"17","author":"Mallick","year":"2021","journal-title":"PLoS Comput Biol"},{"key":"2024122123012963700_ref37","doi-asserted-by":"publisher","first-page":"643","DOI":"10.1093\/bioinformatics\/btx650","article-title":"An omnibus test for differential distribution analysis of microbiome sequencing data","volume":"34","author":"Chen","year":"2018","journal-title":"Bioinformatics"},{"key":"2024122123012963700_ref38","doi-asserted-by":"publisher","first-page":"1200","DOI":"10.1038\/nmeth.2658","article-title":"Differential abundance analysis for microbial marker-gene surveys","volume":"10","author":"Paulson","year":"2013","journal-title":"Nat Methods"},{"key":"2024122123012963700_ref39","article-title":"Individual Comparisons by Ranking Methods","volume-title":"Break throughs in Statistics: Methodology and Distribution","author":""},{"key":"2024122123012963700_ref40","doi-asserted-by":"publisher","first-page":"1023","DOI":"10.1038\/nmeth.4468","article-title":"Accessible, curated metagenomic data through ExperimentHub","volume":"14","author":"Pasolli","year":"2017","journal-title":"Nat Methods"},{"key":"2024122123012963700_ref41","doi-asserted-by":"publisher","first-page":"95","DOI":"10.1186\/s13059-022-02655-5","article-title":"LinDA: linear models for differential abundance analysis of microbiome compositional data","volume":"23","author":"Zhou","year":"2022","journal-title":"Genome Biol"},{"key":"2024122123012963700_ref42","doi-asserted-by":"publisher","first-page":"70","DOI":"10.1136\/gutjnl-2015-309800","article-title":"Metagenomic analysis of faecal microbiome as a tool towards targeted non-invasive biomarkers for colorectal cancer","volume":"66","author":"Yu","year":"2017","journal-title":"Gut"},{"key":"2024122123012963700_ref43","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1038\/s41522-017-0022-5","article-title":"Unexplored diversity and strain-level structure of the skin microbiome associated with psoriasis","volume":"3","author":"Tett","year":"2017","journal-title":"NPJ Biofilms Microbiomes"},{"key":"2024122123012963700_ref44","doi-asserted-by":"publisher","first-page":"191","DOI":"10.1186\/s13059-020-02104-1","article-title":"Assessment of statistical methods from single cell, bulk RNA-seq, and metagenomics applied to microbiome data","volume":"21","author":"Calgaro","year":"2020","journal-title":"Genome Biol"},{"key":"2024122123012963700_ref45","doi-asserted-by":"publisher","first-page":"40200","DOI":"10.1038\/srep40200","article-title":"MicroPattern: a web-based tool for microbe set enrichment analysis and disease similarity calculation based on a list of microbes","volume":"7","author":"Ma","year":"2017","journal-title":"Sci Rep"},{"key":"2024122123012963700_ref46","doi-asserted-by":"publisher","first-page":"297","DOI":"10.5217\/ir.2016.14.4.297","article-title":"Irritable bowel syndrome and inflammatory bowel disease overlap syndrome: pieces of the puzzle are falling into place","volume":"14","author":"Abdul Rani","year":"2016","journal-title":"Intest Res"},{"key":"2024122123012963700_ref47","doi-asserted-by":"publisher","first-page":"S604","DOI":"10.1093\/ecco-jcc\/jjab076.805","article-title":"P685 gut microbiota in patients with inflammatory bowel disease during remission","volume":"15","author":"Pisani","year":"2021","journal-title":"J Crohn's Colitis"},{"key":"2024122123012963700_ref48","doi-asserted-by":"publisher","first-page":"497","DOI":"10.1038\/s41579-019-0213-6","article-title":"Microbial genes and pathways in inflammatory bowel disease","volume":"17","author":"Schirmer","year":"2019","journal-title":"Nat Rev Microbiol"},{"key":"2024122123012963700_ref49","doi-asserted-by":"publisher","first-page":"173","DOI":"10.2147\/CEG.S33858","article-title":"Gut bacterial profile in patients newly diagnosed with treatment-naive Crohn's disease","volume":"5","author":"Ricanek","year":"2012","journal-title":"Clin Exp Gastroenterol"},{"key":"2024122123012963700_ref50","doi-asserted-by":"publisher","first-page":"514","DOI":"10.1038\/tpj.2012.43","article-title":"Metagenomic sequencing of the human gut microbiome before and after bariatric surgery in obese patients with type 2 diabetes: Correlation with inflammatory and metabolic parameters","volume":"13","author":"Graessler","year":"2013","journal-title":"Pharmacogenomics J"},{"key":"2024122123012963700_ref51","doi-asserted-by":"publisher","first-page":"102590","DOI":"10.1016\/j.ebiom.2019.11.051","article-title":"Role of gut microbiota in type 2 diabetes pathophysiology","volume":"51","author":"Gurung","year":"2020","journal-title":"EBioMedicine"},{"key":"2024122123012963700_ref52","doi-asserted-by":"publisher","first-page":"628426","DOI":"10.3389\/fmicb.2021.628426","article-title":"Discovering potential taxonomic biomarkers of type 2 diabetes from human gut microbiota via different feature selection methods","volume":"12","author":"Bakir-Gungor","year":"2021","journal-title":"Front Microbiol"},{"key":"2024122123012963700_ref53","doi-asserted-by":"publisher","first-page":"e0157516","DOI":"10.1371\/journal.pone.0157516","article-title":"Identification of Veillonella species in the tongue biofilm by using a novel one-step polymerase chain reaction method","volume":"11","author":"Mashima","year":"2016","journal-title":"PloS One"},{"key":"2024122123012963700_ref54","doi-asserted-by":"publisher","first-page":"e0248308","DOI":"10.1371\/journal.pone.0248308","article-title":"A concerted probiotic activity to inhibit periodontitis-associated bacteria","volume":"16","author":"Jansen","year":"2021","journal-title":"PloS One"},{"key":"2024122123012963700_ref55","doi-asserted-by":"publisher","first-page":"1421","DOI":"10.1902\/jop.2009.090185","article-title":"Comparisons of subgingival microbial profiles of refractory periodontitis, severe periodontitis, and periodontal health using the human oral microbe identification microarray","volume":"80","author":"Colombo","year":"2009","journal-title":"J Periodontol"},{"key":"2024122123012963700_ref56","doi-asserted-by":"publisher","first-page":"2324709620910645","DOI":"10.1177\/2324709620910645","article-title":"Bacterial endocarditis caused by Actinomyces oris: first reported case and literature review","volume":"8","author":"Phichaphop","year":"2020","journal-title":"J Investig Med High Impact Case Rep"},{"key":"2024122123012963700_ref57","doi-asserted-by":"publisher","first-page":"785191","DOI":"10.3389\/fmicb.2021.775570","article-title":"Periodontal and peri-implant microbiome dysbiosis is associated with alterations in the microbial community structure and local stability","volume":"12","author":"Zhang","year":"2021","journal-title":"Front Microbiol"},{"key":"2024122123012963700_ref58","doi-asserted-by":"publisher","first-page":"787176","DOI":"10.3389\/fgene.2021.787176","article-title":"Colorectal cancer-associated microbiome patterns and signatures","volume":"12","author":"Zhao","year":"2021","journal-title":"Front Genet"},{"key":"2024122123012963700_ref59","article-title":"Microbiota, Inflammation and Colorectal Cancer","volume":"18","author":"","journal-title":"International Journal of Molecular Sciences"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/1\/bbae653\/61251718\/bbae653.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/26\/1\/bbae653\/61251718\/bbae653.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,21]],"date-time":"2024-12-21T23:01:49Z","timestamp":1734822109000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbae653\/7930070"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,22]]},"references-count":59,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,11,22]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbae653","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,1]]},"published":{"date-parts":[[2024,11,22]]},"article-number":"bbae653"}}