{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T03:10:32Z","timestamp":1777605032754,"version":"3.51.4"},"reference-count":97,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2023,10,5]],"date-time":"2023-10-05T00:00:00Z","timestamp":1696464000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Microbiol."],"abstract":"<jats:p>Although metagenomic sequencing is now the preferred technique to study microbiome-host interactions, analyzing and interpreting microbiome sequencing data presents challenges primarily attributed to the statistical specificities of the data (e.g., sparse, over-dispersed, compositional, inter-variable dependency). This mini review explores preprocessing and transformation methods applied in recent human microbiome studies to address microbiome data analysis challenges. Our results indicate a limited adoption of transformation methods targeting the statistical characteristics of microbiome sequencing data. Instead, there is a prevalent usage of relative and normalization-based transformations that do not specifically account for the specific attributes of microbiome data. The information on preprocessing and transformations applied to the data before analysis was incomplete or missing in many publications, leading to reproducibility concerns, comparability issues, and questionable results. We hope this mini review will provide researchers and newcomers to the field of human microbiome research with an up-to-date point of reference for various data transformation tools and assist them in choosing the most suitable transformation method based on their research questions, objectives, and data characteristics.<\/jats:p>","DOI":"10.3389\/fmicb.2023.1250909","type":"journal-article","created":{"date-parts":[[2023,10,5]],"date-time":"2023-10-05T12:51:20Z","timestamp":1696510280000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":27,"title":["Overview of data preprocessing for machine learning applications in human microbiome research"],"prefix":"10.3389","volume":"14","author":[{"given":"Eliana","family":"Ibrahimi","sequence":"first","affiliation":[]},{"given":"Marta B.","family":"Lopes","sequence":"additional","affiliation":[]},{"given":"Xhilda","family":"Dhamo","sequence":"additional","affiliation":[]},{"given":"Andrea","family":"Simeon","sequence":"additional","affiliation":[]},{"given":"Rajesh","family":"Shigdel","sequence":"additional","affiliation":[]},{"given":"Karel","family":"Hron","sequence":"additional","affiliation":[]},{"given":"Bla\u017e","family":"Stres","sequence":"additional","affiliation":[]},{"given":"Domenica","family":"D\u2019Elia","sequence":"additional","affiliation":[]},{"given":"Magali","family":"Berland","sequence":"additional","affiliation":[]},{"given":"Laura Judith","family":"Marcos-Zambrano","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2023,10,5]]},"reference":[{"key":"ref1","doi-asserted-by":"publisher","first-page":"275","DOI":"10.1016\/j.coemr.2021.04.005","article-title":"Recent progress in analyzing the spatial structure of the human microbiome: Distinguishing biogeography and architecture in the oral and gut communities","volume":"18","author":"Adade","year":"2021","journal-title":"Curr. Opin. Endocr. Metab. Res."},{"key":"ref9001","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1111\/j.2517-6161.1982.tb01195.x","article-title":"The statistical analysis of compositional data (with discussion)","volume":"44","author":"Aitchison","year":"1982","journal-title":"J R Stat Soc Series B"},{"key":"ref2","doi-asserted-by":"crossref","DOI":"10.1007\/978-94-009-4109-0","volume-title":"The statistical analysis of compositional data","author":"Aitchison","year":"1986"},{"key":"ref3","doi-asserted-by":"publisher","first-page":"e00191-16","DOI":"10.1128\/mSystems.00191-16","article-title":"Deblur rapidly resolves single-nucleotide community sequence patterns","volume":"2","author":"Amir","year":"2017","journal-title":"MSystems"},{"key":"ref4","doi-asserted-by":"publisher","first-page":"19","DOI":"10.1080\/1364557032000119616","article-title":"Scoping studies: towards a methodological framework","volume":"8","author":"Arksey","year":"2005","journal-title":"Int. J. Soc. Res. Methodol."},{"key":"ref5","doi-asserted-by":"publisher","first-page":"36","DOI":"10.3389\/fmicb.2018.00036","article-title":"\u2018TIME\u2019: a web application for obtaining insights into microbial ecology using longitudinal microbiome data","volume":"9","author":"Baksi","year":"2018","journal-title":"Front. Microbiol."},{"key":"ref6","doi-asserted-by":"publisher","first-page":"e65088","DOI":"10.7554\/eLife.65088","article-title":"Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3","volume":"10","author":"Beghini","year":"2021","journal-title":"elife"},{"key":"ref7","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41587-023-01688-w","article-title":"Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4","author":"Blanco-M\u00edguez","year":"2023","journal-title":"Nat. Biotechnol."},{"key":"ref8","doi-asserted-by":"publisher","first-page":"186","DOI":"10.1186\/s13059-019-1788-y","article-title":"MITRE: inferring features from microbiota time-series data linked to host status","volume":"20","author":"Bogart","year":"2019","journal-title":"Genome Biol."},{"key":"ref9","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1038\/nmeth.2276","article-title":"Quality-filtering vastly improves diversity estimates from Illumina amplicon sequencing","volume":"10","author":"Bokulich","year":"2013","journal-title":"Nat. Methods"},{"key":"ref10","doi-asserted-by":"publisher","first-page":"2114","DOI":"10.1093\/bioinformatics\/btu170","article-title":"Trimmomatic: a flexible trimmer for Illumina sequence data","volume":"30","author":"Bolger","year":"2014","journal-title":"Bioinformatics"},{"key":"ref11","doi-asserted-by":"publisher","first-page":"e0185056","DOI":"10.1371\/journal.pone.0185056","article-title":"BBMerge \u2013 Accurate paired shotgun read merging via overlap","volume":"12","author":"Bushnell","year":"2017","journal-title":"PLoS One"},{"key":"ref9002","doi-asserted-by":"publisher","first-page":"581","DOI":"10.1038\/nmeth.3869","article-title":"DADA2: High-resolution sample inference from Illumina amplicon data","volume":"13","author":"Callahan","year":"2016","journal-title":"Nat. Methods"},{"key":"ref12","doi-asserted-by":"publisher","first-page":"e4600","DOI":"10.7717\/peerj.4600","article-title":"GMPR: A robust normalization method for zero-inflated count data with application to microbiome sequencing data","volume":"6","author":"Chen","year":"2018","journal-title":"PeerJ"},{"key":"ref13","doi-asserted-by":"publisher","first-page":"2149","DOI":"10.3390\/microorganisms9102149","article-title":"Predicting the role of the human gut microbiome in constipation using machine-learning methods: a meta-analysis","volume":"9","author":"Chen","year":"2021","journal-title":"Microorganisms"},{"key":"ref14","doi-asserted-by":"publisher","first-page":"100570","DOI":"10.1016\/j.spasta.2021.100570","article-title":"A new class of \u03b1-transformations for the spatial analysis of compositional data","volume":"47","author":"Clarotto","year":"2022","journal-title":"Spat. Stat."},{"key":"ref15","doi-asserted-by":"publisher","first-page":"359","DOI":"10.1038\/nmeth.2897","article-title":"A fair comparison","volume":"11","author":"Costea","year":"2014","journal-title":"Nat. Methods"},{"key":"ref9003","doi-asserted-by":"publisher","first-page":"1257002","DOI":"10.3389\/fmicb.2023.1257002","article-title":"Advancing microbiome research with machine learning: key findings from the ML4Microbiome COST action","volume":"14","author":"D\u2019Elia","year":"2023","journal-title":"Front. Microbiol."},{"key":"ref16","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1186\/s12859-020-03933-4","article-title":"MegaR: an interactive R package for rapid sample classification and phenotype prediction using metagenome profiles and machine learning","volume":"22","author":"Dhungel","year":"2021","journal-title":"BMC Bioinformatics"},{"key":"ref17","doi-asserted-by":"publisher","first-page":"441","DOI":"10.1186\/s12859-017-1843-1","article-title":"Interpretation of microbiota-based diagnostics by explaining individual classifier decisions","volume":"18","author":"Eck","year":"2017","journal-title":"BMC Bioinformatics"},{"key":"ref18","doi-asserted-by":"publisher","first-page":"2194","DOI":"10.1093\/bioinformatics\/btr381","article-title":"UCHIME improves sensitivity and speed of chimera detection","volume":"27","author":"Edgar","year":"2011","journal-title":"Bioinformatics"},{"key":"ref19","doi-asserted-by":"publisher","first-page":"795","DOI":"10.1007\/s11004-005-7381-9","article-title":"Groups of parts and their balances in compositional data analysis","volume":"37","author":"Egozcue","year":"2005","journal-title":"Math. Geol."},{"key":"ref20","doi-asserted-by":"publisher","first-page":"279","DOI":"10.1023\/A:1023818214614","article-title":"Isometric logratio transformations for compositional data analysis","volume":"35","author":"Egozcue","year":"2003","journal-title":"Math. Geol."},{"key":"ref21","doi-asserted-by":"publisher","first-page":"509","DOI":"10.1007\/978-1-4939-3572-7_26","article-title":"Big data, evolution, and metagenomes: predicting disease from gut microbiota codon usage profiles","volume":"1415","author":"Fabijani\u0107","year":"2016","journal-title":"Methods Mol. Biol."},{"key":"ref22","doi-asserted-by":"publisher","first-page":"115648","DOI":"10.1016\/j.eswa.2021.115648","article-title":"Machine Learning analysis of the human infant gut microbiome identifies influential species in type 1 diabetes","volume":"185","author":"Fern\u00e1ndez-Edreira","year":"2021","journal-title":"Expert Syst. Appl."},{"key":"ref23","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-319-96422-5","volume-title":"Applied compositional data analysis","author":"Filzmoser","year":"2018"},{"key":"ref24","doi-asserted-by":"publisher","first-page":"194","DOI":"10.1016\/j.chroma.2014.08.050","article-title":"What can go wrong at the data normalization step for identification of biomarkers?","volume":"1362","author":"Filzmoser","year":"2014","journal-title":"J. Chromatogr. A"},{"key":"ref25","doi-asserted-by":"publisher","first-page":"1454","DOI":"10.1136\/gutjnl-2017-314814","article-title":"The oral microbiota in colorectal cancer is distinctive and predictive","volume":"67","author":"Flemer","year":"2018","journal-title":"Gut"},{"key":"ref26","doi-asserted-by":"publisher","first-page":"1930872","DOI":"10.1080\/19490976.2021.1930872","article-title":"A microbial signature following bariatric surgery is robustly consistent across multiple cohorts","volume":"13","author":"Fouladi","year":"2021","journal-title":"Gut Microbes"},{"key":"ref27","doi-asserted-by":"publisher","first-page":"2403","DOI":"10.3390\/jcm9082403","article-title":"Usefulness of machine learning-based gut microbiome analysis for identifying patients with irritable bowels syndrome","volume":"9","author":"Fukui","year":"2020","journal-title":"J. Clin. Med."},{"key":"ref28","doi-asserted-by":"publisher","first-page":"101199","DOI":"10.1016\/j.isci.2020.101199","article-title":"Human gut microbiome aging clock based on taxonomic profiling and deep learning","volume":"23","author":"Galkin","year":"2020","journal-title":"IScience"},{"key":"ref29","doi-asserted-by":"publisher","first-page":"322","DOI":"10.1016\/j.annepidem.2016.03.003","article-title":"It\u2019s all relative: analyzing microbiome data as compositions","volume":"26","author":"Gloor","year":"2016","journal-title":"Ann. Epidemiol."},{"key":"ref30","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1007\/s11004-008-9212-2","article-title":"Log-ratio analysis is a limiting case of correspondence analysis","volume":"42","author":"Greenacre","year":"2010","journal-title":"Math. Geosci."},{"key":"ref31","doi-asserted-by":"publisher","first-page":"681","DOI":"10.1007\/s11004-011-9338-5","article-title":"Measuring subcompositional incoherence","volume":"43","author":"Greenacre","year":"2011","journal-title":"Math. Geosci."},{"key":"ref32","doi-asserted-by":"publisher","first-page":"727398","DOI":"10.3389\/fmicb.2021.727398","article-title":"Compositional data analysis of microbiome and any-omics datasets: a validation of the additive logratio transformation","volume":"12","author":"Greenacre","year":"2021","journal-title":"Front. Microbiol."},{"key":"ref33","doi-asserted-by":"publisher","first-page":"e00438-19","DOI":"10.1128\/mSystems.00438-19","article-title":"Association of Flavonifractor plautii, a flavonoid-degrading bacterium, with the gut microbiome of colorectal cancer patients in India","volume":"4","author":"Gupta","year":"2019","journal-title":"MSystems"},{"key":"ref34","doi-asserted-by":"publisher","first-page":"271","DOI":"10.1007\/s13199-021-00778-0","article-title":"Survey of artificial intelligence approaches in the study of anthropogenic impacts on symbiotic organisms \u2013 a holistic view","volume":"84","author":"Gupta","year":"2021","journal-title":"Symbiosis"},{"key":"ref35","doi-asserted-by":"publisher","first-page":"FSO474","DOI":"10.2144\/fsoa-2020-0028","article-title":"New EU projects delivering human microbiome applications","volume":"6","author":"Hadrich","year":"2020","journal-title":"Fut. Sci. OA"},{"key":"ref36","doi-asserted-by":"publisher","first-page":"98","DOI":"10.1038\/s43705-022-00182-9","article-title":"Machine learning and deep learning applications in microbiome research","volume":"2","author":"Hern\u00e1ndez Medina","year":"2022","journal-title":"ISME Commun."},{"key":"ref37","doi-asserted-by":"publisher","first-page":"e30126","DOI":"10.1371\/journal.pone.0030126","article-title":"Dirichlet Multinomial Mixtures: Generative Models for Microbial Metagenomics","volume":"7","author":"Holmes","year":"2012","journal-title":"PLoS One"},{"key":"ref38","doi-asserted-by":"publisher","first-page":"1079","DOI":"10.1038\/s41564-020-0743-8","article-title":"Genome-wide associations of human gut microbiome variation and implications for causal inference analyses","volume":"5","author":"Hughes","year":"2020","journal-title":"Nat. Microbiol."},{"key":"ref39","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1007\/s13253-021-00447-1","article-title":"A statistical perspective on the challenges in molecular microbial biology","volume":"26","author":"Jeganathan","year":"2021","journal-title":"J. Agric. Biol. Environ. Stat."},{"key":"ref40","doi-asserted-by":"publisher","first-page":"e0227285","DOI":"10.1371\/journal.pone.0227285","article-title":"Quantitative PCR provides a simple and accessible method for quantitative microbiota profiling","volume":"15","author":"Jian","year":"2020","journal-title":"PLoS One"},{"key":"ref41","doi-asserted-by":"publisher","first-page":"290","DOI":"10.1038\/s41598-021-04373-7","article-title":"Accurate diagnosis of atopic dermatitis by combining transcriptome and microbiota data with supervised machine learning","volume":"12","author":"Jiang","year":"2022","journal-title":"Sci. Rep."},{"key":"ref42","doi-asserted-by":"publisher","first-page":"522","DOI":"10.1093\/biostatistics\/kxz050","article-title":"A Bayesian zero-inflated negative binomial regression model for the integrative analysis of microbiome data","volume":"22","author":"Jiang","year":"2021","journal-title":"Biostatistics"},{"key":"ref43","article-title":"Leakage and the reproducibility crisis in ML-based science","author":"Kapoor","year":"2022"},{"key":"ref44","doi-asserted-by":"publisher","first-page":"784397","DOI":"10.3389\/fgene.2022.784397","article-title":"Benchmark of data processing methods and machine learning models for gut microbiome-based diagnosis of inflammatory bowel disease","volume":"13","author":"Kubinski","year":"2022","journal-title":"Front. Genet."},{"key":"ref45","doi-asserted-by":"publisher","first-page":"e32","DOI":"10.7717\/peerj.32","article-title":"Associations between the human intestinal microbiota, Lactobacillus rhamnosus GG and serum lipids indicated by integrated analysis of high-throughput profiling data","volume":"1","author":"Lahti","year":"2013","journal-title":"PeerJ"},{"key":"ref46","doi-asserted-by":"publisher","first-page":"e0160169","DOI":"10.1371\/journal.pone.0160169","article-title":"MixMC: A multivariate statistical framework to gain insight into microbial communities","volume":"11","author":"L\u00ea Cao","year":"2016","journal-title":"PLoS One"},{"key":"ref47","doi-asserted-by":"publisher","first-page":"104892","DOI":"10.1016\/j.micinf.2021.104892","article-title":"Machine learning-based investigation of the relationship between gut microbiome and obesity status","volume":"24","author":"Liu","year":"2022","journal-title":"Microbes Infect."},{"key":"ref48","doi-asserted-by":"publisher","first-page":"3242","DOI":"10.1093\/bioinformatics\/btr547","article-title":"Sparse distance-based learning for simultaneous multiclass classification and feature selection of metagenomic data","volume":"27","author":"Liu","year":"2011","journal-title":"Bioinformatics"},{"key":"ref49","doi-asserted-by":"publisher","first-page":"719","DOI":"10.1016\/j.cmet.2022.03.002","article-title":"Early prediction of incident liver disease using conventional risk factors and gut-microbiome-augmented gradient boosting","volume":"34","author":"Liu","year":"2022","journal-title":"Cell Metab."},{"key":"ref50","doi-asserted-by":"publisher","first-page":"3562","DOI":"10.1038\/s41467-021-23821-6","article-title":"Benchmarking microbiome transformations favors experimental quantitative approaches to address compositionality and sampling depth biases","volume":"12","author":"Llor\u00e9ns-Rico","year":"2021","journal-title":"Nat. Commun."},{"key":"ref51","doi-asserted-by":"publisher","first-page":"314","DOI":"10.1186\/s12859-019-2833-2","article-title":"MetaNN: accurate classification of host phenotypes from metagenomic data using neural networks","volume":"20","author":"Lo","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"ref52","doi-asserted-by":"publisher","first-page":"550","DOI":"10.1186\/s13059-014-0550-8","article-title":"Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2","volume":"15","author":"Love","year":"2014","journal-title":"Genome Biol."},{"key":"ref53","doi-asserted-by":"publisher","first-page":"634511","DOI":"10.3389\/fmicb.2021.634511","article-title":"Applications of machine learning in human microbiome studies: a review on feature selection, biomarker identification, disease prediction and treatment","volume":"12","author":"Marcos-Zambrano","year":"2021","journal-title":"Front. Microbiol."},{"key":"ref54","doi-asserted-by":"publisher","first-page":"10","DOI":"10.14806\/ej.17.1.200","article-title":"Cutadapt removes adapter sequences from high-throughput sequencing reads","volume":"17","author":"Martin","year":"2011","journal-title":"EMBnet.Journal"},{"key":"ref55","doi-asserted-by":"publisher","first-page":"389","DOI":"10.1111\/2041-210X.13115","article-title":"Methods for normalizing microbiome data: An ecological perspective","volume":"10","author":"McKnight","year":"2019","journal-title":"Methods Ecol. Evol."},{"key":"ref56","doi-asserted-by":"publisher","first-page":"1885","DOI":"10.1038\/s41591-021-01552-x","article-title":"Reporting guidelines for human microbiome research: the STORMS checklist","volume":"27","author":"Mirzayi","year":"2021","journal-title":"Nat. Med."},{"key":"ref57","doi-asserted-by":"publisher","first-page":"635781","DOI":"10.3389\/fmicb.2021.635781","article-title":"Statistical and machine learning techniques in human microbiome studies: contemporary challenges and solutions","volume":"12","author":"Moreno-Indias","year":"2021","journal-title":"Front. Microbiol."},{"key":"ref58","doi-asserted-by":"publisher","first-page":"23565","DOI":"10.1109\/ACCESS.2021.3050838","article-title":"Feature extension of gut microbiome data for deep neural network-based colorectal cancer classification","volume":"9","author":"Mulenga","year":"2021","journal-title":"IEEE Access"},{"key":"ref59","doi-asserted-by":"publisher","first-page":"336","DOI":"10.3390\/metabo11060336","article-title":"General unified microbiome profiling pipeline (GUMPP) for large scale, streamlined and reproducible analysis of bacterial 16S rRNA data to predicted microbial metagenomes, enzymatic reactions and metabolic pathways","volume":"11","author":"Murovec","year":"2021","journal-title":"Metabolites"},{"key":"ref60","doi-asserted-by":"publisher","first-page":"3207","DOI":"10.1038\/s41396-021-00998-8","article-title":"Distinct composition and metabolic functions of human gut microbiota are associated with cachexia in lung cancer patients","volume":"15","author":"Ni","year":"2021","journal-title":"ISME J."},{"key":"ref61","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1186\/s40168-015-0114-5","article-title":"Phylogenetic approaches to microbial community classification","volume":"3","author":"Ning","year":"2015","journal-title":"Microbiome"},{"key":"ref9004","doi-asserted-by":"publisher","first-page":"1261889","DOI":"10.3389\/fmicb.2023.1261889","article-title":"Machine learning approaches in microbiome research: challenges and best practices","volume":"14","author":"Papoutsoglou","year":"2023","journal-title":"Front. Microbiol."},{"key":"ref62","doi-asserted-by":"crossref","DOI":"10.1002\/9781119003144","volume-title":"Modelling and analysis of compositional data","author":"Pawlowsky-Glahn","year":"2015"},{"key":"ref63","doi-asserted-by":"publisher","first-page":"584","DOI":"10.1038\/ismej.2016.117","article-title":"Absolute quantification of microbial taxon abundances","volume":"11","author":"Props","year":"2017","journal-title":"ISME J."},{"key":"ref64","doi-asserted-by":"publisher","first-page":"e00230-19","DOI":"10.1128\/mSystems.00230-19","article-title":"Interpretable log contrasts for the classification of health biomarkers: a new approach to balance selection","volume":"5","author":"Quinn","year":"2020","journal-title":"MSystems"},{"key":"ref65","doi-asserted-by":"publisher","first-page":"2870","DOI":"10.1093\/bioinformatics\/bty175","article-title":"Understanding sequencing data as compositions: an outlook and review","volume":"34","author":"Quinn","year":"2018","journal-title":"Bioinformatics"},{"key":"ref66","doi-asserted-by":"publisher","first-page":"e1009021","DOI":"10.1371\/journal.pcbi.1009021","article-title":"MiMeNet: Exploring microbiome-metabolome relationships using neural networks","volume":"17","author":"Reiman","year":"2021","journal-title":"PLoS Comput. Biol."},{"key":"ref67","doi-asserted-by":"publisher","first-page":"139","DOI":"10.1093\/bioinformatics\/btp616","article-title":"edgeR: a Bioconductor package for differential expression analysis of digital gene expression data","volume":"26","author":"Robinson","year":"2010","journal-title":"Bioinformatics"},{"key":"ref68","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1080\/19490976.2021.1888673","article-title":"Links between gut microbiome composition and fatty liver disease in a large population sample","volume":"13","author":"Ruuskanen","year":"2021","journal-title":"Gut Microbes"},{"key":"ref69","doi-asserted-by":"publisher","first-page":"1512","DOI":"10.1038\/s41467-020-15342-5","article-title":"Colonic microbiota is associated with inflammation and host epigenomic alterations in inflammatory bowel disease","volume":"11","author":"Ryan","year":"2020","journal-title":"Nat. Commun."},{"key":"ref70","doi-asserted-by":"publisher","first-page":"2789","DOI":"10.1016\/j.csbj.2020.09.014","article-title":"Naught all zeros in sequence count data are the same","volume":"18","author":"Silverman","year":"2020","journal-title":"Comput. Struct. Biotechnol. J."},{"key":"ref71","doi-asserted-by":"publisher","first-page":"28","DOI":"10.1186\/s40168-016-0175-0","article-title":"Adjusting microbiome profiles for differences in microbial load by spike-in bacteria","volume":"4","author":"St\u00e4mmler","year":"2016","journal-title":"Microbiome"},{"key":"ref72","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1186\/2049-2618-1-11","article-title":"A comprehensive evaluation of multicategory classification methods for microbiomic data","volume":"1","author":"Statnikov","year":"2013","journal-title":"Microbiome"},{"key":"ref73","doi-asserted-by":"publisher","first-page":"315","DOI":"10.1002\/sam.11514","article-title":"Weighted pivot coordinates for partial least squares-based marker discovery in high-throughput compositional data","volume":"14","author":"\u0160tefelov\u00e1","year":"2021","journal-title":"Stat. Anal. Data Mining ASA Data Sci. J."},{"key":"ref74","doi-asserted-by":"publisher","first-page":"e1586","DOI":"10.1002\/wics.1586","article-title":"A review of normalization and differential abundance methods for microbiome counts data. WIREs","volume":"15","author":"Swift","year":"2023","journal-title":"Comput. Stat."},{"key":"ref75","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1053\/j.gastro.2016.09.049","article-title":"Identification of an intestinal microbiota signature associated with severity of irritable bowel syndrome","volume":"152","author":"Tap","year":"2017","journal-title":"Gastroenterology"},{"key":"ref76","doi-asserted-by":"publisher","first-page":"667","DOI":"10.1038\/s41591-019-0405-7","article-title":"Metagenomic analysis of colorectal cancer datasets identifies cross-cohort microbial diagnostic signatures and a link with choline degradation","volume":"25","author":"Thomas","year":"2019","journal-title":"Nat. Med."},{"key":"ref77","doi-asserted-by":"publisher","first-page":"62","DOI":"10.1186\/s40168-016-0208-8","article-title":"Large-scale benchmarking reveals false discoveries and count transformation sensitivity in 16S rRNA gene amplicon data analysis methods used in microbiome studies","volume":"4","author":"Thorsen","year":"2016","journal-title":"Microbiome"},{"key":"ref78","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1007\/978-3-319-23497-7_11","article-title":"Predicting the metagenomics content with multiple CART trees","volume-title":"Mathematical Models in Biology","author":"Travisany","year":"2015"},{"key":"ref79","doi-asserted-by":"publisher","first-page":"320","DOI":"10.1016\/j.cageo.2006.11.017","article-title":"\u201ccompositions\u201d: A unified R package to analyze compositional data","volume":"34","author":"van den Boogaart","year":"2008","journal-title":"Comput. Geosci."},{"key":"ref80","doi-asserted-by":"publisher","first-page":"507","DOI":"10.1038\/nature24460","article-title":"Quantitative microbiome profiling links gut community variation to microbial load","volume":"551","author":"Vandeputte","year":"2017","journal-title":"Nature"},{"key":"ref81","doi-asserted-by":"publisher","first-page":"giz042","DOI":"10.1093\/gigascience\/giz042","article-title":"Microbiome Learning Repo (ML Repo): A public repository of microbiome regression and classification tasks","volume":"8","author":"Vangay","year":"2019","journal-title":"GigaScience"},{"key":"ref82","doi-asserted-by":"publisher","first-page":"27","DOI":"10.1186\/s40168-017-0237-y","article-title":"Normalization and microbial differential abundance strategies depend upon data characteristics","volume":"5","author":"Weiss","year":"2017","journal-title":"Microbiome"},{"key":"ref83","doi-asserted-by":"publisher","first-page":"679","DOI":"10.1038\/s41591-019-0406-6","article-title":"Meta-analysis of fecal metagenomes reveals global microbial signatures that are specific for colorectal cancer","volume":"25","author":"Wirbel","year":"2019","journal-title":"Nat. Med."},{"key":"ref84","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1155\/2018\/2936257","article-title":"Metagenomics biomarkers selected for prediction of three different diseases in Chinese population","volume":"2018","author":"Wu","year":"2018","journal-title":"Biomed. Res. Int."},{"key":"ref85","doi-asserted-by":"publisher","first-page":"2742","DOI":"10.1016\/j.csbj.2021.04.054","article-title":"Towards multi-label classification: Next step of machine learning for microbiome research","volume":"19","author":"Wu","year":"2021","journal-title":"Comput. Struct. Biotechnol. J."},{"key":"ref86","doi-asserted-by":"publisher","first-page":"104568","DOI":"10.1016\/j.micpath.2020.104568","article-title":"Potential of gut microbiome for detection of autism spectrum disorder","volume":"149","author":"Wu","year":"2020","journal-title":"Microb. Pathog."},{"key":"ref87","doi-asserted-by":"crossref","DOI":"10.1007\/978-981-13-1534-3","volume-title":"Statistical Analysis of Microbiome Data with R","author":"Xia","year":"2018"},{"key":"ref88","doi-asserted-by":"publisher","first-page":"2","DOI":"10.1186\/s13040-021-00241-2","article-title":"LightCUD: a program for diagnosing IBD based on human gut microbiome data","volume":"14","author":"Xu","year":"2021","journal-title":"BioData Mining"},{"key":"ref89","doi-asserted-by":"publisher","first-page":"968","DOI":"10.1038\/s41591-019-0458-7","article-title":"Metagenomic and metabolomic analyses reveal distinct stage-specific phenotypes of the gut microbiota in colorectal cancer","volume":"25","author":"Yachida","year":"2019","journal-title":"Nat. Med."},{"key":"ref90","doi-asserted-by":"publisher","first-page":"baaa050","DOI":"10.1093\/database\/baaa050","article-title":"mAML: an automated machine learning pipeline with a microbiome repository for human disease classification","volume":"2020","author":"Yang","year":"2020","journal-title":"Database"},{"key":"ref92","doi-asserted-by":"publisher","first-page":"bbaa436","DOI":"10.1093\/bib\/bbaa436","article-title":"GutBalance: a server for the human gut microbiome-based disease prediction and biomarker discovery with compositionality addressed","volume":"22","author":"Yang","year":"2021","journal-title":"Brief. Bioinform."},{"key":"ref93","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1186\/s12859-016-1441-7","article-title":"Negative binomial mixed models for analyzing microbiome count data","volume":"18","author":"Zhang","year":"2017","journal-title":"BMC Bioinformatics"},{"key":"ref94","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1186\/s12866-021-02414-9","article-title":"Determine independent gut microbiota-diseases association by eliminating the effects of human lifestyle factors","volume":"22","author":"Zhu","year":"2022","journal-title":"BMC Microbiol."}],"container-title":["Frontiers in Microbiology"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fmicb.2023.1250909\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,30]],"date-time":"2024-10-30T03:02:16Z","timestamp":1730257336000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fmicb.2023.1250909\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,5]]},"references-count":97,"alternative-id":["10.3389\/fmicb.2023.1250909"],"URL":"https:\/\/doi.org\/10.3389\/fmicb.2023.1250909","relation":{},"ISSN":["1664-302X"],"issn-type":[{"value":"1664-302X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,5]]},"article-number":"1250909"}}