{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T06:19:05Z","timestamp":1772173145552,"version":"3.50.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1010820","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2023,1,24]],"date-time":"2023-01-24T00:00:00Z","timestamp":1674518400000}}],"reference-count":80,"publisher":"Public Library of Science (PLoS)","issue":"1","license":[{"start":{"date-parts":[[2023,1,6]],"date-time":"2023-01-06T00:00:00Z","timestamp":1672963200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"publisher","award":["01IS18036A"],"award-info":[{"award-number":["01IS18036A"]}],"id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["BO3139\/7-1"],"award-info":[{"award-number":["BO3139\/7-1"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>In recent years, unsupervised analysis of microbiome data, such as microbial network analysis and clustering, has increased in popularity. Many new statistical and computational methods have been proposed for these tasks. This multiplicity of analysis strategies poses a challenge for researchers, who are often unsure which method(s) to use and might be tempted to try different methods on their dataset to look for the \u201cbest\u201d ones. However, if only the best results are selectively reported, this may cause over-optimism: the \u201cbest\u201d method is overly fitted to the specific dataset, and the results might be non-replicable on validation data. Such effects will ultimately hinder research progress. Yet so far, these topics have been given little attention in the context of unsupervised microbiome analysis. In our illustrative study, we aim to quantify over-optimism effects in this context. We model the approach of a hypothetical microbiome researcher who undertakes four unsupervised research tasks: clustering of bacterial genera, hub detection in microbial networks, differential microbial network analysis, and clustering of samples. While these tasks are unsupervised, the researcher might still have certain expectations as to what constitutes interesting results. We translate these expectations into concrete evaluation criteria that the hypothetical researcher might want to optimize. We then randomly split an exemplary dataset from the American Gut Project into discovery and validation sets multiple times. For each research task, multiple method combinations (e.g., methods for data normalization, network generation, and\/or clustering) are tried on the discovery data, and the combination that yields the best result according to the evaluation criterion is chosen. While the hypothetical researcher might only report this result, we also apply the \u201cbest\u201d method combination to the validation dataset. The results are then compared between discovery and validation data. In all four research tasks, there are notable over-optimism effects; the results on the validation data set are worse compared to the discovery data, averaged over multiple random splits into discovery\/validation data. Our study thus highlights the importance of validation and replication in microbiome analysis to obtain reliable results and demonstrates that the issue of over-optimism goes beyond the context of statistical testing and fishing for significance.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1010820","type":"journal-article","created":{"date-parts":[[2023,1,6]],"date-time":"2023-01-06T13:57:27Z","timestamp":1673013447000},"page":"e1010820","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":10,"title":["Over-optimism in unsupervised microbiome analysis: Insights from network learning and clustering"],"prefix":"10.1371","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1215-8561","authenticated-orcid":true,"given":"Theresa","family":"Ullmann","sequence":"first","affiliation":[]},{"given":"Stefanie","family":"Peschel","sequence":"additional","affiliation":[]},{"given":"Philipp","family":"Finger","sequence":"additional","affiliation":[]},{"given":"Christian L.","family":"M\u00fcller","sequence":"additional","affiliation":[]},{"given":"Anne-Laure","family":"Boulesteix","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2023,1,6]]},"reference":[{"issue":"477","key":"pcbi.1010820.ref001","doi-asserted-by":"crossref","first-page":"eaaw1815","DOI":"10.1126\/scitranslmed.aaw1815","article-title":"Transforming medicine with the microbiome","volume":"11","author":"N Zmora","year":"2019","journal-title":"Science Translational Medicine"},{"issue":"1","key":"pcbi.1010820.ref002","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1016\/j.tips.2016.10.001","article-title":"Introducing the microbiome into precision medicine","volume":"38","author":"TM Kuntz","year":"2017","journal-title":"Trends in Pharmacological Sciences"},{"issue":"1","key":"pcbi.1010820.ref003","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1186\/s40168-017-0267-5","article-title":"Optimizing methods and dodging pitfalls in microbiome research","volume":"5","author":"D Kim","year":"2017","journal-title":"Microbiome"},{"issue":"3","key":"pcbi.1010820.ref004","doi-asserted-by":"crossref","first-page":"e00525","DOI":"10.1128\/mBio.00525-18","article-title":"Identifying and overcoming threats to reproducibility, replicability, robustness, and generalizability in microbiome research","volume":"9","author":"PD Schloss","year":"2018","journal-title":"mBio"},{"issue":"6251","key":"pcbi.1010820.ref005","doi-asserted-by":"crossref","first-page":"aac4716","DOI":"10.1126\/science.aac4716","article-title":"Estimating the reproducibility of psychological science","volume":"349","author":"Open Science Collaboration","year":"2015","journal-title":"Science"},{"key":"pcbi.1010820.ref006","doi-asserted-by":"crossref","first-page":"201925","DOI":"10.1098\/rsos.201925","article-title":"The multiplicity of analysis strategies jeopardizes replicability: lessons learned across disciplines","volume":"8","author":"S Hoffmann","year":"2021","journal-title":"Royal Society Open Science"},{"issue":"11","key":"pcbi.1010820.ref007","doi-asserted-by":"crossref","first-page":"1359","DOI":"10.1177\/0956797611417632","article-title":"False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant","volume":"22","author":"JP Simmons","year":"2011","journal-title":"Psychological Science"},{"issue":"3","key":"pcbi.1010820.ref008","doi-asserted-by":"crossref","first-page":"670","DOI":"10.1002\/bimj.201800309","article-title":"Sampling uncertainty versus method uncertainty: A general framework with applications to omics biomarker selection","volume":"62","author":"S Klau","year":"2020","journal-title":"Biometrical Journal"},{"issue":"4","key":"pcbi.1010820.ref009","doi-asserted-by":"crossref","first-page":"bbaa290","DOI":"10.1093\/bib\/bbaa290","article-title":"NetCoMi: network construction and comparison for microbiome data in R","volume":"22","author":"S Peschel","year":"2020","journal-title":"Briefings in Bioinformatics"},{"issue":"3","key":"pcbi.1010820.ref010","doi-asserted-by":"crossref","first-page":"e3000691","DOI":"10.1371\/journal.pbio.3000691","article-title":"What is replication?","volume":"18","author":"BA Nosek","year":"2020","journal-title":"PLoS Biology"},{"issue":"3","key":"pcbi.1010820.ref011","first-page":"e1444","article-title":"Validation of cluster analysis results on validation data: A systematic framework","volume":"12","author":"T Ullmann","year":"2022","journal-title":"Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery"},{"issue":"8","key":"pcbi.1010820.ref012","doi-asserted-by":"crossref","first-page":"e124","DOI":"10.1371\/journal.pmed.0020124","article-title":"Why most published research findings are false","volume":"2","author":"JP Ioannidis","year":"2005","journal-title":"PLoS Medicine"},{"issue":"6","key":"pcbi.1010820.ref013","doi-asserted-by":"crossref","first-page":"460","DOI":"10.1511\/2014.111.460","article-title":"The statistical crisis in science","volume":"102","author":"A Gelman","year":"2014","journal-title":"American Scientist"},{"issue":"3","key":"pcbi.1010820.ref014","doi-asserted-by":"crossref","first-page":"e1002106","DOI":"10.1371\/journal.pbio.1002106","article-title":"The extent and consequences of p-hacking in science","volume":"13","author":"ML Head","year":"2015","journal-title":"PLoS Biology"},{"issue":"3","key":"pcbi.1010820.ref015","doi-asserted-by":"crossref","first-page":"e00031","DOI":"10.1128\/mSystems.00031-18","article-title":"American gut: an open platform for citizen science microbiome research","volume":"3","author":"D McDonald","year":"2018","journal-title":"Msystems"},{"issue":"11","key":"pcbi.1010820.ref016","doi-asserted-by":"crossref","first-page":"1077","DOI":"10.1038\/nbt.3981","article-title":"Assessment of variation in microbial community amplicon sequencing by the Microbiome Quality Control (MBQC) project consortium","volume":"35","author":"R Sinha","year":"2017","journal-title":"Nature Biotechnology"},{"issue":"1","key":"pcbi.1010820.ref017","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1186\/s12866-017-1101-8","article-title":"A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome","volume":"17","author":"I Allali","year":"2017","journal-title":"BMC Microbiology"},{"key":"pcbi.1010820.ref018","first-page":"kxab048","article-title":"Evaluating replicability in microbiome data","author":"DS Clausen","year":"2021","journal-title":"Biostatistics"},{"issue":"3","key":"pcbi.1010820.ref019","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pbio.3001556","article-title":"Systematically assessing microbiome\u2013disease associations identifies drivers of inconsistency in metagenomic research","volume":"20","author":"BT Tierney","year":"2022","journal-title":"PLoS Biology"},{"issue":"1","key":"pcbi.1010820.ref020","first-page":"1","article-title":"Microbiome differential abundance methods produce different results across 38 datasets","volume":"13","author":"JT Nearing","year":"2022","journal-title":"Nature Communications"},{"issue":"11","key":"pcbi.1010820.ref021","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pone.0259973","article-title":"Analysing microbiome intervention design studies: Comparison of alternative multivariate statistical methods","volume":"16","author":"M Khomich","year":"2021","journal-title":"PLoS One"},{"issue":"1","key":"pcbi.1010820.ref022","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/BF01908075","article-title":"Comparing partitions","volume":"2","author":"L Hubert","year":"1985","journal-title":"Journal of Classification"},{"issue":"4","key":"pcbi.1010820.ref023","doi-asserted-by":"crossref","first-page":"lqaa100","DOI":"10.1093\/nargab\/lqaa100","article-title":"Shrinkage improves estimation of microbial associations under different normalization methods","volume":"2","author":"M Badri","year":"2020","journal-title":"NAR Genomics and Bioinformatics"},{"key":"pcbi.1010820.ref024","doi-asserted-by":"crossref","first-page":"219","DOI":"10.3389\/fmicb.2014.00219","article-title":"Deciphering microbial interactions and detecting keystone species with co-occurrence networks","volume":"5","author":"D Berry","year":"2014","journal-title":"Frontiers in Microbiology"},{"issue":"1","key":"pcbi.1010820.ref025","doi-asserted-by":"crossref","first-page":"e1002352","DOI":"10.1371\/journal.pbio.1002352","article-title":"Microbial hub taxa link host and abiotic factors to plant microbiome variation","volume":"14","author":"MT Agler","year":"2016","journal-title":"PLoS Biology"},{"issue":"9","key":"pcbi.1010820.ref026","doi-asserted-by":"crossref","first-page":"567","DOI":"10.1038\/s41579-018-0024-1","article-title":"Keystone taxa as drivers of microbiome structure and functioning","volume":"16","author":"S Banerjee","year":"2018","journal-title":"Nature Reviews Microbiology"},{"issue":"6","key":"pcbi.1010820.ref027","doi-asserted-by":"crossref","first-page":"761","DOI":"10.1093\/femsre\/fuy030","article-title":"From hairballs to hypotheses\u2013biological insights from microbial networks","volume":"42","author":"L R\u00f6ttjers","year":"2018","journal-title":"FEMS Microbiology Reviews"},{"issue":"1","key":"pcbi.1010820.ref028","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1038\/s41396-020-00777-x","article-title":"A network approach to elucidate and prioritize microbial dark matter in microbial communities","volume":"15","author":"T Zamkovaya","year":"2021","journal-title":"The ISME Journal"},{"key":"pcbi.1010820.ref029","doi-asserted-by":"crossref","first-page":"1543","DOI":"10.3389\/fmicb.2015.01543","article-title":"Antibiotics and the human gut microbiome: dysbioses and accumulation of resistances","volume":"6","author":"M Francino","year":"2016","journal-title":"Frontiers in microbiology"},{"issue":"6086","key":"pcbi.1010820.ref030","doi-asserted-by":"crossref","first-page":"1255","DOI":"10.1126\/science.1224203","article-title":"The application of ecological theory toward an understanding of the human microbiome","volume":"336","author":"EK Costello","year":"2012","journal-title":"Science"},{"issue":"1","key":"pcbi.1010820.ref031","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/srep04547","article-title":"Revealing the hidden language of complex networks","volume":"4","author":"\u00d6N Yavero\u011flu","year":"2014","journal-title":"Scientific Reports"},{"issue":"1","key":"pcbi.1010820.ref032","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13073-016-0297-9","article-title":"Antibiotic perturbation of the murine gut microbiome enhances the adiposity, insulin resistance, and liver disease associated with high-fat diet","volume":"8","author":"D Mahana","year":"2016","journal-title":"Genome Medicine"},{"issue":"1","key":"pcbi.1010820.ref033","first-page":"1","article-title":"A single early-in-life macrolide course has lasting effects on murine microbial network topology and immunity","volume":"8","author":"VE Ruiz","year":"2017","journal-title":"Nature Communications"},{"issue":"1","key":"pcbi.1010820.ref034","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s40168-018-0412-9","article-title":"Individual and household attributes influence the dynamics of the personal skin microbiota and its association network","volume":"6","author":"MH Leung","year":"2018","journal-title":"Microbiome"},{"key":"pcbi.1010820.ref035","doi-asserted-by":"crossref","first-page":"174","DOI":"10.1038\/nature09944","article-title":"Enterotypes of the human gut microbiome","volume":"473","author":"M Arumugam","year":"2011","journal-title":"Nature"},{"issue":"9","key":"pcbi.1010820.ref036","doi-asserted-by":"crossref","first-page":"591","DOI":"10.1038\/nrmicro2859","article-title":"Categorization of the gut microbiota: enterotypes or gradients?","volume":"10","author":"IB Jeffery","year":"2012","journal-title":"Nature Reviews Microbiology"},{"issue":"1","key":"pcbi.1010820.ref037","doi-asserted-by":"crossref","first-page":"e1002863","DOI":"10.1371\/journal.pcbi.1002863","article-title":"A guide to enterotypes across the human body: meta-analysis of microbial community structures in human microbiome datasets","volume":"9","author":"O Koren","year":"2013","journal-title":"PLoS Computational Biology"},{"issue":"4","key":"pcbi.1010820.ref038","doi-asserted-by":"crossref","first-page":"433","DOI":"10.1016\/j.chom.2014.09.013","article-title":"Rethinking \u201centerotypes\u201d","volume":"16","author":"D Knights","year":"2014","journal-title":"Cell Host & Microbe"},{"key":"pcbi.1010820.ref039","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1038\/s41564-017-0072-8","article-title":"Enterotypes in the landscape of gut microbial community composition","volume":"3","author":"PI Costea","year":"2018","journal-title":"Nature Microbiology"},{"issue":"1","key":"pcbi.1010820.ref040","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1016\/j.gpb.2018.02.004","article-title":"Stereotypes about enterotype: the old and new ideas","volume":"17","author":"M Cheng","year":"2019","journal-title":"Genomics, Proteomics & Bioinformatics"},{"issue":"6052","key":"pcbi.1010820.ref041","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1126\/science.1208344","article-title":"Linking long-term dietary patterns with gut microbial enterotypes","volume":"334","author":"GD Wu","year":"2011","journal-title":"Science"},{"key":"pcbi.1010820.ref042","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","article-title":"Silhouettes: a graphical aid to the interpretation and validation of cluster analysis","volume":"20","author":"PJ Rousseeuw","year":"1987","journal-title":"Journal of Computational and Applied Mathematics"},{"issue":"2","key":"pcbi.1010820.ref043","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1111\/j.2517-6161.1982.tb01195.x","article-title":"The statistical analysis of compositional data","volume":"44","author":"J Aitchison","year":"1982","journal-title":"Journal of the Royal Statistical Society: Series B (Methodological)"},{"key":"pcbi.1010820.ref044","doi-asserted-by":"crossref","first-page":"516","DOI":"10.3389\/fgene.2019.00516","article-title":"Microbial networks in SPRING\u2014Semi-parametric rank-based correlation and partial correlation estimation for quantitative microbiome data","volume":"10","author":"G Yoon","year":"2019","journal-title":"Frontiers in Genetics"},{"key":"pcbi.1010820.ref045","doi-asserted-by":"crossref","first-page":"R106","DOI":"10.1186\/gb-2010-11-10-r106","article-title":"Differential expression analysis for sequence count data","volume":"11","author":"S Anders","year":"2010","journal-title":"Genome Biology"},{"issue":"3","key":"pcbi.1010820.ref046","doi-asserted-by":"crossref","first-page":"609","DOI":"10.1093\/biomet\/asaa007","article-title":"Sparse semiparametric canonical correlation analysis for data of mixed types","volume":"107","author":"G Yoon","year":"2020","journal-title":"Biometrika"},{"issue":"3","key":"pcbi.1010820.ref047","doi-asserted-by":"crossref","first-page":"e1004075","DOI":"10.1371\/journal.pcbi.1004075","article-title":"Proportionality: a valid alternative to correlation for relative data","volume":"11","author":"D Lovell","year":"2015","journal-title":"PLoS Computational Biology"},{"key":"pcbi.1010820.ref048","first-page":"849","article-title":"On spectral clustering: analysis and an algorithm","volume":"14","author":"A Ng","year":"2001","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"6","key":"pcbi.1010820.ref049","doi-asserted-by":"crossref","first-page":"066111","DOI":"10.1103\/PhysRevE.70.066111","article-title":"Finding community structure in very large networks","volume":"70","author":"A Clauset","year":"2004","journal-title":"Physical Review E"},{"issue":"10","key":"pcbi.1010820.ref050","doi-asserted-by":"crossref","first-page":"P10008","DOI":"10.1088\/1742-5468\/2008\/10\/P10008","article-title":"Fast unfolding of communities in large networks","volume":"2008","author":"VD Blondel","year":"2008","journal-title":"Journal of Statistical Mechanics: Theory and Experiment"},{"issue":"1","key":"pcbi.1010820.ref051","doi-asserted-by":"crossref","first-page":"e00903","DOI":"10.1128\/mSystems.00903-19","article-title":"Manta: A clustering algorithm for weighted ecological networks","volume":"5","author":"L R\u00f6ttjers","year":"2020","journal-title":"Msystems"},{"issue":"4","key":"pcbi.1010820.ref052","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1007\/BF00891269","article-title":"On criteria for measures of compositional difference","volume":"24","author":"J Aitchison","year":"1992","journal-title":"Mathematical Geology"},{"key":"pcbi.1010820.ref053","unstructured":"Mart\u00edn-Fern\u00e1ndez JA, Bren M, Barcel\u00f3-Vidal C, Pawlowsky-Glahn V. A measure of difference for compositional data based on measures of divergence. In: Proceedings of the Fifth Annual Conference of the International Association for Mathematical Geology. vol. 1; 1999. p. 211\u2013215."},{"issue":"4","key":"pcbi.1010820.ref054","first-page":"326","article-title":"An ordination of the upland forest communities of southern Wisconsin","volume":"27","author":"JR Bray","year":"1957","journal-title":"Ecological Monographs"},{"issue":"2","key":"pcbi.1010820.ref055","doi-asserted-by":"crossref","first-page":"e30126","DOI":"10.1371\/journal.pone.0030126","article-title":"Dirichlet multinomial mixtures: generative models for microbial metagenomics","volume":"7","author":"I Holmes","year":"2012","journal-title":"PloS One"},{"key":"pcbi.1010820.ref056","doi-asserted-by":"crossref","DOI":"10.1002\/9780470316801","volume-title":"Finding Groups in Data","author":"L Kaufman","year":"1990"},{"issue":"4","key":"pcbi.1010820.ref057","doi-asserted-by":"crossref","first-page":"e61562","DOI":"10.1371\/journal.pone.0061562","article-title":"A plea for neutral comparison studies in computational sciences","volume":"8","author":"AL Boulesteix","year":"2013","journal-title":"PloS One"},{"key":"pcbi.1010820.ref058","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1186\/s12874-017-0417-2","article-title":"Towards evidence-based computational statistics: lessons from clinical research on the role and design of real-data benchmark studies","volume":"17","author":"AL Boulesteix","year":"2017","journal-title":"BMC Medical Research Methodology"},{"issue":"1","key":"pcbi.1010820.ref059","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-021-04193-6","article-title":"Comparison study of differential abundance testing methods using two large Parkinson disease gut microbiome datasets derived from 16S amplicon sequencing","volume":"22","author":"ZD Wallen","year":"2021","journal-title":"BMC Bioinformatics"},{"key":"pcbi.1010820.ref060","doi-asserted-by":"crossref","first-page":"4048","DOI":"10.1016\/j.csbj.2020.11.049","article-title":"Measuring the microbiome: Best practices for developing and benchmarking microbiomics methods","volume":"18","author":"NA Bokulich","year":"2020","journal-title":"Computational and Structural Biotechnology Journal"},{"issue":"11","key":"pcbi.1010820.ref061","doi-asserted-by":"crossref","first-page":"2600","DOI":"10.1073\/pnas.1708274114","article-title":"The preregistration revolution","volume":"115","author":"BA Nosek","year":"2018","journal-title":"Proceedings of the National Academy of Sciences"},{"key":"pcbi.1010820.ref062","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1186\/s13059-021-02306-1","article-title":"Microbiome meta-analysis and cross-disease comparison enabled by the SIAMCAT machine learning toolbox","volume":"22","author":"J Wirbel","year":"2021","journal-title":"Genome Biology"},{"issue":"1","key":"pcbi.1010820.ref063","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-021-93645-3","article-title":"Tree-aggregated predictive modeling of microbiome data","volume":"11","author":"J Bien","year":"2021","journal-title":"Scientific Reports"},{"key":"pcbi.1010820.ref064","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1038\/s41591-022-01688-4","article-title":"Microbiome and metabolome features of the cardiometabolic disease spectrum","volume":"28","author":"S Fromentin","year":"2022","journal-title":"Nature Medicine"},{"issue":"7","key":"pcbi.1010820.ref065","doi-asserted-by":"crossref","first-page":"e177","DOI":"10.1371\/journal.pbio.0050177","article-title":"Development of the human infant intestinal microbiota","volume":"5","author":"C Palmer","year":"2007","journal-title":"PLoS Biology"},{"key":"pcbi.1010820.ref066","doi-asserted-by":"crossref","first-page":"4586","DOI":"10.1073\/pnas.1000097107","article-title":"Composition, variability, and temporal stability of the intestinal microbiota of the elderly","volume":"108","author":"MJ Claesson","year":"2011","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"12","key":"pcbi.1010820.ref067","doi-asserted-by":"crossref","first-page":"997","DOI":"10.1016\/j.tim.2019.08.001","article-title":"The gut microbiota in the first decade of life","volume":"27","author":"M Derrien","year":"2019","journal-title":"Trends in Microbiology"},{"key":"pcbi.1010820.ref068","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1186\/s40168-018-0608-z","article-title":"Impact of early events and lifestyle on the gut microbiota and metabolic phenotypes in young school-age children","volume":"7","author":"H Zhong","year":"2019","journal-title":"Microbiome"},{"issue":"4","key":"pcbi.1010820.ref069","doi-asserted-by":"crossref","first-page":"1249","DOI":"10.1080\/10618600.2021.1882468","article-title":"Fast computation of latent correlations","volume":"30","author":"G Yoon","year":"2021","journal-title":"Journal of Computational and Graphical Statistics"},{"issue":"1","key":"pcbi.1010820.ref070","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-017-16520-0","article-title":"propr: an R-package for identifying proportionally abundant features using compositional data analysis","volume":"7","author":"TP Quinn","year":"2017","journal-title":"Scientific Reports"},{"issue":"3","key":"pcbi.1010820.ref071","doi-asserted-by":"crossref","first-page":"1436","DOI":"10.1214\/009053606000000281","article-title":"High-dimensional graphs and variable selection with the lasso","volume":"34","author":"N Meinshausen","year":"2006","journal-title":"Annals of Statistics"},{"key":"pcbi.1010820.ref072","volume-title":"Local False Discovery Rates","author":"B Efron","year":"2005"},{"issue":"3","key":"pcbi.1010820.ref073","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1016\/0378-8733(78)90021-7","article-title":"Centrality in social networks conceptual clarification","volume":"1","author":"LC Freeman","year":"1978","journal-title":"Social networks"},{"issue":"2","key":"pcbi.1010820.ref074","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1111\/j.1469-8137.1912.tb05611.x","article-title":"The distribution of the flora in the alpine zone","volume":"11","author":"P Jaccard","year":"1912","journal-title":"New Phytologist"},{"issue":"7500","key":"pcbi.1010820.ref075","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nature13178","article-title":"Dynamics and associations of microbial community types across the human body","volume":"509","author":"T Ding","year":"2014","journal-title":"Nature"},{"key":"pcbi.1010820.ref076","first-page":"1695","article-title":"The igraph software package for complex network research","author":"G Cs\u00e1rdi","year":"2006","journal-title":"InterJournal"},{"key":"pcbi.1010820.ref077","unstructured":"Ushey K, Allaire J, Tang Y. reticulate: interface to\u2019Python\u2019; 2022. Available from: https:\/\/rstudio.github.io\/reticulate\/."},{"key":"pcbi.1010820.ref078","unstructured":"Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K. cluster: cluster analysis basics and extensions; 2022. Available from: https:\/\/CRAN.R-project.org\/package=cluster."},{"key":"pcbi.1010820.ref079","unstructured":"Morgan M. DirichletMultinomial: Dirichlet-multinomial mixture model machine learning for microbiome data; 2022. Available from: https:\/\/www.bioconductor.org\/packages\/release\/bioc\/html\/DirichletMultinomial.html."},{"issue":"10","key":"pcbi.1010820.ref080","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v071.i10","article-title":"Computation of graphlet orbits for nodes and edges in sparse graphs","volume":"71","author":"T Ho\u010devar","year":"2016","journal-title":"Journal of Statistical Software"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1010820","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2023,1,24]],"date-time":"2023-01-24T00:00:00Z","timestamp":1674518400000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1010820","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,11]],"date-time":"2024-10-11T15:54:23Z","timestamp":1728662063000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1010820"}},"subtitle":[],"editor":[{"given":"Luis Pedro","family":"Coelho","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,1,6]]},"references-count":80,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,1,6]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1010820","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2022.06.24.497500","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,1,6]]}}}