{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T06:22:27Z","timestamp":1772173347882,"version":"3.50.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1012268","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2025,5,12]],"date-time":"2025-05-12T00:00:00Z","timestamp":1747008000000}}],"reference-count":40,"publisher":"Public Library of Science (PLoS)","issue":"5","license":[{"start":{"date-parts":[[2025,5,2]],"date-time":"2025-05-02T00:00:00Z","timestamp":1746144000000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100002341","name":"Academy of Finland","doi-asserted-by":"publisher","award":["339172"],"award-info":[{"award-number":["339172"]}],"id":[{"id":"10.13039\/501100002341","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000268","name":"Biotechnology and Biological Sciences Research Council","doi-asserted-by":"publisher","award":["BB\/X011054\/1"],"award-info":[{"award-number":["BB\/X011054\/1"]}],"id":[{"id":"10.13039\/501100000268","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000268","name":"Biotechnology and Biological Sciences Research Council","doi-asserted-by":"publisher","award":["BB\/CCG2260\/1"],"award-info":[{"award-number":["BB\/CCG2260\/1"]}],"id":[{"id":"10.13039\/501100000268","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>The growing interest in the role of the gut virome in human health and disease, has led to several recent large-scale viral catalogue projects mining human gut metagenomes each using varied computational tools and quality control criteria. Importantly, there has been to date no consistent comparison of these catalogues\u2019 quality, diversity, and overlap. In this project, we therefore systematically surveyed nine previously published human gut viral catalogues. While these catalogues collectively screened &gt;40,000 human fecal metagenomes, 82% of the recovered 345,613 viral sequences were unique to one catalogue, highlighting limited redundancy between the ressources and suggesting the need for an aggregated resource bringing these viral sequences together. We further expanded these viral catalogues by mining 7,867 infant gut metagenomes from 12 large-scale infant studies collected in 9 different countries. From these datasets, we constructed the Aggregated Gut Viral Catalogue (AVrC), a unified modular resource containing 1,018,941 dereplicated viral sequences (449,859 species-level vOTUs). Using computational inference tools, annotations were obtained for each vOTU representative sequence quality, viral taxonomy, predicted viral lifestyle, and putative host. This project aims to facilitate the reuse of previously published viral catalogues by the research community and follows a modular framework to enable future expansions as novel data becomes available.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1012268","type":"journal-article","created":{"date-parts":[[2025,5,2]],"date-time":"2025-05-02T16:04:29Z","timestamp":1746201869000},"page":"e1012268","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":2,"title":["The Aggregated Gut Viral Catalogue (AVrC): A unified resource for exploring the viral diversity of the human gut"],"prefix":"10.1371","volume":"21","author":[{"given":"Anastasia","family":"Galperina","sequence":"first","affiliation":[]},{"given":"Gabriele Andrea","family":"Lugli","sequence":"additional","affiliation":[]},{"given":"Christian","family":"Milani","sequence":"additional","affiliation":[]},{"given":"Willem M.","family":"De Vos","sequence":"additional","affiliation":[]},{"given":"Marco","family":"Ventura","sequence":"additional","affiliation":[]},{"given":"Anne","family":"Salonen","sequence":"additional","affiliation":[]},{"given":"Bonnie","family":"Hurwitz","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4125-7561","authenticated-orcid":true,"given":"Alise Jany","family":"Ponsero","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2025,5,2]]},"reference":[{"key":"pcbi.1012268.ref001","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1186\/s13100-017-0095-y","article-title":"Viral communities of the human gut: metagenomic analysis of composition and dynamics","volume":"8","author":"V Aggarwala","year":"2017","journal-title":"Mob DNA"},{"issue":"2","key":"pcbi.1012268.ref002","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/j.chom.2019.01.017","article-title":"Bacteriophages of the human gut: the known unknown of the microbiome","volume":"25","author":"AN Shkoporov","year":"2019","journal-title":"Cell Host Microbe"},{"issue":"1","key":"pcbi.1012268.ref003","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1186\/s40168-017-0283-5","article-title":"VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data","volume":"5","author":"J Ren","year":"2017","journal-title":"Microbiome"},{"issue":"1","key":"pcbi.1012268.ref004","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1007\/s40484-019-0187-4","article-title":"Identifying viruses from metagenomic data using deep learning","volume":"8","author":"J Ren","year":"2020","journal-title":"Quant Biol"},{"issue":"21","key":"pcbi.1012268.ref005","doi-asserted-by":"crossref","DOI":"10.1093\/nar\/gkaa856","article-title":"Seeker: alignment-free identification of bacteriophage genomes by deep learning","volume":"48","author":"N Auslander","year":"2020","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"pcbi.1012268.ref006","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1186\/s40168-020-00990-y","article-title":"VirSorter2: a multi-classifier, expert-guided approach to detect diverse DNA and RNA viruses","volume":"9","author":"J Guo","year":"2021","journal-title":"Microbiome"},{"issue":"1","key":"pcbi.1012268.ref007","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1186\/s40168-019-0657-y","article-title":"Mining, analyzing, and integrating viral signals from metagenomic data","volume":"7","author":"T Zheng","year":"2019","journal-title":"Microbiome"},{"issue":"1","key":"pcbi.1012268.ref008","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1186\/s40168-020-00867-0","article-title":"Vibrant: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences","volume":"8","author":"K Kieft","year":"2020","journal-title":"Microbiome"},{"key":"pcbi.1012268.ref009","doi-asserted-by":"crossref","DOI":"10.1093\/nar\/gkaa946","article-title":"Img\/vr v3: an integrated ecological and evolutionary framework for interrogating genomes of uncultivated viruses","volume":"49","author":"S Roux","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1012268.ref010","doi-asserted-by":"crossref","DOI":"10.1093\/nar\/gkac1037","article-title":"Img\/vr v4: an expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata","volume":"51","author":"AP Camargo","year":"2023","journal-title":"Nucleic Acids Res"},{"issue":"5","key":"pcbi.1012268.ref011","doi-asserted-by":"crossref","DOI":"10.1016\/j.chom.2020.08.003","article-title":"The gut virome database reveals age-dependent patterns of virome diversity in the human gut","volume":"28","author":"A Gregory","year":"2020","journal-title":"Cell Host Microbe"},{"issue":"23","key":"pcbi.1012268.ref012","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2023202118","article-title":"A catalog of tens of thousands of viruses from human metagenomes reveals hidden associations with chronic diseases","volume":"118","author":"MJ Tisza","year":"2021","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"1","key":"pcbi.1012268.ref013","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1186\/s40168-021-01017-w","article-title":"Thousands of previously unknown phages discovered in whole-community human gut metagenomes","volume":"9","author":"S Benler","year":"2021","journal-title":"Microbiome"},{"issue":"1","key":"pcbi.1012268.ref014","doi-asserted-by":"crossref","first-page":"5252","DOI":"10.1038\/s41467-022-32832-w","article-title":"Extensive gut virome variation and its associations with host and environmental factors in a population-level cohort","volume":"13","author":"S Nishijima","year":"2022","journal-title":"Nat Commun"},{"issue":"5","key":"pcbi.1012268.ref015","doi-asserted-by":"crossref","DOI":"10.1128\/msystems.00382-21","article-title":"A previously undescribed highly prevalent phage identified in a Danish enteric virome catalog","volume":"6","author":"L Van Espen","year":"2021","journal-title":"mSystems"},{"issue":"7","key":"pcbi.1012268.ref016","doi-asserted-by":"crossref","first-page":"960","DOI":"10.1038\/s41564-021-00928-6","article-title":"Metagenomic compendium of 189,680 DNA viruses from the human gut microbiome","volume":"6","author":"S Nayfach","year":"2021","journal-title":"Nat Microbiol"},{"issue":"4","key":"pcbi.1012268.ref017","doi-asserted-by":"crossref","DOI":"10.1016\/j.cell.2021.01.029","article-title":"Massive expansion of human gut bacteriophage diversity","volume":"184","author":"L Camarillo-Guerrero","year":"2021","journal-title":"Cell"},{"issue":"5","key":"pcbi.1012268.ref018","doi-asserted-by":"crossref","first-page":"986","DOI":"10.1038\/s41564-023-01345-7","article-title":"Expanding known viral diversity in the healthy infant gut","volume":"8","author":"SA Shah","year":"2023","journal-title":"Nat Microbiol"},{"issue":"2","key":"pcbi.1012268.ref019","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1038\/s41396-021-01090-x","article-title":"Phages in the infant gut: a framework for virome development during early life","volume":"16","author":"M Shamash","year":"2022","journal-title":"ISME J"},{"issue":"10","key":"pcbi.1012268.ref020","doi-asserted-by":"crossref","first-page":"801","DOI":"10.1016\/j.tim.2016.06.001","article-title":"The bacterial microbiome and virome milestones of infant development","volume":"24","author":"ES Lim","year":"2016","journal-title":"Trends Microbiol"},{"key":"pcbi.1012268.ref021","first-page":"1","article-title":"Identification of mobile genetic elements with geNomad","volume":"41","author":"A Camargo","year":"2023","journal-title":"Nat Biotechnol"},{"issue":"1","key":"pcbi.1012268.ref022","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1186\/s13059-024-03236-4","article-title":"Benchmarking bioinformatic virus identification tools using real-world metagenomic data across biomes","volume":"25","author":"L-Y Wu","year":"2024","journal-title":"Genome Biol"},{"key":"pcbi.1012268.ref023","first-page":"14","article-title":"Evaluation of computational phage detection tools for metagenomic datasets.","author":"KE Schackart","year":"2023"},{"issue":"6","key":"pcbi.1012268.ref024","doi-asserted-by":"crossref","DOI":"10.1136\/bmjopen-2018-028500","article-title":"Cohort profile: Finnish Health and Early Life Microbiota (HELMi) longitudinal birth cohort","volume":"9","author":"K Korpela","year":"2019","journal-title":"BMJ Open"},{"issue":"1","key":"pcbi.1012268.ref025","doi-asserted-by":"crossref","first-page":"276","DOI":"10.1186\/s12866-021-02337-5","article-title":"Ecology impacts the decrease of spirochaetes and prevotella in the fecal gut microbiota of urban humans","volume":"21","author":"L Thingholm","year":"2021","journal-title":"BMC Microbiol"},{"issue":"5","key":"pcbi.1012268.ref026","doi-asserted-by":"crossref","first-page":"578","DOI":"10.1038\/s41587-020-00774-7","article-title":"CheckV assesses the quality and completeness of metagenome-assembled viral genomes","volume":"39","author":"S Nayfach","year":"2021","journal-title":"Nat Biotechnol"},{"key":"pcbi.1012268.ref027","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btab293","article-title":"Bacteriophage classification for assembled contigs using graph convolutional network","volume":"37","author":"J Shang","year":"2021","journal-title":"Bioinform"},{"issue":"1","key":"pcbi.1012268.ref028","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbac487","article-title":"PhaTYP: predicting the lifestyle for bacteriophages using BERT","volume":"24","author":"J Shang","year":"2023","journal-title":"Brief Bioinform"},{"issue":"4","key":"pcbi.1012268.ref029","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pbio.3002083","article-title":"iPHoP: An integrated machine learning framework to maximize host prediction for metagenome-derived viruses of archaea and bacteria","volume":"21","author":"S Roux","year":"2023","journal-title":"PLoS Biol"},{"issue":"1","key":"pcbi.1012268.ref030","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1186\/s40168-024-01905-x","article-title":"Viromes vs. mixed community metagenomes: choice of method dictates interpretation of viral community ecology","volume":"12","author":"JC Kosmopoulos","year":"2024","journal-title":"Microbiome"},{"key":"pcbi.1012268.ref031","first-page":"14","article-title":"An extended catalog of integrated prophages in the infant and adult fecal microbiome shows high prevalence of lysogeny.","author":"E Dikareva","year":"2023"},{"issue":"1","key":"pcbi.1012268.ref032","doi-asserted-by":"crossref","first-page":"1864","DOI":"10.1038\/s41467-024-45793-z","article-title":"A metagenomic catalog of the early-life human gut virome","volume":"15","author":"S Zeng","year":"2024","journal-title":"Nat Commun"},{"issue":"1","key":"pcbi.1012268.ref033","doi-asserted-by":"crossref","first-page":"1791","DOI":"10.1038\/s41467-024-46033-0","article-title":"Exploring the gut DNA virome in fecal immunochemical test stool samples reveals associations with lifestyle in a large population-based study","volume":"15","author":"P Istvan","year":"2024","journal-title":"Nat Commun"},{"issue":"2","key":"pcbi.1012268.ref034","first-page":"001198","article-title":"The long and short of it: benchmarking viromics using Illumina, Nanopore and PacBio sequencing technologies","volume":"10","author":"R Cook","year":"2024","journal-title":"Microb Genom"},{"issue":"4","key":"pcbi.1012268.ref035","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1089\/phage.2021.0007","article-title":"Infrastructure for a phage reference database: identification of large-scale biases in the current collection of cultured phage genomes","volume":"2","author":"R Cook","year":"2021","journal-title":"Phage"},{"issue":"10","key":"pcbi.1012268.ref036","doi-asserted-by":"crossref","first-page":"1674","DOI":"10.1093\/bioinformatics\/btv033","article-title":"MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph","volume":"31","author":"D Li","year":"2015","journal-title":"Bioinformatics"},{"issue":"3","key":"pcbi.1012268.ref037","doi-asserted-by":"crossref","DOI":"10.1128\/mSystems.00583-21","article-title":"METAnnotatorX2: a comprehensive tool for deep and shallow metagenomic data set analyses","volume":"6","author":"C Milani","year":"2021","journal-title":"mSystems"},{"issue":"14","key":"pcbi.1012268.ref038","doi-asserted-by":"crossref","first-page":"4126","DOI":"10.1093\/bioinformatics\/btaa490","article-title":"Metaviral SPAdes: assembly of viruses from metagenomic data","volume":"36","author":"D Antipov","year":"2020","journal-title":"Bioinformatics"},{"issue":"11","key":"pcbi.1012268.ref039","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1038\/nbt.3988","article-title":"MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets","volume":"35","author":"M Steinegger","year":"2017","journal-title":"Nat Biotechnol"},{"issue":"7","key":"pcbi.1012268.ref040","doi-asserted-by":"crossref","first-page":"908","DOI":"10.1016\/j.chom.2022.06.003","article-title":"Advances and challenges in cataloging the human gut virome","volume":"30","author":"J Li","year":"2022","journal-title":"Cell Host Microbe"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1012268","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2025,5,12]],"date-time":"2025-05-12T00:00:00Z","timestamp":1747008000000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1012268","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,5,12]],"date-time":"2025-05-12T14:15:52Z","timestamp":1747059352000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1012268"}},"subtitle":[],"editor":[{"given":"Iddo","family":"Friedberg","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,5,2]]},"references-count":40,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2025,5,2]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1012268","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.06.24.600367","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,5,2]]}}}