{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,26]],"date-time":"2025-09-26T08:14:04Z","timestamp":1758874444193,"version":"3.37.3"},"reference-count":84,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2021,10,12]],"date-time":"2021-10-12T00:00:00Z","timestamp":1633996800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"National Health Institute Director\u2019s Pioneer Award"},{"name":"National Health Institute\u2019s Eureka","award":["R01-GM098465"],"award-info":[{"award-number":["R01-GM098465"]}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,1,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Metagenomes offer a glimpse into the total genomic diversity contained within a sample. Currently, however, there is no straightforward way to obtain a non-redundant list of all putative homologs of a set of reference sequences present in a metagenome.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>To address this problem, we developed a novel clustering approach called \u2018metagenomic clustering by reference library\u2019 (MCRL), where a reference library containing a set of reference genes is clustered with respect to an assembled metagenome. According to our proposed approach, reference genes homologous to similar sets of metagenomic sequences, termed \u2018signatures\u2019, are iteratively clustered in a greedy fashion, retaining at each step the reference genes yielding the lowest E values, and terminating when signatures of remaining reference genes have a minimal overlap. The outcome of this computation is a non-redundant list of reference genes homologous to minimally overlapping sets of contigs, representing potential candidates for gene families present in the metagenome. Unlike metagenomic clustering methods, there is no need for contigs to overlap to be associated with a cluster, enabling MCRL to draw on more information encoded in the metagenome when computing tentative gene families. We demonstrate how MCRL can be used to extract candidate viral gene families from an oral metagenome and an oral virome that otherwise could not be determined using standard approaches. We evaluate the sensitivity, accuracy and robustness of our proposed method for the viral case study and compare it with existing analysis approaches.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>https:\/\/github.com\/a-tadmor\/MCRL.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab703","type":"journal-article","created":{"date-parts":[[2021,10,8]],"date-time":"2021-10-08T04:05:01Z","timestamp":1633665901000},"page":"631-647","source":"Crossref","is-referenced-by-count":4,"title":["MCRL: using a reference library to compress a metagenome into a non-redundant list of sequences, considering viruses as a case study"],"prefix":"10.1093","volume":"38","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1617-2010","authenticated-orcid":false,"given":"Arbel D","family":"Tadmor","sequence":"first","affiliation":[{"name":"TRON - Translational Oncology at the University Medical Center of Johannes Gutenberg University, 55131 Mainz, Germany"},{"name":"Department of Biochemistry and Molecular Biophysics, California Institute of Technology , Pasadena, CA 91125, USA"}]},{"given":"Rob","family":"Phillips","sequence":"additional","affiliation":[{"name":"Department of Bioengineering, California Institute of Technology , Pasadena, CA 91125, USA"},{"name":"Department of Applied Physics, California Institute of Technology , Pasadena, CA 91125, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,10,12]]},"reference":[{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"e1002358","DOI":"10.1371\/journal.pcbi.1002358","article-title":"Metabolic reconstruction for metagenomic data and its application to the human microbiome","volume":"8","author":"Abubucker","year":"2012","journal-title":"PLoS comput. Biol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"9743","DOI":"10.1038\/srep09743","article-title":"MICCA: a complete and accurate software for taxonomic profiling of metagenomic data","volume":"5","author":"Albanese","year":"2015","journal-title":"Sci. Rep"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s40168-018-0401-z","article-title":"DeepARG: a deep learning approach for predicting antibiotic resistance genes from metagenomic data","volume":"6","author":"Arango-Argoty","year":"2018","journal-title":"Microbiome"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1093\/bioinformatics\/bti770","article-title":"The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling","volume":"22","author":"Arnold","year":"2006","journal-title":"Bioinformatics"},{"issue":"Suppl. 1","key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"D138","DOI":"10.1093\/nar\/gkh121","article-title":"The Pfam protein families database","volume":"32","author":"Bateman","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1038\/ismej.2011.85","article-title":"The oral metagenome in health and disease","volume":"6","author":"Belda-Ferre","year":"2012","journal-title":"ISME J"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"7629","DOI":"10.1128\/AEM.00938-07","article-title":"Metagenomic characterization of Chesapeake Bay virioplankton","volume":"73","author":"Bench","year":"2007","journal-title":"Appl. Environ. Microbiol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1186\/s40168-019-0670-1","article-title":"Identification and reconstruction of novel antibiotic resistance genes from metagenomes","volume":"7","author":"Berglund","year":"2019","journal-title":"Microbiome"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"1545","DOI":"10.1128\/AEM.03305-12","article-title":"Phylogenetic distribution of potential cellulases in bacteria","volume":"79","author":"Berlemont","year":"2013","journal-title":"Appl. Environ. Microbiol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"592","DOI":"10.1093\/oxfordjournals.molbev.a025797","article-title":"Recombinant DNA sequences generated by PCR amplification","volume":"14","author":"Bradley","year":"1997","journal-title":"Mol. Biol. Evol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"8365","DOI":"10.1038\/srep08365","article-title":"RASTtk: a modular and extensible implementation of the RAST algorithm for building custom annotation pipelines and annotating batches of genomes","volume":"5","author":"Brettin","year":"2015","journal-title":"Sci. Rep"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nmeth.3176","article-title":"Fast and sensitive protein alignment using DIAMOND","volume":"12","author":"Buchfink","year":"2015","journal-title":"Nat. Methods"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1186\/1471-2105-10-421","article-title":"BLAST+: architecture and applications","volume":"10","author":"Camacho","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"277","DOI":"10.1046\/j.1365-2958.2003.03580.x","article-title":"Prophages and bacterial genomics: what have we learned so far?","volume":"49","author":"Casjens","year":"2003","journal-title":"Mol. Microbiol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"1091","DOI":"10.1128\/JB.187.3.1091-1104.2005","article-title":"The generalized transducing Salmonella bacteriophage ES18: complete genome sequence and DNA packaging strategy","volume":"187","author":"Casjens","year":"2005","journal-title":"J. Bacteriol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"1315","DOI":"10.1038\/nbt1004-1315","article-title":"What is a hidden Markov model?","volume":"22","author":"Eddy","year":"2004","journal-title":"Nat. Biotechnol"},{"key":"2023033004312870000_","first-page":"205","article-title":"A new generation of homology search tools based on probabilistic inference","volume":"23","author":"Eddy","year":"2009","journal-title":"Genome Inform"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"2460","DOI":"10.1093\/bioinformatics\/btq461","article-title":"Search and clustering orders of magnitude faster than BLAST","volume":"26","author":"Edgar","year":"2010","journal-title":"Bioinformatics"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"3316","DOI":"10.1093\/bioinformatics\/bts599","article-title":"Real time metagenomics: using k-mers to annotate metagenomes","volume":"28","author":"Edwards","year":"2012","journal-title":"Bioinformatics"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"504","DOI":"10.1038\/nrmicro1163","article-title":"Viral metagenomics","volume":"3","author":"Edwards","year":"2005","journal-title":"Nat. Rev. Microbiol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"3150","DOI":"10.1093\/bioinformatics\/bts565","article-title":"CD-HIT: accelerated for clustering the next-generation sequencing data","volume":"28","author":"Fu","year":"2012","journal-title":"Bioinformatics"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"271","DOI":"10.1186\/1471-2105-12-271","article-title":"DNACLUST: accurate and efficient clustering of phylogenetic marker genes","volume":"12","author":"Ghodsi","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1038\/ismej.2014.106","article-title":"Improved annotation of antibiotic resistance determinants reveals microbial resistomes cluster by ecology","volume":"9","author":"Gibson","year":"2015","journal-title":"ISME J"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"prot5368","DOI":"10.1101\/pdb.prot5368","article-title":"Using the metagenomics RAST server (MG-RAST) for analyzing shotgun metagenomes","volume":"2010","author":"Glass","year":"2010","journal-title":"Cold Spring Harb. Protoc"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"371","DOI":"10.1093\/nar\/gkg128","article-title":"The TIGRFAMs database of protein families","volume":"31","author":"Haft","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"506","DOI":"10.1016\/j.mib.2003.09.004","article-title":"Bacteriophage genomics","volume":"6","author":"Hendrix","year":"2003","journal-title":"Curr. Opin. Microbiol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"10915","DOI":"10.1073\/pnas.89.22.10915","article-title":"Amino acid substitution matrices from protein blocks","volume":"89","author":"Henikoff","year":"1992","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1016\/j.virol.2014.09.019","article-title":"Development of a virus detection and discovery pipeline using next generation sequencing","volume":"471","author":"Ho","year":"2014","journal-title":"Virology"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1093\/bioinformatics\/14.5.423","article-title":"Removing near-neighbour redundancy from large protein sequence collections","volume":"14","author":"Holm","year":"1998","journal-title":"Bioinformatics"},{"key":"2023033004312870000_","first-page":"e000131","article-title":"ARIBA: rapid antimicrobial resistance genotyping directly from sequencing reads","volume":"3","author":"Hunt","year":"2017","journal-title":"Microb. Genom"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1186\/s40793-016-0138-x","article-title":"The standard operating procedure of the DOE-JGI Metagenome Annotation Pipeline (MAP v. 4)","volume":"11","author":"Huntemann","year":"2016","journal-title":"Stand. Genomic Sci"},{"key":"2023033004312870000_","first-page":"D211","article-title":"InterPro: the integrative protein signature database","volume":"37 (Suppl. 1","author":"Hunter","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"fnw077","DOI":"10.1093\/femsle\/fnw077","article-title":"Computational prospecting the great viral unknown","volume":"363","author":"Hurwitz","year":"2016","journal-title":"FEMS Microbiol. Lett"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"e1004957","DOI":"10.1371\/journal.pcbi.1004957","article-title":"MEGAN community edition-interactive exploration and analysis of large-scale microbiome sequencing data","volume":"12","author":"Huson","year":"2016","journal-title":"PLoS Comput. Biol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1186\/s40168-017-0233-2","article-title":"Fast and simple protein-alignment-guided assembly of orthologous gene families from microbiome sequencing reads","volume":"5","author":"Huson","year":"2017","journal-title":"Microbiome"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1093\/nar\/28.1.27","article-title":"KEGG: kyoto encyclopedia of genes and genomes","volume":"28","author":"Kanehisa","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"e00003","DOI":"10.1128\/mSystems.00003-15","article-title":"Open-source sequence clustering methods improve the state of the art","volume":"1","author":"Kopylova","year":"2016","journal-title":"MSystems"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"E2401","DOI":"10.1073\/pnas.1621061114","article-title":"Multiple origins of viral capsid proteins from cellular ancestors","volume":"114","author":"Krupovic","year":"2017","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"D574","DOI":"10.1093\/nar\/gkw1009","article-title":"MEGARes: an antimicrobial resistance database for high throughput sequencing","volume":"45","author":"Lakin","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1038\/nbt.2942","article-title":"An integrated catalog of reference genes in the human gut microbiome","volume":"32","author":"Li","year":"2014","journal-title":"Nat. Biotechnol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"282","DOI":"10.1093\/bioinformatics\/17.3.282","article-title":"Clustering of highly homologous sequences to reduce the size of large protein databases","volume":"17","author":"Li","year":"2001","journal-title":"Bioinformatics"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"656","DOI":"10.1093\/bib\/bbs035","article-title":"Ultrafast clustering algorithms for metagenomic sequence analysis","volume":"13","author":"Li","year":"2012","journal-title":"Brief. Bioinform"},{"key":"2023033004312870000_","first-page":"1","article-title":"VIP: an integrated pipeline for metagenomics of virus identification and discovery","volume":"6","author":"Li","year":"2016","journal-title":"Sci. Rep"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"1658","DOI":"10.1093\/bioinformatics\/btl158","article-title":"Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences","volume":"22","author":"Li","year":"2006","journal-title":"Bioinformatics"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"D687","DOI":"10.1093\/nar\/gky1080","article-title":"VFDB 2019: a comparative pathogenomic platform with an interactive web interface","volume":"47","author":"Liu","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"D98","DOI":"10.1093\/nar\/gkr1032","article-title":"GeneDB\u2014an annotation database for pathogens","volume":"40","author":"Logan-Klumpler","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"D265","DOI":"10.1093\/nar\/gkz991","article-title":"CDD\/SPARCLE: the conserved domain database in 2020","volume":"48","author":"Lu","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"368","DOI":"10.1038\/nature10576","article-title":"Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw","volume":"480","author":"Mackelprang","year":"2011","journal-title":"Nature"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"e593","DOI":"10.7717\/peerj.593","article-title":"Swarm: robust and fast clustering method for amplicon-based studies","volume":"2","author":"Mah\u00e9","year":"2014","journal-title":"PeerJ"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"e1420","DOI":"10.7717\/peerj.1420","article-title":"Swarm v2: highly-scalable and high-resolution amplicon clustering","volume":"3","author":"Mah\u00e9","year":"2015","journal-title":"PeerJ"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"63","DOI":"10.4056\/sigs.632","article-title":"The DOE-JGI Standard operating procedure for the annotations of microbial genomes","volume":"1","author":"Mavromatis","year":"2009","journal-title":"Stand. Genomic Sci"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"3348","DOI":"10.1128\/AAC.00419-13","article-title":"The comprehensive antibiotic resistance database","volume":"57","author":"McArthur","year":"2013","journal-title":"Antimicrob. Agents Chemother"},{"key":"2023033004312870000_","first-page":"D347","article-title":"The National Microbial Pathogen Database Resource (NMPDR): a genomics platform based on subsystem annotation","volume":"35 (Suppl. 1","author":"McNeil","year":"2006","journal-title":"Nucleic Acids Res"},{"first-page":"27","year":"2013","author":"Mercier","key":"2023033004312870000_"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1038\/nature11209","article-title":"A framework for human microbiome research","volume":"486","author":"Meth\u00e9","year":"2012","journal-title":"Nature"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1186\/1471-2105-9-386","article-title":"The metagenomics RAST server\u2013a public resource for the automatic phylogenetic and functional analysis of metagenomes","volume":"9","author":"Meyer","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"6643","DOI":"10.1093\/nar\/gkp698","article-title":"FIGfams: yet another set of protein families","volume":"37","author":"Meyer","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"1180","DOI":"10.1101\/gr.171934.113","article-title":"A cloud-compatible bioinformatics pipeline for ultrarapid pathogen identification from next-generation sequencing of clinical samples","volume":"24","author":"Naccache","year":"2014","journal-title":"Genome Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"371","DOI":"10.1016\/B978-0-12-407863-5.00019-8","article-title":"Advancing our understanding of the human microbiome using QIIME","volume":"531","author":"Navas-Molina","year":"2013","journal-title":"Methods Enzymol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"75","DOI":"10.4137\/BBI.S12462","article-title":"Metagenomics: tools and insights for analyzing next-generation sequencing data derived from biodiversity studies","volume":"9","author":"Oulas","year":"2015","journal-title":"Bioinform. Biol. Insights"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1038\/nature19094","article-title":"Uncovering Earth\u2019s virome","volume":"536","author":"Paez-Espino","year":"2016","journal-title":"Nature"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"2192","DOI":"10.1128\/AEM.01285-09","article-title":"Detection and quantification of functional genes of cellulose-degrading, fermentative, and sulfate-reducing bacteria and methanogenic archaea","volume":"76","author":"Pereyra","year":"2010","journal-title":"Appl. Environ. Microbiol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"D593","DOI":"10.1093\/nar\/gkr859","article-title":"ViPR: an open bioinformatics database and analysis resource for virology research","volume":"40","author":"Pickett","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"915","DOI":"10.1038\/ismej.2011.169","article-title":"Evidence of a robust resident bacteriophage population revealed through analysis of the human salivary virome","volume":"6","author":"Pride","year":"2012","journal-title":"ISME J"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"D61","DOI":"10.1093\/nar\/gkl842","article-title":"NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins","volume":"35 (Suppl. 1","author":"Pruitt","year":"2007","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"647","DOI":"10.1146\/annurev.genet.42.110807.091545","article-title":"The bacteriophage DNA packaging motor","volume":"42","author":"Rao","year":"2008","journal-title":"Annu. Rev. Genet"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"112","DOI":"10.1038\/s41564-018-0292-6","article-title":"Prediction of the intestinal resistome by a three-dimensional structure-based method","volume":"4","author":"Rupp\u00e9","year":"2018","journal-title":"Nat. Microbiol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"D1","DOI":"10.1093\/nar\/gkz899","article-title":"Database resources of the National Center for Biotechnology Information","volume":"48","author":"Sayers","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"D693","DOI":"10.1093\/nar\/gky999","article-title":"Victors: a web-based knowledge base of virulence factors in human and animal pathogens","volume":"47","author":"Sayers","year":"2019","journal-title":"Nucleic acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"7537","DOI":"10.1128\/AEM.01541-09","article-title":"Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities","volume":"75","author":"Schloss","year":"2009","journal-title":"Appl. Environ. Microbiol"},{"key":"2023033004312870000_","first-page":"D546","article-title":"Community cyberinfrastructure for advanced microbial ecology research and analysis: the CAMERA resource","volume":"39 (Suppl. 1","author":"Sun","year":"2010","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"1282","DOI":"10.1093\/bioinformatics\/btm098","article-title":"UniRef: comprehensive and non-redundant UniProt reference clusters","volume":"23","author":"Suzek","year":"2007","journal-title":"Bioinformatics"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1126\/science.1200758","article-title":"Probing individual environmental bacteria for viruses by using microfluidic digital PCR","volume":"333","author":"Tadmor","year":"2011","journal-title":"Science"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1093\/nar\/28.1.33","article-title":"The COG database: a tool for genome-scale analysis of protein functions and evolution","volume":"8","author":"Tatusov","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"e64465","DOI":"10.1371\/journal.pone.0064465","article-title":"VirusFinder: software for efficient and accurate detection of viruses and their integration sites in host genomes through next generation sequencing data","volume":"8","author":"Wang","year":"2013","journal-title":"PLoS One"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1186\/s40168-015-0093-6","article-title":"Xander: employing a novel method for efficient gene-targeted metagenomic assembly","volume":"3","author":"Wang","year":"2015","journal-title":"Microbiome"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"560","DOI":"10.1038\/nature06269","article-title":"Metagenomic and functional analysis of hindgut microbiota of a wood-feeding higher termite","volume":"450","author":"Warnecke","year":"2007","journal-title":"Nature"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"391","DOI":"10.1111\/j.2041-1014.2010.00587.x","article-title":"Community and gene composition of a human dental plaque microbiota obtained by metagenomic sequencing","volume":"25","author":"Xie","year":"2010","journal-title":"Mol. Microbiol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"2346","DOI":"10.1093\/bioinformatics\/btw136","article-title":"ARGs-OAP: online analysis pipeline for antibiotic resistance genes detection from metagenomic data using an integrated structured ARG-database","volume":"32","author":"Yang","year":"2016","journal-title":"Bioinformatics"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"D624","DOI":"10.1093\/nar\/gku985","article-title":"PAIDB v2. 0: exploration and analysis of pathogenicity and resistance islands","volume":"43","author":"Yoon","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"2640","DOI":"10.1093\/jac\/dks261","article-title":"Identification of acquired antimicrobial resistance genes","volume":"67","author":"Zankari","year":"2012","journal-title":"J. Antimicrob. Chemother"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"e1003737","DOI":"10.1371\/journal.pcbi.1003737","article-title":"A scalable and accurate targeted gene assembly tool (SAT-Assembler) for next-generation sequencing data","volume":"10","author":"Zhang","year":"2014","journal-title":"PLoS Comput. Biol"},{"key":"2023033004312870000_","doi-asserted-by":"crossref","first-page":"D466","DOI":"10.1093\/nar\/gkw857","article-title":"Influenza Research Database: an integrated bioinformatics resource for influenza virus research","volume":"45","author":"Zhang","year":"2017","journal-title":"Nucleic Acids Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab703\/41646194\/btab703.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/3\/631\/49692893\/btab703.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/38\/3\/631\/49692893\/btab703.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,10]],"date-time":"2023-11-10T14:06:45Z","timestamp":1699625205000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/38\/3\/631\/6390794"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,10,12]]},"references-count":84,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,1,12]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab703","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2022,2,1]]},"published":{"date-parts":[[2021,10,12]]}}}