{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,19]],"date-time":"2026-05-19T18:47:52Z","timestamp":1779216472135,"version":"3.51.4"},"reference-count":374,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2023,3,3]],"date-time":"2023-03-03T00:00:00Z","timestamp":1677801600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Bioinform."],"abstract":"<jats:p>Metagenomics has enabled accessing the genetic repertoire of natural microbial communities. Metagenome shotgun sequencing has become the method of choice for studying and classifying microorganisms from various environments. To this end, several methods have been developed to process and analyze the sequence data from raw reads to end-products such as predicted protein sequences or families. In this article, we provide a thorough review to simplify such processes and discuss the alternative methodologies that can be followed in order to explore biodiversity at the protein family level. We provide details for analysis tools and we comment on their scalability as well as their advantages and disadvantages. Finally, we report the available data repositories and recommend various approaches for protein family annotation related to phylogenetic distribution, structure prediction and metadata enrichment.<\/jats:p>","DOI":"10.3389\/fbinf.2023.1157956","type":"journal-article","created":{"date-parts":[[2023,3,3]],"date-time":"2023-03-03T18:41:56Z","timestamp":1677868916000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":18,"title":["Exploring microbial functional biodiversity at the protein family level\u2014From metagenomic sequence reads to annotated protein clusters"],"prefix":"10.3389","volume":"3","author":[{"given":"Fotis A.","family":"Baltoumas","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Evangelos","family":"Karatzas","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Paez-Espino","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nefeli K.","family":"Venetsianou","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Eleni","family":"Aplakidou","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Anastasis","family":"Oulas","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Robert D.","family":"Finn","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sergey","family":"Ovchinnikov","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Evangelos","family":"Pafilis","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nikos C.","family":"Kyrpides","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Georgios A.","family":"Pavlopoulos","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1965","published-online":{"date-parts":[[2023,3,3]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"e126","DOI":"10.1093\/nar\/gks406","article-title":"PhiSpy: A novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies","volume":"40","author":"Akhter","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"B2","volume-title":"Concoct: Clustering cONtigs on COverage and ComposiTion","author":"Alneberg","year":"2013"},{"key":"B3","doi-asserted-by":"publisher","first-page":"693","DOI":"10.1016\/0022-2836(87)90352-4","article-title":"Correlation of co-ordinated amino acid substitutions with function in viruses related to tobacco mosaic virus","volume":"193","author":"Altschuh","year":"1987","journal-title":"J. Mol. Biol."},{"key":"B4","doi-asserted-by":"publisher","first-page":"403","DOI":"10.1016\/s0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol."},{"key":"B5","doi-asserted-by":"publisher","first-page":"304","DOI":"10.3389\/fgene.2018.00304","article-title":"MARVEL, a tool for prediction of bacteriophage sequences in metagenomic bins","volume":"9","author":"Amgarten","year":"2018","journal-title":"Front. Genet."},{"key":"B6","doi-asserted-by":"publisher","first-page":"D376","DOI":"10.1093\/nar\/gkz1064","article-title":"The SCOP database in 2020: Expanded classification of representative family and superfamily domains of known protein structures","volume":"48","author":"Andreeva","year":"2020","journal-title":"Nucleic Acids Res."},{"key":"B7","doi-asserted-by":"crossref","first-page":"367","DOI":"10.5220\/0003350803670368","volume-title":"Proceedings of the international conference on Bioinformatics models, methods and algorithms","year":"2011"},{"key":"B8","doi-asserted-by":"publisher","first-page":"4126","DOI":"10.1093\/bioinformatics\/btaa490","article-title":"Metaviral SPAdes: Assembly of viruses from metagenomic data","volume":"36","author":"Antipov","year":"2020","journal-title":"Bioinformatics"},{"key":"B9","doi-asserted-by":"publisher","first-page":"824","DOI":"10.1038\/s41587-020-0561-9","article-title":"Genome editing with CRISPR\u2013Cas nucleases, base editors, transposases and prime editors","volume":"38","author":"Anzalone","year":"2020","journal-title":"Nat. Biotechnol."},{"key":"B10","doi-asserted-by":"publisher","first-page":"W16","DOI":"10.1093\/nar\/gkw387","article-title":"Phaster: A better, faster version of the PHAST phage search tool","volume":"44","author":"Arndt","year":"2016","journal-title":"Nucleic Acids Res."},{"key":"B11","doi-asserted-by":"publisher","first-page":"e121","DOI":"10.1093\/nar\/gkaa856","article-title":"Seeker: Alignment-free identification of bacteriophage genomes by deep learning","volume":"48","author":"Auslander","year":"2020","journal-title":"Nucleic Acids Res."},{"key":"B12","doi-asserted-by":"publisher","first-page":"e33","DOI":"10.1093\/nar\/gkx1313","article-title":"HipMCL: A high-performance parallel implementation of the markov clustering algorithm for large-scale networks","volume":"46","author":"Azad","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"B13","doi-asserted-by":"publisher","first-page":"12364","DOI":"10.3390\/ijms150712364","article-title":"Exploring neighborhoods in the metagenome universe","volume":"15","author":"A\u00dfhauer","year":"2014","journal-title":"Int. J. Mol. Sci."},{"key":"B14","doi-asserted-by":"publisher","first-page":"93","DOI":"10.1126\/science.1065659","article-title":"Protein structure prediction and structural genomics","volume":"294","author":"Baker","year":"2001","journal-title":"Science"},{"key":"B15","doi-asserted-by":"publisher","first-page":"2","DOI":"10.1186\/1471-2105-4-2","article-title":"An automated method for finding molecular complexes in large protein interaction networks","volume":"4","author":"Bader","year":"2003","journal-title":"BMC Bioinforma."},{"key":"B16","doi-asserted-by":"publisher","first-page":"871","DOI":"10.1126\/science.abj8754","article-title":"Accurate prediction of protein structures and interactions using a three-track neural network","volume":"373","author":"Baek","year":"2021","journal-title":"Science"},{"key":"B17","doi-asserted-by":"publisher","first-page":"4264","DOI":"10.1093\/bioinformatics\/btac509","article-title":"Identification of bacteriophage genome sequences with representation learning","volume":"38","author":"Bai","year":"2022","journal-title":"Bioinformatics"},{"key":"B18","doi-asserted-by":"publisher","first-page":"1245","DOI":"10.3390\/biom11081245","article-title":"Biomolecule and bioentity interaction databases in systems biology: A comprehensive review","volume":"11","author":"Baltoumas","year":"","journal-title":"Biomolecules"},{"key":"B19","doi-asserted-by":"publisher","first-page":"lqab090","DOI":"10.1101\/2021.05.14.444150","article-title":"OnTheFly 2.0: A text-mining web application for automated biomedical entity recognition, document annotation, network and functional enrichment analysis","volume":"3","author":"Baltoumas","year":"","journal-title":"Bioinformatics"},{"key":"B20","doi-asserted-by":"publisher","first-page":"D480","DOI":"10.1093\/nar\/gkaa1100","article-title":"UniProt: The universal protein knowledgebase in 2021","volume":"49","author":"Bateman","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"B21","first-page":"1","article-title":"Folding@home: Lessons from eight years of volunteer distributed computing","volume-title":"2009 IEEE international symposium on parallel and distributed processing","author":"Beberg","year":"2009"},{"key":"B22","doi-asserted-by":"publisher","first-page":"D41","DOI":"10.1093\/nar\/gkx1094","article-title":"GenBank","volume":"46","author":"Benson","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"B23","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The protein Data Bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"B24","doi-asserted-by":"publisher","first-page":"D1515","DOI":"10.1093\/nar\/gkaa887","article-title":"NASA GeneLab: Interfaces for the exploration of space omics data","volume":"49","author":"Berrios","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"B25","doi-asserted-by":"publisher","first-page":"2607","DOI":"10.1093\/nar\/29.12.2607","article-title":"GeneMarkS: A self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions","volume":"29","author":"Besemer","year":"2001","journal-title":"Nucleic Acids Res."},{"key":"B26","doi-asserted-by":"publisher","first-page":"3911","DOI":"10.1093\/nar\/27.19.3911","article-title":"Heuristic approach to deriving models for gene finding","volume":"27","author":"Besemer","year":"1999","journal-title":"Nucleic Acids Res."},{"key":"B27","doi-asserted-by":"publisher","first-page":"W252","DOI":"10.1093\/nar\/gku340","article-title":"SWISS-MODEL: Modelling protein tertiary and quaternary structure using evolutionary information","volume":"42","author":"Biasini","year":"2014","journal-title":"Nucleic Acids Res."},{"key":"B28","doi-asserted-by":"publisher","first-page":"1067","DOI":"10.1038\/nbt.4266","article-title":"High-quality genome sequences of uncultured microbes by assembly of read clouds","volume":"36","author":"Bishara","year":"2018","journal-title":"Nat. Biotechnol."},{"key":"B29","doi-asserted-by":"crossref","DOI":"10.1007\/978-81-322-1856-2","volume-title":"Recent advances in information technology","author":"Biswas","year":"2014"},{"key":"B30","doi-asserted-by":"publisher","DOI":"10.1101\/2022.08.22.504593","article-title":"Extending and improving metagenomic taxonomic profiling with uncharacterized species with MetaPhlAn 4","author":"Blanco-Miguez","year":"2022","journal-title":"bioRxiv"},{"key":"B31","doi-asserted-by":"publisher","first-page":"209","DOI":"10.1186\/1471-2105-8-209","article-title":"CRISPR recognition tool (CRT): A tool for automatic detection of clustered regularly interspaced palindromic repeats","volume":"8","author":"Bland","year":"2007","journal-title":"BMC Bioinforma."},{"key":"B32","doi-asserted-by":"publisher","first-page":"W29","DOI":"10.1093\/nar\/gkab335","article-title":"antiSMASH 6.0: improving cluster detection and comparison capabilities","volume":"49","author":"Blin","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"B33","doi-asserted-by":"publisher","first-page":"P10008","DOI":"10.1088\/1742-5468\/2008\/10\/p10008","article-title":"Fast unfolding of communities in large networks","volume":"2008","author":"Blondel","year":"2008","journal-title":"J. Stat. Mech."},{"key":"B34","doi-asserted-by":"publisher","first-page":"D344","DOI":"10.1093\/nar\/gkaa977","article-title":"The InterPro protein families and domains database: 20 years on","volume":"49","author":"Blum","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"B35","doi-asserted-by":"publisher","first-page":"851","DOI":"10.1038\/nature00831","article-title":"A global analysis of Caenorhabditis elegans operons","volume":"417","author":"Blumenthal","year":"2002","journal-title":"Nature"},{"key":"B36","doi-asserted-by":"publisher","first-page":"R122","DOI":"10.1186\/gb-2012-13-12-r122","article-title":"Ray meta: Scalable de novo metagenome assembly and profiling","volume":"13","author":"Boisvert","year":"2012","journal-title":"Genome Biol."},{"key":"B37","doi-asserted-by":"publisher","first-page":"2114","DOI":"10.1093\/bioinformatics\/btu170","article-title":"Trimmomatic: A flexible trimmer for Illumina sequence data","volume":"30","author":"Bolger","year":"2014","journal-title":"Bioinformatics"},{"key":"B38","doi-asserted-by":"publisher","first-page":"852","DOI":"10.1038\/s41587-019-0209-9","article-title":"Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2","volume":"37","author":"Bolyen","year":"2019","journal-title":"Nat. Biotechnol."},{"key":"B39","doi-asserted-by":"publisher","first-page":"lqab009","DOI":"10.1093\/nargab\/lqab009","article-title":"A comprehensive evaluation of binning methods to recover human gut microbial species from a non-redundant reference gene catalog","volume":"3","author":"Borderes","year":"2021","journal-title":"NAR Genomics Bioinforma."},{"key":"B40","doi-asserted-by":"publisher","first-page":"666","DOI":"10.1038\/nature01216","article-title":"Large clusters of co-expressed genes in the Drosophila genome","volume":"420","author":"Boutanaev","year":"2002","journal-title":"Nature"},{"key":"B41","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1109\/SC.2006.54","article-title":"Scalable algorithms for molecular dynamics simulations on commodity clusters","volume-title":"ACM\/IEEE SC 2006 conference (SC\u201906)","author":"Bowers","year":"2006"},{"key":"B42","doi-asserted-by":"publisher","first-page":"673","DOI":"10.1038\/nmeth.1358","article-title":"Phymm and PhymmBL: Metagenomic phylogenetic classification with interpolated markov models","volume":"6","author":"Brady","year":"2009","journal-title":"Nat. Methods"},{"key":"B43","doi-asserted-by":"publisher","first-page":"488","DOI":"10.1186\/1471-2105-7-488","article-title":"Evaluation of clustering algorithms for protein-protein interaction networks","volume":"7","author":"Broh\u00e9e","year":"2006","journal-title":"BMC Bioinforma."},{"key":"B44","doi-asserted-by":"publisher","first-page":"164","DOI":"10.1186\/s13059-020-02066-4","article-title":"Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity","volume":"21","author":"Brown","year":"2020","journal-title":"Genome Biol."},{"key":"B45","doi-asserted-by":"publisher","first-page":"W402","DOI":"10.1093\/nar\/gkz297","article-title":"The PSIPRED protein analysis workbench: 20 years on","volume":"47","author":"Buchan","year":"2019","journal-title":"Nucleic Acids Res."},{"key":"B46","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1038\/nmeth.3176","article-title":"Fast and sensitive protein alignment using DIAMOND","volume":"12","author":"Buchfink","year":"2015","journal-title":"Nat. Methods"},{"key":"B47","doi-asserted-by":"publisher","first-page":"e0185056","DOI":"10.1371\/journal.pone.0185056","article-title":"BBMerge \u2013 accurate paired shotgun read merging via overlap","volume":"12","author":"Bushnell","year":"2017","journal-title":"PLoS ONE 12"},{"key":"B48","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1186\/s13326-016-0097-6","article-title":"The environment ontology in 2016: Bridging domains with increased scope, semantic density, and interoperation","volume":"7","author":"Buttigieg","year":"2016","journal-title":"J. Biomed. Semant."},{"key":"B49","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1038\/d41586-022-03539-1","article-title":"AlphaFold\u2019s new rival? Meta AI predicts shape of 600 million proteins","volume":"611","author":"Callaway","year":"2022","journal-title":"Nature"},{"key":"B50","doi-asserted-by":"publisher","first-page":"D733","DOI":"10.1093\/nar\/gkac1037","article-title":"IMG\/VR v4: An expanded database of uncultivated virus genomes within a framework of extensive functional, taxonomic, and ecological metadata","volume":"51","author":"Camargo","year":"2022","journal-title":"Nucleic Acids Res."},{"key":"B51","doi-asserted-by":"publisher","first-page":"5825","DOI":"10.1093\/molbev\/msab293","article-title":"eggNOG-mapper v2: Functional annotation, Orthology assignments, and domain prediction at the metagenomic scale","volume":"38","author":"Cantalapiedra","year":"2021","journal-title":"Mol. Biol. Evol."},{"key":"B52","doi-asserted-by":"publisher","first-page":"1972","DOI":"10.1093\/bioinformatics\/btp348","article-title":"trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses","volume":"25","author":"Capella-Guti\u00e9rrez","year":"2009","journal-title":"Bioinformatics"},{"key":"B53","doi-asserted-by":"publisher","first-page":"D325","DOI":"10.1093\/nar\/gkaa1113","article-title":"The gene ontology resource: Enriching a GOld mine","volume":"49","author":"Carbon","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"B54","doi-asserted-by":"publisher","first-page":"1559","DOI":"10.1126\/science.1112014","article-title":"The transcriptional landscape of the mammalian genome","volume":"309","author":"Carninci","year":"2005","journal-title":"Science"},{"key":"B55","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/978-981-15-0702-1_1","article-title":"Structure and organization of virus genomes","volume-title":"Genome and genomics: From archaea to eukaryotes","author":"Chaitanya","year":"2019"},{"key":"B56","doi-asserted-by":"publisher","first-page":"9077","DOI":"10.1093\/nar\/gkab688","article-title":"tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes","volume":"49","author":"Chan","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"B57","doi-asserted-by":"publisher","first-page":"D553","DOI":"10.1093\/nar\/gkab1054","article-title":"SCOPe: Improvements to the structural classification of proteins - extended database to facilitate variant interpretation and machine learning","volume":"50","author":"Chandonia","year":"2022","journal-title":"Nucleic Acids Res."},{"key":"B58","doi-asserted-by":"publisher","first-page":"9","DOI":"10.1038\/s41581-022-00636-2","article-title":"Advances in CRISPR therapeutics","volume":"19","author":"Chavez","year":"2022","journal-title":"Nat. Rev. Nephrol."},{"key":"B59","doi-asserted-by":"publisher","first-page":"D666","DOI":"10.1093\/nar\/gky901","article-title":"IMG\/M v.5.0: An integrated data management and comparative analysis system for microbial genomes and microbiomes","volume":"47","author":"Chen","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"B60","doi-asserted-by":"publisher","first-page":"gkac976","DOI":"10.1093\/nar\/gkac976","article-title":"The IMG\/M data management and analysis system v.7: Content updates and new features","volume":"51","author":"Chen","year":"2022","journal-title":"Nucleic Acids Res."},{"key":"B61","doi-asserted-by":"publisher","first-page":"e24","DOI":"10.1371\/journal.pcbi.0010024","article-title":"Bioinformatics for whole-genome shotgun sequencing of microbial communities","volume":"1","author":"Chen","year":"2005","journal-title":"PLoS Comput. Biol."},{"key":"B62","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1038\/s41467-020-20236-7","article-title":"Efficient assembly of nanopore reads via highly accurate and intact error correction","volume":"12","author":"Chen","year":"2021","journal-title":"Nat. Commun."},{"key":"B63","doi-asserted-by":"publisher","first-page":"27","DOI":"10.1007\/s13721-016-0132-7","article-title":"MetaG: A graph-based metagenomic gene analysis for big DNA data","volume":"5","author":"Chowdhury","year":"2016","journal-title":"Netw. Model. Anal. Health Inf. Bioinforma."},{"key":"B64","doi-asserted-by":"publisher","first-page":"e00804","DOI":"10.1128\/msystems.00804-20","article-title":"DOE JGI metagenome workflow","volume":"6","author":"Clum","year":"2021","journal-title":"mSystems"},{"key":"B65","doi-asserted-by":"publisher","first-page":"1422","DOI":"10.1093\/bioinformatics\/btp163","article-title":"Biopython: Freely available Python tools for computational molecular biology and bioinformatics","volume":"25","author":"Cock","year":"2009","journal-title":"Bioinformatics"},{"key":"B66","doi-asserted-by":"publisher","first-page":"D626","DOI":"10.1093\/nar\/gkz994","article-title":"TerrestrialMetagenomeDB: A public repository of curated and standardized metadata for terrestrial metagenomes","volume":"48","author":"Corr\u00eaa","year":"2019","journal-title":"Nucleic Acids Res."},{"key":"B67","doi-asserted-by":"publisher","first-page":"D1500","DOI":"10.1093\/nar\/gkab1046","article-title":"BioSamples database: FAIRer samples metadata to accelerate research data management","volume":"50","author":"Courtot","year":"2022","journal-title":"Nucleic Acids Res."},{"key":"B68","doi-asserted-by":"publisher","first-page":"210","DOI":"10.1186\/1471-2148-10-210","article-title":"BMGE (block mapping and gathering with entropy): A new software for selection of phylogenetic informative regions from multiple sequence alignments","volume":"10","author":"Criscuolo","year":"2010","journal-title":"BMC Evol. Biol."},{"key":"B69","doi-asserted-by":"publisher","first-page":"1188","DOI":"10.1101\/gr.849004","article-title":"WebLogo: A sequence logo generator: Figure 1","volume":"14","author":"Crooks","year":"2004","journal-title":"Genome Res."},{"key":"B70","doi-asserted-by":"publisher","first-page":"D106","DOI":"10.1093\/nar\/gkab1051","article-title":"The European nucleotide archive in 2021","volume":"50","author":"Cummins","year":"2022","journal-title":"Nucleic Acids Res."},{"key":"B71","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1007\/bf01890115","article-title":"Efficient algorithms for agglomerative hierarchical clustering methods","volume":"1","author":"Day","year":"1984","journal-title":"J. Classif."},{"key":"B72","doi-asserted-by":"publisher","first-page":"e2005849","DOI":"10.1371\/journal.pbio.2005849","article-title":"EukRef: Phylogenetic curation of ribosomal RNA to enhance understanding of eukaryotic diversity and distribution","volume":"16","author":"Del Campo","year":"2018","journal-title":"PLoS Biol."},{"key":"B73","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1186\/s12864-018-4870-z","article-title":"WHAM!: A web-based visualization suite for user-defined analysis of metagenomic shotgun sequencing data","volume":"19","author":"Devlin","year":"2018","journal-title":"BMC Genomics"},{"key":"B74","doi-asserted-by":"publisher","first-page":"W13","DOI":"10.1093\/nar\/gkr245","article-title":"T-coffee: A web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension","volume":"39","author":"Di Tommaso","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"B75","doi-asserted-by":"publisher","first-page":"1198","DOI":"10.1093\/bioinformatics\/btab827","article-title":"No one tool to rule them all: Prokaryotic gene prediction tool annotations are highly dependent on the organism of study","volume":"38","author":"Dimonaco","year":"2022","journal-title":"Bioinformatics"},{"key":"B76","doi-asserted-by":"publisher","first-page":"815","DOI":"10.1093\/bioinformatics\/btt647","article-title":"Rapid detection of expanded short tandem repeats in personal genomics using hybrid sequencing","volume":"30","author":"Doi","year":"2014","journal-title":"Bioinformatics"},{"key":"B77","doi-asserted-by":"publisher","first-page":"1719","DOI":"10.1093\/bioinformatics\/btx828","article-title":"mTM-align: an algorithm for fast and accurate multiple protein structure alignment","volume":"34","author":"Dong","year":"2018","journal-title":"Bioinformatics"},{"key":"B78","doi-asserted-by":"publisher","first-page":"999","DOI":"10.3389\/fgene.2019.00999","article-title":"An integrated pipeline for annotation and visualization of metagenomic contigs","volume":"10","author":"Dong","year":"2019","journal-title":"Front. Genet."},{"key":"B79","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1186\/1748-7188-3-7","article-title":"Noisy: Identification of problematic columns in multiple sequence alignments","volume":"3","author":"Dress","year":"2008","journal-title":"Algorithms Mol. Biol."},{"key":"B80","doi-asserted-by":"publisher","first-page":"3030","DOI":"10.1038\/s41598-021-82726-y","article-title":"Comparison between 16S rRNA and shotgun sequencing data for the taxonomic characterization of the gut microbiota","volume":"11","author":"Durazzi","year":"2021","journal-title":"Sci. Rep."},{"key":"B81","doi-asserted-by":"publisher","first-page":"e1005659","DOI":"10.1371\/journal.pcbi.1005659","article-title":"OpenMM 7: Rapid development of high performance algorithms for molecular dynamics","volume":"13","author":"Eastman","year":"2017","journal-title":"PLoS Comput. Biol."},{"key":"B82","doi-asserted-by":"publisher","first-page":"969","DOI":"10.1093\/bioinformatics\/btp092","article-title":"Mom: Maximum oligonucleotide mapping","volume":"25","author":"Eaves","year":"2009","journal-title":"Bioinformatics"},{"key":"B83","doi-asserted-by":"publisher","first-page":"e1002195","DOI":"10.1371\/journal.pcbi.1002195","article-title":"Accelerated profile HMM searches","volume":"7","author":"Eddy","year":"2011","journal-title":"PLoS Comput. Biol."},{"key":"B84","doi-asserted-by":"publisher","first-page":"1792","DOI":"10.1093\/nar\/gkh340","article-title":"Muscle: Multiple sequence alignment with high accuracy and high throughput","volume":"32","author":"Edgar","year":"2004","journal-title":"Nucleic Acids Res."},{"key":"B85","doi-asserted-by":"publisher","first-page":"2460","DOI":"10.1093\/bioinformatics\/btq461","article-title":"Search and clustering orders of magnitude faster than BLAST","volume":"26","author":"Edgar","year":"2010","journal-title":"Bioinformatics"},{"key":"B86","doi-asserted-by":"publisher","first-page":"123","DOI":"10.1093\/bioinformatics\/btp601","article-title":"MicroRazerS: Rapid alignment of small RNA reads","volume":"26","author":"Emde","year":"2010","journal-title":"Bioinformatics"},{"key":"B87","doi-asserted-by":"publisher","first-page":"348","DOI":"10.3389\/fgene.2015.00348","article-title":"The road to metagenomics: From microbiology to DNA sequencing technologies and bioinformatics","volume":"6","author":"Escobar-Zepeda","year":"2015","journal-title":"Front. Genet."},{"key":"B88","doi-asserted-by":"publisher","first-page":"D941","DOI":"10.1093\/nar\/gkz836","article-title":"The International Genome Sample Resource (IGSR) collection of open human genomic variation resources","volume":"48","author":"Fairley","year":"2020","journal-title":"Nucleic Acids Res."},{"key":"B89","doi-asserted-by":"publisher","first-page":"W29","DOI":"10.1093\/nar\/gkr367","article-title":"HMMER web server: Interactive sequence similarity searching","volume":"39","author":"Finn","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"B90","doi-asserted-by":"publisher","first-page":"5839","DOI":"10.1093\/nar\/gkl732","article-title":"Phage_Finder: Automated identification and classification of prophage regions in complete bacterial genome sequences","volume":"34","author":"Fouts","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"B91","doi-asserted-by":"publisher","first-page":"972","DOI":"10.1126\/science.1136800","article-title":"Clustering by passing messages between data points","volume":"315","author":"Frey","year":"2007","journal-title":"Science"},{"key":"B92","doi-asserted-by":"publisher","first-page":"e23","DOI":"10.1093\/nar\/gkq1212","article-title":"A new repeat-masking method enables specific detection of homologous sequences","volume":"39","author":"Frith","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"B93","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1186\/1471-2164-5-4","article-title":"Inter-species differences of co-expression of neighboring genes in eukaryotic genomes","volume":"5","author":"Fukuoka","year":"2004","journal-title":"BMC Genomics"},{"key":"B94","doi-asserted-by":"publisher","first-page":"D274","DOI":"10.1093\/nar\/gkaa1018","article-title":"COG database update: Focus on microbial diversity, model organisms, and widespread pathogens","volume":"49","author":"Galperin","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"B95","doi-asserted-by":"publisher","first-page":"37","DOI":"10.3389\/fmicb.2020.00037","article-title":"FeGenie: A comprehensive tool for the identification of iron genes and iron gene neighborhoods in genome and metagenome assemblies","volume":"11","author":"Garber","year":"2020","journal-title":"Front. Microbiol."},{"key":"B96","doi-asserted-by":"publisher","first-page":"R80","DOI":"10.1186\/gb-2004-5-10-r80","article-title":"Bioconductor: Open software development for computational biology and bioinformatics","volume":"5","author":"Gentleman","year":"2004","journal-title":"Genome Biol."},{"key":"B97","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1074\/jbc.rev119.006794","article-title":"Successes and challenges in simulating the folding of large proteins","volume":"295","author":"Gershenson","year":"2020","journal-title":"J. Biol. Chem."},{"key":"B98","doi-asserted-by":"publisher","first-page":"e3035","DOI":"10.7717\/peerj.3035","article-title":"BinSanity: Unsupervised clustering of environmental microbial assemblies using coverage and affinity propagation","volume":"5","author":"Graham","year":"2017","journal-title":"PeerJ"},{"key":"B99","doi-asserted-by":"publisher","first-page":"D507","DOI":"10.1093\/nar\/gkq968","article-title":"The BRENDA tissue ontology (BTO): The first all-integrating ontology of all organisms for enzyme sources","volume":"39","author":"Gremse","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"B100","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1186\/s40168-020-00990-y","article-title":"VirSorter2: A multi-classifier, expert-guided approach to detect diverse DNA and RNA viruses","volume":"9","author":"Guo","year":"2021","journal-title":"Microbiome"},{"key":"B101","doi-asserted-by":"publisher","first-page":"371","DOI":"10.1093\/nar\/gkg128","article-title":"The TIGRFAMs database of protein families","volume":"31","author":"Haft","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"B102","doi-asserted-by":"publisher","first-page":"2717","DOI":"10.1093\/bioinformatics\/btu395","article-title":"Omega: an Overlap-graph de novo Assembler for Metagenomics","volume":"30","author":"Haider","year":"2014","journal-title":"Bioinformatics"},{"key":"B103","doi-asserted-by":"publisher","first-page":"1571","DOI":"10.1093\/bioinformatics\/btw025","article-title":"Inclusion of dyad-repeat pattern improves topology prediction of transmembrane \u03b2-barrel proteins","volume":"32","author":"Hayat","year":"2016","journal-title":"Bioinformatics"},{"key":"B104","doi-asserted-by":"publisher","first-page":"5413","DOI":"10.1073\/pnas.1419956112","article-title":"All-atom 3D structure prediction of transmembrane \u03b2-barrel proteins from sequences","volume":"112","author":"Hayat","year":"2015","journal-title":"Proc. Natl. Acad. Sci. U. S. A."},{"key":"B105","doi-asserted-by":"publisher","first-page":"520","DOI":"10.1186\/1471-2164-10-520","article-title":"The effect of sequencing errors on metagenomic gene prediction","volume":"10","author":"Hoff","year":"2009","journal-title":"BMC Genomics"},{"key":"B106","doi-asserted-by":"publisher","first-page":"e57","DOI":"10.1002\/cpbi.57","article-title":"Predicting genes in single genomes with AUGUSTUS","volume":"65","author":"Hoff","year":"2019","journal-title":"Curr. Protoc. Bioinforma."},{"key":"B107","doi-asserted-by":"publisher","first-page":"W210","DOI":"10.1093\/nar\/gkac387","article-title":"Dali server: Structural unification of protein families","volume":"50","author":"Holm","year":"2022","journal-title":"Nucleic Acids Res."},{"key":"B108","doi-asserted-by":"publisher","first-page":"1607","DOI":"10.1016\/j.cell.2012.04.012","article-title":"Three-dimensional structures of membrane proteins from genomic sequencing","volume":"149","author":"Hopf","year":"2012","journal-title":"Cell."},{"key":"B109","doi-asserted-by":"publisher","first-page":"1582","DOI":"10.1093\/bioinformatics\/bty862","article-title":"The EVcouplings Python framework for coevolutionary sequence analysis","volume":"35","author":"Hopf","year":"2019","journal-title":"Bioinformatics"},{"key":"B110","doi-asserted-by":"publisher","DOI":"10.1101\/2021.10.26.466018","article-title":"DeepMicrobeFinder sorts metagenomes into prokaryotes, eukaryotes and viruses, with marine applications. 2021.10.26.466018","author":"Hou","year":"2021"},{"key":"B111","doi-asserted-by":"publisher","first-page":"54","DOI":"10.1016\/j.compbiolchem.2018.03.024","article-title":"Hardware acceleration of BWA-MEM genomic short read mapping for longer read lengths","volume":"75","author":"Houtgast","year":"2018","journal-title":"Comput. Biol. Chem."},{"key":"B112","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1038\/nmeth.4067","article-title":"CHARMM36m: An improved force field for folded and intrinsically disordered proteins","volume":"14","author":"Huang","year":"2017","journal-title":"Nat. Methods"},{"key":"B113","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1109\/mcse.2007.55","article-title":"Matplotlib: A 2D graphics environment","volume":"9","author":"Hunter","year":"2007","journal-title":"Comput. Sci. Eng."},{"key":"B114","doi-asserted-by":"publisher","first-page":"1204","DOI":"10.1101\/gr.10.8.1204","article-title":"Predicting protein function by genomic context: Quantitative evaluation and qualitative inferences","volume":"10","author":"Huynen","year":"2000","journal-title":"Genome Res."},{"key":"B115","doi-asserted-by":"publisher","first-page":"119","DOI":"10.1186\/1471-2105-11-119","article-title":"Prodigal: Prokaryotic gene recognition and translation initiation site identification","volume":"11","author":"Hyatt","year":"2010","journal-title":"BMC Bioinforma."},{"key":"B116","doi-asserted-by":"publisher","first-page":"2223","DOI":"10.1093\/bioinformatics\/bts429","article-title":"Gene and translation initiation site prediction in metagenomic sequences","volume":"28","author":"Hyatt","year":"2012","journal-title":"Bioinformatics"},{"key":"B117","doi-asserted-by":"publisher","first-page":"e603","DOI":"10.7717\/peerj.603","article-title":"GroopM: An automated tool for the recovery of population genomes from related metagenomes","volume":"2","author":"Imelfort","year":"2014","journal-title":"PeerJ"},{"key":"B118","doi-asserted-by":"publisher","first-page":"1803","DOI":"10.1111\/j.1462-2920.2010.02270.x","article-title":"A call for standardized classification of metagenome projects","volume":"12","author":"Ivanova","year":"2010","journal-title":"Environ. Microbiol."},{"key":"B119","doi-asserted-by":"publisher","first-page":"767","DOI":"10.1126\/science.1207943","article-title":"The birth of the operon","volume":"332","author":"Jacob","year":"2011","journal-title":"Science"},{"key":"B120","doi-asserted-by":"publisher","first-page":"182","DOI":"10.1186\/1471-2105-15-182","article-title":"Skewer: A fast and accurate adapter trimmer for next-generation sequencing paired-end reads","volume":"15","author":"Jiang","year":"2014","journal-title":"BMC Bioinforma."},{"key":"B121","doi-asserted-by":"publisher","first-page":"1105","DOI":"10.1093\/bioinformatics\/btq078","article-title":"SPICi: A fast clustering algorithm for large biological networks","volume":"26","author":"Jiang","year":"2010","journal-title":"Bioinformatics"},{"key":"B122","doi-asserted-by":"publisher","first-page":"965","DOI":"10.1038\/s41467-022-28581-5","article-title":"Genome binning of viral entities from bulk metagenomics data","volume":"13","author":"Johansen","year":"2022","journal-title":"Nat. Commun."},{"key":"B123","doi-asserted-by":"publisher","first-page":"1236","DOI":"10.1093\/bioinformatics\/btu031","article-title":"InterProScan 5: Genome-scale protein function classification","volume":"30","author":"Jones","year":"2014","journal-title":"Bioinformatics"},{"key":"B124","doi-asserted-by":"publisher","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"B125","doi-asserted-by":"publisher","first-page":"e0163111","DOI":"10.1371\/journal.pone.0163111","article-title":"MetaPhinder\u2014identifying bacteriophage sequences in metagenomic data sets","volume":"11","author":"Jurtz","year":"2016","journal-title":"PLoS ONE 11"},{"key":"B126","doi-asserted-by":"publisher","first-page":"W429","DOI":"10.1093\/nar\/gkm256","article-title":"Advantages of combined transmembrane topology and signal peptide prediction\u2013the Phobius web server","volume":"35","author":"K\u00e4ll","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"B127","doi-asserted-by":"publisher","first-page":"1511","DOI":"10.1038\/nprot.2012.085","article-title":"Template-based protein structure modeling using the RaptorX web server","volume":"7","author":"K\u00e4llberg","year":"2012","journal-title":"Nat. Protoc."},{"key":"B128","doi-asserted-by":"publisher","first-page":"D192","DOI":"10.1093\/nar\/gkaa1047","article-title":"Rfam 14: Expanded coverage of metagenomic, viral and microRNA families","volume":"49","author":"Kalvari","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"B129","doi-asserted-by":"publisher","first-page":"28","DOI":"10.1002\/pro.3711","article-title":"KEGG Mapper for inferring cellular functions from protein sequences","volume":"29","author":"Kanehisa","year":"2020","journal-title":"Protein Sci."},{"key":"B130","doi-asserted-by":"publisher","first-page":"e7359","DOI":"10.7717\/peerj.7359","article-title":"MetaBAT 2: An adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies","volume":"7","author":"Kang","year":"2019","journal-title":"PeerJ"},{"key":"B131","doi-asserted-by":"publisher","first-page":"428","DOI":"10.1038\/s41576-020-0233-0","article-title":"Phylogenetic tree building in the genomic age","volume":"21","author":"Kapli","year":"2020","journal-title":"Nat. Rev. Genet."},{"key":"B132","doi-asserted-by":"publisher","first-page":"W36","DOI":"10.1093\/nar\/gkab278","article-title":"Arena3Dweb: Interactive 3D visualization of multilayered networks","volume":"49","author":"Karatzas","year":"","journal-title":"Nucleic Acids Res."},{"key":"B133","doi-asserted-by":"publisher","first-page":"520","DOI":"10.3390\/biom12040520","article-title":"Darling: A web application for detecting disease-related biomedical entity associations with literature mining","volume":"12","author":"Karatzas","year":"","journal-title":"Biomolecules"},{"key":"B134","doi-asserted-by":"publisher","first-page":"104557","DOI":"10.1016\/j.compbiomed.2021.104557","article-title":"Victor: A visual analytics web application for comparing cluster sets","volume":"135","author":"Karatzas","year":"","journal-title":"Comput. Biol. Med."},{"key":"B135","doi-asserted-by":"publisher","first-page":"vbac036","DOI":"10.1093\/bioadv\/vbac036","article-title":"The network makeup artist (NORMA-2.0): Distinguishing annotated groups in a network using innovative layout strategies","volume":"2","author":"Karatzas","year":"","journal-title":"Bioinforma. Adv."},{"key":"B136","doi-asserted-by":"publisher","first-page":"344","DOI":"10.1093\/bioinformatics\/btab672","article-title":"Tiara: Deep learning-based classification system for eukaryotic sequences","volume":"38","author":"Karlicki","year":"2021","journal-title":"Bioinformatics"},{"key":"B137","doi-asserted-by":"publisher","first-page":"190","DOI":"10.1038\/nbt.4045","article-title":"Retrieval of a million high-quality, full-length microbial 16S and 18S rRNA gene sequences without primer bias","volume":"36","author":"Karst","year":"2018","journal-title":"Nat. Biotechnol."},{"key":"B138","doi-asserted-by":"publisher","first-page":"D743","DOI":"10.1093\/nar\/gkaa1031","article-title":"HumanMetagenomeDB: A public repository of curated and standardized metadata for human metagenomes","volume":"49","author":"Kasmanas","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"B139","doi-asserted-by":"publisher","first-page":"772","DOI":"10.1093\/molbev\/mst010","article-title":"MAFFT multiple sequence alignment software version 7: Improvements in performance and usability","volume":"30","author":"Katoh","year":"2013","journal-title":"Mol. Biol. Evol."},{"key":"B140","doi-asserted-by":"publisher","first-page":"259","DOI":"10.1038\/s41568-022-00441-w","article-title":"CRISPR in cancer biology and therapy","volume":"22","author":"Katti","year":"2022","journal-title":"Nat. Rev. Cancer"},{"key":"B141","doi-asserted-by":"publisher","first-page":"e1002541","DOI":"10.1371\/journal.pcbi.1002541","article-title":"A platform-independent method for detecting errors in metagenomic sequencing data: Drisee","volume":"8","author":"Keegan","year":"2012","journal-title":"PLoS Comput. Biol."},{"key":"B142","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1038\/nrmicro819","article-title":"Tapping into microbial diversity","volume":"2","author":"Keller","year":"2004","journal-title":"Nat. Rev. Microbiol."},{"key":"B143","doi-asserted-by":"publisher","first-page":"544","DOI":"10.1186\/1471-2105-11-544","article-title":"Clustering metagenomic sequences with interpolated Markov models","volume":"11","author":"Kelley","year":"2010","journal-title":"BMC Bioinforma."},{"key":"B144","doi-asserted-by":"publisher","first-page":"845","DOI":"10.1038\/nprot.2015.053","article-title":"The Phyre2 web portal for protein modeling, prediction and analysis","volume":"10","author":"Kelley","year":"2015","journal-title":"Nat. Protoc."},{"key":"B145","doi-asserted-by":"publisher","DOI":"10.1101\/2022.02.07.479398","article-title":"Foldseek: Fast and accurate protein structure search","author":"Kempen","year":"2022","journal-title":"bioRxiv"},{"key":"B146","doi-asserted-by":"publisher","first-page":"656","DOI":"10.1101\/gr.229202","article-title":"BLAT\u2013the BLAST-like alignment tool","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res."},{"key":"B147","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1186\/s40168-020-00867-0","article-title":"Vibrant: Automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences","volume":"8","author":"Kieft","year":"2020","journal-title":"Microbiome"},{"key":"B148","doi-asserted-by":"publisher","first-page":"301","DOI":"10.1080\/19768354.2017.1382388","article-title":"Functional gene networks based on the gene neighborhood in metagenomes","volume":"21","author":"Kim","year":"2017","journal-title":"Animal Cells Syst."},{"key":"B149","doi-asserted-by":"publisher","first-page":"1721","DOI":"10.1101\/gr.210641.116","article-title":"Centrifuge: Rapid and sensitive classification of metagenomic sequences","volume":"26","author":"Kim","year":"2016","journal-title":"Genome Res."},{"key":"B150","doi-asserted-by":"publisher","first-page":"316","DOI":"10.1186\/1471-2105-10-316","article-title":"Unsupervised statistical clustering of environmental shotgun sequences","volume":"10","author":"Kislyuk","year":"2009","journal-title":"BMC Bioinforma."},{"key":"B151","doi-asserted-by":"publisher","first-page":"D692","DOI":"10.1093\/nar\/gkx1036","article-title":"The MAR databases: Development and implementation of databases specific for marine metagenomics","volume":"46","author":"Klemetsen","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"B152","doi-asserted-by":"publisher","first-page":"D54","DOI":"10.1093\/nar\/gkr854","article-title":"The sequence read archive: Explosive growth of sequencing data","volume":"40","author":"Kodama","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"B153","doi-asserted-by":"publisher","DOI":"10.1101\/2022.10.01.510435","article-title":"Arena3D web: Interactive 3D visualization of multilayered networks supporting multiple directional information channels, clustering analysis and application integration","author":"Kokoli","year":"2022","journal-title":"biorxiv"},{"key":"B154","doi-asserted-by":"publisher","first-page":"353","DOI":"10.1007\/s00335-019-09821-4","article-title":"The JAX Synteny Browser for mouse-human comparative genomics","volume":"30","author":"Kolishovski","year":"2019","journal-title":"Mamm. Genome"},{"key":"B155","doi-asserted-by":"publisher","first-page":"1103","DOI":"10.1038\/s41592-020-00971-x","article-title":"metaFlye: scalable long-read metagenome assembly using repeat graphs","volume":"17","author":"Kolmogorov","year":"2020","journal-title":"Nat. Methods"},{"key":"B156","doi-asserted-by":"publisher","first-page":"722","DOI":"10.1101\/gr.215087.116","article-title":"Canu: Scalable and accurate long-read assembly via adaptive k -mer weighting and repeat separation","volume":"27","author":"Koren","year":"2017","journal-title":"Genome Res."},{"key":"B157","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1089\/omi.2008.0a10","article-title":"A standard MIGS\/MIMS compliant XML schema: Toward the development of the genomic contextual data markup language (GCDML)","volume":"12","author":"Kottmann","year":"2008","journal-title":"OMICS"},{"key":"B158","article-title":"Exploring networks in the STRING and reactome database","volume-title":"Reference module in biomedical Sciences","author":"Koutrouli","year":""},{"key":"B159","doi-asserted-by":"publisher","first-page":"34","DOI":"10.3389\/fbioe.2020.00034","article-title":"A guide to conquer the biological network era using graph theory","volume":"8","author":"Koutrouli","year":"","journal-title":"Front. Bioeng. Biotechnol."},{"key":"B160","doi-asserted-by":"publisher","first-page":"e943","DOI":"10.14806\/ej.26.0.943","article-title":"The network analysis profiler (NAP v2.0): A web tool for visual topological comparison between multiple networks","volume":"26","author":"Koutrouli","year":"2021","journal-title":"EMBnet J."},{"key":"B161","doi-asserted-by":"publisher","first-page":"386","DOI":"10.1002\/wics.1314","article-title":"Why the Monte Carlo method is so important today","volume":"6","author":"Kroese","year":"2014","journal-title":"WIREs Comp. Stat."},{"key":"B162","doi-asserted-by":"publisher","first-page":"567","DOI":"10.1006\/jmbi.2000.4315","article-title":"Predicting transmembrane protein topology with a hidden markov model: Application to complete genomes11Edited by F. Cohen","volume":"305","author":"Krogh","year":"2001","journal-title":"J. Mol. Biol."},{"key":"B163","doi-asserted-by":"publisher","first-page":"1607","DOI":"10.1002\/prot.26237","article-title":"Critical assessment of methods of protein structure prediction (CASP)\u2014round XIV","volume":"89","author":"Kryshtafovych","year":"2021","journal-title":"Proteins"},{"key":"B164","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1038\/nbt.3416","article-title":"Synthetic long-read sequencing reveals intraspecies diversity in the human microbiome","volume":"34","author":"Kuleshov","year":"2016","journal-title":"Nat. Biotechnol."},{"key":"B165","doi-asserted-by":"publisher","first-page":"719","DOI":"10.1093\/bioinformatics\/btm563","article-title":"Defining clusters from a hierarchical cluster tree: The dynamic tree cut package for R","volume":"24","author":"Langfelder","year":"2008","journal-title":"Bioinformatics"},{"key":"B166","doi-asserted-by":"publisher","first-page":"814","DOI":"10.1038\/nbt.2676","article-title":"Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences","volume":"31","author":"Langille","year":"2013","journal-title":"Nat. Biotechnol."},{"key":"B167","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1038\/nmeth.1923","article-title":"Fast gapped-read alignment with Bowtie 2","volume":"9","author":"Langmead","year":"2012","journal-title":"Nat. Methods"},{"key":"B168","doi-asserted-by":"publisher","first-page":"613791","DOI":"10.3389\/fmicb.2021.613791","article-title":"Metagenomic data assembly \u2013 the way of decoding unknown microorganisms","volume":"12","author":"Lapidus","year":"2021","journal-title":"Front. Microbiol."},{"key":"B169","doi-asserted-by":"publisher","first-page":"298","DOI":"10.1186\/1471-2105-6-298","article-title":"Kalign\u2013an accurate and fast multiple sequence alignment algorithm","volume":"6","author":"Lassmann","year":"2005","journal-title":"BMC Bioinforma."},{"key":"B170","doi-asserted-by":"publisher","first-page":"875","DOI":"10.1101\/gr.737703","article-title":"Genomic gene clustering analysis of pathways in eukaryotes","volume":"13","author":"Lee","year":"2003","journal-title":"Genome Res."},{"key":"B171","doi-asserted-by":"publisher","first-page":"665","DOI":"10.1038\/s41592-020-0848-2","article-title":"Macromolecular modeling and design in Rosetta: Recent methods and frameworks","volume":"17","author":"Leman","year":"2020","journal-title":"Nat. Methods"},{"key":"B172","doi-asserted-by":"publisher","first-page":"253","DOI":"10.1186\/1471-2105-13-253","article-title":"G-NEST: A gene neighborhood scoring tool to identify co-conserved, co-expressed genes","volume":"13","author":"Lemay","year":"2012","journal-title":"BMC Bioinforma."},{"key":"B173","doi-asserted-by":"publisher","first-page":"3753","DOI":"10.1093\/bioinformatics\/bty454","article-title":"MIDORI server: A webserver for taxonomic assignment of unknown metazoan mitochondrial-encoded sequences using a curated database","volume":"34","author":"Leray","year":"2018","journal-title":"Bioinformatics"},{"key":"B174","doi-asserted-by":"publisher","first-page":"2909","DOI":"10.1016\/j.celrep.2020.02.036","article-title":"An integrated metagenome catalog reveals new insights into the murine gut microbiome","volume":"30","author":"Lesker","year":"2020","journal-title":"Cell. Rep."},{"key":"B175","doi-asserted-by":"publisher","first-page":"48","DOI":"10.1186\/s40168-020-00808-x","article-title":"MetaEuk\u2014Sensitive, high-throughput gene discovery, and annotation for large-scale eukaryotic metagenomics","volume":"8","author":"Levy Karin","year":"2020","journal-title":"Microbiome"},{"key":"B176","doi-asserted-by":"publisher","first-page":"1674","DOI":"10.1093\/bioinformatics\/btv033","article-title":"Megahit: An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph","volume":"31","author":"Li","year":"2015","journal-title":"Bioinformatics"},{"key":"B177","doi-asserted-by":"publisher","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with Burrows-Wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"B178","doi-asserted-by":"publisher","first-page":"1851","DOI":"10.1101\/gr.078212.108","article-title":"Mapping short DNA sequencing reads and calling variants using mapping quality scores","volume":"18","author":"Li","year":"2008","journal-title":"Genome Res."},{"key":"B179","doi-asserted-by":"publisher","first-page":"1658","DOI":"10.1093\/bioinformatics\/btl158","article-title":"Cd-Hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences","volume":"22","author":"Li","year":"2006","journal-title":"Bioinformatics"},{"key":"B180","doi-asserted-by":"publisher","first-page":"D1020","DOI":"10.1093\/nar\/gkaa1105","article-title":"RefSeq: Expanding the prokaryotic genome annotation pipeline reach with protein family model curation","volume":"49","author":"Li","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"B181","doi-asserted-by":"publisher","first-page":"W60","DOI":"10.1093\/nar\/gkaa443","article-title":"Fatcat 2.0: Towards a better understanding of the structural diversity of proteins","volume":"48","author":"Li","year":"2020","journal-title":"Nucleic Acids Res."},{"key":"B182","doi-asserted-by":"publisher","first-page":"W199","DOI":"10.1093\/nar\/gkz401","article-title":"WebGestalt 2019: Gene set analysis toolkit with revamped UIs and APIs","volume":"47","author":"Liao","year":"2019","journal-title":"Nucleic Acids Res."},{"key":"B183","doi-asserted-by":"publisher","first-page":"863","DOI":"10.1093\/bioinformatics\/btn043","article-title":"Prophinder: A computational tool for prophage prediction in prokaryotic genomes","volume":"24","author":"Lima-Mendez","year":"2008","journal-title":"Bioinformatics"},{"key":"B184","doi-asserted-by":"publisher","first-page":"24175","DOI":"10.1038\/srep24175","article-title":"Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes","volume":"6","author":"Lin","year":"2016","journal-title":"Sci. Rep."},{"key":"B185","doi-asserted-by":"publisher","DOI":"10.1101\/2022.07.20.500902","article-title":"Evolutionary-scale prediction of atomic level protein structure with a language model. 2022.07.20.500902","author":"Lin","year":"2022"},{"key":"B186","doi-asserted-by":"publisher","first-page":"58","DOI":"10.1186\/s40168-021-01015-y","article-title":"Accurate and sensitive detection of microbial eukaryotes from whole metagenome shotgun sequencing","volume":"9","author":"Lind","year":"2021","journal-title":"Microbiome"},{"key":"B187","doi-asserted-by":"publisher","first-page":"878","DOI":"10.1093\/bioinformatics\/bts061","article-title":"SOAP3: Ultra-fast GPU-based parallel alignment tool for short reads","volume":"28","author":"Liu","year":"2012","journal-title":"Bioinformatics"},{"key":"B188","doi-asserted-by":"publisher","first-page":"763","DOI":"10.1109\/TCBB.2022.3161135","article-title":"virSearcher: Identifying bacteriophages from metagenomes by combining convolutional neural network and gene information","volume":"20","author":"Liu","year":"2022","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinforma."},{"key":"B189","doi-asserted-by":"publisher","first-page":"491","DOI":"10.1002\/prot.10514","article-title":"The number of protein folds and their distribution over families in nature","volume":"54","author":"Liu","year":"2004","journal-title":"Proteins"},{"key":"B190","doi-asserted-by":"publisher","first-page":"315","DOI":"10.1007\/s13238-020-00724-8","article-title":"A practical guide to amplicon and metagenomic analysis of microbiome data","volume":"12","author":"Liu","year":"2021","journal-title":"Protein Cell."},{"key":"B191","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1038\/nature23889","article-title":"Strains, functions and dynamics in the expanded human microbiome project","volume":"550","author":"Lloyd-Price","year":"2017","journal-title":"Nature"},{"key":"B192","doi-asserted-by":"publisher","first-page":"257","DOI":"10.1093\/nar\/28.1.257","article-title":"SCOP: A structural classification of proteins database","volume":"28","author":"Lo Conte","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"B193","doi-asserted-by":"publisher","first-page":"5970","DOI":"10.1073\/pnas.1521291113","article-title":"Scaling laws predict global microbial diversity","volume":"113","author":"Locey","year":"2016","journal-title":"Proc. Natl. Acad. Sci."},{"key":"B194","doi-asserted-by":"publisher","first-page":"e119","DOI":"10.1093\/nar\/gku557","article-title":"Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm","volume":"42","author":"Lomsadze","year":"2014","journal-title":"Nucleic Acids Res."},{"key":"B195","doi-asserted-by":"publisher","first-page":"1079","DOI":"10.1101\/gr.230615.117","article-title":"Modeling leaderless transcription and atypical genes results in more accurate gene prediction in prokaryotes","volume":"28","author":"Lomsadze","year":"2018","journal-title":"Genome Res."},{"key":"B196","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1007\/978-1-62703-646-7_10","article-title":"Phylogeny-aware alignment with PRANK","volume":"1079","author":"L\u00f6ytynoja","year":"2014","journal-title":"Methods Mol. Biol."},{"key":"B197","doi-asserted-by":"publisher","first-page":"791","DOI":"10.1093\/bioinformatics\/btw290","article-title":"Cocacola: Binning metagenomic contigs using sequence COmposition, read CoverAge, CO-alignment and paired-end read LinkAge","volume":"33","author":"Lu","year":"2016","journal-title":"Bioinformatics"},{"key":"B198","doi-asserted-by":"publisher","first-page":"936","DOI":"10.1101\/gr.111120.110","article-title":"Stampy: A statistical algorithm for sensitive and fast mapping of Illumina sequence reads","volume":"21","author":"Lunter","year":"2011","journal-title":"Genome Res."},{"key":"B199","doi-asserted-by":"publisher","first-page":"67","DOI":"10.1038\/s41579-019-0299-x","article-title":"Evolutionary classification of CRISPR\u2013cas systems: A burst of class 2 and derived variants","volume":"18","author":"Makarova","year":"2020","journal-title":"Nat. Rev. Microbiol."},{"key":"B200","doi-asserted-by":"publisher","first-page":"669","DOI":"10.1093\/bib\/bbs054","article-title":"Classification of metagenomic sequences: Methods and challenges","volume":"13","author":"Mande","year":"2012","journal-title":"Briefings Bioinforma."},{"key":"B201","doi-asserted-by":"publisher","first-page":"e28766","DOI":"10.1371\/journal.pone.0028766","article-title":"Protein 3D structure computed from evolutionary sequence variation","volume":"6","author":"Marks","year":"2011","journal-title":"PLoS One"},{"key":"B202","doi-asserted-by":"publisher","first-page":"291","DOI":"10.1146\/annurev.biophys.29.1.291","article-title":"Comparative protein structure modeling of genes and genomes","volume":"29","author":"Mart\u00ed-Renom","year":"2000","journal-title":"Annu. Rev. Biophys. Biomol. Struct."},{"key":"B203","doi-asserted-by":"publisher","first-page":"D51","DOI":"10.1093\/nar\/gkv1105","article-title":"DNA data bank of Japan (DDBJ) progress report","volume":"44","author":"Mashima","year":"2016","journal-title":"Nucleic Acids Res."},{"key":"B204","doi-asserted-by":"publisher","first-page":"3808","DOI":"10.1093\/bioinformatics\/btx517","article-title":"MAPseq: Highly efficient k-mer search with confidence estimates, for rRNA sequence analysis","volume":"33","author":"Matias Rodrigues","year":"2017","journal-title":"Bioinformatics"},{"key":"B205","doi-asserted-by":"publisher","first-page":"355","DOI":"10.1023\/a:1007618624809","article-title":"Some PAC-bayesian theorems","volume":"37","author":"McAllester","year":"1999","journal-title":"Mach. Learn."},{"key":"B206","doi-asserted-by":"publisher","first-page":"P1","DOI":"10.1186\/gb-2003-4-2-p1","article-title":"Positional clustering of differentially expressed genes on human chromosomes 20, 21 and 22","volume":"4","author":"M\u00e9gy","year":"2003","journal-title":"Genome Biol."},{"key":"B207","doi-asserted-by":"publisher","first-page":"11257","DOI":"10.1038\/ncomms11257","article-title":"Fast and sensitive taxonomic classification for metagenomics with Kaiju","volume":"7","author":"Menzel","year":"2016","journal-title":"Nat. Commun."},{"key":"B208","doi-asserted-by":"publisher","first-page":"1151","DOI":"10.1093\/bib\/bbx105","article-title":"MG-RAST version 4-lessons learned from a decade of low-budget ultra-high-throughput metagenome analysis","volume":"20","author":"Meyer","year":"2019","journal-title":"Brief. Bioinform"},{"key":"B209","doi-asserted-by":"publisher","first-page":"386","DOI":"10.1186\/1471-2105-9-386","article-title":"The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes","volume":"9","author":"Meyer","year":"2008","journal-title":"BMC Bioinforma."},{"key":"B210","doi-asserted-by":"publisher","first-page":"e57","DOI":"10.1093\/nar\/gkz148","article-title":"Autometa: Automated extraction of microbial genomes from individual shotgun metagenomes","volume":"47","author":"Miller","year":"2019","journal-title":"Nucleic Acids Res."},{"key":"B211","doi-asserted-by":"publisher","first-page":"679","DOI":"10.1038\/s41592-022-01488-1","article-title":"ColabFold: Making protein folding accessible to all","volume":"19","author":"Mirdita","year":"2022","journal-title":"Nat. Methods"},{"key":"B212","doi-asserted-by":"publisher","first-page":"3029","DOI":"10.1093\/bioinformatics\/btab184","article-title":"Fast and sensitive taxonomic assignment to metagenomic contigs","volume":"37","author":"Mirdita","year":"2021","journal-title":"Bioinformatics"},{"key":"B213","doi-asserted-by":"publisher","first-page":"1885","DOI":"10.1038\/s41591-021-01552-x","article-title":"Reporting guidelines for human microbiome research: The STORMS checklist","volume":"27","author":"Mirzayi","year":"2021","journal-title":"Nat. Med."},{"key":"B214","doi-asserted-by":"publisher","first-page":"D412","DOI":"10.1093\/nar\/gkaa913","article-title":"Pfam: The protein families database in 2021","volume":"49","author":"Mistry","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"B215","doi-asserted-by":"publisher","first-page":"D570","DOI":"10.1093\/nar\/gkz1035","article-title":"MGnify: The microbiome analysis resource in 2020","volume":"48","author":"Mitchell","year":"2019","journal-title":"Nucleic Acids Res."},{"key":"B216","doi-asserted-by":"publisher","first-page":"D726","DOI":"10.1093\/nar\/gkx967","article-title":"EBI metagenomics in 2017: Enriching the analysis of microbial communities, from sequence reads to assemblies","volume":"46","author":"Mitchell","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"B217","doi-asserted-by":"publisher","first-page":"81","DOI":"10.30491\/jabr.2020.109380","article-title":"CRISPR arrays: A review on its mechanism","volume":"7","author":"Mohamadi","year":"2020","journal-title":"J. Apple Biotechnol. Rep."},{"key":"B218","doi-asserted-by":"publisher","first-page":"E1293","DOI":"10.1073\/pnas.1111471108","article-title":"Direct-coupling analysis of residue coevolution captures native contacts across many protein families","volume":"108","author":"Morcos","year":"2011","journal-title":"Proc. Natl. Acad. Sci. U. S. A."},{"key":"B219","doi-asserted-by":"publisher","first-page":"1028","DOI":"10.1089\/cmb.2006.13.1028","article-title":"A fast and symmetric DUST implementation to mask low-complexity DNA sequences","volume":"13","author":"Morgulis","year":"2006","journal-title":"J. Comput. Biol."},{"key":"B220","doi-asserted-by":"publisher","first-page":"5011","DOI":"10.1038\/s41467-021-25316-w","article-title":"Improving fragment-based ab initio protein structure assembly using low-accuracy contact-map predictions","volume":"12","author":"Mortuza","year":"2021","journal-title":"Nat. Commun."},{"key":"B221","doi-asserted-by":"publisher","first-page":"549","DOI":"10.1186\/1756-0500-4-549","article-title":"Which clustering algorithm is better for predicting protein complexes?","volume":"4","author":"Moschopoulos","year":"2011","journal-title":"BMC Res. Notes"},{"key":"B222","doi-asserted-by":"publisher","first-page":"676","DOI":"10.1038\/nbt.3886","article-title":"1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life","volume":"35","author":"Mukherjee","year":"2017","journal-title":"Nat. Biotechnol."},{"key":"B223","doi-asserted-by":"publisher","first-page":"D957","DOI":"10.1093\/nar\/gkac974","article-title":"Twenty-five years of genomes OnLine database (GOLD): Data updates and new features in v.9","volume":"51","author":"Mukherjee","year":"2022","journal-title":"Nucleic Acids Res."},{"key":"B224","doi-asserted-by":"publisher","first-page":"e83","DOI":"10.1093\/nar\/gkp318","article-title":"MM-Align: A quick algorithm for aligning multiple-chain protein complex structures using iterative dynamic programming","volume":"37","author":"Mukherjee","year":"2009","journal-title":"Nucleic Acids Res."},{"key":"B225","doi-asserted-by":"publisher","first-page":"R5","DOI":"10.1186\/gb-2012-13-1-r5","article-title":"Uberon, an integrative multi-species anatomy ontology","volume":"13","author":"Mungall","year":"2012","journal-title":"Genome Biol."},{"key":"B226","doi-asserted-by":"publisher","first-page":"e155","DOI":"10.1093\/nar\/gks678","article-title":"MetaVelvet: An extension of velvet assembler to de novo metagenome assembly from short sequence reads","volume":"40","author":"Namiki","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"B227","doi-asserted-by":"publisher","first-page":"giac077","DOI":"10.1093\/gigascience\/giac077","article-title":"A machine learning framework for discovery and enrichment of metagenomics metadata from open access publications","volume":"11","author":"Nassar","year":"2022","journal-title":"GigaScience"},{"key":"B228","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1186\/s40793-022-00449-7","article-title":"MarineMetagenomeDB: A public repository for curated and standardized metadata for marine metagenomes","volume":"17","author":"Nata\u2019ala","year":"2022","journal-title":"Environ. Microbiome"},{"key":"B229","doi-asserted-by":"publisher","first-page":"2933","DOI":"10.1093\/bioinformatics\/btt509","article-title":"Infernal 1.1: 100-fold faster RNA homology searches","volume":"29","author":"Nawrocki","year":"2013","journal-title":"Bioinformatics"},{"key":"B230","doi-asserted-by":"publisher","first-page":"499","DOI":"10.1038\/s41587-020-0718-6","article-title":"A genomic catalog of Earth\u2019s microbiomes","volume":"39","author":"Nayfach","year":"2021","journal-title":"Nat. Biotechnol."},{"key":"B231","doi-asserted-by":"publisher","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","article-title":"A general method applicable to the search for similarities in the amino acid sequence of two proteins","volume":"48","author":"Needleman","year":"1970","journal-title":"J. Mol. Biol."},{"key":"B232","doi-asserted-by":"publisher","DOI":"10.1101\/2021.07.25.453296","article-title":"The high-throughput gene prediction of more than 1,700 eukaryote genomes using the software package EukMetaSanity","author":"Neely","year":"2021","journal-title":"Bioinformatics"},{"key":"B233","doi-asserted-by":"publisher","first-page":"3327","DOI":"10.3390\/ijms22073327","article-title":"Novel CRISPR-cas systems: An updated review of the current achievements, applications, and future research perspectives","volume":"22","author":"Nidhi","year":"2021","journal-title":"Int. J. Mol. Sci."},{"key":"B234","doi-asserted-by":"publisher","first-page":"D259","DOI":"10.1093\/nar\/gky1022","article-title":"The UNITE database for molecular identification of fungi: Handling dark taxa and parallel taxonomic classifications","volume":"47","author":"Nilsson","year":"2019","journal-title":"Nucleic Acids Res."},{"key":"B235","doi-asserted-by":"publisher","DOI":"10.1101\/490078","article-title":"Binning microbial genomes using deep learning","author":"Nissen","year":"2018","journal-title":"biorxiv"},{"key":"B236","doi-asserted-by":"publisher","first-page":"5623","DOI":"10.1093\/nar\/gkl723","article-title":"MetaGene: Prokaryotic gene finding from environmental genome shotgun sequences","volume":"34","author":"Noguchi","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"B237","doi-asserted-by":"publisher","first-page":"387","DOI":"10.1093\/dnares\/dsn027","article-title":"MetaGeneAnnotator: Detecting species-specific patterns of ribosomal binding site for precise gene prediction in anonymous prokaryotic and phage genomes","volume":"15","author":"Noguchi","year":"2008","journal-title":"DNA Res."},{"key":"B238","doi-asserted-by":"publisher","first-page":"824","DOI":"10.1101\/gr.213959.116","article-title":"metaSPAdes: a new versatile metagenomic assembler","volume":"27","author":"Nurk","year":"2017","journal-title":"Genome Res."},{"key":"B239","doi-asserted-by":"publisher","first-page":"S2","DOI":"10.1038\/nmeth.f.301","article-title":"Visualizing biological data-now and in the future","volume":"7","author":"O\u2019Donoghue","year":"2010","journal-title":"Nat. Methods"},{"key":"B240","doi-asserted-by":"publisher","first-page":"D102","DOI":"10.1093\/nar\/gkab995","article-title":"DNA Data Bank of Japan (DDBJ) update report 2021","volume":"50","author":"Okido","year":"2022","journal-title":"Nucleic Acids Res."},{"key":"B241","doi-asserted-by":"publisher","first-page":"385","DOI":"10.1186\/1471-2105-12-385","article-title":"Interactive metagenomic visualization in a Web browser","volume":"12","author":"Ondov","year":"2011","journal-title":"BMC Bioinforma."},{"key":"B242","doi-asserted-by":"publisher","first-page":"BBI.S12462","DOI":"10.4137\/bbi.s12462","article-title":"Metagenomics: Tools and insights for analyzing next-generation sequencing data derived from biodiversity studies","volume":"9","author":"Oulas","year":"2015","journal-title":"Bioinform Biol. Insights"},{"key":"B243","doi-asserted-by":"publisher","first-page":"294","DOI":"10.1126\/science.aah4043","article-title":"Protein structure determination using metagenome sequence data","volume":"355","author":"Ovchinnikov","year":"2017","journal-title":"Science"},{"key":"B244","doi-asserted-by":"publisher","first-page":"e02030","DOI":"10.7554\/elife.02030","article-title":"Robust and accurate prediction of residue-residue interactions across protein interfaces using evolutionary information","volume":"3","author":"Ovchinnikov","year":"2014","journal-title":"Elife 3"},{"key":"B245","doi-asserted-by":"publisher","first-page":"D457","DOI":"10.1093\/nar\/gkw1030","article-title":"IMG\/VR: A database of cultured and uncultured DNA viruses and retroviruses","volume":"45","author":"Paez-Espino","year":"","journal-title":"Nucleic Acids Res."},{"key":"B246","doi-asserted-by":"publisher","first-page":"425","DOI":"10.1038\/nature19094","article-title":"Uncovering Earth\u2019s virome","volume":"536","author":"Paez-Espino","year":"2016","journal-title":"Nature"},{"key":"B247","doi-asserted-by":"publisher","first-page":"1673","DOI":"10.1038\/nprot.2017.063","article-title":"Nontargeted virus sequence discovery pipeline and virus clustering for metagenomic data","volume":"12","author":"Paez-Espino","year":"","journal-title":"Nat. Protoc."},{"key":"B248","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1186\/s40168-019-0768-5","article-title":"Diversity, evolution, and classification of virophages uncovered through global metagenomics","volume":"7","author":"Paez-Espino","year":"2019","journal-title":"Microbiome"},{"key":"B249","doi-asserted-by":"publisher","first-page":"baw005","DOI":"10.1093\/database\/baw005","article-title":"Extract: Interactive extraction of environment metadata and term suggestion for metagenomic sample annotation","volume":"2016","author":"Pafilis","year":"2016","journal-title":"Database"},{"key":"B250","doi-asserted-by":"publisher","first-page":"134110","DOI":"10.1063\/5.0018516","article-title":"Heterogeneous parallelization and acceleration of molecular dynamics simulations in GROMACS","volume":"153","author":"P\u00e1ll","year":"2020","journal-title":"J. Chem. Phys."},{"key":"B251","doi-asserted-by":"publisher","first-page":"5607","DOI":"10.1099\/ijsem.0.004332","article-title":"List of prokaryotic names with standing in nomenclature (LPSN) moves to the DSMZ","volume":"70","author":"Parte","year":"2020","journal-title":"Int. J. Syst. Evol. Microbiol."},{"key":"B252","doi-asserted-by":"publisher","first-page":"407","DOI":"10.1016\/j.aej.2015.11.003","article-title":"Divisive Analysis (DIANA) of hierarchical clustering and GPS data for level of service criteria of urban streets","volume":"55","author":"Patnaik","year":"2016","journal-title":"Alexandria Eng. J."},{"key":"B253","doi-asserted-by":"publisher","first-page":"158","DOI":"10.15406\/mojpb.2017.05.00174","article-title":"How to cluster protein sequences: Tools, tips and commands","volume":"5","author":"Pavlopoulos","year":"2017","journal-title":"MOJPB"},{"key":"B254","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1093\/gigascience\/giy014","article-title":"Bipartite graphs in systems biology and medicine: A survey of methods and applications","volume":"7","author":"Pavlopoulos","year":"2018","journal-title":"Gigascience"},{"key":"B255","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1155\/2017\/1278932","article-title":"Empirical comparison of visualization tools for larger-scale network analysis","volume":"2017","author":"Pavlopoulos","year":"2017","journal-title":"Adv. Bioinforma."},{"key":"B256","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1186\/1756-0381-4-10","article-title":"Using graph theory to analyze biological networks","volume":"4","author":"Pavlopoulos","year":"2011","journal-title":"BioData Min."},{"key":"B257","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1756-0381-3-1","article-title":"A reference guide for tree analysis and visualization","volume":"3","author":"Pavlopoulos","year":"2010","journal-title":"BioData Min."},{"key":"B258","doi-asserted-by":"publisher","first-page":"12","DOI":"10.1186\/1756-0381-1-12","article-title":"A survey of visualization tools for biological network analysis","volume":"1","author":"Pavlopoulos","year":"2008","journal-title":"BioData Min."},{"key":"B259","doi-asserted-by":"publisher","first-page":"e1010539","DOI":"10.1371\/journal.pcbi.1010539","article-title":"Fast and accurate ab initio Protein structure prediction using deep learning potentials","volume":"18","author":"Pearce","year":"2022","journal-title":"PLoS Comput. Biol."},{"key":"B260","doi-asserted-by":"publisher","first-page":"1420","DOI":"10.1093\/bioinformatics\/bts174","article-title":"IDBA-UD: A de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth","volume":"28","author":"Peng","year":"2012","journal-title":"Bioinformatics"},{"key":"B261","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1002\/prot.10505","article-title":"Detection of functional modules from protein interaction networks","volume":"54","author":"Pereira-Leal","year":"2004","journal-title":"Proteins"},{"key":"B262","doi-asserted-by":"publisher","first-page":"mgen000409","DOI":"10.1099\/mgen.0.000409","article-title":"Metagenomic approaches in microbial ecology: An update on whole-genome and marker gene sequencing analyses","volume":"6","author":"P\u00e9rez-Cobas","year":"2020","journal-title":"Microb. Genomics"},{"key":"B263","doi-asserted-by":"publisher","first-page":"e0176469","DOI":"10.1371\/journal.pone.0176469","article-title":"MGmapper: Reference based mapping and taxonomy annotation of metagenomics sequence reads","volume":"12","author":"Petersen","year":"2017","journal-title":"PLoS One 12"},{"key":"B264","doi-asserted-by":"publisher","first-page":"044130","DOI":"10.1063\/5.0014475","article-title":"Scalable molecular dynamics on CPU and GPU architectures with NAMD","volume":"153","author":"Phillips","year":"2020","journal-title":"J. Chem. Phys. 153"},{"key":"B265","doi-asserted-by":"publisher","first-page":"248","DOI":"10.3389\/fevo.2020.00248","article-title":"Putting COI metabarcoding in context: The utility of exact sequence variants (ESVs) in biodiversity analysis","volume":"8","author":"Porter","year":"2020","journal-title":"Front. Ecol. Evol."},{"key":"B266","doi-asserted-by":"publisher","first-page":"R233","DOI":"10.1186\/gb-2007-8-11-r233","article-title":"The determinants of gene order conservation in yeasts","volume":"8","author":"Poyatos","year":"2007","journal-title":"Genome Biol."},{"key":"B267","doi-asserted-by":"publisher","first-page":"mgen000823","DOI":"10.1099\/mgen.0.000823","article-title":"Whokaryote: Distinguishing eukaryotic and prokaryotic contigs in metagenomes based on gene structure","volume":"8","author":"Pronk","year":"2022","journal-title":"Microb. Genomics 8"},{"key":"B268","doi-asserted-by":"publisher","first-page":"7188","DOI":"10.1093\/nar\/gkm864","article-title":"Silva: A comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB","volume":"35","author":"Pruesse","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"B269","doi-asserted-by":"publisher","first-page":"ii56","DOI":"10.1101\/2021.11.05.467408","article-title":"3CAC: Improving the classification of phages and plasmids in metagenomic assemblies using assembly graphs","volume":"38","author":"Pu","year":"2022","journal-title":"Bioinformatics"},{"key":"B270","doi-asserted-by":"publisher","first-page":"833","DOI":"10.1038\/nbt.3935","article-title":"Shotgun metagenomics, from sampling to analysis","volume":"35","author":"Quince","year":"2017","journal-title":"Nat. Biotechnol."},{"key":"B271","doi-asserted-by":"publisher","first-page":"133","DOI":"10.1186\/s40168-019-0743-1","article-title":"Comparative analysis of amplicon and metagenomic sequencing methods reveals key features in the evolution of animal metaorganisms","volume":"7","author":"Rausch","year":"2019","journal-title":"Microbiome"},{"key":"B272","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1186\/s40168-017-0283-5","article-title":"VirFinder: A novel k-mer based tool for identifying viral sequences from assembled metagenomic data","volume":"5","author":"Ren","year":"2017","journal-title":"Microbiome"},{"key":"B273","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1007\/s40484-019-0187-4","article-title":"Identifying viruses from metagenomic data using deep learning","volume":"8","author":"Ren","year":"2020","journal-title":"Quant. Biol."},{"key":"B274","doi-asserted-by":"publisher","first-page":"e191","DOI":"10.1093\/nar\/gkq747","article-title":"FragGeneScan: Predicting genes in short and error-prone reads","volume":"38","author":"Rho","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"B275","doi-asserted-by":"publisher","first-page":"3499","DOI":"10.1021\/acs.jctc.5b00356","article-title":"Improved peptide and protein torsional energetics with the OPLS-AA force field","volume":"11","author":"Robertson","year":"2015","journal-title":"J. Chem. Theory Comput."},{"key":"B276","doi-asserted-by":"publisher","first-page":"e2584","DOI":"10.7717\/peerj.2584","article-title":"Vsearch: A versatile open source tool for metagenomics","volume":"4","author":"Rognes","year":"2016","journal-title":"PeerJ"},{"key":"B277","doi-asserted-by":"publisher","first-page":"539","DOI":"10.1093\/sysbio\/sys029","article-title":"MrBayes 3.2: Efficient bayesian phylogenetic inference and model choice across a large model space","volume":"61","author":"Ronquist","year":"2012","journal-title":"Syst. Biol."},{"key":"B278","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1093\/bioinformatics\/btq619","article-title":"NBC: The naive Bayes classification tool webserver for taxonomic classification of metagenomic reads","volume":"27","author":"Rosen","year":"2011","journal-title":"Bioinformatics"},{"key":"B279","doi-asserted-by":"publisher","first-page":"85","DOI":"10.1093\/protein\/12.2.85","article-title":"Twilight zone of protein sequence alignments","volume":"12","author":"Rost","year":"1999","journal-title":"Protein Eng."},{"key":"B280","doi-asserted-by":"publisher","first-page":"309","DOI":"10.1186\/s12859-018-2320-1","article-title":"Selection of marker genes for genetic barcoding of microorganisms and binning of metagenomic reads by Barcoder software tools","volume":"19","author":"Rotimi","year":"2018","journal-title":"BMC Bioinforma."},{"key":"B281","doi-asserted-by":"publisher","first-page":"D764","DOI":"10.1093\/nar\/gkaa946","article-title":"IMG\/VR v3: An integrated ecological and evolutionary framework for interrogating genomes of uncultivated viruses","volume":"49","author":"Roux","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"B282","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1038\/s41592-019-0669-3","article-title":"Fast and accurate long-read assembly with wtdbg2","volume":"17","author":"Ruan","year":"2020","journal-title":"Nat. Methods"},{"key":"B283","doi-asserted-by":"publisher","first-page":"112","DOI":"10.1038\/s41564-018-0292-6","article-title":"Prediction of the intestinal resistome by a three-dimensional structure-based method","volume":"4","author":"Rupp\u00e9","year":"2019","journal-title":"Nat. Microbiol."},{"key":"B284","doi-asserted-by":"publisher","first-page":"1069","DOI":"10.1038\/nmeth.2212","article-title":"A travel guide to Cytoscape plugins","volume":"9","author":"Saito","year":"2012","journal-title":"Nat. Methods"},{"key":"B285","doi-asserted-by":"publisher","first-page":"406","DOI":"10.1093\/oxfordjournals.molbev.a040454","article-title":"The neighbor-joining method: A new method for reconstructing phylogenetic trees","volume":"4","author":"Saitou","year":"1987","journal-title":"Mol. Biol. Evol."},{"key":"B286","doi-asserted-by":"publisher","first-page":"2244","DOI":"10.1128\/jb.01811-07","article-title":"Polarity in archaeal operon transcription in Thermococcus kodakaraensis","volume":"190","author":"Santangelo","year":"2008","journal-title":"J. Bacteriol."},{"key":"B287","doi-asserted-by":"publisher","first-page":"D161","DOI":"10.1093\/nar\/gkab1135","article-title":"GenBank","volume":"50","author":"Sayers","year":"2022","journal-title":"Nucleic Acids Res."},{"key":"B288","doi-asserted-by":"publisher","first-page":"baaa062","DOI":"10.1093\/database\/baaa062","article-title":"NCBI taxonomy: NCBI taxonomy: A comprehensive update on curation, resources and tools","volume":"2020","author":"Schoch","year":"2020","journal-title":"Database"},{"key":"B289","doi-asserted-by":"publisher","first-page":"1003","DOI":"10.1038\/nmeth.3621","article-title":"Avoiding abundance bias in the functional annotation of post-translationally modified proteins","volume":"12","author":"Sch\u00f6lz","year":"2015","journal-title":"Nat. Methods"},{"key":"B290","doi-asserted-by":"publisher","first-page":"D940","DOI":"10.1093\/nar\/gkr972","article-title":"Disease ontology: A backbone for disease semantic integration","volume":"40","author":"Schriml","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"B291","doi-asserted-by":"publisher","first-page":"151","DOI":"10.1016\/j.str.2008.12.014","article-title":"Outcome of a workshop on applications of protein models in biomedical research","volume":"17","author":"Schwede","year":"2009","journal-title":"Structure"},{"key":"B292","doi-asserted-by":"publisher","DOI":"10.3389\/fmicb.2015.01451","article-title":"gbtools: Interactive visualization of metagenome bins in R","volume":"6","author":"Seah","year":"2015","journal-title":"Front. Microbiol."},{"key":"B293","doi-asserted-by":"publisher","first-page":"329","DOI":"10.1038\/s41576-018-0003-4","article-title":"Piercing the dark matter: Bioinformatics of long-range sequencing and mapping","volume":"19","author":"Sedlazeck","year":"2018","journal-title":"Nat. Rev. Genet."},{"key":"B294","doi-asserted-by":"publisher","first-page":"2068","DOI":"10.1093\/bioinformatics\/btu153","article-title":"Prokka: Rapid prokaryotic genome annotation","volume":"30","author":"Seemann","year":"2014","journal-title":"Bioinformatics"},{"key":"B295","doi-asserted-by":"publisher","first-page":"811","DOI":"10.1038\/nmeth.2066","article-title":"Metagenomic microbial community profiling using unique clade-specific marker genes","volume":"9","author":"Segata","year":"2012","journal-title":"Nat. Methods"},{"key":"B296","first-page":"1","article-title":"Extreme-scale many-against-many protein similarity search","author":"Selvitopi","year":"2022"},{"key":"B297","first-page":"1","article-title":"Distributed many-to-many protein sequence alignment using sparse matrices","author":"Selvitopi","year":"2020"},{"key":"B298","doi-asserted-by":"publisher","first-page":"359","DOI":"10.1038\/nbt.4110","article-title":"Cultivation and sequencing of rumen microbiome members from the Hungate1000 Collection","volume":"36","author":"Seshadri","year":"2018","journal-title":"Nat. Biotechnol."},{"key":"B299","doi-asserted-by":"publisher","first-page":"2128","DOI":"10.1038\/s41564-022-01266-x","article-title":"Standardized multi-omics of Earth\u2019s microbiomes reveals microbial and metabolite diversity","volume":"7","author":"Shaffer","year":"2022","journal-title":"Nat. Microbiol."},{"key":"B300","doi-asserted-by":"publisher","first-page":"e1003918","DOI":"10.1371\/journal.pcbi.1003918","article-title":"BiomeNet: A bayesian model for inference of metabolic divergence among microbial communities","volume":"10","author":"Shafiei","year":"2014","journal-title":"PLOS Comput. Biol."},{"key":"B301","doi-asserted-by":"publisher","first-page":"bbac258","DOI":"10.1093\/bib\/bbac258","article-title":"Accurate identification of bacteriophages from metagenomic data using Transformer","volume":"23","author":"Shang","year":"2022","journal-title":"Briefings Bioinforma."},{"key":"B302","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1186\/s12866-021-02321-z","article-title":"MetaGeneBank: A standardized database to study deep sequenced metagenomic data from human fecal specimen","volume":"21","author":"Shao","year":"2021","journal-title":"BMC Microbiol."},{"key":"B303","doi-asserted-by":"publisher","first-page":"D637","DOI":"10.1093\/nar\/gky1008","article-title":"gcMeta: a Global Catalogue of Metagenomics platform to support the archiving, standardization and analysis of microbiome data","volume":"47","author":"Shi","year":"2019","journal-title":"Nucleic Acids Res."},{"key":"B304","doi-asserted-by":"publisher","first-page":"539","DOI":"10.1038\/msb.2011.75","article-title":"Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega","volume":"7","author":"Sievers","year":"2011","journal-title":"Mol. Syst. Biol."},{"key":"B305","doi-asserted-by":"crossref","DOI":"10.1201\/9780429447273","volume-title":"Interactive web-based data visualization with R, plotly, and shiny","author":"Sievert","year":"2020"},{"key":"B306","doi-asserted-by":"publisher","first-page":"D344","DOI":"10.1093\/nar\/gks1067","article-title":"New and continuing developments at PROSITE","volume":"41","author":"Sigrist","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"B307","doi-asserted-by":"publisher","first-page":"D266","DOI":"10.1093\/nar\/gkaa1079","article-title":"Cath: Increased structural coverage of functional space","volume":"49","author":"Sillitoe","year":"2021","journal-title":"Nucleic Acids Res."},{"key":"B308","doi-asserted-by":"publisher","first-page":"283","DOI":"10.1038\/73723","article-title":"Structural genomics and its importance for gene function analysis","volume":"18","author":"Skolnick","year":"2000","journal-title":"Nat. Biotechnol."},{"key":"B309","doi-asserted-by":"publisher","first-page":"e48998","DOI":"10.1371\/journal.pone.0048998","article-title":"MetaSee: An interactive and extendable visualization toolbox for metagenomic sample analysis and comparison","volume":"7","author":"Song","year":"2012","journal-title":"PLOS ONE"},{"key":"B310","doi-asserted-by":"publisher","first-page":"W74","DOI":"10.1093\/nar\/gkz380","article-title":"Prophage hunter: An integrative hunting tool for active prophages","volume":"47","author":"Song","year":"2019","journal-title":"Nucleic Acids Res."},{"key":"B311","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1016\/j.str.2013.08.005","article-title":"High-resolution comparative modeling with RosettaCM","volume":"21","author":"Song","year":"2013","journal-title":"Structure"},{"key":"B312","doi-asserted-by":"publisher","first-page":"e3001007","DOI":"10.1371\/journal.pbio.3001007","article-title":"ClipKIT: A multiple sequence alignment trimming software for accurate phylogenomic inference","volume":"18","author":"Steenwyk","year":"2020","journal-title":"PLoS Biol."},{"key":"B313","doi-asserted-by":"publisher","first-page":"473","DOI":"10.1186\/s12859-019-3019-7","article-title":"HH-suite3 for fast remote homology detection and deep protein annotation","volume":"20","author":"Steinegger","year":"","journal-title":"BMC Bioinforma."},{"key":"B314","doi-asserted-by":"publisher","first-page":"603","DOI":"10.1038\/s41592-019-0437-4","article-title":"Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold","volume":"16","author":"Steinegger","year":"","journal-title":"Nat. Methods"},{"key":"B315","doi-asserted-by":"publisher","first-page":"2542","DOI":"10.1038\/s41467-018-04964-5","article-title":"Clustering huge protein sequence sets in linear time","volume":"9","author":"Steinegger","year":"2018","journal-title":"Nat. Commun."},{"key":"B316","doi-asserted-by":"publisher","first-page":"1026","DOI":"10.1038\/nbt.3988","article-title":"MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets","volume":"35","author":"Steinegger","year":"2017","journal-title":"Nat. Biotechnol."},{"key":"B317","doi-asserted-by":"publisher","first-page":"3.1.1","DOI":"10.1002\/0471250953.bi0301s27","article-title":"An introduction to sequence similarity (\u201chomology\u201d) searching","author":"Stormo","year":"2009","journal-title":"Curr. Protoc. Bioinforma."},{"key":"B318","doi-asserted-by":"publisher","first-page":"410","DOI":"10.3389\/fmicb.2012.00410","article-title":"The binning of metagenomic contigs for microbial physiology of mixed cultures","volume":"3","author":"Strous","year":"2012","journal-title":"Front. Microbio. 3"},{"key":"B319","doi-asserted-by":"publisher","first-page":"15545","DOI":"10.1073\/pnas.0506580102","article-title":"Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles","volume":"102","author":"Subramanian","year":"2005","journal-title":"Proc. Natl. Acad. Sci. U. S. A."},{"key":"B320","doi-asserted-by":"publisher","first-page":"37","DOI":"10.21775\/cimb.024.037","article-title":"Methods for the metagenomic data visualization and analysis","volume":"24","author":"Sudarikov","year":"2017","journal-title":"Curr. Issues Mol. Biol."},{"key":"B321","doi-asserted-by":"publisher","first-page":"1261359","DOI":"10.1126\/science.1261359","article-title":"Ocean plankton. Structure and function of the global ocean microbiome","volume":"348","author":"Sunagawa","year":"2015","journal-title":"Science"},{"key":"B322","doi-asserted-by":"publisher","first-page":"564","DOI":"10.1080\/10635150701472164","article-title":"Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments","volume":"56","author":"Talavera","year":"2007","journal-title":"Syst. Biol."},{"key":"B323","doi-asserted-by":"publisher","first-page":"486","DOI":"10.1126\/science.1153917","article-title":"Synteny and collinearity in plant genomes","volume":"320","author":"Tang","year":"2008","journal-title":"Science"},{"key":"B324","doi-asserted-by":"publisher","first-page":"1037","DOI":"10.1093\/bioinformatics\/btx713","article-title":"Dfast: A flexible prokaryotic genome annotation pipeline for faster genome publication","volume":"34","author":"Tanizawa","year":"2018","journal-title":"Bioinformatics"},{"key":"B325","doi-asserted-by":"publisher","first-page":"6614","DOI":"10.1093\/nar\/gkw569","article-title":"NCBI prokaryotic genome annotation pipeline","volume":"44","author":"Tatusova","year":"2016","journal-title":"Nucleic Acids Res."},{"key":"B326","year":"2015"},{"key":"B327","doi-asserted-by":"publisher","first-page":"1023","DOI":"10.1038\/s41587-021-01156-3","article-title":"SignalP 6.0 predicts all five types of signal peptides using protein language models","volume":"40","author":"Teufel","year":"2022","journal-title":"Nat. Biotechnol."},{"key":"B328","doi-asserted-by":"publisher","first-page":"665","DOI":"10.3390\/biology10070665","article-title":"Flame: A web tool for functional and literature enrichment analysis of multiple gene lists","volume":"10","author":"Thanati","year":"2021","journal-title":"Biology"},{"key":"B329","doi-asserted-by":"publisher","first-page":"457","DOI":"10.1038\/nature24621","article-title":"A communal catalogue reveals Earth\u2019s multiscale microbial diversity","volume":"551","author":"Thompson","year":"2017","journal-title":"Nature"},{"key":"B330","doi-asserted-by":"publisher","first-page":"528","DOI":"10.1021\/acs.jctc.9b00591","article-title":"ff19SB: Amino-Acid-Specific protein backbone parameters trained against Quantum mechanics energy surfaces in solution","volume":"16","author":"Tian","year":"2020","journal-title":"J. Chem. Theory Comput."},{"key":"B331","doi-asserted-by":"publisher","first-page":"i61","DOI":"10.1093\/bioinformatics\/btz349","article-title":"cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs","volume":"35","author":"Tolstoganov","year":"2019","journal-title":"Bioinformatics"},{"key":"B332","doi-asserted-by":"publisher","first-page":"902","DOI":"10.1038\/nmeth.3589","article-title":"MetaPhlAn2 for enhanced metagenomic taxonomic profiling","volume":"12","author":"Truong","year":"2015","journal-title":"Nat. Methods"},{"key":"B333","doi-asserted-by":"publisher","first-page":"590","DOI":"10.1038\/s41586-021-03828-1","article-title":"Highly accurate protein structure prediction for the human proteome","volume":"596","author":"Tunyasuvunakool","year":"2021","journal-title":"Nature"},{"key":"B334","doi-asserted-by":"publisher","first-page":"D626","DOI":"10.1093\/nar\/gkw1134","article-title":"The UCSC genome browser database: 2017 update","volume":"45","author":"Tyner","year":"2017","journal-title":"Nucleic Acids Res."},{"key":"B335","doi-asserted-by":"publisher","first-page":"2699","DOI":"10.1093\/nar\/gky092","article-title":"UniProt: The universal protein knowledgebase","volume":"46","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"B336","doi-asserted-by":"publisher","first-page":"227","DOI":"10.1002\/prot.10146","article-title":"Scoring residue conservation","volume":"48","author":"Valdar","year":"2002","journal-title":"Proteins Struct. Funct. Bioinforma."},{"key":"B337","doi-asserted-by":"publisher","first-page":"D517","DOI":"10.1093\/nar\/gkw1101","article-title":"MicroScope in 2017: An expanding and evolving integrated resource for community expertise of microbial genomes","volume":"45","author":"Vallenet","year":"2017","journal-title":"Nucleic Acids Res."},{"key":"B338","doi-asserted-by":"publisher","first-page":"01194","DOI":"10.1128\/msystems.01194-20","article-title":"Microbiome metadata standards: Report of the national microbiome data collaborative\u2019s workshop and follow-on activities","volume":"6","author":"Vangay","year":"2021","journal-title":"mSystems"},{"key":"B339","doi-asserted-by":"publisher","first-page":"D439","DOI":"10.1093\/nar\/gkab1061","article-title":"AlphaFold protein structure database: Massively expanding the structural coverage of protein-sequence space with high-accuracy models","volume":"50","author":"Varadi","year":"2022","journal-title":"Nucleic Acids Res."},{"key":"B340","doi-asserted-by":"publisher","first-page":"9977","DOI":"10.1016\/j.jksuci.2022.09.015","article-title":"MetaViz \u2013 a graphical meta-model instantiator for generating information dashboards and visualizations","volume":"34","author":"V\u00e1zquez-Ingelmo","year":"2022","journal-title":"J. King Saud Univ. - Comput. Inf. Sci."},{"key":"B341","doi-asserted-by":"publisher","first-page":"217","DOI":"10.1186\/s13059-019-1817-x","article-title":"Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT","volume":"20","author":"von Meijenfeldt","year":"2019","journal-title":"Genome Biol."},{"key":"B342","doi-asserted-by":"publisher","first-page":"641","DOI":"10.1038\/s41579-022-00739-4","article-title":"Structural biology of CRISPR\u2013Cas immunity and genome editing enzymes","volume":"20","author":"Wang","year":"2022","journal-title":"Nat. Rev. Microbiol."},{"key":"B343","doi-asserted-by":"publisher","first-page":"i356","DOI":"10.1093\/bioinformatics\/bts397","article-title":"MetaCluster 5.0: A two-round binning approach for metagenomic data for low-abundance species in a noisy sample","volume":"28","author":"Wang","year":"2012","journal-title":"Bioinformatics"},{"key":"B344","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1186\/s13059-019-1823-z","article-title":"Fueling ab initio folding with marine metagenomics enables structure and function predictions of new protein families","volume":"20","author":"Wang","year":"2019","journal-title":"Genome Biol."},{"key":"B345","doi-asserted-by":"publisher","first-page":"425","DOI":"10.1186\/s12859-017-1835-1","article-title":"Improving contig binning of metagenomic data using $$ {d}_2^S $$ oligonucleotide frequency dissimilarity","volume":"18","author":"Wang","year":"2017","journal-title":"BMC Bioinforma."},{"key":"B346","doi-asserted-by":"publisher","first-page":"4229","DOI":"10.1093\/bioinformatics\/btz253","article-title":"SolidBin: Improving metagenome binning with semi-supervised normalized cut","volume":"35","author":"Wang","year":"2019","journal-title":"Bioinformatics"},{"key":"B347","doi-asserted-by":"publisher","first-page":"239","DOI":"10.1007\/978-1-0716-0892-0_14","article-title":"Protein structure modeling with MODELLER","volume":"2199","author":"Webb","year":"2021","journal-title":"Methods Mol. Biol."},{"key":"B348","doi-asserted-by":"publisher","first-page":"569","DOI":"10.1101\/gr.228429.117","article-title":"Genome-reconstruction for eukaryotes from complex natural microbial communities","volume":"28","author":"West","year":"2018","journal-title":"Genome Res."},{"key":"B349","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1186\/1471-2105-15-7","article-title":"Skylign: A tool for creating informative, interactive logos representing sequence alignments and profile hidden markov models","volume":"15","author":"Wheeler","year":"2014","journal-title":"BMC Bioinforma."},{"key":"B350","doi-asserted-by":"publisher","first-page":"6578","DOI":"10.1073\/pnas.95.12.6578","article-title":"Prokaryotes: The unseen majority","volume":"95","author":"Whitman","year":"1998","journal-title":"Proc. Natl. Acad. Sci. U. S. A."},{"key":"B351","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1186\/1471-2105-13-141","article-title":"The M5nr: A novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools","volume":"13","author":"Wilke","year":"2012","journal-title":"BMC Bioinforma."},{"key":"B352","doi-asserted-by":"publisher","first-page":"160018","DOI":"10.1038\/sdata.2016.18","article-title":"The FAIR Guiding Principles for scientific data management and stewardship","volume":"3","author":"Wilkinson","year":"2016","journal-title":"Sci. Data"},{"key":"B353","doi-asserted-by":"publisher","first-page":"356","DOI":"10.1101\/gr.161901","article-title":"Genome alignment, evolution of prokaryotic genome organization, and prediction of gene function using genomic context","volume":"11","author":"Wolf","year":"2001","journal-title":"Genome Res."},{"key":"B354","doi-asserted-by":"publisher","first-page":"257","DOI":"10.1186\/s13059-019-1891-0","article-title":"Improved metagenomic analysis with Kraken 2","volume":"20","author":"Wood","year":"2019","journal-title":"Genome Biol."},{"key":"B355","doi-asserted-by":"publisher","first-page":"580","DOI":"10.1093\/bioinformatics\/btq698","article-title":"X-MATE: A flexible system for mapping short read data","volume":"27","author":"Wood","year":"2011","journal-title":"Bioinformatics"},{"key":"B356","doi-asserted-by":"publisher","DOI":"10.1101\/2022.07.21.500999","article-title":"High-resolution de novo structure prediction from primary sequence","author":"Wu","year":"2022","journal-title":"bioRxiv"},{"key":"B357","doi-asserted-by":"publisher","first-page":"605","DOI":"10.1093\/bioinformatics\/btv638","article-title":"MaxBin 2.0: An automated binning algorithm to recover genomes from multiple metagenomic datasets","volume":"32","author":"Wu","year":"2016","journal-title":"Bioinformatics"},{"key":"B358","doi-asserted-by":"publisher","first-page":"523","DOI":"10.1089\/cmb.2010.0245","article-title":"A novel abundance-based algorithm for binning metagenomic sequences using l -tuples","volume":"18","author":"Wu","year":"2011","journal-title":"J. Comput. Biol."},{"key":"B359","doi-asserted-by":"publisher","first-page":"1715","DOI":"10.1002\/prot.24065","article-title":"Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field","volume":"80","author":"Xu","year":"2012","journal-title":"Proteins"},{"key":"B360","doi-asserted-by":"publisher","first-page":"645","DOI":"10.1109\/tnn.2005.845141","article-title":"Survey of clustering algorithms","volume":"16","author":"Xu","year":"2005","journal-title":"IEEE Trans. Neural Netw."},{"key":"B361","doi-asserted-by":"publisher","first-page":"6301","DOI":"10.1016\/j.csbj.2021.11.028","article-title":"A review of computational tools for generating metagenome-assembled genomes from metagenomic sequencing data","volume":"19","author":"Yang","year":"2021","journal-title":"Comput. Struct. Biotechnol. J."},{"key":"B362","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1038\/nmeth.3213","article-title":"The I-tasser suite: Protein structure and function prediction","volume":"12","author":"Yang","year":"2015","journal-title":"Nat. Methods"},{"key":"B363","doi-asserted-by":"publisher","first-page":"e2110828118","DOI":"10.1073\/pnas.2110828118","article-title":"Decoding the link of microbiome niches with homologous sequences enables accurately targeted protein structure prediction","volume":"118","author":"Yang","year":"2021","journal-title":"Proc. Natl. Acad. Sci. U.S.A."},{"key":"B364","doi-asserted-by":"publisher","first-page":"1565","DOI":"10.1038\/ismej.2011.39","article-title":"The genomic standards consortium: Bringing standards to life for microbial ecology","volume":"5","author":"Yilmaz","year":"2011","journal-title":"ISME J."},{"key":"B365","doi-asserted-by":"publisher","first-page":"4172","DOI":"10.1093\/bioinformatics\/bty519","article-title":"BMC3C: Binning metagenomic contigs using codon usage, sequence composition and read coverage","volume":"34","author":"Yu","year":"2018","journal-title":"Bioinformatics"},{"key":"B366","doi-asserted-by":"publisher","first-page":"334","DOI":"10.1186\/s12859-020-03667-3","article-title":"Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets","volume":"21","author":"Yue","year":"2020","journal-title":"BMC Bioinforma."},{"key":"B367","doi-asserted-by":"publisher","first-page":"293","DOI":"10.3390\/microorganisms10020293","article-title":"Prego: A literature and data-mining resource to associate microorganisms, biological processes, and environment types","volume":"10","author":"Zafeiropoulos","year":"2022","journal-title":"Microorganisms"},{"key":"B368","doi-asserted-by":"publisher","first-page":"4169","DOI":"10.1021\/acs.biochem.9b00735","article-title":"The EFI web resource for genomic enzymology tools: Leveraging protein, genome, and metagenome databases to discover novel enzymes and metabolic pathways","volume":"58","author":"Zallot","year":"2019","journal-title":"Biochemistry"},{"key":"B369","doi-asserted-by":"publisher","first-page":"276","DOI":"10.1186\/s12859-016-1112-8","article-title":"Clustering analysis of proteins from microbial genomes at multiple levels of resolution","volume":"8","author":"Zaslavsky","year":"2016","journal-title":"BMC Bioinforma."},{"key":"B370","doi-asserted-by":"publisher","first-page":"D754","DOI":"10.1093\/nar\/gkx1098","article-title":"Ensembl 2018","volume":"46","author":"Zerbino","year":"2018","journal-title":"Nucleic Acids Res."},{"key":"B371","doi-asserted-by":"publisher","first-page":"702","DOI":"10.1002\/prot.20264","article-title":"Scoring function for automated assessment of protein structure template quality","volume":"57","author":"Zhang","year":"2004","journal-title":"Proteins"},{"key":"B372","doi-asserted-by":"publisher","first-page":"2302","DOI":"10.1093\/nar\/gki524","article-title":"TM-Align: A protein structure alignment algorithm based on the TM-score","volume":"33","author":"Zhang","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"B373","doi-asserted-by":"publisher","first-page":"W527","DOI":"10.1093\/nar\/gkac376","article-title":"OmicsNet 2.0: A web-based platform for multi-omics integration and network visual analytics","volume":"50","author":"Zhou","year":"2022","journal-title":"Nucleic Acids Res."},{"key":"B374","doi-asserted-by":"publisher","first-page":"e132","DOI":"10.1093\/nar\/gkq275","article-title":"Ab initio gene identification in metagenomic sequences","volume":"38","author":"Zhu","year":"2010","journal-title":"Nucleic Acids Res."}],"container-title":["Frontiers in Bioinformatics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2023.1157956\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,3,7]],"date-time":"2023-03-07T16:37:50Z","timestamp":1678207070000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2023.1157956\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,3]]},"references-count":374,"alternative-id":["10.3389\/fbinf.2023.1157956"],"URL":"https:\/\/doi.org\/10.3389\/fbinf.2023.1157956","relation":{},"ISSN":["2673-7647"],"issn-type":[{"value":"2673-7647","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,3]]},"article-number":"1157956"}}