{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T08:01:40Z","timestamp":1775116900859,"version":"3.50.1"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","license":[{"start":{"date-parts":[[2020,7,13]],"date-time":"2020-07-13T00:00:00Z","timestamp":1594598400000},"content-version":"vor","delay-in-days":12,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"CAPES - Ci\u00eancia sem Fronteiras","award":["BEX 13472\/13-5"],"award-info":[{"award-number":["BEX 13472\/13-5"]}]},{"DOI":"10.13039\/501100002347","name":"BMBF","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002347","name":"BMBF","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100018929","name":"German Network for Bioinformatics Infrastructure","doi-asserted-by":"crossref","award":["031A537B"],"award-info":[{"award-number":["031A537B"]}],"id":[{"id":"10.13039\/501100018929","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100018929","name":"German Network for Bioinformatics Infrastructure","doi-asserted-by":"crossref","award":["031A533A"],"award-info":[{"award-number":["031A533A"]}],"id":[{"id":"10.13039\/501100018929","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100018929","name":"German Network for Bioinformatics Infrastructure","doi-asserted-by":"crossref","award":["031A538A"],"award-info":[{"award-number":["031A538A"]}],"id":[{"id":"10.13039\/501100018929","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100018929","name":"German Network for Bioinformatics Infrastructure","doi-asserted-by":"crossref","award":["031A533B"],"award-info":[{"award-number":["031A533B"]}],"id":[{"id":"10.13039\/501100018929","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100018929","name":"German Network for Bioinformatics Infrastructure","doi-asserted-by":"crossref","award":["031A535A"],"award-info":[{"award-number":["031A535A"]}],"id":[{"id":"10.13039\/501100018929","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100018929","name":"German Network for Bioinformatics Infrastructure","doi-asserted-by":"crossref","award":["031A537C"],"award-info":[{"award-number":["031A537C"]}],"id":[{"id":"10.13039\/501100018929","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100018929","name":"German Network for Bioinformatics Infrastructure","doi-asserted-by":"crossref","award":["031A534A"],"award-info":[{"award-number":["031A534A"]}],"id":[{"id":"10.13039\/501100018929","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100018929","name":"German Network for Bioinformatics Infrastructure","doi-asserted-by":"crossref","award":["031A532B"],"award-info":[{"award-number":["031A532B"]}],"id":[{"id":"10.13039\/501100018929","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>The exponential growth of assembled genome sequences greatly benefits metagenomics studies. However, currently available methods struggle to manage the increasing amount of sequences and their frequent updates. Indexing the current RefSeq can take days and hundreds of GB of memory on large servers. Few methods address these issues thus far, and even though many can theoretically handle large amounts of references, time\/memory requirements are prohibitive in practice. As a result, many studies that require sequence classification use often outdated and almost never truly up-to-date indices.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Motivated by those limitations, we created ganon, a k-mer-based read classification tool that uses Interleaved Bloom Filters in conjunction with a taxonomic clustering and a k-mer counting\/filtering scheme. Ganon provides an efficient method for indexing references, keeping them updated. It requires &amp;lt;55\u00a0min to index the complete RefSeq of bacteria, archaea, fungi and viruses. The tool can further keep these indices up-to-date in a fraction of the time necessary to create them. Ganon makes it possible to query against very large reference sets and therefore it classifies significantly more reads and identifies more species than similar methods. When classifying a high-complexity CAMI challenge dataset against complete genomes from RefSeq, ganon shows strongly increased precision with equal or better sensitivity compared with state-of-the-art tools. With the same dataset against the complete RefSeq, ganon improved the F1-score by 65% at the genus level. It supports taxonomy- and assembly-level classification, multiple indices and hierarchical classification.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The software is open-source and available at: https:\/\/gitlab.com\/rki_bioinformatics\/ganon.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa458","type":"journal-article","created":{"date-parts":[[2020,7,1]],"date-time":"2020-07-01T23:19:22Z","timestamp":1593645562000},"page":"i12-i20","source":"Crossref","is-referenced-by-count":67,"title":["ganon: precise metagenomics classification against large and up-to-date sets of reference sequences"],"prefix":"10.1093","volume":"36","author":[{"given":"Vitor C","family":"Piro","sequence":"first","affiliation":[{"name":"Bioinformatics Unit (MF1), Robert Koch Institute , Berlin 13353, Germany"},{"name":"CAPES Foundation, Ministry of Education of Brazil , Bras\u00edlia 70040-020, Brazil"},{"name":"Data Analytics and Computational Statistics , Hasso Plattner Insititute, Digital Engineering Faculty, University of Potsdam, Potsdam 14482, Germany"}]},{"given":"Temesgen H","family":"Dadi","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, Freie Universit\u00e4t Berlin , Berlin 14195, Germany"}]},{"given":"Enrico","family":"Seiler","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, Freie Universit\u00e4t Berlin , Berlin 14195, Germany"}]},{"given":"Knut","family":"Reinert","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, Freie Universit\u00e4t Berlin , Berlin 14195, Germany"}]},{"given":"Bernhard Y","family":"Renard","sequence":"additional","affiliation":[{"name":"Bioinformatics Unit (MF1), Robert Koch Institute , Berlin 13353, Germany"},{"name":"Data Analytics and Computational Statistics , Hasso Plattner Insititute, Digital Engineering Faculty, University of Potsdam, Potsdam 14482, Germany"}]}],"member":"286","published-online":{"date-parts":[[2020,7,13]]},"reference":[{"key":"2024041416573369800_btaa458-B1","doi-asserted-by":"crossref","first-page":"114","DOI":"10.1186\/s12864-017-3501-4","article-title":"SILVA, RDP, Greengenes, NCBI and OTT\u2014how do these taxonomies compare?","volume":"18","author":"Balvo\u010di\u016bt\u0117","year":"2017","journal-title":"BMC Genomics"},{"key":"2024041416573369800_btaa458-B2","doi-asserted-by":"crossref","first-page":"D41","DOI":"10.1093\/nar\/gkx1094","article-title":"GenBank","volume":"46","author":"Benson","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2024041416573369800_btaa458-B3","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1145\/362686.362692","article-title":"Space\/time trade-offs in hash coding with allowable errors","volume":"13","author":"Bloom","year":"1970","journal-title":"Commun. ACM"},{"key":"2024041416573369800_btaa458-B4","doi-asserted-by":"crossref","first-page":"1125","DOI":"10.1093\/bib\/bbx120","article-title":"A review of methods and databases for metagenomic classification and assembly","volume":"20","author":"Breitwieser","year":"2019","journal-title":"Brief. Bioinform"},{"key":"2024041416573369800_btaa458-B5","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1186\/s13059-018-1568-0","article-title":"KrakenUniq: confident and fast metagenomics classification using unique k-mer counts","volume":"19","author":"Breitwieser","year":"2018","journal-title":"Genome Biol"},{"key":"2024041416573369800_btaa458-B6","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nmeth.3176","article-title":"Fast and sensitive protein alignment using DIAMOND","volume":"12","author":"Buchfink","year":"2015","journal-title":"Nat. Methods"},{"key":"2024041416573369800_btaa458-B7","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1016\/j.ipl.2003.12.001","article-title":"Approximation algorithms for a hierarchically structured bin packing problem","volume":"89","author":"Codenotti","year":"2004","journal-title":"Inform. Process. Lett"},{"key":"2024041416573369800_btaa458-B8","doi-asserted-by":"crossref","first-page":"i766","DOI":"10.1093\/bioinformatics\/bty567","article-title":"DREAM-Yara: an exact read mapper for very large databases with short update time","volume":"34","author":"Dadi","year":"2018","journal-title":"Bioinformatics"},{"key":"2024041416573369800_btaa458-B9","doi-asserted-by":"crossref","first-page":"D136","DOI":"10.1093\/nar\/gkr1178","article-title":"The NCBI Taxonomy database","volume":"40","author":"Federhen","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2024041416573369800_btaa458-B10","doi-asserted-by":"crossref","first-page":"i124","DOI":"10.1093\/bioinformatics\/btx237","article-title":"Abundance estimation and differential testing on strain level in metagenomics data","volume":"33","author":"Fischer","year":"2017","journal-title":"Bioinformatics"},{"key":"2024041416573369800_btaa458-B11","doi-asserted-by":"crossref","first-page":"D851","DOI":"10.1093\/nar\/gkx1068","article-title":"RefSeq: an update on prokaryotic genome annotation and curation","volume":"46","author":"Haft","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2024041416573369800_btaa458-B12","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1101\/gr.5969107","article-title":"MEGAN analysis of metagenomic data","volume":"17","author":"Huson","year":"2007","journal-title":"Genome Res"},{"key":"2024041416573369800_btaa458-B13","doi-asserted-by":"crossref","first-page":"240","DOI":"10.1007\/3-540-54345-7_67","volume-title":"Mathematical Foundations of Computer Science 1991, Lecture Notes in Computer Science","author":"Jokinen","year":"1991"},{"key":"2024041416573369800_btaa458-B14","doi-asserted-by":"crossref","first-page":"1721","DOI":"10.1101\/gr.210641.116","article-title":"Centrifuge: rapid and sensitive classification of metagenomic sequences","volume":"26","author":"Kim","year":"2016","journal-title":"Genome Res"},{"key":"2024041416573369800_btaa458-B15","doi-asserted-by":"crossref","first-page":"e0198773","DOI":"10.1371\/journal.pone.0198773","article-title":"When old metagenomic data meet newly sequenced genomes, a case study","volume":"13","author":"Li","year":"2018","journal-title":"PLoS One"},{"key":"2024041416573369800_btaa458-B16","doi-asserted-by":"crossref","first-page":"19233","DOI":"10.1038\/srep19233","article-title":"An evaluation of the accuracy and speed of metagenome analysis tools","volume":"6","author":"Lindgreen","year":"2016","journal-title":"Sci. Rep"},{"key":"2024041416573369800_btaa458-B17","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1186\/s13059-017-1299-7","article-title":"Comprehensive benchmarking and ensemble approaches for metagenomic classifiers","volume":"18","author":"McIntyre","year":"2017","journal-title":"Genome Biol"},{"key":"2024041416573369800_btaa458-B18","doi-asserted-by":"crossref","first-page":"11257","DOI":"10.1038\/ncomms11257","article-title":"Fast and sensitive taxonomic classification for metagenomics with Kaiju","volume":"7","author":"Menzel","year":"2016","journal-title":"Nat. Commun"},{"key":"2024041416573369800_btaa458-B19","doi-asserted-by":"crossref","DOI":"10.1093\/gigascience\/giy069","article-title":"AMBER: assessment of metagenome BinnERs","volume":"7","author":"Meyer","year":"2018","journal-title":"Gigascience"},{"key":"2024041416573369800_btaa458-B20","doi-asserted-by":"crossref","first-page":"676","DOI":"10.1038\/nbt.3886","article-title":"1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life","volume":"35","author":"Mukherjee","year":"2017","journal-title":"Nat. Biotechnol"},{"key":"2024041416573369800_btaa458-B21","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1186\/s13059-018-1554-6","article-title":"RefSeq database growth influences the accuracy of k-mer-based lowest common ancestor species identification","volume":"19","author":"Nasko","year":"2018","journal-title":"Genome Biol"},{"key":"2024041416573369800_btaa458-B22","doi-asserted-by":"crossref","first-page":"75","DOI":"10.4137\/BBI.S12462","article-title":"Metagenomics: tools and insights for analyzing next-generation sequencing data derived from biodiversity studies","volume":"9","author":"Oulas","year":"2015","journal-title":"Bioinform. Biol. Insights"},{"key":"2024041416573369800_btaa458-B23","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1186\/s12864-015-1419-2","article-title":"CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers","volume":"16","author":"Ounit","year":"2015","journal-title":"BMC Genomics"},{"key":"2024041416573369800_btaa458-B24","doi-asserted-by":"crossref","first-page":"1533","DOI":"10.1038\/s41564-017-0012-7","article-title":"Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life","volume":"2","author":"Parks","year":"2017","journal-title":"Nat. Microbiol"},{"key":"2024041416573369800_btaa458-B25","doi-asserted-by":"crossref","first-page":"362","DOI":"10.1186\/s12859-015-0788-5","article-title":"Evaluation of shotgun metagenomics sequence classification methods using in silico and in vitro simulated communities","volume":"16","author":"Peabody","year":"2015","journal-title":"BMC Bioinformatics"},{"key":"2024041416573369800_btaa458-B26","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1146\/annurev-genom-090413-025358","article-title":"Alignment of next-generation sequencing reads","volume":"16","author":"Reinert","year":"2015","journal-title":"Annu. Rev. Genomics Hum. Genet"},{"key":"2024041416573369800_btaa458-B27","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1016\/j.jbiotec.2017.07.017","article-title":"The SeqAn C++ template library for efficient sequence analysis: a resource for programmers","volume":"261","author":"Reinert","year":"2017","journal-title":"J. Biotechnol"},{"key":"2024041416573369800_btaa458-B28","doi-asserted-by":"crossref","first-page":"1063","DOI":"10.1038\/nmeth.4458","article-title":"Critical assessment of metagenome interpretation\u2013a benchmark of metagenomics software","volume":"14","author":"Sczyrba","year":"2017","journal-title":"Nat. Methods"},{"key":"2024041416573369800_btaa458-B29","doi-asserted-by":"crossref","first-page":"3750","DOI":"10.1093\/bioinformatics\/bty433","article-title":"Livekraken - real-time metagenomic classification of illumina data","volume":"34","author":"Tausch","year":"2018","journal-title":"Bioinformatics"},{"key":"2024041416573369800_btaa458-B30","doi-asserted-by":"crossref","first-page":"902","DOI":"10.1038\/nmeth.3589","article-title":"MetaPhlAn2 for enhanced metagenomic taxonomic profiling","volume":"12","author":"Truong","year":"2015","journal-title":"Nat. Methods"},{"key":"2024041416573369800_btaa458-B31","doi-asserted-by":"crossref","first-page":"170203","DOI":"10.1038\/sdata.2017.203","article-title":"The reconstruction of 2,631 draft metagenome-assembled genomes from the global oceans","volume":"5","author":"Tully","year":"2018","journal-title":"Sci. Data"},{"key":"2024041416573369800_btaa458-B32","doi-asserted-by":"crossref","first-page":"R46","DOI":"10.1186\/gb-2014-15-3-r46","article-title":"Kraken: ultrafast metagenomic sequence classification using exact alignments","volume":"15","author":"Wood","year":"2014","journal-title":"Genome Biol"},{"key":"2024041416573369800_btaa458-B33","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1186\/s13059-019-1891-0","article-title":"Improved metagenomic analysis with Kraken 2","volume":"20","author":"Wood","year":"2019","journal-title":"Genome Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/Supplement_1\/i12\/57232196\/bioinformatics_36_supplement1_i12.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/Supplement_1\/i12\/57232196\/bioinformatics_36_supplement1_i12.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,4,14]],"date-time":"2024-04-14T12:58:16Z","timestamp":1713099496000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/Supplement_1\/i12\/5870470"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,1]]},"references-count":33,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2020,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa458","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/406017","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,7]]},"published":{"date-parts":[[2020,7,1]]}}}