{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T21:14:04Z","timestamp":1775855644225,"version":"3.50.1"},"reference-count":26,"publisher":"PeerJ","license":[{"start":{"date-parts":[[2017,1,2]],"date-time":"2017-01-02T00:00:00Z","timestamp":1483315200000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"US National Institutes of Health","doi-asserted-by":"crossref","award":["R01-HG006677"],"award-info":[{"award-number":["R01-HG006677"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100000002","name":"US National Institutes of Health","doi-asserted-by":"crossref","award":["R01-GM083873"],"award-info":[{"award-number":["R01-GM083873"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100000183","name":"US Army Research Office","doi-asserted-by":"crossref","award":["W911NF-1410490"],"award-info":[{"award-number":["W911NF-1410490"]}],"id":[{"id":"10.13039\/100000183","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"abstract":"<jats:p>Metagenomic experiments attempt to characterize microbial communities using high-throughput DNA sequencing. Identification of the microorganisms in a sample provides information about the genetic profile, population structure, and role of microorganisms within an environment. Until recently, most metagenomics studies focused on high-level characterization at the level of phyla, or alternatively sequenced the 16S ribosomal RNA gene that is present in bacterial species. As the cost of sequencing has fallen, though, metagenomics experiments have increasingly used unbiased shotgun sequencing to capture all the organisms in a sample. This approach requires a method for estimating abundance directly from the raw read data. Here we describe a fast, accurate new method that computes the abundance at the species level using the reads collected in a metagenomics experiment. Bracken (Bayesian Reestimation of Abundance after Classification with KrakEN) uses the taxonomic assignments made by Kraken, a very fast read-level classifier, along with information about the genomes themselves to estimate abundance at the species level, the genus level, or above. We demonstrate that Bracken can produce accurate species- and genus-level abundance estimates even when a sample contains multiple near-identical species.<\/jats:p>","DOI":"10.7717\/peerj-cs.104","type":"journal-article","created":{"date-parts":[[2017,1,2]],"date-time":"2017-01-02T02:37:29Z","timestamp":1483324649000},"page":"e104","source":"Crossref","is-referenced-by-count":1825,"title":["Bracken: estimating species abundance in metagenomics data"],"prefix":"10.7717","volume":"3","author":[{"given":"Jennifer","family":"Lu","sequence":"first","affiliation":[{"name":"Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States"},{"name":"Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, United States"}]},{"given":"Florian P.","family":"Breitwieser","sequence":"additional","affiliation":[{"name":"Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, United States"}]},{"given":"Peter","family":"Thielen","sequence":"additional","affiliation":[{"name":"Applied Physics Laboratory, Johns Hopkins University, Laurel, MD, United States"}]},{"given":"Steven L.","family":"Salzberg","sequence":"additional","affiliation":[{"name":"Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States"},{"name":"Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, United States"},{"name":"Departments of Computer Science and Biostatistics, Johns Hopkins University, Baltimore, MD, United States"}]}],"member":"4443","published-online":{"date-parts":[[2017,1,2]]},"reference":[{"key":"10.7717\/peerj-cs.104\/ref-1","doi-asserted-by":"publisher","first-page":"e1000593","DOI":"10.1371\/journal.pcbi.1000593","article-title":"The GAAS metagenomic tool and its estimations of viral and microbial average genome size in four major biomes","volume":"5","author":"Angly","year":"2009","journal-title":"PLos Computational Biology"},{"key":"10.7717\/peerj-cs.104\/ref-2","doi-asserted-by":"publisher","first-page":"D30","DOI":"10.1093\/nar\/gku1216","article-title":"GenBank","volume":"43","author":"Benson","year":"2015","journal-title":"Nucleic Acids Research"},{"key":"10.7717\/peerj-cs.104\/ref-3","doi-asserted-by":"publisher","first-page":"856","DOI":"10.1186\/s12864-015-2063-6","article-title":"Impact of library preparation protocols and template quantity on the metagenomic reconstruction of a mock microbial community","volume":"16","author":"Bowers","year":"2015","journal-title":"BMC Genomics"},{"key":"10.7717\/peerj-cs.104\/ref-4","doi-asserted-by":"publisher","first-page":"3684","DOI":"10.1073\/pnas.052548299","article-title":"A new evolutionary scenario for the Mycobacterium tuberculosis complex","volume":"99","author":"Brosch","year":"2002","journal-title":"Proceedings of the National Academy of Sciences of the United States of America"},{"key":"10.7717\/peerj-cs.104\/ref-5","doi-asserted-by":"publisher","first-page":"7877","DOI":"10.1073\/pnas.1130426100","article-title":"The complete genome sequence of Mycobacterium bovis","volume":"100","author":"Garnier","year":"2003","journal-title":"Proceedings of the National Academy of Sciences of the United States of America"},{"key":"10.7717\/peerj-cs.104\/ref-6","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1054\/tube.2000.0263","article-title":"Mycobacterium bovis infection in human beings","volume":"81","author":"Grange","year":"2001","journal-title":"Tuberculosis"},{"key":"10.7717\/peerj-cs.104\/ref-7","doi-asserted-by":"publisher","first-page":"2627","DOI":"10.1128\/AEM.66.6.2627-2630.2000","article-title":"Bacillus anthracis, Bacillus cereus, and Bacillus thuringiensis\u2013one species on the basis of genetic evidence","volume":"66","author":"Helgason","year":"2000","journal-title":"Applied and Environmental Microbiology"},{"key":"10.7717\/peerj-cs.104\/ref-8","doi-asserted-by":"publisher","first-page":"207","DOI":"10.1038\/nature11234","article-title":"Structure, function and diversity of the healthy human microbiome","volume":"486","author":"Human Microbiome Project C","year":"2012","journal-title":"Nature"},{"key":"10.7717\/peerj-cs.104\/ref-9","doi-asserted-by":"publisher","first-page":"1125","DOI":"10.1016\/S1286-4579(02)01637-4","article-title":"Escherichia coli in disguise: molecular origins of Shigella","volume":"4","author":"Lan","year":"2002","journal-title":"Microbes and Infection"},{"key":"10.7717\/peerj-cs.104\/ref-10","doi-asserted-by":"publisher","DOI":"10.1038\/srep19233","article-title":"An evaluation of the accuracy and speed of metagenome analysis tools","volume":"6","author":"Lindgreen","year":"2016","journal-title":"Scientific Reports"},{"key":"10.7717\/peerj-cs.104\/ref-11","doi-asserted-by":"publisher","first-page":"e10","DOI":"10.1093\/nar\/gks803","article-title":"Metagenomic abundance estimation and diagnostic testing on species level","volume":"41","author":"Lindner","year":"2012","journal-title":"Nucleic Acids Research"},{"key":"10.7717\/peerj-cs.104\/ref-12","doi-asserted-by":"publisher","DOI":"10.1038\/srep14082","article-title":"Genomic insights into the taxonomic status of the Bacillus cereus group","volume":"5","author":"Liu","year":"2015","journal-title":"Scientific Reports"},{"key":"10.7717\/peerj-cs.104\/ref-13","doi-asserted-by":"publisher","first-page":"1045","DOI":"10.1038\/nbt.3319","article-title":"ConStrains identifies microbial strains in metagenomic datasets","volume":"33","author":"Luo","year":"2015","journal-title":"Nature Biotechnology"},{"key":"10.7717\/peerj-cs.104\/ref-14","doi-asserted-by":"publisher","first-page":"e31386","DOI":"10.1371\/journal.pone.0031386","article-title":"Assessment of metagenomic assembly using simulated next generation sequencing data","volume":"7","author":"Mende","year":"2012","journal-title":"PLoS ONE"},{"key":"10.7717\/peerj-cs.104\/ref-15","doi-asserted-by":"publisher","first-page":"1757","DOI":"10.1093\/bioinformatics\/btn322","article-title":"Database indexing for production MegaBLAST searches","volume":"24","author":"Morgulis","year":"2008","journal-title":"Bioinformatics"},{"key":"10.7717\/peerj-cs.104\/ref-16","doi-asserted-by":"publisher","first-page":"363","DOI":"10.1186\/s12859-015-0788-5","article-title":"Evaluation of shotgun metagenomics sequence classification methods using in silico and in vitro simulated communities","volume":"16","author":"Peabody","year":"2015","journal-title":"BMC Bioinformatics"},{"key":"10.7717\/peerj-cs.104\/ref-17","doi-asserted-by":"publisher","first-page":"525","DOI":"10.1146\/annurev.genet.38.072902.091216","article-title":"Metagenomics: genomic analysis of microbial communities","volume":"38","author":"Riesenfeld","year":"2004","journal-title":"Annual Review of Genetics"},{"key":"10.7717\/peerj-cs.104\/ref-18","article-title":"Pseudoalignment for metagenomic read assignment","author":"Schaeffer","year":"2015"},{"key":"10.7717\/peerj-cs.104\/ref-19","doi-asserted-by":"publisher","first-page":"811","DOI":"10.1038\/nmeth.2066","article-title":"Metagenomic microbial community profiling using unique clade-specific marker genes","volume":"9","author":"Segata","year":"2012","journal-title":"Nature Methods"},{"key":"10.7717\/peerj-cs.104\/ref-20","doi-asserted-by":"publisher","DOI":"10.1186\/1471-2105-15-242","article-title":"Accurate genome relative abundance estimation for closely related species in a metagnomic sample","volume":"15","author":"Sohn","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"10.7717\/peerj-cs.104\/ref-21","doi-asserted-by":"publisher","first-page":"860","DOI":"10.1038\/35057062","article-title":"Initial sequencing and analysis of the human genome","volume":"409","author":"The International Human Genome Sequencing Consortium","year":"2001","journal-title":"Nature"},{"key":"10.7717\/peerj-cs.104\/ref-22","doi-asserted-by":"publisher","first-page":"562","DOI":"10.4056\/sigs.3899418","article-title":"Complete genome sequence of Anabaena variabilis ATCC 29413","volume":"9","author":"Thiel","year":"2014","journal-title":"Standards in Genomic Sciences"},{"key":"10.7717\/peerj-cs.104\/ref-23","doi-asserted-by":"publisher","first-page":"1304","DOI":"10.1126\/science.1058040","article-title":"The sequence of the human genome","volume":"291","author":"Venter","year":"2001","journal-title":"Science"},{"key":"10.7717\/peerj-cs.104\/ref-24","doi-asserted-by":"publisher","first-page":"66","DOI":"10.1126\/science.1093857","article-title":"Environmental genome shotgun sequencing of the Sargasso Sea","volume":"304","author":"Venter","year":"2004","journal-title":"Science"},{"key":"10.7717\/peerj-cs.104\/ref-25","doi-asserted-by":"publisher","DOI":"10.1186\/gb-2014-15-3-r46","article-title":"Kraken: ultrafast metagenomic sequence classification using exact alignments","volume":"15","author":"Wood","year":"2014","journal-title":"Genome Biology"},{"key":"10.7717\/peerj-cs.104\/ref-26","doi-asserted-by":"publisher","first-page":"e27992","DOI":"10.1371\/journal.pone.0027992","article-title":"Accurate genome relative abundance estimation based on shotgun metgenomic reads","volume":"6","author":"Xia","year":"2011","journal-title":"PLoS ONE"}],"container-title":["PeerJ Computer Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/peerj.com\/articles\/cs-104.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/articles\/cs-104.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/articles\/cs-104.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/peerj.com\/articles\/cs-104.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2017,1,2]],"date-time":"2017-01-02T02:37:33Z","timestamp":1483324653000},"score":1,"resource":{"primary":{"URL":"https:\/\/peerj.com\/articles\/cs-104"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,1,2]]},"references-count":26,"alternative-id":["10.7717\/peerj-cs.104"],"URL":"https:\/\/doi.org\/10.7717\/peerj-cs.104","archive":["CLOCKSS","LOCKSS","Portico"],"relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/051813","asserted-by":"object"}]},"ISSN":["2376-5992"],"issn-type":[{"value":"2376-5992","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,1,2]]},"article-number":"e104"}}