{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T10:53:08Z","timestamp":1740135188951,"version":"3.37.3"},"reference-count":23,"publisher":"Springer Science and Business Media LLC","issue":"S5","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2013,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Environmental shotgun sequencing (ESS) has potential to give greater insight into microbial communities than targeted sequencing of 16S regions, but requires much higher sequence coverage. The advent of next-generation sequencing has made it feasible for the Human Microbiome Project and other initiatives to generate ESS data on a large scale, but computationally efficient methods for analysing such data sets are needed.<\/jats:p>\n          <jats:p>Here we present metaBEETL, a fast taxonomic classifier for environmental shotgun sequences. It uses a Burrows-Wheeler Transform (BWT) index of the sequencing reads and an indexed database of microbial reference sequences. Unlike other BWT-based tools, our method has no upper limit on the number or the total size of the reference sequences in its database. By capturing sequence relationships between strains, our reference index also allows us to classify reads which are not unique to an individual strain but are nevertheless specific to some higher phylogenetic order.<\/jats:p>\n          <jats:p>Tested on datasets with known taxonomic composition, metaBEETL gave results that are competitive with existing similarity-based tools: due to normalization steps which other classifiers lack, the taxonomic profile computed by metaBEETL closely matched the true environmental profile. At the same time, its moderate running time and low memory footprint allow metaBEETL to scale well to large data sets.<\/jats:p>\n          <jats:p>Code to construct the BWT indexed database and for the taxonomic classification is part of the BEETL library, available as a github repository at git@github.com:BEETL\/BEETL.git.<\/jats:p>","DOI":"10.1186\/1471-2105-14-s5-s2","type":"journal-article","created":{"date-parts":[[2013,10,8]],"date-time":"2013-10-08T12:09:57Z","timestamp":1381234197000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences"],"prefix":"10.1186","volume":"14","author":[{"given":"Christina","family":"Ander","sequence":"first","affiliation":[]},{"given":"Ole B","family":"Schulz-Trieglaff","sequence":"additional","affiliation":[]},{"given":"Jens","family":"Stoye","sequence":"additional","affiliation":[]},{"given":"Anthony J","family":"Cox","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2013,4,10]]},"reference":[{"issue":"21","key":"5770_CR1","doi-asserted-by":"publisher","first-page":"7188","DOI":"10.1093\/nar\/gkm864","volume":"35","author":"E Pruesse","year":"2007","unstructured":"Pruesse E, Quast C, Knittel K, Fuchs BM, Ludwig W, Peplies J, Glockner FO: SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res. 2007, 35 (21): 7188-7196. 10.1093\/nar\/gkm864.","journal-title":"Nucleic Acids Res"},{"issue":"16","key":"5770_CR2","doi-asserted-by":"publisher","first-page":"5261","DOI":"10.1128\/AEM.00062-07","volume":"73","author":"Q Wang","year":"2007","unstructured":"Wang Q, Garrity GM, Tiedje JM, Cole JR: Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol. 2007, 73 (16): 5261-5267. 10.1128\/AEM.00062-07.","journal-title":"Appl Environ Microbiol"},{"issue":"3","key":"5770_CR3","doi-asserted-by":"publisher","first-page":"717","DOI":"10.1128\/AEM.06516-11","volume":"78","author":"ES Wright","year":"2012","unstructured":"Wright ES, Yilmaz LS, Noguera DR: DECIPHER, a search-based approach to chimera identification for 16S rRNA sequences. Appl Environ Microbiol. 2012, 78 (3): 717-725. 10.1128\/AEM.06516-11.","journal-title":"Appl Environ Microbiol"},{"issue":"16","key":"5770_CR4","doi-asserted-by":"publisher","first-page":"5180","DOI":"10.1093\/nar\/gkn496","volume":"36","author":"C Manichanh","year":"2008","unstructured":"Manichanh C, Chapple CE, Frangeul L, Gloux K, Guigo R, Dore J: A comparison of random sequence reads versus 16S rDNA sequences for estimating the biodiversity of a metagenomic library. Nucleic Acids Res. 2008, 36 (16): 5180-5188. 10.1093\/nar\/gkn496.","journal-title":"Nucleic Acids Res"},{"issue":"5667","key":"5770_CR5","doi-asserted-by":"publisher","first-page":"66","DOI":"10.1126\/science.1093857","volume":"304","author":"JC Venter","year":"2004","unstructured":"Venter JC, Remington K, Heidelberg JF, Halpern AL, Rusch D, Eisen JA, Wu D, Paulsen I, Nelson KE, Nelson W, Fouts DE, Levy S, Knap AH, Lomas MW, Nealson K, White O, Peterson J, Hoffman J, Parsons R, Baden-Tillson H, Pfannkoch C, Rogers YH, Smith HO: Environmental genome shotgun sequencing of the Sargasso Sea. Science. 2004, 304 (5667): 66-74. 10.1126\/science.1093857. [http:\/\/www.sciencemag.org\/content\/304\/5667\/66.abstract]","journal-title":"Science"},{"issue":"12","key":"5770_CR6","doi-asserted-by":"publisher","first-page":"2317","DOI":"10.1101\/gr.096651.109","volume":"19","author":"TNHW Group","year":"2009","unstructured":"Group TNHW, Peterson J, Garges S, Giovanni M, McInnes P, Wang L, Schloss JA, Bonazzi V, McEwen JE, Wetterstrand KA, Deal C, Baker CC, Di Francesco V, Howcroft TK, Karp RW, Lunsford RD, Wellington CR, Belachew T, Wright M, Giblin C, David H, Mills M, Salomon R, Mullins C, Akolkar B, Begg L, Davis C, Grandison L, Humble M, Khalsa J, Little AR, Peavy H, Pontzer C, Portnoy M, Sayre MH, Starke-Reed P, Zakhari S, Read J, Watson B, Guyer M: The NIH Human Microbiome Project. Genome Research. 2009, 19 (12): 2317-2323. [http:\/\/genome.cshlp.org\/content\/19\/12\/2317.abstract]","journal-title":"Genome Research"},{"issue":"3","key":"5770_CR7","doi-asserted-by":"publisher","first-page":"000","DOI":"10.1101\/gr.5969107","volume":"17","author":"DH Huson","year":"2007","unstructured":"Huson DH, Auch AF, Qi J, Schuster SC: MEGAN analysis of metagenomic data. Genome Research. 2007, 17 (3): 000-[http:\/\/genome.cshlp.org\/content\/early\/2007\/01\/01\/gr.5969107.abstract]","journal-title":"Genome Research"},{"issue":"14","key":"5770_CR8","doi-asserted-by":"publisher","first-page":"e91","DOI":"10.1093\/nar\/gkr225","volume":"39","author":"W Gerlach","year":"2011","unstructured":"Gerlach W, Stoye J: Taxonomic classification of metagenomic shotgun sequences with CARMA3. Nucleic Acids Res. 2011, 39 (14): e91-10.1093\/nar\/gkr225.","journal-title":"Nucleic Acids Res"},{"issue":"3","key":"5770_CR9","doi-asserted-by":"publisher","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","volume":"215","author":"SF Altschul","year":"1990","unstructured":"Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. Journal of Molecular Biology. 1990, 215 (3): 403-410. [http:\/\/www.sciencedirect.com\/science\/article\/pii\/S0022283605803602]","journal-title":"Journal of Molecular Biology"},{"issue":"8","key":"5770_CR10","doi-asserted-by":"publisher","first-page":"e41224","DOI":"10.1371\/journal.pone.0041224","volume":"7","author":"CF Davenport","year":"2012","unstructured":"Davenport CF, Neugebauer J, Beckmann N, Friedrich B, Kameri B, Kokott S, Paetow M, Siekmann B, Wieding-Drewes M, Wienh\u00f6fer M, Wolf S, T\u00fcmmler B, Ahlers V, Sprengel F: Genometa - A fast and accurate classifier for short metagenomic shotgun reads. PLoS ONE. 2012, 7 (8): e41224-10.1371\/journal.pone.0041224. [http:\/\/dx.doi.org\/10.1371%2Fjournal.pone.0041224]","journal-title":"PLoS ONE"},{"issue":"3","key":"5770_CR11","doi-asserted-by":"publisher","first-page":"R25+","DOI":"10.1186\/gb-2009-10-3-r25","volume":"10","author":"B Langmead","year":"2009","unstructured":"Langmead B, Trapnell C, Pop M, Salzberg S: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology. 2009, 10 (3): R25+-[http:\/\/dx.doi.org\/10.1186\/gb-2009-10-3-r25]","journal-title":"Genome Biology"},{"issue":"14","key":"5770_CR12","doi-asserted-by":"publisher","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","volume":"25","author":"H Li","year":"2009","unstructured":"Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009, 25 (14): 1754-1760. 10.1093\/bioinformatics\/btp324. [http:\/\/bioinformatics.oxfordjournals.org\/content\/25\/14\/1754.abstract]","journal-title":"Bioinformatics"},{"key":"5770_CR13","volume-title":"Tech rep","author":"M Burrows","year":"1994","unstructured":"Burrows M, Wheeler DJ: A block sorting data compression algorithm. Tech rep. 1994, DIGITAL System Research Center"},{"key":"5770_CR14","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-78909-5","volume-title":"The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching","author":"D Adjeroh","year":"2008","unstructured":"Adjeroh D, Bell T, Mukherjee A: The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching. 2008, Springer Publishing Company, Incorporated, 1","edition":"1"},{"key":"5770_CR15","doi-asserted-by":"publisher","first-page":"390-","DOI":"10.1109\/SFCS.2000.892127","volume-title":"Proceedings of the 41st Annual Symposium on Foundations of Computer Science","author":"P Ferragina","year":"2000","unstructured":"Ferragina P, Manzini G: Opportunistic data structures with applications. Proceedings of the 41st Annual Symposium on Foundations of Computer Science. 2000, FOCS '00, Washington, DC, USA: IEEE Computer Society, 390--[http:\/\/dl.acm.org\/citation.cfm?id=795666.796543]"},{"key":"5770_CR16","unstructured":"NCBI Taxonomy. [ftp:\/\/ftp.ncbi.nlm.nih.gov\/pub\/taxonomy]"},{"key":"5770_CR17","first-page":"214","volume-title":"WABI 2012, Volume 7534 of LNBI","author":"AJ Cox","year":"2012","unstructured":"Cox AJ, Jakobi T, Rosone G, Schulz-Trieglaff OB: Comparing DNA sequence collections by direct comparison of compressed text indexes. WABI 2012, Volume 7534 of LNBI. 2012, 214-224."},{"key":"5770_CR18","first-page":"219","volume-title":"CPM 2011, Volume 6661 of LNCS","author":"MJ Bauer","year":"2011","unstructured":"Bauer MJ, Cox AJ, Rosone G: Lightweight BWT construction for very large string collections. CPM 2011, Volume 6661 of LNCS. 2011, Springer, 219-231."},{"issue":"Suppl 2","key":"5770_CR19","doi-asserted-by":"publisher","first-page":"S4","DOI":"10.1186\/1471-2164-12-S2-S4","volume":"12","author":"B Liu","year":"2011","unstructured":"Liu B, Gibbons T, Ghodsi M, Treangen T, Pop M: Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences. BMC Genomics. 2011, 12 (Suppl 2): S4-10.1186\/1471-2164-12-S2-S4.","journal-title":"BMC Genomics"},{"issue":"10","key":"5770_CR20","doi-asserted-by":"publisher","first-page":"e3373","DOI":"10.1371\/journal.pone.0003373","volume":"3","author":"DC Richter","year":"2008","unstructured":"Richter DC, Ott F, Auch AF, Schmid R, Huson DH: MetaSim: a sequencing simulator for genomics and metagenomics. PLoS ONE. 2008, 3 (10): e3373-10.1371\/journal.pone.0003373.","journal-title":"PLoS ONE"},{"key":"5770_CR21","unstructured":"NCBI Microbial Genomes. [ftp:\/\/ftp.ncbi.nlm.nih.gov\/genomes\/Bacteria]"},{"issue":"12","key":"5770_CR22","doi-asserted-by":"publisher","first-page":"i367","DOI":"10.1093\/bioinformatics\/btq217","volume":"26","author":"JT Simpson","year":"2010","unstructured":"Simpson JT, Durbin R: Efficient construction of an assembly string graph using the FM-index. Bioinformatics. 2010, 26 (12): i367-i373. 10.1093\/bioinformatics\/btq217. [http:\/\/dx.doi.org\/10.1093\/bioinformatics\/btq217]","journal-title":"Bioinformatics"},{"issue":"3","key":"5770_CR23","doi-asserted-by":"publisher","first-page":"549","DOI":"10.1101\/gr.126953.111","volume":"22","author":"JT Simpson","year":"2012","unstructured":"Simpson JT, Durbin R: Efficient de novo assembly of large genomes using compressed data structures. Genome Research. 2012, 22 (3): 549-556. 10.1101\/gr.126953.111. [http:\/\/genome.cshlp.org\/content\/22\/3\/549.abstract]","journal-title":"Genome Research"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-14-S5-S2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T22:51:28Z","timestamp":1630536688000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-14-S5-S2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,4]]},"references-count":23,"journal-issue":{"issue":"S5","published-print":{"date-parts":[[2013,4]]}},"alternative-id":["5770"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-14-s5-s2","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2013,4]]},"assertion":[{"value":"10 April 2013","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S2"}}