{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,8]],"date-time":"2026-01-08T22:26:22Z","timestamp":1767911182759,"version":"3.49.0"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":2132,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,1,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Metagenomics is a recent field of biology that studies microbial communities by analyzing their genomic content directly sequenced from the environment. A metagenomic dataset consists of many short DNA or RNA fragments called reads. One interesting problem in metagenomic data analysis is the discovery of the taxonomic composition of a given dataset. A simple method for this task, called the Lowest Common Ancestor (LCA), is employed in state-of-the-art computational tools for metagenomic data analysis of very short reads (about 100 bp). However LCA has two main drawbacks: it possibly assigns many reads to high taxonomic ranks and it discards a high number of reads.<\/jats:p>\n               <jats:p>Results: We present MTR, a new method for tackling these drawbacks using clustering at Multiple Taxonomic Ranks. Unlike LCA, which processes the reads one-by-one, MTR exploits information shared by reads. Specifically, MTR consists of two main phases. First, for each taxonomic rank, a collection of potential clusters of reads is generated, and each potential cluster is associated to a taxon at that rank. Next, a small number of clusters is selected at each rank using a combinatorial optimization algorithm. The effectiveness of the resulting method is tested on a large number of simulated and real-life metagenomes. Results of experiments show that MTR improves on LCA by discarding a significantly smaller number of reads and by assigning much more reads at lower taxonomic ranks. Moreover, MTR provides a more faithful taxonomic characterization of the metagenome population distribution.<\/jats:p>\n               <jats:p>Availability: Matlab and C++ source codes of the method available at http:\/\/cs.ru.nl\/\u02dcgori\/software\/MTR.tar.gz.<\/jats:p>\n               <jats:p>Contact: \u00a0gori@cs.ru.nl; elenam@cs.ru.nl<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq649","type":"journal-article","created":{"date-parts":[[2010,12,3]],"date-time":"2010-12-03T01:53:52Z","timestamp":1291341232000},"page":"196-203","source":"Crossref","is-referenced-by-count":47,"title":["MTR: taxonomic annotation of short metagenomic reads using clustering at multiple taxonomic ranks"],"prefix":"10.1093","volume":"27","author":[{"given":"Fabio","family":"Gori","sequence":"first","affiliation":[{"name":"1 Radboud University Nijmegen, Institute for Computing and Information Sciences, Nijmegen, The Netherlands, 2ICAR-CNR, Institute of High Performance Computing and Networking, Rende, Italy and 3Radboud University Nijmegen, IWWR, Department of Microbiology, Nijmegen, The Netherlands"}]},{"given":"Gianluigi","family":"Folino","sequence":"additional","affiliation":[{"name":"1 Radboud University Nijmegen, Institute for Computing and Information Sciences, Nijmegen, The Netherlands, 2ICAR-CNR, Institute of High Performance Computing and Networking, Rende, Italy and 3Radboud University Nijmegen, IWWR, Department of Microbiology, Nijmegen, The Netherlands"}]},{"given":"Mike S. M.","family":"Jetten","sequence":"additional","affiliation":[{"name":"1 Radboud University Nijmegen, Institute for Computing and Information Sciences, Nijmegen, The Netherlands, 2ICAR-CNR, Institute of High Performance Computing and Networking, Rende, Italy and 3Radboud University Nijmegen, IWWR, Department of Microbiology, Nijmegen, The Netherlands"}]},{"given":"Elena","family":"Marchiori","sequence":"additional","affiliation":[{"name":"1 Radboud University Nijmegen, Institute for Computing and Information Sciences, Nijmegen, The Netherlands, 2ICAR-CNR, Institute of High Performance Computing and Networking, Rende, Italy and 3Radboud University Nijmegen, IWWR, Department of Microbiology, Nijmegen, The Netherlands"}]}],"member":"286","published-online":{"date-parts":[[2010,12,1]]},"reference":[{"key":"2023012512180096900_B1","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023012512180096900_B2","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1128\/mr.59.1.143-169.1995","article-title":"Phylogenetic identification and in situ detection of individual microbial cells without cultivation","volume":"59","author":"Amann","year":"1995","journal-title":"Microbiol. Rev."},{"key":"2023012512180096900_B3","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1016\/0196-6774(81)90020-1","article-title":"A linear-time approximation algorithm for the weighted vertex cover problem","volume":"2","author":"Bar-Yehuda","year":"1981","journal-title":"J. Algorithms"},{"key":"2023012512180096900_B4","doi-asserted-by":"crossref","first-page":"771","DOI":"10.1146\/annurev.genet.38.072902.094318","article-title":"Comparative genomic structure of prokaryotes","volume":"38","author":"Bentley","year":"2004","journal-title":"Ann. Rev. Genet."},{"key":"2023012512180096900_B5","doi-asserted-by":"crossref","first-page":"960","DOI":"10.1101\/gr.5578007","article-title":"A framework for collaborative analysis of ENCODE data: making large-scale analyses biologist-friendly","volume":"17","author":"Blankenberg","year":"2007","journal-title":"Genome Res."},{"key":"2023012512180096900_B6","doi-asserted-by":"crossref","DOI":"10.1002\/0471142727.mb1910s89","article-title":"Galaxy: a web-based genome analysis tool for experimentalists","author":"Blankenberg","year":"2010","journal-title":"Curr. Protoc. Mol. Biol."},{"key":"2023012512180096900_B7","doi-asserted-by":"crossref","first-page":"673","DOI":"10.1038\/nmeth.1358","article-title":"Phymm and phymmbl: metagenomic phylogenetic classification with interpolated markov models","volume":"6","author":"Brady","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012512180096900_B8","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1287\/moor.4.3.233","article-title":"A greedy heuristic for the set-covering problem","volume":"4","author":"Chvatal","year":"1979","journal-title":"Math. Operat. Res."},{"key":"2023012512180096900_B9","first-page":"3","article-title":"Accurate taxonomic assignment of short pyrosequencing reads","volume":"15","author":"Clemente","year":"2010","journal-title":"Pac. Symp. Biocomput."},{"key":"2023012512180096900_B10","doi-asserted-by":"crossref","first-page":"i7","DOI":"10.1093\/bioinformatics\/btn276","article-title":"Annotation of metagenome short reads using proxygenes","volume":"24","author":"Dalevi","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012512180096900_B11","doi-asserted-by":"crossref","first-page":"673","DOI":"10.1093\/bioinformatics\/btm009","article-title":"Identifying bacterial genes and endosymbiont DNA with Glimmer","volume":"23","author":"Delcher","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012512180096900_B12","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1186\/1471-2164-7-57","article-title":"Using pyrosequencing to shed light on deep mine microbial ecology","volume":"7","author":"Edwards","year":"2006","journal-title":"BMC Genomics"},{"key":"2023012512180096900_B13","doi-asserted-by":"crossref","DOI":"10.1038\/sj.embor.7400538","article-title":"Environments shape the nucleotide composition of genomes","volume":"6","author":"Foerstner","year":"2005","journal-title":"EMBO Rep."},{"key":"2023012512180096900_B14","first-page":"152","article-title":"Clustering metagenome short reads using weighted proteins","volume-title":"EvoBIO","author":"Folino","year":"2009"},{"key":"2023012512180096900_B15","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1101\/gr.5969107","article-title":"Megan analysis of metagenomic data","volume":"17","author":"Huson","year":"2007","journal-title":"Genome Res."},{"key":"2023012512180096900_B16","volume-title":"BLAST.","author":"Korf","year":"2003"},{"key":"2023012512180096900_B17","doi-asserted-by":"crossref","first-page":"2230","DOI":"10.1093\/nar\/gkn038","article-title":"Phylogenetic classification of short environmental DNA fragments","volume":"36","author":"Krause","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012512180096900_B18","doi-asserted-by":"crossref","first-page":"557","DOI":"10.1128\/MMBR.00009-08","article-title":"A bioinformatician's guide to metagenomics","volume":"72","author":"Kunin","year":"2008","journal-title":"Microbiol. Mol. Biol. Rev."},{"key":"2023012512180096900_B19","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1109\/18.61115","article-title":"Divergence measures based on the Shannon Entropy","volume":"37","author":"Lin","year":"1991","journal-title":"IEEE Trans. Inf. Theory"},{"key":"2023012512180096900_B20","doi-asserted-by":"crossref","first-page":"e120","DOI":"10.1093\/nar\/gkn491","article-title":"Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers","volume":"36","author":"Liu","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012512180096900_B21","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1038\/nature03959","article-title":"Genome sequencing in microfabricated high-density picolitre reactors","volume":"437","author":"Margulies","year":"2005","journal-title":"Nature"},{"key":"2023012512180096900_B22","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1016\/j.mib.2007.08.004","article-title":"What's in the mix: phylogenetic classification of metagenome sequence samples","volume":"10","author":"McHardy","year":"2007","journal-title":"Curr. Opin. Microbiol."},{"key":"2023012512180096900_B23","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1186\/1471-2105-9-386","article-title":"The metagenomics rast server - a public resource for the automatic phylogenetic and functional analysis of metagenomes","volume":"9","author":"Meyer","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012512180096900_B24","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1145\/1029496.1029525","article-title":"The SEED: a peer-to-peer environment for genome annotation","volume":"47","author":"Overbeek","year":"2004","journal-title":"Comm. ACM"},{"key":"2023012512180096900_B25","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nature08821","article-title":"A human gut microbial gene catalogue established by metagenomic sequencing","volume":"464","author":"Qin","year":"2010","journal-title":"Nature"},{"key":"2023012512180096900_B26","doi-asserted-by":"crossref","first-page":"490","DOI":"10.1016\/j.mib.2007.09.001","article-title":"Get the most out of your metagenome: computational analysis of environmental sequence data","volume":"10","author":"Raes","year":"2007","journal-title":"Curr. Opin. Microbiol."},{"key":"2023012512180096900_B27","first-page":"2707","article-title":"Metagenomic analysis of the microbial community associated with the coral","volume":"9","author":"Rodriguez-Brito","year":"2007","journal-title":"Porites astreoides. Environ. Microbiol."},{"key":"2023012512180096900_B28","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1016\/0022-2836(75)90213-2","article-title":"A rapid method for determining sequences in dna by primed synthesis with DNA polymerase","volume":"94","author":"Sanger","year":"1975","journal-title":"J. Mol. Biol."},{"key":"2023012512180096900_B29","doi-asserted-by":"crossref","first-page":"5463","DOI":"10.1073\/pnas.74.12.5463","article-title":"DNA sequencing with chain-terminating inhibitors","volume":"74","author":"Sanger","year":"1977","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512180096900_B30","doi-asserted-by":"crossref","first-page":"804","DOI":"10.1038\/nature06244","article-title":"The Human Microbiome Project","volume":"449","author":"Turnbaugh","year":"2007","journal-title":"Nature"},{"key":"2023012512180096900_B31","doi-asserted-by":"crossref","first-page":"5261","DOI":"10.1128\/AEM.00062-07","article-title":"Naive Bayesian classifier for rapid assignment of rRNA sequences into the New Bacterial Taxonomy","volume":"73","author":"Wang","year":"2007","journal-title":"Appl. Environ. Microbiol."},{"key":"2023012512180096900_B32","doi-asserted-by":"crossref","first-page":"e1000667","DOI":"10.1371\/journal.pcbi.1000667","article-title":"A primer on metagenomics","volume":"6","author":"Wooley","year":"2010","journal-title":"PLoS Comput. Biol."},{"key":"2023012512180096900_B33","doi-asserted-by":"crossref","first-page":"e16","DOI":"10.1371\/journal.pbio.0050016","article-title":"The Sorcerer II global ocean sampling expedition: expanding the universe of protein families","volume":"5","author":"Yooseph","year":"2007","journal-title":"PLoS Biol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/2\/196\/48869164\/bioinformatics_27_2_196.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/2\/196\/48869164\/bioinformatics_27_2_196.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T14:55:46Z","timestamp":1674658546000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/27\/2\/196\/286378"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,12,1]]},"references-count":33,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2011,1,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq649","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2011,1,15]]},"published":{"date-parts":[[2010,12,1]]}}}