{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,13]],"date-time":"2026-02-13T19:05:38Z","timestamp":1771009538169,"version":"3.50.1"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"15","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2009,8,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Background: Metagenomics is the study of the genomic content of an environmental sample of microbes. Advances in the through-put and cost-efficiency of sequencing technology is fueling a rapid increase in the number and size of metagenomic datasets being generated. Bioinformatics is faced with the problem of how to handle and analyze these datasets in an efficient and useful way. One goal of these metagenomic studies is to get a basic understanding of the microbial world both surrounding us and within us. One major challenge is how to compare multiple datasets. Furthermore, there is a need for bioinformatics tools that can process many large datasets and are easy to use.<\/jats:p>\n               <jats:p>Results: This article describes two new and helpful techniques for comparing multiple metagenomic datasets. The first is a visualization technique for multiple datasets and the second is a new statistical method for highlighting the differences in a pairwise comparison. We have developed implementations of both methods that are suitable for very large datasets and provide these in Version 3 of our standalone metagenome analysis tool MEGAN.<\/jats:p>\n               <jats:p>Conclusion: These new methods are suitable for the visual comparison of many large metagenomes and the statistical comparison of two metagenomes at a time. Nevertheless, more work needs to be done to support the comparative analysis of multiple metagenome datasets.<\/jats:p>\n               <jats:p>Availability: Version 3 of MEGAN, which implements all ideas presented in this article, can be obtained from our web site at: www-ab.informatik.uni-tuebingen.de\/software\/megan.<\/jats:p>\n               <jats:p>Contact: \u00a0mitra@informatik.uni-tuebingen.de<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btp341","type":"journal-article","created":{"date-parts":[[2009,6,11]],"date-time":"2009-06-11T01:39:55Z","timestamp":1244684395000},"page":"1849-1855","source":"Crossref","is-referenced-by-count":62,"title":["Visual and statistical comparison of metagenomes"],"prefix":"10.1093","volume":"25","author":[{"given":"Suparna","family":"Mitra","sequence":"first","affiliation":[{"name":"1 Center for Bioinformatics ZBIT, T\u00fcbingen University, Sand 14, 72076 T\u00fcbingen and 2 Institute for Stochastics, Karlsruhe University, Kaiserstra\u00dfe 89, 76133 Karlsruhe, Germany"}]},{"given":"Bernhard","family":"Klar","sequence":"additional","affiliation":[{"name":"1 Center for Bioinformatics ZBIT, T\u00fcbingen University, Sand 14, 72076 T\u00fcbingen and 2 Institute for Stochastics, Karlsruhe University, Kaiserstra\u00dfe 89, 76133 Karlsruhe, Germany"}]},{"given":"Daniel H.","family":"Huson","sequence":"additional","affiliation":[{"name":"1 Center for Bioinformatics ZBIT, T\u00fcbingen University, Sand 14, 72076 T\u00fcbingen and 2 Institute for Stochastics, Karlsruhe University, Kaiserstra\u00dfe 89, 76133 Karlsruhe, Germany"}]}],"member":"286","published-online":{"date-parts":[[2009,6,10]]},"reference":[{"key":"2023013112045834100_B1","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol."},{"key":"2023013112045834100_B2","doi-asserted-by":"crossref","first-page":"1477","DOI":"10.1093\/bioinformatics\/btg173","article-title":"Differential expression in sage: accounting for normal between-library variation","volume":"19","author":"Baggerly","year":"2003","journal-title":"Bioinformatics"},{"key":"2023013112045834100_B3","doi-asserted-by":"crossref","first-page":"126","DOI":"10.1093\/nar\/29.1.126","article-title":"Genomes online database (GOLD): a monitor of genome projects world-wide","volume":"29","author":"Bernal","year":"2001","journal-title":"Nucleic Acids Res"},{"key":"2023013112045834100_B4","doi-asserted-by":"crossref","first-page":"W470","DOI":"10.1093\/nar\/gkn277","article-title":"Signature, a web server for taxonomic characterization of sequence samples using signature genes","volume":"36","author":"Dutilh","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023013112045834100_B5","doi-asserted-by":"crossref","first-page":"1354","DOI":"10.1890\/05-1839","article-title":"Toward an ecological classification of soil bacteria","volume":"88","author":"Fierer","year":"2007","journal-title":"J. Ecol."},{"key":"2023013112045834100_B6","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1016\/S1074-5521(98)90108-9","article-title":"Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural products","volume":"5","author":"Handelsman","year":"1998","journal-title":"Chem. Biol."},{"key":"2023013112045834100_B7","first-page":"65","article-title":"A simple sequentially rejective multiple test procedure","volume":"6","author":"Holm","year":"1979","journal-title":"Scand. J. Stat."},{"key":"2023013112045834100_B8","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1101\/gr.5969107","article-title":"MEGAN analysis of metagenomic data","volume":"17","author":"Huson","year":"2007","journal-title":"Genome Res."},{"key":"2023013112045834100_B9","doi-asserted-by":"crossref","first-page":"2230","DOI":"10.1093\/nar\/gkn038","article-title":"Phylogenetic classification of short environmental DNA fragments","volume":"36","author":"Krause","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023013112045834100_B10","doi-asserted-by":"crossref","first-page":"371","DOI":"10.1186\/1471-2105-7-371","article-title":"Unifrac\u2013an online tool for comparing microbial community diversity in a phylogenetic context","volume":"7","author":"Lozupone","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023013112045834100_B11","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1186\/1471-2105-6-165","article-title":"Identifying differential expression in multiple sage libraries: an overdispersed log-linear model approach","volume":"6","author":"Lu","year":"2005","journal-title":"BMC Bioinformatics"},{"issue":"Database issue","key":"2023013112045834100_B12","doi-asserted-by":"crossref","first-page":"344","DOI":"10.1093\/nar\/gkj024","article-title":"The integrated microbial genomes (IMG) system","volume":"34","author":"Markowitz","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023013112045834100_B13","doi-asserted-by":"crossref","first-page":"D534","DOI":"10.1093\/nar\/gkm869","article-title":"IMG\/M: a data management and analysis system for metagenomes","volume":"36","author":"Markowitz","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023013112045834100_B14","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1038\/nmeth976","article-title":"Accurate phylogenetic classification of variable-length DNA fragments","volume":"4","author":"McHardy","year":"2006","journal-title":"Nat. Methods."},{"key":"2023013112045834100_B15","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1186\/1471-2105-9-386","article-title":"The metagenomics rast server\u2013a public resource for the automatic phylogenetic and functional analysis of metagenomes","volume":"9","author":"Meyer","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023013112045834100_B16","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1101\/gr.082628.108","article-title":"The mitochondrial genome sequence of the Tasmanian tiger (Thylacinus cynocephalus)","volume":"19","author":"Miller","year":"2009","journal-title":"Genome Res."},{"key":"2023013112045834100_B17","doi-asserted-by":"crossref","first-page":"5691","DOI":"10.1093\/nar\/gki866","article-title":"The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes","volume":"33","author":"Overbeek","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023013112045834100_B18","doi-asserted-by":"crossref","first-page":"392","DOI":"10.1126\/science.1123360","article-title":"Metagenomics to paleogenomics: large-scale sequencing of mammoth DNA","volume":"311","author":"Poinar","year":"2006","journal-title":"Science"},{"key":"2023013112045834100_B19","doi-asserted-by":"crossref","first-page":"e77","DOI":"10.1371\/journal.pbio.0050077","article-title":"The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific","volume":"5","author":"Rusch","year":"2007","journal-title":"PLoS Biol."},{"key":"2023013112045834100_B20","doi-asserted-by":"crossref","first-page":"2881","DOI":"10.1093\/bioinformatics\/btm453","article-title":"Moderated statistical tests for assessing differences in tag abundance","volume":"23","author":"Robinson","year":"2007","journal-title":"Bioinformatics"},{"key":"2023013112045834100_B21","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pbio.0050075","article-title":"CAMERA: a community resource for metagenomics","volume":"5","author":"Seshadri","year":"2007","journal-title":"PLoS Biol."},{"key":"2023013112045834100_B22","doi-asserted-by":"crossref","first-page":"561","DOI":"10.1146\/annurev.ps.46.020195.003021","article-title":"Multiple hypothesis testing","volume":"46","author":"Shaffer","year":"1995","journal-title":"Ann. Rev. Psychol."},{"key":"2023013112045834100_B23","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1126\/science.278.5338.631","article-title":"A genomic perspective on protein families","volume":"278","author":"Tatusov","year":"1997","journal-title":"Science"},{"key":"2023013112045834100_B24","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1186\/1471-2105-5-163","article-title":"Tetra: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences","volume":"5","author":"Teeling","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023013112045834100_B25","doi-asserted-by":"crossref","first-page":"554","DOI":"10.1126\/science.1107851","article-title":"Comparative metagenomics of microbial communities","volume":"308","author":"Tringe","year":"2005","journal-title":"Science"},{"key":"2023013112045834100_B26","doi-asserted-by":"crossref","first-page":"1027","DOI":"10.1038\/nature05414","article-title":"An obesity-associated gut microbiome with increased capacity for energy harvest","volume":"444","author":"Turnbaugh","year":"2006","journal-title":"Nature"},{"key":"2023013112045834100_B27","doi-asserted-by":"crossref","first-page":"1126","DOI":"10.1126\/science.1133420","article-title":"Quantitative phylogenetic assessment of microbial communities in diverse environments","volume":"315","author":"von Mering","year":"2007","journal-title":"Science"},{"key":"2023013112045834100_B28","doi-asserted-by":"crossref","first-page":"e1000352","DOI":"10.1371\/journal.pcbi.1000352","article-title":"Statistical methods for detecting differentially abundant features in clinical metagenomic samples","volume":"5","author":"White","year":"2009","journal-title":"PLoS Comput. Biol."},{"key":"2023013112045834100_B29","doi-asserted-by":"crossref","first-page":"e16","DOI":"10.1371\/journal.pbio.0050016","article-title":"The Sorcerer II Global Ocean Sampling expedition: expanding the Universe of Protein Families","volume":"5","author":"Yooseph","year":"2007","journal-title":"PLoS Biol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/15\/1849\/48993021\/bioinformatics_25_15_1849.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/25\/15\/1849\/48993021\/bioinformatics_25_15_1849.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,31]],"date-time":"2023-01-31T21:19:08Z","timestamp":1675199948000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/25\/15\/1849\/212949"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,6,10]]},"references-count":29,"journal-issue":{"issue":"15","published-print":{"date-parts":[[2009,8,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btp341","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2009,8,1]]},"published":{"date-parts":[[2009,6,10]]}}}