{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T23:34:07Z","timestamp":1773272047000,"version":"3.50.1"},"reference-count":19,"publisher":"Oxford University Press (OUP)","issue":"18","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":1490,"URL":"http:\/\/creativecommons.org\/licenses\/by\/3.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,9,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Metagenomic binning remains an important topic in metagenomic analysis. Existing unsupervised binning methods for next-generation sequencing (NGS) reads do not perform well on (i) samples with low-abundance species or (ii) samples (even with high abundance) when there are many extremely low-abundance species. These two problems are common for real metagenomic datasets. Binning methods that can solve these problems are desirable.<\/jats:p>\n               <jats:p>Results: We proposed a two-round binning method (MetaCluster 5.0) that aims at identifying both low-abundance and high-abundance species in the presence of a large amount of noise due to many extremely low-abundance species. In summary, MetaCluster 5.0 uses a filtering strategy to remove noise from the extremely low-abundance species. It separate reads of high-abundance species from those of low-abundance species in two different rounds. To overcome the issue of low coverage for low-abundance species, multiple w values are used to group reads with overlapping w-mers, whereas reads from high-abundance species are grouped with high confidence based on a large w and then binning expands to low-abundance species using a relaxed (shorter) w. Compared to the recent tools, TOSS and MetaCluster 4.0, MetaCluster 5.0 can find more species (especially those with low abundance of say 6\u00d7 to 10\u00d7) and can achieve better sensitivity and specificity using less memory and running time.<\/jats:p>\n               <jats:p>Availability: \u00a0http:\/\/i.cs.hku.hk\/~alse\/MetaCluster\/<\/jats:p>\n               <jats:p>Contact: \u00a0chin@cs.hku.hk<\/jats:p>","DOI":"10.1093\/bioinformatics\/bts397","type":"journal-article","created":{"date-parts":[[2012,9,7]],"date-time":"2012-09-07T20:35:22Z","timestamp":1347050122000},"page":"i356-i362","source":"Crossref","is-referenced-by-count":113,"title":["MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample"],"prefix":"10.1093","volume":"28","author":[{"given":"Yi","family":"Wang","sequence":"first","affiliation":[{"name":"Department of Computer Science, The University of Hong Kong, Hong Kong"}]},{"given":"Henry C.M.","family":"Leung","sequence":"additional","affiliation":[{"name":"Department of Computer Science, The University of Hong Kong, Hong Kong"}]},{"given":"S.M.","family":"Yiu","sequence":"additional","affiliation":[{"name":"Department of Computer Science, The University of Hong Kong, Hong Kong"}]},{"given":"Francis Y.L.","family":"Chin","sequence":"additional","affiliation":[{"name":"Department of Computer Science, The University of Hong Kong, Hong Kong"}]}],"member":"286","published-online":{"date-parts":[[2012,9,3]]},"reference":[{"key":"2023012513021708700_B1","doi-asserted-by":"crossref","first-page":"1919","DOI":"10.1128\/aem.56.6.1919-1925.1990","article-title":"Combination of 16S rRNA-targeted oligonucleotide probes with flow cytometry for analyzing mixed microbial populations","volume":"56","author":"Amann","year":"1990","journal-title":"Appl. Environ. Microbiol."},{"key":"2023012513021708700_B2","doi-asserted-by":"crossref","first-page":"673","DOI":"10.1038\/nmeth.1358","article-title":"Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models","volume":"6","author":"Brady","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012513021708700_B3","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1128\/AEM.01177-06","article-title":"Use of 16S rRNA and rpoB genes as molecular markers for microbial ecology studies","volume":"73","author":"Case","year":"2007","journal-title":"Appl. Environ. Microbiol."},{"key":"2023012513021708700_B4","first-page":"17","volume-title":"CompostBin: A DNA composition-based algorithm for binning environmental shotgun reads","author":"Chatterji","year":"2008"},{"key":"2023012513021708700_B5","doi-asserted-by":"crossref","first-page":"D294","DOI":"10.1093\/nar\/gki038","article-title":"The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysis","volume":"33","author":"Cole","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012513021708700_B6","doi-asserted-by":"crossref","first-page":"e82","DOI":"10.1371\/journal.pbio.0050082","article-title":"Environmental shotgun sequencing: its potential and challenges for studying the hidden world of microbes","volume":"5","author":"Eisen","year":"2007","journal-title":"PLoS Biol."},{"key":"2023012513021708700_B7","doi-asserted-by":"crossref","first-page":"2421","DOI":"10.1093\/bioinformatics\/bth266","article-title":"How independent are the appearances of n-mers in different genomes?","volume":"20","author":"Fofanov","year":"2004","journal-title":"Bioinformatics"},{"key":"2023012513021708700_B8","article-title":"Metagenomic analysis of phosphorus removing sludge communities","author":"Garcia Martin","year":"2008"},{"key":"2023012513021708700_B9","first-page":"656","article-title":"BLAT\u2014the BLAST-like alignment tool","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res."},{"key":"2023012513021708700_B10","doi-asserted-by":"crossref","first-page":"e3064","DOI":"10.1371\/journal.pone.0003064","article-title":"Predominant role of host genetics in controlling the composition of gut microbiota","volume":"3","author":"Khachatryan","year":"2008","journal-title":"PLoS One"},{"key":"2023012513021708700_B11","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1038\/nmeth976","article-title":"Accurate phylogenetic classification of variable-length DNA fragments","volume":"4","author":"McHardy","year":"2006","journal-title":"Nat. Methods"},{"key":"2023012513021708700_B12","first-page":"191","article-title":"A two-way multi-dimensional mixture model for clustering metagenomic sequences","author":"Prabhakara","year":"2011","journal-title":"ACM-BCB"},{"key":"2023012513021708700_B13","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nature08821","article-title":"A human gut microbial gene catalogue established by metagenomic sequencing","volume":"464","author":"Qin","year":"2010","journal-title":"Nature"},{"key":"2023012513021708700_B14","doi-asserted-by":"crossref","first-page":"298","DOI":"10.1007\/978-3-642-23038-7_25","article-title":"Separating metagenomic short reads into genomes via clustering","volume":"6833\/2011","author":"Tanaseichuk","year":"2011","journal-title":"Algorithms Bioinformatics"},{"key":"2023012513021708700_B15","doi-asserted-by":"crossref","first-page":"163","DOI":"10.1186\/1471-2105-5-163","article-title":"TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences","volume":"5","author":"Teeling","year":"2004","journal-title":"BMC Bioinformatics"},{"key":"2023012513021708700_B16","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1089\/cmb.2011.0276","article-title":"MetaCluster 4.0: a novel binning algorithm for ngs reads and huge number of species","volume":"19","author":"Wang","year":"2012","journal-title":"J. Computat. Biol."},{"key":"2023012513021708700_B17","doi-asserted-by":"crossref","first-page":"523","DOI":"10.1089\/cmb.2010.0245","article-title":"Anovel abundance-based algorithm for binning metagenomic sequences using l-tuples","volume":"18","author":"Wu","year":"2011","journal-title":"J. Comput. Biol."},{"key":"2023012513021708700_B18","first-page":"170","volume-title":"MetaCluster: unsupervised binning of environmental genomic fragments and taxonomic annotation","author":"Yang","year":"2010"},{"key":"2023012513021708700_B19","first-page":"S5","article-title":"Unsupervised binning of environmental genomic fragments based on an error robust selection of l-mers","volume":"11","author":"Yang","year":"2010","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/18\/i356\/48883454\/bioinformatics_28_18_i356.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/18\/i356\/48883454\/bioinformatics_28_18_i356.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T18:51:30Z","timestamp":1674672690000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/18\/i356\/248051"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,9,3]]},"references-count":19,"journal-issue":{"issue":"18","published-print":{"date-parts":[[2012,9,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bts397","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,9,15]]},"published":{"date-parts":[[2012,9,3]]}}}