{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T09:18:51Z","timestamp":1775726331573,"version":"3.50.1"},"reference-count":33,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,7,11]],"date-time":"2022-07-11T00:00:00Z","timestamp":1657497600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,7,11]],"date-time":"2022-07-11T00:00:00Z","timestamp":1657497600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Algorithms Mol Biol"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Advancements in metagenomics sequencing allow the study of microbial communities directly from their environments. Metagenomics binning is a key step in the species characterisation of microbial communities. Next-generation sequencing reads are usually assembled into contigs for metagenomics binning mainly due to the limited information within short reads. Third-generation sequencing provides much longer reads that have lengths similar to the contigs assembled from short reads. However, existing contig-binning tools cannot be directly applied on long reads due to the absence of coverage information and the presence of high error rates. The few existing long-read binning tools either use only composition or use composition and coverage information separately. This may ignore bins that correspond to low-abundance species or erroneously split bins that correspond to species with non-uniform coverages. Here we present a reference-free binning approach, LRBinner, that combines composition and coverage information of complete long-read datasets. LRBinner also uses a distance-histogram-based clustering algorithm to extract clusters with varying sizes.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>The experimental results on both simulated and real datasets show that LRBinner achieves the best binning accuracy in most cases while handling the complete datasets without any sampling. Moreover, we show that binning reads using LRBinner prior to assembly reduces computational resources required for assembly while attaining satisfactory assembly qualities.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusion<\/jats:title><jats:p>LRBinner shows that deep-learning techniques can be used for effective feature aggregation to support the metagenomics binning of long reads. Furthermore, accurate binning of long reads supports improvements in metagenomics assembly, especially in complex datasets. Binning also helps to reduce the resources required for assembly. Source code for LRBinner is freely available at https:\/\/github.com\/anuradhawick\/LRBinner.<\/jats:p><\/jats:sec>","DOI":"10.1186\/s13015-022-00221-z","type":"journal-article","created":{"date-parts":[[2022,7,12]],"date-time":"2022-07-12T20:11:57Z","timestamp":1657656717000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":23,"title":["Binning long reads in metagenomics datasets using composition and coverage information"],"prefix":"10.1186","volume":"17","author":[{"given":"Anuradha","family":"Wickramarachchi","sequence":"first","affiliation":[]},{"given":"Yu","family":"Lin","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,7,11]]},"reference":[{"key":"221_CR1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.0010024","author":"K Chen","year":"2005","unstructured":"Chen K, Pachter L. Bioinformatics for whole-genome shotgun sequencing of microbial communities. PLOS Comput Biol. 2005. https:\/\/doi.org\/10.1371\/journal.pcbi.0010024.","journal-title":"PLOS Comput Biol"},{"issue":"3","key":"221_CR2","doi-asserted-by":"publisher","first-page":"46","DOI":"10.1186\/gb-2014-15-3-r46","volume":"15","author":"DE Wood","year":"2014","unstructured":"Wood DE, Salzberg SL. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 2014;15(3):46.","journal-title":"Genome Biol"},{"issue":"12","key":"221_CR3","doi-asserted-by":"publisher","first-page":"1721","DOI":"10.1101\/gr.210641.116","volume":"26","author":"D Kim","year":"2016","unstructured":"Kim D, Song L, Breitwieser FP, Salzberg SL. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Res. 2016;26(12):1721\u20139","journal-title":"Genome Res"},{"key":"221_CR4","doi-asserted-by":"publisher","first-page":"11257","DOI":"10.1038\/ncomms11257","volume":"7","author":"P Menzel","year":"2016","unstructured":"Menzel P, Ng KL, Krogh A. Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nat Commun. 2016;7:11257.","journal-title":"Nat Commun"},{"key":"221_CR5","doi-asserted-by":"publisher","first-page":"1165","DOI":"10.7717\/peerj.1165","volume":"3","author":"DD Kang","year":"2015","unstructured":"Kang DD, Froula J, Egan R, Wang Z. MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. PeerJ. 2015;3:1165.","journal-title":"PeerJ"},{"key":"221_CR6","doi-asserted-by":"publisher","first-page":"7359","DOI":"10.7717\/peerj.7359","volume":"7","author":"DD Kang","year":"2019","unstructured":"Kang DD, Li F, Kirton E, Thomas A, et al. MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies. PeerJ. 2019;7:7359.","journal-title":"PeerJ"},{"issue":"1","key":"221_CR7","doi-asserted-by":"publisher","first-page":"26","DOI":"10.1186\/2049-2618-2-26","volume":"2","author":"Y-W Wu","year":"2014","unstructured":"Wu Y-W, Tang Y-H, Tringe SG, et al. Maxbin: an automated binning method to recover individual genomes from metagenomes using an expectation-maximization algorithm. Microbiome. 2014;2(1):26.","journal-title":"Microbiome"},{"issue":"4","key":"221_CR8","doi-asserted-by":"publisher","first-page":"605","DOI":"10.1093\/bioinformatics\/btv638","volume":"32","author":"Y-W Wu","year":"2015","unstructured":"Wu Y-W, Simmons BA, Singer SW. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics. 2015;32(4):605\u20137.","journal-title":"Bioinformatics"},{"issue":"24","key":"221_CR9","doi-asserted-by":"crossref","first-page":"4172","DOI":"10.1093\/bioinformatics\/bty519","volume":"34","author":"G Yu","year":"2018","unstructured":"Yu G, Jiang Y, Wang J, et al. BMC3C: binning metagenomic contigs using codon usage, sequence composition and read coverage. Bioinformatics. 2018;34(24):4172\u20139.","journal-title":"Bioinformatics"},{"issue":"W1","key":"221_CR10","doi-asserted-by":"publisher","first-page":"171","DOI":"10.1093\/nar\/gkx348","volume":"45","author":"CC Laczny","year":"2017","unstructured":"Laczny CC, Kiefer C, Galata V, et al. BusyBee Web: metagenomic data analysis by bootstrapped supervised binning and annotation. Nucleic Acids Res. 2017;45(W1):171\u20139.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"221_CR11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s40168-014-0066-1","volume":"3","author":"CC Laczny","year":"2015","unstructured":"Laczny CC, Sternal T, Plugaru V, Gawron P, Atashpendar A, Margossian HH, Coronado S, Van der Maaten L, Vlassis N, Wilmes P. Vizbin-an application for reference-independent visualization and human-augmented binning of metagenomic data. Microbiome. 2015;3(1):1\u20137.","journal-title":"Microbiome"},{"key":"221_CR12","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btz253","author":"Z Wang","year":"2019","unstructured":"Wang Z, Wang Z, Lu YY, et al. SolidBin: improving metagenome binning with semi-supervised normalized cut. Bioinformatics. 2019. https:\/\/doi.org\/10.1093\/bioinformatics\/btz253.","journal-title":"Bioinformatics"},{"issue":"11","key":"221_CR13","doi-asserted-by":"publisher","first-page":"1052","DOI":"10.1089\/cmb.2021.0270","volume":"28","author":"F Andreace","year":"2021","unstructured":"Andreace F, Pizzi C, Comin M. Metaprob 2: metagenomic reads binning based on assembly using minimizers and k-mers statistics. J Comput Biol. 2021;28(11):1052\u201362.","journal-title":"J Comput Biol"},{"key":"221_CR14","doi-asserted-by":"publisher","DOI":"10.1038\/s41587-020-00777-4","author":"JN Nissen","year":"2021","unstructured":"Nissen JN, Johansen J, Alles\u00f8e RL, S\u00f8nderby CK, Armenteros JJA, Gr\u00f8nbech CH, Jensen LJ, Nielsen HB, Petersen TN, Winther O, Rasmussen S. Improved metagenome binning and assembly using deep variational autoencoders. Nat Biotechnol. 2021. https:\/\/doi.org\/10.1038\/s41587-020-00777-4.","journal-title":"Nat Biotechnol"},{"issue":"Supplement 1","key":"221_CR15","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1093\/bioinformatics\/btaa441","volume":"36","author":"A Wickramarachchi","year":"2020","unstructured":"Wickramarachchi A, Mallawaarachchi V, Rajan V, Lin Y. MetaBCC-LR: metagenomics binning by coverage and composition for long reads. Bioinformatics. 2020;36(Supplement 1):3\u201311. https:\/\/doi.org\/10.1093\/bioinformatics\/btaa441.","journal-title":"Bioinformatics"},{"key":"221_CR16","doi-asserted-by":"publisher","first-page":"24175","DOI":"10.1038\/srep24175.27067514","volume":"6","author":"H-H Lin","year":"2016","unstructured":"Lin H-H, Liao Y-C. Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes. Sci Rep. 2016;6:24175. https:\/\/doi.org\/10.1038\/srep24175.27067514.","journal-title":"Sci Rep."},{"issue":"10","key":"221_CR17","doi-asserted-by":"publisher","first-page":"1155","DOI":"10.1038\/s41587-019-0217-9","volume":"37","author":"AM Wenger","year":"2019","unstructured":"...Wenger AM, Peluso P, Rowell WJ, Chang P-C, Hall RJ, Concepcion GT, Ebler J, Fungtammasan A, Kolesnikov A, Olson ND, T\u00f6pfer A, Alonge M, Mahmoud M, Qian Y, Chin C-S, Phillippy AM, Schatz MC, Myers G, DePristo MA, Ruan J, Marschall T, Sedlazeck FJ, Zook JM, Li H, Koren S, Carroll A, Rank DR, Hunkapiller MW. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol. 2019;37(10):1155\u201362. https:\/\/doi.org\/10.1038\/s41587-019-0217-9.","journal-title":"Nat Biotechnol"},{"issue":"4","key":"221_CR18","doi-asserted-by":"publisher","first-page":"693","DOI":"10.1101\/gr.634603","volume":"13","author":"T Abe","year":"2003","unstructured":"Abe T, Kanaya S, Kinouchi M, Ichiba Y, Kozuki T, Ikemura T. Informatics for unveiling hidden genome signatures. Genome Res. 2003;13(4):693\u2013702.","journal-title":"Genome Res"},{"issue":"10","key":"221_CR19","doi-asserted-by":"publisher","first-page":"1391","DOI":"10.1093\/oxfordjournals.molbev.a026048","volume":"16","author":"PJ Deschavanne","year":"1999","unstructured":"Deschavanne PJ, Giron A, Vilain J, Fagot G, Fertil B. Genomic signature: characterization and classification of species assessed by chaos game representation of sequences. Mol Biol Evol. 1999;16(10):1391\u20139.","journal-title":"Mol Biol Evol"},{"key":"221_CR20","doi-asserted-by":"publisher","first-page":"1144","DOI":"10.1038\/nmeth.3103","volume":"11","author":"J Alneberg","year":"2014","unstructured":"Alneberg J, Bjarnason BS, de Bruijn I, et al. Binning metagenomic contigs by coverage and composition. Nat Methods. 2014;11:1144.","journal-title":"Nat Methods"},{"issue":"4","key":"221_CR21","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1371\/journal.pcbi.1007781","volume":"16","author":"D Pellow","year":"2020","unstructured":"Pellow D, Mizrahi I, Shamir R. Plasclass improves plasmid sequence classification. PLOS Comput Biol. 2020;16(4):1\u20139. https:\/\/doi.org\/10.1371\/journal.pcbi.1007781.","journal-title":"PLOS Comput Biol"},{"issue":"5","key":"221_CR22","doi-asserted-by":"publisher","first-page":"652","DOI":"10.1093\/bioinformatics\/btt020","volume":"29","author":"G Rizk","year":"2013","unstructured":"Rizk G, Lavenier D, Chikhi R. DSK: k-mer counting with very low memory usage. Bioinformatics. 2013;29(5):652\u20133.","journal-title":"Bioinformatics"},{"issue":"1","key":"221_CR23","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1214\/aoms\/1177729694","volume":"22","author":"S Kullback","year":"1951","unstructured":"Kullback S, Leibler RA. On information and sufficiency. Ann Math Stat. 1951;22(1):79\u201386. https:\/\/doi.org\/10.1214\/aoms\/1177729694.","journal-title":"Ann Math Stat"},{"issue":"17","key":"221_CR24","doi-asserted-by":"publisher","first-page":"2704","DOI":"10.1093\/bioinformatics\/btw286","volume":"32","author":"BK St\u00f6cker","year":"2016","unstructured":"St\u00f6cker BK, K\u00f6ster J, Rahmann S. SimLoRD: simulation of long read data. Bioinformatics. 2016;32(17):2704\u20136.","journal-title":"Bioinformatics"},{"issue":"18","key":"221_CR25","doi-asserted-by":"publisher","first-page":"3094","DOI":"10.1093\/bioinformatics\/bty191","volume":"34","author":"H Li","year":"2018","unstructured":"Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34(18):3094\u2013100.","journal-title":"Bioinformatics"},{"issue":"5","key":"221_CR26","doi-asserted-by":"publisher","first-page":"043","DOI":"10.1093\/gigascience\/giz043","volume":"8","author":"SM Nicholls","year":"2019","unstructured":"Nicholls SM, Quick JC, Tang S, Loman NJ. Ultra-deep, long-read nanopore sequencing of mock microbial community standards. Gigascience. 2019;8(5):043.","journal-title":"Gigascience"},{"key":"221_CR27","doi-asserted-by":"publisher","DOI":"10.1093\/gigascience\/giy069","author":"F Meyer","year":"2018","unstructured":"Meyer F, Hofmann P, Belmann P, Garrido-Oter R, Fritz A, Sczyrba A, McHardy AC. AMBER: assessment of metagenome BinnERs. GigaScience. 2018. https:\/\/doi.org\/10.1093\/gigascience\/giy069.","journal-title":"GigaScience."},{"issue":"2","key":"221_CR28","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1038\/s41592-019-0669-3","volume":"17","author":"J Ruan","year":"2020","unstructured":"Ruan J, Li H. Fast and accurate long-read assembly with wtdbg2. Nat Methods. 2020;17(2):155\u20138. https:\/\/doi.org\/10.1038\/s41592-019-0669-3.","journal-title":"Nat Methods"},{"issue":"11","key":"221_CR29","doi-asserted-by":"publisher","first-page":"1103","DOI":"10.1038\/s41592-020-00971-x","volume":"17","author":"M Kolmogorov","year":"2020","unstructured":"Kolmogorov M, Bickhart DM, Behsaz B, Gurevich A, Rayko M, Shin SB, Kuhn K, Yuan J, Polevikov E, Smith TPL, Pevzner PA. Metaflye: scalable long-read metagenome assembly using repeat graphs. Nat Methods. 2020;17(11):1103\u201310. https:\/\/doi.org\/10.1038\/s41592-020-00971-x.","journal-title":"Nat Methods"},{"issue":"7","key":"221_CR30","doi-asserted-by":"publisher","first-page":"1088","DOI":"10.1093\/bioinformatics\/btv697","volume":"32","author":"A Mikheenko","year":"2015","unstructured":"Mikheenko A, Saveliev V, Gurevich A. MetaQUAST: evaluation of metagenome assemblies. Bioinformatics. 2015;32(7):1088\u201390.","journal-title":"Bioinformatics"},{"key":"221_CR31","unstructured":"Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S. Pytorch: An imperative style, high-performance deep learning library. In: Wallach H, Larochelle H, Beygelzimer A, d\u2019 Alch\u00e9-Buc F, Fox E, Garnett R. (eds.) Advances in Neural Information Processing Systems 32, Curran Associates Inc, New York. 2019, 8024\u20138035"},{"key":"221_CR32","doi-asserted-by":"publisher","unstructured":"Harris CR, Millman KJ, Van Der Walt SJ, Gommers R, Virtanen P, Cournapeau D, Wieser E, Taylor J, Berg S, Smith NJ, Kern R. Array programming with NumPy. Nature. 2020;585:357\u201362. https:\/\/doi.org\/10.1038\/s41586-020-2649-2.","DOI":"10.1038\/s41586-020-2649-2"},{"issue":"1","key":"221_CR33","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13015-021-00185-6","volume":"16","author":"VG Mallawaarachchi","year":"2021","unstructured":"Mallawaarachchi VG, Wickramarachchi AS, Lin Y. Improving metagenomic binning results with overlapped bins using assembly graphs. Algorithms Mol Biol. 2021;16(1):1\u201318.","journal-title":"Algorithms Mol Biol"}],"container-title":["Algorithms for Molecular Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13015-022-00221-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13015-022-00221-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13015-022-00221-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,11]],"date-time":"2023-02-11T11:59:52Z","timestamp":1676116792000},"score":1,"resource":{"primary":{"URL":"https:\/\/almob.biomedcentral.com\/articles\/10.1186\/s13015-022-00221-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,11]]},"references-count":33,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["221"],"URL":"https:\/\/doi.org\/10.1186\/s13015-022-00221-z","relation":{},"ISSN":["1748-7188"],"issn-type":[{"value":"1748-7188","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,7,11]]},"assertion":[{"value":"31 December 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 June 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 July 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"14"}}