{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T00:21:30Z","timestamp":1777335690466,"version":"3.51.4"},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"18","license":[{"start":{"date-parts":[[2021,4,5]],"date-time":"2021-04-05T00:00:00Z","timestamp":1617580800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"National Science Foundation Division of Ocean Sciences","award":["1636402"],"award-info":[{"award-number":["1636402"]}]},{"DOI":"10.13039\/100000106","name":"Office of Integrative Activities","doi-asserted-by":"publisher","award":["1557349-Ike Wai"],"award-info":[{"award-number":["1557349-Ike Wai"]}],"id":[{"id":"10.13039\/100000106","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Securing Hawaii\u2019s Water Future","award":["1736030\u2013G2P"],"award-info":[{"award-number":["1736030\u2013G2P"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,9,29]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Metagenomic approaches hold the potential to characterize microbial communities and unravel the intricate link between the microbiome and biological processes. Assembly is one of the most critical steps in metagenomics experiments. It consists of transforming overlapping DNA sequencing reads into sufficiently accurate representations of the community\u2019s genomes. This process is computationally difficult and commonly results in genomes fragmented across many contigs. Computational binning methods are used to mitigate fragmentation by partitioning contigs based on their sequence composition, abundance or chromosome organization into bins representing the community\u2019s genomes. Existing binning methods have been principally tuned for bacterial genomes and do not perform favorably on viral metagenomes.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We propose Composition and Coverage Network (CoCoNet), a new binning method for viral metagenomes that leverages the flexibility and the effectiveness of deep learning to model the co-occurrence of contigs belonging to the same viral genome and provide a rigorous framework for binning viral contigs. Our results show that CoCoNet substantially outperforms existing binning methods on viral datasets.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>CoCoNet was implemented in Python and is available for download on PyPi (https:\/\/pypi.org\/). The source code is hosted on GitHub at https:\/\/github.com\/Puumanamana\/CoCoNet and the documentation is available at https:\/\/coconet.readthedocs.io\/en\/latest\/index.html. CoCoNet does not require extensive resources to run. For example, binning 100k contigs took about 4\u2009h on 10 Intel CPU Cores (2.4\u2009GHz), with a memory peak at 27 GB (see Supplementary Fig. S9). To process a large dataset, CoCoNet may need to be run on a high RAM capacity server. Such servers are typically available in high-performance or cloud computing settings.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab213","type":"journal-article","created":{"date-parts":[[2021,4,3]],"date-time":"2021-04-03T11:08:58Z","timestamp":1617448138000},"page":"2803-2810","source":"Crossref","is-referenced-by-count":26,"title":["CoCoNet: an efficient deep learning tool for viral metagenome binning"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5255-0942","authenticated-orcid":false,"given":"C\u00e9dric G","family":"Arisdakessian","sequence":"first","affiliation":[{"name":"Department of Information and Computer Sciences, University of Hawai\u2018i at M\u0101noa , Honolulu, HI 96822, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Olivia D","family":"Nigro","sequence":"additional","affiliation":[{"name":"Department of Natural Science, Hawai\u2018i Pacific University , Honolulu, HI 96813, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Grieg F","family":"Steward","sequence":"additional","affiliation":[{"name":"Department of Oceanography, University of Hawai\u2018i at M\u0101noa , Honolulu, HI 96822, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guylaine","family":"Poisson","sequence":"additional","affiliation":[{"name":"Department of Information and Computer Sciences, University of Hawai\u2018i at M\u0101noa , Honolulu, HI 96822, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mahdi","family":"Belcaid","sequence":"additional","affiliation":[{"name":"Department of Information and Computer Sciences, University of Hawai\u2018i at M\u0101noa , Honolulu, HI 96822, USA"},{"name":"Hawai\u2018i Institute of Marine Biology , University of Hawai\u2018i at M\u0101noa, Honolulu, HI 96816, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2021,4,5]]},"reference":[{"key":"2023061402422499800_btab213-B1","doi-asserted-by":"crossref","first-page":"1144","DOI":"10.1038\/nmeth.3103","article-title":"Binning metagenomic contigs by coverage and composition","volume":"11","author":"Alneberg","year":"2014","journal-title":"Nat. Methods"},{"key":"2023061402422499800_btab213-B2","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1093\/bioinformatics\/btu638","article-title":"Htseq-a python framework to work with high-throughput sequencing data","volume":"31","author":"Anders","year":"2015","journal-title":"Bioinformatics"},{"key":"2023061402422499800_btab213-B3","doi-asserted-by":"crossref","first-page":"e368","DOI":"10.1371\/journal.pbio.0040368","article-title":"The marine viromes of four oceanic regions","volume":"4","author":"Angly","year":"2006","journal-title":"PLoS Biol"},{"key":"2023061402422499800_btab213-B4","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1101\/gr.251686.119","article-title":"Assembly-free single-molecule sequencing recovers complete virus genomes from natural microbial communities","volume":"30","author":"Beaulaurier","year":"2020","journal-title":"Genome Res"},{"key":"2023061402422499800_btab213-B5","first-page":"737","article-title":"Signature verification using a \u201csiamese\u201d time delay neural network","author":"Bromley","year":"1993","journal-title":"Proceedings of the 6th International Conference on Neural Information Processing Systems, NIPS\u201993"},{"key":"2023061402422499800_btab213-B6","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1007\/978-1-60327-565-1_7","volume-title":"Bacteriophages","author":"Casjens","year":"2009"},{"key":"2023061402422499800_btab213-B7","doi-asserted-by":"crossref","first-page":"i884","DOI":"10.1093\/bioinformatics\/bty560","article-title":"fastp: an ultra-fast all-in-one fastq preprocessor","volume":"34","author":"Chen","year":"2018","journal-title":"Bioinformatics"},{"key":"2023061402422499800_btab213-B8","doi-asserted-by":"crossref","first-page":"748","DOI":"10.1016\/j.drudis.2020.03.003","article-title":"Machine learning in drug\u2013target interaction prediction: current state and future directions","volume":"25","author":"D\u2019Souza","year":"2020","journal-title":"Drug Discov. Today"},{"key":"2023061402422499800_btab213-B9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s40168-019-0633-6","article-title":"Camisim: simulating metagenomes and microbial communities","volume":"7","author":"Fritz","year":"2019","journal-title":"Microbiome"},{"key":"2023061402422499800_btab213-B10","doi-asserted-by":"crossref","first-page":"141","DOI":"10.3389\/fbioe.2015.00141","article-title":"Fragmentation and coverage variation in viral metagenome assemblies, and their effect in diversity calculations","volume":"3","author":"Garc\u00eda-L\u00f3pez","year":"2015","journal-title":"Front. Bioeng. Biotechnol"},{"key":"2023061402422499800_btab213-B11","doi-asserted-by":"crossref","first-page":"e1005838","DOI":"10.1371\/journal.pgen.1005838","article-title":"Continuous influx of genetic material from host to virus populations","volume":"12","author":"Gilbert","year":"2016","journal-title":"PLoS Genet"},{"key":"2023061402422499800_btab213-B12","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1007\/BF01908075","article-title":"Comparing partitions","volume":"2","author":"Hubert","year":"1985","journal-title":"J. Classif"},{"key":"2023061402422499800_btab213-B13","doi-asserted-by":"crossref","first-page":"1561","DOI":"10.3389\/fmicb.2017.01561","article-title":"Analysing microbial community composition through amplicon sequencing: from sampling to hypothesis testing","volume":"8","author":"Hugerth","year":"2017","journal-title":"Front. Microbiol"},{"key":"2023061402422499800_btab213-B14","doi-asserted-by":"crossref","first-page":"e57355","DOI":"10.1371\/journal.pone.0057355","article-title":"The Pacific Ocean Virome (POV): a marine viral metagenomic dataset and associated protein clusters for quantitative viral ecology","volume":"8","author":"Hurwitz","year":"2013","journal-title":"PLoS One"},{"key":"2023061402422499800_btab213-B15","doi-asserted-by":"crossref","first-page":"e603","DOI":"10.7717\/peerj.603","article-title":"GroopM: an automated tool for the recovery of population genomes from related metagenomes","volume":"2","author":"Imelfort","year":"2014","journal-title":"PeerJ"},{"key":"2023061402422499800_btab213-B16","doi-asserted-by":"crossref","first-page":"e7359","DOI":"10.7717\/peerj.7359","article-title":"Metabat 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies","volume":"7","author":"Kang","year":"2019","journal-title":"PeerJ"},{"key":"2023061402422499800_btab213-B17","doi-asserted-by":"crossref","first-page":"S227","DOI":"10.1089\/bsp.2013.0008","article-title":"The effect of preprocessing by sequence-independent, single-primer amplification (SISPA) on metagenomic detection of viruses","volume":"11","author":"Karlsson","year":"2013","journal-title":"Biosecurity Bioterrorism Biodefense Strat. Pract. Sci"},{"key":"2023061402422499800_btab213-B18","article-title":"Adam: a method for stochastic optimization","author":"Kingma","year":"2014"},{"key":"2023061402422499800_btab213-B19","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1007\/978-3-642-77011-1_2","volume-title":"Genetic Diversity of RNA Viruses","author":"Lai","year":"1992"},{"key":"2023061402422499800_btab213-B20","article-title":"Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM","author":"Li","year":"2013"},{"key":"2023061402422499800_btab213-B21","doi-asserted-by":"crossref","first-page":"3094","DOI":"10.1093\/bioinformatics\/bty191","article-title":"Minimap2: pairwise alignment for nucleotide sequences","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"2023061402422499800_btab213-B22","article-title":"Checkv: assessing the quality of metagenome-assembled viral genomes","author":"Nayfach","year":"2020","journal-title":"Nature Biotechnol., 1\u20138"},{"key":"2023061402422499800_btab213-B23","doi-asserted-by":"crossref","first-page":"036104","DOI":"10.1103\/PhysRevE.74.036104","article-title":"Finding community structure in networks using the eigenvectors of matrices","volume":"74","author":"Newman","year":"2006","journal-title":"Phys. Rev. E"},{"key":"2023061402422499800_btab213-B24","doi-asserted-by":"crossref","first-page":"824","DOI":"10.1101\/gr.213959.116","article-title":"metaspades: a new versatile metagenomic assembler","volume":"27","author":"Nurk","year":"2017","journal-title":"Genome Res"},{"key":"2023061402422499800_btab213-B25","doi-asserted-by":"crossref","first-page":"D733","DOI":"10.1093\/nar\/gkv1189","article-title":"Reference sequence (refseq) database at ncbi: current status, taxonomic expansion, and functional annotation","volume":"44","author":"O\u2019Leary","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023061402422499800_btab213-B26","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1186\/s40168-018-0507-3","article-title":"Evaluation of bias induced by viral enrichment and random amplification protocols in metagenomic surveys of saliva DNA viruses","volume":"6","author":"Parras-Molt\u00f3","year":"2018","journal-title":"Microbiome"},{"key":"2023061402422499800_btab213-B27","first-page":"130997","article-title":"GATTACA: lightweight metagenomic binning with compact indexing of kmer counts and minhash-based panel selection","author":"Popic","year":"2017","journal-title":"bioRxiv"},{"key":"2023061402422499800_btab213-B28","article-title":"Deep learning is robust to massive label noise","author":"Rolnick","year":"2017"},{"key":"2023061402422499800_btab213-B29","doi-asserted-by":"crossref","first-page":"e76144","DOI":"10.1371\/journal.pone.0076144","article-title":"The origin of biased sequence depth in sequence-independent nucleic acid amplification and optimization for efficient massive parallel sequencing","volume":"8","author":"Rosseel","year":"2013","journal-title":"PLoS One"},{"key":"2023061402422499800_btab213-B30","author":"Roux","year":"2009"},{"key":"2023061402422499800_btab213-B31","doi-asserted-by":"crossref","first-page":"410","DOI":"10.3389\/fmicb.2012.00410","article-title":"The binning of metagenomic contigs for microbial physiology of mixed cultures","volume":"3","author":"Strous","year":"2012","journal-title":"Front. Microbiol"},{"key":"2023061402422499800_btab213-B32","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1186\/s40168-019-0626-5","article-title":"Choice of assembly software has a critical impact on virome characterisation","volume":"7","author":"Sutton","year":"2019","journal-title":"Microbiome"},{"key":"2023061402422499800_btab213-B33","doi-asserted-by":"crossref","first-page":"5233","DOI":"10.1038\/s41598-019-41695-z","article-title":"From Louvain to Leiden: guaranteeing well-connected communities","volume":"9","author":"Traag","year":"2019","journal-title":"Sci. Rep"},{"key":"2023061402422499800_btab213-B34","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1007\/s00203-018-1615-y","article-title":"Shotgun metagenomics offers novel insights into taxonomic compositions, metabolic pathways and antibiotic resistance genes in fish gut microbiome","volume":"201","author":"Tyagi","year":"2019","journal-title":"Arch. Microbiol"},{"key":"2023061402422499800_btab213-B35","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1186\/1471-2164-15-37","article-title":"Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut","volume":"15","author":"V\u00e1zquez-Castellanos","year":"2014","journal-title":"BMC Genomics"},{"key":"2023061402422499800_btab213-B36","doi-asserted-by":"crossref","first-page":"572","DOI":"10.1016\/j.cels.2016.10.004","article-title":"Shotgun metagenomics of 250 adult twins reveals genetic and environmental impacts on the gut microbiome","volume":"3","author":"Xie","year":"2016","journal-title":"Cell Syst"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab213\/37084552\/btab213.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/18\/2803\/50579237\/btab213.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/18\/2803\/50579237\/btab213.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,14]],"date-time":"2023-06-14T02:44:20Z","timestamp":1686710660000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/18\/2803\/6211156"}},"subtitle":[],"editor":[{"given":"Pier Luigi","family":"Martelli","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2021,4,5]]},"references-count":36,"journal-issue":{"issue":"18","published-print":{"date-parts":[[2021,9,29]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab213","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,9,15]]},"published":{"date-parts":[[2021,4,5]]}}}