{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T23:24:35Z","timestamp":1773185075320,"version":"3.50.1"},"reference-count":46,"publisher":"Oxford University Press (OUP)","issue":"19","license":[{"start":{"date-parts":[[2016,11,7]],"date-time":"2016-11-07T00:00:00Z","timestamp":1478476800000},"content-version":"vor","delay-in-days":877,"URL":"http:\/\/creativecommons.org\/licenses\/by\/3.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2014,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: To assess the potential of different types of sequence data combined with de novo and hybrid assembly approaches to improve existing draft genome sequences.<\/jats:p><jats:p>Results: Illumina, 454 and PacBio sequencing technologies were used to generate de novo and hybrid genome assemblies for four different bacteria, which were assessed for quality using summary statistics (e.g. number of contigs, N50) and in silico evaluation tools. Differences in predictions of multiple copies of rDNA operons for each respective bacterium were evaluated by PCR and Sanger sequencing, and then the validated results were applied as an additional criterion to rank assemblies. In general, assemblies using longer PacBio reads were better able to resolve repetitive regions. In this study, the combination of Illumina and PacBio sequence data assembled through the ALLPATHS-LG algorithm gave the best summary statistics and most accurate rDNA operon number predictions. This study will aid others looking to improve existing draft genome assemblies.<\/jats:p><jats:p>Availability and implementation: All assembly tools except CLC Genomics Workbench are freely available under GNU General Public License.<\/jats:p><jats:p>Contact: \u00a0brownsd@ornl.gov<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btu391","type":"journal-article","created":{"date-parts":[[2014,6,16]],"date-time":"2014-06-16T20:35:17Z","timestamp":1402950917000},"page":"2709-2716","source":"Crossref","is-referenced-by-count":102,"title":["Evaluation and validation of<i>de novo<\/i>and hybrid assembly techniques to derive high-quality genome sequences"],"prefix":"10.1093","volume":"30","author":[{"given":"Sagar M.","family":"Utturkar","sequence":"first","affiliation":[{"name":"1 Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and 2 Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dawn M.","family":"Klingeman","sequence":"additional","affiliation":[{"name":"1 Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and 2 Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Miriam L.","family":"Land","sequence":"additional","affiliation":[{"name":"1 Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and 2 Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Christopher W.","family":"Schadt","sequence":"additional","affiliation":[{"name":"1 Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and 2 Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA"},{"name":"1 Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and 2 Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mitchel J.","family":"Doktycz","sequence":"additional","affiliation":[{"name":"1 Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and 2 Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA"},{"name":"1 Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and 2 Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dale A.","family":"Pelletier","sequence":"additional","affiliation":[{"name":"1 Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and 2 Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA"},{"name":"1 Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and 2 Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Steven D.","family":"Brown","sequence":"additional","affiliation":[{"name":"1 Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and 2 Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA"},{"name":"1 Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37919, USA and 2 Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2014,6,14]]},"reference":[{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1089\/cmb.2012.0021","article-title":"SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing","volume":"19","author":"Bankevich","year":"2012","journal-title":"J. Comput. Biol."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"701","DOI":"10.1038\/nbt.2288","article-title":"A hybrid approach for the automated finishing of bacterial genomes","volume":"30","author":"Bashir","year":"2012","journal-title":"Nat. Biotechnol."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1186\/2047-217X-2-10","article-title":"Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species","volume":"2","author":"Bradnam","year":"2013","journal-title":"GigaScience"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1186\/1754-6834-7-40","article-title":"Comparison of single-molecule sequencing and hybrid approaches for finishing the genome of Clostridium autoethanogenum and analysis of CRISPR systems in industrial relevant Clostridia","volume":"7","author":"Brown","year":"2014","journal-title":"Biotechnol. Biofuels"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"2383","DOI":"10.1128\/JB.00198-12","article-title":"Draft genome sequence of Rhizobium sp. strain PDO1-076, a bacterium isolated from Populus deltoides","volume":"194","author":"Brown","year":"2012","journal-title":"J. Bacteriol."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"5991","DOI":"10.1128\/JB.01243-12","article-title":"Twenty-one genome sequences from Pseudomonas species and 19 genome sequences from diverse bacteria isolated from the rhizosphere and endosphere of Populus deltoides","volume":"194","author":"Brown","year":"2012","journal-title":"J. Bacteriol."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"810","DOI":"10.1101\/gr.7337908","article-title":"ALLPATHS: de novo assembly of whole-genome shotgun microreads","volume":"18","author":"Butler","year":"2008","journal-title":"Genome Res."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1126\/science.1180614","article-title":"Genomics. Genome project standards in a new era of sequencing","volume":"326","author":"Chain","year":"2009","journal-title":"Science"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1186\/1471-2105-13-238","article-title":"Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory","volume":"13","author":"Chaisson","year":"2012","journal-title":"BMC Bioinformatics"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1093\/bioinformatics\/btt310","article-title":"Informed and automated k-mer size selection for genome assembly","volume":"30","author":"Chikhi","year":"2014","journal-title":"Bioinformatics"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1038\/nmeth.2474","article-title":"Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data","volume":"10","author":"Chin","year":"2013","journal-title":"Nat. Methods"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","DOI":"10.1002\/0471250953.bi1104s17","article-title":"Assembling genomic DNA sequences with PHRAP","author":"de la Bastide","year":"2007","journal-title":"Curr. Protoc. Bioinformatics"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"2224","DOI":"10.1101\/gr.126599.111","article-title":"Assemblathon 1: a competitive assessment of de novo short read assembly methods","volume":"21","author":"Earl","year":"2011","journal-title":"Genome Res."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"e47768","DOI":"10.1371\/journal.pone.0047768","article-title":"Mind the gap: upgrading genomes with Pacific Biosciences RS long-read sequencing technology","volume":"7","author":"English","year":"2012","journal-title":"PLoS One"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"6403","DOI":"10.1128\/JB.184.23.6403-6405.2002","article-title":"The value of complete microbial genome sequencing (you get what you pay for)","volume":"184","author":"Fraser","year":"2002","journal-title":"J. Bacteriol."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"1072","DOI":"10.1093\/bioinformatics\/btt086","article-title":"QUAST: quality assessment tool for genome assemblies","volume":"29","author":"Gurevich","year":"2013","journal-title":"Bioinformatics"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"368","DOI":"10.1016\/j.mimet.2011.06.019","article-title":"A biologist's guide to de novo genome assembly using next-generation sequence data: A test with fungal genomes","volume":"86","author":"Haridas","year":"2011","journal-title":"J. Microbiol. Methods"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"R47","DOI":"10.1186\/gb-2013-14-5-r47","article-title":"REAPR: a universal tool for genome assembly evaluation","volume":"14","author":"Hunt","year":"2013","journal-title":"Genome Biol."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1371\/journal.pone.0041295","article-title":"Sequencing intractable DNA to close microbial genomes","volume":"7","author":"Hurt","year":"2012","journal-title":"PLoS One"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1186\/1471-2105-11-119","article-title":"Prodigal: prokaryotic gene recognition and translation initiation site identification","volume":"11","author":"Hyatt","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"693","DOI":"10.1038\/nbt.2280","article-title":"Hybrid error correction and de novo assembly of single-molecule sequencing reads","volume":"30","author":"Koren","year":"2012","journal-title":"Nat. Biotechnol."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"R101","DOI":"10.1186\/gb-2013-14-9-r101","article-title":"Reducing assembly complexity of microbial genomes with single-molecule sequencing","volume":"14","author":"Koren","year":"2013","journal-title":"Genome Biol."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"126","DOI":"10.1186\/1471-2105-15-126","article-title":"Automated ensemble assembly and validation of microbial genomes","volume":"15","author":"Koren","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"3100","DOI":"10.1093\/nar\/gkm160","article-title":"RNAmmer: consistent and rapid annotation of ribosomal RNA genes","volume":"35","author":"Lagesen","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"251364","DOI":"10.1155\/2012\/251364","article-title":"Comparison of next-generation sequencing systems","volume":"2012","author":"Liu","year":"2012","journal-title":"J. Biomed. Biotechnol."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1186\/2047-217X-1-18","article-title":"SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler","volume":"1","author":"Luo","year":"2012","journal-title":"Gigascience"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"R103","DOI":"10.1186\/gb-2009-10-10-r103","article-title":"ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads","volume":"10","author":"Maccallum","year":"2009","journal-title":"Genome Biol."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"D115","DOI":"10.1093\/nar\/gkr1044","article-title":"IMG: the Integrated Microbial Genomes database and comparative analysis system","volume":"40","author":"Markowitz","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"e48837","DOI":"10.1371\/journal.pone.0048837","article-title":"The fast changing landscape of sequencing technologies and their impact on microbial genome assemblies and annotation","volume":"7","author":"Mavromatis","year":"2012","journal-title":"PLoS One"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"2818","DOI":"10.1093\/bioinformatics\/btn548","article-title":"Aggressive assembly of pyrosequencing reads with mates","volume":"24","author":"Miller","year":"2008","journal-title":"Bioinformatics"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1016\/j.ygeno.2010.03.001","article-title":"Assembly algorithms for next-generation sequencing data","volume":"95","author":"Miller","year":"2010","journal-title":"Genomics"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"242","DOI":"10.1186\/1471-2164-11-242","article-title":"Finishing genomes with limited resources: lessons from an ensemble of microbial genomes","volume":"11","author":"Nagarajan","year":"2010","journal-title":"BMC Genomics"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1038\/nrg3367","article-title":"Sequence assembly demystified","volume":"14","author":"Nagarajan","year":"2013","journal-title":"Nat. Rev. Genet."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"675","DOI":"10.1186\/1471-2164-14-675","article-title":"Efficient and accurate whole genome assembly and methylome profiling of E. coli","volume":"14","author":"Powers","year":"2013","journal-title":"BMC Genomics"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1186\/1471-2164-13-341","article-title":"A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers","volume":"13","author":"Quail","year":"2012","journal-title":"BMC Genomics"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"R8","DOI":"10.1186\/gb-2013-14-1-r8","article-title":"CGAL: computing genome assembly likelihoods","volume":"14","author":"Rahman","year":"2013","journal-title":"Genome Biol."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"2270","DOI":"10.1101\/gr.141515.112","article-title":"Finished bacterial genomes from shotgun sequence data","volume":"22","author":"Ribeiro","year":"2012","journal-title":"Genome Res."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"405","DOI":"10.1186\/gb-2013-14-6-405","article-title":"The advantages of SMRT sequencing","volume":"14","author":"Roberts","year":"2013","journal-title":"Genome Biol."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"557","DOI":"10.1101\/gr.131383.111","article-title":"GAGE: a critical evaluation of genome assemblies and assembly algorithms","volume":"22","author":"Salzberg","year":"2012","journal-title":"Genome Res."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"e68824","DOI":"10.1371\/journal.pone.0068824","article-title":"Advantages of single-molecule real-time sequencing in high-GC content genomes","volume":"8","author":"Shin","year":"2013","journal-title":"PLoS One"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"1117","DOI":"10.1101\/gr.089532.108","article-title":"ABySS: a parallel assembler for short read sequence data","volume":"19","author":"Simpson","year":"2009","journal-title":"Genome Res."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"64","DOI":"10.1186\/1471-2105-8-64","article-title":"Minimus: a fast, lightweight genome assembler","volume":"8","author":"Sommer","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"1260","DOI":"10.1038\/nprot.2012.068","article-title":"A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs","volume":"7","author":"Swain","year":"2012","journal-title":"Nat. Protoc."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1038\/nrg3117","article-title":"Repetitive DNA and next-generation sequencing: computational challenges and solutions","volume":"13","author":"Treangen","year":"2012","journal-title":"Nat. Rev. Genet."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"821","DOI":"10.1101\/gr.074492.107","article-title":"Velvet: algorithms for de novo short read assembly using de Bruijn graphs","volume":"18","author":"Zerbino","year":"2008","journal-title":"Genome Res."},{"key":"2023041303360879700_","doi-asserted-by":"crossref","first-page":"2669","DOI":"10.1093\/bioinformatics\/btt476","article-title":"The MaSuRCA genome assembler","volume":"29","author":"Zimin","year":"2013","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/19\/2709\/49872395\/bioinformatics_30_19_2709.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/19\/2709\/49872395\/bioinformatics_30_19_2709.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,14]],"date-time":"2023-07-14T09:00:34Z","timestamp":1689325234000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/30\/19\/2709\/2422249"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,6,14]]},"references-count":46,"journal-issue":{"issue":"19","published-print":{"date-parts":[[2014,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btu391","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2014,10]]},"published":{"date-parts":[[2014,6,14]]}}}