{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,20]],"date-time":"2026-01-20T03:24:26Z","timestamp":1768879466235,"version":"3.49.0"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1011870","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2024,2,22]],"date-time":"2024-02-22T00:00:00Z","timestamp":1708560000000}}],"reference-count":41,"publisher":"Public Library of Science (PLoS)","issue":"2","license":[{"start":{"date-parts":[[2024,2,9]],"date-time":"2024-02-09T00:00:00Z","timestamp":1707436800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Chloroplasts are photosynthetic organelles in algal and plant cells that contain their own genome. Chloroplast genomes are commonly used in evolutionary studies and taxonomic identification and are increasingly becoming a target for crop improvement studies. As DNA sequencing becomes more affordable, researchers are collecting vast swathes of high-quality whole-genome sequence data from laboratory and field settings alike. Whole tissue read libraries sequenced with the primary goal of understanding the nuclear genome will inadvertently contain many reads derived from the chloroplast genome. These whole-genome, whole-tissue read libraries can additionally be used to assemble chloroplast genomes with little to no extra cost. While several tools exist that make use of short-read second generation and third-generation long-read sequencing data for chloroplast genome assembly, these tools may have complex installation steps, inadequate error reporting, poor expandability, and\/or lack scalability. Here, we present <jats:italic>CLAW<\/jats:italic> (Chloroplast Long-read Assembly Workflow), an easy to install, customise, and use Snakemake tool to assemble chloroplast genomes from chloroplast long-reads found in whole-genome read libraries (<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/aaronphillips7493\/CLAW\" xlink:type=\"simple\">https:\/\/github.com\/aaronphillips7493\/CLAW<\/jats:ext-link>). Using 19 publicly available reference chloroplast genome assemblies and long-read libraries from algal, monocot and eudicot species, we show that <jats:italic>CLAW<\/jats:italic> can rapidly produce chloroplast genome assemblies with high similarity to the reference assemblies. <jats:italic>CLAW<\/jats:italic> was designed such that users have complete control over parameterisation, allowing individuals to optimise <jats:italic>CLAW<\/jats:italic> to their specific use cases. We expect that <jats:italic>CLAW<\/jats:italic> will provide researchers (with varying levels of bioinformatics expertise) with an additional resource useful for contributing to the growing number of publicly available chloroplast genome assemblies.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1011870","type":"journal-article","created":{"date-parts":[[2024,2,9]],"date-time":"2024-02-09T18:35:03Z","timestamp":1707503703000},"page":"e1011870","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":2,"title":["CLAW: An automated Snakemake workflow for the assembly of chloroplast genomes from long-read data"],"prefix":"10.1371","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1351-0291","authenticated-orcid":true,"given":"Aaron L.","family":"Phillips","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4821-7490","authenticated-orcid":true,"given":"Scott","family":"Ferguson","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0638-4709","authenticated-orcid":true,"given":"Rachel A.","family":"Burton","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7935-6151","authenticated-orcid":true,"given":"Nathan S.","family":"Watson-Haigh","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2024,2,9]]},"reference":[{"issue":"5","key":"pcbi.1011870.ref001","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1093\/dnares\/6.5.283","article-title":"Complete Structure of the Chloroplast Genome of Arabidopsis thaliana","volume":"6","author":"S Sato","year":"1999","journal-title":"DNA Res"},{"issue":"49","key":"pcbi.1011870.ref002","doi-asserted-by":"crossref","first-page":"14323","DOI":"10.1021\/acs.jafc.0c03001","article-title":"Genomic Profiling: The Strengths and Limitations of Chloroplast Genome-Based Plant Variety Authentication","volume":"68","author":"D Teske","year":"2020","journal-title":"J Agric Food Chem"},{"issue":"11","key":"pcbi.1011870.ref003","doi-asserted-by":"crossref","first-page":"823","DOI":"10.1007\/BF00418529","article-title":"Conservation of chloroplast genome structure among vascular plants","volume":"10","author":"JD Palmer","year":"1986","journal-title":"Curr Genet"},{"issue":"24","key":"pcbi.1011870.ref004","doi-asserted-by":"crossref","first-page":"9054","DOI":"10.1073\/pnas.84.24.9054","article-title":"Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs","volume":"84","author":"KH Wolfe","year":"1987","journal-title":"Proc Natl Acad Sci"},{"key":"pcbi.1011870.ref005","doi-asserted-by":"crossref","first-page":"57","DOI":"10.3389\/fpls.2016.00057","article-title":"Chloroplast DNA Copy Number Changes during Plant Development in Organelle DNA Polymerase Mutants.","volume":"7","author":"SA Morley","year":"2016","journal-title":"Front Plant Sci"},{"issue":"6","key":"pcbi.1011870.ref006","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1007\/s11738-020-03089-x","article-title":"The chloroplast genome: a review","volume":"42","author":"J Dobrogojski","year":"2020","journal-title":"Acta Physiol Plant"},{"issue":"8","key":"pcbi.1011870.ref007","doi-asserted-by":"crossref","first-page":"636","DOI":"10.1016\/j.plaphy.2010.04.009","article-title":"Plastid ndh genes in plant evolution","volume":"48","author":"M Mart\u00edn","year":"2010","journal-title":"Plant Physiol Biochem"},{"key":"pcbi.1011870.ref008","doi-asserted-by":"crossref","first-page":"107229","DOI":"10.1016\/j.ympev.2021.107229","article-title":"Phylogenetics and comparative plastome genomics of two of the largest genera of angiosperms, Piper and Peperomia (Piperaceae).","volume":"163","author":"SE Simmonds","year":"2021","journal-title":"Mol Phylogenet Evol"},{"issue":"23","key":"pcbi.1011870.ref009","doi-asserted-by":"crossref","first-page":"8369","DOI":"10.1073\/pnas.0503123102","article-title":"Use of DNA barcodes to identify flowering plants","volume":"102","author":"WJ Kress","year":"2005","journal-title":"Proc Natl Acad Sci"},{"issue":"31","key":"pcbi.1011870.ref010","doi-asserted-by":"crossref","first-page":"12794","DOI":"10.1073\/pnas.0905845106","article-title":"A DNA barcode for land plants","volume":"106","author":"CBOL Plant Working Group","year":"2009","journal-title":"Proc Natl Acad Sci"},{"issue":"4","key":"pcbi.1011870.ref011","doi-asserted-by":"crossref","first-page":"1119","DOI":"10.1093\/jxb\/ery445","article-title":"Feeding the world: improving photosynthetic efficiency for sustainable crop production","volume":"70","author":"AJ Simkin","year":"2019","journal-title":"J Exp Bot"},{"key":"pcbi.1011870.ref012","article-title":"Editorial: Chloroplast Biotechnology for Crop Improvement.","author":"C De-la-Pe\u00f1a","year":"2022","journal-title":"Front Plant Sci [Internet]."},{"issue":"1","key":"pcbi.1011870.ref013","doi-asserted-by":"crossref","first-page":"254","DOI":"10.1186\/s13059-020-02153-6","article-title":"A systematic comparison of chloroplast genome assembly tools","volume":"21","author":"JA Freudenthal","year":"2020","journal-title":"Genome Biol"},{"issue":"1","key":"pcbi.1011870.ref014","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1186\/s12864-021-07397-5","article-title":"Impact of short-read sequencing on the misassembly of a plant genome","volume":"22","author":"P Wang","year":"2021","journal-title":"BMC Genomics"},{"issue":"11","key":"pcbi.1011870.ref015","doi-asserted-by":"crossref","first-page":"e1008397","DOI":"10.1371\/journal.pcbi.1008397","article-title":"Integrative analysis of structural variations using short-reads and linked-reads yields highly specific and sensitive predictions.","volume":"16","author":"R Sethi","year":"2020","journal-title":"PLOS Comput Biol."},{"issue":"1","key":"pcbi.1011870.ref016","doi-asserted-by":"crossref","first-page":"977","DOI":"10.1186\/s12864-018-5348-8","article-title":"Assembly of chloroplast genomes with long- and short-read data: a comparison of approaches using Eucalyptus pauciflora as a test case","volume":"19","author":"W Wang","year":"2018","journal-title":"BMC Genomics"},{"issue":"12","key":"pcbi.1011870.ref017","first-page":"3372","article-title":"Long-Reads Reveal That the Chloroplast Genome Exists in Two Distinct Versions in Most Plants","volume":"11","author":"W Wang","year":"2019","journal-title":"Genome Biol Evol"},{"issue":"11","key":"pcbi.1011870.ref018","doi-asserted-by":"crossref","first-page":"835","DOI":"10.1007\/BF00418530","article-title":"Structural evolution and flip-flop recombination of chloroplast DNA in the fern genus Osmunda","volume":"10","author":"DB Stein","year":"1986","journal-title":"Curr Genet"},{"issue":"1","key":"pcbi.1011870.ref019","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1111\/jpy.12811","article-title":"Flip-flop organization in the chloroplast genome of Capsosiphon fulvescens (Ulvophyceae, Chlorophyta).","volume":"55","author":"D Kim","year":"2019","journal-title":"J Phycol"},{"issue":"1","key":"pcbi.1011870.ref020","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1186\/s13059-020-1935-5","article-title":"Opportunities and challenges in long-read sequencing data analysis","volume":"21","author":"SL Amarasinghe","year":"2020","journal-title":"Genome Biol"},{"issue":"21","key":"pcbi.1011870.ref021","doi-asserted-by":"crossref","first-page":"464","DOI":"10.21105\/joss.00464","article-title":"chloroExtractor: extraction and assembly of the chloroplast genome from whole genome shotgun data","volume":"3","author":"MJ Ankenbrand","year":"2018","journal-title":"J Open Source Softw"},{"issue":"5","key":"pcbi.1011870.ref022","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1038\/s41587-019-0072-8","article-title":"Assembly of long, error-prone reads using repeat graphs","volume":"37","author":"M Kolmogorov","year":"2019","journal-title":"Nat Biotechnol"},{"issue":"6","key":"pcbi.1011870.ref023","doi-asserted-by":"crossref","first-page":"e1005595","DOI":"10.1371\/journal.pcbi.1005595","article-title":"Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads.","volume":"13","author":"RR Wick","year":"2017","journal-title":"PLOS Comput Biol"},{"issue":"12","key":"pcbi.1011870.ref024","doi-asserted-by":"crossref","first-page":"e1010705","DOI":"10.1371\/journal.pcbi.1010705","article-title":"Ten simple rules and a template for creating workflows-as-applications.","volume":"18","author":"M Roach","year":"2022","journal-title":"PLOS Comput Biol."},{"key":"pcbi.1011870.ref025","article-title":"Versatile and open software for comparing large genomes","volume":"9","author":"S Kurtz","year":"2004","journal-title":"Genome Biol"},{"issue":"20","key":"pcbi.1011870.ref026","doi-asserted-by":"crossref","first-page":"3350","DOI":"10.1093\/bioinformatics\/btv383","article-title":"Bandage: interactive visualization of de novo genome assemblies","volume":"31","author":"R Wick","year":"2015","journal-title":"Bioinformatics"},{"issue":"1","key":"pcbi.1011870.ref027","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1186\/1471-2105-10-421","article-title":"BLAST+: architecture and applications.","volume":"10","author":"C Camacho","year":"2009","journal-title":"BMC Bioinformatics"},{"issue":"6","key":"pcbi.1011870.ref028","doi-asserted-by":"crossref","first-page":"1442","DOI":"10.1111\/1755-0998.13787","article-title":"Plastid Genome Assembly Using Long-read data","volume":"23","author":"W Zhou","year":"2023","journal-title":"Molecular Ecology Resources"},{"issue":"1","key":"pcbi.1011870.ref029","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1186\/s13059-020-02154-5","article-title":"GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes","volume":"21","author":"JJ Jin","year":"2020","journal-title":"Genome Biol"},{"key":"pcbi.1011870.ref030","doi-asserted-by":"crossref","first-page":"823","DOI":"10.1038\/s41592-022-01539-7","article-title":"Oxford Nanopore R10.4 long-read sequencing enables the generation of near-finished bacterial genomes from pure cultures and metagenomes without short-read or reference polishing.","volume":"19","author":"M. Sereika","year":"2022","journal-title":"Nat Methods."},{"key":"pcbi.1011870.ref031","doi-asserted-by":"crossref","first-page":"2352","DOI":"10.1016\/j.csbj.2023.03.038","article-title":"Benchmarking of Nanopore R10.4 and R9.4.1 flow cells in single-cell whole-genome amplification and whole-genome shotgun sequencing.","volume":"21","author":"Y Ni","year":"2023","journal-title":"Comput. Struct. Biotechnol. J."},{"issue":"1","key":"pcbi.1011870.ref032","first-page":"mgen000910","article-title":"Comparison of R9.4.1\/Kit10 and R10\/Kit12 Oxford Nanopore flowcells and chemistries in bacterial genome reconstruction","volume":"9","author":"N Sanderson","year":"2023","journal-title":"Microb Genom"},{"issue":"1","key":"pcbi.1011870.ref033","doi-asserted-by":"crossref","first-page":"lqab019","DOI":"10.1093\/nargab\/lqab019","article-title":"Sequencing error profiles of Illumina sequencing instruments","volume":"3","author":"N Stoler","year":"2021","journal-title":"NAR Genomics Bioinforma"},{"key":"pcbi.1011870.ref034","first-page":"1","article-title":"DeepConsensus improves the accuracy of sequences with a gap-aware sequence transformer","author":"G Baid","year":"2022","journal-title":"Nat Biotechnol"},{"issue":"5","key":"pcbi.1011870.ref035","doi-asserted-by":"crossref","first-page":"100129","DOI":"10.1016\/j.xgen.2022.100129","article-title":"PrecisionFDA Truth Challenge V2: Calling variants from short and long reads in difficult-to-map regions","volume":"2","author":"ND Olson","year":"2022","journal-title":"Cell Genomics"},{"issue":"1","key":"pcbi.1011870.ref036","doi-asserted-by":"crossref","first-page":"20740","DOI":"10.1038\/s41598-021-00178-w","article-title":"Comparative evaluation of Nanopore polishing tools for microbial genome assembly and polishing strategies for downstream analysis","volume":"11","author":"JY Lee","year":"2021","journal-title":"Sci Rep"},{"issue":"15","key":"pcbi.1011870.ref037","doi-asserted-by":"crossref","first-page":"R736","DOI":"10.1016\/j.cub.2019.06.040","article-title":"The mitochondrial and chloroplast genomes of the green alga Haematococcus are made up of nearly identical repetitive sequences","volume":"29","author":"X Zhang","year":"2019","journal-title":"Curr Biol"},{"issue":"5885","key":"pcbi.1011870.ref038","doi-asserted-by":"crossref","first-page":"698","DOI":"10.1038\/299698a0","article-title":"Mitochondrial and chloroplast genomes of maize have a 12-kilobase DNA sequence in common","volume":"299","author":"DB Stern","year":"1982","journal-title":"Nature"},{"issue":"9","key":"pcbi.1011870.ref039","doi-asserted-by":"crossref","first-page":"2040","DOI":"10.1093\/molbev\/msm133","article-title":"Transfer of Chloroplast Genomic DNA to Mitochondrial Genome Occurred At Least 300 MYA","volume":"24","author":"D Wang","year":"2007","journal-title":"Mol Biol Evol"},{"issue":"1","key":"pcbi.1011870.ref040","doi-asserted-by":"crossref","first-page":"210","DOI":"10.1186\/s12870-018-1421-3","article-title":"Interspecific chloroplast genome sequence diversity and genomic resources in Diospyros","volume":"18","author":"W Li","year":"2018","journal-title":"BMC Plant Biol"},{"issue":"10","key":"pcbi.1011870.ref041","doi-asserted-by":"crossref","first-page":"e0257521","DOI":"10.1371\/journal.pone.0257521","article-title":"Sequencing DNA with nanopores: Troubles and biases.","volume":"16","author":"C Delahaye","year":"2021","journal-title":"PLOS ONE"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1011870","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2024,2,22]],"date-time":"2024-02-22T00:00:00Z","timestamp":1708560000000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1011870","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,22]],"date-time":"2024-02-22T18:36:35Z","timestamp":1708626995000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1011870"}},"subtitle":[],"editor":[{"given":"Christos A.","family":"Ouzounis","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,2,9]]},"references-count":41,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2024,2,9]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1011870","relation":{"new_version":[{"id-type":"doi","id":"10.1371\/journal.pcbi.1011870","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,9]]}}}