{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T06:14:29Z","timestamp":1772172869905,"version":"3.50.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1009078","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2021,7,1]],"date-time":"2021-07-01T00:00:00Z","timestamp":1625097600000}}],"reference-count":29,"publisher":"Public Library of Science (PLoS)","issue":"6","license":[{"start":{"date-parts":[[2021,6,21]],"date-time":"2021-06-21T00:00:00Z","timestamp":1624233600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000051","name":"National Human Genome Research Institute","doi-asserted-by":"publisher","award":["U24HG007497."],"award-info":[{"award-number":["U24HG007497."]}],"id":[{"id":"10.13039\/100000051","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000051","name":"National Human Genome Research Institute","doi-asserted-by":"publisher","award":["1U01HG010973"],"award-info":[{"award-number":["1U01HG010973"]}],"id":[{"id":"10.13039\/100000051","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>\n                    It is computationally challenging to detect variation by aligning single-molecule sequencing (SMS) reads, or contigs from SMS assemblies. One approach to efficiently align SMS reads is sparse dynamic programming (SDP), where optimal chains of exact matches are found between the sequence and the genome. While straightforward implementations of SDP penalize gaps with a cost that is a linear function of gap length, biological variation is more accurately represented when gap cost is a concave function of gap length. We have developed a method, lra, that uses SDP with a concave-cost gap penalty, and used lra to align long-read sequences from PacBio and Oxford Nanopore (ONT) instruments as well as de novo assembly contigs. This alignment approach increases sensitivity and specificity for SV discovery, particularly for variants above 1kb and when discovering variation from ONT reads, while having runtime that are comparable (1.05-3.76\u00d7) to current methods. When applied to calling variation from\n                    <jats:italic>de novo<\/jats:italic>\n                    assembly contigs, there is a 3.2% increase in Truvari F1 score compared to minimap2+htsbox. lra is available in bioconda (\n                    <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/anaconda.org\/bioconda\/lra\" xlink:type=\"simple\">https:\/\/anaconda.org\/bioconda\/lra<\/jats:ext-link>\n                    ) and github (\n                    <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/ChaissonLab\/LRA\" xlink:type=\"simple\">https:\/\/github.com\/ChaissonLab\/LRA<\/jats:ext-link>\n                    ).\n                  <\/jats:p>","DOI":"10.1371\/journal.pcbi.1009078","type":"journal-article","created":{"date-parts":[[2021,6,21]],"date-time":"2021-06-21T13:37:29Z","timestamp":1624282649000},"page":"e1009078","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":87,"title":["lra: A long read aligner for sequences and contigs"],"prefix":"10.1371","volume":"17","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3356-3008","authenticated-orcid":true,"given":"Jingwen","family":"Ren","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5395-1457","authenticated-orcid":true,"given":"Mark J. P.","family":"Chaisson","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"340","published-online":{"date-parts":[[2021,6,21]]},"reference":[{"issue":"18","key":"pcbi.1009078.ref001","doi-asserted-by":"crossref","first-page":"3094","DOI":"10.1093\/bioinformatics\/bty191","article-title":"Minimap2: pairwise alignment for nucleotide sequences","volume":"34","author":"H Li","year":"2018","journal-title":"Bioinformatics"},{"issue":"6","key":"pcbi.1009078.ref002","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1038\/s41592-018-0001-7","article-title":"Accurate detection of complex structural variations using single-molecule sequencing","volume":"15","author":"FJ Sedlazeck","year":"2018","journal-title":"Nature methods"},{"issue":"1","key":"pcbi.1009078.ref003","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1186\/1471-2105-13-238","article-title":"Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory","volume":"13","author":"MJ Chaisson","year":"2012","journal-title":"BMC bioinformatics"},{"key":"pcbi.1009078.ref004","first-page":"1723","article-title":"Comprehensive variant detection in a human genome with highly accurate long reads","volume":"vol. 27","author":"WJ Rowell","year":"2019","journal-title":"EUROPEAN JOURNAL OF HUMAN GENETICS"},{"issue":"2","key":"pcbi.1009078.ref005","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1006\/jagm.2002.1214","article-title":"Sparse dynamic programming for longest common subsequence from fragments","volume":"42","author":"BS Baker","year":"2002","journal-title":"Journal of algorithms"},{"issue":"5","key":"pcbi.1009078.ref006","doi-asserted-by":"crossref","first-page":"1382","DOI":"10.1073\/pnas.80.5.1382","article-title":"Optimal sequence alignments","volume":"80","author":"WM Fitch","year":"1983","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"3","key":"pcbi.1009078.ref007","doi-asserted-by":"crossref","first-page":"546","DOI":"10.1145\/146637.146656","article-title":"Sparse dynamic programming II: convex and concave cost functions","volume":"39","author":"D Eppstein","year":"1992","journal-title":"Journal of the ACM (JACM)"},{"key":"pcbi.1009078.ref008","first-page":"1","article-title":"Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes","author":"K Shafin","year":"2020","journal-title":"Nature Biotechnology"},{"issue":"5","key":"pcbi.1009078.ref009","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1038\/s41587-019-0072-8","article-title":"Assembly of long, error-prone reads using repeat graphs","volume":"37","author":"M Kolmogorov","year":"2019","journal-title":"Nature biotechnology"},{"issue":"2","key":"pcbi.1009078.ref010","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1038\/s41592-020-01056-5","article-title":"Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm","volume":"18","author":"H Cheng","year":"2021","journal-title":"Nature Methods"},{"issue":"1","key":"pcbi.1009078.ref011","doi-asserted-by":"crossref","first-page":"e1005944","DOI":"10.1371\/journal.pcbi.1005944","article-title":"MUMmer4: A fast and versatile genome alignment system","volume":"14","author":"G Mar\u00e7ais","year":"2018","journal-title":"PLoS computational biology"},{"issue":"17","key":"pcbi.1009078.ref012","doi-asserted-by":"crossref","first-page":"i748","DOI":"10.1093\/bioinformatics\/bty597","article-title":"A fast adaptive algorithm for computing whole-genome homology maps","volume":"34","author":"C Jain","year":"2018","journal-title":"Bioinformatics"},{"key":"pcbi.1009078.ref013","first-page":"1","article-title":"A robust benchmark for detection of germline large deletions and insertions","author":"JM Zook","year":"2020","journal-title":"Nature biotechnology"},{"issue":"1","key":"pcbi.1009078.ref014","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-018-08148-z","article-title":"Multi-platform discovery of haplotype-resolved structural variation in human genomes","volume":"10","author":"MJ Chaisson","year":"2019","journal-title":"Nature communications"},{"issue":"10","key":"pcbi.1009078.ref015","doi-asserted-by":"crossref","first-page":"1155","DOI":"10.1038\/s41587-019-0217-9","article-title":"Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome","volume":"37","author":"AM Wenger","year":"2019","journal-title":"Nature biotechnology"},{"issue":"1","key":"pcbi.1009078.ref016","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-020-02107-y","article-title":"Long-read-based human genomic structural variation detection with cuteSV","volume":"21","author":"T Jiang","year":"2020","journal-title":"Genome biology"},{"issue":"3","key":"pcbi.1009078.ref017","doi-asserted-by":"crossref","first-page":"663","DOI":"10.1016\/j.cell.2018.12.019","article-title":"Characterizing the major structural variant alleles of the human genome","volume":"176","author":"PA Audano","year":"2019","journal-title":"Cell"},{"issue":"6537","key":"pcbi.1009078.ref018","doi-asserted-by":"crossref","DOI":"10.1126\/science.abf7117","article-title":"Haplotype-resolved diverse human genomes and integrated analysis of structural variation","volume":"372","author":"P Ebert","year":"2021","journal-title":"Science"},{"issue":"7571","key":"pcbi.1009078.ref019","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1038\/nature15394","article-title":"An integrated map of structural variation in 2,504 human genomes","volume":"526","author":"PH Sudmant","year":"2015","journal-title":"Nature"},{"issue":"20","key":"pcbi.1009078.ref020","doi-asserted-by":"crossref","first-page":"11484","DOI":"10.1073\/pnas.1932072100","article-title":"Evolution\u2019s cauldron: duplication, deletion, and rearrangement in the mouse and human genomes","volume":"100","author":"WJ Kent","year":"2003","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"21","key":"pcbi.1009078.ref021","doi-asserted-by":"crossref","first-page":"2790","DOI":"10.1093\/bioinformatics\/btt468","article-title":"NextGenMap: fast and accurate read mapping in highly polymorphic genomes","volume":"29","author":"FJ Sedlazeck","year":"2013","journal-title":"Bioinformatics"},{"issue":"18","key":"pcbi.1009078.ref022","doi-asserted-by":"crossref","first-page":"3363","DOI":"10.1093\/bioinformatics\/bth408","article-title":"Reducing storage requirements for biological sequence comparison","volume":"20","author":"M Roberts","year":"2004","journal-title":"Bioinformatics"},{"key":"pcbi.1009078.ref023","doi-asserted-by":"crossref","unstructured":"Jain C, Dilthey A, Koren S, Aluru S, Phillippy AM. A fast approximate algorithm for mapping long reads to large reference databases. In: International Conference on Research in Computational Molecular Biology. Springer; 2017. p. 66\u201381.","DOI":"10.1007\/978-3-319-56970-3_5"},{"issue":"Supplement_1","key":"pcbi.1009078.ref024","doi-asserted-by":"crossref","first-page":"i111","DOI":"10.1093\/bioinformatics\/btaa435","article-title":"Weighted minimizer sampling improves long read mapping","volume":"36","author":"C Jain","year":"2020","journal-title":"Bioinformatics"},{"issue":"4","key":"pcbi.1009078.ref025","doi-asserted-by":"crossref","first-page":"721","DOI":"10.1101\/gr.926603","article-title":"LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA","volume":"13","author":"M Brudno","year":"2003","journal-title":"Genome research"},{"issue":"1","key":"pcbi.1009078.ref026","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1016\/0304-3975(89)90101-1","article-title":"Speeding up dynamic programming with applications to molecular biology","volume":"64","author":"Z Galil","year":"1989","journal-title":"Theoretical computer science"},{"issue":"4","key":"pcbi.1009078.ref027","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1145\/270563.571472","article-title":"Algorithms on stings, trees, and sequences: Computer science and computational biology","volume":"28","author":"D Gusfield","year":"1997","journal-title":"Acm Sigact News"},{"issue":"3","key":"pcbi.1009078.ref028","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1145\/146637.146650","article-title":"Sparse dynamic programming I: Linear Cost Functions","volume":"39","author":"D Eppstein","year":"1992","journal-title":"Journal of the ACM (JACM)"},{"issue":"9","key":"pcbi.1009078.ref029","doi-asserted-by":"crossref","first-page":"1394","DOI":"10.1093\/bioinformatics\/btw753","article-title":"Edlib: a C\/C++ library for fast, exact sequence alignment using edit distance","volume":"33","author":"M \u0160o\u0161i\u0107","year":"2017","journal-title":"Bioinformatics"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1009078","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2021,7,1]],"date-time":"2021-07-01T00:00:00Z","timestamp":1625097600000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009078","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,1]],"date-time":"2021-07-01T14:54:19Z","timestamp":1625151259000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009078"}},"subtitle":[],"editor":[{"given":"Ferhat","family":"Ay","sequence":"first","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2021,6,21]]},"references-count":29,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2021,6,21]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1009078","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.11.15.383273","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,6,21]]}}}