{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T09:17:42Z","timestamp":1774689462766,"version":"3.50.1"},"reference-count":43,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2020,4,22]],"date-time":"2020-04-22T00:00:00Z","timestamp":1587513600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000057","name":"National Institute of General Medical Sciences","doi-asserted-by":"publisher","award":["R01-GM118568"],"award-info":[{"award-number":["R01-GM118568"]}],"id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000054","name":"National Cancer Institute","doi-asserted-by":"publisher","award":["R21-CA220411"],"award-info":[{"award-number":["R21-CA220411"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["DBI-1350041"],"award-info":[{"award-number":["DBI-1350041"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["IIS-1349906"],"award-info":[{"award-number":["IIS-1349906"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Read alignment is central to many aspects of modern genomics. Most aligners use heuristics to accelerate processing, but these heuristics can fail to find the optimal alignments of reads. Alignment accuracy is typically measured through simulated reads; however, the simulated location may not be the (only) location with the optimal alignment score.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Vargas implements a heuristic-free algorithm guaranteed to find the highest-scoring alignment for real sequencing reads to a linear or graph genome. With semiglobal and local alignment modes and affine gap and quality-scaled mismatch penalties, it can implement the scoring functions of commonly used aligners to calculate optimal alignments. While this is computationally intensive, Vargas uses multi-core parallelization and vectorized (SIMD) instructions to make it practical to optimally align large numbers of reads, achieving a maximum speed of 456 billion cell updates per second. We demonstrate how these \u2018gold standard\u2019 Vargas alignments can be used to improve heuristic alignment accuracy by optimizing command-line parameters in Bowtie 2, BWA-maximal exact match and vg to align more reads correctly.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Source code implemented in C++ and compiled binary releases are available at https:\/\/github.com\/langmead-lab\/vargas under the MIT license.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa265","type":"journal-article","created":{"date-parts":[[2020,4,15]],"date-time":"2020-04-15T15:25:54Z","timestamp":1586964354000},"page":"3712-3718","source":"Crossref","is-referenced-by-count":27,"title":["Vargas: heuristic-free alignment for assessing linear and graph read aligners"],"prefix":"10.1093","volume":"36","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2195-5300","authenticated-orcid":false,"given":"Charlotte A","family":"Darby","sequence":"first","affiliation":[{"name":"Department of Computer Science"}]},{"given":"Ravi","family":"Gaddipati","sequence":"additional","affiliation":[{"name":"Department of Biomedical Engineering"}]},{"given":"Michael C","family":"Schatz","sequence":"additional","affiliation":[{"name":"Department of Computer Science"},{"name":"Department of Biology , Johns Hopkins University, Baltimore, MD 21218, USA"},{"name":"Simons Center for Quantitative Biology , Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2437-1976","authenticated-orcid":false,"given":"Ben","family":"Langmead","sequence":"additional","affiliation":[{"name":"Department of Computer Science"}]}],"member":"286","published-online":{"date-parts":[[2020,4,22]]},"reference":[{"key":"2023063010281876300_btaa265-B1","doi-asserted-by":"crossref","first-page":"R18","DOI":"10.1186\/gb-2011-12-2-r18","article-title":"Analyzing and minimizing PCR amplification bias in illumina sequencing libraries","volume":"12","author":"Aird","year":"2011","journal-title":"Genome Biol"},{"key":"2023063010281876300_btaa265-B2","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1186\/s13059-019-1774-4","article-title":"Is it time to change the reference genome?","volume":"20","author":"Ballouz","year":"2019","journal-title":"Genome Biol"},{"key":"2023063010281876300_btaa265-B3","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1186\/s13059-015-0587-3","article-title":"Extending reference assembly models","volume":"16","author":"Church","year":"2015","journal-title":"Genome Biol"},{"key":"2023063010281876300_btaa265-B4","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1186\/s12859-016-0930-z","article-title":"Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments","volume":"17","author":"Daily","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2023063010281876300_btaa265-B5","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1101\/gr.210500.116","article-title":"A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree","volume":"27","author":"Eberle","year":"2017","journal-title":"Genome Res"},{"key":"2023063010281876300_btaa265-B6","doi-asserted-by":"crossref","first-page":"156","DOI":"10.1093\/bioinformatics\/btl582","article-title":"Striped Smith-Waterman speeds database searches six times over other SIMD implementations","volume":"23","author":"Farrar","year":"2007","journal-title":"Bioinformatics"},{"key":"2023063010281876300_btaa265-B7","doi-asserted-by":"crossref","first-page":"875","DOI":"10.1038\/nbt.4227","article-title":"Variation graph toolkit improves read mapping by representing genetic variation in the reference","volume":"36","author":"Garrison","year":"2018","journal-title":"Nat. Biotechnol"},{"key":"2023063010281876300_btaa265-B8","doi-asserted-by":"crossref","first-page":"705","DOI":"10.1016\/0022-2836(82)90398-9","article-title":"An improved algorithm for matching biological sequences","volume":"162","author":"Gotoh","year":"1982","journal-title":"J. Mol. Biol"},{"key":"2023063010281876300_btaa265-B9","doi-asserted-by":"crossref","first-page":"210","DOI":"10.1186\/1471-2105-12-210","article-title":"A novel and well-defined benchmarking method for second generation read mapping","volume":"12","author":"Holtgrewe","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023063010281876300_btaa265-B10","doi-asserted-by":"crossref","first-page":"i361","DOI":"10.1093\/bioinformatics\/btt215","article-title":"Short read alignment with populations of genomes","volume":"29","author":"Huang","year":"2013","journal-title":"Bioinformatics"},{"key":"2023063010281876300_btaa265-B11","year":"2015"},{"key":"2023063010281876300_btaa265-B12","first-page":"451","author":"Jain","year":"2019"},{"key":"2023063010281876300_btaa265-B13","volume-title":"Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition","author":"Jeffers","year":"2016"},{"key":"2023063010281876300_btaa265-B14","doi-asserted-by":"crossref","first-page":"907","DOI":"10.1038\/s41587-019-0201-4","article-title":"Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype","volume":"37","author":"Kim","year":"2019","journal-title":"Nat. Biotechnol"},{"key":"2023063010281876300_btaa265-B15","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1186\/s13059-017-1290-3","article-title":"A tandem simulation framework for predicting mapping quality","volume":"18","author":"Langmead","year":"2017","journal-title":"Genome Biol"},{"key":"2023063010281876300_btaa265-B16","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.1923","article-title":"Fast gapped-read alignment with Bowtie 2","volume":"9","author":"Langmead","year":"2012","journal-title":"Nat. Methods"},{"key":"2023063010281876300_btaa265-B17","doi-asserted-by":"crossref","first-page":"452","DOI":"10.1093\/bioinformatics\/18.3.452","article-title":"Multiple sequence alignment using partial order graphs","volume":"18","author":"Lee","year":"2002","journal-title":"Bioinformatics"},{"key":"2023063010281876300_btaa265-B18","doi-asserted-by":"crossref","first-page":"2097","DOI":"10.1093\/bioinformatics\/bts330","article-title":"Genomic dark matter: the reliability of short read mapping illustrated by the genome mappability score","volume":"28","author":"Lee","year":"2012","journal-title":"Bioinformatics"},{"key":"2023063010281876300_btaa265-B19","author":"Li","year":"2013"},{"key":"2023063010281876300_btaa265-B20","first-page":"3094","article-title":"Minimap2: pairwise alignment for nucleotide sequences","author":"Li","year":"2018"},{"key":"2023063010281876300_btaa265-B21","first-page":"589","article-title":"Fast and accurate long-read alignment with Burrows-Wheeler transform","volume":"26","author":"Li","year":"2010","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023063010281876300_btaa265-B22","doi-asserted-by":"crossref","first-page":"1851","DOI":"10.1101\/gr.078212.108","article-title":"Mapping short DNA sequencing reads and calling variants using mapping quality scores","volume":"18","author":"Li","year":"2008","journal-title":"Genome Res"},{"key":"2023063010281876300_btaa265-B23","doi-asserted-by":"crossref","first-page":"595","DOI":"10.1038\/s41592-018-0054-7","article-title":"A synthetic-diploid benchmark for accurate variant-calling evaluation","volume":"15","author":"Li","year":"2018","journal-title":"Nat. Methods"},{"key":"2023063010281876300_btaa265-B24","first-page":"184","author":"Liu","year":"2014"},{"key":"2023063010281876300_btaa265-B25","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1186\/1471-2105-14-117","article-title":"CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions","volume":"14","author":"Liu","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023063010281876300_btaa265-B26","doi-asserted-by":"crossref","first-page":"50","DOI":"10.12688\/wellcomeopenres.15126.2","article-title":"Variant calling on the GRCh38 assembly with the data from phase three of the 1000 Genomes Project","volume":"4","author":"Lowy-Gallego","year":"2019","journal-title":"Wellcome Open Res"},{"key":"2023063010281876300_btaa265-B27","doi-asserted-by":"crossref","first-page":"665","DOI":"10.1101\/gr.214155.116","article-title":"Genome graphs and the evolution of genome inference","volume":"27","author":"Paten","year":"2017","journal-title":"Genome Res"},{"key":"2023063010281876300_btaa265-B28","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1038\/nmeth.4197","article-title":"Salmon provides fast and bias-aware quantification of transcript expression","volume":"14","author":"Patro","year":"2017","journal-title":"Nat. Methods"},{"key":"2023063010281876300_btaa265-B29","doi-asserted-by":"crossref","first-page":"220","DOI":"10.1186\/s13059-018-1595-x","article-title":"FORGe: prioritizing variants for graph genomes","volume":"19","author":"Pritt","year":"2018","journal-title":"Genome Biol"},{"key":"2023063010281876300_btaa265-B30","doi-asserted-by":"crossref","first-page":"3437","DOI":"10.1093\/bioinformatics\/bty380","article-title":"Generic accelerated sequence alignment in SeqAn using vectorization and multi-threading","volume":"34","author":"Rahn","year":"2018","journal-title":"Bioinformatics"},{"key":"2023063010281876300_btaa265-B31","doi-asserted-by":"crossref","first-page":"3599","DOI":"10.1093\/bioinformatics\/btz162","article-title":"Bit-parallel sequence-to-graph alignment","volume":"35","author":"Rautiainen","year":"2019","journal-title":"Bioinformatics"},{"key":"2023063010281876300_btaa265-B32","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1186\/1471-2105-12-221","article-title":"Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation","volume":"12","author":"Rognes","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023063010281876300_btaa265-B33","first-page":"699","article-title":"Six-fold speed-up of Smith-Waterman sequence database searches using parallel processing on common microprocessors","volume":"16","author":"Rognes","year":"2000","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023063010281876300_btaa265-B34","doi-asserted-by":"crossref","first-page":"R98","DOI":"10.1186\/gb-2009-10-9-r98","article-title":"Simultaneous alignment of short reads against multiple genomes","volume":"10","author":"Schneeberger","year":"2009","journal-title":"Genome Biol"},{"key":"2023063010281876300_btaa265-B35","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"Smith","year":"1981","journal-title":"J. Mol. Biol"},{"key":"2023063010281876300_btaa265-B36","article-title":"Teaser: individualized benchmarking and optimization of read mapping results for NGS data","volume":"16, 235","author":"Smolka","year":"2015","journal-title":"Genome Biol"},{"key":"2023063010281876300_btaa265-B37","author":"Sodani","year":"2015"},{"key":"2023063010281876300_btaa265-B38","first-page":"34","author":"Tam","year":"2018"},{"key":"2023063010281876300_btaa265-B39","doi-asserted-by":"crossref","first-page":"e127","DOI":"10.1093\/nar\/gks425","article-title":"A new strategy to reduce allelic bias in RNA-Seq readmapping","volume":"40","author":"Vijaya Satya","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023063010281876300_btaa265-B40","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1186\/s12896-019-0535-5","article-title":"VARSCOT: variant-aware detection and scoring enables sensitive and personalized off-target detection for CRISPR-Cas9","volume":"19","author":"Wilson","year":"2019","journal-title":"BMC Biotechnology"},{"key":"2023063010281876300_btaa265-B41","first-page":"145","article-title":"Using video-oriented instructions to speed up sequence comparison","volume":"13","author":"Wozniak","year":"1997","journal-title":"Comp. Appl. Biosci. CABIOS"},{"key":"2023063010281876300_btaa265-B42","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1186\/s13059-019-1717-0","article-title":"One reference genome is not enough","volume":"20","author":"Yang","year":"2019","journal-title":"Genome Biol"},{"key":"2023063010281876300_btaa265-B43","doi-asserted-by":"crossref","first-page":"246","DOI":"10.1038\/nbt.2835","article-title":"Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls","volume":"32","author":"Zook","year":"2014","journal-title":"Nat. Biotechnol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa265\/33363800\/btaa265.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/12\/3712\/50745949\/bioinformatics_36_12_3712.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/12\/3712\/50745949\/bioinformatics_36_12_3712.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,30]],"date-time":"2023-06-30T06:28:56Z","timestamp":1688106536000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/12\/3712\/5823884"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,4,22]]},"references-count":43,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2020,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa265","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2019.12.20.884676","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,6,15]]},"published":{"date-parts":[[2020,4,22]]}}}