{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,3]],"date-time":"2026-05-03T09:54:47Z","timestamp":1777802087474,"version":"3.51.4"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2023,11,17]],"date-time":"2023-11-17T00:00:00Z","timestamp":1700179200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Union Regional Development Fund"},{"name":"Spanish Ministerio de Ciencia e Innovacion","award":["PRE2021-101059"],"award-info":[{"award-number":["PRE2021-101059"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,12,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Advances in genomics and sequencing technologies demand faster and more scalable analysis methods that can process longer sequences with higher accuracy. However, classical pairwise alignment methods, based on dynamic programming (DP), impose impractical computational requirements to align long and noisy sequences like those produced by PacBio and Nanopore technologies. The recently proposed wavefront alignment (WFA) algorithm paves the way for more efficient alignment tools, improving time and memory complexity over previous methods. However, high-performance computing (HPC) platforms require efficient parallel algorithms and tools to exploit the computing resources available on modern accelerator-based architectures.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>This paper presents WFA-GPU, a GPU (graphics processing unit)-accelerated tool to compute exact gap-affine alignments based on the WFA algorithm. We present the algorithmic adaptations and performance optimizations that allow exploiting the massively parallel capabilities of modern GPU devices to accelerate the alignment computations. In particular, we propose a CPU\u2013GPU co-design capable of performing inter-sequence and intra-sequence parallel sequence alignment, combining a succinct WFA-data representation with an efficient GPU implementation. As a result, we demonstrate that our implementation outperforms the original multi-threaded WFA implementation by up to 4.3\u00d7 and up to 18.2\u00d7 when using heuristic methods on long and noisy sequences. Compared to other state-of-the-art tools and libraries, the WFA-GPU is up to 29\u00d7 faster than other GPU implementations and up to four orders of magnitude faster than other CPU implementations. Furthermore, WFA-GPU is the only GPU solution capable of correctly aligning long reads using a commodity GPU.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>WFA-GPU code and documentation are publicly available at https:\/\/github.com\/quim0\/WFA-GPU.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad701","type":"journal-article","created":{"date-parts":[[2023,11,16]],"date-time":"2023-11-16T14:44:12Z","timestamp":1700145852000},"source":"Crossref","is-referenced-by-count":27,"title":["WFA-GPU: gap-affine pairwise read-alignment using GPUs"],"prefix":"10.1093","volume":"39","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4871-3192","authenticated-orcid":false,"given":"Quim","family":"Aguado-Puig","sequence":"first","affiliation":[{"name":"Departament d\u2019Arquitectura de Computadors i Sistemes Operatius, Universitat Aut\u00f2noma de Barcelona , Barcelona 08193, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Max","family":"Doblas","sequence":"additional","affiliation":[{"name":"Computer Sciences Department, Barcelona Supercomputing Center , Barcelona 08034, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Christos","family":"Matzoros","sequence":"additional","affiliation":[{"name":"Computer Sciences Department, Barcelona Supercomputing Center , Barcelona 08034, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Antonio","family":"Espinosa","sequence":"additional","affiliation":[{"name":"Departament d\u2019Arquitectura de Computadors i Sistemes Operatius, Universitat Aut\u00f2noma de Barcelona , Barcelona 08193, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Juan Carlos","family":"Moure","sequence":"additional","affiliation":[{"name":"Departament d\u2019Arquitectura de Computadors i Sistemes Operatius, Universitat Aut\u00f2noma de Barcelona , Barcelona 08193, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7951-3914","authenticated-orcid":false,"given":"Santiago","family":"Marco-Sola","sequence":"additional","affiliation":[{"name":"Computer Sciences Department, Barcelona Supercomputing Center , Barcelona 08034, Spain"},{"name":"Departament d\u2019Arquitectura de Computadors, Universitat Polit\u00e8cnica de Catalunya , Barcelona 08034, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Miquel","family":"Moreto","sequence":"additional","affiliation":[{"name":"Computer Sciences Department, Barcelona Supercomputing Center , Barcelona 08034, Spain"},{"name":"Departament d\u2019Arquitectura de Computadors, Universitat Polit\u00e8cnica de Catalunya , Barcelona 08034, Spain"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2023,11,17]]},"reference":[{"key":"2023120522524497900_btad701-B1","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1186\/s12859-019-3086-9","article-title":"Gasal2: a GPU accelerated sequence alignment library for high-throughput NGS data","volume":"20","author":"Ahmed","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2023120522524497900_btad701-B2","doi-asserted-by":"crossref","first-page":"388","DOI":"10.1186\/s12859-020-03685-1","article-title":"GPU acceleration of Darwin read overlapper for de novo assembly of long DNA reads","volume":"21","author":"Ahmed","year":"2020","journal-title":"BMC Bioinformatics"},{"key":"2023120522524497900_btad701-B3","doi-asserted-by":"crossref","first-page":"3355","DOI":"10.1093\/bioinformatics\/btx342","article-title":"GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping","volume":"33","author":"Alser","year":"2017","journal-title":"Bioinformatics"},{"key":"2023120522524497900_btad701-B4","doi-asserted-by":"crossref","first-page":"4255","DOI":"10.1093\/bioinformatics\/btz234","article-title":"Shouji: a fast and efficient pre-alignment filter for sequence alignment","volume":"35","author":"Alser","year":"2019","journal-title":"Bioinformatics"},{"key":"2023120522524497900_btad701-B5","doi-asserted-by":"crossref","first-page":"406","DOI":"10.1186\/s12859-020-03720-1","article-title":"ADEPT: a domain independent sequence alignment strategy for GPU architectures","volume":"21","author":"Awan","year":"2020","journal-title":"BMC Bioinformatics"},{"key":"2023120522524497900_btad701-B6","author":"Baeza-Yates","year":"1989"},{"key":"2023120522524497900_btad701-B7","first-page":"465","author":"Baeza-Yates","year":"1992"},{"key":"2023120522524497900_btad701-B8","author":"Chac\u00f3n","year":"2014"},{"key":"2023120522524497900_btad701-B9","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1186\/s12859-016-0930-z","article-title":"Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments","volume":"17","author":"Daily","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2023120522524497900_btad701-B10","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1186\/1471-2105-9-11","article-title":"SeqAn an efficient, generic C++ library for sequence analysis","volume":"9","author":"D\u00f6ring","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023120522524497900_btad701-B11","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511790492","volume-title":"Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids","author":"Durbin","year":"1998"},{"key":"2023120522524497900_btad701-B12","author":"Eizenga","year":"2022"},{"key":"2023120522524497900_btad701-B13","doi-asserted-by":"crossref","first-page":"156","DOI":"10.1093\/bioinformatics\/btl582","article-title":"Striped Smith\u2013Waterman speeds database searches six times over other SIMD implementations","volume":"23","author":"Farrar","year":"2007","journal-title":"Bioinformatics"},{"key":"2023120522524497900_btad701-B14","volume-title":"GPU Computing Gems Emerald Edition","author":"Hwu","year":"2011"},{"key":"2023120522524497900_btad701-B15","volume-title":"An Introduction to Bioinformatics Algorithms","author":"Jones","year":"2004"},{"key":"2023120522524497900_btad701-B16","doi-asserted-by":"crossref","first-page":"722","DOI":"10.1101\/gr.215087.116","article-title":"Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation","volume":"27","author":"Koren","year":"2017","journal-title":"Genome Res"},{"key":"2023120522524497900_btad701-B17","author":"Li","year":"2013"},{"key":"2023120522524497900_btad701-B18","doi-asserted-by":"crossref","first-page":"3094","DOI":"10.1093\/bioinformatics\/bty191","article-title":"Minimap2: pairwise alignment for nucleotide sequences","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"2023120522524497900_btad701-B19","doi-asserted-by":"crossref","first-page":"2639","DOI":"10.1109\/TPDS.2017.2674664","article-title":"Perfect hashing based parallel algorithms for multiple string matching on graphic processing units","volume":"28","author":"Lin","year":"2017","journal-title":"IEEE Trans Parallel Distrib Syst"},{"key":"2023120522524497900_btad701-B20","author":"Lindegger"},{"key":"2023120522524497900_btad701-B21","doi-asserted-by":"crossref","first-page":"1185","DOI":"10.1038\/nmeth.2221","article-title":"The GEM mapper: fast, accurate and versatile alignment by filtration","volume":"9","author":"Marco-Sola","year":"2012","journal-title":"Nat Methods"},{"key":"2023120522524497900_btad701-B22","doi-asserted-by":"crossref","first-page":"456","DOI":"10.1093\/bioinformatics\/btaa777","article-title":"Fast gap-affine pairwise alignment using the wavefront algorithm","volume":"37","author":"Marco-Sola","year":"2021","journal-title":"Bioinformatics"},{"key":"2023120522524497900_btad701-B23","doi-asserted-by":"crossref","first-page":"1297","DOI":"10.1101\/gr.107524.110","article-title":"The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data","volume":"20","author":"McKenna","year":"2010","journal-title":"Genome Res"},{"key":"2023120522524497900_btad701-B24","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1007\/BF01840446","article-title":"An O(ND) difference algorithm and its variations","volume":"1","author":"Myers","year":"1986","journal-title":"Algorithmica"},{"key":"2023120522524497900_btad701-B25","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1145\/316542.316550","article-title":"A fast bit-vector algorithm for approximate string matching based on dynamic programming","volume":"46","author":"Myers","year":"1999","journal-title":"J ACM"},{"key":"2023120522524497900_btad701-B26","author":"Navarro"},{"key":"2023120522524497900_btad701-B27","doi-asserted-by":"crossref","first-page":"879","DOI":"10.1109\/JPROC.2008.917757","article-title":"GPU computing","volume":"96","author":"Owens","year":"2008","journal-title":"Proc IEEE"},{"key":"2023120522524497900_btad701-B28","doi-asserted-by":"crossref","first-page":"e01315\u201319","DOI":"10.1128\/JCM.01315-19","article-title":"Third-generation sequencing in the clinical laboratory: exploring the advantages and challenges of nanopore sequencing","volume":"58","author":"Petersen","year":"2019","journal-title":"J Clin Microbiol"},{"key":"2023120522524497900_btad701-B29","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1186\/s12864-016-3404-9","article-title":"ChimPipe: accurate detection of fusion genes and transcription-induced chimeras from RNA-seq data","volume":"18","author":"Rodr\u00edguez-Mart\u00edn","year":"2017","journal-title":"BMC Genomics"},{"key":"2023120522524497900_btad701-B30","doi-asserted-by":"crossref","first-page":"699","DOI":"10.1093\/bioinformatics\/16.8.699","article-title":"Six-fold speed-up of Smith\u2013Waterman sequence database searches using parallel processing on common microprocessors","volume":"16","author":"Rognes","year":"2000","journal-title":"Bioinformatics"},{"key":"2023120522524497900_btad701-B31","doi-asserted-by":"crossref","first-page":"359","DOI":"10.1016\/0196-6774(80)90016-4","article-title":"The theory and computation of evolutionary distances: pattern recognition","volume":"1","author":"Sellers","year":"1980","journal-title":"J Algorithms"},{"key":"2023120522524497900_btad701-B32","doi-asserted-by":"crossref","first-page":"1117","DOI":"10.1101\/gr.089532.108","article-title":"ABySS: a parallel assembler for short read sequence data","volume":"19","author":"Simpson","year":"2009","journal-title":"Genome Res"},{"key":"2023120522524497900_btad701-B33","doi-asserted-by":"crossref","first-page":"1394","DOI":"10.1093\/bioinformatics\/btw753","article-title":"Edlib: a C\/C++ library for fast, exact sequence alignment using edit distance","volume":"33","author":"\u0160o\u0161i\u0107","year":"2017","journal-title":"Bioinformatics"},{"key":"2023120522524497900_btad701-B34","author":"Suzuki","year":"2017"},{"key":"2023120522524497900_btad701-B35","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1186\/s12859-018-2014-8","article-title":"Introducing difference recurrence relations for faster semi-global alignment of long sequences","volume":"19","author":"Suzuki","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2023120522524497900_btad701-B36","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1016\/0196-6774(85)90023-9","article-title":"Finding approximate patterns in strings","volume":"6","author":"Ukkonen","year":"1985","journal-title":"J Algorithms"},{"key":"2023120522524497900_btad701-B37","first-page":"145","article-title":"Using video-oriented instructions to speed up sequence comparison","volume":"13","author":"Wozniak","year":"1997","journal-title":"Comput Appl Biosci"},{"key":"2023120522524497900_btad701-B38","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1145\/135239.135244","article-title":"Fast text searching: allowing errors","volume":"35","author":"Wu","year":"1992","journal-title":"Commun ACM"},{"key":"2023120522524497900_btad701-B39","author":"Zeni","year":"2020"},{"key":"2023120522524497900_btad701-B40","doi-asserted-by":"crossref","first-page":"e82138","DOI":"10.1371\/journal.pone.0082138","article-title":"SSW library: an SIMD Smith-Waterman C\/C++ library for use in genomic applications","volume":"8","author":"Zhao","year":"2013","journal-title":"PLoS One"},{"key":"2023120522524497900_btad701-B41","doi-asserted-by":"crossref","first-page":"1913","DOI":"10.1093\/bioinformatics\/btv053","article-title":"Starcode: sequence clustering based on all-pairs search","volume":"31","author":"Zorita","year":"2015","journal-title":"Bioinformatics"},{"key":"2023120522524497900_btad701-B42","first-page":"1","article-title":"Sequence clustering in bioinformatics: an empirical study","volume":"21","author":"Zou","year":"2020","journal-title":"Brief Bioinform"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad701\/53483097\/btad701.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/12\/btad701\/54025738\/btad701.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/12\/btad701\/54025738\/btad701.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,5]],"date-time":"2023-12-05T23:01:02Z","timestamp":1701817262000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad701\/7425447"}},"subtitle":[],"editor":[{"given":"Janet","family":"Kelso","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2023,11,17]]},"references-count":42,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2023,12,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad701","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,12,1]]},"published":{"date-parts":[[2023,11,17]]},"article-number":"btad701"}}