{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:38:34Z","timestamp":1772138314871,"version":"3.50.1"},"reference-count":22,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2024,3,14]],"date-time":"2024-03-14T00:00:00Z","timestamp":1710374400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,7,3]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Increasing the accuracy of the nucleotide sequence alignment is an essential issue in genomics research. Although classic dynamic programming (DP) algorithms (e.g., Smith\u2013Waterman and Needleman\u2013Wunsch) guarantee to produce the optimal result, their time complexity hinders the application of large-scale sequence alignment. Many optimization efforts that aim to accelerate the alignment process generally come from three perspectives: redesigning data structures [e.g., diagonal or striped Single Instruction Multiple Data (SIMD) implementations], increasing the number of parallelisms in SIMD operations (e.g., difference recurrence relation), or reducing search space (e.g., banded DP). However, no methods combine all these three aspects to build an ultra-fast algorithm. In this study, we developed a Banded Striped Aligner (BSAlign) library that delivers accurate alignment results at an ultra-fast speed by knitting a series of novel methods together to take advantage of all of the aforementioned three perspectives with highlights such as active F-loop in striped vectorization and striped move in banded DP. We applied our new acceleration design on both regular and edit distance pairwise alignment. BSAlign achieved 2-fold speed-up than other SIMD-based implementations for regular pairwise alignment, and 1.5-fold to 4-fold speed-up in edit distance-based implementations for long reads. BSAlign is implemented in C programing language and is available at https:\/\/github.com\/ruanjue\/bsalign.<\/jats:p>","DOI":"10.1093\/gpbjnl\/qzae025","type":"journal-article","created":{"date-parts":[[2024,3,14]],"date-time":"2024-03-14T10:17:22Z","timestamp":1710411442000},"source":"Crossref","is-referenced-by-count":9,"title":["BSAlign: A Library for Nucleotide Sequence Alignment"],"prefix":"10.1093","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8806-197X","authenticated-orcid":false,"given":"Haojing","family":"Shao","sequence":"first","affiliation":[{"name":"Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen 518120, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3713-3192","authenticated-orcid":false,"given":"Jue","family":"Ruan","sequence":"additional","affiliation":[{"name":"Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen 518120, China"}]}],"member":"286","published-online":{"date-parts":[[2024,3,14]]},"reference":[{"key":"2024083002070104500_qzae025-B1","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","article-title":"A general method applicable to the search for similarities in the amino acid sequence of two proteins","volume":"48","author":"Needleman","year":"1970","journal-title":"J Mol Biol"},{"key":"2024083002070104500_qzae025-B2","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"Smith","year":"1981","journal-title":"J Mol Biol"},{"key":"2024083002070104500_qzae025-B3","first-page":"145","article-title":"Using video-oriented instructions to speed up sequence comparison","volume":"13","author":"Wozniak","year":"1997","journal-title":"Comput Appl Biosci"},{"key":"2024083002070104500_qzae025-B4","doi-asserted-by":"crossref","first-page":"699","DOI":"10.1093\/bioinformatics\/16.8.699","article-title":"Six-fold speed-up of Smith\u2013Waterman sequence database searches using parallel processing on common microprocessors","volume":"16","author":"Rognes","year":"2000","journal-title":"Bioinformatics"},{"key":"2024083002070104500_qzae025-B5","doi-asserted-by":"crossref","first-page":"2306","DOI":"10.1093\/bioinformatics\/bty930","article-title":"BGSA: a bit-parallel global sequence alignment toolkit for multi-core and many-core architectures","volume":"35","author":"Zhang","year":"2019","journal-title":"Bioinformatics"},{"key":"2024083002070104500_qzae025-B6","doi-asserted-by":"crossref","first-page":"3437","DOI":"10.1093\/bioinformatics\/bty380","article-title":"Generic accelerated sequence alignment in SeqAn using vectorization and multi-threading","volume":"34","author":"Rahn","year":"2018","journal-title":"Bioinformatics"},{"key":"2024083002070104500_qzae025-B7","first-page":"1030","article-title":"AnySeq: a high performance sequence alignment library based on partial evaluation","author":"Muller","year":"2020","journal-title":"IEEE International Parallel and Distributed Processing Symposium"},{"key":"2024083002070104500_qzae025-B8","doi-asserted-by":"crossref","first-page":"156","DOI":"10.1093\/bioinformatics\/btl582","article-title":"Striped Smith\u2013Waterman speeds database searches six times over other simd implementations","volume":"23","author":"Farrar","year":"2007","journal-title":"Bioinformatics"},{"key":"2024083002070104500_qzae025-B9","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with Burrows\u2013Wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2024083002070104500_qzae025-B10","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.1923","article-title":"Fast gapped-read alignment with Bowtie 2","volume":"9","author":"Langmead","year":"2012","journal-title":"Nat Methods"},{"key":"2024083002070104500_qzae025-B11","doi-asserted-by":"crossref","first-page":"e82138","DOI":"10.1371\/journal.pone.0082138","article-title":"SSW library: an SIMD Smith\u2013Waterman C\/V++ library for use in genomic applications","volume":"8","author":"Zhao","year":"2013","journal-title":"PLoS One"},{"key":"2024083002070104500_qzae025-B12","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1186\/s12859-018-2014-8","article-title":"Introducing difference recurrence relations for faster semi-global alignment of long sequences","volume":"19","author":"Suzuki","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2024083002070104500_qzae025-B13","first-page":"481","article-title":"Aligning two sequences within a specified diagonal band","volume":"8","author":"Chao","year":"1992","journal-title":"Comput Appl Biosci"},{"key":"2024083002070104500_qzae025-B14","author":"Suzuki","year":"2017"},{"key":"2024083002070104500_qzae025-B15","doi-asserted-by":"crossref","first-page":"3094","DOI":"10.1093\/bioinformatics\/bty191","article-title":"minimap2: pairwise alignment for nucleotide sequences","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"2024083002070104500_qzae025-B16","article-title":"Block aligner: fast and flexible pairwise sequence alignment with SIMD-accelerated adaptive blocks","author":"Liu","year":"2021","journal-title":"bioRxiv"},{"key":"2024083002070104500_qzae025-B17","doi-asserted-by":"crossref","first-page":"456","DOI":"10.1093\/bioinformatics\/btaa777","article-title":"Fast gap-affine pairwise alignment using the wavefront algorithm","volume":"37","author":"Marco-Sola","year":"2021","journal-title":"Bioinformatics"},{"key":"2024083002070104500_qzae025-B18","author":"Daily","year":"2015"},{"key":"2024083002070104500_qzae025-B19","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1186\/s12859-016-0930-z","article-title":"Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments","volume":"17","author":"Daily","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2024083002070104500_qzae025-B20","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1145\/316542.316550","article-title":"A fast bit-vector algorithm for approximate string matching based on dynamic programming","volume":"46","author":"Myers","year":"1999","journal-title":"JACM"},{"key":"2024083002070104500_qzae025-B21","doi-asserted-by":"crossref","first-page":"1394","DOI":"10.1093\/bioinformatics\/btw753","article-title":"Edlib: a C\/C++ library for fast, exact sequence alignment using edit distance","volume":"33","author":"\u0160o\u0161ic","year":"2017","journal-title":"Bioinformatics"},{"key":"2024083002070104500_qzae025-B22","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1093\/bioinformatics\/btaa835","article-title":"PBSIM2: a simulator for long-read sequencers with a novel generative model of quality scores","volume":"37","author":"Ono","year":"2021","journal-title":"Bioinformatics"}],"container-title":["Genomics, Proteomics &amp; Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/gpb\/advance-article-pdf\/doi\/10.1093\/gpbjnl\/qzae025\/56970403\/qzae025.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/gpb\/article-pdf\/22\/2\/qzae025\/58967255\/qzae025.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/gpb\/article-pdf\/22\/2\/qzae025\/58967255\/qzae025.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,29]],"date-time":"2024-08-29T22:07:13Z","timestamp":1724969233000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/gpb\/article\/doi\/10.1093\/gpbjnl\/qzae025\/7628627"}},"subtitle":[],"editor":[{"given":"Zhang","family":"Zhang","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,3,14]]},"references-count":22,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,7,3]]}},"URL":"https:\/\/doi.org\/10.1093\/gpbjnl\/qzae025","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.01.15.575791","asserted-by":"object"}]},"ISSN":["1672-0229","2210-3244"],"issn-type":[{"value":"1672-0229","type":"print"},{"value":"2210-3244","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,4]]},"published":{"date-parts":[[2024,3,14]]},"article-number":"qzae025"}}