{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T15:44:32Z","timestamp":1772725472278,"version":"3.50.1"},"reference-count":53,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2023,3,24]],"date-time":"2023-03-24T00:00:00Z","timestamp":1679616000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000028","name":"Semiconductor Research Corporation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000028","id-type":"DOI","asserted-by":"publisher"}]},{"name":"ETH Future Computing Laboratory"},{"name":"BioPIM"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,5,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Pairwise sequence alignment is a very time-consuming step in common bioinformatics pipelines. Speeding up this step requires heuristics, efficient implementations, and\/or hardware acceleration. A promising candidate for all of the above is the recently proposed GenASM algorithm. We identify and address three inefficiencies in the GenASM algorithm: it has a high amount of data movement, a large memory footprint, and does some unnecessary work.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We propose Scrooge, a fast and memory-frugal genomic sequence aligner. Scrooge includes three novel algorithmic improvements which reduce the data movement, memory footprint, and the number of operations in the GenASM algorithm. We provide efficient open-source implementations of the Scrooge algorithm for CPUs and GPUs, which demonstrate the significant benefits of our algorithmic improvements. For long reads, the CPU version of Scrooge achieves a 20.1\u00d7, 1.7\u00d7, and 2.1\u00d7 speedup over KSW2, Edlib, and a CPU implementation of GenASM, respectively. The GPU version of Scrooge achieves a 4.0\u00d7, 80.4\u00d7, 6.8\u00d7, 12.6\u00d7, and 5.9\u00d7 speedup over the CPU version of Scrooge, KSW2, Edlib, Darwin-GPU, and a GPU implementation of GenASM, respectively. We estimate an ASIC implementation of Scrooge to use 3.6\u00d7 less chip area and 2.1\u00d7 less power than a GenASM ASIC while maintaining the same throughput. Further, we systematically analyze the throughput and accuracy behavior of GenASM and Scrooge under various configurations. As the best configuration of Scrooge depends on the computing platform, we make several observations that can help guide future implementations of Scrooge.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>https:\/\/github.com\/CMU-SAFARI\/Scrooge.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad151","type":"journal-article","created":{"date-parts":[[2023,3,24]],"date-time":"2023-03-24T14:06:47Z","timestamp":1679666807000},"source":"Crossref","is-referenced-by-count":15,"title":["Scrooge: a fast and memory-frugal genomic sequence aligner for CPUs, GPUs, and ASICs"],"prefix":"10.1093","volume":"39","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2581-8637","authenticated-orcid":false,"given":"Jo\u00ebl","family":"Lindegger","sequence":"first","affiliation":[{"name":"Department of Information Technology and Electrical Engineering, ETH Zurich , Zurich 8006, Switzerland"}]},{"given":"Damla","family":"Senol Cali","sequence":"additional","affiliation":[{"name":"Bionano Genomics , San Diego, CA 92121, United States"}]},{"given":"Mohammed","family":"Alser","sequence":"additional","affiliation":[{"name":"Department of Information Technology and Electrical Engineering, ETH Zurich , Zurich 8006, Switzerland"}]},{"given":"Juan","family":"G\u00f3mez-Luna","sequence":"additional","affiliation":[{"name":"Department of Information Technology and Electrical Engineering, ETH Zurich , Zurich 8006, Switzerland"}]},{"given":"Nika Mansouri","family":"Ghiasi","sequence":"additional","affiliation":[{"name":"Department of Information Technology and Electrical Engineering, ETH Zurich , Zurich 8006, Switzerland"}]},{"given":"Onur","family":"Mutlu","sequence":"additional","affiliation":[{"name":"Department of Information Technology and Electrical Engineering, ETH Zurich , Zurich 8006, Switzerland"}]}],"member":"286","published-online":{"date-parts":[[2023,3,24]]},"reference":[{"key":"2023051719353249600_btad151-B1","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1186\/s12859-019-3086-9","article-title":"GASAL2: a GPU accelerated sequence alignment library for high-throughput NGS data","volume":"20","author":"Ahmed","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2023051719353249600_btad151-B2","doi-asserted-by":"crossref","first-page":"388","DOI":"10.1186\/s12859-020-03685-1","article-title":"GPU acceleration of Darwin read overlapper for de novo assembly of long DNA reads","volume":"21","author":"Ahmed","year":"2020","journal-title":"BMC Bioinformatics"},{"key":"2023051719353249600_btad151-B3","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1109\/MM.2020.3013728","article-title":"Accelerating genome analysis: a primer on an ongoing journey","volume":"40","author":"Alser","year":"2020","journal-title":"IEEE Micro"},{"key":"2023051719353249600_btad151-B4","doi-asserted-by":"crossref","first-page":"5282","DOI":"10.1093\/bioinformatics\/btaa1015","article-title":"SneakySnake: a fast and accurate universal genome pre-alignment filter for CPUs, GPUs and FPGAs","volume":"36","author":"Alser","year":"2020","journal-title":"Bioinformatics"},{"key":"2023051719353249600_btad151-B5","doi-asserted-by":"crossref","first-page":"4579","DOI":"10.1016\/j.csbj.2022.08.019","article-title":"From molecules to genomic variations: accelerating genome analysis via intelligent algorithms and architectures","volume":"20","author":"Alser","year":"2022","journal-title":"Comput Struct Biotechnol J"},{"key":"2023051719353249600_btad151-B6","doi-asserted-by":"crossref","first-page":"4255","DOI":"10.1093\/bioinformatics\/btz234","article-title":"Shouji: a fast and efficient pre-alignment filter for sequence alignment","volume":"35","author":"Alser","year":"2019","journal-title":"Bioinformatics"},{"key":"2023051719353249600_btad151-B7","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1186\/s13059-021-02443-7","article-title":"Technology dictates algorithms: recent developments in read alignment","volume":"22","author":"Alser","year":"2021","journal-title":"Genome Biol"},{"key":"2023051719353249600_btad151-B8","doi-asserted-by":"crossref","first-page":"406","DOI":"10.1186\/s12859-020-03720-1","article-title":"ADEPT: a domain independent sequence alignment strategy for GPU architectures","volume":"21","author":"Awan","year":"2020","journal-title":"BMC Bioinformatics"},{"key":"2023051719353249600_btad151-B9","first-page":"51","article-title":"Edit distance cannot be computed in strongly subquadratic time (unless SETH is false)","volume-title":"STOC","author":"Backurs","year":"2015"},{"key":"2023051719353249600_btad151-B10","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1145\/135239.135243","article-title":"A new approach to text searching","volume":"35","author":"Baeza-Yates","year":"1992","journal-title":"Commun ACM"},{"key":"2023051719353249600_btad151-B11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3085572","article-title":"CACTI 7: new tools for interconnect exploration in innovative off-chip memories","volume":"14","author":"Balasubramonian","year":"2017","journal-title":"ACM Trans Archit Code Optim"},{"key":"2023051719353249600_btad151-B12","doi-asserted-by":"crossref","first-page":"561","DOI":"10.1109\/TVLSI.2008.2005314","article-title":"A highly parameterized and efficient FPGA-based skeleton for pairwise biological sequence alignment","volume":"17","author":"Benkrid","year":"2009","journal-title":"IEEE Trans VLSI Syst"},{"key":"2023051719353249600_btad151-B13","first-page":"316","volume-title":"ASPLOS","author":"Boroumand","year":"2018."},{"key":"2023051719353249600_btad151-B14","first-page":"159","author":"Boroumand","year":"2021"},{"key":"2023051719353249600_btad151-B15","doi-asserted-by":"crossref","first-page":"e53","DOI":"10.1093\/nar\/gkac039","article-title":"Dysgu: efficient structural variant calling using short or long reads","volume":"50","author":"Cleal","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2023051719353249600_btad151-B16","doi-asserted-by":"crossref","first-page":"2838","DOI":"10.1109\/TPDS.2016.2515597","article-title":"CUDAlign 4.0: incremental speculative traceback for exact chromosome-wide alignment in GPU clusters","volume":"27","author":"de Oliveira Sandes","year":"2016","journal-title":"IEEE Trans Parallel Distrib Syst"},{"key":"2023051719353249600_btad151-B17","volume-title":"A Christmas Carol","author":"Dickens","year":"1843"},{"key":"2023051719353249600_btad151-B18","author":"Eizenga","year":"2022"},{"key":"2023051719353249600_btad151-B19","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1007\/s12539-017-0225-8","article-title":"FPGASW: accelerating large-scale Smith\u2013Waterman sequence alignment application with backtracking on FPGA linear systolic array","volume":"10","author":"Fei","year":"2018","journal-title":"Interdiscip Sci"},{"key":"2023051719353249600_btad151-B20","author":"Fog","year":"2021"},{"key":"2023051719353249600_btad151-B21","first-page":"69","article-title":"GenAx: a genome sequencing accelerator","volume-title":"ISCA","author":"Fujiki","year":"2018"},{"key":"2023051719353249600_btad151-B22","first-page":"937","article-title":"SeedEx: a genome sequencing accelerator for optimal alignments in subminimal space","volume-title":"MICRO","author":"Fujiki","year":"2020"},{"key":"2023051719353249600_btad151-B23","doi-asserted-by":"crossref","first-page":"705","DOI":"10.1016\/0022-2836(82)90398-9","article-title":"An improved algorithm for matching biological sequences","volume":"162","author":"Gotoh","year":"1982","journal-title":"J Mol Biol"},{"key":"2023051719353249600_btad151-B24","first-page":"535","article-title":"Using FPGAs to accelerate Myers bit-vector algorithm","volume":"57","author":"Hoffmann","year":"2016","journal-title":"MEDICON"},{"key":"2023051719353249600_btad151-B25","first-page":"29","article-title":"A bit-vector algorithm for computing Levenshtein and Damerau edit distances","volume":"10","author":"Hyyr\u00f6","year":"2003","journal-title":"Nord J Comput"},{"key":"2023051719353249600_btad151-B26","doi-asserted-by":"crossref","first-page":"367","DOI":"10.1006\/jcss.2000.1727","article-title":"On the complexity of k-SAT","volume":"62","author":"Impagliazzo","year":"2001","journal-title":"J Comput Syst Sci"},{"key":"2023051719353249600_btad151-B27","author":"Intel","year":"2017"},{"key":"2023051719353249600_btad151-B28","first-page":"707","article-title":"Binary codes capable of correcting deletions, insertions, and reversals","volume":"10","author":"Levenshtein","year":"1966","journal-title":"Soviet Physics Doklady"},{"key":"2023051719353249600_btad151-B29","doi-asserted-by":"crossref","first-page":"3094","DOI":"10.1093\/bioinformatics\/bty191","article-title":"Minimap2: pairwise alignment for nucleotide sequences","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051719353249600_btad151-B30","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1093\/bfgp\/elr035","article-title":"Comparison of the two major classes of assembly algorithms: overlap\u2013layout\u2013consensus and De-Bruijn-graph","volume":"11","author":"Li","year":"2011","journal-title":"Brief Funct Genomics"},{"key":"2023051719353249600_btad151-B31","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1109\/MM.2008.31","article-title":"NVIDIA tesla: a unified graphics and computing architecture","volume":"28","author":"Lindholm","year":"2008","journal-title":"IEEE Micro"},{"key":"2023051719353249600_btad151-B32","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1186\/1471-2105-14-117","article-title":"CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions","volume":"14","author":"Liu","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023051719353249600_btad151-B33","first-page":"635","article-title":"GenStore: a high-performance in-storage processing system for genome sequence analysis","volume-title":"ASPLOS","author":"Mansouri Ghiasi","year":"2022"},{"key":"2023051719353249600_btad151-B34","doi-asserted-by":"crossref","first-page":"456","DOI":"10.1093\/bioinformatics\/btaa777","article-title":"Fast gap-affine pairwise alignment using the wavefront algorithm","volume":"37","author":"Marco-Sola","year":"2020","journal-title":"Bioinformatics"},{"key":"2023051719353249600_btad151-B35","first-page":"1","article-title":"Hyper-threading technology architecture and microarchitecture","volume":"6","author":"Marr","year":"2002","journal-title":"Intel Technol J"},{"key":"2023051719353249600_btad151-B36","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1145\/316542.316550","article-title":"A fast bit-vector algorithm for approximate string matching based on dynamic programming","volume":"46","author":"Myers","year":"1999","journal-title":"J ACM"},{"key":"2023051719353249600_btad151-B37","author":"NVIDIA","year":"2020"},{"key":"2023051719353249600_btad151-B38","author":"NVIDIA","year":"2023"},{"key":"2023051719353249600_btad151-B39","first-page":"76","article-title":"Applying the roofline model","volume-title":"ISPASS","author":"Ofenbeck","year":"2014"},{"key":"2023051719353249600_btad151-B40","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1093\/bioinformatics\/btaa835","article-title":"PBSIM2: a simulator for long-read sequencers with a novel generative model of quality scores","volume":"37","author":"Ono","year":"2020","journal-title":"Bioinformatics"},{"key":"2023051719353249600_btad151-B41","first-page":"951","article-title":"GenASM: a high-performance, low-power approximate string matching acceleration framework for genome sequence analysis","volume-title":"MICRO","author":"Senol Cali","year":"2020"},{"key":"2023051719353249600_btad151-B42","first-page":"638","article-title":"SeGraM: a universal hardware accelerator for genomic sequence-to-graph and sequence-to-sequence mapping","volume-title":"ISCA","author":"Senol Cali","year":"2022"},{"key":"2023051719353249600_btad151-B43","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1109\/MM.2021.3088396","article-title":"FPGA-based near-memory acceleration of modern data-intensive applications","volume":"41","author":"Singh","year":"2021","journal-title":"IEEE Micro"},{"key":"2023051719353249600_btad151-B44","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"Smith","year":"1981","journal-title":"J Mol Biol"},{"key":"2023051719353249600_btad151-B45","doi-asserted-by":"crossref","first-page":"1394","DOI":"10.1093\/bioinformatics\/btw753","article-title":"Edlib: a C\/C++ library for fast, exact sequence alignment using edit distance","volume":"33","author":"\u0160o\u0161i\u0107","year":"2017","journal-title":"Bioinformatics"},{"key":"2023051719353249600_btad151-B46","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1186\/s12859-018-2014-8","article-title":"Introducing difference recurrence relations for faster semi-global alignment of long sequences","volume":"19","author":"Suzuki","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2023051719353249600_btad151-B47","first-page":"199","article-title":"Darwin: a genomics co-processor provides up to 15,000\u00d7 acceleration on long read assembly","volume":"53","author":"Turakhia","year":"2018","journal-title":"ASPLOS"},{"key":"2023051719353249600_btad151-B48","first-page":"359","author":"Turakhia","year":"2019"},{"key":"2023051719353249600_btad151-B49","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1016\/S0019-9958(85)80046-2","article-title":"Algorithms for approximate string matching","volume":"64","author":"Ukkonen","year":"1985","journal-title":"Inf Control"},{"key":"2023051719353249600_btad151-B50","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1145\/1498765.1498785","article-title":"Roofline: an insightful visual performance model for multicore architectures","volume":"52","author":"Williams","year":"2009","journal-title":"Commun ACM"},{"key":"2023051719353249600_btad151-B51","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1145\/135239.135244","article-title":"Fast text searching: allowing errors","volume":"35","author":"Wu","year":"1992","journal-title":"Commun ACM"},{"key":"2023051719353249600_btad151-B52","doi-asserted-by":"crossref","first-page":"1553","DOI":"10.1093\/bioinformatics\/btu856","article-title":"Shifted hamming distance: a fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping","volume":"31","author":"Xin","year":"2015","journal-title":"Bioinformatics"},{"key":"2023051719353249600_btad151-B53","doi-asserted-by":"crossref","first-page":"S13","DOI":"10.1186\/1471-2164-14-S1-S13","article-title":"Accelerating read mapping with FastHASH","volume":"14","author":"Xin","year":"2013","journal-title":"BMC Genomics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad151\/49628103\/btad151.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/5\/btad151\/50378945\/btad151.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/5\/btad151\/50378945\/btad151.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,17]],"date-time":"2023-05-17T19:36:29Z","timestamp":1684352189000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad151\/7085594"}},"subtitle":[],"editor":[{"given":"Peter","family":"Robinson","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,3,24]]},"references-count":53,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2023,5,4]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad151","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,5,1]]},"published":{"date-parts":[[2023,3,24]]},"article-number":"btad151"}}