{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T23:16:15Z","timestamp":1776122175502,"version":"3.50.1"},"reference-count":39,"publisher":"Oxford University Press (OUP)","issue":"21","license":[{"start":{"date-parts":[[2017,5,31]],"date-time":"2017-05-31T00:00:00Z","timestamp":1496188800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","award":["R01 HG006004"],"award-info":[{"award-number":["R01 HG006004"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004963","name":"Seventh Framework Programme","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100004963","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004410","name":"Scientific and Technological Research Council of Turkey","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100004410","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments -called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and \u2018candidate\u2019 locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper\u2019s execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average\u2009&amp;gt;96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>https:\/\/github.com\/BilkentCompGen\/GateKeeper<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx342","type":"journal-article","created":{"date-parts":[[2017,5,30]],"date-time":"2017-05-30T08:22:29Z","timestamp":1496132549000},"page":"3355-3363","source":"Crossref","is-referenced-by-count":80,"title":["GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping"],"prefix":"10.1093","volume":"33","author":[{"given":"Mohammed","family":"Alser","sequence":"first","affiliation":[{"name":"Department of Computer Engineering, Bilkent University, Bilkent, Ankara, Turkey"}]},{"given":"Hasan","family":"Hassan","sequence":"additional","affiliation":[{"name":"TOBB University of Economics & Technology, Sogutozu, Ankara, Turkey"},{"name":"Department of Computer Science, ETH Z\u00fcrich, Z\u00fcrich, Switzerland"}]},{"given":"Hongyi","family":"Xin","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA"}]},{"given":"O\u011fuz","family":"Ergin","sequence":"additional","affiliation":[{"name":"TOBB University of Economics & Technology, Sogutozu, Ankara, Turkey"}]},{"given":"Onur","family":"Mutlu","sequence":"additional","affiliation":[{"name":"Department of Computer Science, ETH Z\u00fcrich, Z\u00fcrich, Switzerland"}]},{"given":"Can","family":"Alkan","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, Bilkent University, Bilkent, Ankara, Turkey"}]}],"member":"286","published-online":{"date-parts":[[2017,5,31]]},"reference":[{"key":"2023051506254170400_btx342-B1","doi-asserted-by":"crossref","first-page":"e41\u2013e41.","DOI":"10.1093\/nar\/gkr1246","article-title":"Hobbes: optimized gram-based methods for efficient read alignment","volume":"40","author":"Ahmadi","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023051506254170400_btx342-B2","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1038\/ng.437","article-title":"Personalized copy number and segmental duplication maps using next-generation sequencing","volume":"41","author":"Alkan","year":"2009","journal-title":"Nature Genet"},{"key":"2023051506254170400_btx342-B3","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1109\/MDAT.2013.2293757","article-title":"A review of hardware acceleration for computational genomics","volume":"31","author":"Aluru","year":"2014","journal-title":"Des. Test IEEE"},{"key":"2023051506254170400_btx342-B4","author":"Arram","year":"2013"},{"key":"2023051506254170400_btx342-B5","author":"Canzar","year":"2015"},{"key":"2023051506254170400_btx342-B6","doi-asserted-by":"crossref","first-page":"192","DOI":"10.1186\/s12859-015-0626-9","article-title":"BitMapper: an efficient all-mapper based on bit-vector computing","volume":"16","author":"Cheng","year":"2015","journal-title":"BMC Bioinformatics"},{"key":"2023051506254170400_btx342-B7","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/nature11632","article-title":"An integrated map of genetic variation from 1,092 human genomes","volume":"491","author":"Consortium","year":"2012","journal-title":"Nature"},{"key":"2023051506254170400_btx342-B8","doi-asserted-by":"crossref","first-page":"1011","DOI":"10.1093\/bioinformatics\/btr046","article-title":"SHRiMP2: sensitive yet practical short read mapping","volume":"27","author":"David","year":"2011","journal-title":"Bioinformatics"},{"key":"2023051506254170400_btx342-B9","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1038\/nmeth0810-576","article-title":"mrsFAST: a cache-oblivious algorithm for short-read mapping","volume":"7","author":"Hach","year":"2010","journal-title":"Nat. Methods"},{"key":"2023051506254170400_btx342-B10","author":"Hach","year":"2014"},{"key":"2023051506254170400_btx342-B11","doi-asserted-by":"crossref","first-page":"184.","DOI":"10.1186\/1471-2105-14-184","article-title":"Benchmarking short sequence mapping tools","volume":"14","author":"Hatem","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023051506254170400_btx342-B12","doi-asserted-by":"crossref","first-page":"50.","DOI":"10.1109\/MC.2007.79","article-title":"Achieving high performance with FPGA-based computing","volume":"40","author":"Herbordt","year":"2007","journal-title":"Computer"},{"key":"2023051506254170400_btx342-B13","doi-asserted-by":"crossref","first-page":"e7767.","DOI":"10.1371\/journal.pone.0007767","article-title":"BFAST: an alignment tool for large scale genome resequencing","volume":"4","author":"Homer","year":"2009","journal-title":"PloS One"},{"key":"2023051506254170400_btx342-B14","author":"Houtgast","year":"2015"},{"key":"2023051506254170400_btx342-B15","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2815631","article-title":"RIFFA 2.1: a reusable integration framework for FPGA accelerators","volume":"8","author":"Jacobsen","year":"2015","journal-title":"ACM Trans. Reconfigurable Technol. Syst"},{"key":"2023051506254170400_btx342-B16","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.1923","article-title":"Fast gapped-read alignment with Bowtie 2","volume":"9","author":"Langmead","year":"2012","journal-title":"Nat. Methods"},{"key":"2023051506254170400_btx342-B17","doi-asserted-by":"crossref","first-page":"R25.","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short DNA sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol"},{"key":"2023051506254170400_btx342-B18","first-page":"707","article-title":"Binary codes capable of correcting deletions, insertions, and reversals","volume":"10","author":"Levenshtein","year":"1966","journal-title":"Soviet Phys. Doklady"},{"key":"2023051506254170400_btx342-B19","author":"Li","year":"2013"},{"key":"2023051506254170400_btx342-B20","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with Burrows\u2013Wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023051506254170400_btx342-B21","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1142\/S0219720004000661","article-title":"PatternHunter II: highly sensitive and fast homology search","volume":"2","author":"Li","year":"2004","journal-title":"J. Bioinf. Comput. Biol"},{"key":"2023051506254170400_btx342-B22","author":"Lindner","year":"2016"},{"key":"2023051506254170400_btx342-B23","doi-asserted-by":"crossref","first-page":"878","DOI":"10.1093\/bioinformatics\/bts061","article-title":"SOAP3: ultra-fast GPU-based parallel alignment tool for short reads","volume":"28","author":"Liu","year":"2012","journal-title":"Bioinformatics"},{"key":"2023051506254170400_btx342-B24","first-page":"e65632-e65632.","article-title":"SOAP3-dp: fast, accurate and sensitive GPU-based short read aligner","volume":"8","author":"Luo","year":"2013","journal-title":"PloS One"},{"key":"2023051506254170400_btx342-B25","doi-asserted-by":"crossref","first-page":"1185","DOI":"10.1038\/nmeth.2221","article-title":"The GEM mapper: fast, accurate and versatile alignment by filtration","volume":"9","author":"Marco-Sola","year":"2012","journal-title":"Nat. Methods"},{"key":"2023051506254170400_btx342-B26","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","article-title":"A general method applicable to the search for similarities in the amino acid sequence of two proteins","volume":"48","author":"Needleman","year":"1970","journal-title":"J. Mol. Biol"},{"key":"2023051506254170400_btx342-B27","author":"Olson","year":"2012"},{"key":"2023051506254170400_btx342-B28","doi-asserted-by":"crossref","first-page":"296","DOI":"10.1089\/cmb.2006.13.296","article-title":"Efficient q-gram filters for finding all \u03b5-matches over a given length","volume":"13","author":"Rasmussen","year":"2006","journal-title":"J. Comput. Biol"},{"key":"2023051506254170400_btx342-B29","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"Smith","year":"1981","journal-title":"J. Mol. Biol"},{"key":"2023051506254170400_btx342-B30","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1109\/JPROC.2015.2392104","article-title":"Three ages of FPGAs: a retrospective on the first thirty years of FPGA technology","volume":"103","author":"Trimberger","year":"2015","journal-title":"Proc. IEEE"},{"key":"2023051506254170400_btx342-B31","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1016\/S0019-9958(85)80046-2","article-title":"Algorithms for approximate string matching","volume":"64","author":"Ukkonen","year":"1985","journal-title":"Inf. Control"},{"key":"2023051506254170400_btx342-B32","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1016\/0304-3975(92)90143-4","article-title":"Approximate string-matching with q-grams and maximal matches","volume":"92","author":"Ukkonen","year":"1992","journal-title":"Theor. Comput. Sci"},{"key":"2023051506254170400_btx342-B33","author":"Waidyasooriya","year":"2014"},{"key":"2023051506254170400_btx342-B34","doi-asserted-by":"crossref","first-page":"1646","DOI":"10.1101\/gr.088823.108","article-title":"RazerS\u2014fast read mapping with sensitivity control","volume":"19","author":"Weese","year":"2009","journal-title":"Genome Res"},{"key":"2023051506254170400_btx342-B35","doi-asserted-by":"crossref","first-page":"2592","DOI":"10.1093\/bioinformatics\/bts505","article-title":"RazerS 3: faster, fully sensitive read mapping","volume":"28","author":"Weese","year":"2012","journal-title":"Bioinformatics"},{"key":"2023051506254170400_btx342-B36","author":"Xilinx","year":"2014"},{"key":"2023051506254170400_btx342-B37","author":"Xilinx","year":"2015"},{"key":"2023051506254170400_btx342-B38","doi-asserted-by":"crossref","first-page":"1553","DOI":"10.1093\/bioinformatics\/btu856","article-title":"Shifted hamming distance: a fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping","volume":"31","author":"Xin","year":"2015","journal-title":"Bioinformatics"},{"key":"2023051506254170400_btx342-B39","doi-asserted-by":"crossref","first-page":"S13.","DOI":"10.1186\/1471-2164-14-S1-S13","article-title":"Accelerating read mapping with FastHASH","volume":"14","author":"Xin","year":"2013","journal-title":"BMC Genomics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/21\/3355\/50315285\/bioinformatics_33_21_3355.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/21\/3355\/50315285\/bioinformatics_33_21_3355.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,15]],"date-time":"2023-05-15T06:26:18Z","timestamp":1684131978000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/21\/3355\/3859176"}},"subtitle":[],"editor":[{"given":"Bonnie","family":"Berger","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2017,5,31]]},"references-count":39,"journal-issue":{"issue":"21","published-print":{"date-parts":[[2017,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx342","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2017,11,1]]},"published":{"date-parts":[[2017,5,31]]}}}