{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T13:37:33Z","timestamp":1760708253232},"reference-count":20,"publisher":"Oxford University Press (OUP)","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,1,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Recently, a number of programs have been proposed for mapping short reads to a reference genome. Many of them are heavily optimized for short-read mapping and hence are very efficient for shorter queries, but that makes them inefficient or not applicable for reads longer than 200 bp. However, many sequencers are already generating longer reads and more are expected to follow. For long read sequence mapping, there are limited options; BLAT, SSAHA2, FANGS and BWA-SW are among the popular ones. However, resequencing and personalized medicine need much faster software to map these long sequencing reads to a reference genome to identify SNPs or rare transcripts.<\/jats:p>\n               <jats:p>Results: We present AGILE (AliGnIng Long rEads), a hash table based high-throughput sequence mapping algorithm for longer 454 reads that uses diagonal multiple seed-match criteria, customized q-gram filtering and a dynamic incremental search approach among other heuristics to optimize every step of the mapping process. In our experiments, we observe that AGILE is more accurate than BLAT, and comparable to BWA-SW and SSAHA2. For practical error rates (&amp;lt; 5%) and read lengths (200\u20131000 bp), AGILE is significantly faster than BLAT, SSAHA2 and BWA-SW. Even for the other cases, AGILE is comparable to BWA-SW and several times faster than BLAT and SSAHA2.<\/jats:p>\n               <jats:p>Availability: \u00a0http:\/\/www.ece.northwestern.edu\/~smi539\/agile.html.<\/jats:p>\n               <jats:p>Contact: \u00a0smi539@eecs.northwestern.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq648","type":"journal-article","created":{"date-parts":[[2010,11,19]],"date-time":"2010-11-19T03:58:31Z","timestamp":1290139111000},"page":"189-195","source":"Crossref","is-referenced-by-count":22,"title":["Anatomy of a hash-based long read sequence mapping algorithm for next generation DNA sequencing"],"prefix":"10.1093","volume":"27","author":[{"given":"Sanchit","family":"Misra","sequence":"first","affiliation":[{"name":"Department of Electrical Engineering and Computer Science, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208, USA"}]},{"given":"Ankit","family":"Agrawal","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering and Computer Science, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208, USA"}]},{"given":"Wei-keng","family":"Liao","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering and Computer Science, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208, USA"}]},{"given":"Alok","family":"Choudhary","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering and Computer Science, Northwestern University, 2145 Sheridan Road, Evanston, IL 60208, USA"}]}],"member":"286","published-online":{"date-parts":[[2010,11,18]]},"reference":[{"key":"2023012512170977800_B1","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol."},{"key":"2023012512170977800_B2","doi-asserted-by":"crossref","first-page":"967","DOI":"10.1093\/bioinformatics\/btp087","article-title":"Pass: a program to align short sequences","volume":"25","author":"Campagna","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012512170977800_B3","first-page":"656","article-title":"Blat\u2013the blast-like alignment tool","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res."},{"key":"2023012512170977800_B4","doi-asserted-by":"crossref","first-page":"R25+","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short dna sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol."},{"key":"2023012512170977800_B5","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1093\/bioinformatics\/btp698","article-title":"Fast and accurate long read alignment with burrows-wheeler transform","volume":"26","author":"Li","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012512170977800_B6","doi-asserted-by":"crossref","first-page":"1851","DOI":"10.1101\/gr.078212.108","article-title":"Mapping short dna sequencing reads and calling variants using mapping quality scores","volume":"18","author":"Li","year":"2008","journal-title":"Genome Res."},{"key":"2023012512170977800_B7","doi-asserted-by":"crossref","first-page":"713","DOI":"10.1093\/bioinformatics\/btn025","article-title":"Soap: short oligonucleotide alignment program","volume":"24","author":"Li","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012512170977800_B8","doi-asserted-by":"crossref","first-page":"1181","DOI":"10.1056\/NEJMoa0908094","article-title":"Whole-genome sequencing in a patient with charcot-marie-tooth neuropathy","volume":"362","author":"Lupski","year":"2010","journal-title":"N. Engl. J. Med."},{"key":"2023012512170977800_B9","article-title":"Fangs: high speed sequence mapping for next generation sequencers","volume-title":"Proceedings of ACM Symposium of Applied Computing (ACM SAC)","author":"Misra","year":"2009"},{"key":"2023012512170977800_B10","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","article-title":"A general method applicable to the search for similarities in the amino acid sequence of two proteins","volume":"48","author":"Needleman","year":"1970","journal-title":"J. Mol. Biol."},{"key":"2023012512170977800_B11","doi-asserted-by":"crossref","first-page":"1725","DOI":"10.1101\/gr.194201","article-title":"Ssaha: a fast search method for large dna databases","volume":"11","author":"Ning","year":"2001","journal-title":"Genome Res."},{"key":"2023012512170977800_B12","first-page":"191","article-title":"454 life sciences: illuminating the future of genome sequencing and personalized medicine","volume":"80","author":"Patrick","year":"2007","journal-title":"Yale J. Biol. Med."},{"key":"2023012512170977800_B13","doi-asserted-by":"crossref","first-page":"2444","DOI":"10.1073\/pnas.85.8.2444","article-title":"Improved tools for biological sequence comparison","volume":"85","author":"Pearson","year":"1988","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512170977800_B14","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1007\/BF01188584","article-title":"Multiple filtration and approximate pattern matching","volume":"13","author":"Pevzner","year":"1995","journal-title":"Algorithmica"},{"key":"2023012512170977800_B15","doi-asserted-by":"crossref","first-page":"296","DOI":"10.1089\/cmb.2006.13.296","article-title":"Efficient q-gram filters for finding all epsilon-matches over a given length","volume":"13","author":"Rasmussen","year":"2006","journal-title":"J. Comput. Biol."},{"key":"2023012512170977800_B16","doi-asserted-by":"crossref","first-page":"636","DOI":"10.1126\/science.1186802","article-title":"Analysis of genetic inheritance in a family quartet by whole-genome sequencing","volume":"328","author":"Roach","year":"2010","journal-title":"Science"},{"key":"2023012512170977800_B17","doi-asserted-by":"crossref","first-page":"1117","DOI":"10.1038\/nbt1485","article-title":"The development and impact of 454 sequencing","volume":"26","author":"Rothberg","year":"2008","journal-title":"Nat. Biotechnol."},{"key":"2023012512170977800_B18","doi-asserted-by":"crossref","first-page":"e1000386","DOI":"10.1371\/journal.pcbi.1000386","article-title":"Shrimp: accurate mapping of short color-space reads","volume":"5","author":"Rumble","year":"2009","journal-title":"PLoS Comput. Biol."},{"key":"2023012512170977800_B19","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"Smith","year":"1981","journal-title":"J. Mol. Biol."},{"key":"2023012512170977800_B20","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1186\/1471-2105-9-128","article-title":"Using quality scores and longer reads improves accuracy of solexa read mapping","volume":"9","author":"Smith","year":"2008","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/2\/189\/48866960\/bioinformatics_27_2_189.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/2\/189\/48866960\/bioinformatics_27_2_189.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T14:49:46Z","timestamp":1674658186000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/27\/2\/189\/286308"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,11,18]]},"references-count":20,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2011,1,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq648","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2011,1,15]]},"published":{"date-parts":[[2010,11,18]]}}}