{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,26]],"date-time":"2025-11-26T16:23:40Z","timestamp":1764174220129},"reference-count":26,"publisher":"Oxford University Press (OUP)","issue":"11","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Single Molecule Real-Time (SMRT) sequencing has been widely applied in cutting-edge genomic studies. However, it is still an expensive task to align the noisy long SMRT reads to reference genome by state-of-the-art aligners, which is becoming a bottleneck in applications with SMRT sequencing. Novel approach is on demand for improving the efficiency and effectiveness of SMRT read alignment.<\/jats:p>\n               <jats:p>Results: We propose Regional Hashing-based Alignment Tool (rHAT), a seed-and-extension-based read alignment approach specifically designed for noisy long reads. rHAT indexes reference genome by regional hash table (RHT), a hash table-based index which describes the short tokens within local windows of reference genome. In the seeding phase, rHAT utilizes RHT for efficiently calculating the occurrences of short token matches between partial read and local genomic windows to find highly possible candidate sites. In the extension phase, a sparse dynamic programming-based heuristic approach is used for reducing the cost of aligning read to the candidate sites. By benchmarking on the real and simulated datasets from various prokaryote and eukaryote genomes, we demonstrated that rHAT can effectively align SMRT reads with outstanding throughput.<\/jats:p>\n               <jats:p>Availability and implementation: rHAT is implemented in C++; the source code is available at https:\/\/github.com\/HIT-Bioinformatics\/rHAT.<\/jats:p>\n               <jats:p>Contact: \u00a0ydwang@hit.edu.cn<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btv662","type":"journal-article","created":{"date-parts":[[2015,11,15]],"date-time":"2015-11-15T01:38:21Z","timestamp":1447551501000},"page":"1625-1631","source":"Crossref","is-referenced-by-count":33,"title":["rHAT: fast alignment of noisy long reads with regional hashing"],"prefix":"10.1093","volume":"32","author":[{"given":"Bo","family":"Liu","sequence":"first","affiliation":[{"name":"Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China"}]},{"given":"Dengfeng","family":"Guan","sequence":"additional","affiliation":[{"name":"Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China"}]},{"given":"Mingxiang","family":"Teng","sequence":"additional","affiliation":[{"name":"Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China"}]},{"given":"Yadong","family":"Wang","sequence":"additional","affiliation":[{"name":"Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China"}]}],"member":"286","published-online":{"date-parts":[[2015,11,14]]},"reference":[{"key":"2023020112290999100_btv662-B1","doi-asserted-by":"crossref","first-page":"e41","DOI":"10.1093\/nar\/gkr1246","article-title":"Hobbes: optimized gram-based methods for efficient read alignment","volume":"40","author":"Ahmadi","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023020112290999100_btv662-B2","doi-asserted-by":"crossref","first-page":"375","DOI":"10.1186\/1471-2164-13-375","article-title":"Pacific biosciences sequencing technology for genotyping and variation discovery in human data","volume":"13","author":"Carneiro","year":"2012","journal-title":"BMC Genomics"},{"key":"2023020112290999100_btv662-B3","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1186\/1471-2105-13-238","article-title":"Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory","volume":"13","author":"Chaisson","year":"2012","journal-title":"BMC Bioinformatics"},{"key":"2023020112290999100_btv662-B4","doi-asserted-by":"crossref","first-page":"608","DOI":"10.1038\/nature13907","article-title":"Resolving the complexity of the human genome using single-molecule sequencing","volume":"517","author":"Chaisson","year":"2015","journal-title":"Nature"},{"key":"2023020112290999100_btv662-B5","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1038\/nmeth.2474","article-title":"Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data","volume":"10","author":"Chin","year":"2013","journal-title":"Nat. Methods"},{"key":"2023020112290999100_btv662-B6","doi-asserted-by":"crossref","first-page":"1011","DOI":"10.1093\/bioinformatics\/btr046","article-title":"SHRiMP2: sensitive yet practical short read mapping","volume":"27","author":"David","year":"2011","journal-title":"Bioinformatics"},{"key":"2023020112290999100_btv662-B7","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1126\/science.1162986","article-title":"Real-time DNA sequencing from single polymerase molecules","volume":"323","author":"Eid","year":"2009","journal-title":"Science"},{"key":"2023020112290999100_btv662-B8","doi-asserted-by":"crossref","first-page":"180","DOI":"10.1186\/1471-2105-15-180","article-title":"PBHoney: identifying genomic variants via long-read discordance and interrupted mapping","volume":"15","author":"English","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"2023020112290999100_btv662-B9","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1145\/146637.146650","article-title":"Sparse dynamic programming I: linear cost functions","volume":"39","author":"Eppstein","year":"1992","journal-title":"J. Assoc. Comput. Machinery"},{"key":"2023020112290999100_btv662-B10","doi-asserted-by":"crossref","first-page":"3169","DOI":"10.1093\/bioinformatics\/bts605","article-title":"Tools for mapping high-throughput sequencing data","volume":"28","author":"Fonseca","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020112290999100_btv662-B11","first-page":"390","article-title":"Opportunistic data structures with applications","author":"Ferragina","year":"2000"},{"key":"2023020112290999100_btv662-B12","doi-asserted-by":"crossref","first-page":"688","DOI":"10.1101\/gr.168450.113","article-title":"Reconstructing complex regions of genomes using long-read sequencing technology","volume":"24","author":"Huddleston","year":"2015","journal-title":"Genome Res."},{"key":"2023020112290999100_btv662-B13","first-page":"656","article-title":"BLAT\u2014the BLAST-like alignment tool","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res."},{"key":"2023020112290999100_btv662-B14","doi-asserted-by":"crossref","first-page":"487","DOI":"10.1101\/gr.113985.110","article-title":"Adaptive seeds tame genomic sequence comparison","volume":"21","author":"Kie\u0142basa","year":"2011","journal-title":"Genome Res."},{"key":"2023020112290999100_btv662-B15","doi-asserted-by":"crossref","first-page":"693","DOI":"10.1038\/nbt.2280","article-title":"Hybrid error correction and de novo assembly of single-molecule sequencing reads","volume":"30","author":"Koren","year":"2012","journal-title":"Nat. Biotechnol."},{"key":"2023020112290999100_btv662-B16","doi-asserted-by":"crossref","first-page":"R101","DOI":"10.1186\/gb-2013-14-9-r101","article-title":"Reducing assembly complexity of microbial genomes with single-molecule sequencing","volume":"14","author":"Koren","year":"2013","journal-title":"Genome Biol."},{"key":"2023020112290999100_btv662-B17","doi-asserted-by":"crossref","first-page":"R25","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short DNA sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol."},{"key":"2023020112290999100_btv662-B18","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.1923","article-title":"Fast gapped-read alignment with Bowtie 2","volume":"9","author":"Langmead","year":"2012","journal-title":"Nat Methods"},{"key":"2023020112290999100_btv662-B19","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1093\/bioinformatics\/btp698","article-title":"Fast and accurate long-read alignment with Burrows\u2013Wheeler transform","volume":"26","author":"Li","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020112290999100_btv662-B20","article-title":"Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM","author":"Li","year":"2013"},{"key":"2023020112290999100_btv662-B21","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1093\/bioinformatics\/bts649","article-title":"PBSIM: PacBio reads simulator\u2014toward accurate genome assembly","volume":"29","author":"Ono","year":"2013","journal-title":"Bioinformatics"},{"key":"2023020112290999100_btv662-B22","doi-asserted-by":"crossref","first-page":"296","DOI":"10.1089\/cmb.2006.13.296","article-title":"Efcient q-gram filters for finding all epsilon-matches over a given length","volume":"13","author":"Rasmussen","year":"2006","journal-title":"J. Comput. Biol."},{"key":"2023020112290999100_btv662-B23","doi-asserted-by":"crossref","first-page":"405","DOI":"10.1186\/gb-2013-14-6-405","article-title":"The advantages of SMRT sequencing","volume":"14","author":"Roberts","year":"2013","journal-title":"Genome Biol."},{"key":"2023020112290999100_btv662-B24","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1038\/nrg3117","article-title":"Repetitive DNA and next-generation sequencing: computational challenges and solutions","volume":"13","author":"Treangen","year":"2011","journal-title":"Nat. Rev. Genet."},{"key":"2023020112290999100_btv662-B25","doi-asserted-by":"crossref","first-page":"2592","DOI":"10.1093\/bioinformatics\/bts505","article-title":"RazerS 3: faster, fully sensitive read mapping","volume":"28","author":"Weese","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020112290999100_btv662-B26","author":"Yanovsky","year":"2014"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/11\/1625\/49019381\/bioinformatics_32_11_1625.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/11\/1625\/49019381\/bioinformatics_32_11_1625.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T22:32:39Z","timestamp":1675290759000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/32\/11\/1625\/1742681"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,11,14]]},"references-count":26,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2016,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btv662","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2016,6,1]]},"published":{"date-parts":[[2015,11,14]]}}}