{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T00:36:22Z","timestamp":1774312582023,"version":"3.50.1"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"22","license":[{"start":{"date-parts":[[2021,6,12]],"date-time":"2021-06-12T00:00:00Z","timestamp":1623456000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["R01 HG009937"],"award-info":[{"award-number":["R01 HG009937"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["CCF-1750472"],"award-info":[{"award-number":["CCF-1750472"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["CNS-1763680"],"award-info":[{"award-number":["CNS-1763680"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,11,18]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Sequence alignment is one of the first steps in many modern genomic analyses, such as variant detection, transcript abundance estimation and metagenomic profiling. Unfortunately, it is often a computationally expensive procedure. As the quantity of data and wealth of different assays and applications continue to grow, the need for accurate and fast alignment tools that scale to large collections of reference sequences persists.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>In this article, we introduce PuffAligner, a fast, accurate and versatile aligner built on top of the Pufferfish index. PuffAligner is able to produce highly sensitive alignments, similar to those of Bowtie2, but much more quickly. While exhibiting similar speed to the ultrafast STAR aligner, PuffAligner requires considerably less memory to construct its index and align reads. PuffAligner strikes a desirable balance with respect to the time, space and accuracy tradeoffs made by different alignment tools and provides a promising foundation on which to test new alignment ideas over large collections of sequences.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>All the data used for preparing the results of this paper can be found with 10.5281\/zenodo.4902332. PuffAligner is a free and open-source software. It is implemented in C++14 and can be obtained from https:\/\/github.com\/COMBINE-lab\/pufferfish\/tree\/cigar-strings.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab408","type":"journal-article","created":{"date-parts":[[2021,6,11]],"date-time":"2021-06-11T23:22:23Z","timestamp":1623453743000},"page":"4048-4055","source":"Crossref","is-referenced-by-count":26,"title":["PuffAligner: a fast, efficient and accurate aligner based on the Pufferfish index"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0993-3600","authenticated-orcid":false,"given":"Fatemeh","family":"Almodaresi","sequence":"first","affiliation":[{"name":"Computer Science Department, University of Maryland , College Park, MD 20742, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9856-719X","authenticated-orcid":false,"given":"Mohsen","family":"Zakeri","sequence":"additional","affiliation":[{"name":"Computer Science Department, University of Maryland , College Park, MD 20742, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rob","family":"Patro","sequence":"additional","affiliation":[{"name":"Computer Science Department, University of Maryland , College Park, MD 20742, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2021,6,12]]},"reference":[{"key":"2023051607112620700_btab408-B1","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1038\/ng.437","article-title":"Personalized copy number and segmental duplication maps using next-generation sequencing","volume":"41","author":"Alkan","year":"2009","journal-title":"Nat. Genet"},{"key":"2023051607112620700_btab408-B2","author":"Almodaresi","year":"2017"},{"key":"2023051607112620700_btab408-B3","doi-asserted-by":"crossref","first-page":"i169","DOI":"10.1093\/bioinformatics\/bty292","article-title":"A space and time-efficient index for the compacted colored de Bruijn graph","volume":"34","author":"Almodaresi","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051607112620700_btab408-B4","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1038\/nbt.3519","article-title":"Near-optimal probabilistic RNA-seq quantification","volume":"34","author":"Bray","year":"2016","journal-title":"Nat. Biotechnol"},{"key":"2023051607112620700_btab408-B5","doi-asserted-by":"crossref","first-page":"i884","DOI":"10.1093\/bioinformatics\/bty560","article-title":"fastp: an ultra-fast all-in-one fastq preprocessor","volume":"34","author":"Chen","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051607112620700_btab408-B6","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1038\/nature15393","article-title":"A global reference for human genetic variation","volume":"526","author":"Consortium","year":"2015","journal-title":"Nature"},{"key":"2023051607112620700_btab408-B7","doi-asserted-by":"crossref","first-page":"2938","DOI":"10.1093\/bioinformatics\/btx364","article-title":"Upsetr: an r package for the visualization of intersecting sets and their properties","volume":"33","author":"Conway","year":"2017","journal-title":"Bioinformatics"},{"key":"2023051607112620700_btab408-B8","doi-asserted-by":"crossref","first-page":"1011","DOI":"10.1093\/bioinformatics\/btr046","article-title":"Shrimp2: sensitive yet practical short read mapping","volume":"27","author":"David","year":"2011","journal-title":"Bioinformatics"},{"key":"2023051607112620700_btab408-B9","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1093\/bioinformatics\/bts635","article-title":"STAR: ultrafast universal RNA-seq aligner","volume":"29","author":"Dobin","year":"2013","journal-title":"Bioinformatics"},{"key":"2023051607112620700_btab408-B10","author":"Fisher","year":"2020"},{"key":"2023051607112620700_btab408-B11","doi-asserted-by":"crossref","first-page":"D766","DOI":"10.1093\/nar\/gky955","article-title":"Gencode reference annotation for the human and mouse genomes","volume":"47","author":"Frankish","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023051607112620700_btab408-B12","doi-asserted-by":"crossref","first-page":"576","DOI":"10.1038\/nmeth0810-576","article-title":"mrsfast: a cache-oblivious algorithm for short-read mapping","volume":"7","author":"Hach","year":"2010","journal-title":"Nat. Methods"},{"key":"2023051607112620700_btab408-B13","doi-asserted-by":"crossref","first-page":"311","DOI":"10.1186\/s12859-018-2319-7","article-title":"Browniealigner: accurate alignment of illumina sequencing data to de Bruijn graphs","volume":"19","author":"Heydari","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2023051607112620700_btab408-B14","doi-asserted-by":"crossref","first-page":"226","DOI":"10.1038\/ng.1028","article-title":"De novo assembly and genotyping of variants using colored de Bruijn graphs","volume":"44","author":"Iqbal","year":"2012","journal-title":"Nat. Genet"},{"key":"2023051607112620700_btab408-B15","doi-asserted-by":"crossref","first-page":"766","DOI":"10.1089\/cmb.2018.0036","article-title":"A fast approximate algorithm for mapping long reads to large reference databases","volume":"25","author":"Jain","year":"2018","journal-title":"J. Comput. Biol"},{"key":"2023051607112620700_btab408-B16","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.3317","article-title":"Hisat: a fast spliced aligner with low memory requirements","volume":"12","author":"Kim","year":"2015","journal-title":"Nat. Methods"},{"key":"2023051607112620700_btab408-B17","doi-asserted-by":"crossref","first-page":"907","DOI":"10.1038\/s41587-019-0201-4","article-title":"Graph-based genome alignment and genotyping with hisat2 and hisat-genotype","volume":"37","author":"Kim","year":"2019","journal-title":"Nat. Biotechnol"},{"key":"2023051607112620700_btab408-B18","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.1923","article-title":"Fast gapped-read alignment with Bowtie 2","volume":"9","author":"Langmead","year":"2012","journal-title":"Nat. Methods"},{"key":"2023051607112620700_btab408-B19","doi-asserted-by":"crossref","first-page":"3094","DOI":"10.1093\/bioinformatics\/bty191","article-title":"Minimap2: pairwise alignment for nucleotide sequences","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051607112620700_btab408-B20","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with Burrows\u2013Wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023051607112620700_btab408-B21","doi-asserted-by":"crossref","first-page":"e108","DOI":"10.1093\/nar\/gkt214","article-title":"The subread aligner: fast, accurate and scalable read mapping by seed-and-vote","volume":"41","author":"Liao","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023051607112620700_btab408-B22","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1186\/s12859-016-1103-9","article-title":"Read mapping on de Bruijn graphs","volume":"17","author":"Limasset","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2023051607112620700_btab408-B23","doi-asserted-by":"crossref","first-page":"3224","DOI":"10.1093\/bioinformatics\/btw371","article-title":"debga: read alignment with de Bruijn graph-based seed and extension","volume":"32","author":"Liu","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051607112620700_btab408-B24","doi-asserted-by":"crossref","first-page":"e104","DOI":"10.7717\/peerj-cs.104","article-title":"Bracken: estimating species abundance in metagenomics data","volume":"3","author":"Lu","year":"2017","journal-title":"PeerJ Comput. Sci"},{"key":"2023051607112620700_btab408-B25","doi-asserted-by":"crossref","first-page":"3181","DOI":"10.1093\/bioinformatics\/btx067","article-title":"Succinct colored de Bruijn graphs","volume":"33","author":"Muggli","year":"2017","journal-title":"Bioinformatics"},{"key":"2023051607112620700_btab408-B26","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1016\/j.cels.2018.05.021","article-title":"Mantis: a fast, small, and exact large-scale sequence-search index","volume":"7","author":"Pandey","year":"2018","journal-title":"Cell Syst"},{"key":"2023051607112620700_btab408-B27","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1038\/nmeth.4197","article-title":"Salmon provides fast and bias-aware quantification of transcript expression","volume":"14","author":"Patro","year":"2017","journal-title":"Nat. Methods"},{"key":"2023051607112620700_btab408-B28","doi-asserted-by":"crossref","first-page":"e1006096","DOI":"10.1371\/journal.pcbi.1006096","article-title":"Using pseudoalignment and base quality to accurately quantify microbial community composition","volume":"14","author":"Reppell","year":"2018","journal-title":"PLoS Comput. Biol"},{"key":"2023051607112620700_btab408-B29","first-page":"27","author":"Sarkar","year":"2018"},{"key":"2023051607112620700_btab408-B30","doi-asserted-by":"crossref","first-page":"849","DOI":"10.1101\/gr.213611.116","article-title":"Evaluation of GRCH38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly","volume":"27","author":"Schneider","year":"2017","journal-title":"Genome Res"},{"key":"2023051607112620700_btab408-B31","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-020-02151-8","article-title":"Alignment and mapping methodology influence transcript abundance estimation","volume":"21","author":"Srivastava","year":"2020","journal-title":"Genome Biol"},{"key":"2023051607112620700_btab408-B32","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1186\/s12859-018-2014-8","article-title":"Introducing difference recurrence relations for faster semi-global alignment of long sequences","volume":"19","author":"Suzuki","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2023051607112620700_btab408-B33","author":"Vuong","year":"2018"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab408\/38711374\/btab408.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/22\/4048\/50335370\/btab408.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/22\/4048\/50335370\/btab408.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T03:19:18Z","timestamp":1684207158000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/22\/4048\/6297388"}},"subtitle":[],"editor":[{"given":"Janet","family":"Kelso","sequence":"additional","affiliation":[{"name":"Computer Science Department, University of Maryland , College Park, MD 20742, USA"}],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2021,6,12]]},"references-count":33,"journal-issue":{"issue":"22","published-print":{"date-parts":[[2021,11,18]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab408","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.08.11.246892","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,11,15]]},"published":{"date-parts":[[2021,6,12]]}}}