{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T20:19:33Z","timestamp":1780517973076,"version":"3.54.1"},"reference-count":10,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":1801,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/3.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Summary: With the wide application of next-generation sequencing (NGS) techniques, fast tools for protein similarity search that scale well to large query datasets and large databases are highly desirable. In a previous work, we developed RAPSearch, an algorithm that achieved a ~20\u201390-fold speedup relative to BLAST while still achieving similar levels of sensitivity for short protein fragments derived from NGS data. RAPSearch, however, requires a substantial memory footprint to identify alignment seeds, due to its use of a suffix array data structure. Here we present RAPSearch2, a new memory-efficient implementation of the RAPSearch algorithm that uses a collision-free hash table to index a similarity search database. The utilization of an optimized data structure further speeds up the similarity search\u2014another 2\u20133 times. We also implemented multi-threading in RAPSearch2, and the multi-thread modes achieve significant acceleration (e.g. 3.5X for 4-thread mode). RAPSearch2 requires up to 2G memory when running in single thread mode, or up to 3.5G memory when running in 4-thread mode.<\/jats:p>\n               <jats:p>Availability and implementation: Implemented in C++, the source code is freely available for download at the RAPSearch2 website: http:\/\/omics.informatics.indiana.edu\/mg\/RAPSearch2\/.<\/jats:p>\n               <jats:p>Contact: \u00a0yye@indiana.edu<\/jats:p>\n               <jats:p>Supplementary information: Available at the RAPSearch2 website.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btr595","type":"journal-article","created":{"date-parts":[[2011,10,29]],"date-time":"2011-10-29T02:04:25Z","timestamp":1319853865000},"page":"125-126","source":"Crossref","is-referenced-by-count":368,"title":["RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data"],"prefix":"10.1093","volume":"28","author":[{"given":"Yongan","family":"Zhao","sequence":"first","affiliation":[{"name":"1 School of Informatics and Computing and 2Center for Genomics and Bioinformatics, Indiana University, Bloomington, IN 47404, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Haixu","family":"Tang","sequence":"additional","affiliation":[{"name":"1 School of Informatics and Computing and 2Center for Genomics and Bioinformatics, Indiana University, Bloomington, IN 47404, USA"},{"name":"1 School of Informatics and Computing and 2Center for Genomics and Bioinformatics, Indiana University, Bloomington, IN 47404, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yuzhen","family":"Ye","sequence":"additional","affiliation":[{"name":"1 School of Informatics and Computing and 2Center for Genomics and Bioinformatics, Indiana University, Bloomington, IN 47404, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2011,10,28]]},"reference":[{"key":"2023061011444030100_B1","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol."},{"key":"2023061011444030100_B2","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023061011444030100_B3","doi-asserted-by":"crossref","first-page":"673","DOI":"10.1038\/nmeth.1358","article-title":"Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models","volume":"6","author":"Brady","year":"2009","journal-title":"Nat. Methods"},{"key":"2023061011444030100_B4","doi-asserted-by":"crossref","first-page":"629","DOI":"10.1038\/nature06810","article-title":"Functional metagenomic profiling of nine biomes","volume":"452","author":"Dinsdale","year":"2008","journal-title":"Nature"},{"key":"2023061011444030100_B5","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1101\/gr.5969107","article-title":"MEGAN analysis of metagenomic data","volume":"17","author":"Huson","year":"2007","journal-title":"Genome Res."},{"key":"2023061011444030100_B6","first-page":"656","article-title":"BLAT\u2013the BLAST-like alignment tool","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res."},{"key":"2023061011444030100_B7","doi-asserted-by":"crossref","first-page":"1509","DOI":"10.1101\/gr.079558.108","article-title":"RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays","volume":"18","author":"Marioni","year":"2008","journal-title":"Genome Res."},{"key":"2023061011444030100_B8","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1146\/annurev.genet.38.072902.091216","article-title":"Metagenomics: genomic analysis of microbial communities","volume":"38","author":"Riesenfeld","year":"2004","journal-title":"Annu. Rev. Genet."},{"key":"2023061011444030100_B9","doi-asserted-by":"crossref","first-page":"480","DOI":"10.1038\/nature07540","article-title":"A core gut microbiome in obese and lean twins","volume":"457","author":"Turnbaugh","year":"2009","journal-title":"Nature"},{"key":"2023061011444030100_B10","doi-asserted-by":"crossref","first-page":"159","DOI":"10.1186\/1471-2105-12-159","article-title":"RAPSearch: a fast protein similarity search tool for short reads","volume":"12","author":"Ye","year":"2011","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/1\/125\/50568248\/bioinformatics_28_1_125.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/1\/125\/50568248\/bioinformatics_28_1_125.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,10]],"date-time":"2023-06-10T11:46:00Z","timestamp":1686397560000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/1\/125\/218953"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,10,28]]},"references-count":10,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2012,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btr595","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,1,1]]},"published":{"date-parts":[[2011,10,28]]}}}