{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,3]],"date-time":"2024-08-03T09:17:28Z","timestamp":1722676648200},"reference-count":20,"publisher":"Oxford University Press (OUP)","issue":"17","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Currently, re-sequencing approaches use multiple modules serially to interpret raw sequencing data from next-generation sequencing platforms, while remaining oblivious to the genomic information until the final alignment step. Such approaches fail to exploit the full information from both raw sequencing data and the reference genome that can yield better quality sequence reads, SNP-calls, variant detection, as well as an alignment at the best possible location in the reference genome. Thus, there is a need for novel reference-guided bioinformatics algorithms for interpreting analog signals representing sequences of the bases ({A, C, G, T}), while simultaneously aligning possible sequence reads to a source reference genome whenever available.<\/jats:p><jats:p>Results: Here, we propose a new base-calling algorithm, TotalReCaller, to achieve improved performance. A linear error model for the raw intensity data and Burrows\u2013Wheeler transform (BWT) based alignment are combined utilizing a Bayesian score function, which is then globally optimized over all possible genomic locations using an efficient branch-and-bound approach. The algorithm has been implemented in soft- and hardware [field-programmable gate array (FPGA)] to achieve real-time performance. Empirical results on real high-throughput Illumina data were used to evaluate TotalReCaller's performance relative to its peers\u2014Bustard, BayesCall, Ibis and Rolexa\u2014based on several criteria, particularly those important in clinical and scientific applications. Namely, it was evaluated for (i) its base-calling speed and throughput, (ii) its read accuracy and (iii) its specificity and sensitivity in variant calling.<\/jats:p><jats:p>Availability: A software implementation of TotalReCaller as well as additional information, is available at: http:\/\/bioinformatics.nyu.edu\/wordpress\/projects\/totalrecaller\/<\/jats:p><jats:p>Contact: \u00a0fabian.menges@nyu.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/btr393","type":"journal-article","created":{"date-parts":[[2011,7,4]],"date-time":"2011-07-04T04:22:20Z","timestamp":1309753340000},"page":"2330-2337","source":"Crossref","is-referenced-by-count":10,"title":["T<scp>otal<\/scp>R<scp>e<\/scp>C<scp>aller<\/scp>: improved accuracy and performance via integrated alignment and base-calling"],"prefix":"10.1093","volume":"27","author":[{"given":"Fabian","family":"Menges","sequence":"first","affiliation":[{"name":"1 Computer Science Department, Courant Institute, New York University, NY 10012 and 2Quantitative Biology Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11791 USA"}]},{"given":"Giuseppe","family":"Narzisi","sequence":"additional","affiliation":[{"name":"1 Computer Science Department, Courant Institute, New York University, NY 10012 and 2Quantitative Biology Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11791 USA"}]},{"given":"Bud","family":"Mishra","sequence":"additional","affiliation":[{"name":"1 Computer Science Department, Courant Institute, New York University, NY 10012 and 2Quantitative Biology Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11791 USA"},{"name":"1 Computer Science Department, Courant Institute, New York University, NY 10012 and 2Quantitative Biology Center, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11791 USA"}]}],"member":"286","published-online":{"date-parts":[[2011,6,30]]},"reference":[{"key":"2023012511533181800_B1","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res."},{"key":"2023012511533181800_B2","doi-asserted-by":"crossref","first-page":"247","DOI":"10.2307\/2002797","article-title":"Functional approximations and dynamic programming","volume":"13","author":"Bellman","year":"1959","journal-title":"Mathematical Tables and Other Aids to Computation"},{"key":"2023012511533181800_B3","doi-asserted-by":"crossref","first-page":"545","DOI":"10.1016\/j.gde.2006.10.009","article-title":"Whole-genome re-sequencing","volume":"16","author":"Bentley","year":"2006","journal-title":"Curr. Opin. Genet. Dev."},{"key":"2023012511533181800_B4","volume-title":"A block-sorting lossless data compression algorithm.","author":"Burrows","year":"1994"},{"key":"2023012511533181800_B5","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1287\/opre.50.1.48.17791","article-title":"Richard Bellman on the birth of dynamic programming","volume":"50","author":"Dreyfus","year":"2002","journal-title":"Operat. Res."},{"key":"2023012511533181800_B6","doi-asserted-by":"crossref","first-page":"679","DOI":"10.1038\/nmeth.1230","article-title":"Alta-Cyclic: a self-optimizing base caller for next-generation sequencing","volume":"5","author":"Erlich","year":"2008","journal-title":"Nat. Methods"},{"key":"2023012511533181800_B7","doi-asserted-by":"crossref","first-page":"390","DOI":"10.1109\/SFCS.2000.892127","article-title":"Opportunistic data structures with applications","volume":"41","author":"Ferragina","year":"2000","journal-title":"Annu. Sympos. Found. Comput. Sci."},{"key":"2023012511533181800_B8","doi-asserted-by":"crossref","first-page":"1884","DOI":"10.1101\/gr.095299.109","article-title":"BayesCall: a model-based base-calling algorithm for high-throughput short-read sequencing","volume":"19","author":"Kao","year":"2009","journal-title":"Genome Res."},{"key":"2023012511533181800_B9","first-page":"656","article-title":"BLAT\u2014the BLAST-Like Alignment Tool","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res."},{"key":"2023012511533181800_B10","doi-asserted-by":"crossref","first-page":"R83","DOI":"10.1186\/gb-2009-10-8-r83","article-title":"Improved base calling for the Illumina Genome Analyzer using machine learning strategies","volume":"10","author":"Kircher","year":"2009","journal-title":"Genome Biol."},{"key":"2023012511533181800_B11","doi-asserted-by":"crossref","first-page":"497","DOI":"10.2307\/1910129","article-title":"An automatic method of solving discrete programming problems","volume":"28","author":"Land","year":"1960","journal-title":"Econometrica"},{"key":"2023012511533181800_B12","doi-asserted-by":"crossref","first-page":"R25","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short DNA sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol."},{"key":"2023012511533181800_B13","doi-asserted-by":"crossref","first-page":"699","DOI":"10.1287\/opre.14.4.699","article-title":"Branch-and-bound methods: a survey","volume":"14","author":"Lawler","year":"1966","journal-title":"Operat. Res."},{"key":"2023012511533181800_B14","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with burrows-wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012511533181800_B15","doi-asserted-by":"crossref","first-page":"1966","DOI":"10.1093\/bioinformatics\/btp336","article-title":"SOAP2: an improved ultrafast tool for short read alignment","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012511533181800_B16","doi-asserted-by":"crossref","first-page":"1767","DOI":"10.1101\/gr.3770505","article-title":"Emerging technologies in DNA sequencing","volume":"15","author":"Metzker","year":"2005","journal-title":"Genome Res."},{"key":"2023012511533181800_B17","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","article-title":"A general method applicable to the search for similarities in the amino acid sequence of two proteins","volume":"48","author":"Needleman","year":"1970","journal-title":"J. Mol. Biol."},{"key":"2023012511533181800_B18","doi-asserted-by":"crossref","first-page":"431","DOI":"10.1186\/1471-2105-9-431","article-title":"Probabilistic base calling of Solexa sequencing data","volume":"9","author":"Rougemont","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012511533181800_B19","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"Smith","year":"1981","journal-title":"J. Mol. Biol."},{"key":"2023012511533181800_B20","doi-asserted-by":"crossref","first-page":"1596","DOI":"10.1126\/science.1128691","article-title":"The genome of black cottonwood, Populus trichocarpa (Torr. & Gray)","volume":"313","author":"Tuskan","year":"2006","journal-title":"Science"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/17\/2330\/48866215\/bioinformatics_27_17_2330.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/17\/2330\/48866215\/bioinformatics_27_17_2330.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,4,8]],"date-time":"2024-04-08T10:52:53Z","timestamp":1712573573000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/27\/17\/2330\/223750"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,6,30]]},"references-count":20,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2011,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btr393","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2011,9,1]]},"published":{"date-parts":[[2011,6,30]]}}}