{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,22]],"date-time":"2025-02-22T00:44:36Z","timestamp":1740185076745,"version":"3.37.3"},"reference-count":31,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2016,9,23]],"date-time":"2016-09-23T00:00:00Z","timestamp":1474588800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/100000185","name":"DARPA","doi-asserted-by":"crossref","award":["HR0011-15-2-0046"],"award-info":[{"award-number":["HR0011-15-2-0046"]}],"id":[{"id":"10.13039\/100000185","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Spain\u2019s","award":["TIN2013-42351-P","S2013\/ICE-2845"],"award-info":[{"award-number":["TIN2013-42351-P","S2013\/ICE-2845"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,1,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Despite RNA-seq reads provide quality scores that represent the probability of calling a correct base, these values are not probabilistically integrated in most alignment algorithms. Based on the quality scores of the reads, we propose to calculate a lower bound of the probability of alignment of any fast alignment algorithm that generates SAM files. This bound is called Fast Bayesian Bound (FBB) and serves as a canonical reference to compare alignment results across different algorithms. This Bayesian Bound intends to provide additional support to the current state-of-the-art aligners, not to replace them.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We propose a feasible Bayesian bound that uses quality scores of the reads to align them to a genome of reference. Two theorems are provided to efficiently calculate the Bayesian bound that under some conditions becomes the equality. The algorithm reads the SAM files generated by the alignment algorithms using multiple command option values. The program options are mapped into the FBB reference values, and all the aligners can be compared respect to the same accuracy values provided by the FBB. Stranded paired read RNA-seq data was used for evaluation purposes. The errors of the alignments can be calculated based on the information contained in the distance between the pairs given by Theorem 2, and the alignments to the incorrect strand. Most of the algorithms (Bowtie, Bowtie 2, SHRiMP2, Soap 2, Novoalign) provide similar results with subtle variations.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and Implementation<\/jats:title>\n                  <jats:p>Current version of the FBB software is provided at https:\/\/bitbucket.org\/irenerodriguez\/fbb.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btw608","type":"journal-article","created":{"date-parts":[[2016,9,24]],"date-time":"2016-09-24T01:11:09Z","timestamp":1474679469000},"page":"210-218","source":"Crossref","is-referenced-by-count":1,"title":["FBB: a fast Bayesian-bound tool to calibrate RNA-seq aligners"],"prefix":"10.1093","volume":"33","author":[{"given":"Irene","family":"Rodriguez-Lujan","sequence":"first","affiliation":[{"name":"BioCircuits Institute, University of California, San Diego, La Jolla, CA, USA"},{"name":"Machine Learning Group, Escuela Polit\u00e9cnica Superior, Universidad Aut\u00f3noma de Madrid, Madrid, Spain"}]},{"given":"Jeff","family":"Hasty","sequence":"additional","affiliation":[{"name":"BioCircuits Institute, University of California, San Diego, La Jolla, CA, USA"},{"name":"Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA"},{"name":"Molecular Biology Section, Division of Biological Science, University of California, San Diego, La Jolla, CA, USA"}]},{"given":"Ram\u00f3n","family":"Huerta","sequence":"additional","affiliation":[{"name":"BioCircuits Institute, University of California, San Diego, La Jolla, CA, USA"}]}],"member":"286","published-online":{"date-parts":[[2016,9,23]]},"reference":[{"key":"2023020204313473300_btw608-B1","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped LAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023020204313473300_btw608-B2","first-page":"370","article-title":"An essay toward solving a problem in the doctrine of chances","volume":"53","author":"Bayes","year":"1764","journal-title":"Philos. Trans. R. Soc. Lond"},{"key":"2023020204313473300_btw608-B3","doi-asserted-by":"crossref","first-page":"136","DOI":"10.1093\/bioinformatics\/btv524","article-title":"RNF: a general framework to evaluate NGS read mappers","volume":"32","author":"B\u0159inda","year":"2016","journal-title":"Bioinformatics"},{"key":"2023020204313473300_btw608-B4","doi-asserted-by":"crossref","first-page":"264.","DOI":"10.1186\/1471-2164-15-264","article-title":"Comparison of mapping algorithms used in high-throughput sequencing: application to Ion Torrent data","volume":"15","author":"Caboche","year":"2014","journal-title":"BMC Genomics"},{"key":"2023020204313473300_btw608-B5","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1093\/bioinformatics\/bts723","article-title":"ALE: a generic assembly likelihood evaluation framework for assessing the accuracy of genome and metagenome assemblies","volume":"29","author":"Clark","year":"2013","journal-title":"Bioinformatics"},{"key":"2023020204313473300_btw608-B6","doi-asserted-by":"crossref","first-page":"2022","DOI":"10.1214\/aop\/1176988493","article-title":"Limit distribution of maximal non-aligned two-sequence segmental score","volume":"22","author":"Dembo","year":"1994","journal-title":"Ann. Probability"},{"key":"2023020204313473300_btw608-B7","doi-asserted-by":"crossref","first-page":"1185","DOI":"10.1038\/nmeth.2722","article-title":"Systematic evaluation of spliced alignment programs for RNA-seq data","volume":"10","author":"Engstr\u00f6m","year":"2013","journal-title":"Nat. Methods"},{"key":"2023020204313473300_btw608-B8","doi-asserted-by":"crossref","first-page":"3169","DOI":"10.1093\/bioinformatics\/bts605","article-title":"Tools for mapping high-throughput sequencing data","volume":"28","author":"Fonseca","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020204313473300_btw608-B9","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1093\/bioinformatics\/btt255","article-title":"Specificity control for read alignments using an artificial reference genome-guided false discovery rate","volume":"30","author":"Giese","year":"2014","journal-title":"Bioinformatics"},{"key":"2023020204313473300_btw608-B10","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1038\/nmeth.1179","article-title":"Whole-genome sequencing and variant discovery in C. elegans","volume":"5","author":"Hillier","year":"2008","journal-title":"Nat. Methods"},{"key":"2023020204313473300_btw608-B11","doi-asserted-by":"crossref","first-page":"e1000502.","DOI":"10.1371\/journal.pcbi.1000502","article-title":"Fast mapping of short sequences with mismatches, insertions and deletions using index structures","volume":"5","author":"Hoffmann","year":"2009","journal-title":"PLoS Comput. Biol"},{"key":"2023020204313473300_btw608-B12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-12-210","article-title":"A novel and well-defined benchmarking method for second generation read mapping","volume":"12","author":"Holtgrewe","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023020204313473300_btw608-B13","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1093\/bioinformatics\/btr708","article-title":"ART: a next-generation sequencing read simulator","volume":"28","author":"Huang","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020204313473300_btw608-B14","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1007\/BF02418571","article-title":"Sur les fonctions convexes et les in\u00e9galit\u00e9s entre les valeurs moyennes","volume":"30","author":"Jensen","year":"1906","journal-title":"Acta Math"},{"key":"2023020204313473300_btw608-B15","doi-asserted-by":"crossref","first-page":"2264","DOI":"10.1073\/pnas.87.6.2264","article-title":"Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes","volume":"87","author":"Karlin","year":"1990","journal-title":"Proc. Natl. Acad. Sci"},{"key":"2023020204313473300_btw608-B16","doi-asserted-by":"crossref","DOI":"10.1038\/srep13443","article-title":"CADBURE: a generic tool to evaluate the performance of spliced aligners on RNA-Seq data","volume":"5","author":"Kumar","year":"2015","journal-title":"Sci. Rep"},{"key":"2023020204313473300_btw608-B17","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.1923","article-title":"Fast gapped-read alignment with Bowtie 2","volume":"9","author":"Langmead","year":"2012","journal-title":"Nat. Methods"},{"key":"2023020204313473300_btw608-B18","doi-asserted-by":"crossref","first-page":"R25.","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short DNA sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol"},{"key":"2023020204313473300_btw608-B19","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-014-0553-5","article-title":"Evaluation of de novo transcriptome assemblies from RNA-Seq data","volume":"15","author":"Li","year":"2014","journal-title":"Genome Biol"},{"key":"2023020204313473300_btw608-B20","first-page":"1","article-title":"Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM","volume":"1303","author":"Li","year":"2013","journal-title":"arXiv Preprint arXiv:1303.3997"},{"key":"2023020204313473300_btw608-B21","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with Burrows\u2013Wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023020204313473300_btw608-B22","doi-asserted-by":"crossref","first-page":"1851","DOI":"10.1101\/gr.078212.108","article-title":"Mapping short DNA sequencing reads and calling variants using mapping quality scores","volume":"18","author":"Li","year":"2008","journal-title":"Genome Res"},{"key":"2023020204313473300_btw608-B23","doi-asserted-by":"crossref","first-page":"1966","DOI":"10.1093\/bioinformatics\/btp336","article-title":"SOAP2: an improved ultrafast tool for short read alignment","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023020204313473300_btw608-B24","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1093\/bioinformatics\/btn565","article-title":"Slider\u2014maximum use of probability information for alignment of short sequence reads and SNP detection","volume":"25","author":"Malhis","year":"2009","journal-title":"Bioinformatics"},{"key":"2023020204313473300_btw608-B25","doi-asserted-by":"crossref","first-page":"1.","DOI":"10.1186\/gb-2011-12-11-r112","article-title":"Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and genome analyzer systems","volume":"12","author":"Minoche","year":"2011","journal-title":"Genome Biol"},{"year":"2013","author":"Pfeiffer","key":"2023020204313473300_btw608-B26"},{"key":"2023020204313473300_btw608-B27","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1146\/annurev-genom-090413-025358","article-title":"Alignment of next-generation sequencing reads","volume":"16","author":"Reinert","year":"2015","journal-title":"Annu. Rev. Genomics Hum. Genet"},{"key":"2023020204313473300_btw608-B28","doi-asserted-by":"crossref","first-page":"e1000386.","DOI":"10.1371\/journal.pcbi.1000386","article-title":"SHRiMP: accurate mapping of short color-space reads","volume":"5","author":"Rumble","year":"2009","journal-title":"PLoS Comput. Biol"},{"volume-title":"Introduction to Modern Information Retrieval","year":"1986","author":"Salton","key":"2023020204313473300_btw608-B29"},{"key":"2023020204313473300_btw608-B30","doi-asserted-by":"crossref","first-page":", 128.","DOI":"10.1186\/1471-2105-9-128","article-title":"Using quality scores and longer reads improves accuracy of Solexa read mapping","volume":"9","author":"Smith","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023020204313473300_btw608-B31","doi-asserted-by":"crossref","first-page":"6.","DOI":"10.1186\/1756-0381-5-6","article-title":"How do alignment programs perform on sequencing data with varying qualities and from repetitive regions?","volume":"5","author":"Yu","year":"2012","journal-title":"BioData Mining"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/2\/210\/49037522\/bioinformatics_33_2_210.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/2\/210\/49037522\/bioinformatics_33_2_210.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T04:33:44Z","timestamp":1675312424000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/2\/210\/2584476"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2016,9,23]]},"references-count":31,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2017,1,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btw608","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2017,1,15]]},"published":{"date-parts":[[2016,9,23]]}}}