{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,1,14]],"date-time":"2023-01-14T03:51:14Z","timestamp":1673668274911},"reference-count":22,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2011,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Over the past few years, new massively parallel DNA sequencing technologies have emerged. These platforms generate massive amounts of data per run, greatly reducing the cost of DNA sequencing. However, these techniques also raise important computational difficulties mostly due to the huge volume of data produced, but also because of some of their specific characteristics such as read length and sequencing errors. Among the most critical problems is that of efficiently and accurately mapping reads to a reference genome in the context of re-sequencing projects.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We present an efficient method for the local alignment of pyrosequencing reads produced by the GS FLX (454) system against a reference sequence. Our approach explores the characteristics of the data in these re-sequencing applications and uses state of the art indexing techniques combined with a flexible seed-based approach, leading to a fast and accurate algorithm which needs very little user parameterization. An evaluation performed using real and simulated data shows that our proposed method outperforms a number of mainstream tools on the quantity and quality of successful alignments, as well as on the execution time.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>The proposed methodology was implemented in a software tool called TAPyR--Tool for the Alignment of Pyrosequencing Reads--which is publicly available from <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"http:\/\/www.tapyr.net\" ext-link-type=\"uri\">http:\/\/www.tapyr.net<\/jats:ext-link>.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-12-163","type":"journal-article","created":{"date-parts":[[2011,6,14]],"date-time":"2011-06-14T18:13:59Z","timestamp":1308075239000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Efficient alignment of pyrosequencing reads for re-sequencing applications"],"prefix":"10.1186","volume":"12","author":[{"given":"Francisco","family":"Fernandes","sequence":"first","affiliation":[]},{"given":"Paulo GS","family":"da Fonseca","sequence":"additional","affiliation":[]},{"given":"Luis MS","family":"Russo","sequence":"additional","affiliation":[]},{"given":"Arlindo L","family":"Oliveira","sequence":"additional","affiliation":[]},{"given":"Ana T","family":"Freitas","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2011,5,16]]},"reference":[{"key":"4552_CR1","first-page":"104","volume":"24","author":"F Sanger","year":"1992","unstructured":"Sanger F, Nicklen S, Coulson AR: DNA sequencing with chain-terminating inhibitors. 1977. Biotechnology 1992, 24: 104\u20138.","journal-title":"Biotechnology"},{"key":"4552_CR2","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1038\/nmeth0206-141","volume":"3","author":"L Bonetta","year":"2006","unstructured":"Bonetta L: Genome sequencing in the fast lane. Nat Methods 2006, 3: 141\u2013147. 10.1038\/nmeth0206-141","journal-title":"Nat Methods"},{"key":"4552_CR3","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1038\/nmeth1156","volume":"5","author":"SC Schuster","year":"2008","unstructured":"Schuster SC: Next-generation sequencing transforms today's biology. Nat Methods 2008, 5: 16\u20138. 10.1038\/nmeth1156","journal-title":"Nat Methods"},{"issue":"3","key":"4552_CR4","doi-asserted-by":"publisher","first-page":"142","DOI":"10.1016\/j.tig.2007.12.006","volume":"24","author":"M Pop","year":"2008","unstructured":"Pop M, Salzberg SL: Bioinformatics challenges of new sequencing technology. Trends Genet 2008, 24(3):142\u20139. 10.1016\/j.tig.2007.12.006","journal-title":"Trends Genet"},{"issue":"2","key":"4552_CR5","doi-asserted-by":"publisher","first-page":"324","DOI":"10.1101\/gr.7088808","volume":"18","author":"MJ Chaisson","year":"2008","unstructured":"Chaisson MJ, Pevzner PA: Short read fragment assembly of bacterial genomes. Genome Res 2008, 18(2):324\u201330. 10.1101\/gr.7088808","journal-title":"Genome Res"},{"issue":"5","key":"4552_CR6","doi-asserted-by":"publisher","first-page":"821","DOI":"10.1101\/gr.074492.107","volume":"18","author":"DR Zerbino","year":"2008","unstructured":"Zerbino DR, Birney E: Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 2008, 18(5):821\u20139. 10.1101\/gr.074492.107","journal-title":"Genome Res"},{"issue":"3","key":"4552_CR7","doi-asserted-by":"publisher","first-page":"R25","DOI":"10.1186\/gb-2009-10-3-r25","volume":"10","author":"B Langmead","year":"2009","unstructured":"Langmead B, Trapnell C, Pop M, Salzberg SL: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 2009, 10(3):R25. 10.1186\/gb-2009-10-3-r25","journal-title":"Genome Biol"},{"issue":"14","key":"4552_CR8","doi-asserted-by":"publisher","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","volume":"25","author":"H Li","year":"2009","unstructured":"Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009, 25(14):1754\u201360. 10.1093\/bioinformatics\/btp324","journal-title":"Bioinformatics"},{"issue":"15","key":"4552_CR9","doi-asserted-by":"publisher","first-page":"1966","DOI":"10.1093\/bioinformatics\/btp336","volume":"25","author":"R Li","year":"2009","unstructured":"Li R, Yu C, Li Y, Lam TW, Yiu SM, Kristiansen K, Wang J: SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 2009, 25(15):1966\u20137. 10.1093\/bioinformatics\/btp336","journal-title":"Bioinformatics"},{"issue":"9","key":"4552_CR10","doi-asserted-by":"publisher","first-page":"e1000502","DOI":"10.1371\/journal.pcbi.1000502","volume":"5","author":"S Hoffmann","year":"2009","unstructured":"Hoffmann S, Otto C, Kurtz S, Sharma CM, Khaitovich P, Vogel J, Stadler PF, Hackerm\u00fcller J: Fast mapping of short sequences with mismatches, insertions and deletions using index structures. PLoS Comput Biol 2009, 5(9):e1000502. 10.1371\/journal.pcbi.1000502","journal-title":"PLoS Comput Biol"},{"issue":"5","key":"4552_CR11","doi-asserted-by":"publisher","first-page":"589","DOI":"10.1093\/bioinformatics\/btp698","volume":"26","author":"H Li","year":"2010","unstructured":"Li H, Durbin R: Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 2010, 26(5):589\u201395. 10.1093\/bioinformatics\/btp698","journal-title":"Bioinformatics"},{"issue":"20","key":"4552_CR12","doi-asserted-by":"publisher","first-page":"2534","DOI":"10.1093\/bioinformatics\/btq485","volume":"26","author":"G Rizk","year":"2010","unstructured":"Rizk G, Lavenier D: GASSST: global alignment short sequence search tool. Bioinformatics 2010, 26(20):2534\u201340. 10.1093\/bioinformatics\/btq485","journal-title":"Bioinformatics"},{"issue":"10","key":"4552_CR13","doi-asserted-by":"publisher","first-page":"1725","DOI":"10.1101\/gr.194201","volume":"11","author":"Z Ning","year":"2001","unstructured":"Ning Z, Cox AJ, Mullikin JC: SSAHA: a fast search method for large DNA databases. Genome Res 2001, 11(10):1725\u20139. 10.1101\/gr.194201","journal-title":"Genome Res"},{"issue":"1-2","key":"4552_CR14","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1016\/j.jbiotec.2008.03.021","volume":"136","author":"M Droege","year":"2008","unstructured":"Droege M, Hill B: The Genome Sequencer FLX System-longer reads, more applications, straight forward bioinformatics and more complete data sets. J Biotechnol 2008, 136(1\u20132):3\u201310. 10.1016\/j.jbiotec.2008.03.021","journal-title":"J Biotechnol"},{"key":"4552_CR15","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511574931","volume-title":"Algorithms on strings, trees, and sequences: computer science and computational biology","author":"D Gusfield","year":"1997","unstructured":"Gusfield D: Algorithms on strings, trees, and sequences: computer science and computational biology. Cambridge University Press, NY; 1997."},{"key":"4552_CR16","volume-title":"ACM Comp Surv","author":"G Navarro","year":"2007","unstructured":"Navarro G, M\u00e4kinen V: Compressed full-text indexes. ACM Comp Surv 2007., 32:"},{"key":"4552_CR17","doi-asserted-by":"publisher","first-page":"191","DOI":"10.1007\/978-3-540-69068-9_19","volume-title":"Proceedings of the 19th Annual Symposium on Combinatorial Pattern Matching (CPM 2008), LNCS","author":"L Russo","year":"2008","unstructured":"Russo L, Navarro G, Oliveira A: Dynamic Fully-Compressed Suffix Trees. Proceedings of the 19th Annual Symposium on Combinatorial Pattern Matching (CPM 2008), LNCS 2008, 191\u2013203."},{"key":"4552_CR18","doi-asserted-by":"crossref","unstructured":"Ferragina P, Manzini G, M\u00e4kinen V, Navarro G: Compressed representations of sequences and full-text indexes. Transactions on Algorithms (TALG 2007., 3(2):","DOI":"10.1145\/1240233.1240243"},{"key":"4552_CR19","volume-title":"Technical report 124, Palo Alto, CA, Digital Equipment Corporation","author":"M Burrows","year":"1994","unstructured":"Burrows M, Wheeler DJ: A Block-Sorting Lossless Data Compression Algorithm. Technical report 124, Palo Alto, CA, Digital Equipment Corporation 1994. [http:\/\/citeseer.ist.psu.edu\/76182]"},{"issue":"7","key":"4552_CR20","doi-asserted-by":"publisher","first-page":"R143","DOI":"10.1186\/gb-2007-8-7-r143","volume":"8","author":"SM Huse","year":"2007","unstructured":"Huse SM, Huber JA, Morrison HG, Sogin ML, Welch DM: Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biol 2007, 8(7):R143. 10.1186\/gb-2007-8-7-r143","journal-title":"Genome Biol"},{"issue":"7057","key":"4552_CR21","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1038\/nature03959","volume":"437","author":"M Margulies","year":"2005","unstructured":"Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Ho CH, Irzyk GP, Jando SC, Alenquer MLI, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu P, Begley RF, Rothberg JM: Genome sequencing in microfabricated high-density picolitre reactors. Nature 2005, 437(7057):376\u201380.","journal-title":"Nature"},{"key":"4552_CR22","doi-asserted-by":"publisher","first-page":"603","DOI":"10.1186\/1471-2164-9-603","volume":"9","author":"JM Aury","year":"2008","unstructured":"Aury JM, Cruaud C, Barbe V, Rogier O, Mangenot S, Samson G, Poulain J, Anthouard V, Scarpelli C, Artiguenave F, Wincker P: High quality draft sequences for prokaryotic genomes using a mix of new sequencing technologies. BMC Genomics 2008, 9: 603. 10.1186\/1471-2164-9-603","journal-title":"BMC Genomics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-12-163.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T13:54:55Z","timestamp":1630504495000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-12-163"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,5,16]]},"references-count":22,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2011,12]]}},"alternative-id":["4552"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-12-163","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,5,16]]},"assertion":[{"value":"16 December 2010","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 May 2011","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 May 2011","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"163"}}