{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:49Z","timestamp":1772138089610,"version":"3.50.1"},"reference-count":25,"publisher":"Oxford University Press (OUP)","issue":"22","license":[{"start":{"date-parts":[[2018,5,29]],"date-time":"2018-05-29T00:00:00Z","timestamp":1527552000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000060","name":"National Institute Of Allergy And Infectious Diseases","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100000060","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R00AI120851"],"award-info":[{"award-number":["R00AI120851"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01AI120009"],"award-info":[{"award-number":["R01AI120009"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["UM1AI068618"],"award-info":[{"award-number":["UM1AI068618"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000026","name":"National Institute on Drug Abuse","doi-asserted-by":"publisher","award":["R21DA041007"],"award-info":[{"award-number":["R21DA041007"]}],"id":[{"id":"10.13039\/100000026","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100005595","name":"University of California","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100005595","id-type":"DOI","asserted-by":"publisher"}]},{"name":"San Diego Center for AIDS Research","award":["P30AI036214"],"award-info":[{"award-number":["P30AI036214"]}]},{"name":"San Diego Center for AIDS Research","award":["R21AI115701"],"award-info":[{"award-number":["R21AI115701"]}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,11,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Protein coding genes can be studied using long-read next generation sequencing. However, high rates of indel sequencing errors are problematic, corrupting the reading frame. Even the consensus of multiple independent sequence reads retains indel errors. To solve this problem, we introduce Reference-Informed Frame-Resolving multiple-Alignment Free template inference algorithm (RIFRAF), a sequence consensus algorithm that takes a set of error-prone reads and a reference sequence and infers an accurate in-frame consensus. RIFRAF uses a novel structure, analogous to a two-layer hidden Markov model: the consensus is optimized to maximize alignment scores with both the set of noisy reads and with a reference. The template-to-reads component of the model encodes the preponderance of indels, and is sensitive to the per-base quality scores, giving greater weight to more accurate bases. The reference-to-template component of the model penalizes frame-destroying indels. A local search algorithm proceeds in stages to find the best consensus sequence for both objectives.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Using Pacific Biosciences SMRT sequences from an HIV-1 env clone, NL4-3, we compare our approach to other consensus and frame correction methods. RIFRAF consistently finds a consensus sequence that is more accurate and in-frame, especially with small numbers of reads. It was able to perfectly reconstruct over 80% of consensus sequences from as few as three reads, whereas the best alternative required twice as many. RIFRAF is able to achieve these results and keep the consensus in-frame even with a distantly related reference sequence. Moreover, unlike other frame correction methods, RIFRAF can detect and keep true indels while removing erroneous ones.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>RIFRAF is implemented in Julia, and source code is publicly available at https:\/\/github.com\/MurrellGroup\/Rifraf.jl.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty426","type":"journal-article","created":{"date-parts":[[2018,5,22]],"date-time":"2018-05-22T23:17:21Z","timestamp":1527031041000},"page":"3817-3824","source":"Crossref","is-referenced-by-count":2,"title":["RIFRAF: a frame-resolving consensus algorithm"],"prefix":"10.1093","volume":"34","author":[{"given":"Kemal","family":"Eren","sequence":"first","affiliation":[{"name":"Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA, USA"}]},{"given":"Ben","family":"Murrell","sequence":"additional","affiliation":[{"name":"Department of Medicine, University of California San Diego, La Jolla, CA, USA"}]}],"member":"286","published-online":{"date-parts":[[2018,5,29]]},"reference":[{"key":"2023012712372949300_bty426-B1","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1137\/141000671","article-title":"Julia: a fresh approach to numerical computing","volume":"59","author":"Bezanson","year":"2017","journal-title":"SIAM Rev"},{"key":"2023012712372949300_bty426-B2","doi-asserted-by":"crossref","first-page":"481","DOI":"10.1093\/bioinformatics\/8.5.481","article-title":"Aligning two sequences within a specified diagonal band","volume":"8","author":"Chao","year":"1992","journal-title":"Bioinformatics"},{"key":"2023012712372949300_bty426-B3","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1016\/S0092-8240(05)80237-X","article-title":"Constrained sequence alignment","volume":"55","author":"Chao","year":"1993","journal-title":"Bull. Math. Biol"},{"key":"2023012712372949300_bty426-B4","first-page":"563","article-title":"Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data","volume-title":"Nat Methods","author":"Chin","year":"2013"},{"key":"2023012712372949300_bty426-B5","doi-asserted-by":"crossref","first-page":"i529","DOI":"10.1093\/bioinformatics\/btw458","article-title":"Improve homology search sensitivity of PacBio data by correcting frameshifts","volume":"32","author":"Du","year":"2016","journal-title":"Bioinformatics"},{"key":"2023012712372949300_bty426-B6","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511790492","volume-title":"Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids","author":"Durbin","year":"1998"},{"key":"2023012712372949300_bty426-B7","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1126\/science.1162986","article-title":"Real-time DNA sequencing from single polymerase molecules","volume":"323","author":"Eid","year":"2009","journal-title":"Science"},{"key":"2023012712372949300_bty426-B8","doi-asserted-by":"crossref","first-page":"20166","DOI":"10.1073\/pnas.1110064108","article-title":"Accurate sampling and deep sequencing of the hiv-1 protease gene using a primer id","volume":"108","author":"Jabara","year":"2011","journal-title":"Proc. Natl. Acad. Sci"},{"key":"2023012712372949300_bty426-B9","doi-asserted-by":"crossref","first-page":"772","DOI":"10.1093\/molbev\/mst010","article-title":"Mafft multiple sequence alignment software version 7: improvements in performance and usability","volume":"30","author":"Katoh","year":"2013","journal-title":"Mol. Biol. Evol"},{"key":"2023012712372949300_bty426-B10","doi-asserted-by":"crossref","first-page":"3059","DOI":"10.1093\/nar\/gkf436","article-title":"MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform","volume":"30","author":"Katoh","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"2023012712372949300_bty426-B11","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1093\/bmb\/58.1.19","article-title":"Evolutionary and immunological implications of contemporary hiv-1 variation","volume":"58","author":"Korber","year":"2001","journal-title":"Br. Med. Bull"},{"key":"2023012712372949300_bty426-B12","doi-asserted-by":"crossref","first-page":"vew018.","DOI":"10.1093\/ve\/vew018","article-title":"Rapid sequencing of complete env genes from primary HIV-1 samples","volume":"2","author":"Laird Smith","year":"2016","journal-title":"Virus Evol"},{"key":"2023012712372949300_bty426-B13","doi-asserted-by":"crossref","first-page":"999","DOI":"10.1093\/bioinformatics\/btg109","article-title":"Generating consensus sequences from partial order multiple sequence alignment graphs","volume":"19","author":"Lee","year":"2003","journal-title":"Bioinformatics"},{"key":"2023012712372949300_bty426-B14","doi-asserted-by":"crossref","first-page":"452","DOI":"10.1093\/bioinformatics\/18.3.452","article-title":"Multiple sequence alignment using partial order graphs","volume":"18","author":"Lee","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012712372949300_bty426-B15","doi-asserted-by":"crossref","first-page":"733","DOI":"10.1038\/nmeth.3444","article-title":"A complete bacterial genome assembled de novo using only nanopore sequencing data","volume":"12","author":"Loman","year":"2015","journal-title":"Nat. Methods"},{"key":"2023012712372949300_bty426-B16","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1038\/nrg3367","article-title":"Sequence assembly demystified","volume":"14","author":"Nagarajan","year":"2013","journal-title":"Nat. Rev. Genet"},{"key":"2023012712372949300_bty426-B17","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","article-title":"A general method applicable to the search for similiarities in the amino acid sequence of two proteins","volume":"48","author":"Needleman","year":"1970","journal-title":"J. Mol. Biol"},{"key":"2023012712372949300_bty426-B18","doi-asserted-by":"crossref","first-page":"601","DOI":"10.1109\/TSMCC.2005.855515","article-title":"Evolutionary computation in bioinformatics: a review","volume":"36","author":"Pal","year":"2006","journal-title":"IEEE Trans. Syst. Man Cybernetics Part C: Appl. Rev"},{"key":"2023012712372949300_bty426-B19","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1093\/bib\/bbq020","article-title":"De novo assembly of short sequence reads","volume":"11","author":"Paszkiewicz","year":"2010","journal-title":"Brief. Bioinformatics"},{"key":"2023012712372949300_bty426-B20","doi-asserted-by":"crossref","first-page":"205","DOI":"10.4137\/EBO.S19199","article-title":"Evaluating the accuracy and efficiency of multiple sequence alignment methods","volume":"10","author":"Pervez","year":"2014","journal-title":"Evol. Bioinformatics"},{"key":"2023012712372949300_bty426-B21","doi-asserted-by":"crossref","first-page":"276","DOI":"10.1016\/S0168-9525(00)02024-2","article-title":"Emboss: the European molecular biology open software suite","volume":"6","author":"Rice","year":"2000","journal-title":"Trends Genet."},{"key":"2023012712372949300_bty426-B22","doi-asserted-by":"crossref","first-page":"3575","DOI":"10.1093\/bioinformatics\/btu576","article-title":"Frameshift alignment: statistics and post-genomic applications","volume":"30","author":"Sheetlin","year":"2014","journal-title":"Bioinformatics"},{"key":"2023012712372949300_bty426-B23","doi-asserted-by":"crossref","first-page":"E1330","DOI":"10.1073\/pnas.1203613109","article-title":"Degenerate primer ids and the birthday problem","volume":"109","author":"Sheward","year":"2012","journal-title":"Proc. Natl. Acad. Sci"},{"key":"2023012712372949300_bty426-B24","doi-asserted-by":"crossref","first-page":"e00592-13","DOI":"10.1128\/mBio.00592-13","article-title":"Ecological patterns of nifH genes in four terrestrial climatic zones explored with targeted metagenomics using framebot, a new informatics tool","volume":"4","author":"Wang","year":"2013","journal-title":"mBio"},{"key":"2023012712372949300_bty426-B25","doi-asserted-by":"crossref","first-page":"198.","DOI":"10.1186\/1471-2105-12-198","article-title":"HMM-FRAME: accurate protein domain classification for metagenomic sequences containing frameshift errors","volume":"12","author":"Zhang","year":"2011","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/22\/3817\/48921231\/bioinformatics_34_22_3817.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/22\/3817\/48921231\/bioinformatics_34_22_3817.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,27]],"date-time":"2023-01-27T08:27:39Z","timestamp":1674808059000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/22\/3817\/5021683"}},"subtitle":[],"editor":[{"given":"Bonnie","family":"Berger","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,5,29]]},"references-count":25,"journal-issue":{"issue":"22","published-print":{"date-parts":[[2018,11,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty426","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/227520","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,11,15]]},"published":{"date-parts":[[2018,5,29]]}}}