{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,7,13]],"date-time":"2024-07-13T11:56:06Z","timestamp":1720871766096},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"22","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,11,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Viral genomes tend to code in overlapping reading frames to maximize informational content. This may result in atypical codon bias and particular evolutionary constraints. Due to the fast mutation rate of viruses, there is additional strong evidence for varying selection between intra- and intergenomic regions. The presence of multiple coding regions complicates the concept of Ka\/Ks ratio, and thus begs for an alternative approach when investigating selection strengths. Building on the paper by McCauley and Hein, we develop a method for annotating a viral genome coding in overlapping reading frames. We introduce an evolutionary model capable of accounting for varying levels of selection along the genome, and incorporate it into our prior single sequence HMM methodology, extending it now to a phylogenetic HMM. Given an alignment of several homologous viruses to a reference sequence, we may thus achieve an annotation both of coding regions as well as selection strengths, allowing us to investigate different selection patterns and hypotheses.<\/jats:p><jats:p>Results: We illustrate our method by applying it to a multiple alignment of four HIV2 sequences, as well as of three Hepatitis B sequences. We obtain an annotation of the coding regions, as well as a posterior probability for each site of the strength of selection acting on it. From this we may deduce the average posterior selection acting on the different genes. Whilst we are encouraged to see in HIV2, that the known to be conserved genes gag and pol are indeed annotated as such, we also discover several sites of less stringent negative selection within the env gene. To the best of our knowledge, we are the first to subsequently provide a full selection annotation of the Hepatitis B genome by explicitly modelling the evolution within overlapping reading frames, and not relying on simple Ka\/Ks ratios.<\/jats:p><jats:p>Availability: The Matlab code can be downloaded from http:\/\/www.stats.ox.ac.uk\/mccauley\/<\/jats:p><jats:p>Contact: \u00a0degroot@stats.ox.ac.uk<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm472","type":"journal-article","created":{"date-parts":[[2007,10,6]],"date-time":"2007-10-06T00:14:42Z","timestamp":1191629682000},"page":"2978-2986","source":"Crossref","is-referenced-by-count":11,"title":["Annotation of selection strengths in viral genomes"],"prefix":"10.1093","volume":"23","author":[{"given":"Stephen","family":"McCauley","sequence":"first","affiliation":[{"name":"Department of Statistics, University of Oxford, 1 South Parks Road, OX1 3TG, UK"}]},{"given":"Saskia","family":"de Groot","sequence":"additional","affiliation":[{"name":"Department of Statistics, University of Oxford, 1 South Parks Road, OX1 3TG, UK"}]},{"given":"Thomas","family":"Mailund","sequence":"additional","affiliation":[{"name":"Department of Statistics, University of Oxford, 1 South Parks Road, OX1 3TG, UK"}]},{"given":"Jotun","family":"Hein","sequence":"additional","affiliation":[{"name":"Department of Statistics, University of Oxford, 1 South Parks Road, OX1 3TG, UK"}]}],"member":"286","published-online":{"date-parts":[[2007,10,5]]},"reference":[{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"3911","DOI":"10.1093\/nar\/27.19.3911","article-title":"Heuristic approach to deriving models for gene finding","volume":"27","author":"Besemer","year":"1999","journal-title":"Nucleic Acids Res."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"2607","DOI":"10.1093\/nar\/29.12.2607","article-title":"GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions","volume":"29","author":"Besemer","year":"2001","journal-title":"Nucleic Acids Res."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","DOI":"10.1186\/1471-2180-5-33","article-title":"Variability and conservation in hepatitis B virus core protein","volume":"5","author":"Chain","year":"2005","journal-title":"BMC Microbiol."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"1080","DOI":"10.1093\/bioinformatics\/btm078","article-title":"Comparative annotation of viral genomes with non-conserved gene structure","volume":"23","author":"de Groot","year":"2007","journal-title":"Bioinformatics"},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"1047","DOI":"10.1534\/genetics.103.018135","article-title":"Mapping sites of positive selection and amino acid diversification in the HIV genome","volume":"167","author":"de Oliveira","year":"2004","journal-title":"Genetics"},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"1077","DOI":"10.1093\/genetics\/153.3.1077","article-title":"Genealogical evidence for positive selection in the nef gene of HIV-1","volume":"153","author":"de Zanotto","year":"1999","journal-title":"Genetics"},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1006\/viro.1994.1071","article-title":"New overlapping gene encoded by the cucumber mosaic virus genome","volume":"198","author":"Ding","year":"1994","journal-title":"Virology"},{"key":"2023041208263015700_","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511790492","volume-title":"Biological Sequence Analysis","author":"Durbin","year":"1998"},{"key":"2023041208263015700_","first-page":"164","article-title":"PHYLIP \u2013 Phylogeny inference package (Version 3.2)","volume":"5","author":"Felsenstein","year":"1989","journal-title":"Cladistics"},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"282","DOI":"10.1093\/bioinformatics\/bti007","article-title":"Detecting overlapping coding sequences with pairwise alignments","volume":"21","author":"Firth","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041208263015700_","doi-asserted-by":"crossref","DOI":"10.1186\/1471-2105-7-75","article-title":"Detecting overlapping coding sequences in virus genomes","volume":"7","author":"Firth","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1016\/j.gene.2003.09.021","article-title":"On dynamics of overlapping genes in bacterial genomes","volume":"323","author":"Fukuda","year":"2003","journal-title":"Gene"},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"1799","DOI":"10.1099\/0022-1317-83-7-1799","article-title":"Sequence analysis of Potato leafroll virus isolates reveals genetic stability, major evolutionary events and differential selection pressure between overlapping reading frame products","volume":"83","author":"Guyader","year":"2002","journal-title":"J. Gen. Virol."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1007\/BF00167112","article-title":"A maximum-likelihood approach to analyzing nonoverlapping and overlapping reading frame","volume":"40","author":"Hein","year":"1995","journal-title":"J. Mol. Evol."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1016\/j.virusres.2005.03.030","article-title":"Patterns of nucleotide difference in overlapping and non-overlapping reading frames of papillomavirus genomes","volume":"113","author":"Hughes","year":"2005","journal-title":"Virus Res."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1093\/molbev\/msg039","article-title":"Detecting recombination in 4-taxa DNA sequence alignments with Bayesian Hidden Markov models and Markov chain Monte Carlo","volume":"20","author":"Husmeier","year":"2003","journal-title":"Mol. Biol. Evol."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"2268","DOI":"10.1101\/gr.2433104","article-title":"Properties of overlapping genes are conserved across microbial genomes","volume":"14","author":"Johnson","year":"2006","journal-title":"Genome Res."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1023\/A:1026631030516","article-title":"Overlapping genes and variability of the genetic code","volume":"375","author":"Kozlov","year":"2000","journal-title":"Dokl. Biol. Sci."},{"key":"2023041208263015700_","first-page":"119","article-title":"Analysis of a set of overlapping genes","volume":"373","author":"Kozlov","year":"2000","journal-title":"Dokl. Biochem."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.compbiolchem.2004.12.006","article-title":"Overlapping genes in vertebrate genomes","volume":"29","author":"Makalowska","year":"2005","journal-title":"Comput. Biol. Chem."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btl092","article-title":"Using HMMs and observed evolution to annotate viral genomes","author":"McCauley","year":"2006","journal-title":"Bioinformatics"},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"7041","DOI":"10.1093\/nar\/gkg878","article-title":"Improving gene annotation of complete viral genomes","volume":"31","author":"Mills","year":"2003","journal-title":"Nucleic Acids Res."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"3034","DOI":"10.1093\/bioinformatics\/bti459","article-title":"Dual multiple change-point model leads to more accurate recombination detection","volume":"21","author":"Minin","year":"2005","journal-title":"Bioinformatics"},{"issue":"Suppl. 1","key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1007\/PL00000061","article-title":"Constrained evolution with respect to gene overlap of Hepatitis B Virus","volume":"44","author":"Mizokami","year":"1997","journal-title":"J. Mol. Evol."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"10307","DOI":"10.1128\/JVI.00996-06","article-title":"Molecular evolution of hepatitis B virus over 25 Years","volume":"80","author":"Osiowy","year":"2006","journal-title":"J. Virol."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"284","DOI":"10.1007\/s002399910033","article-title":"Detection of signature sequences in overlapping genes and prediction of a novel overlapping gene in hepatitis G virus","volume":"50","author":"Pavesi","year":"2000","journal-title":"J. Mol. Evol."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"1013","DOI":"10.1099\/vir.0.81375-0","article-title":"Origin and evolution of overlapping genes in the family Microviridae","volume":"87","author":"Pavesi","year":"2006","journal-title":"J. Gen. Virol."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"625","DOI":"10.1007\/PL00006185","article-title":"On the informational content of overlapping genes in prokaryotic and eukaryotic viruses","volume":"44","author":"Pavesi","year":"1997","journal-title":"J. Mol. Evol."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"763","DOI":"10.1093\/oxfordjournals.molbev.a003859","article-title":"A dependent-rates model and an MCMC-based methodology for the maximum-likelihood analysis of sequences with overlapping reading frames","volume":"18","author":"Pedersen","year":"2001","journal-title":"Mol. Biol. Evol."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1093\/bioinformatics\/19.2.219","article-title":"Gene finding with a hidden Markov model of genome structure and evolution","volume":"19","author":"Pedersen","year":"2003","journal-title":"Bioinformatics"},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1016\/S0168-9525(02)02649-5","article-title":"Purifying and directional selection in overlapping prokaryotic genes","volume":"18","author":"Rogozin","year":"2002","journal-title":"Trends Genet."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"2493","DOI":"10.1093\/bioinformatics\/btl427","article-title":"Robust inference of positive selection from recombining coding sequences","volume":"22","author":"Scheffler","year":"2006","journal-title":"Bioinformatics"},{"key":"2023041208263015700_","first-page":"803","article-title":"Natural selection on the gag, pol, and env genes of human immunodeficiency virus 1 (HIV-1)","volume":"12","author":"Seibert","year":"1995","journal-title":"Mol. Biol. Evol."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1089\/1066527041410472","article-title":"Combining phylogenetic and hidden Markov models in biosequence analysis","volume":"11","author":"Siepel","year":"2004","journal-title":"J. Comput. Biol."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"5840","DOI":"10.1128\/jvi.64.12.5840-5850.1990","article-title":"Analysis of sequence diversity in hypervariable regions of the external glycoprotein of human immunodeficiency virus type 1","volume":"64","author":"Simmonds","year":"1990","journal-title":"J. Virol."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"3103","DOI":"10.1128\/jvi.67.6.3103-3110.1993","article-title":"A small highly basic protein is encoded in overlapping reading frame within the P gene of vesicular stomatitis virus","volume":"67","author":"Spiropoulou","year":"1993","journal-title":"J. Virol."},{"key":"2023041208263015700_","first-page":"57","article-title":"Some probabilistic and statistical problems in the analysis of DNA sequences","volume-title":"Some Mathematical Questions in BiologyDNA Sequence Analysis","author":"Tavar\u00e9","year":"1986"},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"4673","DOI":"10.1093\/nar\/22.22.4673","article-title":"CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice","volume":"22","author":"Thompson","year":"1994","journal-title":"Nucleic Acids Res."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1093\/oxfordjournals.molbev.a003981","article-title":"Codon-substitution models to detect adaptive evolution that account for heterogeneous selective pressures among site classes","volume":"19","author":"Yang","year":"2002","journal-title":"Mol. Biol. Evol."},{"key":"2023041208263015700_","doi-asserted-by":"crossref","first-page":"710","DOI":"10.1017\/S1355838201010111","article-title":"Evidence for a new hepatitis C virus antigen encoded in an overlapping reading frame","volume":"7","author":"Walewski","year":"2001","journal-title":"RNA"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/22\/2978\/49858022\/bioinformatics_23_22_2978.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/22\/2978\/49858022\/bioinformatics_23_22_2978.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,14]],"date-time":"2023-05-14T10:06:06Z","timestamp":1684058766000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/22\/2978\/209202"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,10,5]]},"references-count":40,"journal-issue":{"issue":"22","published-print":{"date-parts":[[2007,11,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm472","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,11,15]]},"published":{"date-parts":[[2007,10,5]]}}}