{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,26]],"date-time":"2025-02-26T05:33:37Z","timestamp":1740548017827,"version":"3.38.0"},"reference-count":45,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2010,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Models of sequence evolution typically assume that different nucleotide positions evolve independently. This assumption is widely appreciated to be an over-simplification. The best known violations involve biases due to adjacent nucleotides. There have also been suggestions that biases exist at larger scales, however this possibility has not been systematically explored.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>To address this we have developed a method which identifies over- and under-represented substitution patterns and assesses their overall impact on the evolution of genome composition. Our method is designed to account for biases at smaller pattern sizes, removing their effects. We used this method to investigate context bias in the human lineage after the divergence from chimpanzee. We examined bias effects in substitution patterns between 2 and 5 bp long and found significant effects at all sizes. This included some individual three and four base pair patterns with relatively large biases. We also found that bias effects vary across the genome, differing between transposons and non-transposons, between different classes of transposons, and also near and far from genes.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>We found that nucleotides beyond the immediately adjacent one are responsible for substantial context effects, and that these biases vary across the genome.<\/jats:p><\/jats:sec>","DOI":"10.1186\/1471-2105-11-462","type":"journal-article","created":{"date-parts":[[2010,9,16]],"date-time":"2010-09-16T06:13:55Z","timestamp":1284617635000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["Context dependent substitution biases vary within the human genome"],"prefix":"10.1186","volume":"11","author":[{"given":"P Andrew","family":"Nevarez","sequence":"first","affiliation":[]},{"given":"Christopher M","family":"DeBoever","sequence":"additional","affiliation":[]},{"given":"Benjamin J","family":"Freeland","sequence":"additional","affiliation":[]},{"given":"Marissa A","family":"Quitt","sequence":"additional","affiliation":[]},{"given":"Eliot C","family":"Bush","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2010,9,15]]},"reference":[{"key":"3919_CR1","doi-asserted-by":"crossref","unstructured":"Jukes T, Cantor C: Evolution of protein molecules. In Mammalian Protein Metabolism. Edited by: Munro H. Academic Press; 21\u201332.","DOI":"10.1016\/B978-1-4832-3211-9.50009-7"},{"issue":"6","key":"3919_CR2","doi-asserted-by":"publisher","first-page":"368","DOI":"10.1007\/BF01734359","volume":"V17","author":"J Felsenstein","year":"1981","unstructured":"Felsenstein J: Evolutionary trees from DNA sequences: A maximum likelihood approach. Journal of Molecular Evolution 1981, V17(6):368\u2013376. 10.1007\/BF01734359","journal-title":"Journal of Molecular Evolution"},{"issue":"2","key":"3919_CR3","doi-asserted-by":"publisher","first-page":"160","DOI":"10.1007\/BF02101694","volume":"V22","author":"M Hasegawa","year":"1985","unstructured":"Hasegawa M, Kishino H, Yano T: Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution 1985, V22(2):160\u2013174. 10.1007\/BF02101694","journal-title":"Journal of Molecular Evolution"},{"issue":"3","key":"3919_CR4","doi-asserted-by":"publisher","first-page":"468","DOI":"10.1093\/molbev\/msh039","volume":"21","author":"A Siepel","year":"2004","unstructured":"Siepel A, Haussler D: Phylogenetic Estimation of Context-Dependent Substitution Rates by Maximum Likelihood. Mol Biol Evol 2004, 21(3):468\u2013488. 10.1093\/molbev\/msh039","journal-title":"Mol Biol Evol"},{"issue":"6422","key":"3919_CR5","doi-asserted-by":"publisher","first-page":"709","DOI":"10.1038\/362709a0","volume":"362","author":"T Lindahl","year":"1993","unstructured":"Lindahl T: Instability and decay of the primary structure of DNA. Nature 1993, 362(6422):709\u2013715. 10.1038\/362709a0","journal-title":"Nature"},{"issue":"5673","key":"3919_CR6","doi-asserted-by":"publisher","first-page":"775","DOI":"10.1038\/274775a0","volume":"274","author":"C Coulondre","year":"1978","unstructured":"Coulondre C, Miller JH, Farabaugh PJ, Gilbert W: Molecular basis of base substitution hotspots in Escherichia coli. Nature 1978, 274(5673):775\u2013780. 10.1038\/274775a0","journal-title":"Nature"},{"issue":"6","key":"3919_CR7","doi-asserted-by":"crossref","first-page":"1961","DOI":"10.1016\/S0021-9258(19)73967-2","volume":"237","author":"M Swartz","year":"1962","unstructured":"Swartz M, Trautner T, Kornberg A: Enzymatic Synthesis of Deoxyribonucleic Acid XI. Further Studies on Nearest Neighbor Base Sequences in Deoxyribonucleic acids. Journal of Biological Chemistry 1962, 237(6):1961\u20131967.","journal-title":"Journal of Biological Chemistry"},{"issue":"7","key":"3919_CR8","doi-asserted-by":"publisher","first-page":"1499","DOI":"10.1093\/nar\/8.7.1499","volume":"8","author":"AP Bird","year":"1980","unstructured":"Bird AP: DNA methylation and the frequency of CpG in animal DNA. Nucl. Acids Res. 1980, 8(7):1499\u20131504. 10.1093\/nar\/8.7.1499","journal-title":"Nucl. Acids Res"},{"issue":"4470","key":"3919_CR9","doi-asserted-by":"publisher","first-page":"604","DOI":"10.1126\/science.6254144","volume":"210","author":"A Razin","year":"1980","unstructured":"Razin A, Riggs A: DNA methylation and gene function. Science 1980, 210(4470):604\u2013610. 10.1126\/science.6254144","journal-title":"Science"},{"issue":"4","key":"3919_CR10","first-page":"322","volume":"3","author":"M Bulmer","year":"1986","unstructured":"Bulmer M: Neighboring base effects on substitution rates in pseudogenes. Mol Biol Evol 1986, 3(4):322\u2013329.","journal-title":"Mol Biol Evol"},{"issue":"3","key":"3919_CR11","doi-asserted-by":"publisher","first-page":"189","DOI":"10.1007\/BF00162968","volume":"34","author":"RD Blake","year":"1992","unstructured":"Blake RD, Hess ST, NicholsonTuell J: The Influence Of Nearest Neighbors On The Rate And Pattern Of Spontaneous Point Mutations. Journal Of Molecular Evolution 1992, 34(3):189\u2013200. 10.1007\/BF00162968","journal-title":"Journal Of Molecular Evolution"},{"issue":"4","key":"3919_CR12","doi-asserted-by":"publisher","first-page":"1022","DOI":"10.1016\/0022-2836(94)90009-4","volume":"236","author":"ST Hess","year":"1994","unstructured":"Hess ST, Blake JD, Blake RD: Wide Variations In Neighbor-Dependent Substitution Rates. Journal Of Molecular Biology 1994, 236(4):1022\u20131033. 10.1016\/0022-2836(94)90009-4","journal-title":"Journal Of Molecular Biology"},{"issue":"21","key":"3919_CR13","doi-asserted-by":"publisher","first-page":"9717","DOI":"10.1073\/pnas.92.21.9717","volume":"92","author":"B Morton","year":"1995","unstructured":"Morton B: Neighboring Base Composition and Transversion\/Transition Bias in a Comparison of Rice and Maize Chloroplast Noncoding Regions. PNAS 1995, 92(21):9717\u20139721. 10.1073\/pnas.92.21.9717","journal-title":"PNAS"},{"issue":"3","key":"3919_CR14","doi-asserted-by":"publisher","first-page":"227","DOI":"10.1007\/PL00006224","volume":"45","author":"BR Morton","year":"1997","unstructured":"Morton BR, Oberholzer VM, Clegg MT: The Influence of Specific Neighboring Bases on Substitution Bias in Noncoding Regions of the Plant Chloroplast Genome. Journal of Molecular Evolution 1997, 45(3):227\u2013231. 10.1007\/PL00006224","journal-title":"Journal of Molecular Evolution"},{"issue":"6","key":"3919_CR15","doi-asserted-by":"publisher","first-page":"605","DOI":"10.1007\/s00239-006-0076-0","volume":"64","author":"T Zheng","year":"2007","unstructured":"Zheng T, Ichiba T, Morton B: Assessing Substitution Variation Across Sites in Grass Chloroplast DNA. Journal of Molecular Evolution 2007, 64(6):605\u2013613. 10.1007\/s00239-006-0076-0","journal-title":"Journal of Molecular Evolution"},{"key":"3919_CR16","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1007\/s00239-001-2310-0","volume":"55","author":"YW Yang","year":"2002","unstructured":"Yang YW, Chen Y, Li WH: The Influence of Adjacent Nucleotides on the Pattern of Nucleotide Substitution in Mitochondrial Introns of Angiosperms. Journal of Molecular Evolution 2002, 55: 111\u2013115. 10.1007\/s00239-001-2310-0","journal-title":"Journal of Molecular Evolution"},{"issue":"11","key":"3919_CR17","doi-asserted-by":"publisher","first-page":"1679","DOI":"10.1101\/gr.287302","volume":"12","author":"Z Zhao","year":"2002","unstructured":"Zhao Z, Boerwinkle E: Neighboring-Nucleotide Effects on Single Nucleotide Polymorphisms: A Study of 2.6 Million Polymorphisms Across the Human Genome. Genome Res. 2002, 12(11):1679\u20131686. 10.1101\/gr.287302","journal-title":"Genome Res"},{"issue":"2","key":"3919_CR18","doi-asserted-by":"publisher","first-page":"499","DOI":"10.1239\/aap\/1013540176","volume":"32","author":"JL Jensen","year":"2000","unstructured":"Jensen JL, Pedersen AMK: Probabilistic Models of DNA Sequence Evolution with Context Dependent Rates of Substitution. Advances in Applied Probability 2000, 32(2):499\u2013517. 10.1239\/aap\/1013540176","journal-title":"Advances in Applied Probability"},{"issue":"10","key":"3919_CR19","doi-asserted-by":"publisher","first-page":"2322","DOI":"10.1093\/bioinformatics\/bti376","volume":"21","author":"PF Arndt","year":"2005","unstructured":"Arndt PF, Hwa T: Identification and measurement of neighbor-dependent nucleotide substitution processes. Bioinformatics 2005, 21(10):2322\u20132328. 10.1093\/bioinformatics\/bti376","journal-title":"Bioinformatics"},{"issue":"suppl 1","key":"3919_CR20","doi-asserted-by":"publisher","first-page":"i216","DOI":"10.1093\/bioinformatics\/bth901","volume":"20","author":"G Lunter","year":"2004","unstructured":"Lunter G, Hein J: A nucleotide substitution model with nearest-neighbour interactions. Bioinformatics 2004, 20(suppl 1):i216\u2013223. 10.1093\/bioinformatics\/bth901","journal-title":"Bioinformatics"},{"issue":"39","key":"3919_CR21","doi-asserted-by":"publisher","first-page":"13994","DOI":"10.1073\/pnas.0404142101","volume":"101","author":"DG Hwang","year":"2004","unstructured":"Hwang DG, Green P: Inaugural Article: Bayesian Markov chain Monte Carlo sequence analysis reveals varying neutral substitution patterns in mammalian evolution. PNAS 2004, 101(39):13994\u201314001. 10.1073\/pnas.0404142101","journal-title":"PNAS"},{"issue":"5","key":"3919_CR22","doi-asserted-by":"publisher","first-page":"675","DOI":"10.1080\/10635150802422324","volume":"57","author":"G Baele","year":"2008","unstructured":"Baele G, Van de Peer Y, Vansteelandt S: A model-based approach to study nearest-neighbor influences reveals complex substitution patterns in non-coding sequences. Systematic biology 2008, 57(5):675. 10.1080\/10635150802422324","journal-title":"Systematic biology"},{"issue":"2","key":"3919_CR23","doi-asserted-by":"publisher","first-page":"e1000027","DOI":"10.1371\/journal.pbio.1000027","volume":"7","author":"A Hodgkinson","year":"2009","unstructured":"Hodgkinson A, Ladoukakis E, Eyre-Walker A: Cryptic Variation in the Human Mutation Rate. PLoS Biol 2009, 7(2):e1000027. 10.1371\/journal.pbio.1000027","journal-title":"PLoS Biol"},{"issue":"4","key":"3919_CR24","doi-asserted-by":"publisher","first-page":"1358","DOI":"10.1073\/pnas.89.4.1358","volume":"89","author":"C Burge","year":"1992","unstructured":"Burge C, Campbell A, Karlin S: Over- and Under-Representation of Short Oligonucleotides in DNA Sequences. PNAS 1992, 89(4):1358\u20131362. 10.1073\/pnas.89.4.1358","journal-title":"PNAS"},{"issue":"26","key":"3919_CR25","doi-asserted-by":"publisher","first-page":"12832","DOI":"10.1073\/pnas.91.26.12832","volume":"91","author":"S Karlin","year":"1994","unstructured":"Karlin S, Ladunga I: Comparisons of Eukaryotic Genomic Sequences. PNAS 1994, 91(26):12832\u201312836. 10.1073\/pnas.91.26.12832","journal-title":"PNAS"},{"issue":"12","key":"3919_CR26","doi-asserted-by":"crossref","first-page":"3899","DOI":"10.1128\/jb.179.12.3899-3913.1997","volume":"179","author":"S Karlin","year":"1997","unstructured":"Karlin S, Mrazek J, Campbell A: Compositional biases of bacterial genomes and evolutionary implications. J. Bacteriol. 1997, 179(12):3899\u20133913.","journal-title":"J. Bacteriol"},{"issue":"2","key":"3919_CR27","doi-asserted-by":"publisher","first-page":"151","DOI":"10.1089\/106652701300312922","volume":"8","author":"J Elhai","year":"2001","unstructured":"Elhai J: Determination of Bias in the Relative Abundance of Oligonucleotides in DNA Sequences. Journal of Computational Biology 2001, 8(2):151\u2013175. 10.1089\/106652701300312922","journal-title":"Journal of Computational Biology"},{"key":"3919_CR28","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","volume":"57","author":"Y Benjamini","year":"1995","unstructured":"Benjamini Y, Hochberg Y: Controlling the false discovery rate - a practical and powerful approach to multiple testing. Journal Of The Royal Statistical Society Series B-Methodological 1995, 57: 289\u2013300.","journal-title":"Journal Of The Royal Statistical Society Series B-Methodological"},{"key":"3919_CR29","unstructured":"Manly B: Randomization, Bootstrap and Monte Carlo Methods in Biology. 2nd edition. London: Chapman and Hall;"},{"key":"3919_CR30","volume-title":"RepeatMasker Open-3.0","author":"A Smit","year":"1996","unstructured":"Smit A, Hubley R, Green P: RepeatMasker Open-3.0.1996. [http:\/\/www.repeatmasker.org]"},{"issue":"9","key":"3919_CR31","doi-asserted-by":"publisher","first-page":"418","DOI":"10.1016\/S0168-9525(00)02093-X","volume":"16","author":"J Jurka","year":"2000","unstructured":"Jurka J: Repbase Update: a database and an electronic journal of repetitive elements. Trends in Genetics 2000, 16(9):418\u2013420. 10.1016\/S0168-9525(00)02093-X","journal-title":"Trends in Genetics"},{"issue":"90001","key":"3919_CR32","doi-asserted-by":"publisher","first-page":"D493","DOI":"10.1093\/nar\/gkh103","volume":"32","author":"D Karolchik","year":"2004","unstructured":"Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, Kent WJ: The UCSC Table Browser data retrieval tool. Nucl. Acids Res. 2004, 32(90001):D493\u2013496. 10.1093\/nar\/gkh103","journal-title":"Nucl. Acids Res"},{"key":"3919_CR33","volume-title":"Current Protocols in Bioinformatics","author":"J Taylor","year":"2007","unstructured":"Taylor J, Schenck I, Blankenberg D, Nekrutenko A: Using galaxy to perform large-scale interactive data analyses. Current Protocols in Bioinformatics 2007., Chapter 10(Unit 10.5):"},{"issue":"4","key":"3919_CR34","doi-asserted-by":"publisher","first-page":"708","DOI":"10.1101\/gr.1933104","volume":"14","author":"M Blanchette","year":"2004","unstructured":"Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, Haussler D, Miller W: Aligning Multiple Genomic Sequences With the Threaded Blockset Aligner. Genome Research 2004, 14(4):708\u2013715. 10.1101\/gr.1933104","journal-title":"Genome Research"},{"issue":"5","key":"3919_CR35","doi-asserted-by":"publisher","first-page":"676","DOI":"10.1093\/bioinformatics\/bti079","volume":"21","author":"SLK Pond","year":"2005","unstructured":"Pond SLK, Frost SDW, Muse SV: HyPhy: hypothesis testing using phylogenies. Bioinformatics 2005, 21(5):676\u2013679. 10.1093\/bioinformatics\/bti079","journal-title":"Bioinformatics"},{"issue":"3","key":"3919_CR36","doi-asserted-by":"publisher","first-page":"122","DOI":"10.1016\/j.tig.2005.12.007","volume":"22","author":"S Taudien","year":"2006","unstructured":"Taudien S, Ebersberger I, Glockner G, Platzer M: Should the draft chimpanzee sequence be finished? Trends in Genetics 2006, 22(3):122\u2013125. 10.1016\/j.tig.2005.12.007","journal-title":"Trends in Genetics"},{"issue":"6","key":"3919_CR37","doi-asserted-by":"publisher","first-page":"803","DOI":"10.1007\/s00239-005-0228-7","volume":"62","author":"L Duret","year":"2006","unstructured":"Duret L: The GC Content of Primates and Rodents Genomes Is Not at Equilibrium: A Reply to Antezana. Journal of Molecular Evolution 2006, 62(6):803\u2013806. 10.1007\/s00239-005-0228-7","journal-title":"Journal of Molecular Evolution"},{"issue":"3-4","key":"3919_CR38","doi-asserted-by":"publisher","first-page":"313","DOI":"10.1089\/10665270360688039","volume":"10","author":"PF Arndt","year":"2003","unstructured":"Arndt PF, Burge CB, Hwa T: DNA Sequence Evolution with Neighbor-Dependent Mutation. Journal of Computational Biology 2003, 10(3\u20134):313\u2013322. 10.1089\/10665270360688039","journal-title":"Journal of Computational Biology"},{"issue":"15","key":"3919_CR39","doi-asserted-by":"publisher","first-page":"5471","DOI":"10.1073\/pnas.0408986102","volume":"102","author":"J Meunier","year":"2005","unstructured":"Meunier J, Khelifi A, Navratil V, Duret L: Homology-dependent methylation in primate repetitive DNA. Proceedings of the National Academy of Sciences of the United States of America 2005, 102(15):5471\u20135476. 10.1073\/pnas.0408986102","journal-title":"Proceedings of the National Academy of Sciences of the United States of America"},{"issue":"2","key":"3919_CR40","doi-asserted-by":"publisher","first-page":"e1000015","DOI":"10.1371\/journal.pcbi.1000015","volume":"4","author":"N Elango","year":"2008","unstructured":"Elango N, Kim SH, Vigoda E, Yi SV, NISC Comparative Sequencing Program: Mutations of Different Molecular Origins Exhibit Contrasting Patterns of Regional Substitution Rate Variation. PLoS Comput Biol 2008, 4(2):e1000015. 10.1371\/journal.pcbi.1000015","journal-title":"PLoS Comput Biol"},{"issue":"11","key":"3919_CR41","doi-asserted-by":"publisher","first-page":"e150","DOI":"10.1371\/journal.pcbi.0020150","volume":"2","author":"EC Bush","year":"2006","unstructured":"Bush EC, Lahn BT: The Evolution of Word Composition in Metazoan Promoter Sequence. PLoS Computational Biology 2006, 2(11):e150. 10.1371\/journal.pcbi.0020150","journal-title":"PLoS Computational Biology"},{"issue":"5507","key":"3919_CR42","doi-asserted-by":"publisher","first-page":"1304","DOI":"10.1126\/science.1058040","volume":"291","author":"JC Venter","year":"2001","unstructured":"Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al.: The Sequence of the Human Genome. Science 2001, 291(5507):1304\u20131351. 10.1126\/science.1058040","journal-title":"Science"},{"issue":"8","key":"3919_CR43","doi-asserted-by":"publisher","first-page":"335","DOI":"10.1016\/S0168-9525(97)01181-5","volume":"13","author":"JA Yoder","year":"1997","unstructured":"Yoder JA, Walsh CP, Bestor TH: Cytosine methylation and the ecology of intragenomic parasites. Trends in Genetics 1997, 13(8):335\u2013340. 10.1016\/S0168-9525(97)01181-5","journal-title":"Trends in Genetics"},{"issue":"6067","key":"3919_CR44","doi-asserted-by":"publisher","first-page":"209","DOI":"10.1038\/321209a0","volume":"321","author":"AP Bird","year":"1986","unstructured":"Bird AP: CpG-rich islands and the function of DNA methylation. Nature 1986, 321(6067):209\u2013213. 10.1038\/321209a0","journal-title":"Nature"},{"issue":"4","key":"3919_CR45","doi-asserted-by":"publisher","first-page":"363","DOI":"10.1038\/ng0496-363","volume":"12","author":"TH Bestor","year":"1996","unstructured":"Bestor TH, Tycko B: Creation of genomic methylation patterns. Nat Genet 1996, 12(4):363\u2013367. 10.1038\/ng0496-363","journal-title":"Nat Genet"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-11-462.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,25]],"date-time":"2025-02-25T21:33:52Z","timestamp":1740519232000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-11-462"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,9,15]]},"references-count":45,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2010,12]]}},"alternative-id":["3919"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-11-462","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2010,9,15]]},"assertion":[{"value":"2 April 2010","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 September 2010","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 September 2010","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"462"}}