{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,17]],"date-time":"2026-01-17T20:54:37Z","timestamp":1768683277611,"version":"3.49.0"},"reference-count":32,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2016,6,24]],"date-time":"2016-06-24T00:00:00Z","timestamp":1466726400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2016,6,24]],"date-time":"2016-06-24T00:00:00Z","timestamp":1466726400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100005010","name":"Associazione Italiana per la Ricerca sul Cancro","doi-asserted-by":"publisher","award":["12214"],"award-info":[{"award-number":["12214"]}],"id":[{"id":"10.13039\/501100005010","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Many models of protein sequence evolution, in particular those based on Point Accepted Mutation (PAM) matrices, assume that its dynamics is Markovian. Nevertheless, it has been observed that evolution seems to proceed differently at different time scales, questioning this assumption. In 2011 Kosiol and Goldman proved that, if evolution is Markovian at the codon level, it can not be Markovian at the amino acid level. However, it remains unclear up to which point the Markov assumption is verified at the codon level.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>Here we show how also the among-site variability of substitution rates makes the process of full protein sequence evolution effectively not Markovian even at the codon level. This may be the theoretical explanation behind the well known systematic underestimation of evolutionary distances observed when omitting rate variability. If the substitution rate variability is neglected the average amino acid and codon replacement probabilities are affected by systematic errors and those with the largest mismatches are the substitutions involving more than one nucleotide at a time. On the other hand, the instantaneous substitution matrices estimated from alignments with the Markov assumption tend to overestimate double and triple substitutions, even when learned from alignments at high sequence identity.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>These results discourage the use of simple Markov models to describe full protein sequence evolution and encourage to employ, whenever possible, models that account for rate variability by construction (such as hidden Markov models or mixture models) or substitution models of the type of Le and Gascuel (2008) that account for it explicitly.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-016-1135-1","type":"journal-article","created":{"date-parts":[[2016,6,24]],"date-time":"2016-06-24T12:40:45Z","timestamp":1466772045000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Non-Markovian effects on protein sequence evolution due to site dependent substitution rates"],"prefix":"10.1186","volume":"17","author":[{"given":"Francesca","family":"Rizzato","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alex","family":"Rodriguez","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alessandro","family":"Laio","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2016,6,24]]},"reference":[{"key":"1135_CR1","unstructured":"Dayhoff M, Eck R. Atlas of Protein Sequence and Structure 1967-68: Published by National Biomedical Research Foundation; 1968, pp. 33\u201341."},{"key":"1135_CR2","first-page":"345","volume":"5","author":"M Dayhoff","year":"1978","unstructured":"Dayhoff M, Schwartz R, Orcutt B. A model of evolutionary change in proteins. Atlas Protein Sequences Struct. 1978; 5:345\u201352.","journal-title":"Atlas Protein Sequences Struct"},{"issue":"3","key":"1135_CR3","first-page":"275","volume":"8","author":"DT Jones","year":"1992","unstructured":"Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci: CABIOS. 1992; 8(3):275\u201382.","journal-title":"Comput Appl Biosci: CABIOS"},{"issue":"5062","key":"1135_CR4","doi-asserted-by":"publisher","first-page":"1443","DOI":"10.1126\/science.1604319","volume":"256","author":"GH Gonnet","year":"1992","unstructured":"Gonnet GH, Cohen MA, Benner SA. Exhaustive matching of the entire protein sequence database. Science. 1992; 256(5062):1443\u20131445.","journal-title":"Science"},{"issue":"5","key":"1135_CR5","doi-asserted-by":"publisher","first-page":"691","DOI":"10.1093\/oxfordjournals.molbev.a003851","volume":"18","author":"S Whelan","year":"2001","unstructured":"Whelan S, Goldman N. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol. 2001; 18(5):691\u20139. http:\/\/mbe.oxfordjournals.org\/content\/18\/5\/691.full.pdf+html.","journal-title":"Mol Biol Evol"},{"issue":"1","key":"1135_CR6","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1093\/oxfordjournals.molbev.a003985","volume":"19","author":"T Mueller","year":"2002","unstructured":"Mueller T, Spang R, Vingron M. Estimating amino acid substitution models: A comparison of Dayhoff\u2019s estimator, the resolvent approach and a maximum likelihood method. Mol Biol Evol. 2002; 19(1):8\u201313. http:\/\/mbe.oxfordjournals.org\/content\/19\/1\/8.full.pdf+html.","journal-title":"Mol Biol Evol"},{"issue":"11","key":"1135_CR7","doi-asserted-by":"publisher","first-page":"1323","DOI":"10.1093\/protein\/7.11.1323","volume":"7","author":"SA Benner","year":"1994","unstructured":"Benner SA, Cohen MA, Gonnet GH. Amino acid substitution during functionally constrained divergent evolution of protein sequences. Protein Eng. 1994; 7(11):1323\u20131332. doi:10.1093\/protein\/7.11.132310.1093\/protein\/7.11.1323. http:\/\/peds.oxfordjournals.org\/content\/7\/11\/1323.full.pdf+html.","journal-title":"Protein Eng"},{"issue":"6","key":"1135_CR8","doi-asserted-by":"publisher","first-page":"1139","DOI":"10.1007\/BF00173195","volume":"41","author":"G Mitchison","year":"1995","unstructured":"Mitchison G, Durbin R. Tree-based maximal likelihood substitution matrices and hiddenMarkov models. J Mol Evol. 1995; 41(6):1139\u201351.","journal-title":"J Mol Evol"},{"key":"1135_CR9","doi-asserted-by":"publisher","first-page":"910","DOI":"10.1016\/j.jmb.2011.06.005","volume":"411.4-6","author":"C Kosiol","year":"2011","unstructured":"Kosiol C, Goldman N. Markovian and non-Markovian protein sequence evolution: Aggregated Markov process models. J Mol Biol. 2011; 411.4-6:910\u201323.","journal-title":"J Mol Biol"},{"issue":"7","key":"1135_CR10","doi-asserted-by":"publisher","first-page":"1464","DOI":"10.1093\/molbev\/msm064","volume":"24","author":"C Kosiol","year":"2007","unstructured":"Kosiol C, Holmes I, Goldman N. An empirical codon model for protein sequence evolution. Mol Biol Evol. 2007; 24(7):1464\u20131479. doi:10.1093\/molbev\/msm06410.1093\/molbev\/msm064. http:\/\/mbe.oxfordjournals.org\/content\/24\/7\/1464.full.pdf+html.","journal-title":"Mol Biol Evol"},{"issue":"1","key":"1135_CR11","doi-asserted-by":"publisher","first-page":"134","DOI":"10.1186\/1471-2105-6-134","volume":"6","author":"A Schneider","year":"2005","unstructured":"Schneider A, Cannarozzi G, Gonnet G. Empirical codon substitution matrix. BMC Bioinforma. 2005; 6(1):134. doi:10.1186\/1471-2105-6-134.","journal-title":"BMC Bioinforma"},{"issue":"2","key":"1135_CR12","doi-asserted-by":"publisher","first-page":"388","DOI":"10.1093\/molbev\/msl175","volume":"24","author":"A Doron-Faigenboim","year":"2007","unstructured":"Doron-Faigenboim A, Pupko T. A combined empirical and mechanistic codon model. Mol Biol Evol. 2007; 24(2):388\u201397. doi:10.1093\/molbev\/msl17510.1093\/molbev\/msl175. http:\/\/mbe.oxfordjournals.org\/content\/24\/2\/388.full.pdf+html.","journal-title":"Mol Biol Evol"},{"issue":"6","key":"1135_CR13","first-page":"1396","volume":"10","author":"Z Yang","year":"1993","unstructured":"Yang Z. Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. Mol Biol Evol. 1993; 10(6):1396\u20131401. http:\/\/mbe.oxfordjournals.org\/content\/10\/6\/1396.full.pdf+html.","journal-title":"Mol Biol Evol"},{"issue":"2","key":"1135_CR14","first-page":"316","volume":"11","author":"Z Yang","year":"1994","unstructured":"Yang Z, Goldman N, Friday A. Comparison of models for nucleotide substitution used in maximum-likelihood phylogenetic estimation.Mol Biol Evol. 1994; 11(2):316\u201324. http:\/\/mbe.oxfordjournals.org\/content\/11\/2\/316.full.pdf+html.","journal-title":"Mol Biol Evol"},{"key":"1135_CR15","doi-asserted-by":"publisher","first-page":"109","DOI":"10.1038\/nrg.2015.18","volume":"17","author":"J Echave","year":"2016","unstructured":"Echave J, Spielman SJ, Wilke CO. Causes of evolutionary rate variation among protein sites. Nat Rev Genet. 2016; 17:109\u2013121.","journal-title":"Nat Rev Genet"},{"issue":"2","key":"1135_CR16","doi-asserted-by":"crossref","first-page":"993","DOI":"10.1093\/genetics\/139.2.993","volume":"139","author":"Z Yang","year":"1995","unstructured":"Yang Z. A space-time process model for the evolution of DNA sequences. Genetics. 1995; 139(2):993\u20131005. http:\/\/www.genetics.org\/content\/139\/2\/993.full.pdf+html.","journal-title":"Genetics"},{"issue":"1","key":"1135_CR17","doi-asserted-by":"publisher","first-page":"93","DOI":"10.1093\/oxfordjournals.molbev.a025575","volume":"13","author":"J Felsenstein","year":"1996","unstructured":"Felsenstein J, Churchill GA. A hidden Markov model approach to variation among sites in rate of evolution. Mol Biol Evol. 1996; 13(1):93\u2013104. http:\/\/mbe.oxfordjournals.org\/content\/13\/1\/93.full.pdf+html.","journal-title":"Mol Biol Evol"},{"issue":"7","key":"1135_CR18","doi-asserted-by":"publisher","first-page":"910","DOI":"10.1093\/oxfordjournals.molbev.a025995","volume":"15","author":"AL Halpern","year":"1998","unstructured":"Halpern AL, Bruno WJ. Evolutionary distances for protein-coding sequences: modeling site-specific residue frequencies. Mol Biol Evol. 1998; 15(7):910\u20137. http:\/\/mbe.oxfordjournals.org\/content\/15\/7\/910.full.pdf+html.","journal-title":"Mol Biol Evol"},{"issue":"4","key":"1135_CR19","doi-asserted-by":"publisher","first-page":"571","DOI":"10.1080\/10635150490468675","volume":"53","author":"M Pagel","year":"2004","unstructured":"Pagel M, Meade A. A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data. Syst Biol. 2004; 53(4):571\u201381. doi:10.1080\/1063515049046867510.1080\/10635150490468675. http:\/\/sysbio.oxfordjournals.org\/content\/53\/4\/571.full.pdf+html.","journal-title":"Syst Biol"},{"issue":"6","key":"1135_CR20","doi-asserted-by":"publisher","first-page":"1095","DOI":"10.1093\/molbev\/msh112","volume":"21","author":"N Lartillot","year":"2004","unstructured":"Lartillot N, Philippe H. A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Mol Biol Evol. 2004; 21(6):1095\u20131109. doi:10.1093\/molbev\/msh11210.1093\/molbev\/msh112. http:\/\/mbe.oxfordjournals.org\/content\/21\/6\/1095.full.pdf+html.","journal-title":"Mol Biol Evol"},{"issue":"3","key":"1135_CR21","doi-asserted-by":"publisher","first-page":"306","DOI":"10.1007\/BF00160154","volume":"39","author":"Z Yang","year":"1994","unstructured":"Yang Z. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J Mol Evol. 1994; 39(3):306\u201314.","journal-title":"J Mol Evol"},{"issue":"7","key":"1135_CR22","doi-asserted-by":"publisher","first-page":"1307","DOI":"10.1093\/molbev\/msn067","volume":"25","author":"SQ Le","year":"2008","unstructured":"Le SQ, Gascuel O. An improved general amino acid replacement matrix. Mol Biol Evol. 2008; 25(7):1307\u20131320. doi:10.1093\/molbev\/msn06710.1093\/molbev\/msn067. http:\/\/mbe.oxfordjournals.org\/content\/25\/7\/1307.full.pdf+html.","journal-title":"Mol Biol Evol"},{"key":"1135_CR23","doi-asserted-by":"crossref","unstructured":"Cox DR, Miller HD. The theory of stochastic processes. CRC Press; 1977. 134.","DOI":"10.1176\/ajp.134.10.1160-a"},{"issue":"1","key":"1135_CR24","doi-asserted-by":"crossref","first-page":"431","DOI":"10.1093\/genetics\/155.1.431","volume":"155","author":"Z Yang","year":"2000","unstructured":"Yang Z, Nielsen R, Goldman N, Pedersen A-MK. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics. 2000; 155(1):431\u201349. http:\/\/www.genetics.org\/content\/155\/1\/431.full.pdf+html.","journal-title":"Genetics"},{"issue":"3","key":"1135_CR25","doi-asserted-by":"crossref","first-page":"1615","DOI":"10.1093\/genetics\/149.3.1615","volume":"149","author":"J Zhang","year":"1998","unstructured":"Zhang J, Gu X. Correlation between the substitution rate and rate variation among sites in protein evolution. Genetics. 1998; 149(3):1615\u201325. http:\/\/www.genetics.org\/content\/149\/3\/1615.full.pdf.","journal-title":"Genetics"},{"issue":"1","key":"1135_CR26","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1093\/oxfordjournals.molbev.a003973","volume":"19","author":"P Lopez","year":"2002","unstructured":"Lopez P, Casane D, Philippe H. Heterotachy, an important process of protein evolution. Mol Biol Evol. 2002; 19(1):1\u20137. http:\/\/mbe.oxfordjournals.org\/content\/19\/1\/1.full.pdf+html.","journal-title":"Mol Biol Evol"},{"key":"1135_CR27","unstructured":"Kemeny JG, Snell JL. Finite markov chains. van Nostrand Princeton, NJ; 1960. 356."},{"issue":"30","key":"1135_CR28","first-page":"725","volume":"266","author":"N De Maio","year":"2012","unstructured":"De Maio N, Holmes I, Schl\u00f6tterer C, Kosiol C. Estimating empirical codon hidden Markov models. Mol Biol Evol. 2012; 266(30):725\u2013736.","journal-title":"Mol Biol Evol"},{"issue":"1512","key":"1135_CR29","doi-asserted-by":"publisher","first-page":"3965","DOI":"10.1098\/rstb.2008.0180","volume":"363","author":"SQ Le","year":"2008","unstructured":"Le SQ, Lartillot N, Gascuel O. Phylogenetic mixture models for proteins. Philos Trans R Soc Lond B Biol Sci. 2008; 363(1512):3965\u2013976.","journal-title":"Philos Trans R Soc Lond B Biol Sci"},{"issue":"9","key":"1135_CR30","doi-asserted-by":"publisher","first-page":"755","DOI":"10.1093\/bioinformatics\/14.9.755","volume":"14","author":"SR Eddy","year":"1998","unstructured":"Eddy SR. Profile hidden Markov models. Bioinformatics. 1998; 14(9):755\u201363.","journal-title":"Bioinformatics"},{"issue":"5","key":"1135_CR31","doi-asserted-by":"publisher","first-page":"1501","DOI":"10.1006\/jmbi.1994.1104","volume":"235","author":"A Krogh","year":"1994","unstructured":"Krogh A, Brown M, Mian IS, Sj\u00f6lander K, Haussler D. Hidden Markov models in computational biology: Applications to protein modeling. J Mol Biol. 1994; 235(5):1501\u20131531.","journal-title":"J Mol Biol"},{"key":"1135_CR32","unstructured":"Papoulis A, Pillai SU. Probability, random variables, and stochastic processes. McGraw-Hill: 1985."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-016-1135-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-016-1135-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-016-1135-1","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-016-1135-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T17:59:14Z","timestamp":1706810354000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-016-1135-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,6,24]]},"references-count":32,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2016,12]]}},"alternative-id":["1135"],"URL":"https:\/\/doi.org\/10.1186\/s12859-016-1135-1","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2016,6,24]]},"assertion":[{"value":"7 April 2016","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 June 2016","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 June 2016","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"258"}}