{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T19:08:08Z","timestamp":1775329688325,"version":"3.50.1"},"reference-count":47,"publisher":"Oxford University Press (OUP)","issue":"19","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: To understand the evolution of molecular function within protein families, it is important to identify those amino acid residues responsible for functional divergence; i.e. those sites in a protein family that affect cofactor, protein or substrate binding preferences; affinity; catalysis; flexibility; or folding. Type I functional divergence (FD) results from changes in conservation (evolutionary rate) at a site between protein subfamilies, whereas type II FD occurs when there has been a shift in preferences for different amino acid chemical properties. A variety of methods have been developed for identifying both site types in protein subfamilies, both from phylogenetic and information-theoretic angles. However, evaluation of the performance of these methods has typically relied upon a handful of reasonably well-characterized biological datasets or analyses of a single biological example. While experimental validation of many truly functionally divergent sites (true positives) can be relatively straightforward, determining that particular sites do not contribute to functional divergence (i.e. false positives and true negatives) is much more difficult, resulting in noisy \u2018gold standard\u2019 examples.<\/jats:p>\n               <jats:p>Results:We describe a novel, phylogeny-based functional divergence classifier, FunDi. Unlike previous approaches, FunDi uses a unified mixture model-based approach to detect type I and type II FD. To assess FunDi's overall classification performance relative to other methods, we introduce two methods for simulating functionally divergent datasets. We find that the FunDi method performs better than several other predictors over a wide variety of simulation conditions.<\/jats:p>\n               <jats:p>Availability:http:\/\/rogerlab.biochem.dal.ca\/Software<\/jats:p>\n               <jats:p>Contact: \u00a0andrew.roger@dal.ca<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btr470","type":"journal-article","created":{"date-parts":[[2011,8,13]],"date-time":"2011-08-13T00:26:20Z","timestamp":1313195180000},"page":"2655-2663","source":"Crossref","is-referenced-by-count":33,"title":["A phylogenetic mixture model for the identification of functionally divergent protein residues"],"prefix":"10.1093","volume":"27","author":[{"given":"Daniel","family":"Gaston","sequence":"first","affiliation":[{"name":"1 Centre for Comparative Genomics and Evolutionary Bioinformatics, 2Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Canada, B3H 1X5 and 3Department of Mathematics and Statistics, Dalhousie University, Halifax, Canada, B3H 3J5"},{"name":"1 Centre for Comparative Genomics and Evolutionary Bioinformatics, 2Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Canada, B3H 1X5 and 3Department of Mathematics and Statistics, Dalhousie University, Halifax, Canada, B3H 3J5"}]},{"given":"Edward","family":"Susko","sequence":"additional","affiliation":[{"name":"1 Centre for Comparative Genomics and Evolutionary Bioinformatics, 2Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Canada, B3H 1X5 and 3Department of Mathematics and Statistics, Dalhousie University, Halifax, Canada, B3H 3J5"},{"name":"1 Centre for Comparative Genomics and Evolutionary Bioinformatics, 2Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Canada, B3H 1X5 and 3Department of Mathematics and Statistics, Dalhousie University, Halifax, Canada, B3H 3J5"}]},{"given":"Andrew J.","family":"Roger","sequence":"additional","affiliation":[{"name":"1 Centre for Comparative Genomics and Evolutionary Bioinformatics, 2Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Canada, B3H 1X5 and 3Department of Mathematics and Statistics, Dalhousie University, Halifax, Canada, B3H 3J5"},{"name":"1 Centre for Comparative Genomics and Evolutionary Bioinformatics, 2Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, Canada, B3H 1X5 and 3Department of Mathematics and Statistics, Dalhousie University, Halifax, Canada, B3H 3J5"}]}],"member":"286","published-online":{"date-parts":[[2011,8,11]]},"reference":[{"key":"2023012512004207900_B1","doi-asserted-by":"crossref","first-page":"784","DOI":"10.1093\/molbev\/msi065","article-title":"Impact of taxon sampling on the estimation of rates of evolution at sites","volume":"22","author":"Blouin","year":"2005","journal-title":"Mol. Biol. Evol."},{"key":"2023012512004207900_B2","doi-asserted-by":"crossref","first-page":"W35","DOI":"10.1093\/nar\/gkq415","article-title":"Multi-Harmony: detecting functional specificity from sequence alignment","volume":"38","author":"Brandt","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"2023012512004207900_B3","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1186\/1471-2105-9-491","article-title":"Prediction of specificity-determining residues for small-molecule kinase inhibitors","volume":"9","author":"Caffrey","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012512004207900_B4","doi-asserted-by":"crossref","first-page":"1875","DOI":"10.1093\/bioinformatics\/btm270","article-title":"Predicting functionally important residues from sequence conservation","volume":"23","author":"Capra","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012512004207900_B5","doi-asserted-by":"crossref","first-page":"1473","DOI":"10.1093\/bioinformatics\/btn214","article-title":"Characterization and prediction of residues determining protein functional specificity","volume":"24","author":"Capra","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012512004207900_B6","doi-asserted-by":"crossref","first-page":"e1000585","DOI":"10.1371\/journal.pcbi.1000585","article-title":"Predicting protein ligand binding sites by combining evolutionary sequence conservation and 3D structure","volume":"5","author":"Capra","year":"2009","journal-title":"PLoS Comput. Biol."},{"key":"2023012512004207900_B7","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1186\/1471-2105-10-207","article-title":"Ensemble approach to predict specificity determinants: benchmarking and validation","volume":"10","author":"Chakrabarti","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023012512004207900_B8","doi-asserted-by":"crossref","first-page":"801","DOI":"10.1016\/j.jmb.2007.08.036","article-title":"Functional specificity lies within the properties and evolutionary changes of amino acids","volume":"373","author":"Chakrabarti","year":"2007","journal-title":"J. Mol. Biol."},{"key":"2023012512004207900_B9","doi-asserted-by":"crossref","DOI":"10.1145\/1143844.1143874","article-title":"The relationship between precision-recall and ROC curves","volume-title":"23rd International Conference on Machine Learning (ICML)","author":"Davis","year":"2006"},{"key":"2023012512004207900_B10","doi-asserted-by":"crossref","first-page":"3075","DOI":"10.1093\/bioinformatics\/btq595","article-title":"Identification of subfamily-specific sites based on active sites modeling and clustering","volume":"26","author":"de Melo-Minardi","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012512004207900_B11","doi-asserted-by":"crossref","first-page":"W495","DOI":"10.1093\/nar\/gkm406","article-title":"Sequence harmony: detecting functional specificity from alignments","volume":"35","author":"Feenstra","year":"2007","journal-title":"Nucleic Acids Res."},{"key":"2023012512004207900_B12","doi-asserted-by":"crossref","first-page":"1879","DOI":"10.1093\/molbev\/msp098","article-title":"INDELible: a flexible simulator of biological sequence evolution","volume":"26","author":"Fletcher","year":"2009","journal-title":"Mol. Biol. Evol."},{"key":"2023012512004207900_B13","volume-title":"Can sequence determine function? Genome Biol","author":"Gerlt","year":"2000"},{"key":"2023012512004207900_B14","doi-asserted-by":"crossref","first-page":"1664","DOI":"10.1093\/oxfordjournals.molbev.a026080","article-title":"Statistical methods for testing functional divergence after gene duplication","volume":"16","author":"Gu","year":"1999","journal-title":"Mol. Biol. Evol."},{"key":"2023012512004207900_B15","doi-asserted-by":"crossref","first-page":"453","DOI":"10.1093\/oxfordjournals.molbev.a003824","article-title":"Maxmimum-likelihood approach for gene family evolution under functional divergence","volume":"18","author":"Gu","year":"2001","journal-title":"Mol. Biol. Evol."},{"key":"2023012512004207900_B16","doi-asserted-by":"crossref","first-page":"500","DOI":"10.1093\/bioinformatics\/18.3.500","article-title":"DIVERGE: phylogeny-based analysis for functional-structural divergence of a protein family","volume":"18","author":"Gu","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012512004207900_B17","doi-asserted-by":"crossref","first-page":"609","DOI":"10.1126\/science.278.5338.609","article-title":"Gene families: the taxonomy of protein paralogs and chimeras","volume":"278","author":"Henikoff","year":"1997","journal-title":"Science"},{"key":"2023012512004207900_B18","first-page":"275","article-title":"The rapid generation of mutation data matrices from protein sequences","volume":"8","author":"Jones","year":"1992","journal-title":"Comput. Appl. Biosci."},{"key":"2023012512004207900_B19","doi-asserted-by":"crossref","first-page":"14512","DOI":"10.1073\/pnas.251526398","article-title":"A likelihood ratio test for evolutionary rate shifts and functional divergence among proteins","volume":"98","author":"Knudsen","year":"2001","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012512004207900_B20","doi-asserted-by":"crossref","first-page":"1261","DOI":"10.1093\/genetics\/164.4.1261","article-title":"Using evolutionary rates to investigate protein functional divergence and conservation. A case study of the carbonic anhydrases","volume":"164","author":"Knudesen","year":"2003","journal-title":"Genetics"},{"key":"2023012512004207900_B21","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1214\/aoms\/1177729694","article-title":"On information and sufficiency","volume":"22","author":"Kullback","year":"1951","journal-title":"Ann. Math. Stat."},{"key":"2023012512004207900_B22","doi-asserted-by":"crossref","first-page":"1095","DOI":"10.1093\/molbev\/msh112","article-title":"A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process","volume":"21","author":"Lartillot","year":"2004","journal-title":"Mol. Biol. Evol."},{"key":"2023012512004207900_B23","doi-asserted-by":"crossref","first-page":"1307","DOI":"10.1093\/molbev\/msn067","article-title":"An improved general amino acid replacement matrix","volume":"25","author":"Le","year":"2008","journal-title":"Mol. Biol. Evol."},{"key":"2023012512004207900_B24","first-page":"14","article-title":"Evolution of duplicated genes","volume-title":"Evolution of Genes and Proteins","author":"Li","year":"1983"},{"key":"2023012512004207900_B25","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1006\/jmbi.1996.0167","article-title":"An evolutionary trace method defines binding surfaces common to protein families","volume":"257","author":"Lichtarge","year":"1996","journal-title":"J. Mol. Biol."},{"key":"2023012512004207900_B26","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1109\/18.61115","article-title":"Divergence measures based on the shannon entropy","volume":"37","author":"Lin","year":"1991","journal-title":"IEEE Trans. Informat. Theory"},{"key":"2023012512004207900_B27","doi-asserted-by":"crossref","first-page":"8126","DOI":"10.1074\/jbc.M312671200","article-title":"Evolutionary trace of G protein-coupled receptors reveals clusters of residues that determine global and class-specific functions","volume":"279","author":"Madabushi","year":"2004","journal-title":"J. Biol. Chem."},{"key":"2023012512004207900_B28","doi-asserted-by":"crossref","first-page":"1265","DOI":"10.1016\/j.jmb.2003.12.078","article-title":"A family of evolution-entropy hybrid methods for ranking protein residues by importance","volume":"336","author":"Mihalek","year":"2004","journal-title":"J. Mol. Biol."},{"key":"2023012512004207900_B29","doi-asserted-by":"crossref","first-page":"793","DOI":"10.1006\/jmbi.2001.4630","article-title":"Surface map comparison: studying function diversity of homologous proteins","volume":"309","author":"Pawlowski","year":"2001","journal-title":"J. Mol. Biol."},{"key":"2023012512004207900_B30","doi-asserted-by":"crossref","first-page":"6540","DOI":"10.1093\/nar\/gkl901","article-title":"Sequence comparison by sequence harmony identifies subtype-specific functional sites","volume":"34","author":"Pirovano","year":"2006","journal-title":"Nucleic Acids Res."},{"key":"2023012512004207900_B31","doi-asserted-by":"crossref","first-page":"1641","DOI":"10.1093\/molbev\/msp077","article-title":"FastTree: computing large minimum-evolution trees with profiles instead of a distance matrix","volume":"26","author":"Price","year":"2009","journal-title":"Mol. Biol. Evol."},{"key":"2023012512004207900_B32","doi-asserted-by":"crossref","first-page":"e9490","DOI":"10.1371\/journal.pone.0009490","article-title":"FastTree 2 \u2013 approximately maximum-likelihood trees for large alignments","volume":"5","author":"Price","year":"2010","journal-title":"PLoS One"},{"key":"2023012512004207900_B33","first-page":"1046","article-title":"Evolutionary identification of a subtype specific functional site in the ligand binding domain of steroid receptors","volume":"1057","author":"Raviscioni","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012512004207900_B34","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1002\/prot.22141","article-title":"Rapid comparison of properties on protein surface","volume":"73","author":"Sael","year":"2008","journal-title":"Proteins"},{"key":"2023012512004207900_B35","doi-asserted-by":"crossref","first-page":"617","DOI":"10.1093\/bioinformatics\/btq008","article-title":"Active site prediction using evolutionary and structural information","volume":"26","author":"Sankararaman","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012512004207900_B36","doi-asserted-by":"crossref","first-page":"502","DOI":"10.1093\/bioinformatics\/18.3.502","article-title":"TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing","volume":"18","author":"Schmidt","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012512004207900_B37","first-page":"327","article-title":"Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology","volume":"12","author":"Sj\u00f6lander","year":"1996","journal-title":"Comput. Appl. Biosci."},{"key":"2023012512004207900_B38","doi-asserted-by":"crossref","first-page":"2688","DOI":"10.1093\/bioinformatics\/btl446","article-title":"RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models","volume":"22","author":"Stamatakis","year":"2006","journal-title":"Bioinformatics"},{"key":"2023012512004207900_B39","doi-asserted-by":"crossref","first-page":"640","DOI":"10.1093\/molbev\/msl195","article-title":"indel-Seq-Gen: a new protein family simulator incorporating domains, motifs, and indels","volume":"24","author":"Strope","year":"2007","journal-title":"Mol. Biol. Evol."},{"key":"2023012512004207900_B40","doi-asserted-by":"crossref","first-page":"2581","DOI":"10.1093\/molbev\/msp174","article-title":"Biological sequence simulation for testing complex evolutionary hypotheses: indel-Seq-Gen version 2.0","volume":"26","author":"Strope","year":"2009","journal-title":"Mol. Biol. Evol."},{"key":"2023012512004207900_B41","doi-asserted-by":"crossref","first-page":"1514","DOI":"10.1093\/oxfordjournals.molbev.a004214","article-title":"Testing for differences in rates-across-sites distributions in phylogenetic trees","volume":"19","author":"Susko","year":"2002","journal-title":"Mol. Biol. Evol."},{"key":"2023012512004207900_B42","doi-asserted-by":"crossref","first-page":"351","DOI":"10.1007\/s00239-004-0352-9","article-title":"Biases in phylogenetic estimation can be caused by random sequence segments","volume":"61","author":"Susko","year":"2005","journal-title":"J. Mol. Evol."},{"key":"2023012512004207900_B43","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1186\/1471-2148-8-331","article-title":"A class frequency mixture model that adjusts for site-specific amino acid frequencies and improves inference of protein phylogeny","volume":"8","author":"Wang","year":"2008","journal-title":"BMC Evol. Biol."},{"key":"2023012512004207900_B44","doi-asserted-by":"crossref","first-page":"691","DOI":"10.1093\/oxfordjournals.molbev.a003851","article-title":"A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach","volume":"18","author":"Whelan","year":"2001","journal-title":"Mol. Biol. Evol."},{"key":"2023012512004207900_B45","doi-asserted-by":"crossref","first-page":"717","DOI":"10.1093\/oxfordjournals.molbev.a025811","article-title":"Bayesian phylogenetic inferences using DNA sequences: a Markov chain Monte Carlo method","volume":"14","author":"Yang","year":"1997","journal-title":"Mol. Biol. Evol."},{"key":"2023012512004207900_B46","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1093\/bioinformatics\/btm537","article-title":"Multi-RELIEF: a method to recognize specificity determining residues from multiple sequence alignments using a Machine-Learning approach for feature weighting","volume":"24","author":"Ye","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012512004207900_B47","doi-asserted-by":"crossref","first-page":"588","DOI":"10.1080\/10635150290102339","article-title":"Increased taxon sampling greatly reduces phylogenetic error","volume":"51","author":"Zwickl","year":"2002","journal-title":"Syst. Biol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/19\/2655\/48868934\/bioinformatics_27_19_2655.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/19\/2655\/48868934\/bioinformatics_27_19_2655.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T13:54:54Z","timestamp":1674654894000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/27\/19\/2655\/232361"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,8,11]]},"references-count":47,"journal-issue":{"issue":"19","published-print":{"date-parts":[[2011,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btr470","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2011,10,1]]},"published":{"date-parts":[[2011,8,11]]}}}