{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T15:58:09Z","timestamp":1770739089182,"version":"3.49.0"},"reference-count":23,"publisher":"Springer Science and Business Media LLC","issue":"S4","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2007,5]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Phylogenetic profiles record the occurrence of homologs of genes across fully sequenced organisms. Proteins with similar profiles are typically components of protein complexes or metabolic pathways. Various existing methods measure similarity between two profiles and, hence, the likelihood that the two proteins co-evolve. Some methods ignore phylogenetic relationships between organisms while others account for such with metrics that explicitly model the likelihood of two proteins co-evolving on a tree. The latter methods more sensitively detect co-evolving proteins, but at a significant computational cost. Here we propose a novel heuristic to improve phylogenetic profile analysis that accounts for phylogenetic relationships between genomes in a computationally efficient fashion. We first order the genomes within profiles and then enumerate runs of consecutive matches and accurately compute the probability of observing these. We hypothesize that profiles with many runs are more likely to involve functionally related proteins than profiles in which all the matches are concentrated in one interval of the tree.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We compared our approach to various previously published methods that both ignore and incorporate the underlying phylogeny between organisms. To evaluate performance, we compare the functional similarity of rank-ordered lists of protein pairs that share similar phylogenetic profiles by assessing significance of overlap in their Gene Ontology annotations. Accounting for runs in phylogenetic profile matches improves our ability to identify functionally related pairs of proteins. Furthermore, the networks that result from our approach tend to have smaller clusters of co-evolving proteins than networks computed using previous approaches and are thus more useful for inferring functional relationships. Finally, we report that our approach is orders of magnitude more computationally efficient than full tree-based methods.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>We have developed an improved method for analyzing phylogenetic profiles. The method allows us to more accurately and efficiently infer functional relationships between proteins based on these profiles than other published approaches. As the number of fully sequenced genomes increases, it becomes more important to account for evolutionary relationships among organisms in comparative analyses. Our approach, therefore, serves as an important example of how these relationships may be accounted for in an efficient manner.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-8-s4-s7","type":"journal-article","created":{"date-parts":[[2007,5,23]],"date-time":"2007-05-23T16:35:21Z","timestamp":1179938121000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":41,"title":["An improved method for identifying functionally linked proteins using phylogenetic profiles"],"prefix":"10.1186","volume":"8","author":[{"given":"Shawn","family":"Cokus","sequence":"first","affiliation":[]},{"given":"Sayaka","family":"Mizutani","sequence":"additional","affiliation":[]},{"given":"Matteo","family":"Pellegrini","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2007,5,22]]},"reference":[{"key":"1910_CR1","doi-asserted-by":"publisher","first-page":"4285","DOI":"10.1073\/pnas.96.8.4285","volume":"96","author":"M Pellegrini","year":"1999","unstructured":"Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO: Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc Natl Acad Sci USA 1999, 96: 4285\u20134288. 10.1073\/pnas.96.8.4285","journal-title":"Proc Natl Acad Sci USA"},{"key":"1910_CR2","doi-asserted-by":"publisher","first-page":"1524","DOI":"10.1093\/bioinformatics\/btg187","volume":"19","author":"J Wu","year":"2003","unstructured":"Wu J, Kasif S, DeLisi C: Identification of functional links between genes using phylogenetic profiles. Bioinformatics 2003, 19: 1524\u20131530. 10.1093\/bioinformatics\/btg187","journal-title":"Bioinformatics"},{"key":"1910_CR3","doi-asserted-by":"publisher","first-page":"1055","DOI":"10.1038\/nbt861","volume":"21","author":"SV Date","year":"2003","unstructured":"Date SV, Marcotte EM: Discovery of uncharacterized cellular systems by genome-wide analysis of functional linkages. Nat Biotechnol 2003, 21: 1055\u20131062. 10.1038\/nbt861","journal-title":"Nat Biotechnol"},{"issue":"Suppl 1","key":"1910_CR4","doi-asserted-by":"publisher","first-page":"S276","DOI":"10.1093\/bioinformatics\/18.suppl_1.S276","volume":"18","author":"JP Vert","year":"2002","unstructured":"Vert JP: A tree kernel to analyse phylogenetic profiles. Bioinformatics 2002, 18(Suppl 1):S276-S284.","journal-title":"Bioinformatics"},{"key":"1910_CR5","doi-asserted-by":"publisher","first-page":"e3","DOI":"10.1371\/journal.pcbi.0010003","volume":"1","author":"D Barker","year":"2005","unstructured":"Barker D, Pagel M: Predicting functional gene links from phylogenetic-statistical analyses of whole genomes. PLoS Comput Biol 2005, 1: e3. 10.1371\/journal.pcbi.0010003","journal-title":"PLoS Comput Biol"},{"key":"1910_CR6","doi-asserted-by":"publisher","first-page":"1150","DOI":"10.1016\/j.jmb.2006.04.011","volume":"359","author":"Y Zhou","year":"2006","unstructured":"Zhou Y, Wang R, Li L, Xia X, Sun Z: Inferring functional linkages between proteins from evolutionary scenarios. J Mol Biol 2006, 359: 1150\u20131159. 10.1016\/j.jmb.2006.04.011","journal-title":"J Mol Biol"},{"key":"1910_CR7","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1093\/bioinformatics\/btl558","volume":"23","author":"D Barker","year":"2007","unstructured":"Barker D, Meade A, Pagel M: Constrained models of evolution lead to improved prediction of functional linkage from correlated gain and loss of genes. Bioinformatics 2007, 23: 14\u201320. 10.1093\/bioinformatics\/btl558","journal-title":"Bioinformatics"},{"key":"1910_CR8","doi-asserted-by":"publisher","first-page":"985","DOI":"10.1016\/j.bbrc.2006.12.146","volume":"353","author":"J Sun","year":"2007","unstructured":"Sun J, Li Y, Zhao Z: Phylogenetic profiles for the prediction of protein-protein interactions: How to select reference organisms? Biochem Biophys Res Commun 2007, 353: 985\u2013991. 10.1016\/j.bbrc.2006.12.146","journal-title":"Biochem Biophys Res Commun"},{"key":"1910_CR9","doi-asserted-by":"publisher","first-page":"177","DOI":"10.1186\/1471-2105-7-177","volume":"7","author":"P Kharchenko","year":"2006","unstructured":"Kharchenko P, Chen L, Freund Y, Vitkup D, Church GM: Identifying metabolic enzymes with multiple types of association evidence. BMC Bioinformatics 2006, 7: 177. 10.1186\/1471-2105-7-177","journal-title":"BMC Bioinformatics"},{"key":"1910_CR10","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1038\/75556","volume":"25","author":"M Ashburner","year":"2000","unstructured":"Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000, 25: 25\u201329. 10.1038\/75556","journal-title":"Nat Genet"},{"key":"1910_CR11","doi-asserted-by":"publisher","first-page":"R35","DOI":"10.1186\/gb-2004-5-5-r35","volume":"5","author":"PM Bowers","year":"2004","unstructured":"Bowers PM, Pellegrini M, Thompson MJ, Fierro J, Yeates TO, Eisenberg D: Prolinks: a database of protein functional linkages derived from coevolution. Genome Biol 2004, 5: R35. 10.1186\/gb-2004-5-5-r35","journal-title":"Genome Biol"},{"key":"1910_CR12","doi-asserted-by":"publisher","first-page":"2006.0005","DOI":"10.1038\/msb4100047","volume":"2","author":"N Slonim","year":"2006","unstructured":"Slonim N, Elemento O, Tavazoie S: Ab initio genotype-phenotype association reveals intrinsic modularity in genetic networks. Mol Syst Biol 2006, 2: 2006.0005. 10.1038\/msb4100047","journal-title":"Mol Syst Biol"},{"key":"1910_CR13","doi-asserted-by":"publisher","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","volume":"215","author":"SF Altschul","year":"1990","unstructured":"Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215: 403\u2013410.","journal-title":"J Mol Biol"},{"key":"1910_CR14","doi-asserted-by":"publisher","first-page":"4218","DOI":"10.1093\/nar\/27.21.4218","volume":"27","author":"ST Fitz-Gibbon","year":"1999","unstructured":"Fitz-Gibbon ST, House CH: Whole genome-based phylogenetic analysis of free-living microorganisms. Nucleic Acids Res 1999, 27: 4218\u20134222. 10.1093\/nar\/27.21.4218","journal-title":"Nucleic Acids Res"},{"key":"1910_CR15","doi-asserted-by":"publisher","first-page":"1070","DOI":"10.1093\/bioinformatics\/btg030","volume":"19","author":"Z Bar-Joseph","year":"2003","unstructured":"Bar-Joseph Z, Demaine ED, Gifford DK, Srebro N, Hamel AM, Jaakkola TS: K -ary clustering with optimal leaf ordering for gene expression data. Bioinformatics 2003, 19: 1070\u20131078. 10.1093\/bioinformatics\/btg030","journal-title":"Bioinformatics"},{"key":"1910_CR16","doi-asserted-by":"publisher","first-page":"253","DOI":"10.1038\/nbt1065","volume":"23","author":"H Li","year":"2005","unstructured":"Li H, Pellegrini M, Eisenberg D: Detection of parallel functional modules by comparative analysis of genome sequences. Nat Biotechnol 2005, 23: 253\u2013260. 10.1038\/nbt1065","journal-title":"Nat Biotechnol"},{"key":"1910_CR17","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1038\/nature03959","volume":"437","author":"M Margulies","year":"2005","unstructured":"Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, et al.: Genome sequencing in microfabricated high-density picolitre reactors. Nature 2005, 437: 376\u2013380.","journal-title":"Nature"},{"key":"1910_CR18","unstructured":"Supplementary material for \" Ab initio genotype-phenotype association reveals intrinsic modularity in genetic networks\"[http:\/\/tavazoielab.princeton.edu\/genphen\/]"},{"key":"1910_CR19","unstructured":"UniProt GOA proteome sets[http:\/\/www.ebi.ac.uk\/GOA\/proteomes.html]"},{"key":"1910_CR20","unstructured":"GO downloads[http:\/\/www.geneontology.org\/GO.downloads.shtml]"},{"key":"1910_CR21","unstructured":"Batch Entrez[http:\/\/www.ncbi.nlm.nih.gov\/entrez\/batchentrez.cgi]"},{"key":"1910_CR22","unstructured":"Map a batch of IDs in the i ProClass database[http:\/\/pir.georgetown.edu\/pirwww\/search\/idmapping.shtml]"},{"key":"1910_CR23","unstructured":"Reading Evolutionary Biology Group \u2013 BayesTraits[http:\/\/www.evolution.rdg.ac.uk\/BayesTraits.html]"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-8-S4-S7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,8,31]],"date-time":"2021-08-31T21:19:38Z","timestamp":1630444778000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-8-S4-S7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,5]]},"references-count":23,"journal-issue":{"issue":"S4","published-print":{"date-parts":[[2007,5]]}},"alternative-id":["1910"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-8-s4-s7","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2007,5]]},"assertion":[{"value":"22 May 2007","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S7"}}