{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,31]],"date-time":"2026-01-31T16:00:19Z","timestamp":1769875219405,"version":"3.49.0"},"reference-count":41,"publisher":"Oxford University Press (OUP)","issue":"7","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2006,4,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Determining orthology relations among genes across multiple genomes is an important problem in the post-genomic era. Identifying orthologous genes can not only help predict functional annotations for newly sequenced or poorly characterized genomes, but can also help predict new protein\u2013protein interactions. Unfortunately, determining orthology relation through computational methods is not straightforward due to the presence of paralogs. Traditional approaches have relied on pairwise sequence comparisons to construct graphs, which were then partitioned into putative clusters of orthologous groups. These methods do not attempt to preserve the non-transitivity and hierarchic nature of the orthology relation.<\/jats:p>\n               <jats:p>Results: We propose a new method, COCO-CL, for hierarchical clustering of homology relations and identification of orthologous groups of genes. Unlike previous approaches, which are based on pairwise sequence comparisons, our method explores the correlation of evolutionary histories of individual genes in a more global context. COCO-CL can be used as a semi-independent method to delineate the orthology\/paralogy relation for a refined set of homologous proteins obtained using a less-conservative clustering approach, or as a refiner that removes putative out-paralogs from clusters computed using a more inclusive approach. We analyze our clustering results manually, with support from literature and functional annotations. Since our orthology determination procedure does not employ a species tree to infer duplication events, it can be used in situations when the species tree is unknown or uncertain.<\/jats:p>\n               <jats:p>Contact: \u00a0jothi@mail.nih.gov, przytyck@mail.nih.gov<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary materials are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btl009","type":"journal-article","created":{"date-parts":[[2006,1,25]],"date-time":"2006-01-25T02:48:15Z","timestamp":1138157295000},"page":"779-788","source":"Crossref","is-referenced-by-count":55,"title":["COCO-CL: hierarchical clustering of homology relations based on evolutionary correlations"],"prefix":"10.1093","volume":"22","author":[{"given":"Raja","family":"Jothi","sequence":"first","affiliation":[{"name":"National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health 1 \u00a0 1 \u00a0 \u00a0 Bethesda, MD 20894, USA"}]},{"given":"Elena","family":"Zotenko","sequence":"additional","affiliation":[{"name":"National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health 1 \u00a0 1 \u00a0 \u00a0 Bethesda, MD 20894, USA"},{"name":"Department of Computer Science, University of Maryland 2 \u00a0 2 \u00a0 \u00a0 College Park, MD 20742, USA"}]},{"given":"Asba","family":"Tasneem","sequence":"additional","affiliation":[{"name":"Beckman Institute, University of Illinois at Urbana-Champaign 3 \u00a0 3 \u00a0 \u00a0 Urbana, IL 61801, USA"}]},{"given":"Teresa M.","family":"Przytycka","sequence":"additional","affiliation":[{"name":"National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health 1 \u00a0 1 \u00a0 \u00a0 Bethesda, MD 20894, USA"}]}],"member":"286","published-online":{"date-parts":[[2006,1,24]]},"reference":[{"key":"2023012409121960700_b1","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol."},{"key":"2023012409121960700_b2","doi-asserted-by":"crossref","first-page":"1910","DOI":"10.1093\/oxfordjournals.molbev.a004015","article-title":"Multiple ribonuclease H-encoding genes in the Caenorhabditis elegans genome contrasts with the two typical ribonuclease H-encoding genes in the human genome","volume":"19","author":"Arudchandran","year":"2002","journal-title":"Mol. Biol. Evol."},{"key":"2023012409121960700_b3","doi-asserted-by":"crossref","first-page":"24585","DOI":"10.1074\/jbc.274.35.24585","article-title":"Signal peptide peptidase- and ClpP-like proteins of bacillus subtilis required for efficient translocation and processing of secretory proteins","volume":"274","author":"Bolhuis","year":"1999","journal-title":"J. Biol. Chem."},{"key":"2023012409121960700_b4","doi-asserted-by":"crossref","first-page":"485","DOI":"10.1007\/PL00006571","article-title":"Gene descent, duplication, and horizontal transfer in the evolution of glutamyl- and glutaminyl-tRNA synthetases","volume":"49","author":"Brown","year":"1999","journal-title":"J. Mol. Evol."},{"key":"2023012409121960700_b5","doi-asserted-by":"crossref","first-page":"429","DOI":"10.1089\/106652700750050871","article-title":"NOTUNG: a program for dating gene duplication and optimizing gene family trees","volume":"7","author":"Chen","year":"2000","journal-title":"J. Comput. Biol."},{"key":"2023012409121960700_b6","doi-asserted-by":"crossref","first-page":"2022","DOI":"10.1126\/science.282.5396.2022","article-title":"Comparison of the complete protein sets of worm and yeast: orthology and divergence","volume":"282","author":"Chervitz","year":"1998","journal-title":"Science"},{"key":"2023012409121960700_b7","doi-asserted-by":"crossref","first-page":"2596","DOI":"10.1093\/bioinformatics\/bti325","article-title":"Tree pattern matching in phylogenetic trees: automatic search for orthologs or paralogs in homologous gene sequence databases","volume":"21","author":"Dufayard","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012409121960700_b8","first-page":"250","article-title":"A hybrid micro-macro approach to gene tree reconstruction","author":"Durand","year":"2005","journal-title":"RECOMB"},{"key":"2023012409121960700_b9","doi-asserted-by":"crossref","first-page":"481","DOI":"10.1006\/tpbi.2002.1594","article-title":"Phylogenetic analysis and gene functional predictions: phylogenomics in action","volume":"61","author":"Eisen","year":"2002","journal-title":"Theor. Popul. Biol."},{"key":"2023012409121960700_b10","doi-asserted-by":"crossref","first-page":"e45","DOI":"10.1371\/journal.pcbi.0010045","article-title":"Protein molecular function prediction by Bayesian phylogenomics","volume":"1","author":"Engelhardt","year":"2005","journal-title":"PLoS Computat. Biol."},{"key":"2023012409121960700_b11","doi-asserted-by":"crossref","first-page":"99","DOI":"10.2307\/2412448","article-title":"Distinguishing homologous from analogous proteins","volume":"19","author":"Fitch","year":"1970","journal-title":"Syst. Zool."},{"key":"2023012409121960700_b12","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1016\/S0168-9525(00)02005-9","article-title":"Homology a personal view on some of the problems","volume":"16","author":"Fitch","year":"2000","journal-title":"Trends Genet."},{"key":"2023012409121960700_b13","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1006\/jmbi.2000.3732","article-title":"Co-evolution of proteins with their interaction partners","volume":"299","author":"Goh","year":"2000","journal-title":"J. Mol. Biol."},{"key":"2023012409121960700_b14","doi-asserted-by":"crossref","first-page":"177","DOI":"10.1016\/S0022-2836(02)01038-0","article-title":"Co-evolutionary analysis reveals insights into protein\u2013protein interactions","volume":"324","author":"Goh","year":"2002","journal-title":"J. Mol. Biol."},{"key":"2023012409121960700_b15","doi-asserted-by":"crossref","first-page":"132","DOI":"10.2307\/2412519","article-title":"Fitting the gene lineage into its species lineage: a parsimony strategy illustrated by cladograms constructed from globin sequences","volume":"28","author":"Goodman","year":"1979","journal-title":"Syst. Zool."},{"key":"2023012409121960700_b16","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1006\/mpev.1996.0071","article-title":"Reconstruction of ancient phylogenies","volume":"6","author":"Guigo","year":"1996","journal-title":"Mol. Phylogenet. Evol."},{"key":"2023012409121960700_b17","doi-asserted-by":"crossref","first-page":"241","DOI":"10.1007\/BF02289588","article-title":"Hierarchical clustering schemes","volume":"2","author":"Johnson","year":"1967","journal-title":"Psychometrika"},{"key":"2023012409121960700_b18","doi-asserted-by":"crossref","first-page":"i241","DOI":"10.1093\/bioinformatics\/bti1009","article-title":"Predicting protein\u2013protein interaction by searching evolutionary tree automorphism space","volume":"21","author":"Jothi","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012409121960700_b19","doi-asserted-by":"crossref","first-page":"8670","DOI":"10.1073\/pnas.91.18.8670","article-title":"Evolution of the Glx-tRNA synthetase family: the glutaminyl enzyme as a case of horizontal gene transfer","volume":"91","author":"Lamour","year":"1994","journal-title":"Proc. Natl Acad. Sci. USA."},{"key":"2023012409121960700_b20","doi-asserted-by":"crossref","first-page":"493","DOI":"10.1101\/gr.212002","article-title":"Cross-referencing eukaryotic genomes: TIGR Orthologous Gene Alignments (TOGA)","volume":"12","author":"Lee","year":"2002","journal-title":"Genome Res."},{"key":"2023012409121960700_b21","doi-asserted-by":"crossref","first-page":"2178","DOI":"10.1101\/gr.1224503","article-title":"OrthoMCL : identification of ortholog groups for eukaryotic genomes","volume":"13","author":"Li","year":"2003","journal-title":"Genome Res."},{"key":"2023012409121960700_b22","doi-asserted-by":"crossref","first-page":"9407","DOI":"10.1073\/pnas.95.16.9407","article-title":"Evolutionary parameters of the transcribed mammalian genome: an analysis of 2,820 orthologous rodent and human sequences","volume":"95","author":"Makaowski","year":"1998","journal-title":"Proc. Natl Acad. Sci. USA."},{"key":"2023012409121960700_b23","doi-asserted-by":"crossref","first-page":"D192","DOI":"10.1093\/nar\/gki069","article-title":"CDD: a conserved domain database for protein classification","volume":"33","author":"Marchler-Bauer","year":"2005","journal-title":"Nucleic Acids Res."},{"key":"2023012409121960700_b24","doi-asserted-by":"crossref","first-page":"493","DOI":"10.1089\/cmb.1995.2.493","article-title":"A biologically consistent model for comparing molecular phylogenies","volume":"2","author":"Mirkin","year":"1995","journal-title":"J. Comput. Biol."},{"key":"2023012409121960700_b25","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1006\/mpev.1996.0390","article-title":"From gene to organismal phylogeny: reconciled trees and gene tree\/species tree problem","volume":"7","author":"Page","year":"1997","journal-title":"Mol. Phylogenet. Evol."},{"key":"2023012409121960700_b26","first-page":"58","article-title":"Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas","volume":"43","author":"Page","year":"1994","journal-title":"Syst. Biol."},{"key":"2023012409121960700_b27","doi-asserted-by":"crossref","first-page":"609","DOI":"10.1093\/protein\/14.9.609","article-title":"Similarity of phylogenetic trees as indicator of protein\u2013protein interaction","volume":"14","author":"Pazos","year":"2001","journal-title":"Protein Eng."},{"key":"2023012409121960700_b28","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1016\/S0022-2836(03)00114-1","article-title":"Exploiting co-evolution of interacting proteins to discover interaction specificity","volume":"327","author":"Ramani","year":"2003","journal-title":"J. Mol. Biol."},{"key":"2023012409121960700_b29","doi-asserted-by":"crossref","first-page":"1041","DOI":"10.1006\/jmbi.2000.5197","article-title":"Automatic clustering of orthologs and in-paralogs from pariwise species comparisons","volume":"314","author":"Remm","year":"2002","journal-title":"J. Mol. Biol."},{"key":"2023012409121960700_b30","doi-asserted-by":"crossref","first-page":"1974","DOI":"10.1073\/pnas.0409522102","article-title":"Conserved patterns of protein interaction in multiple species","volume":"102","author":"Sharan","year":"2005","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012409121960700_b31","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1046\/j.1432-1327.1998.2560080.x","article-title":"Modular evolution of the Glx-tRNA synthetase family\u2013rooting of the evolutionary tree between the bacteria and archaea\/eukarya branches","volume":"256","author":"Siatecka","year":"1998","journal-title":"Eur. J. Biochem."},{"key":"2023012409121960700_b32","doi-asserted-by":"crossref","first-page":"619","DOI":"10.1016\/S0168-9525(02)02793-2","article-title":"Orthology, paralogy and proposed classification for paralog subtypes","volume":"18","author":"Sonnhammer","year":"2002","journal-title":"Trends Genet."},{"key":"2023012409121960700_b33","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1093\/bioinformatics\/18.1.92","article-title":"Automated ortholog inference from phylogenetic trees and calculation of orthology reliability","volume":"18","author":"Storm","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012409121960700_b36","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1126\/science.278.5338.631","article-title":"A genomic perspective on protein families","volume":"278","author":"Tatusov","year":"1997","journal-title":"Science"},{"key":"2023012409121960700_b35","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1093\/nar\/28.1.33","article-title":"The COG database: a tool for genome-scale analysis of protein functions and evolution","volume":"28","author":"Tatusov","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023012409121960700_b34","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1186\/1471-2105-4-41","article-title":"The COG database: an updated version includes eukaryotes","volume":"4","author":"Tatusov","year":"2003","journal-title":"BMC Bioinformatics"},{"key":"2023012409121960700_b37","doi-asserted-by":"crossref","first-page":"689","DOI":"10.1101\/gr.9.8.689","article-title":"Evolution of aminoacyl-tRNA synthetases\u2013analysis of unique domain architectures and phylogenetic trees reveals a complex history of horizontal gene transfer events","volume":"9","author":"Wolf","year":"1999","journal-title":"Genome Res."},{"key":"2023012409121960700_b38","doi-asserted-by":"crossref","first-page":"1107","DOI":"10.1101\/gr.1774904","article-title":"Annotation transfer between genomes: protein\u2013protein interologs and protein-DNA regulogs","volume":"14","author":"Yu","year":"2004","journal-title":"Genome Res."},{"key":"2023012409121960700_b39","doi-asserted-by":"crossref","first-page":"285","DOI":"10.1093\/bioinformatics\/14.3.285","article-title":"Towards detection of orthologues in sequence databases","volume":"14","author":"Yuan","year":"1998","journal-title":"Bioinformatics"},{"key":"2023012409121960700_b40","doi-asserted-by":"crossref","first-page":"821","DOI":"10.1093\/bioinformatics\/17.9.821","article-title":"A simple algorithm to infer gene duplication and speciation events on a gene tree","volume":"17","author":"Zmasek","year":"2001","journal-title":"Bioinformatics"},{"key":"2023012409121960700_b41","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1186\/1471-2105-3-14","article-title":"RIO: analyzing proteomes by automated phylogenomics using resampled inference of orthologs","volume":"3","author":"Zmasek","year":"2002","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/7\/779\/48840988\/bioinformatics_22_7_779.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/22\/7\/779\/48840988\/bioinformatics_22_7_779.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,24]],"date-time":"2023-01-24T09:49:02Z","timestamp":1674553742000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/22\/7\/779\/202484"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,1,24]]},"references-count":41,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2006,4,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btl009","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2006,4,1]]},"published":{"date-parts":[[2006,1,24]]}}}