{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,12]],"date-time":"2026-05-12T20:48:32Z","timestamp":1778618912954,"version":"3.51.4"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"14","license":[{"start":{"date-parts":[[2017,7,12]],"date-time":"2017-07-12T00:00:00Z","timestamp":1499817600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,7,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Accurate orthology inference is a fundamental step in many phylogenetics and comparative analysis. Many methods have been proposed, including OMA (Orthologous MAtrix). Yet substantial challenges remain, in particular in coping with fragmented genes or genes evolving at different rates after duplication, and in scaling to large datasets. With more and more genomes available, it is necessary to improve the scalability and robustness of orthology inference methods.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We present improvements in the OMA algorithm: (i) refining the pairwise orthology inference step to account for same-species paralogs evolving at different rates, and (ii) minimizing errors in the pairwise orthology verification step by testing the consistency of pairwise distance estimates, which can be problematic in the presence of fragmentary sequences. In addition we introduce a more scalable procedure for hierarchical orthologous group (HOG) clustering, which are several orders of magnitude faster on large datasets. Using the Quest for Orthologs consortium orthology benchmark service, we show that these changes translate into substantial improvement on multiple empirical datasets.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and Implementation<\/jats:title>\n                  <jats:p>This new OMA 2.0 algorithm is used in the OMA database (http:\/\/omabrowser.org) from the March 2017 release onwards, and can be run on custom genomes using OMA standalone version 2.0 and above (http:\/\/omabrowser.org\/standalone).<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx229","type":"journal-article","created":{"date-parts":[[2017,4,20]],"date-time":"2017-04-20T07:52:13Z","timestamp":1492674733000},"page":"i75-i82","source":"Crossref","is-referenced-by-count":97,"title":["Orthologous Matrix (OMA) algorithm 2.0: more robust to asymmetric evolutionary rates and more scalable hierarchical orthologous group inference"],"prefix":"10.1093","volume":"33","author":[{"given":"Cl\u00e9ment-Marie","family":"Train","sequence":"first","affiliation":[{"name":"Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland"},{"name":"Swiss Institute of Bioinformatics, Lausanne, Switzerland"},{"name":"Center of Integrative Genomics, University of Lausanne, Lausanne, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Natasha M","family":"Glover","sequence":"additional","affiliation":[{"name":"Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland"},{"name":"Swiss Institute of Bioinformatics, Lausanne, Switzerland"},{"name":"Center of Integrative Genomics, University of Lausanne, Lausanne, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gaston H","family":"Gonnet","sequence":"additional","affiliation":[{"name":"Department of Computer Science, ETH Zurich, Zurich, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Adrian M","family":"Altenhoff","sequence":"additional","affiliation":[{"name":"Swiss Institute of Bioinformatics, Lausanne, Switzerland"},{"name":"Department of Computer Science, ETH Zurich, Zurich, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Christophe","family":"Dessimoz","sequence":"additional","affiliation":[{"name":"Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland"},{"name":"Swiss Institute of Bioinformatics, Lausanne, Switzerland"},{"name":"Center of Integrative Genomics, University of Lausanne, Lausanne, Switzerland"},{"name":"Department of Genetics, Evolution and Environment, University College London, London, UK"},{"name":"Department of Computer Science, University College London, London, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2017,7,12]]},"reference":[{"key":"2023051506494159700_btx229-B1","doi-asserted-by":"crossref","first-page":"e53786.","DOI":"10.1371\/journal.pone.0053786","article-title":"Inferring hierarchical orthologous groups from orthologous gene pairs","volume":"8","author":"Altenhoff","year":"2013","journal-title":"PLoS One"},{"key":"2023051506494159700_btx229-B2","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1038\/nmeth.3830","article-title":"Standardized benchmarking in the quest for orthologs","volume":"13","author":"Altenhoff","year":"2016","journal-title":"Nat. Methods"},{"key":"2023051506494159700_btx229-B3","doi-asserted-by":"crossref","first-page":"D240","DOI":"10.1093\/nar\/gku1158","article-title":"The OMA orthology database in 2015: function predictions, better plant support, synteny view and other improvements","volume":"43","author":"Altenhoff","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023051506494159700_btx229-B4","doi-asserted-by":"crossref","first-page":"e1000262.","DOI":"10.1371\/journal.pcbi.1000262","article-title":"Phylogenetic and functional assessment of orthologs inference projects and methods","volume":"5","author":"Altenhoff","year":"2009","journal-title":"PLoS Comput. Biol"},{"key":"2023051506494159700_btx229-B5","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1093\/molbev\/msw153","article-title":"A new orthology assessment method for phylogenomic data: unrooted phylogenetic orthology","volume":"33","author":"Ballesteros","year":"2016","journal-title":"Mol. Biol. Evol"},{"key":"2023051506494159700_btx229-B6","doi-asserted-by":"crossref","first-page":"423","DOI":"10.1093\/bib\/bbr034","article-title":"Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees","volume":"12","author":"Boeckmann","year":"2011","journal-title":"Brief. Bioinformatics"},{"key":"2023051506494159700_btx229-B7","doi-asserted-by":"crossref","first-page":"1988","DOI":"10.1093\/gbe\/evv121","article-title":"Quest for orthologs entails quest for tree of life: in search of the gene stream","volume":"7","author":"Boeckmann","year":"2015","journal-title":"Genome Biol. Evol"},{"key":"2023051506494159700_btx229-B8","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1101\/gr.141978.112","article-title":"Genome-scale coestimation of species and gene trees","volume":"23","author":"Boussau","year":"2012","journal-title":"Genome Res"},{"key":"2023051506494159700_btx229-B9","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1016\/0095-8956(74)90047-1","article-title":"A note on the metric properties of trees","volume":"17","author":"Buneman","year":"1974","journal-title":"J. Combin. Theory Ser. B"},{"key":"2023051506494159700_btx229-B10","author":"Cormen","year":"2009"},{"key":"2023051506494159700_btx229-B11","doi-asserted-by":"crossref","first-page":"1800","DOI":"10.1093\/gbe\/evt132","article-title":"Bidirectional best hits miss many orthologs in duplication-rich clades such as plants and animals","volume":"5","author":"Dalquen","year":"2013","journal-title":"Genome Biol. Evol"},{"key":"2023051506494159700_btx229-B12","doi-asserted-by":"crossref","first-page":"3309","DOI":"10.1093\/nar\/gkl433","article-title":"Detecting non-orthology in the COGs database and other approaches grouping orthologs using genome-specific best hits","volume":"34","author":"Dessimoz","year":"2006","journal-title":"Nucleic Acids Res"},{"key":"2023051506494159700_btx229-B13","doi-asserted-by":"crossref","first-page":"529.","DOI":"10.1186\/1471-2105-7-529","article-title":"Fast estimation of the difference between two PAM\/JTT evolutionary distances in triplets of homologous sequences","volume":"7","author":"Dessimoz","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023051506494159700_btx229-B14","first-page":"61","volume-title":"RECOMB 2005 Workshop on Comparative Genomics","author":"Dessimoz","year":"2005"},{"key":"2023051506494159700_btx229-B15","doi-asserted-by":"crossref","first-page":"320","DOI":"10.1089\/cmb.2006.13.320","article-title":"A hybrid micro-macroevolutionary approach to gene tree reconstruction","volume":"13","author":"Durand","year":"2006","journal-title":"J. Comput. Biol"},{"key":"2023051506494159700_btx229-B16","doi-asserted-by":"crossref","first-page":"99","DOI":"10.2307\/2412448","article-title":"Distinguishing homologous from analogous proteins","volume":"19","author":"Fitch","year":"1970","journal-title":"Syst. Zool"},{"key":"2023051506494159700_btx229-B17","doi-asserted-by":"crossref","first-page":"D271","DOI":"10.1093\/nar\/gkm845","article-title":"OrthoDB: the hierarchical catalog of eukaryotic orthologs","volume":"36","author":"Kriventseva","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023051506494159700_btx229-B18","doi-asserted-by":"crossref","first-page":"S12.","DOI":"10.1186\/1471-2164-15-S6-S12","article-title":"Orthology and paralogy constraints: satisfiability and consistency","volume":"15(Suppl 6)","author":"Lafond","year":"2014","journal-title":"BMC Genomics"},{"key":"2023051506494159700_btx229-B19","doi-asserted-by":"crossref","first-page":"2178","DOI":"10.1101\/gr.1224503","article-title":"OrthoMCL: identification of ortholog groups for eukaryotic genomes","volume":"13","author":"Li","year":"2003","journal-title":"Genome Res"},{"key":"2023051506494159700_btx229-B20","doi-asserted-by":"crossref","first-page":"11.","DOI":"10.1186\/1471-2105-12-11","article-title":"OrthoInspector: comprehensive orthology analysis and visual exploration","volume":"12","author":"Linard","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023051506494159700_btx229-B21","doi-asserted-by":"crossref","first-page":"2896","DOI":"10.1073\/pnas.96.6.2896","article-title":"The use of gene clusters to infer functional coupling","volume":"96","author":"Overbeek","year":"1999","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023051506494159700_btx229-B22","doi-asserted-by":"crossref","first-page":"e1000602.","DOI":"10.1371\/journal.pbio.1000602","article-title":"Resolving difficult phylogenetic questions: why more sequences are not enough","volume":"9","author":"Philippe","year":"2011","journal-title":"PLoS Biol"},{"key":"2023051506494159700_btx229-B23","doi-asserted-by":"crossref","first-page":"1041","DOI":"10.1006\/jmbi.2000.5197","article-title":"Automatic clustering of orthologs and in-paralogs from pairwise species comparisons","volume":"314","author":"Remm","year":"2001","journal-title":"J. Mol. Biol"},{"key":"2023051506494159700_btx229-B24","doi-asserted-by":"crossref","first-page":"518.","DOI":"10.1186\/1471-2105-9-518","article-title":"Algorithm of OMA for large-scale orthology inference","volume":"9","author":"Roth","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023051506494159700_btx229-B25","doi-asserted-by":"crossref","first-page":"2072","DOI":"10.1016\/j.jmb.2013.02.018","article-title":"Hieranoid: hierarchical orthology inference","volume":"425","author":"Schreiber","year":"2013","journal-title":"J. Mol. Biol"},{"key":"2023051506494159700_btx229-B26","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"Smith","year":"1981","journal-title":"J. Mol. Biol"},{"key":"2023051506494159700_btx229-B27","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1126\/science.278.5338.631","article-title":"A genomic perspective on protein families","volume":"278","author":"Tatusov","year":"1997","journal-title":"Science"},{"key":"2023051506494159700_btx229-B28","doi-asserted-by":"crossref","first-page":"769","DOI":"10.1002\/bies.201100062","article-title":"Orthology prediction methods: a quality assessment using curated protein families","volume":"33","author":"Trachana","year":"2011","journal-title":"Bioessays"},{"key":"2023051506494159700_btx229-B29","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1101\/gr.073585.107","article-title":"EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates","volume":"19","author":"Vilella","year":"2008","journal-title":"Genome Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/14\/i75\/50314978\/bioinformatics_33_14_i75.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/14\/i75\/50314978\/bioinformatics_33_14_i75.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,15]],"date-time":"2023-05-15T06:50:23Z","timestamp":1684133423000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/14\/i75\/3953943"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,7,12]]},"references-count":29,"journal-issue":{"issue":"14","published-print":{"date-parts":[[2017,7,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx229","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2017,7,15]]},"published":{"date-parts":[[2017,7,12]]}}}