{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:45:57Z","timestamp":1760150757708,"version":"build-2065373602"},"reference-count":31,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2023,12,28]],"date-time":"2023-12-28T00:00:00Z","timestamp":1703721600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e a Tecnologia (FCT)","doi-asserted-by":"publisher","award":["UIDB\/50021\/2020","LA\/P\/0078\/2020","951970","IPL\/IDI&CA2023\/PhyloLearn_ISEL"],"award-info":[{"award-number":["UIDB\/50021\/2020","LA\/P\/0078\/2020","951970","IPL\/IDI&CA2023\/PhyloLearn_ISEL"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"name":"European Union\u2019s Horizon 2020","award":["UIDB\/50021\/2020","LA\/P\/0078\/2020","951970","IPL\/IDI&CA2023\/PhyloLearn_ISEL"],"award-info":[{"award-number":["UIDB\/50021\/2020","LA\/P\/0078\/2020","951970","IPL\/IDI&CA2023\/PhyloLearn_ISEL"]}]},{"name":"Instituto Polit\u00e9cnico de Lisboa","award":["UIDB\/50021\/2020","LA\/P\/0078\/2020","951970","IPL\/IDI&CA2023\/PhyloLearn_ISEL"],"award-info":[{"award-number":["UIDB\/50021\/2020","LA\/P\/0078\/2020","951970","IPL\/IDI&CA2023\/PhyloLearn_ISEL"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>There are several tools available to infer phylogenetic trees, which depict the evolutionary relationships among biological entities such as viral and bacterial strains in infectious outbreaks or cancerous cells in tumor progression trees. These tools rely on several inference methods available to produce phylogenetic trees, with resulting trees not being unique. Thus, methods for comparing phylogenies that are capable of revealing where two phylogenetic trees agree or differ are required. An approach is then proposed to compute a similarity or dissimilarity measure between trees, with the Robinson\u2013Foulds distance being one of the most used, and which can be computed in linear time and space. Nevertheless, given the large and increasing volume of phylogenetic data, phylogenetic trees are becoming very large with hundreds of thousands of leaves. In this context, space requirements become an issue both while computing tree distances and while storing trees. We propose then an efficient implementation of the Robinson\u2013Foulds distance over tree succinct representations. Our implementation also generalizes the Robinson\u2013Foulds distances to labelled phylogenetic trees, i.e., trees containing labels on all nodes, instead of only on leaves. Experimental results show that we are able to still achieve linear time while requiring less space. Our implementation in C++ is available as an open-source tool.<\/jats:p>","DOI":"10.3390\/a17010015","type":"journal-article","created":{"date-parts":[[2023,12,28]],"date-time":"2023-12-28T09:35:21Z","timestamp":1703756121000},"page":"15","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Computing RF Tree Distance over Succinct Representations"],"prefix":"10.3390","volume":"17","author":[{"given":"Ant\u00f3nio Pedro","family":"Branco","sequence":"first","affiliation":[{"name":"Instituto de Engenharia de Sistemas e Computadores, Investiga\u00e7\u00e3o e Desenvolvimento em Lisboa (INESC-ID), 1000-029 Lisboa, Portugal"},{"name":"Instituto Superior T\u00e9cnico, Universidade de Lisboa, 1049-001 Lisboa, Portugal"}]},{"given":"C\u00e1tia","family":"Vaz","sequence":"additional","affiliation":[{"name":"Instituto de Engenharia de Sistemas e Computadores, Investiga\u00e7\u00e3o e Desenvolvimento em Lisboa (INESC-ID), 1000-029 Lisboa, Portugal"},{"name":"Instituto Superior de Engenharia de Lisboa, Instituto Polit\u00e9cnico de Lisboa, 1959-007 Lisboa, Portugal"}]},{"given":"Alexandre P.","family":"Francisco","sequence":"additional","affiliation":[{"name":"Instituto de Engenharia de Sistemas e Computadores, Investiga\u00e7\u00e3o e Desenvolvimento em Lisboa (INESC-ID), 1000-029 Lisboa, Portugal"},{"name":"Instituto Superior T\u00e9cnico, Universidade de Lisboa, 1049-001 Lisboa, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2023,12,28]]},"reference":[{"key":"ref_1","unstructured":"Felsenstein, J. (2004). Inferring Phylogenies, Sinauer Associates."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1093\/sysbio\/syu085","article-title":"Practical performance of tree comparison metrics","volume":"64","author":"Kuhner","year":"2015","journal-title":"Syst. Biol."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1006\/jagm.1999.1010","article-title":"Twist\u2013rotation transformations of binary trees and arithmetic expressions","volume":"32","author":"Li","year":"1999","journal-title":"J. Algorithms"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s00026-001-8006-8","article-title":"Subtree transfer operations and their induced metrics on evolutionary trees","volume":"5","author":"Allen","year":"2001","journal-title":"Ann. Comb."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"409","DOI":"10.1007\/s00026-004-0229-z","article-title":"On the computational complexity of the rooted subtree prune and regraft distance","volume":"8","author":"Bordewich","year":"2005","journal-title":"Ann. Comb."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1016\/0025-5564(81)90043-2","article-title":"Comparison of phylogenetic trees","volume":"53","author":"Robinson","year":"1981","journal-title":"Math. Biosci."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1007\/BFb0102690","article-title":"Comparison of weighted labelled trees","volume":"Volume 748","author":"Robinson","year":"1979","journal-title":"Proceedings of the Combinatorial Mathematics VI: Proceedings of the Sixth Australian Conference on Combinatorial Mathematics, Armidale, Australia, August 1978"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"323","DOI":"10.1093\/sysbio\/45.3.323","article-title":"The triples distance for rooted bifurcating phylogenetic trees","volume":"45","author":"Critchlow","year":"1996","journal-title":"Syst. Biol."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1093\/bioinformatics\/bti720","article-title":"A novel algorithm and web-based tool for comparing two alternative phylogenetic trees","volume":"22","author":"Nye","year":"2006","journal-title":"Bioinformatics"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"519","DOI":"10.2307\/1218253","article-title":"On the comparison of two classifications of the same set of elements","volume":"20","author":"Williams","year":"1971","journal-title":"Taxon"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1038\/297197a0","article-title":"Testing the theory of evolution by comparing phylogenetic trees constructed from five different protein sequences","volume":"297","author":"Penny","year":"1982","journal-title":"Nature"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"193","DOI":"10.2307\/2413326","article-title":"Comparison of undirected phylogenetic trees based on subtrees of four evolutionary units","volume":"34","author":"Estabrook","year":"1985","journal-title":"Syst. Zool."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"5007","DOI":"10.1093\/bioinformatics\/btaa614","article-title":"Information theoretic generalized Robinson\u2013Foulds metrics for comparing phylogenetic trees","volume":"36","author":"Smith","year":"2020","journal-title":"Bioinformatics"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"733","DOI":"10.1006\/aama.2001.0759","article-title":"Geometry of the space of phylogenetic trees","volume":"27","author":"Billera","year":"2001","journal-title":"Adv. Appl. Math."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"577","DOI":"10.1089\/cmb.2008.0068","article-title":"An exact algorithm for the geodesic distance between phylogenetic trees","volume":"15","author":"Kupczok","year":"2008","journal-title":"J. Comput. Biol."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1181","DOI":"10.1089\/cmb.2021.0342","article-title":"The generalized Robinson-Foulds distance for phylogenetic trees","volume":"28","author":"Valiente","year":"2021","journal-title":"J. Comput. Biol."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1972","DOI":"10.1093\/bib\/bby062","article-title":"A review of metrics measuring dissimilarity for rooted phylogenetic networks","volume":"20","author":"Wang","year":"2019","journal-title":"Briefings Bioinform."},{"key":"ref_18","unstructured":"Tavares, B.L. (2019). An analysis of the Geodesic Distance and other comparative metrics for tree-like structures. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1007\/BF01908061","article-title":"Optimal algorithms for comparing trees with labeled leaves","volume":"2","author":"Day","year":"1985","journal-title":"J. Classif."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"724","DOI":"10.1089\/cmb.2007.R012","article-title":"Efficiently computing the Robinson-Foulds metric","volume":"14","author":"Pattengale","year":"2007","journal-title":"J. Comput. Biol."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1391","DOI":"10.1093\/sysbio\/syac028","article-title":"A Linear Time Solution to the Labeled Robinson\u2013Foulds Distance Problem","volume":"71","author":"Briand","year":"2022","journal-title":"Syst. Biol."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1101\/gr.251678.119","article-title":"The EnteroBase user\u2019s guide, with case studies on Salmonella transmissions, Yersinia pestis phylogeny, and Escherichia core genomic diversity","volume":"30","author":"Zhou","year":"2020","journal-title":"Genome Res."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Navarro, G. (2016). Compact Data Structures: A Practical Approach, Cambridge University Press.","DOI":"10.1017\/CBO9781316588284"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Vaz, C., Nascimento, M., Carri\u00e7o, J.A., Rocher, T., and Francisco, A.P. (2021). Distance-based phylogenetic inference from typing data: A unifying view. Briefings Bioinform., 22.","DOI":"10.1093\/bib\/bbaa147"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Huson, D.H., Rupp, R., and Scornavacca, C. (2010). Phylogenetic Networks: Concepts, Algorithms and Applications, Cambridge University Press.","DOI":"10.1017\/CBO9780511974076"},{"key":"ref_26","unstructured":"G\u00f3recki, P., and Eulenstein, O. (2012). Proceedings of the International Symposium on Bioinformatics Research and Applications, Springer."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Francisco, A.P., Bugalho, M., Ramirez, M., and Carri\u00e7o, J.A. (2009). Global optimal eBURST analysis of multilocus typing data using a graphic matroid approach. BMC Bioinform., 10.","DOI":"10.1186\/1471-2105-10-152"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1395","DOI":"10.1101\/gr.232397.117","article-title":"GrapeTree: Visualization of core genomic relationships among 100,000 bacterial pathogens","volume":"28","author":"Zhou","year":"2018","journal-title":"Genome Res."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2601073","article-title":"Fully Functional Static and Dynamic Succinct Trees","volume":"10","author":"Navarro","year":"2014","journal-title":"ACM Trans. Algorithms"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Gog, S., Beller, T., Moffat, A., and Petri, M. (July, January 29). From Theory to Practice: Plug and Play with Succinct Data Structures. Proceedings of the 13th International Symposium on Experimental Algorithms (SEA 2014), Copenhagen, Denmark.","DOI":"10.1007\/978-3-319-07959-2_28"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1016\/S1571-0661(04)81042-9","article-title":"Valgrind: A program supervision framework","volume":"89","author":"Nethercote","year":"2003","journal-title":"Electron. Notes Theor. Comput. Sci."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/17\/1\/15\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:43:26Z","timestamp":1760132606000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/17\/1\/15"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,28]]},"references-count":31,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,1]]}},"alternative-id":["a17010015"],"URL":"https:\/\/doi.org\/10.3390\/a17010015","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2023,12,28]]}}}