{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T12:12:48Z","timestamp":1763467968668},"reference-count":36,"publisher":"Springer Science and Business Media LLC","issue":"S1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2010,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Genomic data provide a wealth of new information for phylogenetic analysis. Yet making use of this data requires phylogenetic methods that can efficiently analyze extremely large data sets and account for processes of gene evolution, such as gene duplication and loss, incomplete lineage sorting (deep coalescence), or horizontal gene transfer, that cause incongruence among gene trees. One such approach is gene tree parsimony, which, given a set of gene trees, seeks a species tree that requires the smallest number of evolutionary events to explain the incongruence of the gene trees. However, the only existing algorithms for gene tree parsimony under the duplication-loss or deep coalescence reconciliation cost are prohibitively slow for large datasets.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We describe novel algorithms for SPR and TBR based local search heuristics under the duplication-loss cost, and we show how they can be adapted for the deep coalescence cost. These algorithms improve upon the best existing algorithms for these problems by a factor of <jats:italic>n<\/jats:italic>, where <jats:italic>n<\/jats:italic> is the number of species in the collection of gene trees. We implemented our new SPR based local search algorithm for the duplication-loss cost and demonstrate the tremendous improvement in runtime and scalability it provides compared to existing implementations. We also evaluate the performance of our algorithm on three large-scale genomic data sets.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>Our new algorithms enable, for the first time, gene tree parsimony analyses of thousands of genes from hundreds of taxa using the duplication-loss and deep coalescence reconciliation costs. Thus, this work expands both the size of data sets and the range of evolutionary models that can be incorporated into genome-scale phylogenetic analyses.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-11-s1-s42","type":"journal-article","created":{"date-parts":[[2010,1,19]],"date-time":"2010-01-19T07:17:41Z","timestamp":1263885461000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":46,"title":["Efficient genome-scale phylogenetic analysis under the duplication-loss and deep coalescence cost models"],"prefix":"10.1186","volume":"11","author":[{"given":"Mukul S","family":"Bansal","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"J Gordon","family":"Burleigh","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Oliver","family":"Eulenstein","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2010,1,18]]},"reference":[{"key":"3992_CR1","doi-asserted-by":"publisher","first-page":"523","DOI":"10.1093\/sysbio\/46.3.523","volume":"46","author":"WP Maddison","year":"1997","unstructured":"Maddison WP: Gene Trees in Species Trees. Systematic Biology 1997, 46: 523-536.","journal-title":"Systematic Biology"},{"issue":"5","key":"3992_CR2","doi-asserted-by":"publisher","first-page":"e68","DOI":"10.1371\/journal.pgen.0020068","volume":"2","author":"JH Degnan","year":"2006","unstructured":"Degnan JH, Rosenberg NA: Discordance of Species Trees with Their Most Likely Gene Trees. PLoS Genetics 2006, 2(5):e68. 10.1371\/journal.pgen.0020068","journal-title":"PLoS Genetics"},{"key":"3992_CR3","doi-asserted-by":"publisher","first-page":"132","DOI":"10.2307\/2412519","volume":"28","author":"M Goodman","year":"1979","unstructured":"Goodman M, Czelusniak J, Moore GW, Romero-Herrera AE, Matsuda G: Fitting the gene lineage into its species lineage. A parsimony strategy illustrated by cladograms constructed from globin sequences. Systematic Zoology 1979, 28: 132-163. 10.2307\/2412519","journal-title":"Systematic Zoology"},{"key":"3992_CR4","first-page":"58","volume":"43","author":"RDM Page","year":"1994","unstructured":"Page RDM: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Systematic Biology 1994, 43: 58-77.","journal-title":"Systematic Biology"},{"issue":"2","key":"3992_CR5","doi-asserted-by":"publisher","first-page":"189","DOI":"10.1006\/mpev.1996.0071","volume":"6","author":"R Guig\u00f3","year":"1996","unstructured":"Guig\u00f3 R, Muchnik I, Smith TF: Reconstruction of Ancient Molecular Phylogeny. Molecular Phylogenetics and Evolution 1996, 6(2):189-213. 10.1006\/mpev.1996.0071","journal-title":"Molecular Phylogenetics and Evolution"},{"issue":"4","key":"3992_CR6","doi-asserted-by":"publisher","first-page":"493","DOI":"10.1089\/cmb.1995.2.493","volume":"2","author":"B Mirkin","year":"1995","unstructured":"Mirkin B, Muchnik I, Smith TF: A Biologically Consistent Model for Comparing Molecular Phylogenies. Journal of Computational Biology 1995, 2(4):493-507. 10.1089\/cmb.1995.2.493","journal-title":"Journal of Computational Biology"},{"key":"3992_CR7","doi-asserted-by":"publisher","first-page":"101","DOI":"10.1016\/S0166-218X(98)00068-7","volume":"88","author":"O Eulenstein","year":"1998","unstructured":"Eulenstein O, Vingron M: On the equivalence of two tree mapping measures. Discrete Applied Mathematics 1998, 88: 101-126. 10.1016\/S0166-218X(98)00068-7","journal-title":"Discrete Applied Mathematics"},{"key":"3992_CR8","doi-asserted-by":"publisher","first-page":"138","DOI":"10.1145\/332306.332359","volume-title":"RECOMB","author":"MT Hallett","year":"2000","unstructured":"Hallett MT, Lagergren J: New algorithms for the duplication-loss model. RECOMB 2000, 138-146. full_text"},{"issue":"1-2","key":"3992_CR9","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1016\/j.tcs.2005.05.016","volume":"347","author":"P Bonizzoni","year":"2005","unstructured":"Bonizzoni P, Vedova GD, Dondi R: Reconciling a gene tree to a species tree under the duplication cost model. Theor Comput Sci 2005, 347(1-2):36-53. 10.1016\/j.tcs.2005.05.016","journal-title":"Theor Comput Sci"},{"issue":"1-3","key":"3992_CR10","doi-asserted-by":"publisher","first-page":"378","DOI":"10.1016\/j.tcs.2006.05.019","volume":"359","author":"P G\u00f3recki","year":"2006","unstructured":"G\u00f3recki P, Tiuryn J: DLS-trees: A model of evolutionary scenarios. Theor Comput Sci 2006, 359(1-3):378-399. 10.1016\/j.tcs.2006.05.019","journal-title":"Theor Comput Sci"},{"issue":"2","key":"3992_CR11","doi-asserted-by":"publisher","first-page":"320","DOI":"10.1089\/cmb.2006.13.320","volume":"13","author":"D Durand","year":"2006","unstructured":"Durand D, Halld\u00f3rsson BV, Vernot B: A Hybrid Micro-Macroevolutionary Approach to Gene Tree Reconstruction. Journal of Computational Biology 2006, 13(2):320-335. 10.1089\/cmb.2006.13.320","journal-title":"Journal of Computational Biology"},{"issue":"8","key":"3992_CR12","doi-asserted-by":"publisher","first-page":"1043","DOI":"10.1089\/cmb.2008.0054","volume":"15","author":"C Chauve","year":"2008","unstructured":"Chauve C, Doyon JP, El-Mabrouk N: Gene Family Evolution by Duplication, Speciation, and Loss. Journal of Computational Biology 2008, 15(8):1043-1062. 10.1089\/cmb.2008.0054","journal-title":"Journal of Computational Biology"},{"key":"3992_CR13","first-page":"46","volume-title":"RECOMB","author":"C Chauve","year":"2009","unstructured":"Chauve C, El-Mabrouk N: New Perspectives on Gene Family Evolution: Losses in Reconciliation and a Link with Supertrees. RECOMB 2009, 46-58."},{"key":"3992_CR14","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1080\/10635150500354928","volume":"55","author":"WP Maddison","year":"2006","unstructured":"Maddison WP, Knowles LL: Inferring Phylogeny Despite Incomplete Lineage Sorting. Systematic Biology 2006, 55: 21-30. 10.1080\/10635150500354928","journal-title":"Systematic Biology"},{"key":"3992_CR15","first-page":"192","volume-title":"RECOMB","author":"L Zhang","year":"2000","unstructured":"Zhang L: Inferring a Species Tree from Gene Trees under the Deep Coalescence Cost. RECOMB 2000, 192-193."},{"issue":"9","key":"3992_CR16","doi-asserted-by":"publisher","first-page":"e1000501","DOI":"10.1371\/journal.pcbi.1000501","volume":"5","author":"C Than","year":"2009","unstructured":"Than C, Nakhleh L: Species tree inference by minimizing deep coalescences. PLoS Computational Biology 2009, 5(9):e1000501. 10.1371\/journal.pcbi.1000501","journal-title":"PLoS Computational Biology"},{"issue":"3","key":"3992_CR17","doi-asserted-by":"publisher","first-page":"729","DOI":"10.1137\/S0097539798343362","volume":"30","author":"B Ma","year":"2000","unstructured":"Ma B, Li M, Zhang L: From Gene Trees to Species Trees. SIAM J Comput 2000, 30(3):729-752. 10.1137\/S0097539798343362","journal-title":"SIAM J Comput"},{"key":"3992_CR18","doi-asserted-by":"publisher","first-page":"409","DOI":"10.1007\/s00026-004-0229-z","volume":"8","author":"M Bordewich","year":"2004","unstructured":"Bordewich M, Semple C: On the computational complexity of the rooted subtree prune and regraft distance. Annals of Combinatorics 2004, 8: 409-423. 10.1007\/s00026-004-0229-z","journal-title":"Annals of Combinatorics"},{"key":"3992_CR19","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1177\/117693430600200003","volume":"2","author":"D Chen","year":"2006","unstructured":"Chen D, Eulenstein O, Fern\u00e1ndez-Baca D, Burleigh JG: Improved Heuristics for Minimum-Flip Supertree Construction. Evolutionary Bioinformatics 2006, 2: 347-356.","journal-title":"Evolutionary Bioinformatics"},{"issue":"9","key":"3992_CR20","doi-asserted-by":"publisher","first-page":"819","DOI":"10.1093\/bioinformatics\/14.9.819","volume":"14","author":"RDM Page","year":"1998","unstructured":"Page RDM: comparing gene and species phylogenies using reconciled trees. Bioinformatics 1998, 14(9):819-820. 10.1093\/bioinformatics\/14.9.819","journal-title":"Bioinformatics"},{"key":"3992_CR21","volume-title":"Mesquite: a modular system for evolutionary analysis. Version 2.6","author":"WP Maddison","year":"2009","unstructured":"Maddison WP, Maddison D: Mesquite: a modular system for evolutionary analysis. Version 2.6.2009. [http:\/\/mesquiteproject.org]"},{"key":"3992_CR22","first-page":"238","volume-title":"RECOMB","author":"MS Bansal","year":"2007","unstructured":"Bansal MS, Burleigh JG, Eulenstein O, Wehe A: Heuristics for the Gene-Duplication Problem: A \u0398( n ) Speed-Up for the Local Search. RECOMB 2007, 238-252."},{"issue":"4","key":"3992_CR23","doi-asserted-by":"publisher","first-page":"514","DOI":"10.1109\/TCBB.2008.69","volume":"5","author":"MS Bansal","year":"2008","unstructured":"Bansal MS, Eulenstein O: An \u03a9( n2\/log n ) Speed-Up of TBR Heuristics for the Gene-Duplication Problem. IEEE\/ACM Transactions on Computational Biology and Bioinformatics 2008, 5(4):514-524. 10.1109\/TCBB.2008.69","journal-title":"IEEE\/ACM Transactions on Computational Biology and Bioinformatics"},{"issue":"3","key":"3992_CR24","doi-asserted-by":"publisher","first-page":"504","DOI":"10.1080\/10635150701429982","volume":"56","author":"L Liu","year":"2007","unstructured":"Liu L, Pearl DK: Species Trees from Gene Trees: Reconstructing Bayesian Posterior Distributions of a Species Phylogeny Using Estimated Gene Tree Distributions. Systematic Biology 2007, 56(3):504-514. 10.1080\/10635150701429982","journal-title":"Systematic Biology"},{"issue":"7","key":"3992_CR25","doi-asserted-by":"publisher","first-page":"971","DOI":"10.1093\/bioinformatics\/btp079","volume":"25","author":"LS Kubatko","year":"2009","unstructured":"Kubatko LS, Carstens BC, Knowles LL: STEM: species tree estimation using maximum likelihood for gene trees under coalescence. Bioinformatics 2009, 25(7):971-973. 10.1093\/bioinformatics\/btp079","journal-title":"Bioinformatics"},{"issue":"7","key":"3992_CR26","doi-asserted-by":"publisher","first-page":"1575","DOI":"10.1093\/molbev\/msm107","volume":"24","author":"C An\u00e9","year":"2007","unstructured":"An\u00e9 C, Larget B, Baum DA, Smith SD, Rokas A: Bayesian Estimation of Concordance Among Gene Trees. Mol Biol Evol 2007, 24(7):1575. 10.1093\/molbev\/msm107","journal-title":"Mol Biol Evol"},{"key":"3992_CR27","first-page":"7","volume-title":"ISMB (Supplement of Bioinformatics)","author":"L Arvestad","year":"2003","unstructured":"Arvestad L, Berglund AC, Lagergren J, Sennblad B: Bayesian gene\/species tree reconciliation and orthology analysis using MCMC. ISMB (Supplement of Bioinformatics) 2003, 7-15. 10.1093\/bioinformatics\/btg1000"},{"issue":"14","key":"3992_CR28","doi-asserted-by":"publisher","first-page":"5714","DOI":"10.1073\/pnas.0806251106","volume":"106","author":"O \u00c4kerborg","year":"2009","unstructured":"\u00c4kerborg O, Sennblad B, Arvestad L, Lagergren J: Simultaneous Bayesian gene tree reconstruction and reconciliation analysis. Proceedings of the National Academy of Sciences 2009, 106(14):5714-5719. 10.1073\/pnas.0806251106","journal-title":"Proceedings of the National Academy of Sciences"},{"key":"3992_CR29","volume-title":"PhD thesis","author":"MS Bansal","year":"2009","unstructured":"Bansal MS: Algorithms for efficient phylogenetic tree construction. PhD thesis. Iowa State Univ; 2009."},{"key":"3992_CR30","doi-asserted-by":"publisher","first-page":"798","DOI":"10.1038\/nature02053","volume":"425","author":"A Rokas","year":"2003","unstructured":"Rokas A, Williams BL, King N, Carroll SB: Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 2003, 425: 798-804. 10.1038\/nature02053","journal-title":"Nature"},{"issue":"12","key":"3992_CR31","doi-asserted-by":"publisher","first-page":"2689","DOI":"10.1093\/molbev\/msn213","volume":"25","author":"CH Kuo","year":"2008","unstructured":"Kuo CH, Wares JP, Kissinger JC: The Apicomplexan Whole-Genome Phylogeny: An Analysis of Incongruence among Gene Trees. Mol Biol Evol 2008, 25(12):2689-2698. 10.1093\/molbev\/msn213","journal-title":"Mol Biol Evol"},{"issue":"21","key":"3992_CR32","doi-asserted-by":"publisher","first-page":"2688","DOI":"10.1093\/bioinformatics\/btl446","volume":"22","author":"A Stamatakis","year":"2006","unstructured":"Stamatakis A: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 2006, 22(21):2688-2690. 10.1093\/bioinformatics\/btl446","journal-title":"Bioinformatics"},{"key":"3992_CR33","unstructured":"Burleigh JG, Bansal MS, Eulenstein O, Hartmann S, Wehe A, Vision TJ: Genome-scale phylogenetics: inferring the plant tree of life from 18,896 discordant gene trees. Systematic Biology, in press."},{"issue":"2","key":"3992_CR34","doi-asserted-by":"publisher","first-page":"221","DOI":"10.1109\/TCBB.2009.7","volume":"6","author":"MS Bansal","year":"2009","unstructured":"Bansal MS, Eulenstein O, Wehe A: The Gene-Duplication Problem: Near-Linear Time Algorithms for NNI-Based Local Searches. IEEE\/ACM Transactions on Computational Biology and Bioinformatics 2009, 6(2):221-231. 10.1109\/TCBB.2009.7","journal-title":"IEEE\/ACM Transactions on Computational Biology and Bioinformatics"},{"key":"3992_CR35","doi-asserted-by":"crossref","unstructured":"Wehe A, Bansal MS, Burleigh JG, Eulenstein O: DupTree: a program for large-scale phylogenetic analyses using gene tree parsimony. Bioinformatics 2008., 24(13): 10.1093\/bioinformatics\/btn230","DOI":"10.1093\/bioinformatics\/btn230"},{"key":"3992_CR36","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1007\/978-1-4020-2330-9_6","volume-title":"Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life","author":"JA Cotton","year":"2004","unstructured":"Cotton JA, Page RDM: Tangled tales from multiple markers: reconciling conflict between phylogenies to build molecular supertrees. In Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life. Edited by: Bininda-Emonds ORP. Springer-Verlag; 2004:107-125."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-11-S1-S42.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T04:35:38Z","timestamp":1630470938000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-11-S1-S42"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,1]]},"references-count":36,"journal-issue":{"issue":"S1","published-print":{"date-parts":[[2010,1]]}},"alternative-id":["3992"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-11-s1-s42","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2010,1]]},"assertion":[{"value":"18 January 2010","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S42"}}