{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T23:52:00Z","timestamp":1773273120454,"version":"3.50.1"},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2005,4,8]],"date-time":"2005-04-08T00:00:00Z","timestamp":1112918400000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/2.0\/"},{"start":{"date-parts":[[2005,4,8]],"date-time":"2005-04-08T00:00:00Z","timestamp":1112918400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/2.0\/"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                        <jats:title>Background<\/jats:title>\n                        <jats:p>Understanding the evolutionary relationships among species based on their genetic information is one of the primary objectives in phylogenetic analysis. Reconstructing phylogenies for large data sets is still a challenging task in Bioinformatics.<\/jats:p>\n                     <\/jats:sec><jats:sec>\n                        <jats:title>Results<\/jats:title>\n                        <jats:p>We propose a new distance-based clustering method, <jats:italic>the shortest triplet clustering algorithm (STC)<\/jats:italic>, to reconstruct phylogenies. The main idea is the introduction of a natural definition of so-called <jats:italic>k-representative sets<\/jats:italic>. Based on <jats:italic>k<\/jats:italic>-representative sets, <jats:italic>shortest triplets<\/jats:italic> are reconstructed and serve as building blocks for the STC algorithm to agglomerate sequences for tree reconstruction in <jats:italic>O<\/jats:italic>(<jats:italic>n<\/jats:italic><jats:sup>2<\/jats:sup>) time for <jats:italic>n<\/jats:italic> sequences.<\/jats:p>\n                        <jats:p>Simulations show that STC gives better topological accuracy than other tested methods that also build a first starting tree. STC appears as a very good method to start the tree reconstruction. However, all tested methods give similar results if balanced nearest neighbor interchange (BNNI) is applied as a post-processing step. BNNI leads to an improvement in all instances. The program is available at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"http:\/\/www.bi.uni-duesseldorf.de\/software\/stc\/\">http:\/\/www.bi.uni-duesseldorf.de\/software\/stc\/<\/jats:ext-link>.<\/jats:p>\n                     <\/jats:sec><jats:sec>\n                        <jats:title>Conclusion<\/jats:title>\n                        <jats:p>The results demonstrate that the new approach efficiently reconstructs phylogenies for large data sets. We found that BNNI boosts the topological accuracy of all methods including STC, therefore, one should use BNNI as a post-processing step to get better topological accuracy.<\/jats:p>\n                     <\/jats:sec>","DOI":"10.1186\/1471-2105-6-92","type":"journal-article","created":{"date-parts":[[2005,4,9]],"date-time":"2005-04-09T06:15:47Z","timestamp":1113027347000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Shortest triplet clustering: reconstructing large phylogenies using representative sets"],"prefix":"10.1186","volume":"6","author":[{"given":"Le","family":"Sy Vinh","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Arndt","family":"von Haeseler","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2005,4,8]]},"reference":[{"key":"417_CR1","first-page":"406","volume":"4","author":"N Saitou","year":"1987","unstructured":"Saitou N, Nei M: The Neighbor \u2013 joining Method: A New Method for Reconstructing Phylogenetic Trees. Mol Biol Evol 1987, 4: 406\u2013425.","journal-title":"Mol Biol Evol"},{"key":"417_CR2","doi-asserted-by":"publisher","first-page":"964","DOI":"10.1093\/oxfordjournals.molbev.a025664","volume":"13","author":"K Strimmer","year":"1996","unstructured":"Strimmer K, von Haeseler A: Quartet Puzzling: A Quartet Maximum \u2013 Likelihood Method for Reconstructing Tree Topologies. Mol Biol Evol 1996, 13: 964\u2013969.","journal-title":"Mol Biol Evol"},{"key":"417_CR3","doi-asserted-by":"publisher","first-page":"685","DOI":"10.1093\/oxfordjournals.molbev.a025808","volume":"14","author":"O Gascuel","year":"1997","unstructured":"Gascuel O: BIONJ: An Improved Version of the NJ Algorithm Based on a Simple Model of Sequence Data. Mol Biol Evol 1997, 14: 685\u2013695.","journal-title":"Mol Biol Evol"},{"key":"417_CR4","doi-asserted-by":"publisher","first-page":"369","DOI":"10.1089\/106652799318337","volume":"6","author":"DH Huson","year":"1999","unstructured":"Huson DH, Nettles SM, Warnow TJ: Disk-Covering, a Fast-Converging Method for Phylogenetic Reconstruction. J Comput Biol 1999, 6: 369\u2013386. 10.1089\/106652799318337","journal-title":"J Comput Biol"},{"key":"417_CR5","doi-asserted-by":"publisher","first-page":"189","DOI":"10.1093\/oxfordjournals.molbev.a026231","volume":"17","author":"WJ Bruno","year":"2000","unstructured":"Bruno WJ, Socci ND, Halpern AL: Weighted Neighbor Joining: A Likelihood Based-Approach to Distance-Based Phylogeny Reconstruction. Mol Biol Evol 2000, 17: 189\u2013197.","journal-title":"Mol Biol Evol"},{"key":"417_CR6","doi-asserted-by":"publisher","first-page":"687","DOI":"10.1089\/106652702761034136","volume":"9","author":"R Desper","year":"2002","unstructured":"Desper R, Gascuel O: Fast and Accurate Phylogeny Reconstruction Algorithms Based on the Minimum-Evolution Principle. J Comput Biol 2002, 9: 687\u2013706. 10.1089\/106652702761034136","journal-title":"J Comput Biol"},{"key":"417_CR7","doi-asserted-by":"publisher","first-page":"277","DOI":"10.1089\/10665270252935467","volume":"9","author":"M Cs\u00fcr\u00f6s","year":"2002","unstructured":"Cs\u00fcr\u00f6s M: Fast Recovery of Evolutionary Trees with Thousands of Nodes. J Comput Biol 2002, 9: 277\u2013297. 10.1089\/10665270252935467","journal-title":"J Comput Biol"},{"key":"417_CR8","doi-asserted-by":"publisher","first-page":"696","DOI":"10.1080\/10635150390235520","volume":"52","author":"S Guindon","year":"2003","unstructured":"Guindon S, Gascuel O: A Simple, Fast and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood. Syst Biol 2003, 52: 696\u2013704. 10.1080\/10635150390235520","journal-title":"Syst Biol"},{"key":"417_CR9","doi-asserted-by":"publisher","first-page":"456","DOI":"10.1093\/bioinformatics\/bti191","volume":"21","author":"A Stamatakis","year":"2005","unstructured":"Stamatakis A, Ludwig T, Meier H: RAxML-III: A fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 2005, 21: 456\u2013463. 10.1093\/bioinformatics\/bti191","journal-title":"Bioinformatics"},{"key":"417_CR10","doi-asserted-by":"publisher","first-page":"1565","DOI":"10.1093\/molbev\/msh176","volume":"21","author":"LS Vinh","year":"2004","unstructured":"Vinh LS, von Haeseler A: IQPNNI: Moving fast through tree space and stopping in time. Mol Biol Evol 2004, 21: 1565\u20131571. 10.1093\/molbev\/msh176","journal-title":"Mol Biol Evol"},{"key":"417_CR11","volume-title":"Proceedings of the 28th Annual German Classification Society Conference (GfKl 2004)","author":"LS Vinh","year":"2004","unstructured":"Vinh LS, Schmidt HA, von Haeseler A: PhyNav: A Novel Approach to Reconstruct Large Phylogenies. In Proceedings of the 28th Annual German Classification Society Conference (GfKl 2004). Dortmund, Germany; 2004:in press."},{"key":"417_CR12","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1089\/106652701300099137","volume":"8","author":"MA Charleston","year":"2001","unstructured":"Charleston MA: Hitch-Hiking: A Parallel Heuristic Search Strategy, Applied to the Phylogeny Problem. J Comput Biol 2001, 8: 79\u201391. 10.1089\/106652701300099137","journal-title":"J Comput Biol"},{"key":"417_CR13","doi-asserted-by":"publisher","first-page":"1717","DOI":"10.1093\/oxfordjournals.molbev.a003994","volume":"19","author":"MJ Brauer","year":"2002","unstructured":"Brauer MJ, Holder MT, Dries LA, Zwickl DJ, Lewis PO, Hillis DM: Genetic Algorithms and Parallel Processing in Maximum-Likelihood Phylogeny Inference. Mol Biol Evol 2002, 19: 1717\u20131726.","journal-title":"Mol Biol Evol"},{"key":"417_CR14","doi-asserted-by":"publisher","first-page":"502","DOI":"10.1093\/bioinformatics\/18.3.502","volume":"18","author":"HA Schmidt","year":"2002","unstructured":"Schmidt HA, Strimmer K, Vingron M, von Haeseler A: TREE-PUZZLE: Maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 2002, 18: 502\u2013504. 10.1093\/bioinformatics\/18.3.502","journal-title":"Bioinformatics"},{"key":"417_CR15","first-page":"6.6.1","volume-title":"Current Protocols in Bioinformatics","author":"HA Schmidt","year":"2003","unstructured":"Schmidt HA, von Haeseler A: Maximum Likelihood Analysis Using TREE-PUZZLE. In Current Protocols in Bioinformatics. Edited by: Baxevanis AD, Davison DB, Page RDM, Stormo G, Stein L. New York, USA: Wiley and Sons; 2003:6.6.1\u20136.6.25."},{"key":"417_CR16","first-page":"233","volume":"19","author":"L Cavalli-Sforza","year":"1967","unstructured":"Cavalli-Sforza L, Edwards AWF: Phylogenetic analysis: Models and estimation procedures. Am J Hum Genet 1967, 19: 233\u2013257.","journal-title":"Am J Hum Genet"},{"key":"417_CR17","doi-asserted-by":"publisher","first-page":"279","DOI":"10.1126\/science.155.3760.279","volume":"155","author":"W Fitch","year":"1967","unstructured":"Fitch W, Margoliash E: Construction of Phylogenetic trees. Science 1967, 155: 279\u2013284.","journal-title":"Science"},{"key":"417_CR18","volume-title":"Clustering Algorithms","author":"AJ Hartigan","year":"1975","unstructured":"Hartigan AJ: Clustering Algorithms. John Wiley and Sons, Inc; 1975."},{"key":"417_CR19","first-page":"823","volume-title":"On the phenetic approach to vertebrate classification","author":"J Farris","year":"1977","unstructured":"Farris J: On the phenetic approach to vertebrate classification. 1977, 17: 823\u2013850."},{"key":"417_CR20","doi-asserted-by":"publisher","first-page":"4516","DOI":"10.1073\/pnas.76.9.4516","volume":"76","author":"LC Klotz NKRB","year":"1979","unstructured":"Klotz NKRB LC, Mitchell RM: Calculation of evolutionary trees from sequence data. Proc Natl Acad Sci USA 1979, 76: 4516\u20134520.","journal-title":"Proc Natl Acad Sci USA"},{"key":"417_CR21","doi-asserted-by":"publisher","first-page":"1085","DOI":"10.1073\/pnas.78.2.1085","volume":"78","author":"WH Li","year":"1981","unstructured":"Li WH: Simple method for constructing phylogenetic trees from distance matrices. Proc Natl Acad Sci USA 1981, 78: 1085\u20131089.","journal-title":"Proc Natl Acad Sci USA"},{"key":"417_CR22","first-page":"235","volume":"13","author":"A Rambaut","year":"1997","unstructured":"Rambaut A, Crassly NC: Seq-Gen: An application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comput Appl Biosci 1997, 13: 235\u2013238.","journal-title":"Comput Appl Biosci"},{"key":"417_CR23","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1007\/BF01731581","volume":"16","author":"M Kimura","year":"1980","unstructured":"Kimura M: A Simple Method for Estimating Evolutionary Rates of Base Substitutions through Comparative Studies of Nucleotide Sequences. J Mol Evol 1980, 16: 111\u2013120.","journal-title":"J Mol Evol"},{"key":"417_CR24","doi-asserted-by":"publisher","first-page":"44","DOI":"10.2307\/1426329","volume":"3","author":"EF Harding","year":"1971","unstructured":"Harding EF: The probabilities of rooted tree-shapes generated by random bifurcation. Adv Appl Prob 1971, 3: 44\u201377.","journal-title":"Adv Appl Prob"},{"key":"417_CR25","first-page":"407","volume-title":"Molecular Systematics","author":"DL Swofford","year":"1996","unstructured":"Swofford DL, Olsen GJ, Waddell PJ, Hillis DM: Phylogeny Reconstruction. In Molecular Systematics. 2nd edition. Edited by: Hillis DM, Moritz C, Mable BK. Sunderland, Massachusetts: Sinauer Associates; 1996:407\u2013514.","edition":"2"},{"key":"417_CR26","volume-title":"PHYLIP (Phylogeny Inference Package) version 3.5c","author":"J Felsenstein","year":"1993","unstructured":"Felsenstein J: PHYLIP (Phylogeny Inference Package) version 3.5c. Department of Genetics, University of Washington, Seattle 1993. [http:\/\/evolution.genetics.washington.edu\/phylip.html]"},{"key":"417_CR27","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1016\/0025-5564(81)90043-2","volume":"53","author":"DR Robinson","year":"1981","unstructured":"Robinson DR, Foulds LR: Comparison of phylogenetic trees. Mathematical Biosciences 1981, 53: 131\u2013147. 10.1016\/0025-5564(81)90043-2","journal-title":"Mathematical Biosciences"},{"key":"417_CR28","volume-title":"Mathematics in the archaeological and historical sciences","author":"P Buneman","year":"1971","unstructured":"Buneman P: The recovery of trees from measures of dissimilarity. In Mathematics in the archaeological and historical sciences. Edited by: Hodson, Lendall, Tautu. Edinburgh: Edinburgh university press; 1971."},{"key":"417_CR29","doi-asserted-by":"crossref","DOI":"10.1093\/oso\/9780198509424.001.0001","volume-title":"Phylogenetics","author":"C Semple","year":"2003","unstructured":"Semple C, Steel M: Phylogenetics. OXFORD univerity press; 2003."},{"key":"417_CR30","volume-title":"Inferring Phylogenies","author":"J Felsenstein","year":"2004","unstructured":"Felsenstein J: Inferring Phylogenies. Sunderland, Massachusetts: Sinauer Associates; 2004."},{"key":"417_CR31","volume-title":"The Design and Analysis of Computer Algorithms","author":"AV Aho","year":"1974","unstructured":"Aho AV, Hopcroft JE, Ullman JD: The Design and Analysis of Computer Algorithms. Addison-Wesley Publishing Company; 1974."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-6-92.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/1471-2105-6-92\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-6-92.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,7]],"date-time":"2024-10-07T12:10:30Z","timestamp":1728303030000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-6-92"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,4,8]]},"references-count":31,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2005,12]]}},"alternative-id":["417"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-6-92","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005,4,8]]},"assertion":[{"value":"29 November 2004","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 April 2005","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 April 2005","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"92"}}