{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,18]],"date-time":"2026-02-18T22:50:55Z","timestamp":1771455055596,"version":"3.50.1"},"reference-count":23,"publisher":"Oxford University Press (OUP)","issue":"17","license":[{"start":{"date-parts":[[2020,6,9]],"date-time":"2020-06-09T00:00:00Z","timestamp":1591660800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["DMS 1916037"],"award-info":[{"award-number":["DMS 1916037"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Graduate Research Fellowship","award":["1752814"],"award-info":[{"award-number":["1752814"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Due to new technology for efficiently generating genome data, machine learning methods are urgently needed to analyze large sets of gene trees over the space of phylogenetic trees. However, the space of phylogenetic trees is not Euclidean, so ordinary machine learning methods cannot be directly applied. In 2019, Yoshida et al. introduced the notion of tropical principal component analysis (PCA), a statistical method for visualization and dimensionality reduction using a tropical polytope with a fixed number of vertices that minimizes the sum of tropical distances between each data point and its tropical projection. However, their work focused on the tropical projective space rather than the space of phylogenetic trees. We focus here on tropical PCA for dimension reduction and visualization over the space of phylogenetic trees.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>Our main results are 2-fold: (i) theoretical interpretations of the tropical principal components over the space of phylogenetic trees, namely, the existence of a tropical cell decomposition into regions of fixed tree topology; and (ii) the development of a stochastic optimization method to estimate tropical PCs over the space of phylogenetic trees using a Markov Chain Monte Carlo approach. This method performs well with simulation studies, and it is applied to three empirical datasets: Apicomplexa and African coelacanth genomes as well as sequences of hemagglutinin for influenza from New York.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Dataset: http:\/\/polytopes.net\/Data.tar.gz. Code: http:\/\/polytopes.net\/tropica_MCMC_codes.tar.gz.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa564","type":"journal-article","created":{"date-parts":[[2020,6,3]],"date-time":"2020-06-03T19:12:02Z","timestamp":1591211522000},"page":"4590-4598","source":"Crossref","is-referenced-by-count":25,"title":["Tropical principal component analysis on the space of phylogenetic trees"],"prefix":"10.1093","volume":"36","author":[{"given":"Robert","family":"Page","sequence":"first","affiliation":[{"name":"Department of Operations Research, Naval Postgraduate School , Monterey, CA 93943, USA"}]},{"given":"Ruriko","family":"Yoshida","sequence":"additional","affiliation":[{"name":"Department of Operations Research, Naval Postgraduate School , Monterey, CA 93943, USA"}]},{"given":"Leon","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Mathematics, University of California , Berkeley, Berkeley, CA 94720, USA"}]}],"member":"286","published-online":{"date-parts":[[2020,6,9]]},"reference":[{"key":"2023062304264769400_btaa564-B1","doi-asserted-by":"crossref","first-page":"3261","DOI":"10.1016\/j.laa.2011.06.009","article-title":"Best approximation in max-plus semimodules","volume":"435","author":"Akian","year":"2011","journal-title":"Linear Algebra Appl"},{"key":"2023062304264769400_btaa564-B2","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1016\/j.jctb.2005.06.004","article-title":"The Bergman complex of a matroid and phylogenetic trees","volume":"96","author":"Ardila","year":"2006","journal-title":"J. Combin. Theory Ser. B"},{"key":"2023062304264769400_btaa564-B3","doi-asserted-by":"crossref","first-page":"701","DOI":"10.1137\/18M1218741","article-title":"L-infinity optimization to Bergman fans of matroids with an application to phylogenetics","volume":"34","author":"Bernstein","year":"2020","journal-title":"SIAM J. Discrete Math"},{"key":"2023062304264769400_btaa564-B4","doi-asserted-by":"crossref","first-page":"733","DOI":"10.1006\/aama.2001.0759","article-title":"Geometry of the space of phylogenetic trees","volume":"27","author":"Billera","year":"2001","journal-title":"Adv. Appl. Math"},{"key":"2023062304264769400_btaa564-B98423631","volume":"32"},{"key":"2023062304264769400_btaa564-B5","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1016\/j.laa.2003.08.010","article-title":"Duality and separation theorems in idempotent semimodules","volume":"379","author":"Cohen","year":"2004","journal-title":"Linear Algebra Appl"},{"key":"2023062304264769400_btaa564-B6","doi-asserted-by":"crossref","first-page":"1792","DOI":"10.1093\/nar\/gkh340","article-title":"Muscle: multiple sequence alignment with high accuracy and high throughput","volume":"32","author":"Edgar","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023062304264769400_btaa564-B7","doi-asserted-by":"crossref","DOI":"10.37236\/5271","article-title":"Tropical linear spaces and tropical convexity","author":"Hampe","year":"2015","journal-title":"Electr. J. Comb"},{"key":"2023062304264769400_btaa564-B8","author":"Joswig","year":"2019"},{"key":"2023062304264769400_btaa564-B9","doi-asserted-by":"crossref","first-page":"2689","DOI":"10.1093\/molbev\/msn213","article-title":"The Apicomplexan whole-genome phylogeny: an analysis of incongruence among gene trees","volume":"25","author":"Kuo","year":"2008","journal-title":"Mol. Biol. Evol"},{"key":"2023062304264769400_btaa564-B10","doi-asserted-by":"crossref","first-page":"1803","DOI":"10.1093\/molbev\/mst072","article-title":"One thousand two hundred ninety nuclear genes from a genome-wide survey support lungfishes as the sister group of tetrapods","volume":"30","author":"Liang","year":"2013","journal-title":"Mol. Biol. Evol"},{"key":"2023062304264769400_btaa564-B11","doi-asserted-by":"crossref","first-page":"1229","DOI":"10.1137\/16M1071122","article-title":"Tropical Fermat\u2013Weber points","volume":"32","author":"Lin","year":"2018","journal-title":"SIAM J. Discrete Math"},{"key":"2023062304264769400_btaa564-B12","doi-asserted-by":"crossref","first-page":"2015","DOI":"10.1137\/16M1079841","article-title":"Convexity in tree spaces","volume":"31","author":"Lin","year":"2017","journal-title":"SIAM J. Discrete Math"},{"key":"2023062304264769400_btaa564-B13","author":"Maclagan","year":"2015"},{"key":"2023062304264769400_btaa564-B14","author":"Maddison","year":"2009"},{"key":"2023062304264769400_btaa564-B78452094","volume":"419","author":"Malcolm"},{"key":"2023062304264769400_btaa564-B15","author":"Monod","year":"2019"},{"key":"2023062304264769400_btaa564-B16","doi-asserted-by":"crossref","first-page":"2716","DOI":"10.1214\/11-AOS915","article-title":"Principal components analysis in the space of phylogenetic trees","volume":"39","author":"Nye","year":"2011","journal-title":"Ann. Stat"},{"key":"2023062304264769400_btaa564-B17","doi-asserted-by":"crossref","first-page":"901","DOI":"10.1093\/biomet\/asx047","article-title":"Principal component analysis and the locus of the Fr\u00e9chet mean in the space of phylogenetic trees","volume":"104","author":"Nye","year":"2017","journal-title":"Biometrika"},{"key":"2023062304264769400_btaa564-B18","first-page":"406","article-title":"The neighbor-joining method: a new method for reconstructing phylogenetic trees","volume":"4","author":"Saitou","year":"1987","journal-title":"Mol. Biol. Evol"},{"key":"2023062304264769400_btaa564-B19","volume-title":"Phylogenetics, Volume 161 of Mathematics and Its Applications","author":"Semple","year":"2003"},{"key":"2023062304264769400_btaa564-B20","first-page":"1359","article-title":"Normalizing kernels in the Billera\u2013Holmes\u2013Vogtmann treespace","author":"Weyenberg","year":"2016","journal-title":"IEEE ACM Trans. Comput. Biol"},{"key":"2023062304264769400_btaa564-B21","author":"Zairis","year":"2016"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa564\/33796492\/btaa564.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/17\/4590\/50677377\/btaa564.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/17\/4590\/50677377\/btaa564.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,24]],"date-time":"2023-06-24T19:35:29Z","timestamp":1687635329000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/17\/4590\/5855129"}},"subtitle":[],"editor":[{"given":"Pier","family":"Luigi Martelli","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,6,9]]},"references-count":23,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2020,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa564","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,9,1]]},"published":{"date-parts":[[2020,6,9]]}}}