{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T11:43:38Z","timestamp":1753875818689,"version":"3.41.2"},"reference-count":22,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2024,9,19]],"date-time":"2024-09-19T00:00:00Z","timestamp":1726704000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Shenzhen Basic Research Institutions","award":["JCKY2020-44"],"award-info":[{"award-number":["JCKY2020-44"]}]},{"DOI":"10.13039\/501100017607","name":"Shenzhen Fundamental Research Program","doi-asserted-by":"publisher","award":["JCYJ20220818103212025","20220817165436004"],"award-info":[{"award-number":["JCYJ20220818103212025","20220817165436004"]}],"id":[{"id":"10.13039\/501100017607","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Summary<\/jats:title>\n                  <jats:p>Sketching technologies have recently emerged as a promising solution for real-time, large-scale phylogenetic analysis. However, existing sketching-based phylogenetic tools exhibit drawbacks, including platform restrictions, deficiencies in tree visualization, and inherent distance estimation bias. These limitations collectively impede the overall convenience and efficiency of the analysis. In this study, we introduce Kssdtree, an interactive Python package designed to address these challenges. Kssdtree surpasses other sketching-based tools by demonstrating superior performance in terms of both accuracy and time efficiency on comprehensive benchmarking datasets. Notably, Kssdtree offers key advantages such as intra-species phylogenomic analysis and GTDB-based phylogenetic placement analysis, significantly enhancing the scope and depth of phylogenetic investigations. Through extensive evaluations and comparisons, Kssdtree stands out as an efficient and versatile method for real-time, large-scale phylogenetic analysis.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The Kssdtree Python package is freely accessible at https:\/\/pypi.org\/project\/kssdtree and source code is available at https:\/\/github.com\/yhlink\/kssdtree. The documentation and instantiation for the software is available at https:\/\/kssdtree.readthedocs.io\/en\/latest. The video tutorial is available at\u00a0https:\/\/youtu.be\/_6hg59Yn-Ws.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae566","type":"journal-article","created":{"date-parts":[[2024,9,19]],"date-time":"2024-09-19T17:39:36Z","timestamp":1726767576000},"source":"Crossref","is-referenced-by-count":0,"title":["Kssdtree: an interactive Python package for phylogenetic analysis based on sketching technique"],"prefix":"10.1093","volume":"40","author":[{"given":"Hang","family":"Yang","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology , Jinzhong 030600,","place":["China"]},{"name":"Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen 518055,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaoxin","family":"Lu","sequence":"additional","affiliation":[{"name":"Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen 518055,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jiaxing","family":"Chang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology , Jinzhong 030600,","place":["China"]},{"name":"Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen 518055,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qing","family":"Chang","sequence":"additional","affiliation":[{"name":"Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen 518055,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wen","family":"Zheng","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology , Jinzhong 030600,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zehua","family":"Chen","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology , Jinzhong 030600,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Huiguang","family":"Yi","sequence":"additional","affiliation":[{"name":"Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen 518055,","place":["China"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2024,9,19]]},"reference":[{"key":"2024101105064354800_btae566-B1","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J Mol Biol"},{"key":"2024101105064354800_btae566-B2","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1109\/MCSE.2010.118","article-title":"Cython: the best of both worlds","volume":"13","author":"Behnel","year":"2011","journal-title":"Comput Sci Eng"},{"key":"2024101105064354800_btae566-B3","doi-asserted-by":"crossref","first-page":"btac774","DOI":"10.1093\/bioinformatics\/btac774","article-title":"Scaling neighbor joining to one million taxa with dynamic and heuristic neighbor joining","volume":"39","author":"Clausen","year":"2023","journal-title":"Bioinformatics"},{"key":"2024101105064354800_btae566-B4","doi-asserted-by":"crossref","first-page":"3019","DOI":"10.1093\/molbev\/msr108","article-title":"Large-scale phylogenomic analyses indicate a deep origin of primary plastids within cyanobacteria","volume":"28","author":"Criscuolo","year":"2011","journal-title":"Mol Biol Evol"},{"key":"2024101105064354800_btae566-B5","doi-asserted-by":"crossref","first-page":"1115","DOI":"10.1093\/molbev\/msr268","article-title":"ALF\u2014a simulation framework for genome evolution","volume":"29","author":"Dalquen","year":"2012","journal-title":"Mol Biol Evol"},{"key":"2024101105064354800_btae566-B6","doi-asserted-by":"crossref","first-page":"1792","DOI":"10.1093\/nar\/gkh340","article-title":"MUSCLE: multiple sequence alignment with high accuracy and high throughput","volume":"32","author":"Edgar","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2024101105064354800_btae566-B7","doi-asserted-by":"crossref","first-page":"522","DOI":"10.1186\/s12864-015-1647-5","article-title":"An assembly and alignment-free method of phylogeny reconstruction from next-generation sequencing data","volume":"16","author":"Fan","year":"2015","journal-title":"BMC Genomics"},{"key":"2024101105064354800_btae566-B8","doi-asserted-by":"crossref","first-page":"1635","DOI":"10.1093\/molbev\/msw046","article-title":"ETE 3: Reconstruction, analysis, and visualization of phylogenomic data","volume":"33","author":"Huerta-Cepas","year":"2016","journal-title":"Mol Biol Evol"},{"key":"2024101105064354800_btae566-B9","doi-asserted-by":"crossref","first-page":"1","DOI":"10.21105\/joss.01762","article-title":"Mashtree: a rapid comparison of whole genome sequence files","volume":"4","author":"Katz","year":"2019","journal-title":"J Open Source Softw"},{"key":"2024101105064354800_btae566-B10","doi-asserted-by":"crossref","first-page":"2798","DOI":"10.1093\/molbev\/msv150","article-title":"FastME 2.0: a comprehensive, accurate, and fast distance-based phylogeny inference program","volume":"32","author":"Lefort","year":"2015","journal-title":"Mol Biol Evol"},{"key":"2024101105064354800_btae566-B11","doi-asserted-by":"crossref","first-page":"312","DOI":"10.1038\/s41586-023-05896-x","article-title":"A draft human pangenome reference","volume":"617","author":"Liao","year":"2023","journal-title":"Nature"},{"key":"2024101105064354800_btae566-B12","doi-asserted-by":"crossref","first-page":"132","DOI":"10.1186\/s13059-016-0997-x","article-title":"Mash: fast genome and metagenome distance estimation using MinHash","volume":"17","author":"Ondov","year":"2016","journal-title":"Genome Biol"},{"key":"2024101105064354800_btae566-B13","doi-asserted-by":"crossref","first-page":"D785","DOI":"10.1093\/nar\/gkab776","article-title":"GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy","volume":"50","author":"Parks","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2024101105064354800_btae566-B14","doi-asserted-by":"crossref","first-page":"1006","DOI":"10.12688\/f1000research.19675.1","article-title":"Large-scale sequence comparisons with sourmash","volume":"8","author":"Pierce","year":"2019","journal-title":"F1000Res"},{"key":"2024101105064354800_btae566-B15","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1016\/0025-5564(81)90043-2","article-title":"Comparison of phylogenetic trees","volume":"53","author":"Robinson","year":"1981","journal-title":"Math Biosci"},{"key":"2024101105064354800_btae566-B16","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1186\/s13059-019-1809-x","article-title":"When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data","volume":"20","author":"Rowe","year":"2019","journal-title":"Genome Biol"},{"key":"2024101105064354800_btae566-B17","first-page":"406","article-title":"The neighbor-joining method: a new method for reconstructing phylogenetic trees","volume":"4","author":"Saitou","year":"1987","journal-title":"Mol Biol Evol"},{"key":"2024101105064354800_btae566-B18","doi-asserted-by":"crossref","first-page":"4673","DOI":"10.1093\/nar\/22.22.4673","article-title":"CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice","volume":"22","author":"Thompson","year":"1994","journal-title":"Nucleic Acids Res"},{"key":"2024101105064354800_btae566-B19","doi-asserted-by":"crossref","first-page":"e75","DOI":"10.1093\/nar\/gkt003","article-title":"Co-phylog: an assembly-free phylogenomic approach for closely related organisms","volume":"41","author":"Yi","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2024101105064354800_btae566-B20","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1186\/s13059-021-02303-4","article-title":"KSSD: sequence dimensionality reduction by k-mer substring space sampling enables real-time large-scale datasets analysis","volume":"22","author":"Yi","year":"2021","journal-title":"Genome Biol"},{"key":"2024101105064354800_btae566-B21","doi-asserted-by":"crossref","first-page":"671","DOI":"10.1093\/bioinformatics\/bty651","article-title":"BinDash, software for fast genome distance estimation on a typical personal laptop","volume":"35","author":"Zhao","year":"2019","journal-title":"Bioinformatics"},{"key":"2024101105064354800_btae566-B22","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1186\/s13059-019-1755-7","article-title":"Benchmarking of alignment-free sequence comparison methods","volume":"20","author":"Zielezinski","year":"2019","journal-title":"Genome Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae566\/59204008\/btae566.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/10\/btae566\/59716446\/btae566.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/10\/btae566\/59716446\/btae566.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,11]],"date-time":"2024-10-11T05:06:58Z","timestamp":1728623218000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae566\/7762101"}},"subtitle":[],"editor":[{"given":"Russell","family":"Schwartz","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2024,9,19]]},"references-count":22,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2024,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae566","relation":{},"ISSN":["1367-4811"],"issn-type":[{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2024,10]]},"published":{"date-parts":[[2024,9,19]]},"article-number":"btae566"}}