{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,22]],"date-time":"2025-02-22T00:45:54Z","timestamp":1740185154600,"version":"3.37.3"},"reference-count":18,"publisher":"Oxford University Press (OUP)","issue":"21","license":[{"start":{"date-parts":[[2021,6,8]],"date-time":"2021-06-08T00:00:00Z","timestamp":1623110400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Museum National d\u2019Histoire Naturelle"},{"name":"Institut Universtaire de France"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,11,5]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Summary<\/jats:title>\n                  <jats:p>Genomic sequences are widely used to infer the evolutionary history of a given group of individuals. Many methods have been developed for sequence clustering and tree building. In the early days of genome sequencing, these were often limited to hundreds of sequences but due to the surge of high throughput sequencing, it is now common to have millions of sampled sequences at hand. We introduce MNHN-Tree-Tools, a high performance set of algorithms that builds multi-scale, nested clusters of sequences found in a FASTA file. MNHN-Tree-Tools does not rely on multiple sequence alignment and can thus be used on large datasets to infer a sequence tree. Herein, we outline two applications: a human alpha-satellite repeats classification and a tree of life derivation from 16S\/18S rDNA sequences.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Open source with a Zlib License via the Git protocol: https:\/\/gitlab.in2p3.fr\/mnhn-tools\/mnhn-tree-tools.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Manual<\/jats:title>\n                  <jats:p>A detailed users guide and tutorial: https:\/\/gitlab.in2p3.fr\/mnhn-tools\/mnhn-tree-tools-manual\/-\/raw\/master\/manual.pdf.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Website and FAQ<\/jats:title>\n                  <jats:p>http:\/\/treetools.haschka.net.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab430","type":"journal-article","created":{"date-parts":[[2021,6,7]],"date-time":"2021-06-07T11:55:56Z","timestamp":1623066956000},"page":"3947-3949","source":"Crossref","is-referenced-by-count":2,"title":["MNHN-Tree-Tools: a toolbox for tree inference using multi-scale clustering of a set of sequences"],"prefix":"10.1093","volume":"37","author":[{"given":"Thomas","family":"Haschka","sequence":"first","affiliation":[{"name":"Mus\u00e9um National d\u2019Histoire Naturelle, Structure et Instabilit\u00e9 des G\u00e9nomes, UMR7196 , Paris 75231, France"}]},{"given":"Loic","family":"Ponger","sequence":"additional","affiliation":[{"name":"Mus\u00e9um National d\u2019Histoire Naturelle, Structure et Instabilit\u00e9 des G\u00e9nomes, UMR7196 , Paris 75231, France"}]},{"given":"Christophe","family":"Escud\u00e9","sequence":"additional","affiliation":[{"name":"Mus\u00e9um National d\u2019Histoire Naturelle, Structure et Instabilit\u00e9 des G\u00e9nomes, UMR7196 , Paris 75231, France"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5652-0302","authenticated-orcid":false,"given":"Julien","family":"Mozziconacci","sequence":"additional","affiliation":[{"name":"Mus\u00e9um National d\u2019Histoire Naturelle, Structure et Instabilit\u00e9 des G\u00e9nomes, UMR7196 , Paris 75231, France"},{"name":"Institut Universitaire de 1 , rue Descartes 75005 Paris, France"}]}],"member":"286","published-online":{"date-parts":[[2021,6,8]]},"reference":[{"key":"2023051608254392600_btab430-B1","first-page":"49","volume-title":"Optics: Ordering Points to Identify the Clustering Structure","author":"Ankerst","year":"1999"},{"first-page":"17","year":"2008","author":"Chatterji","key":"2023051608254392600_btab430-B2"},{"key":"2023051608254392600_btab430-B3","doi-asserted-by":"crossref","first-page":"302","DOI":"10.3389\/fevo.2019.00302","article-title":"Review and interpretation of trends in DNA barcoding","volume":"7","author":"DeSalle","year":"2019","journal-title":"Front. Ecol. Evol"},{"first-page":"226","year":"1996","author":"Ester","key":"2023051608254392600_btab430-B4"},{"year":"1994","author":"Forum","key":"2023051608254392600_btab430-B5"},{"key":"2023051608254392600_btab430-B6","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1038\/nrg.2016.49","article-title":"Coming of age: ten years of next-generation sequencing technologies","volume":"17","author":"Goodwin","year":"2016","journal-title":"Nat. Rev. Genet"},{"key":"2023051608254392600_btab430-B7","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1016\/j.tig.2007.02.001","article-title":"DNA barcoding: how it complements taxonomy, molecular phylogenetics and population genetics","volume":"23","author":"Hajibabaei","year":"2007","journal-title":"Trends Genet"},{"first-page":"246","year":"2004","author":"Kailing","key":"2023051608254392600_btab430-B8"},{"key":"2023051608254392600_btab430-B9","doi-asserted-by":"crossref","first-page":"1435","DOI":"10.1126\/science.2983426","article-title":"Rapid and sensitive protein similarity searches","volume":"227","author":"Lipman","year":"1985","journal-title":"Science"},{"key":"2023051608254392600_btab430-B10","doi-asserted-by":"crossref","first-page":"1362","DOI":"10.1016\/j.proeng.2012.06.169","article-title":"Phylogenetic tree construction for DNA sequences using clustering methods","volume":"38","author":"Mahapatro","year":"2012","journal-title":"Proc. Eng"},{"key":"2023051608254392600_btab430-B11","doi-asserted-by":"crossref","first-page":"e1420","DOI":"10.7717\/peerj.1420","article-title":"Swarm v2: highly-scalable and high-resolution amplicon clustering","volume":"3","author":"Mah\u00e9","year":"2015","journal-title":"PeerJ"},{"key":"2023051608254392600_btab430-B12","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1038\/ismej.2011.139","article-title":"An improved greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea","volume":"6","author":"McDonald","year":"2012","journal-title":"ISME J"},{"key":"2023051608254392600_btab430-B13","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1016\/j.syapm.2011.03.001","article-title":"Release ltps104 of the all-species living tree","volume":"34","author":"Munoz","year":"2011","journal-title":"Syst. Appl. Microbiol"},{"key":"2023051608254392600_btab430-B14","doi-asserted-by":"crossref","first-page":"e2584","DOI":"10.7717\/peerj.2584","article-title":"Vsearch: a versatile open source tool for metagenomics","volume":"4","author":"Rognes","year":"2016","journal-title":"PeerJ"},{"key":"2023051608254392600_btab430-B15","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1016\/j.eswa.2011.06.049","article-title":"Clustering based distributed phylogenetic tree construction","volume":"39","author":"Ruzgar","year":"2012","journal-title":"Expert Syst. Appl"},{"key":"2023051608254392600_btab430-B16","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","article-title":"Identification of common molecular subsequences","volume":"147","author":"Smith","year":"1981","journal-title":"J. Mol. Biol"},{"key":"2023051608254392600_btab430-B17","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1109\/MCSE.2010.69","article-title":"Opencl: a parallel programming standard for heterogeneous computing systems","volume":"12","author":"Stone","year":"2010","journal-title":"Comput. Sci. Eng"},{"key":"2023051608254392600_btab430-B18","doi-asserted-by":"crossref","first-page":"103708","DOI":"10.1016\/j.dib.2019.103708","article-title":"Classification and monomer-by-monomer annotation dataset of suprachromosomal family 1 alpha satellite higher-order repeats in hg38 human genome assembly","volume":"24","author":"Uralsky","year":"2019","journal-title":"Data Brief"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab430\/38709390\/btab430.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/21\/3947\/50337033\/btab430.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/21\/3947\/50337033\/btab430.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T08:36:27Z","timestamp":1684226187000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/21\/3947\/6294927"}},"subtitle":[],"editor":[{"given":"Teresa","family":"Przytycka","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,6,8]]},"references-count":18,"journal-issue":{"issue":"21","published-print":{"date-parts":[[2021,11,5]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab430","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2021,11,1]]},"published":{"date-parts":[[2021,6,8]]}}}