{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T03:53:16Z","timestamp":1772077996255,"version":"3.50.1"},"reference-count":32,"publisher":"Oxford University Press (OUP)","issue":"24","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2014,12,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation : Next-generation sequencing (NGS) has revolutionized biomedical research in the past decade and led to a continuous stream of developments in bioinformatics, addressing the need for fast and space-efficient solutions for analyzing NGS data. Often researchers need to analyze a set of genomic sequences that stem from closely related species or are indeed individuals of the same species. Hence, the analyzed sequences are similar. For analyses where local changes in the examined sequence induce only local changes in the results, it is obviously desirable to examine identical or similar regions not repeatedly.<\/jats:p>\n               <jats:p>Results : In this work, we provide a datatype that exploits data parallelism inherent in a set of similar sequences by analyzing shared regions only once. In real-world experiments, we show that algorithms that otherwise would scan each reference sequentially can be speeded up by a factor of 115.<\/jats:p>\n               <jats:p>Availability : The data structure and associated tools are publicly available at http:\/\/www.seqan.de\/projects\/jst and are part of SeqAn, the C ++ template library for sequence analysis.<\/jats:p>\n               <jats:p>Contact : rene.rahn@fu-berlin.de<\/jats:p>","DOI":"10.1093\/bioinformatics\/btu438","type":"journal-article","created":{"date-parts":[[2014,7,16]],"date-time":"2014-07-16T00:23:13Z","timestamp":1405470193000},"page":"3499-3505","source":"Crossref","is-referenced-by-count":29,"title":["Journaled string tree\u2014a scalable data structure for analyzing thousands of similar genomes on your laptop"],"prefix":"10.1093","volume":"30","author":[{"given":"Ren\u00e9","family":"Rahn","sequence":"first","affiliation":[{"name":"Department of Mathematics and Computer Science, Freie Universit\u00e4t Berlin, Takustr. 9, 14195 Berlin, Germany"}]},{"given":"David","family":"Weese","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, Freie Universit\u00e4t Berlin, Takustr. 9, 14195 Berlin, Germany"}]},{"given":"Knut","family":"Reinert","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, Freie Universit\u00e4t Berlin, Takustr. 9, 14195 Berlin, Germany"}]}],"member":"286","published-online":{"date-parts":[[2014,7,15]]},"reference":[{"key":"2023012712051725500_btu438-B1","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1038\/nature09534","article-title":"A map of human genome variation from population-scale sequencing","volume":"467","author":"Altshuler","year":"2010","journal-title":"Nature"},{"key":"2023012712051725500_btu438-B2","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1126\/science.1216872","article-title":"A fine-scale chimpanzee genetic map from population sequencing","volume":"336","author":"Auton","year":"2012","journal-title":"Science"},{"key":"2023012712051725500_btu438-B3","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1145\/135239.135243","article-title":"A new approach to text searching","volume":"35","author":"Baeza-Yates","year":"1992","journal-title":"Commun. ACM"},{"key":"2023012712051725500_btu438-B4","doi-asserted-by":"crossref","first-page":"11920","DOI":"10.1073\/pnas.1201904109","article-title":"A public resource facilitating clinical use of genomes","volume":"109","author":"Ball","year":"2012","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012712051725500_btu438-B5","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1504\/IJCBDD.2013.052206","article-title":"Querying highly similar sequences","volume":"6","author":"Barton","year":"2013","journal-title":"Int. J. Comput. Biol. Drug Des."},{"key":"2023012712051725500_btu438-B6","doi-asserted-by":"crossref","first-page":"i174","DOI":"10.1093\/bioinformatics\/btn300","article-title":"Optimal spliced alignments of short sequence reads","volume":"24","author":"De Bona","year":"2008","journal-title":"Bioinformatics"},{"key":"2023012712051725500_btu438-B7","doi-asserted-by":"crossref","first-page":"2572","DOI":"10.1093\/bioinformatics\/btt460","article-title":"Genome compression: a novel approach for large collections","volume":"29","author":"Deorowicz","year":"2013","journal-title":"Bioinformatics"},{"key":"2023012712051725500_btu438-B8","doi-asserted-by":"crossref","first-page":"2979","DOI":"10.1093\/bioinformatics\/btr505","article-title":"Robust relative compression of genomes with random access","volume":"27","author":"Deorowicz","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012712051725500_btu438-B9","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1186\/1471-2105-9-11","article-title":"SeqAn an efficient, generic C++ library for sequence analysis","volume":"9","author":"D\u00f6ring","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012712051725500_btu438-B10","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1038\/nature09534","article-title":"A map of human genome variation from population-scale sequencing","volume":"467","author":"Durbin","year":"2010","journal-title":"Nature"},{"key":"2023012712051725500_btu438-B11","doi-asserted-by":"crossref","DOI":"10.1109\/SFCS.2000.892127","article-title":"Opportunistic data structures with applications","volume-title":"Proceedings of the 41st Annual Symposium on Foundations of Computer Science","author":"Ferragina","year":"2000"},{"key":"2023012712051725500_btu438-B12","doi-asserted-by":"crossref","first-page":"851","DOI":"10.1038\/nature06258","article-title":"A second generation human haplotype map of over 3.1 million SNPs","volume":"449","author":"Frazer","year":"2007","journal-title":"Nature"},{"key":"2023012712051725500_btu438-B13","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1093\/nar\/28.1.228","article-title":"Increased coverage of protein families with the blocks database servers","volume":"28","author":"Henikoff","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023012712051725500_btu438-B14","doi-asserted-by":"crossref","first-page":"501","DOI":"10.1002\/spe.4380100608","article-title":"Practical fast searching in strings","volume":"10","author":"Horspool","year":"1980","journal-title":"Softw. Pract. Exper."},{"key":"2023012712051725500_btu438-B15","doi-asserted-by":"crossref","first-page":"i361","DOI":"10.1093\/bioinformatics\/btt215","article-title":"Short read alignment with populations of genomes","volume":"29","author":"Huang","year":"2013","journal-title":"Bioinformatics"},{"key":"2023012712051725500_btu438-B16","doi-asserted-by":"crossref","first-page":"728","DOI":"10.1126\/science.1197891","article-title":"On the future of genomic data","volume":"331","author":"Kahn","year":"2011","journal-title":"Science"},{"key":"2023012712051725500_btu438-B17","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1038\/nature10413","article-title":"Mouse genomic variation and its effect on phenotypes and gene regulation","volume":"477","author":"Keane","year":"2011","journal-title":"Nature"},{"key":"2023012712051725500_btu438-B18","article-title":"Optimized relative lempel-ziv compression of genomes","volume-title":"Proceedings of the Thirty-Fourth Australasian Computer Science Conference","author":"Kuruppu","year":"2011"},{"key":"2023012712051725500_btu438-B19","article-title":"Indexing similar dna sequences","volume-title":"Algorithmic Aspects in Information and Management","author":"Lam","year":"2010"},{"key":"2023012712051725500_btu438-B20","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with burrows\u2013wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012712051725500_btu438-B21","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1089\/cmb.2005.12.407","article-title":"Space-efficient whole genome comparisons with burrows-wheeler","volume":"12","author":"Lippert","year":"2005","journal-title":"J. Comput. Biol."},{"key":"2023012712051725500_btu438-B22","doi-asserted-by":"crossref","first-page":"627","DOI":"10.1038\/nbt.2241","article-title":"Compressive genomics","volume":"30","author":"Loh","year":"2012","journal-title":"Nat. Biotechnol."},{"key":"2023012712051725500_btu438-B23","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-642-02008-7_9","article-title":"Storage and retrieval of individual genomes","volume-title":"Research in Computational Molecular Biology","author":"M\u00e4kinen","year":"2009"},{"key":"2023012712051725500_btu438-B24","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1145\/316542.316550","article-title":"A fast bit-vector algorithm for approximate string matching based on dynamic programming","volume":"46","author":"Myers","year":"1999","journal-title":"J. ACM"},{"key":"2023012712051725500_btu438-B25","doi-asserted-by":"crossref","first-page":"e27","DOI":"10.1093\/nar\/gkr1124","article-title":"GReEn: a tool for efficient compression of genome resequencing data","volume":"40","author":"Pinho","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023012712051725500_btu438-B26","doi-asserted-by":"crossref","first-page":"R98","DOI":"10.1186\/gb-2009-10-9-r98","article-title":"Simultaneous alignment of short reads against multiple genomes","volume":"10","author":"Schneeberger","year":"2009","journal-title":"Genome Biol."},{"key":"2023012712051725500_btu438-B27","doi-asserted-by":"crossref","first-page":"799","DOI":"10.1093\/bioinformatics\/15.10.799","article-title":"Fingerprintscan: intelligent searching of the prints motif database","volume":"15","author":"Scordis","year":"1999","journal-title":"Bioinformatics"},{"key":"2023012712051725500_btu438-B28","doi-asserted-by":"crossref","first-page":"270","DOI":"10.1007\/978-3-642-23038-7_23","article-title":"Indexing finite language representation of population genotypes","volume":"6833","author":"Sir\u00e9n","year":"2011","journal-title":"Algorithms Bioinformatics"},{"key":"2023012712051725500_btu438-B29","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/nature11632","article-title":"An integrated map of genetic variation from 1,092 human genomes","volume":"491","author":"The 1000 Genomes Project Consortium","year":"2012","journal-title":"Nature"},{"key":"2023012712051725500_btu438-B30","doi-asserted-by":"crossref","first-page":"993","DOI":"10.1038\/nature08987","article-title":"International network of cancer genome projects","volume":"464","author":"The International Cancer Genome Consortium","year":"2010","journal-title":"Nature"},{"key":"2023012712051725500_btu438-B31","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1186\/1748-7188-7-30","article-title":"Adaptive efficient compression of genomes","volume":"7","author":"Wandelt","year":"2012","journal-title":"Algorithms Mol. Biol."},{"key":"2023012712051725500_btu438-B32","doi-asserted-by":"crossref","first-page":"2592","DOI":"10.1093\/bioinformatics\/bts505","article-title":"Razers 3: faster, fully sensitive read mapping","volume":"28","author":"Weese","year":"2012","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/24\/3499\/48931568\/bioinformatics_30_24_3499.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/24\/3499\/48931568\/bioinformatics_30_24_3499.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,27]],"date-time":"2023-01-27T13:03:49Z","timestamp":1674824629000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/30\/24\/3499\/2422167"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,7,15]]},"references-count":32,"journal-issue":{"issue":"24","published-print":{"date-parts":[[2014,12,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btu438","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2014,12,15]]},"published":{"date-parts":[[2014,7,15]]}}}