{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,5]],"date-time":"2025-10-05T16:50:32Z","timestamp":1759683032108},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"18","license":[{"start":{"date-parts":[[2017,5,11]],"date-time":"2017-05-11T00:00:00Z","timestamp":1494460800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,9,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Next Generation Sequencing (NGS) platforms and, more generally, high-throughput technologies are giving rise to an exponential growth in the size of nucleotide sequence databases. Moreover, many emerging applications of nucleotide datasets\u2014as those related to personalized medicine\u2014require the compliance with regulations about the storage and processing of sensitive data.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We have designed and carefully engineered E2FM-index, a new full-text index in minute space which was optimized for compressing and encrypting nucleotide sequence collections in FASTA format and for performing fast pattern-search queries. E2FM-index allows to build self-indexes which occupy till to 1\/20 of the storage required by the input FASTA file, thus permitting to save about 95% of storage when indexing collections of highly similar sequences; moreover, it can exactly search the built indexes for patterns in times ranging from few milliseconds to a few hundreds milliseconds, depending on pattern length.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Source code is available at https:\/\/github.com\/montecuollo\/E2FM.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx313","type":"journal-article","created":{"date-parts":[[2017,5,10]],"date-time":"2017-05-10T11:09:11Z","timestamp":1494414551000},"page":"2808-2817","source":"Crossref","is-referenced-by-count":2,"title":["<i>E<\/i>\n          \u00a02\n          \u00a0<i>FM<\/i>: an encrypted and compressed full-text index for collections of genomic sequences"],"prefix":"10.1093","volume":"33","author":[{"given":"Ferdinando","family":"Montecuollo","sequence":"first","affiliation":[{"name":"Centro Reti, Sistemi e Servizi Informatici\/CRESSI, Universit\u00e0 degli Studi della Campania \u201cLuigi Vanvitelli,\u201d Napoli, Italy"}]},{"given":"Giovannni","family":"Schmid","sequence":"additional","affiliation":[{"name":"Istituto di Calcolo e Reti ad Alte Prestazioni\/ICAR, Consiglio Nazionale delle Ricerche, Napoli, Italy"}]},{"given":"Roberto","family":"Tagliaferri","sequence":"additional","affiliation":[{"name":"Dipartimento di Scienze Aziendali \u2013 Management & Innovation Systems\/DISA-MIS, Universit\u00e0 di Salerno, Fisciano, Italy"}]}],"member":"286","published-online":{"date-parts":[[2017,5,11]]},"reference":[{"key":"2023020206414070500_btx313-B1","volume-title":"Combinatorial Pattern Matching","author":"Bauer","year":"2011"},{"key":"2023020206414070500_btx313-B2","first-page":"360","author":"Bentley","year":"1997"},{"key":"2023020206414070500_btx313-B3","doi-asserted-by":"crossref","first-page":"320","DOI":"10.1145\/5684.5688","article-title":"A locally adaptive data compression scheme","volume":"29","author":"Bentley","year":"1986","journal-title":"Commun. ACM"},{"key":"2023020206414070500_btx313-B100","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1007\/978-3-540-68351-3_8","article-title":"The Salsa20 Family of Stream Ciphers New Stream Cipher Designs","volume":"4986","author":"Bernstein","year":"2008","journal-title":"Lecture Notes In Computer Science"},{"key":"2023020206414070500_btx313-B4","author":"Bonwick","year":"2003"},{"key":"2023020206414070500_btx313-B5","author":"Burrows","year":"1994"},{"key":"2023020206414070500_btx313-B6","volume-title":"Introduction to Algorithms","author":"Cormen","year":"2009"},{"key":"2023020206414070500_btx313-B7","doi-asserted-by":"crossref","first-page":"1415","DOI":"10.1093\/bioinformatics\/bts173","article-title":"Large-scale compression of genomic sequence databases with the burrows\u2013wheeler transform","volume":"28","author":"Cox","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020206414070500_btx313-B8","doi-asserted-by":"crossref","first-page":"420.","DOI":"10.1145\/364520.364540","article-title":"Algorithm 235: random permutation","volume":"7","author":"Durstenfeld","year":"1964","journal-title":"Commun. ACM"},{"key":"2023020206414070500_btx313-B9","first-page":"390","author":"Ferragina","year":"2000"},{"key":"2023020206414070500_btx313-B10","author":"Jacobson","year":"1988"},{"key":"2023020206414070500_btx313-B11","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1016\/j.tcs.2007.07.018","article-title":"Fast bwt in small space by blockwise suffix sorting","volume":"387","author":"K\u00e4rkk\u00e4inen","year":"2007","journal-title":"Theor. Comput. Sci"},{"key":"2023020206414070500_btx313-B12","doi-asserted-by":"crossref","first-page":"1.","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short DNA sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol"},{"key":"2023020206414070500_btx313-B13","doi-asserted-by":"crossref","first-page":"178","DOI":"10.1007\/11496656_16","volume-title":"Annual Symposium on Combinatorial Pattern Matching","author":"Mantaci","year":"2005"},{"key":"2023020206414070500_btx313-B14","volume-title":"Handbook of Applied Cryptography","author":"Menezes","year":"1997"},{"key":"2023020206414070500_btx313-B15","author":"Mouha","year":"2013"},{"key":"2023020206414070500_btx313-B16","doi-asserted-by":"crossref","first-page":"R131","DOI":"10.1093\/hmg\/ddq400","article-title":"Small insertions and deletions (indels) in human genomes","volume":"19","author":"Mullaney","year":"2010","journal-title":"Hum. Mol. Genet"},{"key":"2023020206414070500_btx313-B17","first-page":"16","article-title":"Data compression by means of a \u201cbook stack\u201d","volume":"16","author":"Ryabko","year":"1980","journal-title":"Problemy Peredachi Informatsii"},{"key":"2023020206414070500_btx313-B18","first-page":"btw505.","article-title":"Nrgc: a novel referential genome compression algorithm","author":"Saha","year":"2016","journal-title":"Bioinformatics"},{"key":"2023020206414070500_btx313-B19","doi-asserted-by":"crossref","first-page":"1652","DOI":"10.1093\/bioinformatics\/btw050","article-title":"Efficient privacy-preserving string search and an application in genomics","volume":"32","author":"Shimizu","year":"2016","journal-title":"Bioinformatics"},{"key":"2023020206414070500_btx313-B20","doi-asserted-by":"crossref","first-page":"1.","DOI":"10.1186\/1748-7188-6-23","article-title":"Recoil-an algorithm for compression of extremely large datasets of dna data","volume":"6","author":"Yanovsky","year":"2011","journal-title":"Algorithms for Molecular Biology"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/18\/2808\/49041269\/bioinformatics_33_18_2808.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/18\/2808\/49041269\/bioinformatics_33_18_2808.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T06:42:05Z","timestamp":1675320125000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/18\/2808\/3819194"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2017,5,11]]},"references-count":21,"journal-issue":{"issue":"18","published-print":{"date-parts":[[2017,9,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx313","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2017,9,15]]},"published":{"date-parts":[[2017,5,11]]}}}