{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T03:53:42Z","timestamp":1772078022268,"version":"3.50.1"},"reference-count":13,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":1082,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/3.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2014,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: The data deluge phenomenon is becoming a serious problem in most genomic centers. To alleviate it, general purpose tools, such as gzip, are used to compress the data. However, although pervasive and easy to use, these tools fall short when the intention is to reduce as much as possible the data, for example, for medium- and long-term storage. A number of algorithms have been proposed for the compression of genomics data, but unfortunately only a few of them have been made available as usable and reliable compression tools.<\/jats:p>\n               <jats:p>Results: In this article, we describe one such tool, MFCompress, specially designed for the compression of FASTA and multi-FASTA files. In comparison to gzip and applied to multi-FASTA files, MFCompress can provide additional average compression gains of almost 50%, i.e. it potentially doubles the available storage, although at the cost of some more computation time. On highly redundant datasets, and in comparison with gzip, 8-fold size reductions have been obtained.<\/jats:p>\n               <jats:p>Availability: Both source code and binaries for several operating systems are freely available for non-commercial use at http:\/\/bioinformatics.ua.pt\/software\/mfcompress\/.<\/jats:p>\n               <jats:p>Contact: \u00a0ap@ua.pt<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btt594","type":"journal-article","created":{"date-parts":[[2013,10,17]],"date-time":"2013-10-17T01:33:55Z","timestamp":1381973635000},"page":"117-118","source":"Crossref","is-referenced-by-count":96,"title":["MFCompress: a compression tool for FASTA and multi-FASTA data"],"prefix":"10.1093","volume":"30","author":[{"given":"Armando J.","family":"Pinho","sequence":"first","affiliation":[{"name":"IEETA, Department of Electronics, Telecommunications and Informatics, University of Aveiro, 3810\u2013193 Aveiro, Portugal"}]},{"given":"Diogo","family":"Pratas","sequence":"additional","affiliation":[{"name":"IEETA, Department of Electronics, Telecommunications and Informatics, University of Aveiro, 3810\u2013193 Aveiro, Portugal"}]}],"member":"286","published-online":{"date-parts":[[2013,10,16]]},"reference":[{"key":"2023012710375778000_btt594-B1","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1038\/nrg3433","article-title":"Computational solutions for omics data","volume":"14","author":"Berger","year":"2013","journal-title":"Nat. Rev. Genet."},{"key":"2023012710375778000_btt594-B2","doi-asserted-by":"crossref","first-page":"e59190","DOI":"10.1371\/journal.pone.0059190","article-title":"Compression of FASTQ and SAM format sequencing data","volume":"8","author":"Bonfield","year":"2013","journal-title":"PLoS One"},{"key":"2023012710375778000_btt594-B3","first-page":"43","article-title":"A simple statistical algorithm for biological sequence compression","volume-title":"Data Compression Conference, DCC-2007, Snowbird, Utah","author":"Cao","year":"2007"},{"key":"2023012710375778000_btt594-B4","doi-asserted-by":"crossref","first-page":"1415","DOI":"10.1093\/bioinformatics\/bts173","article-title":"Large-scale compression of genomic sequence databases with the Burrows-Wheeler transform","volume":"28","author":"Cox","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012710375778000_btt594-B5","first-page":"340","article-title":"Compression of DNA sequences","volume-title":"Data Compression Conference, DCC-93, Snowbird, Utah","author":"Grumbach","year":"1993"},{"key":"2023012710375778000_btt594-B6","doi-asserted-by":"crossref","first-page":"3051","DOI":"10.1093\/bioinformatics\/bts593","article-title":"SCALCE: boosting sequence compression algorithms using locally consistent encoding","volume":"28","author":"Hach","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012710375778000_btt594-B7","doi-asserted-by":"crossref","first-page":"e171","DOI":"10.1093\/nar\/gks754","article-title":"Compression of next-generation sequencing reads aided by highly efficient de novo assembly","volume":"40","author":"Jones","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023012710375778000_btt594-B8","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1109\/DCC.2007.60","article-title":"Normalized maximum likelihood model of order-1 for the compression of DNA sequences","volume-title":"Data Compression Conference, DCC-2007, Snowbird, Utah","author":"Korodi","year":"2007"},{"key":"2023012710375778000_btt594-B9","doi-asserted-by":"crossref","first-page":"3189","DOI":"10.1109\/TIT.2012.2236605","article-title":"A compression model for DNA multiple sequence alignment blocks","volume":"59","author":"Matos","year":"2013","journal-title":"IEEE Trans. Inf. Theory"},{"key":"2023012710375778000_btt594-B10","doi-asserted-by":"crossref","first-page":"2527","DOI":"10.1093\/bioinformatics\/bts467","article-title":"DELIMINATE - a fast and efficient method for loss-less compression of genomic sequences","volume":"28","author":"Mohammed","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012710375778000_btt594-B11","doi-asserted-by":"crossref","first-page":"e21588","DOI":"10.1371\/journal.pone.0021588","article-title":"On the representability of complete genomes by multiple competing finite-context (Markov) models","volume":"6","author":"Pinho","year":"2011","journal-title":"PLoS One"},{"key":"2023012710375778000_btt594-B12","doi-asserted-by":"crossref","first-page":"e27","DOI":"10.1093\/nar\/gkr1124","article-title":"GReEn: a tool for efficient compression of genome resequencing data","volume":"40","author":"Pinho","year":"2012","journal-title":"Nucleic Acids Res."},{"key":"2023012710375778000_btt594-B13","doi-asserted-by":"crossref","first-page":"e27","DOI":"10.1093\/nar\/gks939","article-title":"NGC: lossless and lossy compression of aligned high-throughput sequencing data","volume":"41","author":"Popitsch","year":"2013","journal-title":"Nucleic Acids Res."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/1\/117\/48913003\/bioinformatics_30_1_117.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/1\/117\/48913003\/bioinformatics_30_1_117.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,27]],"date-time":"2023-01-27T10:40:51Z","timestamp":1674816051000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/30\/1\/117\/236841"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,10,16]]},"references-count":13,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2014,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btt594","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2014,1,1]]},"published":{"date-parts":[[2013,10,16]]}}}