{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:30:55Z","timestamp":1760243455069,"version":"build-2065373602"},"reference-count":18,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2013,5,21]],"date-time":"2013-05-21T00:00:00Z","timestamp":1369094400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/3.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>Data storage is a major and growing part of IT budgets for research since manyyears. Especially in biology, the amount of raw data products is growing continuously,and the advent of the so-called \"next-generation\" sequencers has made things worse.Affordable prices have pushed scientists to massively sequence whole genomes and to screenlarge cohort of patients, thereby producing tons of data as a side effect. The need formaximally fitting data into the available storage volumes has encouraged and welcomednew compression algorithms and tools. We focus here on state-of-the-art compression toolsand measure their compression performance on ABI SOLiD data.<\/jats:p>","DOI":"10.3390\/a6020309","type":"journal-article","created":{"date-parts":[[2013,5,21]],"date-time":"2013-05-21T14:22:12Z","timestamp":1369146132000},"page":"309-318","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Multi-Sided Compression Performance Assessment of ABI SOLiD WES Data"],"prefix":"10.3390","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0434-8533","authenticated-orcid":false,"given":"Tommaso","family":"Mazza","sequence":"first","affiliation":[{"name":"IRCCS Casa Sollievo della Sofferenza-Mendel Institute, Regina Margherita Avenue, 261, Rome 00198, Italy"}]},{"given":"Stefano","family":"Castellana","sequence":"additional","affiliation":[{"name":"IRCCS Casa Sollievo della Sofferenza-Mendel Institute, Regina Margherita Avenue, 261, Rome 00198, Italy"}]}],"member":"1968","published-online":{"date-parts":[[2013,5,21]]},"reference":[{"key":"ref_1","unstructured":"Goddard, W.A., and Lynott, J. (1970). Direct Access Magnetic Disc Storage Device. (3,503,060), U.S. Patent."},{"key":"ref_2","unstructured":"Komorowski, M. A history of storage cost. Available online: http:\/\/www.mkomo.com\/cost-per-gigabyte."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1038\/nature09534","article-title":"A map of human genome variation from population-scale sequencing","volume":"467","author":"Consortium","year":"2010","journal-title":"Nature"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"D871","DOI":"10.1093\/nar\/gkq1017","article-title":"ENCODE whole-genome data in the UCSC genome browser (2011 update)","volume":"39","author":"Rosenbloom","year":"2011","journal-title":"Nucleic Acids Res."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1038\/nature11690","article-title":"Analysis of 6515 exomes reveals the recent origin of most human protein-coding variants","volume":"493","author":"Fu","year":"2013","journal-title":"Nature"},{"key":"ref_6","unstructured":"The SAM Format Specification Working Group The SAM Format Specification (v1.4-r985). Available online: http:\/\/samtools.sourceforge.net\/SAM1.pdf."},{"key":"ref_7","unstructured":"The SAM Format Specification Working Group Picard. Available online: http:\/\/picard.sourceforge.net\/."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"491","DOI":"10.1038\/ng.806","article-title":"A framework for variation discovery and genotyping using next-generation DNA sequencing data","volume":"43","author":"DePristo","year":"2011","journal-title":"Nat. Genet."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"841","DOI":"10.1093\/bioinformatics\/btq033","article-title":"BEDTools: A flexible suite of utilities for comparing genomic features","volume":"26","author":"Quinlan","year":"2010","journal-title":"Bioinformatics"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"433","DOI":"10.1136\/jmedgenet-2012-100918","article-title":"wANNOVAR: Annotating genetic variants for personal genomes via the web","volume":"49","author":"Chang","year":"2012","journal-title":"J. Med. Genet."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Castellana, S., Romani, M., Valente, E., and Mazza, T. (2012). A solid quality-control analysis of AB SOLiD short-read sequencing data. Brief. Bioinform.","DOI":"10.1093\/bib\/bbs048"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"734","DOI":"10.1101\/gr.114819.110","article-title":"Efficient storage of high throughput DNA sequencing data using reference-based compression","volume":"21","author":"Fritz","year":"2011","journal-title":"Genome Res."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Jones, D., Ruzzo, W., Peng, X., and Katze, M. (2012). Compression of next-generation sequencing reads aided by highly efficient de novo assembly. Nucleic Acids Res., 40.","DOI":"10.1093\/nar\/gks754"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Popitsch, N., and von Haeseler, A. (2013). NGC: Lossless and lossy compression of aligned high-throughput sequencing data. Nucleic Acids Res., 7.","DOI":"10.1093\/nar\/gks939"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1109\/TIT.1966.1053907","article-title":"Run-Length encodings","volume":"12","author":"Golomb","year":"1966","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Stitziel, N., Kiezun, A., and Sunyaev, S. (2011). Computational and statistical approaches to analyzing variants identified by exome sequencing. Genome Biol., 12.","DOI":"10.1186\/gb-2011-12-9-227"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"747","DOI":"10.1038\/nature08494","article-title":"Finding the missing heritability of complex diseases","volume":"461","author":"Manolio","year":"2009","journal-title":"Nature"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Castellana, S., and Mazza, T. (2013). Congruency in the prediction of pathogenic missense mutations: State-of-the-art web-based tools. Brief. Bioinforma.","DOI":"10.1093\/bib\/bbt013"}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/6\/2\/309\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T21:46:50Z","timestamp":1760219210000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/6\/2\/309"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,5,21]]},"references-count":18,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2013,6]]}},"alternative-id":["a6020309"],"URL":"https:\/\/doi.org\/10.3390\/a6020309","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2013,5,21]]}}}