{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:41Z","timestamp":1772138081073,"version":"3.50.1"},"reference-count":5,"publisher":"Oxford University Press (OUP)","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Summary: Large resequencing projects require a significant amount of storage for raw sequences, as well as alignment files. Because the raw sequences are redundant once the alignment has been generated, it is possible to keep only the alignment files. We present BamHash, a checksum based method to ensure that the read pairs in FASTQ files match exactly the read pairs stored in BAM files, regardless of the ordering of reads. BamHash can be used to verify the integrity of the files stored and discover any discrepancies. Thus, BamHash can be used to determine if it is safe to delete the FASTQ files storing raw sequencing read after alignment, without the loss of data.<\/jats:p>\n                  <jats:p>Availability and implementation: The software is implemented in C++, GPL licensed and available at https:\/\/github.com\/DecodeGenetics\/BamHash<\/jats:p>\n                  <jats:p>Contact: \u00a0pmelsted@hi.is<\/jats:p>","DOI":"10.1093\/bioinformatics\/btv539","type":"journal-article","created":{"date-parts":[[2015,9,11]],"date-time":"2015-09-11T20:23:51Z","timestamp":1442003031000},"page":"140-141","source":"Crossref","is-referenced-by-count":2,"title":["BamHash: a checksum program for verifying the integrity of sequence data"],"prefix":"10.1093","volume":"32","author":[{"given":"Arna","family":"\u00d3skarsd\u00f3ttir","sequence":"first","affiliation":[{"name":"1 deCODE Genetics\/Amgen, Reykjav\u00edk, Iceland and"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"G\u00edsli","family":"M\u00e1sson","sequence":"additional","affiliation":[{"name":"1 deCODE Genetics\/Amgen, Reykjav\u00edk, Iceland and"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"P\u00e1ll","family":"Melsted","sequence":"additional","affiliation":[{"name":"1 deCODE Genetics\/Amgen, Reykjav\u00edk, Iceland and"},{"name":"2 Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland, Reykjav\u00edk, Iceland"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2015,9,10]]},"reference":[{"key":"2023020110220473200_btv539-B1","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1186\/1471-2105-9-11","article-title":"SeqAn an efficient, generic C++ library for sequence analysis","volume":"9.1","author":"D\u00f6ring","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023020110220473200_btv539-B2","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1038\/ng.3247","article-title":"Large-scale whole-genome sequencing of the Icelandic population","volume":"47","author":"Gudbjartsson","year":"2015","journal-title":"Nat. Genet."},{"key":"2023020110220473200_btv539-B3","doi-asserted-by":"crossref","first-page":"2078","DOI":"10.1093\/bioinformatics\/btp352","article-title":"The sequence alignment\/map format and SAMtools","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023020110220473200_btv539-B4","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1016\/0304-3975(92)90260-M","article-title":"On the distributional complexity of disjointness","volume":"106","author":"Razborov","year":"1992","journal-title":"Theor. Comput. Sci."},{"key":"2023020110220473200_btv539-B5","article-title":"RFC 1321: the MD5 message-digest algorithm. Internet Engineering Task Force","author":"Rivest","year":"1992"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/1\/140\/49016397\/bioinformatics_32_1_140.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/1\/140\/49016397\/bioinformatics_32_1_140.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T13:52:04Z","timestamp":1675259524000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/32\/1\/140\/1743564"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,9,10]]},"references-count":5,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2016,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btv539","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/015867","asserted-by":"object"}]},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2016,1,1]]},"published":{"date-parts":[[2015,9,10]]}}}