{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T03:53:42Z","timestamp":1772078022284,"version":"3.50.1"},"reference-count":12,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2025,3,4]],"date-time":"2025-03-04T00:00:00Z","timestamp":1741046400000},"content-version":"vor","delay-in-days":3,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100009894","name":"Lodz University of Technology","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100009894","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Faculty of Electrical, Electronic, Computer and Control Engineering","award":["501\/12-24-1-5418"],"award-info":[{"award-number":["501\/12-24-1-5418"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,3,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Summary<\/jats:title>\n                  <jats:p>The FASTQ format remains at the heart of high-throughput sequencing. Despite advances in specialized FASTQ compressors, they are still imperfect in terms of practical performance tradeoffs. We present a multi-threaded version of Pseudogenome-based Read Compressor (PgRC), an in-memory algorithm for compressing the DNA stream, based on the idea of approximating the shortest common superstring over high-quality reads. Redundancy in the obtained string is efficiently removed by using a compact temporary representation. The current version, v2.0, preserves the compression ratio of the previous one, reducing the compression (resp. decompression) time by a factor of 8\u20139 (resp. 2\u20132.5) on a 14-core\/28-thread machine.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>PgRC\u20092.0 can be downloaded from https:\/\/github.com\/kowallus\/PgRC and https:\/\/zenodo.org\/records\/14882486 (10.5281\/zenodo.14882486).<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf101","type":"journal-article","created":{"date-parts":[[2025,3,5]],"date-time":"2025-03-05T02:08:31Z","timestamp":1741140511000},"source":"Crossref","is-referenced-by-count":2,"title":["PgRC2: engineering the compression of sequencing reads"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0953-3762","authenticated-orcid":false,"given":"Tomasz M","family":"Kowalski","sequence":"first","affiliation":[{"name":"Institute of Applied Computer Science, Lodz University of Technology , Lodz 90-924,","place":["Poland"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1714-1224","authenticated-orcid":false,"given":"Szymon","family":"Grabowski","sequence":"additional","affiliation":[{"name":"Institute of Applied Computer Science, Lodz University of Technology , Lodz 90-924,","place":["Poland"]}]}],"member":"286","published-online":{"date-parts":[[2025,3,4]]},"reference":[{"key":"2025031417311442900_btaf101-B1","doi-asserted-by":"crossref","first-page":"e59190","DOI":"10.1371\/journal.pone.0059190","article-title":"Compression of FASTQ and SAM format sequencing data","volume":"8","author":"Bonfield","year":"2013","journal-title":"PLoS One"},{"key":"2025031417311442900_btaf101-B2","doi-asserted-by":"crossref","first-page":"2674","DOI":"10.1093\/bioinformatics\/bty1015","article-title":"SPRING: a next-generation compressor for FASTQ data","volume":"35","author":"Chandak","year":"2019","journal-title":"Bioinformatics"},{"key":"2025031417311442900_btaf101-B3","doi-asserted-by":"crossref","first-page":"578","DOI":"10.1038\/s41598-020-57452-6","article-title":"FQSqueezer: k-mer-based compression of sequencing data","volume":"10","author":"Deorowicz","year":"2020","journal-title":"Sci Rep"},{"key":"2025031417311442900_btaf101-B4","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1093\/bioinformatics\/bty670","article-title":"copMEM: finding maximal exact matches via sampling both genomes","volume":"35","author":"Grabowski","year":"2019","journal-title":"Bioinformatics"},{"key":"2025031417311442900_btaf101-B5","doi-asserted-by":"crossref","first-page":"1389","DOI":"10.1093\/bioinformatics\/btu844","article-title":"Disk-based compression of data from genome sequencing","volume":"31","author":"Grabowski","year":"2015","journal-title":"Bioinformatics"},{"key":"2025031417311442900_btaf101-B6","doi-asserted-by":"crossref","first-page":"3051","DOI":"10.1093\/bioinformatics\/bts593","article-title":"SCALCE: boosting sequence compression algorithms using locally consistent encoding","volume":"28","author":"Hach","year":"2012","journal-title":"Bioinformatics"},{"key":"2025031417311442900_btaf101-B7","doi-asserted-by":"crossref","first-page":"2082","DOI":"10.1093\/bioinformatics\/btz919","article-title":"PgRC: pseudogenome-based read compressor","volume":"36","author":"Kowalski","year":"2020","journal-title":"Bioinformatics"},{"key":"2025031417311442900_btaf101-B8","doi-asserted-by":"crossref","first-page":"e0133198","DOI":"10.1371\/journal.pone.0133198","article-title":"Indexing arbitrary-length k-mers in sequencing reads","volume":"10","author":"Kowalski","year":"2015","journal-title":"PLoS One"},{"key":"2025031417311442900_btaf101-B9","doi-asserted-by":"crossref","first-page":"2225","DOI":"10.1093\/bioinformatics\/btab102","article-title":"Genozip: a universal extensible genomic data compressor","volume":"37","author":"Lan","year":"2021","journal-title":"Bioinformatics"},{"key":"2025031417311442900_btaf101-B10","doi-asserted-by":"crossref","first-page":"2066","DOI":"10.1093\/bioinformatics\/bty936","article-title":"Index suffix-prefix overlaps by (w,k)-minimizer to generate long contigs for reads compression","volume":"35","author":"Liu","year":"2019","journal-title":"Bioinformatics"},{"key":"2025031417311442900_btaf101-B11","doi-asserted-by":"crossref","first-page":"e1009229","DOI":"10.1371\/journal.pcbi.1009229","article-title":"Hamming-shifting graph of genomic short reads: efficient construction and its application for compression","volume":"17","author":"Liu","year":"2021","journal-title":"PLoS Comput Biol"},{"key":"2025031417311442900_btaf101-B12","doi-asserted-by":"crossref","first-page":"3294","DOI":"10.1093\/bioinformatics\/btac333","article-title":"CURC: a CUDA-based reference-free read compressor","volume":"38","author":"Xie","year":"2022","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf101\/62266545\/btaf101.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/3\/btaf101\/62266545\/btaf101.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/3\/btaf101\/62266545\/btaf101.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,14]],"date-time":"2025-03-14T17:31:28Z","timestamp":1741973488000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf101\/8051895"}},"subtitle":[],"editor":[{"given":"Can","family":"Alkan","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,3]]},"references-count":12,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,3,4]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf101","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,3]]},"published":{"date-parts":[[2025,3]]},"article-number":"btaf101"}}