{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T21:17:59Z","timestamp":1773350279576,"version":"3.50.1"},"reference-count":24,"publisher":"Oxford University Press (OUP)","issue":"3","funder":[{"name":"NIH\/NHGRI","award":["R01HG011774"],"award-info":[{"award-number":["R01HG011774"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2026,2,28]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Genome-wide association studies (GWAS) are widely used to investigate the role of genetics in disease traits, but the resulting file sizes from these studies are large, posing barriers to efficient storage, sharing, and querying. This issue is especially important for biobanks like the UK Biobank that publish GWAS for thousands of traits, increasing the volume of data that must be effectively managed. Current compression and query methods reduce file sizes and allow for quick genomic position-based queries but do not provide utility for quickly finding loci based on their summary statistics. For example, finding all SNVs in a particular p-value range would require decompressing and scanning the whole file. We propose a new tool, STABIX, which introduces summary-statistic-based queries and improves upon the standard bgzip compression and Tabix query tool in both compression ratio and decompression speed.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>When applied to 10 GWAS files from PanUKBB, STABIX created smaller compressed data and indices than Tabix for all files, where bgzip and tbi files were an average of 1.2 times the size of STABIX compressed files and indexes. In the same 10 files, STABIX per gene decompression was, on average 7\u00d7 faster than Tabix per gene decompression, and achieved faster per gene decompression times for over 99% of nearly 20,000 genes.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Software freely available for download at GitHub: https:\/\/github.com\/kristen-schneider\/stabix\/.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf264","type":"journal-article","created":{"date-parts":[[2025,4,29]],"date-time":"2025-04-29T07:50:53Z","timestamp":1745913053000},"source":"Crossref","is-referenced-by-count":0,"title":["STABIX: summary-statistic-based GWAS indexing and compression"],"prefix":"10.1093","volume":"42","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-4757-4500","authenticated-orcid":false,"given":"Kristen","family":"Schneider","sequence":"first","affiliation":[{"name":"Department of Computer Science, University of Colorado , Boulder, Boulder, 80309 CO,","place":["United States"]},{"name":"BioFrontiers Institute, University of Colorado , Boulder, Boulder, 80303 CO,","place":["United States"]}]},{"given":"Simon","family":"Walker","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Colorado , Boulder, Boulder, 80309 CO,","place":["United States"]}]},{"given":"Chris","family":"Gignoux","sequence":"additional","affiliation":[{"name":"Colorado Center for Personalized Medicine, University of Colorado, Anschutz Medical Campus , Aurora, 80045 CO,","place":["United States"]},{"name":"Department of Biomedical Informatics, University of Colorado, Anschutz Medical Campus , Aurora, 80045 CO,","place":["United States"]}]},{"given":"Ryan","family":"Layer","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Colorado , Boulder, Boulder, 80309 CO,","place":["United States"]},{"name":"BioFrontiers Institute, University of Colorado , Boulder, Boulder, 80303 CO,","place":["United States"]},{"name":"Department of Biomedical Informatics, University of Colorado, Anschutz Medical Campus , Aurora, 80045 CO,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2025,5,2]]},"reference":[{"key":"2026031206463732600_btaf264-B10","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1016\/j.ajhg.2022.12.011","article-title":"15 Years of GWAS discovery: realizing the promise","volume":"110","author":"Abdellaoui","year":"2023","journal-title":"Am J Hum Genet"},{"key":"2026031206463732600_btaf264-B20","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1001\/jama.2021.20356","article-title":"Phenome-wide association studies","volume":"327","author":"Bastarache","year":"2022","journal-title":"Jama"},{"key":"2026031206463732600_btaf264-B11","doi-asserted-by":"crossref","first-page":"D1005","DOI":"10.1093\/nar\/gky1120","article-title":"The NHGRI-EBI GWAS catalog of published genome-wide association studies, targeted arrays and summary statistics 2019","volume":"47","author":"Buniello","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2026031206463732600_btaf264-B18","author":"Chang","year":"2009"},{"key":"2026031206463732600_btaf264-B19","doi-asserted-by":"crossref","DOI":"10.1093\/gigascience\/giz082","article-title":"PRSice-2: polygenic risk score software for biobank-scale data","volume":"8","author":"Choi","year":"2019","journal-title":"Gigascience"},{"key":"2026031206463732600_btaf264-B13","doi-asserted-by":"crossref","first-page":"880","DOI":"10.1038\/nrg2898","article-title":"Predicting genetic predisposition in humans: the promise of whole-genome markers","volume":"11","author":"de los Campos","year":"2010","journal-title":"Nat Rev Genet"},{"key":"2026031206463732600_btaf264-B24","doi-asserted-by":"crossref","first-page":"e1004383","DOI":"10.1371\/journal.pgen.1004383","article-title":"Bayesian test for colocalisation between pairs of genetic association studies using summary statistics","volume":"10","author":"Giambartolomei","year":"2014","journal-title":"PLoS Genet"},{"key":"2026031206463732600_btaf264-B12","doi-asserted-by":"publisher","DOI":"10.1101\/2024.03.13.24303864","article-title":"Pan-UK biobank GWAS improves discovery, analysis of genetic architecture, and resolution into ancestry-enriched effects","author":"Karczewski","year":"2024","journal-title":"bioRxiv"},{"key":"2026031206463732600_btaf264-B22","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1093\/bioinformatics\/btw615","article-title":"Improved methods for multi-trait fine mapping of pleiotropic risk loci","volume":"33","author":"Kichaev","year":"2017","journal-title":"Bioinformatics"},{"key":"2026031206463732600_btaf264-B14","doi-asserted-by":"crossref","first-page":"524","DOI":"10.1038\/s41576-022-00470-z","article-title":"Polygenic scores in biomedical research","volume":"23","author":"Kullo","year":"2022","journal-title":"Nat Rev Genet"},{"key":"2026031206463732600_btaf264-B16","doi-asserted-by":"crossref","first-page":"718","DOI":"10.1093\/bioinformatics\/btq671","article-title":"Tabix: fast retrieval of sequence features from generic TAB-delimited files","volume":"27","author":"Li","year":"2011","journal-title":"Bioinformatics"},{"key":"2026031206463732600_btaf264-B6","doi-asserted-by":"crossref","first-page":"5900","DOI":"10.1038\/s41467-020-19653-5","article-title":"15 Years of genome-wide association studies and no signs of slowing down","volume":"11","author":"Loos","year":"2020","journal-title":"Nat Commun"},{"key":"2026031206463732600_btaf264-B23","doi-asserted-by":"crossref","first-page":"1388","DOI":"10.1016\/j.ajhg.2022.07.002","article-title":"Multi-ancestry fine-mapping improves precision to identify causal genes in transcriptome-wide association studies","volume":"109","author":"Lu","year":"2022","journal-title":"Am J Hum Genet"},{"key":"2026031206463732600_btaf264-B2","doi-asserted-by":"crossref","first-page":"D896","DOI":"10.1093\/nar\/gkw1133","article-title":"The new NHGRI-EBI catalog of published genome-wide association studies (GWAS catalog)","volume":"45","author":"MacArthur","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2026031206463732600_btaf264-B8","first-page":"696","article-title":"The BioBank Japan project","volume":"5","author":"Nakamura","year":"2007","journal-title":"Clin. Adv Hematol Oncol"},{"key":"2026031206463732600_btaf264-B21","doi-asserted-by":"crossref","first-page":"2336","DOI":"10.1093\/bioinformatics\/btq419","article-title":"LocusZoom: regional visualization of genome-wide association scan results","volume":"26","author":"Pruim","year":"2010","journal-title":"Bioinformatics"},{"key":"2026031206463732600_btaf264-B17","doi-asserted-by":"crossref","first-page":"399","DOI":"10.1375\/136905203770326402","article-title":"Heritability of adult body height: a comparative study of twin cohorts in eight countries","volume":"6","author":"Silventoinen","year":"2003","journal-title":"Twin Res"},{"key":"2026031206463732600_btaf264-B7","doi-asserted-by":"crossref","first-page":"e1001779","DOI":"10.1371\/journal.pmed.1001779","article-title":"UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of Middle and old age","volume":"12","author":"Sudlow","year":"2015","journal-title":"PLoS Med"},{"key":"2026031206463732600_btaf264-B15","doi-asserted-by":"crossref","first-page":"467","DOI":"10.1038\/s41576-019-0127-1","article-title":"Benefits and limitations of genome-wide association studies","volume":"20","author":"Tam","year":"2019","journal-title":"Nat Rev Genet"},{"key":"2026031206463732600_btaf264-B1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s43586-021-00056-9","article-title":"Genome-wide association studies","volume":"1","author":"Uffelmann","year":"2021","journal-title":"Nat Rev Methods Primers"},{"key":"2026031206463732600_btaf264-B4","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1016\/j.ajhg.2011.11.029","article-title":"Five years of GWAS discovery","volume":"90","author":"Visscher","year":"2012","journal-title":"Am J Hum Genet"},{"key":"2026031206463732600_btaf264-B5","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1016\/j.ajhg.2017.06.005","article-title":"10 Years of GWAS discovery: biology, function, and translation","volume":"101","author":"Visscher","year":"2017","journal-title":"Am J Hum Genet"},{"key":"2026031206463732600_btaf264-B3","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1038\/nature05911","article-title":"Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls","volume":"447","author":"Wellcome Trust Case Control Consortium","year":"2007","journal-title":"Nature"},{"key":"2026031206463732600_btaf264-B9","doi-asserted-by":"crossref","first-page":"100192","DOI":"10.1016\/j.xgen.2022.100192","article-title":"Global biobank meta-analysis initiative: powering genetic discovery across human disease","volume":"2","author":"Zhou","year":"2022","journal-title":"Cell Genom"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf264\/63045443\/btaf264.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/3\/btaf264\/63045443\/btaf264.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/42\/3\/btaf264\/63045443\/btaf264.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T10:46:54Z","timestamp":1773312414000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf264\/8124076"}},"subtitle":[],"editor":[{"given":"Macha","family":"Nikolski","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,5,2]]},"references-count":24,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,2,28]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf264","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.11.15.623812","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2026,3]]},"published":{"date-parts":[[2025,5,2]]},"article-number":"btaf264"}}