{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,9]],"date-time":"2025-11-09T11:06:36Z","timestamp":1762686396710,"version":"build-2065373602"},"reference-count":28,"publisher":"Oxford University Press (OUP)","issue":"11","license":[{"start":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T00:00:00Z","timestamp":1760659200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001804","name":"Canada Research Chair","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001804","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Machine Learning for Genomics and Healthcare","award":["CRC-2021-00547"],"award-info":[{"award-number":["CRC-2021-00547"]}]},{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Large-scale biobanks, with rich phenotypic and genomic data across hundreds of thousands of samples, provide ample opportunities to elucidate the genetics of complex traits and diseases. Consequently, there is growing demand for robust and scalable methods for disease risk prediction from genotype data. Inference in this setting is challenging due to the high-dimensionality of genomic data, especially when coupled with smaller sample sizes. Popular Polygenic Risk Score (PRS) inference methods address this challenge by adopting sparse Bayesian priors or penalized regression techniques, such as the Least Absolute Shrinkage and Selection Operator (LASSO). However, the former class of methods are not as scalable and do not produce exact sparsity, while the latter tends to over-shrink large coefficients.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>In this study, we present SSLPRS, a novel PRS method based on the Spike-and-Slab LASSO (SSL) prior, which offers a theoretical bridge between the two frameworks. We extend previous work to derive a coordinate-ascent inference algorithm that operates on GWAS summary statistics, which is orders-of-magnitude more efficient than corresponding individual-level-based implementations. To illustrate the statistical properties of the proposed model, we conducted experiments involving nine simulation configurations and nine quantitative phenotypes from the UK Biobank. Our results demonstrate that SSLPRS is competitive with state-of-the-art methods in terms of prediction accuracy and exhibits superior variable selection performance, especially in sparse genetic architectures. In simulations, this translates to upwards of 50% improvement in positive predictive value. In analysis of real phenotypes, we show that selected variants are highly enriched for meaningful genomic annotations and have better replication rates in larger meta-analyses.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>SSLPRS is available in the open-source package https:\/\/github.com\/li-lab-mcgill\/penprs.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf578","type":"journal-article","created":{"date-parts":[[2025,10,15]],"date-time":"2025-10-15T12:28:34Z","timestamp":1760531314000},"source":"Crossref","is-referenced-by-count":0,"title":["Sparse polygenic risk score inference with the spike-and-slab LASSO"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-6832-2452","authenticated-orcid":false,"given":"Junyi","family":"Song","sequence":"first","affiliation":[{"name":"School of Computer Science, McGill University, Montr\u00e9al, QC H3A 0G4,","place":["Canada"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8003-9284","authenticated-orcid":false,"given":"Shadi","family":"Zabad","sequence":"additional","affiliation":[{"name":"School of Computer Science, McGill University, Montr\u00e9al, QC H3A 0G4,","place":["Canada"]}]},{"given":"Archer","family":"Yang","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Statistics, McGill University, Montr\u00e9al, QC H3A 0G4,","place":["Canada"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9183-964X","authenticated-orcid":false,"given":"Simon","family":"Gravel","sequence":"additional","affiliation":[{"name":"Department of Human Genetics, McGill University , Montr\u00e9al, QC H3A 0G4,","place":["Canada"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3844-4865","authenticated-orcid":false,"given":"Yue","family":"Li","sequence":"additional","affiliation":[{"name":"School of Computer Science, McGill University, Montr\u00e9al, QC H3A 0G4,","place":["Canada"]}]}],"member":"286","published-online":{"date-parts":[[2025,10,17]]},"reference":[{"key":"2025110906040784700_btaf578-B1","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1201\/9781003089018-4","volume-title":"Handbook of Bayesian Variable Selection","author":"Bai","year":"2021"},{"key":"2025110906040784700_btaf578-B2","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1038\/s41586-018-0579-z","article-title":"The UK biobank resource with deep phenotyping and genomic data","volume":"562","author":"Bycroft","year":"2018","journal-title":"Nature"},{"key":"2025110906040784700_btaf578-B3","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1186\/s13742-015-0047-8","article-title":"Second-generation plink: rising to the challenge of larger and richer datasets","volume":"4","author":"Chang","year":"2015","journal-title":"Gigascience"},{"key":"2025110906040784700_btaf578-B4","doi-asserted-by":"publisher","first-page":"giz082","DOI":"10.1093\/gigascience\/giz082","article-title":"Prsice-2: polygenic risk score software for biobank-scale data","volume":"8","author":"Choi","year":"2019","journal-title":"Gigascience"},{"key":"2025110906040784700_btaf578-B5","doi-asserted-by":"publisher","first-page":"1228","DOI":"10.1038\/ng.3404","article-title":"Partitioning heritability by functional annotation using genome-wide association summary statistics","volume":"47","author":"Finucane","year":"2015","journal-title":"Nat Genet"},{"key":"2025110906040784700_btaf578-B6","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v033.i01","article-title":"Regularization paths for generalized linear models via coordinate descent","volume":"33","author":"Friedman","year":"2010","journal-title":"J Stat Softw"},{"key":"2025110906040784700_btaf578-B7","doi-asserted-by":"publisher","first-page":"1776","DOI":"10.1038\/s41467-019-09718-5","article-title":"Polygenic prediction via Bayesian regression and continuous shrinkage priors","volume":"10","author":"Ge","year":"2019","journal-title":"Nat Commun"},{"key":"2025110906040784700_btaf578-B8","doi-asserted-by":"publisher","first-page":"675","DOI":"10.1038\/s41586-021-04064-3","article-title":"The power of genetic diversity in genome-wide association studies of lipids","volume":"600","author":"Graham","year":"2021","journal-title":"Nature"},{"key":"2025110906040784700_btaf578-B9","doi-asserted-by":"publisher","first-page":"2116","DOI":"10.1038\/s41588-025-02286-z","article-title":"LDAK-KVIK performs fast and powerful mixed-model association analysis of quantitative and binary phenotypes","volume":"57","author":"Hof","year":"2025","journal-title":"Nat Genet"},{"key":"2025110906040784700_btaf578-B10","doi-asserted-by":"crossref","first-page":"730","DOI":"10.1214\/009053604000001147","article-title":"Spike and slab variable selection: frequentist and Bayesian strategies","volume":"33","author":"Ishwaran","year":"2005","journal-title":"Ann Stat"},{"volume-title":"Introduction to Statistical Learning with Applications in R","year":"2019","author":"James","key":"2025110906040784700_btaf578-B11"},{"key":"2025110906040784700_btaf578-B12","doi-asserted-by":"publisher","first-page":"1401","DOI":"10.1007\/s00439-024-02716-8","article-title":"Advancements and limitations in polygenic risk score methods for genomic prediction: a scoping review","volume":"143","author":"Jayasinghe","year":"2024","journal-title":"Hum Genet"},{"author":"Karczewski","key":"2025110906040784700_btaf578-B13","doi-asserted-by":"publisher","DOI":"10.1101\/2024.03.13.24303864"},{"key":"2025110906040784700_btaf578-B14","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1186\/s13073-020-00742-5","article-title":"Polygenic risk scores: from research tools to clinical instruments","volume":"12","author":"Lewis","year":"2020","journal-title":"Genome Med"},{"key":"2025110906040784700_btaf578-B15","doi-asserted-by":"publisher","first-page":"5086","DOI":"10.1038\/s41467-019-12653-0","article-title":"Improved polygenic prediction by Bayesian multiple regression on summary statistics","volume":"10","author":"Lloyd-Jones","year":"2019","journal-title":"Nat Commun"},{"key":"2025110906040784700_btaf578-B16","doi-asserted-by":"publisher","first-page":"469","DOI":"10.1002\/gepi.22050","article-title":"Polygenic scores via penalized regression on summary statistics","volume":"41","author":"Mak","year":"2017","journal-title":"Genet Epidemiol"},{"key":"2025110906040784700_btaf578-B17","doi-asserted-by":"publisher","first-page":"1091","DOI":"10.1214\/19-BA1149","article-title":"Variance prior forms for high-dimensional Bayesian variable selection","volume":"14","author":"Moran","year":"2019","journal-title":"Bayesian Anal"},{"key":"2025110906040784700_btaf578-B18","doi-asserted-by":"publisher","first-page":"e1009021","DOI":"10.1371\/journal.pgen.1009021","article-title":"Evaluation of polygenic prediction methodology within a reference-standardized framework","volume":"17","author":"Pain","year":"2021","journal-title":"PLoS Genet"},{"key":"2025110906040784700_btaf578-B19","doi-asserted-by":"publisher","first-page":"117","DOI":"10.1038\/nrg.2016.142","article-title":"Dissecting the genetics of complex traits using summary association statistics","volume":"18","author":"Pasaniuc","year":"2017","journal-title":"Nat Rev Genet"},{"key":"2025110906040784700_btaf578-B20","doi-asserted-by":"publisher","first-page":"5424","DOI":"10.1093\/bioinformatics\/btaa1029","article-title":"Ldpred2: better, faster, stronger","volume":"36","author":"Priv\u00e9","year":"2020","journal-title":"Bioinformatics"},{"key":"2025110906040784700_btaf578-B21","doi-asserted-by":"publisher","first-page":"431","DOI":"10.1080\/01621459.2016.1260469","article-title":"The spike-and-slab lasso","volume":"113","author":"Ro\u010dkov\u00e1","year":"2018","journal-title":"J Am Stat Assoc"},{"key":"2025110906040784700_btaf578-B22","doi-asserted-by":"publisher","first-page":"581","DOI":"10.1038\/s41576-018-0018-x","article-title":"The personal and clinical utility of polygenic risk scores","volume":"19","author":"Torkamani","year":"2018","journal-title":"Nat Rev Genet"},{"key":"2025110906040784700_btaf578-B23","doi-asserted-by":"publisher","first-page":"741","DOI":"10.1016\/j.ajhg.2023.03.009","article-title":"Fast and accurate Bayesian polygenic risk modeling with variational inference","volume":"110","author":"Zabad","year":"2023","journal-title":"Am J Hum Genet"},{"key":"2025110906040784700_btaf578-B24","doi-asserted-by":"publisher","first-page":"1528","DOI":"10.1016\/j.ajhg.2025.05.002","article-title":"Toward whole-genome inference of polygenic scores with fast and memory-efficient algorithms","volume":"112","author":"Zabad","year":"2025","journal-title":"Am J Hum Genet"},{"key":"2025110906040784700_btaf578-B25","doi-asserted-by":"publisher","first-page":"576","DOI":"10.1214\/12-STS399","article-title":"A general theory of concave regularization for high-dimensional sparse estimation problems","volume":"27","author":"Zhang","year":"2012","journal-title":"Stat Sci"},{"key":"2025110906040784700_btaf578-B26","doi-asserted-by":"publisher","first-page":"4192","DOI":"10.1038\/s41467-021-24485-y","article-title":"Improved genetic prediction of complex traits from individual-level data or summary statistics","volume":"12","author":"Zhang","year":"2021","journal-title":"Nat Commun"},{"key":"2025110906040784700_btaf578-B27","first-page":"2541","article-title":"On model selection consistency of lasso","volume":"7","author":"Zhao","year":"2006","journal-title":"J Mach Learn Res"},{"key":"2025110906040784700_btaf578-B28","doi-asserted-by":"publisher","first-page":"100192","DOI":"10.1016\/j.xgen.2022.100192","article-title":"Global biobank meta-analysis initiative: powering genetic discovery across human disease","volume":"2","author":"Zhou","year":"2022","journal-title":"Cell Genom"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf578\/64738987\/btaf578.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/11\/btaf578\/64738987\/btaf578.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/11\/btaf578\/64738987\/btaf578.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,9]],"date-time":"2025-11-09T11:04:13Z","timestamp":1762686253000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf578\/8292660"}},"subtitle":[],"editor":[{"given":"Russell","family":"Schwartz","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,10,17]]},"references-count":28,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2025,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf578","relation":{},"ISSN":["1367-4811"],"issn-type":[{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2025,11]]},"published":{"date-parts":[[2025,10,17]]},"article-number":"btaf578"}}