{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,3]],"date-time":"2026-03-03T15:45:59Z","timestamp":1772552759962,"version":"3.50.1"},"reference-count":61,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2017,10,25]],"date-time":"2017-10-25T00:00:00Z","timestamp":1508889600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R03-DE024198, R03-DE025646"],"award-info":[{"award-number":["R03-DE024198, R03-DE025646"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["IIS-1502172"],"award-info":[{"award-number":["IIS-1502172"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100004543","name":"China Scholarship Council","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100004543","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["81573253, 81773541, and 81673448"],"award-info":[{"award-number":["81573253, 81773541, and 81673448"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,3,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Large-scale molecular data have been increasingly used as an important resource for prognostic prediction of diseases and detection of associated genes. However, standard approaches for omics data analysis ignore the group structure among genes encoded in functional relationships or pathway information.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We propose new Bayesian hierarchical generalized linear models, called group spike-and-slab lasso GLMs, for predicting disease outcomes and detecting associated genes by incorporating large-scale molecular data and group structures. The proposed model employs a mixture double-exponential prior for coefficients that induces self-adaptive shrinkage amount on different coefficients. The group information is incorporated into the model by setting group-specific parameters. We have developed a fast and stable deterministic algorithm to fit the proposed hierarchal GLMs, which can perform variable selection within groups. We assess the performance of the proposed method on several simulated scenarios, by varying the overlap among groups, group size, number of non-null groups, and the correlation within group. Compared with existing methods, the proposed method provides not only more accurate estimates of the parameters but also better prediction. We further demonstrate the application of the proposed procedure on three cancer datasets by utilizing pathway structures of genes. Our results show that the proposed method generates powerful models for predicting disease outcomes and detecting associated genes.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>The methods have been implemented in a freely available R package BhGLM (http:\/\/www.ssg.uab.edu\/bhglm\/).<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx684","type":"journal-article","created":{"date-parts":[[2017,10,24]],"date-time":"2017-10-24T19:28:27Z","timestamp":1508873307000},"page":"901-910","source":"Crossref","is-referenced-by-count":25,"title":["Group spike-and-slab lasso generalized linear models for disease prediction and associated genes detection by incorporating pathway information"],"prefix":"10.1093","volume":"34","author":[{"given":"Zaixiang","family":"Tang","sequence":"first","affiliation":[{"name":"Department of Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou, China"},{"name":"Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Medical College of Soochow University, Suzhou, China"},{"name":"Center for Genetic Epidemiology and Genomics, Medical College of Soochow University, Suzhou, China"},{"name":"Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, USA"}]},{"given":"Yueping","family":"Shen","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, School of Public Health, Medical College of Soochow University, Suzhou, China"},{"name":"Jiangsu Key Laboratory of Preventive and Translational Medicine for Geriatric Diseases, Medical College of Soochow University, Suzhou, China"}]},{"given":"Yan","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, USA"}]},{"given":"Xinyan","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, USA"}]},{"given":"Jia","family":"Wen","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, USA"}]},{"given":"Chen\u2019ao","family":"Qian","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics, School of Biology & Basic Medical Science, Soochow University, Suzhou, China"}]},{"given":"Wenzhuo","family":"Zhuang","sequence":"additional","affiliation":[{"name":"Department of Cell Biology, School of Biology & Basic Medical Science, Soochow University, Suzhou, China"}]},{"given":"Xinghua","family":"Shi","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, USA"}]},{"given":"Nengjun","family":"Yi","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, USA"}]}],"member":"286","published-online":{"date-parts":[[2017,10,25]]},"reference":[{"key":"2023012712474074700_btx684-B1","author":"Barillot","year":"2012"},{"key":"2023012712474074700_btx684-B2","doi-asserted-by":"crossref","first-page":"731","DOI":"10.1111\/biom.12300","article-title":"The group exponential lasso for bi-level variable selection","volume":"71","author":"Breheny","year":"2015","journal-title":"Biometrics"},{"key":"2023012712474074700_btx684-B3","doi-asserted-by":"crossref","first-page":"369","DOI":"10.4310\/SII.2009.v2.n3.a10","article-title":"Penalized methods for bi-level variable selection","volume":"2","author":"Breheny","year":"2009","journal-title":"Stat. Interf"},{"key":"2023012712474074700_btx684-B4","doi-asserted-by":"crossref","first-page":"2640","DOI":"10.1158\/1535-7163.MCT-16-0048","article-title":"Mitochondria-targeted doxorubicin: a new therapeutic strategy against doxorubicin-resistant osteosarcoma","volume":"15","author":"Buondonno","year":"2016","journal-title":"Mol. Cancer Ther"},{"key":"2023012712474074700_btx684-B5","author":"Chen","year":"2014"},{"key":"2023012712474074700_btx684-B6","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1002\/wics.1284","article-title":"Variable selection in linear models","volume":"6","author":"Chen","year":"2014","journal-title":"Wiley Interdiscip. Rev. Comput. Stat"},{"key":"2023012712474074700_btx684-B7","doi-asserted-by":"crossref","first-page":"17","DOI":"10.2307\/3315687","article-title":"Bayesian variable selection with related predictions","volume":"24","author":"Chipman","year":"1996","journal-title":"Can. J. Stat"},{"key":"2023012712474074700_btx684-B8","first-page":"65","article-title":"The Practical Implementation of Bayesian Model Selection","volume-title":"Lecture Notes-Monograph Series","author":"Chipman","year":"2001"},{"key":"2023012712474074700_btx684-B9","doi-asserted-by":"crossref","first-page":"880","DOI":"10.1038\/nrg2898","article-title":"Predicting genetic predisposition in humans: the promise of whole-genome markers","volume":"11","author":"de los Campos","year":"2010","journal-title":"Nat. Rev. Genet"},{"key":"2023012712474074700_btx684-B10","doi-asserted-by":"crossref","first-page":"1348","DOI":"10.1198\/016214501753382273","article-title":"Variable selection via nonconcave penalized likelihood and its oracle properties","volume":"96","author":"Fan","year":"2001","journal-title":"J. Am. Stat. Assoc"},{"key":"2023012712474074700_btx684-B11","author":"Friedman","year":"2010"},{"key":"2023012712474074700_btx684-B12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v033.i01","article-title":"Regularization paths for generalized linear models via coordinate descent","volume":"33","author":"Friedman","year":"2010","journal-title":"J. Stat. Softw"},{"key":"2023012712474074700_btx684-B13","volume-title":"Bayesian Data Analysis","author":"Gelman","year":"2014"},{"key":"2023012712474074700_btx684-B14","volume-title":"Data Analysis Using Regression and Multilevel\/Hierarchical Models","author":"Gelman","year":"2007"},{"key":"2023012712474074700_btx684-B15","doi-asserted-by":"crossref","first-page":"D1049","DOI":"10.1093\/nar\/gku1179","article-title":"Gene Ontology Consortium: going forward","volume":"43","author":"Gene Ontology","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023012712474074700_btx684-B16","doi-asserted-by":"crossref","first-page":"881","DOI":"10.1080\/01621459.1993.10476353","article-title":"Variable selection via Gibbs sampling","volume":"88","author":"George","year":"1993","journal-title":"J. Am. Stat. Assoc"},{"key":"2023012712474074700_btx684-B17","first-page":"339","article-title":"Approaches for Bayesian variable selection","volume":"7","author":"George","year":"1997","journal-title":"Stat. Sin"},{"key":"2023012712474074700_btx684-B18","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-84858-7","volume-title":"The Elements of Statistical Learning","author":"Hastie","year":"2009"},{"key":"2023012712474074700_btx684-B19","doi-asserted-by":"crossref","DOI":"10.1201\/b18401","volume-title":"Statistical Learning with Sparsity - the Lasso and Generalization","author":"Hastie","year":"2015"},{"key":"2023012712474074700_btx684-B20","doi-asserted-by":"crossref","first-page":"481","DOI":"10.1214\/12-STS392","article-title":"A Selective review of group selection in high-dimensional models","volume":"27","author":"Huang","year":"2012","journal-title":"Stat. Sci"},{"key":"2023012712474074700_btx684-B21","doi-asserted-by":"crossref","first-page":"339","DOI":"10.1093\/biomet\/asp020","article-title":"A group bridge approach for variable selection","volume":"96","author":"Huang","year":"2009","journal-title":"Biometrika"},{"key":"2023012712474074700_btx684-B22","doi-asserted-by":"crossref","first-page":"764","DOI":"10.1198\/016214505000000051","article-title":"Spike and slab gene selection for multigroup microarray data","volume":"100","author":"Ishwaran","year":"2005","journal-title":"J. Am. Stat. Assoc"},{"key":"2023012712474074700_btx684-B23","doi-asserted-by":"crossref","first-page":"433","DOI":"10.1145\/1553374.1553431","volume-title":"Proceedings of the 26th Annual International Conference on Machine Learning","author":"Jacob","year":"2009"},{"key":"2023012712474074700_btx684-B24","doi-asserted-by":"crossref","first-page":"D457","DOI":"10.1093\/nar\/gkv1070","article-title":"KEGG as a reference resource for gene and protein annotation","volume":"44","author":"Kanehisa","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023012712474074700_btx684-B25","first-page":"1","article-title":"A doubly sparse approach for group variable selection","volume":"69","author":"Kwon","year":"2016","journal-title":"Ann. Inst. Stat. Math"},{"key":"2023012712474074700_btx684-B26","doi-asserted-by":"crossref","first-page":"664","DOI":"10.1002\/gepi.21932","article-title":"Multiple SNP set analysis for genome-wide association studies through Bayesian latent variable selection","volume":"39","author":"Lu","year":"2015","journal-title":"Genet. Epidemiol"},{"key":"2023012712474074700_btx684-B27","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4899-3242-6","volume-title":"Generalized Linear Models","author":"McCullagh","year":"1989"},{"key":"2023012712474074700_btx684-B28","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1111\/j.1467-9868.2007.00627.x","article-title":"The group lasso for logistic regression","volume":"70","author":"Meier","year":"2008","journal-title":"J. Royal Stat. Soc. Ser. B"},{"key":"2023012712474074700_btx684-B29","author":"Obozinski","year":"2011"},{"key":"2023012712474074700_btx684-B30","doi-asserted-by":"crossref","first-page":"S7.","DOI":"10.1186\/1753-6561-8-S5-S7","article-title":"Regularized group regression methods for genomic prediction: Bridge, MCP, SCAD, group bridge, group lasso, sparse group lasso, group MCP and group SCAD","volume":"8","author":"Ogutu","year":"2014","journal-title":"BMC Proc"},{"key":"2023012712474074700_btx684-B31","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1002\/bimj.201400110","article-title":"Agglomerative joint clustering of metabolic data with spike at zero: A Bayesian perspective","volume":"58","author":"Partovi Nia","year":"2016","journal-title":"Biom. J"},{"key":"2023012712474074700_btx684-B32","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-8-35","article-title":"Classification of microarray data using gene networks","volume":"8","author":"Rapaport","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023012712474074700_btx684-B33","doi-asserted-by":"crossref","first-page":"828","DOI":"10.1080\/01621459.2013.869223","article-title":"EMVS: the EM approach to Bayesian variable selection","volume":"109","author":"Ro\u010dkov\u00e1","year":"2014","journal-title":"J. Am. Stat. Assoc"},{"key":"2023012712474074700_btx684-B34","author":"Ro\u010dkov\u00e1","year":"2016"},{"key":"2023012712474074700_btx684-B35","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1007\/978-3-319-27099-9_11","volume-title":"Statistical Analysis for High-Dimensional Data: The Abel Symposium 2014","author":"Ro\u010dkov\u00e1","year":"2016"},{"key":"2023012712474074700_btx684-B36","doi-asserted-by":"crossref","first-page":"31.","DOI":"10.1186\/s12859-015-0467-6","article-title":"A systematic evaluation of high-dimensional, ensemble-based regression for exploring large model spaces in microbiome analyses","volume":"16","author":"Shankar","year":"2015","journal-title":"BMC Bioinformatics"},{"key":"2023012712474074700_btx684-B37","doi-asserted-by":"crossref","first-page":"e0124088","DOI":"10.1371\/journal.pone.0124088","article-title":"Nonlinear spike-and-slab sparse coding for interpretable image encoding","volume":"10","author":"Shelton","year":"2015","journal-title":"PLoS One"},{"key":"2023012712474074700_btx684-B38","doi-asserted-by":"crossref","first-page":"e1003939","DOI":"10.1371\/journal.pgen.1003939","article-title":"Pathways-driven sparse regression identifies pathways and genes associated with high-density lipoprotein cholesterol in two Asian cohorts","volume":"9","author":"Silver","year":"2013","journal-title":"PLoS Genet"},{"key":"2023012712474074700_btx684-B39","doi-asserted-by":"crossref","DOI":"10.2202\/1544-6115.1755","article-title":"Fast identification of biological pathways associated with a quantitative trait using group lasso with overlaps","volume":"11","author":"Silver","year":"2012","journal-title":"Stat. Appl. Genet. Mol. Biol"},{"key":"2023012712474074700_btx684-B40","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v039.i05","article-title":"Regularization paths for cox\u2019s proportional hazards model via coordinate descent","volume":"39","author":"Simon","year":"2011","journal-title":"J. Stat. Softw"},{"key":"2023012712474074700_btx684-B41","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1080\/10618600.2012.681250","article-title":"A sparse-group Lasso","volume":"22","author":"Simon","year":"2013","journal-title":"J. Comput. Graph. Stat"},{"key":"2023012712474074700_btx684-B42","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1158\/1535-7163.MCT-14-0711","article-title":"MLN0128, an ATP-competitive mTOR kinase inhibitor with potent in vitro and in vivo antitumor activity, as potential therapy for bone and soft-tissue sarcoma","volume":"14","author":"Slotkin","year":"2015","journal-title":"Mol. Cancer Ther"},{"key":"2023012712474074700_btx684-B43","doi-asserted-by":"crossref","first-page":"e54089.","DOI":"10.1371\/journal.pone.0054089","article-title":"Predictive modeling using a somatic mutational profile in ovarian high grade serous carcinoma","volume":"8","author":"Sohn","year":"2013","journal-title":"PLoS One"},{"key":"2023012712474074700_btx684-B44","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-77244-8","volume-title":"Clinical Prediction Models: A Practical Approch to Development, Validation, and Updates","author":"Steyerberg","year":"2009"},{"key":"2023012712474074700_btx684-B45","doi-asserted-by":"crossref","first-page":"2799","DOI":"10.1093\/bioinformatics\/btx300","article-title":"The spike-and-slab lasso cox model for survival prediction and associated genes detection","volume":"33","author":"Tang","year":"2017","journal-title":"Bioinformatics"},{"key":"2023012712474074700_btx684-B46","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1534\/genetics.116.192195","article-title":"The spike-and-slab lasso generalized linear models for prediction and associated genes detection","volume":"205","author":"Tang","year":"2017","journal-title":"Genetics"},{"key":"2023012712474074700_btx684-B47","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. Royal Stat. Soc. Ser. B"},{"key":"2023012712474074700_btx684-B48","doi-asserted-by":"crossref","first-page":"385","DOI":"10.1002\/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3","article-title":"The lasso method for variable selection in the Cox model","volume":"16","author":"Tibshirani","year":"1997","journal-title":"Stat Med"},{"key":"2023012712474074700_btx684-B49","doi-asserted-by":"crossref","first-page":"1","DOI":"10.2202\/1544-6115.1000","article-title":"Pre-validation and inference in microarrays","volume":"1","author":"Tibshirani","year":"2002","journal-title":"Stat. Appl. Genet. Mol. Biol"},{"key":"2023012712474074700_btx684-B50","doi-asserted-by":"crossref","first-page":"867","DOI":"10.1093\/genetics\/165.2.867","article-title":"Stochastic search variable selection for mapping multiple quantitative trait loci","volume":"165","author":"Yi","year":"2003","journal-title":"Genetics"},{"key":"2023012712474074700_btx684-B51","doi-asserted-by":"crossref","first-page":"284","DOI":"10.1089\/omi.2011.0118","article-title":"clusterProfiler: an R package for comparing biological themes among gene clusters","volume":"16","author":"Yu","year":"2012","journal-title":"Omics"},{"key":"2023012712474074700_btx684-B52","doi-asserted-by":"crossref","first-page":"2104","DOI":"10.1109\/TPAMI.2013.17","article-title":"Efficient methods for overlapping group lasso","volume":"35","author":"Yuan","year":"2013","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell"},{"key":"2023012712474074700_btx684-B53","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1111\/j.1467-9868.2005.00532.x","article-title":"Model selection and estimation in regression with grouped variables","volume":"68","author":"Yuan","year":"2006","journal-title":"J. Royal Stat. Soc. Ser. B"},{"key":"2023012712474074700_btx684-B54","doi-asserted-by":"crossref","first-page":"644","DOI":"10.1038\/nbt.2940","article-title":"Assessing the clinical utility of cancer genomic and proteomic data across tumor types","volume":"32","author":"Yuan","year":"2014","journal-title":"Nat. Biotechnol"},{"key":"2023012712474074700_btx684-B55","doi-asserted-by":"crossref","first-page":"179","DOI":"10.4137\/CIN.S40043","article-title":"Overlapping group logistic regression with applications to genetic pathway selection","volume":"15","author":"Zeng","year":"2016","journal-title":"Cancer Informatics"},{"key":"2023012712474074700_btx684-B56","first-page":"894","article-title":"Nearly unbiased variable selection under minimax concave penalty","volume-title":"Ann. Stat.","author":"Zhang","year":"2010"},{"key":"2023012712474074700_btx684-B57","author":"Zhang","year":"2007"},{"key":"2023012712474074700_btx684-B58","doi-asserted-by":"crossref","first-page":"e1002975","DOI":"10.1371\/journal.pcbi.1002975","article-title":"Network-based survival analysis reveals subnetwork signatures for predicting outcomes of ovarian cancer treatment","volume":"9","author":"Zhang","year":"2013","journal-title":"PLoS Comput. Biol"},{"key":"2023012712474074700_btx684-B59","doi-asserted-by":"crossref","first-page":"3468","DOI":"10.1214\/07-AOS584","article-title":"The composite absolute penalties family for grouped and hierarchical variable selection","volume":"37","author":"Zhao","year":"2009","journal-title":"Ann. Stat"},{"key":"2023012712474074700_btx684-B60","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1093\/bib\/bbu003","article-title":"Combining multidimensional genomic measurements for predicting cancer prognosis: observations from TCGA","volume":"16","author":"Zhao","year":"2015","journal-title":"Brief Bioinform"},{"key":"2023012712474074700_btx684-B61","doi-asserted-by":"crossref","first-page":"e1003264.","DOI":"10.1371\/journal.pgen.1003264","article-title":"Polygenic modeling with bayesian sparse linear mixed models","volume":"9","author":"Zhou","year":"2013","journal-title":"PLoS Genet"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/6\/901\/48914629\/bioinformatics_34_6_901.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/6\/901\/48914629\/bioinformatics_34_6_901.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,28]],"date-time":"2024-06-28T00:28:25Z","timestamp":1719534505000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/6\/901\/4565593"}},"subtitle":[],"editor":[{"given":"John","family":"Hancock","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2017,10,25]]},"references-count":61,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2018,3,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx684","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,3,15]]},"published":{"date-parts":[[2017,10,25]]}}}