{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T07:13:47Z","timestamp":1775027627275,"version":"3.50.1"},"reference-count":38,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2020,9,11]],"date-time":"2020-09-11T00:00:00Z","timestamp":1599782400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000054","name":"National Cancer Institute","doi-asserted-by":"publisher","award":["P01 CA196569"],"award-info":[{"award-number":["P01 CA196569"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000054","name":"National Cancer Institute","doi-asserted-by":"publisher","award":["R01 CA140561"],"award-info":[{"award-number":["R01 CA140561"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,5,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Associated with genomic features like gene expression, methylation and genotypes, used in statistical modeling of health outcomes, there is a rich set of meta-features like functional annotations, pathway information and knowledge from previous studies, that can be used post hoc to facilitate the interpretation of a model. However, using this meta-feature information a priori rather than post hoc can yield improved prediction performance as well as enhanced model interpretation.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We propose a new penalized regression approach that allows a priori integration of external meta-features. The method extends LASSO regression by incorporating individualized penalty parameters for each regression coefficient. The penalty parameters are, in turn, modeled as a log-linear function of the meta-features and are estimated from the data using an approximate empirical Bayes approach. Optimization of the marginal likelihood on which the empirical Bayes estimation is performed using a fast and stable majorization\u2013minimization procedure. Through simulations, we show that the proposed regression with individualized penalties can outperform the standard LASSO in terms of both parameters estimation and prediction performance when the external data is informative. We further demonstrate our approach with applications to gene expression studies of bone density and breast cancer.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The methods have been implemented in the R package xtune freely available for download from https:\/\/cran.r-project.org\/web\/packages\/xtune\/index.html.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa776","type":"journal-article","created":{"date-parts":[[2020,9,1]],"date-time":"2020-09-01T07:12:43Z","timestamp":1598944363000},"page":"514-521","source":"Crossref","is-referenced-by-count":21,"title":["Incorporating prior knowledge into regularized regression"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5746-7881","authenticated-orcid":false,"given":"Chubing","family":"Zeng","sequence":"first","affiliation":[{"name":"Division of Biostatistics, Department of Preventive Medicine, Keck School of Medicine, University of Southern California , Los Angeles, CA 90033, USA"}]},{"given":"Duncan Campbell","family":"Thomas","sequence":"additional","affiliation":[{"name":"Division of Biostatistics, Department of Preventive Medicine, Keck School of Medicine, University of Southern California , Los Angeles, CA 90033, USA"}]},{"given":"Juan Pablo","family":"Lewinger","sequence":"additional","affiliation":[{"name":"Division of Biostatistics, Department of Preventive Medicine, Keck School of Medicine, University of Southern California , Los Angeles, CA 90033, USA"}]}],"member":"286","published-online":{"date-parts":[[2020,9,11]]},"reference":[{"key":"2023051706070143500_btaa776-B1","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology. The gene ontology consortium","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat. Genet"},{"key":"2023051706070143500_btaa776-B2","doi-asserted-by":"crossref","DOI":"10.2202\/1544-6115.1703","article-title":"Weighted lasso with data integration","volume":"10","author":"Bergersen","year":"2011","journal-title":"Stat. Appl. Genet. Mol. Biol"},{"key":"2023051706070143500_btaa776-B3","doi-asserted-by":"crossref","DOI":"10.1080\/01621459.2014.960967","article-title":"Dirichlet-Laplace priors for optimal shrinkage","volume":"110","author":"Bhattacharya","year":"2015","journal-title":"J. Am. Stat. Assoc"},{"key":"2023051706070143500_btaa776-B4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1155\/2017\/7691937","article-title":"IPF-LASSO: integrative-penalized regression with penalty factors for prediction based on multi-omics data","volume":"2017","author":"Boulesteix","year":"2017","journal-title":"Comput. Math. Methods Med"},{"key":"2023051706070143500_btaa776-B5","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511804441","volume-title":"Convex Optimization","author":"Boyd","year":"2004"},{"key":"2023051706070143500_btaa776-B6","doi-asserted-by":"crossref","first-page":"e1002920","DOI":"10.1371\/journal.pcbi.1002920","article-title":"Biomolecular events in cancer revealed by attractor metagenes","volume":"9","author":"Cheng","year":"2013","journal-title":"PLoS Comput. Biol"},{"key":"2023051706070143500_btaa776-B7","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1038\/nature10983","article-title":"The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups","volume":"486","author":"Curtis","year":"2012","journal-title":"Nature"},{"key":"2023051706070143500_btaa776-B8","doi-asserted-by":"crossref","first-page":"D945","DOI":"10.1093\/nar\/gkq929","article-title":"Cosmic: mining complete cancer genomes in the catalogue of somatic mutations in cancer","volume":"39","author":"Forbes","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023051706070143500_btaa776-B9","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1007\/s00180-007-0033-4","article-title":"A random model approach for the LASSO","volume":"23","author":"Foster","year":"2008","journal-title":"Comput. Stat"},{"key":"2023051706070143500_btaa776-B10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v033.i01","article-title":"Regularization paths for generalized linear models via coordinate descent","volume":"33","author":"Friedman","year":"2010","journal-title":"J. Stat. Softw"},{"key":"2023051706070143500_btaa776-B11","doi-asserted-by":"crossref","first-page":"1846","DOI":"10.1109\/TIT.2012.2227680","article-title":"How correlations influence lasso prediction","volume":"59","author":"Hebiri","year":"2013","journal-title":"IEEE Trans. Inf. Theory"},{"key":"2023051706070143500_btaa776-B12","article-title":"EBglmnet: a comprehensive r package for sparse generalized linear regression models","author":"Huang","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051706070143500_btaa776-B13","article-title":"Exploitation of gene expression and cancer biomarkers in paving the path to era of personalized medicine","volume":"15, 220-235","author":"Kamel","year":"2017","journal-title":"Genomics Proteomics Bioinf"},{"key":"2023051706070143500_btaa776-B14","doi-asserted-by":"crossref","DOI":"10.1007\/s10107-018-1235-y","article-title":"DC programming and DCA: thirty years of developments","volume":"169","author":"Le Thi","year":"2018","journal-title":"Math. Programm"},{"key":"2023051706070143500_btaa776-B15","doi-asserted-by":"crossref","first-page":"D1047","DOI":"10.1093\/nar\/gkr1182","article-title":"GWASdb: a database for human genetic variants identified by genome-wide association studies","volume":"40","author":"Li","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023051706070143500_btaa776-B16","doi-asserted-by":"crossref","first-page":"151","DOI":"10.1214\/10-BA506","article-title":"The Bayesian elastic net","volume":"5","author":"Li","year":"2010","journal-title":"Bayesian Anal"},{"key":"2023051706070143500_btaa776-B17","doi-asserted-by":"crossref","first-page":"369","DOI":"10.1186\/s12859-018-2401-1","article-title":"Data integration by multi-tuning parameter elastic net regression","volume":"19","author":"Liu","year":"2018","journal-title":"BMC Bioinformatics"},{"key":"2023051706070143500_btaa776-B18","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1162\/neco.1992.4.3.415","article-title":"Bayesian Interpolation","volume":"4","author":"MacKay","year":"1992","journal-title":"Neural Comput"},{"key":"2023051706070143500_btaa776-B19","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1093\/biomet\/asr066","article-title":"A direct approach to sparse discriminant analysis in ultra-high dimensions","volume":"99","author":"Mai","year":"2012","journal-title":"Biometrika"},{"key":"2023051706070143500_btaa776-B20","volume-title":"Bayesian Learning for Neural Networks, Volume 118","author":"Neal","year":"1995"},{"key":"2023051706070143500_btaa776-B21","doi-asserted-by":"crossref","first-page":"R62","DOI":"10.1186\/bcr1614","article-title":"Predicting a local recurrence after breast-conserving therapy by gene expression profiling","volume":"8","author":"Nuyten","year":"2006","journal-title":"Breast Cancer Res. BCR"},{"key":"2023051706070143500_btaa776-B22","doi-asserted-by":"crossref","first-page":"474","DOI":"10.1111\/j.1541-0420.2009.01296.x","article-title":"Incorporating predictor network in penalized regression with application to microarray data","volume":"66","author":"Pan","year":"2010","journal-title":"Biometrics"},{"key":"2023051706070143500_btaa776-B23","doi-asserted-by":"crossref","first-page":"681","DOI":"10.1198\/016214508000000337","article-title":"The Bayesian Lasso","volume":"103","author":"Park","year":"2008","journal-title":"J. Am. Stat. Assoc"},{"key":"2023051706070143500_btaa776-B24","first-page":"74, 83-89","article-title":"Diseases: text mining and data integration of disease\u2013gene associations","author":"Pletscher-Frankild","year":"2014","journal-title":"Methods (San Diego, Calif.)"},{"key":"2023051706070143500_btaa776-B25","first-page":"35","article-title":"A study of error variance estimation in lasso regression","author":"Reid","year":"2016"},{"key":"2023051706070143500_btaa776-B26","doi-asserted-by":"crossref","first-page":"604","DOI":"10.1016\/j.bone.2009.11.007","article-title":"Eight genes are highly associated with BMD variation in postmenopausal Caucasian women","volume":"46","author":"Reppe","year":"2010","journal-title":"Bone"},{"key":"2023051706070143500_btaa776-B27","doi-asserted-by":"crossref","first-page":"baw100","DOI":"10.1093\/database\/baw100","article-title":"The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins","volume":"2016","author":"Rouillard","year":"2016","journal-title":"Database"},{"key":"2023051706070143500_btaa776-B28","doi-asserted-by":"crossref","first-page":"1775","DOI":"10.1093\/bioinformatics\/btm234","article-title":"Incorporating prior knowledge of predictors into penalized classifiers with multiple penalty terms","volume":"23","author":"Tai","year":"2007","journal-title":"Bioinformatics"},{"key":"2023051706070143500_btaa776-B29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-016-1210-7","article-title":"Tilting the lasso by knowledge-based post-processing","volume":"17","author":"Tharmaratnam","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2023051706070143500_btaa776-B30","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression Shrinkage and Selection via the Lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser. B Stat. Methodol"},{"key":"2023051706070143500_btaa776-B31","first-page":"211","article-title":"Sparse Bayesian learning and the relevance vector mach","volume":"1","author":"Tipping","year":"2001","journal-title":"J. Mach. Learn. Res"},{"key":"2023051706070143500_btaa776-B32","doi-asserted-by":"crossref","first-page":"368","DOI":"10.1002\/sim.6732","article-title":"Better prediction by use of co-data: adaptive group-regularized ridge regression","volume":"35","author":"van de Wiel","year":"2016","journal-title":"Stat. Med"},{"key":"2023051706070143500_btaa776-B33","article-title":"The NHGRI GWAS catalog, a curated resource of SNP-trait associations","volume":"42","author":"Welter","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023051706070143500_btaa776-B34","first-page":"1625","article-title":"A new view of automatic relevance determination","volume":"20","author":"Wipf","year":"2008","journal-title":"Compute"},{"key":"2023051706070143500_btaa776-B35","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1109\/JSTSP.2010.2042413","article-title":"Iterative reweighted l1 and l2 methods for finding sparse solutions","volume":"4","author":"Wipf","year":"2010","journal-title":"IEEE J. Select. Top. Signal Process"},{"key":"2023051706070143500_btaa776-B36","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1111\/j.1467-9868.2005.00532.x","article-title":"Model selection and estimation in regression with grouped variables","volume":"68","author":"Yuan","year":"2006","journal-title":"J. R. Stat. Soc. Ser. B Stat. Methodol"},{"key":"2023051706070143500_btaa776-B37","author":"Zeng","year":"2019"},{"key":"2023051706070143500_btaa776-B38","doi-asserted-by":"crossref","first-page":"1418","DOI":"10.1198\/016214506000000735","article-title":"The adaptive lasso and its oracle properties","volume":"101","author":"Zou","year":"2006","journal-title":"J. Am. Stat. Assoc"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa776\/34841187\/btaa776.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/4\/514\/50359773\/btaa776.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/4\/514\/50359773\/btaa776.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,12]],"date-time":"2024-08-12T19:50:48Z","timestamp":1723492248000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/4\/514\/5904263"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,9,11]]},"references-count":38,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2021,5,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa776","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.03.04.971408","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,2,15]]},"published":{"date-parts":[[2020,9,11]]}}}