{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,14]],"date-time":"2026-02-14T14:24:23Z","timestamp":1771079063775,"version":"3.50.1"},"reference-count":35,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","license":[{"start":{"date-parts":[[2020,7,13]],"date-time":"2020-07-13T00:00:00Z","timestamp":1594598400000},"content-version":"vor","delay-in-days":12,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"name":"National Institute of Health","award":["U01EB023685"],"award-info":[{"award-number":["U01EB023685"]}]},{"name":"National Institute of Health","award":["R01HG010798"],"award-info":[{"award-number":["R01HG010798"]}]},{"DOI":"10.13039\/100006733","name":"Indiana University","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100006733","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Precision Health Initiative"},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CNS-1838083"],"award-info":[{"award-number":["CNS-1838083"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>The generalized linear mixed model (GLMM) is an extension of the generalized linear model (GLM) in which the linear predictor takes random effects into account. Given its power of precisely modeling the mixed effects from multiple sources of random variations, the method has been widely used in biomedical computation, for instance in the genome-wide association studies (GWASs) that aim to detect genetic variance significantly associated with phenotypes such as human diseases. Collaborative GWAS on large cohorts of patients across multiple institutions is often impeded by the privacy concerns of sharing personal genomic and other health data. To address such concerns, we present in this paper a privacy-preserving Expectation\u2013Maximization (EM) algorithm to build GLMM collaboratively when input data are distributed to multiple participating parties and cannot be transferred to a central server. We assume that the data are horizontally partitioned among participating parties: i.e. each party holds a subset of records (including observational values of fixed effect variables and their corresponding outcome), and for all records, the outcome is regulated by the same set of known fixed effects and random effects.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Our collaborative EM algorithm is mathematically equivalent to the original EM algorithm commonly used in GLMM construction. The algorithm also runs efficiently when tested on simulated and real human genomic data, and thus can be practically used for privacy-preserving GLMM construction. We implemented the algorithm for collaborative GLMM (cGLMM) construction in R. The data communication was implemented using the rsocket package.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>The software is released in open source at https:\/\/github.com\/huthvincent\/cGLMM.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa478","type":"journal-article","created":{"date-parts":[[2020,5,4]],"date-time":"2020-05-04T19:13:41Z","timestamp":1588619621000},"page":"i128-i135","source":"Crossref","is-referenced-by-count":25,"title":["Privacy-preserving construction of generalized linear mixed model for biomedical computation"],"prefix":"10.1093","volume":"36","author":[{"given":"Rui","family":"Zhu","sequence":"first","affiliation":[{"name":"Luddy School of Informatics, Computing, and Engineering, Indiana University , Bloomington, IN 47405, USA"}]},{"given":"Chao","family":"Jiang","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Software Engineering, Auburn University , Auburn, AL 36849, USA"}]},{"given":"Xiaofeng","family":"Wang","sequence":"additional","affiliation":[{"name":"Luddy School of Informatics, Computing, and Engineering, Indiana University , Bloomington, IN 47405, USA"}]},{"given":"Shuang","family":"Wang","sequence":"additional","affiliation":[{"name":"Luddy School of Informatics, Computing, and Engineering, Indiana University , Bloomington, IN 47405, USA"}]},{"given":"Hao","family":"Zheng","sequence":"additional","affiliation":[{"name":"Department of Bioinformatics, Hangzhou Nuowei Information Technology , Hangzhou 310053, China"}]},{"given":"Haixu","family":"Tang","sequence":"additional","affiliation":[{"name":"Luddy School of Informatics, Computing, and Engineering, Indiana University , Bloomington, IN 47405, USA"}]}],"member":"286","published-online":{"date-parts":[[2020,7,13]]},"reference":[{"key":"2024021913321800900_btaa478-B1","doi-asserted-by":"crossref","first-page":"3777","DOI":"10.1093\/nar\/gkr1255","article-title":"Comprehensive literature review and statistical considerations for GWAS meta-analysis","volume":"40","author":"Begum","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2024021913321800900_btaa478-B2","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1111\/1467-9868.00176","article-title":"Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm","volume":"61","author":"Booth","year":"1999","journal-title":"J. R. Stat. Soc. B Stat. Methodol"},{"key":"2024021913321800900_btaa478-B3","doi-asserted-by":"crossref","first-page":"431","DOI":"10.1038\/sj.bjc.6601119","article-title":"Survival analysis part II: multivariate data analysis \u2014an introduction to concepts and methods","volume":"89","author":"Bradburn","year":"2003","journal-title":"Br. J. Cancer"},{"key":"2024021913321800900_btaa478-B4","first-page":"1747","volume-title":"AMIA Annual Symposium Proceedings","author":"Chen","year":"2016"},{"key":"2024021913321800900_btaa478-B5","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1186\/s12920-017-0281-2","article-title":"Presage: privacy-preserving genetic testing via software guard extension","volume":"10","author":"Chen","year":"2017","journal-title":"BMC Med. Genomics"},{"key":"2024021913321800900_btaa478-B6","doi-asserted-by":"crossref","first-page":"871","DOI":"10.1093\/bioinformatics\/btw758","article-title":"Princess: privacy-protecting rare disease international network collaboration via encryption through software guard extensions","volume":"33","author":"Chen","year":"2017","journal-title":"Bioinformatics"},{"key":"2024021913321800900_btaa478-B7","doi-asserted-by":"crossref","first-page":"327","DOI":"10.1080\/00031305.1995.10476177","article-title":"Understanding the Metropolis-Hastings algorithm","volume":"49","author":"Chib","year":"1995","journal-title":"Am. Stat"},{"key":"2024021913321800900_btaa478-B8","doi-asserted-by":"crossref","first-page":"e28071,","DOI":"10.1371\/journal.pone.0028071","article-title":"A systematic review of re-identification attacks on health data","volume":"6","author":"El Emam","year":"2011","journal-title":"PLoS One"},{"key":"2024021913321800900_btaa478-B9","first-page":"169","author":"Gentry","year":"2009"},{"key":"2024021913321800900_btaa478-B10","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1201\/9781315154084-27","volume-title":"Handbook of Statistical Methods for Case-Control Studies","author":"Golan","year":"2018","edition":"1st edn."},{"key":"2024021913321800900_btaa478-B11","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1038\/nrg1521","article-title":"Genome-wide association studies for common diseases and complex traits","volume":"6","author":"Hirschhorn","year":"2005","journal-title":"Nat. Rev. Genet"},{"key":"2024021913321800900_btaa478-B12","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1016\/j.datak.2007.06.013","article-title":"Privacy-preserving imputation of missing data","volume":"65","author":"Jagannathan","year":"2008","journal-title":"Data Knowl. Eng"},{"key":"2024021913321800900_btaa478-B13","doi-asserted-by":"crossref","first-page":"727","DOI":"10.1111\/j.1474-9726.2012.00871.x","article-title":"a meta-analysis of GWAS and age-associated diseases","volume":"11","author":"Jeck","year":"2012","journal-title":"Aging Cell"},{"key":"2024021913321800900_btaa478-B14","doi-asserted-by":"crossref","first-page":"3238","DOI":"10.1093\/bioinformatics\/btt559","article-title":"WebGLORE: a web service for grid logistic regression","volume":"29","author":"Jiang","year":"2013","journal-title":"Bioinformatics"},{"key":"2024021913321800900_btaa478-B15","doi-asserted-by":"crossref","first-page":"e19","DOI":"10.2196\/medinform.8805","article-title":"Secure logistic regression based on homomorphic encryption: design and evaluation","volume":"6","author":"Kim","year":"2018","journal-title":"JMIR Med. Inform"},{"key":"2024021913321800900_btaa478-B16","author":"Kone\u010dn\u1ef3","year":"2016"},{"key":"2024021913321800900_btaa478-B17","doi-asserted-by":"crossref","first-page":"570","DOI":"10.1093\/jamia\/ocv146","article-title":"VERTIcal Grid lOgistic Regression (VERTIGO)","volume":"23","author":"Li","year":"2016","journal-title":"J. Am. Med. Inform. Assoc"},{"key":"2024021913321800900_btaa478-B18","doi-asserted-by":"crossref","first-page":"1212","DOI":"10.1093\/jamia\/ocv083","article-title":"WebDISCO: a web service for distributed cox model learning without patient-level data sharing","volume":"22","author":"Lu","year":"2015","journal-title":"J. Am. Med. Inform. Assoc"},{"key":"2024021913321800900_btaa478-B19","doi-asserted-by":"crossref","first-page":"356","DOI":"10.1038\/nrg2344","article-title":"Genome-wide association studies for complex traits: consensus, uncertainty and challenges","volume":"9","author":"McCarthy","year":"2008","journal-title":"Nat. Rev. Genet"},{"key":"2024021913321800900_btaa478-B21","first-page":"i","volume-title":"NSF-CBMS Regional Conference Series in Probability and Statistics","author":"McCulloch","year":"2003"},{"key":"2024021913321800900_btaa478-B22","first-page":"1","author":"McKeen","year":"2016"},{"key":"2024021913321800900_btaa478-B23","article-title":"Survey of various homomorphic encryption algorithms and schemes","volume":"91, pp 26-32.","author":"Parmar","year":"2014","journal-title":"Int. J. Comput. Appl"},{"key":"2024021913321800900_btaa478-B24","doi-asserted-by":"crossref","first-page":"362","DOI":"10.1038\/ng.2564","article-title":"GWAS meta-analysis and replication identifies three new susceptibility loci for ovarian cancer","volume":"45","author":"Pharoah","year":"2013","journal-title":"Nat. Genet"},{"key":"2024021913321800900_btaa478-B25","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1109\/Trustcom.2015.357","volume-title":"2015 IEEE Trustcom\/BigDataSE\/ISPA","author":"Sabt","year":"2015"},{"key":"2024021913321800900_btaa478-B26","doi-asserted-by":"crossref","first-page":"074004","DOI":"10.1088\/0957-0233\/26\/7\/074004","article-title":"Collaborative framework for PIV uncertainty quantification: comparative assessment of methods","volume":"26","author":"Sciacchitano","year":"2015","journal-title":"Meas. Sci. Technol"},{"key":"2024021913321800900_btaa478-B20","article-title":"Generalized linear mixed models: modern concepts, methods and applications. CRC press, 2012","author":"Stroup"},{"key":"2024021913321800900_btaa478-B27","first-page":"639","author":"Vaidya","year":"2002"},{"key":"2024021913321800900_btaa478-B28","first-page":"206","author":"Vaidya","year":"2003"},{"key":"2024021913321800900_btaa478-B29","author":"Vaidya","year":"2004"},{"key":"2024021913321800900_btaa478-B30","doi-asserted-by":"crossref","first-page":"480","DOI":"10.1016\/j.jbi.2013.03.008","article-title":"Expectation propagation logistic regression (explorer): distributed privacy-preserving online model learning","volume":"46","author":"Wang","year":"2013","journal-title":"J. Biomed. Inform"},{"key":"2024021913321800900_btaa478-B31","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1093\/bioinformatics\/btv563","article-title":"Healer: homomorphic computation of exact logistic regression for secure rare disease variants analysis in GWAS","volume":"32","author":"Wang","year":"2016","journal-title":"Bioinformatics"},{"key":"2024021913321800900_btaa478-B32","volume":"11(suppl 4)","author":"Wang","year":"2018"},{"key":"2024021913321800900_btaa478-B33","doi-asserted-by":"crossref","first-page":"758","DOI":"10.1136\/amiajnl-2012-000862","article-title":"Grid binary LOgistic Regression (GLORE): building shared models without sharing data","volume":"19","author":"Wu","year":"2012","journal-title":"J. Am. Med. Inform. Assoc"},{"key":"2024021913321800900_btaa478-B34","first-page":"647","author":"Yu","year":"2006"},{"key":"2024021913321800900_btaa478-B35","first-page":"1034","author":"Yu","year":"2008"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/Supplement_1\/i128\/56702373\/bioinformatics_36_supplement1_i128.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/Supplement_1\/i128\/56702373\/bioinformatics_36_supplement1_i128.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,5]],"date-time":"2024-08-05T06:39:42Z","timestamp":1722839982000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/Supplement_1\/i128\/5870488"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,1]]},"references-count":35,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2020,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa478","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,7]]},"published":{"date-parts":[[2020,7,1]]}}}