{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,7]],"date-time":"2026-03-07T03:26:23Z","timestamp":1772853983680,"version":"3.50.1"},"reference-count":40,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2019,8,29]],"date-time":"2019-08-29T00:00:00Z","timestamp":1567036800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"National Cancer Institute at the National Institutes of Health","award":["P01 CA196569"],"award-info":[{"award-number":["P01 CA196569"]}]},{"name":"National Cancer Institute at the National Institutes of Health","award":["R01 CA140561"],"award-info":[{"award-number":["R01 CA140561"]}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,2,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Epidemiologic, clinical and translational studies are increasingly generating multiplatform omics data. Methods that can integrate across multiple high-dimensional data types while accounting for differential patterns are critical for uncovering novel associations and underlying relevant subgroups.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We propose an integrative model to estimate latent unknown clusters (LUCID) aiming to both distinguish unique genomic, exposure and informative biomarkers\/omic effects while jointly estimating subgroups relevant to the outcome of interest. Simulation studies indicate that we can obtain consistent estimates reflective of the true simulated values, accurately estimate subgroups and recapitulate subgroup-specific effects. We also demonstrate the use of the integrated model for future prediction of risk subgroups and phenotypes. We apply this approach to two real data applications to highlight the integration of genomic, exposure and metabolomic data.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and Implementation<\/jats:title><jats:p>The LUCID method is implemented through the LUCIDus R package available on CRAN (https:\/\/CRAN.R-project.org\/package=LUCIDus).<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary materials are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz667","type":"journal-article","created":{"date-parts":[[2019,8,21]],"date-time":"2019-08-21T11:42:53Z","timestamp":1566387773000},"page":"842-850","source":"Crossref","is-referenced-by-count":32,"title":["A latent unknown clustering integrating multi-omics data (LUCID) with phenotypic traits"],"prefix":"10.1093","volume":"36","author":[{"given":"Cheng","family":"Peng","sequence":"first","affiliation":[{"name":"Department of Preventive Medicine, Keck School of Medicine, Los Angeles, CA 90089, USA"}]},{"given":"Jun","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Preventive Medicine, Keck School of Medicine, Los Angeles, CA 90089, USA"}]},{"given":"Isaac","family":"Asante","sequence":"additional","affiliation":[{"name":"Department of Clinical Pharmacy, School of Pharmacy, University of Southern California , Los Angeles, CA 90089, USA"}]},{"given":"Stan","family":"Louie","sequence":"additional","affiliation":[{"name":"Department of Clinical Pharmacy, School of Pharmacy, University of Southern California , Los Angeles, CA 90089, USA"}]},{"given":"Ran","family":"Jin","sequence":"additional","affiliation":[{"name":"Department of Preventive Medicine, Keck School of Medicine, Los Angeles, CA 90089, USA"}]},{"given":"Lida","family":"Chatzi","sequence":"additional","affiliation":[{"name":"Department of Preventive Medicine, Keck School of Medicine, Los Angeles, CA 90089, USA"}]},{"given":"Graham","family":"Casey","sequence":"additional","affiliation":[{"name":"Center for Public Health Genomics, Department of Public Health Sciences , University of Virginia, Charlottesville, VA 22908, USA"}]},{"given":"Duncan C","family":"Thomas","sequence":"additional","affiliation":[{"name":"Department of Preventive Medicine, Keck School of Medicine, Los Angeles, CA 90089, USA"}]},{"given":"David V","family":"Conti","sequence":"additional","affiliation":[{"name":"Department of Preventive Medicine, Keck School of Medicine, Los Angeles, CA 90089, USA"}]}],"member":"286","published-online":{"date-parts":[[2019,8,29]]},"reference":[{"key":"2023013110024962200_btz667-B1","doi-asserted-by":"crossref","first-page":"232","DOI":"10.1214\/10-AOAS388","article-title":"Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection","volume":"5","author":"Breheny","year":"2011","journal-title":"Ann. Appl. Statist"},{"key":"2023013110024962200_btz667-B2","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1038\/nature10983","article-title":"The genomic and transcriptomic architecture of 2 000 breast tumours reveals novel subgroups","volume":"486","author":"Curtis","year":"2012","journal-title":"Nature"},{"key":"2023013110024962200_btz667-B3","first-page":"54","article-title":"Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy","volume":"1","author":"Efron","year":"1986","journal-title":"Statist. Sci"},{"key":"2023013110024962200_btz667-B4","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1111\/rssb.12001","article-title":"Tuning parameter selection in high dimensional penalized likelihood","volume":"75","author":"Fan","year":"2013","journal-title":"J. R. Statist. Soc"},{"key":"2023013110024962200_btz667-B5","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1093\/toxsci\/kfv198","article-title":"Reference standardization for mass spectrometry and high-resolution metabolomics applications to exposome research","volume":"148","author":"Go","year":"2015","journal-title":"Toxicol. Sci"},{"key":"2023013110024962200_btz667-B6","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1210\/jc.2003-031402","article-title":"Impaired glucose tolerance and reduced beta-cell function in overweight Latino children with a positive family history for type 2 diabetes","volume":"89","author":"Goran","year":"2004","journal-title":"J. Clin. Endocrinol. Metab"},{"key":"2023013110024962200_btz667-B7","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1093\/oxfordjournals.jncimonographs.a024231","article-title":"Study-design issues in the development of the University of Southern California Consortium\u2019s Colorectal Cancer Family Registry","volume":"90033","author":"Haile","year":"1999","journal-title":"J. Natl. Cancer Inst. Monogr"},{"key":"2023013110024962200_btz667-B8","volume-title":"The Elements of Statistical Learning (Springer Series in Statistics)","author":"Hastie","year":"2009"},{"key":"2023013110024962200_btz667-B9","doi-asserted-by":"crossref","first-page":"162","DOI":"10.1002\/sim.6326","article-title":"Integrative modeling of multi-platform genomic data under the framework of mediation analysis","volume":"34","author":"Huang","year":"2015","journal-title":"Statist. Med"},{"key":"2023013110024962200_btz667-B10","doi-asserted-by":"crossref","first-page":"352","DOI":"10.1214\/13-AOAS690","article-title":"Joint analysis of SNP and gene expression data in genetic association studies of complex diseases","volume":"8","author":"Huang","year":"2014","journal-title":"Ann. Appl. Stat"},{"key":"2023013110024962200_btz667-B11","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1002\/gepi.21905","article-title":"iGWAS: integrative genome-wide association studies of genetic and genomic data for disease susceptibility using mediation analysis","volume":"39","author":"Huang","year":"2015","journal-title":"Gen. Epidemiol"},{"key":"2023013110024962200_btz667-B12","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4614-7138-7","volume-title":"An Introduction to Statistical Learning with Applications in R","author":"James","year":"2013"},{"key":"2023013110024962200_btz667-B13","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1007\/s00439-002-0857-5","article-title":"Assessment of the CTNNA3 gene encoding human alpha T-catenin regarding its involvement in dilated cardiomyopathy","volume":"112","author":"Janssens","year":"2003","journal-title":"Hum. Genet"},{"key":"2023013110024962200_btz667-B14","doi-asserted-by":"crossref","first-page":"e1003123","DOI":"10.1371\/journal.pcbi.1003123","article-title":"Predicting network activity from high throughput metabolomics","volume":"9","author":"Li","year":"2013","journal-title":"PLoS Comput. Biol"},{"key":"2023013110024962200_btz667-B15","doi-asserted-by":"crossref","DOI":"10.1002\/9781119013563","volume-title":"Statistical Analysis with Missing Data","author":"Little","year":"2002","edition":"2nd edn"},{"key":"2023013110024962200_btz667-B16","doi-asserted-by":"crossref","first-page":"899","DOI":"10.1080\/01621459.1991.10475130","article-title":"Using EM to obtain asymptotic matrices: the SEM algorithm","volume":"86","author":"Meng","year":"1991","journal-title":"J. Am. Stat. Ass"},{"key":"2023013110024962200_btz667-B17","doi-asserted-by":"crossref","first-page":"2854","DOI":"10.1093\/hmg\/ddm244","article-title":"Genetic association of CTNNA3 with late-onset Alzheimer\u2019s disease in females","volume":"16","author":"Miyashita","year":"2007","journal-title":"Hum. Mol. Gene"},{"key":"2023013110024962200_btz667-B18","doi-asserted-by":"crossref","first-page":"4245","DOI":"10.1073\/pnas.1208949110","article-title":"Pattern discovery and cancer gene identification in integrated cancer genomic data","volume":"110","author":"Mo","year":"2013","journal-title":"Proc. Nat. Acad. Sci. USA"},{"key":"2023013110024962200_btz667-B19","volume-title":"Machine Learning a Probabilistic Perspective (Adaptive Computation and Machine Learning)","author":"Murphy","year":"2012"},{"key":"2023013110024962200_btz667-B20","doi-asserted-by":"crossref","first-page":"2331","DOI":"10.1158\/1055-9965.EPI-07-0648","article-title":"Colon Cancer Family Registry: an international resource for studies of the genetic epidemiology of colon cancer","volume":"16","author":"Newcomb","year":"2007","journal-title":"Cancer Epidemiol. Biomark. Prev"},{"key":"2023013110024962200_btz667-B21","doi-asserted-by":"crossref","first-page":"89","DOI":"10.3109\/07853890.2015.1137630","article-title":"Metabolomics in diabetes, a review","volume":"48","author":"Pallares-M\u00e9ndez","year":"2016","journal-title":"Ann. Med"},{"key":"2023013110024962200_btz667-B22","doi-asserted-by":"crossref","first-page":"2653","DOI":"10.1093\/jn\/136.10.2653","article-title":"A mathematical model gives insights into nutritional and genetic aspects of folate-mediated one-carbon metabolism","volume":"136","author":"Reed","year":"2006","journal-title":"J. Nutr"},{"key":"2023013110024962200_btz667-B23","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1007\/s10654-017-0253-z","article-title":"Cancer subtypes in aetiological research","volume":"32","author":"Richiardi","year":"2017","journal-title":"Eur. J. Epidemiol"},{"key":"2023013110024962200_btz667-B24","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1038\/nrg3868","article-title":"Methods of integrating data to uncover genotype-phenotype interactions","volume":"16","author":"Ritchie","year":"2015","journal-title":"Nat. Rev. Gen"},{"key":"2023013110024962200_btz667-B25","volume-title":"Modern Epidemiology","author":"Rothman","year":"2008","edition":"3rd edn"},{"key":"2023013110024962200_btz667-B26","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1111\/j.1530-9290.2008.00004.x","article-title":"The Sankey diagram in energy and material flow management: part I: history","volume":"12","author":"Schmidt","year":"2008","journal-title":"J. Indust. Ecol"},{"key":"2023013110024962200_btz667-B27","doi-asserted-by":"crossref","first-page":"7138.","DOI":"10.1038\/ncomms8138","article-title":"Genome-wide association study of colorectal cancer identifies six new susceptibility loci","volume":"6","author":"Schumacher","year":"2015","journal-title":"Nat. Commun"},{"key":"2023013110024962200_btz667-B28","doi-asserted-by":"crossref","first-page":"2906","DOI":"10.1093\/bioinformatics\/btp543","article-title":"Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis","volume":"25","author":"Shen","year":"2009","journal-title":"Bioinformatics"},{"key":"2023013110024962200_btz667-B29","doi-asserted-by":"crossref","first-page":"S132","DOI":"10.1007\/s11306-011-0332-1","article-title":"High-performance metabolic profiling with dual chromatography-Fourier-transform mass spectrometry (DC-FTMS) for study of the exposome","volume":"9(Suppl. 1)","author":"Soltow","year":"2013","journal-title":"Metabolomics"},{"key":"2023013110024962200_btz667-B30","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1111\/biom.12964","article-title":"Regularized latent class model for joint analysis of high-dimensional longitudinal biomarkers and a time-to-event outcome","volume":"75","author":"Sun","year":"2019","journal-title":"Biometrics"},{"key":"2023013110024962200_btz667-B31","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression selection and Shrinkage via the Lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. Royal Stat. Soc. B"},{"key":"2023013110024962200_btz667-B32","doi-asserted-by":"crossref","first-page":"565","DOI":"10.1007\/s10985-007-9061-1","article-title":"Multistage sampling for latent variable models","volume":"13","author":"Thomas","year":"2007","journal-title":"Lifetime Data Anal"},{"key":"2023013110024962200_btz667-B33","doi-asserted-by":"crossref","first-page":"448.","DOI":"10.1038\/nm.2307","article-title":"Metabolite profiles and the risk of developing diabetes","volume":"17","author":"Wang","year":"2011","journal-title":"Nat. Med"},{"key":"2023013110024962200_btz667-B34","doi-asserted-by":"crossref","first-page":"2094","DOI":"10.2337\/diacare.26.7.2094","article-title":"Association between insulin sensitivity and post-glucose challenge plasma insulin values in overweight Latino youth","volume":"26","author":"Weigensberg","year":"2003","journal-title":"Diabetes Care"},{"key":"2023013110024962200_btz667-B35","doi-asserted-by":"crossref","first-page":"615","DOI":"10.1111\/j.1467-9868.2009.00699.x","article-title":"Covariance-regularized regression and classification for high dimensional problems","volume":"2","author":"Witten","year":"2009","journal-title":"J. R. Stat. Soc"},{"key":"2023013110024962200_btz667-B36","doi-asserted-by":"crossref","first-page":"4.","DOI":"10.3390\/ht8010004","article-title":"A selective review of multi-level omics data integration using variable selection","volume":"8","author":"Wu","year":"2019","journal-title":"High-Throughput"},{"key":"2023013110024962200_btz667-B37","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-21706-2","volume-title":"Modern Applied Statistics with S","author":"Venables","year":"2002","edition":"4th edn"},{"key":"2023013110024962200_btz667-B38","doi-asserted-by":"crossref","first-page":"3150","DOI":"10.1093\/bioinformatics\/btw351","article-title":"Estimating and testing high-dimensional mediation effects in epigenetic studies","volume":"32","author":"Zhang","year":"2016","journal-title":"Bioinformatics"},{"key":"2023013110024962200_btz667-B39","doi-asserted-by":"crossref","first-page":"1418","DOI":"10.1198\/016214506000000735","article-title":"The adaptive lasso and its oracle properties","volume":"101","author":"Zou","year":"2006","journal-title":"J. Am. Stat. Assoc"},{"key":"2023013110024962200_btz667-B40","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1111\/j.1467-9868.2005.00503.x","article-title":"Regularization and variable selection via the elastic-net","volume":"67","author":"Zou","year":"2005","journal-title":"J. R. Stat. Soc"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btz667\/29224293\/btz667.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/3\/842\/48982473\/bioinformatics_36_3_842.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/3\/842\/48982473\/bioinformatics_36_3_842.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,22]],"date-time":"2024-07-22T10:02:32Z","timestamp":1721642552000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/3\/842\/5556107"}},"subtitle":[],"editor":[{"given":"Russell","family":"Schwartz","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2019,8,29]]},"references-count":40,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2020,2,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz667","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,2,1]]},"published":{"date-parts":[[2019,8,29]]}}}