{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:01Z","timestamp":1772138041073,"version":"3.50.1"},"reference-count":36,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2020,9,11]],"date-time":"2020-09-11T00:00:00Z","timestamp":1599782400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"UK Medical Research Council (core funding to Stephen Burgess","award":["MC_UU_00002\/7"],"award-info":[{"award-number":["MC_UU_00002\/7"]}]},{"name":"UK Medical Research Council (core funding to Stephen Burgess","award":["MC_UU_00002\/13"],"award-info":[{"award-number":["MC_UU_00002\/13"]}]},{"name":"UK National Institute for Health Research Cambridge Biomedical Research Centre"},{"name":"Sir Henry Dale Fellowship jointly funded by the Wellcome Trust and the Royal Society","award":["204623\/Z\/16\/Z"],"award-info":[{"award-number":["204623\/Z\/16\/Z"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,5,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Mendelian randomization is an epidemiological technique that uses genetic variants as instrumental variables to estimate the causal effect of a risk factor on an outcome. We consider a scenario in which causal estimates based on each variant in turn differ more strongly than expected by chance alone, but the variants can be divided into distinct clusters, such that all variants in the cluster have similar causal estimates. This scenario is likely to occur when there are several distinct causal mechanisms by which a risk factor influences an outcome with different magnitudes of causal effect. We have developed an algorithm MR-Clust that finds such clusters of variants, and so can identify variants that reflect distinct causal mechanisms. Two features of our clustering algorithm are that it accounts for differential uncertainty in the causal estimates, and it includes \u2018null\u2019 and \u2018junk\u2019 clusters, to provide protection against the detection of spurious clusters.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Our algorithm correctly detected the number of clusters in a simulation analysis, outperforming methods that either do not account for uncertainty or do not include null and junk clusters. In an applied example considering the effect of blood pressure on coronary artery disease risk, the method detected four clusters of genetic variants. A post hoc hypothesis-generating search suggested that variants in the cluster with a negative effect of blood pressure on coronary artery disease risk were more strongly related to trunk fat percentage and other adiposity measures than variants not in this cluster.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>MR-Clust can be downloaded from https:\/\/github.com\/cnfoley\/mrclust.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa778","type":"journal-article","created":{"date-parts":[[2020,9,1]],"date-time":"2020-09-01T15:46:28Z","timestamp":1598975188000},"page":"531-541","source":"Crossref","is-referenced-by-count":75,"title":["MR-Clust: clustering of genetic variants in Mendelian randomization with similar causal estimates"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0970-2610","authenticated-orcid":false,"given":"Christopher N","family":"Foley","sequence":"first","affiliation":[{"name":"MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge , Cambridge CB2 0SR, UK"}]},{"given":"Amy M","family":"Mason","sequence":"additional","affiliation":[{"name":"Department of Public Health and Primary Care, Cardiovascular Epidemiology Unit, University of Cambridge , Cambridge CB1 8RN, UK"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5931-7489","authenticated-orcid":false,"given":"Paul D W","family":"Kirk","sequence":"additional","affiliation":[{"name":"MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge , Cambridge CB2 0SR, UK"},{"name":"Cambridge Institute of Therapeutic Immunology & Infectious Disease, University of Cambridge , Cambridge CB2 0AW, UK"}]},{"given":"Stephen","family":"Burgess","sequence":"additional","affiliation":[{"name":"MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge , Cambridge CB2 0SR, UK"},{"name":"Department of Public Health and Primary Care, Cardiovascular Epidemiology Unit, University of Cambridge , Cambridge CB1 8RN, UK"}]}],"member":"286","published-online":{"date-parts":[[2020,9,11]]},"reference":[{"key":"2023051706071825400_btaa778-B1","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1080\/01621459.1996.10476902","article-title":"Identification of causal effects using instrumental variables","volume":"91","author":"Angrist","year":"1996","journal-title":"J. Am. Stat. Assoc"},{"key":"2023051706071825400_btaa778-B2","doi-asserted-by":"crossref","first-page":"2297","DOI":"10.1002\/sim.6128","article-title":"Instrumental variable methods for causal inference","volume":"33","author":"Baiocchi","year":"2014","journal-title":"Stat. Med"},{"key":"2023051706071825400_btaa778-B3","doi-asserted-by":"crossref","DOI":"10.1201\/b18084","volume-title":"Mendelian Randomization: Methods for Using Genetic Variants in Causal Estimation","author":"Burgess","year":"2015"},{"key":"2023051706071825400_btaa778-B4","doi-asserted-by":"crossref","first-page":"658","DOI":"10.1002\/gepi.21758","article-title":"Mendelian randomization analysis with multiple genetic variants using summarized data","volume":"37","author":"Burgess","year":"2013","journal-title":"Genet. Epidemiol"},{"key":"2023051706071825400_btaa778-B5","doi-asserted-by":"crossref","first-page":"597","DOI":"10.1002\/gepi.21998","article-title":"Bias due to participant overlap in two-sample Mendelian randomization","volume":"40","author":"Burgess","year":"2016","journal-title":"Genet. Epidemiol"},{"key":"2023051706071825400_btaa778-B6","doi-asserted-by":"crossref","first-page":"1880","DOI":"10.1002\/sim.6835","article-title":"Combining information on multiple instrumental variables in Mendelian randomization: comparison of allele score and summarized data methods","volume":"35","author":"Burgess","year":"2016","journal-title":"Stat. Med"},{"key":"2023051706071825400_btaa778-B7","doi-asserted-by":"crossref","first-page":"481","DOI":"10.1534\/genetics.117.300191","article-title":"Dissecting causal pathways using Mendelian randomization with summarized genetic data: application to age at menarche and risk of breast cancer","volume":"207","author":"Burgess","year":"2017","journal-title":"Genetics"},{"key":"2023051706071825400_btaa778-B8","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1038\/s41467-019-14156-4","article-title":"A robust and efficient method for Mendelian randomization with hundreds of genetic variants","volume":"11","author":"Burgess","year":"2020","journal-title":"Nat. Commun"},{"key":"2023051706071825400_btaa778-B9","doi-asserted-by":"crossref","first-page":"1010","DOI":"10.1038\/s41467-020-14452-4","article-title":"Exploiting horizontal pleiotropy to search for causal pathways within a Mendelian randomization framework","volume":"11","author":"Cho","year":"2020","journal-title":"Nat. Commun"},{"key":"2023051706071825400_btaa778-B10","doi-asserted-by":"crossref","first-page":"1638","DOI":"10.1080\/01621459.2012.734171","article-title":"Instrumental variable estimators for binary outcomes","volume":"107","author":"Clarke","year":"2012","journal-title":"J. Am. Stat. Assoc"},{"key":"2023051706071825400_btaa778-B11","doi-asserted-by":"crossref","first-page":"e1006516","DOI":"10.1371\/journal.pcbi.1006516","article-title":"A Bayesian mixture modelling approach for spatial proteomics","volume":"14","author":"Crook","year":"2018","journal-title":"PLoS Comput. Biol"},{"key":"2023051706071825400_btaa778-B12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1093\/ije\/dyg070","article-title":"Mendelian randomization\u2019: can genetic epidemiology contribute to understanding environmental determinants of disease?","volume":"32","author":"Davey Smith","year":"2003","journal-title":"Int. J. Epidemiol"},{"key":"2023051706071825400_btaa778-B13","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1177\/0962280206077743","article-title":"Mendelian randomization as an instrumental variable approach to causal inference","volume":"16","author":"Didelez","year":"2007","journal-title":"Stat. Methods Med. Res"},{"key":"2023051706071825400_btaa778-B14","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1214\/09-STS316","article-title":"Assumptions of IV methods for observational epidemiology","volume":"25","author":"Didelez","year":"2010","journal-title":"Stat. Sci"},{"key":"2023051706071825400_btaa778-B15","doi-asserted-by":"crossref","first-page":"1412","DOI":"10.1038\/s41588-018-0205-x","article-title":"Genetic analysis of over one million people identifies 535 novel loci for blood pressure","volume":"50","author":"Evangelou","year":"2018","journal-title":"Nat. Genet"},{"key":"2023051706071825400_btaa778-B16","doi-asserted-by":"crossref","first-page":"722","DOI":"10.1093\/ije\/29.4.722","article-title":"An introduction to instrumental variables for epidemiologists","volume":"29","author":"Greenland","year":"2000","journal-title":"Int. J. Epidemiol"},{"key":"2023051706071825400_btaa778-B17","doi-asserted-by":"crossref","first-page":"360","DOI":"10.1097\/01.ede.0000222409.00878.37","article-title":"Instruments for causal inference: an epidemiologist\u2019s dream?","volume":"17","author":"Hern\u00e1n","year":"2006","journal-title":"Epidemiology"},{"key":"2023051706071825400_btaa778-B18","doi-asserted-by":"crossref","first-page":"467","DOI":"10.2307\/2951620","article-title":"Identification and estimation of local average treatment effects","volume":"62","author":"Imbens","year":"1994","journal-title":"Econometrica"},{"key":"2023051706071825400_btaa778-B19","author":"Johnson","year":"2013"},{"key":"2023051706071825400_btaa778-B20","doi-asserted-by":"crossref","first-page":"1133","DOI":"10.1002\/sim.3034","article-title":"Mendelian randomization: using genes as instruments for making causal inferences in epidemiology","volume":"27","author":"Lawlor","year":"2008","journal-title":"Stat. Med"},{"key":"2023051706071825400_btaa778-B21","volume-title":"Causality: Models, Reasoning, and Inference","author":"Pearl","year":"2000"},{"key":"2023051706071825400_btaa778-B22","doi-asserted-by":"crossref","first-page":"581","DOI":"10.1038\/nrd4051","article-title":"Validating therapeutic targets through human genetics","volume":"12","author":"Plenge","year":"2013","journal-title":"Nat. Rev. Drug Disc"},{"key":"2023051706071825400_btaa778-B23","doi-asserted-by":"crossref","first-page":"1941","DOI":"10.1038\/s41467-019-09432-2","article-title":"Mendelian randomization analysis using mixture models for robust and efficient estimation of causal effects","volume":"10","author":"Qi","year":"2019","journal-title":"Nat. Commun"},{"key":"2023051706071825400_btaa778-B24","doi-asserted-by":"crossref","first-page":"846","DOI":"10.1080\/01621459.1971.10482356","article-title":"Objective criteria for the evaluation of clustering methods","volume":"66","author":"Rand","year":"1971","journal-title":"J. Am. Stat. Assoc"},{"key":"2023051706071825400_btaa778-B25","doi-asserted-by":"crossref","first-page":"252","DOI":"10.1038\/s41591-020-0751-5","article-title":"Using human genetics to understand the disease impacts of testosterone in men and women","volume":"26","author":"Ruth","year":"2020","journal-title":"Nat. Med"},{"key":"2023051706071825400_btaa778-B26","doi-asserted-by":"crossref","first-page":"289","DOI":"10.32614\/RJ-2016-021","article-title":"mclust 5: clustering, classification and density estimation using Gaussian finite mixture models","volume":"8","author":"Scrucca","year":"2016","journal-title":"R. J"},{"key":"2023051706071825400_btaa778-B27","doi-asserted-by":"crossref","first-page":"3207","DOI":"10.1093\/bioinformatics\/btw373","article-title":"PhenoScanner: a database of human genotype-phenotype associations","volume":"32","author":"Staley","year":"2016","journal-title":"Bioinformatics"},{"key":"2023051706071825400_btaa778-B28","doi-asserted-by":"crossref","first-page":"370","DOI":"10.1097\/EDE.0b013e31828d0590","article-title":"Commentary: how to report instrumental variable analyses (suggestions welcome)","volume":"24","author":"Swanson","year":"2013","journal-title":"Epidemiology"},{"key":"2023051706071825400_btaa778-B29","doi-asserted-by":"crossref","first-page":"4064","DOI":"10.1038\/s41467-019-11953-9","article-title":"Components of genetic associations across 2,138 phenotypes in the UK Biobank highlight adipocyte biology","volume":"10","author":"Tanigawa","year":"2019","journal-title":"Nat. Commun"},{"key":"2023051706071825400_btaa778-B30","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1038\/nature10405","article-title":"Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk","volume":"478","year":"2011","journal-title":"Nature"},{"key":"2023051706071825400_btaa778-B31","doi-asserted-by":"crossref","first-page":"e1002654","DOI":"10.1371\/journal.pmed.1002654","article-title":"Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: a soft clustering analysis","volume":"15","author":"Udler","year":"2018","journal-title":"PLoS Med"},{"key":"2023051706071825400_btaa778-B32","doi-asserted-by":"crossref","first-page":"433","DOI":"10.1161\/CIRCRESAHA.117.312086","article-title":"Identification of 64 novel genetic loci provides an expanded view on the genetic architecture of coronary artery disease","volume":"122","author":"van der Harst","year":"2018","journal-title":"Circ. Res"},{"key":"2023051706071825400_btaa778-B33","doi-asserted-by":"crossref","first-page":"693","DOI":"10.1038\/s41588-018-0099-7","article-title":"Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases","volume":"50","author":"Verbanck","year":"2018","journal-title":"Nat. Genet"},{"key":"2023051706071825400_btaa778-B34","doi-asserted-by":"crossref","first-page":"572","DOI":"10.1016\/S0140-6736(12)60312-2","article-title":"Plasma HDL cholesterol and risk of myocardial infarction: a mendelian randomisation study","volume":"380","author":"Voight","year":"2012","journal-title":"Lancet"},{"key":"2023051706071825400_btaa778-B35","doi-asserted-by":"crossref","first-page":"108","DOI":"10.1002\/ajmg.b.32286","article-title":"Revisiting mendelian randomization studies of the effect of body mass index on depression","volume":"168","author":"Walter","year":"2015","journal-title":"Am. J. Med. Genet. B Neuropsychiatric Genet"},{"key":"2023051706071825400_btaa778-B36","volume-title":"Introductory econometrics: A modern approach. Chapter 15: Instrumental Variables Estimation and Two Stage Least Squares","author":"Wooldridge","year":"2009"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa778\/34484595\/btaa778.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/4\/531\/50359801\/btaa778.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/4\/531\/50359801\/btaa778.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,17]],"date-time":"2023-05-17T02:14:45Z","timestamp":1684289685000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/4\/531\/5904264"}},"subtitle":[],"editor":[{"given":"Russell","family":"Schwartz","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,9,11]]},"references-count":36,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2021,5,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa778","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2019.12.18.881326","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,2,15]]},"published":{"date-parts":[[2020,9,11]]}}}