{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,4]],"date-time":"2026-03-04T12:20:18Z","timestamp":1772626818418,"version":"3.50.1"},"reference-count":26,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2017,2,14]],"date-time":"2017-02-14T00:00:00Z","timestamp":1487030400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001711","name":"SNSF","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001711","id-type":"DOI","asserted-by":"crossref"}]},{"name":"COPDGene project"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,6,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Genetic heterogeneity is the phenomenon that distinct genetic variants may give rise to the same phenotype. The recently introduced algorithm Fast Automatic Interval Search (FAIS) enables the genome-wide search of candidate regions for genetic heterogeneity in the form of any contiguous sequence of variants, and achieves high computational efficiency and statistical power. Although FAIS can test all possible genomic regions for association with a phenotype, a key limitation is its inability to correct for confounders such as gender or population structure, which may lead to numerous false-positive associations.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We propose FastCMH, a method that overcomes this problem by properly accounting for categorical confounders, while still retaining statistical power and computational efficiency. Experiments comparing FastCMH with FAIS and multiple kinds of burden tests on simulated data, as well as on human and Arabidopsis samples, demonstrate that FastCMH can drastically reduce genomic inflation and discover associations that are missed by standard burden tests.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and Implementation<\/jats:title>\n                  <jats:p>An R package fastcmh is available on CRAN and the source code can be found at: https:\/\/www.bsse.ethz.ch\/mlcb\/research\/bioinformatics-and-computational-biology\/fastcmh.html<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx071","type":"journal-article","created":{"date-parts":[[2017,2,15]],"date-time":"2017-02-15T08:58:08Z","timestamp":1487149088000},"page":"1820-1828","source":"Crossref","is-referenced-by-count":14,"title":["Genome-wide genetic heterogeneity discovery with categorical covariates"],"prefix":"10.1093","volume":"33","author":[{"given":"Felipe","family":"Llinares-L\u00f3pez","sequence":"first","affiliation":[{"name":"Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland"}]},{"given":"Laetitia","family":"Papaxanthos","sequence":"additional","affiliation":[{"name":"Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland"}]},{"given":"Dean","family":"Bodenham","sequence":"additional","affiliation":[{"name":"Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland"}]},{"given":"Damian","family":"Roqueiro","sequence":"additional","affiliation":[{"name":"Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland"}]},{"name":"COPDGene Investigators","sequence":"additional","affiliation":[]},{"given":"Karsten","family":"Borgwardt","sequence":"additional","affiliation":[{"name":"Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland"},{"name":"SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland"}]}],"member":"286","published-online":{"date-parts":[[2017,2,14]]},"reference":[{"key":"2023020205324356300_btx071-B1","doi-asserted-by":"crossref","first-page":"627","DOI":"10.1038\/nature08800","article-title":"Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines","volume":"465","author":"Atwell","year":"2010","journal-title":"Nature"},{"key":"2023020205324356300_btx071-B2","doi-asserted-by":"crossref","first-page":"338","DOI":"10.1038\/nature12625","article-title":"The causes and consequences of genetic heterogeneity in cancer evolution","volume":"501","author":"Burrell","year":"2013","journal-title":"Nature"},{"key":"2023020205324356300_btx071-B3","doi-asserted-by":"crossref","first-page":"200","DOI":"10.1038\/ng.535","article-title":"Variants in FAM13A are associated with chronic obstructive pulmonary disease","volume":"42","author":"Cho","year":"2010","journal-title":"Nat. Genet"},{"key":"2023020205324356300_btx071-B4","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1016\/S2213-2600(14)70002-5","article-title":"Risk loci for chronic obstructive pulmonary disease: a genome-wide association study and meta-analysis","volume":"2","author":"Cho","year":"2014","journal-title":"Lancet Respir. Med"},{"key":"2023020205324356300_btx071-B5","doi-asserted-by":"crossref","first-page":"417","DOI":"10.2307\/3001616","article-title":"Some methods for strengthening the common chi2 tests","volume":"10","author":"Cochran","year":"1954","journal-title":"Biometrics"},{"key":"2023020205324356300_btx071-B6","doi-asserted-by":"crossref","first-page":"997","DOI":"10.1111\/j.0006-341X.1999.00997.x","article-title":"Genomic control for association studies","volume":"55","author":"Devlin","year":"1999","journal-title":"Biometrics"},{"key":"2023020205324356300_btx071-B7","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1080\/01621459.1961.10482090","article-title":"Multiple comparisons among means","volume":"56","author":"Dunn","year":"1961","journal-title":"J. Am. Stat. Assoc"},{"key":"2023020205324356300_btx071-B8","doi-asserted-by":"crossref","first-page":"87","DOI":"10.2307\/2340521","article-title":"On the interpretation of \u03c72 from contingency tables, and the calculation of P","volume":"85","author":"Fisher","year":"1922","journal-title":"J. R. Stat. Soc"},{"key":"2023020205324356300_btx071-B9","article-title":"easygwas: A cloud-based platform for comparing the results of genome-wide association studies","author":"Grimm","year":"2016","journal-title":"Plant Cell"},{"key":"2023020205324356300_btx071-B10","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1016\/j.ajhg.2014.06.009","article-title":"Rare-variant association analysis: study designs and statistical tests","volume":"95","author":"Lee","year":"2014","journal-title":"Am. J. Hum. Genet"},{"key":"2023020205324356300_btx071-B11","doi-asserted-by":"crossref","first-page":"1526","DOI":"10.1093\/bioinformatics\/btt177","article-title":"A powerful and efficient set test for genetic markers that handles confounders","volume":"29","author":"Listgarten","year":"2013","journal-title":"Bioinformatics"},{"key":"2023020205324356300_btx071-B12","doi-asserted-by":"crossref","first-page":"i240","DOI":"10.1093\/bioinformatics\/btv263","article-title":"Genome-wide detection of intervals of genetic heterogeneity associated with complex traits","volume":"31","author":"Llinares-L\u00f3pez","year":"2015","journal-title":"Bioinformatics"},{"key":"2023020205324356300_btx071-B13","author":"Llinares-L\u00f3pez","year":"2015"},{"key":"2023020205324356300_btx071-B14","first-page":"719.","article-title":"Statistical aspects of the analysis of data from retrospective studies of disease","volume":"22","author":"Mantel","year":"1959","journal-title":"J Natl Cancer Inst"},{"key":"2023020205324356300_btx071-B15","doi-asserted-by":"crossref","first-page":"512","DOI":"10.1038\/ng1337","article-title":"The effects of human population structure on large genetic association studies","volume":"36","author":"Marchini","year":"2004","journal-title":"Nat. Genet"},{"key":"2023020205324356300_btx071-B16","author":"Minato","year":"2014"},{"key":"2023020205324356300_btx071-B17","first-page":"2279","author":"Papaxanthos","year":"2016"},{"key":"2023020205324356300_btx071-B18","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1080\/14786440009463897","article-title":"On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can reasonable be supposed to have arisen from random sampling","volume":"50","author":"Pearson","year":"1900","journal-title":"Philos. Mag"},{"key":"2023020205324356300_btx071-B19","doi-asserted-by":"crossref","first-page":"904","DOI":"10.1038\/ng1847","article-title":"Principal components analysis corrects for stratification in genome-wide association studies","volume":"38","author":"Price","year":"2006","journal-title":"Nat. Genet"},{"key":"2023020205324356300_btx071-B20","doi-asserted-by":"crossref","first-page":"32","DOI":"10.3109\/15412550903499522","article-title":"Genetic epidemiology of COPD (COPDGene) study design","volume":"7","author":"Regan","year":"2011","journal-title":"COPD"},{"key":"2023020205324356300_btx071-B21","doi-asserted-by":"crossref","first-page":"e3746.","DOI":"10.1371\/journal.pone.0003746","article-title":"The trouble with sliding windows and the selective pressure in brca1","volume":"3","author":"Schmid","year":"2008","journal-title":"PLoS One"},{"key":"2023020205324356300_btx071-B22","author":"Sugiyama","year":"2015"},{"key":"2023020205324356300_btx071-B23","doi-asserted-by":"crossref","first-page":"515","DOI":"10.2307\/2531456","article-title":"A modified Bonferroni method for discrete data","volume":"46","author":"Tarone","year":"1990","journal-title":"Biometrics"},{"key":"2023020205324356300_btx071-B24","doi-asserted-by":"crossref","first-page":"12996","DOI":"10.1073\/pnas.1302233110","article-title":"Statistical significance of combinatorial regulations","volume":"110","author":"Terada","year":"2013","journal-title":"Proc. Natl. Acad. Sci"},{"key":"2023020205324356300_btx071-B25","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/nrg3382","article-title":"The nature of confounding in genome-wide association studies","volume":"14","author":"Vilhj\u00e1lmsson","year":"2013","journal-title":"Nat. Rev. Genet"},{"key":"2023020205324356300_btx071-B26","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1038\/nature05911","article-title":"Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls","volume":"447","author":"Wellcome Trust Case Control Consortium","year":"2007","journal-title":"Nature"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/12\/1820\/49040099\/bioinformatics_33_12_1820.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/12\/1820\/49040099\/bioinformatics_33_12_1820.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T05:37:12Z","timestamp":1675316232000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/12\/1820\/2995817"}},"subtitle":[],"editor":[{"given":"Janet","family":"Kelso","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2017,2,14]]},"references-count":26,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2017,6,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx071","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2017,6,15]]},"published":{"date-parts":[[2017,2,14]]}}}