{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T12:31:18Z","timestamp":1775219478811,"version":"3.50.1"},"reference-count":4,"publisher":"Oxford University Press (OUP)","issue":"1","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":1795,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/3.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2012,1,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Summary: High-throughput genotyping arrays provide an efficient way to survey single nucleotide polymorphisms (SNPs) across the genome in large numbers of individuals. Downstream analysis of the data, for example in genome-wide association studies (GWAS), often involves statistical models of genotype frequencies across individuals. The complexities of the sample collection process and the potential for errors in the experimental assay can lead to biases and artefacts in an individual's inferred genotypes. Rather than attempting to model these complications, it has become a standard practice to remove individuals whose genome-wide data differ from the sample at large. Here we describe a simple, but robust, statistical algorithm to identify samples with atypical summaries of genome-wide variation. Its use as a semi-automated quality control tool is demonstrated using several summary statistics, selected to identify different potential problems, and it is applied to two different genotyping platforms and sample collections.<\/jats:p>\n               <jats:p>Availability: The algorithm is written in R and is freely available at www.well.ox.ac.uk\/chris-spencer<\/jats:p>\n               <jats:p>Contact: \u00a0chris.spencer@well.ox.ac.uk<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btr599","type":"journal-article","created":{"date-parts":[[2011,11,5]],"date-time":"2011-11-05T00:36:36Z","timestamp":1320453396000},"page":"134-135","source":"Crossref","is-referenced-by-count":72,"title":["A robust clustering algorithm for identifying problematic samples in genome-wide association studies"],"prefix":"10.1093","volume":"28","author":[{"given":"C\u00e9line","family":"Bellenguez","sequence":"first","affiliation":[{"name":"1 Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN and 2Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK"}]},{"given":"Amy","family":"Strange","sequence":"additional","affiliation":[{"name":"1 Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN and 2Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK"}]},{"given":"Colin","family":"Freeman","sequence":"additional","affiliation":[{"name":"1 Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN and 2Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK"}]},{"name":"Wellcome Trust Case Control Consortium \u2020","sequence":"additional","affiliation":[]},{"given":"Peter","family":"Donnelly","sequence":"additional","affiliation":[{"name":"1 Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN and 2Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK"},{"name":"1 Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN and 2Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK"}]},{"given":"Chris C.A.","family":"Spencer","sequence":"additional","affiliation":[{"name":"1 Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN and 2Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK"}]}],"member":"286","published-online":{"date-parts":[[2011,11,3]]},"reference":[{"key":"2023061011444605900_B1","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1016\/S0167-9473(97)00011-X","article-title":"Maximum trimmed likelihood estimators: a unified approach, examples, and algorithms","volume":"25","author":"Hadi","year":"1997","journal-title":"Comput. Stat. Data Anal."},{"key":"2023061011444605900_B2","doi-asserted-by":"crossref","first-page":"985","DOI":"10.1038\/ng.694","article-title":"A genome-wide association study identifies new psoriasis susceptibility loci and an interaction between HLA-C and ERAP1","volume":"42","author":"Genetic Analysis of Psoriasis Consortium & the WTCCC2","year":"2010","journal-title":"Nat. Genet."},{"key":"2023061011444605900_B3","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1038\/nature10251","article-title":"Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosis","volume":"476","author":"The International Multiple Sclerosis Genetics Consortium & the WTCCC2","year":"2011","journal-title":"Nature"},{"key":"2023061011444605900_B4","doi-asserted-by":"crossref","first-page":"1330","DOI":"10.1038\/ng.483","article-title":"Genome-wide association study of ulcerative colitis identifies three new susceptibility loci, including the HNF4A region","volume":"41","author":"The UK IBD Genetics Consortium & the WTCCC2","year":"2009","journal-title":"Nat. Genet."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/1\/134\/50568250\/bioinformatics_28_1_134.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/28\/1\/134\/50568250\/bioinformatics_28_1_134.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,10]],"date-time":"2023-06-10T07:46:10Z","timestamp":1686383170000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/28\/1\/134\/219463"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,11,3]]},"references-count":4,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2012,1,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btr599","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2012,1,1]]},"published":{"date-parts":[[2011,11,3]]}}}