{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T00:55:45Z","timestamp":1775004945498,"version":"3.50.1"},"reference-count":17,"publisher":"Oxford University Press (OUP)","issue":"13","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Typical high-throughput genotyping techniques produce numerous missing calls that confound subsequent analyses, such as disease association studies. Common remedies for this problem include removing affected markers and\/or samples or, otherwise, imputing the missing data. On small marker sets imputation is frequently based on a vote of the K-nearest-neighbor (KNN) haplotypes, but this technique is neither practical nor justifiable for large datasets.<\/jats:p>\n               <jats:p>Results: We describe a data structure that supports efficient KNN queries over arbitrarily sized, sliding haplotype windows, and evaluate its use for genotype imputation. The performance of our method enables exhaustive exploration over all window sizes and known sites in large (150K, 8.3M) SNP panels. We also compare the accuracy and performance of our methods with competing imputation approaches.<\/jats:p>\n               <jats:p>Availability: A free open source software package, NPUTE, is available at http:\/\/compgen.unc.edu\/software, for non-commercial uses.<\/jats:p>\n               <jats:p>Contact: mcmillan@cs.unc.edu<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm220","type":"journal-article","created":{"date-parts":[[2007,7,23]],"date-time":"2007-07-23T16:13:46Z","timestamp":1185207226000},"page":"i401-i407","source":"Crossref","is-referenced-by-count":78,"title":["Inferring missing genotypes in large SNP panels using fast nearest-neighbor searches over sliding windows"],"prefix":"10.1093","volume":"23","author":[{"given":"Adam","family":"Roberts","sequence":"first","affiliation":[{"name":"1 Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, 2Constella Group, Durham, NC 27713, 3Department of Environmental Sciences and Engineering, University of North Carolina, Chapel Hill, NC 27599 and 4Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA"}]},{"given":"Leonard","family":"McMillan","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, 2Constella Group, Durham, NC 27713, 3Department of Environmental Sciences and Engineering, University of North Carolina, Chapel Hill, NC 27599 and 4Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA"}]},{"given":"Wei","family":"Wang","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, 2Constella Group, Durham, NC 27713, 3Department of Environmental Sciences and Engineering, University of North Carolina, Chapel Hill, NC 27599 and 4Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA"}]},{"given":"Joel","family":"Parker","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, 2Constella Group, Durham, NC 27713, 3Department of Environmental Sciences and Engineering, University of North Carolina, Chapel Hill, NC 27599 and 4Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA"}]},{"given":"Ivan","family":"Rusyn","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, 2Constella Group, Durham, NC 27713, 3Department of Environmental Sciences and Engineering, University of North Carolina, Chapel Hill, NC 27599 and 4Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA"}]},{"given":"David","family":"Threadgill","sequence":"additional","affiliation":[{"name":"1 Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, 2Constella Group, Durham, NC 27713, 3Department of Environmental Sciences and Engineering, University of North Carolina, Chapel Hill, NC 27599 and 4Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA"}]}],"member":"286","published-online":{"date-parts":[[2007,7,1]]},"reference":[{"key":"2023062708515280500_B1","doi-asserted-by":"crossref","first-page":"690","DOI":"10.1002\/gepi.20180","article-title":"Imputation methods to improve inference in SNP association studies","volume":"30","author":"Dai","year":"2006","journal-title":"Genet. Epidemiol"},{"key":"2023062708515280500_B2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1142\/S0219720003000174","article-title":"Efficient reconstruction of haplotype structure via perfect phylogeny","volume":"1","author":"Eskin","year":"2003","journal-title":"J. Bioinform. Comput. Biol"},{"key":"2023062708515280500_B3","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1186\/1471-2164-6-149","article-title":"SNiPer: improved SNP genotype calling for Affymetrix 10K GeneChip microarray data","volume":"6","author":"Huentelman","year":"2005","journal-title":"BMC Genomics"},{"key":"2023062708515280500_B4","first-page":"116","article-title":"Tradeoff between no-call reduction in genotyping error rate and loss of sample size for genetic case\/control association studies","volume":"9","author":"Kang","year":"2004","journal-title":"Pac. Symp. Biocomput"},{"key":"2023062708515280500_B5","doi-asserted-by":"crossref","first-page":"1129","DOI":"10.1086\/344347","article-title":"Haplotype inference in random population samples","volume":"71","author":"Lin","year":"2002","journal-title":"Am. J. Hum. Genet"},{"key":"2023062708515280500_B6","doi-asserted-by":"crossref","first-page":"437","DOI":"10.1086\/500808","article-title":"A comparison of phasing algorithms for trios and unrelated individuals","volume":"78","author":"Marchini","year":"2006","journal-title":"Am. J. Hum. Genet"},{"key":"2023062708515280500_B7","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1086\/338446","article-title":"Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms","volume":"70","author":"Niu","year":"2002","journal-title":"Am. J. Hum. Genet"},{"key":"2023062708515280500_B8","doi-asserted-by":"crossref","first-page":"1242","DOI":"10.1086\/344207","article-title":"Partition-ligation-expectation maximization algorithm for haplotype inference with single nucleotide polymorphisms","volume":"71","author":"Qin","year":"2002","journal-title":"Am. J. Hum. Genet"},{"key":"2023062708515280500_B9","doi-asserted-by":"crossref","first-page":"538","DOI":"10.1080\/01621459.1977.10480610","article-title":"Formalizing subjective notions about the effect of nonrespondents in sample surveys","volume":"72","author":"Rubin","year":"1977","journal-title":"J. Am. Stat. Assoc"},{"key":"2023062708515280500_B10","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1086\/502802","article-title":"A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase","volume":"78","author":"Scheet","year":"2006","journal-title":"Am. J. Hum. Genet"},{"key":"2023062708515280500_B11","doi-asserted-by":"crossref","first-page":"978","DOI":"10.1086\/319501","article-title":"A new statistical method for haplotype reconstruction from population data","volume":"68","author":"Stephens","year":"2001","journal-title":"Am. J. Hum. Genet"},{"key":"2023062708515280500_B12","doi-asserted-by":"crossref","first-page":"2001","DOI":"10.1093\/bioinformatics\/bti261","article-title":"Inference of missing SNPs and information quantity measurements for haplotype blocks","volume":"21","author":"Su","year":"2005","journal-title":"Bioinformatics"},{"key":"2023062708515280500_B13","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1007\/s00335-001-4001-y","article-title":"Genetic dissection of complex and quantitative traits: from fantasy to reality via a community effort","volume":"13","author":"Threadgill","year":"2002","journal-title":"Mamm. Genome"},{"key":"2023062708515280500_B14","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1093\/bioinformatics\/17.6.520","article-title":"Missing value estimation methods for DNA microarrays","volume":"17","author":"Troyanskaya","year":"2001","journal-title":"Bioinformatics"},{"key":"2023062708515280500_B15","doi-asserted-by":"crossref","first-page":"1175","DOI":"10.1038\/ng1666","article-title":"Genetic variation in laboratory mice","volume":"37","author":"Wade","year":"2005","journal-title":"Nat. Genet"},{"key":"2023062708515280500_B16","article-title":"Quantification and visualization of LD patterns and identification of haplotype blocks (2004)","volume-title":"U.C. Berkeley Division of Biostatistics Working Paper Series","author":"Wang","year":"2004"},{"key":"2023062708515280500_B17","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/1471-2105-6-S2-S4","article-title":"Decision forest analysis of 61 single nucleotide polymorphisms in a case-control study of esophageal cancer; a novel method","volume":"6","author":"Xie","year":"2005","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/13\/i401\/50717637\/bioinformatics_23_13_i401.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/13\/i401\/50717637\/bioinformatics_23_13_i401.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,27]],"date-time":"2023-06-27T08:58:27Z","timestamp":1687856307000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/13\/i401\/236706"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,7,1]]},"references-count":17,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2007,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm220","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,7]]},"published":{"date-parts":[[2007,7,1]]}}}