{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T02:45:17Z","timestamp":1770345917640,"version":"3.49.0"},"reference-count":20,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Recent studies have shown that the patterns of linkage disequilibrium observed in human populations have a block-like structure, and a small subset of SNPs (called tag SNPs) is sufficient to distinguish each pair of haplotype patterns in the block. In reality, some tag SNPs may be missing, and we may fail to distinguish two distinct haplotypes due to the ambiguity caused by missing data.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We show there exists a subset of SNPs (referred to as robust tag SNPs) which can still distinguish all distinct haplotypes even when some SNPs are missing. The problem of finding minimum robust tag SNPs is shown to be NP-hard. To find robust tag SNPs efficiently, we propose two greedy algorithms and one linear programming relaxation algorithm. The experimental results indicate that (1) the solutions found by these algorithms are quite close to the optimal solution; (2) the genotyping cost saved by using tag SNPs can be as high as 80%; and (3) genotyping additional tag SNPs for tolerating missing data is still cost-effective.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusion<\/jats:title>\n                <jats:p>Genotyping robust tag SNPs is more practical than just genotyping the minimum tag SNPs if we can not avoid the occurrence of missing data. Our theoretical analysis and experimental results show that the performance of our algorithms is not only efficient but the solution found is also close to the optimal solution.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/1471-2105-6-263","type":"journal-article","created":{"date-parts":[[2005,11,3]],"date-time":"2005-11-03T07:13:55Z","timestamp":1131002035000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["Selecting additional tag SNPs for tolerating missing data in genotyping"],"prefix":"10.1186","volume":"6","author":[{"given":"Yao-Ting","family":"Huang","sequence":"first","affiliation":[]},{"given":"Kui","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Ting","family":"Chen","sequence":"additional","affiliation":[]},{"given":"Kun-Mao","family":"Chao","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2005,11,1]]},"reference":[{"key":"588_CR1","doi-asserted-by":"publisher","first-page":"19","DOI":"10.1145\/640075.640078","volume-title":"Proc RECOMB'03","author":"V Bafna","year":"2003","unstructured":"Bafna V, Halld\u00f3rsson BV, Schwartz R, Clark AG, Istrail S: Haplotypes and informative SNP selection algorithms: don't block out information. Proc RECOMB'03 2003, 19\u201327."},{"key":"588_CR2","doi-asserted-by":"publisher","first-page":"106","DOI":"10.1086\/381000","volume":"74","author":"CS Carlson","year":"2004","unstructured":"Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA: Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet 2004, 74: 106\u2013120.","journal-title":"Am J Hum Genet"},{"key":"588_CR3","volume-title":"Introduction to algorithms","author":"TH Cormen","year":"2001","unstructured":"Cormen TH, Leiserson CE, Rivest RL, Stein C: Introduction to algorithms. The MIT Press; 2001."},{"issue":"2","key":"588_CR4","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1038\/ng1001-229","volume":"29","author":"MJ Daly","year":"2001","unstructured":"Daly MJ, Rioux JD, Schaffner SF, Hudson TJ, Lander ES: High-resolution haplotype structure in the human genome. Nat Genet 2001, 29(2):229\u2013232.","journal-title":"Nat Genet"},{"key":"588_CR5","doi-asserted-by":"publisher","first-page":"525","DOI":"10.1137\/S0036144502414942","volume":"44","author":"A Forsgren","year":"2002","unstructured":"Forsgren A, Gill PE, Wright MH: Interior methods for nonlinear optimization. SIAM Rev 2002, 44: 525\u2013597.","journal-title":"SIAM Rev"},{"key":"588_CR6","volume-title":"Computers and intractability","author":"MR Garey","year":"1979","unstructured":"Garey MR, Johnson DS: Computers and intractability. Freeman, New York; 1979."},{"key":"588_CR7","first-page":"1633","volume-title":"Genome Research","author":"BV Halld\u00f3rsson","year":"2004","unstructured":"Halld\u00f3rsson BV, Bafna V, Lippert R, Schwartz R, Vega FM, Clark AG, Istrail S: Optimal haplotype block-free selection of tagging SNPs for genome-wide association studies. Genome Research 2004, 1633\u20131640."},{"key":"588_CR8","volume-title":"Bioinformatics","author":"E Halperin","year":"2004","unstructured":"Halperin E, Eskin E: Haplotype reconstruction from genotype data using imperfect phylogeny. Bioinformatics 2004."},{"key":"588_CR9","doi-asserted-by":"publisher","first-page":"1072","DOI":"10.1126\/science.1105436","volume":"307","author":"DA Hinds","year":"2005","unstructured":"Hinds DA, Stuve LL, Nilsen GB, Halperin E, Eskin E, Ballinger DG, Frazer KA, Cox DR: Whole-genome patterns of common DNA variation in three human populations. Science 2005, 307: 1072\u20131079.","journal-title":"Science"},{"key":"588_CR10","doi-asserted-by":"publisher","first-page":"337","DOI":"10.1093\/bioinformatics\/18.2.337","volume":"18","author":"RR Hudson","year":"2002","unstructured":"Hudson RR: Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 2002, 18: 337\u2013338.","journal-title":"Bioinformatics"},{"key":"588_CR11","unstructured":"LP Solve[http:\/\/www.cs.sunysb.edu\/~algorith\/implement\/lpsolve\/implement.shtml]"},{"key":"588_CR12","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1086\/338446","volume":"70","author":"T Niu","year":"2002","unstructured":"Niu T, Qin Z, Xu X, Liu JS: Bayesian haplotype inference for multiple linked single-nucleotide polymorphisms. Am J Hum Genet 2002, 70: 157\u2013159.","journal-title":"Am J Hum Genet"},{"key":"588_CR13","doi-asserted-by":"publisher","first-page":"1719","DOI":"10.1126\/science.1065573","volume":"294","author":"N Patil","year":"2001","unstructured":"Patil N, Berno AJ, Hinds DA, Barrett WA, Doshi JM, Hacker CR, Kautzer CR, Lee DH, Marjoribanks C, McDonough DP, Nguyen BT, Norris MC, Sheehan JB, Shen N, Stern D, Stokowski RP, Thomas DJ, Trulson MO, Vyas KR, Frazer KA, Fodor SP, Cox DR: Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science 2001, 294: 1719\u20131723.","journal-title":"Science"},{"key":"588_CR14","doi-asserted-by":"publisher","first-page":"1162","DOI":"10.1086\/379378","volume":"73","author":"M Stephens","year":"2003","unstructured":"Stephens M, Donnelly P: A comparison of bayesian methods for haplotype reconstruction from population genotype data. Am J Hum Genet 2003, 73: 1162\u20131169.","journal-title":"Am J Hum Genet"},{"issue":"14","key":"588_CR15","doi-asserted-by":"publisher","first-page":"1773","DOI":"10.1093\/bioinformatics\/btg239","volume":"19","author":"L Wang","year":"2003","unstructured":"Wang L, Xu Y: Haplotype inference by maximum parsimony. Bioinformatics 2003, 19(14):1773\u20131780.","journal-title":"Bioinformatics"},{"issue":"12","key":"588_CR16","doi-asserted-by":"publisher","first-page":"7225","DOI":"10.1073\/pnas.1237858100","volume":"100","author":"Y Yang","year":"2003","unstructured":"Yang Y, Zhang J, Hoh J, Matsuda F, Xu P, Lathrop M, Ott J: Efficiency of single-nucleotide polymorphism haplotype estimation from pooled DNA. Proc Nat Acad Set 2003, 100(12):7225\u20137230.","journal-title":"Proc Nat Acad Set"},{"issue":"11","key":"588_CR17","doi-asserted-by":"publisher","first-page":"7335","DOI":"10.1073\/pnas.102186799","volume":"99","author":"K Zhang","year":"2002","unstructured":"Zhang K, Deng M, Chen T, Waterman MS, Sun F: A dynamic programming algorithm for haplotype partitioning. Proc Nat Acad Sci 2002, 99(11):7335\u20137339.","journal-title":"Proc Nat Acad Sci"},{"key":"588_CR18","doi-asserted-by":"publisher","first-page":"63","DOI":"10.1086\/376437","volume":"73","author":"K Zhang","year":"2003","unstructured":"Zhang K, Sun F, Waterman MS, Chen T: Haplotype block partition with limited resources and applications to human chromosome 21 haplotype data. Am J Hum Genet 2003, 73: 63\u201373.","journal-title":"Am J Hum Genet"},{"key":"588_CR19","doi-asserted-by":"publisher","first-page":"908","DOI":"10.1101\/gr.1837404","volume":"14","author":"K Zhang","year":"2004","unstructured":"Zhang K, Qin ZS, Liu JS, Chen T, Waterman MS, Sun F: Haplotype block partition and tag SNP selection using genotype data and their applications to association studies. Genome Research 2004, 14: 908\u2013916.","journal-title":"Genome Research"},{"key":"588_CR20","doi-asserted-by":"publisher","first-page":"1694","DOI":"10.1093\/bioinformatics\/18.12.1694","volume":"18","author":"JH Zhao","year":"2002","unstructured":"Zhao JH, Lissarrague S, Essioux L, Sham PC: GENECOUNTING: haplotype analysis with missing genotypes. Bioinformatics 2002, 18: 1694\u20131695.","journal-title":"Bioinformatics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-6-263.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T17:52:40Z","timestamp":1706809960000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-6-263"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,11,1]]},"references-count":20,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2005,12]]}},"alternative-id":["588"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-6-263","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005,11,1]]},"assertion":[{"value":"26 May 2005","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 November 2005","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 November 2005","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"263"}}