{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,5,17]],"date-time":"2024-05-17T23:28:40Z","timestamp":1715988520893},"reference-count":16,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2006,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Single nucleotide polymorphisms (SNPs) are DNA sequence variations, occurring when a single nucleotide \u2013 adenine (A), thymine (T), cytosine (C) or guanine (G) \u2013 is altered. Arguably, SNPs account for more than 90% of human genetic variation. Our laboratory has developed a highly redundant SNP genotyping assay consisting of multiple probes with signals from multiple channels for a single SNP, based on arrayed primer extension (APEX). This mini-sequencing method is a powerful combination of a highly parallel microarray with distinctive Sanger-based dideoxy terminator sequencing chemistry. Using this microarray platform, our current genotype calling system (known as SNP Chart) is capable of calling single SNP genotypes by manual inspection of the APEX data, which is time-consuming and exposed to user subjectivity bias.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>Using a set of 32 Coriell DNA samples plus three negative PCR controls as a training data set, we have developed a fully-automated genotyping algorithm based on simple linear discriminant analysis (LDA) using dynamic variable selection. The algorithm combines separate analyses based on the multiple probe sets to give a final posterior probability for each candidate genotype. We have tested our algorithm on a completely independent data set of 270 DNA samples, with validated genotypes, from patients admitted to the intensive care unit (ICU) of St. Paul's Hospital (plus one negative PCR control sample). Our method achieves a concordance rate of 98.9% with a 99.6% call rate for a set of 96 SNPs. By adjusting the threshold value for the final posterior probability of the called genotype, the call rate reduces to 94.9% with a higher concordance rate of 99.6%. We also reversed the two independent data sets in their training and testing roles, achieving a concordance rate up to 99.8%.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>The strength of this APEX chemistry-based platform is its unique redundancy having multiple probes for a single SNP. Our model-based genotype calling algorithm captures the redundancy in the system considering all the underlying probe features of a particular SNP, automatically down-weighting any 'bad data' corresponding to image artifacts on the microarray slide or failure of a specific chemistry. In this regard, our method is able to automatically select the probes which work well and reduce the effect of other so-called bad performing probes in a sample-specific manner, for any number of SNPs.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-7-521","type":"journal-article","created":{"date-parts":[[2006,11,30]],"date-time":"2006-11-30T19:16:19Z","timestamp":1164914179000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Dynamic variable selection in SNP genotype autocalling from APEX microarray data"],"prefix":"10.1186","volume":"7","author":[{"given":"Mohua","family":"Podder","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"William J","family":"Welch","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ruben H","family":"Zamar","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Scott J","family":"Tebbutt","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2006,11,30]]},"reference":[{"key":"1260_CR1","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1101\/gr.4.6.357","volume":"4","author":"KJ Livak","year":"1995","unstructured":"Livak KJ, Flood SJ, Marmaro J, Giusti W, Deetz K: Oligonucleotides with fluorescent dyes at opposite ends provide a quenched probe system useful for detecting PCR product and nucleic acid hybridization. PCR Methods Appl 1995, 4: 357\u201362.","journal-title":"PCR Methods Appl"},{"key":"1260_CR2","doi-asserted-by":"publisher","first-page":"1233","DOI":"10.1038\/nbt869","volume":"21","author":"GC Kennedy","year":"2003","unstructured":"Kennedy GC, Matsuzaki H, Dong S, Liu WM, Huang J, Liu G, Su X, Cao M, Chen W, Zhang J, et al.: Large-scale genotyping of complex DNA. Nat Biotechnol 2003, 21: 1233\u20137. 10.1038\/nbt869","journal-title":"Nat Biotechnol"},{"key":"1260_CR3","doi-asserted-by":"crossref","unstructured":"Oliphant A, Barker DL, Stuelpnagel JR, Chee MS: BeadArray technology: enabling an accurate, cost-effective approach to high-throughput genotyping. Biotechniques 2002, (Suppl):56\u20138. 60\u20131 60\u20131","DOI":"10.2144\/jun0207"},{"issue":"9","key":"1260_CR4","doi-asserted-by":"publisher","first-page":"1958","DOI":"10.1093\/bioinformatics\/bti275","volume":"21","author":"Di Xiaojun","year":"2005","unstructured":"Xiaojun Di, Matsuzaki H, Webster TA, Hubbell E, Liu G, Dong S, Bartell D, Huang J, Chiles R, et al.: Dynamic model based algorithms for screening and genotyping over 100K SNPs on oligonucleotide microarrays. Bioinformatics 2005, 21(9):1958\u20131963. 10.1093\/bioinformatics\/bti275","journal-title":"Bioinformatics"},{"key":"1260_CR5","doi-asserted-by":"publisher","first-page":"70","DOI":"10.1016\/j.mrfmmm.2004.07.022","volume":"573","author":"R Shen","year":"2005","unstructured":"Shen R, Fan JB, Campbell D, Chang W, Chen J, Doucet D, Yeakley J, Bibikova M, Garcia EW, McBride C, Steemers F, Garcia F, Kermani BG, Gunderson K, Oliphant A: High-throughput SNP genotyping on universal bead arrays. Mutation Research 2005, 573: 70\u201382.","journal-title":"Mutation Research"},{"key":"1260_CR6","doi-asserted-by":"crossref","first-page":"977","DOI":"10.2144\/04376RR02","volume":"37","author":"SJ Tebbutt","year":"2004","unstructured":"Tebbutt SJ, He JQ, Burkett KM, Ruan J, Opushnyev IV, Tripp BW, Zeznik JA, Abara CO, Nelson CC, Walley KR: Microarray genotyping resource to determine population stratification in genetic association studies of complex disease. Biotechniques 2004, 37: 977\u201385.","journal-title":"Biotechniques"},{"key":"1260_CR7","unstructured":"Website title[http:\/\/www.asperbio.com]"},{"issue":"1","key":"1260_CR8","first-page":"45","volume":"46","author":"S Kaminski","year":"2005","unstructured":"Kaminski S, Ahman A, Rusc A, Wojcik E, Malewski T: MilkProtChip \u2013 a microarray of SNPs in candidate genes associated with milk protein biosynthesis-development and validation. J Appl Genet 2005, 46(1):45\u201358.","journal-title":"J Appl Genet"},{"key":"1260_CR9","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1089\/109065700316408","volume":"4","author":"A Kurg","year":"2000","unstructured":"Kurg A, Tonisson N, Georgiou I, Shumaker J, Tollett J, Metspalu A: Arrayed primer extension: solid-phase four-color DNA resequencing and mutation detection technology. Genet Test 2000, 4: 1\u20137. 10.1089\/109065700316408","journal-title":"Genet Test"},{"key":"1260_CR10","doi-asserted-by":"crossref","first-page":"2051","DOI":"10.1093\/clinchem\/48.11.2051","volume":"48","author":"F Gemignani","year":"2002","unstructured":"Gemignani F, Perra C, Landi S, Canzian F, Kurg A, Tonisson N, Galanello R, Cao A, Metspalu A, Romeo G: Reliable detection of beta-thalassemia and G6PD mutations by a DNA microarray. Clin Chem 2002, 48: 2051\u20134.","journal-title":"Clin Chem"},{"key":"1260_CR11","doi-asserted-by":"publisher","first-page":"124","DOI":"10.1093\/bioinformatics\/bth470","volume":"21","author":"SJ Tebbutt","year":"2005","unstructured":"Tebbutt SJ, Opushnyev IV, Tripp BW, Kassamali AM, Alexander WL, Andersen MI: SNP Chart: an integrated platform for visualization and interpretation of microarray genotyping data. Bioinformatics 2005, 21: 124\u20137. 10.1093\/bioinformatics\/bth470","journal-title":"Bioinformatics"},{"key":"1260_CR12","unstructured":"Website title[http:\/\/coriell.umdnj.edu\/]"},{"key":"1260_CR13","doi-asserted-by":"publisher","first-page":"179","DOI":"10.1111\/j.1469-1809.1936.tb02137.x","volume":"7","author":"RA Fisher","year":"1936","unstructured":"Fisher RA: The Use of Multiple Measures in Taxonomic Problems. Ann of Eugenics 1936, 7: 179\u2013188.","journal-title":"Ann of Eugenics"},{"key":"1260_CR14","first-page":"362","volume-title":"Principles of Data Mining","author":"D Hand","year":"2001","unstructured":"Hand D, Mannila H, Smyth P: Principles of Data Mining. The MIT Press, Cambridge, MA; 2001:362\u2013363. 359\u2013360"},{"key":"1260_CR15","doi-asserted-by":"publisher","first-page":"1147","DOI":"10.1093\/bioinformatics\/btl080","volume":"22","author":"DC Walley","year":"2006","unstructured":"Walley DC, Tripp BW, Song YC, Walley KR, Tebbutt SJ: MACGT: multidimensional automated clustering genotyping tool for analysis of microarray based mini-sequencing data. Bioinformatics 2006, 22: 1147\u20139. 10.1093\/bioinformatics\/btl080","journal-title":"Bioinformatics"},{"key":"1260_CR16","doi-asserted-by":"publisher","first-page":"126","DOI":"10.1007\/978-0-387-21606-5","volume-title":"The Elements of Statistical Learning","author":"T Hastie","year":"2001","unstructured":"Hastie T, Tibshirani R, Friedman J: The Elements of Statistical Learning. Springer, New York; 2001:126\u2013127. 84\u201395"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-7-521.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T11:04:27Z","timestamp":1630494267000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-7-521"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,11,30]]},"references-count":16,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2006,12]]}},"alternative-id":["1260"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-7-521","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2006,11,30]]},"assertion":[{"value":"6 July 2006","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 November 2006","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 November 2006","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"521"}}