{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,23]],"date-time":"2025-10-23T11:08:17Z","timestamp":1761217697369},"reference-count":26,"publisher":"Oxford University Press (OUP)","issue":"13","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2014,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Fast and accurate genotype imputation is necessary for facilitating gene-mapping studies, especially with the ever increasing numbers of both common and rare variants generated by high-throughput-sequencing experiments. However, most of the existing imputation approaches suffer from either inaccurate results or heavy computational demand.<\/jats:p>\n               <jats:p>Results: In this article, aiming to perform fast and accurate genotype-imputation analysis, we propose a novel, fast and yet accurate method to impute diploid genotypes. Specifically, we extend a hidden Markov model that is widely used to describe haplotype structures. But we model hidden states onto single reference haplotypes rather than onto pairs of haplotypes. Consequently the computational complexity is linear to size of reference haplotypes. We further develop an algorithm \u2018merge-and-recover (MAR)\u2019 to speed up the calculation. Working on compact representation of segmental reference haplotypes, the MAR algorithm always calculates an exact form of transition probabilities regardless of partition of segments. Both simulation studies and real-data analyses demonstrated that our proposed method was comparable to most of the existing popular methods in terms of imputation accuracy, but was much more efficient in terms of computation. The MAR algorithm can further speed up the calculation by several folds without loss of accuracy. The proposed method will be useful in large-scale imputation studies with a large number of reference subjects.<\/jats:p>\n               <jats:p>Availability: The implemented multi-threading software FISH is freely available for academic use at https:\/\/sites.google.com\/site\/lzhanghomepage\/FISH .<\/jats:p>\n               <jats:p>Contact: \u00a0zlbio12@gmail.com ; pyf0419@gmail.com ; hdeng2@tulane.edu<\/jats:p>\n               <jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btu143","type":"journal-article","created":{"date-parts":[[2014,3,12]],"date-time":"2014-03-12T00:16:49Z","timestamp":1394583409000},"page":"1876-1883","source":"Crossref","is-referenced-by-count":22,"title":["FISH: fast and accurate diploid genotype imputation via segmental hidden Markov model"],"prefix":"10.1093","volume":"30","author":[{"given":"Lei","family":"Zhang","sequence":"first","affiliation":[{"name":"1 School of Public Health, Xi'an Jiaotong University, Shaanxi, China, 2 Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, USA and 3 Center of System Biomedical Sciences, University of Shanghai for Science and Technology, Shanghai, China"},{"name":"1 School of Public Health, Xi'an Jiaotong University, Shaanxi, China, 2 Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, USA and 3 Center of System Biomedical Sciences, University of Shanghai for Science and Technology, Shanghai, China"}]},{"given":"Yu-Fang","family":"Pei","sequence":"additional","affiliation":[{"name":"1 School of Public Health, Xi'an Jiaotong University, Shaanxi, China, 2 Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, USA and 3 Center of System Biomedical Sciences, University of Shanghai for Science and Technology, Shanghai, China"},{"name":"1 School of Public Health, Xi'an Jiaotong University, Shaanxi, China, 2 Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, USA and 3 Center of System Biomedical Sciences, University of Shanghai for Science and Technology, Shanghai, China"}]},{"given":"Xiaoying","family":"Fu","sequence":"additional","affiliation":[{"name":"1 School of Public Health, Xi'an Jiaotong University, Shaanxi, China, 2 Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, USA and 3 Center of System Biomedical Sciences, University of Shanghai for Science and Technology, Shanghai, China"}]},{"given":"Yong","family":"Lin","sequence":"additional","affiliation":[{"name":"1 School of Public Health, Xi'an Jiaotong University, Shaanxi, China, 2 Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, USA and 3 Center of System Biomedical Sciences, University of Shanghai for Science and Technology, Shanghai, China"}]},{"given":"Yu-Ping","family":"Wang","sequence":"additional","affiliation":[{"name":"1 School of Public Health, Xi'an Jiaotong University, Shaanxi, China, 2 Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, USA and 3 Center of System Biomedical Sciences, University of Shanghai for Science and Technology, Shanghai, China"}]},{"given":"Hong-Wen","family":"Deng","sequence":"additional","affiliation":[{"name":"1 School of Public Health, Xi'an Jiaotong University, Shaanxi, China, 2 Department of Biostatistics and Bioinformatics, Tulane University, New Orleans, USA and 3 Center of System Biomedical Sciences, University of Shanghai for Science and Technology, Shanghai, China"}]}],"member":"286","published-online":{"date-parts":[[2014,3,10]]},"reference":[{"key":"2023012711141731400_btu143-B1","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/nature11632","article-title":"An integrated map of genetic variation from 1,092 human genomes","volume":"491","author":"Abecasis","year":"2012","journal-title":"Nature"},{"key":"2023012711141731400_btu143-B2","doi-asserted-by":"crossref","first-page":"1084","DOI":"10.1086\/521987","article-title":"Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering","volume":"81","author":"Browning","year":"2007","journal-title":"Am. J. Hum. Genet."},{"key":"2023012711141731400_btu143-B3","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1101\/gr.145821.112","article-title":"Genotype imputation via matrix completion","volume":"23","author":"Chi","year":"2013","journal-title":"Genome Res"},{"key":"2023012711141731400_btu143-B4","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1038\/nmeth.1785","article-title":"A linear complexity phasing method for thousands of genomes","volume":"9","author":"Delaneau","year":"2012","journal-title":"Nat. Methods"},{"key":"2023012711141731400_btu143-B5","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1038\/nmeth.2307","article-title":"Improved whole-chromosome phasing for disease and population genetic studies","volume":"10","author":"Delaneau","year":"2013","journal-title":"Nat. Methods"},{"key":"2023012711141731400_btu143-B6","doi-asserted-by":"crossref","first-page":"2744","DOI":"10.1093\/bioinformatics\/btt477","article-title":"Imputation of coding variants in African Americans: better performance using data from the exome sequencing project","volume":"29","author":"Duan","year":"2013","journal-title":"Bioinformatics"},{"key":"2023012711141731400_btu143-B7","doi-asserted-by":"crossref","first-page":"528","DOI":"10.1093\/bioinformatics\/bts724","article-title":"A comprehensive SNP and indel imputability database","volume":"29","author":"Duan","year":"2013","journal-title":"Bioinformatics"},{"key":"2023012711141731400_btu143-B8","doi-asserted-by":"crossref","first-page":"955","DOI":"10.1038\/ng.2354","article-title":"Fast and accurate genotype imputation in genome-wide association studies through pre-phasing","volume":"44","author":"Howie","year":"2012","journal-title":"Nat. Genet."},{"key":"2023012711141731400_btu143-B9","doi-asserted-by":"crossref","first-page":"e1000529","DOI":"10.1371\/journal.pgen.1000529","article-title":"A flexible and accurate genotype imputation method for the next generation of genome-wide association studies","volume":"5","author":"Howie","year":"2009","journal-title":"PLoS Genet."},{"key":"2023012711141731400_btu143-B10","doi-asserted-by":"crossref","first-page":"2213","DOI":"10.1093\/genetics\/165.4.2213","article-title":"Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data","volume":"165","author":"Li","year":"2003","journal-title":"Genetics"},{"key":"2023012711141731400_btu143-B11","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1146\/annurev.genom.9.081307.164242","article-title":"Genotype imputation","volume":"10","author":"Li","year":"2009","journal-title":"Ann. Rev. Genom. Hum. Genet."},{"key":"2023012711141731400_btu143-B12","doi-asserted-by":"crossref","first-page":"816","DOI":"10.1002\/gepi.20533","article-title":"MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes","volume":"34","author":"Li","year":"2010","journal-title":"Genet. Epidemiol."},{"key":"2023012711141731400_btu143-B13","doi-asserted-by":"crossref","first-page":"1129","DOI":"10.1086\/344347","article-title":"Haplotype inference in random population samples","volume":"71","author":"Lin","year":"2002","journal-title":"Am. J. Hum. Genet."},{"key":"2023012711141731400_btu143-B14","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1002\/gepi.21690","article-title":"MaCH-admix: genotype imputation for admixed populations","volume":"37","author":"Liu","year":"2013","journal-title":"Genet. Epidemiol."},{"key":"2023012711141731400_btu143-B15","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1038\/nrg2796","article-title":"Genotype imputation for genome-wide association studies","volume":"11","author":"Marchini","year":"2010","journal-title":"Nat. Rev. Genet."},{"key":"2023012711141731400_btu143-B16","doi-asserted-by":"crossref","first-page":"906","DOI":"10.1038\/ng2088","article-title":"A new multipoint method for genome-wide association studies by imputation of genotypes","volume":"39","author":"Marchini","year":"2007","journal-title":"Nat. Genet."},{"key":"2023012711141731400_btu143-B17","article-title":"Fast and accurate 1000 Genomes imputation using summary statistics or low-coverage sequencing data","volume-title":"The 62nd American Society of Human Genetics","author":"Pasaniuc","year":"2012"},{"key":"2023012711141731400_btu143-B18","doi-asserted-by":"crossref","first-page":"e3551","DOI":"10.1371\/journal.pone.0003551","article-title":"Analyses and comparison of accuracy of different genotype imputation methods","volume":"3","author":"Pei","year":"2008","journal-title":"PLoS One"},{"key":"2023012711141731400_btu143-B19","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1086\/519795","article-title":"PLINK: a tool set for whole-genome association and population-based linkage analyses","volume":"81","author":"Purcell","year":"2007","journal-title":"Am. J. Hum. Genet."},{"key":"2023012711141731400_btu143-B20","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1109\/5.18626","article-title":"A tutorial on hidden Markov models and selected applications in speech recognition","volume":"77","author":"Rabiner","year":"1989","journal-title":"Proc. IEEE"},{"key":"2023012711141731400_btu143-B21","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1038\/nmeth.2308","article-title":"High-resolution whole-genome haplotyping using limited seed data","volume":"10","author":"Rao","year":"2013","journal-title":"Nat. Methods"},{"key":"2023012711141731400_btu143-B22","doi-asserted-by":"crossref","first-page":"199","DOI":"10.1038\/35075590","article-title":"Linkage disequilibrium in the human genome","volume":"411","author":"Reich","year":"2001","journal-title":"Nature"},{"key":"2023012711141731400_btu143-B23","doi-asserted-by":"crossref","first-page":"1576","DOI":"10.1101\/gr.3709305","article-title":"Calibrating a coalescent simulation of human genome sequence variation","volume":"15","author":"Schaffner","year":"2005","journal-title":"Genome Res."},{"key":"2023012711141731400_btu143-B24","doi-asserted-by":"crossref","first-page":"629","DOI":"10.1086\/502802","article-title":"A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase","volume":"78","author":"Scheet","year":"2006","journal-title":"Am. J. Hum. Genet."},{"key":"2023012711141731400_btu143-B25","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1086\/428594","article-title":"Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation","volume":"76","author":"Stephens","year":"2005","journal-title":"Am. J. Hum. Genet."},{"key":"2023012711141731400_btu143-B26","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1016\/j.ajhg.2012.06.013","article-title":"Phasing of many thousands of genotyped samples","volume":"91","author":"Williams","year":"2012","journal-title":"Am. J. Hum. Genet."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/13\/1876\/48923735\/bioinformatics_30_13_1876.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/13\/1876\/48923735\/bioinformatics_30_13_1876.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,27]],"date-time":"2023-01-27T11:51:51Z","timestamp":1674820311000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/30\/13\/1876\/2422276"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,3,10]]},"references-count":26,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2014,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btu143","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2014,7,1]]},"published":{"date-parts":[[2014,3,10]]}}}