{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T09:00:59Z","timestamp":1770541259861,"version":"3.49.0"},"reference-count":36,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,4,7]],"date-time":"2023-04-07T00:00:00Z","timestamp":1680825600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,4,7]],"date-time":"2023-04-07T00:00:00Z","timestamp":1680825600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Innovation Technology Fund","award":["ITS\/060\/18"],"award-info":[{"award-number":["ITS\/060\/18"]}]},{"name":"Research Grants Council of the HKSAR","award":["Theme-based Research Scheme T12-710\/16-R"],"award-info":[{"award-number":["Theme-based Research Scheme T12-710\/16-R"]}]},{"name":"Mainland-Hong Kong Joint Funding Scheme","award":["MHP\/033\/20"],"award-info":[{"award-number":["MHP\/033\/20"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>For detecting genotype-phenotype association from case\u2013control single nucleotide polymorphism (SNP) data, one class of methods relies on testing each genomic variant site individually. However, this approach ignores the tendency for associated variant sites to be spatially clustered instead of uniformly distributed along the genome. Therefore, a more recent class of methods looks for blocks of influential variant sites. Unfortunately, existing such methods either assume prior knowledge of the blocks, or rely on ad hoc moving windows. A principled method is needed to automatically detect genomic variant blocks which are associated with the phenotype.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>In this paper, we introduce an automatic block-wise Genome-Wide Association Study (GWAS) method based on Hidden Markov model. Using case\u2013control SNP data as input, our method detects the number of blocks associated with the phenotype and the locations of the blocks. Correspondingly, the minor allele of each variate site will be classified as having negative influence, no influence or positive influence on the phenotype. We evaluated our method using both datasets simulated from our model and datasets from a block model different from ours, and compared the performance with other methods. These included both simple methods based on the Fisher\u2019s exact test, applied site-by-site, as well as more complex methods built into the recent Zoom-Focus Algorithm. Across all simulations, our method consistently outperformed the comparisons.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>With its demonstrated better performance, we expect our algorithm for detecting influential variant sites may help find more accurate signals across a wide range of case\u2013control GWAS.<\/jats:p><\/jats:sec>","DOI":"10.1186\/s12859-023-05265-5","type":"journal-article","created":{"date-parts":[[2023,4,7]],"date-time":"2023-04-07T16:02:58Z","timestamp":1680883378000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Automatic block-wise genotype-phenotype association detection based on hidden Markov model"],"prefix":"10.1186","volume":"24","author":[{"given":"Jin","family":"Du","sequence":"first","affiliation":[]},{"given":"Chaojie","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Lijun","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Shanjun","family":"Mao","sequence":"additional","affiliation":[]},{"given":"Bencong","family":"Zhu","sequence":"additional","affiliation":[]},{"given":"Zheng","family":"Li","sequence":"additional","affiliation":[]},{"given":"Xiaodan","family":"Fan","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,4,7]]},"reference":[{"issue":"1","key":"5265_CR1","doi-asserted-by":"crossref","first-page":"289","DOI":"10.1111\/j.2517-6161.1995.tb02031.x","volume":"57","author":"Y Benjamini","year":"1995","unstructured":"Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc: Ser B (Methodol). 1995;57(1):289\u2013300.","journal-title":"J R Stat Soc: Ser B (Methodol)"},{"issue":"5","key":"5265_CR2","doi-asserted-by":"crossref","first-page":"393","DOI":"10.6026\/97320630016393","volume":"16","author":"X Cao","year":"2020","unstructured":"Cao X, Xing L, et al. Views on GWAS statistical analysis. Bioinformation. 2020;16(5):393\u20137.","journal-title":"Bioinformation"},{"issue":"5","key":"5265_CR3","doi-asserted-by":"publisher","first-page":"425","DOI":"10.1006\/pmed.1995.1069","volume":"24","author":"MC Constanza","year":"1995","unstructured":"Constanza MC. Matching. Prev Med. 1995;24(5):425\u201333.","journal-title":"Prev Med"},{"issue":"1","key":"5265_CR4","doi-asserted-by":"publisher","first-page":"93","DOI":"10.1093\/oxfordjournals.molbev.a025575","volume":"13","author":"J Felsenstein","year":"1996","unstructured":"Felsenstein J, Churchill GA. A Hidden Markov Model approach to variation among sites in rate of evolution. Mol Biol Evol. 1996;13(1):93\u2013104.","journal-title":"Mol Biol Evol"},{"issue":"5576","key":"5265_CR5","doi-asserted-by":"publisher","first-page":"2225","DOI":"10.1126\/science.1069424","volume":"296","author":"SB Gabriel","year":"2002","unstructured":"Gabriel SB, Schaffner SF, et al. The structure of haplotype blocks in the human genome. Science. 2002;296(5576):2225\u20139.","journal-title":"Science"},{"issue":"2","key":"5265_CR6","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1089\/cmb.1997.4.127","volume":"4","author":"J Henderson","year":"1997","unstructured":"Henderson J, Salzberg S, et al. Finding genes in DNA with a hidden Markov model. J Comput Biol. 1997;4(2):127\u201341.","journal-title":"J Comput Biol"},{"key":"5265_CR7","doi-asserted-by":"publisher","first-page":"117863101772117","DOI":"10.1177\/1178631017721178","volume":"10","author":"KHM Kuo","year":"2017","unstructured":"Kuo KHM. Multiple testing in the context of gene discovery in sickle cell disease using genome-wide association studies. Genomics Insights. 2017;10:1178631017721178.","journal-title":"Genomics Insights"},{"key":"5265_CR8","doi-asserted-by":"publisher","first-page":"16021","DOI":"10.1038\/ncomms16021","volume":"8","author":"CD Langefeld","year":"2017","unstructured":"Langefeld CD, Ainsworth HC, et al. Transancestral mapping and genetic load in systemic lupus erythematosus. Nat Commun. 2017;8:16021.","journal-title":"Nat Commun"},{"issue":"4","key":"5265_CR9","doi-asserted-by":"publisher","first-page":"762","DOI":"10.1093\/biostatistics\/kxs014","volume":"13","author":"S Lee","year":"2012","unstructured":"Lee S, Wu MC, et al. Optimal tests for rare variant effects in sequencing association studies. Biostatistics. 2012;13(4):762\u201375.","journal-title":"Biostatistics"},{"issue":"28","key":"5265_CR10","first-page":"57","volume":"11","author":"S Lewallen","year":"1998","unstructured":"Lewallen S, Courtright P. Epidemiology in practice: case-control studies. Community Eye Health. 1998;11(28):57\u20138.","journal-title":"Community Eye Health"},{"issue":"3","key":"5265_CR11","doi-asserted-by":"publisher","first-page":"311","DOI":"10.1016\/j.ajhg.2008.06.024","volume":"83","author":"B Li","year":"2008","unstructured":"Li B, Leal SM. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet. 2008;83(3):311\u201321.","journal-title":"Am J Hum Genet"},{"issue":"2","key":"5265_CR12","doi-asserted-by":"publisher","first-page":"517","DOI":"10.1109\/78.823977","volume":"48","author":"J Li","year":"2000","unstructured":"Li J, Najmi A, et al. Image classification by a two-dimensional hidden Markov model. IEEE Trans Signal Process. 2000;48(2):517\u201333.","journal-title":"IEEE Trans Signal Process"},{"issue":"2","key":"5265_CR13","volume":"11","author":"J Lin","year":"2018","unstructured":"Lin J, Musunuru K. From genotype to phenotype: a primer on the functional follow-up of genome-wide association studies in cardiovascular disease. Circ: Genomic Precis Med. 2018;11(2): e001946.","journal-title":"Circ: Genomic Precis Med"},{"key":"5265_CR14","doi-asserted-by":"publisher","first-page":"1091","DOI":"10.3389\/fgene.2019.01091","volume":"10","author":"Y Liu","year":"2019","unstructured":"Liu Y, Wang D, et al. Phenotype prediction and genome-wide association study using deep convolutional neural network of soybean. Front Genet. 2019;10:1091.","journal-title":"Front Genet"},{"key":"5265_CR15","doi-asserted-by":"publisher","first-page":"549","DOI":"10.1038\/nrg3523","volume":"14","author":"TA Manolio","year":"2013","unstructured":"Manolio TA. Bringing genome-wide association findings into clinical use. Nat Rev Genet. 2013;14:549\u201358.","journal-title":"Nat Rev Genet"},{"key":"5265_CR16","doi-asserted-by":"publisher","first-page":"793","DOI":"10.1007\/s10044-015-0508-9","volume":"19","author":"A Mesa","year":"2016","unstructured":"Mesa A, Basterrech S, et al. Hidden Markov models for gene sequence classification. Pattern Anal Appl. 2016;19:793\u2013805.","journal-title":"Pattern Anal Appl"},{"issue":"4","key":"5265_CR17","doi-asserted-by":"publisher","first-page":"373","DOI":"10.1038\/ng.3242","volume":"47","author":"K Michailidou","year":"2015","unstructured":"Michailidou K, Beesley J, et al. Genome-wide association analysis of more than 120,000 individuals identifies 15 new susceptibility loci for breast cancer. Nat Genet. 2015;47(4):373\u201380.","journal-title":"Nat Genet"},{"key":"5265_CR18","doi-asserted-by":"publisher","first-page":"92","DOI":"10.1038\/nature24284","volume":"551","author":"K Michailidou","year":"2017","unstructured":"Michailidou K, Lindstr\u00f6m S, et al. Association analysis identifies 65 new breast cancer risk loci. Nature. 2017;551:92\u20134.","journal-title":"Nature"},{"key":"5265_CR19","doi-asserted-by":"publisher","first-page":"1385","DOI":"10.1038\/ng.3913","volume":"49","author":"CP Nelson","year":"2017","unstructured":"Nelson CP, Goel A, et al. Association analyses based on false discovery rate implicate new loci for coronary artery disease. Nat Genet. 2017;49:1385\u201391.","journal-title":"Nat Genet"},{"key":"5265_CR20","unstructured":"Noland K, Sandler M. Key estimation using a hidden Markov model. In: Proceedings of ISMIR 2006: 7th international conference on music information retrieval (2006)."},{"issue":"5","key":"5265_CR21","doi-asserted-by":"publisher","first-page":"680","DOI":"10.1038\/ng.3826","volume":"49","author":"CM Phelan","year":"2017","unstructured":"Phelan CM, Kuchenbaecker KB, et al. Identification of 12 new susceptibility loci for different histotypes of epithelial ovarian cancer. Nat Genet. 2017;49(5):680\u201391.","journal-title":"Nat Genet"},{"issue":"2","key":"5265_CR22","doi-asserted-by":"publisher","first-page":"257","DOI":"10.1109\/5.18626","volume":"77","author":"LR Rabiner","year":"1989","unstructured":"Rabiner LR. A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE. 1989;77(2):257\u201386.","journal-title":"Proc IEEE"},{"issue":"11","key":"5265_CR23","doi-asserted-by":"publisher","first-page":"2888","DOI":"10.2337\/db16-1253","volume":"66","author":"RA Scott","year":"2017","unstructured":"Scott RA, Scott LJ, et al. An expanded genome-wide association study of type 2 diabetes in Europeans. Diabetes. 2017;66(11):2888\u2013902.","journal-title":"Diabetes"},{"issue":"6","key":"5265_CR24","first-page":"1","volume":"9","author":"P Sebastiani","year":"2008","unstructured":"Sebastiani P, Zaho Z, et al. A hierarchical and modular approach to the discovery of robust associations in genome-wide association studies from pooled DNA samples. BMC Genomic Data. 2008;9(6):1\u201314.","journal-title":"BMC Genomic Data"},{"issue":"6","key":"5265_CR25","doi-asserted-by":"publisher","first-page":"561","DOI":"10.1002\/gepi.21649","volume":"36","author":"Q Sha","year":"2012","unstructured":"Sha Q, Wang X, et al. Detecting association of rare and common variants by testing an optimally weighted combination of variants. Genet Epidemiol. 2012;36(6):561\u201371.","journal-title":"Genet Epidemiol"},{"key":"5265_CR26","doi-asserted-by":"publisher","DOI":"10.7717\/peerj.127","volume":"1","author":"A Skewes","year":"2013","unstructured":"Skewes A, Welch R. A Markovian analysis of bacterial genome sequence constraints. PeerJ. 2013;1: e127.","journal-title":"PeerJ"},{"issue":"7","key":"5265_CR27","doi-asserted-by":"publisher","first-page":"591","DOI":"10.1002\/gepi.22000","volume":"40","author":"R Sun","year":"2016","unstructured":"Sun R, Weng H, et al. A W-test collapsing method for rare-variant association testing in exome sequencing data. Genet Epidemiol. 2016;40(7):591\u20136.","journal-title":"Genet Epidemiol"},{"key":"5265_CR28","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1038\/s43586-021-00056-9","volume":"1","author":"E Uffelmann","year":"2021","unstructured":"Uffelmann E, Huang QQ, et al. Genome-wide association studies. Nat Rev Methods Prim. 2021;1:59.","journal-title":"Nat Rev Methods Prim"},{"issue":"5","key":"5265_CR29","doi-asserted-by":"publisher","first-page":"1861","DOI":"10.1016\/j.csda.2008.07.002","volume":"53","author":"N Usotskaya","year":"2009","unstructured":"Usotskaya N, Ryabko B. Applications of information-theoretic tests for analysis of DNA sequences based on Markov chain models. Comput Stat Data Anal. 2009;53(5):1861\u201372.","journal-title":"Comput Stat Data Anal"},{"issue":"2","key":"5265_CR30","doi-asserted-by":"publisher","first-page":"260","DOI":"10.1109\/TIT.1967.1054010","volume":"13","author":"A Viterbi","year":"1967","unstructured":"Viterbi A. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans Inf Theory. 1967;13(2):260\u20139.","journal-title":"IEEE Trans Inf Theory"},{"issue":"15","key":"5265_CR31","doi-asserted-by":"publisher","first-page":"2330","DOI":"10.1093\/bioinformatics\/btx130","volume":"33","author":"M Wang","year":"2017","unstructured":"Wang M, Weng H, et al. A Zoom-Focus algorithm (ZFA) to locate the optimal testing region for rare variant association tests. Bioinformatics. 2017;33(15):2330\u20136.","journal-title":"Bioinformatics"},{"issue":"1","key":"5265_CR32","doi-asserted-by":"publisher","first-page":"82","DOI":"10.1016\/j.ajhg.2011.05.029","volume":"89","author":"MC Wu","year":"2011","unstructured":"Wu MC, Lee S, et al. Rare-variant association testing for sequencing data with the sequence kernel association test. J Hum Genet. 2011;89(1):82\u201393.","journal-title":"J Hum Genet"},{"issue":"12","key":"5265_CR33","doi-asserted-by":"publisher","first-page":"768","DOI":"10.15252\/msb.20145654","volume":"10","author":"B Zacher","year":"2014","unstructured":"Zacher B, Lidschreiber M, et al. Annotation of genomics data using bidirectional hidden Markov models unveils variations in Pol II transcription cycle. Mol Syst Biol. 2014;10(12):768.","journal-title":"Mol Syst Biol"},{"issue":"3","key":"5265_CR34","doi-asserted-by":"publisher","first-page":"554","DOI":"10.3390\/genes13030554","volume":"13","author":"M Zakarczemny","year":"2022","unstructured":"Zakarczemny M, Zajecka M. Note on DNA analysis and redesigning using Markov chain. Genes. 2022;13(3):554.","journal-title":"Genes"},{"key":"5265_CR35","doi-asserted-by":"publisher","first-page":"549","DOI":"10.1038\/s41588-018-0079-y","volume":"50","author":"E Zengini","year":"2018","unstructured":"Zengini E, Hatzikotoulas K, et al. Genome-wide analyses using UK Biobank data provide insights into the genetic architecture of osteoarthritis. Nat Genet. 2018;50:549\u201358.","journal-title":"Nat Genet"},{"issue":"8","key":"5265_CR36","doi-asserted-by":"publisher","first-page":"1917","DOI":"10.1029\/91WR01403","volume":"27","author":"W Zucchini","year":"1991","unstructured":"Zucchini W, Guttorp P. A hidden Markov model for space-time precipitation. Water Resour Res. 1991;27(8):1917\u201323.","journal-title":"Water Resour Res"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-023-05265-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-023-05265-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-023-05265-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,17]],"date-time":"2024-10-17T22:25:51Z","timestamp":1729203951000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-023-05265-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,4,7]]},"references-count":36,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,12]]}},"alternative-id":["5265"],"URL":"https:\/\/doi.org\/10.1186\/s12859-023-05265-5","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,4,7]]},"assertion":[{"value":"31 December 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"31 March 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 April 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"138"}}