{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T15:18:20Z","timestamp":1772205500076,"version":"3.50.1"},"reference-count":26,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2018,11,15]],"date-time":"2018-11-15T00:00:00Z","timestamp":1542240000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100003977","name":"Israel Science Foundation","doi-asserted-by":"publisher","award":["407\/17"],"award-info":[{"award-number":["407\/17"]}],"id":[{"id":"10.13039\/501100003977","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Hidden Markov models (HMMs) are powerful tools for modeling processes along the genome. In a standard genomic HMM, observations are drawn, at each genomic position, from a distribution whose parameters depend on a hidden state, and the hidden states evolve along the genome as a Markov chain. Often, the hidden state is the Cartesian product of multiple processes, each evolving independently along the genome. Inference in these so-called Factorial HMMs has a na\u00efve running time that scales as the square of the number of possible states, which by itself increases exponentially with the number of sub-chains; such a running time scaling is impractical for many applications. While faster algorithms exist, there is no available implementation suitable for developing bioinformatics applications.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We developed FactorialHMM, a Python package for fast exact inference in Factorial HMMs. Our package allows simulating either directly from the model or from the posterior distribution of states given the observations. Additionally, we allow the inference of all key quantities related to HMMs: (i) the (Viterbi) sequence of states with the highest posterior probability; (ii) the likelihood of the data and (iii) the posterior probability (given all observations) of the marginal and pairwise state probabilities. The running time and space requirement of all procedures is linearithmic in the number of possible states. Our package is highly modular, providing the user with maximal flexibility for developing downstream applications.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>https:\/\/github.com\/regevs\/factorial_hmm<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty944","type":"journal-article","created":{"date-parts":[[2018,11,13]],"date-time":"2018-11-13T15:17:29Z","timestamp":1542122249000},"page":"2162-2164","source":"Crossref","is-referenced-by-count":4,"title":["FactorialHMM: fast and exact inference in factorial hidden Markov models"],"prefix":"10.1093","volume":"35","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2450-8901","authenticated-orcid":false,"given":"Regev","family":"Schweiger","sequence":"first","affiliation":[{"name":"Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel"},{"name":"MyHeritage, Or Yehuda, Israel"}]},{"given":"Yaniv","family":"Erlich","sequence":"additional","affiliation":[{"name":"MyHeritage, Or Yehuda, Israel"},{"name":"Department of Computer Science, Fu Foundation School of Engineering, Columbia University, New York, NY, USA"},{"name":"Department of Systems Biology, Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY, USA"},{"name":"New York Genome Center, New York, NY, USA"}]},{"given":"Shai","family":"Carmi","sequence":"additional","affiliation":[{"name":"Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel"}]}],"member":"286","published-online":{"date-parts":[[2018,11,15]]},"reference":[{"key":"2023012713220056600_bty944-B1","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1038\/ng786","article-title":"Merlin\u2013rapid analysis of dense genetic maps using sparse gene flow trees","volume":"30","author":"Abecasis","year":"2002","journal-title":"Nat. Genet"},{"key":"2023012713220056600_bty944-B2","doi-asserted-by":"crossref","first-page":"545","DOI":"10.1515\/sagmb-2012-0010","article-title":"Simultaneous inference and clustering of transcriptional dynamics in gene regulatory networks","volume":"12","author":"Asif","year":"2013","journal-title":"Stat. Appl. Genet. Mol. Biol"},{"key":"2023012713220056600_bty944-B3","doi-asserted-by":"crossref","first-page":"1359","DOI":"10.1093\/bioinformatics\/bts144","article-title":"Fast and accurate inference of local ancestry in Latino populations","volume":"28","author":"Baran","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012713220056600_bty944-B4","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1007\/978-3-642-29627-7_2","volume-title":"Research in Computational Molecular Biology","author":"Bercovici","year":"2012"},{"key":"2023012713220056600_bty944-B5","doi-asserted-by":"crossref","first-page":"i175","DOI":"10.1093\/bioinformatics\/btq204","article-title":"Estimating genome-wide IBD sharing from SNP data via an efficient hidden Markov model of LD with application to gene mapping","volume":"26","author":"Bercovici","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012713220056600_bty944-B6","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511790492","volume-title":"Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids","author":"Durbin","year":"1998"},{"key":"2023012713220056600_bty944-B7","doi-asserted-by":"crossref","first-page":"1518","DOI":"10.1109\/TIT.2002.1003838","article-title":"Hidden Markov processes","volume":"48","author":"Ephraim","year":"2002","journal-title":"IEEE Trans. Information Theory"},{"key":"2023012713220056600_bty944-B8","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1038\/nmeth.1906","article-title":"ChromHMM: automating chromatin-state discovery and characterization","volume":"9","author":"Ernst","year":"2012","journal-title":"Nat. Methods"},{"key":"2023012713220056600_bty944-B9","doi-asserted-by":"crossref","first-page":"7265","DOI":"10.1021\/ac0508853","article-title":"NovoHMM: a hidden Markov model for de novo peptide sequencing","volume":"77","author":"Fischer","year":"2005","journal-title":"Anal. Chem"},{"key":"2023012713220056600_bty944-B10","doi-asserted-by":"crossref","first-page":"881","DOI":"10.1534\/genetics.107.078907","article-title":"Estimating meiotic gene conversion rates from population genetic data","volume":"177","author":"Gay","year":"2007","journal-title":"Genetics"},{"key":"2023012713220056600_bty944-B11","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1023\/A:1007425814087","article-title":"Factorial hidden Markov models","volume":"29","author":"Ghahramani","year":"1997","journal-title":"Machine Learn"},{"key":"2023012713220056600_bty944-B12","doi-asserted-by":"crossref","first-page":"ii166","DOI":"10.1093\/bioinformatics\/bti1127","article-title":"Discriminating between rate heterogeneity and interspecific recombination in DNA sequence alignments with phylogenetic factorial hidden Markov models","volume":"21","author":"Husmeier","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012713220056600_bty944-B13","first-page":"673","article-title":"Bayesian nonparametric hidden semi-Markov models","volume":"14","author":"Johnson","year":"2013","journal-title":"J. Machine Learn. Res"},{"key":"2023012713220056600_bty944-B14","doi-asserted-by":"crossref","first-page":"1155","DOI":"10.1089\/cmb.2007.0133","article-title":"Genotype error detection using Hidden Markov models of haplotype diversity","volume":"15","author":"Kennedy","year":"2008","journal-title":"J. Comput. Biol"},{"key":"2023012713220056600_bty944-B15","doi-asserted-by":"crossref","first-page":"i333","DOI":"10.1093\/bioinformatics\/btr243","article-title":"Reconstruction of genealogical relationships with applications to Phase III of HapMap","volume":"27","author":"Kyriazopoulou-Panagiotopoulou","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012713220056600_bty944-B16","doi-asserted-by":"crossref","first-page":"2363","DOI":"10.1073\/pnas.84.8.2363","article-title":"Construction of multilocus genetic linkage maps in humans","volume":"84","author":"Lander","year":"1987","journal-title":"PNAS"},{"key":"2023012713220056600_bty944-B17","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1111\/j.1467-9876.2008.00648.x","article-title":"Segmenting bacterial and viral DNA sequence alignments with a trans-dimensional phylogenetic factorial hidden Markov model","volume":"58","author":"Lehrach","year":"2009","journal-title":"J. R. Stat. Soc. Series C (Appl. Stat.)"},{"key":"2023012713220056600_bty944-B18","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1016\/j.cell.2014.05.034","article-title":"Expansion of biological pathways based on evolutionary inference","volume":"158","author":"Li","year":"2014","journal-title":"Cell"},{"key":"2023012713220056600_bty944-B19","doi-asserted-by":"crossref","first-page":"256","DOI":"10.1002\/gepi.21710","article-title":"Extending admixture mapping to nuclear pedigrees: application to Sarcoidosis","volume":"37","author":"McKeigue","year":"2013","journal-title":"Genet. Epidemiol"},{"key":"2023012713220056600_bty944-B20","first-page":"308494","article-title":"Inferring the ancestry of parents and grandparents from genetic data","author":"Pei","year":"2018","journal-title":"bioRxiv"},{"key":"2023012713220056600_bty944-B21","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1109\/5.18626","article-title":"A tutorial on hidden Markov models and selected applications in speech recognition","volume":"77","author":"Rabiner","year":"1989","journal-title":"Proc. IEEE"},{"key":"2023012713220056600_bty944-B22","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1089\/cmb.2017.0101","article-title":"HetFHMM: a novel approach to infer tumor heterogeneity using factorial hidden Markov models","volume":"25","author":"Rahman","year":"2018","journal-title":"J. Comput. Biol"},{"key":"2023012713220056600_bty944-B23","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1002\/humu.22225","article-title":"Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models","volume":"34","author":"Shihab","year":"2013","journal-title":"Human Mutat"},{"key":"2023012713220056600_bty944-B24","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1089\/1066527041410472","article-title":"Combining phylogenetic and hidden Markov models in biosequence analysis","volume":"11","author":"Siepel","year":"2004","journal-title":"J. Comput. Biol"},{"key":"2023012713220056600_bty944-B25","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1186\/s12920-017-0255-4","article-title":"CLImAT-HET: detecting subclonal copy number alterations and loss of heterozygosity in heterogeneous tumor samples from whole-genome sequencing data","volume":"10","author":"Yu","year":"2017","journal-title":"BMC Med. Genom"},{"key":"2023012713220056600_bty944-B26","doi-asserted-by":"crossref","first-page":"1917","DOI":"10.1029\/91WR01403","article-title":"A hidden Markov model for space-time precipitation","volume":"27","author":"Zucchini","year":"1991","journal-title":"Water Resour. Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/12\/2162\/48934612\/bioinformatics_35_12_2162.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/12\/2162\/48934612\/bioinformatics_35_12_2162.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,27]],"date-time":"2023-01-27T09:15:45Z","timestamp":1674810945000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/12\/2162\/5184283"}},"subtitle":[],"editor":[{"given":"Russell","family":"Schwartz","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,11,15]]},"references-count":26,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2019,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty944","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/383380","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,6]]},"published":{"date-parts":[[2018,11,15]]}}}