{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,7]],"date-time":"2026-05-07T09:03:40Z","timestamp":1778144620846,"version":"3.51.4"},"reference-count":25,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,2,28]],"date-time":"2024-02-28T00:00:00Z","timestamp":1709078400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,28]],"date-time":"2024-02-28T00:00:00Z","timestamp":1709078400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"NHGRI Centers for Common Disease Genomics","award":["UM1-HG008853"],"award-info":[{"award-number":["UM1-HG008853"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Approximating the recent phylogeny of <jats:italic>N<\/jats:italic> phased haplotypes at a set of variants along the genome is a core problem in modern population genomics and central to performing genome-wide screens for association, selection, introgression, and other signals. The Li &amp; Stephens (LS) model provides a simple yet powerful hidden Markov model for inferring the recent ancestry at a given variant, represented as an <jats:inline-formula><jats:alternatives><jats:tex-math>$$N \\times N$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                    <mml:mrow>\n                      <mml:mi>N<\/mml:mi>\n                      <mml:mo>\u00d7<\/mml:mo>\n                      <mml:mi>N<\/mml:mi>\n                    <\/mml:mrow>\n                  <\/mml:math><\/jats:alternatives><\/jats:inline-formula> distance matrix based on posterior decodings.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We provide a high-performance engine to make these posterior decodings readily accessible with minimal pre-processing via an easy to use package kalis, in the statistical programming language . kalis enables investigators to rapidly resolve the ancestry at loci of interest and developers to build a range of variant-specific ancestral inference pipelines on top. kalis exploits both multi-core parallelism and modern CPU vector instruction sets to enable scaling to hundreds of thousands of genomes.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>The resulting distance matrices accessible via kalis enable local ancestry, selection, and association studies in modern large scale genomic datasets.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12859-024-05688-8","type":"journal-article","created":{"date-parts":[[2024,2,28]],"date-time":"2024-02-28T10:02:37Z","timestamp":1709114557000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["kalis: a modern implementation of the Li &amp; Stephens model for local ancestry inference in R"],"prefix":"10.1186","volume":"25","author":[{"given":"Louis J. M.","family":"Aslett","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ryan R.","family":"Christ","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,2,28]]},"reference":[{"key":"5688_CR1","doi-asserted-by":"publisher","first-page":"2213","DOI":"10.1093\/genetics\/165.4.2213","volume":"165","author":"N Li","year":"2003","unstructured":"Li N, Stephens M. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics. 2003;165:2213\u201333.","journal-title":"Genetics"},{"key":"5688_CR2","doi-asserted-by":"publisher","first-page":"1321","DOI":"10.1038\/s41588-019-0484-x","volume":"51","author":"L Speidel","year":"2019","unstructured":"Speidel L, Forest M, Shi S, Myers SR. A method for genome-wide genealogy estimation for thousands of samples. Nat Genet. 2019;51:1321\u20139.","journal-title":"Nat Genet"},{"key":"5688_CR3","doi-asserted-by":"publisher","first-page":"1005","DOI":"10.1534\/genetics.116.191817","volume":"203","author":"YS Song","year":"2016","unstructured":"Song YS. Na Li and Matthew Stephens on Modeling Linkage Disequilibrium. Genetics. 2016;203:1005\u20136.","journal-title":"Genetics"},{"key":"5688_CR4","doi-asserted-by":"publisher","first-page":"1880","DOI":"10.1016\/j.ajhg.2021.08.005","volume":"108","author":"BL Browning","year":"2021","unstructured":"Browning BL, Tian X, Zhou Y, Browning SR. Fast two-stage phasing of large-scale sequence data. Am J Hum Genet. 2021;108:1880\u201390.","journal-title":"Am J Hum Genet"},{"key":"5688_CR5","doi-asserted-by":"publisher","first-page":"e1009049","DOI":"10.1371\/journal.pgen.1009049","volume":"16","author":"S Rubinacci","year":"2020","unstructured":"Rubinacci S, Delaneau O, Marchini J. Genotype imputation using the positional burrows wheeler transform. PLoS Genet. 2020;16:e1009049.","journal-title":"PLoS Genet"},{"key":"5688_CR6","doi-asserted-by":"publisher","first-page":"1330","DOI":"10.1038\/s41588-019-0483-y","volume":"51","author":"J Kelleher","year":"2019","unstructured":"Kelleher J, et al. Inferring whole-genome histories in large population datasets. Nat Genet. 2019;51:1330\u20138.","journal-title":"Nat Genet"},{"key":"5688_CR7","doi-asserted-by":"publisher","first-page":"e1002453","DOI":"10.1371\/journal.pgen.1002453","volume":"8","author":"DJ Lawson","year":"2012","unstructured":"Lawson DJ, Hellenthal G, Myers S, Falush D. Inference of population structure using dense haplotype data. PLoS Genet. 2012;8:e1002453.","journal-title":"PLoS Genet"},{"key":"5688_CR8","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13015-019-0144-9","volume":"14","author":"YM Rosen","year":"2019","unstructured":"Rosen YM, Paten BJ. An average-case sublinear forward algorithm for the haploid Li and Stephens model. Algorithms Mol Biol. 2019;14:1\u201312.","journal-title":"Algorithms Mol Biol"},{"key":"5688_CR9","unstructured":"R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria 2023. https:\/\/www.R-project.org\/."},{"key":"5688_CR10","first-page":"202","volume":"30","author":"H Sutter","year":"2005","unstructured":"Sutter H. The free lunch is over: a fundamental turn toward concurrency in software. Dr Dobb\u2019s J. 2005;30:202\u201310.","journal-title":"Dr. Dobb\u2019s J"},{"key":"5688_CR11","doi-asserted-by":"publisher","first-page":"42","DOI":"10.1109\/40.526924","volume":"16","author":"A Peleg","year":"1996","unstructured":"Peleg A, Weiser U. MMX technology extension to the Intel architecture. IEEE Micro. 1996;16:42\u201350.","journal-title":"IEEE Micro"},{"key":"5688_CR12","unstructured":"Intel Corporation. Intel Architecture Instruction Set Extensions and Future Features. Tech. Rep. 319433-046 (2022)."},{"key":"5688_CR13","unstructured":"ARM. NEON Programmer\u2019s Guide. Tech. Rep. DEN0018A ID071613 (2013)."},{"key":"5688_CR14","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1109\/40.216745","volume":"13","author":"D Alpert","year":"1993","unstructured":"Alpert D, Avnon D. Architecture of the Pentium microprocessor. IEEE Micro. 1993;13:11\u201321.","journal-title":"IEEE Micro"},{"key":"5688_CR15","unstructured":"ISO. ISO\/IEC 9899:2018 Information technology\u2014Programming languages\u2014C Fourth edn (BSI, 2018). https:\/\/www.iso.org\/standard\/74528.html."},{"key":"5688_CR16","doi-asserted-by":"publisher","first-page":"257","DOI":"10.1109\/5.18626","volume":"77","author":"LR Rabiner","year":"1989","unstructured":"Rabiner LR. A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE. 1989;77:257\u201386.","journal-title":"Proc IEEE"},{"key":"5688_CR17","doi-asserted-by":"crossref","unstructured":"Sch\u00f6ne R, Ilsche T, Bielert M, Gocht A, Hackenberg D. IEEE (ed.) Energy efficiency features of the Intel Skylake-SP processor and their impact on performance. (ed. IEEE) 2019 International Conference on High Performance Computing & Simulation (HPCS), 2019. pp. 399\u2013406.","DOI":"10.1109\/HPCS48598.2019.9188239"},{"key":"5688_CR18","doi-asserted-by":"crossref","unstructured":"Consortium GP, et al. A global reference for human genetic variation. Nature. 2015;526:68.","DOI":"10.1038\/nature15393"},{"key":"5688_CR19","doi-asserted-by":"publisher","first-page":"579","DOI":"10.1007\/s00439-008-0593-6","volume":"124","author":"CJ Ingram","year":"2009","unstructured":"Ingram CJ, Mulcare CA, Itan Y, Thomas MG, Swallow DM. Lactose digestion and the evolutionary genetics of lactase persistence. Hum Genet. 2009;124:579\u201391.","journal-title":"Hum Genet"},{"key":"5688_CR20","doi-asserted-by":"publisher","first-page":"496","DOI":"10.1016\/j.ajhg.2014.02.009","volume":"94","author":"A Ranciaro","year":"2014","unstructured":"Ranciaro A, et al. Genetic origins of lactase persistence and the spread of pastoralism in Africa. Am J Hum Genet. 2014;94:496\u2013510.","journal-title":"Am J Hum Genet"},{"key":"5688_CR21","doi-asserted-by":"publisher","first-page":"1111","DOI":"10.1086\/421051","volume":"74","author":"T Bersaglieri","year":"2004","unstructured":"Bersaglieri T, et al. Genetic signatures of strong recent positive selection at the lactase gene. Am J Hum Genet. 2004;74:1111\u201320.","journal-title":"Am J Hum Genet"},{"key":"5688_CR22","doi-asserted-by":"crossref","unstructured":"Busby G, et\u00a0al. Inferring adaptive gene-flow in recent African history. BioRxiv 2017;205252.","DOI":"10.1101\/205252"},{"key":"5688_CR23","first-page":"1409","volume":"38","author":"RR Sokal","year":"1958","unstructured":"Sokal RR. A statistical method for evaluating systematic relationships. Univ Kansas Sci Bull. 1958;38:1409\u201338.","journal-title":"Univ Kansas Sci Bull"},{"key":"5688_CR24","doi-asserted-by":"publisher","first-page":"629","DOI":"10.1086\/502802","volume":"78","author":"P Scheet","year":"2006","unstructured":"Scheet P, Stephens M. A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet. 2006;78:629\u201344.","journal-title":"Am J Hum Genet"},{"key":"5688_CR25","doi-asserted-by":"publisher","first-page":"26","DOI":"10.1109\/MM.2017.35","volume":"37","author":"N Stephens","year":"2017","unstructured":"Stephens N, et al. The ARM scalable vector extension. IEEE Micro. 2017;37:26\u201339.","journal-title":"IEEE Micro"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-024-05688-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-024-05688-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-024-05688-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,28]],"date-time":"2024-02-28T10:02:46Z","timestamp":1709114566000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-024-05688-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,28]]},"references-count":25,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["5688"],"URL":"https:\/\/doi.org\/10.1186\/s12859-024-05688-8","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,28]]},"assertion":[{"value":"3 August 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 February 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 February 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"86"}}