{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T21:29:57Z","timestamp":1770499797922,"version":"3.49.0"},"reference-count":39,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","license":[{"start":{"date-parts":[[2025,7,15]],"date-time":"2025-07-15T00:00:00Z","timestamp":1752537600000},"content-version":"vor","delay-in-days":14,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01HG010086"],"award-info":[{"award-number":["R01HG010086"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R56HG011509"],"award-info":[{"award-number":["R56HG011509"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>The availability of large genotyped cohorts brings new opportunities for revealing the high-resolution genetic structure of admixed populations via local ancestry inference (LAI), the process of identifying the ancestry of each segment of an individual haplotype. Though current methods achieve high accuracy in standard cases, LAI is still challenging when reference populations are more similar (e.g. intra-continental), when the number of reference populations is too numerous, or when the admixture events are deep in time, all of which are increasingly unavoidable in large biobanks.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>In this work, we present Recomb-Mix, a new LAI method which integrates elements from the site-based Li and Stephens model and introduces a new graph collapsing techniques to simplify counting paths with the same ancestry label readout. Through comprehensive benchmarking on various simulated datasets, we show that Recomb-Mix is more accurate than existing methods in diverse sets of scenarios while being competitive in terms of resource efficiency. The scalability and robustness of Recomb-Mix are also demonstrated with real-world datasets. We expect that Recomb-Mix will be a useful method for advancing genetics studies of admixed populations.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The implementation of Recomb-Mix is available at https:\/\/github.com\/ucfcbb\/Recomb-Mix.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf227","type":"journal-article","created":{"date-parts":[[2025,7,15]],"date-time":"2025-07-15T13:02:09Z","timestamp":1752584529000},"page":"i180-i188","source":"Crossref","is-referenced-by-count":1,"title":["Recomb-Mix: fast and accurate local ancestry inference"],"prefix":"10.1093","volume":"41","author":[{"given":"Yuan","family":"Wei","sequence":"first","affiliation":[{"name":"Department of Computer Science, University of Central Florida , Orlando, FL 32816,","place":["United States"]}]},{"given":"Degui","family":"Zhi","sequence":"additional","affiliation":[{"name":"McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston , Houston, TX 77030,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4051-5549","authenticated-orcid":false,"given":"Shaojie","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Central Florida , Orlando, FL 32816,","place":["United States"]}]}],"member":"286","published-online":{"date-parts":[[2025,7,15]]},"reference":[{"key":"2025071509020360000_btaf227-B1","doi-asserted-by":"crossref","first-page":"e54967","DOI":"10.7554\/eLife.54967","article-title":"A community-maintained standard library of population genetic models","volume":"9","author":"Adrion","year":"2020","journal-title":"Elife"},{"key":"2025071509020360000_btaf227-B2","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1038\/s41588-020-00766-y","article-title":"Tractor uses local ancestry to enable the inclusion of admixed individuals in GWAS and to boost power","volume":"53","author":"Atkinson","year":"2021","journal-title":"Nat Genet"},{"key":"2025071509020360000_btaf227-B3","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1038\/nature15393","article-title":"A global reference for human genetic variation","volume":"526","author":"Auton","year":"2015","journal-title":"Nature"},{"key":"2025071509020360000_btaf227-B4","doi-asserted-by":"crossref","first-page":"1359","DOI":"10.1093\/bioinformatics\/bts144","article-title":"Fast and accurate inference of local ancestry in Latino populations","volume":"28","author":"Baran","year":"2012","journal-title":"Bioinformatics"},{"key":"2025071509020360000_btaf227-B5","doi-asserted-by":"crossref","first-page":"eaay5012","DOI":"10.1126\/science.aay5012","article-title":"Insights into human genetic variation and population history from 929 diverse genomes","volume":"367","author":"Bergstr\u00f6m A, McCarthy S, Hui R","year":"2020","journal-title":"Science"},{"key":"2025071509020360000_btaf227-B6","doi-asserted-by":"crossref","first-page":"1880","DOI":"10.1016\/j.ajhg.2021.08.005","article-title":"Fast two-stage phasing of large-scale sequence data","volume":"108","author":"Browning","year":"2021","journal-title":"Am J Hum Genet"},{"key":"2025071509020360000_btaf227-B7","doi-asserted-by":"crossref","first-page":"338","DOI":"10.1016\/j.ajhg.2018.07.015","article-title":"A One-Penny imputed genome from next-generation reference panels","volume":"103","author":"Browning","year":"2018","journal-title":"Am J Hum Genet"},{"key":"2025071509020360000_btaf227-B8","doi-asserted-by":"crossref","first-page":"326","DOI":"10.1016\/j.ajhg.2022.12.010","article-title":"Fast, accurate local ancestry inference with FLARE","volume":"110","author":"Browning","year":"2023","journal-title":"Am J Hum Genet"},{"key":"2025071509020360000_btaf227-B9","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1038\/s41586-018-0579-z","article-title":"The UK biobank resource with deep phenotyping and genomic data","volume":"562","author":"Bycroft","year":"2018","journal-title":"Nature"},{"key":"2025071509020360000_btaf227-B10","doi-asserted-by":"crossref","first-page":"3426","DOI":"10.1016\/j.cell.2022.08.004","article-title":"High-coverage whole-genome sequencing of the expanded 1000 genomes project cohort including 602 trios","volume":"185","author":"Byrska-Bishop","year":"2022","journal-title":"Cell"},{"key":"2025071509020360000_btaf227-B11","doi-asserted-by":"crossref","first-page":"D854","DOI":"10.1093\/nar\/gkw829","article-title":"The international genome sample resource (IGSR): a worldwide collection of genome variation incorporating the 1000 genomes project data","volume":"45","author":"Clarke","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2025071509020360000_btaf227-B12","doi-asserted-by":"crossref","first-page":"2156","DOI":"10.1093\/bioinformatics\/btr330","article-title":"The variant call format and VCFtools","volume":"27","author":"Danecek","year":"2011","journal-title":"Bioinformatics"},{"key":"2025071509020360000_btaf227-B13","doi-asserted-by":"crossref","first-page":"2318","DOI":"10.1093\/molbev\/msy126","article-title":"Loter: a software package to infer local ancestry for a wide range of species","volume":"35","author":"Dias-Alves","year":"2018","journal-title":"Mol Biol Evol"},{"key":"2025071509020360000_btaf227-B14","doi-asserted-by":"crossref","first-page":"622","DOI":"10.1186\/1471-2164-12-622","article-title":"Comparison of measures of marker informativeness for ancestry and admixture mapping","volume":"12","author":"Ding","year":"2011","journal-title":"BMC Genomics"},{"key":"2025071509020360000_btaf227-B15","doi-asserted-by":"crossref","first-page":"774","DOI":"10.1038\/s41586-023-06079-4","article-title":"Polygenic scoring accuracy varies across the genetic ancestry continuum","volume":"618","author":"Ding","year":"2023","journal-title":"Nature"},{"key":"2025071509020360000_btaf227-B16","doi-asserted-by":"crossref","first-page":"3328","DOI":"10.1038\/s41467-019-11112-0","article-title":"Analysis of polygenic risk score usage and performance in diverse human populations","volume":"10","author":"Duncan","year":"2019","journal-title":"Nat Commun"},{"key":"2025071509020360000_btaf227-B17","doi-asserted-by":"publisher","DOI":"10.1101\/2021.01.19.427308","article-title":"A scalable pipeline for local ancestry inference using tens of thousands of reference haplotypes","author":"Durand","year":"2021"},{"key":"2025071509020360000_btaf227-B18","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511790492","volume-title":"Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids","author":"Durbin","year":"1998"},{"key":"2025071509020360000_btaf227-B19","doi-asserted-by":"crossref","first-page":"1709","DOI":"10.1093\/bib\/bby044","article-title":"A comprehensive survey of models for dissecting local ancestry deconvolution in human genome","volume":"20","author":"Geza","year":"2019","journal-title":"Brief Bioinf"},{"key":"2025071509020360000_btaf227-B20","doi-asserted-by":"crossref","first-page":"552","DOI":"10.1111\/1755-0998.12968","article-title":"Tree-sequence recording in SLiM opens new horizons for forward-time simulation of whole genomes","volume":"19","author":"Haller","year":"2019","journal-title":"Mol Ecol Resour"},{"key":"2025071509020360000_btaf227-B21","doi-asserted-by":"crossref","first-page":"632","DOI":"10.1093\/molbev\/msy228","article-title":"SLiM 3: forward genetic simulations beyond the Wright-Fisher model","volume":"36","author":"Haller","year":"2019","journal-title":"Mol Biol Evol"},{"key":"2025071509020360000_btaf227-B22","doi-asserted-by":"crossref","first-page":"msad074","DOI":"10.1093\/molbev\/msad074","article-title":"Localizing post-admixture adaptive variants with object detection on ancestry-painted chromosomes","volume":"40","author":"Hamid","year":"2023","journal-title":"Mol Biol Evol"},{"key":"2025071509020360000_btaf227-B23","doi-asserted-by":"publisher","DOI":"10.1101\/2021.09.19.460980","article-title":"High resolution ancestry deconvolution for next generation genomic data","author":"Hilmarsson","year":"2021"},{"key":"2025071509020360000_btaf227-B24","doi-asserted-by":"crossref","first-page":"549","DOI":"10.1038\/s41588-023-01338-6","article-title":"Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals","volume":"55","author":"Hou","year":"2023","journal-title":"Nat Genet"},{"key":"2025071509020360000_btaf227-B25","first-page":"111","article-title":"Data preprocessing for supervised leaning","volume":"1","author":"Kotsiantis","year":"2006","journal-title":"Int J Comput Sci"},{"key":"2025071509020360000_btaf227-B26","doi-asserted-by":"crossref","first-page":"508","DOI":"10.1038\/s41586-022-05473-8","article-title":"FinnGen provides genetic insights from a well-phenotyped isolated population","volume":"613","author":"Kurki","year":"2023","journal-title":"Nature"},{"key":"2025071509020360000_btaf227-B27","doi-asserted-by":"crossref","first-page":"e1002453","DOI":"10.1371\/journal.pgen.1002453","article-title":"Inference of population structure using dense haplotype data","volume":"8","author":"Lawson","year":"2012","journal-title":"PLoS Genet"},{"key":"2025071509020360000_btaf227-B28","doi-asserted-by":"crossref","first-page":"eabm4247","DOI":"10.1126\/science.abm4247","article-title":"The genetic history of the Southern Arc: a bridge between West Asia and Europe","volume":"377","author":"Lazaridis","year":"2022","journal-title":"Science"},{"key":"2025071509020360000_btaf227-B29","doi-asserted-by":"crossref","first-page":"2213","DOI":"10.1093\/genetics\/165.4.2213","article-title":"Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data","volume":"165","author":"Li","year":"2003","journal-title":"Genetics"},{"key":"2025071509020360000_btaf227-B30","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1016\/j.ajhg.2013.06.020","article-title":"RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference","volume":"93","author":"Maples","year":"2013","journal-title":"Am J Hum Genet"},{"key":"2025071509020360000_btaf227-B31","doi-asserted-by":"crossref","first-page":"635","DOI":"10.1016\/j.ajhg.2017.03.004","article-title":"Human demographic history impacts genetic risk prediction across diverse populations","volume":"100","author":"Martin","year":"2017","journal-title":"Am J Hum Genet"},{"key":"2025071509020360000_btaf227-B32","doi-asserted-by":"crossref","first-page":"eaat7487","DOI":"10.1126\/science.aat7487","article-title":"The formation of human populations in South and Central Asia","volume":"365","author":"Narasimhan","year":"2019","journal-title":"Science"},{"key":"2025071509020360000_btaf227-B33","doi-asserted-by":"crossref","first-page":"ii27","DOI":"10.1093\/bioinformatics\/btac464","article-title":"SALAI-Net: species-agnostic local ancestry inference network","volume":"38","author":"Oriol Sabat","year":"2022","journal-title":"Bioinformatics"},{"key":"2025071509020360000_btaf227-B34","doi-asserted-by":"crossref","first-page":"1839","DOI":"10.1086\/302148","article-title":"Estimating African American admixture proportions by use of population-specific alleles","volume":"63","author":"Parra","year":"1998","journal-title":"Am J Hum Genet"},{"key":"2025071509020360000_btaf227-B35","doi-asserted-by":"crossref","first-page":"e1001371","DOI":"10.1371\/journal.pgen.1001371","article-title":"Enhanced statistical tests for GWAS in admixed populations: assessment using African Americans from CARe and a breast cancer consortium","volume":"7","author":"Pasaniuc","year":"2011","journal-title":"PLoS Genet"},{"key":"2025071509020360000_btaf227-B36","doi-asserted-by":"crossref","first-page":"e1000519","DOI":"10.1371\/journal.pgen.1000519","article-title":"Sensitive detection of chromosomal segments of distinct ancestry in admixed populations","volume":"5","author":"Price","year":"2009","journal-title":"PLoS Genet"},{"key":"2025071509020360000_btaf227-B37","doi-asserted-by":"crossref","first-page":"1113","DOI":"10.1038\/ng1646","article-title":"A whole-genome admixture scan finds a candidate locus for multiple sclerosis susceptibility","volume":"37","author":"Reich","year":"2005","journal-title":"Nat Genet"},{"key":"2025071509020360000_btaf227-B38","doi-asserted-by":"crossref","first-page":"869","DOI":"10.1534\/genetics.119.302139","article-title":"Fine-scale inference of ancestry segments without prior knowledge of admixing groups","volume":"212","author":"Salter-Townshend","year":"2019","journal-title":"Genetics"},{"key":"2025071509020360000_btaf227-B39","doi-asserted-by":"crossref","first-page":"459","DOI":"10.1186\/s12859-021-04350-x","article-title":"Ancestry inference using reference labeled clusters of haplotypes","volume":"22","author":"Wang","year":"2021","journal-title":"BMC Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/Supplement_1\/i180\/63745289\/btaf227.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/Supplement_1\/i180\/63745289\/btaf227.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,15]],"date-time":"2025-07-15T13:02:14Z","timestamp":1752584534000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/41\/Supplement_1\/i180\/8199351"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,1]]},"references-count":39,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2025,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf227","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,7]]},"published":{"date-parts":[[2025,7,1]]}}}