{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,12]],"date-time":"2026-04-12T02:35:53Z","timestamp":1775961353873,"version":"3.50.1"},"reference-count":35,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2024,6,30]],"date-time":"2024-06-30T00:00:00Z","timestamp":1719705600000},"content-version":"vor","delay-in-days":36,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"University of Oxford and the Euretta J. Kellett Fellowship from Columbia University","award":["Robertson Foundation, NIH"],"award-info":[{"award-number":["Robertson Foundation, NIH"]}]},{"name":"University of Oxford and the Euretta J. Kellett Fellowship from Columbia University","award":["HG011395"],"award-info":[{"award-number":["HG011395"]}]},{"name":"University of Oxford and the Euretta J. Kellett Fellowship from Columbia University","award":["HG012473"],"award-info":[{"award-number":["HG012473"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,6,3]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Summary<\/jats:title>\n                    <jats:p>Ancestral recombination graphs (ARGs) encode the ensemble of correlated genealogical trees arising from recombination in a compact and efficient structure and are of fundamental importance in population and statistical genetics. Recent breakthroughs have made it possible to simulate and infer ARGs at biobank scale, and there is now intense interest in using ARG-based methods across a broad range of applications, particularly in genome-wide association studies (GWAS). Sophisticated methods exist to simulate ARGs using population genetics models, but there is currently no software to simulate quantitative traits directly from these ARGs. To apply existing quantitative trait simulators users must export genotype data, losing important information about ancestral processes and producing prohibitively large files when applied to the biobank-scale datasets currently of interest in GWAS. We present tstrait, an open-source Python library to simulate quantitative traits on ARGs, and show how this user-friendly software can quickly simulate phenotypes for biobank-scale datasets on a laptop computer.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>tstrait is available for download on the Python Package Index. Full documentation with examples and workflow templates is available on https:\/\/tskit.dev\/tstrait\/docs\/, and the development version is maintained on GitHub (https:\/\/github.com\/tskit-dev\/tstrait).<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae334","type":"journal-article","created":{"date-parts":[[2024,5,25]],"date-time":"2024-05-25T23:27:00Z","timestamp":1716679620000},"source":"Crossref","is-referenced-by-count":10,"title":["<tt>tstrait<\/tt>\n                    : a quantitative trait simulator for ancestral recombination graphs"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0923-1070","authenticated-orcid":false,"given":"Daiki","family":"Tagami","sequence":"first","affiliation":[{"name":"Department of Statistics, University of Oxford , Oxford OX1 3LB,","place":["United Kingdom"]},{"name":"Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford , Oxford OX3 7LF,","place":["United Kingdom"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8327-0142","authenticated-orcid":false,"given":"Gertjan","family":"Bisschop","sequence":"additional","affiliation":[{"name":"Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford , Oxford OX3 7LF,","place":["United Kingdom"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7894-5253","authenticated-orcid":false,"given":"Jerome","family":"Kelleher","sequence":"additional","affiliation":[{"name":"Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford , Oxford OX3 7LF,","place":["United Kingdom"]}]}],"member":"286","published-online":{"date-parts":[[2024,5,25]]},"reference":[{"key":"2025013116110877300_btae334-B1","doi-asserted-by":"crossref","first-page":"e54967","DOI":"10.7554\/eLife.54967","article-title":"A community-maintained standard library of population genetic models","volume":"9","author":"Adrion","year":"2020","journal-title":"Elife"},{"key":"2025013116110877300_btae334-B2","doi-asserted-by":"crossref","first-page":"849","DOI":"10.1126\/science.add5300","article-title":"On the genes, genealogies, and geographies of Quebec","volume":"380","author":"Anderson-Trocm\u00e9","year":"2023","journal-title":"Science"},{"key":"2025013116110877300_btae334-B3","doi-asserted-by":"crossref","first-page":"iyab229","DOI":"10.1093\/genetics\/iyab229","article-title":"Efficient ancestry and mutation simulation with msprime 1.0","volume":"220","author":"Baumdicker","year":"2022","journal-title":"Genetics"},{"key":"2025013116110877300_btae334-B4","doi-asserted-by":"crossref","first-page":"evae005","DOI":"10.1093\/gbe\/evae005","article-title":"The promise of inferring the past using the ancestral recombination graph","volume":"16","author":"Brandt","year":"2024","journal-title":"Genome Biol Evol"},{"key":"2025013116110877300_btae334-B5","doi-asserted-by":"crossref","first-page":"2156","DOI":"10.1093\/bioinformatics\/btr330","article-title":"The variant call format and vcftools","volume":"27","author":"Danecek","year":"2011","journal-title":"Bioinformatics"},{"key":"2025013116110877300_btae334-B6","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-020-03804-y","article-title":"simplePHENOTYPES: SIMulation of pleiotropic, linked and epistatic phenotypes","volume":"21","author":"Fernandes","year":"2020","journal-title":"BMC Bioinform"},{"key":"2025013116110877300_btae334-B7","doi-asserted-by":"crossref","first-page":"jkaa017","DOI":"10.1093\/g3journal\/jkaa017","article-title":"AlphaSimR: an R package for breeding program simulations","volume":"11","author":"Gaynor","year":"2021","journal-title":"G3"},{"key":"2025013116110877300_btae334-B8","first-page":"257","volume-title":"Progress in Population Genetics and Human Evolution, IMA Volumes in Mathematics and Its Applications","author":"Griffiths","year":"1997"},{"key":"2025013116110877300_btae334-B9","doi-asserted-by":"crossref","first-page":"E127","DOI":"10.1086\/723601","article-title":"SLiM 4: multispecies eco-evolutionary modeling","volume":"201","author":"Haller","year":"2023","journal-title":"Am Nat"},{"key":"2025013116110877300_btae334-B10","doi-asserted-by":"crossref","first-page":"552","DOI":"10.1111\/1755-0998.12968","article-title":"Tree-sequence recording in SLiM opens new horizons for forward-time simulation of whole genomes","volume":"19","author":"Haller","year":"2018","journal-title":"Mol Ecol Resour"},{"key":"2025013116110877300_btae334-B11","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/s41586-020-2649-2","article-title":"Array programming with NumPy","volume":"585","author":"Harris","year":"2020","journal-title":"Nature"},{"key":"2025013116110877300_btae334-B12","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1016\/0040-5809(83)90013-8","article-title":"Properties of a neutral allele model with intragenic recombination","volume":"23","author":"Hudson","year":"1983","journal-title":"Theor Popul Biol"},{"key":"2025013116110877300_btae334-B13","doi-asserted-by":"crossref","first-page":"e1004842","DOI":"10.1371\/journal.pcbi.1004842","article-title":"Efficient coalescent simulation and genealogical analysis for large sample sizes","volume":"12","author":"Kelleher","year":"2016","journal-title":"PLoS Comput Biol"},{"key":"2025013116110877300_btae334-B14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pcbi.1006581","article-title":"Efficient pedigree recording for fast population genetics simulation","volume":"14","author":"Kelleher","year":"2018","journal-title":"PLoS Comput Biol"},{"key":"2025013116110877300_btae334-B15","doi-asserted-by":"crossref","first-page":"1330","DOI":"10.1038\/s41588-019-0483-y","article-title":"Inferring whole-genome histories in large population datasets","volume":"51","author":"Kelleher","year":"2019","journal-title":"Nat Genet"},{"key":"2025013116110877300_btae334-B16","first-page":"1","author":"Lam","year":"2015"},{"key":"2025013116110877300_btae334-B17","doi-asserted-by":"crossref","first-page":"e1011110","DOI":"10.1371\/journal.pgen.1011110","article-title":"The era of the ARG: an introduction to ancestral recombination graphs and their significance in empirical evolutionary genomics","volume":"20","author":"Lewanski","year":"2024","journal-title":"PLoS Genet"},{"key":"2025013116110877300_btae334-B18","doi-asserted-by":"crossref","first-page":"2077","DOI":"10.1016\/j.ajhg.2023.10.017","article-title":"Tree-based QTL mapping with expected local genetic relatedness matrices","volume":"110","author":"Link","year":"2023","journal-title":"Am J Hum Genet"},{"key":"2025013116110877300_btae334-B19","doi-asserted-by":"crossref","first-page":"635","DOI":"10.1016\/j.ajhg.2017.03.004","article-title":"Human demographic history impacts genetic risk prediction across diverse populations","volume":"100","author":"Martin","year":"2017","journal-title":"Am J Hum Genet"},{"key":"2025013116110877300_btae334-B20","doi-asserted-by":"publisher","first-page":"790","DOI":"10.1038\/s41562-023-01528-6","article-title":"Genome-wide analysis identifies genetic effects on reproductive success and ongoing natural selection at the FADS locus","volume":"7","author":"Mathieson","year":"2023","journal-title":"Nat Hum Behav"},{"key":"2025013116110877300_btae334-B21","first-page":"56","author":"McKinney","year":"2010"},{"key":"2025013116110877300_btae334-B22","doi-asserted-by":"crossref","first-page":"2951","DOI":"10.1093\/bioinformatics\/bty197","article-title":"PhenotypeSimulator: a comprehensive framework for simulating multi-trait, multi-locus genotype to phenotype relationships","volume":"34","author":"Meyer","year":"2018","journal-title":"Bioinformatics"},{"key":"2025013116110877300_btae334-B23","doi-asserted-by":"crossref","first-page":"1494","DOI":"10.1038\/s41588-023-01487-8","article-title":"Extremely sparse models of linkage disequilibrium in ancestrally diverse association studies","volume":"55","author":"Nowbandegani","year":"2023","journal-title":"Nat Genet"},{"key":"2025013116110877300_btae334-B24","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1016\/j.ajhg.2020.08.017","article-title":"Lessons learned from bugs in models of human history","volume":"107","author":"Ragsdale","year":"2020","journal-title":"Am J Hum Genet"},{"key":"2025013116110877300_btae334-B25","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1534\/genetics.120.303253","article-title":"Efficiently summarizing relationships in large samples: a general duality between statistics of genealogies and genomes","volume":"215","author":"Ralph","year":"2020","journal-title":"Genetics"},{"key":"2025013116110877300_btae334-B26","doi-asserted-by":"crossref","first-page":"1011","DOI":"10.1016\/j.ajhg.2012.10.010","article-title":"Improved heritability estimation from genome-wide SNPs","volume":"91","author":"Speed","year":"2012","journal-title":"Am J Hum Genet"},{"key":"2025013116110877300_btae334-B27","doi-asserted-by":"crossref","first-page":"2304","DOI":"10.1093\/bioinformatics\/btr341","article-title":"HAPGEN2: simulation of multiple disease SNPs","volume":"27","author":"Su","year":"2011","journal-title":"Bioinformatics"},{"key":"2025013116110877300_btae334-B28","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1038\/s10038-020-00862-1","article-title":"Practical guide for managing large-scale human genome data in research","volume":"66","author":"Tanjo","year":"2021","journal-title":"J Hum Genet"},{"key":"2025013116110877300_btae334-B29","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/s43586-021-00056-9","article-title":"Genome-wide association studies","volume":"1","author":"Uffelmann","year":"2021","journal-title":"Nat Rev Methods Primers"},{"key":"2025013116110877300_btae334-B30","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1016\/j.ajhg.2017.06.005","article-title":"10 years of GWAS discovery: biology, function, and translation","volume":"101","author":"Visscher","year":"2017","journal-title":"Am J Hum Genet"},{"key":"2025013116110877300_btae334-B31","doi-asserted-by":"crossref","first-page":"btad535","DOI":"10.1093\/bioinformatics\/btad535","article-title":"HAPNEST: efficient, large-scale generation and evaluation of synthetic datasets for genotypes and phenotypes","volume":"39","author":"Wharrie","year":"2023","journal-title":"Bioinformatics"},{"key":"2025013116110877300_btae334-B32","author":"Wong","year":"2023"},{"key":"2025013116110877300_btae334-B33","doi-asserted-by":"crossref","first-page":"704","DOI":"10.1038\/s41586-022-05275-y","article-title":"A saturated map of common genetic variants associated with human height","volume":"610","author":"Yengo","year":"2022","journal-title":"Nature"},{"key":"2025013116110877300_btae334-B34","doi-asserted-by":"crossref","first-page":"e61548","DOI":"10.7554\/eLife.61548","article-title":"Demographic history mediates the effect of stratification on polygenic scores","volume":"9","author":"Zaidi","year":"2020","journal-title":"Elife"},{"key":"2025013116110877300_btae334-B35","doi-asserted-by":"crossref","first-page":"768","DOI":"10.1038\/s41588-023-01379-x","article-title":"Biobank-scale inference of ancestral recombination graphs enables genealogical analysis of complex traits","volume":"55","author":"Zhang","year":"2023","journal-title":"Nat Genet"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae334\/57909401\/btae334.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/6\/btae334\/59494059\/btae334.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/6\/btae334\/59494059\/btae334.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,31]],"date-time":"2025-01-31T11:11:30Z","timestamp":1738321890000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae334\/7682379"}},"subtitle":[],"editor":[{"given":"Russell","family":"Schwartz","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,5,25]]},"references-count":35,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,6,3]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae334","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.03.13.584790","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,6]]},"published":{"date-parts":[[2024,5,25]]},"article-number":"btae334"}}