{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T21:29:16Z","timestamp":1775078956682,"version":"3.50.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1010419","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2022,9,28]],"date-time":"2022-09-28T00:00:00Z","timestamp":1664323200000}}],"reference-count":45,"publisher":"Public Library of Science (PLoS)","issue":"9","license":[{"start":{"date-parts":[[2022,9,16]],"date-time":"2022-09-16T00:00:00Z","timestamp":1663286400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000057","name":"National Institute of General Medical Sciences","doi-asserted-by":"publisher","award":["R01GM146051"],"award-info":[{"award-number":["R01GM146051"]}],"id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Unraveling the complex demographic histories of natural populations is a central problem in population genetics. Understanding past demographic events is of general anthropological interest, but is also an important step in establishing accurate null models when identifying adaptive or disease-associated genetic variation. An important class of tools for inferring past population size changes from genomic sequence data are Coalescent Hidden Markov Models (CHMMs). These models make efficient use of the linkage information in population genomic datasets by using the local genealogies relating sampled individuals as latent states that evolve along the chromosome in an HMM framework. Extending these models to large sample sizes is challenging, since the number of possible latent states increases rapidly.<\/jats:p>\n                  <jats:p>\n                    Here, we present our method\n                    <jats:monospace>CHIMP<\/jats:monospace>\n                    (\n                    <jats:bold>C<\/jats:bold>\n                    HMM\n                    <jats:bold>H<\/jats:bold>\n                    istory-\n                    <jats:bold>I<\/jats:bold>\n                    nference\n                    <jats:bold>M<\/jats:bold>\n                    aximum-Likelihood\n                    <jats:bold>P<\/jats:bold>\n                    rocedure), a novel CHMM method for inferring the size history of a population. It can be applied to large samples (hundreds of haplotypes) and only requires unphased genomes as input. The two implementations of\n                    <jats:monospace>CHIMP<\/jats:monospace>\n                    that we present here use either the height of the genealogical tree (\n                    <jats:italic>T<\/jats:italic>\n                    <jats:sub>\n                      <jats:italic>MRCA<\/jats:italic>\n                    <\/jats:sub>\n                    ) or the total branch length, respectively, as the latent variable at each position in the genome. The requisite transition and emission probabilities are obtained by numerically solving certain systems of differential equations derived from the ancestral process with recombination. The parameters of the population size history are subsequently inferred using an Expectation-Maximization algorithm. In addition, we implement a composite likelihood scheme to allow the method to scale to large sample sizes.\n                  <\/jats:p>\n                  <jats:p>\n                    We demonstrate the efficiency and accuracy of our method in a variety of benchmark tests using simulated data and present comparisons to other state-of-the-art methods. Specifically, our implementation using\n                    <jats:italic>T<\/jats:italic>\n                    <jats:sub>\n                      <jats:italic>MRCA<\/jats:italic>\n                    <\/jats:sub>\n                    as the latent variable shows comparable performance and provides accurate estimates of effective population sizes in intermediate and ancient times. Our method is agnostic to the phasing of the data, which makes it a promising alternative in scenarios where high quality data is not available, and has potential applications for pseudo-haploid data.\n                  <\/jats:p>","DOI":"10.1371\/journal.pcbi.1010419","type":"journal-article","created":{"date-parts":[[2022,9,16]],"date-time":"2022-09-16T13:55:17Z","timestamp":1663336517000},"page":"e1010419","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":13,"title":["Robust inference of population size histories from genomic sequencing data"],"prefix":"10.1371","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2318-4407","authenticated-orcid":true,"given":"Gautam","family":"Upadhya","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6331-7900","authenticated-orcid":true,"given":"Matthias","family":"Steinr\u00fccken","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2022,9,16]]},"reference":[{"key":"pcbi.1010419.ref001","doi-asserted-by":"crossref","first-page":"e45380","DOI":"10.7554\/eLife.45380","article-title":"Why structure matters","volume":"8","author":"N Barton","year":"2019","journal-title":"Elife"},{"issue":"5","key":"pcbi.1010419.ref002","doi-asserted-by":"crossref","first-page":"555","DOI":"10.1038\/ng.3254","article-title":"Exploring population size changes using SNP frequency spectra","volume":"47","author":"X Liu","year":"2015","journal-title":"Nat Genet"},{"issue":"2","key":"pcbi.1010419.ref003","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1101\/gr.178756.114","article-title":"Efficient inference of population size histories and locus-specific mutation rates from large-sample genomic variation data","volume":"25","author":"A Bhaskar","year":"2015","journal-title":"Genome Res"},{"issue":"3","key":"pcbi.1010419.ref004","doi-asserted-by":"crossref","first-page":"967","DOI":"10.1534\/genetics.119.302373","article-title":"Bayesian Estimation of Population Size Changes by Sampling Tajima\u2019s Trees","volume":"213","author":"JA Palacios","year":"2019","journal-title":"Genetics"},{"issue":"6082","key":"pcbi.1010419.ref005","doi-asserted-by":"crossref","first-page":"740","DOI":"10.1126\/science.1217283","article-title":"Recent explosive human population growth has resulted in an excess of rare genetic variants","volume":"336","author":"A Keinan","year":"2012","journal-title":"Science"},{"issue":"2","key":"pcbi.1010419.ref006","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1038\/ng.3748","article-title":"Robust and scalable inference of population history from hundreds of unphased whole genomes","volume":"49","author":"J Terhorst","year":"2017","journal-title":"Nat Genet"},{"issue":"3","key":"pcbi.1010419.ref007","doi-asserted-by":"crossref","first-page":"404","DOI":"10.1016\/j.ajhg.2015.07.012","article-title":"Accurate Non-parametric Estimation of Recent Effective Population Size from Segments of Identity by Descent","volume":"97","author":"SR Browning","year":"2015","journal-title":"Am J Hum Genet"},{"issue":"5","key":"pcbi.1010419.ref008","doi-asserted-by":"crossref","first-page":"809","DOI":"10.1016\/j.ajhg.2012.08.030","article-title":"Length distributions of identity by descent reveal fine-scale demographic history","volume":"91","author":"PF Palamara","year":"2012","journal-title":"Am J Hum Genet"},{"issue":"5","key":"pcbi.1010419.ref009","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pgen.1004342","article-title":"Genome-Wide Inference of Ancestral Recombination Graphs","volume":"10","author":"MD Rasmussen","year":"2014","journal-title":"PLoS Genet"},{"issue":"9","key":"pcbi.1010419.ref010","doi-asserted-by":"crossref","first-page":"1330","DOI":"10.1038\/s41588-019-0483-y","article-title":"Inferring whole-genome histories in large population datasets","volume":"51","author":"J Kelleher","year":"2019","journal-title":"Nat Genet"},{"issue":"9","key":"pcbi.1010419.ref011","doi-asserted-by":"crossref","first-page":"1321","DOI":"10.1038\/s41588-019-0484-x","article-title":"A method for genome-wide genealogy estimation for thousands of samples","volume":"51","author":"L Speidel","year":"2019","journal-title":"Nat Genet"},{"issue":"3","key":"pcbi.1010419.ref012","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1006\/tpbi.1998.1403","article-title":"Recombination as a point process along sequences","volume":"55","author":"C Wiuf","year":"1999","journal-title":"Theor Popul Biol"},{"issue":"1459","key":"pcbi.1010419.ref013","doi-asserted-by":"crossref","first-page":"1387","DOI":"10.1098\/rstb.2005.1673","article-title":"Approximating the coalescent with recombination","volume":"360","author":"GAT McVean","year":"2005","journal-title":"Philos Trans R Soc B"},{"issue":"7357","key":"pcbi.1010419.ref014","doi-asserted-by":"crossref","first-page":"493","DOI":"10.1038\/nature10231","article-title":"Inference of human population history from individual whole-genome sequences","volume":"475","author":"H Li","year":"2011","journal-title":"Nature"},{"issue":"8","key":"pcbi.1010419.ref015","doi-asserted-by":"crossref","first-page":"919","DOI":"10.1038\/ng.3015","article-title":"Inferring human population size and separation history from multiple genome sequences","volume":"46","author":"S Schiffels","year":"2014","journal-title":"Nat Genet"},{"issue":"3","key":"pcbi.1010419.ref016","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pgen.1008552","article-title":"Tracking human population structure through time from whole genome sequences","volume":"16","author":"K Wang","year":"2020","journal-title":"PLoS Genet"},{"issue":"3","key":"pcbi.1010419.ref017","doi-asserted-by":"crossref","first-page":"647","DOI":"10.1534\/genetics.112.149096","article-title":"Estimating Variable Effective Population Sizes from Multiple Genomes: A Sequentially Markov Conditional Sampling Distribution Approach","volume":"194","author":"S Sheehan","year":"2013","journal-title":"Genetics"},{"issue":"34","key":"pcbi.1010419.ref018","doi-asserted-by":"crossref","first-page":"17115","DOI":"10.1073\/pnas.1905060116","article-title":"Inference of complex population histories using whole-genome sequences from multiple populations","volume":"116","author":"M Steinr\u00fccken","year":"2019","journal-title":"Proc Natl Acad Sci USA"},{"key":"pcbi.1010419.ref019","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1016\/j.gde.2018.07.002","article-title":"Inference of population history using coalescent HMMs: review and outlook","volume":"53","author":"JP Spence","year":"2018","journal-title":"Curr Opin Genet Dev"},{"issue":"7","key":"pcbi.1010419.ref020","doi-asserted-by":"crossref","first-page":"2231","DOI":"10.1111\/1755-0998.13416","article-title":"Limits and convergence properties of the sequentially Markovian coalescent","volume":"21","author":"TPP Sellinger","year":"2021","journal-title":"Mol Ecol Resour"},{"key":"pcbi.1010419.ref021","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.tpb.2017.09.002","article-title":"Computing the joint distribution of the total tree length across loci in populations with variable size","volume":"118","author":"A Miroshnikov","year":"2017","journal-title":"Theor Popul Biol"},{"issue":"3","key":"pcbi.1010419.ref022","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1016\/0304-4149(82)90011-4","article-title":"The coalescent","volume":"13","author":"JFC Kingman","year":"1982","journal-title":"Stoch Process Their Appl"},{"key":"pcbi.1010419.ref023","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1007\/978-1-4757-2609-1_16","volume-title":"Progress in Population Genetics and Human Evolution","author":"RC Griffiths","year":"1997"},{"issue":"2","key":"pcbi.1010419.ref024","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1093\/bioinformatics\/18.2.337","article-title":"Generating samples under a Wright-Fisher neutral model of genetic variation","volume":"18","author":"RR Hudson","year":"2002","journal-title":"Bioinformatics"},{"issue":"5","key":"pcbi.1010419.ref025","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pcbi.1004842","article-title":"Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes","volume":"12","author":"J Kelleher","year":"2016","journal-title":"PLoS Comput Biol"},{"issue":"3","key":"pcbi.1010419.ref026","doi-asserted-by":"crossref","first-page":"iyab229","DOI":"10.1093\/genetics\/iyab229","article-title":"Efficient ancestry and mutation simulation with msprime 1.0","volume":"220","author":"F Baumdicker","year":"2022","journal-title":"Genetics"},{"issue":"1","key":"pcbi.1010419.ref027","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1186\/1471-2156-7-16","article-title":"Fast \u201ccoalescent\u201d simulation","volume":"7","author":"P Marjoram","year":"2006","journal-title":"BMC Genet"},{"key":"pcbi.1010419.ref028","article-title":"Exact decoding of the sequentially Markov coalescent","author":"C Ki","year":"2020","journal-title":"bioRxiv"},{"issue":"1","key":"pcbi.1010419.ref029","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1006\/tpbi.1997.1307","article-title":"A Markov chain model of coalescence with recombination","volume":"52","author":"KL Simonsen","year":"1997","journal-title":"Theor Popul Biol"},{"issue":"3","key":"pcbi.1010419.ref030","doi-asserted-by":"crossref","first-page":"307","DOI":"10.1214\/ss\/1177010378","article-title":"Ancestral Inference in Population Genetics","volume":"9","author":"RC Griffiths","year":"1994","journal-title":"Statist Sci"},{"key":"pcbi.1010419.ref031","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-78168-6","volume-title":"Probability Models for DNA Sequence Evolution","author":"R Durrett","year":"2008"},{"issue":"1","key":"pcbi.1010419.ref032","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1016\/0771-050X(80)90013-3","article-title":"A family of embedded Runge-Kutta formulae","volume":"6","author":"JR Dormand","year":"1980","journal-title":"J Comput Appl Math"},{"key":"pcbi.1010419.ref033","volume-title":"Pattern Recognition and Machine Learning","author":"C Bishop","year":"2006"},{"issue":"4","key":"pcbi.1010419.ref034","doi-asserted-by":"crossref","first-page":"308","DOI":"10.1093\/comjnl\/7.4.308","article-title":"A Simplex Method for Function Minimization","volume":"7","author":"JA Nelder","year":"1965","journal-title":"Comput J"},{"issue":"4","key":"pcbi.1010419.ref035","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1080\/00401706.1962.10490033","article-title":"Sequential Application of Simplex Designs in Optimisation and Evolutionary Operation","volume":"4","author":"W Spendley","year":"1962","journal-title":"Technometrics"},{"issue":"1","key":"pcbi.1010419.ref036","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1007\/s10589-010-9329-3","article-title":"Implementing the Nelder-Mead simplex algorithm with adaptive parameters","volume":"51","author":"F Gao","year":"2012","journal-title":"Comput Optim Appl"},{"issue":"5","key":"pcbi.1010419.ref037","doi-asserted-by":"crossref","first-page":"730","DOI":"10.1093\/sysbio\/syz008","article-title":"Robust Design for Coalescent Model Inference","volume":"68","author":"KV Parag","year":"2019","journal-title":"Syst Biol"},{"key":"pcbi.1010419.ref038","article-title":"High coverage whole genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios","author":"M Byrska-Bishop","year":"2021","journal-title":"bioRxiv"},{"key":"pcbi.1010419.ref039","doi-asserted-by":"crossref","first-page":"e54967","DOI":"10.7554\/eLife.54967","article-title":"A community-maintained standard library of population genetic models","volume":"9","author":"JR Adrion","year":"2020","journal-title":"Elife"},{"issue":"10","key":"pcbi.1010419.ref040","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pgen.1000695","article-title":"Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data","volume":"5","author":"RN Gutenkunst","year":"2009","journal-title":"PLoS Genet"},{"issue":"3","key":"pcbi.1010419.ref041","doi-asserted-by":"crossref","first-page":"1549","DOI":"10.1534\/genetics.117.200493","article-title":"Inferring the Joint Demographic History of Multiple Populations: Beyond the Diffusion Approximation","volume":"206","author":"J Jouganous","year":"2017","journal-title":"Genetics"},{"issue":"1","key":"pcbi.1010419.ref042","doi-asserted-by":"crossref","first-page":"50","DOI":"10.3390\/genes11010050","article-title":"Consensify: A Method for Generating Pseudohaploid Genome Sequences from Palaeogenomic Datasets with Reduced Error Rates","volume":"11","author":"A Barlow","year":"2020","journal-title":"Genes"},{"key":"pcbi.1010419.ref043","article-title":"Human generation times across the past 250,000 years","author":"RJ Wang","year":"2021","journal-title":"bioRxiv"},{"issue":"9","key":"pcbi.1010419.ref044","doi-asserted-by":"crossref","first-page":"3497","DOI":"10.1093\/molbev\/msab174","article-title":"Inferring Population Histories for Ancient Genomes Using Genome-Wide Genealogies","volume":"38","author":"L Speidel","year":"2021","journal-title":"Mol Biol Evol"},{"issue":"9","key":"pcbi.1010419.ref045","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pgen.1008384","article-title":"An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data","volume":"15","author":"AJ Stern","year":"2019","journal-title":"PLoS Genet"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1010419","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2022,9,28]],"date-time":"2022-09-28T00:00:00Z","timestamp":1664323200000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1010419","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,9,28]],"date-time":"2022-09-28T14:36:14Z","timestamp":1664375774000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1010419"}},"subtitle":[],"editor":[{"given":"Stephan","family":"Schiffels","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,9,16]]},"references-count":45,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2022,9,16]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1010419","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.05.22.445274","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,16]]}}}