{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,8]],"date-time":"2026-03-08T17:35:54Z","timestamp":1772991354244,"version":"3.50.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1010409","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2022,9,6]],"date-time":"2022-09-06T00:00:00Z","timestamp":1662422400000}}],"reference-count":51,"publisher":"Public Library of Science (PLoS)","issue":"8","license":[{"start":{"date-parts":[[2022,8,24]],"date-time":"2022-08-24T00:00:00Z","timestamp":1661299200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100007251","name":"HSE University","doi-asserted-by":"crossref","award":["Basic Research Program"],"award-info":[{"award-number":["Basic Research Program"]}],"id":[{"id":"10.13039\/501100007251","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100007251","name":"HSE University","doi-asserted-by":"crossref","award":["Basic Research Program"],"award-info":[{"award-number":["Basic Research Program"]}],"id":[{"id":"10.13039\/501100007251","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100007251","name":"HSE University","doi-asserted-by":"crossref","award":["Basic Research Program"],"award-info":[{"award-number":["Basic Research Program"]}],"id":[{"id":"10.13039\/501100007251","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100007251","name":"HSE University","doi-asserted-by":"crossref","award":["Project Teams framework of MIEM HSE"],"award-info":[{"award-number":["Project Teams framework of MIEM HSE"]}],"id":[{"id":"10.13039\/501100007251","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100002261","name":"\u0420\u043e\u0441\u0441\u0438\u0439\u0441\u043a\u0438\u0439 \u0424\u043e\u043d\u0434 \u0424\u0443\u043d\u0434\u0430\u043c\u0435\u043d\u0442\u0430\u043b\u044c\u043d\u044b\u0445 \u0418\u0441\u0441\u043b\u0435\u0434\u043e\u0432\u0430\u043d\u0438\u0439","doi-asserted-by":"publisher","award":["20-04-60556"],"award-info":[{"award-number":["20-04-60556"]}],"id":[{"id":"10.13039\/501100002261","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000009","name":"Foundation for the National Institutes of Health","doi-asserted-by":"publisher","award":["R35GM128932"],"award-info":[{"award-number":["R35GM128932"]}],"id":[{"id":"10.13039\/100000009","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100013060","name":"European Molecular Biology Laboratory","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100013060","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>\n                    Accurate simulation of complex biological processes is an essential component of developing and validating new technologies and inference approaches. As an effort to help contain the COVID-19 pandemic, large numbers of SARS-CoV-2 genomes have been sequenced from most regions in the world. More than 5.5 million viral sequences are publicly available as of November 2021. Many studies estimate viral genealogies from these sequences, as these can provide valuable information about the spread of the pandemic across time and space. Additionally such data are a rich source of information about molecular evolutionary processes including natural selection, for example allowing the identification of new variants with transmissibility and immunity evasion advantages. To our knowledge, there is no framework that is both efficient and flexible enough to simulate the pandemic to approximate world-scale scenarios and generate viral genealogies of millions of samples. Here, we introduce a new fast simulator\n                    <jats:monospace>VGsim<\/jats:monospace>\n                    which addresses the problem of simulation genealogies under epidemiological models. The simulation process is split into two phases. During the forward run the algorithm generates a chain of population-level events reflecting the dynamics of the pandemic using an hierarchical version of the Gillespie algorithm. During the backward run a coalescent-like approach generates a tree genealogy of samples conditioning on the population-level events chain generated during the forward run. Our software can model complex population structure, epistasis and immunity escape.\n                  <\/jats:p>","DOI":"10.1371\/journal.pcbi.1010409","type":"journal-article","created":{"date-parts":[[2022,8,24]],"date-time":"2022-08-24T13:44:45Z","timestamp":1661348685000},"page":"e1010409","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":12,"title":["VGsim: Scalable viral genealogy simulator for global pandemic"],"prefix":"10.1371","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1431-5260","authenticated-orcid":true,"given":"Vladimir","family":"Shchur","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0331-8060","authenticated-orcid":true,"given":"Vadim","family":"Spirin","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2682-9867","authenticated-orcid":true,"given":"Dmitry","family":"Sirotkin","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8149-0483","authenticated-orcid":true,"given":"Evgeni","family":"Burovski","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1776-8564","authenticated-orcid":true,"given":"Nicola","family":"De Maio","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6535-2478","authenticated-orcid":true,"given":"Russell","family":"Corbett-Detig","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2022,8,24]]},"reference":[{"key":"pcbi.1010409.ref001","article-title":"Want to track pandemic variants faster? Fix the bioinformatics bottleneck","author":"EB Hodcroft","year":"2021","journal-title":"Nature Publishing Group"},{"issue":"6501","key":"pcbi.1010409.ref002","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1126\/science.abc1917","article-title":"Introductions and early spread of SARS-CoV-2 in the New York City area","volume":"369","author":"AS Gonzalez-Reiche","year":"2020","journal-title":"Science"},{"issue":"9","key":"pcbi.1010409.ref003","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2012008118","article-title":"The origin and early spread of SARS-CoV-2 in Europe","volume":"118","author":"SA Nadeau","year":"2021","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"5","key":"pcbi.1010409.ref004","doi-asserted-by":"crossref","DOI":"10.1128\/mBio.02107-20","article-title":"An Early Pandemic Analysis of SARS-CoV-2 Population Structure and Dynamics in Arizona","volume":"11","author":"JT Ladner","year":"2020","journal-title":"mBio"},{"issue":"1","key":"pcbi.1010409.ref005","doi-asserted-by":"crossref","first-page":"649","DOI":"10.1038\/s41467-020-20880-z","article-title":"Genomic epidemiology of the early stages of the SARS-CoV-2 outbreak in Russia","volume":"12","author":"AB Komissarov","year":"2021","journal-title":"Nature Communications"},{"key":"pcbi.1010409.ref006","article-title":"Epidemic waves of COVID-19 in Scotland: a genomic perspective on the impact of the introduction and relaxation of lockdown on SARS-CoV-2","author":"SJ Lycett","year":"2021","journal-title":"medRxiv"},{"issue":"3","key":"pcbi.1010409.ref007","doi-asserted-by":"crossref","first-page":"440","DOI":"10.1038\/s41591-021-01255-3","article-title":"Sixteen novel lineages of SARS-CoV-2 in South Africa","volume":"27","author":"H Tegally","year":"2021","journal-title":"Nature Medicine"},{"key":"pcbi.1010409.ref008","article-title":"Multiple SARS-CoV-2 variants escape neutralization by vaccine-induced humoral immunity","author":"WF Garcia-Beltran","year":"2021","journal-title":"Cell"},{"issue":"4","key":"pcbi.1010409.ref009","doi-asserted-by":"crossref","first-page":"571","DOI":"10.1038\/s41591-021-01290-0","article-title":"Assessing the human immune response to SARS-CoV-2 variants","volume":"27","author":"R Burioni","year":"2021","journal-title":"Nature Medicine"},{"issue":"49","key":"pcbi.1010409.ref010","doi-asserted-by":"crossref","first-page":"31519","DOI":"10.1073\/pnas.2012331117","article-title":"Global analysis of more than 50,000 SARS-CoV-2 genomes reveals epistasis between eight viral genes","volume":"117","author":"HL Zeng","year":"2020","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"29","key":"pcbi.1010409.ref011","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2104241118","article-title":"Ongoing global and regional adaptive evolution of SARS-CoV-2","volume":"118","author":"ND Rochman","year":"2021","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"5","key":"pcbi.1010409.ref012","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pcbi.1004842","article-title":"Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes","volume":"12","author":"J Kelleher","year":"2016","journal-title":"PLOS Computational Biology"},{"issue":"9","key":"pcbi.1010409.ref013","doi-asserted-by":"crossref","first-page":"1266","DOI":"10.1093\/bioinformatics\/btu014","article-title":"Efficient haplotype matching and storage using the positional Burrows\u2013Wheeler transform (PBWT)","volume":"30","author":"R Durbin","year":"2014","journal-title":"Bioinformatics"},{"key":"pcbi.1010409.ref014","article-title":"Fast and scalable genome-wide inference of local tree topologies from large number of haplotypes based on tree consistent PBWT data structure","author":"V Shchur","year":"2019","journal-title":"bioRxiv"},{"issue":"9","key":"pcbi.1010409.ref015","doi-asserted-by":"crossref","first-page":"1330","DOI":"10.1038\/s41588-019-0483-y","article-title":"Inferring whole-genome histories in large population datasets","volume":"51","author":"J Kelleher","year":"2019","journal-title":"Nature Genetics"},{"issue":"A","key":"pcbi.1010409.ref016","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1017\/S0021900200034446","article-title":"On the genealogy of large populations","volume":"19","author":"JFC Kingman","year":"1982","journal-title":"Journal of Applied Probability"},{"issue":"594-604","key":"pcbi.1010409.ref017","first-page":"309","article-title":"On the mathematical foundations of theoretical statistics","volume":"222","author":"RA Fisher","year":"1922","journal-title":"Philosophical Transactions of the Royal Society of London Series A, Containing Papers of a Mathematical or Physical Character"},{"issue":"2","key":"pcbi.1010409.ref018","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1093\/genetics\/16.2.97","article-title":"EVOLUTION IN MENDELIAN POPULATIONS","volume":"16","author":"S Wright","year":"1931","journal-title":"Genetics"},{"issue":"4","key":"pcbi.1010409.ref019","doi-asserted-by":"crossref","first-page":"2213","DOI":"10.1093\/genetics\/165.4.2213","article-title":"Modeling Linkage Disequilibrium and Identifying Recombination Hotspots Using Single-Nucleotide Polymorphism Data","volume":"165","author":"N Li","year":"2003","journal-title":"Genetics"},{"issue":"6","key":"pcbi.1010409.ref020","doi-asserted-by":"crossref","first-page":"809","DOI":"10.1038\/s41588-021-00862-7","article-title":"Ultrafast Sample placement on Existing tRees (UShER) enables real-time phylogenetics for the SARS-CoV-2 pandemic","volume":"53","author":"Y Turakhia","year":"2021","journal-title":"Nature Genetics"},{"key":"pcbi.1010409.ref021","article-title":"matUtils: Tools to Interpret and Manipulate Mutation Annotated Trees","author":"J McBroome","year":"2021","journal-title":"bioRxiv"},{"issue":"3","key":"pcbi.1010409.ref022","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1016\/0304-4149(82)90011-4","article-title":"The coalescent","volume":"13","author":"JFC Kingman","year":"1982","journal-title":"Stochastic Processes and their Applications"},{"issue":"5","key":"pcbi.1010409.ref023","doi-asserted-by":"crossref","first-page":"380","DOI":"10.1038\/nrg795","article-title":"Genealogical trees, coalescent theory and the analysis of genetic polymorphisms","volume":"3","author":"NA Rosenberg","year":"2002","journal-title":"Nature Reviews Genetics"},{"issue":"5","key":"pcbi.1010409.ref024","doi-asserted-by":"crossref","first-page":"1185","DOI":"10.1093\/molbev\/msi103","article-title":"Bayesian Coalescent Inference of Past Population Dynamics from Molecular Sequences","volume":"22","author":"AJ Drummond","year":"2005","journal-title":"Molecular Biology and Evolution"},{"issue":"1","key":"pcbi.1010409.ref025","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1534\/genetics.116.198796","article-title":"The Bacterial Sequential Markov Coalescent","volume":"206","author":"N De Maio","year":"2017","journal-title":"Genetics"},{"issue":"4","key":"pcbi.1010409.ref026","doi-asserted-by":"crossref","first-page":"1421","DOI":"10.1534\/genetics.109.106021","article-title":"Phylodynamics of Infectious Disease Epidemics","volume":"183","author":"EM Volz","year":"2009","journal-title":"Genetics"},{"issue":"3","key":"pcbi.1010409.ref027","first-page":"1","article-title":"Viral Phylodynamics","volume":"9","author":"EM Volz","year":"2013","journal-title":"PLOS Computational Biology"},{"key":"pcbi.1010409.ref028","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1016\/j.tpb.2013.10.002","article-title":"Birth\u2013death models and coalescent point processes: The shape and probability of reconstructed phylogenies","volume":"90","author":"A Lambert","year":"2013","journal-title":"Theoretical Population Biology"},{"issue":"1","key":"pcbi.1010409.ref029","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1016\/j.jtbi.2009.07.018","article-title":"On incomplete sampling under birth\u2013death models and connections to the sampling-based coalescent","volume":"261","author":"T Stadler","year":"2009","journal-title":"Journal of Theoretical Biology"},{"key":"pcbi.1010409.ref030","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1007\/978-3-540-78911-6_2","volume-title":"Mathematical epidemiology","author":"F Brauer","year":"2008"},{"issue":"11","key":"pcbi.1010409.ref031","article-title":"Bayesian phylodynamic inference with complex models","volume":"14","author":"EM Volz","year":"2018","journal-title":"PLOS Computational Biology"},{"key":"pcbi.1010409.ref032","article-title":"Simulating trajectories and phylogenies from population dynamics models with TiPS","author":"G Danesh","year":"2020","journal-title":"bioRxiv"},{"issue":"24","key":"pcbi.1010409.ref033","doi-asserted-by":"crossref","first-page":"3839","DOI":"10.1093\/bioinformatics\/btw556","article-title":"Discoal: flexible coalescent simulations with selection","volume":"32","author":"AD Kern","year":"2016","journal-title":"Bioinformatics"},{"issue":"16","key":"pcbi.1010409.ref034","doi-asserted-by":"crossref","first-page":"2064","DOI":"10.1093\/bioinformatics\/btq322","article-title":"MSMS: a coalescent simulation program including recombination, demographic structure and selection at a single locus","volume":"26","author":"G Ewing","year":"2010","journal-title":"Bioinformatics"},{"issue":"2","key":"pcbi.1010409.ref035","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pgen.1001301","article-title":"Prevalence of Epistasis in the Evolution of Influenza A Surface Proteins","volume":"7","author":"S Kryazhimskiy","year":"2011","journal-title":"PLOS Genetics"},{"issue":"43","key":"pcbi.1010409.ref036","doi-asserted-by":"crossref","first-page":"15376","DOI":"10.1073\/pnas.0404125101","article-title":"The contribution of epistasis to the architecture of fitness in an RNA virus","volume":"101","author":"R Sanju\u00e1n","year":"2004","journal-title":"Proceedings of the National Academy of Sciences"},{"key":"pcbi.1010409.ref037","article-title":"phastSim: efficient simulation of sequence evolution for pandemic-scale datasets","author":"N De Maio","year":"2021","journal-title":"bioRxiv"},{"key":"pcbi.1010409.ref038","first-page":"700","article-title":"Thomas A contribution to the mathematical theory of epidemics","volume":"115","author":"MAG Kermack William Ogilvy","year":"1927","journal-title":"Proceedings of Royal Society A"},{"issue":"1","key":"pcbi.1010409.ref039","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1146\/annurev.physchem.58.032806.104637","article-title":"Stochastic Simulation of Chemical Kinetics","volume":"58","author":"DT Gillespie","year":"2007","journal-title":"Annual Review of Physical Chemistry"},{"issue":"13","key":"pcbi.1010409.ref040","doi-asserted-by":"crossref","first-page":"134116","DOI":"10.1063\/1.4896985","article-title":"Efficient rejection-based simulation of biochemical reactions with stochastic noise and delays","volume":"141","author":"VH Thanh","year":"2014","journal-title":"The Journal of Chemical Physics"},{"issue":"9","key":"pcbi.1010409.ref041","doi-asserted-by":"crossref","first-page":"4059","DOI":"10.1063\/1.1778376","article-title":"Efficient formulation of the stochastic simulation algorithm for chemically reacting systems","volume":"121","author":"Y Cao","year":"2004","journal-title":"The Journal of Chemical Physics"},{"issue":"2","key":"pcbi.1010409.ref042","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1109\/MCSE.2010.118","article-title":"Cython: The best of both worlds","volume":"13","author":"S Behnel","year":"2011","journal-title":"Computing in Science & Engineering"},{"issue":"7825","key":"pcbi.1010409.ref043","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/s41586-020-2649-2","article-title":"Array programming with NumPy","volume":"585","author":"CR Harris","year":"2020","journal-title":"Nature"},{"key":"pcbi.1010409.ref044","unstructured":"Burovski E, Godyaev D, Gorbunova V. mc_lib: Assorted small utilities for MC simulations with Cython;."},{"issue":"6","key":"pcbi.1010409.ref045","doi-asserted-by":"crossref","first-page":"1480","DOI":"10.1093\/molbev\/mst057","article-title":"A Stochastic Simulator of Birth\u2013Death Master Equations with Application to Phylodynamics","volume":"30","author":"TG Vaughan","year":"2013","journal-title":"Molecular Biology and Evolution"},{"issue":"11","key":"pcbi.1010409.ref046","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pone.0242128","article-title":"Reproductive number of coronavirus: A systematic review and meta-analysis based on global level evidence","volume":"15","author":"MA Billah","year":"2020","journal-title":"PLOS ONE"},{"key":"pcbi.1010409.ref047","article-title":"Neuer Beweis eines Satzes \u00fcber Permutationen","author":"H Pr\u00fcfer","year":"1918","journal-title":"Arch Math Phys"},{"key":"pcbi.1010409.ref048","first-page":"012050","article-title":"HPC Resources of the Higher School of Economics","volume":"1740","author":"PS Kostenetskiy","year":"2021","journal-title":"Journal of Physics: Conference Series"},{"issue":"8","key":"pcbi.1010409.ref049","doi-asserted-by":"crossref","first-page":"1002","DOI":"10.1111\/2041-210X.13422","article-title":"nosoi: A stochastic agent-based transmission chain simulation framework in R","volume":"11","author":"S Lequime","year":"2020","journal-title":"Methods in Ecology and Evolution"},{"issue":"11","key":"pcbi.1010409.ref050","doi-asserted-by":"crossref","first-page":"1852","DOI":"10.1093\/bioinformatics\/bty921","article-title":"FAVITES: simultaneous simulation of transmission networks, phylogenetic trees and sequences","volume":"35","author":"N Moshiri","year":"2018","journal-title":"Bioinformatics"},{"key":"pcbi.1010409.ref051","article-title":"Pandemic-Scale Phylogenomics Reveals Elevated Recombination Rates in the SARS-CoV-2 Spike Region","author":"Y Turkahia","year":"2021","journal-title":"bioRxiv"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1010409","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2022,9,6]],"date-time":"2022-09-06T00:00:00Z","timestamp":1662422400000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1010409","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,9,6]],"date-time":"2022-09-06T13:53:22Z","timestamp":1662472402000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1010409"}},"subtitle":[],"editor":[{"given":"Manja","family":"Marz","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,8,24]]},"references-count":51,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2022,8,24]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1010409","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.04.21.21255891","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,8,24]]}}}