{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T10:26:57Z","timestamp":1773829617201,"version":"3.50.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1009182","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2021,7,16]],"date-time":"2021-07-16T00:00:00Z","timestamp":1626393600000}}],"reference-count":52,"publisher":"Public Library of Science (PLoS)","issue":"7","license":[{"start":{"date-parts":[[2021,7,6]],"date-time":"2021-07-06T00:00:00Z","timestamp":1625529600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000865","name":"Bill and Melinda Gates Foundation","doi-asserted-by":"publisher","award":["OPP1195157"],"award-info":[{"award-number":["OPP1195157"]}],"id":[{"id":"10.13039\/100000865","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>\n                    Sample size calculations are an essential component of the design and evaluation of scientific studies. However, there is a lack of clear guidance for determining the sample size needed for phylogenetic studies, which are becoming an essential part of studying pathogen transmission. We introduce a statistical framework for determining the number of true infector-infectee transmission pairs identified by a phylogenetic study, given the size and population coverage of that study. We then show how characteristics of the criteria used to determine linkage and aspects of the study design can influence our ability to correctly identify transmission links, in sometimes counterintuitive ways. We test the overall approach using outbreak simulations and provide guidance for calculating the sensitivity and specificity of the linkage criteria, the key inputs to our approach. The framework is freely available as the R package\n                    <jats:italic>phylosamp<\/jats:italic>\n                    , and is broadly applicable to designing and evaluating a wide array of pathogen phylogenetic studies.\n                  <\/jats:p>","DOI":"10.1371\/journal.pcbi.1009182","type":"journal-article","created":{"date-parts":[[2021,7,6]],"date-time":"2021-07-06T13:43:16Z","timestamp":1625578996000},"page":"e1009182","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":15,"title":["Sample size calculation for phylogenetic case linkage"],"prefix":"10.1371","volume":"17","author":[{"given":"Shirlee","family":"Wohl","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0954-4093","authenticated-orcid":true,"given":"John R.","family":"Giles","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9741-8109","authenticated-orcid":true,"given":"Justin","family":"Lessler","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2021,7,6]]},"reference":[{"key":"pcbi.1009182.ref001","doi-asserted-by":"crossref","DOI":"10.1128\/JCM.00480-18","article-title":"Real-Time Analysis and Visualization of Pathogen Sequence Data","volume":"56","author":"RA Neher","year":"2018","journal-title":"J Clin Microbiol"},{"key":"pcbi.1009182.ref002","doi-asserted-by":"crossref","first-page":"228","DOI":"10.1038\/nature16996","article-title":"Real-time, portable genome sequencing for Ebola surveillance","volume":"530","author":"J Quick","year":"2016","journal-title":"Nature"},{"key":"pcbi.1009182.ref003","doi-asserted-by":"crossref","first-page":"730","DOI":"10.1056\/NEJMoa1003176","article-title":"Whole-genome sequencing and social-network analysis of a tuberculosis outbreak","volume":"364","author":"JL Gardy","year":"2011","journal-title":"N Engl J Med"},{"key":"pcbi.1009182.ref004","doi-asserted-by":"crossref","first-page":"380","DOI":"10.1093\/cid\/ciw242","article-title":"Implementation of Nationwide Real-time Whole-genome Sequencing to Enhance Listeriosis Outbreak Detection and Investigation","volume":"63","author":"BR Jackson","year":"2016","journal-title":"Clin Infect Dis"},{"key":"pcbi.1009182.ref005","doi-asserted-by":"crossref","first-page":"346","DOI":"10.15585\/mmwr.mm6513a3","article-title":"Surveillance Systems to Track Progress Toward Polio Eradication\u2014Worldwide, 2014\u20132015","volume":"65","author":"CJ Snider","year":"2016","journal-title":"MMWR Morb Mortal Wkly Rep"},{"key":"pcbi.1009182.ref006","doi-asserted-by":"crossref","first-page":"466","DOI":"10.2174\/138920211797904052","article-title":"Prospective of Genomics in Revealing Transmission, Reassortment and Evolution of Wildlife-Borne Avian Influenza A (H5N1) Viruses","volume":"12","author":"F Lei","year":"2011","journal-title":"Curr Genomics"},{"key":"pcbi.1009182.ref007","doi-asserted-by":"crossref","first-page":"1220","DOI":"10.1371\/journal.ppat.0030131","article-title":"Phylogenetic analysis reveals the global migration of seasonal influenza A viruses","volume":"3","author":"MI Nelson","year":"2007","journal-title":"PLoS Pathog"},{"key":"pcbi.1009182.ref008","article-title":"Introductions and early spread of SARS-CoV-2 in the New York City area","author":"AS Gonzalez-Reiche","year":"2020","journal-title":"Science"},{"key":"pcbi.1009182.ref009","doi-asserted-by":"crossref","first-page":"855","DOI":"10.1016\/j.chom.2018.04.017","article-title":"Genomic Epidemiology Reconstructs the Introduction and Spread of Zika Virus in Central America and Mexico","volume":"23","author":"J Th\u00e9z\u00e9","year":"2018","journal-title":"Cell Host Microbe"},{"key":"pcbi.1009182.ref010","doi-asserted-by":"crossref","first-page":"230","DOI":"10.1038\/s41586-018-0818-3","article-title":"Genomic insights into the 2016\u20132017 cholera epidemic in Yemen","volume":"565","author":"F-X Weill","year":"2019","journal-title":"Nature"},{"key":"pcbi.1009182.ref011","doi-asserted-by":"crossref","first-page":"97","DOI":"10.1038\/nature14594","article-title":"Temporal and spatial analysis of the 2014\u20132015 Ebola virus outbreak in West Africa","volume":"524","author":"MW Carroll","year":"2015","journal-title":"Nature"},{"key":"pcbi.1009182.ref012","doi-asserted-by":"crossref","first-page":"1516","DOI":"10.1016\/j.cell.2015.06.007","article-title":"Ebola Virus Epidemiology, Transmission, and Evolution during Seven Months in Sierra Leone","volume":"161","author":"DJ Park","year":"2015","journal-title":"Cell"},{"key":"pcbi.1009182.ref013","doi-asserted-by":"crossref","first-page":"e173","DOI":"10.1016\/S2352-3018(19)30378-9","article-title":"Quantifying HIV transmission flow between high-prevalence hotspots and surrounding communities: a population-based study in Rakai, Uganda","volume":"7","author":"O Ratmann","year":"2020","journal-title":"Lancet HIV"},{"key":"pcbi.1009182.ref014","doi-asserted-by":"crossref","first-page":"9535","DOI":"10.1073\/pnas.1120621109","article-title":"Revealing the microscale spatial signature of dengue transmission and immunity in an urban population","volume":"109","author":"H Salje","year":"2012","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1009182.ref015","doi-asserted-by":"crossref","first-page":"e1003397","DOI":"10.1371\/journal.pcbi.1003397","article-title":"Inferring the source of transmission with phylogenetic data","volume":"9","author":"EM Volz","year":"2013","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1009182.ref016","doi-asserted-by":"crossref","first-page":"88","DOI":"10.1016\/j.epidem.2014.09.001","article-title":"Eight challenges in phylodynamic inference","volume":"10","author":"SDW Frost","year":"2015","journal-title":"Epidemics"},{"key":"pcbi.1009182.ref017","doi-asserted-by":"crossref","first-page":"e8","DOI":"10.1016\/S2352-3018(16)30184-9","article-title":"Phylogenetic insights into age-disparate partnerships and HIV","author":"MK Grabowski","year":"2017","journal-title":"The lancet. HIV"},{"key":"pcbi.1009182.ref018","article-title":"Regaining perspective on SARS-CoV-2 molecular tracing and its implications","author":"C Mavian","year":"2020","journal-title":"medRxiv"},{"key":"pcbi.1009182.ref019","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1186\/s13073-014-0101-7","article-title":"A phylogeny-based sampling strategy and power calculator informs genome-wide associations study design for microbial pathogens","volume":"6","author":"MR Farhat","year":"2014","journal-title":"Genome Med"},{"key":"pcbi.1009182.ref020","doi-asserted-by":"crossref","first-page":"2461","DOI":"10.1093\/bioinformatics\/btv183","article-title":"Power and sample-size estimation for microbiome studies using pairwise distances and PERMANOVA","volume":"31","author":"BJ Kelly","year":"2015","journal-title":"Bioinformatics"},{"key":"pcbi.1009182.ref021","author":"HPT Network","year":"2013","journal-title":"HPTN 071: population effects of antiretroviral therapy to reduce HIV transmission (PopART): a cluster-randomized trial of the impact of a combination prevention package on population-level HIV incidence in Zambia and South Africa"},{"key":"pcbi.1009182.ref022","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1002\/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3","article-title":"Index for rating diagnostic tests","volume":"3","author":"WJ Youden","year":"1950","journal-title":"Cancer"},{"key":"pcbi.1009182.ref023","doi-asserted-by":"crossref","first-page":"670","DOI":"10.1093\/aje\/kwj063","article-title":"The inconsistency of \u201coptimal\u201d cutpoints obtained using two criteria based on the receiver operating characteristic curve","volume":"163","author":"NJ Perkins","year":"2006","journal-title":"Am J Epidemiol"},{"key":"pcbi.1009182.ref024","doi-asserted-by":"crossref","first-page":"2676","DOI":"10.1002\/sim.4509","article-title":"Classification accuracy and cut point selection","volume":"31","author":"X Liu","year":"2012","journal-title":"Stat Med"},{"key":"pcbi.1009182.ref025","doi-asserted-by":"crossref","first-page":"807","DOI":"10.1016\/j.acra.2013.02.004","article-title":"Optimal thresholds by maximizing or minimizing various metrics via ROC-type analysis","volume":"20","author":"KH Zou","year":"2013","journal-title":"Acad Radiol"},{"key":"pcbi.1009182.ref026","doi-asserted-by":"crossref","first-page":"e1003457","DOI":"10.1371\/journal.pcbi.1003457","article-title":"Bayesian reconstruction of disease outbreaks by combining epidemiologic and genomic data","volume":"10","author":"T Jombart","year":"2014","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1009182.ref027","unstructured":"Team RC, Others. R: A language and environment for statistical computing. 2013. Available: http:\/\/finzi.psych.upenn.edu\/R\/library\/dplR\/doc\/intro-dplR.pdf"},{"key":"pcbi.1009182.ref028","doi-asserted-by":"crossref","first-page":"749","DOI":"10.2307\/3215356","article-title":"On the distribution of distances in recursive trees","volume":"33","author":"RP Dobrow","year":"1996","journal-title":"J Appl Probab"},{"key":"pcbi.1009182.ref029","doi-asserted-by":"crossref","first-page":"253","DOI":"10.1214\/aoap\/1042765668","article-title":"Distribution of distances in random binary search trees","volume":"13","author":"HM Mahmoud","year":"2003","journal-title":"Ann Appl Probab"},{"key":"pcbi.1009182.ref030","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1016\/j.epidem.2016.10.001","article-title":"Estimating infectious disease transmission distances using the overall distribution of cases","volume":"17","author":"H Salje","year":"2016","journal-title":"Epidemics"},{"key":"pcbi.1009182.ref031","doi-asserted-by":"crossref","first-page":"1395","DOI":"10.1534\/genetics.114.171538","article-title":"The distribution of pairwise genetic distances: a tool for investigating disease transmission","volume":"198","author":"CJ Worby","year":"2014","journal-title":"Genetics"},{"key":"pcbi.1009182.ref032","doi-asserted-by":"crossref","first-page":"e1006885","DOI":"10.1371\/journal.ppat.1006885","article-title":"When are pathogen genome sequences informative of transmission events?","volume":"14","author":"F Campbell","year":"2018","journal-title":"PLoS Pathog"},{"key":"pcbi.1009182.ref033","doi-asserted-by":"crossref","first-page":"156","DOI":"10.1007\/s00239-001-0064-3","article-title":"Rates of molecular evolution in RNA viruses: a quantitative phylogenetic analysis","volume":"54","author":"GM Jenkins","year":"2002","journal-title":"J Mol Evol"},{"key":"pcbi.1009182.ref034","first-page":"e000094","article-title":"Genome-scale rates of evolutionary change in bacteria.","volume":"2","author":"S Duch\u00eane","year":"2016","journal-title":"Microb Genom"},{"key":"pcbi.1009182.ref035","first-page":"288","article-title":"Reproduction numbers of infectious disease models","volume":"2","author":"P van den Driessche","year":"2017","journal-title":"Infect Dis Model"},{"key":"pcbi.1009182.ref036","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-319-24277-4","volume-title":"ggplot2: Elegant Graphics for Data Analysis","author":"H Wickham","year":"2016"},{"key":"pcbi.1009182.ref037","doi-asserted-by":"crossref","first-page":"e3000611","DOI":"10.1371\/journal.pbio.3000611","article-title":"Combining genomics and epidemiology to track mumps virus transmission in the United States","volume":"18","author":"S Wohl","year":"2020","journal-title":"PLoS Biol"},{"key":"pcbi.1009182.ref038","unstructured":"Genomic epidemiology of novel coronavirus\u2014Global subsampling. [cited 20 Mar 2021]. Available: https:\/\/nextstrain.org\/ncov\/global?l=clock"},{"key":"pcbi.1009182.ref039","doi-asserted-by":"crossref","first-page":"4121","DOI":"10.1093\/bioinformatics\/bty407","article-title":"Nextstrain: real-time tracking of pathogen evolution","volume":"34","author":"J Hadfield","year":"2018","journal-title":"Bioinformatics"},{"key":"pcbi.1009182.ref040","doi-asserted-by":"crossref","first-page":"vex042","DOI":"10.1093\/ve\/vex042","article-title":"TreeTime: Maximum-likelihood phylodynamic analysis","volume":"4","author":"P Sagulenko","year":"2018","journal-title":"Virus Evol"},{"key":"pcbi.1009182.ref041","article-title":"Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing","author":"L Ferretti","year":"2020","journal-title":"bioRxiv"},{"key":"pcbi.1009182.ref042","doi-asserted-by":"crossref","first-page":"865","DOI":"10.1093\/aje\/kwu209","article-title":"Serial intervals of respiratory infectious diseases: a systematic review and analysis","volume":"180","author":"MA Vink","year":"2014","journal-title":"Am J Epidemiol"},{"key":"pcbi.1009182.ref043","volume-title":"Infectious Diseases of Humans: Dynamics and Control","author":"RM Anderson","year":"1992"},{"key":"pcbi.1009182.ref044","volume-title":"An Introduction to Infectious Disease Modelling","author":"E Vynnycky","year":"2010"},{"key":"pcbi.1009182.ref045","doi-asserted-by":"crossref","first-page":"e0242128","DOI":"10.1371\/journal.pone.0242128","article-title":"Reproductive number of coronavirus: A systematic review and meta-analysis based on global level evidence","volume":"15","author":"MA Billah","year":"2020","journal-title":"PLoS One"},{"key":"pcbi.1009182.ref046","doi-asserted-by":"crossref","first-page":"e0239800","DOI":"10.1371\/journal.pone.0239800","article-title":"Global convergence of COVID-19 basic reproduction number and estimation from early-time SIR dynamics","volume":"15","author":"GG Katul","year":"2020","journal-title":"PLoS One"},{"key":"pcbi.1009182.ref047","doi-asserted-by":"crossref","first-page":"e1005495","DOI":"10.1371\/journal.pcbi.1005495","article-title":"Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks","volume":"13","author":"D Klinkenberg","year":"2017","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1009182.ref048","first-page":"444","article-title":"Unravelling transmission trees of infectious diseases by combining genetic and epidemiological data","volume":"279","author":"RJF Ypma","year":"2012","journal-title":"Proc Biol Sci"},{"key":"pcbi.1009182.ref049","doi-asserted-by":"crossref","first-page":"e1002768","DOI":"10.1371\/journal.pcbi.1002768","article-title":"A Bayesian inference framework to reconstruct transmission trees using epidemiological and genetic data","volume":"8","author":"MJ Morelli","year":"2012","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1009182.ref050","doi-asserted-by":"crossref","first-page":"1119","DOI":"10.1098\/rsif.2009.0530","article-title":"Protocols for sampling viral sequences to study epidemic dynamics","volume":"7","author":"JC Stack","year":"2010","journal-title":"J R Soc Interface"},{"key":"pcbi.1009182.ref051","doi-asserted-by":"crossref","first-page":"1797","DOI":"10.1098\/rsif.2011.0850","article-title":"Inferring pandemic growth rates from sequence data","volume":"9","author":"E de Silva","year":"2012","journal-title":"J R Soc Interface"},{"key":"pcbi.1009182.ref052","first-page":"vew003","article-title":"The effects of sampling strategy on the quality of reconstruction of viral population dynamics using Bayesian skyline family coalescent methods: A simulation study","volume":"2","author":"MD Hall","year":"2016","journal-title":"Virus Evol"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1009182","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2021,7,16]],"date-time":"2021-07-16T00:00:00Z","timestamp":1626393600000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009182","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,16]],"date-time":"2021-07-16T13:54:32Z","timestamp":1626443672000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1009182"}},"subtitle":[],"editor":[{"given":"Virginia E.","family":"Pitzer","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,7,6]]},"references-count":52,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2021,7,6]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1009182","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.07.10.20150920","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,7,6]]}}}