{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T07:53:06Z","timestamp":1761897186448},"reference-count":24,"publisher":"Springer Science and Business Media LLC","issue":"S20","license":[{"start":{"date-parts":[[2019,12,1]],"date-time":"2019-12-01T00:00:00Z","timestamp":1575158400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2019,12,17]],"date-time":"2019-12-17T00:00:00Z","timestamp":1576540800000},"content-version":"vor","delay-in-days":16,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2019,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n<jats:title>Background<\/jats:title>\n<jats:p>Bacterial pathogens exhibit an impressive amount of genomic diversity. This diversity can be informative of evolutionary adaptations, host-pathogen interactions, and disease transmission patterns. However, capturing this diversity directly from biological samples is challenging.<\/jats:p>\n<\/jats:sec><jats:sec>\n<jats:title>Results<\/jats:title>\n<jats:p>We introduce a framework for understanding the within-host diversity of a pathogen using multi-locus sequence types (MLST) from whole-genome sequencing (WGS) data. Our approach consists of two stages. First we process each sample individually by assigning it, for each locus in the MLST scheme, a set of alleles and a proportion for each allele. Next, we associate to each sample a set of strain types using the alleles and the strain proportions obtained in the first step. We achieve this by using the smallest possible number of previously unobserved strains across all samples, while using those unobserved strains which are as close to the observed ones as possible, at the same time respecting the allele proportions as closely as possible. We solve both problems using mixed integer linear programming (MILP). Our method performs accurately on simulated data and generates results on a real data set of <jats:italic>Borrelia burgdorferi<\/jats:italic> genomes suggesting a high level of diversity for this pathogen.<\/jats:p>\n<\/jats:sec><jats:sec>\n<jats:title>Conclusions<\/jats:title>\n<jats:p>Our approach can apply to any bacterial pathogen with an MLST scheme, even though we developed it with <jats:italic>Borrelia burgdorferi<\/jats:italic>, the etiological agent of Lyme disease, in mind. Our work paves the way for robust strain typing in the presence of within-host heterogeneity, overcoming an essential challenge currently not addressed by any existing methodology for pathogen genomics.<\/jats:p>\n<\/jats:sec>","DOI":"10.1186\/s12859-019-3204-8","type":"journal-article","created":{"date-parts":[[2019,12,17]],"date-time":"2019-12-17T01:02:24Z","timestamp":1576544544000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Deconvoluting the diversity of within-host pathogen strains in a multi-locus sequence typing framework"],"prefix":"10.1186","volume":"20","author":[{"given":"Guo Liang","family":"Gan","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Elijah","family":"Willie","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Cedric","family":"Chauve","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Leonid","family":"Chindelevitch","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2019,12,17]]},"reference":[{"issue":"3","key":"3204_CR1","doi-asserted-by":"publisher","first-page":"150","DOI":"10.1038\/nrmicro.2015.13","volume":"14","author":"X Didelot","year":"2016","unstructured":"Didelot X, Walker AS, Peto TE, Crook DW, Wilson DJ. Within-host evolution of bacterial pathogens. Nat Rev Microbiol. 2016; 14(3):150\u201362.","journal-title":"Nat Rev Microbiol"},{"key":"3204_CR2","doi-asserted-by":"publisher","first-page":"691","DOI":"10.1038\/nri.2017.69","volume":"17","author":"AM Cadena","year":"2017","unstructured":"Cadena AM, Fortune SM, Flynn JL. Heterogeneity in tuberculosis. Nat Rev Immunol. 2017; 17:691. https:\/\/doi.org\/10.1038\/nri.2017.69.","journal-title":"Nat Rev Immunol"},{"issue":"10","key":"3204_CR3","doi-asserted-by":"publisher","first-page":"0185656","DOI":"10.1371\/journal.pone.0185656","volume":"12","author":"AD Tyler","year":"2017","unstructured":"Tyler AD, Randell E, Baikie M, Antonation K, Janella D, Christianson S, Tyrrell GJ, Graham M, Van Domselaar G, Sharma MK. Application of whole genome sequence analysis to the study of Mycobacterium tuberculosis in Nunavut, Canada. PLoS ONE. 2017; 12(10):0185656. https:\/\/doi.org\/10.1371\/journal.pone.0185656.","journal-title":"PLoS ONE"},{"issue":"4","key":"3204_CR4","doi-asserted-by":"publisher","first-page":"556","DOI":"10.1111\/ele.12076","volume":"16","author":"Samuel Alizon","year":"2013","unstructured":"Alizon S, de Roode J. C, Michalakis Y. Multiple infections and the evolution of virulence. Ecol Lett. 2013; 16(4):556\u201367. https:\/\/doi.org\/10.1111\/ele.12076.","journal-title":"Ecology Letters"},{"issue":"1675","key":"3204_CR5","doi-asserted-by":"publisher","first-page":"20140293","DOI":"10.1098\/rstb.2014.0293","volume":"370","author":"Maria Strandh","year":"2015","unstructured":"Strandh M, R\u00e5berg Lars. Within-host competition between Borrelia afzelii ospC strains in wild hosts as revealed by massively parallel amplicon sequencing. Philos Trans R Soc Lond B Biol Sci. 2015; 370(1675). https:\/\/doi.org\/10.1098\/rstb.2014.0293.","journal-title":"Philosophical Transactions of the Royal Society B: Biological Sciences"},{"issue":"8","key":"3204_CR6","doi-asserted-by":"publisher","first-page":"22926","DOI":"10.1371\/journal.pone.0022926","volume":"6","author":"D Brisson","year":"2011","unstructured":"Brisson D, Baxamusa N, Schwartz I, Wormser GP. Biodiversity of Borrelia burgdorferi strains in tissues of Lyme disease patients. PLoS ONE. 2011; 6(8):22926. https:\/\/doi.org\/10.1371\/journal.pone.0022926.","journal-title":"PLoS ONE"},{"issue":"7","key":"3204_CR7","doi-asserted-by":"publisher","first-page":"1005759","DOI":"10.1371\/journal.ppat.1005759","volume":"12","author":"KS Walter","year":"2016","unstructured":"Walter KS, Carpi G, Evans BR, Caccone A, Diuk-Wasser MA. Vectors as epidemiological sentinels: Patterns of within-tick Borrelia burgdorferi diversity. PLoS Pathog. 2016; 12(7):1005759. URL https:\/\/doi.org\/10.1371\/journal.ppat.1005759.","journal-title":"PLoS Pathog"},{"issue":"4","key":"3204_CR8","doi-asserted-by":"publisher","first-page":"881","DOI":"10.1128\/CMR.00001-16","volume":"29","author":"T Lynch","year":"2016","unstructured":"Lynch T, Petkau A, Knox N, Graham M, Domselaar GV. A primer on infectious disease bacterial genomics. Clin Microbiol Rev. 2016; 29(4):881\u2013913. https:\/\/doi.org\/10.1128\/cmr.00001-16.","journal-title":"Clin Microbiol Rev"},{"key":"3204_CR9","doi-asserted-by":"publisher","unstructured":"Carpi G, Walter KS, Bent SJ, Hoen AG, Diuk-Wasser M, Caccone A. Whole genome capture of vector-borne pathogens from mixed DNA samples: a case study of Borrelia burgdorferi. BMC Genomics. 2015; 16(1). https:\/\/doi.org\/10.1186\/s12864-015-1634-x.","DOI":"10.1186\/s12864-015-1634-x"},{"issue":"6","key":"3204_CR10","doi-asserted-by":"publisher","first-page":"3140","DOI":"10.1073\/pnas.95.6.3140","volume":"95","author":"MC Maiden","year":"1998","unstructured":"Maiden MC, Bygraves JA, Feil E, Morelli G, Russell JE, Urwin R, Zhang Q, Zhou J, Zurth K, Caugant DA, Feavers IM, Achtman M, Spratt BG. Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. PNAS. 1998; 95(6):3140\u20135.","journal-title":"PNAS"},{"issue":"25","key":"3204_CR11","doi-asserted-by":"publisher","first-page":"8730","DOI":"10.1073\/pnas.0800323105","volume":"105","author":"G Margos","year":"2008","unstructured":"Margos G, Gatewood AG, Aanensen DM, Hanincova K, Terekhova D, Vollmer SA, Cornet M, Piesman J, Donaghy M, Bormane A, Hurn MA, Feil EJ, Fish D, Casjens S, Wormser GP, Schwartz I, Kurtenbach K. MLST of housekeeping genes captures geographic population structure and suggests a european origin of Borrelia burgdorferi. PNAS. 2008; 105(25):8730\u201335. https:\/\/doi.org\/10.1073\/pnas.0800323105.","journal-title":"PNAS"},{"issue":"1","key":"3204_CR12","doi-asserted-by":"publisher","first-page":"181","DOI":"10.1186\/s13059-017-1309-9","volume":"18","author":"C Quince","year":"2017","unstructured":"Quince C, Delmont TO, Raguideau S, Alneberg J, Darling AE, Collins G, Eren AM. DESMAN: a new tool for de novo extraction of strains from metagenomes. Genome Biol. 2017; 18(1):181. https:\/\/doi.org\/10.1186\/s13059-017-1309-9.","journal-title":"Genome Biol"},{"issue":"1","key":"3204_CR13","doi-asserted-by":"publisher","first-page":"2260","DOI":"10.1038\/s41467-017-02209-5","volume":"8","author":"D Albanese","year":"2017","unstructured":"Albanese D, Donati C. Strain profiling and epidemiology of bacterial species from metagenomic sequencing. Nat Commun. 2017; 8(1):2260. https:\/\/doi.org\/10.1038\/s41467-017-02209-5.","journal-title":"Nat Commun"},{"issue":"1","key":"3204_CR14","doi-asserted-by":"publisher","first-page":"106","DOI":"10.1016\/j.gpb.2018.12.005","volume":"17","author":"J Li","year":"2019","unstructured":"Li J, Du P, Ye AY, Zhang Y, Song C, Zeng H, Chen C. GPA: A microbial genetic polymorphisms assignments tool in metagenomic analysis by bayesian estimation. Genomics Proteomics Bioinforma. 2019; 17(1):106\u201317. https:\/\/doi.org\/10.1016\/j.gpb.2018.12.005.","journal-title":"Genomics Proteomics Bioinforma"},{"issue":"2","key":"3204_CR15","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1371\/journal.pcbi.1004475","volume":"12","author":"L Chindelevitch","year":"2016","unstructured":"Chindelevitch L, Colijn C, Moodley P, Wilson D, Cohen T, Else E. ClassTR: Classifying within-host heterogeneity based on tandem repeats with application to Mycobacterium tuberculosis infections. PLOS Comput Biol. 2016; 12(2):1\u201316. https:\/\/doi.org\/10.1371\/journal.pcbi.1004475.","journal-title":"PLOS Comput Biol"},{"key":"3204_CR16","doi-asserted-by":"publisher","first-page":"000124","DOI":"10.1099\/mgen.0.000124","volume":"3","author":"AJ Page","year":"2017","unstructured":"Page AJ, Alikhan N-F, Carleton HA, Seemann T, Keane JA, Katz LS. Comparison of Multi-Locus Sequence Typing software for Next Generation Sequencing data. Microb Genom. 2017; 3:000124. URL https:\/\/doi.org\/10.1099\/mgen.0.000124.","journal-title":"Microb Genom"},{"issue":"1","key":"3204_CR17","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1186\/s13015-015-0052-6","volume":"10","author":"V Bo\u017ea","year":"2015","unstructured":"Bo\u017ea V, Brejov\u00e1 B, Vina\u0159 T. GAML: genome assembly by maximum likelihood. Algorithm Mol Biol. 2015; 10(1):18. URL https:\/\/doi.org\/10.1186\/s13015-015-0052-6.","journal-title":"Algorithm Mol Biol"},{"issue":"4","key":"3204_CR18","doi-asserted-by":"publisher","first-page":"593","DOI":"10.1093\/bioinformatics\/btr708","volume":"28","author":"W Huang","year":"2012","unstructured":"Huang W, Li L, Myers JR, Marth GT. ART: a Next-Generation Sequencing read simulator. Bioinformatics. 2012; 28(4):593\u20134. https:\/\/doi.org\/10.1093\/bioinformatics\/btr708.","journal-title":"Bioinformatics"},{"issue":"3","key":"3204_CR19","doi-asserted-by":"publisher","first-page":"25","DOI":"10.1186\/gb-2009-10-3-r25","volume":"10","author":"B Langmead","year":"2009","unstructured":"Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009; 10(3):25. URL https:\/\/doi.org\/10.1186\/gb-2009-10-3-r25.","journal-title":"Genome Biol"},{"issue":"5","key":"3204_CR20","doi-asserted-by":"publisher","first-page":"525","DOI":"10.1038\/nbt.3519","volume":"34","author":"NL Bray","year":"2016","unstructured":"Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotech. 2016; 34(5):525\u20137. https:\/\/doi.org\/10.1038\/nbt.3519.","journal-title":"Nat Biotech"},{"key":"3204_CR21","doi-asserted-by":"publisher","unstructured":"Levin DA, Peres Y, Wilmer EL. Markov chains and mixing times. Am Math Soc. 2009. https:\/\/doi.org\/10.1090\/mbk\/058.","DOI":"10.1090\/mbk\/058"},{"issue":"7","key":"3204_CR22","doi-asserted-by":"publisher","first-page":"739","DOI":"10.1109\/34.192468","volume":"11","author":"S Peleg","year":"1989","unstructured":"Peleg S, Werman M, Rom H. A unified approach to the change of resolution: space and gray-level. IEEE Trans Pattern Anal Mach Intell. 1989; 11(7):739\u201342. https:\/\/doi.org\/10.1109\/34.192468.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"3204_CR23","doi-asserted-by":"publisher","unstructured":"Knyazev S, Tsyvina V, Melnyk A, Artyomenko A, Malygina T, Porozov YB, Campbell E, Switzer WM, Skums P, Zelikovsky A. CliqueSNV: Scalable reconstruction of intra-host viral populations from NGS reads. bioRxiv. 2018. https:\/\/doi.org\/10.1101\/264242.","DOI":"10.1101\/264242"},{"issue":"2","key":"3204_CR24","doi-asserted-by":"publisher","first-page":"165","DOI":"10.1007\/BF01219108","volume":"14","author":"RC Falco","year":"1992","unstructured":"Falco RC, Fish D. A comparison of methods for sampling the deer tick, Ixodes dammini, in a Lyme disease endemic area. Exp Appl Acarol. 1992; 14(2):165\u201373. https:\/\/doi.org\/10.1007\/BF01219108.","journal-title":"Exp Appl Acarol"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-019-3204-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s12859-019-3204-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-019-3204-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,12,16]],"date-time":"2020-12-16T00:15:08Z","timestamp":1608077708000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-019-3204-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,12]]},"references-count":24,"journal-issue":{"issue":"S20","published-print":{"date-parts":[[2019,12]]}},"alternative-id":["3204"],"URL":"https:\/\/doi.org\/10.1186\/s12859-019-3204-8","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,12]]},"assertion":[{"value":"17 December 2019","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"637"}}