{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T01:53:32Z","timestamp":1776131612448,"version":"3.50.1"},"reference-count":14,"publisher":"Oxford University Press (OUP)","issue":"20","license":[{"start":{"date-parts":[[2021,4,6]],"date-time":"2021-04-06T00:00:00Z","timestamp":1617667200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100010269","name":"Wellcome Trust","doi-asserted-by":"publisher","award":["202792\/Z\/16\/Z"],"award-info":[{"award-number":["202792\/Z\/16\/Z"]}],"id":[{"id":"10.13039\/100010269","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,10,25]]},"abstract":"<jats:title>Summary<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Routine infectious disease surveillance is increasingly based on large-scale whole-genome sequencing databases. Real-time surveillance would benefit from immediate assignments of each genome assembly to hierarchical population structures. Here we present pHierCC, a pipeline that defines a scalable clustering scheme, HierCC, based on core genome multi-locus typing that allows incremental, static, multi-level cluster assignments of genomes. We also present HCCeval, which identifies optimal thresholds for assigning genomes to cohesive HierCC clusters. HierCC was implemented in EnteroBase in 2018 and has since genotyped &amp;gt;530 000 genomes from Salmonella, Escherichia\/Shigella, Streptococcus, Clostridioides, Vibrio and Yersinia.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>https:\/\/enterobase.warwick.ac.uk\/ and Source code and instructions: https:\/\/github.com\/zheminzhou\/pHierCC<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab234","type":"journal-article","created":{"date-parts":[[2021,4,6]],"date-time":"2021-04-06T07:20:23Z","timestamp":1617693623000},"page":"3645-3646","source":"Crossref","is-referenced-by-count":113,"title":["HierCC: a multi-level clustering scheme for population assignments based on core genome MLST"],"prefix":"10.1093","volume":"37","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9783-0366","authenticated-orcid":false,"given":"Zhemin","family":"Zhou","sequence":"first","affiliation":[{"name":"Warwick Medical School, University of Warwick , Coventry CV4 7AL, UK"}]},{"given":"Jane","family":"Charlesworth","sequence":"additional","affiliation":[{"name":"Warwick Medical School, University of Warwick , Coventry CV4 7AL, UK"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6815-0070","authenticated-orcid":false,"given":"Mark","family":"Achtman","sequence":"additional","affiliation":[{"name":"Warwick Medical School, University of Warwick , Coventry CV4 7AL, UK"}]}],"member":"286","published-online":{"date-parts":[[2021,4,6]]},"reference":[{"key":"2023051608570344200_btab234-B1","doi-asserted-by":"crossref","DOI":"10.1099\/mgen.0.000318","article-title":"Concordance of SNP- and allele-based typing workflows in the context of a large-scale international Salmonella Enteritidis outbreak investigation","author":"Coipan","year":"2020","journal-title":"Microb. Genom"},{"key":"2023051608570344200_btab234-B2","doi-asserted-by":"crossref","first-page":"3028","DOI":"10.1093\/bioinformatics\/bty212","article-title":"SnapperDB: a database solution for routine sequencing analysis of bacterial isolates","volume":"34","author":"Dallman","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051608570344200_btab234-B3","year":"2020"},{"key":"2023051608570344200_btab234-B4","first-page":"mgen000410","article-title":"A publicly accessible database for Clostridioides difficile genome sequences supports tracing of transmission chains and epidemics","volume":"6","author":"Frentrup","year":"2019","journal-title":"Microb. Genom"},{"key":"2023051608570344200_btab234-B5","doi-asserted-by":"crossref","first-page":"3046","DOI":"10.1128\/JCM.01312-12","article-title":"Resolution of a meningococcal disease outbreak from whole-genome sequence data with rapid web-based analysis methods","volume":"50","author":"Jolley","year":"2012","journal-title":"J. Clin. Microbiol"},{"key":"2023051608570344200_btab234-B6","doi-asserted-by":"crossref","first-page":"13","DOI":"10.2807\/1560-7917.ES.2019.24.13.1900161","article-title":"Outbreak of Salmonella enterica serotype Poona in infants linked to persistent Salmonella contamination in an infant formula manufacturing facility, France, August 2018 to February 2019","volume":"24","author":"Jones","year":"2019","journal-title":"Euro. Surveill"},{"key":"2023051608570344200_btab234-B7","doi-asserted-by":"crossref","first-page":"631","DOI":"10.3390\/e19110631","article-title":"On normalized mutual information: measure derivations and properties","volume":"19","author":"Kvalseth","year":"2017","journal-title":"Entropy"},{"key":"2023051608570344200_btab234-B8","doi-asserted-by":"crossref","first-page":"e22751","DOI":"10.1371\/journal.pone.0022751","article-title":"Prospective genomic characterization of the German enterohemorrhagic Escherichia coli O104:H4 outbreak by rapid next generation sequencing technology","volume":"6","author":"Mellmann","year":"2011","journal-title":"PLoS One"},{"key":"2023051608570344200_btab234-B9","doi-asserted-by":"crossref","first-page":"1140","DOI":"10.1038\/ng.705","article-title":"Yersinia pestis genome sequencing identifies patterns of global phylogenetic diversity","volume":"42","author":"Morelli","year":"2010","journal-title":"Nat. Genet"},{"key":"2023051608570344200_btab234-B10","doi-asserted-by":"crossref","first-page":"16185","DOI":"10.1038\/nmicrobiol.2016.185","article-title":"Whole genome-based population biology and epidemiological surveillance of Listeria monocytogenes","volume":"2","author":"Moura","year":"2016","journal-title":"Nat. Microbiol"},{"key":"2023051608570344200_btab234-B11","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","article-title":"Silhouettes: a graphical aid to the interpretation and validation of cluster analysis","volume":"20","author":"Rousseeuw","year":"1987","journal-title":"J. Comput. Appl. Math"},{"key":"2023051608570344200_btab234-B12","doi-asserted-by":"crossref","first-page":"12827","DOI":"10.1038\/ncomms12827","article-title":"An extended genotyping framework for Salmonella enterica serovar Typhi, the cause of human typhoid","volume":"7","author":"Wong","year":"2016","journal-title":"Nat. Commun"},{"key":"2023051608570344200_btab234-B13","doi-asserted-by":"crossref","first-page":"1395","DOI":"10.1101\/gr.232397.117","article-title":"GrapeTree: visualization of core genomic relationships among 100,000 bacterial pathogens","volume":"28","author":"Zhou","year":"2018","journal-title":"Genome Res"},{"key":"2023051608570344200_btab234-B14","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1101\/gr.251678.119","article-title":"The EnteroBase user's guide, with case studies on Salmonella transmissions, Yersinia pestis phylogeny, and Escherichia core genomic diversity","volume":"30","author":"Zhou","year":"2020","journal-title":"Genome Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab234\/37305954\/btab234.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/20\/3645\/50338210\/btab234.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/20\/3645\/50338210\/btab234.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T04:57:39Z","timestamp":1684213059000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/20\/3645\/6212647"}},"subtitle":[],"editor":[{"given":"Janet","family":"Kelso","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,4,6]]},"references-count":14,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2021,10,25]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab234","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.11.25.397539","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,10,15]]},"published":{"date-parts":[[2021,4,6]]}}}