{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T21:12:52Z","timestamp":1773177172138,"version":"3.50.1"},"reference-count":35,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2025,3,7]],"date-time":"2025-03-07T00:00:00Z","timestamp":1741305600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,3,29]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Recommendations on the use of genomics for pathogens surveillance are evidence that high-throughput genomic sequencing plays a key role to fight global health threats. Coupled with bioinformatics and other data types (e.g., epidemiological information), genomics is used to obtain knowledge on health pathogenic threats and insights on their evolution, to monitor pathogens spread, and to evaluate the effectiveness of countermeasures. From a decision-making policy perspective, it is essential to ensure the entire process\u2019s quality before relying on analysis results as evidence. Available workflows usually offer quality assessment tools that are primarily focused on the quality of raw NGS reads but often struggle to keep pace with new technologies and threats, and fail to provide a robust consensus on results, necessitating manual evaluation of multiple tool outputs.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We present PathoSeq-QC, a bioinformatics decision support workflow developed to improve the trustworthiness of genomic surveillance analyses and conclusions. Designed for SARS-CoV-2, it is suitable for any viral threat. In the specific case of SARS-CoV-2, PathoSeq-QC: (i) evaluates the quality of the raw data; (ii) assesses whether the analysed sample is composed by single or multiple lineages; (iii) produces robust variant calling results via multi-tool comparison; (iv) reports whether the produced data are in support of a recombinant virus, a novel or an already known lineage. The tool is modular, which will allow easy functionalities extension.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>PathoSeq-QC is a command-line tool written in Python and R. The code is available at https:\/\/code.europa.eu\/dighealth\/pathoseq-qc.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf102","type":"journal-article","created":{"date-parts":[[2025,3,7]],"date-time":"2025-03-07T19:21:56Z","timestamp":1741375316000},"source":"Crossref","is-referenced-by-count":2,"title":["PathoSeq-QC: a decision support bioinformatics workflow for robust genomic surveillance"],"prefix":"10.1093","volume":"41","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4899-5284","authenticated-orcid":false,"given":"Gabriele","family":"Leoni","sequence":"first","affiliation":[{"name":"European Commission, Joint Research Centre (JRC) , Ispra, 21027,","place":["Italy"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6782-4704","authenticated-orcid":false,"given":"Mauro","family":"Petrillo","sequence":"additional","affiliation":[{"name":"Seidor Italy S.r.l. , Milan, 20122,","place":["Italy"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3991-0514","authenticated-orcid":false,"given":"Victoria","family":"Ruiz-Serra","sequence":"additional","affiliation":[{"name":"European Commission, Joint Research Centre (JRC) , Geel, 2440,","place":["Belgium"]}]},{"given":"Maddalena","family":"Querci","sequence":"additional","affiliation":[{"name":"European Commission, Joint Research Centre (JRC) , Ispra, 21027,","place":["Italy"]}]},{"given":"Sandra","family":"Coecke","sequence":"additional","affiliation":[{"name":"European Commission, Joint Research Centre (JRC) , Ispra, 21027,","place":["Italy"]}]},{"given":"Tobias","family":"Wiesenthal","sequence":"additional","affiliation":[{"name":"European Commission, Joint Research Centre (JRC) , Geel, 2440,","place":["Belgium"]}]}],"member":"286","published-online":{"date-parts":[[2025,3,7]]},"reference":[{"key":"2025041602165674300_btaf102-B1","doi-asserted-by":"crossref","first-page":"1058","DOI":"10.1016\/j.trecan.2023.08.011","article-title":"Variant allele frequency: a decision-making tool in precision oncology?","volume":"9","author":"Boscolo Bielo","year":"2023","journal-title":"Trends Cancer"},{"key":"2025041602165674300_btaf102-B2","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1016\/j.bsheal.2023.01.002","article-title":"Towards precision medicine: omics approach for COVID-19","volume":"5","author":"Cen","year":"2023","journal-title":"Biosaf Health"},{"key":"2025041602165674300_btaf102-B3","doi-asserted-by":"crossref","first-page":"i884","DOI":"10.1093\/bioinformatics\/bty560","article-title":"fastp: an ultra-fast all-in-one FASTQ preprocessor","volume":"34","author":"Chen","year":"2018","journal-title":"Bioinformatics"},{"key":"2025041602165674300_btaf102-B4","first-page":"e00182","article-title":"Profiling of SARS-CoV-2 subgenomic RNAs in clinical specimens","volume":"10","author":"Chen","year":"2022","journal-title":"Microbiol Spectr"},{"key":"2025041602165674300_btaf102-B5","doi-asserted-by":"crossref","first-page":"430","DOI":"10.3390\/v16030430","article-title":"Recommendations for uniform variant calling of SARS-CoV-2 genome sequence across bioinformatic workflows","volume":"16","author":"Connor","year":"2024","journal-title":"Viruses"},{"key":"2025041602165674300_btaf102-B6","doi-asserted-by":"crossref","first-page":"911861","DOI":"10.3389\/fmed.2022.911861","article-title":"Advances and trends in omics technology development","volume":"9","author":"Dai","year":"2022","journal-title":"Front. Med"},{"key":"2025041602165674300_btaf102-B7","doi-asserted-by":"crossref","first-page":"giab008","DOI":"10.1093\/gigascience\/giab008","article-title":"Twelve years of SAMtools and BCFtools","volume":"10","author":"Danecek","year":"2021","journal-title":"Gigascience"},{"key":"2025041602165674300_btaf102-B8","doi-asserted-by":"crossref","first-page":"491","DOI":"10.1038\/ng.806","article-title":"A framework for variation discovery and genotyping using next-generation DNA sequencing data","volume":"43","author":"DePristo","year":"2011","journal-title":"Nat Genet"},{"key":"2025041602165674300_btaf102-B9","doi-asserted-by":"crossref","first-page":"e1006468","DOI":"10.1371\/journal.pcbi.1006468","article-title":"Wrangling distributed computing for high-throughput environmental science: an introduction to HTCondor","volume":"14","author":"Erickson","year":"2018","journal-title":"PLoS Comput Biol"},{"key":"2025041602165674300_btaf102-B10","doi-asserted-by":"crossref","first-page":"146075","DOI":"10.1109\/ACCESS.2020.3015016","article-title":"SeQual: big data tool to perform quality control and data preprocessing of large NGS datasets","volume":"8","author":"Exp\u00f3sito","year":"2020","journal-title":"IEEE Access"},{"key":"2025041602165674300_btaf102-B11","doi-asserted-by":"crossref","first-page":"185","DOI":"10.3390\/v14020185","article-title":"Assessment of inter-laboratory differences in SARS-CoV-2 consensus genome assemblies between public health laboratories in Australia","volume":"14","author":"Foster","year":"2022","journal-title":"Viruses"},{"key":"2025041602165674300_btaf102-B12","doi-asserted-by":"crossref","first-page":"giae065","DOI":"10.1093\/gigascience\/giae065","article-title":"V-pipe 3.0: a sustainable pipeline for within-sample viral genetic diversity estimation","volume":"13","author":"Fuhrmann","year":"2024","journal-title":"Gigascience"},{"key":"2025041602165674300_btaf102-B13","doi-asserted-by":"crossref","first-page":"3181","DOI":"10.1093\/bioinformatics\/btac306","article-title":"Detection of oncogenic and clinically actionable mutations in cancer genomes critically depends on variant calling tools","volume":"38","author":"Garcia-Prieto","year":"2022","journal-title":"Bioinformatics"},{"key":"2025041602165674300_btaf102-B14","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1186\/s13059-018-1618-7","article-title":"An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar","volume":"20","author":"Grubaugh","year":"2019","journal-title":"Genome Biol"},{"key":"2025041602165674300_btaf102-B15","doi-asserted-by":"crossref","first-page":"W619","DOI":"10.1093\/nar\/gkab417","article-title":"The COVID-19 data portal: accelerating SARS-CoV-2 and COVID-19 research through rapid open access data sharing","volume":"49","author":"Harrison","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2025041602165674300_btaf102-B16","doi-asserted-by":"publisher","first-page":"e0094421","DOI":"10.1128\/jcm.00944-21","article-title":"Assessment of SARS-CoV-2 genome sequencing: quality criteria and low-frequency variants","volume":"59","author":"Jacot","year":"2021","journal-title":"J Clin Microbiol"},{"key":"2025041602165674300_btaf102-B17","doi-asserted-by":"crossref","first-page":"1190133","DOI":"10.3389\/fmicb.2023.1190133","article-title":"Genomic characterization of SARS-CoV-2 in Egypt: insights into spike protein thermodynamic stability","volume":"14","author":"Jalal","year":"2023","journal-title":"Front Microbiol"},{"key":"2025041602165674300_btaf102-B18","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1038\/s41586-022-05049-6","article-title":"Wastewater sequencing reveals early cryptic SARS-CoV-2 variant transmission","volume":"609","author":"Karthikeyan","year":"2022","journal-title":"Nature"},{"key":"2025041602165674300_btaf102-B19","doi-asserted-by":"crossref","first-page":"1049","DOI":"10.46234\/ccdcw2021.255","article-title":"GISAID\u2019s role in pandemic response","volume":"3","author":"Khare","year":"2021","journal-title":"China CDC Wkly"},{"key":"2025041602165674300_btaf102-B21","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1186\/s13059-022-02609-x","article-title":"VirStrain: a strain identification tool for RNA viruses","volume":"23","author":"Liao","year":"2022","journal-title":"Genome Biol"},{"key":"2025041602165674300_btaf102-B22","doi-asserted-by":"crossref","first-page":"361","DOI":"10.1038\/s41579-023-00878-2","article-title":"The evolution of SARS-CoV-2","volume":"21","author":"Markov","year":"2023","journal-title":"Nat Rev Microbiol"},{"key":"2025041602165674300_btaf102-B23","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1038\/s41576-021-00360-w","article-title":"Testing at scale during the COVID-19 pandemic","volume":"22","author":"Mercer","year":"2021","journal-title":"Nat Rev Genet"},{"key":"2025041602165674300_btaf102-B24","doi-asserted-by":"crossref","first-page":"e13300","DOI":"10.7717\/peerj.13300","article-title":"PipeCoV: a pipeline for SARS-CoV-2 genome assembly, annotation and variant identification","volume":"10","author":"Oliveira","year":"2022","journal-title":"PeerJ"},{"key":"2025041602165674300_btaf102-B25","doi-asserted-by":"crossref","first-page":"veab064","DOI":"10.1093\/ve\/veab064","article-title":"Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool","volume":"7","author":"O\u2019Toole","year":"2021","journal-title":"Virus Evol"},{"key":"2025041602165674300_btaf102-B26","doi-asserted-by":"crossref","first-page":"645","DOI":"10.1101\/gr.268110.120","article-title":"Subgenomic RNA identification in SARS-CoV-2 genomic sequencing data","volume":"31","author":"Parker","year":"2021","journal-title":"Genome Res"},{"key":"2025041602165674300_btaf102-B27","doi-asserted-by":"crossref","first-page":"851","DOI":"10.3389\/fonc.2019.00851","article-title":"Standardization of sequencing coverage depth in NGS: recommendation for detection of clonal and subclonal mutations in cancer diagnostics","volume":"9","author":"Petrackova","year":"2019","journal-title":"Front Oncol"},{"key":"2025041602165674300_btaf102-B28","author":"Poplin","year":"1178"},{"key":"2025041602165674300_btaf102-B29","volume-title":"Guiding Principles for Pathogen Genome Data Sharing","author":"WHO","year":"2022"},{"key":"2025041602165674300_btaf102-B30","volume-title":"Global Genomic Surveillance Strategy for Pathogens with Pandemic and Epidemic Potential 2022\u20132032: Progress Report on the First Year of Implementation","author":"WHO","year":"2023"},{"key":"2025041602165674300_btaf102-B31","doi-asserted-by":"crossref","first-page":"11189","DOI":"10.1093\/nar\/gks918","article-title":"LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets","volume":"40","author":"Wilm","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2025041602165674300_btaf102-B32","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1038\/s41586-020-2008-3","article-title":"A new coronavirus associated with human respiratory disease in China","volume":"579","author":"Wu","year":"2020","journal-title":"Nature"},{"key":"2025041602165674300_btaf102-B33","doi-asserted-by":"crossref","first-page":"e13821","DOI":"10.7717\/peerj.13821","article-title":"Benchmark datasets for SARS-CoV-2 surveillance bioinformatics","volume":"10","author":"Xiaoli","year":"2022","journal-title":"PeerJ"},{"key":"2025041602165674300_btaf102-B34","doi-asserted-by":"crossref","first-page":"109860","DOI":"10.1016\/j.virol.2023.109860","article-title":"H5N1 highly pathogenic avian influenza clade 2.3.4.4b in wild and domestic birds: introductions into the United States and reassortments, December 2021\u2013April 2022","volume":"587","author":"Youk","year":"2023","journal-title":"Virology"},{"key":"2025041602165674300_btaf102-B35","doi-asserted-by":"crossref","first-page":"270","DOI":"10.1038\/s41586-020-2012-7","article-title":"A pneumonia outbreak associated with a new coronavirus of probable bat origin","volume":"579","author":"Zhou","year":"2020","journal-title":"Nature"},{"key":"2025041602165674300_btaf102-B36","first-page":"001146","article-title":"Bioinformatic investigation of discordant sequence data for SARS-CoV-2: insights for robust genomic analysis during pandemic surveillance","volume":"9","author":"Zufan","year":"2023","journal-title":"Microb Genom"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf102\/62340404\/btaf102.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/4\/btaf102\/62340404\/btaf102.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/4\/btaf102\/62340404\/btaf102.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,4,16]],"date-time":"2025-04-16T06:17:07Z","timestamp":1744784227000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf102\/8063611"}},"subtitle":[],"editor":[{"given":"Can","family":"Alkan","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,3,7]]},"references-count":35,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,3,29]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf102","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,4]]},"published":{"date-parts":[[2025,3,7]]},"article-number":"btaf102"}}