{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,23]],"date-time":"2025-04-23T15:07:55Z","timestamp":1745420875886,"version":"3.37.3"},"reference-count":31,"publisher":"Oxford University Press (OUP)","issue":"8","license":[{"start":{"date-parts":[[2020,11,23]],"date-time":"2020-11-23T00:00:00Z","timestamp":1606089600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"Deutsche Forschungsgemeinschafft"},{"DOI":"10.13039\/501100001659","name":"German Research Foundation","doi-asserted-by":"crossref","award":["398066876","GRK 2485\/1"],"award-info":[{"award-number":["398066876","GRK 2485\/1"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,5,23]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>High-throughput sequencing data can be affected by different technical errors, e.g. from probe preparation or false base calling. As a consequence, reproducibility of experiments can be weakened. In virus metagenomics, technical errors can result in falsely identified viruses in samples from infected hosts. We present a new resampling approach based on bootstrap sampling of sequencing reads from FASTQ-files in order to generate artificial replicates of sequencing runs which can help to judge the robustness of an analysis. In addition, we evaluate a mixture model on the distribution of read counts per virus to identify potentially false positive findings.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>The evaluation of our approach on an artificially generated dataset with known viral sequence content shows in general a high reproducibility of uncovering viruses in sequencing data, i.e. the correlation between original and mean bootstrap read count was highly correlated. However, the bootstrap read counts can also indicate reduced or increased evidence for the presence of a virus in the biological sample. We also found that the mixture-model fits well to the read counts, and furthermore, it provides a higher accuracy on the original or on the bootstrap read counts than on the difference between both. The usefulness of our methods is further demonstrated on two freely available real-world datasets from harbor seals.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>We provide a Phyton tool, called RESEQ, available from https:\/\/github.com\/babaksaremi\/RESEQ that allows efficient generation of bootstrap reads from an original FASTQ-file.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa926","type":"journal-article","created":{"date-parts":[[2020,10,20]],"date-time":"2020-10-20T19:13:55Z","timestamp":1603221235000},"page":"1068-1075","source":"Crossref","is-referenced-by-count":6,"title":["Measuring reproducibility of virus metagenomics analyses using bootstrap samples from FASTQ-files"],"prefix":"10.1093","volume":"37","author":[{"given":"Babak","family":"Saremi","sequence":"first","affiliation":[{"name":"Institute for Animal Breeding and Genetics, University of Veterinary Medicine Hannover , Hannover D-30559, Germany"}]},{"given":"Moritz","family":"Kohls","sequence":"additional","affiliation":[{"name":"Institute for Animal Breeding and Genetics, University of Veterinary Medicine Hannover , Hannover D-30559, Germany"}]},{"given":"Pamela","family":"Liebig","sequence":"additional","affiliation":[{"name":"Institute for Animal Breeding and Genetics, University of Veterinary Medicine Hannover , Hannover D-30559, Germany"}]},{"given":"Ursula","family":"Siebert","sequence":"additional","affiliation":[{"name":"Institute for Terrestrial and Aquatic Wildlife Research, University of Veterinary Medicine Hannover , Hannover D-30559, Germany"}]},{"given":"Klaus","family":"Jung","sequence":"additional","affiliation":[{"name":"Institute for Animal Breeding and Genetics, University of Veterinary Medicine Hannover , Hannover D-30559, Germany"}]}],"member":"286","published-online":{"date-parts":[[2020,11,23]]},"reference":[{"key":"2023051612083181000_btaa926-B1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-019-52881-4","article-title":"Damian: an open source bioinformatics tool for fast, systematic and cohort based analysis of microorganisms in diagnostic samples","volume":"9","author":"Alawi","year":"2019","journal-title":"Sci. Rep"},{"key":"2023051612083181000_btaa926-B2","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1007\/s00253-018-9464-9","article-title":"Bioinformatics tools to assess metagenomic data for applied microbiology","volume":"103","author":"Almeida","year":"2019","journal-title":"Appl. Microbiol. Biotechnol"},{"key":"2023051612083181000_btaa926-B3","first-page":"e01942","article-title":"New isolates of pandoraviruses: contribution to the study of replication cycle steps","volume":"93","author":"Andrade","year":"2019","journal-title":"J. Virol"},{"key":"2023051612083181000_btaa926-B4","doi-asserted-by":"crossref","first-page":"e01180","DOI":"10.1128\/mBio.01180-15","article-title":"Discovery of a novel hepatovirus (phopivirus of seals) related to human hepatitis a virus","volume":"6","author":"Anthony","year":"2015","journal-title":"MBio"},{"key":"2023051612083181000_btaa926-B5","doi-asserted-by":"crossref","first-page":"777","DOI":"10.1093\/bioinformatics\/btg484","article-title":"Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments","volume":"20","author":"Baggerly","year":"2004","journal-title":"Bioinformatics"},{"key":"2023051612083181000_btaa926-B6","first-page":"1","article-title":"mixtools: an R package for analyzing finite mixture models","author":"Benaglia","year":"2009"},{"key":"2023051612083181000_btaa926-B7","doi-asserted-by":"crossref","first-page":"720","DOI":"10.3201\/eid2104.141675","article-title":"Avian influenza a (h10n7) virus-associated mass deaths among harbor seals","volume":"21","author":"Bodewes","year":"2015","journal-title":"Emerg. Infect. Dis"},{"key":"2023051612083181000_btaa926-B8","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1038\/nature07410","article-title":"The phaeodactylum genome reveals the evolutionary history of diatom genomes","volume":"456","author":"Bowler","year":"2008","journal-title":"Nature"},{"key":"2023051612083181000_btaa926-B9","doi-asserted-by":"crossref","first-page":"305","DOI":"10.2307\/3318719","article-title":"Matched-block bootstrap for dependent data","volume":"4","author":"Carlstein","year":"1998","journal-title":"Bernoulli"},{"key":"2023051612083181000_btaa926-B10","doi-asserted-by":"crossref","first-page":"e26","DOI":"10.1093\/nar\/gni025","article-title":"Reproducibility, bioinformatic analysis and power of the sage method to evaluate changes in transcriptome","volume":"33","author":"Dinel","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023051612083181000_btaa926-B11","doi-asserted-by":"crossref","first-page":"83","DOI":"10.1007\/s11002-009-9083-4","article-title":"Evaluation of structure and reproducibility of cluster solutions using the bootstrap","volume":"21","author":"Dolnicar","year":"2010","journal-title":"Market. Lett"},{"key":"2023051612083181000_btaa926-B12","doi-asserted-by":"crossref","first-page":"567","DOI":"10.1007\/s00705-013-1844-y","article-title":"A giant pseudomonas phage from Poland","volume":"159","author":"Drulis-Kawa","year":"2014","journal-title":"Arch. Virol"},{"key":"2023051612083181000_btaa926-B13","doi-asserted-by":"crossref","DOI":"10.1137\/1.9781611970319","volume-title":"The Jackknife, the Bootstrap, and Other Resampling Plans","author":"Efron","year":"1982"},{"key":"2023051612083181000_btaa926-B14","doi-asserted-by":"crossref","first-page":"341ps12","DOI":"10.1126\/scitranslmed.aaf5027","article-title":"What does research reproducibility mean?","volume":"8","author":"Goodman","year":"2016","journal-title":"Sci. Transl. Med"},{"volume-title":"Robust Statistics: The Approach Based on Influence Functions","year":"2011","author":"Hampel","key":"2023051612083181000_btaa926-B15"},{"key":"2023051612083181000_btaa926-B16","doi-asserted-by":"crossref","first-page":"115","DOI":"10.3354\/dao068115","article-title":"The 1988 and 2002 phocine distemper virus epidemics in European harbour seals","volume":"68","author":"H\u00e4rk\u00f6nen","year":"2006","journal-title":"Dis. Aquat. Organ"},{"key":"2023051612083181000_btaa926-B17","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1093\/bioinformatics\/btr708","article-title":"Art: a next-generation sequencing read simulator","volume":"28","author":"Huang","year":"2012","journal-title":"Bioinformatics"},{"key":"2023051612083181000_btaa926-B18","doi-asserted-by":"crossref","first-page":"180","DOI":"10.1016\/j.meegid.2018.09.026","article-title":"Virus detection in high-throughput sequencing data without a reference genome of the host","volume":"66","author":"Kruppa","year":"2018","journal-title":"Infect. Genet. Evol"},{"key":"2023051612083181000_btaa926-B19","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.1923","article-title":"Fast gapped-read alignment with bowtie 2","volume":"9","author":"Langmead","year":"2012","journal-title":"Nat. Methods"},{"key":"2023051612083181000_btaa926-B20","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1007\/s10152-007-0072-9","article-title":"Parasites in harbour seals (Phoca vitulina) from the German Wadden Sea between two phocine distemper virus epidemics","volume":"61","author":"Lehnert","year":"2007","journal-title":"Helgoland Mar. Res"},{"key":"2023051612083181000_btaa926-B21","doi-asserted-by":"crossref","first-page":"1752","DOI":"10.1214\/11-AOAS466","article-title":"Measuring reproducibility of high-throughput experiments","volume":"5","author":"Li","year":"2011","journal-title":"Ann. Appl. Stat"},{"key":"2023051612083181000_btaa926-B22","doi-asserted-by":"crossref","first-page":"1427","DOI":"10.1099\/vir.0.19005-0","article-title":"Genetic characterization of the unique short segment of phocid herpesvirus type 1 reveals close relationships among alphaherpesviruses of hosts of the order carnivora","volume":"84","author":"Martina","year":"2003","journal-title":"J. Gen. Virol"},{"key":"2023051612083181000_btaa926-B23","doi-asserted-by":"crossref","first-page":"e30619","DOI":"10.1371\/journal.pone.0030619","article-title":"NGS QC Toolkit: a toolkit for quality control of next generation sequencing data","volume":"7","author":"Patel","year":"2012","journal-title":"PLoS One"},{"key":"2023051612083181000_btaa926-B24","doi-asserted-by":"crossref","first-page":"e2819","DOI":"10.7717\/peerj.2819","article-title":"Brain transcriptomes of harbor seals demonstrate gene expression patterns of animals undergoing a metabolic disease and a viral infection","volume":"4","author":"Rosales","year":"2016","journal-title":"PeerJ"},{"key":"2023051612083181000_btaa926-B25","doi-asserted-by":"crossref","first-page":"69","DOI":"10.1186\/s12859-015-0503-6","article-title":"RIEMS: a software pipeline for sensitive and comprehensive taxonomic classification of reads from metagenomics datasets","volume":"16","author":"Scheuch","year":"2015","journal-title":"BMC Bioinform"},{"key":"2023051612083181000_btaa926-B26","doi-asserted-by":"crossref","first-page":"S2","DOI":"10.1186\/1471-2164-15-S8-S2","article-title":"Bootstrap-based differential gene expression analysis for RNA-seq data with and without replicates","volume":"15","author":"Seesi","year":"2014","journal-title":"BMC Genomics"},{"key":"2023051612083181000_btaa926-B27","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1016\/j.jcpa.2007.04.018","article-title":"Pathological findings in harbour seals (Phoca vitulina): 1996\u20132005","volume":"137","author":"Siebert","year":"2007","journal-title":"J. Comp. Pathol"},{"key":"2023051612083181000_btaa926-B28","first-page":"487","volume-title":"Nature Conservation and Biodiversity","author":"Siebert","year":"2012"},{"key":"2023051612083181000_btaa926-B29","doi-asserted-by":"crossref","first-page":"201","DOI":"10.7589\/2015-11-320","article-title":"Bacterial microbiota in harbor seals (Phoca vitulina) from the North Sea of Schleswig-Holstein, Germany, around the time of morbillivirus and influenza epidemics","volume":"53","author":"Siebert","year":"2017","journal-title":"J. Wildlife Dis"},{"key":"2023051612083181000_btaa926-B30","first-page":"1","article-title":"An introduction to the bootstrap","volume":"57","author":"Tibshirani","year":"1993","journal-title":"Monogr. Stat. Appl. Prob"},{"key":"2023051612083181000_btaa926-B31","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.scitotenv.2004.09.021","article-title":"Bacteriophages\u2014potential for application in wastewater treatment processes","volume":"339","author":"Withey","year":"2005","journal-title":"Sci. Total Environ"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa926\/34463359\/btaa926.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/8\/1068\/50340780\/btaa926.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/8\/1068\/50340780\/btaa926.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T12:09:51Z","timestamp":1684238991000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/8\/1068\/5948992"}},"subtitle":[],"editor":[{"given":"Cowen","family":"Lenore","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,11,23]]},"references-count":31,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2021,5,23]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa926","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2021,4,15]]},"published":{"date-parts":[[2020,11,23]]}}}