{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,5]],"date-time":"2026-01-05T06:40:15Z","timestamp":1767595215225,"version":"3.48.0"},"reference-count":45,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2026,1,5]],"date-time":"2026-01-05T00:00:00Z","timestamp":1767571200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Bioinform."],"abstract":"<jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>Integrating short- and long-read sequencing technologies has become a promising approach for achieving accurate and comprehensive genomic analysis. Although short-read sequencing (Illumina, etc.) offers high base accuracy and cost efficiency, it struggles with structural variant (SV) detection and complex genomic regions. In contrast, long-read sequencing (PacBio HiFi) excels in resolving large SVs and repetitive sequences but is limited by throughput, higher insertion or deletion (indel) error rates, and sequencing costs. Hybrid approaches may combine these technologies and leverage their complementary strengths and different sources of error to provide higher accuracy, more comprehensive results, and higher throughput by lowering the coverage requirement for the long reads.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Methods<\/jats:title>\n                    <jats:p>This study benchmarks the DNAscope Hybrid (DS-Hybrid) pipeline, a novel integrated alignment and variant calling framework that combines short- and long-read data sequenced from the same sample. The DNAscope Hybrid pipeline is a bioinformatics pipeline that runs on generic x86 CPUs. We evaluate its performance across multiple human genome reference datasets (HG002\u2013HG004) using the draft Q100 and Genome in a Bottle v4.2.1 benchmarks. The pipeline\u2019s ability to detect small variants [single-nucleotide polymorphisms (SNPs)\/indels)], SVs, and copy-number variations (CNVs) is assessed using data from the Illumina and PacBio sequencing systems at varying read depths (5\u00d7\u201330\u00d7). Benchmark results are compared to those of DeepVariant.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>The DNAscope Hybrid pipeline significantly improves SNP and indel calling accuracy, particularly in complex genomic regions. At lower long-read depths (e.g., 5\u00d7\u201310\u00d7), the hybrid approach outperforms stand-alone short- or long-read pipelines at full sequencing depths (30\u00d7\u201335\u00d7), reducing variant calling errors by at least 50%. Additionally, the DNAscope Hybrid outperforms leading open-source tools for SV and CNV detection and enhances variant discovery in challenging genomic regions. The pipeline also demonstrates clinical utility by identifying variants in disease-associated genes. Moreover, DNAscope Hybrid is highly efficient, achieving less than 90 min runtimes at single standard instance.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusion<\/jats:title>\n                    <jats:p>The DNAscope Hybrid pipeline is a computationally efficient, highly accurate variant calling framework that leverages the advantages of both short- and long-read sequencing. By improving variant detection in challenging genomic regions and offering a robust solution for clinical and large-scale genomic applications, it holds significant promise for genetic disease diagnostics, population-scale studies, and personalized medicine.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.3389\/fbinf.2025.1691056","type":"journal-article","created":{"date-parts":[[2026,1,5]],"date-time":"2026-01-05T06:38:10Z","timestamp":1767595090000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["A novel and accelerated method for integrated alignment and variant calling from short and long reads"],"prefix":"10.3389","volume":"5","author":[{"given":"Jinnan","family":"Hu","sequence":"first","affiliation":[]},{"given":"Donald","family":"Freed","sequence":"additional","affiliation":[]},{"given":"Hanying","family":"Feng","sequence":"additional","affiliation":[]},{"given":"Hong","family":"Chen","sequence":"additional","affiliation":[]},{"given":"Zhipan","family":"Li","sequence":"additional","affiliation":[]},{"given":"Haodong","family":"Chen","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2026,1,5]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"974","DOI":"10.1101\/gr.114876.110","article-title":"CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing","volume":"21","author":"Abyzov","year":"2011","journal-title":"Genome Res."},{"key":"B2","doi-asserted-by":"publisher","first-page":"30","DOI":"10.1186\/s13059-020-1935-5","article-title":"Opportunities and challenges in long-read sequencing data analysis","volume":"21","author":"Amarasinghe","year":"2020","journal-title":"Genome Biol."},{"key":"B3","first-page":"1177","article-title":"Comprehensive genome analysis and variant detection at scale using DRAGEN","volume-title":"Nat. Biotechnol.","author":"Behera","year":"2024"},{"key":"B4","doi-asserted-by":"publisher","first-page":"3753","DOI":"10.1038\/s41598-021-83081-8","article-title":"Critical evaluation of short, long, and hybrid assembly for contextual analysis of antibiotic resistance genes in complex environmental metagenomes","volume":"11","author":"Brown","year":"2021","journal-title":"Sci. Rep."},{"key":"B5","doi-asserted-by":"publisher","first-page":"572","DOI":"10.1038\/s41576-021-00367-3","article-title":"Towards population\u2010scale long\u2010read sequencing","volume":"22","author":"De Coster","year":"2021","journal-title":"Nat. Rev. Genet."},{"key":"B37","doi-asserted-by":"publisher","DOI":"10.3389\/fphar.2025.1653999","article-title":"Comparative evaluation of Oxford Nanopore Technologies\u2019 adaptive sampling and the Twist long-read PGx panel for pharmacogenomic profiling. https:\/\/www.frontiersin.org\/journals\/pharmacology\/articles\/10.3389\/fphar.2025.1653999\/full","volume":"16","author":"Deserranno","year":"2025","journal-title":"Front. Pharmacol."},{"key":"B6","doi-asserted-by":"publisher","first-page":"271","DOI":"10.1186\/s13059-022-02840-6","article-title":"Truvari: refined structural variant comparison preserves allelic diversity","volume":"23","author":"English","year":"2022","journal-title":"Genome Biol."},{"key":"B7","doi-asserted-by":"publisher","DOI":"10.1101\/115717","article-title":"The sentieon genomics tools: a fast and accurate solution to variant calling from next-generation sequence data","author":"Freed","year":"2017","journal-title":"BioRxiv"},{"key":"B8","doi-asserted-by":"publisher","DOI":"10.1101\/2022.05.20.492556","article-title":"DNAscope: high accuracy small variant calling using machine learning","author":"Freed","year":"","journal-title":"BioRxiv"},{"key":"B9","doi-asserted-by":"publisher","DOI":"10.1101\/2022.06.01.494452","article-title":"Sentieon DNAscope LongRead \u2013 a highly accurate, fast, and efficient pipeline for germline variant calling from PacBio HiFi reads","author":"Freed","year":"","journal-title":"BioRxiv"},{"key":"B10","doi-asserted-by":"publisher","first-page":"333","DOI":"10.1038\/nrg.2016.49","article-title":"Coming of age: ten years of next\u2010generation sequencing technologies","volume":"17","author":"Goodwin","year":"2016","journal-title":"Nat. Rev. Genet."},{"key":"B20","article-title":"A complete diploid human genome benchmark for personalized genomics","author":"Hansen","year":"2025","journal-title":"biorxiv"},{"key":"B11","doi-asserted-by":"publisher","first-page":"11877","DOI":"10.3390\/ijms252211877","article-title":"Single laboratory evaluation of the Q20+ nanopore sequencing kit for bacterial outbreak investigations","volume":"25","author":"Hoffmann","year":"2024","journal-title":"Int. J. Mol. Sci."},{"key":"B12","doi-asserted-by":"publisher","first-page":"450","DOI":"10.1016\/j.ajhg.2024.12.013","article-title":"HiFi long-read genomes for difficult-to-detect, clinically relevant variants","volume":"112","author":"H\u00f6ps","year":"2025","journal-title":"Am. J. Hum. Genet."},{"key":"B13","unstructured":"Illumina press release\n          \n          \n          2023"},{"key":"B14","doi-asserted-by":"publisher","first-page":"1145285","DOI":"10.3389\/fgene.2023.1145285","article-title":"ONT long-read WGS for variant discovery and orthogonal confirmation of short-read WGS-Derived genetic variants in clinical genetic testing","volume":"14","author":"Kaplun","year":"2023","journal-title":"Front. Genet."},{"key":"B45","doi-asserted-by":"publisher","first-page":"555","DOI":"10.1038\/s41587-019-0054-x","article-title":"Best practices for benchmarking germline small-variant calls in human genomes","volume":"37","author":"Krusche","year":"2019","journal-title":"Nat. Biotechnol."},{"key":"B15","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1303.3997","article-title":"Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM","author":"Li","year":"2013","journal-title":"arXiv"},{"key":"B16","doi-asserted-by":"publisher","first-page":"3094","DOI":"10.1093\/bioinformatics\/bty191","article-title":"Minimap2: pairwise alignment for nucleotide sequences","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"B17","doi-asserted-by":"publisher","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with burrows\u2013wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"B18","doi-asserted-by":"publisher","DOI":"10.1101\/2024.11.01.621515","article-title":"Blended length genome sequencing (blend\u2010seq): combining short reads with low\u2010coverage long reads to maximize variant discovery","author":"Magner","year":"2024","journal-title":"BioRxiv"},{"key":"B19","doi-asserted-by":"publisher","first-page":"246","DOI":"10.1186\/s13059-019-1828-7","article-title":"Structural variant calling: the long and the short of it","volume":"20","author":"Mahmoud","year":"2019","journal-title":"Genome Biol."},{"key":"B21","doi-asserted-by":"publisher","first-page":"122","DOI":"10.1186\/s13059-016-0974-4","article-title":"The ensembl variant effect predictor","volume":"17","author":"McLaren","year":"2016","journal-title":"Genome Biol."},{"key":"B22","unstructured":"Update: cost-Effective genomics analysis with sentieon on azure\n          \n          \n          2024"},{"key":"B23","doi-asserted-by":"publisher","first-page":"100129","DOI":"10.1016\/j.xgen.2022.100129","article-title":"PrecisionFDA truth challenge V2: calling variants from short and long reads in difficult-to-map regions","volume":"2","author":"Olson","year":"2022","journal-title":"Cell. Genomics"},{"key":"B24","unstructured":"Revio 2022Q4\n          \n          \n          2022"},{"key":"B25","unstructured":"Sequencing 101: sequencing coverage\n          \n          \n          2024"},{"key":"B26","unstructured":"Pbsv: pacbio structural variant (SV) calling and analysis tools gitHub repository"},{"key":"B27","doi-asserted-by":"publisher","first-page":"983","DOI":"10.1038\/nbt.4235","article-title":"A universal SNP and small-indel variant caller using deep neural networks","volume":"36","author":"Poplin","year":"2018","journal-title":"Nat. Biotechnol."},{"key":"B28","unstructured":"PrecisionFDA challenges: introduction\n          \n          \n          2025"},{"key":"B29","doi-asserted-by":"publisher","first-page":"404","DOI":"10.1186\/s12859-021-04311-4","article-title":"HELLO: improved neural network architectures and methodologies for small variant calling","volume":"22","author":"Ramachandran","year":"2021","journal-title":"BMC Bioinforma."},{"key":"B30","doi-asserted-by":"publisher","first-page":"344","DOI":"10.1038\/s41586-023-06457-y","article-title":"The complete sequence of a human Y chromosome","volume":"621","author":"Rhie","year":"2023","journal-title":"Nature"},{"key":"B31","doi-asserted-by":"publisher","first-page":"997","DOI":"10.3390\/biology12070997","article-title":"Next\u2010generation sequencing technology: current trends and advancements","volume":"12","author":"Satam","year":"2023","journal-title":"Biology"},{"key":"B32","doi-asserted-by":"publisher","first-page":"btaf136","DOI":"10.1101\/2024.08.19.608674","article-title":"Sawfish: improving long-read structural variant discovery and genotyping with local haplotype modeling","volume":"41","author":"Saunders","year":"2024","journal-title":"Bioinformatics"},{"key":"B33","unstructured":"Sentieon CLI [gitHub repository]"},{"key":"B34","unstructured":"hap-eval: a VCF comparison engine for structural variant benchmarking"},{"key":"B35","unstructured":"Salus sentieon APP notes [PDF]"},{"key":"B36","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1038\/nature15394","article-title":"An integrated map of structural variation in 2,504 human genomes","volume":"526","author":"Sudmant","year":"2015","journal-title":"Nature"},{"key":"B38","unstructured":"Services\n          \n          \n          2025"},{"key":"B39","doi-asserted-by":"publisher","first-page":"100128","DOI":"10.1016\/j.xgen.2022.100128","article-title":"Benchmarking challenging small variants with linked and long reads","volume":"2","author":"Wagner","year":"","journal-title":"Cell. Genomics"},{"key":"B40","doi-asserted-by":"publisher","first-page":"672","DOI":"10.1038\/s41587-021-01158-1","article-title":"Curated variation benchmarks for challenging medically relevant autosomal genes","volume":"40","author":"Wagner","year":"","journal-title":"Nat. Biotechnol."},{"key":"B41","doi-asserted-by":"publisher","first-page":"1155","DOI":"10.1038\/s41587-019-0217-9","article-title":"Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome","volume":"37","author":"Wenger","year":"2019","journal-title":"Nat. Biotechnol."},{"key":"B42","doi-asserted-by":"publisher","first-page":"172","DOI":"10.1038\/nrg3871","article-title":"A copy number variation map of the human genome","volume":"16","author":"Zarrei","year":"2015","journal-title":"Nat. Rev. Genet."},{"key":"B43","doi-asserted-by":"publisher","first-page":"889","DOI":"10.1186\/s12864-020-07227-0","article-title":"A comprehensive evaluation of long read error correction methods","volume":"21","author":"Zhang","year":"2020","journal-title":"BMC Genomics"},{"key":"B44","doi-asserted-by":"publisher","first-page":"1347","DOI":"10.1038\/s41587-020-0687-3","article-title":"A robust benchmark for detection of germline large deletions and insertions","volume":"38","author":"Zook","year":"2020","journal-title":"Nat. Biotechnol."}],"container-title":["Frontiers in Bioinformatics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1691056\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,5]],"date-time":"2026-01-05T06:38:12Z","timestamp":1767595092000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1691056\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,5]]},"references-count":45,"alternative-id":["10.3389\/fbinf.2025.1691056"],"URL":"https:\/\/doi.org\/10.3389\/fbinf.2025.1691056","relation":{},"ISSN":["2673-7647"],"issn-type":[{"value":"2673-7647","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,5]]},"article-number":"1691056"}}