{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T19:08:28Z","timestamp":1776971308742,"version":"3.51.4"},"reference-count":77,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,4,23]],"date-time":"2025-04-23T00:00:00Z","timestamp":1745366400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Bioinform."],"abstract":"<jats:p>Artificial intelligence (AI) has revolutionized numerous fields, including genomics, where it has significantly impacted variant calling, a crucial process in genomic analysis. Variant calling involves the detection of genetic variants such as single nucleotide polymorphisms (SNPs), insertions\/deletions (InDels), and structural variants from high-throughput sequencing data. Traditionally, statistical approaches have dominated this task, but the advent of AI led to the development of sophisticated tools that promise higher accuracy, efficiency, and scalability. This review explores the state-of-the-art AI-based variant calling tools, including DeepVariant, DNAscope, DeepTrio, Clair, Clairvoyante, Medaka, and HELLO. We discuss their underlying methodologies, strengths, limitations, and performance metrics across different sequencing technologies, alongside their computational requirements, focusing primarily on SNP and InDel detection. By comparing these AI-driven techniques with conventional methods, we highlight the transformative advancements AI has introduced and its potential to further enhance genomic research.<\/jats:p>","DOI":"10.3389\/fbinf.2025.1574359","type":"journal-article","created":{"date-parts":[[2025,4,23]],"date-time":"2025-04-23T05:21:07Z","timestamp":1745385667000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":18,"title":["Artificial intelligence in variant calling: a review"],"prefix":"10.3389","volume":"5","author":[{"given":"Omar","family":"Abdelwahab","sequence":"first","affiliation":[]},{"given":"Davoud","family":"Torkamaneh","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2025,4,23]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"472","DOI":"10.1186\/s12859-023-05596-3","article-title":"Performance analysis of conventional and AI-based variant callers using short and long reads","volume":"24","author":"Abdelwahab","year":"2023","journal-title":"BMC Bioinforma."},{"key":"B2","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1038\/nature11632","article-title":"An integrated map of genetic variation from 1,092 human genomes","volume":"491","author":"Altshuler","year":"2012","journal-title":"Nature"},{"key":"B3","doi-asserted-by":"publisher","first-page":"1523","DOI":"10.1002\/AJMG.A.35470","article-title":"The Centers for Mendelian Genomics: a new large-scale initiative to identify the genes underlying rare Mendelian conditions","author":"Bamshad","year":"2012","journal-title":"Am. J. Med. Genet. A"},{"key":"B4","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1186\/s12864-022-08365-3","article-title":"Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discovery","volume":"23","author":"Barbitoff","year":"2022","journal-title":"BMC Genomics"},{"key":"B5","doi-asserted-by":"publisher","first-page":"559","DOI":"10.1002\/HUMU.22982","article-title":"Gene variant databases and sharing: creating a global genomic variant database for personalized medicine","volume":"37","author":"Bean","year":"2016","journal-title":"Hum. Mutat."},{"key":"B6","doi-asserted-by":"publisher","first-page":"221","DOI":"10.1038\/gim.2015.111","article-title":"Utility of whole-genome sequencing for detection of newborn screening disorders in a population cohort of 1,696 neonates","volume":"18","author":"Bodian","year":"2015","journal-title":"Genet. Med. 2016"},{"key":"B7","doi-asserted-by":"publisher","first-page":"lqae013","DOI":"10.1093\/NARGAB\/LQAE013","article-title":"Extending DeepTrio for sensitive detection of complex de novo mutation patterns","volume":"6","author":"Brand","year":"2024","journal-title":"Nar. Genom Bioinform"},{"key":"B8","doi-asserted-by":"publisher","first-page":"3240","DOI":"10.1038\/s41467-019-11146-4","article-title":"Comprehensive evaluation and characterisation of short read general-purpose structural variant calling software","volume":"10","author":"Cameron","year":"2019","journal-title":"Nat. Commun."},{"key":"B9","doi-asserted-by":"publisher","first-page":"197","DOI":"10.1186\/s12859-023-05294-0","article-title":"Improving variant calling using population data and deep learning","volume":"24","author":"Chen","year":"2023","journal-title":"BMC Bioinforma."},{"key":"B10","doi-asserted-by":"publisher","first-page":"3","DOI":"10.17605\/OSF.IO\/GV5T4","article-title":"Advantages and disadvantages of artificial intelligence and machine learning: a literature review","volume":"9","author":"Chhaya","year":"2020","journal-title":"Int. J. Libr. and Inf. Sci. (IJLIS)"},{"key":"B11","doi-asserted-by":"publisher","first-page":"1033","DOI":"10.3390\/BIOLOGY12071033","article-title":"Transformer architecture and attention mechanisms in genome data analysis: a comprehensive review","volume":"12","author":"Choi","year":"2023","journal-title":"Biology"},{"key":"B12","doi-asserted-by":"publisher","first-page":"giab008","DOI":"10.1093\/GIGASCIENCE\/GIAB008","article-title":"Twelve years of SAMtools and BCFtools","volume":"10","author":"Danecek","year":"2021","journal-title":"Gigascience"},{"key":"B13","doi-asserted-by":"publisher","first-page":"31","DOI":"10.5120\/20182-2402","article-title":"Applications of artificial intelligence in machine learning: review and prospect","volume":"115","author":"Das","year":"2015","journal-title":"Int. J. Comput. Appl."},{"key":"B14","unstructured":"NVIDIA docs\n          \n          \n          2023"},{"key":"B15","unstructured":"GitHub\n          \n          \n          2024"},{"key":"B16","unstructured":"Improved non-human variant calling using species-specific DeepVariant models\n          \n          \n          2018"},{"key":"B17","doi-asserted-by":"publisher","first-page":"115717","DOI":"10.1101\/115717","article-title":"The Sentieon Genomics Tools - a fast and accurate solution to variant calling from next-generation sequence data","author":"Freed","year":"2017","journal-title":"bioRxiv"},{"key":"B18","doi-asserted-by":"publisher","first-page":"492556","DOI":"10.1101\/2022.05.20.492556","article-title":"DNAscope: high accuracy small variant calling using machine learning","author":"Freed","year":"","journal-title":"bioRxiv"},{"key":"B19","doi-asserted-by":"publisher","DOI":"10.1101\/2022.06.01.494452","article-title":"Sentieon DNAscope LongRead \u2013 a highly accurate, fast, and efficient pipeline for germline variant calling from PacBio HiFi reads","author":"Freed","year":"","journal-title":"bioRxiv"},{"key":"B20","unstructured":"Haplotype-based variant detection from short-read sequencing\n          \n          \n            \n              Garrison\n              E.\n            \n            \n              Marth\n              G.\n            \n          \n          \n          2012"},{"key":"B21","doi-asserted-by":"publisher","first-page":"675","DOI":"10.1038\/NBT0715-675A","article-title":"Genome in a bottle\u2014a human DNA standard","volume":"33","year":"2015","journal-title":"Nat. Biotechnol."},{"key":"B22","doi-asserted-by":"publisher","first-page":"736","DOI":"10.3389\/fgene.2019.00736","article-title":"Sentieon DNASeq variant calling workflow demonstrates strong computational performance and accuracy","volume":"10","author":"Glusman","year":"2019","journal-title":"Front. Genet."},{"key":"B23","unstructured":"The value of genomic analysis - Google health\n          \n          \n          2022"},{"key":"B24","doi-asserted-by":"publisher","DOI":"10.7554\/ELIFE.98300","article-title":"Benchmarking reveals superiority of deep learning variant callers on bacterial nanopore sequence data","volume":"13","author":"Hall","year":"2024","journal-title":"Elife"},{"key":"B25","doi-asserted-by":"publisher","first-page":"373","DOI":"10.3390\/DIAGNOSTICS13030373","article-title":"Next-generation sequencing (NGS) and third-generation sequencing (TGS) for the diagnosis of thalassemia","volume":"13","author":"Hassan","year":"2023","journal-title":"Diagnostics"},{"key":"B26","doi-asserted-by":"publisher","first-page":"6160","DOI":"10.1038\/s41598-024-56604-2","article-title":"Benchmarking long-read aligners and SV callers for structural variation detection in Oxford nanopore sequencing data","volume":"14","author":"Helal","year":"2024","journal-title":"Sci. Rep."},{"key":"B27","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1007\/s12178-020-09600-8","article-title":"Machine learning and artificial intelligence: definitions, applications, and future directions","volume":"13","author":"Helm","year":"2020","journal-title":"Curr. Rev. Musculoskelet. Med."},{"key":"B28","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1155\/2020\/7231205","article-title":"DeepVariant-on-Spark: small-scale genome analysis using a cloud-based computing framework","volume":"2020","author":"Huang","year":"2020","journal-title":"Comput. Math. Methods Med."},{"key":"B29","doi-asserted-by":"publisher","first-page":"318","DOI":"10.1186\/s12864-024-10239-9","article-title":"Comparison of structural variant callers for massive whole-genome sequence data","volume":"25","author":"Joe","year":"2024","journal-title":"BMC Genomics"},{"key":"B30","unstructured":"Signal-level algorithms for MinION data\n          \n          \n          2017"},{"key":"B31","doi-asserted-by":"publisher","first-page":"303","DOI":"10.1093\/BFGP\/ELAE003","article-title":"A comprehensive review of deep learning-based variant calling methods","volume":"23","author":"Junjun","year":"2024","journal-title":"Brief. Funct. Genomics"},{"key":"B32","doi-asserted-by":"publisher","first-page":"591","DOI":"10.1038\/s41592-018-0051-x","article-title":"Strelka2: fast and accurate calling of germline and somatic variants","volume":"15","author":"Kim","year":"2018","journal-title":"Nat. Methods"},{"key":"B33","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1186\/s13073-020-00791-w","article-title":"Best practices for variant calling in clinical sequencing","volume":"12","author":"Koboldt","year":"2020","journal-title":"Genome Med."},{"key":"B34","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1038\/nature11412","article-title":"Comprehensive molecular portraits of human breast tumours","volume":"490","author":"Koboldt","year":"2012","journal-title":"Nature"},{"key":"B35","doi-asserted-by":"publisher","DOI":"10.1101\/2021.04.05.438434","article-title":"DeepTrio: variant calling in families using deep learning","author":"Kolesnikov","year":"2021","journal-title":"bioRxiv"},{"key":"B36","doi-asserted-by":"publisher","first-page":"e3001507","DOI":"10.1371\/JOURNAL.PBIO.3001507","article-title":"DAJIN enables multiplex genotyping to simultaneously validate intended and unintended target genome editing outcomes","volume":"20","author":"Kuno","year":"2022","journal-title":"PLoS Biol."},{"key":"B37","doi-asserted-by":"publisher","first-page":"533","DOI":"10.1186\/s12864-022-08775-3","article-title":"Accuracy benchmark of the GeneMind GenoLab M sequencing platform for WGS and WES analysis","volume":"23","author":"Li","year":"","journal-title":"BMC Genomics"},{"key":"B38","doi-asserted-by":"publisher","first-page":"2078","DOI":"10.1093\/bioinformatics\/btp352","article-title":"The sequence alignment\/map format and SAMtools","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"B39","doi-asserted-by":"publisher","first-page":"3197","DOI":"10.1007\/s10115-022-01756-8","article-title":"Interpretable deep learning: interpretation, interpretability, trustworthiness, and beyond","volume":"64","author":"Li","year":"","journal-title":"Knowl. Inf. Syst."},{"key":"B40","doi-asserted-by":"publisher","first-page":"18","DOI":"10.3390\/E23010018","article-title":"Explainable AI: a review of machine learning interpretability methods","volume":"23","author":"Linardatos","year":"2020","journal-title":"Entropy"},{"key":"B41","doi-asserted-by":"publisher","first-page":"e75619","DOI":"10.1371\/JOURNAL.PONE.0075619","article-title":"Variant callers for next-generation sequencing data: a comparison study","volume":"8","author":"Liu","year":"2013","journal-title":"PLoS One"},{"key":"B42","doi-asserted-by":"publisher","first-page":"a008581","DOI":"10.1101\/CSHPERSPECT.A008581","article-title":"Personalized medicine and human genetic diversity","volume":"4","author":"Lu","year":"2014","journal-title":"Cold Spring Harb. Perspect. Med."},{"key":"B43","doi-asserted-by":"publisher","first-page":"998","DOI":"10.1038\/s41467-019-09025-z","article-title":"A multi-task convolutional deep neural network for variant calling in single molecule sequencing","volume":"10","author":"Luo","year":"2019","journal-title":"Nat. Commun."},{"key":"B44","doi-asserted-by":"publisher","first-page":"220","DOI":"10.1038\/s42256-020-0167-4","article-title":"Exploring the limit of using a deep neural network on pileup data for germline variant calling","volume":"2","author":"Luo","year":"2020","journal-title":"Nat. Mach. Intell."},{"key":"B45","doi-asserted-by":"publisher","first-page":"118","DOI":"10.1159\/000346826","article-title":"Population genetics of rare variants and complex diseases","volume":"74","author":"Maher","year":"2013","journal-title":"Hum. Hered."},{"key":"B46","doi-asserted-by":"publisher","first-page":"837","DOI":"10.1038\/s41467-024-44804-3","article-title":"Utility of long-read sequencing for all of us","volume":"15","author":"Mahmoud","year":"2024","journal-title":"Nat. Commun."},{"key":"B47","doi-asserted-by":"publisher","first-page":"1297","DOI":"10.1101\/GR.107524.110","article-title":"The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data","volume":"20","author":"McKenna","year":"2010","journal-title":"Genome Res."},{"key":"B48","doi-asserted-by":"publisher","first-page":"768","DOI":"10.3390\/ELECTRONICS8070768","article-title":"A survey on internet of things and cloud computing for healthcare","volume":"8","author":"Minh Dang","year":"2019","journal-title":"Electronics"},{"key":"B49","unstructured":"Sequence correction provided by ONT research\n          \n          \n          2018"},{"key":"B50","doi-asserted-by":"publisher","first-page":"100129","DOI":"10.1016\/J.XGEN.2022.100129","article-title":"PrecisionFDA Truth Challenge V2: calling variants from short and long reads in difficult-to-map regions","volume":"2","author":"Olson","year":"2022","journal-title":"Cell Genomics"},{"key":"B51","doi-asserted-by":"publisher","first-page":"19","DOI":"10.1186\/s12864-015-1219-8","article-title":"Impacts of low coverage depths and post-mortem DNA damage on variant calling: a simulation study","volume":"16","author":"Parks","year":"2015","journal-title":"BMC Genomics"},{"key":"B52","doi-asserted-by":"publisher","first-page":"bbaa148","DOI":"10.1093\/BIB\/BBAA148","article-title":"Benchmarking variant callers in next-generation and third-generation sequencing analysis","volume":"22","author":"Pei","year":"2021","journal-title":"Brief. Bioinform"},{"key":"B53","doi-asserted-by":"publisher","first-page":"983","DOI":"10.1038\/nbt.4235","article-title":"A universal SNP and small-indel variant caller using deep neural networks","volume":"36","author":"Poplin","year":"2018","journal-title":"Nat. Biotechnol."},{"key":"B54","doi-asserted-by":"publisher","first-page":"404","DOI":"10.1186\/s12859-021-04311-4","article-title":"HELLO: improved neural network architectures and methodologies for small variant calling","volume":"22","author":"Ramachandran","year":"2021","journal-title":"BMC Bioinforma."},{"key":"B55","doi-asserted-by":"publisher","first-page":"912","DOI":"10.1038\/ng.3036","article-title":"Integrating mapping-assembly- and haplotype-based approaches for calling variants in clinical sequencing applications","volume":"46","author":"Rimmer","year":"2014","journal-title":"Nat. Genet."},{"key":"B56","doi-asserted-by":"publisher","first-page":"1811","DOI":"10.1093\/bioinformatics\/bts271","article-title":"Strelka: accurate somatic small-variant calling from sequenced tumor\u2013normal sample pairs","volume":"28","author":"Saunders","year":"2012","journal-title":"Bioinformatics"},{"key":"B57","doi-asserted-by":"publisher","first-page":"e066288","DOI":"10.1136\/BMJ-2021-066288","article-title":"Use of whole genome sequencing to determine genetic basis of suspected mitochondrial disorders: cohort study","volume":"375","author":"Schon","year":"2021","journal-title":"BMJ"},{"key":"B58","unstructured":"DNAscope LongRead nanopore pipeline - Sentieon\n          \n          \n          2025"},{"key":"B59","doi-asserted-by":"publisher","first-page":"1322","DOI":"10.1038\/s41592-021-01299-w","article-title":"Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads","volume":"18","author":"Shafin","year":"2021","journal-title":"Nat. Methods"},{"key":"B60","doi-asserted-by":"publisher","first-page":"407","DOI":"10.1038\/nmeth.4184","article-title":"Detecting DNA cytosine methylation using nanopore sequencing","volume":"14","author":"Simpson","year":"2017","journal-title":"Nat. Methods"},{"key":"B61","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1038\/nrg3642","article-title":"Sequencing depth and coverage: key considerations in genomic analyses","volume":"15","author":"Sims","year":"2014","journal-title":"Nat. Rev. Genet."},{"key":"B62","doi-asserted-by":"publisher","first-page":"a023168","DOI":"10.1101\/CSHPERSPECT.A023168","article-title":"Whole-exome sequencing and whole-genome sequencing in critically ill neonates suspected to have single-gene disorders","volume":"6","author":"Smith","year":"2016","journal-title":"Cold Spring Harb. Perspect. Med."},{"key":"B63","first-page":"942","article-title":"Advancing human genetics research and drug discovery through exome sequencing of the UK Biobank","volume-title":"Nature Genetics","author":"Szustakowski","year":"2021"},{"key":"B64","doi-asserted-by":"publisher","first-page":"2059","DOI":"10.1056\/NEJMOA1301689","article-title":"Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia","volume":"368","author":"Timothy","year":"2013","journal-title":"N. Engl. J. Med."},{"key":"B66","doi-asserted-by":"publisher","DOI":"10.31838\/jcr.07.10.282","article-title":"A literature review on security issues in cloud computing: opportunities and challenges journal of critical reviews A literature review on security issues in cloud computing: opportunities and challenges","author":"Vistro","year":"2020","journal-title":"Article J. Crit. Rev"},{"key":"B67","doi-asserted-by":"publisher","first-page":"100128","DOI":"10.1016\/J.XGEN.2022.100128","article-title":"Benchmarking challenging small variants with linked and long reads","volume":"2","author":"Wagner","year":"2022","journal-title":"Cell Genomics"},{"key":"B68","doi-asserted-by":"publisher","first-page":"82","DOI":"10.1038\/nature14962","article-title":"The UK10K project identifies rare variants in health and disease","volume":"526","author":"Walter","year":"2015","journal-title":"Nature"},{"key":"B69","doi-asserted-by":"publisher","first-page":"1499","DOI":"10.1007\/s00439-021-02387-9","article-title":"Interpretable machine learning for genomics","volume":"141","author":"Watson","year":"2022","journal-title":"Hum. Genet."},{"key":"B70","doi-asserted-by":"publisher","first-page":"1155","DOI":"10.1038\/s41587-019-0217-9","article-title":"Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome","volume":"37","author":"Wenger","year":"2019","journal-title":"Nat. Biotechnol."},{"key":"B71","doi-asserted-by":"publisher","first-page":"377","DOI":"10.1016\/S2213-2600(15)00139-3","article-title":"Whole-genome sequencing for identification of Mendelian disorders in critically ill infants: a retrospective analysis of diagnostic and clinical findings","volume":"3","author":"Willig","year":"2015","journal-title":"Lancet Respir. Med."},{"key":"B72","doi-asserted-by":"publisher","first-page":"765","DOI":"10.1038\/nrg3786","article-title":"The contribution of genetic variants to disease depends on the ruler","volume":"15","author":"Witte","year":"2014","journal-title":"Nat. Rev. Genet."},{"key":"B73","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1016\/J.CSBJ.2018.01.003","article-title":"A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data","volume":"16","author":"Xu","year":"2018","journal-title":"Comput. Struct. Biotechnol. J."},{"key":"B74","doi-asserted-by":"publisher","first-page":"308","DOI":"10.1186\/s12859-023-05434-6","article-title":"Boosting variant-calling performance with multi-platform sequencing data using Clair3-MP","volume":"24","author":"Yu","year":"2023","journal-title":"BMC Bioinforma."},{"key":"B75","doi-asserted-by":"publisher","first-page":"5582","DOI":"10.1093\/bioinformatics\/btaa1081","article-title":"Accurate, scalable cohort variant calls using DeepVariant and GLnexus","volume":"36","author":"Yun","year":"2020","journal-title":"Bioinformatics"},{"key":"B76","doi-asserted-by":"publisher","first-page":"14645","DOI":"10.3390\/IJMS241914645","article-title":"Applications for deep learning in epilepsy genetic research","volume":"24","author":"Zeibich","year":"2023","journal-title":"Int. J. Mol. Sci."},{"key":"B77","doi-asserted-by":"publisher","first-page":"797","DOI":"10.1038\/s43588-022-00387-x","article-title":"Symphonizing pileup and full-alignment for deep learning-based long-read variant calling","volume":"2","author":"Zheng","year":"2022","journal-title":"Nat. Comput. Sci."},{"key":"B78","doi-asserted-by":"publisher","first-page":"160025","DOI":"10.1038\/sdata.2016.25","article-title":"Extensive sequencing of seven human genomes to characterize benchmark reference materials","volume":"3","author":"Zook","year":"2016","journal-title":"Sci. Data"}],"container-title":["Frontiers in Bioinformatics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1574359\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,4,23]],"date-time":"2025-04-23T05:21:11Z","timestamp":1745385671000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2025.1574359\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,23]]},"references-count":77,"alternative-id":["10.3389\/fbinf.2025.1574359"],"URL":"https:\/\/doi.org\/10.3389\/fbinf.2025.1574359","relation":{},"ISSN":["2673-7647"],"issn-type":[{"value":"2673-7647","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,4,23]]},"article-number":"1574359"}}