{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,31]],"date-time":"2026-01-31T01:55:38Z","timestamp":1769824538198,"version":"3.49.0"},"reference-count":22,"publisher":"Walter de Gruyter GmbH","issue":"4","license":[{"start":{"date-parts":[[2021,11,16]],"date-time":"2021-11-16T00:00:00Z","timestamp":1637020800000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,12,22]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Sequencing technologies has provided the basis of most modern genome sequencing studies due to its high base-level accuracy and relatively low cost. One of the most demanding step is mapping reads to the human reference genome. The reliance on a single reference human genome could introduce substantial biases in downstream analyses. Pangenomic graph reference representations offer an attractive approach for storing genetic variations. Moreover, it is possible to include known variants in the reference in order to make read mapping, variant calling, and genotyping variant-aware. Only recently a framework for variation graphs, <jats:italic>vg<\/jats:italic> [Garrison E, Adam MN, Siren J, et\u00a0al. Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nat Biotechnol 2018;36:875\u20139], have improved variation-aware alignment and variant calling in general. The major bottleneck of <jats:italic>vg<\/jats:italic> is its high cost of reads mapping to a variation graph. In this paper we study the problem of SNP calling on a variation graph and we present a fast reads alignment tool, named VG SNP-Aware. VG SNP-Aware is able align reads exactly to a variation graph and detect SNPs based on these aligned reads. The results show that VG SNP-Aware can efficiently map reads to a variation graph with a speedup of 40\u00d7 with respect to <jats:italic>vg<\/jats:italic> and similar accuracy on SNPs detection.<\/jats:p>","DOI":"10.1515\/jib-2021-0032","type":"journal-article","created":{"date-parts":[[2021,11,16]],"date-time":"2021-11-16T03:30:50Z","timestamp":1637033450000},"source":"Crossref","is-referenced-by-count":8,"title":["Fast alignment of reads to a variation graph with application to SNP detection"],"prefix":"10.1515","volume":"18","author":[{"given":"Maurilio","family":"Monsu","sequence":"first","affiliation":[{"name":"Department of Information Engineering , University of Padua , Padua 35100 , Italy"}]},{"given":"Matteo","family":"Comin","sequence":"additional","affiliation":[{"name":"Department of Information Engineering , University of Padua , Padua 35100 , Italy"}]}],"member":"374","published-online":{"date-parts":[[2021,11,16]]},"reference":[{"key":"2023033120115564150_j_jib-2021-0032_ref_001","doi-asserted-by":"crossref","unstructured":"The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 2010;467:1061\u201373. https:\/\/doi.org\/10.1038\/nature09534.","DOI":"10.1038\/nature09534"},{"key":"2023033120115564150_j_jib-2021-0032_ref_002","doi-asserted-by":"crossref","unstructured":"Arita, M, Karsch-Mizrachi, I, Guy, C, INSDC. The international nucleotide sequence database collaboration. Nucleic Acids Res 2020;49:D121\u20134. https:\/\/doi.org\/10.1093\/nar\/gkaa967.","DOI":"10.1093\/nar\/gkaa967"},{"key":"2023033120115564150_j_jib-2021-0032_ref_003","doi-asserted-by":"crossref","unstructured":"Brandt, DYC, Aguiar, VRC, Bitarello, BD, Nunes, K, Goudet, J, Meyer, D. Mapping bias overestimates reference allele frequencies at the HLA genes in the 1000 genomes project phase I data. G3: Genes, Genomes, Genet 2015;5:931\u201341. https:\/\/doi.org\/10.1534\/g3.114.015784.","DOI":"10.1534\/g3.114.015784"},{"key":"2023033120115564150_j_jib-2021-0032_ref_004","doi-asserted-by":"crossref","unstructured":"G\u00fcnther, T, Nettelblad, C. The presence and impact of reference bias on population genomic studies of prehistoric human populations. PLoS Genet 2019;15:1\u201320. https:\/\/doi.org\/10.1371\/journal.pgen.1008302.","DOI":"10.1371\/journal.pgen.1008302"},{"key":"2023033120115564150_j_jib-2021-0032_ref_005","doi-asserted-by":"crossref","unstructured":"Salavati, M, Bush, SJ, Palma-Vera, S, McCulloch, MEB, Hume, DA, Clark, EL. Elimination of reference mapping bias reveals robust immune related allele-specific expression in crossbred sheep. Front Genet 2019;10:863. https:\/\/doi.org\/10.3389\/fgene.2019.00863.","DOI":"10.3389\/fgene.2019.00863"},{"key":"2023033120115564150_j_jib-2021-0032_ref_006","doi-asserted-by":"crossref","unstructured":"G\u00fcnther, T, Nettelblad, C. The presence and impact of reference bias on population genomic studies of prehistoric human populations. PLoS Genet 2019;15:1\u201320. https:\/\/doi.org\/10.1371\/journal.pgen.1008302.","DOI":"10.1371\/journal.pgen.1008302"},{"key":"2023033120115564150_j_jib-2021-0032_ref_007","doi-asserted-by":"crossref","unstructured":"Martiniano, R, Garrison, E, Jones, ER, et al.. Removing reference bias and improving indel calling in ancient DNA data analysis by mapping to a sequence variation graph. Genome Biol 2020;21:250. https:\/\/doi.org\/10.1186\/s13059-020-02160-7.","DOI":"10.1186\/s13059-020-02160-7"},{"key":"2023033120115564150_j_jib-2021-0032_ref_008","doi-asserted-by":"crossref","unstructured":"Sherry, ST, Ward, MH, Kholodov, M, Baker, J, Phan, L, Smigielski, EM, et al.. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 2001;29:308\u201311. https:\/\/doi.org\/10.1093\/nar\/29.1.308.","DOI":"10.1093\/nar\/29.1.308"},{"key":"2023033120115564150_j_jib-2021-0032_ref_009","doi-asserted-by":"crossref","unstructured":"Paten, B, Novak, A, Eizenga, J, Garrison, E. Genome graphs and the evolution of genome inference. Genome Res 2017;27:665\u201376. https:\/\/doi.org\/10.1101\/gr.214155.116.","DOI":"10.1101\/gr.214155.116"},{"key":"2023033120115564150_j_jib-2021-0032_ref_010","doi-asserted-by":"crossref","unstructured":"Garrison, E, Adam, MN, Siren, J, et al.. Variation graph toolkit improves read mapping by representing genetic variation in the reference. Nat Biotechnol 2018;36:875\u20139. https:\/\/doi.org\/10.1038\/nbt.4227.","DOI":"10.1038\/nbt.4227"},{"key":"2023033120115564150_j_jib-2021-0032_ref_011","doi-asserted-by":"crossref","unstructured":"Rakocevic, G, Semenyuk, V, Spencer, J, Browning, J, Johnson, I, Arsenijevic, V, et al.. Fast and accurate genomic analyses using genome graphs. Nat Genet 2019;51:354\u201362. https:\/\/doi.org\/10.1038\/s41588-018-0316-4.","DOI":"10.1038\/s41588-018-0316-4"},{"key":"2023033120115564150_j_jib-2021-0032_ref_012","doi-asserted-by":"crossref","unstructured":"Altschul, SF, Gish, W, Miller, W, Myers, EW, Lipman, DJ. Basic local alignment search tool. J Mol Biol 1990;215:403\u201310. https:\/\/doi.org\/10.1016\/s0022-2836(05)80360-2.","DOI":"10.1016\/S0022-2836(05)80360-2"},{"key":"2023033120115564150_j_jib-2021-0032_ref_013","doi-asserted-by":"crossref","unstructured":"Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 2018;34:3094\u2013100. https:\/\/doi.org\/10.1093\/bioinformatics\/bty191.","DOI":"10.1093\/bioinformatics\/bty191"},{"key":"2023033120115564150_j_jib-2021-0032_ref_014","doi-asserted-by":"crossref","unstructured":"Salmela, L, Rivals, E. LoRDEC: accurate and efficient long read error correction. Bioinformatics 2014;30:3506\u201314. https:\/\/doi.org\/10.1093\/bioinformatics\/btu538.","DOI":"10.1093\/bioinformatics\/btu538"},{"key":"2023033120115564150_j_jib-2021-0032_ref_015","doi-asserted-by":"crossref","unstructured":"Antipov, D, Korobeynikov, A, McLean, J, Pevzner, P. HYBRIDSPADES: an algorithm for hybrid assembly of short and long reads. Bioinformatics 2015;32:btv688. https:\/\/doi.org\/10.1093\/bioinformatics\/btv688.","DOI":"10.1093\/bioinformatics\/btv688"},{"key":"2023033120115564150_j_jib-2021-0032_ref_016","doi-asserted-by":"crossref","unstructured":"Grossi, R, Vitter, JS. Compressed suffix arrays and suffix trees with applications to text indexing and string matching. SIAM J Comput 2005;35:378\u2013407. https:\/\/doi.org\/10.1137\/S0097539702402354.","DOI":"10.1137\/S0097539702402354"},{"key":"2023033120115564150_j_jib-2021-0032_ref_017","doi-asserted-by":"crossref","unstructured":"Siren, J, Garrison, E, Novak, AM, Paten, B, Durbin, R. Haplotype-aware graph indexes. Bioinformatics 2020;36:400\u20137. https:\/\/doi.org\/10.1093\/bioinformatics\/btz575.","DOI":"10.1093\/bioinformatics\/btz575"},{"key":"2023033120115564150_j_jib-2021-0032_ref_018","doi-asserted-by":"crossref","unstructured":"Shibuya, Y, Comin, M. Better quality score compression through sequence-based quality smoothing. BMC Bioinf 2019;20:302. https:\/\/doi.org\/10.1186\/s12859-019-2883-5.","DOI":"10.1186\/s12859-019-2883-5"},{"key":"2023033120115564150_j_jib-2021-0032_ref_019","doi-asserted-by":"crossref","unstructured":"Shibuya, Y, Comin, M. Indexing k-mers in linear space for quality value compression. J Bioinf\u00a0Comput Biol 2019;17:1940011. https:\/\/doi.org\/10.1142\/S0219720019400110.","DOI":"10.1142\/S0219720019400110"},{"key":"2023033120115564150_j_jib-2021-0032_ref_020","unstructured":"Marcolin, M, Andreace, F, Comin, M. Indexing K-mers in Linear Space with Application to SNP Detection. 2021. to appear."},{"key":"2023033120115564150_j_jib-2021-0032_ref_021","doi-asserted-by":"crossref","unstructured":"Zook, J, McDaniel, J, Olson, N, Wagner, J, Parikh, H, Heaton, H, et al.. An open resource for accurately benchmarking small variant and reference calls. Nat Biotechnol 2019;37:561\u20136. https:\/\/doi.org\/10.1038\/s41587-019-0074-6.","DOI":"10.1038\/s41587-019-0074-6"},{"key":"2023033120115564150_j_jib-2021-0032_ref_022","doi-asserted-by":"crossref","unstructured":"Shajii, A, Yorukoglu, D, Yu, YW, Berger, B. Fast genotyping of known SNPs through approximate k-mer matching. Bioinformatics 2016;32:538\u201344. https:\/\/doi.org\/10.1093\/bioinformatics\/btw460.","DOI":"10.1093\/bioinformatics\/btw460"}],"container-title":["Journal of Integrative Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/jib-2021-0032\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/jib-2021-0032\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,4,1]],"date-time":"2023-04-01T08:36:15Z","timestamp":1680338175000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/jib-2021-0032\/html"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,16]]},"references-count":22,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2021,12,16]]},"published-print":{"date-parts":[[2021,12,22]]}},"alternative-id":["10.1515\/jib-2021-0032"],"URL":"https:\/\/doi.org\/10.1515\/jib-2021-0032","relation":{},"ISSN":["1613-4516"],"issn-type":[{"value":"1613-4516","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,11,16]]},"article-number":"20210032"}}