{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T10:53:03Z","timestamp":1740135183556,"version":"3.37.3"},"reference-count":13,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,2,5]],"date-time":"2020-02-05T00:00:00Z","timestamp":1580860800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,2,5]],"date-time":"2020-02-05T00:00:00Z","timestamp":1580860800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/P010040\/1"],"award-info":[{"award-number":["EP\/P010040\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/N031768\/1"],"award-info":[{"award-number":["EP\/N031768\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n<jats:title>Background<\/jats:title>\n<jats:p>Current popular variant calling pipelines rely on the mapping coordinates of each input read to a reference genome in order to detect variants. Since reads deriving from variant loci that diverge in sequence substantially from the reference are often assigned incorrect mapping coordinates, variant calling pipelines that rely on mapping coordinates can exhibit reduced sensitivity.<\/jats:p>\n<\/jats:sec><jats:sec>\n<jats:title>Results<\/jats:title>\n<jats:p>In this work we present GeDi, a suffix array-based somatic single nucleotide variant (SNV) calling algorithm that does not rely on read mapping coordinates to detect SNVs and is therefore capable of reference-free and mapping-free SNV detection. GeDi executes with practical runtime and memory resource requirements, is capable of SNV detection at very low allele frequency (&lt;1%), and detects SNVs with high sensitivity at complex variant loci, dramatically outperforming MuTect, a well-established pipeline.<\/jats:p>\n<\/jats:sec><jats:sec>\n<jats:title>Conclusion<\/jats:title>\n<jats:p>By designing novel suffix-array based SNV calling methods, we have developed a practical SNV calling software, GeDi, that can characterise SNVs at complex variant loci and at low allele frequency thus increasing the repertoire of detectable SNVs in tumour genomes. We expect GeDi to find use cases in targeted-deep sequencing analysis, and to serve as a replacement and improvement over previous suffix-array based SNV calling methods.<\/jats:p>\n<\/jats:sec>","DOI":"10.1186\/s12859-020-3367-3","type":"journal-article","created":{"date-parts":[[2020,2,5]],"date-time":"2020-02-05T15:03:39Z","timestamp":1580915019000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["GeDi: applying suffix arrays to increase the repertoire of detectable SNVs in tumour genomes"],"prefix":"10.1186","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4697-6079","authenticated-orcid":false,"given":"Izaak","family":"Coleman","sequence":"first","affiliation":[]},{"given":"Giacomo","family":"Corleone","sequence":"additional","affiliation":[]},{"given":"James","family":"Arram","sequence":"additional","affiliation":[]},{"given":"Ho-Cheung","family":"Ng","sequence":"additional","affiliation":[]},{"given":"Luca","family":"Magnani","sequence":"additional","affiliation":[]},{"given":"Wayne","family":"Luk","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,2,5]]},"reference":[{"issue":"6","key":"3367_CR1","doi-asserted-by":"publisher","first-page":"613","DOI":"10.1089\/cmb.2018.0007","volume":"25","author":"M Boenn","year":"2018","unstructured":"Boenn M. Shrangesim: Simulation of single nucleotide polymorphism clusters in next-generation sequencing data. J Comput Biol. 2018; 25(6):613\u201322. https:\/\/doi.org\/10.1089\/cmb.2018.0007. PT: J; EA: APR; UT: WOS:000430152300001.","journal-title":"J Comput Biol"},{"issue":"24","key":"3367_CR2","doi-asserted-by":"publisher","first-page":"3207","DOI":"10.1093\/bioinformatics\/btp579","volume":"25","author":"JF Degner","year":"2009","unstructured":"Degner JF, Marioni JC, Pai AA, Pickrell JK, Nkadori E, Gilad Y, Pritchard JK. Effect of read-mapping biases on detecting allele-specific expression from rna-sequencing data. Bioinformatics. 2009; 25(24):3207\u201312. https:\/\/doi.org\/10.1093\/bioinformatics\/btp579. PT: J; UT: WOS:000272464000001.","journal-title":"Bioinformatics"},{"issue":"11","key":"3367_CR3","doi-asserted-by":"publisher","first-page":"1106","DOI":"10.1038\/nbt.3027","volume":"32","author":"V Moncunill","year":"2014","unstructured":"Moncunill V, Gonzalez S, Bea S, Andrieux LO, Salaverria I, Royo C, Martinez L, Puiggros M, Segura-Wang M, Stuetz AM, Navarro A, Royo R, Gelpi JL, Gut IG, Lopez-Otin C, Orozco M, Korbel J, Campo E, Puente XS, Torrents D. Comprehensive characterization of complex structural variations in cancer by directly comparing genome sequence reads. Nat Biotechnol. 2014; 32(11):1106\u201312. PT: J; TC: 9; UT: WOS:000344977000015.","journal-title":"Nat Biotechnol"},{"issue":"8","key":"3367_CR4","doi-asserted-by":"publisher","first-page":"e78","DOI":"10.1093\/nar\/gkw026","volume":"44","author":"Koichi Yamagata","year":"2016","unstructured":"Yamagata K, Yamanishi A, Kokubu C, Takeda J, Sese J. Cosmos: accurate detection of somatic structural variations through asymmetric comparison between tumor and normal samples. Nucleic Acids Res. 2016:026. https:\/\/doi.org\/10.1093\/nar\/gkw026.","journal-title":"Nucleic Acids Research"},{"key":"3367_CR5","doi-asserted-by":"publisher","first-page":"10001","DOI":"10.1038\/ncomms10001","volume":"6","author":"SA Tyler","year":"2015","unstructured":"Tyler SA, et al.A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing. Nat Commun. 2015; 6:10001. https:\/\/doi.org\/10.1038\/ncomms10001. PT: J; UT: WOS:000367579200001.","journal-title":"Nat Commun"},{"issue":"3","key":"3367_CR6","doi-asserted-by":"publisher","first-page":"213","DOI":"10.1038\/nbt.2514","volume":"31","author":"K Cibulskis","year":"2013","unstructured":"Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, Sougnez C, Gabriel S, Meyerson M, Lander ES, Getz G. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol. 2013; 31(3):213\u20139. https:\/\/doi.org\/10.1038\/nbt.2514. PT: J; UT: WOS:000316439500014.","journal-title":"Nat Biotechnol"},{"issue":"4","key":"3367_CR7","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1038\/nmeth.1923","volume":"9","author":"B Langmead","year":"2012","unstructured":"Langmead B, Salzberg SL. Fast gapped-read alignment with bowtie 2. Nat Methods. 2012; 9(4):357\u201354. https:\/\/doi.org\/10.1038\/NMETH.1923. PT: J; UT: WOS:000302218500017.","journal-title":"Nat Methods"},{"issue":"4","key":"3367_CR8","doi-asserted-by":"publisher","first-page":"593","DOI":"10.1093\/bioinformatics\/btr708","volume":"28","author":"W Huang","year":"2012","unstructured":"Huang W, Li L, Myers JR, Marth GT. Art: a next-generation sequencing read simulator. Bioinformatics. 2012; 28(4):593\u20134. https:\/\/doi.org\/10.1093\/bioinformatics\/btr708. PT: J; UT: WOS:000300490500023.","journal-title":"Bioinformatics"},{"issue":"1","key":"3367_CR9","doi-asserted-by":"publisher","first-page":"1377","DOI":"10.1038\/s41467-017-01470-y","volume":"8","author":"H-T Shin","year":"2017","unstructured":"Shin H-T, Choi Y-L, Yun JW, Kim NKD, Kim S-Y, Jeon HJ, Nam J-Y, Lee C, Ryu D, Kim SC, Park K, Lee E, Bae JS, Son DS, Joung J-G, Lee J, Kim ST, Ahn M-J, Lee S-H, Ahn JS, Lee WY, Oh BY, Park YH, Lee JE, Lee KH, Kim HC, Kim K-M, Im Y-H, Park K, Park PJ, Park W-Y. Prevalence and detection of low-allele-fraction variants in clinical cancer samples. Nat Commun. 2017; 8(1):1377. https:\/\/doi.org\/10.1038\/s41467-017-01470-y.","journal-title":"Nat Commun"},{"issue":"1","key":"3367_CR10","doi-asserted-by":"publisher","first-page":"46","DOI":"10.1109\/99.660313","volume":"5","author":"L Dagum","year":"1998","unstructured":"Dagum L, Menon R. Openmp: An industry standard api for shared-memory programming. IEEE Comput Sci Eng. 1998; 5(1):46\u201355. https:\/\/doi.org\/10.1109\/99.660313. PT: J; UT: WOS:000072636000007.","journal-title":"IEEE Comput Sci Eng"},{"issue":"10","key":"3367_CR11","doi-asserted-by":"publisher","first-page":"983","DOI":"10.1038\/nbt.4235","volume":"36","author":"R Poplin","year":"2018","unstructured":"Poplin R, Chang P-C, Alexander D, Schwartz S, Colthurst T, Ku A, Newburger D, Dijamco J, Nguyen N, Afshar PT, Gross SS, Dorfman L, McLean CY, DePristo MA. A universal SNP and small-indel variant caller using deep neural networks. Nat Biotechnol. 2018; 36(10):983. https:\/\/doi.org\/10.1038\/nbt.423.","journal-title":"Nat Biotechnol"},{"issue":"2","key":"3367_CR12","doi-asserted-by":"publisher","first-page":"226","DOI":"10.1038\/ng.1028","volume":"44","author":"Z Iqbal","year":"2012","unstructured":"Iqbal Z, Caccamo M, Turner I, Flicek P, McVean G. De novo assembly and genotyping of variants using colored de Bruijn graphs. Nat Genet. 2012; 44(2):226\u201332. https:\/\/doi.org\/10.1038\/ng.1028. Accessed 27 June 2019.","journal-title":"Nat Genet"},{"key":"3367_CR13","volume-title":"Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics. BCB \u201916","author":"A Bateman","year":"2016","unstructured":"Bateman A, Treangen TJ, Pop M. Limitations of current approaches for reference-free, graph-based variant detection. In: Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics. BCB \u201916. New York: ACM: 2016. p. 499\u2013500. https:\/\/doi.org\/10.1145\/2975167.2985653. event-place: Seattle, WA, USA. Accessed 9 July 2019."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-3367-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s12859-020-3367-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-3367-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,2,4]],"date-time":"2021-02-04T00:12:39Z","timestamp":1612397559000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-020-3367-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,2,5]]},"references-count":13,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["3367"],"URL":"https:\/\/doi.org\/10.1186\/s12859-020-3367-3","relation":{},"ISSN":["1471-2105"],"issn-type":[{"type":"electronic","value":"1471-2105"}],"subject":[],"published":{"date-parts":[[2020,2,5]]},"assertion":[{"value":"25 September 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 January 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 February 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Ethics approval was obtained by the Data Access Compliance Office (DACO) of International Cancer Genome Consortium, granting us access to dataset EGAD00001001859 through application (DACO-1049545). Ethics approval was obtained by Imperial College Healthcare Tissue Bank, granting us access to datasets TSD:chr22 and TSD:chr17 (unpublished) through application R17027-3A.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"Maxeler Technologies partially funded IC for approximately 4 months. This funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"45"}}