{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T00:35:57Z","timestamp":1774312557243,"version":"3.50.1"},"reference-count":23,"publisher":"Oxford University Press (OUP)","issue":"24","license":[{"start":{"date-parts":[[2016,11,7]],"date-time":"2016-11-07T00:00:00Z","timestamp":1478476800000},"content-version":"vor","delay-in-days":816,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2014,12,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: Insertions play an important role in genome evolution. However, such variants are difficult to detect from short-read sequencing data, especially when they exceed the paired-end insert size. Many approaches have been proposed to call short insertion variants based on paired-end mapping. However, there remains a lack of practical methods to detect and assemble long variants.<\/jats:p>\n               <jats:p>Results: We propose here an original method, called M ind T he G ap , for the integrated detection and assembly of insertion variants from re-sequencing data. Importantly, it is designed to call insertions of any size, whether they are novel or duplicated, homozygous or heterozygous in the donor genome. M ind T he G ap uses an efficient k -mer-based method to detect insertion sites in a reference genome, and subsequently assemble them from the donor reads. M ind T he G ap showed high recall and precision on simulated datasets of various genome complexities. When applied to real Caenorhabditis elegans and human NA12878 datasets, M ind T he G ap detected and correctly assembled insertions &amp;gt;1 kb, using at most 14 GB of memory.<\/jats:p>\n               <jats:p>Availability and implementation: \u00a0http:\/\/mindthegap.genouest.org<\/jats:p>\n               <jats:p>Contact: \u00a0guillaume.rizk@inria.fr or claire.lemaitre@inria.fr<\/jats:p>","DOI":"10.1093\/bioinformatics\/btu545","type":"journal-article","created":{"date-parts":[[2014,8,15]],"date-time":"2014-08-15T05:12:51Z","timestamp":1408079571000},"page":"3451-3457","source":"Crossref","is-referenced-by-count":52,"title":["MindTheGap: integrated detection and assembly of short and long insertions"],"prefix":"10.1093","volume":"30","author":[{"given":"Guillaume","family":"Rizk","sequence":"first","affiliation":[{"name":"1 Inria\/IRISA GenScale, Campus de Beaulieu, 35042 Rennes cedex, France, 2 INRA, UMR 1349 Institut de G\u00e9n\u00e9tique, Environnement et Protection des Plantes, Domaine de la Motte - 35653 Le Rheu Cedex, France and 3 Department of Computer Science and Engineering, Pennsylvania State University, PA, USA"}]},{"given":"Ana\u00efs","family":"Gouin","sequence":"additional","affiliation":[{"name":"1 Inria\/IRISA GenScale, Campus de Beaulieu, 35042 Rennes cedex, France, 2 INRA, UMR 1349 Institut de G\u00e9n\u00e9tique, Environnement et Protection des Plantes, Domaine de la Motte - 35653 Le Rheu Cedex, France and 3 Department of Computer Science and Engineering, Pennsylvania State University, PA, USA"}]},{"given":"Rayan","family":"Chikhi","sequence":"additional","affiliation":[{"name":"1 Inria\/IRISA GenScale, Campus de Beaulieu, 35042 Rennes cedex, France, 2 INRA, UMR 1349 Institut de G\u00e9n\u00e9tique, Environnement et Protection des Plantes, Domaine de la Motte - 35653 Le Rheu Cedex, France and 3 Department of Computer Science and Engineering, Pennsylvania State University, PA, USA"}]},{"given":"Claire","family":"Lemaitre","sequence":"additional","affiliation":[{"name":"1 Inria\/IRISA GenScale, Campus de Beaulieu, 35042 Rennes cedex, France, 2 INRA, UMR 1349 Institut de G\u00e9n\u00e9tique, Environnement et Protection des Plantes, Domaine de la Motte - 35653 Le Rheu Cedex, France and 3 Department of Computer Science and Engineering, Pennsylvania State University, PA, USA"}]}],"member":"286","published-online":{"date-parts":[[2014,8,14]]},"reference":[{"key":"2023012712023681700_btu545-B1","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1038\/nature09534","article-title":"A map of human genome variation from population-scale sequencing","volume":"467","author":"1000 Genomes Project Consortium, \n              et al.","year":"2010","journal-title":"Nature"},{"key":"2023012712023681700_btu545-B2","doi-asserted-by":"crossref","first-page":"961","DOI":"10.1101\/gr.112326.110","article-title":"Dindel: Accurate indel calls from short-read data","volume":"21","author":"Albers","year":"2011","journal-title":"Genome Res."},{"key":"2023012712023681700_btu545-B3","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1038\/nrg2958","article-title":"Genome structural variation discovery and genotyping","volume":"12","author":"Alkan","year":"2011","journal-title":"Nat. Rev. Genet."},{"key":"2023012712023681700_btu545-B4","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1038\/nmeth.1363","article-title":"Breakdancer: an algorithm for high-resolution mapping of genomic structural variation","volume":"6","author":"Chen","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012712023681700_btu545-B5","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1101\/gr.162883.113","article-title":"Tigra: a targeted iterative graph routing assembler for breakpoint assembly","volume":"24","author":"Chen","year":"2014","journal-title":"Genome Res."},{"key":"2023012712023681700_btu545-B6","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1186\/1748-7188-8-22","article-title":"Space-efficient and exact de bruijn graph representation based on a bloom filter","volume":"8","author":"Chikhi","year":"2013","journal-title":"Algorithms Mol. Biol."},{"key":"2023012712023681700_btu545-B7","doi-asserted-by":"crossref","first-page":"491","DOI":"10.1038\/ng.806","article-title":"A framework for variation discovery and genotyping using next-generation dna sequencing data","volume":"43","author":"DePristo","year":"2011","journal-title":"Nat. Genet."},{"key":"2023012712023681700_btu545-B8","doi-asserted-by":"crossref","first-page":"985","DOI":"10.1101\/gr.114777.110","article-title":"Whole-genome resequencing allows detection of many rare line-1 insertion alleles in humans","volume":"21","author":"Ewing","year":"2011","journal-title":"Genome Res."},{"key":"2023012712023681700_btu545-B9","doi-asserted-by":"crossref","first-page":"1277","DOI":"10.1093\/bioinformatics\/btq152","article-title":"Detection and characterization of novel sequence insertions using paired-end next-generation sequencing","volume":"26","author":"Hajirasouliha","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012712023681700_btu545-B10","doi-asserted-by":"crossref","first-page":"i350","DOI":"10.1093\/bioinformatics\/btq216","article-title":"Next-generation variationhunter: combinatorial algorithms for transposon insertion discovery","volume":"26","author":"Hormozdiari","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012712023681700_btu545-B11","doi-asserted-by":"crossref","first-page":"226","DOI":"10.1038\/ng.1028","article-title":"De novo assembly and genotyping of variants using colored de bruijn graphs","volume":"44","author":"Iqbal","year":"2012","journal-title":"Nat. Genet."},{"key":"2023012712023681700_btu545-B12","doi-asserted-by":"crossref","first-page":"837","DOI":"10.1016\/j.cell.2010.10.027","article-title":"A human genome structural variation sequencing resource reveals insights into mutational mechanisms","volume":"143","author":"Kidd","year":"2010","journal-title":"Cell"},{"key":"2023012712023681700_btu545-B13","doi-asserted-by":"crossref","first-page":"e128","DOI":"10.1093\/nar\/gkt339","article-title":"Reprever: resolving low-copy duplicated sequences using template driven assembly","volume":"41","author":"Kim","year":"2013","journal-title":"Nucleic Acids Res."},{"key":"2023012712023681700_btu545-B14","doi-asserted-by":"crossref","first-page":"2078","DOI":"10.1093\/bioinformatics\/btp352","article-title":"The sequence alignment\/map format and samtools","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012712023681700_btu545-B15","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1101\/gr.132480.111","article-title":"Soapindel: efficient identification of indels from short paired reads","volume":"23","author":"Li","year":"2013","journal-title":"Genome Res."},{"key":"2023012712023681700_btu545-B16","doi-asserted-by":"crossref","first-page":"S13","DOI":"10.1038\/nmeth.1374","article-title":"Computational methods for discovering structural variation with next-generation sequencing","volume":"6","author":"Medvedev","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012712023681700_btu545-B17","doi-asserted-by":"crossref","first-page":"1033","DOI":"10.1038\/nmeth.3069","article-title":"Accurate de novo and transmitted indel detection in exome-capture data using microassembly","volume":"11","author":"Narzisi","year":"2014","journal-title":"Nat. Methods"},{"key":"2023012712023681700_btu545-B18","doi-asserted-by":"crossref","first-page":"256","DOI":"10.1093\/bib\/bbs086","article-title":"A survey of tools for variant analysis of next-generation genome sequencing data","volume":"15","author":"Pabinger","year":"2013","journal-title":"Brief. Bioinform."},{"key":"2023012712023681700_btu545-B19","doi-asserted-by":"crossref","first-page":"S3","DOI":"10.1186\/1471-2105-12-S6-S3","article-title":"Assembly of non-unique insertion content using next-generation sequencing","volume":"12","author":"Parrish","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023012712023681700_btu545-B20","doi-asserted-by":"crossref","first-page":"364","DOI":"10.1007\/978-3-642-40453-5_28","article-title":"Using cascading bloom filters to improve the memory usage for de brujin graphs","volume":"9","author":"Salikhov","year":"2013","journal-title":"Algorithms Bioinformatics"},{"key":"2023012712023681700_btu545-B21","doi-asserted-by":"crossref","first-page":"i222","DOI":"10.1093\/bioinformatics\/btp208","article-title":"A geometric approach for classification and comparison of structural variants","volume":"25","author":"Sindi","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012712023681700_btu545-B22","doi-asserted-by":"crossref","first-page":"e1002236","DOI":"10.1371\/journal.pgen.1002236","article-title":"A comprehensive map of mobile element insertion polymorphisms in humans","volume":"7","author":"Stewart","year":"2011","journal-title":"PLoS Genet."},{"key":"2023012712023681700_btu545-B23","doi-asserted-by":"crossref","first-page":"2865","DOI":"10.1093\/bioinformatics\/btp394","article-title":"Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads","volume":"25","author":"Ye","year":"2009","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/24\/3451\/48931680\/bioinformatics_30_24_3451.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/30\/24\/3451\/48931680\/bioinformatics_30_24_3451.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,27]],"date-time":"2023-01-27T12:59:57Z","timestamp":1674824397000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/30\/24\/3451\/2422179"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2014,8,14]]},"references-count":23,"journal-issue":{"issue":"24","published-print":{"date-parts":[[2014,12,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btu545","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2014,12,15]]},"published":{"date-parts":[[2014,8,14]]}}}