{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T04:38:39Z","timestamp":1775882319610,"version":"3.50.1"},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"10","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,5,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Motivation: In the past few years, human genome structural variation discovery has enjoyed increased attention from the genomics research community. Many studies were published to characterize short insertions, deletions, duplications and inversions, and associate copy number variants (CNVs) with disease. Detection of new sequence insertions requires sequence data, however, the \u2018detectable\u2019 sequence length with read-pair analysis is limited by the insert size. Thus, longer sequence insertions that contribute to our genetic makeup are not extensively researched.<\/jats:p>\n               <jats:p>Results: We present NovelSeq: a computational framework to discover the content and location of long novel sequence insertions using paired-end sequencing data generated by the next-generation sequencing platforms. Our framework can be built as part of a general sequence analysis pipeline to discover multiple types of genetic variation (SNPs, structural variation, etc.), thus it requires significantly less-computational resources than de novo sequence assembly. We apply our methods to detect novel sequence insertions in the genome of an anonymous donor and validate our results by comparing with the insertions discovered in the same genome using various sources of sequence data.<\/jats:p>\n               <jats:p>Availability: The implementation of the NovelSeq pipeline is available at http:\/\/compbio.cs.sfu.ca\/strvar.htm<\/jats:p>\n               <jats:p>Contact: \u00a0eee@gs.washington.edu; cenk@cs.sfu.ca<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq152","type":"journal-article","created":{"date-parts":[[2010,4,13]],"date-time":"2010-04-13T02:32:32Z","timestamp":1271125952000},"page":"1277-1283","source":"Crossref","is-referenced-by-count":94,"title":["Detection and characterization of novel sequence insertions using paired-end next-generation sequencing"],"prefix":"10.1093","volume":"26","author":[{"given":"Iman","family":"Hajirasouliha","sequence":"first","affiliation":[{"name":"1 Lab for Computational Biology, Simon Fraser University, Burnaby, BC, Canada, 2 Department of Genome Sciences, University of Washington, 3 Howard Hughes Medical Institute, Seattle, WA, 4 Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA and 5 BC Cancer Agency, Genome Science Center, Vancouver, BC, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fereydoun","family":"Hormozdiari","sequence":"additional","affiliation":[{"name":"1 Lab for Computational Biology, Simon Fraser University, Burnaby, BC, Canada, 2 Department of Genome Sciences, University of Washington, 3 Howard Hughes Medical Institute, Seattle, WA, 4 Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA and 5 BC Cancer Agency, Genome Science Center, Vancouver, BC, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Can","family":"Alkan","sequence":"additional","affiliation":[{"name":"1 Lab for Computational Biology, Simon Fraser University, Burnaby, BC, Canada, 2 Department of Genome Sciences, University of Washington, 3 Howard Hughes Medical Institute, Seattle, WA, 4 Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA and 5 BC Cancer Agency, Genome Science Center, Vancouver, BC, Canada"},{"name":"1 Lab for Computational Biology, Simon Fraser University, Burnaby, BC, Canada, 2 Department of Genome Sciences, University of Washington, 3 Howard Hughes Medical Institute, Seattle, WA, 4 Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA and 5 BC Cancer Agency, Genome Science Center, Vancouver, BC, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jeffrey M.","family":"Kidd","sequence":"additional","affiliation":[{"name":"1 Lab for Computational Biology, Simon Fraser University, Burnaby, BC, Canada, 2 Department of Genome Sciences, University of Washington, 3 Howard Hughes Medical Institute, Seattle, WA, 4 Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA and 5 BC Cancer Agency, Genome Science Center, Vancouver, BC, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[{"name":"1 Lab for Computational Biology, Simon Fraser University, Burnaby, BC, Canada, 2 Department of Genome Sciences, University of Washington, 3 Howard Hughes Medical Institute, Seattle, WA, 4 Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA and 5 BC Cancer Agency, Genome Science Center, Vancouver, BC, Canada"},{"name":"1 Lab for Computational Biology, Simon Fraser University, Burnaby, BC, Canada, 2 Department of Genome Sciences, University of Washington, 3 Howard Hughes Medical Institute, Seattle, WA, 4 Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA and 5 BC Cancer Agency, Genome Science Center, Vancouver, BC, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Evan E.","family":"Eichler","sequence":"additional","affiliation":[{"name":"1 Lab for Computational Biology, Simon Fraser University, Burnaby, BC, Canada, 2 Department of Genome Sciences, University of Washington, 3 Howard Hughes Medical Institute, Seattle, WA, 4 Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA and 5 BC Cancer Agency, Genome Science Center, Vancouver, BC, Canada"},{"name":"1 Lab for Computational Biology, Simon Fraser University, Burnaby, BC, Canada, 2 Department of Genome Sciences, University of Washington, 3 Howard Hughes Medical Institute, Seattle, WA, 4 Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA and 5 BC Cancer Agency, Genome Science Center, Vancouver, BC, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"S. Cenk","family":"Sahinalp","sequence":"additional","affiliation":[{"name":"1 Lab for Computational Biology, Simon Fraser University, Burnaby, BC, Canada, 2 Department of Genome Sciences, University of Washington, 3 Howard Hughes Medical Institute, Seattle, WA, 4 Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA and 5 BC Cancer Agency, Genome Science Center, Vancouver, BC, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2010,4,12]]},"reference":[{"key":"2023012507511638500_B1","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1038\/ng.437","article-title":"Personalized copy number and segmental duplication maps using next-generation sequencing","volume":"41","author":"Alkan","year":"2009","journal-title":"Nat. Genet."},{"key":"2023012507511638500_B2","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol."},{"key":"2023012507511638500_B3","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1038\/nature07517","article-title":"Accurate whole human genome sequencing using reversible terminator chemistry","volume":"456","author":"Bentley","year":"2008","journal-title":"Nature"},{"key":"2023012507511638500_B4","doi-asserted-by":"crossref","first-page":"324","DOI":"10.1101\/gr.7088808","article-title":"Short read fragment assembly of bacterial genomes","volume":"18","author":"Chaisson","year":"2008","journal-title":"Genome Res."},{"key":"2023012507511638500_B5","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1038\/nmeth.1363","article-title":"Breakdancer: an algorithm for high-resolution mapping of genomic structural variation","volume":"6","author":"Chen","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012507511638500_B6","doi-asserted-by":"crossref","first-page":"186","DOI":"10.1101\/gr.8.3.186","article-title":"Base-calling of automated sequencer traces using phred.II. error probabilities","volume":"8","author":"Ewing","year":"1998","journal-title":"Genome Res."},{"key":"2023012507511638500_B7","author":"Green","year":"2010","journal-title":"Cross-match."},{"key":"2023012507511638500_B8","doi-asserted-by":"crossref","first-page":"1270","DOI":"10.1101\/gr.088633.108","article-title":"Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes","volume":"19","author":"Hormozdiari","year":"2009","journal-title":"Genome Res."},{"key":"2023012507511638500_B9","article-title":"Next Generation VariationHunter: combinatorial algorithms for transposon insertion discovery","author":"Hormozdiari","year":"2010","journal-title":"Proceedings of the 18th Annual Conference on Intelligent Systems for Molecular Biology (ISMB 2010)."},{"key":"2023012507511638500_B10","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1038\/nature06862","article-title":"Mapping and sequencing of structural variation from eight human genomes","volume":"453","author":"Kidd","year":"2008","journal-title":"Nature"},{"key":"2023012507511638500_B11","doi-asserted-by":"crossref","DOI":"10.1038\/nmeth.1451","article-title":"Characterization of missing human genome sequences and copy-number polymorphic insertions","author":"Kidd","year":"2010","journal-title":"Nat. Methods"},{"key":"2023012507511638500_B12","doi-asserted-by":"crossref","first-page":"420","DOI":"10.1126\/science.1149504","article-title":"Paired-end mapping reveals extensive structural variation in the human genome","volume":"318","author":"Korbel","year":"2007","journal-title":"Science"},{"key":"2023012507511638500_B13","doi-asserted-by":"crossref","first-page":"473","DOI":"10.1038\/nmeth.f.256","article-title":"Modil: detecting small indels from clone-end sequencing with mixtures of distributions","volume":"6","author":"Lee","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012507511638500_B14","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nbt.1596","article-title":"Building the sequence map of the human pan-genome","volume":"28","author":"Li","year":"2009","journal-title":"Nat. Biotechnol."},{"key":"2023012507511638500_B15","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1101\/gr.097261.109","article-title":"De novo assembly of human genomes with massively parallel short read sequencing","volume":"20","author":"Li","year":"2010","journal-title":"Genome Res."},{"key":"2023012507511638500_B16","doi-asserted-by":"crossref","first-page":"e254","DOI":"10.1371\/journal.pbio.0050254","article-title":"The diploid genome sequence of an individual human","volume":"5","author":"Levy","year":"2007","journal-title":"PLoS Biol."},{"key":"2023012507511638500_B17","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1038\/nmeth.1374","article-title":"Computational methods for discovering structural variation with next-generation sequencing","volume":"6","author":"Medvedev","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012507511638500_B18","doi-asserted-by":"crossref","first-page":"1117","DOI":"10.1101\/gr.089532.108","article-title":"ABySS: a parallel assembler for short read sequence data","volume":"19","author":"Simpson","year":"2009","journal-title":"Genome Res."},{"key":"2023012507511638500_B19","doi-asserted-by":"crossref","first-page":"727","DOI":"10.1038\/ng1562","article-title":"Fine-scale structural variation of the human genome","volume":"37","author":"Tuzun","year":"2005","journal-title":"Nat. Genet."},{"key":"2023012507511638500_B20","doi-asserted-by":"crossref","first-page":"7696","DOI":"10.1073\/pnas.1232418100","article-title":"End-sequence profiling: sequence-based analysis of aberrant genomes","volume":"100","author":"Volik","year":"2003","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023012507511638500_B21","volume-title":"Introduction to Graph Theory","author":"West","year":"2001","edition":"2"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/10\/1277\/48851048\/bioinformatics_26_10_1277.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/10\/1277\/48851048\/bioinformatics_26_10_1277.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,25]],"date-time":"2023-01-25T07:51:31Z","timestamp":1674633091000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/10\/1277\/194099"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,4,12]]},"references-count":21,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2010,5,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq152","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,5,15]]},"published":{"date-parts":[[2010,4,12]]}}}