{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:54Z","timestamp":1772138094266,"version":"3.50.1"},"reference-count":45,"publisher":"Oxford University Press (OUP)","issue":"20","license":[{"start":{"date-parts":[[2019,4,2]],"date-time":"2019-04-02T00:00:00Z","timestamp":1554163200000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"T\u00dcB\u0130TAK","award":["215E172"],"award-info":[{"award-number":["215E172"]}]},{"DOI":"10.13039\/100004410","name":"EMBO","doi-asserted-by":"publisher","award":["IG-2521"],"award-info":[{"award-number":["IG-2521"]}],"id":[{"id":"10.13039\/100004410","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["1528234"],"award-info":[{"award-number":["1528234"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","award":["GM112625"],"award-info":[{"award-number":["GM112625"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,10,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Several algorithms have been developed that use high-throughput sequencing technology to characterize structural variations (SVs). Most of the existing approaches focus on detecting relatively simple types of SVs such as insertions, deletions and short inversions. In fact, complex SVs are of crucial importance and several have been associated with genomic disorders. To better understand the contribution of complex SVs to human disease, we need new algorithms to accurately discover and genotype such variants. Additionally, due to similar sequencing signatures, inverted duplications or gene conversion events that include inverted segmental duplications are often characterized as simple inversions, likewise, duplications and gene conversions in direct orientation may be called as simple deletions. Therefore, there is still a need for accurate algorithms to fully characterize complex SVs and thus improve calling accuracy of more simple variants.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We developed novel algorithms to accurately characterize tandem, direct and inverted interspersed segmental duplications using short read whole genome sequencing datasets. We integrated these methods to our TARDIS tool, which is now capable of detecting various types of SVs using multiple sequence signatures such as read pair, read depth and split read. We evaluated the prediction performance of our algorithms through several experiments using both simulated and real datasets. In the simulation experiments, using a 30\u00d7 coverage TARDIS achieved 96% sensitivity with only 4% false discovery rate. For experiments that involve real data, we used two haploid genomes (CHM1 and CHM13) and one human genome (NA12878) from the Illumina Platinum Genomes set. Comparison of our results with orthogonal PacBio call sets from the same genomes revealed higher accuracy for TARDIS than state-of-the-art methods. Furthermore, we showed a surprisingly low false discovery rate of our approach for discovery of tandem, direct and inverted interspersed segmental duplications prediction on CHM1 (&amp;lt;5% for the top 50 predictions).<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>TARDIS source code is available at https:\/\/github.com\/BilkentCompGen\/tardis, and a corresponding Docker image is available at https:\/\/hub.docker.com\/r\/alkanlab\/tardis\/.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz237","type":"journal-article","created":{"date-parts":[[2019,3,29]],"date-time":"2019-03-29T08:10:32Z","timestamp":1553847032000},"page":"3923-3930","source":"Crossref","is-referenced-by-count":37,"title":["Discovery of tandem and interspersed segmental duplications using high-throughput sequencing"],"prefix":"10.1093","volume":"35","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2198-1920","authenticated-orcid":false,"given":"Arda","family":"Soylev","sequence":"first","affiliation":[{"name":"Department of Computer Engineering, Bilkent University , Ankara"},{"name":"Department of Computer Engineering, Konya Food and Agriculture University , Konya, Turkey"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Thong Minh","family":"Le","sequence":"additional","affiliation":[{"name":"UC-Davis Genome Center, University of California , Davis, CA, USA"},{"name":"Department of Computer Science, University of California , Davis, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hajar","family":"Amini","sequence":"additional","affiliation":[{"name":"Department of Neurology, School of Medicine, University of California , Davis, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Can","family":"Alkan","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, Bilkent University , Ankara"},{"name":"Bilkent-Hacettepe Health Sciences and Technologies Program , Ankara, Turkey"},{"name":"Department of Computer Science, ETH Z\u00fcrich , Zurich, Switzerland"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fereydoun","family":"Hormozdiari","sequence":"additional","affiliation":[{"name":"UC-Davis Genome Center, University of California , Davis, CA, USA"},{"name":"Department of Biochemistry and Molecular Medicine, University of California , Davis, CA, USA"},{"name":"MIND Institute, University of California , Davis, CA, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2019,4,1]]},"reference":[{"key":"2023020108351098700_btz237-B1","doi-asserted-by":"crossref","first-page":"974","DOI":"10.1101\/gr.114876.110","article-title":"CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing","volume":"21","author":"Abyzov","year":"2011","journal-title":"Genome Res"},{"key":"2023020108351098700_btz237-B2","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1038\/ng.437","article-title":"Personalized copy number and segmental duplication maps using next-generation sequencing","volume":"41","author":"Alkan","year":"2009","journal-title":"Nat. Genet"},{"key":"2023020108351098700_btz237-B3","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1038\/nrg2958","article-title":"Genome structural variation discovery and genotyping","volume":"12","author":"Alkan","year":"2011","journal-title":"Nat. Rev. Genet"},{"key":"2023020108351098700_btz237-B4","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1093\/bib\/bbv028","article-title":"Robust and exact structural variation detection with paired-end and soft-clipped alignments: SoftSV compared with eight algorithms","volume":"17","author":"Bartenhagen","year":"2016","journal-title":"Brief. Bioinform"},{"key":"2023020108351098700_btz237-B5","doi-asserted-by":"crossref","first-page":"170","DOI":"10.1016\/j.ajhg.2015.05.012","article-title":"Paired-duplication signatures mark cryptic inversions and other complex structural variation","volume":"97","author":"Brand","year":"2015","journal-title":"Am. J. Hum. Genet"},{"key":"2023020108351098700_btz237-B6","doi-asserted-by":"crossref","first-page":"627","DOI":"10.1038\/nrg3933","article-title":"Genetic variation and the de novo assembly of human genomes","volume":"16","author":"Chaisson","year":"2015","journal-title":"Nat. Rev. Genet"},{"key":"2023020108351098700_btz237-B7","doi-asserted-by":"crossref","first-page":"608","DOI":"10.1038\/nature13907","article-title":"Resolving the complexity of the human genome using single-molecule sequencing","volume":"517","author":"Chaisson","year":"2015","journal-title":"Nature"},{"key":"2023020108351098700_btz237-B8","article-title":"Multi-platform discovery of haplotype-resolved structural variation in human genomes","author":"Chaisson","year":"2018","journal-title":"bioRxiv"},{"key":"2023020108351098700_btz237-B9","doi-asserted-by":"crossref","first-page":"704","DOI":"10.1038\/nature08516","article-title":"Origins and functional impact of copy number variation in the human genome","volume":"464","author":"Conrad","year":"2010","journal-title":"Nature"},{"key":"2023020108351098700_btz237-B10","doi-asserted-by":"crossref","first-page":"1199","DOI":"10.1038\/ng.236","article-title":"Systematic assessment of copy number variant detection via genome-wide SNP genotyping","volume":"40","author":"Cooper","year":"2008","journal-title":"Nat. Genet"},{"key":"2023020108351098700_btz237-B11","doi-asserted-by":"crossref","first-page":"664.","DOI":"10.12688\/f1000research.11168.1","article-title":"TIDDIT, an efficient and comprehensive structural variant caller for massive parallel sequencing data","volume":"6","author":"Eisfeldt","year":"2017","journal-title":"F1000Res"},{"key":"2023020108351098700_btz237-B12","doi-asserted-by":"crossref","first-page":"1270","DOI":"10.1101\/gr.088633.108","article-title":"Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes","volume":"19","author":"Hormozdiari","year":"2009","journal-title":"Genome Res"},{"key":"2023020108351098700_btz237-B13","doi-asserted-by":"crossref","first-page":"840","DOI":"10.1101\/gr.115956.110","article-title":"Alu repeat discovery and characterization within human genomes","volume":"21","author":"Hormozdiari","year":"2011","journal-title":"Genome Res"},{"key":"2023020108351098700_btz237-B14","doi-asserted-by":"crossref","first-page":"2203","DOI":"10.1101\/gr.120501.111","article-title":"Simultaneous structural variation discovery among multiple paired-end sequenced genomes","volume":"21","author":"Hormozdiari","year":"2011","journal-title":"Genome Res"},{"key":"2023020108351098700_btz237-B15","doi-asserted-by":"crossref","first-page":"688","DOI":"10.1101\/gr.168450.113","article-title":"Reconstructing complex regions of genomes using long-read sequencing technology","volume":"24","author":"Huddleston","year":"2014","journal-title":"Genome Res"},{"key":"2023020108351098700_btz237-B16","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1101\/gr.214007.116","article-title":"Discovery and genotyping of structural variation from long-read haploid genome sequence data","volume":"27","author":"Huddleston","year":"2016","journal-title":"Genome Res"},{"key":"2023020108351098700_btz237-B17","doi-asserted-by":"crossref","first-page":"984","DOI":"10.1093\/bioinformatics\/btv751","article-title":"SV-Bay: structural variant detection in cancer genomes using a Bayesian approach with correction for GC-content and read mappability","volume":"32","author":"Iakovishina","year":"2016","journal-title":"Bioinformatics"},{"key":"2023020108351098700_btz237-B18","doi-asserted-by":"crossref","first-page":"420","DOI":"10.1126\/science.1149504","article-title":"Paired-end mapping reveals extensive structural variation in the human genome","volume":"318","author":"Korbel","year":"2007","journal-title":"Science"},{"key":"2023020108351098700_btz237-B19","doi-asserted-by":"crossref","first-page":"R84.","DOI":"10.1186\/gb-2014-15-6-r84","article-title":"LUMPY: a probabilistic framework for structural variant discovery","volume":"15","author":"Layer","year":"2014","journal-title":"Genome Biol"},{"key":"2023020108351098700_btz237-B20","doi-asserted-by":"crossref","first-page":"473","DOI":"10.1038\/nmeth.f.256","article-title":"MoDIL: detecting small indels from clone-end sequencing with mixtures of distributions","volume":"6","author":"Lee","year":"2009","journal-title":"Nat. Methods"},{"key":"2023020108351098700_btz237-B21","first-page":"3997","article-title":"Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM","volume":"1303","author":"Li","year":"2013","journal-title":"arXiv Preprint arXiv"},{"key":"2023020108351098700_btz237-B22","doi-asserted-by":"crossref","first-page":"452","DOI":"10.1038\/70570","article-title":"A general approach to single-nucleotide polymorphism discovery","volume":"23","author":"Marth","year":"1999","journal-title":"Nat. Genet"},{"key":"2023020108351098700_btz237-B23","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1038\/ng1696","article-title":"Common deletion polymorphisms in the human genome","volume":"38","author":"McCarroll","year":"2006","journal-title":"Nat. Genet"},{"key":"2023020108351098700_btz237-B24","first-page":"50","author":"Medvedev","year":"2008"},{"key":"2023020108351098700_btz237-B25","doi-asserted-by":"crossref","first-page":"S13","DOI":"10.1038\/nmeth.1374","article-title":"Computational methods for discovering structural variation with next-generation sequencing","volume":"6","author":"Medvedev","year":"2009","journal-title":"Nat. Methods"},{"key":"2023020108351098700_btz237-B26","doi-asserted-by":"crossref","first-page":"1182","DOI":"10.1101\/gr.4565806","article-title":"An initial map of insertion and deletion (INDEL) variation in the human genome","volume":"16","author":"Mills","year":"2006","journal-title":"Genome Res"},{"key":"2023020108351098700_btz237-B27","doi-asserted-by":"crossref","first-page":"1469","DOI":"10.1093\/bioinformatics\/btu828","article-title":"VarSim: a high-fidelity simulation and validation framework for high-throughput genome sequencing with cancer applications","volume":"31","author":"Mu","year":"2015","journal-title":"Bioinformatics"},{"key":"2023020108351098700_btz237-B28","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1016\/S0027-5107(02)00076-3","article-title":"Chromosomal aberrations: formation, identification and distribution","volume":"504","author":"Obe","year":"2002","journal-title":"Mutat. Res"},{"key":"2023020108351098700_btz237-B29","doi-asserted-by":"crossref","first-page":"i333","DOI":"10.1093\/bioinformatics\/bts378","article-title":"DELLY: structural variant discovery by integrated paired-end and split-read analysis","volume":"28","author":"Rausch","year":"2012","journal-title":"Bioinformatics"},{"key":"2023020108351098700_btz237-B30","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1038\/nature05329","article-title":"Global variation in copy number in the human genome","volume":"444","author":"Redon","year":"2006","journal-title":"Nature"},{"key":"2023020108351098700_btz237-B31","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1038\/243290a0","article-title":"A new consistent chromosomal abnormality in chronic myelogenous leukaemia identified by quinacrine fluorescence and giemsa staining","volume":"243","author":"Rowley","year":"1973","journal-title":"Nature"},{"key":"2023020108351098700_btz237-B32","doi-asserted-by":"crossref","first-page":"525","DOI":"10.1126\/science.1098918","article-title":"Large-scale copy number polymorphism in the human genome","volume":"305","author":"Sebat","year":"2004","journal-title":"Science"},{"key":"2023020108351098700_btz237-B33","doi-asserted-by":"crossref","first-page":"1038","DOI":"10.1038\/ng1862","article-title":"Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome","volume":"38","author":"Sharp","year":"2006","journal-title":"Nat. Genet"},{"key":"2023020108351098700_btz237-B34","doi-asserted-by":"crossref","first-page":"1135","DOI":"10.1038\/nbt1486","article-title":"Next-generation DNA sequencing","volume":"26","author":"Shendure","year":"2008","journal-title":"Nat. Biotechnol"},{"key":"2023020108351098700_btz237-B35","doi-asserted-by":"crossref","first-page":"i222","DOI":"10.1093\/bioinformatics\/btp208","article-title":"A geometric approach for classification and comparison of structural variants","volume":"25","author":"Sindi","year":"2009","journal-title":"Bioinformatics"},{"key":"2023020108351098700_btz237-B36","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.ymeth.2017.05.030","article-title":"Toolkit for automated and rapid discovery of structural variants","volume":"129","author":"Soylev","year":"2017","journal-title":"Methods"},{"key":"2023020108351098700_btz237-B37","doi-asserted-by":"crossref","first-page":"2066","DOI":"10.1101\/gr.180893.114","article-title":"Single haplotype assembly of the human genome from a hydatidiform mole","volume":"24","author":"Steinberg","year":"2014","journal-title":"Genome Res"},{"key":"2023020108351098700_btz237-B38","doi-asserted-by":"crossref","first-page":"641","DOI":"10.1126\/science.1197005","article-title":"Diversity of human copy number variation and multicopy genes","volume":"330","author":"Sudmant","year":"2010","journal-title":"Science"},{"key":"2023020108351098700_btz237-B39","doi-asserted-by":"crossref","DOI":"10.1126\/science.aab3761","article-title":"Global diversity, population stratification, and selection of human copy-number variation","volume":"349","author":"Sudmant","year":"2015","journal-title":"Science"},{"key":"2023020108351098700_btz237-B40","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1038\/nature15393","article-title":"A global reference for human genetic variation","volume":"526","year":"2015","journal-title":"Nature"},{"key":"2023020108351098700_btz237-B41","doi-asserted-by":"crossref","first-page":"178","DOI":"10.1093\/bib\/bbs017","article-title":"Integrative genomics viewer (IGV): high-performance genomics data visualization and exploration","volume":"14","author":"Thorvaldsd\u00f3ttir","year":"2013","journal-title":"Brief. Bioinform"},{"key":"2023020108351098700_btz237-B42","doi-asserted-by":"crossref","first-page":"605","DOI":"10.1007\/s00439-017-1777-8","article-title":"Y chromosome palindromes and gene conversion","volume":"136","author":"Trombetta","year":"2017","journal-title":"Hum. Genet"},{"key":"2023020108351098700_btz237-B43","doi-asserted-by":"crossref","first-page":"727","DOI":"10.1038\/ng1562","article-title":"Fine-scale structural variation of the human genome","volume":"37","author":"Tuzun","year":"2005","journal-title":"Nat. Genet"},{"key":"2023020108351098700_btz237-B44","doi-asserted-by":"crossref","first-page":"2865","DOI":"10.1093\/bioinformatics\/btp394","article-title":"Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads","volume":"25","author":"Ye","year":"2009","journal-title":"Bioinformatics"},{"key":"2023020108351098700_btz237-B45","doi-asserted-by":"crossref","first-page":"126","DOI":"10.1186\/s13059-016-0993-1","article-title":"Resolving complex structural genomic rearrangements using a randomized approach","volume":"17","author":"Zhao","year":"2016","journal-title":"Genome Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btz237\/28533834\/btz237.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/20\/3923\/48976381\/bioinformatics_35_20_3923.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/20\/3923\/48976381\/bioinformatics_35_20_3923.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T14:43:27Z","timestamp":1675262607000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/20\/3923\/5425335"}},"subtitle":[],"editor":[{"given":"Bonnie","family":"Berger","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2019,4,1]]},"references-count":45,"journal-issue":{"issue":"20","published-print":{"date-parts":[[2019,10,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz237","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/393694","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,10,15]]},"published":{"date-parts":[[2019,4,1]]}}}