{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:28Z","timestamp":1772138068778,"version":"3.50.1"},"reference-count":34,"publisher":"Oxford University Press (OUP)","issue":"Supplement_2","license":[{"start":{"date-parts":[[2024,9,4]],"date-time":"2024-09-04T00:00:00Z","timestamp":1725408000000},"content-version":"vor","delay-in-days":3,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"ECCB2024"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Complex structural variants (SVs) are genomic rearrangements that involve multiple segments of DNA. They contribute to human diversity and have been shown to cause Mendelian disease. Nevertheless, our abilities to analyse complex SVs are very limited. As opposed to deletions and other canonical types of SVs, there are no established tools that have explicitly been designed for analysing complex SVs.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Here, we describe a new computational approach that we specifically designed for genotyping complex SVs in short-read sequenced genomes. Given a variant description, our approach computes genotype-specific probability distributions for observing aligned read pairs with a wide range of properties. Subsequently, these distributions can be used to efficiently determine the most likely genotype for any set of aligned read pairs observed in a sequenced genome. In addition, we use these distributions to compute a genotyping difficulty for a given variant, which predicts the amount of data needed to achieve a reliable call. Careful evaluation confirms that our approach outperforms other genotypers by making reliable genotype predictions across both simulated and real data. On up to 7829 human genomes, we achieve high concordance with population-genetic assumptions and expected inheritance patterns. On simulated data, we show that precision correlates well with our prediction of genotyping difficulty. This together with low memory and time requirements makes our approach well-suited for application in biomedical studies involving small to very large numbers of short-read sequenced genomes.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Source code is available at https:\/\/github.com\/kehrlab\/Complex-SV-Genotyping.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae391","type":"journal-article","created":{"date-parts":[[2024,6,19]],"date-time":"2024-06-19T18:16:05Z","timestamp":1718820965000},"page":"ii11-ii19","source":"Crossref","is-referenced-by-count":3,"title":["GGTyper: genotyping complex structural variants using short-read sequencing data"],"prefix":"10.1093","volume":"40","author":[{"given":"Tim","family":"Mirus","sequence":"first","affiliation":[{"name":"AG Algorithmic Bioinformatics, Leibniz-Institut f\u00fcr Immuntherapie , Regensburg 93053,","place":["Germany"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6207-4695","authenticated-orcid":false,"given":"Robert","family":"Lohmayer","sequence":"additional","affiliation":[{"name":"AG Algorithmic Bioinformatics, Leibniz-Institut f\u00fcr Immuntherapie , Regensburg 93053,","place":["Germany"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Clementine","family":"D\u00f6hring","sequence":"additional","affiliation":[{"name":"AG Algorithmic Bioinformatics, Leibniz-Institut f\u00fcr Immuntherapie , Regensburg 93053,","place":["Germany"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bjarni V","family":"Halld\u00f3rsson","sequence":"additional","affiliation":[{"name":"deCODE genetics\/Amgen Inc , Reykjavik 101,","place":["Iceland"]},{"name":"School of Technology, Reykjavik University , Reykjavic 102,","place":["Iceland"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3417-7504","authenticated-orcid":false,"given":"Birte","family":"Kehr","sequence":"additional","affiliation":[{"name":"AG Algorithmic Bioinformatics, Leibniz-Institut f\u00fcr Immuntherapie , Regensburg 93053,","place":["Germany"]},{"name":"Fakult\u00e4t f\u00fcr Informatik und Data Science, Universit\u00e4t Regensburg , Regensburg 93053,","place":["Germany"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2024,9,4]]},"reference":[{"key":"2024093018155464000_btae391-B1","doi-asserted-by":"crossref","first-page":"eabl3533","DOI":"10.1126\/science.abl3533","article-title":"A complete reference genome improves analysis of human genetic variation","volume":"376","author":"Aganezov","year":"2022","journal-title":"Science"},{"key":"2024093018155464000_btae391-B2","doi-asserted-by":"crossref","first-page":"663","DOI":"10.1016\/j.cell.2018.12.019","article-title":"Characterizing the major structural variant alleles of the human genome","volume":"176","author":"Audano","year":"2019","journal-title":"Cell"},{"key":"2024093018155464000_btae391-B3","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1038\/s41588-021-00865-4","article-title":"Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits","volume":"53","author":"Beyter","year":"2021","journal-title":"Nat Genet"},{"key":"2024093018155464000_btae391-B4","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1186\/s13059-019-1909-7","article-title":"Paragraph: a graph-based structural variant genotyper for short-read sequence data","volume":"20","author":"Chen","year":"2019","journal-title":"Genome Biol"},{"key":"2024093018155464000_btae391-B5","doi-asserted-by":"crossref","first-page":"1220","DOI":"10.1093\/bioinformatics\/btv710","article-title":"Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications","volume":"32","author":"Chen","year":"2015","journal-title":"Bioinformatics"},{"key":"2024093018155464000_btae391-B6","doi-asserted-by":"crossref","first-page":"966","DOI":"10.1038\/nmeth.3505","article-title":"SpeedSeq: ultra-fast personal genome analysis and interpretation","volume":"12","author":"Chiang","year":"2015","journal-title":"Nat Methods"},{"key":"2024093018155464000_btae391-B7","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1186\/s13059-017-1158-6","article-title":"Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome","volume":"18","author":"Collins","year":"2017","journal-title":"Genome Biol"},{"key":"2024093018155464000_btae391-B8","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1038\/s41586-020-2287-8","article-title":"A structural variation reference for medical and population genetics","volume":"581","author":"Collins","year":"2020","journal-title":"Nature"},{"key":"2024093018155464000_btae391-B9","doi-asserted-by":"crossref","first-page":"eabf7117","DOI":"10.1126\/science.abf7117","article-title":"Haplotype-resolved diverse human genomes and integrated analysis of structural variation","volume":"372","author":"Ebert","year":"2021","journal-title":"Science"},{"key":"2024093018155464000_btae391-B10","doi-asserted-by":"crossref","first-page":"4015","DOI":"10.1093\/bioinformatics\/btx020","article-title":"Genotyping inversions and tandem duplications","volume":"33","author":"Ebler","year":"2017","journal-title":"Bioinformatics"},{"key":"2024093018155464000_btae391-B11","doi-asserted-by":"crossref","first-page":"518","DOI":"10.1038\/s41588-022-01043-w","article-title":"Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes","volume":"54","author":"Ebler","year":"2022","journal-title":"Nat Genet"},{"key":"2024093018155464000_btae391-B12","doi-asserted-by":"crossref","first-page":"5402","DOI":"10.1038\/s41467-019-13341-9","article-title":"GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs","volume":"10","author":"Eggertsson","year":"2019","journal-title":"Nat Commun"},{"key":"2024093018155464000_btae391-B13","doi-asserted-by":"crossref","first-page":"875","DOI":"10.1038\/nbt.4227","article-title":"Variation graph toolkit improves read mapping by representing genetic variation in the reference","volume":"36","author":"Garrison","year":"2018","journal-title":"Nat Biotechnol"},{"key":"2024093018155464000_btae391-B14","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1038\/ng.3247","article-title":"Large-scale whole-genome sequencing of the Icelandic population","volume":"47","author":"Gudbjartsson","year":"2015","journal-title":"Nat Genet"},{"key":"2024093018155464000_btae391-B15","doi-asserted-by":"crossref","first-page":"732","DOI":"10.1038\/s41586-022-04965-x","article-title":"The sequences of 150,119 genomes in the UK Biobank","volume":"607","author":"Halldorsson","year":"2022","journal-title":"Nature"},{"key":"2024093018155464000_btae391-B16","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1186\/s13059-020-1941-7","article-title":"Genotyping structural variants in pangenome graphs using the vg toolkit","volume":"21","author":"Hickey","year":"2020","journal-title":"Genome Biol"},{"key":"2024093018155464000_btae391-B17","doi-asserted-by":"crossref","first-page":"593","DOI":"10.1093\/bioinformatics\/btr708","article-title":"ART: a next-generation sequencing read simulator","volume":"28","author":"Huang","year":"2011","journal-title":"Bioinformatics"},{"key":"2024093018155464000_btae391-B18","doi-asserted-by":"crossref","first-page":"588","DOI":"10.1038\/ng.3801","article-title":"Diversity in non-repetitive human sequences not found in the reference genome","volume":"49","author":"Kehr","year":"2017","journal-title":"Nat Genet"},{"key":"2024093018155464000_btae391-B19","doi-asserted-by":"crossref","first-page":"R84","DOI":"10.1186\/gb-2014-15-6-r84","article-title":"LUMPY: a probabilistic framework for structural variant discovery","volume":"15","author":"Layer","year":"2014","journal-title":"Genome Biol"},{"key":"2024093018155464000_btae391-B20","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.1093\/bioinformatics\/btp324","article-title":"Fast and accurate short read alignment with Burrows\u2013Wheeler transform","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2024093018155464000_btae391-B21","doi-asserted-by":"crossref","first-page":"112","DOI":"10.1038\/s41586-019-1913-9","article-title":"Patterns of somatic structural variation in human cancer genomes","volume":"578","author":"Li","year":"2020","journal-title":"Nature"},{"key":"2024093018155464000_btae391-B22","doi-asserted-by":"crossref","first-page":"312","DOI":"10.1038\/s41586-023-05896-x","article-title":"A draft human pangenome reference","volume":"617","author":"Liao","year":"2023","journal-title":"Nature"},{"key":"2024093018155464000_btae391-B23","doi-asserted-by":"crossref","first-page":"1012","DOI":"10.1016\/j.cell.2015.04.004","article-title":"Disruptions of topological chromatin domains cause pathogenic rewiring of gene\u2013enhancer interactions","volume":"161","author":"Lupi\u00e1\u00f1ez","year":"2015","journal-title":"Cell"},{"key":"2024093018155464000_btae391-B24","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1016\/S0168-9525(98)01555-8","article-title":"Genomic disorders: structural features of the genome can lead to DNA rearrangements and human disease traits","volume":"14","author":"Lupski","year":"1998","journal-title":"Trends Genet"},{"key":"2024093018155464000_btae391-B25","doi-asserted-by":"crossref","first-page":"730","DOI":"10.1038\/s41467-020-20850-5","article-title":"PopDel identifies medium-size deletions simultaneously in tens of thousands of genomes","volume":"12","author":"Niehus","year":"2021","journal-title":"Nat Commun"},{"key":"2024093018155464000_btae391-B26","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1038\/nrg2986","article-title":"Genotype and SNP calling from next-generation sequencing data","volume":"12","author":"Nielsen","year":"2011","journal-title":"Nat Rev Genet"},{"key":"2024093018155464000_btae391-B27","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1126\/science.abj6987","article-title":"The complete sequence of a human genome","volume":"376","author":"Nurk","year":"2022","journal-title":"Science"},{"key":"2024093018155464000_btae391-B28","doi-asserted-by":"crossref","first-page":"i333","DOI":"10.1093\/bioinformatics\/bts378","article-title":"DELLY: structural variant discovery by integrated paired-end and split-read analysis","volume":"28","author":"Rausch","year":"2012","journal-title":"Bioinformatics"},{"key":"2024093018155464000_btae391-B29","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1186\/s13073-018-0606-6","article-title":"Complex structural variants in Mendelian disorders: identification and breakpoint resolution using short- and long-read genome sequencing","volume":"10","author":"Sanchis-Juan","year":"2018","journal-title":"Genome Med"},{"key":"2024093018155464000_btae391-B30","doi-asserted-by":"crossref","first-page":"1054","DOI":"10.1038\/s41588-018-0145-5","article-title":"Accurate genotyping across variant classes and lengths using variant graphs","volume":"50","author":"Sibbesen","year":"2018","journal-title":"Nature Genet"},{"key":"2024093018155464000_btae391-B31","doi-asserted-by":"crossref","first-page":"abg8871","DOI":"10.1126\/science.abg8871","article-title":"Pangenomics enables genotyping of known structural variants in 5202 diverse genomes","volume":"374","author":"Sir\u00e9n","year":"2021","journal-title":"Science"},{"key":"2024093018155464000_btae391-B32","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1038\/nature15394","article-title":"An integrated map of structural variation in 2,504 human genomes","volume":"526","author":"Sudmant","year":"2015","journal-title":"Nature"},{"key":"2024093018155464000_btae391-B33","doi-asserted-by":"crossref","first-page":"126","DOI":"10.1186\/s13059-016-0993-1","article-title":"Resolving complex structural genomic rearrangements using a randomized approach","volume":"17","author":"Zhao","year":"2016","journal-title":"Genome Biol"},{"key":"2024093018155464000_btae391-B34","author":"Zhou","year":"2023"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/Supplement_2\/ii11\/59461435\/btae391.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/Supplement_2\/ii11\/59461435\/btae391.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,30]],"date-time":"2024-09-30T14:16:13Z","timestamp":1727705773000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/40\/Supplement_2\/ii11\/7749065"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,1]]},"references-count":34,"journal-issue":{"issue":"Supplement_2","published-print":{"date-parts":[[2024,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae391","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.03.15.585230","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,9]]},"published":{"date-parts":[[2024,9,1]]}}}