{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T05:27:54Z","timestamp":1776058074090,"version":"3.50.1"},"reference-count":11,"publisher":"Oxford University Press (OUP)","issue":"14","license":[{"start":{"date-parts":[[2017,7,12]],"date-time":"2017-07-12T00:00:00Z","timestamp":1499817600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000893","name":"Simons Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000893","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,7,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Current statistical models of haplotypes are limited to panels of haplotypes whose genetic variation can be represented by arrays of values at linearly ordered bi- or multiallelic loci. These methods cannot model structural variants or variants that nest or overlap.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>A variation graph is a mathematical structure that can encode arbitrarily complex genetic variation. We present the first haplotype model that operates on a variation graph-embedded population reference cohort. We describe an algorithm to calculate the likelihood that a haplotype arose from this cohort through recombinations and demonstrate time complexity linear in haplotype length and sublinear in population size. We furthermore demonstrate a method of rapidly calculating likelihoods for related haplotypes. We describe mathematical extensions to allow modelling of mutations. This work is an important incremental step for clinical genomics and genetic epidemiology since it is the first haplotype model which can represent all sorts of variation in the population.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and Implementation<\/jats:title>\n                    <jats:p>Available on GitHub at https:\/\/github.com\/yoheirosen\/vg.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx236","type":"journal-article","created":{"date-parts":[[2017,4,20]],"date-time":"2017-04-20T03:52:13Z","timestamp":1492660333000},"page":"i118-i123","source":"Crossref","is-referenced-by-count":16,"title":["Modelling haplotypes with respect to reference cohort variation graphs"],"prefix":"10.1093","volume":"33","author":[{"given":"Yohei","family":"Rosen","sequence":"first","affiliation":[{"name":"Baskin School of Engineering, UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA"}]},{"given":"Jordan","family":"Eizenga","sequence":"additional","affiliation":[{"name":"Baskin School of Engineering, UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA"}]},{"given":"Benedict","family":"Paten","sequence":"additional","affiliation":[{"name":"Baskin School of Engineering, UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA"}]}],"member":"286","published-online":{"date-parts":[[2017,7,12]]},"reference":[{"key":"2023051506483703000_btx236-B1","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1038\/nature15393","article-title":"A global reference for human genetic variation","volume":"526","author":"1000 Genomes Project Consortium","year":"2015","journal-title":"Nature"},{"key":"2023051506483703000_btx236-B2","doi-asserted-by":"crossref","first-page":"2156","DOI":"10.1093\/bioinformatics\/btr330","article-title":"The variant call format and VCFtools","volume":"27","author":"Danecek","year":"2011","journal-title":"Bioinformatics"},{"key":"2023051506483703000_btx236-B3","doi-asserted-by":"crossref","first-page":"682","DOI":"10.1038\/ng.3257","article-title":"Improved genome inference in the MHC using a population reference graph","volume":"47","author":"Dilthey","year":"2015","journal-title":"Nat. Genet"},{"key":"2023051506483703000_btx236-B4","doi-asserted-by":"crossref","first-page":"1266","DOI":"10.1093\/bioinformatics\/btu014","article-title":"Efficient haplotype matching and storage using the positional Burrows\u2013Wheeler transform (PBWT)","volume":"30","author":"Durbin","year":"2014","journal-title":"Bioinformatics"},{"key":"2023051506483703000_btx236-B5","author":"Garrison","year":"2016"},{"key":"2023051506483703000_btx236-B6","doi-asserted-by":"crossref","first-page":"27","DOI":"10.2307\/3213548","article-title":"On the genealogy of large populations","volume":"19(A)","author":"Kingman","year":"1982","journal-title":"J. Appl. Prob"},{"key":"2023051506483703000_btx236-B7","doi-asserted-by":"crossref","first-page":"2213","DOI":"10.1093\/genetics\/165.4.2213","article-title":"Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data","volume":"165","author":"Li","year":"2003","journal-title":"Genetics"},{"key":"2023051506483703000_btx236-B8","author":"Lunter","year":"2016"},{"key":"2023051506483703000_btx236-B9","author":"Novak","year":"2016"},{"key":"2023051506483703000_btx236-B10","author":"Paten","year":"2014"},{"key":"2023051506483703000_btx236-B11","author":"Paten","year":"2017"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/14\/i118\/50315202\/bioinformatics_33_14_i118.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/14\/i118\/50315202\/bioinformatics_33_14_i118.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,15]],"date-time":"2023-05-15T02:48:54Z","timestamp":1684118934000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/14\/i118\/3953952"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,7,12]]},"references-count":11,"journal-issue":{"issue":"14","published-print":{"date-parts":[[2017,7,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx236","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/101659","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2017,7,15]]},"published":{"date-parts":[[2017,7,12]]}}}