{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T22:57:17Z","timestamp":1780527437641,"version":"3.54.1"},"reference-count":15,"publisher":"Oxford University Press (OUP)","issue":"2","license":[{"start":{"date-parts":[[2024,1,25]],"date-time":"2024-01-25T00:00:00Z","timestamp":1706140800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,2,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>In diploid organisms, phasing is the problem of assigning the alleles at heterozygous variants to one of two haplotypes. Reads from PacBio HiFi sequencing provide long, accurate observations that can be used as the basis for both calling and phasing variants. HiFi reads also excel at calling larger classes of variation, such as structural or tandem repeat variants. However, current phasing tools typically only phase small variants, leaving larger variants unphased.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We developed HiPhase, a tool that jointly phases SNVs, indels, structural, and tandem repeat variants. The main benefits of HiPhase are (i) dual mode allele assignment for detecting large variants, (ii) a novel application of the A*-algorithm to phasing, and (iii) logic allowing phase blocks to span breaks caused by alignment issues around reference gaps and homozygous deletions. In our assessment, HiPhase produced an average phase block NG50 of 480\u2009kb with 929 switchflip errors and fully phased 93.8% of genes, improving over the current state of the art. Additionally, HiPhase jointly phases SNVs, indels, structural, and tandem repeat variants and includes innate multi-threading, statistics gathering, and concurrent phased alignment output generation.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>HiPhase is available as source code and a pre-compiled Linux binary with a user guide at https:\/\/github.com\/PacificBiosciences\/HiPhase.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae042","type":"journal-article","created":{"date-parts":[[2024,1,25]],"date-time":"2024-01-25T09:46:57Z","timestamp":1706176017000},"source":"Crossref","is-referenced-by-count":40,"title":["HiPhase: jointly phasing small, structural, and tandem repeat variants from HiFi sequencing"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6411-9236","authenticated-orcid":false,"given":"James M","family":"Holt","sequence":"first","affiliation":[{"name":"Computational Biology, PacBio , 1305 O\u2019Brien Drive , Menlo Park, CA 94025, United States"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0726-7600","authenticated-orcid":false,"given":"Christopher T","family":"Saunders","sequence":"additional","affiliation":[{"name":"Computational Biology, PacBio , 1305 O\u2019Brien Drive , Menlo Park, CA 94025, United States"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7422-1194","authenticated-orcid":false,"given":"William J","family":"Rowell","sequence":"additional","affiliation":[{"name":"Computational Biology, PacBio , 1305 O\u2019Brien Drive , Menlo Park, CA 94025, United States"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7627-9808","authenticated-orcid":false,"given":"Zev","family":"Kronenberg","sequence":"additional","affiliation":[{"name":"Computational Biology, PacBio , 1305 O\u2019Brien Drive , Menlo Park, CA 94025, United States"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1183-0432","authenticated-orcid":false,"given":"Aaron M","family":"Wenger","sequence":"additional","affiliation":[{"name":"Computational Biology, PacBio , 1305 O\u2019Brien Drive , Menlo Park, CA 94025, United States"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8965-1253","authenticated-orcid":false,"given":"Michael","family":"Eberle","sequence":"additional","affiliation":[{"name":"Computational Biology, PacBio , 1305 O\u2019Brien Drive , Menlo Park, CA 94025, United States"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2024,1,25]]},"reference":[{"key":"2024020805380725000_btae042-B1","doi-asserted-by":"crossref","first-page":"703","DOI":"10.1038\/nrg3054","article-title":"Haplotype phasing: existing methods and new developments","volume":"12","author":"Browning","year":"2011","journal-title":"Nat Rev Genet"},{"key":"2024020805380725000_btae042-B2","doi-asserted-by":"crossref","first-page":"177","DOI":"10.2217\/pgs-2020-0155","article-title":"Potential of whole-genome sequencing-based pharmacogenetic profiling","volume":"22","author":"Caspar","year":"2021","journal-title":"Pharmacogenomics"},{"key":"2024020805380725000_btae042-B3","doi-asserted-by":"crossref","first-page":"2037","DOI":"10.1093\/bioinformatics\/btx100","article-title":"BCFtools\/csq: haplotype-aware variant consequences","volume":"33","author":"Danecek","year":"2017","journal-title":"Bioinformatics"},{"key":"2024020805380725000_btae042-B4","first-page":"1","author":"Dolzhenko","year":"2024"},{"key":"2024020805380725000_btae042-B5","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1109\/TSSC.1968.300136","article-title":"A formal basis for the heuristic determination of minimum cost paths","volume":"4","author":"Hart","year":"1968","journal-title":"IEEE Trans Syst Sci Cyber"},{"key":"2024020805380725000_btae042-B6","doi-asserted-by":"crossref","first-page":"1816","DOI":"10.1093\/bioinformatics\/btac058","article-title":"LongPhase: an ultra-fast chromosome-scale phasing algorithm for small and large variants","volume":"38","author":"Lin","year":"2022","journal-title":"Bioinformatics"},{"key":"2024020805380725000_btae042-B7","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1186\/s13059-021-02486-w","article-title":"PRINCESS: comprehensive detection of haplotype resolved SNVs, SVs, and methylation","volume":"22","author":"Mahmoud","year":"2021","journal-title":"Genome Biol"},{"key":"2024020805380725000_btae042-B8","doi-asserted-by":"crossref","first-page":"456","DOI":"10.1093\/bioinformatics\/btaa777","article-title":"Fast gap-affine pairwise alignment using the wavefront algorithm","volume":"37","author":"Marco-Sola","year":"2020","journal-title":"Bioinformatics"},{"key":"2024020805380725000_btae042-B9","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1016\/j.bbmt.2018.12.768","article-title":"Recipients receiving better HLA-matched hematopoietic cell transplantation grafts, uncovered by a novel HLA typing method, have superior survival: a retrospective study","volume":"25","author":"Mayor","year":"2019","journal-title":"Biol Blood Marrow Transplant"},{"key":"2024020805380725000_btae042-B10","doi-asserted-by":"crossref","first-page":"498","DOI":"10.1089\/cmb.2014.0157","article-title":"WhatsHap: weighted haplotype assembly for future-generation sequencing reads","volume":"22","author":"Patterson","year":"2015","journal-title":"J Comput Biol"},{"key":"2024020805380725000_btae042-B11","doi-asserted-by":"crossref","first-page":"983","DOI":"10.1038\/nbt.4235","article-title":"A universal SNP and small-indel variant caller using deep neural networks","volume":"36","author":"Poplin","year":"2018","journal-title":"Nat Biotechnol"},{"key":"2024020805380725000_btae042-B12","doi-asserted-by":"crossref","first-page":"344","DOI":"10.1038\/nrg3903","article-title":"Haplotype-resolved genome sequencing: experimental methods and applications","volume":"16","author":"Snyder","year":"2015","journal-title":"Nat Rev Genet"},{"key":"2024020805380725000_btae042-B13","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1038\/nrg2950","article-title":"The importance of phase information for human genomics","volume":"12","author":"Tewhey","year":"2011","journal-title":"Nat Rev Genet"},{"key":"2024020805380725000_btae042-B14","doi-asserted-by":"crossref","first-page":"100128","DOI":"10.1016\/j.xgen.2022.100128","article-title":"Benchmarking challenging small variants with linked and long reads","volume":"2","author":"Wagner","year":"2022","journal-title":"Cell Genom"},{"key":"2024020805380725000_btae042-B15","doi-asserted-by":"crossref","first-page":"1155","DOI":"10.1038\/s41587-019-0217-9","article-title":"Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome","volume":"37","author":"Wenger","year":"2019","journal-title":"Nat Biotechnol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae042\/56412556\/btae042.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/2\/btae042\/56619414\/btae042.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/2\/btae042\/56619414\/btae042.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,8]],"date-time":"2024-02-08T06:02:54Z","timestamp":1707372174000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae042\/7588891"}},"subtitle":[],"editor":[{"given":"Lenore","family":"Cowen","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"editor"}]}],"short-title":[],"issued":{"date-parts":[[2024,1,25]]},"references-count":15,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,2,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae042","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,2,1]]},"published":{"date-parts":[[2024,1,25]]},"article-number":"btae042"}}