{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,13]],"date-time":"2026-01-13T14:16:51Z","timestamp":1768313811044,"version":"3.49.0"},"reference-count":27,"publisher":"Oxford University Press (OUP)","issue":"8","license":[{"start":{"date-parts":[[2019,12,20]],"date-time":"2019-12-20T00:00:00Z","timestamp":1576800000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01HG010040"],"award-info":[{"award-number":["R01HG010040"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["U01HG010971"],"award-info":[{"award-number":["U01HG010971"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["K99HG010906"],"award-info":[{"award-number":["K99HG010906"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["RM1HG008525"],"award-info":[{"award-number":["RM1HG008525"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,4,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Reconstructing high-quality haplotype-resolved assemblies for related individuals has important applications in Mendelian diseases and population genomics. Through major genomics sequencing efforts such as the Personal Genome Project, the Vertebrate Genome Project (VGP) and the Genome in a Bottle project (GIAB), a variety of sequencing datasets from trios of diploid genomes are becoming available. Current trio assembly approaches are not designed to incorporate long- and short-read data from mother\u2013father\u2013child trios, and therefore require relatively high coverages of costly long-read data to produce high-quality assemblies. Thus, building a trio-aware assembler capable of producing accurate and chromosomal-scale diploid genomes of all individuals in a pedigree, while being cost-effective in terms of sequencing costs, is a pressing need of the genomics community.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We present a novel pedigree sequence graph based approach to diploid assembly using accurate Illumina data and long-read Pacific Biosciences (PacBio) data from all related individuals, thereby generalizing our previous work on single individuals. We demonstrate the effectiveness of our pedigree approach on a simulated trio of pseudo-diploid yeast genomes with different heterozygosity rates, and real data from human chromosome. We show that we require as little as 30\u00d7 coverage Illumina data and 15\u00d7 PacBio data from each individual in a trio to generate chromosomal-scale phased assemblies. Additionally, we show that we can detect and phase variants from generated phased assemblies.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>https:\/\/github.com\/shilpagarg\/WHdenovo.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz942","type":"journal-article","created":{"date-parts":[[2019,12,18]],"date-time":"2019-12-18T12:10:53Z","timestamp":1576671053000},"page":"2385-2392","source":"Crossref","is-referenced-by-count":26,"title":["A haplotype-aware<i>de novo<\/i>assembly of related individuals using pedigree sequence graph"],"prefix":"10.1093","volume":"36","author":[{"given":"Shilpa","family":"Garg","sequence":"first","affiliation":[{"name":"Department of Genetics, Harvard Medical School"},{"name":"Wyss Institute for Biologically Inspired Engineering, Harvard University"}]},{"given":"John","family":"Aach","sequence":"additional","affiliation":[{"name":"Department of Genetics, Harvard Medical School"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4874-2874","authenticated-orcid":false,"given":"Heng","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, Harvard Medical School , Boston"}]},{"given":"Isaac","family":"Sebenius","sequence":"additional","affiliation":[{"name":"Department of Molecular and Cellular Biology , Harvard University, Cambridge, MA, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9130-1006","authenticated-orcid":false,"given":"Richard","family":"Durbin","sequence":"additional","affiliation":[{"name":"Department of Genetics, University of Cambridge , Cambridge, UK"}]},{"given":"George","family":"Church","sequence":"additional","affiliation":[{"name":"Department of Genetics, Harvard Medical School"},{"name":"Wyss Institute for Biologically Inspired Engineering, Harvard University"}]}],"member":"286","published-online":{"date-parts":[[2019,12,20]]},"reference":[{"key":"2023013110271016500_btz942-B1","doi-asserted-by":"crossref","first-page":"1009","DOI":"10.1093\/bioinformatics\/btv688","article-title":"hybridSPAdes: an algorithm for hybrid assembly of short and long reads","volume":"32","author":"Antipov","year":"2016","journal-title":"Bioinformatics"},{"key":"2023013110271016500_btz942-B2","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1089\/cmb.2012.0021","article-title":"Spades: a new genome assembly algorithm and its applications to single-cell sequencing","volume":"19","author":"Bankevich","year":"2012","journal-title":"J. Comput. Biol"},{"key":"2023013110271016500_btz942-B3","doi-asserted-by":"crossref","first-page":"701","DOI":"10.1038\/nbt.2288","article-title":"A hybrid approach for the automated finishing of bacterial genomes","volume":"30","author":"Bashir","year":"2012","journal-title":"Nat. Biotechnol"},{"key":"2023013110271016500_btz942-B4","doi-asserted-by":"crossref","first-page":"623","DOI":"10.1038\/nbt.3238","article-title":"Assembling large genomes with single-molecule sequencing and locality-sensitive hashing","volume":"33","author":"Berlin","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023013110271016500_btz942-B5","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1038\/nmeth.2474","article-title":"Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data","volume":"10","author":"Chin","year":"2013","journal-title":"Nat. Methods"},{"key":"2023013110271016500_btz942-B6","doi-asserted-by":"crossref","first-page":"1050","DOI":"10.1038\/nmeth.4035","article-title":"Phased diploid genome assembly with single molecule real-time sequencing","volume":"13","author":"Chin","year":"2016","journal-title":"Nat. Methods"},{"key":"2023013110271016500_btz942-B7","author":"Deshpande","year":"2013"},{"key":"2023013110271016500_btz942-B8","author":"Garg","year":"2018"},{"key":"2023013110271016500_btz942-B9","doi-asserted-by":"crossref","first-page":"i234","DOI":"10.1093\/bioinformatics\/btw276","article-title":"Read-based phasing of related individuals","volume":"32","author":"Garg","year":"2016","journal-title":"Bioinformatics"},{"key":"2023013110271016500_btz942-B10","doi-asserted-by":"crossref","first-page":"i105","DOI":"10.1093\/bioinformatics\/bty279","article-title":"A graph-based approach to diploid genome assembly","volume":"34","author":"Garg","year":"2018","journal-title":"Bioinformatics"},{"key":"2023013110271016500_btz942-B11","doi-asserted-by":"crossref","first-page":"875","DOI":"10.1038\/nbt.4227","article-title":"Variation graph toolkit improves read mapping by representing genetic variation in the reference","volume":"36","author":"Garrison","year":"2018","journal-title":"Nat. Biotechnol."},{"key":"2023013110271016500_btz942-B12","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1038\/s41587-019-0072-8","article-title":"Assembly of long, error-prone reads using repeat graphs","volume":"37","author":"Kolmogorov","year":"2019","journal-title":"Nat. Biotechnol"},{"key":"2023013110271016500_btz942-B13","doi-asserted-by":"crossref","first-page":"722","DOI":"10.1101\/gr.215087.116","article-title":"Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation","volume":"27","author":"Koren","year":"2017","journal-title":"Genome Res"},{"key":"2023013110271016500_btz942-B14","doi-asserted-by":"crossref","first-page":"1174","DOI":"10.1038\/nbt.4277","article-title":"De novo assembly of haplotype-resolved genomes with trio binning","volume":"36","author":"Koren","year":"2018","journal-title":"Nat. Biotechnol"},{"key":"2023013110271016500_btz942-B15","doi-asserted-by":"crossref","first-page":"3694","DOI":"10.1093\/bioinformatics\/btv440","article-title":"Fermikit: assembly-based variant calling for illumina resequencing data","volume":"31","author":"Li","year":"2015","journal-title":"Bioinformatics"},{"key":"2023013110271016500_btz942-B16","first-page":"051516","author":"Malinsky","year":"2016"},{"key":"2023013110271016500_btz942-B17","first-page":"649","article-title":"Superbubbles, ultrabubbles, and cacti","volume-title":"J. Comput. Biol.","author":"Paten","year":"2018"},{"key":"2023013110271016500_btz942-B18","author":"Patterson","year":"2014"},{"key":"2023013110271016500_btz942-B19","doi-asserted-by":"crossref","first-page":"3599","DOI":"10.1093\/bioinformatics\/btz162","article-title":"Bit-parallel sequence-to-graph alignment","volume":"35","author":"Rautiainen","year":"2019","journal-title":"Bioinformatics"},{"key":"2023013110271016500_btz942-B20","author":"Ruan","year":"2019"},{"key":"2023013110271016500_btz942-B21","doi-asserted-by":"crossref","first-page":"549","DOI":"10.1101\/gr.126953.111","article-title":"Efficient de novo assembly of large genomes using compressed data structures","volume":"22","author":"Simpson","year":"2012","journal-title":"Genome Res"},{"key":"2023013110271016500_btz942-B22","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1146\/annurev-genom-090314-050032","article-title":"The theory and practice of genome sequence assembly","volume":"16","author":"Simpson","year":"2015","journal-title":"Annu. Rev. Genomics Hum. Genet"},{"key":"2023013110271016500_btz942-B23","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1038\/nrg2950","article-title":"The importance of phase information for human genomics","volume":"12","author":"Tewhey","year":"2011","journal-title":"Nat. Rev. Genet"},{"key":"2023013110271016500_btz942-B24","doi-asserted-by":"crossref","first-page":"757","DOI":"10.1101\/gr.214874.116","article-title":"Direct determination of diploid genome sequences","volume":"27","author":"Weisenfeld","year":"2017","journal-title":"Genome Res"},{"key":"2023013110271016500_btz942-B25","first-page":"1155","article-title":"Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome","volume-title":"Nat. Biotechnol.","author":"Wenger","year":"2019"},{"key":"2023013110271016500_btz942-B26","doi-asserted-by":"crossref","first-page":"913","DOI":"10.1038\/ng.3847","article-title":"Contrasting evolutionary genome dynamics between domesticated and wild yeasts","volume":"49","author":"Yue","year":"2017","journal-title":"Nat. Genet"},{"key":"2023013110271016500_btz942-B27","doi-asserted-by":"crossref","first-page":"160025.","DOI":"10.1038\/sdata.2016.25","article-title":"Extensive sequencing of seven human genomes to characterize benchmark reference materials","volume":"3","author":"Zook","year":"2016","journal-title":"Sci. Data"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btz942\/32445336\/btz942.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/8\/2385\/48983614\/btz942.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/8\/2385\/48983614\/btz942.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,24]],"date-time":"2023-09-24T05:00:48Z","timestamp":1695531648000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/8\/2385\/5682413"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2019,12,20]]},"references-count":27,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2020,4,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz942","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,4,15]]},"published":{"date-parts":[[2019,12,20]]}}}