{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:33:56Z","timestamp":1772138036554,"version":"3.50.1"},"reference-count":14,"publisher":"Oxford University Press (OUP)","issue":"12","license":[{"start":{"date-parts":[[2020,4,20]],"date-time":"2020-04-20T00:00:00Z","timestamp":1587340800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Genome BC and Genome Canada","award":["243FOR"],"award-info":[{"award-number":["243FOR"]}]},{"name":"Genome BC and Genome Canada","award":["281ANV"],"award-info":[{"award-number":["281ANV"]}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["2R01HG007182-04A1"],"award-info":[{"award-number":["2R01HG007182-04A1"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Institutes of Health or other funding organizations"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,6,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Summary<\/jats:title>\n                    <jats:p>The ability to generate high-quality genome sequences is cornerstone to modern biological research. Even with recent advancements in sequencing technologies, many genome assemblies are still not achieving reference-grade. Here, we introduce ntJoin, a tool that leverages structural synteny between a draft assembly and reference sequence(s) to contiguate and correct the former with respect to the latter. Instead of alignments, ntJoin uses a lightweight mapping approach based on a graph data structure generated from ordered minimizer sketches. The tool can be used in a variety of different applications, including improving a draft assembly with a reference-grade genome, a short-read assembly with a draft long-read assembly and a draft assembly with an assembly from a closely related species. When scaffolding a human short-read assembly using the reference human genome or a long-read assembly, ntJoin improves the NGA50 length 23- and 13-fold, respectively, in under 13\u2009m, using &amp;lt;11 GB of RAM. Compared to existing reference-guided scaffolders, ntJoin generates highly contiguous assemblies faster and using less memory.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>ntJoin is written in C++ and Python and is freely available at https:\/\/github.com\/bcgsc\/ntjoin.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaa253","type":"journal-article","created":{"date-parts":[[2020,4,14]],"date-time":"2020-04-14T15:14:52Z","timestamp":1586877292000},"page":"3885-3887","source":"Crossref","is-referenced-by-count":39,"title":["ntJoin: Fast and lightweight assembly-guided scaffolding using minimizer graphs"],"prefix":"10.1093","volume":"36","author":[{"given":"Lauren","family":"Coombe","sequence":"first","affiliation":[{"name":"Canada\u2019s Michael Smith Genome Sciences Centre , BC Cancer, Vancouver, BC V5Z 4S6, Canada"}]},{"given":"Vladimir","family":"Nikoli\u0107","sequence":"additional","affiliation":[{"name":"Canada\u2019s Michael Smith Genome Sciences Centre , BC Cancer, Vancouver, BC V5Z 4S6, Canada"}]},{"given":"Justin","family":"Chu","sequence":"additional","affiliation":[{"name":"Canada\u2019s Michael Smith Genome Sciences Centre , BC Cancer, Vancouver, BC V5Z 4S6, Canada"}]},{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[{"name":"Canada\u2019s Michael Smith Genome Sciences Centre , BC Cancer, Vancouver, BC V5Z 4S6, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9890-2293","authenticated-orcid":false,"given":"Ren\u00e9 L","family":"Warren","sequence":"additional","affiliation":[{"name":"Canada\u2019s Michael Smith Genome Sciences Centre , BC Cancer, Vancouver, BC V5Z 4S6, Canada"}]}],"member":"286","published-online":{"date-parts":[[2020,4,20]]},"reference":[{"key":"2023063011301364400_btaa253-B1","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1186\/s13059-019-1829-6","article-title":"RaGOO: fast and accurate reference-guided scaffolding of draft genomes","volume":"20","author":"Alonge","year":"2019","journal-title":"Genome Biol"},{"key":"2023063011301364400_btaa253-B2","first-page":"730531","author":"Armstrong","year":"2019"},{"key":"2023063011301364400_btaa253-B3","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1186\/s12915-018-0499-2","article-title":"A phylogenomic framework and timescale for comparative studies of tunicates","volume":"16","author":"Delsuc","year":"2018","journal-title":"BMC Biol"},{"key":"2023063011301364400_btaa253-B4","doi-asserted-by":"crossref","first-page":"768","DOI":"10.1101\/gr.214346.116","article-title":"ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter","volume":"27","author":"Jackman","year":"2017","journal-title":"Genome Res"},{"key":"2023063011301364400_btaa253-B5","doi-asserted-by":"crossref","first-page":"1720","DOI":"10.1101\/gr.236273.118","article-title":"Chromosome assembly of large and complex genomes using multiple references","volume":"28","author":"Kolmogorov","year":"2018","journal-title":"Genome Res"},{"key":"2023063011301364400_btaa253-B6","doi-asserted-by":"crossref","first-page":"3094","DOI":"10.1093\/bioinformatics\/bty191","article-title":"Minimap2: pairwise alignment for nucleotide sequences","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"2023063011301364400_btaa253-B7","doi-asserted-by":"crossref","first-page":"i142","DOI":"10.1093\/bioinformatics\/bty266","article-title":"Versatile genome assembly evaluation with QUAST-LG","volume":"34","author":"Mikheenko","year":"2018","journal-title":"Bioinformatics"},{"key":"2023063011301364400_btaa253-B8","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1146\/annurev-animal-020518-115344","article-title":"New approaches for genome assembly and scaffolding","volume":"7","author":"Rice","year":"2019","journal-title":"Annu. Rev. Anim. Biosci"},{"key":"2023063011301364400_btaa253-B9","doi-asserted-by":"crossref","first-page":"3363","DOI":"10.1093\/bioinformatics\/bth408","article-title":"Reducing storage requirements for biological sequence comparison","volume":"20","author":"Roberts","year":"2004","journal-title":"Bioinformatics"},{"key":"2023063011301364400_btaa253-B10","first-page":"715722","author":"Shafin","year":"2019"},{"key":"2023063011301364400_btaa253-B11","doi-asserted-by":"crossref","first-page":"3210","DOI":"10.1093\/bioinformatics\/btv351","article-title":"BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs","volume":"31","author":"Sim\u00e3o","year":"2015","journal-title":"Bioinformatics"},{"key":"2023063011301364400_btaa253-B12","doi-asserted-by":"crossref","first-page":"4430","DOI":"10.1093\/bioinformatics\/btz400","article-title":"ntEdit: scalable genome sequence polishing","volume":"35","author":"Warren","year":"2019","journal-title":"Bioinformatics"},{"key":"2023063011301364400_btaa253-B13","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1038\/s41587-018-0004-z","article-title":"Errors in long-read assemblies can critically affect protein prediction","volume":"37","author":"Watson","year":"2019","journal-title":"Nat. Biotechnol"},{"key":"2023063011301364400_btaa253-B14","doi-asserted-by":"crossref","first-page":"757","DOI":"10.1101\/gr.214874.116","article-title":"Direct determination of diploid genome sequences","volume":"27","author":"Weisenfeld","year":"2017","journal-title":"Genome Res"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaa253\/33152980\/btaa253.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/12\/3885\/50747053\/bioinformatics_36_12_3885.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/12\/3885\/50747053\/bioinformatics_36_12_3885.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,30]],"date-time":"2023-06-30T07:31:02Z","timestamp":1688110262000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/12\/3885\/5822878"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,4,20]]},"references-count":14,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2020,6,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaa253","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2020.01.13.905240","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,6,15]]},"published":{"date-parts":[[2020,4,20]]}}}