{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T04:11:11Z","timestamp":1772165471981,"version":"3.50.1"},"reference-count":30,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,3,7]],"date-time":"2025-03-07T00:00:00Z","timestamp":1741305600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,3,7]],"date-time":"2025-03-07T00:00:00Z","timestamp":1741305600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000024","name":"Canadian Institutes of Health Research","doi-asserted-by":"publisher","award":["PJT-183608"],"award-info":[{"award-number":["PJT-183608"]}],"id":[{"id":"10.13039\/501100000024","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000024","name":"Canadian Institutes of Health Research","doi-asserted-by":"publisher","award":["PJT-183608"],"award-info":[{"award-number":["PJT-183608"]}],"id":[{"id":"10.13039\/501100000024","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000024","name":"Canadian Institutes of Health Research","doi-asserted-by":"publisher","award":["PJT-183608"],"award-info":[{"award-number":["PJT-183608"]}],"id":[{"id":"10.13039\/501100000024","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000024","name":"Canadian Institutes of Health Research","doi-asserted-by":"publisher","award":["PJT-183608"],"award-info":[{"award-number":["PJT-183608"]}],"id":[{"id":"10.13039\/501100000024","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000024","name":"Canadian Institutes of Health Research","doi-asserted-by":"publisher","award":["PJT-183608"],"award-info":[{"award-number":["PJT-183608"]}],"id":[{"id":"10.13039\/501100000024","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>Advanced long-read sequencing technologies, such as those from Oxford Nanopore Technologies and Pacific Biosciences, are finding a wide use in de novo genome sequencing projects. However, long reads typically have higher error rates relative to short reads. If left unaddressed, subsequent genome assemblies may exhibit high base error rates that compromise the reliability of downstream analysis. Several specialized error correction tools for genome assemblies have since emerged, employing a range of algorithms and strategies to improve base quality. However, despite these efforts, many genome assembly workflows still produce regions with elevated error rates, such as gaps filled with unpolished or ambiguous bases. To address this, we introduce GoldPolish-Target, a modular targeted sequence polishing pipeline. Coupled with GoldPolish, a linear-time genome assembly algorithm, GoldPolish-Target isolates and polishes user-specified assembly loci, offering a resource-efficient means for polishing targeted regions of draft genomes.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>\n                      Experiments using\n                      <jats:italic>Drosophila melanogaster<\/jats:italic>\n                      and\n                      <jats:italic>Homo sapiens<\/jats:italic>\n                      datasets demonstrate that GoldPolish-Target can reduce insertion\/deletion (indel) and mismatch errors by up to 49.2% and 55.4% respectively, achieving base accuracy values upwards of 99.9% (Phred score Q\u2009&gt;\u200930). This polishing accuracy is comparable to the current state-of-the-art, Medaka, while exhibiting up to 27-fold shorter run times and consuming 95% less memory, on average.\n                    <\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusion<\/jats:title>\n                    <jats:p>GoldPolish-Target, in contrast to most other polishing tools, offers the ability to target specific regions of a genome assembly for polishing, providing a computationally light-weight and highly scalable solution for base error correction.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s12859-025-06091-7","type":"journal-article","created":{"date-parts":[[2025,3,7]],"date-time":"2025-03-07T06:17:43Z","timestamp":1741328263000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["GoldPolish-target: targeted long-read genome assembly polishing"],"prefix":"10.1186","volume":"26","author":[{"given":"Emily","family":"Zhang","sequence":"first","affiliation":[]},{"given":"Lauren","family":"Coombe","sequence":"additional","affiliation":[]},{"given":"Johnathan","family":"Wong","sequence":"additional","affiliation":[]},{"given":"Ren\u00e9 L.","family":"Warren","sequence":"additional","affiliation":[]},{"given":"Inan\u00e7","family":"Birol","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,3,7]]},"reference":[{"issue":"7","key":"6091_CR1","doi-asserted-by":"publisher","first-page":"e636","DOI":"10.7717\/peerj-cs.636","volume":"9","author":"F Dida","year":"2021","unstructured":"Dida F, Yi G. Empirical evaluation of methods for de novo genome assembly. PeerJ Comput Sci. 2021;9(7):e636.","journal-title":"PeerJ Comput Sci"},{"issue":"1","key":"6091_CR2","doi-asserted-by":"publisher","first-page":"6","DOI":"10.1038\/s41592-022-01730-w","volume":"20","author":"V Marx","year":"2023","unstructured":"Marx V. Method of the year: long-read sequencing. Nat Methods. 2023;20(1):6\u201311.","journal-title":"Nat Methods"},{"issue":"1","key":"6091_CR3","doi-asserted-by":"publisher","first-page":"30","DOI":"10.1186\/s13059-020-1935-5","volume":"21","author":"SL Amarasinghe","year":"2020","unstructured":"Amarasinghe SL, Su S, Dong X, Zappia L, Ritchie ME, Gouil Q. Opportunities and challenges in long-read sequencing data analysis. Genome Biol. 2020;21(1):30.","journal-title":"Genome Biol"},{"issue":"2","key":"6091_CR4","doi-asserted-by":"publisher","first-page":"Iqaa037","DOI":"10.1093\/nargab\/lqaa037","volume":"2","author":"JC Dohm","year":"2020","unstructured":"Dohm JC, Peters P, Stralis-Pavese N, Himmelbauer H. Benchmarking of long-read correction methods. NAR Genomics Bioinforma. 2020;2(2):Iqaa037.","journal-title":"NAR Genomics Bioinforma"},{"issue":"1","key":"6091_CR5","doi-asserted-by":"publisher","first-page":"399","DOI":"10.1038\/s41597-020-00743-4","volume":"7","author":"T Hon","year":"2020","unstructured":"Hon T, Mars K, Young G, Tsai YC, Karalius JW, Landolin JM, et al. Highly accurate long-read HiFi sequencing data for five complex genomes. Sci Data. 2020;7(1):399.","journal-title":"Sci Data"},{"issue":"13","key":"6091_CR6","doi-asserted-by":"publisher","first-page":"973367","DOI":"10.3389\/fmicb.2022.973367","volume":"13","author":"J Luo","year":"2022","unstructured":"Luo J, Meng Z, Xu X, Wang L, Zhao K, Zhu X, et al. Systematic benchmarking of nanopore Q20+ kit in SARS-CoV-2 whole genome sequencing. Front Microbiol. 2022;13(13):973367.","journal-title":"Front Microbiol"},{"issue":"1","key":"6091_CR7","doi-asserted-by":"publisher","first-page":"Iqab019","DOI":"10.1093\/nargab\/lqab019","volume":"3","author":"N Stoler","year":"2021","unstructured":"Stoler N, Nekrutenko A. Sequencing error profiles of Illumina sequencing instruments. NAR Genomics Bioinforma. 2021;3(1):Iqab019.","journal-title":"NAR Genomics Bioinforma"},{"issue":"S6","key":"6091_CR8","doi-asserted-by":"publisher","first-page":"889","DOI":"10.1186\/s12864-020-07227-0","volume":"21","author":"H Zhang","year":"2020","unstructured":"Zhang H, Jain C, Aluru S. A comprehensive evaluation of long read error correction methods. BMC Genomics. 2020;21(S6):889.","journal-title":"BMC Genomics"},{"issue":"5","key":"6091_CR9","doi-asserted-by":"publisher","first-page":"737","DOI":"10.1101\/gr.214270.116","volume":"27","author":"R Vaser","year":"2017","unstructured":"Vaser R, Sovi\u0107 I, Nagarajan N, \u0160iki\u0107 M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 2017;27(5):737\u201346.","journal-title":"Genome Res"},{"key":"6091_CR10","unstructured":"medaka: Sequence correction provided by ONT Research. [Internet]. [cited 2023 Oct 10]. Available from: https:\/\/github.com\/nanoporetech\/medaka"},{"issue":"1","key":"6091_CR11","doi-asserted-by":"publisher","first-page":"2906","DOI":"10.1038\/s41467-023-38716-x","volume":"14","author":"J Wong","year":"2023","unstructured":"Wong J, Coombe L, Nikoli\u0107 V, Zhang E, Nip KM, Sidhu P, et al. Linear time complexity de novo long read genome assembly with GoldRush. Nat Commun. 2023;14(1):2906.","journal-title":"Nat Commun"},{"issue":"12","key":"6091_CR12","doi-asserted-by":"publisher","first-page":"3669","DOI":"10.1093\/bioinformatics\/btaa179","volume":"36","author":"C Firtina","year":"2020","unstructured":"Firtina C, Kim JS, Alser M, Senol Cali D, Cicek AE, Alkan C, et al. Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithm. Bioinformatics. 2020;36(12):3669\u201379.","journal-title":"Bioinformatics"},{"issue":"1","key":"6091_CR13","doi-asserted-by":"publisher","first-page":"20740","DOI":"10.1038\/s41598-021-00178-w","volume":"11","author":"JY Lee","year":"2021","unstructured":"Lee JY, Kong M, Oh J, Lim J, Chung SH, Kim JM, et al. Comparative evaluation of Nanopore polishing tools for microbial genome assembly and polishing strategies for downstream analysis. Sci Rep. 2021;11(1):20740.","journal-title":"Sci Rep"},{"issue":"5","key":"6091_CR14","doi-asserted-by":"publisher","first-page":"e442","DOI":"10.1002\/cpz1.442","volume":"2","author":"JX Li","year":"2022","unstructured":"Li JX, Coombe L, Wong J, Birol I, Warren RL. ntEdit+Sealer: Efficient Targeted Error Resolution and Automated Finishing of Long-Read Genome Assemblies. Curr Protoc. 2022;2(5):e442.","journal-title":"Curr Protoc"},{"issue":"3","key":"6091_CR15","doi-asserted-by":"publisher","first-page":"866","DOI":"10.1093\/bib\/bbx147","volume":"20","author":"V Jayakumar","year":"2019","unstructured":"Jayakumar V, Sakakibara Y. Comprehensive evaluation of non-hybrid genome assembly tools for third-generation PacBio long-read sequence data. Brief Bioinform. 2019;20(3):866\u201376.","journal-title":"Brief Bioinform"},{"issue":"5","key":"6091_CR16","doi-asserted-by":"publisher","first-page":"722","DOI":"10.1101\/gr.215087.116","volume":"27","author":"S Koren","year":"2017","unstructured":"Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. Canu: scalable and accurate long-read assembly via adaptive k -mer weighting and repeat separation. Genome Res. 2017;27(5):722\u201336.","journal-title":"Genome Res"},{"issue":"2","key":"6091_CR17","doi-asserted-by":"publisher","first-page":"124","DOI":"10.1038\/s41587-018-0004-z","volume":"37","author":"M Watson","year":"2019","unstructured":"Watson M, Warr A. Errors in long-read assemblies can critically affect protein prediction. Nat Biotechnol. 2019;37(2):124\u20136.","journal-title":"Nat Biotechnol"},{"issue":"4","key":"6091_CR18","doi-asserted-by":"publisher","first-page":"e733","DOI":"10.1002\/cpz1.733","volume":"3","author":"L Coombe","year":"2023","unstructured":"Coombe L, Warren RL, Wong J, Nikolic V, Birol I. ntLink: A toolkit for de novo genome assembly scaffolding and mapping using long reads. Curr Protoc. 2023;3(4):e733.","journal-title":"Curr Protoc"},{"key":"6091_CR19","doi-asserted-by":"publisher","first-page":"33","DOI":"10.12688\/f1000research.29032.2","volume":"10","author":"F M\u00f6lder","year":"2021","unstructured":"M\u00f6lder F, Jablonski KP, Letcher B, Hall MB, Tomkins-Tinch CH, Sochat V, et al. Sustainable data analysis with Snakemake. F1000Research. 2021;10:33.","journal-title":"F1000Research"},{"issue":"18","key":"6091_CR20","doi-asserted-by":"publisher","first-page":"3094","DOI":"10.1093\/bioinformatics\/bty191","volume":"34","author":"H Li","year":"2018","unstructured":"Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34(18):3094\u2013100.","journal-title":"Bioinformatics"},{"key":"6091_CR21","unstructured":"PAF: a Pairwise mApping Format [Internet]. [cited 2024 Jan 4]. Available from: https:\/\/github.com\/lh3\/miniasm\/blob\/master\/PAF.md"},{"key":"6091_CR22","unstructured":"BED format [Internet]. [cited 2024 Jan 4]. Available from: https:\/\/genome.cse.ucsc.edu\/FAQ\/FAQformat.html#format1"},{"issue":"29","key":"6091_CR23","doi-asserted-by":"publisher","first-page":"16961","DOI":"10.1073\/pnas.1903436117","volume":"117","author":"J Chu","year":"2020","unstructured":"Chu J, Mohamadi H, Erhan E, Tse J, Chiu R, Yeo S, et al. Mismatch-tolerant, alignment-free sequence classification using multiple spaced seeds and multiindex Bloom filters. Proc Natl Acad Sci. 2020;117(29):16961\u20138.","journal-title":"Proc Natl Acad Sci"},{"issue":"30","key":"6091_CR24","doi-asserted-by":"publisher","first-page":"11920","DOI":"10.1073\/pnas.1201904109","volume":"109","author":"MP Ball","year":"2012","unstructured":"Ball MP, Thakuria JV, Zaranek AW, Clegg T, Rosenbaum AM, Wu X, et al. A public resource facilitating clinical use of genomes. Proc Natl Acad Sci. 2012;109(30):11920\u20137.","journal-title":"Proc Natl Acad Sci"},{"issue":"10","key":"6091_CR25","doi-asserted-by":"publisher","first-page":"1155","DOI":"10.1038\/s41587-019-0217-9","volume":"37","author":"AM Wenger","year":"2019","unstructured":"Wenger AM, Peluso P, Rowell WJ, Chang PC, Hall RJ, Concepcion GT, et al. Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol. 2019;37(10):1155\u201362.","journal-title":"Nat Biotechnol"},{"issue":"1","key":"6091_CR26","doi-asserted-by":"publisher","first-page":"245","DOI":"10.1186\/s13059-020-02134-9","volume":"21","author":"A Rhie","year":"2020","unstructured":"Rhie A, Walenz BP, Koren S, Phillippy AM. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 2020;21(1):245.","journal-title":"Genome Biol"},{"issue":"13","key":"6091_CR27","doi-asserted-by":"publisher","first-page":"i142","DOI":"10.1093\/bioinformatics\/bty266","volume":"34","author":"A Mikheenko","year":"2018","unstructured":"Mikheenko A, Prjibelski A, Saveliev V, Antipov D, Gurevich A. Versatile genome assembly evaluation with QUAST-LG. Bioinformatics. 2018;34(13):i142\u201350.","journal-title":"Bioinformatics"},{"issue":"12","key":"6091_CR28","doi-asserted-by":"publisher","first-page":"e323","DOI":"10.1002\/cpz1.323","volume":"1","author":"M Manni","year":"2021","unstructured":"Manni M, Berkeley MR, Seppey M, Zdobnov EM. BUSCO: assessing genomic data quality and beyond. Curr Protoc. 2021;1(12):e323.","journal-title":"Curr Protoc"},{"issue":"19","key":"6091_CR29","doi-asserted-by":"publisher","first-page":"3120","DOI":"10.1093\/bioinformatics\/btab354","volume":"37","author":"N Huang","year":"2021","unstructured":"Huang N, Nie F, Ni P, Luo F, Gao X, Wang J. NeuralPolish: a novel Nanopore polishing method based on alignment matrix construction and orthogonal Bi-GRU Networks. Bioinformatics. 2021;37(19):3120\u20137.","journal-title":"Bioinformatics"},{"issue":"1","key":"6091_CR30","doi-asserted-by":"publisher","first-page":"679","DOI":"10.1186\/s12864-024-10582-x","volume":"25","author":"T Luan","year":"2024","unstructured":"Luan T, Commichaux S, Hoffmann M, Jayeola V, Jang JH, Pop M, et al. Benchmarking short and long read polishing tools for nanopore assemblies: achieving near-perfect genomes for outbreak isolates. BMC Genom. 2024;25(1):679.","journal-title":"BMC Genom"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-025-06091-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-025-06091-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-025-06091-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,7]],"date-time":"2025-03-07T06:17:46Z","timestamp":1741328266000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-025-06091-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,7]]},"references-count":30,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["6091"],"URL":"https:\/\/doi.org\/10.1186\/s12859-025-06091-7","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.09.27.615516","asserted-by":"object"}]},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,3,7]]},"assertion":[{"value":"17 October 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 February 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 March 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"78"}}