{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,11]],"date-time":"2026-04-11T14:26:17Z","timestamp":1775917577162,"version":"3.50.1"},"reference-count":26,"publisher":"Oxford University Press (OUP)","issue":"19","license":[{"start":{"date-parts":[[2021,5,11]],"date-time":"2021-05-11T00:00:00Z","timestamp":1620691200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["U1909208"],"award-info":[{"award-number":["U1909208"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61772557"],"award-info":[{"award-number":["61772557"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100013314","name":"111 Project","doi-asserted-by":"publisher","award":["B18059"],"award-info":[{"award-number":["B18059"]}],"id":[{"id":"10.13039\/501100013314","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Hunan Provincial Science and Technology Program","award":["2018wk4001"],"award-info":[{"award-number":["2018wk4001"]}]},{"name":"U. S. National Institute of Food and Agriculture","award":["2017-70016-26051"],"award-info":[{"award-number":["2017-70016-26051"]}]},{"DOI":"10.13039\/100000001","name":"U.S.National Science Foundation","doi-asserted-by":"crossref","award":["ABI-1759856"],"award-info":[{"award-number":["ABI-1759856"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100004052","name":"King Abdullah University of Science and Technology","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100004052","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Office of Sponsored Research","award":["FCC\/1\/1976-26-01"],"award-info":[{"award-number":["FCC\/1\/1976-26-01"]}]},{"name":"Office of Sponsored Research","award":["URF\/1\/3412-01-01"],"award-info":[{"award-number":["URF\/1\/3412-01-01"]}]},{"name":"Office of Sponsored Research","award":["URF\/1\/4098-01-01"],"award-info":[{"award-number":["URF\/1\/4098-01-01"]}]},{"name":"Office of Sponsored Research","award":["REI\/1\/4473-01-01"],"award-info":[{"award-number":["REI\/1\/4473-01-01"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,10,11]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Oxford Nanopore sequencing producing long reads at low cost has made many breakthroughs in genomics studies. However, the large number of errors in Nanopore genome assembly affect the accuracy of genome analysis. Polishing is a procedure to correct the errors in genome assembly and can improve the reliability of the downstream analysis. However, the performances of the existing polishing methods are still not satisfactory.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We developed a novel polishing method, NeuralPolish, to correct the errors in assemblies based on alignment matrix construction and orthogonal Bi-GRU networks. In this method, we designed an alignment feature matrix for representing read-to-assembly alignment. Each row of the matrix represents a read, and each column represents the aligned bases at each position of the contig. In the network architecture, a bi-directional GRU network is used to extract the sequence information inside each read by processing the alignment matrix row by row. After that, the feature matrix is processed by another bi-directional GRU network column by column to calculate the probability distribution. Finally, a CTC decoder generates a polished sequence with a greedy algorithm. We used five real datasets and three assembly tools including Wtdbg2, Flye and Canu for testing, and compared the results of different polishing methods including NeuralPolish, Racon, MarginPolish, HELEN and Medaka. Comprehensive experiments demonstrate that NeuralPolish achieves more accurate assembly with fewer errors than other polishing methods and can improve the accuracy of assembly obtained by different assemblers.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>https:\/\/github.com\/huangnengCSU\/NeuralPolish.git.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab354","type":"journal-article","created":{"date-parts":[[2021,5,6]],"date-time":"2021-05-06T13:39:37Z","timestamp":1620308377000},"page":"3120-3127","source":"Crossref","is-referenced-by-count":22,"title":["NeuralPolish: a novel Nanopore polishing method based on alignment matrix construction and orthogonal Bi-GRU Networks"],"prefix":"10.1093","volume":"37","author":[{"given":"Neng","family":"Huang","sequence":"first","affiliation":[{"name":"School of Computer Science and Engineering, Central South University , Changsha 410083, China"},{"name":"Hunan Provincial Key Lab on Bioinformatics, Central South University , Changsha 410083, China"}]},{"given":"Fan","family":"Nie","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Central South University , Changsha 410083, China"},{"name":"Hunan Provincial Key Lab on Bioinformatics, Central South University , Changsha 410083, China"}]},{"given":"Peng","family":"Ni","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Central South University , Changsha 410083, China"},{"name":"Hunan Provincial Key Lab on Bioinformatics, Central South University , Changsha 410083, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4813-2403","authenticated-orcid":false,"given":"Feng","family":"Luo","sequence":"additional","affiliation":[{"name":"School of Computing, Clemson University , Clemson, SC 29634, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7108-3574","authenticated-orcid":false,"given":"Xin","family":"Gao","sequence":"additional","affiliation":[{"name":"Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division , King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1516-0480","authenticated-orcid":false,"given":"Jianxin","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, Central South University , Changsha 410083, China"},{"name":"Hunan Provincial Key Lab on Bioinformatics, Central South University , Changsha 410083, China"}]}],"member":"286","published-online":{"date-parts":[[2021,5,11]]},"reference":[{"key":"2023051608270543200_btab354-B1","doi-asserted-by":"crossref","first-page":"623","DOI":"10.1038\/nbt.3238","article-title":"Assembling large genomes with single-molecule sequencing and locality-sensitive hashing","volume":"33","author":"Berlin","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023051608270543200_btab354-B2","first-page":"1","article-title":"Efficient assembly of nanopore reads via highly accurate and intact error correction","volume":"12","author":"Chen","year":"2021","journal-title":"Nat. Commun"},{"key":"2023051608270543200_btab354-B3","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1038\/nmeth.2474","article-title":"Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data","volume":"10","author":"Chin","year":"2013","journal-title":"Nat. Methods"},{"key":"2023051608270543200_btab354-B4","doi-asserted-by":"crossref","first-page":"1050","DOI":"10.1038\/nmeth.4035","article-title":"Phased diploid genome assembly with single-molecule real-time sequencing","volume":"13","author":"Chin","year":"2016","journal-title":"Nat. Methods"},{"key":"2023051608270543200_btab354-B5","author":"Chung","year":"2014"},{"key":"2023051608270543200_btab354-B6","doi-asserted-by":"crossref","first-page":"3669","DOI":"10.1093\/bioinformatics\/btaa179","article-title":"Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithm","volume":"36","author":"Firtina","year":"2020","journal-title":"Bioinformatics"},{"key":"2023051608270543200_btab354-B7","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1038\/nmeth.4577","article-title":"Highly parallel direct rna sequencing on an array of nanopores","volume":"15","author":"Garalde","year":"2018","journal-title":"Nat. Methods"},{"key":"2023051608270543200_btab354-B8","first-page":"369","author":"Graves","year":"2006"},{"key":"2023051608270543200_btab354-B9","doi-asserted-by":"crossref","first-page":"2253","DOI":"10.1093\/bioinformatics\/btz891","article-title":"Nextpolish: a fast and efficient genome polishing tool for long read assembly","volume":"36","author":"Hu","year":"2020","journal-title":"Bioinformatics"},{"key":"2023051608270543200_btab354-B10","doi-asserted-by":"crossref","first-page":"338","DOI":"10.1038\/nbt.4060","article-title":"Nanopore sequencing and assembly of a human genome with ultra-long reads","volume":"36","author":"Jain","year":"2018","journal-title":"Nat. Biotechnol"},{"key":"2023051608270543200_btab354-B11","author":"Kingma","year":"2014"},{"key":"2023051608270543200_btab354-B12","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1038\/s41587-019-0072-8","article-title":"Assembly of long, error-prone reads using repeat graphs","volume":"37","author":"Kolmogorov","year":"2019","journal-title":"Nat. Biotechnol"},{"key":"2023051608270543200_btab354-B13","doi-asserted-by":"crossref","first-page":"722","DOI":"10.1101\/gr.215087.116","article-title":"CANU: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation","volume":"27","author":"Koren","year":"2017","journal-title":"Genome Res"},{"key":"2023051608270543200_btab354-B14","doi-asserted-by":"crossref","first-page":"452","DOI":"10.1093\/bioinformatics\/18.3.452","article-title":"Multiple sequence alignment using partial order graphs","volume":"18","author":"Lee","year":"2002","journal-title":"Bioinformatics"},{"key":"2023051608270543200_btab354-B15","doi-asserted-by":"crossref","first-page":"3094","DOI":"10.1093\/bioinformatics\/bty191","article-title":"Minimap2: pairwise alignment for nucleotide sequences","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"2023051608270543200_btab354-B16","doi-asserted-by":"crossref","first-page":"2078","DOI":"10.1093\/bioinformatics\/btp352","article-title":"The sequence alignment\/map format and Samtools","volume":"25","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023051608270543200_btab354-B17","doi-asserted-by":"crossref","first-page":"733","DOI":"10.1038\/nmeth.3444","article-title":"A complete bacterial genome assembled de novo using only nanopore sequencing data","volume":"12","author":"Loman","year":"2015","journal-title":"Nat. Methods"},{"key":"2023051608270543200_btab354-B18","doi-asserted-by":"crossref","first-page":"e1005944","DOI":"10.1371\/journal.pcbi.1005944","article-title":"Mummer4: a fast and versatile genome alignment system","volume":"14","author":"Mar\u00e7ais","year":"2018","journal-title":"PLOS Comput. Biol"},{"key":"2023051608270543200_btab354-B19","doi-asserted-by":"crossref","first-page":"4586","DOI":"10.1093\/bioinformatics\/btz276","article-title":"Deepsignal: detecting DNA methylation state from nanopore sequencing reads using deep-learning","volume":"35","author":"Ni","year":"2019","journal-title":"Bioinformatics"},{"key":"2023051608270543200_btab354-B20","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1038\/s41592-019-0669-3","article-title":"Fast and accurate long-read assembly with wtdbg2","volume":"17","author":"Ruan","year":"2020","journal-title":"Nat. Methods"},{"key":"2023051608270543200_btab354-B21","doi-asserted-by":"crossref","first-page":"1044","DOI":"10.1038\/s41587-020-0503-6","article-title":"Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes","volume":"38","author":"Shafin","year":"2020","journal-title":"Nat. Biotechnol"},{"key":"2023051608270543200_btab354-B22","doi-asserted-by":"crossref","first-page":"737","DOI":"10.1101\/gr.214270.116","article-title":"Fast and accurate de novo genome assembly from long uncorrected reads","volume":"27","author":"Vaser","year":"2017","journal-title":"Genome Res"},{"key":"2023051608270543200_btab354-B23","doi-asserted-by":"crossref","first-page":"e112963","DOI":"10.1371\/journal.pone.0112963","article-title":"Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement","volume":"9","author":"Walker","year":"2014","journal-title":"PloS One"},{"key":"2023051608270543200_btab354-B24","doi-asserted-by":"crossref","first-page":"4430","DOI":"10.1093\/bioinformatics\/btz400","article-title":"ntEdit: scalable genome sequence polishing","volume":"35","author":"Warren","year":"2019","journal-title":"Bioinformatics"},{"key":"2023051608270543200_btab354-B25","doi-asserted-by":"crossref","first-page":"1072","DOI":"10.1038\/nmeth.4432","article-title":"Mecat: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads","volume":"14","author":"Xiao","year":"2017","journal-title":"Nat. Methods"},{"key":"2023051608270543200_btab354-B26","doi-asserted-by":"crossref","first-page":"e1007981","DOI":"10.1371\/journal.pcbi.1007981","article-title":"The genome polishing tool Polca makes fast and accurate corrections in genome assemblies","volume":"16","author":"Zimin","year":"2020","journal-title":"PLoS Comput. Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btab354\/38604430\/btab354.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/19\/3120\/50338135\/btab354.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/19\/3120\/50338135\/btab354.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,16]],"date-time":"2023-05-16T08:41:24Z","timestamp":1684226484000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/19\/3120\/6273578"}},"subtitle":[],"editor":[{"given":"Peter","family":"Robinson","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,5,11]]},"references-count":26,"journal-issue":{"issue":"19","published-print":{"date-parts":[[2021,10,11]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab354","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,10,1]]},"published":{"date-parts":[[2021,5,11]]}}}