{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:46Z","timestamp":1772138086966,"version":"3.50.1"},"reference-count":24,"publisher":"Oxford University Press (OUP)","issue":"15","license":[{"start":{"date-parts":[[2018,3,28]],"date-time":"2018-03-28T00:00:00Z","timestamp":1522195200000},"content-version":"vor","delay-in-days":14,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["1564899"],"award-info":[{"award-number":["1564899"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["16119110"],"award-info":[{"award-number":["16119110"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","award":["1R01EB025022-01"],"award-info":[{"award-number":["1R01EB025022-01"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"name":"GSU Molecular Basis of Disease Fellowship"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,8,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Summary<\/jats:title>\n                    <jats:p>Genomic sequences are assembled into a variable, but large number of contigs that should be scaffolded (ordered and oriented) for facilitating comparative or functional analysis. Finding scaffolding is computationally challenging due to misassemblies, inconsistent coverage across the genome and long repeats. An accurate assessment of scaffolding tools should take into account multiple locations of the same contig on the reference scaffolding rather than matching a repeat to a single best location. This makes mapping of inferred scaffoldings onto the reference a computationally challenging problem. This paper formulates the repeat-aware scaffolding evaluation problem, which is to find a mapping of the inferred scaffolding onto the reference maximizing number of correct links and proposes a scalable algorithm capable of handling large whole-genome datasets. Our novel scaffolding validation framework has been applied to assess the most of state-of-the-art scaffolding tools on the representative subset of Genome Assembly Golden-Standard Evaluations (GAGE) datasets and some novel simulated datasets.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The source code of this evaluation framework is available at https:\/\/github.com\/mandricigor\/repeat-aware. The documentation is hosted at https:\/\/mandricigor.github.io\/repeat-aware.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/bty131","type":"journal-article","created":{"date-parts":[[2018,3,9]],"date-time":"2018-03-09T15:15:43Z","timestamp":1520608543000},"page":"2530-2537","source":"Crossref","is-referenced-by-count":5,"title":["Repeat-aware evaluation of scaffolding tools"],"prefix":"10.1093","volume":"34","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4362-5169","authenticated-orcid":false,"given":"Igor","family":"Mandric","sequence":"first","affiliation":[{"name":"Department of Computer Science, Georgia State University, Atlanta, GA, USA"}]},{"given":"Sergey","family":"Knyazev","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Georgia State University, Atlanta, GA, USA"}]},{"given":"Alex","family":"Zelikovsky","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Georgia State University, Atlanta, GA, USA"},{"name":"The laboratory of bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia"}]}],"member":"286","published-online":{"date-parts":[[2018,3,14]]},"reference":[{"key":"2023012810012147000_bty131-B1","doi-asserted-by":"crossref","first-page":"272","DOI":"10.1137\/S0097539793250627","article-title":"Genome rearrangements and sorting by reversals","volume":"25","author":"Bafna","year":"1996","journal-title":"SIAM J. Comput"},{"key":"2023012810012147000_bty131-B2","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1089\/cmb.2012.0021","article-title":"Spades: a new genome assembly algorithm and its applications to single-cell sequencing","volume":"19","author":"Bankevich","year":"2012","journal-title":"J. Comput. Biol"},{"key":"2023012810012147000_bty131-B3","first-page":"3","volume-title":"1st Conference on Algorithms and Computational Methods for Biochemical and Evolutionary Networks (CompBioNets\u2019 04","author":"Blin","year":"2004"},{"key":"2023012810012147000_bty131-B4","doi-asserted-by":"crossref","first-page":"578","DOI":"10.1093\/bioinformatics\/btq683","article-title":"Scaffolding pre-assembled contigs using sspace","volume":"27","author":"Boetzer","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012810012147000_bty131-B5","doi-asserted-by":"crossref","DOI":"10.1002\/0471250953.bi1003s00","article-title":"Using mummer to identify similar regions in large sequence sets","author":"Delcher","year":"2003","journal-title":"Curr. Protoc. Bioinformatics"},{"key":"2023012810012147000_bty131-B6","doi-asserted-by":"crossref","first-page":"1681","DOI":"10.1089\/cmb.2011.0170","article-title":"Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences","volume":"18","author":"Gao","year":"2011","journal-title":"J. Comput. Biol"},{"key":"2023012810012147000_bty131-B7","doi-asserted-by":"crossref","first-page":"102.","DOI":"10.1186\/s13059-016-0951-y","article-title":"Opera-lg: efficient and exact scaffolding of large, repeat-rich eukaryotic genomes with performance guarantees","volume":"17","author":"Gao","year":"2016","journal-title":"Genome Biol"},{"key":"2023012810012147000_bty131-B8","doi-asserted-by":"crossref","first-page":"1072","DOI":"10.1093\/bioinformatics\/btt086","article-title":"Quast: quality assessment tool for genome assemblies","volume":"29","author":"Gurevich","year":"2013","journal-title":"Bioinformatics"},{"key":"2023012810012147000_bty131-B9","doi-asserted-by":"crossref","first-page":"R42.","DOI":"10.1186\/gb-2014-15-3-r42","article-title":"A comprehensive evaluation of assembly scaffolding tools","volume":"15","author":"Hunt","year":"2014","journal-title":"Genome Biol"},{"key":"2023012810012147000_bty131-B10","doi-asserted-by":"crossref","first-page":"R12.","DOI":"10.1186\/gb-2004-5-2-r12","article-title":"Versatile and open software for comparing large genomes","volume":"5","author":"Kurtz","year":"2004","journal-title":"Genome Biol"},{"key":"2023012810012147000_bty131-B11","doi-asserted-by":"crossref","first-page":"R25.","DOI":"10.1186\/gb-2009-10-3-r25","article-title":"Ultrafast and memory-efficient alignment of short dna sequences to the human genome","volume":"10","author":"Langmead","year":"2009","journal-title":"Genome Biol"},{"key":"2023012810012147000_bty131-B12","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1038\/nmeth.1923","article-title":"Fast gapped-read alignment with bowtie 2","volume":"9","author":"Langmead","year":"2012","journal-title":"Nat. Methods"},{"key":"2023012810012147000_bty131-B13","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1093\/bioinformatics\/btp698","article-title":"Fast and accurate long-read alignment with burrows\u2013wheeler transform","volume":"26","author":"Li","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012810012147000_bty131-B14","doi-asserted-by":"crossref","DOI":"10.1186\/1471-2105-15-S9-S9","article-title":"ILP-based maximum likelihood genome scaffolding","volume":"15","author":"Lindsay","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"2023012810012147000_bty131-B15","doi-asserted-by":"crossref","first-page":"18.","DOI":"10.1186\/2047-217X-1-18","article-title":"Soapdenovo2: an empirically improved memory-efficient short-read de novo assembler","volume":"1","author":"Luo","year":"2012","journal-title":"Gigascience"},{"key":"2023012810012147000_bty131-B16","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1093\/bioinformatics\/btw597","article-title":"Boss: a novel scaffolding algorithm based on an optimized scaffold graph","volume":"33","author":"Luo","year":"2016","journal-title":"Bioinformatics"},{"key":"2023012810012147000_bty131-B17","author":"Mandric","year":"2014"},{"key":"2023012810012147000_bty131-B18","doi-asserted-by":"crossref","first-page":"2632","DOI":"10.1093\/bioinformatics\/btv211","article-title":"Scaffmatch: scaffolding algorithm based on maximum weight matching","volume":"31","author":"Mandric","year":"2015","journal-title":"Bioinformatics"},{"key":"2023012810012147000_bty131-B19","doi-asserted-by":"crossref","first-page":"281.","DOI":"10.1186\/1471-2105-15-281","article-title":"Besst-efficient scaffolding of large fragmented assemblies","volume":"15","author":"Sahlin","year":"2014","journal-title":"BMC Bioinformatics"},{"key":"2023012810012147000_bty131-B20","doi-asserted-by":"crossref","first-page":"557","DOI":"10.1101\/gr.131383.111","article-title":"Gage: a critical evaluation of genome assemblies and assembly algorithms","volume":"22","author":"Salzberg","year":"2012","journal-title":"Genome Res"},{"key":"2023012810012147000_bty131-B21","doi-asserted-by":"crossref","first-page":"909","DOI":"10.1093\/bioinformatics\/15.11.909","article-title":"Genome rearrangement with gene families","volume":"15","author":"Sankoff","year":"1999","journal-title":"Bioinformatics"},{"key":"2023012810012147000_bty131-B22","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1038\/nrg3117","article-title":"Repetitive dna and next-generation sequencing: computational challenges and solutions","volume":"13","author":"Treangen","year":"2012","journal-title":"Nat. Rev. Genet"},{"key":"2023012810012147000_bty131-B23","doi-asserted-by":"crossref","first-page":"821","DOI":"10.1101\/gr.074492.107","article-title":"Velvet: algorithms for de novo short read assembly using de bruijn graphs","volume":"18","author":"Zerbino","year":"2008","journal-title":"Genome Res"},{"key":"2023012810012147000_bty131-B24","doi-asserted-by":"crossref","first-page":"3655","DOI":"10.1534\/g3.116.034249","article-title":"In silico whole genome sequencer and analyzer (iwgs): a computational pipeline to guide the design and analysis of de novo genome sequencing studies","volume":"6","author":"Zhou","year":"2016","journal-title":"G3 (Bethesda)"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/15\/2530\/48935440\/bioinformatics_34_15_2530.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/15\/2530\/48935440\/bioinformatics_34_15_2530.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,28]],"date-time":"2023-01-28T05:03:03Z","timestamp":1674882183000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/15\/2530\/4934936"}},"subtitle":[],"editor":[{"given":"Bonnie","family":"Berger","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2018,3,14]]},"references-count":24,"journal-issue":{"issue":"15","published-print":{"date-parts":[[2018,8,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/bty131","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/148932","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,8,1]]},"published":{"date-parts":[[2018,3,14]]}}}