{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T11:48:56Z","timestamp":1773920936778,"version":"3.50.1"},"reference-count":21,"publisher":"Oxford University Press (OUP)","issue":"14","license":[{"start":{"date-parts":[[2016,10,1]],"date-time":"2016-10-01T00:00:00Z","timestamp":1475280000000},"content-version":"vor","delay-in-days":2316,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2010,7,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Existing sequence assembly editors struggle with the volumes of data now readily available from the latest generation of DNA sequencing instruments.<\/jats:p><jats:p>Results: We describe the Gap5 software along with the data structures and algorithms used that allow it to be scalable. We demonstrate this with an assembly of 1.1 billion sequence fragments and compare the performance with several other programs. We analyse the memory, CPU, I\/O usage and file sizes used by Gap5.<\/jats:p><jats:p>Availability and Implementation: Gap5 is part of the Staden Package and is available under an Open Source licence from http:\/\/staden.sourceforge.net. It is implemented in C and Tcl\/Tk. Currently it works on Unix systems only.<\/jats:p><jats:p>Contact: \u00a0jkb@sanger.ac.uk<\/jats:p><jats:p>Supplementary information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btq268","type":"journal-article","created":{"date-parts":[[2010,6,1]],"date-time":"2010-06-01T00:24:30Z","timestamp":1275351870000},"page":"1699-1703","source":"Crossref","is-referenced-by-count":205,"title":["Gap5\u2014editing the billion fragment sequence assembly"],"prefix":"10.1093","volume":"26","author":[{"given":"James K.","family":"Bonfield","sequence":"first","affiliation":[{"name":"Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, UK"}]},{"given":"Andrew","family":"Whitwham","sequence":"additional","affiliation":[{"name":"Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, UK"}]}],"member":"286","published-online":{"date-parts":[[2010,5,30]]},"reference":[{"key":"2023012507583596800_B1","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1093\/bioinformatics\/btp611","article-title":"NGSView: an extensible open source editor for next-generation sequencing data","volume":"26","author":"Arner","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012507583596800_B2","doi-asserted-by":"crossref","first-page":"1554","DOI":"10.1093\/bioinformatics\/btp255","article-title":"MapView: visualization of short reads alignment on a desktop computer","volume":"25","author":"Bao","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012507583596800_B3","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1038\/nature07517","article-title":"Accurate whole human genome sequencing using reversible terminator chemistry","volume":"456","author":"Bentley","year":"2008","journal-title":"Nature"},{"key":"2023012507583596800_B4","doi-asserted-by":"crossref","first-page":"4992","DOI":"10.1093\/nar\/23.24.4992","article-title":"A new DNA sequence assembly program","volume":"23","author":"Bonfield","year":"1995","journal-title":"Nucleic Acids Res."},{"key":"2023012507583596800_B5","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1126\/science.1180614","article-title":"Genome project standards in a new era of sequencing","volume":"326","author":"Chain","year":"2009","journal-title":"Science"},{"key":"2023012507583596800_B6","doi-asserted-by":"crossref","first-page":"260","DOI":"10.1101\/gr.8.3.260","article-title":"Sequence assembly with CAFTOOLS","volume":"8","author":"Dear","year":"1998","journal-title":"Genome Res."},{"key":"2023012507583596800_B7","author":"Deutsch","year":"1996","journal-title":"Zlib compressed data format specification version 3.3. RFC 1950."},{"key":"2023012507583596800_B8","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1101\/gr.8.3.195","article-title":"Consed: a graphical tool for sequence finishing","volume":"8","author":"Gordon","year":"1998","journal-title":"Genome Res."},{"key":"2023012507583596800_B9","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1145\/602259.602266","article-title":"R-Trees: a dynamic index structure for spatial searching","volume-title":"SIGMOD'84, Proceedings of Annual Meeting","author":"Guttman","year":"1984"},{"key":"2023012507583596800_B10","doi-asserted-by":"crossref","first-page":"1538","DOI":"10.1101\/gr.076067.108","article-title":"EagleView: a genome assembly viewer for next-generation sequencing technologies","volume":"18","author":"Huang","year":"2008","journal-title":"Genome Res."},{"key":"2023012507583596800_B11","doi-asserted-by":"crossref","first-page":"996","DOI":"10.1101\/gr.229102","article-title":"The human genome browser at UCSC","volume":"12","author":"Kent","year":"2002","journal-title":"Genome Res."},{"key":"2023012507583596800_B12","volume-title":"The C Programming Language.","author":"Kernighan","year":"1988"},{"key":"2023012507583596800_B13","doi-asserted-by":"crossref","first-page":"1851","DOI":"10.1101\/gr.078212.108","article-title":"Mapping short DNA sequencing reads and calling variants using mapping quality scores","volume":"18","author":"Li","year":"2008","journal-title":"Genome Res."},{"key":"2023012507583596800_B14","doi-asserted-by":"crossref","first-page":"2078","DOI":"10.1093\/bioinformatics\/btp352","article-title":"The Sequence Alignment\/Map format and SAMtools","volume":"16","author":"Li","year":"2009","journal-title":"Bioinformatics"},{"key":"2023012507583596800_B15","article-title":"Adaptive weighing of context models for lossless data compression","volume-title":"Technical Report CS-2005-16","author":"Mahoney","year":"2005"},{"key":"2023012507583596800_B16","doi-asserted-by":"crossref","first-page":"2125","DOI":"10.1101\/gr.093443.109","article-title":"LookSeq: a browser-based viewer for deep sequencing data","volume":"19","author":"Manske","year":"2009","journal-title":"Genome Res."},{"key":"2023012507583596800_B17","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1038\/nature03959","article-title":"Genome sequencing in microfabricated high-density picolitre reactors","volume":"437","author":"Margulies","year":"2005","journal-title":"Nature"},{"key":"2023012507583596800_B18","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1093\/bioinformatics\/btp666","article-title":"Tablet - next generation sequence assembly visualization","volume":"26","author":"Milne","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012507583596800_B19","first-page":"133","article-title":"Tcl: An embeddable command language","volume-title":"Proceedings USENIX Winter Conference","author":"Ousterhout","year":"1990"},{"key":"2023012507583596800_B20","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1002\/9783527625130.ch3","article-title":"Applied biosystems SOLiD system: ligation-based sequencing","volume-title":"Next-Generation Genome Sequencing.","author":"Pandey","year":"2008"},{"key":"2023012507583596800_B21","doi-asserted-by":"crossref","first-page":"R34","DOI":"10.1186\/gb-2007-8-3-r34","article-title":"Hawkeye: an interactive visual analytics tool for genome assemblies","volume":"8","author":"Schatz","year":"2007","journal-title":"Genome Biol."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/14\/1699\/48853180\/bioinformatics_26_14_1699.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/26\/14\/1699\/48853180\/bioinformatics_26_14_1699.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T11:33:37Z","timestamp":1740137617000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/26\/14\/1699\/178142"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2010,5,30]]},"references-count":21,"journal-issue":{"issue":"14","published-print":{"date-parts":[[2010,7,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btq268","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2010,7,15]]},"published":{"date-parts":[[2010,5,30]]}}}