{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,29]],"date-time":"2026-05-29T09:18:32Z","timestamp":1780046312019,"version":"3.53.1"},"reference-count":26,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Source Code Biol Med"],"published-print":{"date-parts":[[2013,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Genome sequencing has become routine, however genome assembly still remains a challenge despite the computational advances in the last decade. In particular, the abundance of repeat elements in genomes makes it difficult to assemble them into a single complete sequence. Identical repeats shorter than the average read length can generally be assembled without issue. However, longer repeats such as ribosomal RNA operons cannot be accurately assembled using existing tools. The application <jats:italic>Scaffold_builder<\/jats:italic> was designed to generate scaffolds \u2013 super contigs of sequences joined by N-bases \u2013 based on the similarity to a closely related reference sequence. This is independent of mate-pair information and can be used complementarily for genome assembly, e.g. when mate-pairs are not available or have already been exploited. <jats:italic>Scaffold_builder<\/jats:italic> was evaluated using simulated pyrosequencing reads of the bacterial genomes <jats:italic>Escherichia coli<\/jats:italic> 042, <jats:italic>Lactobacillus salivarius<\/jats:italic> UCC118 and <jats:italic>Salmonella enterica<\/jats:italic> subsp. enterica serovar Typhi str. P-stx-12. Moreover, we sequenced two genomes from <jats:italic>Salmonella enterica<\/jats:italic> serovar Typhimurium LT2 G455 and <jats:italic>Salmonella enterica<\/jats:italic> serovar Typhimurium SDT1291 and show that <jats:italic>Scaffold_builder<\/jats:italic> decreases the number of contig sequences by 53% while more than doubling their average length. <jats:italic>Scaffold_builder<\/jats:italic> is written in Python and is available at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"http:\/\/edwards.sdsu.edu\/scaffold_builder\" ext-link-type=\"uri\">http:\/\/edwards.sdsu.edu\/scaffold_builder<\/jats:ext-link>. A web-based implementation is additionally provided to allow users to submit a reference genome and a set of contigs to be scaffolded.<\/jats:p>","DOI":"10.1186\/1751-0473-8-23","type":"journal-article","created":{"date-parts":[[2013,11,22]],"date-time":"2013-11-22T14:01:36Z","timestamp":1385128896000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":76,"title":["Combining de novo and reference-guided assembly with scaffold_builder"],"prefix":"10.1186","volume":"8","author":[{"given":"Genivaldo GZ","family":"Silva","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Bas E","family":"Dutilh","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"T David","family":"Matthews","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Keri","family":"Elkins","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Robert","family":"Schmieder","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Elizabeth A","family":"Dinsdale","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Robert A","family":"Edwards","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2013,11,22]]},"reference":[{"key":"103_CR1","doi-asserted-by":"publisher","first-page":"609","DOI":"10.1093\/bib\/bbp039","volume":"10","author":"M Imelfort","year":"2009","unstructured":"Imelfort M, Edwards D: De novo sequencing of plant genomes using second-generation technologies. Brief Bioinform. 2009, 10: 609-618. 10.1093\/bib\/bbp039.","journal-title":"Brief Bioinform"},{"key":"103_CR2","doi-asserted-by":"publisher","first-page":"578","DOI":"10.1093\/bioinformatics\/btq683","volume":"27","author":"M Boetzer","year":"2011","unstructured":"Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W: Scaffolding pre-assembled contigs using SSPACE. Bioinformatics. 2011, 27: 578-579. 10.1093\/bioinformatics\/btq683.","journal-title":"Bioinformatics"},{"key":"103_CR3","doi-asserted-by":"publisher","first-page":"94","DOI":"10.1016\/S0966-842X(01)02293-4","volume":"10","author":"RA Edwards","year":"2002","unstructured":"Edwards RA, Olsen GJ, Maloy SR: Comparative genomics of closely related salmonellae. Trends Microbiol. 2002, 10: 94-99. 10.1016\/S0966-842X(01)02293-4.","journal-title":"Trends Microbiol"},{"key":"103_CR4","doi-asserted-by":"publisher","first-page":"1147","DOI":"10.1101\/gr.1917404","volume":"14","author":"B Chevreux","year":"2004","unstructured":"Chevreux B, Pfisterer T, Drescher B, Driesel AJ, M\u00fcller WEG, Wetter T, Suhai S: Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs. Genome Res. 2004, 14: 1147-1159. 10.1101\/gr.1917404.","journal-title":"Genome Res"},{"key":"103_CR5","doi-asserted-by":"publisher","first-page":"237","DOI":"10.1093\/bib\/5.3.237","volume":"5","author":"M Pop","year":"2004","unstructured":"Pop M, Phillippy A, Delcher AL, Salzberg SL: Comparative genome assembly. Brief Bioinform. 2004, 5: 237-248. 10.1093\/bib\/5.3.237.","journal-title":"Brief Bioinform"},{"key":"103_CR6","doi-asserted-by":"publisher","first-page":"R88","DOI":"10.1186\/gb-2009-10-8-r88","volume":"10","author":"S Gnerre","year":"2009","unstructured":"Gnerre S, Lander ES, Lindblad-Toh K, Jaffe DB: Assisted assembly: how to improve a de novo genome assembly by using related species. Genome Biol. 2009, 10: R88-10.1186\/gb-2009-10-8-r88.","journal-title":"Genome Biol"},{"key":"103_CR7","doi-asserted-by":"publisher","first-page":"e00335\u201310","DOI":"10.1128\/mBio.00335-10","volume":"2","author":"Y Boucher","year":"2011","unstructured":"Boucher Y, Cordero OX, Takemura A, Hunt DE, Schliep K, Bapteste E, Lopez P, Tarr CL, Polz MF: Local mobile gene pools rapidly cross species boundaries to create endemicity within global Vibrio cholerae populations. MBio. 2011, 2: e00335\u201310.","journal-title":"MBio"},{"key":"103_CR8","doi-asserted-by":"publisher","first-page":"e13503","DOI":"10.1371\/journal.pone.0013503","volume":"5","author":"TD Matthews","year":"2010","unstructured":"Matthews TD, Edwards R, Maloy S: Chromosomal rearrangements formed by rrn recombination do not improve replichore balance in host-specific salmonella enterica serovars. PLoS ONE. 2010, 5: e13503-10.1371\/journal.pone.0013503.","journal-title":"PLoS ONE"},{"key":"103_CR9","doi-asserted-by":"publisher","first-page":"6086","DOI":"10.1128\/JB.00649-10","volume":"192","author":"TD Matthews","year":"2010","unstructured":"Matthews TD, Maloy S: Fitness effects of replichore imbalance in salmonella enterica. J Bacteriol. 2010, 192: 6086-6088. 10.1128\/JB.00649-10.","journal-title":"J Bacteriol"},{"key":"103_CR10","doi-asserted-by":"publisher","first-page":"1681","DOI":"10.1089\/cmb.2011.0170","volume":"18","author":"S Gao","year":"2011","unstructured":"Gao S, Sung W-K, Nagarajan N: Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences. J Comput Biol. 2011, 18: 1681-1691. 10.1089\/cmb.2011.0170.","journal-title":"J Comput Biol"},{"key":"103_CR11","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1186\/1751-0473-7-4","volume":"7","author":"MD Barton","year":"2012","unstructured":"Barton MD, Barton HA: Scaffolder - software for manual genome scaffolding. Source Code Biol Med. 2012, 7: 4-10.1186\/1751-0473-7-4.","journal-title":"Source Code Biol Med"},{"key":"103_CR12","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1186\/1751-0473-6-11","volume":"6","author":"M Galardini","year":"2011","unstructured":"Galardini M, Biondi EG, Bazzicalupo M, Mengoni A: CONTIGuator: a bacterial genomes finishing tool for structural insights on draft genomes. Source Code Biol Med. 2011, 6: 11-10.1186\/1751-0473-6-11.","journal-title":"Source Code Biol Med"},{"issue":"Web Server issu","key":"103_CR13","doi-asserted-by":"publisher","first-page":"W560","DOI":"10.1093\/nar\/gki356","volume":"33","author":"SAFT Van Hijum","year":"2005","unstructured":"Van Hijum SAFT, Zomer AL, Kuipers OP, Kok J: Projector 2: contig mapping for efficient gap-closure of prokaryotic genome sequence assemblies. Nucleic Acids Res. 2005, 33 (Web Server issue): W560-W566.","journal-title":"Nucleic Acids Res"},{"key":"103_CR14","doi-asserted-by":"publisher","first-page":"1968","DOI":"10.1093\/bioinformatics\/btp347","volume":"25","author":"S Assefa","year":"2009","unstructured":"Assefa S, Keane TM, Otto TD, Newbold C, Berriman M: ABACAS: algorithm-based automatic contiguation of assembled sequences. Bioinformatics. 2009, 25: 1968-1969. 10.1093\/bioinformatics\/btp347.","journal-title":"Bioinformatics"},{"key":"103_CR15","doi-asserted-by":"publisher","first-page":"46","DOI":"10.14806\/ej.17.1.208","volume":"17","author":"F Vezzi","year":"2011","unstructured":"Vezzi F, Cattonaro F, Policriti A: e-RGA: enhanced reference guided assembly of complex genomes. EMBnet J. 2011, 17: 46-54.","journal-title":"EMBnet J"},{"key":"103_CR16","doi-asserted-by":"crossref","first-page":"376","DOI":"10.1038\/nature03959","volume":"437","author":"M Margulies","year":"2005","unstructured":"Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen Y-J, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Ho CH, Irzyk GP, Jando SC, Alenquer MLI, Jarvie TP, Jirage KB, Kim J-B, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, et al: Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005, 437: 376-380.","journal-title":"Nature"},{"key":"103_CR17","doi-asserted-by":"publisher","first-page":"5691","DOI":"10.1093\/nar\/gki866","volume":"33","author":"R Overbeek","year":"2005","unstructured":"Overbeek R, Begley T, Butler RM, Choudhuri JV, Chuang H-Y, Cohoon M, de Cr\u00e9cy-Lagard V, Diaz N, Disz T, Edwards R, Fonstein M, Frank ED, Gerdes S, Glass EM, Goesmann A, Hanson A, Iwata-Reuyl D, Jensen R, Jamshidi N, Krause L, Kubal M, Larsen N, Linke B, McHardy AC, Meyer F, Neuweger H, Olsen G, Olson R, Osterman A, Portnoy V, et al: The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res. 2005, 33: 5691-5702. 10.1093\/nar\/gki866.","journal-title":"Nucleic Acids Res"},{"key":"103_CR18","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1186\/1471-2164-13-74","volume":"13","author":"KE McElroy","year":"2012","unstructured":"McElroy KE, Luciani F, Thomas T: GemSIM: general, error-model based simulator of next-generation sequencing data. BMC Genomics. 2012, 13: 74-10.1186\/1471-2164-13-74.","journal-title":"BMC Genomics"},{"key":"103_CR19","doi-asserted-by":"publisher","first-page":"R12","DOI":"10.1186\/gb-2004-5-2-r12","volume":"5","author":"S Kurtz","year":"2004","unstructured":"Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, Salzberg SL: Versatile and open software for comparing large genomes. Genome Biol. 2004, 5: R12-10.1186\/gb-2004-5-2-r12.","journal-title":"Genome Biol"},{"key":"103_CR20","doi-asserted-by":"publisher","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","volume":"48","author":"SB Needleman","year":"1970","unstructured":"Needleman SB, Wunsch CD: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970, 48: 443-453. 10.1016\/0022-2836(70)90057-4.","journal-title":"J Mol Biol"},{"key":"103_CR21","doi-asserted-by":"publisher","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","volume":"215","author":"SF Altschul","year":"1990","unstructured":"Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215: 403-410.","journal-title":"J Mol Biol"},{"key":"103_CR22","doi-asserted-by":"publisher","first-page":"600","DOI":"10.1186\/1471-2164-14-600","volume":"14","author":"RA Edwards","year":"2013","unstructured":"Edwards RA, Haggerty JM, Cassman N, Busch JC, Aguinaldo K, Chinta S, Vaughn MH, Morey R, Harkins TT, Teiling C, Fredrikson K, Dinsdale EA: Microbes, metagenomes and marine mammals: enabling the next generation of scientist to enter the genomic era. BMC Genomics. 2013, 14: 600-10.1186\/1471-2164-14-600.","journal-title":"BMC Genomics"},{"key":"103_CR23","doi-asserted-by":"publisher","first-page":"R227","DOI":"10.1093\/hmg\/ddq416","volume":"19","author":"EE Schadt","year":"2010","unstructured":"Schadt EE, Turner S, Kasarskis A: A window into third-generation sequencing. Hum Mol Genet. 2010, 19: R227-R240. 10.1093\/hmg\/ddq416.","journal-title":"Hum Mol Genet"},{"key":"103_CR24","doi-asserted-by":"publisher","first-page":"e144","DOI":"10.1093\/nar\/gng144","volume":"31","author":"SAFT Van Hijum","year":"2003","unstructured":"Van Hijum SAFT, Zomer AL, Kuipers OP, Kok J: Projector: automatic contig mapping for gap closure purposes. Nucleic Acids Res. 2003, 31: e144-10.1093\/nar\/gng144.","journal-title":"Nucleic Acids Res"},{"key":"103_CR25","doi-asserted-by":"publisher","first-page":"3295","DOI":"10.1128\/AEM.67.7.3295-3298.2001","volume":"67","author":"RA Helm","year":"2001","unstructured":"Helm RA, Maloy S: Rapid approach to determine rrn arrangement in salmonella serovars. Appl Environ Microbiol. 2001, 67: 3295-3298. 10.1128\/AEM.67.7.3295-3298.2001.","journal-title":"Appl Environ Microbiol"},{"key":"103_CR26","doi-asserted-by":"publisher","first-page":"e11147","DOI":"10.1371\/journal.pone.0011147","volume":"5","author":"AE Darling","year":"2010","unstructured":"Darling AE, Mau B, Perna NT: ProgressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS ONE. 2010, 5: e11147-10.1371\/journal.pone.0011147.","journal-title":"PLoS ONE"}],"container-title":["Source Code for Biology and Medicine"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1751-0473-8-23.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,2]],"date-time":"2021-09-02T02:10:49Z","timestamp":1630548649000},"score":1,"resource":{"primary":{"URL":"https:\/\/scfbm.biomedcentral.com\/articles\/10.1186\/1751-0473-8-23"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,11,22]]},"references-count":26,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2013,12]]}},"alternative-id":["103"],"URL":"https:\/\/doi.org\/10.1186\/1751-0473-8-23","relation":{},"ISSN":["1751-0473"],"issn-type":[{"value":"1751-0473","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,11,22]]},"assertion":[{"value":"22 March 2013","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 September 2013","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 November 2013","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"23"}}