{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,1,19]],"date-time":"2024-01-19T18:58:33Z","timestamp":1705690713268},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2009,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Accurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone. The accuracy improves when the structures are available, but current structure-based sequence alignment procedures still mis-align substantial numbers of residues. In order to correct such errors, we previously explored the possibility of replacing the residue-based dynamic programming algorithm in structure alignment procedures with the Seed Extension algorithm, which does not use a gap penalty. Here, we describe a new procedure called RSE (Refinement with Seed Extension) that iteratively refines a structure-based sequence alignment.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>RSE uses SE (Seed Extension) in its core, which is an algorithm that we reported recently for obtaining a sequence alignment from two superimposed structures. The RSE procedure was evaluated by comparing the correctly aligned fractions of residues before and after the refinement of the structure-based sequence alignments produced by popular programs. CE, DaliLite, FAST, LOCK2, MATRAS, MATT, TM-align, SHEBA and VAST were included in this analysis and the NCBI's CDD root node set was used as the reference alignments. RSE improved the average accuracy of sequence alignments for all programs tested when no shift error was allowed. The amount of improvement varied depending on the program. The average improvements were small for DaliLite and MATRAS but about 5% for CE and VAST. More substantial improvements have been seen in many individual cases. The additional computation times required for the refinements were negligible compared to the times taken by the structure alignment programs.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>RSE is a computationally inexpensive way of improving the accuracy of a structure-based sequence alignment. It can be used as a standalone procedure following a regular structure-based sequence alignment or to replace the traditional iterative refinement procedures based on residue-level dynamic programming algorithm in many structure alignment programs.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-10-210","type":"journal-article","created":{"date-parts":[[2009,7,9]],"date-time":"2009-07-09T18:13:40Z","timestamp":1247163220000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Iterative refinement of structure-based sequence alignments by Seed Extension"],"prefix":"10.1186","volume":"10","author":[{"given":"Changhoon","family":"Kim","sequence":"first","affiliation":[]},{"given":"Chin-Hsien","family":"Tai","sequence":"additional","affiliation":[]},{"given":"Byungkook","family":"Lee","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2009,7,9]]},"reference":[{"issue":"3","key":"2940_CR1","doi-asserted-by":"publisher","first-page":"691","DOI":"10.1006\/jmbi.2000.3975","volume":"301","author":"AS Yang","year":"2000","unstructured":"Yang AS, Honig B: An integrated approach to the analysis and modeling of protein sequences and structures. III. A comparative study of sequence conservation in protein structural families using multiple structural alignments. J Mol Biol 2000, 301(3):691\u2013711. 10.1006\/jmbi.2000.3975","journal-title":"J Mol Biol"},{"issue":"5","key":"2940_CR2","doi-asserted-by":"publisher","first-page":"685","DOI":"10.1089\/106652701446152","volume":"7","author":"I Eidhammer","year":"2000","unstructured":"Eidhammer I, Jonassen I, Taylor WR: Structure comparison and structure patterns. J Comput Biol 2000, 7(5):685\u2013716. 10.1089\/106652701446152","journal-title":"J Comput Biol"},{"key":"2940_CR3","doi-asserted-by":"publisher","first-page":"499","DOI":"10.1186\/1471-2105-7-499","volume":"7","author":"S Chakrabarti","year":"2006","unstructured":"Chakrabarti S, Lanczycki CJ, Panchenko AR, Przytycka TM, Thiessen PA, Bryant SH: State of the art: refinement of multiple sequence alignments. BMC Bioinformatics 2006, 7: 499. 10.1186\/1471-2105-7-499","journal-title":"BMC Bioinformatics"},{"issue":"22","key":"2940_CR4","doi-asserted-by":"publisher","first-page":"7120","DOI":"10.1093\/nar\/gki1020","volume":"33","author":"T Lassmann","year":"2005","unstructured":"Lassmann T, Sonnhammer EL: Automatic assessment of alignment quality. Nucleic Acids Res 2005, 33(22):7120\u20137128. 10.1093\/nar\/gki1020","journal-title":"Nucleic Acids Res"},{"issue":"5","key":"2940_CR5","doi-asserted-by":"publisher","first-page":"1792","DOI":"10.1093\/nar\/gkh340","volume":"32","author":"RC Edgar","year":"2004","unstructured":"Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004, 32(5):1792\u20131797. 10.1093\/nar\/gkh340","journal-title":"Nucleic Acids Res"},{"key":"2940_CR6","doi-asserted-by":"publisher","first-page":"4","DOI":"10.1186\/1471-2105-6-4","volume":"6","author":"JC Gelly","year":"2005","unstructured":"Gelly JC, Chiche L, Gracy J: EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments. BMC Bioinformatics 2005, 6: 4. 10.1186\/1471-2105-6-4","journal-title":"BMC Bioinformatics"},{"issue":"2","key":"2940_CR7","doi-asserted-by":"publisher","first-page":"910","DOI":"10.1002\/prot.21775","volume":"71","author":"NC Goonesekere","year":"2008","unstructured":"Goonesekere NC, Lee B: Context-specific amino acid substitution matrices and their use in the detection of protein homologs. Proteins 2008, 71(2):910\u2013919. 10.1002\/prot.21775","journal-title":"Proteins"},{"issue":"12","key":"2940_CR8","doi-asserted-by":"publisher","first-page":"1658","DOI":"10.1093\/bioinformatics\/18.12.1658","volume":"18","author":"AS Yang","year":"2002","unstructured":"Yang AS: Structure-dependent sequence alignment for remotely related proteins. Bioinformatics 2002, 18(12):1658\u20131665. 10.1093\/bioinformatics\/18.12.1658","journal-title":"Bioinformatics"},{"issue":"3","key":"2940_CR9","doi-asserted-by":"publisher","first-page":"611","DOI":"10.1002\/prot.21688","volume":"70","author":"RM Bennett-Lovsey","year":"2008","unstructured":"Bennett-Lovsey RM, Herbert AD, Sternberg MJ, Kelley LA: Exploring the extremes of sequence\/structure space with ensemble fold recognition in the program Phyre. Proteins 2008, 70(3):611\u2013625. 10.1002\/prot.21688","journal-title":"Proteins"},{"issue":"7","key":"2940_CR10","doi-asserted-by":"publisher","first-page":"1325","DOI":"10.1002\/pro.5560050711","volume":"5","author":"A Godzik","year":"1996","unstructured":"Godzik A: The structural alignment between two proteins: is there a unique answer? Protein Sci 1996, 5(7):1325\u20131338. 10.1002\/pro.5560050711","journal-title":"Protein Sci"},{"key":"2940_CR11","doi-asserted-by":"publisher","first-page":"50","DOI":"10.1186\/1472-6807-7-50","volume":"7","author":"G Mayr","year":"2007","unstructured":"Mayr G, Domingues FS, Lackner P: Comparative analysis of protein structure alignments. BMC Struct Biol 2007, 7: 50. 10.1186\/1472-6807-7-50","journal-title":"BMC Struct Biol"},{"key":"2940_CR12","doi-asserted-by":"publisher","first-page":"355","DOI":"10.1186\/1471-2105-8-355","volume":"8","author":"C Kim","year":"2007","unstructured":"Kim C, Lee B: Accuracy of structure-based sequence alignment of automatic methods. BMC Bioinformatics 2007, 8: 355. 10.1186\/1471-2105-8-355","journal-title":"BMC Bioinformatics"},{"issue":"6","key":"2940_CR13","doi-asserted-by":"publisher","first-page":"566","DOI":"10.1093\/bioinformatics\/16.6.566","volume":"16","author":"L Holm","year":"2000","unstructured":"Holm L, Park J: DaliLite workbench for protein structure comparison. Bioinformatics 2000, 16(6):566\u2013567. 10.1093\/bioinformatics\/16.6.566","journal-title":"Bioinformatics"},{"issue":"3","key":"2940_CR14","doi-asserted-by":"publisher","first-page":"356","DOI":"10.1002\/prot.340230309","volume":"23","author":"T Madej","year":"1995","unstructured":"Madej T, Gibrat JF, Bryant SH: Threading a database of protein cores. Proteins 1995, 23(3):356\u2013369. 10.1002\/prot.340230309","journal-title":"Proteins"},{"key":"2940_CR15","doi-asserted-by":"crossref","unstructured":"Ye Y, Godzik A: FATCAT: a web server for flexible structure comparison and structure similarity searching. Nucleic Acids Res 2004, (32 Web Server):W582\u2013585. 10.1093\/nar\/gkh430","DOI":"10.1093\/nar\/gkh430"},{"issue":"1","key":"2940_CR16","doi-asserted-by":"publisher","first-page":"e10","DOI":"10.1371\/journal.pcbi.0040010","volume":"4","author":"M Menke","year":"2008","unstructured":"Menke M, Berger B, Cowen L: Matt: local flexibility aids protein multiple structure alignment. PLoS Comput Biol 2008, 4(1):e10. 10.1371\/journal.pcbi.0040010","journal-title":"PLoS Comput Biol"},{"issue":"3","key":"2940_CR17","doi-asserted-by":"publisher","first-page":"618","DOI":"10.1002\/prot.20331","volume":"58","author":"J Zhu","year":"2005","unstructured":"Zhu J, Weng Z: FAST: a novel protein structure alignment algorithm. Proteins 2005, 58(3):618\u2013627. 10.1002\/prot.20331","journal-title":"Proteins"},{"key":"2940_CR18","doi-asserted-by":"crossref","unstructured":"Shapiro J, Brutlag D: FoldMiner and LOCK 2: protein structure comparison and motif discovery on the web. Nucleic Acids Res 2004, (32 Web Server):W536\u2013541. 10.1093\/nar\/gkh389","DOI":"10.1093\/nar\/gkh389"},{"issue":"13","key":"2940_CR19","doi-asserted-by":"publisher","first-page":"3367","DOI":"10.1093\/nar\/gkg581","volume":"31","author":"T Kawabata","year":"2003","unstructured":"Kawabata T: MATRAS: A program for protein 3D structure comparison. Nucleic Acids Res 2003, 31(13):3367\u20133369. 10.1093\/nar\/gkg581","journal-title":"Nucleic Acids Res"},{"issue":"7","key":"2940_CR20","doi-asserted-by":"publisher","first-page":"2302","DOI":"10.1093\/nar\/gki524","volume":"33","author":"Y Zhang","year":"2005","unstructured":"Zhang Y, Skolnick J: TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 2005, 33(7):2302\u20132309. 10.1093\/nar\/gki524","journal-title":"Nucleic Acids Res"},{"issue":"Suppl 1","key":"2940_CR21","doi-asserted-by":"publisher","first-page":"S4","DOI":"10.1186\/1471-2105-10-S1-S4","volume":"10","author":"CH Tai","year":"2009","unstructured":"Tai CH, Vincent JJ, Kim C, Lee B: SE: An algorithm for deriving sequence alignment from a pair of superimposed structures. BMC Bioinformatics 2009, 10(Suppl 1):S4. 10.1186\/1471-2105-10-S1-S4","journal-title":"BMC Bioinformatics"},{"issue":"13","key":"2940_CR22","doi-asserted-by":"publisher","first-page":"1605","DOI":"10.1002\/jcc.20084","volume":"25","author":"EF Pettersen","year":"2004","unstructured":"Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE: UCSF Chimera \u2013 a visualization system for exploratory research and analysis. J Comput Chem 2004, 25(13):1605\u20131612. 10.1002\/jcc.20084","journal-title":"J Comput Chem"},{"issue":"3","key":"2940_CR23","doi-asserted-by":"publisher","first-page":"665","DOI":"10.1006\/jmbi.2000.3973","volume":"301","author":"AS Yang","year":"2000","unstructured":"Yang AS, Honig B: An integrated approach to the analysis and modeling of protein sequences and structures. I. Protein structural alignment and a quantitative measure for protein structural distance. J Mol Biol 2000, 301(3):665\u2013678. 10.1006\/jmbi.2000.3973","journal-title":"J Mol Biol"},{"key":"2940_CR24","first-page":"9","volume":"31","author":"GJ Kleywegt","year":"1994","unstructured":"Kleywegt GJ, Jones TA: A super position. CCP4\/ESF-EACBM Newsletter on Protein Crystallography 1994, 31: 9.","journal-title":"CCP4\/ESF-EACBM Newsletter on Protein Crystallography"},{"issue":"2","key":"2940_CR25","doi-asserted-by":"publisher","first-page":"309","DOI":"10.1002\/prot.340140216","volume":"14","author":"RB Russell","year":"1992","unstructured":"Russell RB, Barton GJ: Multiple protein sequence alignment from tertiary structure comparison: assignment of global and residue confidence levels. Proteins 1992, 14(2):309\u2013323. 10.1002\/prot.340140216","journal-title":"Proteins"},{"key":"2940_CR26","doi-asserted-by":"crossref","unstructured":"Marchler-Bauer A, Anderson JB, Derbyshire MK, DeWeese-Scott C, Gonzales NR, Gwadz M, Hao L, He S, Hurwitz DI, Jackson JD, et al.: CDD: a conserved domain database for interactive domain family analysis. Nucleic Acids Res 2007, (35 Database):D237\u2013240. 10.1093\/nar\/gkl951","DOI":"10.1093\/nar\/gkl951"},{"key":"2940_CR27","doi-asserted-by":"publisher","first-page":"922","DOI":"10.1107\/S0567739476001873","volume":"32","author":"W Kabsch","year":"1976","unstructured":"Kabsch W: A solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A 1976, 32: 922\u2013923. 10.1107\/S0567739476001873","journal-title":"Acta Crystallographica Section A"},{"key":"2940_CR28","doi-asserted-by":"publisher","first-page":"827","DOI":"10.1107\/S0567739478001680","volume":"34","author":"W Kabsch","year":"1978","unstructured":"Kabsch W: A discussion of the solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A 1978, 34: 827\u2013828. 10.1107\/S0567739478001680","journal-title":"Acta Crystallographica Section A"},{"issue":"2","key":"2940_CR29","doi-asserted-by":"publisher","first-page":"445","DOI":"10.1002\/pro.5560070226","volume":"7","author":"M Gerstein","year":"1998","unstructured":"Gerstein M, Levitt M: Comprehensive assessment of automatic structural alignment against a manual standard, the scop classification of proteins. Protein Sci 1998, 7(2):445\u2013456.","journal-title":"Protein Sci"},{"issue":"9","key":"2940_CR30","doi-asserted-by":"publisher","first-page":"739","DOI":"10.1093\/protein\/11.9.739","volume":"11","author":"IN Shindyalov","year":"1998","unstructured":"Shindyalov IN, Bourne PE: Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng 1998, 11(9):739\u2013747. 10.1093\/protein\/11.9.739","journal-title":"Protein Eng"},{"issue":"8","key":"2940_CR31","doi-asserted-by":"publisher","first-page":"535","DOI":"10.1093\/protein\/13.8.535","volume":"13","author":"J Jung","year":"2000","unstructured":"Jung J, Lee B: Protein structure alignment using environmental profiles. Protein Eng 2000, 13(8):535\u2013543. 10.1093\/protein\/13.8.535","journal-title":"Protein Eng"},{"issue":"3","key":"2940_CR32","doi-asserted-by":"publisher","first-page":"635","DOI":"10.1016\/0888-7543(91)90071-L","volume":"11","author":"WR Pearson","year":"1991","unstructured":"Pearson WR: Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 1991, 11(3):635\u2013650. 10.1016\/0888-7543(91)90071-L","journal-title":"Genomics"},{"issue":"4","key":"2940_CR33","doi-asserted-by":"publisher","first-page":"1071","DOI":"10.1110\/ps.03379804","volume":"13","author":"MA Marti-Renom","year":"2004","unstructured":"Marti-Renom MA, Madhusudhan MS, Sali A: Alignment of protein sequences by their profiles. Protein Sci 2004, 13(4):1071\u20131087. 10.1110\/ps.03379804","journal-title":"Protein Sci"},{"issue":"17","key":"2940_CR34","doi-asserted-by":"publisher","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","volume":"25","author":"SF Altschul","year":"1997","unstructured":"Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389\u20133402. 10.1093\/nar\/25.17.3389","journal-title":"Nucleic Acids Res"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-10-210.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,8,31]],"date-time":"2021-08-31T21:42:09Z","timestamp":1630446129000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-10-210"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,7,9]]},"references-count":34,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2009,12]]}},"alternative-id":["2940"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-10-210","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2009,7,9]]},"assertion":[{"value":"2 February 2009","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 July 2009","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 July 2009","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"210"}}