{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,8,4]],"date-time":"2024-08-04T06:02:13Z","timestamp":1722751333725},"reference-count":30,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2006,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Protein sequence alignment is one of the basic tools in bioinformatics. Correct alignments are required for a range of tasks including the derivation of phylogenetic trees and protein structure prediction. Numerous studies have shown that the incorporation of predicted secondary structure information into alignment algorithms improves their performance. Secondary structure predictors have to be trained on a set of somewhat arbitrarily defined states (e.g. helix, strand, coil), and it has been shown that the choice of these states has some effect on alignment quality. However, it is not unlikely that prediction of other structural features also could provide an improvement. In this study we use an unsupervised clustering method, the self-organizing map, to assign sequence profile windows to \"structural states\" and assess their use in sequence alignment.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>The addition of self-organizing map locations as inputs to a profile-profile scoring function improves the alignment quality of distantly related proteins slightly. The improvement is slightly smaller than that gained from the inclusion of predicted secondary structure. However, the information seems to be complementary as the two prediction schemes can be combined to improve the alignment quality by a further small but significant amount.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusion<\/jats:title><jats:p>It has been observed in many studies that predicted secondary structure significantly improves the alignments. Here we have shown that the addition of self-organizing map locations can further improve the alignments as the self-organizing map locations seem to contain some information that is not captured by the predicted secondary structure.<\/jats:p><\/jats:sec>","DOI":"10.1186\/1471-2105-7-357","type":"journal-article","created":{"date-parts":[[2006,7,25]],"date-time":"2006-07-25T18:14:51Z","timestamp":1153851291000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Improved alignment quality by combining evolutionary information, predicted secondary structure and self-organizing maps"],"prefix":"10.1186","volume":"7","author":[{"given":"Tomas","family":"Ohlson","sequence":"first","affiliation":[]},{"given":"Varun","family":"Aggarwal","sequence":"additional","affiliation":[]},{"given":"Arne","family":"Elofsson","sequence":"additional","affiliation":[]},{"given":"Robert M","family":"MacCallum","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2006,7,25]]},"reference":[{"issue":"3","key":"1096_CR1","doi-asserted-by":"publisher","first-page":"613","DOI":"10.1006\/jmbi.1999.3377","volume":"295","author":"E Lindahl","year":"2000","unstructured":"Lindahl E, Elofsson A: Identification of related proteins on family, superfamily and fold level. J Mol Biol 2000, 295(3):613\u2013625. 10.1006\/jmbi.1999.3377","journal-title":"J Mol Biol"},{"issue":"2","key":"1096_CR2","doi-asserted-by":"publisher","first-page":"342","DOI":"10.1002\/prot.10565","volume":"54","author":"B Wallner","year":"2004","unstructured":"Wallner B, Fang H, Ohlson T, Frey-Sk\u00f6tt J, Elofsson A: Using evolutionary information for the query and target improves fold recognition. Proteins 2004, 54(2):342\u2013350. 10.1002\/prot.10565","journal-title":"Proteins"},{"issue":"12","key":"1096_CR3","doi-asserted-by":"publisher","first-page":"1531","DOI":"10.1093\/bioinformatics\/btg185","volume":"19","author":"D Mittelman","year":"2003","unstructured":"Mittelman D, Sadreyev R, Grishin N: Probabilistic scoring measures for profile-profile comparison yield more accurate short seed alignments. Bioinformatics 2003, 19(12):1531\u20131539. 10.1093\/bioinformatics\/btg185","journal-title":"Bioinformatics"},{"issue":"6","key":"1096_CR4","doi-asserted-by":"publisher","first-page":"1612","DOI":"10.1110\/ps.03601504","volume":"13","author":"G Wang","year":"2004","unstructured":"Wang G, Dunbrack R Jr: Scoring profile-to-profile sequence alignments. Protein Sci 2004, 13(6):1612\u20131626. 10.1110\/ps.03601504","journal-title":"Protein Sci"},{"issue":"8","key":"1096_CR5","doi-asserted-by":"publisher","first-page":"1301","DOI":"10.1093\/bioinformatics\/bth090","volume":"20","author":"R Edgar","year":"2004","unstructured":"Edgar R, Sjolander K: A comparison of scoring functions for protein sequence profile alignment. Bioinformatics 2004, 20(8):1301\u20131308. 10.1093\/bioinformatics\/bth090","journal-title":"Bioinformatics"},{"issue":"4","key":"1096_CR6","doi-asserted-by":"publisher","first-page":"1071","DOI":"10.1110\/ps.03379804","volume":"13","author":"M Marti-Renom","year":"2004","unstructured":"Marti-Renom M, Madhusudhan M, Sali A: Alignment of protein sequences by their profiles. Protein Sci 2004, 13(4):1071\u20131087. 10.1110\/ps.03379804","journal-title":"Protein Sci"},{"key":"1096_CR7","doi-asserted-by":"publisher","first-page":"188","DOI":"10.1002\/prot.20184","volume":"57","author":"T Ohlson","year":"2004","unstructured":"Ohlson T, Wallner B, Elofsson A: Profile-profile methods provide improved fold-recognition: A study of different profile-profile alignment methods. Proteins 2004, 57: 188\u2013197. 10.1002\/prot.20184","journal-title":"Proteins"},{"issue":"2","key":"1096_CR8","doi-asserted-by":"publisher","first-page":"342","DOI":"10.1002\/prot.10565","volume":"54","author":"B Wallner","year":"2004","unstructured":"Wallner B, Fang H, Ohlson T, Frey-Sk\u00f6tt J, Elofsson A: Using evolutionary information for the query and target improves fold recognition. Proteins 2004, 54(2):342\u2013350. 10.1002\/prot.10565","journal-title":"Proteins"},{"issue":"13","key":"1096_CR9","doi-asserted-by":"publisher","first-page":"3804","DOI":"10.1093\/nar\/gkg504","volume":"31","author":"K Ginalski","year":"2003","unstructured":"Ginalski K, Pas J, Wyrwicz L, von Grotthuss M, Bujnicki J, Rychlewski L: ORFeus: Detection of distant homology using sequence profiles and predicted secondary structure. Nucleic Acids Res 2003, 31(13):3804\u20133807. 10.1093\/nar\/gkg504","journal-title":"Nucleic Acids Res"},{"issue":"4","key":"1096_CR10","doi-asserted-by":"publisher","first-page":"504","DOI":"10.1002\/prot.10369","volume":"51","author":"R Karchin","year":"2003","unstructured":"Karchin R, Cline M, Mandel-Gutfreund Y, Karplus K: Hidden Markov models that use predicted local structure for fold recognition: alphabets of backbone geometry. Proteins 2003, 51(4):504\u2013514. 10.1002\/prot.10369","journal-title":"Proteins"},{"key":"1096_CR11","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1186\/1471-2105-5-183","volume":"5","author":"R Chung","year":"2004","unstructured":"Chung R, Yona G: Protein family comparison using statistical models and predicted structural information. BMC Bioinformatics 2004, 5: 183. 10.1186\/1471-2105-5-183","journal-title":"BMC Bioinformatics"},{"issue":"5","key":"1096_CR12","doi-asserted-by":"publisher","first-page":"1043","DOI":"10.1016\/j.jmb.2003.10.025","volume":"334","author":"C Tang","year":"2003","unstructured":"Tang C, Xie L, Koh I, Posy S, Alexov E, Honig B: On the role of structural information in remote homology detection and sequence alignment: new methods using hybrid sequence profiles. J Mol Biol 2003, 334(5):1043\u20131062. 10.1016\/j.jmb.2003.10.025","journal-title":"J Mol Biol"},{"issue":"3","key":"1096_CR13","doi-asserted-by":"publisher","first-page":"508","DOI":"10.1002\/prot.20008","volume":"55","author":"R Karchin","year":"2004","unstructured":"Karchin R, Cline M, Karplus K: Evaluation of local structure alphabets based on residue burial. Proteins 2004, 55(3):508\u2013518. 10.1002\/prot.20008","journal-title":"Proteins"},{"issue":"2","key":"1096_CR14","doi-asserted-by":"publisher","first-page":"499","DOI":"10.1006\/jmbi.2000.3741","volume":"299","author":"L Kelley","year":"2000","unstructured":"Kelley L, MacCallum R, Sternberg M: Enhanced genome annotation using structural profiles in the program 3D-PSSM. J Mol Biol 2000, 299(2):499\u2013520. 10.1006\/jmbi.2000.3741","journal-title":"J Mol Biol"},{"issue":"2","key":"1096_CR15","doi-asserted-by":"publisher","first-page":"232","DOI":"10.1110\/ps.9.2.232","volume":"9","author":"L Rychlewski","year":"2000","unstructured":"Rychlewski L, Jaroszewski L, Li W, Godzik A: Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Sci 2000, 9(2):232\u2013241.","journal-title":"Protein Sci"},{"issue":"5","key":"1096_CR16","doi-asserted-by":"publisher","first-page":"1257","DOI":"10.1006\/jmbi.2001.5293","volume":"315","author":"G Yona","year":"2002","unstructured":"Yona G, Levitt M: Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. J Mol Biol 2002, 315(5):1257\u20131275. 10.1006\/jmbi.2001.5293","journal-title":"J Mol Biol"},{"key":"1096_CR17","doi-asserted-by":"publisher","first-page":"243","DOI":"10.1006\/jmbi.2001.4762","volume":"310","author":"J Shi","year":"2001","unstructured":"Shi J, Blundell T, Mizuguchi K: FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. J Mol Biol 2001, 310: 243\u2013257. 10.1006\/jmbi.2001.4762","journal-title":"J Mol Biol"},{"key":"1096_CR18","doi-asserted-by":"publisher","first-page":"253","DOI":"10.1186\/1471-2105-6-253","volume":"6","author":"T Ohlson","year":"2005","unstructured":"Ohlson T, Elofsson A: ProfNet, a method to derive profile-profile alignment scoring functions that improves the alignments of distantly related proteins. BMC Bioinformatics 2005, 6: 253. 10.1186\/1471-2105-6-253","journal-title":"BMC Bioinformatics"},{"issue":"4","key":"1096_CR19","doi-asserted-by":"publisher","first-page":"797","DOI":"10.1006\/jmbi.1999.2583","volume":"287","author":"D Jones","year":"1999","unstructured":"Jones D: GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J Mol Biol 1999, 287(4):797\u2013815. 10.1006\/jmbi.1999.2583","journal-title":"J Mol Biol"},{"issue":"Suppl 1","key":"1096_CR20","doi-asserted-by":"publisher","first-page":"1224","DOI":"10.1093\/bioinformatics\/bth913","volume":"20","author":"R MacCallum","year":"2004","unstructured":"MacCallum R: Striped sheets and protein contact prediction. Bioinformatics 2004, 20(Suppl 1):1224\u20131231.","journal-title":"Bioinformatics"},{"issue":"4","key":"1096_CR21","doi-asserted-by":"publisher","first-page":"413","DOI":"10.1093\/bioinformatics\/bti828","volume":"22","author":"YM Huang","year":"2006","unstructured":"Huang YM, Bystroff C: Improved pairwise alignments of proteins in the Twilight Zone using local structure predictions. Bioinformatics 2006, 22(4):413\u2013422. 10.1093\/bioinformatics\/bti828","journal-title":"Bioinformatics"},{"issue":"2","key":"1096_CR22","doi-asserted-by":"publisher","first-page":"445","DOI":"10.1002\/pro.5560070226","volume":"7","author":"M Gerstein","year":"1998","unstructured":"Gerstein M, Levitt M: Comprehensive assessment ofautomatic structural alignment against a manual standard, the scop classification of proteins. Protein Sci 1998, 7(2):445\u2013456.","journal-title":"Protein Sci"},{"key":"1096_CR23","doi-asserted-by":"crossref","unstructured":"Cristobal S, Zemla A, Fischer D, Rychlewski L, Elofsson A: A study of quality measures for protein threading models. BMC Bioinformatics 2001., 2(5):","DOI":"10.1186\/1471-2105-2-5"},{"key":"1096_CR24","doi-asserted-by":"crossref","DOI":"10.1093\/oso\/9780198538493.001.0001","volume-title":"Neural Networks for Pattern Recognition","author":"C Bishop","year":"1995","unstructured":"Bishop C: Neural Networks for Pattern Recognition. Great Clarendon St, Oxford OX2 6DP, UK.: Oxford University Press; 1995."},{"key":"1096_CR25","volume-title":"NetLab: Netlab neural network software","author":"I Nabney","year":"1995","unstructured":"Nabney I, Bishop C: NetLab: Netlab neural network software.1995. [http:\/\/www.ncrg.aston.ac.uk\/netlab\/index.php]"},{"key":"1096_CR26","volume-title":"palign","author":"A Elofsson","year":"2002","unstructured":"Elofsson A, Ohlson T: palign.2002. [http:\/\/www.bioinfo.se\/palign\/]"},{"issue":"9","key":"1096_CR27","doi-asserted-by":"publisher","first-page":"776","DOI":"10.1093\/bioinformatics\/16.9.776","volume":"16","author":"N Siew","year":"2000","unstructured":"Siew N, Elofsson A, Rychlewski L, Fischer D: MaxSub: an automated measure for the assessment of protein structure prediction quality. Bioinformatics 2000, 16(9):776\u2013785. 10.1093\/bioinformatics\/16.9.776","journal-title":"Bioinformatics"},{"key":"1096_CR28","volume-title":"R: A language and environment for statistical computing","author":"R Development Core Team","year":"2005","unstructured":"R Development Core Team: R: A language and environment for statistical computing.R Foundation for Statistical Computing, Vienna, Austria; 2005. [http:\/\/www.R-project.org] [ISBN 3-900051-07-0]"},{"issue":"17","key":"1096_CR29","doi-asserted-by":"publisher","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","volume":"25","author":"S Altschul","year":"1997","unstructured":"Altschul S, Madden T, Schaffer A, Zhang J, Zhang Z, Miller W, Lipman D: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389\u20133402. 10.1093\/nar\/25.17.3389","journal-title":"Nucleic Acids Res"},{"issue":"5","key":"1096_CR30","doi-asserted-by":"publisher","first-page":"423","DOI":"10.1093\/bioinformatics\/14.5.423","volume":"14","author":"L Holm","year":"1998","unstructured":"Holm L, Sander C: Removing near-neighbour redundancy fromlarge protein sequence collections. Bioinformatics 1998, 14(5):423\u2013429. 10.1093\/bioinformatics\/14.5.423","journal-title":"Bioinformatics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-7-357.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,5]],"date-time":"2024-02-05T14:52:03Z","timestamp":1707144723000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-7-357"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,7,25]]},"references-count":30,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2006,12]]}},"alternative-id":["1096"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-7-357","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2006,7,25]]},"assertion":[{"value":"4 April 2006","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 July 2006","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 July 2006","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"357"}}