{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,4,9]],"date-time":"2025-04-09T00:44:15Z","timestamp":1744159455873},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>The chemical property and biological function of a protein is a direct consequence of its primary structure. Several algorithms have been developed which determine alignment and similarity of primary protein sequences. However, character based similarity cannot provide insight into the structural aspects of a protein. We present a method based on spectral similarity to compare subsequences of amino acids that behave similarly but are not aligned well by considering amino acids as mere characters. This approach finds a similarity score between sequences based on any given attribute, like hydrophobicity of amino acids, on the basis of spectral information after partial conversion to the frequency domain.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>Distance matrices of various branches of the human kinome, that is the full complement of human kinases, were developed that matched the phylogenetic tree of the human kinome establishing the efficacy of the global alignment of the algorithm. PKCd and PKCe kinases share close biological properties and structural similarities but do not give high scores with character based alignments. Detailed comparison established close similarities between subsequences that do not have any significant character identity. We compared their known 3D structures to establish that the algorithm is able to pick subsequences that are not considered similar by character based matching algorithms but share structural similarities. Similarly many subsequences with low character identity were picked between xyna-theau and xyna-clotm F\/10 xylanases. Comparison of 3D structures of the subsequences confirmed the claim of similarity in structure.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusion<\/jats:title>\n                <jats:p>An algorithm is developed which is inspired by successful application of spectral similarity applied to music sequences. The method captures subsequences that do not align by traditional character based alignment tools but give rise to similar secondary and tertiary structures. The Spectral Similarity Score (SSS) is an extension to the conventional similarity methods and results indicate that it holds a strong potential for analysis of various biological sequences and structural variations in proteins.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/1471-2105-6-105","type":"journal-article","created":{"date-parts":[[2005,4,26]],"date-time":"2005-04-26T06:18:18Z","timestamp":1114496298000},"update-policy":"http:\/\/dx.doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Detailed protein sequence alignment based on Spectral Similarity Score (SSS)"],"prefix":"10.1186","volume":"6","author":[{"given":"Kshitiz","family":"Gupta","sequence":"first","affiliation":[]},{"given":"Dina","family":"Thomas","sequence":"additional","affiliation":[]},{"given":"SV","family":"Vidya","sequence":"additional","affiliation":[]},{"given":"KV","family":"Venkatesh","sequence":"additional","affiliation":[]},{"given":"S","family":"Ramakumar","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2005,4,23]]},"reference":[{"key":"430_CR1","doi-asserted-by":"publisher","first-page":"119","DOI":"10.1038\/ng0294-119","volume":"6","author":"SF Altschul","year":"1994","unstructured":"Altschul SF, Boguski MS, Gish W, Wootton JC: Issues in searching molecular sequence databases. Nature Genet 1994, 6: 119\u2013129. 10.1038\/ng0294-119","journal-title":"Nature Genet"},{"key":"430_CR2","doi-asserted-by":"publisher","first-page":"505","DOI":"10.1093\/protein\/2.7.505","volume":"2","author":"WR Taylor","year":"1989","unstructured":"Taylor WR, Orengo CA: A holistic approach to protein structure alignment. Protein Eng 1989, 2: 505\u2013519.","journal-title":"Protein Eng"},{"key":"430_CR3","doi-asserted-by":"publisher","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","volume":"25","author":"SF Altschul","year":"1997","unstructured":"Altschul SF, Madden TL, Schffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25: 3389\u20133402. 10.1093\/nar\/25.17.3389","journal-title":"Nucleic Acids Res"},{"key":"430_CR4","doi-asserted-by":"publisher","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","volume":"215","author":"SF Altschul","year":"1990","unstructured":"Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215: 403\u2013410. 10.1006\/jmbi.1990.9999","journal-title":"J Mol Biol"},{"key":"430_CR5","doi-asserted-by":"publisher","first-page":"W20","DOI":"10.1093\/nar\/gkh435","volume":"32","author":"S McGinnis","year":"2004","unstructured":"McGinnis S, Madden TL: BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res 2004, 32: W20-W25. 10.1093\/nar\/gnh003","journal-title":"Nucleic Acids Res"},{"issue":"8","key":"430_CR6","doi-asserted-by":"publisher","first-page":"2444","DOI":"10.1073\/pnas.85.8.2444","volume":"85","author":"W Pearson","year":"1988","unstructured":"Pearson W, Lipman DJ: Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A 1988, 85(8):2444\u20132448.","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"3","key":"430_CR7","doi-asserted-by":"publisher","first-page":"635","DOI":"10.1016\/0888-7543(91)90071-L","volume":"11","author":"WR Pearson","year":"1991","unstructured":"Pearson WR: Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 1991, 11(3):635\u2013650. 10.1016\/0888-7543(91)90071-L","journal-title":"Genomics"},{"key":"430_CR8","doi-asserted-by":"publisher","first-page":"195","DOI":"10.1016\/0022-2836(81)90087-5","volume":"147","author":"TF Smith","year":"1981","unstructured":"Smith TF, Waterman MS: Identification of Common Molecular Subsequences. J Mol Bio 1981, 147: 195\u2013197. 10.1016\/0022-2836(81)90087-5","journal-title":"J Mol Bio"},{"issue":"3","key":"430_CR9","doi-asserted-by":"publisher","first-page":"443","DOI":"10.1016\/0022-2836(70)90057-4","volume":"48","author":"SB Needleman","year":"1970","unstructured":"Needleman SB, Wunsch CD: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 1970, 48(3):443\u2013453. 10.1016\/0022-2836(70)90057-4","journal-title":"J Mol Biol"},{"key":"430_CR10","doi-asserted-by":"publisher","first-page":"887","DOI":"10.1006\/jmbi.2001.5250","volume":"315","author":"O Carugo","year":"2002","unstructured":"Carugo O, Pongor S: Protein fold similarity estimated by a probabilistic approach based on C([alpha])-C([alpha]) distance comparison. J Mol Biol 2002, 315: 887\u2013898. 10.1006\/jmbi.2001.5250","journal-title":"J Mol Biol"},{"key":"430_CR11","doi-asserted-by":"publisher","first-page":"GC33","DOI":"10.1016\/0378-1119(96)00123-0","volume":"172","author":"U Tonges","year":"1996","unstructured":"Tonges U, Perrey SW, Stoye J, Dress AW: A general method for fast multiple sequence alignment. Gene 1996, 172: GC33\u201341. 10.1016\/0378-1119(96)00123-0","journal-title":"Gene"},{"key":"430_CR12","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1016\/S0097-8485(00)80003-0","volume":"24","author":"WR Taylor","year":"2000","unstructured":"Taylor WR, Saelensminde G, Eidhammer I: Multiple protein sequence alignment using double-dynamic programming. Comput Chem 2000, 24: 3\u201312. 10.1016\/S0097-8485(99)00043-1","journal-title":"Comput Chem"},{"key":"430_CR13","doi-asserted-by":"publisher","first-page":"205","DOI":"10.1006\/jmbi.2000.4042","volume":"302","author":"C Notredame","year":"2000","unstructured":"Notredame C, Higgins DG, Heringa J: T-Coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Biol 2000, 302: 205\u2013217. 10.1006\/jmbi.2000.4042","journal-title":"J Mol Biol"},{"key":"430_CR14","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1186\/1471-2105-5-157","volume":"5","author":"AF Neuwald","year":"2004","unstructured":"Neuwald AF, Liu JS: Gapped alignment of protein sequence motifs through Monte Carlo optimization of a hidden Markov model. BMC Bioinformatics 2004, 5: 157. 10.1186\/1471-2105-5-157","journal-title":"BMC Bioinformatics"},{"key":"430_CR15","doi-asserted-by":"publisher","first-page":"383","DOI":"10.1016\/S0076-6879(96)66024-8","volume":"266","author":"DG Higgins","year":"1996","unstructured":"Higgins DG, Thompson JD, Gibson TJ: Using CLUSTAL for multiple sequence alignments. Methods Enzymol 1996, 266: 383\u2013402.","journal-title":"Methods Enzymol"},{"issue":"22","key":"430_CR16","doi-asserted-by":"publisher","first-page":"4673","DOI":"10.1093\/nar\/22.22.4673","volume":"22","author":"JD Thompson","year":"1994","unstructured":"Thompson JD, Higgins DG, Gibson TJ: Clustal W: improving the sensitivity of progressive multiple sequence alignments through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994, 22(22):4673\u20134680.","journal-title":"Nucleic Acids Res"},{"key":"430_CR17","doi-asserted-by":"publisher","first-page":"11","DOI":"10.1186\/1471-2105-3-11","volume":"3","author":"A Karwath","year":"2002","unstructured":"Karwath A, King RD: Homology Induction: the use of machine learning to improve sequence similarity searches. BMC Bioinformatics 2002, 3: 11. 10.1186\/1471-2105-3-11","journal-title":"BMC Bioinformatics"},{"key":"430_CR18","doi-asserted-by":"publisher","first-page":"1145","DOI":"10.1002\/pro.5560040613","volume":"4","author":"WR Pearson","year":"1995","unstructured":"Pearson WR: Comparison of methods for searching protein sequence databases. Protein Sci 1995, 4: 1145\u20131160.","journal-title":"Protein Sci"},{"issue":"2","key":"430_CR19","doi-asserted-by":"publisher","first-page":"179","DOI":"10.1006\/geno.1996.0614","volume":"38","author":"EG Shpaer","year":"1996","unstructured":"Shpaer EG, Robinson M, Yee D, Candlin JD, Mines RTH, T H: Sensitivity and selectivity in protein similarity searches: a comparison of Smith-Waterman in hardware to BLAST and FASTA. Genomics 1996, 38(2):179\u2013191. 10.1006\/geno.1996.0614","journal-title":"Genomics"},{"issue":"8","key":"430_CR20","doi-asserted-by":"publisher","first-page":"749","DOI":"10.1093\/oxfordjournals.bioinformatics.a011054","volume":"14","author":"CM Pasquier","year":"1998","unstructured":"Pasquier CM, Promponas VI, Varvayannis NJ, J HS: A Web server to locate periodicities in a sequence. Bioinformatics 1998, 14(8):749\u2013750.","journal-title":"Bioinformatics"},{"issue":"3","key":"430_CR21","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1093\/protein\/15.3.193","volume":"15","author":"CH de Trad","year":"2002","unstructured":"de Trad CH, Fang Q, Cosic I: Protein sequence comparison based on wavelet transform. Protein Engineering 2002, 15(3):193\u2013202. 10.1093\/protein\/15.3.193","journal-title":"Protein Engineering"},{"key":"430_CR22","doi-asserted-by":"publisher","first-page":"299","DOI":"10.1002\/prot.10290","volume":"50","author":"AJ Shepherd","year":"2003","unstructured":"Shepherd AJ, Gorse D, Thornton JM: A Novel Approach to the Recognition of Protein Architecture from Sequence Using Fourier Analysis and Neural Networks. PROTEINS: Structure, Function, and Genetics 2003, 50: 299\u2013302. 10.1002\/prot.10290","journal-title":"PROTEINS: Structure, Function, and Genetics"},{"key":"430_CR23","first-page":"2001","volume-title":"Stanford University Database Group technical report","author":"Y Cheng","year":"2000","unstructured":"Cheng Y: Music Database Retrieval Based on Spectral Similarity. Stanford University Database Group technical report 2000, 2001\u20132014."},{"key":"430_CR24","unstructured":"AAindex: Amino Acid Index Database, Release 6.0, September 2002[http:\/\/www.genome.ad.jp\/dbget\/aaindex.html]"},{"key":"430_CR25","doi-asserted-by":"publisher","first-page":"374","DOI":"10.1093\/nar\/28.1.374","volume":"28","author":"S Kawashima","year":"2000","unstructured":"Kawashima S, Kanehisa M: AAindex: amino acid index database. Nucleic Acids Res 2000, 28: 374. 10.1093\/nar\/28.1.374","journal-title":"Nucleic Acids Res"},{"key":"430_CR26","doi-asserted-by":"publisher","first-page":"1302","DOI":"10.1002\/pro.5560060618","volume":"6","author":"PA Karplus","year":"1997","unstructured":"Karplus PA: Hydrophobicity Regained. Protein Science 1997, 6: 1302\u20131307.","journal-title":"Protein Science"},{"key":"430_CR27","doi-asserted-by":"publisher","first-page":"1912","DOI":"10.1126\/science.1075762","volume":"298","author":"G Manning","year":"2002","unstructured":"Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S: The Protein Kinase Complement of the Human Genome. Science 2002, 298: 1912\u20131934. 10.1126\/science.1075762","journal-title":"Science"},{"issue":"12","key":"430_CR28","doi-asserted-by":"publisher","first-page":"980","DOI":"10.1038\/nsb1203-980","volume":"10","author":"H Berman","year":"2003","unstructured":"Berman H, Henrick K, Nakamura H: Announcing the Worldwide Protein Data Bank. Nature Struct Bio 2003, 10(12):980. 10.1038\/nsb1203-980","journal-title":"Nature Struct Bio"},{"key":"430_CR29","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1093\/nar\/28.1.235","volume":"28","author":"HM Berman","year":"2000","unstructured":"Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucl Acids Res 2000, 28: 235\u2013242. 10.1093\/nar\/28.1.235","journal-title":"Nucl Acids Res"},{"key":"430_CR30","doi-asserted-by":"publisher","first-page":"2714","DOI":"10.1002\/elps.1150181505","volume":"18","author":"N Guex","year":"1997","unstructured":"Guex N, Peitsch MC: SWISS-MODEL and the Swiss-PdbViewer: An environment for comparative protein modeling. Electrophoresis 1997, 18: 2714\u20131723.","journal-title":"Electrophoresis"},{"key":"430_CR31","unstructured":"Deep View Swiss-PdbViewer[http:\/\/www.expasy.org\/spdbv\/]"},{"key":"430_CR32","doi-asserted-by":"publisher","first-page":"365","DOI":"10.1093\/nar\/gkg095","volume":"31","author":"B Boeckmann","year":"2003","unstructured":"Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, Gasteiger E, Martin MJ, Michoud K, O'Donovan C, Phan I, Pilbout S, Schneider M: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucl Acids Res 2003, 31: 365\u2013370. 10.1093\/nar\/gkg095","journal-title":"Nucl Acids Res"},{"key":"430_CR33","first-page":"496","volume-title":"Numerical Recipes in C","author":"WH Press","year":"2002","unstructured":"Press WH, Teukolsky SA, Vetterlong WT, Flannery BP: Fast Fourier Transformation. In Numerical Recipes in C. 2nd edition. Cambridge University Press; 2002:496\u2013524.","edition":"2"},{"key":"430_CR34","volume-title":"Fast Transforms: Algorithms, Analyses, Applications","author":"DF Elliott","year":"1982","unstructured":"Elliott DF, Rao KR: Fast Transforms: Algorithms, Analyses, Applications. New York: Academic Press; 1982."},{"issue":"4","key":"430_CR35","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1109\/MASSP.1984.1162257","volume":"1","author":"MT Heideman","year":"1984","unstructured":"Heideman MT, Johnson DH, Burris CS: Gauss and the history of fast Fourier Transform. IEEE ASSP Magazine 1984, 1(4):14\u201321.","journal-title":"IEEE ASSP Magazine"},{"key":"430_CR36","volume-title":"Introduction to Algorithm","author":"TH Cormen","year":"2000","unstructured":"Cormen TH, Leiserson CE, Rivest RL, Stein C: Dynamic Algorithms. In Introduction to Algorithm. Volume 2. 2nd edition. MIT Press; 2000.","edition":"2"},{"issue":"12","key":"430_CR37","doi-asserted-by":"publisher","first-page":"2577","DOI":"10.1002\/bip.360221211","volume":"22","author":"W Kabsch","year":"1983","unstructured":"Kabsch W, Sander C: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22(12):2577\u20132637. 10.1002\/bip.360221211","journal-title":"Biopolymers"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-6-105.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,1]],"date-time":"2024-02-01T17:45:06Z","timestamp":1706809506000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-6-105"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2005,4,23]]},"references-count":37,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2005,12]]}},"alternative-id":["430"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-6-105","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2005,4,23]]},"assertion":[{"value":"1 February 2005","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 April 2005","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 April 2005","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"105"}}