{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T09:24:44Z","timestamp":1777627484520,"version":"3.51.4"},"reference-count":29,"publisher":"Springer Science and Business Media LLC","issue":"1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2006,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their roles in the grand scheme of evolutionary biology. Currently, the structural similarity between two proteins is measured by the root-mean-square-deviation (RMSD) in their best-superimposed atomic coordinates. RMSD is the golden rule of measuring structural similarity when the structures are nearly identical; it, however, fails to detect the higher order topological similarities in proteins evolved into different shapes. We propose new algorithms for extracting geometrical invariants of proteins that can be effectively used to identify homologous protein structures or topologies in order to quantify both close and remote structural similarities.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>We measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In our approach, the Principle Component Correlation (PCC) analysis, a symmetric interaction matrix for a protein structure is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. When using a distance-based construction in the presence or absence of encoded N to C terminal sense, there are strong correlations between the principle components of interaction matrices of structurally or topologically similar proteins.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusion<\/jats:title>\n            <jats:p>The PCC method is extensively tested for protein structures that belong to the same topological class but are significantly different by RMSD measure. The PCC analysis can also differentiate proteins having similar shapes but different topological arrangements. Additionally, we demonstrate that when using two independently defined interaction matrices, comparison of their maximum eigenvalues can be highly effective in clustering structurally or topologically similar proteins. We believe that the PCC analysis of interaction matrix is highly flexible in adopting various structural parameters for protein structure comparison.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1471-2105-7-40","type":"journal-article","created":{"date-parts":[[2006,2,4]],"date-time":"2006-02-04T19:14:13Z","timestamp":1139080453000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":22,"title":["Protein structure similarity from principle component correlation analysis"],"prefix":"10.1186","volume":"7","author":[{"given":"Xiaobo","family":"Zhou","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"James","family":"Chou","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stephen TC","family":"Wong","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2006,1,25]]},"reference":[{"key":"779_CR1","doi-asserted-by":"publisher","first-page":"643","DOI":"10.1038\/1334","volume":"5","author":"S Kim","year":"1998","unstructured":"Kim S: Shining a light on structural genomics. Nat Struct Biol 1998, 5: 643\u2013645. 10.1038\/1334","journal-title":"Nat Struct Biol"},{"key":"779_CR2","first-page":"45","volume":"47","author":"PY Chou","year":"1978","unstructured":"Chou PY, Fasman GD: Prediction of the secondary structure of proteins from their amino acid sequence. Adv Enzymol Relat Areas Mol Biol 1978, 47: 45\u2013148.","journal-title":"Adv Enzymol Relat Areas Mol Biol"},{"key":"779_CR3","doi-asserted-by":"crossref","first-page":"22014","DOI":"10.1016\/S0021-9258(17)31748-9","volume":"269","author":"KC Chou","year":"1994","unstructured":"Chou KC, Zhang CT: Predicting protein folding types by distance functions that make allowances for amino acid interactions. Journal of Biological Chemistry 1994, 269: 22014\u201322020.","journal-title":"Journal of Biological Chemistry"},{"key":"779_CR4","doi-asserted-by":"publisher","first-page":"172","DOI":"10.1002\/(SICI)1097-0134(199710)29:2<172::AID-PROT5>3.0.CO;2-F","volume":"29","author":"I Bahar","year":"1997","unstructured":"Bahar I, Atilgan AR, Jernigan RL, Erman B: Understanding the recognition of protein structural classes by amino acid composition. PROTEINS: Structure, Function, and Genetics 1997, 29: 172\u2013185. Publisher Full Text 10.1002\/(SICI)1097-0134(199710)29:2<172::AID-PROT5>3.0.CO;2-F","journal-title":"PROTEINS: Structure, Function, and Genetics"},{"key":"779_CR5","doi-asserted-by":"publisher","first-page":"45765","DOI":"10.1074\/jbc.M204161200","volume":"227","author":"KC Chou","year":"2002","unstructured":"Chou KC, Cai YD: Using functional domain composition and support vector machines for prediction of protein subcellular location. Journal of Biological Chemistry 2002, 227: 45765\u201345769. 10.1074\/jbc.M204161200","journal-title":"Journal of Biological Chemistry"},{"key":"779_CR6","doi-asserted-by":"publisher","first-page":"34","DOI":"10.1016\/S0968-0004(98)01336-X","volume":"24","author":"K Nakai","year":"1999","unstructured":"Nakai K, Horton P: PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends in Biochemical Science 1999, 24: 34\u201336. 10.1016\/S0968-0004(98)01336-X","journal-title":"Trends in Biochemical Science"},{"issue":"2","key":"779_CR7","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1021\/pr0255710","volume":"2","author":"K Chou","year":"2003","unstructured":"Chou K, Elrod DW: Prediction of enzyme family classes. J Proteome Res 2003, 2(2):183\u2013190. 10.1021\/pr0255710","journal-title":"J Proteome Res"},{"key":"779_CR8","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1016\/S0196-9781(02)00289-9","volume":"24","author":"YD Cai","year":"2003","unstructured":"Cai YD, Lin S, Chou KC: Support vector machines for prediction of protein signal sequences and their cleavage sites. Peptides 2003, 24: 159\u2013161. 10.1016\/S0196-9781(02)00289-9","journal-title":"Peptides"},{"key":"779_CR9","doi-asserted-by":"publisher","first-page":"291","DOI":"10.1007\/BF01028191","volume":"12","author":"JJ Chou","year":"1993","unstructured":"Chou JJ: Predicting cleavability of peptide sequences by HIV protease via correlation-angle approach. Journal of Protein Chemistry 1993, 12: 291\u2013302. 10.1007\/BF01028191","journal-title":"Journal of Protein Chemistry"},{"key":"779_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1006\/abio.1996.0001","volume":"233","author":"KC Chou","year":"1996","unstructured":"Chou KC: Prediction of HIV protease cleavage sites in proteins. Analytical Biochemistry 1996, 233: 1\u201314. 10.1006\/abio.1996.0001","journal-title":"Analytical Biochemistry"},{"key":"779_CR11","doi-asserted-by":"publisher","first-page":"235","DOI":"10.1093\/nar\/28.1.235","volume":"28","author":"HM Berman","year":"2000","unstructured":"Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Research 2000, 28: 235\u2013242. 10.1093\/nar\/28.1.235","journal-title":"Nucleic Acids Research"},{"key":"779_CR12","doi-asserted-by":"publisher","first-page":"685","DOI":"10.1089\/106652701446152","volume":"7","author":"I Eidhammer","year":"2000","unstructured":"Eidhammer I, Jonassen I, Taylor WR: Structure comparison and structure patterns. Journal of Computational Biology 2000, 7: 685\u2013716. 10.1089\/106652701446152","journal-title":"Journal of Computational Biology"},{"key":"779_CR13","doi-asserted-by":"publisher","first-page":"348","DOI":"10.1016\/S0959-440X(00)00214-1","volume":"11","author":"P Koehl","year":"2001","unstructured":"Koehl P: Protein structure similarities. Current Opinion in Structural Biology 2001, 11: 348\u2013353. 10.1016\/S0959-440X(00)00214-1","journal-title":"Current Opinion in Structural Biology"},{"key":"779_CR14","volume-title":"Calmodulin","author":"P Cohen","year":"1988","unstructured":"Cohen P, Klee CB: Calmodulin. New York: Elsevier; 1988."},{"issue":"3","key":"779_CR15","doi-asserted-by":"publisher","first-page":"217","DOI":"10.1023\/A:1026563923774","volume":"18","author":"JJ Chou","year":"2000","unstructured":"Chou JJ, Li SP, Bax A: Study of conformational rearrangement and refinement of structural homology models by the use of heteronuclear dipolar couplings. Journal of Biomolecular NMR 2000, 18(3):217\u2013227. 10.1023\/A:1026563923774","journal-title":"Journal of Biomolecular NMR"},{"issue":"11","key":"779_CR16","doi-asserted-by":"publisher","first-page":"2606","DOI":"10.1110\/ps.0215902","volume":"11","author":"AR Ortiz","year":"2002","unstructured":"Ortiz AR, Strauss CE, Olmea O: MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison. Protein Sci 2002, 11(11):2606\u201321. 10.1110\/ps.0215902","journal-title":"Protein Sci"},{"issue":"3","key":"779_CR17","doi-asserted-by":"publisher","first-page":"487","DOI":"10.1002\/prot.20146","volume":"56","author":"DL Bostick","year":"2004","unstructured":"Bostick DL, Shen M, Vaisman II: A simple topological representation of protein structure: implications for new, fast, and robust structural classification. Proteins 2004, 56(3):487\u2013501. 10.1002\/prot.20146","journal-title":"Proteins"},{"issue":"4","key":"779_CR18","doi-asserted-by":"publisher","first-page":"887","DOI":"10.1006\/jmbi.2001.5250","volume":"315","author":"O Carugo","year":"2002","unstructured":"Carugo O, Pongor S: Protein fold similarity estimated by a probabilistic approach based on C(alpha)-C(alpha) distance comparison. J Mol Biol 2002, 315(4):887\u201398. 10.1006\/jmbi.2001.5250","journal-title":"J Mol Biol"},{"issue":"4","key":"779_CR19","doi-asserted-by":"publisher","first-page":"554","DOI":"10.1002\/(SICI)1097-0134(19991201)37:4<554::AID-PROT6>3.0.CO;2-1","volume":"37","author":"K Kedem","year":"1999","unstructured":"Kedem K, Chew LP, Elber R: Unit-vector RMS (URMS) as a tool to analyze molecular dynamics trajectories. Proteins 1999, 37(4):554\u201364. 10.1002\/(SICI)1097-0134(19991201)37:4<554::AID-PROT6>3.0.CO;2-1","journal-title":"Proteins"},{"issue":"13","key":"779_CR20","doi-asserted-by":"publisher","first-page":"3370","DOI":"10.1093\/nar\/gkg571","volume":"31","author":"A Zemla","year":"2003","unstructured":"Zemla A: LGA: A method for finding 3D similarities in protein structures. Nucleic Acids Res 2003, 31(13):3370\u20134. 10.1093\/nar\/gkg571","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"779_CR21","doi-asserted-by":"publisher","first-page":"22","DOI":"10.1002\/prot.20240","volume":"58","author":"U Bastolla","year":"2005","unstructured":"Bastolla U, et al.: Principal eigenvector of contact matrices and hydrophobicity profiles in proteins. Proteins 2005, 58(1):22\u201330. 10.1002\/prot.20240","journal-title":"Proteins"},{"issue":"1","key":"779_CR22","doi-asserted-by":"publisher","first-page":"119","DOI":"10.1073\/pnas.2636460100","volume":"100","author":"P Rogen","year":"2003","unstructured":"Rogen P, Fain B: Automatic classification of protein structure by using Gauss integrals. Proc Natl Acad Sci USA 2003, 100(1):119\u2013124. 10.1073\/pnas.2636460100","journal-title":"Proc Natl Acad Sci USA"},{"key":"779_CR23","doi-asserted-by":"publisher","first-page":"1093","DOI":"10.1016\/S0969-2126(97)00260-8","volume":"5","author":"CA Orengo","year":"1997","unstructured":"Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM: CATH \u2013 A hierarchic classification of protein domain structures. Structure 1997, 5: 1093\u20131108. 10.1016\/S0969-2126(97)00260-8","journal-title":"Structure"},{"key":"779_CR24","doi-asserted-by":"publisher","first-page":"277","DOI":"10.1093\/nar\/28.1.277","volume":"28","author":"FMG Pearl","year":"2000","unstructured":"Pearl FMG, Lee D, Bray JE, Sillitoe I, Todd AE, Harrison AP, Thornton JM, Orengo CA: Assigning genomic sequences to CATH. Nucleic Acids Research 2000, 28: 277\u2013282. 10.1093\/nar\/28.1.277","journal-title":"Nucleic Acids Research"},{"key":"779_CR25","doi-asserted-by":"crossref","first-page":"588","DOI":"10.21136\/CMJ.1961.100486","volume":"11","author":"G Calugareanu","year":"1961","unstructured":"Calugareanu G: Sur les classes d'isotopie des noeuds tridimensionnels et leurs invariants. Czechoslovak Math 1961, 11: 588\u2013625.","journal-title":"Czechoslovak Math"},{"issue":"4","key":"779_CR26","doi-asserted-by":"publisher","first-page":"815","DOI":"10.1073\/pnas.68.4.815","volume":"68","author":"FB Fuller","year":"1971","unstructured":"Fuller FB: The writhing number of a space curve. Proc Natl Acad Sci USA 1971, 68(4):815\u20139.","journal-title":"Proc Natl Acad Sci USA"},{"issue":"1","key":"779_CR27","first-page":"100","volume":"243","author":"WR Bauer","year":"1980","unstructured":"Bauer WR, Crick FH, White JH: Supercoiled DNA. Sci Am 1980, 243(1):100\u201313.","journal-title":"Sci Am"},{"key":"779_CR28","volume-title":"Proceedings of the eighth annual international conference on Computational molecular biology","author":"MA Erdmann","year":"2004","unstructured":"Erdmann MA: Protein similarity from knot theory and geometric convolution. In Proceedings of the eighth annual international conference on Computational molecular biology. San Diego, California, USA; 2004."},{"key":"779_CR29","volume-title":"Theory and Its Applications","author":"K Murasugi","year":"1996","unstructured":"Murasugi K: Theory and Its Applications. Boston, USA: Birkh\u00e4user; 1996."}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1471-2105-7-40.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T03:20:24Z","timestamp":1630466424000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/1471-2105-7-40"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2006,1,25]]},"references-count":29,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2006,12]]}},"alternative-id":["779"],"URL":"https:\/\/doi.org\/10.1186\/1471-2105-7-40","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2006,1,25]]},"assertion":[{"value":"22 April 2005","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 January 2006","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 January 2006","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"40"}}