{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,7]],"date-time":"2025-11-07T08:48:16Z","timestamp":1762505296357},"reference-count":32,"publisher":"Oxford University Press (OUP)","issue":"24","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,12,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Fold recognition is a key step in the protein structure discovery process, especially when traditional sequence comparison methods fail to yield convincing structural homologies. Although many methods have been developed for protein fold recognition, their accuracies remain low. This can be attributed to insufficient exploitation of fold discriminatory features.<\/jats:p><jats:p>Results: We have developed a new method for protein fold recognition using structural information of amino acid residues and amino acid residue pairs. Since protein fold recognition can be treated as a protein fold classification problem, we have developed a Support Vector Machine (SVM) based classifier approach that uses secondary structural state and solvent accessibility state frequencies of amino acids and amino acid pairs as feature vectors. Among the individual properties examined secondary structural state frequencies of amino acids gave an overall accuracy of 65.2% for fold discrimination, which is better than the accuracy by any method reported so far in the literature. Combination of secondary structural state frequencies with solvent accessibility state frequencies of amino acids and amino acid pairs further improved the fold discrimination accuracy to more than 70%, which is \u223c8% higher than the best available method. In this study we have also tested, for the first time, an all-together multi-class method known as Crammer and Singer method for protein fold classification. Our studies reveal that the three multi-class classification methods, namely one versus all, one versus one and Crammer and Singer method, yield similar predictions.<\/jats:p><jats:p>Availability: Dataset and stand-alone program are available upon request.<\/jats:p><jats:p>Contact: \u00a0han@cdfd.org.in<\/jats:p><jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm527","type":"journal-article","created":{"date-parts":[[2007,11,8]],"date-time":"2007-11-08T01:26:33Z","timestamp":1194485193000},"page":"3320-3327","source":"Crossref","is-referenced-by-count":83,"title":["Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs"],"prefix":"10.1093","volume":"23","author":[{"given":"Mohammad Tabrez Anwar","family":"Shamim","sequence":"first","affiliation":[{"name":"Laboratory of Computational Biology, Centre for DNA Fingerprinting and Diagnostics, Hyderabad 500 076, India"}]},{"given":"Mohammad","family":"Anwaruddin","sequence":"additional","affiliation":[{"name":"Laboratory of Computational Biology, Centre for DNA Fingerprinting and Diagnostics, Hyderabad 500 076, India"}]},{"given":"H.A.","family":"Nagarajaram","sequence":"additional","affiliation":[{"name":"Laboratory of Computational Biology, Centre for DNA Fingerprinting and Diagnostics, Hyderabad 500 076, India"}]}],"member":"286","published-online":{"date-parts":[[2007,11,7]]},"reference":[{"key":"2023041107273993700_","first-page":"113","article-title":"Reducing multi-class to binary: a unifying approach for margin classifiers","volume":"1","author":"Allwein","year":"2000","journal-title":"J. Mach. Learn. Res"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"23262","DOI":"10.1074\/jbc.M401932200","article-title":"Classification of nuclear receptors based on amino acid composition and dipeptide composition","volume":"279","author":"Bhasin","year":"2004","journal-title":"J. Biol. Chem"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"D189","DOI":"10.1093\/nar\/gkh034","article-title":"The ASTRAL compendium in 2004","volume":"32","author":"Chandonia","year":"2004","journal-title":"Nucleic Acids Res"},{"key":"2023041107273993700_","unstructured":"Chang CC \u00a0LinCJ LIBSVM: a library for support vector machines 2001 Software available at http:\/\/www.csie.ntu.edu.tw\/~cjlin\/libsvm"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"1456","DOI":"10.1093\/bioinformatics\/btl102","article-title":"A machine learning information retrieval approach to protein fold recognition","volume":"22","author":"Cheng","year":"2006","journal-title":"Bioinformatics"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"w72","DOI":"10.1093\/nar\/gki396","article-title":"SCRATCH: a protein structure and structural feature prediction server","volume":"33","author":"Cheng","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023041107273993700_","first-page":"35","article-title":"On the learnability and design of output codes for multiclass problems","author":"Crammer","year":"2000"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1093\/bioinformatics\/17.4.349","article-title":"Multi-class protein fold recognition using support vector machines and neural networks","volume":"17","author":"Ding","year":"2001","journal-title":"Bioinformatics"},{"key":"2023041107273993700_","first-page":"721","article-title":"Round robin classification","volume":"2","author":"Furnkranz","year":"2002","journal-title":"J. Mach. Learn. Res"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"14427","DOI":"10.1074\/jbc.M411789200","article-title":"Support vector machine-based method for subcellular localization of human proteins using amino acid compositions, their order, and similarity search","volume":"280","author":"Garg","year":"2005","journal-title":"J. Biol. Chem"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"1211","DOI":"10.1162\/neco.1997.9.6.1245","article-title":"Note on free lunches and cross-validation","volume":"9","author":"Goutte","year":"1997","journal-title":"Neural Comput"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"5099","DOI":"10.1002\/pmic.200600064","article-title":"GNBSL: a new integrative system to predict the subcellular location for Gram-negative bacteria proteins","volume":"6","author":"Guo","year":"2006","journal-title":"Proteomics"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1109\/72.991427","article-title":"A comparison of methods for multi-class support vector machines","volume":"13","author":"Hsu","year":"2002","journal-title":"IEEE Trans. Neural Netw"},{"key":"2023041107273993700_","first-page":"431","article-title":"Estimating the generalization performance of an SVM efficiently","author":"Joachims","year":"2000"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1038\/358086a0","article-title":"A new approach to protein fold recognition","volume":"358","author":"Jones","year":"1992","journal-title":"Nature"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1093\/bioinformatics\/18.1.147","article-title":"Classifying G-protein coupled receptors with support vector machines","volume":"18","author":"Karchin","year":"2002","journal-title":"Bioinformatics"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1006\/jmbi.2000.3741","article-title":"Enhanced genome annotation using structural profiles in the program 3D-PSSM","volume":"299","author":"Kelley","year":"2000","journal-title":"J. Mol. Biol"},{"key":"2023041107273993700_","first-page":"255","volume-title":"Pairwise classification and support vector machines. Advances in Kernel Methods- Support Vector Learning","author":"Krebel","year":"1999"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1093\/bib\/bbk007","article-title":"Machine learning in bioinformatics","volume":"7","author":"Larranaga","year":"2006","journal-title":"Brief. Bioinformatics"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"404","DOI":"10.1093\/bioinformatics\/16.4.404","article-title":"The PSIPRED protein structure prediction server","volume":"16","author":"McGuffin","year":"2000","journal-title":"Bioinformatics"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"536","DOI":"10.1016\/S0022-2836(05)80134-2","article-title":"SCOP: a structural classification of proteins database for the investigation of sequences and structures","volume":"247","author":"Murzin","year":"1995","journal-title":"J. Mol. Biol"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"e408","DOI":"10.1093\/bioinformatics\/btl222","article-title":"BaCelLo: a balanced subcellular localization predictor","volume":"22","author":"Pierleoni","year":"2006","journal-title":"Bioinformatics"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"584","DOI":"10.1006\/jmbi.1993.1413","article-title":"Prediction of protein secondary structure at better than 70% accuracy","volume":"232","author":"Rost","year":"1993","journal-title":"J. Mol. Biol"},{"key":"2023041107273993700_","volume-title":"Ph.D. Thesis","author":"Sali","year":"1991"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"1717","DOI":"10.1093\/bioinformatics\/btl170","article-title":"Ensemble classifier for protein fold pattern recognition","volume":"22","author":"Shen","year":"2006","journal-title":"Bioinformatics"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1006\/jmbi.2001.4762","article-title":"FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties","volume":"310","author":"Shi","year":"2001","journal-title":"J. Mol. Biol"},{"key":"2023041107273993700_","volume-title":"SSTRUC: A Program to Calculate Secondary Structural Summary","author":"Smith","year":"1989"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-2440-0","volume-title":"The Nature of Statistical Learning theory","author":"Vapnik","year":"1995"},{"key":"2023041107273993700_","volume-title":"Statistical Learning Theory","author":"Vapnik","year":"1998"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1002\/prot.21062","article-title":"Better prediction of the location of \u03b1-turns in proteins with support vector machine","volume":"65","author":"Wang","year":"2006","journal-title":"Proteins Struct. Funct. Bioinformatics"},{"key":"2023041107273993700_","doi-asserted-by":"crossref","first-page":"328","DOI":"10.1093\/bib\/5.4.328","article-title":"Biological applications of support vector machines","volume":"5","author":"Yang","year":"2004","journal-title":"Brief. Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/24\/3320\/49823669\/bioinformatics_23_24_3320.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/24\/3320\/49823669\/bioinformatics_23_24_3320.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,14]],"date-time":"2023-05-14T17:17:56Z","timestamp":1684084676000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/24\/3320\/1745716"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,11,7]]},"references-count":32,"journal-issue":{"issue":"24","published-print":{"date-parts":[[2007,12,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm527","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,12,15]]},"published":{"date-parts":[[2007,11,7]]}}}