{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,7]],"date-time":"2025-11-07T19:10:04Z","timestamp":1762542604286,"version":"3.37.3"},"reference-count":39,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2016,12,22]],"date-time":"2016-12-22T00:00:00Z","timestamp":1482364800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["(11501306, 11501407)"],"award-info":[{"award-number":["(11501306, 11501407)"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"China National 863 High-Tech Program","award":["(2015AA020101)"],"award-info":[{"award-number":["(2015AA020101)"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2017,3,15]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Protein fold classification is a critical step in protein structure prediction. There are two possible ways to classify protein folds. One is through template-based fold assignment and the other is ab-initio prediction using machine learning algorithms. Combination of both solutions to improve the prediction accuracy was never explored before.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We developed two algorithms, HH-fold and SVM-fold for protein fold classification. HH-fold is a template-based fold assignment algorithm using the HHsearch program. SVM-fold is a support vector machine-based ab-initio classification algorithm, in which a comprehensive set of features are extracted from three complementary sequence profiles. These two algorithms are then combined, resulting to the ensemble approach TA-fold. We performed a comprehensive assessment for the proposed methods by comparing with ab-initio methods and template-based threading methods on six benchmark datasets. An accuracy of 0.799 was achieved by TA-fold on the DD dataset that consists of proteins from 27 folds. This represents improvement of 5.4\u201311.7% over ab-initio methods. After updating this dataset to include more proteins in the same folds, the accuracy increased to 0.971. In addition, TA-fold achieved &amp;gt;0.9 accuracy on a large dataset consisting of 6451 proteins from 184 folds. Experiments on the LE dataset show that TA-fold consistently outperforms other threading methods at the family, superfamily and fold levels. The success of TA-fold is attributed to the combination of template-based fold assignment and ab-initio classification using features from complementary sequence profiles that contain rich evolution information.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and Implementation<\/jats:title>\n                  <jats:p>http:\/\/yanglab.nankai.edu.cn\/TA-fold\/<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btw768","type":"journal-article","created":{"date-parts":[[2016,12,2]],"date-time":"2016-12-02T12:05:39Z","timestamp":1480680339000},"page":"863-870","source":"Crossref","is-referenced-by-count":40,"title":["An ensemble approach to protein fold classification by integration of template-based assignment and support vector machine classifier"],"prefix":"10.1093","volume":"33","author":[{"given":"Jiaqi","family":"Xia","sequence":"first","affiliation":[{"name":"Department of Physics, Northeast Forestry University, Harbin, China"}]},{"given":"Zhenling","family":"Peng","sequence":"additional","affiliation":[{"name":"Center for Applied Mathematics, Tianjin University, Tianjin, China"}]},{"given":"Dawei","family":"Qi","sequence":"additional","affiliation":[{"name":"Department of Physics, Northeast Forestry University, Harbin, China"}]},{"given":"Hongbo","family":"Mu","sequence":"additional","affiliation":[{"name":"Department of Physics, Northeast Forestry University, Harbin, China"}]},{"given":"Jianyi","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Mathematical Sciences, Nankai University, Tianjin, China"}]}],"member":"286","published-online":{"date-parts":[[2016,12,22]]},"reference":[{"key":"2023020204514989500_btw768-B1","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023020204514989500_btw768-B2","doi-asserted-by":"crossref","first-page":"2843","DOI":"10.1093\/bioinformatics\/btm475","article-title":"PFRES: protein fold classification by using evolutionary information and predicted secondary structure","volume":"23","author":"Chen","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020204514989500_btw768-B3","doi-asserted-by":"crossref","first-page":"963","DOI":"10.1007\/s00726-010-0721-1","article-title":"iFC(2): an integrated web-server for improved prediction of protein structural class, fold type, and secondary structure content","volume":"40","author":"Chen","year":"2011","journal-title":"Amino Acids"},{"key":"2023020204514989500_btw768-B4","doi-asserted-by":"crossref","first-page":"426","DOI":"10.1002\/jcc.24232","article-title":"Protein folds recognized by an intelligent predictor based-on evolutionary and structural information","volume":"37","author":"Cheung","year":"2016","journal-title":"J. Comput. Chem"},{"key":"2023020204514989500_btw768-B5","doi-asserted-by":"crossref","first-page":"275","DOI":"10.3109\/10409239509083488","article-title":"Prediction of protein structural classes","volume":"30","author":"Chou","year":"1995","journal-title":"Crit. Rev. Biochem. Mol. Biol"},{"key":"2023020204514989500_btw768-B6","doi-asserted-by":"crossref","first-page":"1264","DOI":"10.1093\/bioinformatics\/btn112","article-title":"Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection","volume":"24","author":"Damoulas","year":"2008","journal-title":"Bioinformatics"},{"key":"2023020204514989500_btw768-B7","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1002\/prot.22324","article-title":"Enhanced protein fold recognition using a structural alphabet","volume":"76","author":"Deschavanne","year":"2009","journal-title":"Proteins"},{"key":"2023020204514989500_btw768-B8","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1093\/bioinformatics\/17.4.349","article-title":"Multi-class protein fold recognition using support vector machines and neural networks","volume":"17","author":"Ding","year":"2001","journal-title":"Bioinformatics"},{"key":"2023020204514989500_btw768-B9","doi-asserted-by":"crossref","first-page":"2655","DOI":"10.1093\/bioinformatics\/btp500","article-title":"A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation","volume":"25","author":"Dong","year":"2009","journal-title":"Bioinformatics"},{"key":"2023020204514989500_btw768-B10","doi-asserted-by":"crossref","first-page":"D304","DOI":"10.1093\/nar\/gkt1240","article-title":"SCOPe: Structural Classification of Proteins\u2013extended, integrating SCOP and ASTRAL data and classification of new structures","volume":"42","author":"Fox","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023020204514989500_btw768-B11","doi-asserted-by":"crossref","first-page":"659","DOI":"10.1093\/protein\/gzn045","article-title":"A novel hierarchical ensemble classifier for protein fold recognition","volume":"21","author":"Guo","year":"2008","journal-title":"Protein Eng. Des. Select. PEDS"},{"key":"2023020204514989500_btw768-B12","doi-asserted-by":"crossref","first-page":"1099","DOI":"10.1016\/S0969-2126(99)80177-4","article-title":"A systematic comparison of protein structure classifications: SCOP, CATH and FSSP","volume":"7","author":"Hadley","year":"1999","journal-title":"Structure"},{"key":"2023020204514989500_btw768-B13","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1109\/TNB.2003.820284","article-title":"Hierarchical learning architecture with automatic feature selection for multiclass protein fold classification","volume":"2","author":"Huang","year":"2003","journal-title":"IEEE Trans. Nanobiosci"},{"key":"2023020204514989500_btw768-B14","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1006\/jmbi.1999.3091","article-title":"Protein secondary structure prediction based on position-specific scoring matrices","volume":"292","author":"Jones","year":"1999","journal-title":"J. Mol. Biol"},{"key":"2023020204514989500_btw768-B15","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1038\/358086a0","article-title":"A new approach to protein fold recognition","volume":"358","author":"Jones","year":"1992","journal-title":"Nature"},{"key":"2023020204514989500_btw768-B16","doi-asserted-by":"crossref","first-page":"613","DOI":"10.1006\/jmbi.1999.3377","article-title":"Identification of related proteins on family, superfamily and fold level","volume":"295","author":"Lindahl","year":"2000","journal-title":"J. Mol. Biol"},{"key":"2023020204514989500_btw768-B17","doi-asserted-by":"crossref","first-page":"491","DOI":"10.1002\/prot.10514","article-title":"The number of protein folds and their distribution over families in nature","volume":"54","author":"Liu","year":"2004","journal-title":"Proteins"},{"key":"2023020204514989500_btw768-B18","doi-asserted-by":"crossref","first-page":"761","DOI":"10.1109\/TNB.2015.2457906","article-title":"Advancing the Accuracy of Protein Fold Recognition by Utilizing Profiles From Hidden Markov Models","volume":"14","author":"Lyons","year":"2015","journal-title":"IEEE Trans. Nanobiosci"},{"key":"2023020204514989500_btw768-B19","doi-asserted-by":"crossref","first-page":"414.","DOI":"10.1186\/1471-2105-10-414","article-title":"Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences","volume":"10","author":"Mizianty","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"2023020204514989500_btw768-B20","doi-asserted-by":"crossref","first-page":"4239","DOI":"10.1093\/bioinformatics\/bti687","article-title":"Profile-based direct kernels for remote homology detection and fold recognition","volume":"21","author":"Rangwala","year":"2005","journal-title":"Bioinformatics"},{"key":"2023020204514989500_btw768-B21","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1038\/nmeth.1818","article-title":"HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment","volume":"9","author":"Remmert","year":"2012","journal-title":"Nat. Methods"},{"key":"2023020204514989500_btw768-B22","doi-asserted-by":"crossref","first-page":"3320","DOI":"10.1093\/bioinformatics\/btm527","article-title":"Support Vector Machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs","volume":"23","author":"Shamim","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020204514989500_btw768-B23","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1016\/j.jtbi.2012.12.008","article-title":"A feature extraction technique using bi-gram probabilities of position specific scoring matrix for protein fold recognition","volume":"320","author":"Sharma","year":"2013","journal-title":"J. Theor. Biol"},{"key":"2023020204514989500_btw768-B24","doi-asserted-by":"crossref","first-page":"1717","DOI":"10.1093\/bioinformatics\/btl170","article-title":"Ensemble classifier for protein fold pattern recognition","volume":"22","author":"Shen","year":"2006","journal-title":"Bioinformatics"},{"key":"2023020204514989500_btw768-B25","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1016\/j.jtbi.2008.10.007","article-title":"Predicting protein fold pattern with functional domain and sequential evolution information","volume":"256","author":"Shen","year":"2009","journal-title":"J. Theor. Biol"},{"key":"2023020204514989500_btw768-B26","doi-asserted-by":"crossref","first-page":"D376","DOI":"10.1093\/nar\/gku947","article-title":"CATH: comprehensive structural and functional annotations for genome sequences","volume":"43","author":"Sillitoe","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023020204514989500_btw768-B27","doi-asserted-by":"crossref","first-page":"951","DOI":"10.1093\/bioinformatics\/bti125","article-title":"Protein homology detection by HMM-HMM comparison","volume":"21","author":"Soding","year":"2005","journal-title":"Bioinformatics"},{"key":"2023020204514989500_btw768-B28","doi-asserted-by":"crossref","first-page":"W244","DOI":"10.1093\/nar\/gki408","article-title":"The HHpred interactive server for protein homology detection and structure prediction","volume":"33","author":"Soding","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023020204514989500_btw768-B29","doi-asserted-by":"crossref","first-page":"404.","DOI":"10.1186\/1471-2105-8-404","article-title":"Application of amino acid occurrence for discriminating different folding types of globular proteins","volume":"8","author":"Taguchi","year":"2007","journal-title":"BMC Bioinformatics"},{"key":"2023020204514989500_btw768-B30","doi-asserted-by":"crossref","first-page":"649","DOI":"10.1109\/TNB.2015.2450233","article-title":"Enhanced protein fold prediction method through a novel feature extraction technique","volume":"14","author":"Wei","year":"2015","journal-title":"IEEE Trans. Nanobiosci"},{"key":"2023020204514989500_btw768-B31","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1016\/0003-2670(93)80437-P","article-title":"DNA and peptide sequences and chemical processes multivariately modelled by principal component analysis and partial least-squares projections to latent structures","volume":"277","author":"Wold","year":"1993","journal-title":"Anal. Chim. Acta"},{"key":"2023020204514989500_btw768-B32","doi-asserted-by":"crossref","first-page":"660","DOI":"10.1093\/bioinformatics\/btt578","article-title":"FFAS-3D: improving fold recognition by including optimized structural features and template re-ranking","volume":"30","author":"Xu","year":"2014","journal-title":"Bioinformatics"},{"key":"2023020204514989500_btw768-B33","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1038\/nmeth.3213","article-title":"The I-TASSER Suite: protein structure and function prediction","volume":"12","author":"Yang","year":"2015","journal-title":"Nat. Methods"},{"key":"2023020204514989500_btw768-B34","doi-asserted-by":"crossref","first-page":"2053","DOI":"10.1002\/prot.23025","article-title":"Improving taxonomy-based protein fold recognition by using global and local features","volume":"79","author":"Yang","year":"2011","journal-title":"Proteins"},{"key":"2023020204514989500_btw768-B35","doi-asserted-by":"crossref","first-page":"S9.","DOI":"10.1186\/1471-2105-11-S1-S9","article-title":"Prediction of protein structural classes for low-homology sequences based on predicted secondary structure","volume":"11","author":"Yang","year":"2010","journal-title":"BMC Bioinformatics"},{"key":"2023020204514989500_btw768-B36","doi-asserted-by":"crossref","first-page":"618","DOI":"10.1016\/j.jtbi.2008.12.027","article-title":"Prediction of protein structural classes by recurrence quantification analysis based on chaos game representation","volume":"257","author":"Yang","year":"2009","journal-title":"J. Theor. Biol"},{"key":"2023020204514989500_btw768-B37","doi-asserted-by":"crossref","first-page":"2076","DOI":"10.1093\/bioinformatics\/btr350","article-title":"Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates","volume":"27","author":"Yang","year":"2011","journal-title":"Bioinformatics"},{"key":"2023020204514989500_btw768-B38","doi-asserted-by":"crossref","first-page":"1850","DOI":"10.1093\/bioinformatics\/btu118","article-title":"Protein fold recognition using geometric kernel data fusion","volume":"30","author":"Zakeri","year":"2014","journal-title":"Bioinformatics"},{"key":"2023020204514989500_btw768-B39","doi-asserted-by":"crossref","first-page":"1301","DOI":"10.1006\/jmbi.1998.2282","article-title":"Estimating the number of protein folds","volume":"284","author":"Zhang","year":"1998","journal-title":"J. Mol. Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/6\/863\/49038307\/bioinformatics_33_6_863.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/33\/6\/863\/49038307\/bioinformatics_33_6_863.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,2]],"date-time":"2023-02-02T04:55:19Z","timestamp":1675313719000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/33\/6\/863\/2761562"}},"subtitle":[],"editor":[{"given":"Anna","family":"Tramontano","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2016,12,22]]},"references-count":39,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2017,3,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btw768","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2017,3,15]]},"published":{"date-parts":[[2016,12,22]]}}}