{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,26]],"date-time":"2026-03-26T12:25:46Z","timestamp":1774527946738,"version":"3.50.1"},"reference-count":28,"publisher":"Springer Science and Business Media LLC","issue":"S1","content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Syst Biol"],"published-print":{"date-parts":[[2011,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:sec>\n            <jats:title>Background<\/jats:title>\n            <jats:p>Enzymes are known as the largest class of proteins and their functions are usually annotated by the Enzyme Commission (EC), which uses a hierarchy structure, i.e., four numbers separated by periods, to classify the function of enzymes. Automatically categorizing enzyme into the EC hierarchy is crucial to understand its specific molecular mechanism.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Results<\/jats:title>\n            <jats:p>In this paper, we introduce two key improvements in predicting enzyme function within the machine learning framework. One is to introduce the efficient sequence encoding methods for representing given proteins. The second one is to develop a structure-based prediction method with low computational complexity. In particular, we propose to use the conjoint triad feature (CTF) to represent the given protein sequences by considering not only the composition of amino acids but also the neighbor relationships in the sequence. Then we develop a support vector machine (SVM)-based method, named as SVMHL (SVM for hierarchy labels), to output enzyme function by fully considering the hierarchical structure of EC. The experimental results show that our SVMHL with the CTF outperforms SVMHL with the amino acid composition (AAC) feature both in predictive accuracy and Matthew\u2019s correlation coefficient (MCC). In addition, SVMHL with the CTF obtains the accuracy and MCC ranging from 81% to 98% and 0<jats:italic>.<\/jats:italic> 82 to 0<jats:italic>.<\/jats:italic> 98 when predicting the first three EC digits on a low-homologous enzyme dataset. We further demonstrate that our method outperforms the methods which do not take account of hierarchical relationship among enzyme categories and alternative methods which incorporate prior knowledge about inter-class relationships.<\/jats:p>\n          <\/jats:sec>\n          <jats:sec>\n            <jats:title>Conclusions<\/jats:title>\n            <jats:p>Our structure-based prediction model, SVMHL with the CTF, reduces the computational complexity and outperforms the alternative approaches in enzyme function prediction. Therefore our new method will be a useful tool for enzyme function prediction community.<\/jats:p>\n          <\/jats:sec>","DOI":"10.1186\/1752-0509-5-s1-s6","type":"journal-article","created":{"date-parts":[[2011,8,3]],"date-time":"2011-08-03T07:09:11Z","timestamp":1312355351000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":33,"title":["Support vector machine prediction of enzyme function with conjoint triad feature and hierarchical context"],"prefix":"10.1186","volume":"5","author":[{"given":"Yong-Cui","family":"Wang","sequence":"first","affiliation":[]},{"given":"Yong","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Zhi-Xia","family":"Yang","sequence":"additional","affiliation":[]},{"given":"Nai-Yang","family":"Deng","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2011,6,20]]},"reference":[{"key":"673_CR1","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511790515","volume-title":"Systems biology: properties of reconstructed networks","author":"B Palsson","year":"2006","unstructured":"Palsson B: Systems biology: properties of reconstructed networks. 2006, Cambridge University Press New York, NY, USA"},{"key":"673_CR2","doi-asserted-by":"publisher","first-page":"304","DOI":"10.1093\/nar\/28.1.304","volume":"28","author":"A Bairoch","year":"2000","unstructured":"Bairoch A: The ENZYME database in 2000. Nucleic Acids Research. 2000, 28: 304-305. 10.1093\/nar\/28.1.304.","journal-title":"Nucleic Acids Research"},{"key":"673_CR3","doi-asserted-by":"publisher","first-page":"863","DOI":"10.1016\/j.jmb.2003.08.057","volume":"333","author":"W Tian","year":"2003","unstructured":"Tian W, Skolnick J: How well is enzyme function conserved as a function of pairwise sequence identity?. Journal of Molecular Biology. 2003, 333: 863-882. 10.1016\/j.jmb.2003.08.057.","journal-title":"Journal of Molecular Biology"},{"key":"673_CR4","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1016\/j.bbrc.2007.09.098","volume":"364","author":"HB Shen","year":"2007","unstructured":"Shen HB, Chou KC: EzyPred: a top-down approach for predicting enzyme functional classes and subclasses. Biochemical and Biophysical Research Communications. 2007, 364: 53-59. 10.1016\/j.bbrc.2007.09.098.","journal-title":"Biochemical and Biophysical Research Communications"},{"key":"673_CR5","volume-title":"Proceedings of the thirteenth ACM international conference on Information and knowledge management","author":"LJ Cai","year":"2004","unstructured":"Cai LJ, Hofmann T: Hierarchical document categorization with support vector machines. Proceedings of the thirteenth ACM international conference on Information and knowledge management. 2004, Washington, D.C., USA"},{"key":"673_CR6","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1021\/pr0255710","volume":"2","author":"KC Chou","year":"2003","unstructured":"Chou KC, Elrod DW: Prediction of enzyme family classes. Journal of Proteome Research. 2003, 2: 183-190. 10.1021\/pr0255710.","journal-title":"Journal of Proteome Research"},{"issue":"4","key":"673_CR7","doi-asserted-by":"publisher","first-page":"771","DOI":"10.1016\/S0022-2836(03)00628-4","volume":"330","author":"PD Dobson","year":"2003","unstructured":"Dobson PD, Doig AJ: Distinguishing enzyme structures from non-enzymes without alignments. Journal of Molecular Biology. 2003, 330 (4): 771-783. 10.1016\/S0022-2836(03)00628-4.","journal-title":"Journal of Molecular Biology"},{"key":"673_CR8","unstructured":"[http:\/\/www.ebi.ac.uk\/thornton-srv\/databases\/CATRES\/]"},{"key":"673_CR9","doi-asserted-by":"publisher","first-page":"246","DOI":"10.1002\/prot.1035","volume":"43","author":"KC Chou","year":"2001","unstructured":"Chou KC: Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins: Structure Function, and Genetics. 2001, 43: 246-255. 10.1002\/prot.1035.","journal-title":"Proteins: Structure Function, and Genetics"},{"key":"673_CR10","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1093\/bioinformatics\/bth466","volume":"21","author":"KC Chou","year":"2005","unstructured":"Chou KC: Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics. 2005, 21: 10-19. 10.1093\/bioinformatics\/bth466.","journal-title":"Bioinformatics"},{"key":"673_CR11","doi-asserted-by":"publisher","first-page":"4337","DOI":"10.1073\/pnas.0607879104","volume":"104","author":"JW Shen","year":"2007","unstructured":"Shen JW, Zhang J, Luo XM, Zhu WL, Yu KQ, Chen KX, Li YX, Jiang HL: Predicting protein-protein interactions based only on sequences information. Proceedings of the National Academy of Sciences. 2007, 104: 4337-4341. 10.1073\/pnas.0607879104.","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"11","key":"673_CR12","doi-asserted-by":"publisher","first-page":"1441","DOI":"10.2174\/0929866511009011441","volume":"17","author":"YC Wang","year":"2010","unstructured":"Wang YC, Wang XB, Yang ZX, Deng NY: Prediction of enzyme subfamily class via pseudo amino acid composition by incorporating the conjoint triad feature. Protein and Peptide Letters. 2010, 17 (11): 1441-1449.","journal-title":"Protein and Peptide Letters"},{"key":"673_CR13","volume-title":"Support vector machine with fuzzy decision-making for real-world data classification","author":"B Li","year":"2006","unstructured":"Li B, Hu J, Hirasawa K, Sun P, Marko K: Support vector machine with fuzzy decision-making for real-world data classification. 2006, IEEE World Congresson Computational Intelligence Int. Joint Conf. on Neural Networks Canada"},{"key":"673_CR14","first-page":"1601","volume":"7","author":"J Rousu","year":"2006","unstructured":"Rousu J, Saunders C, Szedmak S, Shawe-Taylor J: Kernel-based learning of hierarchical multilabel classification models. The Journal of Machine Learning Research. 2006, 7: 1601-1626.","journal-title":"The Journal of Machine Learning Research"},{"key":"673_CR15","doi-asserted-by":"publisher","DOI":"10.1145\/345508.345593","volume-title":"Hierarchical classification of web content","author":"S Dumais","year":"2000","unstructured":"Dumais S, Chen H: Hierarchical classification of web content. 2000, SIGIR"},{"key":"673_CR16","doi-asserted-by":"publisher","first-page":"S2","DOI":"10.1186\/1753-6561-2-s4-s2","volume":"2","author":"K Astikainen","year":"2008","unstructured":"Astikainen K, Holm L, Pitk\u00e4nen E, Szedmak S, Rousu J: Towards structured output prediction of enzyme function. BMC Proceedings. 2008, 2: S2-. 10.1186\/1753-6561-2-s4-s2.","journal-title":"BMC Proceedings"},{"key":"673_CR17","volume-title":"Tech. rep., Pascal Research Reports","author":"S Szedmak","year":"2005","unstructured":"Szedmak S, Shawe-Taylor J, Parado-Hernandez E: Learning via linear operators: maximum margin regression. Tech. rep., Pascal Research Reports. 2005"},{"key":"673_CR18","volume-title":"Proceedings of the 25th International Conference on Machine Learning","author":"S Sarawagi","year":"2008","unstructured":"Sarawagi S, Gupta R: Accurate max-margin training for structured output spaces. Proceedings of the 25th International Conference on Machine Learning. 2008, HelsinkiFinland"},{"issue":"11","key":"673_CR19","doi-asserted-by":"publisher","first-page":"707","DOI":"10.1093\/protein\/gzp055","volume":"22","author":"XB Wang","year":"2009","unstructured":"Wang XB, Wu LY, Wang YC, Deng NY: Prediction of palmitoylation sites using the composition of k-spaced amino acid pairs. Protein Engineering Design and Selection. 2009, 22 (11): 707-712. 10.1093\/protein\/gzp055.","journal-title":"Protein Engineering Design and Selection"},{"key":"673_CR20","volume-title":"The Pacific Symposium on Biocomputing","author":"GRG Lanckriet","year":"2004","unstructured":"Lanckriet GRG, Deng M, Cristianini N, Jordan MI, Noble WS: Kernel-based data fusion and its application to protein function prediction in yeast. The Pacific Symposium on Biocomputing. 2004"},{"key":"673_CR21","doi-asserted-by":"crossref","unstructured":"Guan Y, Myers C, Hess D, Barutcuoglu Z, Caudy A, Troyanskaya O: Predicting gene function in a hierarchical context with an ensemble of classifiers. Genome Biology. 2008, 9 (S3):","DOI":"10.1186\/gb-2008-9-s1-s3"},{"key":"673_CR22","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1109\/TPAMI.2006.17","volume":"28","author":"OL Mangasarian","year":"2006","unstructured":"Mangasarian OL, Wild EW: Multisurface proximal support vector machine classification via generalized eigenvalues. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2006, 28: 69-74. 10.1109\/TPAMI.2006.17.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"673_CR23","doi-asserted-by":"publisher","first-page":"905","DOI":"10.1109\/TPAMI.2007.1068","volume":"29","author":"R Khemchandani Jayadeva","year":"2007","unstructured":"Jayadeva Khemchandani R, Chandra S: Twin support vector machines for pattern classification. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2007, 29: 905-910. 10.1109\/TPAMI.2007.1068.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"673_CR24","doi-asserted-by":"publisher","first-page":"510","DOI":"10.1016\/j.sigpro.2008.10.002","volume":"89","author":"S Ghorai","year":"2008","unstructured":"Ghorai S, Mukherjee A, Dutta PK: Nonparallel plane proximal classifier. Signal Processing. 2008, 89: 510-522. 10.1016\/j.sigpro.2008.10.002.","journal-title":"Signal Processing"},{"key":"673_CR25","doi-asserted-by":"publisher","first-page":"63","DOI":"10.4236\/ns.2009.12011","volume":"2","author":"KC Chou","year":"2009","unstructured":"Chou KC, Shen HB: Review: recent advances in developing web-servers for predicting protein attributes. Natural Science. 2009, 2: 63-92. 10.4236\/ns.2009.12011.","journal-title":"Natural Science"},{"key":"673_CR26","volume-title":"A practical guide to support vector classfication","author":"CW Hsu","year":"2007","unstructured":"Hsu CW, Chang CC, Lin CJ: A practical guide to support vector classfication. 2007, [http:\/\/www.csie.ntu.edu.tw\/~~cjlin]"},{"key":"673_CR27","doi-asserted-by":"publisher","first-page":"442","DOI":"10.1016\/0005-2795(75)90109-9","volume":"405","author":"BW Matthews","year":"1975","unstructured":"Matthews BW: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta. 1975, 405: 442-451.","journal-title":"Biochimica et Biophysica Acta"},{"key":"673_CR28","doi-asserted-by":"publisher","first-page":"259","DOI":"10.1016\/j.jtbi.2007.01.016","volume":"247","author":"X Pu","year":"2007","unstructured":"Pu X, Guo J, Leunga H, Lin YL: Prediction of membrane protein types from sequences and position-specific scoring matrices. Journal of Theoretical Biology. 2007, 247: 259-265. 10.1016\/j.jtbi.2007.01.016.","journal-title":"Journal of Theoretical Biology"}],"container-title":["BMC Systems Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/1752-0509-5-S1-S6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,9,1]],"date-time":"2021-09-01T14:28:15Z","timestamp":1630506495000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcsystbiol.biomedcentral.com\/articles\/10.1186\/1752-0509-5-S1-S6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,6,20]]},"references-count":28,"journal-issue":{"issue":"S1","published-print":{"date-parts":[[2011,12]]}},"alternative-id":["673"],"URL":"https:\/\/doi.org\/10.1186\/1752-0509-5-s1-s6","relation":{},"ISSN":["1752-0509"],"issn-type":[{"value":"1752-0509","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,6,20]]},"assertion":[{"value":"20 June 2011","order":1,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"S6"}}