{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,12]],"date-time":"2025-11-12T13:56:43Z","timestamp":1762955803125,"version":"build-2065373602"},"reference-count":38,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2017,12,8]],"date-time":"2017-12-08T00:00:00Z","timestamp":1512691200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"the Fundamental Research Funds for the Central Universities","award":["HIT NSRIF.20169"],"award-info":[{"award-number":["HIT NSRIF.20169"]}]},{"name":"the Heilongjiang Postdoctoral Fund","award":["LBH-Z16081"],"award-info":[{"award-number":["LBH-Z16081"]}]},{"name":"the Online Education Research Funds of Online Education Research Center of Ministry of Education (Quantong Education)","award":["2016YB132"],"award-info":[{"award-number":["2016YB132"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>Gene function prediction is a complicated and challenging hierarchical multi-label classification (HMC) task, in which genes may have many functions at the same time and these functions are organized in a hierarchy. This paper proposed a novel HMC algorithm for solving this problem based on the Gene Ontology (GO), the hierarchy of which is a directed acyclic graph (DAG) and is more difficult to tackle. In the proposed algorithm, the HMC task is firstly changed into a set of binary classification tasks. Then, two measures are implemented in the algorithm to enhance the HMC performance by considering the hierarchy structure during the learning procedures. Firstly, negative instances selecting policy associated with the SMOTE approach are proposed to alleviate the imbalanced data set problem. Secondly, a nodes interaction method is introduced to combine the results of binary classifiers. It can guarantee that the predictions are consistent with the hierarchy constraint. The experiments on eight benchmark yeast data sets annotated by the Gene Ontology show the promising performance of the proposed algorithm compared with other state-of-the-art algorithms.<\/jats:p>","DOI":"10.3390\/a10040138","type":"journal-article","created":{"date-parts":[[2017,12,8]],"date-time":"2017-12-08T11:37:40Z","timestamp":1512733060000},"page":"138","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["A Hierarchical Multi-Label Classification Algorithm for Gene Function Prediction"],"prefix":"10.3390","volume":"10","author":[{"given":"Shou","family":"Feng","sequence":"first","affiliation":[{"name":"Department of Automatic Test and Control, Harbin Institute of Technology, Harbin 150080, China"}]},{"given":"Ping","family":"Fu","sequence":"additional","affiliation":[{"name":"Department of Automatic Test and Control, Harbin Institute of Technology, Harbin 150080, China"}]},{"given":"Wenbin","family":"Zheng","sequence":"additional","affiliation":[{"name":"Department of Automatic Test and Control, Harbin Institute of Technology, Harbin 150080, China"}]}],"member":"1968","published-online":{"date-parts":[[2017,12,8]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Madjarov, G., Dimitrovski, I., Gjorgjevikj, D., and D\u017eeroski, S. (2014). Evaluation of Different Data-Derived Label Hierarchies in Multi-Label Classification, Springer.","DOI":"10.1007\/978-3-319-17876-9_2"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/coin.12011","article-title":"An Extensive Evaluation of Decision Tree\u2014Based Hierarchical Multilabel Classification Methods and Performance Measures","volume":"31","author":"Cerri","year":"2013","journal-title":"Comput. Intell."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Rom\u00e3o, L.M., and Nievola, J.C. (2015, January 3\u20135). Hierarchical Multi-label Classification Problems: An LCS Approach. Proceedings of the 12th International Conference on Distributed Computing and Artificial Intelligence, Salamanca, Spain.","DOI":"10.1007\/978-3-319-19638-1_11"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Blockeel, H., Schietgat, L., Struyf, J., D\u017eeroski, S., and Clare, A. (2006, January 18\u201322). Decision trees for hierarchical multilabel classification: A case study in functional genomics. Proceedings of the 10th European Conference on Principle and Practice of Knowledge Discovery in Databases, Berlin, Germany.","DOI":"10.1007\/11871637_7"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"2907","DOI":"10.1109\/TKDE.2015.2441707","article-title":"Bayes-Optimal Hierarchical Multilabel Classification","volume":"27","author":"Bi","year":"2015","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_6","unstructured":"Merschmann, L.H.D.C., and Freitas, A.A. (2013). An Extended Local Hierarchical Classifier for Prediction of Protein and Gene Functions, Springer."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium","volume":"25","author":"Ashburner","year":"2015","journal-title":"Nat. Genet."},{"key":"ref_8","unstructured":"Alves, R.T., Delgado, M.R., and Freitas, A.A. (2008). Multi-Label Hierarchical Classification of Protein Functions with Artificial Immune Systems, Springer."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Santos, A., and Canuto, A. (2014, January 6\u201311). Applying semi-supervised learning in hierarchical multi-label classification. Proceedings of the 2014 International Joint Conference on Neural Networks (IJCNN), Beijing, China.","DOI":"10.1109\/IJCNN.2014.6889565"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Cerri, R., Barros, R.C., and de Carvalho, A. (2011, January 22\u201324). Hierarchical Multi-Label Classification for Protein Function Prediction: A Local Approach based on Neural Networks. Proceedings of the 11th International Conference on Intelligent Systems Design and Applications (ISDA), Cordoba, Spain.","DOI":"10.1109\/ISDA.2011.6121678"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Ram\u00edrez-Corona, M., Sucar, L.E., and Morales, E.F. (2014). Multi-Label Classification for Tree and Directed Acyclic Graphs Hierarchies, Springer.","DOI":"10.1007\/978-3-319-11433-0_27"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Alves, R.T., Delgado, M.R., and Freitas, A.A. (2010, January 18\u201323). Knowledge discovery with Artificial Immune Systems for hierarchical multi-label classification of protein functions. Proceedings of the 2010 IEEE International Conference on Fuzzy Systems (FUZZ), Barcelona, Spain.","DOI":"10.1109\/FUZZY.2010.5584298"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1007\/s10994-008-5077-3","article-title":"Decision trees for hierarchical multi-label classification","volume":"73","author":"Vens","year":"2008","journal-title":"Mach. Learn."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Borges, H.B., and Nievola, J.C. (2012, January 10\u201315). Multi-Label Hierarchical Classification using a Competitive Neural Network for protein function prediction. Proceedings of the International Joint Conference on Neural Networks, Brisbane, Australia.","DOI":"10.1109\/IJCNN.2012.6252736"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Chen, B., Duan, L., and Hu, J. (2012, January 10\u201315). Composite kernel based SVM for hierarchical multi-label gene function classification. Proceedings of the International Joint Conference on Neural Networks, Brisbane, Australia.","DOI":"10.1109\/IJCNN.2012.6252555"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"830","DOI":"10.1093\/bioinformatics\/btk048","article-title":"Hierarchical multi-label prediction of gene function","volume":"22","author":"Barutcuoglu","year":"2006","journal-title":"Bioinformatics"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"832","DOI":"10.1109\/TCBB.2010.38","article-title":"True Path Rule Hierarchical Ensembles for Genome-Wide Gene Function Prediction","volume":"8","author":"Valentini","year":"2011","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1007\/978-3-319-20248-8_2","article-title":"A Hierarchical Ensemble Method for DAG-Structured Taxonomies","volume":"Volume 9132","author":"Robinson","year":"2015","journal-title":"Lecture Notes in Computer Science"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"165","DOI":"10.1007\/s12293-010-0045-4","article-title":"A hierarchical multi-label classification ant colony algorithm for protein function prediction","volume":"2","author":"Otero","year":"2010","journal-title":"Memet. Comput."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"3955","DOI":"10.1186\/1471-2105-14-285","article-title":"Using PPI network autocorrelation in hierarchical multi-label classification trees for gene function prediction","volume":"14","author":"Stojanova","year":"2013","journal-title":"BMC Bioinform."},{"key":"ref_21","first-page":"64","article-title":"Pitfalls of Ascertainment Biases in Genome Annotations\u2014Computing Comparable Protein Domain Distributions in Eukarya","volume":"10","author":"Parikesit","year":"2014","journal-title":"Malays. J. Fundam. Appl. Sci."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1613\/jair.953","article-title":"SMOTE: Synthetic minority over-sampling technique","volume":"16","author":"Chawla","year":"2011","journal-title":"J. Artif. Intell. Res."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1007\/s10618-010-0175-9","article-title":"A survey of hierarchical classification across different application domains","volume":"Volume 22","author":"Silla","year":"2011","journal-title":"Data Mining & Knowledge Discovery"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"179","DOI":"10.1016\/j.ijar.2015.07.008","article-title":"Hierarchical multilabel classification based on path evaluation","volume":"68","author":"Sucar","year":"2016","journal-title":"Int. J. Approx. Reason."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"843","DOI":"10.3233\/IDA-2011-0499","article-title":"Irrelevant attributes and imbalanced classes in multi-label text-categorization domains","volume":"15","author":"Dendamrongvit","year":"2011","journal-title":"Intell. Data Anal."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"191","DOI":"10.1016\/j.dss.2009.07.011","article-title":"On strategies for imbalanced text classification using SVM: A comparative study","volume":"48","author":"Sun","year":"2009","journal-title":"Decis. Support Syst."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1007\/s10994-007-5018-6","article-title":"A note on Platt\u2019s probabilistic outputs for support vector machines","volume":"68","author":"Lin","year":"2007","journal-title":"Mach. Learn."},{"key":"ref_28","first-page":"1","article-title":"Hierarchical ensemble methods for protein function prediction","volume":"2014","author":"Valentini","year":"2014","journal-title":"Int. Sch. Res. Not."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"8348","DOI":"10.1073\/pnas.0832373100","article-title":"A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae)","volume":"100","author":"Troyanskaya","year":"2003","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Li, H., Liu, C., B\u00fcrge, L., Ko, K.D., and Southerland, W. (2012, January 4\u20137). Predicting protein-protein interactions using full Bayesian network. Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine Workshops, Philadelphia, PA, USA.","DOI":"10.1109\/BIBMW.2012.6470198"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"ii42","DOI":"10.1093\/bioinformatics\/btg1058","article-title":"Predicting gene function in Saccharomyces cerevisiae","volume":"19","author":"Clare","year":"2003","journal-title":"Bioinformatics"},{"key":"ref_32","unstructured":"Bi, W., and Kwok, J.T. (July, January 28). MultiLabel Classification on Tree- and DAG-Structured Hierarchies. Proceedings of the 28th International Conference on International Conference on Machine Learning, Bellevue, WA, USA."},{"key":"ref_33","first-page":"896","article-title":"Gene function prediction based on the Gene Ontology hierarchical structure","volume":"9","author":"Liangxi","year":"2013","journal-title":"PLoS ONE"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1038\/nmeth.2340","article-title":"A large-scale evaluation of computational protein function prediction","volume":"10","author":"Radivojac","year":"2013","journal-title":"Nat. Methods"},{"key":"ref_35","unstructured":"Aleksovski, D., Kocev, D., and Dzeroski, S. (2009, January 7). Evaluation of distance measures for hierarchical multilabel classification in functional genomics. Proceedings of the 1st Workshop on Learning from Mulit-Label Data (MLD), Bled, Slovenia."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Chen, Y., Li, Z., Hu, X., and Liu, J. (2010, January 19\u201321). Hierarchical Classification with Dynamic-Threshold SVM Ensemble for Gene Function Prediction. Proceedings of the 6th International Conference on Advanced Data Mining and Applications (ADMA), Chongqing, China.","DOI":"10.1007\/978-3-642-17313-4_33"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"717","DOI":"10.3233\/IDA-140665","article-title":"Hierarchical multi-label classification with SVMs: A case study in gene function prediction","volume":"18","author":"Vateekul","year":"2014","journal-title":"Intell. Data Anal."},{"key":"ref_38","unstructured":"Alaydie, N., Reddy, C.K., and Fotouhi, F. (June, January 29). Exploiting Label Dependency for Hierarchical Multi-label Classification. Proceedings of the 16th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, Kuala Lumpur, Malaysia."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/10\/4\/138\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T18:53:07Z","timestamp":1760208787000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/10\/4\/138"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,12,8]]},"references-count":38,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2017,12]]}},"alternative-id":["a10040138"],"URL":"https:\/\/doi.org\/10.3390\/a10040138","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2017,12,8]]}}}