{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T08:23:20Z","timestamp":1772180600614,"version":"3.50.1"},"reference-count":51,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2024,7,29]],"date-time":"2024-07-29T00:00:00Z","timestamp":1722211200000},"content-version":"vor","delay-in-days":4,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100012547","name":"Natural Science Foundation of Guangxi Province","doi-asserted-by":"publisher","award":["2020GXNSFAA159074"],"award-info":[{"award-number":["2020GXNSFAA159074"]}],"id":[{"id":"10.13039\/100012547","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61862006"],"award-info":[{"award-number":["61862006"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62362004"],"award-info":[{"award-number":["62362004"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,7,25]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>The annotation of enzyme function is a fundamental challenge in industrial biotechnology and pathologies. Numerous computational methods have been proposed to predict enzyme function by annotating enzyme labels with Enzyme Commission number. However, the existing methods face difficulties in modelling the hierarchical structure of enzyme label in a global view. Moreover, they haven\u2019t gone entirely to leverage the mutual interactions between different levels of enzyme label. In this paper, we formulate the hierarchy of enzyme label as a directed enzyme graph and propose a hierarchy-GCN (Graph Convolutional Network) encoder to globally model enzyme label dependency on the enzyme graph. Based on the enzyme hierarchy encoder, we develop an end-to-end hierarchical-aware global model named GloEC to predict enzyme function. GloEC learns hierarchical-aware enzyme label embeddings via the hierarchy-GCN encoder and conducts deductive fusion of label-aware enzyme features to predict enzyme labels. Meanwhile, our hierarchy-GCN encoder is designed to bidirectionally compute to investigate the enzyme label correlation information in both bottom-up and top-down manners, which has not been explored in enzyme function prediction. Comparative experiments on three benchmark datasets show that GloEC achieves better predictive performance as compared to the existing methods. The case studies also demonstrate that GloEC is capable of effectively predicting the function of isoenzyme. GloEC is available at: https:\/\/github.com\/hyr0771\/GloEC.<\/jats:p>","DOI":"10.1093\/bib\/bbae365","type":"journal-article","created":{"date-parts":[[2024,7,29]],"date-time":"2024-07-29T23:31:54Z","timestamp":1722295914000},"source":"Crossref","is-referenced-by-count":10,"title":["GloEC: a hierarchical-aware global model for predicting enzyme function"],"prefix":"10.1093","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6289-8766","authenticated-orcid":false,"given":"Yiran","family":"Huang","sequence":"first","affiliation":[{"name":"Guangxi University School of Computer, Electronics and Information, , Nanning 530004, China"},{"name":"Guangxi University Key Laboratory of Parallel, Distributed and Intelligent Computing in Guangxi Universities and Colleges, , Nanning 530004, China"},{"name":"Guangxi University Guangxi Key Laboratory of Multimedia Communications and Network Technology, , Nanning 530004, China"}]},{"given":"Yufu","family":"Lin","sequence":"additional","affiliation":[{"name":"Guangxi University School of Computer, Electronics and Information, , Nanning 530004, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5839-7504","authenticated-orcid":false,"given":"Wei","family":"Lan","sequence":"additional","affiliation":[{"name":"Guangxi University School of Computer, Electronics and Information, , Nanning 530004, China"},{"name":"Guangxi University Key Laboratory of Parallel, Distributed and Intelligent Computing in Guangxi Universities and Colleges, , Nanning 530004, China"},{"name":"Guangxi University Guangxi Key Laboratory of Multimedia Communications and Network Technology, , Nanning 530004, China"}]},{"given":"Cuiyu","family":"Huang","sequence":"additional","affiliation":[{"name":"Nankai University College of Chemistry, Tianjin Key Laboratory of Biosensing and Molecular Recognition, , Tianjin 300071, China"}]},{"given":"Cheng","family":"Zhong","sequence":"additional","affiliation":[{"name":"Guangxi University School of Computer, Electronics and Information, , Nanning 530004, China"},{"name":"Guangxi University Key Laboratory of Parallel, Distributed and Intelligent Computing in Guangxi Universities and Colleges, , Nanning 530004, China"},{"name":"Guangxi University Guangxi Key Laboratory of Multimedia Communications and Network Technology, , Nanning 530004, China"}]}],"member":"286","published-online":{"date-parts":[[2024,7,29]]},"reference":[{"key":"2024072915225262100_ref1","doi-asserted-by":"crossref","DOI":"10.1016\/j.compbiolchem.2021.107558","article-title":"ABLE: attention based learning for enzyme classification","volume":"94","author":"Nallapareddy","year":"2021","journal-title":"Comput Biol Chem"},{"key":"2024072915225262100_ref2","doi-asserted-by":"crossref","first-page":"D54","DOI":"10.1093\/nar\/gkr854","article-title":"The sequence read archive: explosive growth of sequencing data","volume":"40","author":"Kodama","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2024072915225262100_ref3","doi-asserted-by":"crossref","first-page":"34","DOI":"10.1186\/s12859-024-05662-4","article-title":"Exploring gene-patient association to identify personalized cancer driver genes by linear neighborhood propagation","volume":"25","author":"Huang","year":"2024","journal-title":"BMC Bioinformatics"},{"key":"2024072915225262100_ref4","doi-asserted-by":"crossref","first-page":"2159","DOI":"10.1109\/TCBB.2023.3234331","article-title":"NetPro: neighborhood interaction-based drug repositioning via label propagation","volume":"20","author":"Huang","year":"2023","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2024072915225262100_ref5","doi-asserted-by":"crossref","first-page":"3173","DOI":"10.1109\/TCBB.2023.3284505","article-title":"Predicting disease-associated N7\u2013methylguanosine (m7G) sites via random walk on heterogeneous network","volume":"20","author":"Huang","year":"2023","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2024072915225262100_ref6","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1093\/nar\/28.1.45","article-title":"The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000","volume":"28","author":"Bairoch","year":"2000","journal-title":"Nucleic Acids Res"},{"key":"2024072915225262100_ref7","doi-asserted-by":"crossref","first-page":"74","DOI":"10.1016\/j.pisc.2014.02.006","article-title":"Current IUBMB recommendations on enzyme nomenclature and kinetics","volume":"1","author":"Cornish-Bowden","year":"2014","journal-title":"Perspect Sci"},{"key":"2024072915225262100_ref8","doi-asserted-by":"crossref","first-page":"540","DOI":"10.2174\/1389450119666181002143355","article-title":"A survey for predicting enzyme family classes using machine learning methods","volume":"20","author":"Tan","year":"2019","journal-title":"Curr Drug Targets"},{"key":"2024072915225262100_ref9","doi-asserted-by":"crossref","first-page":"89802","DOI":"10.1109\/ACCESS.2020.2992468","article-title":"The classification of enzymes by deep learning","volume":"8","author":"Tao","year":"2020","journal-title":"IEEE Access"},{"key":"2024072915225262100_ref10","doi-asserted-by":"crossref","first-page":"5389","DOI":"10.3390\/ijms20215389","article-title":"Alignment-free method to predict enzyme classes and subclasses","volume":"20","author":"Concu","year":"2019","journal-title":"Int J Mol Sci"},{"key":"2024072915225262100_ref11","doi-asserted-by":"crossref","first-page":"1358","DOI":"10.1126\/science.adf2465","article-title":"Enzyme function prediction using contrastive learning","volume":"379","author":"Yu","year":"2023","journal-title":"Science"},{"key":"2024072915225262100_ref12","doi-asserted-by":"crossref","first-page":"15384","DOI":"10.3390\/ijms160715384","article-title":"An overview of practical applications of protein disorder prediction and drive for faster, more accurate predictions","volume":"16","author":"Deng","year":"2015","journal-title":"Int J Mol Sci"},{"key":"2024072915225262100_ref13","doi-asserted-by":"crossref","first-page":"760","DOI":"10.1093\/bioinformatics\/btx680","article-title":"DEEPre: sequence-based enzyme EC number prediction by deep learning","volume":"34","author":"Li","year":"2018","journal-title":"Bioinformatics"},{"key":"2024072915225262100_ref14","doi-asserted-by":"crossref","first-page":"13996","DOI":"10.1073\/pnas.1821905116","article-title":"Deep learning enables high-quality and high-throughput prediction of Enzyme Commission numbers","volume":"116","author":"Ryu","year":"2019","journal-title":"Proc Natl Acad Sci"},{"key":"2024072915225262100_ref15","doi-asserted-by":"crossref","first-page":"4583","DOI":"10.1093\/bioinformatics\/btaa536","article-title":"HECNet: a hierarchical approach to enzyme function classification using a Siamese Triplet Network","volume":"36","author":"Memon","year":"2020","journal-title":"Bioinformatics"},{"key":"2024072915225262100_ref16","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-319-24261-3_7","volume-title":"Deep Metric Learning Using Triplet Network","author":"Hoffer","year":"2015"},{"key":"2024072915225262100_ref17","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1007\/978-3-642-24797-2_4","article-title":"Long short-term memory","volume-title":"Supervised Sequence Labelling with Recurrent Neural Networks","author":"Graves","year":"2012"},{"key":"2024072915225262100_ref18","doi-asserted-by":"crossref","first-page":"W291","DOI":"10.1093\/nar\/gkx366","article-title":"COFACTOR: improved protein function prediction by combining structure, sequence and protein\u2013protein interaction information","volume":"45","author":"Zhang","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2024072915225262100_ref19","doi-asserted-by":"crossref","first-page":"e80942","DOI":"10.7554\/eLife.80942","article-title":"ProteInfer, deep neural networks for protein functional inference","volume":"12","author":"Sanderson","year":"2023","journal-title":"Elife"},{"key":"2024072915225262100_ref20","doi-asserted-by":"crossref","first-page":"2401","DOI":"10.1093\/bioinformatics\/btaa003","article-title":"UDSMProt: universal deep sequence models for protein classification","volume":"36","author":"Strodthoff","year":"2020","journal-title":"Bioinformatics"},{"key":"2024072915225262100_ref21","article-title":"An interpretable double-scale attention model for enzyme protein class prediction based on transformer encoders and multi-scale convolutions","volume":"13","author":"Lin","year":"2022","journal-title":"Front Genet"},{"key":"2024072915225262100_ref22","article-title":"ECRECer: Enzyme Commission Number Recommendation and Benchmarking based on multiagent dual-core learning","author":"Shi","year":"2022","journal-title":"arXiv:2202.03632"},{"key":"2024072915225262100_ref23","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1186\/s12859-024-05665-1","article-title":"PredictEFC: a fast and efficient multi-label classifier for predicting enzyme family classes","volume":"25","author":"Chen","year":"2024","journal-title":"BMC Bioinformatics"},{"key":"2024072915225262100_ref24","doi-asserted-by":"crossref","first-page":"1079","DOI":"10.1109\/TKDE.2010.164","article-title":"Random k-Labelsets for multilabel classification","volume":"23","author":"Tsoumakas","year":"2011","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"2024072915225262100_ref25","doi-asserted-by":"crossref","first-page":"3150","DOI":"10.1093\/bioinformatics\/bts565","article-title":"CD-HIT: accelerated for clustering the next-generation sequencing data","volume":"28","author":"Fu","year":"2012","journal-title":"Bioinformatics"},{"key":"2024072915225262100_ref26","doi-asserted-by":"crossref","first-page":"726","DOI":"10.1016\/j.jmb.2015.11.006","article-title":"BlastKOALA and GhostKOALA: KEGG tools for functional characterization of genome and metagenome sequences","volume":"428","author":"Kanehisa","year":"2016","journal-title":"J Mol Biol"},{"key":"2024072915225262100_ref27","doi-asserted-by":"crossref","first-page":"926","DOI":"10.1093\/bioinformatics\/btu739","article-title":"UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches","volume":"31","author":"Suzek","year":"2015","journal-title":"Bioinformatics"},{"key":"2024072915225262100_ref28","doi-asserted-by":"crossref","first-page":"W471","DOI":"10.1093\/nar\/gks372","article-title":"COFACTOR: an accurate comparative algorithm for structure-based protein function annotation","volume":"40","author":"Roy","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2024072915225262100_ref29","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1007\/978-1-4939-8672-9_2","volume-title":"Lipases and Phospholipases: Methods and Protocols","author":"Armend\u00e1riz-Ruiz","year":"2018"},{"key":"2024072915225262100_ref30","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2020.acl-main.104","volume-title":"Hierarchy-Aware Global Model for Hierarchical Text Classification","author":"Zhou","year":"2020"},{"key":"2024072915225262100_ref31","doi-asserted-by":"crossref","first-page":"e2016239118","DOI":"10.1073\/pnas.2016239118","article-title":"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences","volume":"118","author":"Rives","year":"2021","journal-title":"Proc Natl Acad Sci"},{"key":"2024072915225262100_ref32","article-title":"Layer normalization","author":"Lei Ba","year":"2016"},{"key":"2024072915225262100_ref33","article-title":"Attention is all you need","author":"Vaswani","year":"2017"},{"key":"2024072915225262100_ref34","article-title":"Semi-supervised classification with Graph Convolutional Networks","author":"Kipf","year":"2016"},{"key":"2024072915225262100_ref35","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1007\/s10479-005-5724-z","article-title":"A tutorial on the cross-entropy method","volume":"134","author":"de Boer","year":"2005","journal-title":"Ann Oper Res"},{"key":"2024072915225262100_ref36","first-page":"1063","article-title":"Large-scale hierarchical text classification with recursively regularized deep Graph-CNN","volume-title":"Proceedings of the 2018 World Wide Web Conference. 2018, International World Wide Web Conferences Steering Committee","author":"Peng","year":"2018"},{"key":"2024072915225262100_ref37","first-page":"1929","article-title":"Dropout: a simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J Mach Learn Res"},{"key":"2024072915225262100_ref38","article-title":"A method for stochastic optimization","author":"Kingma","year":"2014"},{"key":"2024072915225262100_ref39","article-title":"Macro F1 and macro F1","author":"Opitz","year":"2019"},{"key":"2024072915225262100_ref40","doi-asserted-by":"crossref","first-page":"D506","DOI":"10.1093\/nar\/gky1049","article-title":"UniProt: a worldwide hub of protein knowledge","volume":"47","author":"UniProt Consortium","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2024072915225262100_ref41","doi-asserted-by":"crossref","first-page":"753","DOI":"10.1073\/pnas.45.5.753","article-title":"Multiple forms of enzymes: tissue, ontogenetic, and species specific patterns","volume":"45","author":"Markert","year":"1959","journal-title":"Proc Natl Acad Sci"},{"key":"2024072915225262100_ref42","doi-asserted-by":"crossref","first-page":"542","DOI":"10.1038\/s41579-022-00712-1","article-title":"Carbohydrate-active enzymes (CAZymes) in the gut microbiome","volume":"20","author":"Wardman","year":"2022","journal-title":"Nat Rev Microbiol"},{"key":"2024072915225262100_ref43","doi-asserted-by":"crossref","first-page":"D571","DOI":"10.1093\/nar\/gkab1045","article-title":"The carbohydrate-active enzyme database: functions and literature","volume":"50","author":"Drula","year":"2022","journal-title":"Nucleic Acids Res"},{"key":"2024072915225262100_ref44","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1093\/nar\/gkg095","article-title":"The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003","volume":"31","author":"Boeckmann","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2024072915225262100_ref45","doi-asserted-by":"crossref","DOI":"10.1109\/ICCV.2019.00936","article-title":"DeepGCNs: can GCNs go as Deep as CNNs?","volume-title":"2019 IEEE\/CVF International Conference on Computer Vision (ICCV)","author":"Li","year":"2019"},{"key":"2024072915225262100_ref46","doi-asserted-by":"crossref","first-page":"789","DOI":"10.1111\/tpj.13415","article-title":"Araport11: a complete reannotation of the Arabidopsis thaliana reference genome","volume":"89","author":"Cheng","year":"2017","journal-title":"Plant J"},{"key":"2024072915225262100_ref47","doi-asserted-by":"crossref","first-page":"2185","DOI":"10.1126\/science.287.5461.2185","article-title":"The genome sequence of Drosophila melanogaster","volume":"287","author":"Adams","year":"2000","journal-title":"Science"},{"key":"2024072915225262100_ref48","doi-asserted-by":"crossref","first-page":"1174","DOI":"10.1016\/j.cell.2010.12.001","article-title":"A tissue-specific atlas of mouse protein phosphorylation and expression","volume":"143","author":"Huttlin","year":"2010","journal-title":"Cell"},{"key":"2024072915225262100_ref49","doi-asserted-by":"crossref","first-page":"883","DOI":"10.3390\/cells10040883","article-title":"Muscle glycogen phosphorylase and its functional partners in health and disease","volume":"10","author":"Migocka-Patrza\u0142ek","year":"2021","journal-title":"Cells"},{"key":"2024072915225262100_ref50","doi-asserted-by":"crossref","first-page":"10450","DOI":"10.3390\/ijms221910450","article-title":"Molecular functions and pathways of plastidial starch phosphorylase (PHO1) in starch metabolism: current and future perspectives","volume":"22","author":"Shoaib","year":"2021","journal-title":"Int J Mol Sci"},{"key":"2024072915225262100_ref51","doi-asserted-by":"crossref","first-page":"3935","DOI":"10.3390\/molecules25173935","article-title":"Fatty acid synthase: an emerging target in cancer","volume":"25","author":"Fhu","year":"2020","journal-title":"Molecules"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/5\/bbae365\/58677145\/bbae365.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/5\/bbae365\/58677145\/bbae365.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,29]],"date-time":"2024-07-29T23:32:18Z","timestamp":1722295938000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbae365\/7723315"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,25]]},"references-count":51,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2024,7,25]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbae365","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,9]]},"published":{"date-parts":[[2024,7,25]]},"article-number":"bbae365"}}