{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,15]],"date-time":"2026-03-15T05:06:52Z","timestamp":1773551212504,"version":"3.50.1"},"reference-count":49,"publisher":"Oxford University Press (OUP)","issue":"7","license":[{"start":{"date-parts":[[2025,6,26]],"date-time":"2025-06-26T00:00:00Z","timestamp":1750896000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62272094"],"award-info":[{"award-number":["62272094"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100005046","name":"Heilongjiang Provincial Natural Science Foundation of China","doi-asserted-by":"crossref","award":["LH2022F002"],"award-info":[{"award-number":["LH2022F002"]}],"id":[{"id":"10.13039\/501100005046","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Protein function prediction is important for drug development and disease treatment. Recently, deep learning methods have leveraged protein sequence and structural information, achieving remarkable progress in the field of protein function prediction. However, existing methods ignore the complex multimodal interaction information between sequence and structural features. Since protein sequence and structural information reveal the functional characteristics of proteins from different perspectives, it is challenging to effectively fuse the information from these two modalities to portray protein functions more comprehensively. In addition, current methods have difficulty in effectively capturing long-range dependencies and global contextual information in protein sequences during feature extraction, thus limiting the ability of the model to recognize critical functional residues.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>In this study, we propose a novel framework termed Multi-stage Attention-based Extraction and Fusion model for GO prediction (MAEF-GO) based on a multistage attention mechanism to predict protein functions. MAEF-GO innovatively integrates the graph convolutional network and the graph attention network to extract protein structural features. To address the issue of modeling long-range dependencies within protein sequences, we introduce a frequency-domain attention mechanism capable of extracting global contextual relationships. Additionally, a cross-attention module is implemented to facilitate interactive fusion between protein sequence and structural modalities. Experimental evaluations demonstrate that MAEF-GO achieves superior performance compared to several state-of-the-art baseline models across standard benchmarks. Furthermore, analysis of the cross-attention weight distributions demonstrates MAEF-GO\u2019s interpretability. It can effectively identify critical functional residues of proteins.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The MAEF-GO source code can be found at https:\/\/github.com\/nebstudio\/MAEF-GO, an archived snapshot of the code used in this study is also available via Zenodo at https:\/\/doi.org\/10.5281\/zenodo.15422392.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btaf374","type":"journal-article","created":{"date-parts":[[2025,6,26]],"date-time":"2025-06-26T09:56:36Z","timestamp":1750931796000},"source":"Crossref","is-referenced-by-count":2,"title":["Multistage attention-based extraction and fusion of protein sequence and structural features for protein function prediction"],"prefix":"10.1093","volume":"41","author":[{"given":"Meiling","family":"Liu","sequence":"first","affiliation":[{"name":"College of Computer and Control Engineering, Northeast Forestry University , Harbin, 150040,","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-4467-9640","authenticated-orcid":false,"given":"Shuangshuang","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Computer and Control Engineering, Northeast Forestry University , Harbin, 150040,","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6650-9975","authenticated-orcid":false,"given":"Zeyu","family":"Luo","sequence":"additional","affiliation":[{"name":"College of Computer and Control Engineering, Northeast Forestry University , Harbin, 150040,","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7381-2374","authenticated-orcid":false,"given":"Guohua","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Computer and Control Engineering, Northeast Forestry University , Harbin, 150040,","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7219-0999","authenticated-orcid":false,"given":"Yuming","family":"Zhao","sequence":"additional","affiliation":[{"name":"College of Computer and Control Engineering, Northeast Forestry University , Harbin, 150040,","place":["China"]}]}],"member":"286","published-online":{"date-parts":[[2025,6,26]]},"reference":[{"key":"2025072417170237100_btaf374-B1","doi-asserted-by":"crossref","first-page":"493","DOI":"10.1038\/s41586-024-07487-w","article-title":"Accurate structure prediction of biomolecular interactions with AlphaFold 3","volume":"630","author":"Abramson","year":"2024","journal-title":"Nature"},{"key":"2025072417170237100_btaf374-B2","doi-asserted-by":"crossref","first-page":"311","DOI":"10.1186\/s12859-019-2932-0","article-title":"ProteinNet: a standardized data set for machine learning of protein structure","volume":"20","author":"AlQuraishi","year":"2019","journal-title":"Bmc Bioinformatics"},{"key":"2025072417170237100_btaf374-B3","doi-asserted-by":"crossref","first-page":"D506","DOI":"10.1093\/nar\/gky1049","article-title":"UniProt: a worldwide hub of protein knowledge","volume":"47","author":"Bateman","year":"2019","journal-title":"Nucl Acids Res"},{"key":"2025072417170237100_btaf374-B4","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nmeth.3176","article-title":"Fast and sensitive protein alignment using DIAMOND","volume":"12","author":"Buchfink","year":"2015","journal-title":"Nat Methods"},{"key":"2025072417170237100_btaf374-B5","doi-asserted-by":"crossref","first-page":"627","DOI":"10.1007\/978-1-4939-7000-1_26","article-title":"Protein data bank (PDB): the single global macromolecular structure archive","volume":"1607","author":"Burley","year":"2017","journal-title":"Meth Mol Biol"},{"key":"2025072417170237100_btaf374-B6","doi-asserted-by":"crossref","first-page":"2825","DOI":"10.1093\/bioinformatics\/btab198","article-title":"TALE: transformer-based protein function annotation with joint sequence-Label embedding","volume":"37","author":"Cao","year":"2021","journal-title":"Bioinformatics"},{"key":"2025072417170237100_btaf374-B7","doi-asserted-by":"crossref","DOI":"10.1126\/science.aaf1420","article-title":"A global genetic interaction network maps a wiring diagram of cellular function","volume":"353","author":"Costanzo","year":"2016","journal-title":"Science"},{"key":"2025072417170237100_btaf374-B8","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1007\/978-1-4939-7231-9_5","article-title":"Protein function prediction","volume":"1654","author":"Cruz","year":"2017","journal-title":"Methods Mol Biol"},{"key":"2025072417170237100_btaf374-B9","doi-asserted-by":"crossref","first-page":"D482","DOI":"10.1093\/nar\/gky1114","article-title":"SIFTS: updated structure integration with function, taxonomy and sequences resource allows 40-fold increase in coverage of structure-based annotations for proteins","volume":"47","author":"Dana","year":"2019","journal-title":"Nucl Acid Res"},{"key":"2025072417170237100_btaf374-B10","doi-asserted-by":"crossref","first-page":"3460","DOI":"10.1093\/bioinformatics\/btv398","article-title":"Functional classification of CATH superfamilies: a domain-based approach for protein function annotation","volume":"31","author":"Das","year":"2015","journal-title":"Bioinformatics"},{"key":"2025072417170237100_btaf374-B11","author":"Dauphin","year":"2017"},{"key":"2025072417170237100_btaf374-B12","doi-asserted-by":"crossref","first-page":"823","DOI":"10.1038\/35015694","article-title":"Protein function in the post-genomic era","volume":"405","author":"Eisenberg","year":"2000","journal-title":"Nature"},{"key":"2025072417170237100_btaf374-B13","doi-asserted-by":"crossref","first-page":"7112","DOI":"10.1109\/TPAMI.2021.3095381","article-title":"ProtTrans: toward understanding the language of life through Self-Supervised learning","volume":"44","author":"Elnaggar","year":"2022","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2025072417170237100_btaf374-B14","author":"Fey","year":"2019"},{"key":"2025072417170237100_btaf374-B15","doi-asserted-by":"crossref","first-page":"3168","DOI":"10.1038\/s41467-021-23303-9","article-title":"Structure-based protein function prediction using graph convolutional networks","volume":"12","author":"Gligorijevic","year":"2021","journal-title":"Nat Commun"},{"key":"2025072417170237100_btaf374-B16","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btad410","article-title":"Hierarchical graph transformer with contrastive learning for protein function prediction","volume":"39","author":"Gu","year":"2023","journal-title":"Bioinformatics"},{"key":"2025072417170237100_btaf374-B17","doi-asserted-by":"crossref","first-page":"2308","DOI":"10.1093\/bioinformatics\/btg299","article-title":"PDB file parser and structure class implemented in Python","volume":"19","author":"Hamelryck","year":"2003","journal-title":"Bioinformatics"},{"key":"2025072417170237100_btaf374-B18","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btad637","article-title":"Struct2GO: protein function prediction based on graph pooling algorithm and AlphaFold2 structure information","volume":"39","author":"Jiao","year":"2023","journal-title":"Bioinformatics"},{"key":"2025072417170237100_btaf374-B19","author":"Kingma","year":"2014"},{"key":"2025072417170237100_btaf374-B20","author":"Kipf","year":"2017"},{"key":"2025072417170237100_btaf374-B21","doi-asserted-by":"crossref","first-page":"717","DOI":"10.1093\/bioinformatics\/btm006","article-title":"On the relationship between sequence and structure similarities in proteomics","volume":"23","author":"Krissinel","year":"2007","journal-title":"Bioinformatics"},{"key":"2025072417170237100_btaf374-B22","doi-asserted-by":"crossref","first-page":"220","DOI":"10.1038\/s42256-024-00795-w","article-title":"Protein function prediction as approximate semantic entailment","volume":"6","author":"Kulmanov","year":"2024","journal-title":"Nat Mach Intell"},{"key":"2025072417170237100_btaf374-B23","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1093\/bioinformatics\/btz595","article-title":"DeepGOPlus: improved protein function prediction from sequence","volume":"36","author":"Kulmanov","year":"2020","journal-title":"Bioinformatics"},{"key":"2025072417170237100_btaf374-B24","doi-asserted-by":"crossref","first-page":"660","DOI":"10.1093\/bioinformatics\/btx624","article-title":"DeepGO: predicting protein funcions from sequence and interactions using a deep ontology-aware classifier","volume":"34","author":"Kulmanov","year":"2018","journal-title":"Bioinformatics"},{"key":"2025072417170237100_btaf374-B25","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbab502","article-title":"Accurate protein function prediction via graph attention networks with predicted structure information","volume":"23","author":"Lai","year":"2022","journal-title":"Brief Bioinform"},{"key":"2025072417170237100_btaf374-B26","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1006\/jmbi.1996.0167","article-title":"An evolutionary trace method defines binding surfaces common to protein families","volume":"257","author":"Lichtarge","year":"1996","journal-title":"J Mol Biol"},{"key":"2025072417170237100_btaf374-B27","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbae289","article-title":"A comprehensive review and comparison of existing computational methods for protein function prediction","volume":"25","author":"Lin","year":"2024","journal-title":"Brief Bioinform"},{"key":"2025072417170237100_btaf374-B28","doi-asserted-by":"crossref","first-page":"1123","DOI":"10.1126\/science.ade2574","article-title":"Evolutionary-scale prediction of atomic-level protein structure with a language model","volume":"379","author":"Lin","year":"2023","journal-title":"Science"},{"key":"2025072417170237100_btaf374-B29","doi-asserted-by":"crossref","first-page":"bbad534","DOI":"10.1093\/bib\/bbad534","article-title":"Interpretable feature extraction and dimensionality reduction in ESM2 for protein localization prediction","volume":"25","author":"Luo","year":"2024","journal-title":"Brief Bioinform"},{"key":"2025072417170237100_btaf374-B30","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btae571","article-title":"TAWFN: a deep learning framework for protein function prediction","volume":"40","author":"Meng","year":"2024","journal-title":"Bioinformatics"},{"key":"2025072417170237100_btaf374-B31","article-title":"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences","volume-title":"Proc Natl Acad Sci USA","year":"2021"},{"key":"2025072417170237100_btaf374-B32","doi-asserted-by":"crossref","first-page":"D351","DOI":"10.1093\/nar\/gky1100","article-title":"InterPro in 2019: improving coverage, classification and access to protein sequence annotations","volume":"47","author":"Mitchell","year":"2019","journal-title":"Nucl Acid Res"},{"key":"2025072417170237100_btaf374-B33","doi-asserted-by":"crossref","first-page":"1759","DOI":"10.1093\/bioinformatics\/btq262","article-title":"Fast integration of heterogeneous data sources for predicting gene function with limited annotation","volume":"26","author":"Mostafavi","year":"2010","journal-title":"Bioinformatics"},{"key":"2025072417170237100_btaf374-B34","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/gb-2008-9-s1-s4","article-title":"GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function","volume":"9","author":"Mostafavi","year":"2008","journal-title":"Genome Biol"},{"key":"2025072417170237100_btaf374-B35","author":"Qin","year":"2021"},{"key":"2025072417170237100_btaf374-B36","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1038\/nmeth.2340","article-title":"A large-scale evaluation of computational protein function prediction","volume":"10","author":"Radivojac","year":"2013","journal-title":"Nat Methods"},{"key":"2025072417170237100_btaf374-B37","doi-asserted-by":"crossref","first-page":"88","DOI":"10.1038\/msb4100129","article-title":"Network-based prediction of protein function","volume":"3","author":"Sharan","year":"2007","journal-title":"Mol Syst Biol"},{"key":"2025072417170237100_btaf374-B38","doi-asserted-by":"crossref","first-page":"2542","DOI":"10.1038\/s41467-018-04964-5","article-title":"Clustering huge protein sequence sets in linear time","volume":"9","author":"Steinegger","year":"2018","journal-title":"Nat Commun"},{"key":"2025072417170237100_btaf374-B39","doi-asserted-by":"crossref","first-page":"D439","DOI":"10.1093\/nar\/gkab1061","article-title":"AlphaFold protein structure database: massively expanding the structural coverage of protein-sequence space with high-accuracy models","volume":"50","author":"Varadi","year":"2022","journal-title":"Nucl Acid Res"},{"key":"2025072417170237100_btaf374-B40","author":"Vaswani","year":"2017"},{"key":"2025072417170237100_btaf374-B41","author":"Veli\u010dkovi\u0107","year":"2017"},{"key":"2025072417170237100_btaf374-B42","first-page":"690873","article-title":"Bioinformatics methods and biological interpretation for next-generation sequencing data","volume":"2015","author":"Wang","year":"2015","journal-title":"Biomed Research International"},{"key":"2025072417170237100_btaf374-B43","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btae031","article-title":"Insights into the inner workings of transformer models for protein function prediction","volume":"40","author":"Wenzel","year":"2024","journal-title":"Bioinformatics"},{"key":"2025072417170237100_btaf374-B44","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btad123","article-title":"CFAGO: cross-fusion of network and attributes based on attention mechanism for protein function prediction","volume":"39","author":"Wu","year":"2023","journal-title":"Bioinformatics"},{"key":"2025072417170237100_btaf374-B45","doi-asserted-by":"crossref","first-page":"601","DOI":"10.1038\/s42256-021-00348-5","article-title":"Improved protein structure prediction by deep learning irrespective of co-evolution information","volume":"3","author":"Xu","year":"2021","journal-title":"Nat Mach Intell"},{"key":"2025072417170237100_btaf374-B46","doi-asserted-by":"crossref","first-page":"D1096","DOI":"10.1093\/nar\/gks966","article-title":"BioLiP: a semi-manually curated database for biologically relevant ligand\u2013protein interactions","volume":"41","author":"Yang","year":"2013","journal-title":"Nucl Acid Res"},{"key":"2025072417170237100_btaf374-B47","doi-asserted-by":"crossref","first-page":"1358","DOI":"10.1126\/science.adf2465","article-title":"Enzyme function prediction using contrastive learning","volume":"379","author":"Yu","year":"2023","journal-title":"Science"},{"key":"2025072417170237100_btaf374-B48","doi-asserted-by":"crossref","first-page":"W248","DOI":"10.1093\/nar\/gkae381","article-title":"GPSFun: geometry-aware protein sequence function predictions with language models","volume":"52","author":"Yuan","year":"2024","journal-title":"Nucl Acid Res"},{"key":"2025072417170237100_btaf374-B49","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbad117","article-title":"Fast and accurate protein function prediction from sequence through pretrained language model and homology-based label diffusion","volume":"24","author":"Yuan","year":"2023","journal-title":"Brief Bioinform"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btaf374\/63590813\/btaf374.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/7\/btaf374\/63590813\/btaf374.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/41\/7\/btaf374\/63590813\/btaf374.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T21:17:14Z","timestamp":1753391834000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btaf374\/8174967"}},"subtitle":[],"editor":[{"given":"Xin","family":"Gao","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,6,26]]},"references-count":49,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2025,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btaf374","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2025,7]]},"published":{"date-parts":[[2025,6,26]]},"article-number":"btaf374"}}