{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T08:40:20Z","timestamp":1775637620834,"version":"3.50.1"},"reference-count":32,"publisher":"Oxford University Press (OUP)","issue":"10","license":[{"start":{"date-parts":[[2023,10,17]],"date-time":"2023-10-17T00:00:00Z","timestamp":1697500800000},"content-version":"vor","delay-in-days":16,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2021YFA0910700"],"award-info":[{"award-number":["2021YFA0910700"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,10,3]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>In recent years, there has been a breakthrough in protein structure prediction, and the AlphaFold2 model of the DeepMind team has improved the accuracy of protein structure prediction to the atomic level. Currently, deep learning-based protein function prediction models usually extract features from protein sequences and combine them with protein\u2013protein interaction networks to achieve good results. However, for newly sequenced proteins that are not in the protein\u2013protein interaction network, such models cannot make effective predictions. To address this, this article proposes the Struct2GO model, which combines protein structure and sequence data to enhance the precision of protein function prediction and the generality of the model.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We obtain amino acid residue embeddings in protein structure through graph representation learning, utilize the graph pooling algorithm based on a self-attention mechanism to obtain the whole graph structure features, and fuse them with sequence features obtained from the protein language model. The results demonstrate that compared with the traditional protein sequence-based function prediction model, the Struct2GO model achieves better results.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The data underlying this article are available at https:\/\/github.com\/lyjps\/Struct2GO.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btad637","type":"journal-article","created":{"date-parts":[[2023,10,17]],"date-time":"2023-10-17T19:48:00Z","timestamp":1697572080000},"source":"Crossref","is-referenced-by-count":46,"title":["Struct2GO: protein function prediction based on graph pooling algorithm and AlphaFold2 structure information"],"prefix":"10.1093","volume":"39","author":[{"given":"Peishun","family":"Jiao","sequence":"first","affiliation":[{"name":"School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen) , Shenzhen, Guang Dong 518055, China"}]},{"given":"Beibei","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen) , Shenzhen, Guang Dong 518055, China"}]},{"given":"Xuan","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen) , Shenzhen, Guang Dong 518055, China"},{"name":"Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Harbin Institute of Technology (Shenzhen) , Shenzhen, Guangdong 518055, China"}]},{"given":"Bo","family":"Liu","sequence":"additional","affiliation":[{"name":"Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology , Harbin, Heilongjiang 150001, China"},{"name":"Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology , Harbin, Heilongjiang 150001, China"}]},{"given":"Yadong","family":"Wang","sequence":"additional","affiliation":[{"name":"Center for Bioinformatics, Faculty of Computing, Harbin Institute of Technology , Harbin, Heilongjiang 150001, China"},{"name":"Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology , Harbin, Heilongjiang 150001, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8045-5264","authenticated-orcid":false,"given":"Junyi","family":"Li","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen) , Shenzhen, Guang Dong 518055, China"},{"name":"Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies, Harbin Institute of Technology (Shenzhen) , Shenzhen, Guangdong 518055, China"},{"name":"Key Laboratory of Biological Bigdata, Ministry of Education, Harbin Institute of Technology , Harbin, Heilongjiang 150001, China"}]}],"member":"286","published-online":{"date-parts":[[2023,10,17]]},"reference":[{"key":"2023102913335038300_btad637-B1","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J Mol Biol"},{"key":"2023102913335038300_btad637-B2","doi-asserted-by":"crossref","first-page":"167640","DOI":"10.1016\/j.jmb.2022.167640","article-title":"Inadequacy of evolutionary profiles vis-a-vis single sequences in predicting transient DNA-binding sites in proteins","volume":"434","author":"Arya","year":"2022","journal-title":"J Mol Biol"},{"key":"2023102913335038300_btad637-B3","first-page":"635","volume-title":"Methods Enzymology","author":"Brenner","year":"1996"},{"key":"2023102913335038300_btad637-B4","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nmeth.3176","article-title":"Fast and sensitive protein alignment using DIAMOND","volume":"12","author":"Buchfink","year":"2015","journal-title":"Nat Methods"},{"key":"2023102913335038300_btad637-B5","author":"Cangea","year":"2018"},{"key":"2023102913335038300_btad637-B6","doi-asserted-by":"crossref","first-page":"D289","DOI":"10.1093\/nar\/gkw1098","article-title":"CATH: an expanded resource to predict protein function through structure and sequence","volume":"45","author":"Dawson","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023102913335038300_btad637-B7","first-page":"3844","article-title":"Convolutional neural networks on graphs with fast localized spectral filtering","volume":"29","author":"Defferrard","year":"2016","journal-title":"Adv Neural Inform. Process. Syst."},{"key":"2023102913335038300_btad637-B8","author":"Devlin","year":"2018"},{"key":"2023102913335038300_btad637-B9","doi-asserted-by":"crossref","first-page":"3168","DOI":"10.1038\/s41467-021-23303-9","article-title":"Structure-based protein function prediction using graph convolutional networks","volume":"12","author":"Gligorijevi\u0107","year":"2021","journal-title":"Nat Commun"},{"key":"2023102913335038300_btad637-B10","first-page":"855","author":"Grover","year":"2016"},{"key":"2023102913335038300_btad637-B11","first-page":"1024","article-title":"Inductive representation learning on large graphs","volume":"30","author":"Hamilton","year":"2017","journal-title":"Advn Neural Inform Process Syst"},{"key":"2023102913335038300_btad637-B12","doi-asserted-by":"crossref","first-page":"723","DOI":"10.1186\/s12859-019-3220-8","article-title":"Modeling aspects of the language of life through transfer-learning protein sequences","volume":"20","author":"Heinzinger","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2023102913335038300_btad637-B13","doi-asserted-by":"crossref","first-page":"595","DOI":"10.1126\/science.273.5275.595","article-title":"Mapping the protein universe","volume":"273","author":"Holm","year":"1996","journal-title":"Science"},{"key":"2023102913335038300_btad637-B14","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2023102913335038300_btad637-B15","author":"Kipf","year":"2016"},{"key":"2023102913335038300_btad637-B16","doi-asserted-by":"crossref","first-page":"717","DOI":"10.1093\/bioinformatics\/btm006","article-title":"On the relationship between sequence and structure similarities in proteomics","volume":"23","author":"Krissinel","year":"2007","journal-title":"Bioinformatics"},{"key":"2023102913335038300_btad637-B17","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1093\/bioinformatics\/btz595","article-title":"DeepGOPlus: improved protein function prediction from sequence","volume":"36","author":"Kulmanov","year":"2020","journal-title":"Bioinformatics"},{"key":"2023102913335038300_btad637-B18","doi-asserted-by":"crossref","first-page":"660","DOI":"10.1093\/bioinformatics\/btx624","article-title":"DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier","volume":"34","author":"Kulmanov","year":"2018","journal-title":"Bioinformatics"},{"issue":"Suppl. 3","key":"2023102913335038300_btad637-B19","doi-asserted-by":"crossref","first-page":"S8","DOI":"10.1186\/1471-2105-14-S3-S8","article-title":"MS-kNN: protein function prediction by integrating multiple data sources","volume":"14","author":"Lan","year":"2013","journal-title":"BMC Bioinformatics"},{"key":"2023102913335038300_btad637-B20","first-page":"3734","volume-title":"Proceedings of the 36th International Conference on Machine Learning.","author":"Lee","year":"2019"},{"key":"2023102913335038300_btad637-B21","author":"Mikolov","year":"2013"},{"key":"2023102913335038300_btad637-B22","doi-asserted-by":"crossref","first-page":"D351","DOI":"10.1093\/nar\/gky1100","article-title":"InterPro in 2019: improving coverage, classification and access to protein sequence annotations","volume":"47","author":"Mitchell","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"2023102913335038300_btad637-B23","first-page":"701","author":"Perozzi","year":"2014"},{"key":"2023102913335038300_btad637-B24","doi-asserted-by":"crossref","first-page":"1438","DOI":"10.1093\/nar\/gks1301","article-title":"The twilight zone of cis element alignments","volume":"41","author":"Sebastian","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023102913335038300_btad637-B25","doi-asserted-by":"crossref","first-page":"D331","DOI":"10.1093\/nar\/gkw1108","article-title":"Expansion of the gene ontology knowledgebase and resources","volume":"45","author":"The Gene Ontology Consortium","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023102913335038300_btad637-B26","doi-asserted-by":"crossref","first-page":"D506","DOI":"10.1093\/nar\/gky1049","article-title":"UniProt: a worldwide hub of protein knowledge","volume":"47","author":"UniProt Consortium","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023102913335038300_btad637-B27","author":"Veli\u010dkovi\u0107","year":"2017"},{"key":"2023102913335038300_btad637-B28","doi-asserted-by":"crossref","first-page":"1713","DOI":"10.1109\/TCBB.2022.3215257","article-title":"PSPGO: Cross-species heterogeneous network propagation for protein function prediction","volume":"20","author":"Wu","year":"2023","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2023102913335038300_btad637-B29","author":"Xu","year":"2018"},{"key":"2023102913335038300_btad637-B30","first-page":"5753","article-title":"Xlnet: generalized autoregressive pretraining for language understanding","volume":"32","author":"Yang","year":"2019","journal-title":"Adv. Neural Inform. Process. Systems"},{"key":"2023102913335038300_btad637-B31","doi-asserted-by":"crossref","first-page":"i262","DOI":"10.1093\/bioinformatics\/btab270","article-title":"DeepGraphGO: graph neural network for large-scale, multispecies protein function prediction","volume":"37","author":"You","year":"2021","journal-title":"Bioinformatics"},{"key":"2023102913335038300_btad637-B32","first-page":"649","author":"Zhang","year":"2015"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btad637\/52191176\/btad637.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/10\/btad637\/52673512\/btad637.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/39\/10\/btad637\/52673512\/btad637.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,29]],"date-time":"2023-10-29T13:55:01Z","timestamp":1698587701000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btad637\/7320010"}},"subtitle":[],"editor":[{"given":"Jonathan","family":"Wren","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,10,1]]},"references-count":32,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2023,10,3]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btad637","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,10,1]]},"published":{"date-parts":[[2023,10,1]]},"article-number":"btad637"}}