{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T19:14:43Z","timestamp":1775330083446,"version":"3.50.1"},"reference-count":30,"publisher":"Oxford University Press (OUP)","issue":"7","license":[{"start":{"date-parts":[[2024,7,4]],"date-time":"2024-07-04T00:00:00Z","timestamp":1720051200000},"content-version":"vor","delay-in-days":3,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Protein\u2013protein interaction (PPI) networks are crucial for automatically annotating protein functions. As multiple PPI networks exist for the same set of proteins that capture properties from different aspects, it is a challenging task to effectively utilize these heterogeneous networks. Recently, several deep learning models have combined PPI networks from all evidence, or concatenated all graph embeddings for protein function prediction. However, the lack of a judicious selection procedure prevents the effective harness of information from different PPI networks, as these networks vary in densities, structures, and noise levels. Consequently, combining protein features indiscriminately could increase the noise level, leading to decreased model performance.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We develop DualNetGO, a dual-network model comprised of a Classifier and a Selector, to predict protein functions by effectively selecting features from different sources including graph embeddings of PPI networks, protein domain, and subcellular location information. Evaluation of DualNetGO on human and mouse datasets in comparison with other network-based models shows at least 4.5%, 6.2%, and 14.2% improvement on Fmax in BP, MF, and CC gene ontology categories, respectively, for human, and 3.3%, 10.6%, and 7.7% improvement on Fmax for mouse. We demonstrate the generalization capability of our model by training and testing on the CAFA3 data, and show its versatility by incorporating Esm2 embeddings. We further show that our model is insensitive to the choice of graph embedding method and is time- and memory-saving. These results demonstrate that combining a subset of features including PPI networks and protein attributes selected by our model is more effective in utilizing PPI network information than only using one kind of or concatenating graph embeddings from all kinds of PPI networks.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The source code of DualNetGO and some of the experiment data are available at: https:\/\/github.com\/georgedashen\/DualNetGO.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae437","type":"journal-article","created":{"date-parts":[[2024,7,4]],"date-time":"2024-07-04T09:27:11Z","timestamp":1720085231000},"source":"Crossref","is-referenced-by-count":6,"title":["DualNetGO: a dual network model for protein function prediction\n                    <i>via<\/i>\n                    effective feature selection"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5823-7698","authenticated-orcid":false,"given":"Zhuoyang","family":"Chen","sequence":"first","affiliation":[{"name":"Data Science and Analytics Thrust, Information Hub, The Hong Kong University of Science and Technology (Guangzhou) , Guangzhou, Guangdong, 511400,","place":["China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2861-9492","authenticated-orcid":false,"given":"Qiong","family":"Luo","sequence":"additional","affiliation":[{"name":"Data Science and Analytics Thrust, Information Hub, The Hong Kong University of Science and Technology (Guangzhou) , Guangzhou, Guangdong, 511400,","place":["China"]},{"name":"HKUST, Hong Kong SAR ,","place":["China"]}]}],"member":"286","published-online":{"date-parts":[[2024,7,4]]},"reference":[{"key":"2024110605394595400_btae437-B1","doi-asserted-by":"crossref","first-page":"iyad031","DOI":"10.1093\/genetics\/iyad031","article-title":"The gene ontology knowledgebase in 2023","volume":"224","author":"Aleksander","year":"2023","journal-title":"Genetics"},{"key":"2024110605394595400_btae437-B2","doi-asserted-by":"crossref","first-page":"btad662","DOI":"10.1093\/bioinformatics\/btad662","article-title":"Sslpheno: a self-supervised learning approach for gene\u2013phenotype association prediction using protein\u2013protein interactions and gene ontology data","volume":"39","author":"Bi","year":"2023","journal-title":"Bioinformatics"},{"key":"2024110605394595400_btae437-B3","doi-asserted-by":"crossref","first-page":"i318","DOI":"10.1093\/bioinformatics\/btad208","article-title":"Combining protein sequences and structures with transformers and equivariant graph neural networks to predict protein function","volume":"39","author":"Boadu","year":"2023","journal-title":"Bioinformatics"},{"key":"2024110605394595400_btae437-B4","doi-asserted-by":"crossref","first-page":"2825","DOI":"10.1093\/bioinformatics\/btab198","article-title":"TALE: transformer-based protein function annotation with joint sequence-label embedding","volume":"37","author":"Cao","year":"2021","journal-title":"Bioinformatics"},{"key":"2024110605394595400_btae437-B5","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1016\/j.cels.2016.10.017","article-title":"Compact integration of multi-network topology for functional analysis of genes","volume":"3","author":"Cho","year":"2016","journal-title":"Cell Syst"},{"key":"2024110605394595400_btae437-B6","doi-asserted-by":"crossref","first-page":"giaa081","DOI":"10.1093\/gigascience\/giaa081","article-title":"Graph2GO: a multi-modal attributed network embedding method for inferring protein functions","volume":"9","author":"Fan","year":"2020","journal-title":"Gigascience"},{"key":"2024110605394595400_btae437-B7","doi-asserted-by":"crossref","first-page":"3873","DOI":"10.1093\/bioinformatics\/bty440","article-title":"deepNF: deep network fusion for protein function prediction","volume":"34","author":"Gligorijevi\u0107","year":"2018","journal-title":"Bioinformatics"},{"key":"2024110605394595400_btae437-B8","doi-asserted-by":"crossref","first-page":"3168","DOI":"10.1038\/s41467-021-23303-9","article-title":"Structure-based protein function prediction using graph convolutional networks","volume":"12","author":"Gligorijevi\u0107","year":"2021","journal-title":"Nat Commun"},{"key":"2024110605394595400_btae437-B9","first-page":"855","article-title":"node2vec: scalable feature learning for networks","volume":"2016","author":"Grover","year":"2016","journal-title":"KDD"},{"key":"2024110605394595400_btae437-B10","author":"Hechtlinger"},{"key":"2024110605394595400_btae437-B11","doi-asserted-by":"crossref","first-page":"1103","DOI":"10.1038\/s42003-023-05476-9","article-title":"Domain-PFP allows protein function prediction using function-aware domain embedding representations","volume":"6","author":"Ibtehaz","year":"2023","journal-title":"Commun Biol"},{"key":"2024110605394595400_btae437-B12","doi-asserted-by":"crossref","first-page":"184","DOI":"10.1186\/s13059-016-1037-6","article-title":"An expanded evaluation of protein function prediction methods shows an improvement in accuracy","volume":"17","author":"Jiang","year":"2016","journal-title":"Genome Biol"},{"key":"2024110605394595400_btae437-B13","article-title":"Variational graph auto-encoders","author":"Kipf"},{"key":"2024110605394595400_btae437-B14","doi-asserted-by":"crossref","first-page":"1187","DOI":"10.1093\/bioinformatics\/btaa763","article-title":"DeepGOPlus: improved protein function prediction from sequence","volume":"37","author":"Kulmanov","year":"2021","journal-title":"Bioinformatics"},{"key":"2024110605394595400_btae437-B15","doi-asserted-by":"crossref","first-page":"660","DOI":"10.1093\/bioinformatics\/btx624","article-title":"DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier","volume":"34","author":"Kulmanov","year":"2018","journal-title":"Bioinformatics"},{"key":"2024110605394595400_btae437-B16","doi-asserted-by":"crossref","first-page":"1123","DOI":"10.1126\/science.ade2574","article-title":"Evolutionary-scale prediction of atomic-level protein structure with a language model","volume":"379","author":"Lin","year":"2023","journal-title":"Science"},{"key":"2024110605394595400_btae437-B17","doi-asserted-by":"crossref","first-page":"402","DOI":"10.1038\/s41586-020-2188-x","article-title":"A reference map of the human binary protein interactome","volume":"580","author":"Luck","year":"2020","journal-title":"Nature"},{"key":"2024110605394595400_btae437-B18","first-page":"4334","author":"Maurya","year":"2022"},{"key":"2024110605394595400_btae437-B19","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1049\/cit2.12166","article-title":"Feature selection: key to enhance node classification with graph neural networks","volume":"8","author":"Maurya","year":"2023","journal-title":"CAAI Trans Intell Technol"},{"key":"2024110605394595400_btae437-B20","doi-asserted-by":"crossref","first-page":"S4","DOI":"10.1186\/gb-2008-9-s1-s4","article-title":"GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function","volume":"9 Suppl 1","author":"Mostafavi","year":"2008","journal-title":"Genome Biol"},{"key":"2024110605394595400_btae437-B21","doi-asserted-by":"crossref","first-page":"242","DOI":"10.1186\/s12859-023-05375-0","article-title":"TEMPROT: protein function annotation using transformers embeddings and homology search","volume":"24","author":"Oliveira","year":"2023","journal-title":"BMC Bioinformatics"},{"key":"2024110605394595400_btae437-B22","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1038\/nmeth.2340","article-title":"A large-scale evaluation of computational protein function prediction","volume":"10","author":"Radivojac","year":"2013","journal-title":"Nat Methods"},{"key":"2024110605394595400_btae437-B23","first-page":"82","author":"Ridnik"},{"key":"2024110605394595400_btae437-B24","doi-asserted-by":"crossref","first-page":"D638","DOI":"10.1093\/nar\/gkac1000","article-title":"The string database in 2023: protein\u2013protein association networks and functional enrichment analyses for any sequenced genome of interest","volume":"51","author":"Szklarczyk","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2024110605394595400_btae437-B25","doi-asserted-by":"crossref","first-page":"D523","DOI":"10.1093\/nar\/gkac1052","article-title":"UniProt: the universal protein knowledgebase in 2023","volume":"51","author":"UniProt","year":"2023","journal-title":"Nucleic Acids Res"},{"key":"2024110605394595400_btae437-B26","author":"Vaswani"},{"key":"2024110605394595400_btae437-B27","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1016\/j.gpb.2023.04.001","article-title":"NetGO 3.0: protein language model improves large-scale functional annotations","volume":"21","author":"Wang","year":"2023","journal-title":"Genom Proteom Bioinform"},{"key":"2024110605394595400_btae437-B28","doi-asserted-by":"crossref","first-page":"btad123","DOI":"10.1093\/bioinformatics\/btad123","article-title":"CFAGO: cross-fusion of network and attributes based on attention mechanism for protein function prediction","volume":"39","author":"Wu","year":"2023","journal-title":"Bioinformatics"},{"key":"2024110605394595400_btae437-B29","doi-asserted-by":"crossref","first-page":"i262","DOI":"10.1093\/bioinformatics\/btab270","article-title":"DeepGraphGO: graph neural network for large-scale, multispecies protein function prediction","volume":"37","author":"You","year":"2021","journal-title":"Bioinformatics"},{"key":"2024110605394595400_btae437-B30","doi-asserted-by":"crossref","first-page":"244","DOI":"10.1186\/s13059-019-1835-8","article-title":"The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens","volume":"20","author":"Zhou","year":"2019","journal-title":"Genome Biol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae437\/58439042\/btae437.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/7\/btae437\/60434800\/btae437.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/7\/btae437\/60434800\/btae437.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,6]],"date-time":"2024-11-06T00:40:08Z","timestamp":1730853608000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae437\/7705979"}},"subtitle":[],"editor":[{"given":"Arne","family":"Elofsson","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,7,1]]},"references-count":30,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2024,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae437","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2023.11.29.569192","asserted-by":"object"}]},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,7]]},"published":{"date-parts":[[2024,7,1]]},"article-number":"btae437"}}