{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T23:56:39Z","timestamp":1772841399857,"version":"3.50.1"},"reference-count":52,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2023,5,31]],"date-time":"2023-05-31T00:00:00Z","timestamp":1685491200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/pages\/standard-publication-reuse-rights"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program of China","doi-asserted-by":"publisher","award":["2021YFE010178"],"award-info":[{"award-number":["2021YFE010178"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62176105"],"award-info":[{"award-number":["62176105"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Hong Kong Research Grants Council","award":["PolyU 152006\/19E"],"award-info":[{"award-number":["PolyU 152006\/19E"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023,7,20]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Protein is the most important component in organisms and plays an indispensable role in life activities. In recent years, a large number of intelligent methods have been proposed to predict protein function. These methods obtain different types of protein information, including sequence, structure and interaction network. Among them, protein sequences have gained significant attention where methods are investigated to extract the information from different views of features. However, how to fully exploit the views for effective protein sequence analysis remains a challenge. In this regard, we propose a multi-view, multi-scale and multi-attention deep neural model (MMSMA) for protein function prediction. First, MMSMA extracts multi-view features from protein sequences, including one-hot encoding features, evolutionary information features, deep semantic features and overlapping property features based on physiochemistry. Second, a specific multi-scale multi-attention deep network model (MSMA) is built for each view to realize the deep feature learning and preliminary classification. In MSMA, both multi-scale local patterns and long-range dependence from protein sequences can be captured. Third, a multi-view adaptive decision mechanism is developed to make a comprehensive decision based on the classification results of all the views. To further improve the prediction performance, an extended version of MMSMA, MMSMAPlus, is proposed to integrate homology-based protein prediction under the framework of multi-view deep neural model. Experimental results show that the MMSMAPlus has promising performance and is significantly superior to the state-of-the-art methods. The source code can be found at https:\/\/github.com\/wzy-2020\/MMSMAPlus.<\/jats:p>","DOI":"10.1093\/bib\/bbad201","type":"journal-article","created":{"date-parts":[[2023,6,1]],"date-time":"2023-06-01T02:29:27Z","timestamp":1685586567000},"source":"Crossref","is-referenced-by-count":14,"title":["MMSMAPlus: a multi-view multi-scale multi-attention embedding model for protein function prediction"],"prefix":"10.1093","volume":"24","author":[{"given":"Zhongyu","family":"Wang","sequence":"first","affiliation":[{"name":"School of Artificial Intelligence and Computer Science, Jiangnan University , Wuxi , China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8045-2426","authenticated-orcid":false,"given":"Zhaohong","family":"Deng","sequence":"additional","affiliation":[{"name":"School of Artificial Intelligence and Computer Science, Jiangnan University , Wuxi , China"}]},{"given":"Wei","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Artificial Intelligence and Computer Science, Jiangnan University , Wuxi , China"}]},{"given":"Qiongdan","family":"Lou","sequence":"additional","affiliation":[{"name":"School of Artificial Intelligence and Computer Science, Jiangnan University , Wuxi , China"}]},{"given":"Kup-Sze","family":"Choi","sequence":"additional","affiliation":[{"name":"Hong Kong Polytechnic University , Hongkong"}]},{"given":"Zhisheng","family":"Wei","sequence":"additional","affiliation":[{"name":"National Key Laboratory of Food Science and Resource Mining, Jiangnan University , Wuxi , China"}]},{"given":"Lei","family":"Wang","sequence":"additional","affiliation":[{"name":"National Key Laboratory of Food Science and Resource Mining, Jiangnan University , Wuxi , China"}]},{"given":"Jing","family":"Wu","sequence":"additional","affiliation":[{"name":"National Key Laboratory of Food Science and Resource Mining, Jiangnan University , Wuxi , China"}]}],"member":"286","published-online":{"date-parts":[[2023,5,31]]},"reference":[{"key":"2023072020043468000_ref1","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1093\/bib\/bbl004","article-title":"Automated protein function prediction--the genomic challenge","volume":"7","author":"Friedberg","year":"2006","journal-title":"Brief Bioinform"},{"key":"2023072020043468000_ref2","doi-asserted-by":"crossref","first-page":"1715","DOI":"10.1016\/j.str.2012.07.016","article-title":"Identification of unknown protein function using metabolite cocktail screening","volume":"20","author":"Shumilin","year":"2012","journal-title":"Structure"},{"key":"2023072020043468000_ref3","doi-asserted-by":"crossref","first-page":"2086","DOI":"10.1002\/prot.23029","article-title":"Analysis of protein function and its prediction from amino acid sequence","volume":"79","author":"Clark","year":"2011","journal-title":"Proteins"},{"key":"2023072020043468000_ref4","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023072020043468000_ref5","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1038\/nmeth.1818","article-title":"HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment","volume":"9","author":"Remmert","year":"2011","journal-title":"Nat Methods"},{"key":"2023072020043468000_ref6","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nmeth.3176","article-title":"Fast and sensitive protein alignment using DIAMOND","volume":"12","author":"Buchfink","year":"2015","journal-title":"Nat Methods"},{"key":"2023072020043468000_ref7","doi-asserted-by":"crossref","first-page":"W297","DOI":"10.1093\/nar\/gkn193","article-title":"FFPred: an integrated feature-based function prediction server for vertebrate proteomes","volume":"36","author":"Lobley","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023072020043468000_ref8","doi-asserted-by":"crossref","first-page":"31865","DOI":"10.1038\/srep31865","article-title":"FFPred 3: feature-based function prediction for all gene ontology domains","volume":"6","author":"Cozzetto","year":"2016","journal-title":"Sci Rep"},{"key":"2023072020043468000_ref9","doi-asserted-by":"crossref","first-page":"660","DOI":"10.1093\/bioinformatics\/btx624","article-title":"DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier","volume":"34","author":"Kulmanov","year":"2018","journal-title":"Bioinformatics"},{"key":"2023072020043468000_ref10","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1016\/j.ymeth.2018.05.026","article-title":"DeepText2GO: improving large-scale protein function prediction with deep semantic text representation","volume":"145","author":"You","year":"2018","journal-title":"Methods"},{"key":"2023072020043468000_ref11","doi-asserted-by":"crossref","first-page":"3168","DOI":"10.1038\/s41467-021-23303-9","article-title":"Structure-based protein function prediction using graph convolutional networks","volume":"12","author":"Gligorijevic","year":"2021","journal-title":"Nat Commun"},{"key":"2023072020043468000_ref12","volume":"abs\/1308.0850","journal-title":"CoRR"},{"key":"2023072020043468000_ref13","doi-asserted-by":"crossref","first-page":"i262","DOI":"10.1093\/bioinformatics\/btab270","article-title":"DeepGraphGO: graph neural network for large-scale, multispecies protein function prediction","volume":"37","author":"You","year":"2021","journal-title":"Bioinformatics"},{"key":"2023072020043468000_ref14","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1016\/j.neunet.2019.11.013","article-title":"Discriminative margin-sensitive autoencoder for collective multi-view disease analysis","volume":"123","author":"Zhang","year":"2020","journal-title":"Neural Netw"},{"key":"2023072020043468000_ref15","doi-asserted-by":"crossref","first-page":"170352","DOI":"10.1109\/ACCESS.2019.2955285","article-title":"Epileptic seizure prediction with multi-view convolutional neural networks","volume":"7","author":"Liu","year":"2019","journal-title":"IEEE Access"},{"key":"2023072020043468000_ref16","doi-asserted-by":"crossref","first-page":"751","DOI":"10.1126\/science.285.5428.751","article-title":"Detecting protein function and protein-protein interactions from genome sequences","volume":"285","author":"Marcotte","year":"1999","journal-title":"Science"},{"key":"2023072020043468000_ref17","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat Genet"},{"key":"2023072020043468000_ref18","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1093\/bioinformatics\/btz595","article-title":"DeepGOPlus: improved protein function prediction from sequence","volume":"36","author":"Kulmanov","year":"2020","journal-title":"Bioinformatics"},{"key":"2023072020043468000_ref19","first-page":"21","article-title":"Predicting functions of maize proteins using graph convolutional network","author":"Zhou","year":"2020","journal-title":"BMC Bioinformatics"},{"key":"2023072020043468000_ref20","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1006\/jmbi.1999.3091","article-title":"Protein secondary structure prediction based on position-specific scoring matrices","volume":"292","author":"Jones","year":"1999","journal-title":"J Mol Biol"},{"key":"2023072020043468000_ref21","journal-title":"Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence"},{"key":"2023072020043468000_ref22","doi-asserted-by":"crossref","first-page":"e51","DOI":"10.1093\/nar\/gkab044","article-title":"GraphBind: protein structural context embedded rules learned by hierarchical graph neural networks for recognizing nucleic-acid-binding residues","volume":"49","author":"Xia","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2023072020043468000_ref23","doi-asserted-by":"crossref","first-page":"2208","DOI":"10.1109\/TCBB.2020.2968882","article-title":"A deep learning framework for gene ontology annotations with sequence- and network-based information","volume":"18","author":"Zhang","year":"2021","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"2023072020043468000_ref24","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1016\/j.aiopen.2021.08.002","article-title":"Pre-trained models: past, present and future","volume":"2","author":"Han","year":"2021","journal-title":"AI Open"},{"key":"2023072020043468000_ref25","article-title":"ProtTrans: towards cracking the language of lifes code through self-supervised deep learning and high performance computing","author":"Elnaggar","year":"2021","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2023072020043468000_ref26","journal-title":"Proceedings of NAACL-HLT"},{"key":"2023072020043468000_ref27","first-page":"20","article-title":"Modeling aspects of the language of life through transfer-learning protein sequences","author":"Heinzinger","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2023072020043468000_ref28","doi-asserted-by":"crossref","first-page":"1341","DOI":"10.1109\/JPROC.2018.2848209","article-title":"Extension of PCA to higher order data structures: an introduction to tensors, tensor decompositions, and tensor PCA","volume":"106","author":"Zare","year":"2018","journal-title":"Proc IEEE"},{"key":"2023072020043468000_ref29","doi-asserted-by":"crossref","first-page":"1459","DOI":"10.1007\/s00726-014-1711-5","article-title":"PhosphoSVM: prediction of phosphorylation sites by integrating various protein sequence attributes with a support vector machine","volume":"46","author":"Dou","year":"2014","journal-title":"Amino Acids"},{"key":"2023072020043468000_ref30","doi-asserted-by":"crossref","first-page":"4007","DOI":"10.1093\/bioinformatics\/bty451","article-title":"ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides","volume":"34","author":"Wei","year":"2018","journal-title":"Bioinformatics"},{"key":"2023072020043468000_ref31","doi-asserted-by":"crossref","first-page":"4668","DOI":"10.1093\/bioinformatics\/btab551","article-title":"PhosIDN: an integrated deep neural network for improving protein phosphorylation site prediction by combining sequence and protein\u2013protein interaction information","volume":"37","author":"Yang","year":"2021","journal-title":"Bioinformatics"},{"key":"2023072020043468000_ref32","journal-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing"},{"key":"2023072020043468000_ref33","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition"},{"key":"2023072020043468000_ref34","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition"},{"key":"2023072020043468000_ref35"},{"key":"2023072020043468000_ref36","first-page":"27","article-title":"Scale invariant feature transform plus hue feature, the international archives of photogrammetry, remote sensing and spatial","volume":"42","author":"Bdaneshvar","year":"2017","journal-title":"Inform Sci"},{"key":"2023072020043468000_ref37","journal-title":"Proceedings of the European conference on computer vision"},{"key":"2023072020043468000_ref38","journal-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition"},{"key":"2023072020043468000_ref39","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision"},{"key":"2023072020043468000_ref40","volume-title":"Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing"},{"key":"2023072020043468000_ref41","journal-title":"Published as a Conference Paper at the 3rd International Conference for Learning Representations"},{"key":"2023072020043468000_ref42","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1038\/nmeth.2340","article-title":"A large-scale evaluation of computational protein function prediction","volume":"10","author":"Radivojac","year":"2013","journal-title":"Nat Methods"},{"key":"2023072020043468000_ref43","doi-asserted-by":"crossref","first-page":"i53","DOI":"10.1093\/bioinformatics\/btt228","article-title":"Information-theoretic evaluation of predicted ontological annotations","volume":"29","author":"Clark","year":"2013","journal-title":"Bioinformatics"},{"key":"2023072020043468000_ref44","volume-title":"Proceedings of the 23rd International Conference on Machine Learning"},{"key":"2023072020043468000_ref45","doi-asserted-by":"crossref","first-page":"2825","DOI":"10.1093\/bioinformatics\/btab198","article-title":"TALE: transformer-based protein function annotation with joint sequence\u2013label embedding","volume":"37","author":"Cao","year":"2021","journal-title":"Bioinformatics"},{"key":"2023072020043468000_ref46","doi-asserted-by":"crossref","first-page":"162","DOI":"10.1093\/bioinformatics\/btaa701","article-title":"Unsupervised protein embeddings outperform hand-crafted sequence and structure features at predicting molecular function","volume":"37","author":"Villegas-Morcillo","year":"2021","journal-title":"Bioinformatics"},{"key":"2023072020043468000_ref47","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-020-80786-0","article-title":"Embeddings from deep learning transfer GO annotations beyond homology","volume":"11","author":"Littmann","year":"2021","journal-title":"Sci Rep"},{"key":"2023072020043468000_ref48","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1007\/s10994-019-05853-8","article-title":"Online Bayesian max-margin subspace learning for multi-view classification and regression","volume":"109","author":"He","year":"2020","journal-title":"Mach Learn"},{"key":"2023072020043468000_ref49","volume":"998","journal-title":"Advanced Materials Research"},{"key":"2023072020043468000_ref50","doi-asserted-by":"crossref","DOI":"10.1016\/j.engappai.2020.103527","article-title":"Collaborative weighted multi-view feature extraction","volume":"90","author":"Zhang","year":"2020","journal-title":"Eng Appl Artif Intel"},{"key":"2023072020043468000_ref51","first-page":"1047","article-title":"iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data","author":"Chen","year":"2020"},{"key":"2023072020043468000_ref52","doi-asserted-by":"crossref","first-page":"e60","DOI":"10.1093\/nar\/gkab122","article-title":"iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization","volume":"49","author":"Chen","year":"2021","journal-title":"Nucleic Acids Res"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/24\/4\/bbad201\/50916669\/bbad201.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/24\/4\/bbad201\/50916669\/bbad201.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,20]],"date-time":"2023-07-20T20:05:58Z","timestamp":1689883558000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbad201\/7187109"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,31]]},"references-count":52,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2023,7,20]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbad201","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,7]]},"published":{"date-parts":[[2023,5,31]]},"article-number":"bbad201"}}