{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,16]],"date-time":"2026-06-16T04:10:03Z","timestamp":1781583003637,"version":"3.54.5"},"reference-count":59,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2024,5,3]],"date-time":"2024-05-03T00:00:00Z","timestamp":1714694400000},"content-version":"vor","delay-in-days":37,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,3,27]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Predicting protein function is crucial for understanding biological life processes, preventing diseases and developing new drug targets. In recent years, methods based on sequence, structure and biological networks for protein function annotation have been extensively researched. Although obtaining a protein in three-dimensional structure through experimental or computational methods enhances the accuracy of function prediction, the sheer volume of proteins sequenced by high-throughput technologies presents a significant challenge. To address this issue, we introduce a deep neural network model DeepSS2GO (Secondary Structure to Gene Ontology). It is a predictor incorporating secondary structure features along with primary sequence and homology information. The algorithm expertly combines the speed of sequence-based information with the accuracy of structure-based features while streamlining the redundant data in primary sequences and bypassing the time-consuming challenges of tertiary structure analysis. The results show that the prediction performance surpasses state-of-the-art algorithms. It has the ability to predict key functions by effectively utilizing secondary structure information, rather than broadly predicting general Gene Ontology terms. Additionally, DeepSS2GO predicts five times faster than advanced algorithms, making it highly applicable to massive sequencing data. The source code and trained models are available at https:\/\/github.com\/orca233\/DeepSS2GO.<\/jats:p>","DOI":"10.1093\/bib\/bbae196","type":"journal-article","created":{"date-parts":[[2024,4,10]],"date-time":"2024-04-10T02:22:29Z","timestamp":1712715749000},"source":"Crossref","is-referenced-by-count":22,"title":["DeepSS2GO: protein function prediction from secondary structure"],"prefix":"10.1093","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-4404-0711","authenticated-orcid":false,"given":"Fu V","family":"Song","sequence":"first","affiliation":[{"name":"Department of Chemical Biology , School of Life Sciences, , Xueyuan Avenue, 518055, Shenzhen , China"},{"name":"Southern University of Science and Technology , School of Life Sciences, , Xueyuan Avenue, 518055, Shenzhen , China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6038-0808","authenticated-orcid":false,"given":"Jiaqi","family":"Su","sequence":"additional","affiliation":[{"name":"Department of Chemical Biology , School of Life Sciences, , Xueyuan Avenue, 518055, Shenzhen , China"},{"name":"Southern University of Science and Technology , School of Life Sciences, , Xueyuan Avenue, 518055, Shenzhen , China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7172-7149","authenticated-orcid":false,"given":"Sixing","family":"Huang","sequence":"additional","affiliation":[{"name":"Gemini Data Japan , Kitaku Oujikamiya 1-11-11, 115-0043, Tokyo , Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3640-0618","authenticated-orcid":false,"given":"Neng","family":"Zhang","sequence":"additional","affiliation":[{"name":"Electronic Engineering and Computer Science, Queen Mary University of London , Mile End Road, E1 4NS, London , UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-3865-4231","authenticated-orcid":false,"given":"Kaiyue","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Chemical Biology , School of Life Sciences, , Xueyuan Avenue, 518055, Shenzhen , China"},{"name":"Southern University of Science and Technology , School of Life Sciences, , Xueyuan Avenue, 518055, Shenzhen , China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ming","family":"Ni","sequence":"additional","affiliation":[{"name":"MGI Tech , Beishan Industrial Zone, 518083, Shenzhen , China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3481-450X","authenticated-orcid":false,"given":"Maofu","family":"Liao","sequence":"additional","affiliation":[{"name":"Department of Chemical Biology , School of Life Sciences, , Xueyuan Avenue, 518055, Shenzhen , China"},{"name":"Southern University of Science and Technology , School of Life Sciences, , Xueyuan Avenue, 518055, Shenzhen , China"},{"name":"Institute for Biological Electron Microscopy, Southern University of Science and Technology , Xueyuan Avenue, 518055, Shenzhen , China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"286","published-online":{"date-parts":[[2024,5,2]]},"reference":[{"issue":"2","key":"2024050310301065200_ref1","doi-asserted-by":"crossref","first-page":"1513","DOI":"10.1093\/bib\/bbab087","article-title":"Deep learning in bioinformatics and biomedicine","volume":"22","author":"Berrar","year":"2021","journal-title":"Brief Bioinform"},{"issue":"7","key":"2024050310301065200_ref2","doi-asserted-by":"crossref","first-page":"774","DOI":"10.1038\/s41592-022-01454-x","article-title":"Understudied proteins: opportunities and challenges for functional proteomics","volume":"19","author":"Kustatscher","year":"2022","journal-title":"Nat Methods"},{"issue":"6","key":"2024050310301065200_ref3","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0198216","article-title":"Predicting human protein function with multi-task deep neural networks","volume":"13","author":"Fa","year":"2018","journal-title":"PloS One"},{"issue":"1","key":"2024050310301065200_ref4","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1038\/75556","article-title":"Gene ontology: tool for the unification of biology","volume":"25","author":"Ashburner","year":"2000","journal-title":"Nat Genet"},{"issue":"1","key":"2024050310301065200_ref5","doi-asserted-by":"crossref","first-page":"304","DOI":"10.1093\/nar\/28.1.304","article-title":"The enzyme database in 2000","volume":"28","author":"Bairoch","year":"2000","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"2024050310301065200_ref6","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1093\/nar\/28.1.27","article-title":"Kegg: Kyoto encyclopedia of genes and genomes","volume":"28","author":"Kanehisa","year":"2000","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2024050310301065200_ref7","doi-asserted-by":"crossref","first-page":"D412","DOI":"10.1093\/nar\/gkaa913","article-title":"Pfam: the protein families database in 2021","volume":"49","author":"Mistry","year":"2021","journal-title":"Nucleic Acids Res"},{"issue":"6","key":"2024050310301065200_ref8","doi-asserted-by":"crossref","first-page":"932","DOI":"10.1038\/s41587-021-01179-w","article-title":"Using deep learning to annotate the protein universe","volume":"40","author":"Bileschi","year":"2022","journal-title":"Nat Biotechnol"},{"issue":"5","key":"2024050310301065200_ref9","first-page":"851","article-title":"Deep learning in bioinformatics","volume":"18","author":"Min","year":"2017","journal-title":"Brief Bioinform"},{"issue":"7693","key":"2024050310301065200_ref10","doi-asserted-by":"crossref","first-page":"555","DOI":"10.1038\/d41586-018-02174-z","article-title":"Deep learning for biology","volume":"554","author":"Webb","year":"2018","journal-title":"Nature"},{"issue":"W1","key":"2024050310301065200_ref11","doi-asserted-by":"crossref","first-page":"W535","DOI":"10.1093\/nar\/gkab354","article-title":"Predictprotein-predicting protein structure and function for 29 years","volume":"49","author":"Bernhofer","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"2024050310301065200_ref12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2105-10-421","article-title":"Blast+: architecture and applications","volume":"10","author":"Camacho","year":"2009","journal-title":"BMC Bioinformatics"},{"issue":"17","key":"2024050310301065200_ref13","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped blast and psi-blast: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2024050310301065200_ref14","doi-asserted-by":"crossref","first-page":"D344","DOI":"10.1093\/nar\/gkaa977","article-title":"The interpro protein families and domains database: 20 years on","volume":"49","author":"Blum","year":"2021","journal-title":"Nucleic Acids Res"},{"issue":"3","key":"2024050310301065200_ref15","doi-asserted-by":"crossref","first-page":"368","DOI":"10.1016\/j.sbi.2006.04.004","article-title":"Multiple sequence alignment","volume":"16","author":"Edgar","year":"2006","journal-title":"Curr Opin Struct Biol"},{"issue":"2","key":"2024050310301065200_ref16","doi-asserted-by":"crossref","first-page":"308","DOI":"10.1109\/TCBB.2010.93","article-title":"On position-specific scoring matrix for protein function prediction","volume":"8","author":"Jeong","year":"2010","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"issue":"23","key":"2024050310301065200_ref17","first-page":"495","article-title":"Introduction to convolutional neural networks. National key lab for novel software technology","volume":"5","author":"Jianxin","year":"2017","journal-title":"Nanjing University China"},{"issue":"9","key":"2024050310301065200_ref18","first-page":"e33","article-title":"A gentle introduction to graph neural networks","volume":"6","author":"Sanchez-Lengeling","year":"2021","journal-title":"Distill"},{"key":"2024050310301065200_ref19","first-page":"6840","article-title":"Denoising diffusion probabilistic models","volume":"33","author":"Ho","year":"2020","journal-title":"Advances in neural information processing systems"},{"key":"2024050310301065200_ref20","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Advances in neural information processing systems"},{"key":"2024050310301065200_ref21","article-title":"Transformer protein language models are unsupervised structure learners","author":"Rao","year":"2020","journal-title":"bioRxiv"},{"key":"2024050310301065200_ref22","doi-asserted-by":"crossref","first-page":"1099","DOI":"10.1038\/s41587-022-01618-2","article-title":"Large language models generate functional protein sequences across diverse families","volume":"41","author":"Madani","year":"2023","journal-title":"Nat Biotechnol"},{"issue":"4","key":"2024050310301065200_ref23","doi-asserted-by":"crossref","first-page":"1114","DOI":"10.1093\/bioinformatics\/btz699","article-title":"Protein\u2013protein interaction site prediction through combining local and global features with deep neural networks","volume":"36","author":"Zeng","year":"2020","journal-title":"Bioinformatics"},{"issue":"2","key":"2024050310301065200_ref24","doi-asserted-by":"crossref","first-page":"422","DOI":"10.1093\/bioinformatics\/btz595","article-title":"Deepgoplus: improved protein function prediction from sequence","volume":"36","author":"Kulmanov","year":"2020","journal-title":"Bioinformatics"},{"issue":"18","key":"2024050310301065200_ref25","doi-asserted-by":"crossref","first-page":"2825","DOI":"10.1093\/bioinformatics\/btab198","article-title":"Tale: transformer-based protein function annotation with joint sequence\u2013label embedding","volume":"37","author":"Cao","year":"2021","journal-title":"Bioinformatics"},{"issue":"8","key":"2024050310301065200_ref26","doi-asserted-by":"crossref","first-page":"giaa081","DOI":"10.1093\/gigascience\/giaa081","article-title":"Graph2go: a multi-modal attributed network embedding method for inferring protein functions","volume":"9","author":"Fan","year":"2020","journal-title":"GigaScience"},{"key":"2024050310301065200_ref27","article-title":"Structure-based protein function prediction using graph convolutional networks","volume":"12","author":"Vladimir Gligorijevi\u0107","year":"2021","journal-title":"Nat Commun"},{"issue":"W1","key":"2024050310301065200_ref28","doi-asserted-by":"crossref","first-page":"W379","DOI":"10.1093\/nar\/gkz388","article-title":"Netgo: improving large-scale protein function prediction with massive network information","volume":"47","author":"You","year":"2019","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2024050310301065200_ref29","doi-asserted-by":"crossref","first-page":"D605","DOI":"10.1093\/nar\/gkaa1074","article-title":"The string database in 2021: customizable protein\u2013protein networks, and functional characterization of user-uploaded gene\/measurement sets","volume":"49","author":"Szklarczyk","year":"2021","journal-title":"Nucleic Acids Res"},{"issue":"Supplement_1","key":"2024050310301065200_ref30","doi-asserted-by":"crossref","first-page":"i262","DOI":"10.1093\/bioinformatics\/btab270","article-title":"Deepgraphgo: graph neural network for large-scale, multispecies protein function prediction","volume":"37","author":"You","year":"2021","journal-title":"Bioinformatics"},{"issue":"4096","key":"2024050310301065200_ref31","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1126\/science.181.4096.223","article-title":"Principles that govern the folding of protein chains","volume":"181","author":"Anfinsen","year":"1973","journal-title":"Science"},{"issue":"2","key":"2024050310301065200_ref32","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1038\/nmeth.f.203","article-title":"Protein crystallization: from purified protein to diffraction-quality crystal","volume":"5","author":"Chayen","year":"2008","journal-title":"Nat Methods"},{"issue":"7832","key":"2024050310301065200_ref33","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1038\/s41586-020-2833-4","article-title":"Atomic-resolution protein structure determination by cryo-em","volume":"587","author":"Yip","year":"2020","journal-title":"Nature"},{"key":"2024050310301065200_ref34","doi-asserted-by":"crossref","first-page":"1222182","DOI":"10.3389\/fbinf.2023.1222182","article-title":"Current successes and remaining challenges in protein function prediction","volume":"3","author":"Jeffery","year":"2023","journal-title":"Front Bioinf"},{"issue":"7","key":"2024050310301065200_ref35","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1038\/nrd.2018.77","article-title":"Cryo-em in drug discovery: achievements, limitations and prospects","volume":"17","author":"Renaud","year":"2018","journal-title":"Nat Rev Drug Discov"},{"issue":"7873","key":"2024050310301065200_ref36","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with alphafold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"issue":"12","key":"2024050310301065200_ref37","doi-asserted-by":"crossref","first-page":"5634","DOI":"10.1038\/s41596-021-00628-9","article-title":"The trrosetta server for fast and accurate protein structure prediction","volume":"16","author":"Zongyang","year":"2021","journal-title":"Nat Protoc"},{"issue":"D1","key":"2024050310301065200_ref38","doi-asserted-by":"crossref","first-page":"D364","DOI":"10.1093\/nar\/gku1028","article-title":"A series of pdb-related databanks for everyday needs","volume":"43","author":"Touw","year":"2015","journal-title":"Nucleic Acids Res"},{"issue":"12","key":"2024050310301065200_ref39","doi-asserted-by":"crossref","first-page":"2577","DOI":"10.1002\/bip.360221211","article-title":"Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features","volume":"22","author":"Kabsch","year":"1983","journal-title":"Biopolymers"},{"key":"2024050310301065200_ref40","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbab502","article-title":"Accurate protein function prediction via graph attention networks with predicted structure information","volume":"23","author":"Lai","year":"2022","journal-title":"Brief Bioinform"},{"issue":"7","key":"2024050310301065200_ref41","doi-asserted-by":"crossref","first-page":"2511","DOI":"10.1021\/acs.jproteome.8b00262","article-title":"Functional annotation of proteins encoded by the minimal bacterial genome based on secondary structure element alignment","volume":"17","author":"Yang","year":"2018","journal-title":"J Proteome Res"},{"issue":"1","key":"2024050310301065200_ref42","doi-asserted-by":"crossref","first-page":"7607","DOI":"10.1038\/s41598-022-11684-w","article-title":"Reaching alignment-profile-based accuracy in predicting protein secondary and tertiary structural properties without alignment","volume":"12","author":"Singh","year":"2022","journal-title":"Sci Rep"},{"issue":"8000","key":"2024050310301065200_ref43","doi-asserted-by":"crossref","first-page":"897","DOI":"10.1038\/s41586-023-07004-5","article-title":"Conformational ensembles of the human intrinsically disordered proteome","volume":"626","author":"Tesei","year":"2024","journal-title":"Nature"},{"issue":"1","key":"2024050310301065200_ref44","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1038\/nmeth.3176","article-title":"Fast and sensitive protein alignment using diamond","volume":"12","author":"Buchfink","year":"2015","journal-title":"Nat Methods"},{"key":"2024050310301065200_ref45","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbad117","article-title":"Fast and accurate protein function prediction from sequence through pretrained language model and homology-based label diffusion","volume":"24","author":"Yuan","year":"2023","journal-title":"Brief Bioinform"},{"key":"2024050310301065200_ref46","doi-asserted-by":"crossref","first-page":"bbad201","DOI":"10.1093\/bib\/bbad201","article-title":"Mmsmaplus: a multi-view multi-scale multi-attention embedding model for protein function prediction","author":"Wang","year":"2023","journal-title":"Brief Bioinform"},{"issue":"D1","key":"2024050310301065200_ref47","doi-asserted-by":"crossref","first-page":"D480","DOI":"10.1093\/nar\/gkaa1100","article-title":"Uniprot: the universal protein knowledgebase in 2021","volume":"49","year":"2021","journal-title":"Nucleic Acids Res"},{"issue":"10","key":"2024050310301065200_ref48","doi-asserted-by":"crossref","first-page":"7112","DOI":"10.1109\/TPAMI.2021.3095381","article-title":"Prottrans: toward understanding the language of life through self-supervised learning","volume":"44","author":"Elnaggar","year":"2021","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2024050310301065200_ref49","article-title":"Pytorch: An imperative style, high-performance deep learning library","volume-title":"Advances in Neural Information Processing Systems","author":"Paszke","year":"2019"},{"key":"2024050310301065200_ref50","article-title":"Adam: a method for stochastic optimization","author":"Kingma","year":"2014","journal-title":"arXiv preprint arXiv:14126980"},{"issue":"3","key":"2024050310301065200_ref51","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1038\/nmeth.2340","article-title":"A large-scale evaluation of computational protein function prediction","volume":"10","author":"Radivojac","year":"2013","journal-title":"Nat Methods"},{"issue":"13","key":"2024050310301065200_ref52","doi-asserted-by":"crossref","first-page":"i53","DOI":"10.1093\/bioinformatics\/btt228","article-title":"Information-theoretic evaluation of predicted ontological annotations","volume":"29","author":"Clark","year":"2013","journal-title":"Bioinformatics"},{"key":"2024050310301065200_ref53","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1145\/1143844.1143874","article-title":"The relationship between precision-recall and roc curves","volume-title":"Proceedings of the 23rd international conference on Machine learning","author":"Davis","year":"2006"},{"issue":"4","key":"2024050310301065200_ref54","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/nmicrobiol.2016.9","article-title":"Slam is an outer membrane protein that is required for the surface display of lipidated virulence factors in neisseria","volume":"1","author":"Hooda","year":"2016","journal-title":"Nat Microbiol"},{"issue":"47","key":"2024050310301065200_ref55","doi-asserted-by":"crossref","first-page":"32858","DOI":"10.1074\/jbc.M114.582338","article-title":"Identification of palmitoyltransferase and thioesterase enzymes that control the subcellular localization of axon survival factor nicotinamide mononucleotide adenylyltransferase 2 (nmnat2)","volume":"289","author":"Milde","year":"2014","journal-title":"J Biol Chem"},{"key":"2024050310301065200_ref56","author":"Quickgo go:0002084","year":"2023"},{"issue":"2","key":"2024050310301065200_ref57","doi-asserted-by":"crossref","first-page":"184","DOI":"10.1038\/s41592-019-0666-6","article-title":"Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning","volume":"17","author":"Gainza","year":"2020","journal-title":"Nat Methods"},{"issue":"34","key":"2024050310301065200_ref58","doi-asserted-by":"crossref","first-page":"15519","DOI":"10.1021\/jacs.2c03858","article-title":"Pseudo-isolated $\\alpha $-helix platform for the recognition of deep and narrow targets","volume":"144","author":"Kim","year":"2022","journal-title":"J Am Chem Soc"},{"key":"2024050310301065200_ref59","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1016\/j.ijbiomac.2022.07.103","article-title":"In pursuit of next-generation therapeutics: antimicrobial peptides against superbugs, their sources, mechanism of action, nanotechnology-based delivery, and clinical applications","volume":"218","author":"Thakur","year":"2022","journal-title":"Int J Biol Macromol"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/3\/bbae196\/57390436\/bbae196.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/3\/bbae196\/57390436\/bbae196.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,3]],"date-time":"2024-05-03T06:30:54Z","timestamp":1714717854000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbae196\/7663430"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,27]]},"references-count":59,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,3,27]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbae196","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2024.03.30.584129","asserted-by":"object"}]},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,5,1]]},"published":{"date-parts":[[2024,3,27]]},"article-number":"bbae196"}}