{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,12]],"date-time":"2026-04-12T09:26:06Z","timestamp":1775985966385,"version":"3.50.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1010668","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2022,11,18]],"date-time":"2022-11-18T00:00:00Z","timestamp":1668729600000}}],"reference-count":54,"publisher":"Public Library of Science (PLoS)","issue":"10","license":[{"start":{"date-parts":[[2022,10,31]],"date-time":"2022-10-31T00:00:00Z","timestamp":1667174400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62271049"],"award-info":[{"award-number":["62271049"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Beijing Natural Science Foundation","award":["JQ19019"],"award-info":[{"award-number":["JQ19019"]}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Intrinsically disordered proteins and regions (IDP\/IDRs) are widespread in living organisms and perform various essential molecular functions. These functions are summarized as six general categories, including entropic chain, assembler, scavenger, effector, display site, and chaperone. The alteration of IDP functions is responsible for many human diseases. Therefore, identifying the function of disordered proteins is helpful for the studies of drug target discovery and rational drug design. Experimental identification of the molecular functions of IDP in the wet lab is an expensive and laborious procedure that is not applicable on a large scale. Some computational methods have been proposed and mainly focus on predicting the entropic chain function of IDRs, while the computational predictive methods for the remaining five important categories of disordered molecular functions are desired. Motivated by the growing numbers of experimental annotated functional sequences and the need to expand the coverage of disordered protein function predictors, we proposed DMFpred for disordered molecular functions prediction, covering disordered assembler, scavenger, effector, display site and chaperone. DMFpred employs the Protein Cubic Language Model (PCLM), which incorporates three protein language models for characterizing sequences, structural and functional features of proteins, and attention-based alignment for understanding the relationship among three captured features and generating a joint representation of proteins. The PCLM was pre-trained with large-scaled IDR sequences and fine-tuned with functional annotation sequences for molecular function prediction. The predictive performance evaluation on five categories of functional and multi-functional residues suggested that DMFpred provides high-quality predictions. The web-server of DMFpred can be freely accessed from<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"http:\/\/bliulab.net\/DMFpred\/\" xlink:type=\"simple\">http:\/\/bliulab.net\/DMFpred\/<\/jats:ext-link>.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1010668","type":"journal-article","created":{"date-parts":[[2022,10,31]],"date-time":"2022-10-31T17:54:39Z","timestamp":1667238879000},"page":"e1010668","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":10,"title":["DMFpred: Predicting protein disorder molecular functions based on protein cubic language model"],"prefix":"10.1371","volume":"18","author":[{"given":"Yihe","family":"Pang","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6314-0762","authenticated-orcid":true,"given":"Bin","family":"Liu","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2022,10,31]]},"reference":[{"issue":"2","key":"pcbi.1010668.ref001","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1080\/07391102.2012.675145","article-title":"Orderly order in protein intrinsic disorder distribution: disorder in 3500 proteomes from viruses and the three domains of life","volume":"30","author":"B Xue","year":"2012","journal-title":"J Biomol Struct Dyn"},{"issue":"1","key":"pcbi.1010668.ref002","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1007\/s00018-014-1661-9","article-title":"Exceptionally abundant exceptions: comprehensive characterization of intrinsic disorder in all domains of life","volume":"72","author":"Z Peng","year":"2015","journal-title":"Cell Mol Life Sci"},{"issue":"21","key":"pcbi.1010668.ref003","doi-asserted-by":"crossref","first-page":"6573","DOI":"10.1021\/bi012159+","article-title":"Intrinsic disorder and protein function","volume":"41","author":"AK Dunker","year":"2002","journal-title":"Biochemistry"},{"issue":"13","key":"pcbi.1010668.ref004","doi-asserted-by":"crossref","first-page":"6589","DOI":"10.1021\/cr400525m","article-title":"Classification of intrinsically disordered regions and proteins","volume":"114","author":"R van der Lee","year":"2014","journal-title":"Chem Rev"},{"issue":"15","key":"pcbi.1010668.ref005","doi-asserted-by":"crossref","first-page":"3346","DOI":"10.1016\/j.febslet.2005.03.072","article-title":"The interplay between structure and function in intrinsically unstructured proteins","volume":"579","author":"P. Tompa","year":"2005","journal-title":"FEBS Lett"},{"issue":"3","key":"pcbi.1010668.ref006","doi-asserted-by":"crossref","first-page":"573","DOI":"10.1016\/S0022-2836(02)00969-5","article-title":"Intrinsic disorder in cell-signaling and cancer-associated proteins","volume":"323","author":"LM Iakoucheva","year":"2002","journal-title":"J Mol Biol"},{"issue":"50","key":"pcbi.1010668.ref007","doi-asserted-by":"crossref","first-page":"14336","DOI":"10.1073\/pnas.1610137113","article-title":"A functional role for intrinsic disorder in the tau-tubulin complex","volume":"113","author":"AM Melo","year":"2016","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"1","key":"pcbi.1010668.ref008","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1016\/S0028-3908(03)00140-0","article-title":"Part II: alpha-synuclein and its molecular pathophysiological role in neurodegenerative disease.","volume":"45","author":"KK Dev","year":"2003","journal-title":"Neuropharmacology"},{"issue":"10","key":"pcbi.1010668.ref009","doi-asserted-by":"crossref","first-page":"435","DOI":"10.1016\/j.tibtech.2006.07.005","article-title":"Rational drug design via intrinsically disordered protein","volume":"24","author":"Y Cheng","year":"2006","journal-title":"Trends Biotechnol"},{"issue":"6","key":"pcbi.1010668.ref010","doi-asserted-by":"crossref","first-page":"475","DOI":"10.1517\/17460441.2012.686489","article-title":"Intrinsically disordered proteins and novel strategies for drug discovery","volume":"7","author":"VN Uversky","year":"2012","journal-title":"Expert Opin Drug Discov"},{"issue":"10","key":"pcbi.1010668.ref011","doi-asserted-by":"crossref","first-page":"527","DOI":"10.1016\/S0968-0004(02)02169-2","article-title":"Intrinsically unstructured proteins","volume":"27","author":"P. Tompa","year":"2002","journal-title":"Trends Biochem Sci"},{"issue":"3","key":"pcbi.1010668.ref012","doi-asserted-by":"crossref","first-page":"277","DOI":"10.1007\/s00239-007-9011-2","article-title":"Dynamic behavior of an intrinsically unstructured linker domain is conserved in the face of negligible amino acid sequence conservation","volume":"65","author":"GW Daughdrill","year":"2007","journal-title":"J Mol Evol"},{"issue":"1","key":"pcbi.1010668.ref013","doi-asserted-by":"crossref","first-page":"e26782","DOI":"10.4161\/idp.26782","article-title":"Disorder in the lifetime of a protein","volume":"1","author":"VN Uversky","year":"2013","journal-title":"Intrinsically Disord Proteins"},{"issue":"6","key":"pcbi.1010668.ref014","doi-asserted-by":"crossref","first-page":"573","DOI":"10.1016\/0306-4522(78)90022-2","article-title":"The character of the stored molecules in chromaffin granules of the adrenal medulla: a nuclear magnetic resonance study","volume":"3","author":"AJ Daniels","year":"1978","journal-title":"Neuroscience"},{"issue":"3","key":"pcbi.1010668.ref015","doi-asserted-by":"crossref","first-page":"420","DOI":"10.1016\/j.sbi.2013.02.010","article-title":"Unfolded phosphopolypeptides enable soft and hard tissues to coexist in the same organism with relative ease","volume":"23","author":"C. Holt","year":"2013","journal-title":"Curr Opin Struct Biol"},{"issue":"29","key":"pcbi.1010668.ref016","doi-asserted-by":"crossref","first-page":"7598","DOI":"10.1021\/bi8006803","article-title":"Regulation of cell division by intrinsically unstructured proteins: intrinsic flexibility, modularity, and signaling conduits","volume":"47","author":"CA Galea","year":"2008","journal-title":"Biochemistry"},{"key":"pcbi.1010668.ref017","doi-asserted-by":"crossref","first-page":"6580","DOI":"10.2741\/3175","article-title":"Understanding eukaryotic linear motifs and their role in cell signaling and regulation","volume":"13","author":"F Diella","year":"2008","journal-title":"Front Biosci"},{"issue":"10","key":"pcbi.1010668.ref018","doi-asserted-by":"crossref","first-page":"781","DOI":"10.1038\/nrm1492","article-title":"Pathways of chaperone-mediated protein folding in the cytosol","volume":"5","author":"JC Young","year":"2004","journal-title":"Nat Rev Mol Cell Biol"},{"issue":"5","key":"pcbi.1010668.ref019","doi-asserted-by":"crossref","first-page":"472","DOI":"10.1038\/s41592-021-01117-3","article-title":"Critical assessment of protein intrinsic disorder prediction.","volume":"18","author":"M Necci","year":"2021","journal-title":"Nat Methods."},{"issue":"D1","key":"pcbi.1010668.ref020","doi-asserted-by":"crossref","first-page":"D361","DOI":"10.1093\/nar\/gkaa1058","article-title":"MobiDB: intrinsically disordered proteins in 2021","volume":"49","author":"D Piovesan","year":"2021","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"pcbi.1010668.ref021","doi-asserted-by":"crossref","first-page":"D219","DOI":"10.1093\/nar\/gkw1056","article-title":"DisProt 7.0: a major update of the database of disordered proteins","volume":"45","author":"D Piovesan","year":"2017","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"pcbi.1010668.ref022","first-page":"D269","article-title":"DisProt: intrinsic protein disorder annotation in 2020","volume":"48","author":"A Hatos","year":"2020","journal-title":"Nucleic Acids Res"},{"issue":"12","key":"pcbi.1010668.ref023","doi-asserted-by":"crossref","first-page":"i341","DOI":"10.1093\/bioinformatics\/btw280","article-title":"DFLpred: High-throughput prediction of disordered flexible linker regions in protein sequences","volume":"32","author":"F Meng","year":"2016","journal-title":"Bioinformatics"},{"issue":"Suppl_2","key":"pcbi.1010668.ref024","first-page":"i754","article-title":"APOD: accurate sequence-based predictor of disordered flexible linkers","volume":"36","author":"Z Peng","year":"2020","journal-title":"Bioinformatics"},{"key":"pcbi.1010668.ref025","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1016\/j.jtbi.2017.10.015","article-title":"MoRFPred-plus: Computational Identification of MoRFs in Protein Sequences using Physicochemical Properties and HMM profiles","volume":"437","author":"R Sharma","year":"2018","journal-title":"J Theor Biol"},{"issue":"12","key":"pcbi.1010668.ref026","doi-asserted-by":"crossref","first-page":"i75","DOI":"10.1093\/bioinformatics\/bts209","article-title":"MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins","volume":"28","author":"FM Disfani","year":"2012","journal-title":"Bioinformatics"},{"issue":"4","key":"pcbi.1010668.ref027","doi-asserted-by":"crossref","first-page":"1107","DOI":"10.1093\/bioinformatics\/btz691","article-title":"Identifying molecular recognition features in intrinsically disordered regions of proteins by transfer learning","volume":"36","author":"J Hanson","year":"2020","journal-title":"Bioinformatics"},{"issue":"5","key":"pcbi.1010668.ref028","doi-asserted-by":"crossref","first-page":"e1000376","DOI":"10.1371\/journal.pcbi.1000376","article-title":"Prediction of protein binding regions in disordered proteins","volume":"5","author":"B Meszaros","year":"2009","journal-title":"PLoS Comput Biol"},{"issue":"W1","key":"pcbi.1010668.ref029","doi-asserted-by":"crossref","first-page":"W329","DOI":"10.1093\/nar\/gky384","article-title":"IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding","volume":"46","author":"B Meszaros","year":"2018","journal-title":"Nucleic Acids Res"},{"issue":"W1","key":"pcbi.1010668.ref030","doi-asserted-by":"crossref","first-page":"W488","DOI":"10.1093\/nar\/gkw409","article-title":"MoRFchibi SYSTEM: software tools for the identification of MoRFs in protein sequences","volume":"44","author":"N Malhis","year":"2016","journal-title":"Nucleic Acids Res"},{"issue":"6","key":"pcbi.1010668.ref031","doi-asserted-by":"crossref","first-page":"e1800058","DOI":"10.1002\/pmic.201800058","article-title":"OPAL+: Length-Specific MoRF Prediction in Intrinsically Disordered Protein Sequences","volume":"19","author":"R Sharma","year":"2019","journal-title":"Proteomics"},{"issue":"11","key":"pcbi.1010668.ref032","doi-asserted-by":"crossref","first-page":"1850","DOI":"10.1093\/bioinformatics\/bty032","article-title":"OPAL: prediction of MoRF regions in intrinsically disordered protein sequences","volume":"34","author":"R Sharma","year":"2018","journal-title":"Bioinformatics"},{"issue":"18","key":"pcbi.1010668.ref033","doi-asserted-by":"crossref","first-page":"e121","DOI":"10.1093\/nar\/gkv585","article-title":"High-throughput prediction of RNA, DNA and protein binding regions mediated by intrinsic disorder","volume":"43","author":"Z Peng","year":"2015","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"pcbi.1010668.ref034","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbab521","article-title":"DeepDISOBind: accurate prediction of RNA-, DNA- and protein-binding intrinsically disordered residues with deep multi-task learning","volume":"23","author":"F Zhang","year":"2022","journal-title":"Brief Bioinform"},{"key":"pcbi.1010668.ref035","article-title":"DisoLipPred: Accurate prediction of disordered lipid binding residues in protein sequences with deep recurrent networks and transfer learning","author":"A Katuwawala","year":"2021","journal-title":"Bioinformatics"},{"issue":"2","key":"pcbi.1010668.ref036","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1006\/jmbi.1999.3110","article-title":"Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm","volume":"293","author":"PE Wright","year":"1999","journal-title":"J Mol Biol"},{"issue":"6912","key":"pcbi.1010668.ref037","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1038\/nature01255","article-title":"The language of genes","volume":"420","author":"DB Searls","year":"2002","journal-title":"Nature"},{"issue":"10","key":"pcbi.1010668.ref038","doi-asserted-by":"crossref","first-page":"1345","DOI":"10.1109\/TKDE.2009.191","article-title":"A survey on transfer learning","volume":"22","author":"SJ Pan","year":"2009","journal-title":"IEEE Transactions on knowledge and data engineering"},{"issue":"17","key":"pcbi.1010668.ref039","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"SF Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"issue":"21","key":"pcbi.1010668.ref040","doi-asserted-by":"crossref","first-page":"5177","DOI":"10.1093\/bioinformatics\/btaa667","article-title":"IDP-Seq2Seq: identification of intrinsically disordered regions based on sequence to sequence learning","volume":"36","author":"YJ Tang","year":"2021","journal-title":"Bioinformatics"},{"issue":"1","key":"pcbi.1010668.ref041","doi-asserted-by":"crossref","first-page":"473","DOI":"10.1186\/s12859-019-3019-7","article-title":"HH-suite3 for fast remote homology detection and deep protein annotation","volume":"20","author":"M Steinegger","year":"2019","journal-title":"BMC Bioinformatics"},{"issue":"5","key":"pcbi.1010668.ref042","doi-asserted-by":"crossref","first-page":"1733","DOI":"10.1093\/bib\/bbz098","article-title":"DeepSVM-fold: protein fold recognition by combining support vector machines and pairwise sequence similarity scores generated by deep learning networks","volume":"21","author":"B Liu","year":"2020","journal-title":"Brief Bioinform"},{"issue":"4","key":"pcbi.1010668.ref043","doi-asserted-by":"crossref","first-page":"1061","DOI":"10.1002\/prot.22934","article-title":"Learning generative models for protein fold families","volume":"79","author":"S Balakrishnan","year":"2011","journal-title":"Proteins"},{"issue":"1","key":"pcbi.1010668.ref044","doi-asserted-by":"crossref","first-page":"012707","DOI":"10.1103\/PhysRevE.87.012707","article-title":"Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models","volume":"87","author":"M Ekeberg","year":"2013","journal-title":"Phys Rev E Stat Nonlin Soft Matter Phys"},{"key":"pcbi.1010668.ref045","unstructured":"Nair V, Hinton GE, editors. Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th International Conference on International Conference on Machine Learning; 2010."},{"issue":"6","key":"pcbi.1010668.ref046","doi-asserted-by":"crossref","first-page":"2133","DOI":"10.1093\/bib\/bbz133","article-title":"MotifCNN-fold: protein fold recognition based on fold-specific features extracted by motif-based convolutional neural networks","volume":"21","author":"CC Li","year":"2020","journal-title":"Brief Bioinform"},{"issue":"22","key":"pcbi.1010668.ref047","doi-asserted-by":"crossref","first-page":"5860","DOI":"10.1016\/j.jmb.2020.09.008","article-title":"iDRBP_MMC: Identifying DNA-Binding Proteins and RNA-Binding Proteins Based on Multi-Label Learning Model and Motif-Based Convolutional Neural Network","volume":"432","author":"J Zhang","year":"2020","journal-title":"J Mol Biol"},{"key":"pcbi.1010668.ref048","article-title":"SelfAT-Fold: protein fold recognition based on residue-based and motif-based self-attention networks","author":"Y Pang","year":"2020","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"issue":"D1","key":"pcbi.1010668.ref049","first-page":"D296","article-title":"ELM-the eukaryotic linear motif resource in 2020","volume":"48","author":"M Kumar","year":"2020","journal-title":"Nucleic Acids Res"},{"issue":"2","key":"pcbi.1010668.ref050","first-page":"291","article-title":"The Importance of the Loss Function in Option Valuation","volume":"72","author":"P Christoffersen","year":"2003","journal-title":"CIRANO"},{"key":"pcbi.1010668.ref051","unstructured":"Kingma D, Ba J. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations2015. p. 1\u201311."},{"issue":"1","key":"pcbi.1010668.ref052","doi-asserted-by":"crossref","first-page":"5407","DOI":"10.1038\/s41467-019-13395-9","article-title":"RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning","volume":"10","author":"J Singh","year":"2019","journal-title":"Nat Commun"},{"key":"pcbi.1010668.ref053","article-title":"PreRBP-TL: Prediction of Species-Specific RNA-Binding Proteins Based on Transfer Learning","author":"J Zhang","year":"2022","journal-title":"Bioinformatics"},{"issue":"8","key":"pcbi.1010668.ref054","doi-asserted-by":"crossref","first-page":"1819","DOI":"10.1109\/TKDE.2013.39","article-title":"A review on multi-label learning algorithms","volume":"26","author":"M-L Zhang","year":"2013","journal-title":"IEEE transactions on knowledge and data engineering"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1010668","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2022,11,18]],"date-time":"2022-11-18T00:00:00Z","timestamp":1668729600000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1010668","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,3,10]],"date-time":"2023-03-10T20:07:49Z","timestamp":1678478869000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1010668"}},"subtitle":[],"editor":[{"given":"Jeffrey","family":"Skolnick","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,10,31]]},"references-count":54,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2022,10,31]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1010668","relation":{"new_version":[{"id-type":"doi","id":"10.1371\/journal.pcbi.1010668","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,10,31]]}}}