{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T07:59:46Z","timestamp":1775721586611,"version":"3.50.1"},"reference-count":62,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2019,9,5]],"date-time":"2019-09-05T00:00:00Z","timestamp":1567641600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/501100000923","name":"Australian Research Council","doi-asserted-by":"publisher","award":["DP180102060"],"award-info":[{"award-number":["DP180102060"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000925","name":"National Health and Medical Research Council","doi-asserted-by":"publisher","award":["1121629"],"award-info":[{"award-number":["1121629"]}],"id":[{"id":"10.13039\/501100000925","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100015449","name":"Queensland Cyber Infrastructure Foundation","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100015449","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,2,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>Protein intrinsic disorder describes the tendency of sequence residues to not fold into a rigid three-dimensional shape by themselves. However, some of these disordered regions can transition from disorder to order when interacting with another molecule in segments known as molecular recognition features (MoRFs). Previous analysis has shown that these MoRF regions are indirectly encoded within the prediction of residue disorder as low-confidence predictions [i.e. in a semi-disordered state P(D)\u22480.5]. Thus, what has been learned for disorder prediction may be transferable to MoRF prediction. Transferring the internal characterization of protein disorder for the prediction of MoRF residues would allow us to take advantage of the large training set available for disorder prediction, enabling the training of larger analytical models than is currently feasible on the small number of currently available annotated MoRF proteins. In this paper, we propose a new method for MoRF prediction by transfer learning from the SPOT-Disorder2 ensemble models built for disorder prediction.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We confirm that directly training on the MoRF set with a randomly initialized model yields substantially poorer performance on independent test sets than by using the transfer-learning-based method SPOT-MoRF, for both deep and simple networks. Its comparison to current state-of-the-art techniques reveals its superior performance in identifying MoRF binding regions in proteins across two independent testing sets, including our new dataset of &amp;gt;800 protein chains. These test chains share &amp;lt;30% sequence similarity to all training and validation proteins used in SPOT-Disorder2 and SPOT-MoRF, and provide a much-needed large-scale update on the performance of current MoRF predictors. The method is expected to be useful in locating functional disordered regions in proteins.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>SPOT-MoRF and its data are available as a web server and as a standalone program at: http:\/\/sparks-lab.org\/jack\/server\/SPOT-MoRF\/index.php.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz691","type":"journal-article","created":{"date-parts":[[2019,9,1]],"date-time":"2019-09-01T03:40:44Z","timestamp":1567309244000},"page":"1107-1113","source":"Crossref","is-referenced-by-count":48,"title":["Identifying molecular recognition features in intrinsically disordered regions of proteins by transfer learning"],"prefix":"10.1093","volume":"36","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6956-6748","authenticated-orcid":false,"given":"Jack","family":"Hanson","sequence":"first","affiliation":[{"name":"Signal Processing Laboratory, Griffith University , Brisbane, QLD 4122, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Thomas","family":"Litfin","sequence":"additional","affiliation":[{"name":"Institute for Glycomics, School of Information and Communication Technology , Griffith University, Southport, QLD 4222, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kuldip","family":"Paliwal","sequence":"additional","affiliation":[{"name":"Signal Processing Laboratory, Griffith University , Brisbane, QLD 4122, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9958-5699","authenticated-orcid":false,"given":"Yaoqi","family":"Zhou","sequence":"additional","affiliation":[{"name":"Institute for Glycomics, School of Information and Communication Technology , Griffith University, Southport, QLD 4222, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2019,9,5]]},"reference":[{"key":"2023013110110038700_btz691-B1","article-title":"Tensorflow: large-scale machine learning on heterogeneous distributed systems","author":"Abadi","year":"2016","journal-title":"CoRR"},{"key":"2023013110110038700_btz691-B2","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"2023013110110038700_btz691-B3","first-page":"110","article-title":"Sequence-based prediction of molecular recognition features in disordered proteins","volume":"2","author":"Chun","year":"2013","journal-title":"J. Med. Bioeng"},{"key":"2023013110110038700_btz691-B4","first-page":"233","author":"Davis","year":"2006"},{"key":"2023013110110038700_btz691-B5","doi-asserted-by":"crossref","first-page":"i75","DOI":"10.1093\/bioinformatics\/bts209","article-title":"MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins","volume":"28","author":"Disfani","year":"2012","journal-title":"Bioinformatics"},{"key":"2023013110110038700_btz691-B6","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1016\/S0959-440X(02)00289-0","article-title":"Coupling of folding and binding for unstructured proteins","volume":"12","author":"Dyson","year":"2002","journal-title":"Curr. Opin. Struct. Biol"},{"key":"2023013110110038700_btz691-B7","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1038\/nrm1589","article-title":"Intrinsically unstructured proteins and their functions","volume":"6","author":"Dyson","year":"2005","journal-title":"Nat. Rev. Mol. Cell Biol"},{"key":"2023013110110038700_btz691-B8","first-page":"50","author":"Fang","year":"2018"},{"key":"2023013110110038700_btz691-B9","volume-title":"Deep Learning","author":"Goodfellow","year":"2016"},{"key":"2023013110110038700_btz691-B10","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1148\/radiology.143.1.7063747","article-title":"The meaning and use of the area under a receiver operating characteristic (ROC) curve","volume":"143","author":"Hanley","year":"1982","journal-title":"Radiology"},{"key":"2023013110110038700_btz691-B11","doi-asserted-by":"crossref","first-page":"685","DOI":"10.1093\/bioinformatics\/btw678","article-title":"Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks","volume":"33","author":"Hanson","year":"2017","journal-title":"Bioinformatics"},{"key":"2023013110110038700_btz691-B12","doi-asserted-by":"crossref","first-page":"4039","DOI":"10.1093\/bioinformatics\/bty481","article-title":"Accurate prediction of protein contact maps by coupling residual two-dimensional bidirectional long short-term memory with convolutional neural networks","volume":"34","author":"Hanson","year":"2018","journal-title":"Bioinformatics"},{"key":"2023013110110038700_btz691-B13","doi-asserted-by":"crossref","first-page":"2369","DOI":"10.1021\/acs.jcim.8b00636","article-title":"Accurate single-sequence prediction of protein intrinsic disorder by an ensemble of deep recurrent and convolutional architectures","volume":"58","author":"Hanson","year":"2018","journal-title":"J. Chem. Inf. Model"},{"key":"2023013110110038700_btz691-B14","article-title":"Enhancing protein intrinsic disorder prediction by utilizing deep squeeze and excitation residual inception and long short-term memory networks","author":"Hanson","year":"2019","journal-title":"Genom. Proteom. Bioinf"},{"key":"2023013110110038700_btz691-B15","article-title":"Getting to know your neighbor: protein structure prediction comes of age with contextual machine learning","author":"Hanson","year":"2019","journal-title":"J. Comput. Biol"},{"key":"2023013110110038700_btz691-B16","doi-asserted-by":"crossref","first-page":"2403","DOI":"10.1093\/bioinformatics\/bty1006","article-title":"Improving prediction of protein secondary structure, backbone angles, solvent accessibility, and contact numbers by using predicted contact maps and an ensemble of recurrent and residual convolutional neural networks","volume":"35","author":"Hanson","year":"2019","journal-title":"Bioinformatics"},{"key":"2023013110110038700_btz691-B17","doi-asserted-by":"crossref","first-page":"e100.","DOI":"10.1371\/journal.pcbi.0020100","article-title":"Intrinsic disorder is a common feature of hub proteins from four eukaryotic interactomes","volume":"2","author":"Haynes","year":"2006","journal-title":"PLoS Comput. Biol"},{"key":"2023013110110038700_btz691-B18","first-page":"770","author":"He","year":"2016"},{"key":"2023013110110038700_btz691-B19","doi-asserted-by":"crossref","first-page":"635.","DOI":"10.3390\/e21070635","article-title":"Prediction of MoRFs in protein sequences with MLPs based on sequence properties and evolution information","volume":"21","author":"He","year":"2019","journal-title":"Entropy"},{"key":"2023013110110038700_btz691-B20","doi-asserted-by":"crossref","first-page":"2842","DOI":"10.1093\/bioinformatics\/btx218","article-title":"Capturing non-local interactions by long short term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure","volume":"33","author":"Heffernan","year":"2017","journal-title":"Bioinformatics"},{"key":"2023013110110038700_btz691-B21","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput"},{"key":"2023013110110038700_btz691-B22","doi-asserted-by":"crossref","first-page":"2761.","DOI":"10.3390\/ijms18122761","article-title":"Functional analysis of human hub proteins and their interactors involved in the intrinsic disorder-enriched interactions","volume":"18","author":"Hu","year":"2017","journal-title":"Int. J. Mol. Sci"},{"key":"2023013110110038700_btz691-B23","author":"Hu","year":"2017"},{"key":"2023013110110038700_btz691-B24","doi-asserted-by":"crossref","first-page":"1800243.","DOI":"10.1002\/pmic.201800243","article-title":"Taxonomic landscape of the dark proteomes: whole-proteome scale interplay between structural darkness, intrinsic disorder, and crystallization propensity","volume":"18","author":"Hu","year":"2018","journal-title":"Proteomics"},{"key":"2023013110110038700_btz691-B25","doi-asserted-by":"crossref","first-page":"857","DOI":"10.1093\/bioinformatics\/btu744","article-title":"DISOPRED3: precise disordered region predictions with annotated protein-binding activity","volume":"31","author":"Jones","year":"2015","journal-title":"Bioinformatics"},{"key":"2023013110110038700_btz691-B26","volume-title":"System Analysis by Digital Computer","author":"Kaiser","year":"1966"},{"key":"2023013110110038700_btz691-B27","author":"Keskar","year":"2017"},{"key":"2023013110110038700_btz691-B28","author":"Kingma","year":"2014"},{"key":"2023013110110038700_btz691-B29","doi-asserted-by":"crossref","first-page":"520","DOI":"10.1002\/prot.25674","article-title":"NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning","volume":"87","author":"Klausen","year":"2019","journal-title":"Proteins"},{"key":"2023013110110038700_btz691-B30","first-page":"1176935117699408","article-title":"Therapeutic interventions of cancers using intrinsically disordered proteins as drug targets: c-myc as model system","volume":"16","author":"Kumar","year":"2017","journal-title":"Cancer Inf"},{"key":"2023013110110038700_btz691-B31","doi-asserted-by":"crossref","first-page":"W488","DOI":"10.1093\/nar\/gkw409","article-title":"MoRFchibi SYSTEM: software tools for the identification of MoRFs in protein sequences","volume":"44","author":"Malhis","year":"2016","journal-title":"Nucleic Acids Res"},{"key":"2023013110110038700_btz691-B32","doi-asserted-by":"crossref","first-page":"442","DOI":"10.1016\/0005-2795(75)90109-9","article-title":"Comparison of the predicted and observed secondary structure of t4 phage lysozyme","volume":"405","author":"Matthews","year":"1975","journal-title":"Biochim. Biophys. Acta"},{"key":"2023013110110038700_btz691-B33","doi-asserted-by":"crossref","first-page":"e1000376.","DOI":"10.1371\/journal.pcbi.1000376","article-title":"Prediction of protein binding regions in disordered proteins","volume":"5","author":"M\u00e9sz\u00e1ros","year":"2009","journal-title":"PLoS Comput. Biol"},{"key":"2023013110110038700_btz691-B34","doi-asserted-by":"crossref","first-page":"W329","DOI":"10.1093\/nar\/gky384","article-title":"IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding","volume":"46","author":"M\u00e9sz\u00e1ros","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023013110110038700_btz691-B35","doi-asserted-by":"crossref","first-page":"481","DOI":"10.1016\/j.cbpa.2010.06.169","article-title":"Intrinsically disordered proteins are potential drug targets","volume":"14","author":"Metallo","year":"2010","journal-title":"Curr. Opin. Chem. Biol"},{"key":"2023013110110038700_btz691-B36","doi-asserted-by":"crossref","first-page":"1043","DOI":"10.1016\/j.jmb.2006.07.087","article-title":"Analysis of molecular recognition features (MoRFS)","volume":"362","author":"Mohan","year":"2006","journal-title":"J. Mol. Biol"},{"key":"2023013110110038700_btz691-B37","first-page":"807","author":"Nair","year":"2010"},{"key":"2023013110110038700_btz691-B38","doi-asserted-by":"crossref","first-page":"1402","DOI":"10.1093\/bioinformatics\/btx015","article-title":"MobiDB-lite: fast and highly specific consensus prediction of intrinsic disorder in proteins","volume":"33","author":"Necci","year":"2017","journal-title":"Bioinformatics"},{"key":"2023013110110038700_btz691-B39","doi-asserted-by":"crossref","first-page":"1345","DOI":"10.1109\/TKDE.2009.191","article-title":"A survey on transfer learning","volume":"22","author":"Pan","year":"2010","journal-title":"IEEE Trans. Knowl. Data Eng"},{"key":"2023013110110038700_btz691-B40","doi-asserted-by":"crossref","first-page":"e121.","DOI":"10.1093\/nar\/gkv585","article-title":"High-throughput prediction of RNA, DNA and protein binding regions mediated by intrinsic disorder","volume":"43","author":"Peng","year":"2015","journal-title":"Nucleic Acids Res"},{"key":"2023013110110038700_btz691-B41","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1002\/prot.20750","article-title":"Assessing protein disorder and induced folding","volume":"62","author":"Receveur-Br\u00e9chot","year":"2005","journal-title":"Proteins"},{"key":"2023013110110038700_btz691-B42","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1038\/nmeth.1818","article-title":"HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment","volume":"9","author":"Remmert","year":"2012","journal-title":"Nat. Methods"},{"key":"2023013110110038700_btz691-B43","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"323","author":"Rumelhart","year":"1986","journal-title":"Nature"},{"key":"2023013110110038700_btz691-B44","doi-asserted-by":"crossref","first-page":"2673","DOI":"10.1109\/78.650093","article-title":"Bidirectional recurrent neural networks","volume":"45","author":"Schuster","year":"1997","journal-title":"IEEE Trans. Signal Process"},{"key":"2023013110110038700_btz691-B45","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1016\/j.jtbi.2017.10.015","article-title":"Morfpred-plus: computational identification of MoRFS in protein sequences using physicochemical properties and HMM profiles","volume":"437","author":"Sharma","year":"2018","journal-title":"J. Theor. Biol"},{"key":"2023013110110038700_btz691-B46","doi-asserted-by":"crossref","first-page":"e1800058","DOI":"10.1002\/pmic.201800058","article-title":"OPAL+: length-specific MoRF prediction in intrinsically disordered protein sequences","volume":"19","author":"Sharma","year":"2018","journal-title":"Proteomics"},{"key":"2023013110110038700_btz691-B47","doi-asserted-by":"crossref","first-page":"1850","DOI":"10.1093\/bioinformatics\/bty032","article-title":"OPAL: prediction of MoRF regions in intrinsically disordered protein sequences","volume":"34","author":"Sharma","year":"2018","journal-title":"Bioinformatics"},{"key":"2023013110110038700_btz691-B48","doi-asserted-by":"crossref","first-page":"2033","DOI":"10.1021\/acs.jcim.8b00442","article-title":"Detecting proline and non-proline cis isomers in protein structures from sequences using deep residual ensemble learning","volume":"58","author":"Singh","year":"2018","journal-title":"J. Chem. Inf. Model"},{"key":"2023013110110038700_btz691-B49","first-page":"1929","article-title":"Dropout: a simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res"},{"key":"2023013110110038700_btz691-B50","first-page":"12","article-title":"Inception-v4, inception-ReSnet and the impact of residual connections on learning","volume":"4","author":"Szegedy","year":"2017","journal-title":"AAAI"},{"key":"2023013110110038700_btz691-B51","doi-asserted-by":"crossref","first-page":"527","DOI":"10.1016\/S0968-0004(02)02169-2","article-title":"Intrinsically unstructured proteins","volume":"27","author":"Tompa","year":"2002","journal-title":"Trends Biochem. Sci"},{"key":"2023013110110038700_btz691-B52","doi-asserted-by":"crossref","first-page":"361","DOI":"10.1016\/j.theochem.2003.08.047","article-title":"The functional benefits of protein disorder","volume":"666","author":"Tompa","year":"2003","journal-title":"J. Mol. Struct. Theochem"},{"key":"2023013110110038700_btz691-B53","first-page":"D204","article-title":"UniProt: a hub for protein information","volume":"43","year":"2014","journal-title":"Nucleic Acids Res"},{"key":"2023013110110038700_btz691-B54","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1046\/j.0014-2956.2001.02649.x","article-title":"What does it mean to be natively unfolded?","volume":"269","author":"Uversky","year":"2002","journal-title":"Eur. J. Biochem"},{"key":"2023013110110038700_btz691-B55","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1002\/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7","article-title":"Why are \u201cnatively unfolded\u201d proteins unstructured under physiologic conditions?","volume":"41","author":"Uversky","year":"2000","journal-title":"Proteins"},{"key":"2023013110110038700_btz691-B56","volume-title":"Statistical Learning Theory","author":"Vapnik","year":"1998"},{"key":"2023013110110038700_btz691-B57","doi-asserted-by":"crossref","first-page":"D483","DOI":"10.1093\/nar\/gks1258","article-title":"Sifts: structure integration with function, taxonomy and sequences resource","volume":"41","author":"Velankar","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023013110110038700_btz691-B58","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pcbi.1005324","article-title":"Accurate de novo prediction of protein contact map by ultra-deep learning model","volume":"13","author":"Wang","year":"2017","journal-title":"PLoS Comput. Biol"},{"key":"2023013110110038700_btz691-B59","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1006\/jmbi.1999.3110","article-title":"Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm","volume":"293","author":"Wright","year":"1999","journal-title":"J. Mol. Biol"},{"key":"2023013110110038700_btz691-B60","doi-asserted-by":"crossref","first-page":"697","DOI":"10.1039\/C5MB00640F","article-title":"Molecular recognition features (MoRFs) in three domains of life","volume":"12","author":"Yan","year":"2016","journal-title":"Mol. Biosyst"},{"key":"2023013110110038700_btz691-B61","doi-asserted-by":"crossref","first-page":"D1096","DOI":"10.1093\/nar\/gks966","article-title":"BioLiP: a semi-manually curated database for biologically relevant ligand\u2013protein interactions","volume":"41","author":"Yang","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"2023013110110038700_btz691-B62","doi-asserted-by":"crossref","first-page":"1193","DOI":"10.1007\/s12013-013-9638-0","article-title":"Intrinsically semi-disordered state and its role in induced folding and protein aggregation","volume":"67","author":"Zhang","year":"2013","journal-title":"Cell. Biochem. Biophys"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btz691\/30062698\/btz691.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/4\/1107\/48983231\/bioinformatics_36_4_1107.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/4\/1107\/48983231\/bioinformatics_36_4_1107.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,19]],"date-time":"2023-09-19T18:41:38Z","timestamp":1695148898000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/4\/1107\/5560335"}},"subtitle":[],"editor":[{"given":"Jan","family":"Gorodkin","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2019,9,5]]},"references-count":62,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,2,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz691","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,2,15]]},"published":{"date-parts":[[2019,9,5]]}}}