{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,12]],"date-time":"2026-04-12T08:44:20Z","timestamp":1775983460817,"version":"3.50.1"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1010238","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2022,7,12]],"date-time":"2022-07-12T00:00:00Z","timestamp":1657584000000}}],"reference-count":125,"publisher":"Public Library of Science (PLoS)","issue":"6","license":[{"start":{"date-parts":[[2022,6,29]],"date-time":"2022-06-29T00:00:00Z","timestamp":1656460800000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000038","name":"natural sciences and engineering research council of canada","doi-asserted-by":"publisher","award":["RGPIN-2018-04924"],"award-info":[{"award-number":["RGPIN-2018-04924"]}],"id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000024","name":"canadian institutes of health research","doi-asserted-by":"publisher","award":["PJT-148532"],"award-info":[{"award-number":["PJT-148532"]}],"id":[{"id":"10.13039\/501100000024","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001804","name":"canada research chairs","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100001804","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001804","name":"canada research chairs","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100001804","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council of Canada","doi-asserted-by":"publisher","award":["CGS Fellowship"],"award-info":[{"award-number":["CGS Fellowship"]}],"id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100007065","name":"Nvidia","doi-asserted-by":"publisher","award":["GPU academic seeding grand"],"award-info":[{"award-number":["GPU academic seeding grand"]}],"id":[{"id":"10.13039\/100007065","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>A major challenge to the characterization of intrinsically disordered regions (IDRs), which are widespread in the proteome, but relatively poorly understood, is the identification of molecular features that mediate functions of these regions, such as short motifs, amino acid repeats and physicochemical properties. Here, we introduce a proteome-scale feature discovery approach for IDRs. Our approach, which we call \u201creverse homology\u201d, exploits the principle that important functional features are conserved over evolution. We use this as a contrastive learning signal for deep learning: given a set of homologous IDRs, the neural network has to correctly choose a held-out homolog from another set of IDRs sampled randomly from the proteome. We pair reverse homology with a simple architecture and standard interpretation techniques, and show that the network learns conserved features of IDRs that can be interpreted as motifs, repeats, or bulk features like charge or amino acid propensities. We also show that our model can be used to produce visualizations of what residues and regions are most important to IDR function, generating hypotheses for uncharacterized IDRs. Our results suggest that feature discovery using unsupervised neural networks is a promising avenue to gain systematic insight into poorly understood protein sequences.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1010238","type":"journal-article","created":{"date-parts":[[2022,6,29]],"date-time":"2022-06-29T13:40:17Z","timestamp":1656510017000},"page":"e1010238","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":49,"title":["Discovering molecular features of intrinsically disordered regions by using evolution for contrastive learning"],"prefix":"10.1371","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9568-3155","authenticated-orcid":true,"given":"Alex X.","family":"Lu","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6575-7165","authenticated-orcid":true,"given":"Amy X.","family":"Lu","sequence":"additional","affiliation":[]},{"given":"Iva","family":"Priti\u0161anac","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1253-3843","authenticated-orcid":true,"given":"Taraneh","family":"Zarin","sequence":"additional","affiliation":[]},{"given":"Julie D.","family":"Forman-Kay","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3118-3121","authenticated-orcid":true,"given":"Alan M.","family":"Moses","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2022,6,29]]},"reference":[{"key":"pcbi.1010238.ref001","doi-asserted-by":"crossref","DOI":"10.1002\/pmic.201800061","article-title":"Intrinsically Disordered Proteins: The Dark Horse of the Dark Proteome","volume":"18","author":"P Kulkarni","year":"2018","journal-title":"Proteomics"},{"key":"pcbi.1010238.ref002","doi-asserted-by":"crossref","first-page":"6589","DOI":"10.1021\/cr400525m","article-title":"Classification of intrinsically disordered regions and proteins","author":"R Van Der Lee","year":"2014","journal-title":"Chemical Reviews. American Chemical Society"},{"key":"pcbi.1010238.ref003","author":"K Lindorff-Larsen","year":"2021","journal-title":"On the potential of machine learning to examine the relationship between sequence, structure, dynamics and function of intrinsically disordered proteins"},{"key":"pcbi.1010238.ref004","first-page":"155","article-title":"Current Opinion in Structural Biology","author":"NE Davey","year":"2019"},{"key":"pcbi.1010238.ref005","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1038\/nrm3920","article-title":"Intrinsically disordered proteins in cellular signalling and regulation","volume":"16","author":"PE Wright","year":"2015","journal-title":"Nat Rev Mol Cell Biol"},{"key":"pcbi.1010238.ref006","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1007\/978-1-0716-0524-0_6","article-title":"Exploring Protein Intrinsic Disorder with MobiDB","volume":"2141","author":"AM Monzon","year":"2020","journal-title":"Methods Mol Biol Clifton NJ"},{"key":"pcbi.1010238.ref007","first-page":"D269","article-title":"DisProt: intrinsic protein disorder annotation in 2020","volume":"48","author":"A Hatos","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010238.ref008","doi-asserted-by":"crossref","first-page":"472","DOI":"10.1038\/s41592-021-01117-3","article-title":"Critical assessment of protein intrinsic disorder prediction","volume":"18","author":"M Necci","year":"2021","journal-title":"Nat Methods"},{"key":"pcbi.1010238.ref009","doi-asserted-by":"crossref","first-page":"857","DOI":"10.1093\/bioinformatics\/btu744","article-title":"DISOPRED3: Precise disordered region predictions with annotated protein-binding activity","volume":"31","author":"DT Jones","year":"2015","journal-title":"Bioinformatics"},{"key":"pcbi.1010238.ref010","doi-asserted-by":"crossref","first-page":"W329","DOI":"10.1093\/nar\/gky384","article-title":"IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding","volume":"46","author":"B M\u00e9sz\u00e1ros","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010238.ref011","doi-asserted-by":"crossref","first-page":"454","DOI":"10.1016\/j.csbj.2019.03.013","article-title":"Computational Prediction of MoRFs, Short Disorder-to-order Transitioning Protein Binding Regions","volume":"17","author":"A Katuwawala","year":"2019","journal-title":"Comput Struct Biotechnol J"},{"key":"pcbi.1010238.ref012","doi-asserted-by":"crossref","first-page":"4438","DOI":"10.1038\/s41467-021-24773-7","article-title":"flDPnn: Accurate intrinsic disorder prediction with putative propensities of disorder functions","volume":"12","author":"G Hu","year":"2021","journal-title":"Nat Commun"},{"key":"pcbi.1010238.ref013","doi-asserted-by":"crossref","first-page":"1","DOI":"10.7554\/eLife.60220","article-title":"Identifying molecular features that are associated with biological function of intrinsically disordered protein regions","volume":"10","author":"T Zarin","year":"2021","journal-title":"eLife"},{"key":"pcbi.1010238.ref014","doi-asserted-by":"crossref","first-page":"3389","DOI":"10.1093\/nar\/25.17.3389","article-title":"Gapped BLAST and PSI-BLAST: a new generation of protein database search programs","volume":"25","author":"SF Altschul","year":"1997","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010238.ref015","doi-asserted-by":"crossref","first-page":"D427","DOI":"10.1093\/nar\/gky995","article-title":"The Pfam protein families database in 2019","volume":"47","author":"S El-Gebali","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010238.ref016","doi-asserted-by":"crossref","first-page":"2164","DOI":"10.1002\/pro.3041","article-title":"Large-scale analysis of intrinsic disorder flavors and associated functions in the protein sequence universe","volume":"25","author":"M Necci","year":"2016","journal-title":"Protein Sci Publ Protein Soc"},{"key":"pcbi.1010238.ref017","doi-asserted-by":"crossref","first-page":"102","DOI":"10.1016\/j.sbi.2015.03.008","article-title":"Relating sequence encoded information to form and function of intrinsically disordered proteins","volume":"32","author":"RK Das","year":"2015","journal-title":"Curr Opin Struct Biol"},{"key":"pcbi.1010238.ref018","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.46883","article-title":"Proteome-wide signatures of function in highly diverged intrinsically disordered regions","volume":"8","author":"T Zarin","year":"2019","journal-title":"eLife"},{"key":"pcbi.1010238.ref019","first-page":"D296","article-title":"ELM-the eukaryotic linear motif resource in 2020","volume":"48","author":"M Kumar","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010238.ref020","doi-asserted-by":"crossref","first-page":"R23","DOI":"10.1186\/gb-2007-8-2-r23","article-title":"Clustering of phosphorylation site recognition motifs can be exploited to predict the targets of cyclin-dependent kinase","volume":"8","author":"AM Moses","year":"2007","journal-title":"Genome Biol"},{"key":"pcbi.1010238.ref021","doi-asserted-by":"crossref","first-page":"1039","DOI":"10.1111\/tra.12310","article-title":"Mechanisms Regulating Protein Localization","volume":"16","author":"NC Bauer","year":"2015","journal-title":"Traffic"},{"key":"pcbi.1010238.ref022","first-page":"4650","article-title":"RGG\/RG Motif Regions in RNA Binding and Phase Separation","volume-title":"Journal of Molecular Biology","author":"PA Chong","year":"2018"},{"key":"pcbi.1010238.ref023","doi-asserted-by":"crossref","first-page":"5616","DOI":"10.1073\/pnas.1516277113","article-title":"Cryptic sequence features within the disordered protein p27Kip1 regulate cell cycle signaling","volume":"113","author":"RK Das","year":"2016","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1010238.ref024","doi-asserted-by":"crossref","first-page":"694","DOI":"10.1126\/science.aaw8653","article-title":"Valence and patterning of aromatic residues determine the phase behavior of prion-like domains","volume":"367","author":"EW Martin","year":"2020","journal-title":"Science"},{"key":"pcbi.1010238.ref025","first-page":"1","article-title":"Phase Transition of a Disordered Nuage Protein Generates Environmentally Responsive Membraneless Organelles","volume":"57","author":"TJ Nott","year":"2015","journal-title":"Mol Cell"},{"key":"pcbi.1010238.ref026","doi-asserted-by":"crossref","first-page":"e8190","DOI":"10.15252\/msb.20188190","article-title":"High-throughput discovery of functional disordered regions: investigation of transactivation domains","volume":"14","author":"CN Ravarani","year":"2018","journal-title":"Mol Syst Biol"},{"key":"pcbi.1010238.ref027","doi-asserted-by":"crossref","first-page":"444","DOI":"10.1016\/j.cels.2018.01.015","article-title":"A High-Throughput Mutational Scan of an Intrinsically Disordered Acidic Transcriptional Activation Domain","volume":"6","author":"MV Staller","year":"2018","journal-title":"Cell Syst"},{"key":"pcbi.1010238.ref028","first-page":"1","article-title":"A survey of DNA motif finding algorithms","author":"MK Das","year":"2007","journal-title":"BMC Bioinformatics. BioMed Central"},{"key":"pcbi.1010238.ref029","article-title":"Motif Discovery in Protein Sequences. Pattern Recognition\u2014Analysis and Applications","author":"SAEH Mohamed","year":"2016","journal-title":"InTech"},{"key":"pcbi.1010238.ref030","first-page":"e58","article-title":"Comparative genomics","author":"RC Hardison","year":"2003","journal-title":"PLoS Biology. Public Library of Science"},{"key":"pcbi.1010238.ref031","doi-asserted-by":"crossref","first-page":"338","DOI":"10.1038\/nature03441","article-title":"Systematic discovery of regulatory motifs in human promoters and 3\u2032 UTRs by comparison of several mammals","volume":"434","author":"X Xie","year":"2005","journal-title":"Nature"},{"key":"pcbi.1010238.ref032","doi-asserted-by":"crossref","first-page":"13933","DOI":"10.1073\/pnas.0501046102","article-title":"An evolutionary proteomics approach identifies substrates of the cAMP-dependent protein kinase","volume":"102","author":"V. Budovskaya Y","year":"2005","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1010238.ref033","doi-asserted-by":"crossref","DOI":"10.1126\/scisignal.2002515","article-title":"Proteome-wide discovery of evolutionary conserved sequences in disordered regions","volume":"5","author":"AN Nguyen Ba","year":"2012","journal-title":"Sci Signal"},{"key":"pcbi.1010238.ref034","doi-asserted-by":"crossref","first-page":"10628","DOI":"10.1093\/nar\/gks854","article-title":"SLiMPrints: Conservation-based discovery of functional motif fingerprints in intrinsically disordered protein regions","volume":"40","author":"NE Davey","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010238.ref035","article-title":"A core subunit of Polycomb repressive complex 1 is broadly conserved in function but not primary sequence","volume":"109","author":"LY Beh","year":"2012","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1010238.ref036","doi-asserted-by":"crossref","first-page":"E1450","DOI":"10.1073\/pnas.1614787114","article-title":"Selection maintains signaling function of a highly diverged intrinsically disordered region","volume":"114","author":"T Zarin","year":"2017","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1010238.ref037","doi-asserted-by":"crossref","first-page":"iyab184","DOI":"10.1093\/genetics\/iyab184","article-title":"The length scale of multivalent interactions is evolutionarily conserved in fungal and vertebrate phase-separating proteins","volume":"220","author":"P Dasmeh","year":"2022","journal-title":"Genetics"},{"key":"pcbi.1010238.ref038","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"Y LeCun","year":"2015","journal-title":"Nature"},{"key":"pcbi.1010238.ref039","doi-asserted-by":"crossref","first-page":"831","DOI":"10.1038\/nbt.3300","article-title":"Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning","volume":"33","author":"B Alipanahi","year":"2015","journal-title":"Nat Biotechnol"},{"key":"pcbi.1010238.ref040","doi-asserted-by":"crossref","first-page":"990","DOI":"10.1101\/gr.200535.115","article-title":"Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks","volume":"26","author":"DR Kelley","year":"2016","journal-title":"Genome Res"},{"key":"pcbi.1010238.ref041","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1038\/s41588-021-00782-6","article-title":"Base-resolution models of transcription-factor binding reveal soft motif syntax","volume":"53","author":"\u017d Avsec","year":"2021","journal-title":"Nat Genet"},{"key":"pcbi.1010238.ref042","doi-asserted-by":"crossref","first-page":"1196","DOI":"10.1038\/s41592-021-01252-x","article-title":"Effective gene expression prediction from sequence by integrating long-range interactions","volume":"18","author":"\u017d Avsec","year":"2021","journal-title":"Nat Methods"},{"key":"pcbi.1010238.ref043","doi-asserted-by":"crossref","first-page":"739","DOI":"10.1101\/gr.227819.117","article-title":"Sequential regulatory activity prediction across chromosomes with convolutional neural networks","volume":"28","author":"DR Kelley","year":"2018","journal-title":"Genome Res"},{"key":"pcbi.1010238.ref044","doi-asserted-by":"crossref","first-page":"e1007560","DOI":"10.1371\/journal.pcbi.1007560","article-title":"Representation learning of genomic sequence motifs with convolutional neural networks","volume":"15","author":"PK Koo","year":"2019","journal-title":"PLOS Comput Biol"},{"key":"pcbi.1010238.ref045","doi-asserted-by":"crossref","first-page":"890","DOI":"10.1016\/j.molcel.2020.04.020","article-title":"A High-Throughput Screen for Transcription Activation Domains Reveals Their Sequence Features and Permits Prediction by Deep Learning","volume":"78","author":"A Erijman","year":"2020","journal-title":"Mol Cell"},{"key":"pcbi.1010238.ref046","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.68068","article-title":"Simple biochemical features underlie transcriptional activation domain diversity and dynamic, fuzzy binding to mediator","volume":"10","author":"AL Sanborn","year":"2021","journal-title":"eLife"},{"key":"pcbi.1010238.ref047","doi-asserted-by":"crossref","first-page":"D884","DOI":"10.1093\/nar\/gkaa942","article-title":"Ensembl 2021","volume":"49","author":"KL Howe","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010238.ref048","doi-asserted-by":"crossref","first-page":"D373","DOI":"10.1093\/nar\/gkaa1007","article-title":"OMA orthology in 2021: Website overhaul, conserved isoforms, ancestral gene order and more","volume":"49","author":"AM Altenhoff","year":"2021","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010238.ref049","author":"L Jing","year":"2019","journal-title":"Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey"},{"key":"pcbi.1010238.ref050","doi-asserted-by":"crossref","first-page":"723","DOI":"10.1186\/s12859-019-3220-8","article-title":"Modeling aspects of the language of life through transfer-learning protein sequences","volume":"20","author":"M Heinzinger","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"pcbi.1010238.ref051","article-title":"Evaluating Protein Transfer Learning with TAPE","author":"R Rao","year":"2019","journal-title":"NeurIPS 2019"},{"key":"pcbi.1010238.ref052","doi-asserted-by":"crossref","first-page":"1315","DOI":"10.1038\/s41592-019-0598-1","article-title":"Unified rational protein engineering with sequence-based deep representation learning","volume":"16","author":"EC Alley","year":"2019","journal-title":"Nat Methods"},{"key":"pcbi.1010238.ref053","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2016239118","article-title":"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences","volume":"118","author":"A Rives","year":"2021","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1010238.ref054","article-title":"Self-Supervised Contrastive Learning of Protein Representations By Mutual Information Maximization","author":"AX Lu","year":"2020","journal-title":"bioRxiv"},{"key":"pcbi.1010238.ref055","article-title":"MSA Transformer","author":"R Rao","year":"2021","journal-title":"bioRxiv"},{"key":"pcbi.1010238.ref056","doi-asserted-by":"crossref","first-page":"1028","DOI":"10.1016\/j.cell.2017.02.027","article-title":"Stress-Triggered Phase Separation Is an Adaptive, Evolutionarily Tuned Response","volume":"168","author":"JA Riback","year":"2017","journal-title":"Cell"},{"key":"pcbi.1010238.ref057","author":"AX Lu","year":"2020","journal-title":"Evolution Is All You Need: Phylogenetic Augmentation for Contrastive Learning"},{"key":"pcbi.1010238.ref058","article-title":"A Simple Framework for Contrastive Learning of Visual Representations","author":"T Chen","year":"2020","journal-title":"ICLR 2020"},{"key":"pcbi.1010238.ref059","author":"den Oord A van","year":"2018","journal-title":"Representation Learning with Contrastive Predictive Coding"},{"key":"pcbi.1010238.ref060","article-title":"Self-supervised Learning: Generative or Contrastive","author":"X Liu","year":"2020","journal-title":"arXiv"},{"key":"pcbi.1010238.ref061","article-title":"An introduction to sequence similarity (\u201chomology\u201d) searching","volume":"0 3","author":"WR Pearson","year":"2013","journal-title":"Curr Protoc Bioinforma"},{"key":"pcbi.1010238.ref062","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1186\/1471-2105-10-39","article-title":"Protein domain organisation: Adding order","volume":"10","author":"SK Kummerfeld","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"pcbi.1010238.ref063","first-page":"12","article-title":"Bringing order to protein disorder through comparative genomics and genetic interactions","author":"J Bellay","year":"2011","journal-title":"Genome Biol"},{"key":"pcbi.1010238.ref064","doi-asserted-by":"crossref","first-page":"879","DOI":"10.1021\/pr060048x","article-title":"Conservation of intrinsic disorder in protein domains and families: I. A database of conserved predicted disordered regions","volume":"5","author":"JW Chen","year":"2006","journal-title":"J Proteome Res"},{"key":"pcbi.1010238.ref065","doi-asserted-by":"crossref","first-page":"888","DOI":"10.1021\/pr060049p","article-title":"Conservation of intrinsic disorder in protein domains and families: II. Functions of conserved disorder","volume":"5","author":"JW Chen","year":"2006","journal-title":"J Proteome Res"},{"key":"pcbi.1010238.ref066","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1003030","article-title":"Distinct Types of Disorder in the Human Proteome: Functional Implications for Alternative Splicing","volume":"9","author":"R Colak","year":"2013","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1010238.ref067","doi-asserted-by":"crossref","first-page":"26918","DOI":"10.1074\/jbc.M109.028431","article-title":"Structural, functional, and bioinformatic studies demonstrate the crucial role of an extended peptide binding site for the SH3 domain of yeast Abp1p","volume":"284","author":"EJ Stollar","year":"2009","journal-title":"J Biol Chem"},{"key":"pcbi.1010238.ref068","author":"L McInnes","year":"2018","journal-title":"UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction"},{"key":"pcbi.1010238.ref069","doi-asserted-by":"crossref","first-page":"1391","DOI":"10.1038\/nbt1146","article-title":"An iterative statistical approach to the identification of protein phosphorylation motifs from large-scale data sets","volume":"23","author":"D Schwartz","year":"2005","journal-title":"Nat Biotechnol"},{"key":"pcbi.1010238.ref070","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1042\/BJ20110720","article-title":"Discovery of cellular substrates for protein kinase A using a peptide array screening protocol","volume":"438","author":"FD Smith","year":"2011","journal-title":"Biochem J"},{"key":"pcbi.1010238.ref071","doi-asserted-by":"crossref","first-page":"13392","DOI":"10.1073\/pnas.1304749110","article-title":"Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues","volume":"110","author":"RK Das","year":"2013","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1010238.ref072","doi-asserted-by":"crossref","DOI":"10.1063\/1.4929391","article-title":"A theoretical method to compute sequence dependent configurational properties in charged polymers and proteins","volume":"143","author":"L Sawle","year":"2015","journal-title":"J Chem Phys"},{"key":"pcbi.1010238.ref073","author":"A Shanehsazzadeh","year":"2020","journal-title":"Is Transfer Learning Necessary for Protein Landscape Prediction?"},{"key":"pcbi.1010238.ref074","author":"T Lu","year":"2021","journal-title":"Random Embeddings and Linear Regression can Predict Protein Function"},{"key":"pcbi.1010238.ref075","first-page":"40","article-title":"Saccharomyces Genome Database: The genomics resource of budding yeast","author":"JM Cherry","year":"2012","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010238.ref076","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1471-2180-12-213","article-title":"High abundance of Serine\/Threonine-rich regions predicted to be hyper-O-glycosylated in the secretory proteins coded by eight fungal genomes","volume":"12","author":"M Gonz\u00e1lez","year":"2012","journal-title":"BMC Microbiol"},{"key":"pcbi.1010238.ref077","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1186\/1471-2105-10-48","article-title":"GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists","volume":"10","author":"E Eden","year":"2009","journal-title":"BMC Bioinformatics"},{"key":"pcbi.1010238.ref078","doi-asserted-by":"crossref","first-page":"3104","DOI":"10.1128\/jb.177.11.3104-3110.1995","article-title":"Identification of three mannoproteins in the cell wall of Saccharomyces cerevisiae","volume":"177","author":"JM Van der Vaart","year":"1995","journal-title":"J Bacteriol"},{"key":"pcbi.1010238.ref079","doi-asserted-by":"crossref","first-page":"2881","DOI":"10.1128\/JB.183.9.2881-2887.2001","article-title":"Reciprocal regulation of anaerobic and aerobic cell wall mannoprotein gene expression in Saccharomyces cerevisiae","volume":"183","author":"N Abramova","year":"2001","journal-title":"J Bacteriol"},{"key":"pcbi.1010238.ref080","doi-asserted-by":"crossref","first-page":"13804","DOI":"10.1073\/pnas.94.25.13804","article-title":"A family of genes required for maintenance of cell wall integrity and for the stress response in Saccharomyces cerevisiae","volume":"94","author":"J Verna","year":"1997","journal-title":"Proc Natl Acad Sci U S A"},{"key":"pcbi.1010238.ref081","doi-asserted-by":"crossref","first-page":"4970","DOI":"10.1111\/febs.12468","article-title":"Uth1 is a mitochondrial inner membrane protein dispensable for post-log-phase and rapamycin-induced mitophagy","author":"E Welter","year":"2013","journal-title":"FEBS Journal. FEBS J"},{"key":"pcbi.1010238.ref082","doi-asserted-by":"crossref","first-page":"168","DOI":"10.1111\/j.1567-1364.2009.00601.x","article-title":"The Saccharomyces SUN gene, UTH1, is involved in cell wall biogenesis","volume":"10","author":"JJ Ritch","year":"2010","journal-title":"FEMS Yeast Res"},{"key":"pcbi.1010238.ref083","doi-asserted-by":"crossref","first-page":"6097","DOI":"10.1093\/nar\/18.20.6097","article-title":"Sequence logos: a new way to display consensus sequences","volume":"18","author":"TD Schneider","year":"1990","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010238.ref084","first-page":"12534","article-title":"Properties and biological impact of RNA G-quadruplexes: From order to turmoil and back","volume-title":"Nucleic Acids Research","author":"P Kharel","year":"2020"},{"key":"pcbi.1010238.ref085","doi-asserted-by":"crossref","first-page":"1215","DOI":"10.1080\/15476286.2019.1621623","article-title":"RGG-motif self-association regulates eIF4G-binding translation repressor protein Scd6","volume":"16","author":"G Poornima","year":"2019","journal-title":"RNA Biol"},{"key":"pcbi.1010238.ref086","doi-asserted-by":"crossref","first-page":"592","DOI":"10.1038\/s41586-020-2097-z","article-title":"Phase separation directs ubiquitination of gene-body nucleosomes","volume":"579","author":"LD Gallego","year":"2020","journal-title":"Nature"},{"key":"pcbi.1010238.ref087","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.31486","article-title":"Pi-Pi contacts are an overlooked protein feature relevant to phase separation","volume":"7","author":"RM Vernon","year":"2018","journal-title":"eLife"},{"key":"pcbi.1010238.ref088","doi-asserted-by":"crossref","first-page":"e1005499","DOI":"10.1371\/journal.pcbi.1005499","article-title":"Exhaustive search of linear information encoding protein-peptide recognition","volume":"13","author":"A Kelil","year":"2017","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1010238.ref089","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1002\/pro.3978","article-title":"The BioGRID database: A comprehensive biomedical resource of curated protein, genetic, and chemical interactions","volume":"30","author":"R Oughtred","year":"2021","journal-title":"Protein Sci"},{"key":"pcbi.1010238.ref090","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41598-020-80357-3","article-title":"Regulation of trehalase activity by multi-site phosphorylation and 14-3-3 interaction","volume":"11","author":"L Dengler","year":"2021","journal-title":"Sci Rep"},{"key":"pcbi.1010238.ref091","doi-asserted-by":"crossref","first-page":"1141","DOI":"10.1016\/j.jmb.2007.06.029","article-title":"Alternative Conformations of the Archaeal Nop56\/58-Fibrillarin Complex Imply Flexibility in Box C\/D RNPs","volume":"371","author":"S Oruganti","year":"2007","journal-title":"J Mol Biol"},{"key":"pcbi.1010238.ref092","doi-asserted-by":"crossref","first-page":"19418","DOI":"10.1074\/jbc.M111.323253","article-title":"Structurally conserved Nop56\/58 N-terminal domain facilitates archaeal box C\/D ribonucleoprotein-guided methyltransferase activity","volume":"287","author":"KT Gagnon","year":"2012","journal-title":"J Biol Chem"},{"key":"pcbi.1010238.ref093","doi-asserted-by":"crossref","first-page":"e1007348","DOI":"10.1371\/journal.pcbi.1007348","article-title":"Learning unsupervised feature representations for single cell microscopy images with paired cell inpainting","volume":"15","author":"AX Lu","year":"2019","journal-title":"PLOS Comput Biol"},{"key":"pcbi.1010238.ref094","doi-asserted-by":"crossref","first-page":"eaal3321","DOI":"10.1126\/science.aal3321","article-title":"A subcellular map of the human proteome","volume":"356","author":"PJ Thul","year":"2017","journal-title":"Science"},{"key":"pcbi.1010238.ref095","doi-asserted-by":"crossref","first-page":"981","DOI":"10.1042\/BST20120092","article-title":"Cell cycle regulation by the intrinsically disordered proteins p21 and p27","author":"MK Yoon","year":"2012","journal-title":"Biochemical Society Transactions. NIH Public Access"},{"key":"pcbi.1010238.ref096","doi-asserted-by":"crossref","first-page":"3255","DOI":"10.1007\/s00018-008-8296-7","article-title":"Post-translational regulation of the tumor suppressor p27KIP1. Cellular and Molecular Life Sciences","author":"J Vervoorts","year":"2008","journal-title":"Cell Mol Life Sci"},{"key":"pcbi.1010238.ref097","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1016\/j.dnarep.2018.07.008","article-title":"Multiple functions of p27 in cell cycle, apoptosis, epigenetic modification and transcriptional regulation for the control of cell growth: A double-edged sword protein","author":"M Abbastabar","year":"2018","journal-title":"DNA Repair. DNA Repair (Amst)"},{"key":"pcbi.1010238.ref098","doi-asserted-by":"crossref","first-page":"1153","DOI":"10.1038\/nm761","article-title":"PKB\/Akt phosphorylates p27, impairs nuclear import of p27 and opposes p27-mediated G1 arrest","volume":"8","author":"J Liang","year":"2002","journal-title":"Nat Med"},{"key":"pcbi.1010238.ref099","doi-asserted-by":"crossref","first-page":"688","DOI":"10.1016\/j.cell.2018.06.006","article-title":"A molecular grammar governing the driving forces for phase separationof prion-like RNA binding proteins","volume":"174","author":"J Wang","year":"2018","journal-title":"Cell"},{"key":"pcbi.1010238.ref100","doi-asserted-by":"crossref","first-page":"D1062","DOI":"10.1093\/nar\/gkx1153","article-title":"ClinVar: improving access to variant interpretations and supporting evidence","volume":"46","author":"MJ Landrum","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010238.ref101","doi-asserted-by":"crossref","first-page":"196","DOI":"10.1038\/s41557-021-00840-w","article-title":"Deciphering how naturally occurring sequence features impact the phase behaviours of disordered prion-like domains","volume":"14","author":"A Bremer","year":"2022","journal-title":"Nat Chem"},{"key":"pcbi.1010238.ref102","doi-asserted-by":"crossref","first-page":"e25200","DOI":"10.4161\/rdis.25200","article-title":"Disease mutations in the prion-like domains of hnRNPA1 and hnRNPA2\/B1 introduce potent steric zippers that drive excess RNP granule assembly","volume":"1","author":"J Shorter","year":"2013","journal-title":"Rare Dis"},{"key":"pcbi.1010238.ref103","article-title":"Poly(A)-binding protein is an ataxin-2 chaperone that emulsifies biomolecular condensates","author":"S Boeynaems","year":"2021","journal-title":"Cell Biology"},{"key":"pcbi.1010238.ref104","doi-asserted-by":"crossref","first-page":"D506","DOI":"10.1093\/nar\/gky1049","article-title":"UniProt: A worldwide hub of protein knowledge","volume":"47","author":"UniProt Consortium","year":"2019","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010238.ref105","doi-asserted-by":"crossref","first-page":"e26","DOI":"10.1371\/journal.pcbi.0010026","article-title":"Comparative Genomics and Disorder Prediction Identify Biologically Relevant SH3 Protein Interactions","volume":"1","author":"P Beltrao","year":"2005","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1010238.ref106","doi-asserted-by":"crossref","first-page":"e1003977","DOI":"10.1371\/journal.pcbi.1003977","article-title":"Detecting functional divergence after gene duplication through evolutionary changes in posttranslational regulatory sequences","volume":"10","author":"AN Nguyen Ba","year":"2014","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1010238.ref107","doi-asserted-by":"crossref","first-page":"940","DOI":"10.1093\/molbev\/msaa258","article-title":"Natural Selection on the Phase-Separation Properties of FUS during 160 My of Mammalian Evolution","volume":"38","author":"P Dasmeh","year":"2021","journal-title":"Mol Biol Evol"},{"key":"pcbi.1010238.ref108","doi-asserted-by":"crossref","first-page":"2985","DOI":"10.1038\/s41598-021-82656-9","article-title":"Intrinsic disorder in protein domains contributes to both organism complexity and clade-specific functions","volume":"11","author":"C Gao","year":"2021","journal-title":"Sci Rep"},{"key":"pcbi.1010238.ref109","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1002709","article-title":"Disease-Associated Mutations Disrupt Functionally Important Regions of Intrinsic Protein Disorder","volume":"8","author":"V Vacic","year":"2012","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1010238.ref110","doi-asserted-by":"crossref","first-page":"662","DOI":"10.3390\/e21070662","article-title":"Entropy and information within intrinsically disordered protein regions","volume":"21","author":"I Priti\u0161anac","year":"2019","journal-title":"Entropy"},{"key":"pcbi.1010238.ref111","doi-asserted-by":"crossref","first-page":"1742","DOI":"10.1016\/j.cell.2020.11.050","article-title":"Phase Separation as a Missing Mechanism for Interpretation of Disease Mutations","volume":"183","author":"B Tsang","year":"2020","journal-title":"Cell"},{"key":"pcbi.1010238.ref112","first-page":"8844","article-title":"MSA Transformer. Proceedings of the 38th International Conference on Machine Learning","author":"RM Rao","year":"2021","journal-title":"PMLR"},{"key":"pcbi.1010238.ref113","unstructured":"Bryant P, Elofsson A. Studying signal peptides with attention neural networks informs cleavage site predictions.: 16."},{"key":"pcbi.1010238.ref114","article-title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding","author":"J Devlin","year":"2019","journal-title":"ArXiv181004805 Cs"},{"key":"pcbi.1010238.ref115","doi-asserted-by":"crossref","first-page":"2112","DOI":"10.1093\/bioinformatics\/btab083","article-title":"DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome","volume":"37","author":"Y Ji","year":"2021","journal-title":"Bioinformatics"},{"key":"pcbi.1010238.ref116","doi-asserted-by":"crossref","first-page":"3387","DOI":"10.1093\/bioinformatics\/btx431","article-title":"DeepLoc: prediction of protein subcellular localization using deep learning","volume":"33","author":"JJ Almagro Armenteros","year":"2017","journal-title":"Bioinforma Oxf Engl"},{"key":"pcbi.1010238.ref117","doi-asserted-by":"crossref","first-page":"1456","DOI":"10.1101\/gr.3672305","article-title":"The Yeast Gene Order Browser: combining curated homology and syntenic context reveals gene fate in polyploid species","volume":"15","author":"KP Byrne","year":"2005","journal-title":"Genome Res"},{"key":"pcbi.1010238.ref118","doi-asserted-by":"crossref","first-page":"685","DOI":"10.1093\/bioinformatics\/btw678","article-title":"Improving protein disorder prediction by deep bidirectional long short-term memory recurrent neural networks","volume":"33","author":"J Hanson","year":"2017","journal-title":"Bioinformatics"},{"key":"pcbi.1010238.ref119","doi-asserted-by":"crossref","first-page":"D344","DOI":"10.1093\/nar\/gks1067","article-title":"New and continuing developments at PROSITE","volume":"41","author":"CJA Sigrist","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010238.ref120","doi-asserted-by":"crossref","first-page":"3059","DOI":"10.1093\/nar\/gkf436","article-title":"MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform","volume":"30","author":"K Katoh","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010238.ref121","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1017\/CBO9780511819049.006","volume-title":"The Phylogenetic Handbook","author":"K Strimmer","year":"2009","edition":"2"},{"key":"pcbi.1010238.ref122","doi-asserted-by":"crossref","first-page":"368","DOI":"10.1007\/BF01734359","article-title":"Evolutionary trees from DNA sequences: A maximum likelihood approach","volume":"17","author":"J. Felsenstein","year":"1981","journal-title":"J Mol Evol"},{"key":"pcbi.1010238.ref123","article-title":"Adam: A Method for Stochastic Optimization","author":"DP Kingma","year":"2017","journal-title":"ArXiv14126980 Cs"},{"key":"pcbi.1010238.ref124","doi-asserted-by":"crossref","first-page":"1422","DOI":"10.1093\/bioinformatics\/btp163","article-title":"Biopython: Freely available Python tools for computational molecular biology and bioinformatics","volume":"25","author":"PJA Cock","year":"2009","journal-title":"Bioinformatics"},{"key":"pcbi.1010238.ref125","doi-asserted-by":"crossref","first-page":"2272","DOI":"10.1093\/bioinformatics\/btz921","article-title":"Logomaker: Beautiful sequence logos in Python","volume":"36","author":"A Tareen","year":"2020","journal-title":"Bioinformatics"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1010238","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2022,7,12]],"date-time":"2022-07-12T00:00:00Z","timestamp":1657584000000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1010238","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,9]],"date-time":"2023-02-09T20:37:52Z","timestamp":1675975072000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1010238"}},"subtitle":[],"editor":[{"given":"Damiano","family":"Piovesan","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,6,29]]},"references-count":125,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2022,6,29]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1010238","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.07.29.454330","asserted-by":"object"}]},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,6,29]]}}}