{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,19]],"date-time":"2025-11-19T14:54:17Z","timestamp":1763564057600,"version":"build-2065373602"},"reference-count":43,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2019,11,16]],"date-time":"2019-11-16T00:00:00Z","timestamp":1573862400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001711","name":"Schweizerischer Nationalfonds zur F\u00f6rderung der Wissenschaftlichen Forschung","doi-asserted-by":"publisher","award":["P2ELP3_181910"],"award-info":[{"award-number":["P2ELP3_181910"]}],"id":[{"id":"10.13039\/501100001711","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001665","name":"Agence Nationale de la Recherche","doi-asserted-by":"publisher","award":["ANR-14-ACHN-0016"],"award-info":[{"award-number":["ANR-14-ACHN-0016"]}],"id":[{"id":"10.13039\/501100001665","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Extracting structural information from sequence co-variation has become a common computational biology practice in the recent years, mainly due to the availability of large sequence alignments of protein families. However, identifying features that are specific to sub-classes and not shared by all members of the family using sequence-based approaches has remained an elusive problem. We here present a coevolutionary-based method to differentially analyze subfamily specific structural features by a continuous sequence reweighting (SR) approach. We introduce the underlying principles and test its predictive capabilities on the Response Regulator family, whose subfamilies have been previously shown to display distinct, specific homo-dimerization patterns. Our results show that this reweighting scheme is effective in assigning structural features known a priori to subfamilies, even when sequence data is relatively scarce. Furthermore, sequence reweighting allows assessing if individual structural contacts pertain to specific subfamilies and it thus paves the way for the identification specificity-determining contacts from sequence variation data.<\/jats:p>","DOI":"10.3390\/e21111127","type":"journal-article","created":{"date-parts":[[2019,11,18]],"date-time":"2019-11-18T04:31:10Z","timestamp":1574051470000},"page":"1127","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Coevolutionary Analysis of Protein Subfamilies by Sequence Reweighting"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3946-9709","authenticated-orcid":false,"given":"Duccio","family":"Malinverni","sequence":"first","affiliation":[{"name":"Medical Research Council (MRC) Laboratory of Molecular Biology, Cambridge CB20QH, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1911-8039","authenticated-orcid":false,"given":"Alessandro","family":"Barducci","sequence":"additional","affiliation":[{"name":"Centre de Biochimie Structurale (CBS), INSERM, CNRS, Universit\u00e9 de Montpellier, 34090 Montpellier, France"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2019,11,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1073\/pnas.0805923106","article-title":"Identification of direct residue contacts in protein-protein interaction by message passing","volume":"106","author":"Weigt","year":"2009","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"E1293","DOI":"10.1073\/pnas.1111471108","article-title":"Direct-coupling analysis of residue coevolution captures native contacts across many protein families","volume":"108","author":"Morcos","year":"2011","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"184","DOI":"10.1093\/bioinformatics\/btr638","article-title":"PSICOV: Precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments","volume":"28","author":"Jones","year":"2012","journal-title":"Bioinformatics"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"e02030","DOI":"10.7554\/eLife.02030","article-title":"Robust and accurate prediction of residue-residue interactions across protein interfaces using evolutionary information","volume":"3","author":"Ovchinnikov","year":"2014","journal-title":"Elife"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Marks, D.S., Colwell, L.J., Sheridan, R., Hopf, T.A., Pagnani, A., Zecchina, R., and Sander, C. (2011). Protein 3D structure computed from evolutionary sequence variation. PLoS ONE, 6.","DOI":"10.1371\/journal.pone.0028766"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1002\/prot.25407","article-title":"Assessment of contact predictions in CASP12: Co-evolution and deep learning coming of age","volume":"86","author":"Schaarschmidt","year":"2018","journal-title":"Proteins Struct. Funct. Bioinform."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1607","DOI":"10.1016\/j.cell.2012.04.012","article-title":"Three-dimensional structures of membrane proteins from genomic sequencing","volume":"149","author":"Hopf","year":"2012","journal-title":"Cell"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"294","DOI":"10.1126\/science.aah4043","article-title":"Protein structure determination using metagenome sequence data","volume":"355","author":"Ovchinnikov","year":"2017","journal-title":"Science"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"E2662","DOI":"10.1073\/pnas.1615068114","article-title":"Large-scale identification of coevolution signals across homo-oligomeric protein interfaces by direct coupling analysis","volume":"114","author":"Uguzzoni","year":"2017","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Malinverni, D., Marsili, S., Barducci, A., and De Los Rios, P. (2015). Large-Scale Conformational Transitions and Dimerization Are Encoded in the Amino-Acid Sequences of Hsp70 Chaperones. PLoS Comput. Biol., 11.","DOI":"10.1371\/journal.pcbi.1004262"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.3389\/fmolb.2017.00040","article-title":"New Techniques for Ancient Proteins: Direct Coupling Analysis Applied on Proteins Involved in Iron Sulfur Cluster Biogenesis","volume":"4","author":"Fantini","year":"2017","journal-title":"Front. Mol. Biosci."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Hopf, T.A., Sch\u00e4rfe, C.P.I., Rodrigues, J.P.G.L.M., Green, A.G., Kohlbacher, O., Sander, C., Bonvin, A.M.J.J., and Marks, D.S. (2014). Sequence co-evolution gives 3D contacts and structures of protein complexes. Elife, 3.","DOI":"10.7554\/eLife.03430"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Malinverni, D., Lopez, A.J., Rios, P.D.L., Hummer, G., and Barducci, A. (2016). Modeling Hsp70\/Hsp40 interaction by multi-scale molecular simulations and co-evolutionary sequence analysis. Elife, 1\u201317.","DOI":"10.1101\/067421"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"13567","DOI":"10.1073\/pnas.1508584112","article-title":"From residue coevolution to protein conformational ensembles and functional dynamics","volume":"112","author":"Sutto","year":"2015","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"20533","DOI":"10.1073\/pnas.1315625110","article-title":"Coevolutionary signals across protein lineages help capture multiple protein conformations","volume":"110","author":"Morcos","year":"2013","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"D158","DOI":"10.1093\/nar\/gkw1099","article-title":"UniProt: The universal protein knowledgebase","volume":"45","author":"Bateman","year":"2017","journal-title":"Nucleic Acids Res."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"D211","DOI":"10.1093\/nar\/gkp985","article-title":"The Pfam protein families database","volume":"38","author":"Finn","year":"2010","journal-title":"Nucleic Acids Res."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1","DOI":"10.7554\/eLife.46754","article-title":"The role of structural pleiotropy and regulatory evolution in the retention of heteromers of paralogs","volume":"8","author":"Marchant","year":"2019","journal-title":"Elife"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1306","DOI":"10.1002\/pro.143","article-title":"Evolutionary constraints on structural similarity in orthologs and paralogs","volume":"18","author":"Peterson","year":"2009","journal-title":"Protein Sci."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"823","DOI":"10.1002\/j.1460-2075.1986.tb04288.x","article-title":"The relation between the divergence of sequence and structure in proteins","volume":"5","author":"Chothia","year":"1986","journal-title":"Embo J."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"9122","DOI":"10.1073\/pnas.1702664114","article-title":"Origins of coevolution between residues distant in protein 3D structures","volume":"114","author":"Anishchenko","year":"2017","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"585","DOI":"10.1242\/jcs.00247","article-title":"The nuclear receptor superfamily","volume":"116","author":"Escriva","year":"2003","journal-title":"J. Cell Sci."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"829","DOI":"10.1038\/nrd.2017.178","article-title":"Trends in GPCR drug discovery: New agents, targets and indications","volume":"16","author":"Hauser","year":"2017","journal-title":"Nat. Rev. Drug Discov."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Nillegoda, N.B., Stank, A., Malinverni, D., Alberts, N., Szlachcic, A., Barducci, A., De Los Rios, P., Wade, R.C., and Bukau, B. (2017). Evolution of an intricate J-protein network driving protein disaggregation in eukaryotes. Elife, 6.","DOI":"10.7554\/eLife.24560"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Tubiana, J., Cocco, S., and Monasson, R. (2019). Learning protein constitutive motifs from sequence data. Elife, 8.","DOI":"10.7554\/eLife.39397"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Jung, K., Fabiani, F., Hoyer, E., and Lassak, J. (2018). Bacterial transmembrane signalling systems and their engineering for biosensing. Open Biol., 8.","DOI":"10.1098\/rsob.180023"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"372","DOI":"10.1016\/j.jmb.2016.08.003","article-title":"Molecular mechanisms of two-component signal transduction","volume":"428","author":"Zschiedrich","year":"2016","journal-title":"J. Mol. Biol."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"2542","DOI":"10.1038\/s41467-018-04964-5","article-title":"Clustering huge protein sequence sets in linear time","volume":"9","author":"Steinegger","year":"2018","journal-title":"Nat. Commun."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Chen, Y., Reilly, K.D., Sprague, A.P., and Guan, Z. (2006, January 20\u201324). Seqoptics: A protein sequence clustering method. Proceedings of the First International Multi-Symposiums on Computer and Computational Sciences (IMSCCS\u201906), Hangzhou, China.","DOI":"10.1109\/IMSCCS.2006.123"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"3150","DOI":"10.1093\/bioinformatics\/bts565","article-title":"CD-HIT: Accelerated for clustering the next-generation sequencing data","volume":"28","author":"Fu","year":"2012","journal-title":"Bioinformatics"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1345","DOI":"10.1109\/TKDE.2009.191","article-title":"A Survey on Transfer Learning","volume":"22","author":"Yang","year":"2010","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Hockenberry, A.J., and Wilke, C.O. (2019). Phylogenetic weighting does little to improve the accuracy of evolutionary coupling analyses. Entropy, 21.","DOI":"10.1101\/736173"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1016\/S0022-2836(02)00587-9","article-title":"Using orthologous and paralogous proteins to identify specificity-determining residues in bacterial transcription factors","volume":"321","author":"Mirny","year":"2002","journal-title":"J. Mol. Biol."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1093\/bib\/bbt092","article-title":"A survey on prediction of specificity-determining sites in proteins","volume":"16","author":"Chakraborty","year":"2013","journal-title":"Brief. Bioinform."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1371\/journal.pone.0162579","article-title":"High-resolution identification of specificity determining positions in the LacI protein family using ensembles of sub-sampled alignments","volume":"11","author":"Sloutsky","year":"2016","journal-title":"PLoS ONE"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1038\/nmeth.1818","article-title":"HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment","volume":"9","author":"Remmert","year":"2012","journal-title":"Nat. Methods"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1103\/PhysRevE.87.012707","article-title":"Improved contact prediction in proteins: Using pseudolikelihoods to infer Potts models","volume":"87","author":"Ekeberg","year":"2013","journal-title":"Phys. Rev. E"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Hockenberry, A.J., and Wilke, C.O. (2019). Evolutionary couplings detect side-chain interactions. PeerJ, 7.","DOI":"10.7717\/peerj.7280"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Bonomi, M., and Camilloni, C. (2019). Coevolutionary Analysis of Protein Sequences for Molecular Modeling. Biomolecular Simulations: Methods and Protocols, Springer.","DOI":"10.1007\/978-1-4939-9608-7"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1016\/j.jcp.2014.07.024","article-title":"Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences","volume":"276","author":"Ekeberg","year":"2014","journal-title":"J. Comput. Phys."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"9965","DOI":"10.1088\/1361-6633\/aa9965","article-title":"Inverse statistical physics of protein sequences: A key issues review","volume":"81","author":"Cocco","year":"2018","journal-title":"Rep. Prog. Phys."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Feinauer, C., Skwark, M.J., Pagnani, A., and Aurell, E. (2014). Improving Contact Prediction along Three Dimensions. PLoS Comput. Biol., 10.","DOI":"10.1371\/journal.pcbi.1003847"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1093\/bioinformatics\/btm604","article-title":"Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction","volume":"24","author":"Dunn","year":"2008","journal-title":"Bioinformatics"}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/21\/11\/1127\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:35:08Z","timestamp":1760189708000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/21\/11\/1127"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,11,16]]},"references-count":43,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2019,11]]}},"alternative-id":["e21111127"],"URL":"https:\/\/doi.org\/10.3390\/e21111127","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2019,11,16]]}}}