{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,26]],"date-time":"2025-12-26T07:16:07Z","timestamp":1766733367675,"version":"3.35.0"},"reference-count":28,"publisher":"Oxford University Press (OUP)","issue":"19","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,10,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Protein structure ensembles provide important insight into the dynamics and function of a protein and contain information that is not captured with a single static structure. However, it is not clear a priori to what extent the variability within an ensemble is caused by internal structural changes. Additional variability results from overall translations and rotations of the molecule. And most experimental data do not provide information to relate the structures to a common reference frame. To report meaningful values of intrinsic dynamics, structural precision, conformational entropy, etc., it is therefore important to disentangle local from global conformational heterogeneity.<\/jats:p><jats:p>Results: We consider the task of disentangling local from global heterogeneity as an inference problem. We use probabilistic methods to infer from the protein ensemble missing information on reference frames and stable conformational sub-states. To this end, we model a protein ensemble as a mixture of Gaussian probability distributions of either entire conformations or structural segments. We learn these models from a protein ensemble using the expectation\u2013maximization algorithm. Our first model can be used to find multiple conformers in a structure ensemble. The second model partitions the protein chain into locally stable structural segments or core elements and less structured regions typically found in loops. Both models are simple to implement and contain only a single free parameter: the number of conformers or structural segments. Our models can be used to analyse experimental ensembles, molecular dynamics trajectories and conformational change in proteins.<\/jats:p><jats:p>Availability: The Python source code for protein ensemble analysis is available from the authors upon request.<\/jats:p><jats:p>Contact: \u00a0michael.habeck@tuebingen.mpg.de<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn396","type":"journal-article","created":{"date-parts":[[2008,7,29]],"date-time":"2008-07-29T00:24:15Z","timestamp":1217291055000},"page":"2184-2192","source":"Crossref","is-referenced-by-count":10,"title":["Mixture models for protein structure ensembles"],"prefix":"10.1093","volume":"24","author":[{"given":"Michael","family":"Hirsch","sequence":"first","affiliation":[{"name":"1 Department of Empirical Inference, Max-Planck-Institute for Biological Cybernetics, Spemannstrasse 38 and 2Department of Protein Evolution, Max-Planck-Institute for Developmental Biology, Spemannstrasse 35, 72076 T\u00fcbingen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael","family":"Habeck","sequence":"additional","affiliation":[{"name":"1 Department of Empirical Inference, Max-Planck-Institute for Biological Cybernetics, Spemannstrasse 38 and 2Department of Protein Evolution, Max-Planck-Institute for Developmental Biology, Spemannstrasse 35, 72076 T\u00fcbingen, Germany"},{"name":"1 Department of Empirical Inference, Max-Planck-Institute for Biological Cybernetics, Spemannstrasse 38 and 2Department of Protein Evolution, Max-Planck-Institute for Developmental Biology, Spemannstrasse 35, 72076 T\u00fcbingen, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2008,7,28]]},"reference":[{"key":"2023020211125140800_B1","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1002\/prot.21507","article-title":"A large data set comparison of protein structures determined by crystallography and NMR: statistical test for structural differences and the effect of crystal packing","volume":"69","author":"Andrec","year":"2007","journal-title":"Proteins Struct. Funct. Bioinform"},{"key":"2023020211125140800_B2","doi-asserted-by":"crossref","first-page":"1147","DOI":"10.1016\/S0006-3495(97)78147-5","article-title":"Molecular dynamics study of time-correlated protein domain motions and molecular flexibility: cytochrome P450BM-3","volume":"73","author":"Arnold","year":"1997","journal-title":"Biophys. J"},{"key":"2023020211125140800_B3","doi-asserted-by":"crossref","DOI":"10.1093\/oso\/9780198538493.001.0001","volume-title":"Neural Networks for Pattern Recognition.","author":"Bishop","year":"1995"},{"key":"2023020211125140800_B4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","article-title":"Maximum likelihood from incomplete data via the EM algorithm (with discussion)","volume":"39","author":"Dempster","year":"1977","journal-title":"J. R. Stat. Soc. B"},{"key":"2023020211125140800_B5","doi-asserted-by":"crossref","first-page":"184","DOI":"10.1038\/nsmb0306-184","article-title":"Is one solution good enough?","volume":"13","author":"Furnham","year":"2006","journal-title":"Nat. Struct. Biol"},{"key":"2023020211125140800_B6","doi-asserted-by":"crossref","first-page":"6739","DOI":"10.1021\/bi00188a001","article-title":"Structural mechanisms for domain movements in proteins","volume":"33","author":"Gerstein","year":"1994","journal-title":"Biochemistry"},{"key":"2023020211125140800_B7","doi-asserted-by":"crossref","first-page":"281","DOI":"10.1016\/0022-2836(85)90346-8","article-title":"An evaluation of the combined use of nuclear magnetic resonance and distance geometry for the determination of protein conformations in solution","volume":"182","author":"Havel","year":"1985","journal-title":"J. Mol. Biol"},{"key":"2023020211125140800_B8","first-page":"1","article-title":"Matrix nearness problems and applications","volume-title":"Applications of Matrix Theory.","author":"Higham","year":"1989"},{"key":"2023020211125140800_B9","doi-asserted-by":"crossref","first-page":"922","DOI":"10.1107\/S0567739476001873","article-title":"A solution for the best rotation to relate two sets of vectors","volume":"A32","author":"Kabsch","year":"1976","journal-title":"Acta Cryst"},{"key":"2023020211125140800_B10","doi-asserted-by":"crossref","first-page":"1063","DOI":"10.1093\/protein\/9.11.1063","article-title":"An automated approach for clustering an ensemble of NMR-derived protein structures into conformationally related subfamilies","volume":"9","author":"Kelley","year":"1996","journal-title":"Protein Eng"},{"key":"2023020211125140800_B11","doi-asserted-by":"crossref","first-page":"737","DOI":"10.1093\/protein\/10.6.737","article-title":"An automated approach for defining core atoms and domains in an ensemble of NMR-derived protein structures","volume":"10","author":"Kelley","year":"1997","journal-title":"Protein Eng"},{"key":"2023020211125140800_B12","doi-asserted-by":"crossref","first-page":"768","DOI":"10.1038\/nsb0995-768","article-title":"Solution structure of calcium-free calmodulin","volume":"2","author":"Kuboniwa","year":"1995","journal-title":"Nat. Struct. Biol"},{"volume-title":"Information Theory, Inference, and Learning Algorithms.","year":"2003","author":"MacKay","key":"2023020211125140800_B13"},{"key":"2023020211125140800_B14","doi-asserted-by":"crossref","first-page":"933","DOI":"10.1006\/jmbi.1998.1852","article-title":"Recommendations for the presentation of NMR structures of proteins and nucleic acids","volume":"280","author":"Markley","year":"1998","journal-title":"J. Mol. Biol"},{"key":"2023020211125140800_B15","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1107\/S0907444906005270","article-title":"Optimal description of a protein structure in terms of multiple groups undergoing TLS motion","volume":"62","author":"Painter","year":"2006","journal-title":"Acta Crystallogr. D Biol. Crystallogr"},{"key":"2023020211125140800_B16","first-page":"554","article-title":"The infinite gaussian mixture model","volume-title":"NIPS 12.","author":"Rasmussen","year":"2000"},{"key":"2023020211125140800_B17","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1126\/science.1110428","article-title":"Inferential structure determination","volume":"309","author":"Rieping","year":"2005","journal-title":"Science"},{"key":"2023020211125140800_B18","doi-asserted-by":"crossref","first-page":"673","DOI":"10.1002\/prot.20402","article-title":"Clustering algorithms for identifying core atom sets and for assessing the precision of protein structure ensembles","volume":"59","author":"Snyder","year":"2005","journal-title":"Proteins Struct. Funct. Bioinform"},{"key":"2023020211125140800_B19","doi-asserted-by":"crossref","first-page":"655","DOI":"10.1002\/prot.20499","article-title":"Assessing precision and accuracy of protein structures derived from NMR data","volume":"59","author":"Snyder","year":"2005","journal-title":"Proteins Struct. Funct. Bioinform"},{"key":"2023020211125140800_B20","doi-asserted-by":"crossref","first-page":"225","DOI":"10.1023\/A:1022819716110","article-title":"The precision of NMR structure ensembles revisited","volume":"25","author":"Spronk","year":"2003","journal-title":"J. Biomol. NMR"},{"key":"2023020211125140800_B21","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1111\/j.2517-6161.1974.tb00994.x","article-title":"Cross-validatory choice and assessment of statistical predictions","volume":"36","author":"Stone","year":"1974","journal-title":"J. R. Stat. Soc. B"},{"key":"2023020211125140800_B22","doi-asserted-by":"crossref","first-page":"936","DOI":"10.1002\/pro.5560020607","article-title":"Representing an ensemble of NMR-derived protein structures by a single structure","volume":"2","author":"Sutcliffe","year":"1993","journal-title":"Protein Sci"},{"key":"2023020211125140800_B23","doi-asserted-by":"crossref","first-page":"18521","DOI":"10.1073\/pnas.0508445103","volume":"103","author":"Theobald","year":"2006","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023020211125140800_B24","doi-asserted-by":"crossref","first-page":"2171","DOI":"10.1093\/bioinformatics\/btl332","volume":"22","author":"Theobald","year":"2006","journal-title":"Bioinformatics"},{"volume-title":"Statistical Analysis of Finite Mixture Distributions.","year":"1985","author":"Titterington","key":"2023020211125140800_B25"},{"key":"2023020211125140800_B26","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1016\/S0969-2126(01)00181-2","article-title":"Movie of the structural changes during a catalytic cycle of nucleoside monophosphate kinases","volume":"3","author":"Vonrhein","year":"1995","journal-title":"Structure"},{"key":"2023020211125140800_B27","doi-asserted-by":"crossref","first-page":"2042","DOI":"10.1074\/jbc.M707632200","article-title":"Conformational transitions in adenylate kinase. Allosteric communication reduces misligation","volume":"283","author":"Whitford","year":"2008","journal-title":"J. Biol. Chem"},{"key":"2023020211125140800_B28","doi-asserted-by":"crossref","DOI":"10.1051\/epn\/19861701011","volume-title":"NMR of Proteins and Nucleic Acids.","author":"W\u00fcthrich","year":"1986"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/19\/2184\/49052494\/bioinformatics_24_19_2184.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/19\/2184\/49052494\/bioinformatics_24_19_2184.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,31]],"date-time":"2025-01-31T06:19:00Z","timestamp":1738304340000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/19\/2184\/246315"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,7,28]]},"references-count":28,"journal-issue":{"issue":"19","published-print":{"date-parts":[[2008,10,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn396","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"type":"electronic","value":"1367-4811"},{"type":"print","value":"1367-4803"}],"subject":[],"published-other":{"date-parts":[[2008,10,1]]},"published":{"date-parts":[[2008,7,28]]}}}