{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T13:32:30Z","timestamp":1760707950652},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"13","license":[{"start":{"date-parts":[[2016,10,2]],"date-time":"2016-10-02T00:00:00Z","timestamp":1475366400000},"content-version":"vor","delay-in-days":3015,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/2.0\/uk\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2008,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: The task of engineering a protein to perform a target biological function is known as protein design. A commonly used paradigm casts this functional design problem as a structural one, assuming a fixed backbone. In probabilistic protein design, positional amino acid probabilities are used to create a random library of sequences to be simultaneously screened for biological activity. Clearly, certain choices of probability distributions will be more successful in yielding functional sequences. However, since the number of sequences is exponential in protein length, computational optimization of the distribution is difficult.<\/jats:p><jats:p>Results: In this paper, we develop a computational framework for probabilistic protein design following the structural paradigm. We formulate the distribution of sequences for a structure using the Boltzmann distribution over their free energies. The corresponding probabilistic graphical model is constructed, and we apply belief propagation (BP) to calculate marginal amino acid probabilities. We test this method on a large structural dataset and demonstrate the superiority of BP over previous methods. Nevertheless, since the results obtained by BP are far from optimal, we thoroughly assess the paradigm using high-quality experimental data. We demonstrate that, for small scale sub-problems, BP attains identical results to those produced by exact inference on the paradigmatic model. However, quantitative analysis shows that the distributions predicted significantly differ from the experimental data. These findings, along with the excellent performance we observed using BP on the smaller problems, suggest potential shortcomings of the paradigm. We conclude with a discussion of how it may be improved in the future.<\/jats:p><jats:p>Contact: \u00a0fromer@cs.huji.ac.il<\/jats:p>","DOI":"10.1093\/bioinformatics\/btn168","type":"journal-article","created":{"date-parts":[[2008,6,27]],"date-time":"2008-06-27T07:43:13Z","timestamp":1214552593000},"page":"i214-i222","source":"Crossref","is-referenced-by-count":19,"title":["A computational framework to empower probabilistic protein design"],"prefix":"10.1093","volume":"24","author":[{"given":"Menachem","family":"Fromer","sequence":"first","affiliation":[{"name":"School of Computer Science and Engineering, The Hebrew University of Jerusalem, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chen","family":"Yanover","sequence":"additional","affiliation":[{"name":"School of Computer Science and Engineering, The Hebrew University of Jerusalem, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"286","published-online":{"date-parts":[[2008,7,1]]},"reference":[{"key":"2023020210383245400_B1","doi-asserted-by":"crossref","first-page":"253","DOI":"10.1038\/35051731","article-title":"Combinatorial and computational challenges for biocatalyst design","volume":"409","author":"Arnold","year":"2001","journal-title":"Nature"},{"key":"2023020210383245400_B2","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1093\/nar\/28.1.235","article-title":"The Protein Data Bank","volume":"28","author":"Berman","year":"2000","journal-title":"Nucleic Acids Res."},{"key":"2023020210383245400_B3","doi-asserted-by":"crossref","first-page":"14238","DOI":"10.1074\/jbc.M201453200","article-title":"Design of a novel peptide inhibitor of HIV fusion that disrupts the internal trimeric coiled-coil of gp41","volume":"277","author":"Bewley","year":"2002","journal-title":"J. Biol. Chem."},{"key":"2023020210383245400_B4","doi-asserted-by":"crossref","first-page":"154908","DOI":"10.1063\/1.2062047","article-title":"Statistical theory for protein ensembles with designed energy landscapes","volume":"123","author":"Biswas","year":"2005","journal-title":"J. Chem. Phys"},{"key":"2023020210383245400_B5","doi-asserted-by":"crossref","first-page":"1101","DOI":"10.1016\/j.jmb.2003.10.004","article-title":"Computational design and characterization of a monomeric helical dinuclear metalloprotein","volume":"334","author":"Calhoun","year":"2003","journal-title":"J. Mol. Biol."},{"key":"2023020210383245400_B6","doi-asserted-by":"crossref","first-page":"10153","DOI":"10.1073\/pnas.0504023102","article-title":"Computational prediction of native protein ligand-binding and enzyme active site sequences","volume":"102","author":"Chakrabarti","year":"2005","journal-title":"PNAS"},{"key":"2023020210383245400_B7","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1007\/978-94-011-5014-9_2","article-title":"Advanced inference in Bayesian networks","volume-title":"Learning in Graphical Models","author":"Cowell","year":"1998"},{"key":"2023020210383245400_B8","article-title":"The inverse protein folding problem: self consistent mean field optimisation of a structure specific mutation matrix","volume-title":"Pacific Symposium on Biocomputing","author":"Delarue"},{"key":"2023020210383245400_B9","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1093\/nar\/26.1.313","article-title":"The HSSP database of protein structure-sequence alignments and family profiles","volume":"26","author":"Dodge","year":"1998","journal-title":"Nucleic Acids Res."},{"key":"2023020210383245400_B10","first-page":"230","article-title":"Backbone-dependent rotamer library for proteins application to side-chain prediction","volume-title":"J. Mol. Biol","author":"Dunbrack","year":"1993"},{"key":"2023020210383245400_B11","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1016\/S0959-440X(99)80072-4","article-title":"Energy functions for protein design","volume":"9","author":"Gordon","year":"1999","journal-title":"Curr. Opin. Struc. Biol"},{"key":"2023020210383245400_B12","doi-asserted-by":"crossref","first-page":"1711","DOI":"10.1110\/ps.04690804","article-title":"De novo proteins from designed combinatorial libraries","volume":"13","author":"Hecht","year":"2004","journal-title":"Protein Sci."},{"key":"2023020210383245400_B13","volume-title":"Statistical mechanics","author":"Huang","year":"1987"},{"key":"2023020210383245400_B14","doi-asserted-by":"crossref","first-page":"e164","DOI":"10.1371\/journal.pcbi.0030164","article-title":"Design of multi-specificity in protein interfaces","volume":"3","author":"Humphris","year":"2007","journal-title":"PLoS Computational Biology"},{"key":"2023020210383245400_B15","doi-asserted-by":"crossref","first-page":"13554","DOI":"10.1073\/pnas.212068599","article-title":"Folding free energy function selects native-like protein sequences in the core but not on the surface","volume":"99","author":"Jaramillo","year":"2002","journal-title":"PNAS"},{"key":"2023020210383245400_B16","first-page":"366","article-title":"Free energy estimates of all-atom protein structures using generalized belief propagation","volume-title":"RECOMB","author":"Kamisetty","year":"2007"},{"key":"2023020210383245400_B17","doi-asserted-by":"crossref","first-page":"607","DOI":"10.1006\/jmbi.2000.4422","article-title":"Statistical theory for protein combinatorial libraries. packing interactions, backbone flexibility, and sequence variability of main-chain structure","volume":"306","author":"Kono","year":"2001","journal-title":"J. Mol. Biol"},{"key":"2023020210383245400_B18","doi-asserted-by":"crossref","first-page":"10383","DOI":"10.1073\/pnas.97.19.10383","article-title":"Native protein sequences are close to optimal for their structures","volume":"97","author":"Kuhlman","year":"2000","journal-title":"PNAS"},{"key":"2023020210383245400_B19","doi-asserted-by":"crossref","first-page":"1364","DOI":"10.1126\/science.1089427","article-title":"Design of a novel globular protein fold with atomic-level accuracy","volume":"302","author":"Kuhlman","year":"2003","journal-title":"Science"},{"key":"2023020210383245400_B20","doi-asserted-by":"crossref","first-page":"6883","DOI":"10.1021\/bi700215x","article-title":"Exhaustive mutagenesis of six secondary active-site residues in Escherichia coli chorismate mutase shows the importance of hydrophobic side chains and a helix n-capping position for stability and catalysis","volume":"46","author":"Lassila","year":"2007","journal-title":"Biochemistry"},{"key":"2023020210383245400_B21","doi-asserted-by":"crossref","DOI":"10.1093\/oso\/9780198522195.001.0001","volume-title":"Graphical Models","author":"Lauritzen","year":"1996"},{"key":"2023020210383245400_B22","doi-asserted-by":"crossref","first-page":"513","DOI":"10.1016\/S0959-440X(03)00104-0","article-title":"Designing proteins for therapeutic applications","volume":"13","author":"Lazar","year":"2003","journal-title":"Curr. Opin. Struc. Biol."},{"key":"2023020210383245400_B23","doi-asserted-by":"crossref","first-page":"740","DOI":"10.1089\/cmb.2005.12.740","article-title":"A novel ensemble-based scoring and search algorithm for protein redesign and its application to modify the substrate specificity of the gramicidin synthetase a phenylalanine adenylation enzyme","volume":"12","author":"Lilien","year":"2005","journal-title":"J. Com. Biol."},{"key":"2023020210383245400_B24","doi-asserted-by":"crossref","first-page":"290","DOI":"10.1145\/974614.974653","article-title":"The evolutionary capacity of protein structures","volume-title":"RECOMB","author":"Meyerguz","year":"2004"},{"key":"2023020210383245400_B25","doi-asserted-by":"crossref","first-page":"5091","DOI":"10.1073\/pnas.0831190100","article-title":"Identifying residue-residue clashes in protein hybrids by using a second-order mean-field approach","volume":"100","author":"Moore","year":"2003","journal-title":"PNAS"},{"key":"2023020210383245400_B26","doi-asserted-by":"crossref","first-page":"22378","DOI":"10.1074\/jbc.M603826200","article-title":"Comprehensive and quantitative mapping of energy landscapes for protein\u2013protein interactions by rapid combinatorial scanning","volume":"281","author":"Pal","year":"2006","journal-title":"J. Biol. Chem."},{"key":"2023020210383245400_B27","doi-asserted-by":"crossref","first-page":"487","DOI":"10.1016\/j.sbi.2004.06.002","article-title":"Advances in computational protein design","volume":"14","author":"Park","year":"2004","journal-title":"Curr. Opin. Struc. Biol."},{"key":"2023020210383245400_B28","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1016\/j.compchemeng.2004.07.037","article-title":"Progress in the development and application of computational methods for probabilistic protein design","volume":"29","author":"Park","year":"2005","journal-title":"Comput. Chem. Eng."},{"key":"2023020210383245400_B29","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1093\/protein\/gzl003","article-title":"Limitations of yeast surface display in engineering proteins of high thermostability","volume":"19","author":"Park","year":"2006","journal-title":"Protein Eng. Des. Sel."},{"key":"2023020210383245400_B30","volume-title":"Probabilistic reasoning in intelligent systems: networks of plausible inference","author":"Pearl","year":"1988"},{"key":"2023020210383245400_B31","doi-asserted-by":"crossref","first-page":"1605","DOI":"10.1002\/jcc.20084","article-title":"UCSF Chimera \u2013 a visualization system for exploratory research and analysis","volume":"25","author":"Pettersen","year":"2004","journal-title":"J. Comput. Chem."},{"key":"2023020210383245400_B32","doi-asserted-by":"crossref","first-page":"3973","DOI":"10.2174\/138161206778743655","article-title":"Computational protein design: a novel path to future protein drugs","volume":"12","author":"Rosenberg","year":"2006","journal-title":"Curr. Pharm. Des."},{"key":"2023020210383245400_B33","doi-asserted-by":"crossref","first-page":"631","DOI":"10.1016\/j.jmb.2004.11.062","article-title":"Recapitulation of protein family divergence using flexible backbone protein design","volume":"346","author":"Saunders","year":"2005","journal-title":"J. Mol. Biol"},{"key":"2023020210383245400_B34","doi-asserted-by":"crossref","first-page":"638","DOI":"10.1126\/science.1112160","article-title":"Progress in modeling of protein structures and interactions","volume":"310","author":"Schueler-Furman","year":"2005","journal-title":"Science"},{"key":"2023020210383245400_B35","doi-asserted-by":"crossref","first-page":"638","DOI":"10.1126\/science.1112160","article-title":"Progress in modeling of protein structures and interactions","volume":"310","author":"Schueler-Furman","year":"2005","journal-title":"Science"},{"key":"2023020210383245400_B36","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1016\/S0022-2836(02)00881-1","article-title":"Modulating calmodulin specificity through computational protein design","volume":"323","author":"Shifman","year":"2002","journal-title":"J. Mol. Biol."},{"key":"2023020210383245400_B37","doi-asserted-by":"crossref","first-page":"3778","DOI":"10.1073\/pnas.051614498","article-title":"Computational method to reduce the search space for directed protein evolution","volume":"98","author":"Voigt","year":"2001","journal-title":"PNAS"},{"key":"2023020210383245400_B38","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1016\/j.cplett.2004.10.153","article-title":"Computational methods for protein design and protein sequence variability: biased monte carlo and replica exchange","volume":"401","author":"Yang","year":"2005","journal-title":"Chem. Phys. Lett"},{"key":"2023020210383245400_B39","first-page":"1457","article-title":"Approximate inference and protein-folding","volume-title":"Advances in Neural Information Processing Systems15","author":"Yanover","year":"2003"},{"key":"2023020210383245400_B40","first-page":"1887","article-title":"Linear programming relaxations and belief propagation \u2013 an empirical study","volume":"7","author":"Yanover","year":"2006","journal-title":"J. Mach. Learn. Res."},{"key":"2023020210383245400_B41","first-page":"381","article-title":"Minimizing and learning energy functions for side-chain prediction","volume-title":"In RECOMB","author":"Yanover","year":"2007"},{"key":"2023020210383245400_B42","doi-asserted-by":"crossref","first-page":"2282","DOI":"10.1109\/TIT.2005.850085","article-title":"Constructing free-energy approximations and generalized belief propagation algorithms","volume":"51","author":"Yedidia","year":"2005","journal-title":"IEEE Trans. Inf. Theory"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/13\/i214\/49054163\/bioinformatics_24_13_i214.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/24\/13\/i214\/49054163\/bioinformatics_24_13_i214.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,27]],"date-time":"2024-02-27T22:21:21Z","timestamp":1709072481000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/24\/13\/i214\/232992"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2008,7,1]]},"references-count":42,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2008,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btn168","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2008,7,1]]},"published":{"date-parts":[[2008,7,1]]}}}