{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,15]],"date-time":"2026-04-15T20:59:05Z","timestamp":1776286745522,"version":"3.50.1"},"reference-count":48,"publisher":"Proceedings of the National Academy of Sciences","issue":"3","content-domain":{"domain":["www.pnas.org"],"crossmark-restriction":true},"short-container-title":["Proc. Natl. Acad. Sci. U.S.A."],"published-print":{"date-parts":[[2013,1,15]]},"abstract":"<jats:p>Knowing how protein sequence maps to function (the \u201cfitness landscape\u201d) is critical for understanding protein evolution as well as for engineering proteins with new and useful properties. We demonstrate that the protein fitness landscape can be inferred from experimental data, using Gaussian processes, a Bayesian learning technique. Gaussian process landscapes can model various protein sequence properties, including functional status, thermostability, enzyme activity, and ligand binding affinity. Trained on experimental data, these models achieve unrivaled quantitative accuracy. Furthermore, the explicit representation of model uncertainty allows for efficient searches through the vast space of possible sequences. We develop and test two protein sequence design algorithms motivated by Bayesian decision theory. The first one identifies small sets of sequences that are informative about the landscape; the second one identifies optimized sequences by iteratively improving the Gaussian process model in regions of the landscape that are predicted to be optimized. We demonstrate the ability of Gaussian processes to guide the search through protein sequence space by designing, constructing, and testing chimeric cytochrome P450s. These algorithms allowed us to engineer active P450 enzymes that are more thermostable than any previously made by chimeragenesis, rational design, or directed evolution.<\/jats:p>","DOI":"10.1073\/pnas.1215251110","type":"journal-article","created":{"date-parts":[[2012,12,31]],"date-time":"2012-12-31T22:18:40Z","timestamp":1356992320000},"update-policy":"https:\/\/doi.org\/10.1073\/pnas.cm10313","source":"Crossref","is-referenced-by-count":330,"title":["Navigating the protein fitness landscape with Gaussian processes"],"prefix":"10.1073","volume":"110","author":[{"given":"Philip A.","family":"Romero","sequence":"first","affiliation":[{"name":"Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125; and"}]},{"given":"Andreas","family":"Krause","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Swiss Federal Institute of Technology, 8092 Zurich, Switzerland"}]},{"given":"Frances H.","family":"Arnold","sequence":"additional","affiliation":[{"name":"Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125; and"}]}],"member":"341","published-online":{"date-parts":[[2012,12,31]]},"reference":[{"key":"e_1_1_2_17_10_1_2","doi-asserted-by":"publisher","DOI":"10.1038\/nrm2805"},{"key":"e_1_1_2_17_10_2_2","volume-title":"Gaussian Processes for Machine Learning","author":"Rasmussen CE","year":"2006","unstructured":"CE Rasmussen, C Williams Gaussian Processes for Machine Learning (MIT Press, Cambridge, MA, 2006)."},{"key":"e_1_1_2_17_10_3_2","doi-asserted-by":"publisher","DOI":"10.1038\/nbt1333"},{"key":"e_1_1_2_17_10_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2011.2182033"},{"key":"e_1_3_3_1_2","doi-asserted-by":"publisher","DOI":"10.1038\/nrm2805"},{"key":"e_1_3_3_2_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-7799(98)01188-3"},{"key":"e_1_3_3_3_2","doi-asserted-by":"publisher","DOI":"10.1093\/protein\/15.10.779"},{"key":"e_1_3_3_4_2","doi-asserted-by":"publisher","DOI":"10.1038\/35070613"},{"key":"e_1_3_3_5_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jmb.2004.06.058"},{"key":"e_1_3_3_6_2","doi-asserted-by":"publisher","DOI":"10.1002\/prot.10016"},{"key":"e_1_3_3_7_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jtbi.2005.05.001"},{"key":"e_1_3_3_8_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.278.5335.82"},{"key":"e_1_3_3_9_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.1152692"},{"key":"e_1_3_3_10_2","doi-asserted-by":"publisher","DOI":"10.1002\/pro.481"},{"key":"e_1_3_3_11_2","unstructured":"CKI Williams CE Rasmussen Advances in Neural Information Processing Systems eds Touretzky DS Mozer MC Hasselmo ME (MIT Press Cambridge MA) pp 514\u2013520. (1996)."},{"key":"e_1_3_3_12_2","volume-title":"Gaussian Processes for Machine Learning","author":"Rasmussen CE","year":"2006","unstructured":"CE Rasmussen, C Williams Gaussian Processes for Machine Learning (MIT Press, Cambridge, MA, 2006)."},{"key":"e_1_3_3_13_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4612-1494-6"},{"key":"e_1_3_3_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/34.735807"},{"key":"e_1_3_3_15_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pbio.0040112"},{"key":"e_1_3_3_16_2","doi-asserted-by":"publisher","DOI":"10.1038\/nbt1333"},{"key":"e_1_3_3_17_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.chembiol.2004.02.018"},{"key":"e_1_3_3_18_2","doi-asserted-by":"publisher","DOI":"10.1080\/02664768700000020"},{"key":"e_1_3_3_19_2","doi-asserted-by":"crossref","unstructured":"C Guestrin A Krause AP Singh Near-optimal sensor placements in Gaussian processes. Proceedings of the 22nd International Conference on Machine Learning eds De Raedt L Wrobel S (ACM New York NY) Vol 1 pp 265\u2013272. (2005).","DOI":"10.1145\/1102351.1102385"},{"key":"e_1_3_3_20_2","doi-asserted-by":"crossref","unstructured":"A Krause E Horvitz A Kansal F Zhao Toward community sensing. Proceedings of the 7th International Conference on Information Processing in Sensor Networks (IEEE Computer Society Washington DC) pp 481\u2013492. (2008).","DOI":"10.1109\/IPSN.2008.37"},{"key":"e_1_3_3_21_2","unstructured":"A Krause C Guestrin Near-optimal observation selection using submodular functions. Proceedings of the 22nd National Conference on Artificial Intelligence (AAAI Press Palo Alto CA) Vol 22 pp 1650\u20131654. (2007)."},{"key":"e_1_3_3_22_2","doi-asserted-by":"publisher","DOI":"10.1016\/S0169-5347(97)01098-7"},{"key":"e_1_3_3_23_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jmb.2012.05.029"},{"key":"e_1_3_3_24_2","volume-title":"Reinforcement Learning","author":"Sutton RS","year":"1998","unstructured":"RS Sutton, AG Barto Reinforcement Learning (MIT Press, Cambridge, MA, 1998)."},{"key":"e_1_3_3_25_2","unstructured":"D Lizotte T Wang M Bowling D Schuurmans Automatic gait optimization with Gaussian process regression. Proceedings of the 20th International Joint Conference on Artificial Intelligence ed Veloso MM (AAAI Press Palo Alto CA) pp 944\u2013949. (2007)."},{"key":"e_1_3_3_26_2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1008306431147"},{"key":"e_1_3_3_27_2","doi-asserted-by":"publisher","DOI":"10.1137\/070693424"},{"key":"e_1_3_3_28_2","first-page":"397","article-title":"Using confidence bounds for exploitation-exploration trade-offs","volume":"3","author":"Auer P","year":"2002","unstructured":"P Auer, Using confidence bounds for exploitation-exploration trade-offs. J Mach Learn Res 3, 397\u2013422 (2002).","journal-title":"J Mach Learn Res"},{"key":"e_1_3_3_29_2","unstructured":"N Srinivas A Krause SM Kakade M Seeger Gaussian process optimization in the bandit setting: No regret and experimental design. Proceedings of the 27th International Conference on Machine learning eds Furnkranz J Joachims T (Omnipress Madison WI) pp 1015\u20131022. (2010)."},{"key":"e_1_3_3_30_2","unstructured":"T Desautels A Krause J Burdick Parallelizing exploration-exploitation tradeoffs with Gaussian process bandit optimization. Proceedings of the 29th International Conference on Machine Learning eds Langford J Pineau J (Omnipress Madison WI). (2012)."},{"key":"e_1_3_3_31_2","doi-asserted-by":"publisher","DOI":"10.1002\/cbic.200300660"},{"key":"e_1_3_3_32_2","doi-asserted-by":"publisher","DOI":"10.1038\/nbt1286"},{"key":"e_1_3_3_33_2","doi-asserted-by":"publisher","DOI":"10.1186\/1472-6750-7-16"},{"key":"e_1_3_3_34_2","doi-asserted-by":"publisher","DOI":"10.1110\/ps.03348304"},{"key":"e_1_3_3_35_2","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.0500729102"},{"key":"e_1_3_3_36_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.chembiol.2007.01.009"},{"key":"e_1_3_3_37_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF01588971"},{"key":"e_1_3_3_38_2","doi-asserted-by":"publisher","DOI":"10.1016\/0012-365X(83)90011-0"},{"key":"e_1_3_3_39_2","doi-asserted-by":"publisher","DOI":"10.1007\/BFb0006528"},{"key":"e_1_3_3_40_2","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.88.13.5597"},{"key":"e_1_3_3_41_2","first-page":"141","article-title":"High-throughput screen for aromatic hydroxylation","volume":"230","author":"Otey CR","year":"2003","unstructured":"CR Otey, JM Joern, High-throughput screen for aromatic hydroxylation. Methods Mol Biol 230, 141\u2013148 (2003).","journal-title":"Methods Mol Biol"},{"key":"e_1_3_3_42_2","doi-asserted-by":"publisher","DOI":"10.1038\/nbt.1609"},{"key":"e_1_3_3_43_2","first-page":"137","article-title":"High-throughput carbon monoxide binding assay for cytochromes p450","volume":"230","author":"Otey CR","year":"2003","unstructured":"CR Otey, High-throughput carbon monoxide binding assay for cytochromes p450. Methods Mol Biol 230, 137\u2013139 (2003).","journal-title":"Methods Mol Biol"},{"key":"e_1_3_3_44_2","doi-asserted-by":"publisher","DOI":"10.1093\/biomet\/80.3.603"}],"container-title":["Proceedings of the National Academy of Sciences"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/pnas.org\/doi\/pdf\/10.1073\/pnas.1215251110","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,4,12]],"date-time":"2022-04-12T21:54:04Z","timestamp":1649800444000},"score":1,"resource":{"primary":{"URL":"https:\/\/pnas.org\/doi\/full\/10.1073\/pnas.1215251110"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2012,12,31]]},"references-count":48,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2013,1,15]]}},"alternative-id":["10.1073\/pnas.1215251110"],"URL":"https:\/\/doi.org\/10.1073\/pnas.1215251110","relation":{"has-review":[{"id-type":"doi","id":"10.3410\/f.718090524.793482948","asserted-by":"object"}]},"ISSN":["0027-8424","1091-6490"],"issn-type":[{"value":"0027-8424","type":"print"},{"value":"1091-6490","type":"electronic"}],"subject":[],"published":{"date-parts":[[2012,12,31]]},"assertion":[{"value":"2012-12-31","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}