{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,7]],"date-time":"2026-04-07T11:20:35Z","timestamp":1775560835321,"version":"3.50.1"},"reference-count":52,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2021,5,28]],"date-time":"2021-05-28T00:00:00Z","timestamp":1622160000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001665","name":"Agence Nationale de la Recherche","doi-asserted-by":"publisher","award":["ANR-16-CE40-0028"],"award-info":[{"award-number":["ANR-16-CE40-0028"]}],"id":[{"id":"10.13039\/501100001665","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001665","name":"Agence Nationale de la Recherche","doi-asserted-by":"publisher","award":["ANR-19-PI3A-0004"],"award-info":[{"award-number":["ANR-19-PI3A-0004"]}],"id":[{"id":"10.13039\/501100001665","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>Proteins are the main active molecules of life. Although natural proteins play many roles, as enzymes or antibodies for example, there is a need to go beyond the repertoire of natural proteins to produce engineered proteins that precisely meet application requirements, in terms of function, stability, activity or other protein capacities. Computational Protein Design aims at designing new proteins from first principles, using full-atom molecular models. However, the size and complexity of proteins require approximations to make them amenable to energetic optimization queries. These approximations make the design process less reliable, and a provable optimal solution may fail. In practice, expensive libraries of solutions are therefore generated and tested. In this paper, we explore the idea of generating libraries of provably diverse low-energy solutions by extending cost function network algorithms with dedicated automaton-based diversity constraints on a large set of realistic full protein redesign problems. We observe that it is possible to generate provably diverse libraries in reasonable time and that the produced libraries do enhance the Native Sequence Recovery, a traditional measure of design methods reliability.<\/jats:p>","DOI":"10.3390\/a14060168","type":"journal-article","created":{"date-parts":[[2021,5,28]],"date-time":"2021-05-28T11:33:20Z","timestamp":1622201600000},"page":"168","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Guaranteed Diversity and Optimality in Cost Function Network Based Computational Protein Design Methods"],"prefix":"10.3390","volume":"14","author":[{"given":"Manon","family":"Ruffini","sequence":"first","affiliation":[{"name":"Universit\u00e9 F\u00e9d\u00e9rale de Toulouse, ANITI, INRAE, UR 875, 31326 Toulouse, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jelena","family":"Vucinic","sequence":"additional","affiliation":[{"name":"TBI, Universit\u00e9 de Toulouse, CNRS, INRAE, INSA, ANITI, 31077 Toulouse, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Simon de","family":"de Givry","sequence":"additional","affiliation":[{"name":"Universit\u00e9 F\u00e9d\u00e9rale de Toulouse, ANITI, INRAE, UR 875, 31326 Toulouse, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3727-6698","authenticated-orcid":false,"given":"George","family":"Katsirelos","sequence":"additional","affiliation":[{"name":"MIA-Paris-Math\u00e9matiques et Informatique Appliqu\u00e9es, INRAE, 75231 Paris, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2581-5022","authenticated-orcid":false,"given":"Sophie","family":"Barbe","sequence":"additional","affiliation":[{"name":"TBI, Universit\u00e9 de Toulouse, CNRS, INRAE, INSA, ANITI, 31077 Toulouse, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6049-3415","authenticated-orcid":false,"given":"Thomas","family":"Schiex","sequence":"additional","affiliation":[{"name":"Universit\u00e9 F\u00e9d\u00e9rale de Toulouse, ANITI, INRAE, UR 875, 31326 Toulouse, France"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,5,28]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1126\/science.181.4096.223","article-title":"Principles that govern the folding of protein chains","volume":"181","author":"Anfinsen","year":"1973","journal-title":"Science"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1093\/protein\/15.10.779","article-title":"Protein design is NP-hard","volume":"15","author":"Pierce","year":"2002","journal-title":"Protein Eng."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Van Laarhoven, P.J., and Aarts, E.H. (1987). Simulated annealing. Simulated Annealing: Theory and Applications, Springer.","DOI":"10.1007\/978-94-015-7744-1"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"545","DOI":"10.1016\/B978-0-12-381270-4.00019-6","article-title":"ROSETTA3: An object-oriented software suite for the simulation and design of macromolecules","volume":"Volume 487","author":"Tyka","year":"2011","journal-title":"Methods in Enzymology"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"2129","DOI":"10.1093\/bioinformatics\/btt374","article-title":"A new framework for computational protein design through cost function network optimization","volume":"29","author":"Allouche","year":"2013","journal-title":"Bioinformatics"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/j.artint.2014.03.005","article-title":"Computational protein design as an optimization problem","volume":"212","author":"Allouche","year":"2014","journal-title":"Artif. Intell."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1107\/S205225251801480X","article-title":"Computational design of symmetrical eight-bladed \u03b2-propeller proteins","volume":"6","author":"Noguchi","year":"2019","journal-title":"IUCrJ"},{"key":"ref_8","first-page":"631","article-title":"Valued constraint satisfaction problems: Hard and easy problems","volume":"95","author":"Schiex","year":"1995","journal-title":"IJCAI (1)"},{"key":"ref_9","first-page":"4-1","article-title":"Graphical models: Queries, complexity, algorithms","volume":"154","author":"Cooper","year":"2020","journal-title":"Leibniz Int. Proc. Inform."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"gzab011","DOI":"10.1093\/protein\/gzab011","article-title":"Molecular flexibility in computational protein design: An algorithmic perspective","volume":"34","author":"Bouchiba","year":"2021","journal-title":"Protein Eng. Des. Sel."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"e1374","DOI":"10.1002\/wcms.1374","article-title":"Essentials of de novo protein design: Methods and applications","volume":"8","author":"Marcos","year":"2018","journal-title":"Wiley Interdiscip. Rev. Comput. Mol. Sci."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"8577","DOI":"10.1073\/pnas.1321126111","article-title":"Removing T-cell epitopes with computational protein design","volume":"111","author":"King","year":"2014","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_13","unstructured":"Kirillov, A., Shlezinger, D., Vetrov, D.P., Rother, C., and Savchynskyy, B. (2015, January 7\u201312). M-Best-Diverse Labelings for Submodular Energies and Beyond. Proceedings of the Twenty-Ninth Conference on Neural Information Processing Systems, Quebec, QC, Canada."},{"key":"ref_14","unstructured":"Bacchus, F., and Van Beek, P. (1998, January 26). On the conversion between non-binary and binary constraint satisfaction problems. Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI), Madison, WI, USA."},{"key":"ref_15","unstructured":"Larrosa, J., and Dechter, R. (2000, January 18). On the dual representation of non-binary semiring-based CSPs. Proceedings of the CP\u20192000 Workshop on Soft Constraints, Singapore."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"844","DOI":"10.1016\/j.str.2011.03.019","article-title":"A smoothed backbone-dependent rotamer library for proteins derived from adaptive kernel density estimates and regressions","volume":"19","author":"Shapovalov","year":"2011","journal-title":"Structure"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1002\/1097-0134(20000815)40:3<389::AID-PROT50>3.0.CO;2-2","article-title":"The penultimate rotamer library","volume":"40","author":"Lovell","year":"2000","journal-title":"Proteins Struct. Funct. Bioinform."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1668","DOI":"10.1002\/jcc.20290","article-title":"The Amber biomolecular simulation programs","volume":"26","author":"Case","year":"2005","journal-title":"J. Comput. Chem."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1545","DOI":"10.1002\/jcc.21287","article-title":"CHARMM: The biomolecular simulation program","volume":"30","author":"Brooks","year":"2009","journal-title":"J. Comput. Chem."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"3031","DOI":"10.1021\/acs.jctc.7b00125","article-title":"The Rosetta all-atom energy function for macromolecular modeling and design","volume":"13","author":"Alford","year":"2017","journal-title":"J. Chem. Theory Comput."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Samish, I. (2017). Computational Protein Design, Springer.","DOI":"10.1007\/978-1-4939-6637-0"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1016\/B978-0-12-394292-0.00005-9","article-title":"OSPREY: Protein design with ensembles, flexibility, and provable algorithms","volume":"Volume 523","author":"Gainza","year":"2013","journal-title":"Methods in Enzymology"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"999","DOI":"10.1002\/1096-987X(200008)21:11<999::AID-JCC9>3.0.CO;2-A","article-title":"Conformational splitting: A more powerful criterion for dead-end elimination","volume":"21","author":"Pierce","year":"2000","journal-title":"J. Comput. Chem."},{"key":"ref_24","unstructured":"Rossi, F., van Beek, P., and Walsh, T. (2006). Handbook of Constraint Programming, Elsevier."},{"key":"ref_25","unstructured":"Koller, D., and Friedman, N. (2009). Probabilistic Graphical Models: Principles and Techniques, MIT press."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"449","DOI":"10.1016\/j.artint.2010.02.001","article-title":"Soft arc consistency revisited","volume":"174","author":"Cooper","year":"2010","journal-title":"Artif. Intell."},{"key":"ref_27","unstructured":"Cooper, M.C., De Givry, S., S\u00e1nchez-Fibla, M., Schiex, T., and Zytnicki, M. (2008, January 13\u201317). Virtual Arc Consistency for Weighted CSP. Proceedings of the Twenty-third National Conference on Artificial Intelligence (AAAI), Chicago, IL, USA."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1048","DOI":"10.1002\/jcc.24290","article-title":"Fast search algorithms for computational protein design","volume":"37","author":"Roberts","year":"2016","journal-title":"J. Comput. Chem."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Traor\u00e9, S., Allouche, D., Andr\u00e9, I., Schiex, T., and Barbe, S. (2017). Deterministic Search Methods for Computational Protein Design. Computational Protein Design, Springer.","DOI":"10.1007\/978-1-4939-6637-0_4"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"10915","DOI":"10.1073\/pnas.89.22.10915","article-title":"Amino acid substitution matrices from protein blocks","volume":"89","author":"Henikoff","year":"1992","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_31","unstructured":"Hebrard, E., Hnich, B., O\u2019Sullivan, B., and Walsh, T. (2005, January 9\u201313). Finding diverse and similar solutions in constraint programming. Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI), Pittsburgh, PA, USA."},{"key":"ref_32","unstructured":"Hebrard, E., O\u2019Sullivan, B., and Walsh, T. (2007, January 6\u201312). Distance Constraints in Constraint Satisfaction. Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Had\u017ei\u0107, T., Holland, A., and O\u2019Sullivan, B. (2009). Reasoning about optimal collections of solutions. International Conference on Principles and Practice of Constraint Programming, Springer.","DOI":"10.1007\/978-3-642-04244-7_34"},{"key":"ref_34","unstructured":"Petit, T., and Trapp, A.C. (2015, January 25\u201331). Finding diverse solutions of high quality to constraint optimization problems. Proceedings of the Twenty-fourth International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Batra, D., Yadollahpour, P., Guzman-Rivera, A., and Shakhnarovich, G. (2012). Diverse M-best solutions in Markov Random Fields. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-642-33715-4_1"},{"key":"ref_36","unstructured":"Prasad, A., Jegelka, S., and Batra, D. (2014, January 8\u201313). Submodular meets structured: Finding diverse subsets in exponentially-large structured item sets. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Kirillov, A., Savchynskyy, B., Schlesinger, D., Vetrov, D., and Rother, C. (2015, January 7\u201313). Inferring M-best diverse labelings in a single one. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.211"},{"key":"ref_38","unstructured":"Chen, C., Kolmogorov, V., Zhu, Y., Metaxas, D., and Lampert, C. (May, January 29). Computing the M most probable modes of a graphical model. Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, Scottsdale, AZ, USA."},{"key":"ref_39","unstructured":"Chen, C., Yuan, C., Ye, Z., and Chen, C. (2018, January 11\u201314). Solving M-Modes in Loopy Graphs Using Tree Decompositions. Proceedings of the International Conference on Probabilistic Graphical Models, Prague, Czech Republic."},{"key":"ref_40","unstructured":"Chen, C., Liu, H., Metaxas, D., and Zhao, T. (2014, January 8\u201313). Mode estimation for high dimensional discrete tree graphical models. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_41","unstructured":"Chen, C., Yuan, C., and Chen, C. (2016, January 9\u201315). Solving M-Modes Using Heuristic Search. Proceedings of the Twenty-fifth International Joint Conference on Artificial Intelligence, New York, NY, USA."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Pesant, G. (2004). A regular language membership constraint for finite sequences of variables. International Conference on Principles and Practice of Constraint Programming, Springer.","DOI":"10.1007\/978-3-540-30201-8_36"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1016\/j.artint.2016.06.005","article-title":"Tractability-preserving transformations of global cost functions","volume":"238","author":"Allouche","year":"2016","journal-title":"Artif. Intell."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Ruffini, M., Vucinic, J., de Givry, S., Katsirelos, G., Barbe, S., and Schiex, T. (2019, January 4\u20136). Guaranteed Diversity & Quality for the Weighted CSP. Proceedings of the 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), Portland, OR, USA.","DOI":"10.1109\/ICTAI.2019.00012"},{"key":"ref_45","unstructured":"Allouche, D., De Givry, S., Katsirelos, G., Schiex, T., and Zytnicki, M. (September, January 31). Anytime hybrid best-first search with tree decomposition for weighted CSP. Proceedings of the International Conference on Principles and Practice of Constraint Programming, Cork, Ireland."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"5980","DOI":"10.1021\/acs.jctc.5b00594","article-title":"Guaranteed discrete energy optimization on large protein design problems","volume":"11","author":"Simoncini","year":"2015","journal-title":"J. Chem. Theory Comput."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Ollikainen, N., and Kortemme, T. (2013). Computational protein design quantifies structural constraints on amino acid covariation. PLoS Comput. Biol., 9.","DOI":"10.1371\/journal.pcbi.1003313"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"6201","DOI":"10.1021\/acs.jctc.6b00819","article-title":"Simultaneous optimization of biomolecular energy functions on features from small molecules and macromolecules","volume":"12","author":"Park","year":"2016","journal-title":"J. Chem. Theory Comput."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1016\/0004-3702(70)90007-X","article-title":"Heuristic search viewed as path finding in a graph","volume":"1","author":"Pohl","year":"1970","journal-title":"Artif. Intell."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1145\/1162349.1162350","article-title":"Fast and accurate algorithms for protein side-chain packing","volume":"53","author":"Xu","year":"2006","journal-title":"J. ACM (JACM)"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1089\/cmb.2015.0194","article-title":"BWM*: A novel, provable, ensemble-based dynamic programming algorithm for sparse approximations of computational protein design","volume":"23","author":"Jou","year":"2016","journal-title":"J. Comput. Biol."},{"key":"ref_52","unstructured":"De Givry, S., Schiex, T., and Verfaillie, G. (2006, January 16\u201320). Exploiting tree decomposition and soft local consistency in weighted CSP. Proceedings of the Twenty-First National Conference on Artificial Intelligence (AAAI), Boston, Massachusetts."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/14\/6\/168\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:09:32Z","timestamp":1760162972000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/14\/6\/168"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,5,28]]},"references-count":52,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2021,6]]}},"alternative-id":["a14060168"],"URL":"https:\/\/doi.org\/10.3390\/a14060168","relation":{"has-preprint":[{"id-type":"doi","id":"10.20944\/preprints202104.0250.v1","asserted-by":"object"}]},"ISSN":["1999-4893"],"issn-type":[{"value":"1999-4893","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,5,28]]}}}