{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,31]],"date-time":"2025-12-31T11:20:27Z","timestamp":1767180027037,"version":"build-2238731810"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1011521","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2023,11,14]],"date-time":"2023-11-14T00:00:00Z","timestamp":1699920000000}}],"reference-count":71,"publisher":"Public Library of Science (PLoS)","issue":"10","license":[{"start":{"date-parts":[[2023,10,26]],"date-time":"2023-10-26T00:00:00Z","timestamp":1698278400000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001665","name":"Agence Nationale de la Recherche","doi-asserted-by":"crossref","award":["RBMPro CE30-0021-01"],"award-info":[{"award-number":["RBMPro CE30-0021-01"]}],"id":[{"id":"10.13039\/501100001665","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001665","name":"Agence Nationale de la Recherche","doi-asserted-by":"publisher","award":["ANR-19 Decrypted CE30-0021-01"],"award-info":[{"award-number":["ANR-19 Decrypted CE30-0021-01"]}],"id":[{"id":"10.13039\/501100001665","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>\n                    Predicting the effects of mutations on protein function is an important issue in evolutionary biology and biomedical applications. Computational approaches, ranging from graphical models to deep-learning architectures, can capture the statistical properties of sequence data and predict the outcome of high-throughput mutagenesis experiments probing the fitness landscape around some wild-type protein. However, how the complexity of the models and the characteristics of the data combine to determine the predictive performance remains unclear. Here, based on a theoretical analysis of the prediction error, we propose descriptors of the sequence data, characterizing their quantity and relevance relative to the model. Our theoretical framework identifies a trade-off between these two quantities, and determines the optimal subset of data for the prediction task, showing that simple models can outperform complex ones when inferred from adequately-selected sequences. We also show how repeated subsampling of the sequence data is informative about how much epistasis in the fitness landscape is not captured by the computational model. Our approach is illustrated on several protein families, as well as on\n                    <jats:italic>in silico<\/jats:italic>\n                    solvable protein models.\n                  <\/jats:p>","DOI":"10.1371\/journal.pcbi.1011521","type":"journal-article","created":{"date-parts":[[2023,10,26]],"date-time":"2023-10-26T13:53:54Z","timestamp":1698328434000},"page":"e1011521","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":3,"title":["Infer global, predict local: Quantity-relevance trade-off in protein fitness predictions from sequence data"],"prefix":"10.1371","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1135-1300","authenticated-orcid":true,"given":"Lorenzo","family":"Posani","sequence":"first","affiliation":[]},{"given":"Francesca","family":"Rizzato","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4459-0204","authenticated-orcid":true,"given":"R\u00e9mi","family":"Monasson","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1852-7789","authenticated-orcid":true,"given":"Simona","family":"Cocco","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2023,10,26]]},"reference":[{"issue":"7","key":"pcbi.1011521.ref001","doi-asserted-by":"crossref","first-page":"480","DOI":"10.1038\/nrg3744","article-title":"Empirical fitness landscapes and the predictability of evolution","volume":"15","author":"JAG De Visser","year":"2014","journal-title":"Nature Reviews Genetics"},{"issue":"8","key":"pcbi.1011521.ref002","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1038\/nrg3540","article-title":"Evolutionary biochemistry: revealing the historical and physical causes of protein properties","volume":"14","author":"MJ Harms","year":"2013","journal-title":"Nature Reviews Genetics"},{"key":"pcbi.1011521.ref003","unstructured":"Wright S, Jones DF. Proceedings of the Sixth International Congress of Genetics. In: Proceedings of the Sixth International Congress of Genetics. vol. 1; 1932. p. 356\u2013366."},{"issue":"2","key":"pcbi.1011521.ref004","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1038\/nrg1523","article-title":"The genetic theory of adaptation: a brief history","volume":"6","author":"HA Orr","year":"2005","journal-title":"Nature Reviews Genetics"},{"issue":"5194","key":"pcbi.1011521.ref005","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1126\/science.7809610","article-title":"Experimental tests of the roles of adaptation, chance, and history in evolution","volume":"267","author":"M Travisano","year":"1995","journal-title":"Science"},{"issue":"3","key":"pcbi.1011521.ref006","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1101\/gr.212802","article-title":"Accounting for human polymorphisms predicted to affect protein function","volume":"12","author":"PC Ng","year":"2002","journal-title":"Genome research"},{"issue":"17","key":"pcbi.1011521.ref007","doi-asserted-by":"crossref","first-page":"3894","DOI":"10.1093\/nar\/gkf493","article-title":"Human non-synonymous SNPs: server and survey","volume":"30","author":"V Ramensky","year":"2002","journal-title":"Nucleic acids research"},{"issue":"5","key":"pcbi.1011521.ref008","doi-asserted-by":"crossref","first-page":"1317","DOI":"10.1093\/nar\/gkj518","article-title":"Computational approaches for predicting the biological effect of p53 missense mutations: a comparison of three sequence analysis based methods","volume":"34","author":"E Mathe","year":"2006","journal-title":"Nucleic acids research"},{"issue":"8","key":"pcbi.1011521.ref009","doi-asserted-by":"crossref","first-page":"1161","DOI":"10.1038\/s41588-018-0167-z","article-title":"Predicting the clinical impact of human mutation with deep neural networks","volume":"50","author":"L Sundaram","year":"2018","journal-title":"Nature genetics"},{"issue":"7","key":"pcbi.1011521.ref010","doi-asserted-by":"crossref","first-page":"e9380","DOI":"10.15252\/msb.20199380","article-title":"Using deep mutational scanning to benchmark variant effect predictors and identify disease mutations","volume":"16","author":"BJ Livesey","year":"2020","journal-title":"Molecular systems biology"},{"issue":"5747","key":"pcbi.1011521.ref011","doi-asserted-by":"crossref","first-page":"499","DOI":"10.1126\/science.1115649","article-title":"The biochemical architecture of an ancient adaptive landscape","volume":"310","author":"M Lunzer","year":"2005","journal-title":"Science"},{"issue":"6","key":"pcbi.1011521.ref012","first-page":"1165","article-title":"Perspective: sign epistasis and genetic costraint on evolutionary trajectories","volume":"59","author":"DM Weinreich","year":"2005","journal-title":"Evolution"},{"issue":"7422","key":"pcbi.1011521.ref013","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1038\/nature11500","article-title":"The spatial architecture of protein function and adaptation","volume":"491","author":"RN McLaughlin","year":"2012","journal-title":"Nature"},{"issue":"11","key":"pcbi.1011521.ref014","doi-asserted-by":"crossref","first-page":"1537","DOI":"10.1261\/rna.040709.113","article-title":"Deep mutational scanning of an RRM domain of the Saccharomyces cerevisiae poly (A)-binding protein","volume":"19","author":"D Melamed","year":"2013","journal-title":"Rna"},{"issue":"2","key":"pcbi.1011521.ref015","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pgen.1004918","article-title":"Combining natural sequence variation with high throughput mutational data to reveal protein interaction sites","volume":"11","author":"D Melamed","year":"2015","journal-title":"PLoS genetics"},{"issue":"23","key":"pcbi.1011521.ref016","doi-asserted-by":"crossref","first-page":"7159","DOI":"10.1073\/pnas.1422285112","article-title":"Dissecting enzyme function with microfluidic-based deep mutational scanning","volume":"112","author":"PA Romero","year":"2015","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"5","key":"pcbi.1011521.ref017","doi-asserted-by":"crossref","first-page":"882","DOI":"10.1016\/j.cell.2015.01.035","article-title":"Evolvability as a function of purifying selection in TEM-1 \u03b2-lactamase","volume":"160","author":"MA Stiffler","year":"2015","journal-title":"Cell"},{"issue":"32","key":"pcbi.1011521.ref018","doi-asserted-by":"crossref","first-page":"13067","DOI":"10.1073\/pnas.1215206110","article-title":"Capturing the mutational landscape of the beta-lactamase TEM-1","volume":"110","author":"H Jacquier","year":"2013","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"6","key":"pcbi.1011521.ref019","doi-asserted-by":"crossref","first-page":"1581","DOI":"10.1093\/molbev\/msu081","article-title":"A comprehensive, high-resolution map of a gene\u2019s fitness landscape","volume":"31","author":"E Firnberg","year":"2014","journal-title":"Molecular biology and evolution"},{"issue":"14","key":"pcbi.1011521.ref020","doi-asserted-by":"crossref","first-page":"E1263","DOI":"10.1073\/pnas.1303309110","article-title":"Activity-enhancing mutations in an E3 ubiquitin ligase identified by high-throughput mutagenesis","volume":"110","author":"LM Starita","year":"2013","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"42","key":"pcbi.1011521.ref021","doi-asserted-by":"crossref","first-page":"16858","DOI":"10.1073\/pnas.1209751109","article-title":"A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function","volume":"109","author":"CL Araya","year":"2012","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"8","key":"pcbi.1011521.ref022","doi-asserted-by":"crossref","first-page":"1363","DOI":"10.1016\/j.jmb.2013.01.032","article-title":"Analyses of the effects of all ubiquitin point mutants on yeast growth rate","volume":"425","author":"BP Roscoe","year":"2013","journal-title":"Journal of molecular biology"},{"issue":"3","key":"pcbi.1011521.ref023","doi-asserted-by":"crossref","first-page":"588","DOI":"10.1016\/j.celrep.2016.03.046","article-title":"Systematic mutant analyses elucidate general and client-specific aspects of Hsp90 function","volume":"15","author":"P Mishra","year":"2016","journal-title":"Cell reports"},{"issue":"2","key":"pcbi.1011521.ref024","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1534\/genetics.115.175802","article-title":"Massively parallel functional analysis of BRCA1 RING domain variants","volume":"200","author":"LM Starita","year":"2015","journal-title":"Genetics"},{"issue":"3-4","key":"pcbi.1011521.ref025","doi-asserted-by":"crossref","first-page":"150","DOI":"10.1016\/j.jmb.2012.09.014","article-title":"Deep sequencing of systematic combinatorial libraries reveals \u03b2-lactamase sequence constraints at high resolution","volume":"424","author":"Z Deng","year":"2012","journal-title":"Journal of molecular biology"},{"key":"pcbi.1011521.ref026","first-page":"gku989","article-title":"UniProt: a hub for protein information","author":"U Consortium","year":"2014","journal-title":"Nucleic Acids Research"},{"issue":"D1","key":"pcbi.1011521.ref027","doi-asserted-by":"crossref","first-page":"D279","DOI":"10.1093\/nar\/gkv1344","article-title":"The Pfam protein families database: towards a more sustainable future","volume":"44","author":"RD Finn","year":"2016","journal-title":"Nucleic acids research"},{"key":"pcbi.1011521.ref028","doi-asserted-by":"crossref","first-page":"e46688","DOI":"10.1371\/journal.pone.0046688","article-title":"Predicting the functional effect of amino acid substitutions and indels","volume":"7","author":"Y Choi","year":"2012","journal-title":"PLoS One"},{"issue":"2","key":"pcbi.1011521.ref029","article-title":"Predicting and interpreting large-scale mutagenesis data using analyses of protein stability and conservation","volume":"38","author":"MH H\u00f8ie","year":"2022","journal-title":"Cell reports"},{"issue":"2","key":"pcbi.1011521.ref030","doi-asserted-by":"crossref","first-page":"342","DOI":"10.1006\/jmbi.1996.0167","article-title":"An evolutionary trace method defines binding surfaces common to protein families","volume":"257","author":"O Lichtarge","year":"1996","journal-title":"Journal of molecular biology"},{"issue":"11","key":"pcbi.1011521.ref031","doi-asserted-by":"crossref","first-page":"2604","DOI":"10.1093\/molbev\/msz179","article-title":"GEMME: a simple and fast global epistatic model predicting mutational effects","volume":"36","author":"E Laine","year":"2019","journal-title":"Molecular biology and evolution"},{"issue":"49","key":"pcbi.1011521.ref032","doi-asserted-by":"crossref","first-page":"E1293","DOI":"10.1073\/pnas.1111471108","article-title":"Direct-coupling analysis of residue coevolution captures native contacts across many protein families","volume":"108","author":"F Morcos","year":"2011","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"12","key":"pcbi.1011521.ref033","doi-asserted-by":"crossref","first-page":"e28766","DOI":"10.1371\/journal.pone.0028766","article-title":"Protein 3D structure computed from evolutionary sequence variation","volume":"6","author":"DS Marks","year":"2011","journal-title":"PloS one"},{"issue":"3","key":"pcbi.1011521.ref034","doi-asserted-by":"crossref","first-page":"032601","DOI":"10.1088\/1361-6633\/aa9965","article-title":"Inverse statistical physics of protein sequences: a key issues review","volume":"81","author":"S Cocco","year":"2018","journal-title":"Reports on Progress in Physics"},{"issue":"8","key":"pcbi.1011521.ref035","doi-asserted-by":"crossref","first-page":"e1003776","DOI":"10.1371\/journal.pcbi.1003776","article-title":"The fitness landscape of HIV-1 gag: advanced modeling approaches and validation of model predictions by in vitro testing","volume":"10","author":"JK Mann","year":"2014","journal-title":"PLoS Comput Biol"},{"issue":"1","key":"pcbi.1011521.ref036","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1093\/molbev\/msv211","article-title":"Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1","volume":"33","author":"M Figliuzzi","year":"2015","journal-title":"Molecular biology and evolution"},{"issue":"2","key":"pcbi.1011521.ref037","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1038\/nbt.3769","article-title":"Mutation effects predicted from sequence co-variation","volume":"35","author":"TA Hopf","year":"2017","journal-title":"Nature biotechnology"},{"key":"pcbi.1011521.ref038","first-page":"2023","article-title":"Minimal epistatic networks from integrated sequence and mutational protein data","author":"S Cocco","year":"2023","journal-title":"bioRxiv"},{"key":"pcbi.1011521.ref039","doi-asserted-by":"crossref","first-page":"816","DOI":"10.1038\/s41592-018-0138-4","article-title":"Deep generative models of genetic variation capture the effects of mutations","volume":"15","author":"AJ Riesselman","year":"2018","journal-title":"Nat Methods"},{"issue":"7873","key":"pcbi.1011521.ref040","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"J Jumper","year":"2021","journal-title":"Nature"},{"key":"pcbi.1011521.ref041","doi-asserted-by":"crossref","unstructured":"Rao R, Meier J, Sercu T, Ovchinnikov S, Rives A. Transformer protein language models are unsupervised structure learners. In: International Conference on Learning Representations; 2020.","DOI":"10.1101\/2020.12.15.422761"},{"issue":"15","key":"pcbi.1011521.ref042","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2016239118","article-title":"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences","volume":"118","author":"A Rives","year":"2021","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"7","key":"pcbi.1011521.ref043","doi-asserted-by":"crossref","first-page":"1114","DOI":"10.1038\/s41587-021-01146-5","article-title":"Learning protein fitness models from evolutionary and assay-labeled data","volume":"40","author":"C Hsu","year":"2022","journal-title":"Nature biotechnology"},{"issue":"1","key":"pcbi.1011521.ref044","doi-asserted-by":"crossref","first-page":"268","DOI":"10.1093\/molbev\/msv211","article-title":"Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1","volume":"33","author":"M Figliuzzi","year":"2016","journal-title":"Molecular biology and evolution"},{"key":"pcbi.1011521.ref045","doi-asserted-by":"crossref","unstructured":"Yusim K, Korber BT, Brander C, Barouch D, de Boer R, Haynes BF, et al. Hiv molecular immunology 2015. Los Alamos National Lab.(LANL), Los Alamos, NM (United States); 2016.","DOI":"10.2172\/1248095"},{"issue":"6","key":"pcbi.1011521.ref046","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1002\/humu.21490","article-title":"Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed","volume":"32","author":"S Hicks","year":"2011","journal-title":"Human mutation"},{"issue":"5275","key":"pcbi.1011521.ref047","doi-asserted-by":"crossref","first-page":"666","DOI":"10.1126\/science.273.5275.666","article-title":"Emergence of preferred structures in a simple model of protein folding","volume":"273","author":"H Li","year":"1996","journal-title":"Science"},{"issue":"10","key":"pcbi.1011521.ref048","doi-asserted-by":"crossref","first-page":"3986","DOI":"10.1021\/ma00200a030","article-title":"A lattice statistical mechanics model of the conformational and sequence spaces of proteins","volume":"22","author":"KF Lau","year":"1989","journal-title":"Macromolecules"},{"issue":"8","key":"pcbi.1011521.ref049","doi-asserted-by":"crossref","first-page":"5967","DOI":"10.1063\/1.459480","article-title":"Enumeration of all compact conformations of copolymers with random sequence of links","volume":"93","author":"E Shakhnovich","year":"1990","journal-title":"The Journal of Chemical Physics"},{"issue":"12","key":"pcbi.1011521.ref050","doi-asserted-by":"crossref","first-page":"e1004889","DOI":"10.1371\/journal.pcbi.1004889","article-title":"Benchmarking inverse statistical approaches for protein structure and design with exactly solvable models","volume":"12","author":"H Jacquin","year":"2016","journal-title":"PLoS Comput Biol"},{"issue":"1","key":"pcbi.1011521.ref051","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1073\/pnas.0805923106","article-title":"Identification of direct residue contacts in protein\u2013protein interaction by message passing","volume":"106","author":"M Weigt","year":"2009","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"1","key":"pcbi.1011521.ref052","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1162\/neco.1992.4.1.1","article-title":"Neural networks and the bias\/variance dilemma","volume":"4","author":"S Geman","year":"1992","journal-title":"Neural computation"},{"issue":"1","key":"pcbi.1011521.ref053","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1023\/A:1009778005914","article-title":"On bias, variance, 0\/1\u00d1loss, and the curse-of-dimensionality","volume":"1","author":"JH Friedman","year":"1997","journal-title":"Data mining and knowledge discovery"},{"issue":"20","key":"pcbi.1011521.ref054","doi-asserted-by":"crossref","first-page":"3089","DOI":"10.1093\/bioinformatics\/btw328","article-title":"ACE: adaptive cluster expansion for maximum entropy graphical model inference","volume":"32","author":"JP Barton","year":"2016","journal-title":"Bioinformatics"},{"issue":"6477","key":"pcbi.1011521.ref055","doi-asserted-by":"crossref","first-page":"248","DOI":"10.1038\/369248a0","article-title":"How does a protein fold?","volume":"369","author":"E Shakhnovich","year":"1994","journal-title":"Nature"},{"issue":"5","key":"pcbi.1011521.ref056","doi-asserted-by":"crossref","first-page":"1267","DOI":"10.1007\/s10955-015-1441-4","article-title":"On the entropy of protein families","volume":"162","author":"JP Barton","year":"2016","journal-title":"Journal of Statistical Physics"},{"key":"pcbi.1011521.ref057","doi-asserted-by":"crossref","first-page":"341","DOI":"10.1016\/j.jcp.2014.07.024","article-title":"Fast pseudolikelihood maximization for direct-coupling analysis of protein structure from many homologous amino-acid sequences","volume":"276","author":"M Ekeberg","year":"2014","journal-title":"Journal of Computational Physics"},{"issue":"6","key":"pcbi.1011521.ref058","doi-asserted-by":"crossref","first-page":"926","DOI":"10.1093\/bioinformatics\/btu739","article-title":"UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches","volume":"31","author":"BE Suzek","year":"2015","journal-title":"Bioinformatics"},{"issue":"10","key":"pcbi.1011521.ref059","doi-asserted-by":"crossref","first-page":"e1002195","DOI":"10.1371\/journal.pcbi.1002195","article-title":"Accelerated profile HMM searches","volume":"7","author":"SR Eddy","year":"2011","journal-title":"PLoS computational biology"},{"issue":"3","key":"pcbi.1011521.ref060","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1038\/nmeth.3223","article-title":"Massively parallel single-amino-acid mutagenesis","volume":"12","author":"JO Kitzman","year":"2015","journal-title":"Nature methods"},{"issue":"1","key":"pcbi.1011521.ref061","doi-asserted-by":"crossref","first-page":"012309","DOI":"10.1103\/PhysRevE.101.012309","article-title":"Inference of compressed Potts graphical models","volume":"101","author":"F Rizzato","year":"2020","journal-title":"Physical Review E"},{"issue":"3","key":"pcbi.1011521.ref062","doi-asserted-by":"crossref","first-page":"534","DOI":"10.1021\/ma00145a039","article-title":"Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation","volume":"18","author":"S Miyazawa","year":"1985","journal-title":"Macromolecules"},{"issue":"3","key":"pcbi.1011521.ref063","doi-asserted-by":"crossref","first-page":"623","DOI":"10.1006\/jmbi.1996.0114","article-title":"Residue\u2013residue potentials with a favorable contact pair term and an unfavorable high packing density term, for simulation and threading","volume":"256","author":"S Miyazawa","year":"1996","journal-title":"Journal of molecular biology"},{"issue":"2","key":"pcbi.1011521.ref064","doi-asserted-by":"crossref","first-page":"024407","DOI":"10.1103\/PhysRevE.104.024407","article-title":"Sparse generative modeling via parameter reduction of Boltzmann machines: application to protein-sequence families","volume":"104","author":"P Barrat-Charlaix","year":"2021","journal-title":"Physical Review E"},{"issue":"4","key":"pcbi.1011521.ref065","doi-asserted-by":"crossref","first-page":"msac070","DOI":"10.1093\/molbev\/msac070","article-title":"Multiple profile models extract features from protein sequence data and resolve functional diversity of very different protein families","volume":"39","author":"R Vicedomini","year":"2022","journal-title":"Molecular biology and evolution"},{"key":"pcbi.1011521.ref066","volume-title":"Inferring Phylogenies","author":"J Felsenstein","year":"2003"},{"key":"pcbi.1011521.ref067","doi-asserted-by":"crossref","first-page":"102594","DOI":"10.1016\/j.sbi.2023.102594","article-title":"Progress at protein structure prediction, as seen in CASP15","volume":"80","author":"A Elofsson","year":"2023","journal-title":"Current Opinion in Structural Biology"},{"issue":"3","key":"pcbi.1011521.ref068","doi-asserted-by":"crossref","first-page":"1287","DOI":"10.1214\/09-AOS691","article-title":"High-dimensional Ising model selection using l1-regularized logistic regression","volume":"38","author":"P Ravikumar","year":"2010","journal-title":"The Annals of Statistics"},{"issue":"6","key":"pcbi.1011521.ref069","doi-asserted-by":"crossref","first-page":"063406","DOI":"10.1088\/1742-5468\/aa727d","article-title":"A statistical physics approach to learning curves for the inverse Ising problem","volume":"2017","author":"L Bachschmid-Romano","year":"2017","journal-title":"Journal of Statistical Mechanics: Theory and Experiment"},{"issue":"7","key":"pcbi.1011521.ref070","doi-asserted-by":"crossref","first-page":"073402","DOI":"10.1088\/1742-5468\/ab8c3a","article-title":"Learning performance in inverse Ising problems with sparse teacher couplings","volume":"2020","author":"A Abbara","year":"2020","journal-title":"Journal of Statistical Mechanics: Theory and Experiment"},{"issue":"5","key":"pcbi.1011521.ref071","doi-asserted-by":"crossref","first-page":"498","DOI":"10.1093\/bib\/bbq080","article-title":"Missing value imputation for gene expression data: computational techniques to recover missing data from available information","volume":"12","author":"AWC Liew","year":"2010","journal-title":"Briefings in Bioinformatics"}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1011521","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2023,11,14]],"date-time":"2023-11-14T00:00:00Z","timestamp":1699920000000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1011521","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,14]],"date-time":"2023-11-14T13:35:09Z","timestamp":1699968909000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1011521"}},"subtitle":[],"editor":[{"given":"Rachel","family":"Kolodny","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2023,10,26]]},"references-count":71,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2023,10,26]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1011521","relation":{},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,26]]}}}