{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T07:58:49Z","timestamp":1773734329171,"version":"3.50.1"},"reference-count":89,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2024,5,15]],"date-time":"2024-05-15T00:00:00Z","timestamp":1715731200000},"content-version":"vor","delay-in-days":14,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001711","name":"Swiss National Science Foundation","doi-asserted-by":"publisher","award":["310030_208174"],"award-info":[{"award-number":["310030_208174"]}],"id":[{"id":"10.13039\/501100001711","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,5,2]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Experimental characterization of fitness landscapes, which map genotypes onto fitness, is important for both evolutionary biology and protein engineering. It faces a fundamental obstacle in the astronomical number of genotypes whose fitness needs to be measured for any one protein. Deep learning may help to predict the fitness of many genotypes from a smaller neural network training sample of genotypes with experimentally measured fitness. Here I use a recently published experimentally mapped fitness landscape of more than 260\u2009000 protein genotypes to ask how such sampling is best performed.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>I show that multilayer perceptrons, recurrent neural networks, convolutional networks, and transformers, can explain more than 90% of fitness variance in the data. In addition, 90% of this performance is reached with a training sample comprising merely \u2248103 sequences. Generalization to unseen test data is best when training data is sampled randomly and uniformly, or sampled to minimize the number of synonymous sequences. In contrast, sampling to maximize sequence diversity or codon usage bias reduces performance substantially. These observations hold for more than one network architecture. Simple sampling strategies may perform best when training deep learning neural networks to map fitness landscapes from experimental data.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>The fitness landscape data analyzed here is publicly available as described previously (Papkou et al. 2023). All code used to analyze this landscape is publicly available at https:\/\/github.com\/andreas-wagner-uzh\/fitness_landscape_sampling<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btae317","type":"journal-article","created":{"date-parts":[[2024,5,14]],"date-time":"2024-05-14T13:30:59Z","timestamp":1715693459000},"source":"Crossref","is-referenced-by-count":4,"title":["Genotype sampling for deep-learning assisted experimental mapping of a combinatorially complete fitness landscape"],"prefix":"10.1093","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4299-3840","authenticated-orcid":false,"given":"Andreas","family":"Wagner","sequence":"first","affiliation":[{"name":"Department of Evolutionary Biology and Environmental Studies, University of Zurich , 8057 Zurich, Switzerland"},{"name":"Swiss Institute of Bioinformatics, Quartier Sorge-Batiment Genopode ,1015 Lausanne, Switzerland"},{"name":"The Santa Fe Institute , Santa Fe, 87501 NM, United States"}]}],"member":"286","published-online":{"date-parts":[[2024,5,15]]},"reference":[{"key":"2024052823533469600_btae317-B1","doi-asserted-by":"crossref","first-page":"1790","DOI":"10.1093\/molbev\/msaa038","article-title":"Predicting the landscape of recombination using deep learning","volume":"37","author":"Adrion","year":"2020","journal-title":"Mol Biol Evol"},{"key":"2024052823533469600_btae317-B2","first-page":"0045","article-title":"1000 Empirical adaptive landscapes and their navigability","volume":"1","author":"Aguilar-Rodriguez","year":"2017","journal-title":"Nat Ecol Evol"},{"key":"2024052823533469600_btae317-B3","doi-asserted-by":"crossref","first-page":"831","DOI":"10.1038\/nbt.3300","article-title":"Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning","volume":"33","author":"Alipanahi","year":"2015","journal-title":"Nat Biotechnol"},{"key":"2024052823533469600_btae317-B4","doi-asserted-by":"crossref","first-page":"1315","DOI":"10.1038\/s41592-019-0598-1","article-title":"Unified rational protein engineering with sequence-based deep representation learning","volume":"16","author":"Alley","year":"2019","journal-title":"Nat Methods"},{"key":"2024052823533469600_btae317-B5","first-page":"13","article-title":"DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning","volume":"18","author":"Angermueller","year":"2017","journal-title":"Genome Biol"},{"key":"2024052823533469600_btae317-B6","doi-asserted-by":"crossref","first-page":"e0141287","DOI":"10.1371\/journal.pone.0141287","article-title":"Continuous distributed epresentation of biological sequences for deep proteomics and genomics","volume":"10","author":"Asgari","year":"2015","journal-title":"PLoS One"},{"key":"2024052823533469600_btae317-B7","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1038\/s41588-021-00782-6","article-title":"Base-resolution models of transcription-factor binding reveal soft motif syntax","volume":"53","author":"Avsec","year":"2021","journal-title":"Nat Genet"},{"key":"2024052823533469600_btae317-B8","doi-asserted-by":"crossref","first-page":"evab141","DOI":"10.1093\/gbe\/evab141","article-title":"Effects of synonymous mutations beyond codon bias: the evidence for adaptive synonymous substitutions from microbial evolution experiments","volume":"13","author":"Bailey","year":"2021","journal-title":"Genome Biol Evol"},{"key":"2024052823533469600_btae317-B10","doi-asserted-by":"crossref","first-page":"e3000300","DOI":"10.1371\/journal.pbio.3000300","article-title":"Genotype network intersections promote evolutionary innovation","volume":"17","author":"Bendixsen","year":"2019","journal-title":"PLoS Biol"},{"key":"2024052823533469600_btae317-B11","doi-asserted-by":"crossref","first-page":"645","DOI":"10.1016\/j.celrep.2015.03.051","article-title":"Systems-level response to point mutations in a core metabolic enzyme modulates genotype-phenotype relationship","volume":"11","author":"Bershtein","year":"2015","journal-title":"Cell Rep"},{"key":"2024052823533469600_btae317-B12","doi-asserted-by":"crossref","first-page":"807","DOI":"10.1137\/S1052623494268522","article-title":"Incremental least squares methods and the extended Kalman filter","volume":"6","author":"Bertsekas","year":"1996","journal-title":"SIAM J Optim"},{"key":"2024052823533469600_btae317-B13","doi-asserted-by":"crossref","first-page":"e82593","DOI":"10.7554\/eLife.82593","article-title":"Rapid protein stability prediction using deep learning representations","volume":"12","author":"Blaabjerg","year":"2023","journal-title":"Elife"},{"key":"2024052823533469600_btae317-B14","doi-asserted-by":"crossref","first-page":"1005","DOI":"10.1038\/nbt.4238","article-title":"Evaluation of 244,000 synthetic sequences reveals design principles to optimize translation in Escherichia coli","volume":"36","author":"Cambray","year":"2018","journal-title":"Nat Biotechnol"},{"key":"2024052823533469600_btae317-B15","doi-asserted-by":"crossref","first-page":"eadg7492","DOI":"10.1126\/science.adg7492","article-title":"Accurate proteome-wide missense variant effect prediction with AlphaMissense","volume":"381","author":"Cheng","year":"2023","journal-title":"Science"},{"key":"2024052823533469600_btae317-B16","volume-title":"Deep Learning with Python","author":"Chollet","year":"2021"},{"key":"2024052823533469600_btae317-B17","doi-asserted-by":"crossref","first-page":"1190","DOI":"10.1126\/science.1203799","article-title":"Diminishing returns epistasis among beneficial mutations decelerates adaptation","volume":"332","author":"Chou","year":"2011","journal-title":"Science"},{"key":"2024052823533469600_btae317-B19","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1093\/molbev\/msr179","article-title":"The fitness effects of synonymous mutations in DNA and RNA viruses","volume":"29","author":"Cuevas","year":"2011","journal-title":"Mol Biol Evol"},{"key":"2024052823533469600_btae317-B20","doi-asserted-by":"crossref","first-page":"e2209373119","DOI":"10.1073\/pnas.2209373119","article-title":"Unpredictable repeatability in molecular evolution","volume":"119","author":"Das","year":"2022","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2024052823533469600_btae317-B21","doi-asserted-by":"crossref","first-page":"480","DOI":"10.1038\/nrg3744","article-title":"Empirical fitness landscapes and the predictability of evolution","volume":"15","author":"de Visser","year":"2014","journal-title":"Nat Rev Genet"},{"key":"2024052823533469600_btae317-B22","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.32472","article-title":"The genetic landscape of a physical interaction","volume":"7","author":"Diss","year":"2018","journal-title":"Elife"},{"key":"2024052823533469600_btae317-B23","doi-asserted-by":"crossref","first-page":"117","DOI":"10.1038\/s41586-018-0170-7","article-title":"Pairwise and higher-order genetic interactions during the evolution of a tRNA","volume":"558","author":"Domingo","year":"2018","journal-title":"Nature"},{"key":"2024052823533469600_btae317-B24","doi-asserted-by":"crossref","first-page":"2454","DOI":"10.1093\/molbev\/msw097","article-title":"How good are statistical models at approximating complex fitness landscapes?","volume":"33","author":"Du Plessis","year":"2016","journal-title":"Mol Biol Evol"},{"key":"2024052823533469600_btae317-B25","doi-asserted-by":"crossref","first-page":"235","DOI":"10.1186\/s12859-020-03546-x","article-title":"Amino acid encoding for deep learning applications","volume":"21","author":"ElAbd","year":"2020","journal-title":"BMC Bioinformatics"},{"key":"2024052823533469600_btae317-B26","doi-asserted-by":"crossref","first-page":"7112","DOI":"10.1109\/TPAMI.2021.3095381","article-title":"ProtTrans: toward understanding the language of life through self-supervised learning","volume":"44","author":"Elnaggar","year":"2021","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2024052823533469600_btae317-B27","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1093\/molbev\/msaa204","article-title":"Unsupervised inference of protein fitness landscape from deep mutational scan","volume":"38","author":"Fernandez-de-Cossio-Diaz","year":"2021","journal-title":"Mol Biol Evol"},{"key":"2024052823533469600_btae317-B28","doi-asserted-by":"crossref","first-page":"220","DOI":"10.1093\/molbev\/msy224","article-title":"The unreasonable effectiveness of convolutional neural networks in population genetic inference","volume":"36","author":"Flagel","year":"2019","journal-title":"Mol Biol Evol"},{"key":"2024052823533469600_btae317-B29","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1007\/PL00006381","article-title":"The genetic code is one in a million","volume":"47","author":"Freeland","year":"1998","journal-title":"J Mol Evol"},{"key":"2024052823533469600_btae317-B31","doi-asserted-by":"crossref","first-page":"703","DOI":"10.1089\/cmb.2008.0173","article-title":"Interpretable numerical descriptors of amino acid space","volume":"16","author":"Georgiev","year":"2009","journal-title":"J Comput Biol"},{"key":"2024052823533469600_btae317-B32","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1021\/sb500242x","article-title":"Mapping of amino acid substitutions conferring herbicide resistance in wheat glutathione transferase","volume":"4","author":"Govindarajan","year":"2015","journal-title":"ACS Synth Biol"},{"key":"2024052823533469600_btae317-B33","doi-asserted-by":"crossref","DOI":"10.4324\/9780203451519","volume-title":"An Introduction to Neural Networks","author":"Gurney","year":"1997"},{"issue":"Suppl 1","key":"2024052823533469600_btae317-B34","doi-asserted-by":"crossref","first-page":"S75","DOI":"10.1093\/jhered\/esq007","article-title":"Fitness epistasis among 6 biosynthetic loci in the budding yeast Saccharomyces cerevisiae","volume":"101","author":"Hall","year":"2010","journal-title":"J Hered"},{"key":"2024052823533469600_btae317-B36","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1146\/annurev.genet.42.110807.091442","article-title":"Selection on codon bias","volume":"42","author":"Hershberg","year":"2008","journal-title":"Annu Rev Genet"},{"key":"2024052823533469600_btae317-B37","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput"},{"key":"2024052823533469600_btae317-B38","doi-asserted-by":"crossref","first-page":"15","DOI":"10.1038\/s41467-020-17222-4","article-title":"Large-scale DNA-based phenotypic recording and deep learning enable highly accurate sequence-function mapping","volume":"11","author":"H\u00f6llerer","year":"2020","journal-title":"Nat Commun"},{"key":"2024052823533469600_btae317-B39","doi-asserted-by":"crossref","first-page":"26065","DOI":"10.1021\/acsomega.1c02995","article-title":"Effects of distal mutations on ligand-binding affinity in E. coli dihydrofolate reductase","volume":"6","author":"Huang","year":"2021","journal-title":"ACS Omega"},{"key":"2024052823533469600_btae317-B40","first-page":"13","article-title":"Codon usage and tRNA content in unicellular and multicellular organisms","volume":"2","author":"Ikemura","year":"1985","journal-title":"Mol Biol Evol"},{"key":"2024052823533469600_btae317-B42","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1007\/s00239-021-10027-z","article-title":"Codon usage bias: an endless tale","volume":"89","author":"Iriarte","year":"2021","journal-title":"J Mol Evol"},{"key":"2024052823533469600_btae317-B43","doi-asserted-by":"crossref","first-page":"3198","DOI":"10.1016\/j.csbj.2021.05.039","article-title":"Representation learning applications in biological sequence analysis","volume":"19","author":"Iuchi","year":"2021","journal-title":"Comput Struct Biotechnol J"},{"key":"2024052823533469600_btae317-B44","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1038\/s41586-021-03819-2","article-title":"Highly accurate protein structure prediction with AlphaFold","volume":"596","author":"Jumper","year":"2021","journal-title":"Nature"},{"key":"2024052823533469600_btae317-B45","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1016\/S0022-5193(87)80029-2","article-title":"Towards a general theory of adaptive walks on rugged landscapes","volume":"128","author":"Kauffman","year":"1987","journal-title":"J Theor Biol"},{"key":"2024052823533469600_btae317-B46","doi-asserted-by":"crossref","first-page":"R77","DOI":"10.1093\/hmg\/ddw207","article-title":"The Yin and Yang of codon usage","volume":"25","author":"Komar","year":"2016","journal-title":"Hum Mol Genet"},{"key":"2024052823533469600_btae317-B47","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"LeCun","year":"2015","journal-title":"Nature"},{"key":"2024052823533469600_btae317-B48","doi-asserted-by":"crossref","first-page":"837","DOI":"10.1126\/science.aae0568","article-title":"The fitness landscape of a tRNA gene","volume":"352","author":"Li","year":"2016","journal-title":"Science"},{"key":"2024052823533469600_btae317-B49","doi-asserted-by":"crossref","first-page":"1025","DOI":"10.1038\/s41559-018-0549-8","article-title":"Multi-environment fitness landscapes of a tRNA gene","volume":"2","author":"Li","year":"2018","journal-title":"Nat Ecol Evol"},{"key":"2024052823533469600_btae317-B50","doi-asserted-by":"crossref","first-page":"2377","DOI":"10.1002\/adsc.201900149","article-title":"Can machine learning revolutionize directed evolution of selective enzymes?","volume":"361","author":"Li","year":"2019","journal-title":"Adv Synth Catal"},{"key":"2024052823533469600_btae317-B51","first-page":"6765","article-title":"Hyperband: a novel bandit-based approach to hyperparameter optimization","volume":"18","author":"Li","year":"2017","journal-title":"J Machine Learning Res"},{"key":"2024052823533469600_btae317-B52","doi-asserted-by":"crossref","first-page":"3886","DOI":"10.1038\/s41467-019-11735-3","article-title":"Changes in gene expression predictably shift and switch genetic interactions","volume":"10","author":"Li","year":"2019","journal-title":"Nat Commun"},{"key":"2024052823533469600_btae317-B53","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.60924","article-title":"Uncovering the basis of protein-protein interaction specificity with a combinatorially complete library","volume":"9","author":"Lite","year":"2020","journal-title":"Elife"},{"key":"2024052823533469600_btae317-B54","doi-asserted-by":"crossref","first-page":"e68346","DOI":"10.7554\/eLife.68346","article-title":"Structurally distributed surface sites tune allosteric regulation","volume":"10","author":"McCormick","year":"2021","journal-title":"Elife"},{"key":"2024052823533469600_btae317-B55","doi-asserted-by":"crossref","first-page":"652","DOI":"10.1038\/351652a0","article-title":"Adaptive protein evolution at the adh locus in drosophila","volume":"351","author":"McDonald","year":"1991","journal-title":"Nature"},{"key":"2024052823533469600_btae317-B56","doi-asserted-by":"crossref","first-page":"1537","DOI":"10.1261\/rna.040709.113","article-title":"Deep mutational scanning of an RRM domain of the Saccharomyces cerevisiae poly(a)-binding protein","volume":"19","author":"Melamed","year":"2013","journal-title":"RNA"},{"key":"2024052823533469600_btae317-B57","doi-asserted-by":"crossref","first-page":"2707","DOI":"10.1093\/molbev\/msv146","article-title":"Adaptive landscapes of resistance genes change as antibiotic concentrations change","volume":"32","author":"Mira","year":"2015","journal-title":"Mol Biol Evol"},{"key":"2024052823533469600_btae317-B59","doi-asserted-by":"crossref","first-page":"7755","DOI":"10.1038\/s41467-022-34902-5","article-title":"Accuracy and data efficiency in deep learning models of protein expression","volume":"13","author":"Nikolados","year":"2022","journal-title":"Nat Commun"},{"key":"2024052823533469600_btae317-B61","doi-asserted-by":"crossref","first-page":"2643","DOI":"10.1016\/j.cub.2014.09.072","article-title":"A comprehensive biophysical description of pairwise epistasis throughout an entire protein domain","volume":"24","author":"Olson","year":"2014","journal-title":"Curr Biol"},{"key":"2024052823533469600_btae317-B62","doi-asserted-by":"crossref","first-page":"7385","DOI":"10.1038\/ncomms8385","article-title":"Delayed commitment to evolutionary fate in antibiotic resistance fitness landscapes","volume":"6","author":"Palmer","year":"2015","journal-title":"Nat Commun"},{"key":"2024052823533469600_btae317-B63","doi-asserted-by":"crossref","first-page":"911","DOI":"10.3390\/genes12060911","article-title":"A deep-learning sequence-based method to predict protein stability changes upon genetic variations","volume":"12","author":"Pancotti","year":"2021","journal-title":"Genes (Basel)"},{"key":"2024052823533469600_btae317-B64","doi-asserted-by":"crossref","first-page":"eadh3860","DOI":"10.1126\/science.adh3860","article-title":"A rugged yet easily navigable fitness landscape of antibiotic resistance","volume":"382","author":"Papkou","year":"2023","journal-title":"Science"},{"key":"2024052823533469600_btae317-B65","doi-asserted-by":"crossref","first-page":"604","DOI":"10.1021\/sb500282v","article-title":"Codon compression algorithms for saturation mutagenesis","volume":"4","author":"Pines","year":"2015","journal-title":"ACS Synth Biol"},{"key":"2024052823533469600_btae317-B66","doi-asserted-by":"crossref","first-page":"4213","DOI":"10.1038\/s41467-019-12130-8","article-title":"Learning the pattern of epistasis linking genotype and phenotype in a protein","volume":"10","author":"Poelwijk","year":"2019","journal-title":"Nat Commun"},{"key":"2024052823533469600_btae317-B67","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1016\/j.jtbi.2010.12.015","article-title":"Reciprocal sign epistasis is a necessary condition for multi-peaked fitness landscapes","volume":"272","author":"Poelwijk","year":"2011","journal-title":"J Theor Biol"},{"key":"2024052823533469600_btae317-B68","doi-asserted-by":"crossref","first-page":"e1008079","DOI":"10.1371\/journal.pgen.1008079","article-title":"An experimental assay of the interactions of amino acids from orthologous sequences shaping a complex fitness landscape","volume":"15","author":"Pokusaeva","year":"2019","journal-title":"PLoS Genet"},{"key":"2024052823533469600_btae317-B69","doi-asserted-by":"crossref","first-page":"16932","DOI":"10.1038\/s41598-019-53324-w","article-title":"Exploring the limitations of biophysical propensity scales coupled with machine learning for protein sequence analysis","volume":"9","author":"Raimondi","year":"2019","journal-title":"Sci Rep"},{"key":"2024052823533469600_btae317-B71","first-page":"9689","article-title":"Evaluating protein transfer learning with TAPE","volume":"32","author":"Rao","year":"2019","journal-title":"Adv Neural Inf Process Syst"},{"key":"2024052823533469600_btae317-B72","first-page":"8844","author":"Rao","year":"2021"},{"key":"2024052823533469600_btae317-B73","doi-asserted-by":"crossref","first-page":"816","DOI":"10.1038\/s41592-018-0138-4","article-title":"Deep generative models of genetic variation capture the effects of mutations","volume":"15","author":"Riesselman","year":"2018","journal-title":"Nat Methods"},{"key":"2024052823533469600_btae317-B74","doi-asserted-by":"crossref","first-page":"e2016239118","DOI":"10.1073\/pnas.2016239118","article-title":"Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences","volume":"118","author":"Rives","year":"2021","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024052823533469600_btae317-B75","doi-asserted-by":"crossref","first-page":"E1470","DOI":"10.1073\/pnas.1601441113","article-title":"Biophysical principles predict fitness landscapes of drug resistance","volume":"113","author":"Rodrigues","year":"2016","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024052823533469600_btae317-B76","doi-asserted-by":"crossref","first-page":"E193","DOI":"10.1073\/pnas.1215251110","article-title":"Navigating the protein fitness landscape with Gaussian processes","volume":"110","author":"Romero","year":"2013","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024052823533469600_btae317-B77","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1037\/h0042519","article-title":"The perceptron: a probabilistic model for information storage and organization in the brain","volume":"65","author":"Rosenblatt","year":"1958","journal-title":"Psychol Rev"},{"key":"2024052823533469600_btae317-B78","doi-asserted-by":"crossref","first-page":"397","DOI":"10.1038\/nature17995","article-title":"Local fitness landscape of the green fluorescent protein","volume":"533","author":"Sarkisyan","year":"2016","journal-title":"Nature"},{"key":"2024052823533469600_btae317-B79","doi-asserted-by":"crossref","first-page":"1533","DOI":"10.1093\/molbev\/msz086","article-title":"High-order epistasis in catalytic power of dihydrofolate reductase gives rise to a rugged fitness landscape in the presence of trimethoprim selection","volume":"36","author":"Tamer","year":"2019","journal-title":"Mol Biol Evol"},{"key":"2024052823533469600_btae317-B80","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1186\/s13059-022-02661-7","article-title":"MAVE-NN: learning genotype-phenotype maps from multiplex assays of variant effect","volume":"23","author":"Tareen","year":"2022","journal-title":"Genome Biol"},{"key":"2024052823533469600_btae317-B81","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1038\/s41586-022-04506-6","article-title":"The evolution, evolvability and engineering of gene regulatory DNA","volume":"603","author":"Vaishnav","year":"2022","journal-title":"Nature"},{"key":"2024052823533469600_btae317-B82","first-page":"5998","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Adv Neural Inf Processing Syst"},{"key":"2024052823533469600_btae317-B83","doi-asserted-by":"crossref","first-page":"5542","DOI":"10.1073\/pnas.1814551116","article-title":"Evolutionarily informed deep learning methods for predicting relative transcript abundance from DNA sequence","volume":"116","author":"Washburn","year":"2019","journal-title":"Proc Natl Acad Sci U S A"},{"key":"2024052823533469600_btae317-B84","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1126\/science.1123539","article-title":"Darwinian evolution can follow only very few mutational paths to fitter proteins","volume":"312","author":"Weinreich","year":"2006","journal-title":"Science"},{"key":"2024052823533469600_btae317-B85","doi-asserted-by":"crossref","first-page":"208","DOI":"10.1007\/s10955-018-1975-3","article-title":"The influence of higher-order epistasis on biological fitness landscape topography","volume":"172","author":"Weinreich","year":"2018","journal-title":"J Stat Phys"},{"key":"2024052823533469600_btae317-B86","first-page":"1165","article-title":"Perspective: sign epistasis and genetic constraint on evolutionary trajectories","volume":"59","author":"Weinreich","year":"2005","journal-title":"Evolution"},{"key":"2024052823533469600_btae317-B87","doi-asserted-by":"crossref","first-page":"700","DOI":"10.1016\/j.gde.2013.10.007","article-title":"Should evolutionary geneticists worry about higher-order epistasis?","volume":"23","author":"Weinreich","year":"2013","journal-title":"Curr Opin Genetics Dev"},{"key":"2024052823533469600_btae317-B89","doi-asserted-by":"crossref","first-page":"1026","DOI":"10.1016\/j.cels.2021.07.008","article-title":"Informed training set design enables efficient machine learning-assisted directed protein evolution","volume":"12","author":"Wittmann","year":"2021","journal-title":"Cell Syst"},{"key":"2024052823533469600_btae317-B90","first-page":"356","volume-title":"Proceedings of the Sixth International Congress on Genetics","author":"Wright","year":"1932"},{"key":"2024052823533469600_btae317-B91","doi-asserted-by":"crossref","first-page":"e16965","DOI":"10.7554\/eLife.16965","article-title":"Adaptation in protein fitness landscapes is facilitated by indirect paths","volume":"5","author":"Wu","year":"2016","journal-title":"Elife"},{"key":"2024052823533469600_btae317-B92","doi-asserted-by":"crossref","first-page":"8852","DOI":"10.1073\/pnas.1901979116","article-title":"Machine learning-assisted directed protein evolution with combinatorial libraries","volume":"116","author":"Wu","year":"2019","journal-title":"Proc Natl Acad Sci USA"},{"key":"2024052823533469600_btae317-B94","doi-asserted-by":"crossref","first-page":"2773","DOI":"10.1021\/acs.jcim.0c00073","article-title":"Deep dive into machine learning models for protein engineering","volume":"60","author":"Xu","year":"2020","journal-title":"J. Chem Inf. Model"},{"key":"2024052823533469600_btae317-B95","doi-asserted-by":"crossref","first-page":"1168","DOI":"10.1093\/molbev\/msaa259","article-title":"Discovery of ongoing selective sweeps within anopheles mosquito populations using deep learning","volume":"38","author":"Xue","year":"2021","journal-title":"Mol Biol Evol"},{"key":"2024052823533469600_btae317-B96","doi-asserted-by":"crossref","first-page":"1120","DOI":"10.1038\/s41589-019-0386-3","article-title":"Higher-order epistasis shapes the fitness landscape of a xenobiotic-degrading enzyme","volume":"15","author":"Yang","year":"2019","journal-title":"Nat Chem Biol"},{"key":"2024052823533469600_btae317-B98","first-page":"187","author":"Zar\u0119ba","year":"2015"},{"key":"2024052823533469600_btae317-B99","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1126\/science.aax1837","article-title":"Cryptic genetic variation accelerates evolution by opening access to diverse adaptive peaks","volume":"365","author":"Zheng","year":"2019","journal-title":"Science"},{"key":"2024052823533469600_btae317-B100","doi-asserted-by":"crossref","first-page":"e2206069119","DOI":"10.1073\/pnas.2206069119","article-title":"Deep learning predicts DNA methylation regulatory variants in the human brain and elucidates the genetics of psychiatric disorders","volume":"119","author":"Zhou","year":"2022","journal-title":"Proc Natl Acad Sci USA"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btae317\/57668648\/btae317.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/5\/btae317\/57955179\/btae317.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/40\/5\/btae317\/57955179\/btae317.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,28]],"date-time":"2024-05-28T23:54:30Z","timestamp":1716940470000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/doi\/10.1093\/bioinformatics\/btae317\/7673132"}},"subtitle":[],"editor":[{"given":"Christina","family":"Kendziorski","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,5,1]]},"references-count":89,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2024,5,2]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btae317","relation":{},"ISSN":["1367-4811"],"issn-type":[{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,5,1]]},"published":{"date-parts":[[2024,5,1]]},"article-number":"btae317"}}