{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,31]],"date-time":"2025-12-31T10:55:19Z","timestamp":1767178519118,"version":"build-2238731810"},"update-to":[{"DOI":"10.1371\/journal.pcbi.1013758","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2025,12,8]],"date-time":"2025-12-08T00:00:00Z","timestamp":1765152000000}}],"reference-count":63,"publisher":"Public Library of Science (PLoS)","issue":"12","license":[{"start":{"date-parts":[[2025,12,1]],"date-time":"2025-12-01T00:00:00Z","timestamp":1764547200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01-AI146028"],"award-info":[{"award-number":["R01-AI146028"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R56-HG013117"],"award-info":[{"award-number":["R56-HG013117"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01-HG013117"],"award-info":[{"award-number":["R01-HG013117"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["S10OD028685"],"award-info":[{"award-number":["S10OD028685"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000011","name":"Howard Hughes Medical Institute","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000011","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>Antibodies play a crucial role in adaptive immunity. They develop as B cell receptors (BCRs): membrane-bound forms of antibodies that are expressed on the surfaces of B cells. BCRs are refined through affinity maturation, a process of somatic hypermutation (SHM) and natural selection, to improve binding to an antigen. Computational models of affinity maturation have developed from two main perspectives: molecular evolution and language modeling. The molecular evolution perspective focuses on nucleotide sequence context to describe mutation and selection; the language modeling perspective involves learning patterns from large data sets of protein sequences. In this paper, we compared models from both perspectives on their ability to predict the course of antibody affinity maturation along phylogenetic trees of BCR sequences. This included models of SHM, models of SHM combined with an estimate of selection, and protein language models. We evaluated these models for large human BCR repertoire data sets, as well as an antigen-specific mouse experiment with a pre-rearranged cognate naive antibody. We demonstrated that precise modeling of SHM, which requires the nucleotide context, provides a substantial amount of predictive power for predicting the course of affinity maturation. Notably, a simple nucleotide-based convolutional neural network modeling SHM outperformed state-of-the-art protein language models, including one trained exclusively on antibody sequences. Furthermore, incorporating estimates of selection based on a custom deep mutational scanning experiment brought only modest improvement in predictive power. To support further research, we introduce EPAM (Evaluating Predictions of Affinity Maturation), a benchmarking framework to integrate evolutionary principles with advances in language modeling, offering a road map for understanding antibody evolution and improving predictive models.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1013758","type":"journal-article","created":{"date-parts":[[2025,12,1]],"date-time":"2025-12-01T19:08:21Z","timestamp":1764616101000},"page":"e1013758","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":1,"title":["Nucleotide context models outperform protein language models for predicting antibody affinity maturation"],"prefix":"10.1371","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3915-2023","authenticated-orcid":true,"given":"Mackenzie M.","family":"Johnson","sequence":"first","affiliation":[]},{"given":"Kevin","family":"Sung","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8324-8324","authenticated-orcid":true,"given":"Hugh K.","family":"Haddox","sequence":"additional","affiliation":[]},{"given":"Ashni A.","family":"Vora","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6203-7250","authenticated-orcid":true,"given":"Tatsuya","family":"Araki","sequence":"additional","affiliation":[]},{"given":"Gabriel D.","family":"Victora","sequence":"additional","affiliation":[]},{"given":"Yun S.","family":"Song","sequence":"additional","affiliation":[]},{"given":"Julia","family":"Fukuyama","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0607-6025","authenticated-orcid":true,"given":"Frederick A.","family":"Matsen IV","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2025,12,1]]},"reference":[{"issue":"8","key":"pcbi.1013758.ref001","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pone.0160853","article-title":"A public database of memory and naive B-cell receptor sequences","volume":"11","author":"WS DeWitt","year":"2016","journal-title":"PLoS One."},{"issue":"1","key":"pcbi.1013758.ref002","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1111\/imr.12666","article-title":"iReceptor: a platform for querying and analyzing antibody\/B-cell and T-cell receptor repertoire data across federated repositories","volume":"284","author":"BD Corrie","year":"2018","journal-title":"Immunol Rev."},{"issue":"8","key":"pcbi.1013758.ref003","doi-asserted-by":"crossref","first-page":"2502","DOI":"10.4049\/jimmunol.1800708","article-title":"Observed antibody space: a resource for data mining next-generation sequencing of antibody repertoires","volume":"201","author":"A Kovaltsuk","year":"2018","journal-title":"J Immunol."},{"key":"pcbi.1013758.ref004","doi-asserted-by":"crossref","first-page":"2365","DOI":"10.3389\/fimmu.2019.02365","article-title":"cAb-Rep: a database of curated antibody repertoires for exploring antibody diversity and predicting antibody prevalence","volume":"10","author":"Y Guo","year":"2019","journal-title":"Front Immunol."},{"issue":"1","key":"pcbi.1013758.ref005","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1002\/pro.4205","article-title":"Observed antibody space: a diverse database of cleaned, annotated, and translated unpaired and paired antibody sequences","volume":"31","author":"TH Olsen","year":"2022","journal-title":"Protein Sci."},{"issue":"3","key":"pcbi.1013758.ref006","doi-asserted-by":"crossref","first-page":"542","DOI":"10.1016\/j.immuni.2016.02.010","article-title":"Complex antigens drive permissive clonal selection in germinal centers","volume":"44","author":"M Kuraoka","year":"2016","journal-title":"Immunity."},{"issue":"8","key":"pcbi.1013758.ref007","doi-asserted-by":"crossref","DOI":"10.1371\/journal.ppat.1011603","article-title":"Germline-encoded specificities and the predictability of the B cell response","volume":"19","author":"MC Vieira","year":"2023","journal-title":"PLoS Pathog."},{"issue":"1676","key":"pcbi.1013758.ref008","doi-asserted-by":"crossref","first-page":"20140243","DOI":"10.1098\/rstb.2014.0243","article-title":"Inferring processes underlying B-cell repertoire diversity","volume":"370","author":"Y Elhanati","year":"2015","journal-title":"Philos Trans R Soc Lond B Biol Sci."},{"issue":"7","key":"pcbi.1013758.ref009","doi-asserted-by":"crossref","first-page":"1467","DOI":"10.1016\/j.celrep.2017.04.054","article-title":"Systems analysis reveals high genetic and antigen-driven predetermination of antibody repertoires throughout B cell development","volume":"19","author":"V Greiff","year":"2017","journal-title":"Cell Rep."},{"key":"pcbi.1013758.ref010","doi-asserted-by":"crossref","first-page":"1433","DOI":"10.3389\/fimmu.2017.01433","article-title":"Antibody heavy chain variable domains of different germline gene origins diversify through different paths","volume":"8","author":"U Kirik","year":"2017","journal-title":"Front Immunol."},{"key":"pcbi.1013758.ref011","doi-asserted-by":"crossref","first-page":"537","DOI":"10.3389\/fimmu.2017.00537","article-title":"Gene-specific substitution profiles describe the types and frequencies of amino acid changes during antibody somatic hypermutation","volume":"8","author":"Z Sheng","year":"2017","journal-title":"Front Immunol."},{"issue":"5","key":"pcbi.1013758.ref012","doi-asserted-by":"crossref","first-page":"2360","DOI":"10.4049\/jimmunol.160.5.2360","article-title":"Base-specific sequences that bias somatic hypermutation deduced by analysis of out-of-frame human IgVH genes","volume":"160","author":"DK Dunn-Walters","year":"1998","journal-title":"J Immunol."},{"key":"pcbi.1013758.ref013","doi-asserted-by":"crossref","first-page":"358","DOI":"10.3389\/fimmu.2013.00358","article-title":"Models of somatic hypermutation targeting and substitution based on synonymous mutations from high-throughput immunoglobulin sequencing data","volume":"4","author":"G Yaari","year":"2013","journal-title":"Front Immunol."},{"issue":"19","key":"pcbi.1013758.ref014","doi-asserted-by":"crossref","first-page":"10702","DOI":"10.1093\/nar\/gkaa825","article-title":"Learning the heterogeneous hypermutation landscape of immunoglobulins from high-throughput repertoire data","volume":"48","author":"N Spisak","year":"2020","journal-title":"Nucleic Acids Res."},{"issue":"1","key":"pcbi.1013758.ref015","doi-asserted-by":"crossref","first-page":"103668","DOI":"10.1016\/j.isci.2021.103668","article-title":"Deep learning model of somatic hypermutation reveals importance of sequence context beyond hotspot targeting","volume":"25","author":"C Tang","year":"2021","journal-title":"iScience."},{"key":"pcbi.1013758.ref016","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.105471","article-title":"Thrifty wide-context models of B cell receptor somatic hypermutation","volume":"14","author":"K Sung","year":"2025","journal-title":"Elife."},{"issue":"1","key":"pcbi.1013758.ref017","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1534\/genetics.116.196303","article-title":"A phylogenetic codon substitution model for antibody lineages","volume":"206","author":"KB Hoehn","year":"2017","journal-title":"Genetics."},{"issue":"45","key":"pcbi.1013758.ref018","doi-asserted-by":"crossref","first-page":"22664","DOI":"10.1073\/pnas.1906020116","article-title":"Repertoire-wide phylogenetic models of B cell molecular evolution reveal evolutionary signatures of aging and vaccination","volume":"116","author":"KB Hoehn","year":"2019","journal-title":"Proc Natl Acad Sci U S A."},{"issue":"1676","key":"pcbi.1013758.ref019","doi-asserted-by":"crossref","first-page":"20140244","DOI":"10.1098\/rstb.2014.0244","article-title":"Quantifying evolutionary constraints on B-cell affinity maturation","volume":"370","author":"CO McCoy","year":"2015","journal-title":"Philos Trans R Soc Lond B Biol Sci."},{"key":"pcbi.1013758.ref020","article-title":"Deciphering antibody affinity maturation with language models and weakly supervised learning","author":"JA Ruffolo","year":"2021","journal-title":"arXiv preprint"},{"issue":"1","key":"pcbi.1013758.ref021","doi-asserted-by":"crossref","DOI":"10.1093\/bioadv\/vbac046","article-title":"AbLang: an antibody language model for completing antibody sequences","volume":"2","author":"TH Olsen","year":"2022","journal-title":"Bioinform Adv."},{"key":"pcbi.1013758.ref022","article-title":"On pre-trained language models for antibody","author":"D Wang","year":"2023","journal-title":"arXiv preprint"},{"issue":"4","key":"pcbi.1013758.ref023","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbae245","article-title":"Accurate prediction of antibody function and structure using bio-inspired antibody language model","volume":"25","author":"H Jing","year":"2024","journal-title":"Brief Bioinform."},{"issue":"12","key":"pcbi.1013758.ref024","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1012646","article-title":"Large scale paired antibody language models","volume":"20","author":"H Kenlay","year":"2024","journal-title":"PLoS Comput Biol."},{"key":"pcbi.1013758.ref025","doi-asserted-by":"crossref","DOI":"10.1101\/2024.05.22.594943","volume-title":"A generative foundation model for antibody sequence understanding","author":"J Barton","year":"2024"},{"issue":"6","key":"pcbi.1013758.ref026","doi-asserted-by":"crossref","first-page":"101239","DOI":"10.1016\/j.patter.2025.101239","article-title":"Focused learning by antibody language models using preferential masking of non-templated regions","volume":"6","author":"K Ng","year":"2025","journal-title":"Patterns (N Y)."},{"issue":"11","key":"pcbi.1013758.ref027","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btae659","article-title":"p-IgGen: a paired antibody generative language model","volume":"40","author":"OM Turnbull","year":"2024","journal-title":"Bioinformatics."},{"issue":"11","key":"pcbi.1013758.ref028","doi-asserted-by":"crossref","DOI":"10.1093\/bioinformatics\/btae618","article-title":"Addressing the antibody germline bias and its effect on language models for improved antibody design","volume":"40","author":"TH Olsen","year":"2024","journal-title":"Bioinformatics."},{"issue":"10","key":"pcbi.1013758.ref029","doi-asserted-by":"crossref","first-page":"1607","DOI":"10.4049\/jimmunol.2200825","article-title":"FLAIRR-Seq: a method for single-molecule resolution of near full-length antibody H chain repertoires","volume":"210","author":"EE Ford","year":"2023","journal-title":"J Immunol."},{"issue":"1","key":"pcbi.1013758.ref030","doi-asserted-by":"crossref","first-page":"4419","DOI":"10.1038\/s41467-023-40070-x","article-title":"Genetic variation in the immunoglobulin heavy chain locus shapes the human antibody repertoire","volume":"14","author":"OL Rodriguez","year":"2023","journal-title":"Nat Commun."},{"issue":"7935","key":"pcbi.1013758.ref031","doi-asserted-by":"crossref","first-page":"352","DOI":"10.1038\/s41586-022-05371-z","article-title":"Functional antibodies exhibit light chain coherence","volume":"611","author":"DB Jaffe","year":"2022","journal-title":"Nature."},{"key":"pcbi.1013758.ref032","article-title":"Replaying germinal center evolution on a quantified affinity landscape","author":"WS DeWitt","year":"2025","journal-title":"bioRxiv."},{"key":"pcbi.1013758.ref033","first-page":"132","article-title":"The IMGT unique numbering for immunoglobulins, T-cell receptors, and Ig-like domains","volume":"7","author":"MP Lefranc","year":"1999","journal-title":"Immunologist."},{"key":"pcbi.1013758.ref034","doi-asserted-by":"crossref","first-page":"438","DOI":"10.3389\/fimmu.2019.00438","article-title":"Mutating for good: DNA damage responses during somatic hypermutation","volume":"10","author":"B Pilzecker","year":"2019","journal-title":"Front Immunol."},{"key":"pcbi.1013758.ref035","first-page":"29287","article-title":"Language models enable zero-shot prediction of the effects of mutations on protein function","volume":"34","author":"J Meier","year":"2021","journal-title":"Adv Neural Inf Process Syst."},{"issue":"2","key":"pcbi.1013758.ref036","doi-asserted-by":"crossref","first-page":"275","DOI":"10.1038\/s41587-023-01763-2","article-title":"Efficient evolution of human antibodies from general protein language models","volume":"42","author":"BL Hie","year":"2024","journal-title":"Nat Biotechnol."},{"key":"pcbi.1013758.ref037","volume-title":"FLAb: Benchmarking deep learning methods for antibody fitness prediction","author":"M Chungyoun","year":"2024"},{"issue":"4","key":"pcbi.1013758.ref038","doi-asserted-by":"crossref","DOI":"10.1093\/bib\/bbaf418","article-title":"Protein language model pseudolikelihoods capture features of in vivo B cell selection and evolution","volume":"26","author":"D van Ginneken","year":"2025","journal-title":"Brief Bioinform."},{"issue":"6","key":"pcbi.1013758.ref039","doi-asserted-by":"crossref","DOI":"10.1016\/j.chom.2018.04.018","article-title":"Functional relevance of improbable antibody mutations for HIV broadly neutralizing antibody development","volume":"23","author":"K Wiehe","year":"2018","journal-title":"Cell Host Microbe."},{"key":"pcbi.1013758.ref040","unstructured":"Notin P, Kollasch A, Ritter D, van Niekerk L, Paul S, Spinner H, et al. ProteinGym: large-scale benchmarks for protein fitness prediction and design. In: Adv. Neural Inf. Process. Syst. 2023. 64331\u201379. https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2023\/file\/cac723e5ff29f65e3fcbb0739ae91bee-Paper-Datasets_and_Benchmarks.pdf"},{"key":"pcbi.1013758.ref041","doi-asserted-by":"crossref","unstructured":"Rao R, Meier J, Sercu T, Ovchinnikov S, Rives A. Transformer protein language models are unsupervised structure learners. 2021. https:\/\/openreview.net\/forum?id=fylclEqgvgd","DOI":"10.1101\/2020.12.15.422761"},{"issue":"6637","key":"pcbi.1013758.ref042","doi-asserted-by":"crossref","first-page":"1123","DOI":"10.1126\/science.ade2574","article-title":"Evolutionary-scale prediction of atomic-level protein structure with a language model","volume":"379","author":"Z Lin","year":"2023","journal-title":"Science."},{"issue":"1","key":"pcbi.1013758.ref043","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1004409","article-title":"Consistency of VDJ rearrangement and substitution parameters enables accurate B cell receptor sequence annotation","volume":"12","author":"DK Ralph","year":"2016","journal-title":"PLoS Comput Biol."},{"issue":"10","key":"pcbi.1013758.ref044","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1005086","article-title":"Likelihood-based inference of B cell clonal families","volume":"12","author":"DK Ralph","year":"2016","journal-title":"PLoS Comput Biol."},{"issue":"7","key":"pcbi.1013758.ref045","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1007133","article-title":"Per-sample immunoglobulin germline inference from B cell receptor deep sequencing data","volume":"15","author":"DK Ralph","year":"2019","journal-title":"PLoS Comput Biol."},{"issue":"11","key":"pcbi.1013758.ref046","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1010723","article-title":"Inference of B cell clonal families using heavy\/light chain pairing information","volume":"18","author":"DK Ralph","year":"2022","journal-title":"PLoS Comput Biol."},{"key":"pcbi.1013758.ref047","unstructured":"Lees W. IG receptor germline set for species: human set name: IGH VDJ. Version 7. https:\/\/doi.org\/10.5281\/zenodo.8305006"},{"key":"pcbi.1013758.ref048","unstructured":"Lees W. IG receptor germline set for species: human set name: IGKappa VJ. Version 2. https:\/\/doi.org\/10.5281\/zenodo.8305008"},{"key":"pcbi.1013758.ref049","unstructured":"Lees W. IG receptor germline set for species: human set name: IGLambda VJ. Version 1. https:\/\/doi.org\/10.5281\/zenodo.8305013"},{"issue":"5","key":"pcbi.1013758.ref050","doi-asserted-by":"crossref","first-page":"1530","DOI":"10.1093\/molbev\/msaa015","article-title":"IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era","volume":"37","author":"BQ Minh","year":"2020","journal-title":"Mol Biol Evol."},{"issue":"6","key":"pcbi.1013758.ref051","doi-asserted-by":"crossref","first-page":"997","DOI":"10.1093\/sysbio\/syw037","article-title":"Terrace aware data structure for phylogenomic inference from supermatrices","volume":"65","author":"O Chernomor","year":"2016","journal-title":"Syst Biol."},{"issue":"10","key":"pcbi.1013758.ref052","doi-asserted-by":"crossref","first-page":"1579","DOI":"10.4049\/jimmunol.2300851","article-title":"Inferring B cell phylogenies from paired H and L chain BCR sequences with Dowser","volume":"212","author":"CG Jensen","year":"2024","journal-title":"J Immunol."},{"issue":"11","key":"pcbi.1013758.ref053","doi-asserted-by":"crossref","first-page":"1422","DOI":"10.1093\/bioinformatics\/btp163","article-title":"Biopython: freely available Python tools for computational molecular biology and bioinformatics","volume":"25","author":"PJA Cock","year":"2009","journal-title":"Bioinformatics."},{"issue":"2","key":"pcbi.1013758.ref054","doi-asserted-by":"crossref","first-page":"298","DOI":"10.1093\/bioinformatics\/btv552","article-title":"ANARCI: antigen receptor numbering and receptor classification","volume":"32","author":"J Dunbar","year":"2016","journal-title":"Bioinformatics."},{"key":"pcbi.1013758.ref055","doi-asserted-by":"crossref","first-page":"788","DOI":"10.3389\/fimmu.2020.00788","article-title":"AID Overlapping and Pol\u03b7 Hotspots Are Key Features of Evolutionary Variation Within the Human Antibody Heavy Chain (IGHV) genes","volume":"11","author":"C Tang","year":"2020","journal-title":"Front Immunol."},{"key":"pcbi.1013758.ref056","doi-asserted-by":"crossref","first-page":"1157","DOI":"10.3389\/fimmu.2017.01157","article-title":"Novel method for high-throughput full-length IGHV-D-J sequencing of the immune repertoire from bulk B-Cells with single-cell resolution","volume":"8","author":"S Vergani","year":"2017","journal-title":"Front Immunol."},{"issue":"13","key":"pcbi.1013758.ref057","doi-asserted-by":"crossref","first-page":"1930","DOI":"10.1093\/bioinformatics\/btu138","article-title":"pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires","volume":"30","author":"JA Vander Heiden","year":"2014","journal-title":"Bioinformatics."},{"issue":"5","key":"pcbi.1013758.ref058","doi-asserted-by":"crossref","first-page":"1253","DOI":"10.1093\/molbev\/msy020","article-title":"Using genotype abundance to improve phylogenetic inference","volume":"35","author":"WS 3rd DeWitt","year":"2018","journal-title":"Mol Biol Evol."},{"issue":"1919","key":"pcbi.1013758.ref059","doi-asserted-by":"crossref","first-page":"20230315","DOI":"10.1098\/rstb.2023.0315","article-title":"Leveraging DAGs to improve context-sensitive and abundance-aware tree estimation","volume":"380","author":"W Dumm","year":"2025","journal-title":"Philos Trans R Soc Lond B Biol Sci."},{"issue":"8","key":"pcbi.1013758.ref060","doi-asserted-by":"crossref","first-page":"802","DOI":"10.1089\/cmb.2021.0644","article-title":"Enabling inference for context-dependent models of mutation by bounding the propagation of dependency","volume":"29","author":"FA 4th Matsen","year":"2022","journal-title":"J Comput Biol."},{"issue":"22","key":"pcbi.1013758.ref061","doi-asserted-by":"crossref","first-page":"10915","DOI":"10.1073\/pnas.89.22.10915","article-title":"Amino acid substitution matrices from protein blocks","volume":"89","author":"S Henikoff","year":"1992","journal-title":"Proc Natl Acad Sci U S A."},{"issue":"6","key":"pcbi.1013758.ref062","doi-asserted-by":"crossref","first-page":"926","DOI":"10.1093\/bioinformatics\/btu739","article-title":"UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches","volume":"31","author":"BE Suzek","year":"2015","journal-title":"Bioinformatics."},{"issue":"1","key":"pcbi.1013758.ref063","doi-asserted-by":"crossref","DOI":"10.1073\/pnas.2418918121","article-title":"Learning the language of antibody hypervariability","volume":"122","author":"R Singh","year":"2025","journal-title":"Proc Natl Acad Sci U S A."}],"updated-by":[{"DOI":"10.1371\/journal.pcbi.1013758","type":"new_version","label":"New version","source":"publisher","updated":{"date-parts":[[2025,12,8]],"date-time":"2025-12-08T00:00:00Z","timestamp":1765152000000}}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1013758","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,8]],"date-time":"2025-12-08T18:41:42Z","timestamp":1765219302000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1013758"}},"subtitle":[],"editor":[{"given":"Alexey","family":"Onufriev","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2025,12,1]]},"references-count":63,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2025,12,1]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1013758","relation":{},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,1]]}}}