{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,14]],"date-time":"2026-02-14T06:51:18Z","timestamp":1771051878542,"version":"3.50.1"},"reference-count":49,"publisher":"Oxford University Press (OUP)","issue":"6","license":[{"start":{"date-parts":[[2024,10,22]],"date-time":"2024-10-22T00:00:00Z","timestamp":1729555200000},"content-version":"vor","delay-in-days":29,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,9,23]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Adaptive immune receptors, such as antibodies and T-cell receptors, recognize foreign threats with exquisite specificity. A major challenge in adaptive immunology is discovering the rules governing immune receptor\u2013antigen binding in order to predict the antigen binding status of previously unseen immune receptors. Many studies assume that the antigen binding status of an immune receptor may be determined by the presence of a short motif in the complementarity determining region 3 (CDR3), disregarding other amino acids. To test this assumption, we present a method to discover short motifs which show high precision in predicting antigen binding and generalize well to unseen simulated and experimental data. Our analysis of a mutagenesis-based antibody dataset reveals 11 336 position-specific, mostly gapped motifs of 3\u20135 amino acids that retain high precision on independently generated experimental data. Using a subset of only 178 motifs, a simple classifier was made that on the independently generated dataset outperformed a deep learning model proposed specifically for such datasets. In conclusion, our findings support the notion that for some antibodies, antigen binding may be largely determined by a short CDR3 motif. As more experimental data emerge, our methodology could serve as a foundation for in-depth investigations into antigen binding signals.<\/jats:p>","DOI":"10.1093\/bib\/bbae537","type":"journal-article","created":{"date-parts":[[2024,10,22]],"date-time":"2024-10-22T23:16:01Z","timestamp":1729638961000},"source":"Crossref","is-referenced-by-count":2,"title":["Predictability of antigen binding based on short motifs in the antibody CDRH3"],"prefix":"10.1093","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8900-075X","authenticated-orcid":false,"given":"Lonneke","family":"Scheffer","sequence":"first","affiliation":[{"name":"Department of Informatics, University of Oslo , Gaustadall\u00e9en 23B, 0373 Oslo ,","place":["Norway"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-7234-9215","authenticated-orcid":false,"given":"Eric Emanuel","family":"Reber","sequence":"additional","affiliation":[{"name":"Department of Informatics, University of Oslo , Gaustadall\u00e9en 23B, 0373 Oslo ,","place":["Norway"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8501-7076","authenticated-orcid":false,"given":"Brij Bhushan","family":"Mehta","sequence":"additional","affiliation":[{"name":"Department of Immunology, University of Oslo , Sognsvannsveien 20, Rikshospitalet, 0372 Oslo ,","place":["Norway"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2484-3868","authenticated-orcid":false,"given":"Milena","family":"Pavlovi\u0107","sequence":"additional","affiliation":[{"name":"Department of Informatics, University of Oslo , Gaustadall\u00e9en 23B, 0373 Oslo ,","place":["Norway"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1507-4171","authenticated-orcid":false,"given":"Maria","family":"Chernigovskaya","sequence":"additional","affiliation":[{"name":"Department of Immunology, University of Oslo , Sognsvannsveien 20, Rikshospitalet, 0372 Oslo ,","place":["Norway"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5499-6283","authenticated-orcid":false,"given":"Eve","family":"Richardson","sequence":"additional","affiliation":[{"name":"La Jolla Institute for Immunology , 9420 Athena Cir, La Jolla, CA ,","place":["United States"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6692-0876","authenticated-orcid":false,"given":"Rahmad","family":"Akbar","sequence":"additional","affiliation":[{"name":"Department of Immunology, University of Oslo , Sognsvannsveien 20, Rikshospitalet, 0372 Oslo ,","place":["Norway"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2445-1258","authenticated-orcid":false,"given":"Fridtjof","family":"Lund-Johansen","sequence":"additional","affiliation":[{"name":"Department of Immunology, University of Oslo , Sognsvannsveien 20, Rikshospitalet, 0372 Oslo ,","place":["Norway"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2622-5032","authenticated-orcid":false,"given":"Victor","family":"Greiff","sequence":"additional","affiliation":[{"name":"Department of Immunology, University of Oslo , Sognsvannsveien 20, Rikshospitalet, 0372 Oslo ,","place":["Norway"]}]},{"given":"Ingrid Hob\u00e6k","family":"Haff","sequence":"additional","affiliation":[{"name":"Department of Mathematics, University of Oslo , Niels Henrik Abels hus, Moltke Moes vei 35, 0851 Oslo ,","place":["Norway"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4959-1409","authenticated-orcid":false,"given":"Geir Kjetil","family":"Sandve","sequence":"additional","affiliation":[{"name":"Department of Informatics, University of Oslo , Gaustadall\u00e9en 23B, 0373 Oslo ,","place":["Norway"]}]}],"member":"286","published-online":{"date-parts":[[2024,10,22]]},"reference":[{"key":"2024102414300487900_ref1","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1016\/S1074-7613(00)00006-6","article-title":"Diversity in the CDR3 region of VH is sufficient for most antibody specificities","volume":"13","author":"Xu","year":"2000","journal-title":"Immunity"},{"key":"2024102414300487900_ref2","doi-asserted-by":"publisher","first-page":"395","DOI":"10.1038\/334395a0","article-title":"T-cell antigen receptor genes and T-cell recognition","volume":"334","author":"Davis","year":"1988","journal-title":"Nature"},{"key":"2024102414300487900_ref3","doi-asserted-by":"publisher","first-page":"108856","DOI":"10.1016\/j.celrep.2021.108856","article-title":"A compact vocabulary of paratope-epitope interactions enables predictability of antibody-antigen binding","volume":"34","author":"Akbar","year":"2021","journal-title":"Cell Rep"},{"key":"2024102414300487900_ref4","doi-asserted-by":"publisher","first-page":"659","DOI":"10.1038\/ng.3822","article-title":"Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire","volume":"49","author":"Emerson","year":"2017","journal-title":"Nat Genet"},{"key":"2024102414300487900_ref5","doi-asserted-by":"publisher","DOI":"10.3389\/fimmu.2021.640725","article-title":"TCRMatch: predicting T-cell receptor specificity based on sequence similarity to previously characterized receptors","volume":"12","author":"Chronister","year":"2021","journal-title":"Front Immunol"},{"key":"2024102414300487900_ref6","doi-asserted-by":"publisher","first-page":"1359","DOI":"10.1158\/1078-0432.CCR-19-3249","article-title":"Investigation of antigen-specific T-cell receptor clusters in human cancers","volume":"26","author":"Zhang","year":"2020","journal-title":"Clin Cancer Res"},{"key":"2024102414300487900_ref7","doi-asserted-by":"publisher","first-page":"e9416","DOI":"10.15252\/msb.20199416","article-title":"Predicting antigen specificity of single T cells based on TCR CDR3 regions","volume":"16","author":"Fischer","year":"2020","journal-title":"Mol Syst Biol"},{"key":"2024102414300487900_ref8","first-page":"18832","article-title":"Modern Hopfield networks and attention for immune repertoire classification","volume":"33","author":"Widrich","year":"2020","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2024102414300487900_ref9","doi-asserted-by":"publisher","first-page":"2008790","DOI":"10.1080\/19420862.2021.2008790","article-title":"Progress and challenges for the machine learning-based design of fit-for-purpose monoclonal antibodies","volume":"14","author":"Akbar","year":"2022","journal-title":"MAbs"},{"key":"2024102414300487900_ref10","doi-asserted-by":"publisher","first-page":"701","DOI":"10.1039\/C9ME00071B","article-title":"Augmenting adaptive immunity: progress and challenges in the quantitative engineering and analysis of adaptive immune receptor repertoires","volume":"4","author":"Brown","year":"2019","journal-title":"Mol Syst Des Eng"},{"key":"2024102414300487900_ref11","doi-asserted-by":"publisher","first-page":"109","DOI":"10.1016\/j.coisb.2020.10.010","article-title":"Mining adaptive immune receptor repertoires for biological and clinical information using machine learning","volume":"24","author":"Greiff","year":"2020","journal-title":"Current Opinion in Systems Biology"},{"key":"2024102414300487900_ref12","doi-asserted-by":"publisher","DOI":"10.1101\/2023.10.20.562936","article-title":"Simulation of adaptive immune receptors and repertoires with complex immune information to guide the development and benchmarking of AIRR machine learning","volume-title":"biorXiv","author":"Chernigovskaya","year":"2023"},{"key":"2024102414300487900_ref13","doi-asserted-by":"publisher","first-page":"1194","DOI":"10.1038\/s41587-020-0505-4","article-title":"Analyzing the mycobacterium tuberculosis immune response by T-cell receptor clustering with GLIPH2 and genome-wide antigen screening","volume":"38","author":"Huang","year":"2020","journal-title":"Nat Biotechnol"},{"key":"2024102414300487900_ref14","doi-asserted-by":"publisher","first-page":"3181","DOI":"10.1093\/bioinformatics\/btu523","article-title":"Tracking global changes induced in the CD4 T-cell receptor repertoire by immunization with a complex antigen using short stretches of CDR3 protein sequence","volume":"30","author":"Thomas","year":"2014","journal-title":"Bioinformatics"},{"key":"2024102414300487900_ref15","doi-asserted-by":"publisher","first-page":"94","DOI":"10.1038\/nature22976","article-title":"Identifying specificity groups in the T cell receptor repertoire","volume":"547","author":"Glanville","year":"2017","journal-title":"Nature"},{"key":"2024102414300487900_ref16","doi-asserted-by":"publisher","DOI":"10.1158\/0008-5472.CAN-18-2292","article-title":"Biophysicochemical motifs in T cell receptor sequences distinguish repertoires from tumor-infiltrating lymphocytes and adjacent healthy tissue","volume":"79","author":"Ostmeyer","year":"2019","journal-title":"Cancer Res"},{"key":"2024102414300487900_ref17","doi-asserted-by":"publisher","first-page":"e0229569","DOI":"10.1371\/journal.pone.0229569","article-title":"Biophysicochemical motifs in T cell receptor sequences as a potential biomarker for high-grade serous ovarian carcinoma","volume":"15","author":"Ostmeyer","year":"2020","journal-title":"PloS One"},{"key":"2024102414300487900_ref18","doi-asserted-by":"publisher","DOI":"10.1016\/j.immuno.2023.100027","article-title":"Interpretable deep learning to uncover the molecular binding patterns determining TCR\u2013epitope interactions","volume-title":"Immunoinformatics","author":"Dens"},{"key":"2024102414300487900_ref19","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1186\/s13073-015-0169-8","article-title":"A bioinformatic framework for immune repertoire diversity profiling enables detection of immunological status","volume":"7","author":"Greiff","year":"2015","journal-title":"Genome Med"},{"key":"2024102414300487900_ref20","doi-asserted-by":"publisher","first-page":"936","DOI":"10.1038\/s42256-021-00413-z","article-title":"The immune ML ecosystem for machine learning analysis of adaptive immune receptor repertoires","volume":"3","author":"Pavlovi\u0107","year":"2021","journal-title":"Nat Mach Intell"},{"key":"2024102414300487900_ref21","doi-asserted-by":"publisher","DOI":"10.3389\/fimmu.2022.858057","article-title":"Machine learning approaches to TCR repertoire analysis","volume":"13","author":"Katayama","year":"2022","journal-title":"Front Immunol"},{"key":"2024102414300487900_ref22","doi-asserted-by":"publisher","first-page":"4994","DOI":"10.1093\/bioinformatics\/btac612","article-title":"Access to ground truth at unconstrained size makes simulated data as indispensable as experimental data for bioinformatics methods development and benchmarking","volume":"38","author":"Sandve","year":"2022","journal-title":"Bioinformatics"},{"key":"2024102414300487900_ref23","doi-asserted-by":"publisher","first-page":"e68605","DOI":"10.7554\/eLife.68605","article-title":"TCR meta-clonotypes for biomarker discovery with tcrdist3 enabled identification of public, HLA-restricted clusters of SARS-CoV-2 TCRs","volume":"10","author":"Mayer-Blackwell","year":"2021","journal-title":"Elife"},{"key":"2024102414300487900_ref24","doi-asserted-by":"publisher","first-page":"e3000314","DOI":"10.1371\/journal.pbio.3000314","article-title":"Detecting T cell receptors involved in immune responses from single repertoire snapshots","volume":"17","author":"Pogorelyy","year":"2019","journal-title":"PLoS Biol"},{"key":"2024102414300487900_ref25","doi-asserted-by":"publisher","first-page":"89","DOI":"10.1038\/nature22383","article-title":"Quantifiable predictive features define epitope-specific T cell receptor repertoires","volume":"547","author":"Dash","year":"2017","journal-title":"Nature"},{"key":"2024102414300487900_ref26","doi-asserted-by":"publisher","first-page":"76","DOI":"10.1038\/s42003-023-04447-4","article-title":"Machine learning identifies T cell receptor repertoire signatures associated with COVID-19 severity","volume":"6","author":"Park","year":"2023","journal-title":"Commun Biol"},{"key":"2024102414300487900_ref27","doi-asserted-by":"publisher","first-page":"559","DOI":"10.3389\/fimmu.2020.00559","article-title":"The TCR repertoire reconstitution in multiple sclerosis: comparing one-shot and continuous immunosuppressive therapies","volume":"11","author":"Amoriello","year":"2020","journal-title":"Front Immunol"},{"key":"2024102414300487900_ref28","doi-asserted-by":"publisher","first-page":"100269","DOI":"10.1016\/j.crmeth.2022.100269","article-title":"Reference-based comparison of adaptive immune receptor repertoires","volume":"2","author":"Weber","year":"2022","journal-title":"Cell Rep Methods"},{"key":"2024102414300487900_ref29","doi-asserted-by":"publisher","first-page":"3594","DOI":"10.1093\/bioinformatics\/btaa158","article-title":"immuneSIM: Tunable multi-feature simulation of B- and T-cell receptor repertoires for immunoinformatics benchmarking","volume":"36","author":"Weber","year":"2020","journal-title":"Bioinformatics"},{"key":"2024102414300487900_ref30","doi-asserted-by":"publisher","first-page":"giac046","DOI":"10.1093\/gigascience\/giac046","article-title":"Profiling the baseline performance and limits of machine learning models for adaptive immune receptor repertoire classification","volume":"11","author":"Kanduri","year":"2022","journal-title":"Gigascience"},{"key":"2024102414300487900_ref31","doi-asserted-by":"publisher","first-page":"giad074","DOI":"10.1093\/gigascience\/giad074","article-title":"simAIRR: simulation of adaptive immune repertoires with realistic receptor sequence sharing for benchmarking of immune state prediction methods","volume":"12","author":"Kanduri","year":"2023","journal-title":"GigaScience"},{"key":"2024102414300487900_ref32","doi-asserted-by":"publisher","first-page":"600","DOI":"10.1038\/s41551-021-00699-9","article-title":"Optimization of therapeutic antibodies by predicting antigen specificity from antibody sequence via deep learning","volume":"5","author":"Mason","year":"2021","journal-title":"Nat Biomed Eng"},{"key":"2024102414300487900_ref33","doi-asserted-by":"publisher","first-page":"1605","DOI":"10.1038\/s41467-021-21879-w","article-title":"DeepTCR is a deep learning framework for revealing sequence concepts within T-cell repertoires","volume":"12","author":"Sidhom","year":"2021","journal-title":"Nat Commun"},{"key":"2024102414300487900_ref34","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1186\/s12859-019-2853-y","article-title":"Capturing the differences between humoral immunity in the normal and tumor environments from repertoire-seq of B-cell receptors using supervised machine learning","volume":"20","author":"Konishi","year":"2019","journal-title":"BMC Bioinformatics"},{"key":"2024102414300487900_ref35","doi-asserted-by":"publisher","first-page":"btac788","DOI":"10.1093\/bioinformatics\/btac788","article-title":"TCRconv: predicting recognition between T cell receptors and epitopes using contextualized motifs","volume":"39","author":"Jokinen","year":"2023","journal-title":"Bioinformatics"},{"key":"2024102414300487900_ref36","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s42003-021-02610-3","article-title":"NetTCR-2.0 enables accurate prediction of TCR-peptide binding by using paired TCR\u03b1 and \u03b2 sequence data","volume":"4","author":"Montemurro","year":"2021","journal-title":"Commun Biol"},{"key":"2024102414300487900_ref37","doi-asserted-by":"publisher","first-page":"19840","DOI":"10.1038\/s41598-019-56154-y","article-title":"Effects of a remote mutation from the contact paratope on the structure of CDR-H3 in the anti-HIV neutralizing antibody PG16","volume":"9","author":"Kondo","year":"2019","journal-title":"Sci Rep"},{"key":"2024102414300487900_ref38","doi-asserted-by":"publisher","first-page":"958","DOI":"10.1006\/jmbi.1993.1650","article-title":"The contribution of contact and non-contact residues of antibody in the affinity of binding to antigen: the interaction of mutant D1.3 antibodies with lysozyme","volume":"234","author":"Hawkins","year":"1993","journal-title":"J Mol Biol"},{"key":"2024102414300487900_ref39","doi-asserted-by":"publisher","first-page":"4505","DOI":"10.4049\/jimmunol.165.8.4505","article-title":"Changing the antigen binding specificity by single point mutations of an anti-p24 (HIV-1) Antibody1","volume":"165","author":"Winkler","year":"2000","journal-title":"The Journal of Immunology"},{"key":"2024102414300487900_ref40","doi-asserted-by":"publisher","first-page":"2166","DOI":"10.1016\/j.csbj.2020.06.041","article-title":"T cell receptor sequence clustering and antigen specificity","volume":"18","author":"Vujovic","year":"2020","journal-title":"Comput Struct Biotechnol J"},{"key":"2024102414300487900_ref41","doi-asserted-by":"publisher","first-page":"eadc9498","DOI":"10.1126\/science.adc9498","article-title":"Germline-encoded amino acid\u2013binding motifs drive immunodominant public antibody responses","volume":"380","author":"Shrock","year":"2023","journal-title":"Science"},{"key":"2024102414300487900_ref42","doi-asserted-by":"publisher","DOI":"10.1101\/2024.03.26.586756","article-title":"Baselining the buzz Trastuzumab-HER2 affinity, and beyond","volume-title":"biorXiv","author":"Chinery","year":"2024"},{"key":"2024102414300487900_ref43","first-page":"487","article-title":"Fast algorithms for mining association rules. Proc. 20th int. conf. Very large data bases","volume":"1215","author":"Agrawal","year":"1994","journal-title":"VLDB"},{"key":"2024102414300487900_ref44","first-page":"2825","article-title":"Scikit-learn: machine learning in python","volume":"12","author":"Pedregosa","year":"2011","journal-title":"Journal of Machine Learning Research"},{"key":"2024102414300487900_ref45","doi-asserted-by":"publisher","first-page":"13","DOI":"10.3389\/fimmu.2022.1055151","article-title":"NetTCR-2.1: lessons and guidance on how to develop models for TCR specificity predictions","volume":"13","author":"Montemurro","year":"2022","journal-title":"Front Immunol"},{"key":"2024102414300487900_ref46","doi-asserted-by":"publisher","first-page":"100024","DOI":"10.1016\/j.immuno.2023.100024","article-title":"Benchmarking solutions to the T-cell receptor epitope prediction problem: IMMREP22 workshop report","volume":"9","author":"Meysman","year":"2023","journal-title":"ImmunoInformatics"},{"key":"2024102414300487900_ref47","doi-asserted-by":"publisher","first-page":"4230","DOI":"10.1093\/bioinformatics\/btac505","article-title":"CompAIRR: ultra-fast comparison of adaptive immune receptor repertoires by exact and approximate sequence matching","volume":"38","author":"Rognes","year":"2022","journal-title":"Bioinformatics"},{"key":"2024102414300487900_ref48","doi-asserted-by":"publisher","first-page":"756","DOI":"10.1038\/nature01392","article-title":"Structure of the extracellular region of HER2 alone and in complex with the Herceptin fab","volume":"421","author":"Cho","year":"2003","journal-title":"Nature"},{"key":"2024102414300487900_ref49","doi-asserted-by":"publisher","first-page":"246","DOI":"10.1111\/j.1747-0285.2009.00855.x","article-title":"Design, synthesis, and docking studies of Peptidomimetics based on HER2-Herceptin binding site with potential Antiproliferative activity against breast cancer cell lines","volume":"74","author":"Satyanarayanajois","year":"2009","journal-title":"Chem Biol Drug Des"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/6\/bbae537\/60016408\/bbae537.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/6\/bbae537\/60016408\/bbae537.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,24]],"date-time":"2024-10-24T14:30:24Z","timestamp":1729780224000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbae537\/7831256"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,23]]},"references-count":49,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,9,23]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbae537","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,11]]},"published":{"date-parts":[[2024,9,23]]},"article-number":"bbae537"}}