{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,11]],"date-time":"2026-05-11T19:57:06Z","timestamp":1778529426704,"version":"3.51.4"},"reference-count":43,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2023,12,18]],"date-time":"2023-12-18T00:00:00Z","timestamp":1702857600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Bioinform."],"abstract":"<jats:p>Understanding how a T-cell receptor (TCR) recognizes its specific ligand peptide is crucial for gaining an insight into biological functions and disease mechanisms. Despite its importance, experimentally determining TCR\u2013peptide\u2013major histocompatibility complex (TCR\u2013pMHC) interactions is expensive and time-consuming. To address this challenge, computational methods have been proposed, but they are typically evaluated by internal retrospective validation only, and few researchers have incorporated and tested an attention layer from language models into structural information. Therefore, in this study, we developed a machine learning model based on a modified version of Transformer, a source\u2013target attention neural network, to predict the TCR\u2013pMHC interaction solely from the amino acid sequences of the TCR complementarity-determining region (CDR) 3 and the peptide. This model achieved competitive performance on a benchmark dataset of the TCR\u2013pMHC interaction, as well as on a truly new external dataset. Additionally, by analyzing the results of binding predictions, we associated the neural network weights with protein structural properties. By classifying the residues into large- and small-attention groups, we identified statistically significant properties associated with the largely attended residues such as hydrogen bonds within CDR3. The dataset that we created and the ability of our model to provide an interpretable prediction of TCR\u2013peptide binding should increase our knowledge about molecular recognition and pave the way for designing new therapeutics.<\/jats:p>","DOI":"10.3389\/fbinf.2023.1274599","type":"journal-article","created":{"date-parts":[[2023,12,18]],"date-time":"2023-12-18T01:37:42Z","timestamp":1702863462000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":14,"title":["Attention network for predicting T-cell receptor\u2013peptide binding can associate attention with interpretable protein structural properties"],"prefix":"10.3389","volume":"3","author":[{"given":"Kyohei","family":"Koyama","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kosuke","family":"Hashimoto","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chioko","family":"Nagao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kenji","family":"Mizuguchi","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1965","published-online":{"date-parts":[[2023,12,18]]},"reference":[{"key":"B1","doi-asserted-by":"crossref","first-page":"2623","DOI":"10.1145\/3292500.3330701","article-title":"Optuna: a next-generation hyperparameter optimization framework","volume-title":"Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining","author":"Akiba","year":"2019"},{"key":"B2","doi-asserted-by":"publisher","first-page":"1429","DOI":"10.1016\/j.csbj.2019.10.005","article-title":"Coevolutive, evolutive and stochastic information in protein-protein interactions","volume":"17","author":"Andrade","year":"2019","journal-title":"Comput. Struct. Biotechnol. J."},{"key":"B3","doi-asserted-by":"publisher","first-page":"980","DOI":"10.1038\/nsb1203-980","article-title":"Announcing the worldwide protein data bank","volume":"10","author":"Berman","year":"2003","journal-title":"Nat. Struct. Mol. Biol."},{"key":"B4","doi-asserted-by":"publisher","first-page":"15","DOI":"10.1145\/360262.360268","article-title":"Biopython: Python tools for computational biology","volume":"20","author":"Chapman","year":"2000","journal-title":"ACM Sigbio Newsl."},{"key":"B5","volume-title":"Dipair: fast and accurate distillation for trillion-scale text matching and pair modeling","author":"Chen","year":"2020"},{"key":"B6","doi-asserted-by":"publisher","first-page":"168","DOI":"10.3389\/fimmu.2013.00168","article-title":"Increased peptide contacts govern high affinity binding of a modified tcr whilst maintaining a native pmhc docking mode","volume":"4","author":"Cole","year":"2013","journal-title":"Front. Immunol."},{"key":"B7","doi-asserted-by":"publisher","first-page":"89","DOI":"10.1038\/nature22383","article-title":"Quantifiable predictive features define epitope-specific T cell receptor repertoires","volume":"547","author":"Dash","year":"2017","journal-title":"Nature"},{"key":"B8","volume-title":"Bert: pre-training of deep bidirectional transformers for language understanding","author":"Devlin","year":"2018"},{"key":"B9","doi-asserted-by":"publisher","first-page":"298","DOI":"10.1093\/bioinformatics\/btv552","article-title":"Anarci: antigen receptor numbering and receptor classification","volume":"32","author":"Dunbar","year":"2016","journal-title":"Bioinformatics"},{"key":"B10","volume-title":"T-cell receptor specific protein language model for prediction and interpretation of epitope binding (protlm. tcr)","author":"Essaghir","year":"2022"},{"key":"B11","doi-asserted-by":"publisher","first-page":"236","DOI":"10.1038\/s42256-023-00619-3","article-title":"Pan-peptide meta learning for t-cell receptor\u2013antigen binding recognition","volume":"5","author":"Gao","year":"2023","journal-title":"Nat. Mach. Intell."},{"key":"B12","doi-asserted-by":"publisher","first-page":"209","DOI":"10.1126\/science.274.5285.209","article-title":"An \u03b1\u03b2 t cell receptor structure at 2.5 \u00e5 and its orientation in the tcr-mhc complex","volume":"274","author":"Garcia","year":"1996","journal-title":"Science"},{"key":"B14","doi-asserted-by":"crossref","first-page":"1754","DOI":"10.18653\/v1\/2021.emnlp-main.132","article-title":"Cross-attention is all you need: adapting pretrained Transformers for machine translation","volume-title":"Proceedings of the 2021 conference on empirical methods in natural language processing","author":"Gheini","year":"2021"},{"key":"B15","doi-asserted-by":"publisher","first-page":"5323","DOI":"10.1093\/bioinformatics\/btz517","article-title":"Tcr3d: the t cell receptor structural repertoire database","volume":"35","author":"Gowthaman","year":"2019","journal-title":"Bioinformatics"},{"key":"B16","doi-asserted-by":"publisher","first-page":"12963","DOI":"10.1609\/aaai.v35i14.17533","article-title":"Self-attention attribution: interpreting information interactions inside transformer","volume":"35","author":"Hao","year":"2021","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"B17","article-title":"Cross attentive antibody-antigen interaction prediction with multi-task learning","volume-title":"ICML 2020 workshop on computational biology (WCB)","author":"Honda","year":"2020"},{"key":"B18","article-title":"Cross attention dti: drug-target interaction prediction with cross attention module in the blind evaluation setup","author":"Koyama","year":"2020","journal-title":"BIOKDD2020"},{"key":"B19","first-page":"201","article-title":"Stacked cross attention for image-text matching","volume-title":"Proceedings of the European conference on computer vision","author":"Lee","year":"2018"},{"key":"B20","doi-asserted-by":"publisher","first-page":"864","DOI":"10.1038\/s42256-021-00383-2","article-title":"Deep learning-based prediction of the t cell receptor\u2013antigen binding specificity","volume":"3","author":"Lu","year":"","journal-title":"Nat. Mach. Intell."},{"key":"B21","doi-asserted-by":"publisher","first-page":"e20211327","DOI":"10.1084\/jem.20211327","article-title":"Identification of conserved SARS-CoV-2 spike epitopes that expand public cTfh clonotypes in mild COVID-19 patients","volume":"218","author":"Lu","year":"","journal-title":"J. Exp. Med."},{"key":"B22","doi-asserted-by":"publisher","first-page":"490","DOI":"10.1186\/s12859-019-3109-6","article-title":"Benchmark datasets of immune receptor-epitope structural complexes","volume":"20","author":"Mahajan","year":"2019","journal-title":"BMC Bioinforma."},{"key":"B23","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s42003-021-02610-3","article-title":"NetTCR-2.0 enables accurate prediction of TCR-peptide binding by using paired TCR\u03b1 and \u03b2 sequence data","volume":"4","author":"Montemurro","year":"2021","journal-title":"Commun. Biol."},{"key":"B24","doi-asserted-by":"publisher","first-page":"bbaa318","DOI":"10.1093\/bib\/bbaa318","article-title":"Current challenges for unseen-epitope tcr interaction prediction and a new perspective derived from image classification","volume":"22","author":"Moris","year":"2021","journal-title":"Briefings Bioinforma."},{"key":"B25","doi-asserted-by":"crossref","first-page":"636","DOI":"10.1109\/SLT48900.2021.9383573","article-title":"Detecting expressions with multimodal transformers","volume-title":"2021 IEEE Spoken Language Technology Workshop (SLT)","author":"Parthasarathy","year":"2021"},{"key":"B26","doi-asserted-by":"publisher","first-page":"654","DOI":"10.2478\/s11696-009-0068-9","article-title":"A graph theoretical approach to the effect of mutation on the flexibility of the dna binding domain of p53 protein","volume":"63","author":"Rauf","year":"2009","journal-title":"Chem. Pap."},{"key":"B27","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1073\/pnas.0407280102","article-title":"The modular architecture of protein-protein binding interfaces","volume":"102","author":"Reichmann","year":"2005","journal-title":"Proc. Natl. Acad. Sci."},{"key":"B28","doi-asserted-by":"publisher","first-page":"842","DOI":"10.1162\/tacl_a_00349","article-title":"A primer in bertology: what we know about how bert works","volume":"8","author":"Rogers","year":"2020","journal-title":"Trans. Assoc. Comput. Linguistics"},{"key":"B29","volume-title":"Pymol","author":"Schr\u00f6dinger","year":"2020"},{"key":"B30","doi-asserted-by":"publisher","first-page":"D419","DOI":"10.1093\/nar\/gkx760","article-title":"Vdjdb: a curated database of t-cell receptor sequences with known antigen specificity","volume":"46","author":"Shugay","year":"2018","journal-title":"Nucleic acids Res."},{"key":"B31","doi-asserted-by":"publisher","first-page":"1605","DOI":"10.1038\/s41467-021-21879-w","article-title":"DeepTCR is a deep learning framework for revealing sequence concepts within T-cell repertoires","volume":"12","author":"Sidhom","year":"2021","journal-title":"Nat. Commun."},{"key":"B32","doi-asserted-by":"publisher","first-page":"539","DOI":"10.1038\/msb.2011.75","article-title":"Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega","volume":"7","author":"Sievers","year":"2011","journal-title":"Mol. Syst. Biol."},{"key":"B33","doi-asserted-by":"publisher","first-page":"1803","DOI":"10.3389\/fimmu.2020.01803","article-title":"Prediction of specific TCR-peptide binding from large dictionaries of TCR-peptide pairs","volume":"11","author":"Springer","year":"2020","journal-title":"Front. Immunol."},{"key":"B34","doi-asserted-by":"publisher","first-page":"664514","DOI":"10.3389\/fimmu.2021.664514","article-title":"Contribution of T cell receptor alpha and beta CDR3, MHC typing, V and J genes to peptide binding prediction","volume":"12","author":"Springer","year":"2021","journal-title":"Front. Immunol."},{"key":"B35","doi-asserted-by":"publisher","first-page":"2924","DOI":"10.1093\/bioinformatics\/btx286","article-title":"Mcpas-tcr: a manually curated catalogue of pathology-associated t cell receptor sequences","volume":"33","author":"Tickotsky","year":"2017","journal-title":"Bioinformatics"},{"key":"B36","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1706.03762","article-title":"Attention is all you need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Adv. neural Inf. Process. Syst."},{"key":"B37","doi-asserted-by":"crossref","first-page":"5797","DOI":"10.18653\/v1\/P19-1580","article-title":"Analyzing multi-head self-attention: specialized heads do the heavy lifting, the rest can be pruned","volume-title":"Proceedings of the 57th annual meeting of the association for computational linguistics","author":"Voita","year":"2019"},{"key":"B38","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1093\/protein\/8.2.127","article-title":"Ligplot: a program to generate schematic diagrams of protein-ligand interactions","volume":"8","author":"Wallace","year":"1995","journal-title":"Protein Eng. Des. Sel."},{"key":"B39","doi-asserted-by":"publisher","first-page":"i237","DOI":"10.1093\/bioinformatics\/btab294","article-title":"Titan: T-cell receptor specificity prediction with bimodal attention networks","volume":"37","author":"Weber","year":"2021","journal-title":"Bioinformatics"},{"key":"B40","article-title":"TCR-BERT: learning the grammar of t-cell receptors for flexible antigen-xbinding analyses","author":"Wu","year":"2021","journal-title":"bioRxiv"},{"key":"B13","unstructured":"A new way of exploring immunity\u2013linking highly multiplexed antigen recognition to immune repertoire and phenotypeTech. Rep.2019"},{"key":"B41","doi-asserted-by":"publisher","first-page":"942491","DOI":"10.3389\/fgene.2022.942491","article-title":"AttnTAP: a dual-input framework incorporating the attention mechanism for accurately predicting TCR-peptide binding","volume":"13","author":"Xu","year":"2022","journal-title":"Front. Genet."},{"key":"B42","doi-asserted-by":"publisher","first-page":"bbab335","DOI":"10.1093\/bib\/bbab335","article-title":"Dlptcr: an ensemble deep learning framework for predicting immunogenic peptide recognized by t cell receptor","volume":"22","author":"Xu","year":"2021","journal-title":"Briefings Bioinforma."},{"key":"B43","doi-asserted-by":"publisher","first-page":"18618","DOI":"10.1074\/jbc.M117.810382","article-title":"Structural basis for clonal diversity of the human T-cell response to a dominant influenza virus epitope","volume":"292","author":"Yang","year":"2017","journal-title":"J. Biol. Chem."}],"container-title":["Frontiers in Bioinformatics"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2023.1274599\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,18]],"date-time":"2023-12-18T01:37:55Z","timestamp":1702863475000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/fbinf.2023.1274599\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,18]]},"references-count":43,"alternative-id":["10.3389\/fbinf.2023.1274599"],"URL":"https:\/\/doi.org\/10.3389\/fbinf.2023.1274599","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2023.02.16.528799","asserted-by":"object"}]},"ISSN":["2673-7647"],"issn-type":[{"value":"2673-7647","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,18]]},"article-number":"1274599"}}