{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,18]],"date-time":"2026-04-18T03:27:27Z","timestamp":1776482847330,"version":"3.51.2"},"reference-count":47,"publisher":"Oxford University Press (OUP)","issue":"17","license":[{"start":{"date-parts":[[2019,1,18]],"date-time":"2019-01-18T00:00:00Z","timestamp":1547769600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,9,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>High-throughput sequencing of large immune repertoires has enabled the development of methods to predict the probability of generation by V(D)J recombination of T- and B-cell receptors of any specific nucleotide sequence. These generation probabilities are very non-homogeneous, ranging over 20 orders of magnitude in real repertoires. Since the function of a receptor really depends on its protein sequence, it is important to be able to predict this probability of generation at the amino acid level. However, brute-force summation over all the nucleotide sequences with the correct amino acid translation is computationally intractable. The purpose of this paper is to present a solution to this problem.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>We use dynamic programming to construct an efficient and flexible algorithm, called OLGA (Optimized Likelihood estimate of immunoGlobulin Amino-acid sequences), for calculating the probability of generating a given CDR3 amino acid sequence or motif, with or without V\/J restriction, as a result of V(D)J recombination in B or T cells. We apply it to databases of epitope-specific T-cell receptors to evaluate the probability that a typical human subject will possess T cells responsive to specific disease-associated epitopes. The model prediction shows an excellent agreement with published data. We suggest that OLGA may be a useful tool to guide vaccine design.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>Source code is available at https:\/\/github.com\/zsethna\/OLGA.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz035","type":"journal-article","created":{"date-parts":[[2019,1,13]],"date-time":"2019-01-13T15:06:42Z","timestamp":1547392002000},"page":"2974-2981","source":"Crossref","is-referenced-by-count":225,"title":["OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs"],"prefix":"10.1093","volume":"35","author":[{"given":"Zachary","family":"Sethna","sequence":"first","affiliation":[{"name":"Princeton University Joseph Henry Laboratories, , Princeton, NJ, USA"}]},{"given":"Yuval","family":"Elhanati","sequence":"additional","affiliation":[{"name":"Princeton University Joseph Henry Laboratories, , Princeton, NJ, USA"}]},{"suffix":"Jr","given":"Curtis G","family":"Callan","sequence":"additional","affiliation":[{"name":"Princeton University Joseph Henry Laboratories, , Princeton, NJ, USA"},{"name":"Centre national de la recherche scientifique, Sorbonne University, University Paris-Diderot Laboratoire de physique de l'Ecole normale sup\u00e9rieure (PSL University), , Paris, France"}]},{"given":"Aleksandra M","family":"Walczak","sequence":"additional","affiliation":[{"name":"Centre national de la recherche scientifique, Sorbonne University, University Paris-Diderot Laboratoire de physique de l'Ecole normale sup\u00e9rieure (PSL University), , Paris, France"}]},{"given":"Thierry","family":"Mora","sequence":"additional","affiliation":[{"name":"Centre national de la recherche scientifique, Sorbonne University, University Paris-Diderot Laboratoire de physique de l'Ecole normale sup\u00e9rieure (PSL University), , Paris, France"}]}],"member":"286","published-online":{"date-parts":[[2019,1,18]]},"reference":[{"key":"2023062711305563200_btz035-B1","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1126\/science.1260668","article-title":"Functional heterogeneity of human memory cd4+ t cell clones primed by pathogens or vaccines","volume":"347","author":"Becattini","year":"2015","journal-title":"Science"},{"key":"2023062711305563200_btz035-B2","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1038\/nature22383","article-title":"Quantifiable predictive features define epitope-specific T cell receptor repertoires","volume":"547","author":"Dash","year":"2017","journal-title":"Nature"},{"key":"2023062711305563200_btz035-B3","doi-asserted-by":"crossref","first-page":"e0160853.","DOI":"10.1371\/journal.pone.0160853","article-title":"A public database of memory and naive B-cell receptor sequences","volume":"11","author":"DeWitt","year":"2016","journal-title":"PLoS One"},{"key":"2023062711305563200_btz035-B4","author":"DeWitt","year":"2018"},{"key":"2023062711305563200_btz035-B5","author":"Dupic","year":"2018"},{"key":"2023062711305563200_btz035-B6","doi-asserted-by":"crossref","first-page":"20140243.","DOI":"10.1098\/rstb.2014.0243","article-title":"Inferring processes underlying B-cell repertoire diversity","volume":"370","author":"Elhanati","year":"2015","journal-title":"Philos. Trans. R Soc. Lond. B Biol. Sci"},{"key":"2023062711305563200_btz035-B7","doi-asserted-by":"crossref","first-page":"1943","DOI":"10.1093\/bioinformatics\/btw112","article-title":"repgenhmm: a dynamic programming tool to infer the rules of immune receptor generation from sequence data","volume":"32","author":"Elhanati","year":"2016","journal-title":"Bioinformatics"},{"key":"2023062711305563200_btz035-B8","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1111\/imr.12665","article-title":"Predicting the spectrum of TCR repertoire sharing with a data-driven model of recombination","volume":"284","author":"Elhanati","year":"2018","journal-title":"Immunol. Rev"},{"key":"2023062711305563200_btz035-B9","doi-asserted-by":"crossref","first-page":"659","DOI":"10.1038\/ng.3822","article-title":"Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire","volume":"49","author":"Emerson","year":"2017","journal-title":"Nat. Genet"},{"key":"2023062711305563200_btz035-B10","doi-asserted-by":"crossref","first-page":"774","DOI":"10.1002\/art.40028","article-title":"Discovery of T cell receptor \u03b2 motifs specific to HLA-B27-positive ankylosing spondylitis by deep repertoire sequence analysis","volume":"69","author":"Faham","year":"2017","journal-title":"Arthritis Rheumatol"},{"key":"2023062711305563200_btz035-B11","doi-asserted-by":"crossref","first-page":"1817","DOI":"10.1101\/gr.092924.109","article-title":"Profiling the T-cell receptor beta-chain repertoire by massively parallel sequencing","volume":"19","author":"Freeman","year":"2009","journal-title":"Genome Res"},{"key":"2023062711305563200_btz035-B12","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/srep44661","article-title":"CD8+T cells specific for the islet autoantigen IGRP are restricted in their T cell receptor chain usage","volume":"7","author":"Fuchs","year":"2017","journal-title":"Sci. Rep"},{"key":"2023062711305563200_btz035-B13","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1016\/j.immuni.2015.12.005","article-title":"Diversity of T cells restricted by the MHC class I-related molecule MR1 facilitates differential antigen recognition","volume":"44","author":"Gherardin","year":"2016","journal-title":"Immunity"},{"key":"2023062711305563200_btz035-B14","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1038\/nature22976","article-title":"Identifying specificity groups in the T cell receptor repertoire","volume":"547","author":"Glanville","year":"2017","journal-title":"Nature"},{"key":"2023062711305563200_btz035-B15","author":"Grigaityte","year":"2017"},{"key":"2023062711305563200_btz035-B16","first-page":"554","article-title":"High-throughput sequencing of the T-cell receptor repertoire: pitfalls and opportunities","volume":"19","author":"Heather","year":"2017","journal-title":"Brief. Bioinform"},{"key":"2023062711305563200_btz035-B17","author":"Horns","year":"2017"},{"key":"2023062711305563200_btz035-B18","doi-asserted-by":"crossref","first-page":"301ra131.","DOI":"10.1126\/scitranslmed.aac5624","article-title":"High-throughput pairing of T cell receptor a and b sequences","volume":"7","author":"Howie","year":"2015","journal-title":"Sci. Transl. Med"},{"key":"2023062711305563200_btz035-B19","doi-asserted-by":"crossref","first-page":"171ra19.","DOI":"10.1126\/scitranslmed.3004794","article-title":"Lineage structure of the human antibody repertoire in response to influenza vaccination","volume":"5","author":"Jiang","year":"2013","journal-title":"Sci. Transl. Med"},{"key":"2023062711305563200_btz035-B20","doi-asserted-by":"crossref","first-page":"1097","DOI":"10.1093\/rheumatology\/kex517","article-title":"CD8+ T cells with characteristic TCR beta motif are detected in blood and expanded in synovial fluid of ankylosing spondylitis patients","volume":"57","author":"Komech","year":"2018","journal-title":"Rheumatology (Oxford, England)"},{"key":"2023062711305563200_btz035-B21","doi-asserted-by":"crossref","first-page":"62","DOI":"10.1016\/j.coisb.2016.12.009","article-title":"Advances and applications of immune receptor sequencing in systems immunology","volume":"1","author":"Lindau","year":"2017","journal-title":"Curr. Opin. Syst. Biol"},{"key":"2023062711305563200_btz035-B22","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1016\/j.jtbi.2015.10.016","article-title":"How many TCR clonotypes does a body maintain?","volume":"389","author":"Lythe","year":"2016","journal-title":"J. Theor. Biol"},{"key":"2023062711305563200_btz035-B23","doi-asserted-by":"crossref","first-page":"1603","DOI":"10.1101\/gr.170753.113","article-title":"T-cell receptor repertoires share a restricted set of public and abundant CDR3 sequences that are associated with self-related immunity","volume":"24","author":"Madi","year":"2014","journal-title":"Genome Res"},{"key":"2023062711305563200_btz035-B24","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.22057","article-title":"T cell receptor repertoires of mice and humans are clustered in similarity networks around conserved public CDR3 sequences","volume":"6","author":"Madi","year":"2017","journal-title":"eLife"},{"key":"2023062711305563200_btz035-B25","doi-asserted-by":"crossref","first-page":"561.","DOI":"10.1038\/s41467-018-02832-w","article-title":"High-throughput immune repertoire analysis with IGoR","volume":"9","author":"Marcou","year":"2018","journal-title":"Nat. Commun"},{"key":"2023062711305563200_btz035-B26","first-page":"185","volume-title":"Systems Immunology: An Introduction to Modeling Methods for Scientists","author":"Mora","year":"2018"},{"key":"2023062711305563200_btz035-B27","doi-asserted-by":"crossref","first-page":"16161","DOI":"10.1073\/pnas.1212755109","article-title":"Statistical inference of the generation probability of T-cell receptors from sequence repertoires","volume":"109","author":"Murugan","year":"2012","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062711305563200_btz035-B28","doi-asserted-by":"crossref","first-page":"e1005572","DOI":"10.1371\/journal.pcbi.1005572","article-title":"Persisting fetal clonotypes influence the structure and overlap of adult human T cell receptor repertoires","volume":"13","author":"Pogorelyy","year":"2017","journal-title":"PLoS Comput. Biol"},{"key":"2023062711305563200_btz035-B29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.7554\/eLife.33050","article-title":"Method for identification of condition-associated public antigen receptor sequences","volume":"7","author":"Pogorelyy","year":"2018","journal-title":"Elife"},{"key":"2023062711305563200_btz035-B30","first-page":"12704","article-title":"Precise tracking of vaccine-responding T-cell clones reveals convergent and personalized response in identical twins","volume-title":"Proc. Natl Acad. Sci","author":"Pogorelyy","year":"2018"},{"key":"2023062711305563200_btz035-B31","doi-asserted-by":"crossref","first-page":"13139","DOI":"10.1073\/pnas.1409155111","article-title":"Diversity and clonal selection in the human T-cell repertoire","volume":"111","author":"Qi","year":"2014","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062711305563200_btz035-B32","doi-asserted-by":"crossref","first-page":"4099","DOI":"10.1182\/blood-2009-04-217604","article-title":"Comprehensive assessment of T-cell receptor beta-chain diversity in alphabeta T cells","volume":"114","author":"Robins","year":"2009","journal-title":"Blood"},{"key":"2023062711305563200_btz035-B33","doi-asserted-by":"crossref","first-page":"47ra64.","DOI":"10.1126\/scitranslmed.3001442","article-title":"Overlap and effective size of the human CD8+ T cell receptor repertoire","volume":"2","author":"Robins","year":"2010","journal-title":"Sci. Transl. Med"},{"key":"2023062711305563200_btz035-B34","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1172\/jci.insight.88242","article-title":"Tissue distribution and clonal diversity of the T and B cell repertoire in type 1 diabetes","volume":"1","author":"Seay","year":"2016","journal-title":"JCI Insight"},{"key":"2023062711305563200_btz035-B35","doi-asserted-by":"crossref","first-page":"2253","DOI":"10.1073\/pnas.1700241114","article-title":"Insights into immune system development and function from mouse T-cell repertoires","volume":"114","author":"Sethna","year":"2017","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062711305563200_btz035-B36","doi-asserted-by":"crossref","first-page":"D419","DOI":"10.1093\/nar\/gkx760","article-title":"VDJdb: a curated database of T-cell receptor sequences with known antigen specificity","volume":"46","author":"Shugay","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"2023062711305563200_btz035-B37","doi-asserted-by":"crossref","first-page":"E3529","DOI":"10.1073\/pnas.1601012113","article-title":"Diversity and divergence of the glioma-infiltrating t-cell receptor repertoire","volume":"113","author":"Sims","year":"2016","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062711305563200_btz035-B38","doi-asserted-by":"crossref","first-page":"413.","DOI":"10.3389\/fimmu.2013.00413","article-title":"The past, present and future of immune repertoire biology \u2013 the rise of next-generation repertoire analysis","volume":"4","author":"Six","year":"2013","journal-title":"Front. Immunol"},{"key":"2023062711305563200_btz035-B39","doi-asserted-by":"crossref","first-page":"1307.","DOI":"10.3389\/fimmu.2018.01307","article-title":"Evidence for shaping of light chain repertoire by structural selection","volume":"9","author":"Toledano","year":"2018","journal-title":"Front. Immunol"},{"key":"2023062711305563200_btz035-B40","doi-asserted-by":"crossref","first-page":"2597","DOI":"10.4049\/jimmunol.181.4.2597","article-title":"The role of production frequency in the sharing of simian immunodeficiency virus-specific CD8+ TCRs between macaques","volume":"181","author":"Venturi","year":"2008","journal-title":"J. Immunol"},{"key":"2023062711305563200_btz035-B41","doi-asserted-by":"crossref","first-page":"639","DOI":"10.1016\/j.coi.2013.07.001","article-title":"Specificity, promiscuity, and precursor frequency in immunoreceptors","volume":"25","author":"Venturi","year":"2013","journal-title":"Curr. Opin. Immunol"},{"key":"2023062711305563200_btz035-B42","doi-asserted-by":"crossref","first-page":"13463","DOI":"10.1073\/pnas.1312146110","article-title":"Genetic measurement of memory B-cell recall using antibody repertoire sequencing","volume":"110","author":"Vollmers","year":"2013","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062711305563200_btz035-B43","doi-asserted-by":"crossref","first-page":"1518","DOI":"10.1073\/pnas.0913939107","article-title":"High throughput sequencing reveals a complex pattern of dynamic interrelationships among human T cell subsets","volume":"107","author":"Wang","year":"2010","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062711305563200_btz035-B44","doi-asserted-by":"crossref","first-page":"807","DOI":"10.1126\/science.1170020","article-title":"High-throughput sequencing of the zebrafish antibody repertoire","volume":"324","author":"Weinstein","year":"2009","journal-title":"Science"},{"key":"2023062711305563200_btz035-B45","doi-asserted-by":"crossref","first-page":"98.","DOI":"10.1186\/gm502","article-title":"Sequence analysis of T-cell repertoires in health and disease","volume":"5","author":"Woodsworth","year":"2013","journal-title":"Genome Med"},{"key":"2023062711305563200_btz035-B46","doi-asserted-by":"crossref","first-page":"1487","DOI":"10.1038\/s41385-018-0046-z","article-title":"Expanded tcr\u00dfcdr3 clonotypes distinguish Crohn\u2019s disease and ulcerative colitis patients","volume":"11","author":"Wu","year":"2018","journal-title":"Mucosal Immunol"},{"key":"2023062711305563200_btz035-B47","doi-asserted-by":"crossref","first-page":"4905","DOI":"10.4049\/jimmunol.1501029","article-title":"Preferential use of public TCR during autoimmune encephalomyelitis","volume":"196","author":"Zhao","year":"2016","journal-title":"J. Immunol"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/17\/2974\/50719771\/bioinformatics_35_17_2974.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/35\/17\/2974\/50719771\/bioinformatics_35_17_2974.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,27]],"date-time":"2023-06-27T07:34:06Z","timestamp":1687851246000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/35\/17\/2974\/5292315"}},"subtitle":[],"editor":[{"given":"Bonnie","family":"Berger","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2019,1,18]]},"references-count":47,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2019,9,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz035","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/367904","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,9,1]]},"published":{"date-parts":[[2019,1,18]]}}}