{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,26]],"date-time":"2026-02-26T20:34:39Z","timestamp":1772138079777,"version":"3.50.1"},"reference-count":44,"publisher":"Oxford University Press (OUP)","issue":"13","license":[{"start":{"date-parts":[[2016,10,12]],"date-time":"2016-10-12T00:00:00Z","timestamp":1476230400000},"content-version":"vor","delay-in-days":219,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2016,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Motivation: The diversity of the immune repertoire is initially generated by random rearrangements of the receptor gene during early T and B cell development. Rearrangement scenarios are composed of random events\u2014choices of gene templates, base pair deletions and insertions\u2014described by probability distributions. Not all scenarios are equally likely, and the same receptor sequence may be obtained in several different ways. Quantifying the distribution of these rearrangements is an essential baseline for studying the immune system diversity. Inferring the properties of the distributions from receptor sequences is a computationally hard problem, requiring enumerating every possible scenario for every sampled receptor sequence.<\/jats:p>\n                  <jats:p>Results: We present a Hidden Markov model, which accounts for all plausible scenarios that can generate the receptor sequences. We developed and implemented a method based on the Baum\u2013Welch algorithm that can efficiently infer the parameters for the different events of the rearrangement process. We tested our software tool on sequence data for both the alpha and beta chains of the T cell receptor. To test the validity of our algorithm, we also generated synthetic sequences produced by a known model, and confirmed that its parameters could be accurately inferred back from the sequences. The inferred model can be used to generate synthetic sequences, to calculate the probability of generation of any receptor sequence, as well as the theoretical diversity of the repertoire. We estimate this diversity to be \u22481023 for human T cells. The model gives a baseline to investigate the selection and dynamics of immune repertoires.<\/jats:p>\n                  <jats:p>Availability and implementation: Source code and sample sequence files are available at https:\/\/bitbucket.org\/yuvalel\/repgenhmm\/downloads.<\/jats:p>\n                  <jats:p>Contact: \u00a0elhanati@lpt.ens.fr or tmora@lps.ens.fr or awalczak@lpt.ens.fr<\/jats:p>","DOI":"10.1093\/bioinformatics\/btw112","type":"journal-article","created":{"date-parts":[[2016,2,26]],"date-time":"2016-02-26T21:10:41Z","timestamp":1456521041000},"page":"1943-1951","source":"Crossref","is-referenced-by-count":36,"title":["repgenHMM: a dynamic programming tool to infer the rules of immune receptor generation from sequence data"],"prefix":"10.1093","volume":"32","author":[{"given":"Yuval","family":"Elhanati","sequence":"first","affiliation":[{"name":"1 Laboratoire de physique th\u00e9orique, CNRS, UPMC and Ecole normale sup\u00e9rieure, Paris, France"}]},{"given":"Quentin","family":"Marcou","sequence":"additional","affiliation":[{"name":"1 Laboratoire de physique th\u00e9orique, CNRS, UPMC and Ecole normale sup\u00e9rieure, Paris, France"}]},{"given":"Thierry","family":"Mora","sequence":"additional","affiliation":[{"name":"2 Laboratoire de physique statistique, CNRS, UPMC and Ecole normale sup\u00e9rieure, Paris, France"}]},{"given":"Aleksandra M.","family":"Walczak","sequence":"additional","affiliation":[{"name":"1 Laboratoire de physique th\u00e9orique, CNRS, UPMC and Ecole normale sup\u00e9rieure, Paris, France"}]}],"member":"286","published-online":{"date-parts":[[2016,3,7]]},"reference":[{"key":"2023020112335656500_btw112-B1","volume-title":"Pattern Recognition and Machine Learning","author":"Bishop","year":"2006"},{"key":"2023020112335656500_btw112-B2","doi-asserted-by":"crossref","first-page":"3073","DOI":"10.1002\/eji.201242517","article-title":"Next generation sequencing for TCR repertoire profiling: platform-specific features and correction algorithms","volume":"42","author":"Bolotin","year":"2012","journal-title":"Eur. J. Immunol"},{"key":"2023020112335656500_btw112-B3","doi-asserted-by":"crossref","first-page":"380","DOI":"10.1038\/nmeth.3364","article-title":"MiXCR: software for comprehensive adaptive immunity profiling","volume":"12","author":"Bolotin","year":"2015","journal-title":"Nat. Methods"},{"key":"2023020112335656500_btw112-B4","first-page":"44","volume-title":"Research in Computational Molecular Biology SE - 7, volume 9029 of Lecture Notes in Computer Science","author":"Bonissone","year":"2015"},{"key":"2023020112335656500_btw112-B5","doi-asserted-by":"crossref","first-page":"503","DOI":"10.1093\/nar\/gkn316","article-title":"IMGT\/V-QUEST: the highly customized and integrated system for IG and TR standardized V-J and V-D-J sequence analysis","volume":"36","author":"Brochet","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023020112335656500_btw112-B6","doi-asserted-by":"crossref","first-page":"1971","DOI":"10.4049\/jimmunol.164.4.1971","article-title":"The nucleotide-replacement spectrum under somatic hypermutation exhibits microsequence dependence that is strand-symmetric and distinct from that under germline mutation","volume":"164","author":"Cowell","year":"2000","journal-title":"J. Immunol"},{"key":"2023020112335656500_btw112-B7","doi-asserted-by":"crossref","first-page":"2360","DOI":"10.4049\/jimmunol.160.5.2360","article-title":"Base-specific sequences that bias somatic hypermutation deduced by analysis of out-of-frame human IgVH genes","volume":"160","author":"Dunn-Walters","year":"1998","journal-title":"J. Immunol. (Baltimore, MD.: 1950)"},{"key":"2023020112335656500_btw112-B8","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511790492","volume-title":"Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids","author":"Durbin","year":"1998"},{"key":"2023020112335656500_btw112-B9","doi-asserted-by":"crossref","first-page":"20140243.","DOI":"10.1098\/rstb.2014.0243","article-title":"Inferring processes underlying B-cell repertoire diversity","volume":"370","author":"Elhanati","year":"2015","journal-title":"Philos. Trans. R. Soc. Lond. Ser. B: Biol. Sci"},{"key":"2023020112335656500_btw112-B10","doi-asserted-by":"crossref","first-page":"20140240.","DOI":"10.1098\/rstb.2014.0240","article-title":"Assigning and visualizing germline genes in antibody repertoires","volume":"370","author":"Frost","year":"2015","journal-title":"Philos. Trans. R. Soc. Lond. Ser. B: Biol. Sci"},{"key":"2023020112335656500_btw112-B11","doi-asserted-by":"crossref","first-page":"E862","DOI":"10.1073\/pnas.1417683112","article-title":"Automated analysis of high-throughput B-cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles","volume":"112","author":"Gadala-Maria","year":"2015","journal-title":"Proc. Natl. Acad. Sci. U. S. A"},{"key":"2023020112335656500_btw112-B12","doi-asserted-by":"crossref","first-page":"1580","DOI":"10.1093\/bioinformatics\/btm147","article-title":"iHMMune-align: hidden Markov model-based alignment and identification of germline genes in rearranged immunoglobulin gene sequences","volume":"23","author":"Ga\u00ebta","year":"2007","journal-title":"Bioinformatics"},{"key":"2023020112335656500_btw112-B13","doi-asserted-by":"crossref","first-page":"4475","DOI":"10.4049\/jimmunol.1400119","article-title":"Check MAIT","volume":"192","author":"Gapin","year":"2014","journal-title":"J. Immunol. (Baltimore, MD.: 1950)"},{"key":"2023020112335656500_btw112-B14","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1038\/nbt.2782","article-title":"The promise and challenge of high-throughput sequencing of the antibody repertoire","volume":"32","author":"Georgiou","year":"2014","journal-title":"Nat. Biotechnol"},{"key":"2023020112335656500_btw112-B15","doi-asserted-by":"crossref","first-page":"D256","DOI":"10.1093\/nar\/gki010","article-title":"IMGT\/GENE-DB: a comprehensive database for human and mouse immunoglobulin and T cell receptor genes","volume":"33","author":"Giudicelli","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023020112335656500_btw112-B16","doi-asserted-by":"crossref","first-page":"1126","DOI":"10.15252\/embj.201489643","article-title":"Structural basis for a novel mechanism of DNA bridging and alignment in eukaryotic DSB DNA repair","volume":"34","author":"Gouge","year":"2015","journal-title":"EMBO J"},{"key":"2023020112335656500_btw112-B17","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1016\/j.imbio.2012.04.003","article-title":"NKT and MAIT invariant TCR\u03b1 sequences can be produced efficiently by VJ gene recombination","volume":"218","author":"Greenaway","year":"2013","journal-title":"Immunobiology"},{"key":"2023020112335656500_btw112-B18","doi-asserted-by":"crossref","first-page":"903","DOI":"10.1073\/pnas.0608248104","article-title":"Role for rearranged variable gene segments in directing secondary T cell receptor alpha recombination","volume":"104","author":"Hawwari","year":"2007","journal-title":"Proc. Natl. Acad. Sci. U. S. A"},{"key":"2023020112335656500_btw112-B19","doi-asserted-by":"crossref","first-page":"2597","DOI":"10.4049\/jimmunol.166.4.2597","article-title":"Ordered and coordinated rearrangement of the TCR locus: role of secondary rearrangement in thymic selection","volume":"166","author":"Huang","year":"2001","journal-title":"J. Immunol"},{"key":"2023020112335656500_btw112-B20","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1016\/S0092-8674(04)00039-X","article-title":"Unraveling V(D)J recombination; insights into gene regulation","volume":"116","author":"Jung","year":"2004","journal-title":"Cell"},{"key":"2023020112335656500_btw112-B21","doi-asserted-by":"crossref","first-page":"317","DOI":"10.3109\/08830189609061755","article-title":"Repertoires of antigen receptors in Tdt congenitally deficient mice","volume":"13","author":"Komori","year":"1996","journal-title":"Int. Rev. Immunol"},{"key":"2023020112335656500_btw112-B22","volume-title":"The T Cell Receptor FactsBook","author":"Lefranc","year":"2001"},{"key":"2023020112335656500_btw112-B23","doi-asserted-by":"crossref","first-page":"867","DOI":"10.1093\/bioinformatics\/btq056","article-title":"SoDA2: a Hidden Markov Model approach for identification of immunoglobulin rearrangements","volume":"26","author":"Munshaw","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020112335656500_btw112-B24","doi-asserted-by":"crossref","first-page":"16161","DOI":"10.1073\/pnas.1212755109","article-title":"Statistical inference of the generation probability of T-cell receptors from sequence repertoires","volume":"109","author":"Murugan","year":"2012","journal-title":"Proc. Natl. Acad. Sci. U. S. A"},{"key":"2023020112335656500_btw112-B25","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1111\/j.1365-2567.2006.02431.x","article-title":"No evidence for the use of DIR, D-D fusions, chromosome 15 open reading frames or VHreplacement in the peripheral repertoire was found on application of an improved algorithm, JointML, to 6329 human immunoglobulin H rearrangements","volume":"119","author":"Ohm-Laursen","year":"2006","journal-title":"Immunology"},{"key":"2023020112335656500_btw112-B26","doi-asserted-by":"crossref","first-page":"892","DOI":"10.4049\/jimmunol.166.2.892","article-title":"The targeting of somatic hypermutation closely resembles that of meiotic mutation","volume":"166","author":"Oprea","year":"2001","journal-title":"J. Immunol"},{"key":"2023020112335656500_btw112-B27","doi-asserted-by":"crossref","first-page":"e0118192.","DOI":"10.1371\/journal.pone.0118192","article-title":"VDJSeq-Solver: in silico V(D)J recombination detection tool","volume":"10","author":"Paciello","year":"2015","journal-title":"Plos One"},{"key":"2023020112335656500_btw112-B28","first-page":"e1004409","article-title":"Consistency of VDJ rearrangement and substitution parameters enables accurate B cell receptor sequence annotation","volume-title":"PLoS computational biology","author":"Ralph","year":"2015"},{"key":"2023020112335656500_btw112-B29","doi-asserted-by":"crossref","first-page":"646","DOI":"10.1016\/j.coi.2013.09.017","article-title":"Immunosequencing: applications of immune repertoire deep sequencing","volume":"25","author":"Robins","year":"2013","journal-title":"Curr. Opin. Immunol"},{"key":"2023020112335656500_btw112-B30","doi-asserted-by":"crossref","first-page":"4099","DOI":"10.1182\/blood-2009-04-217604","article-title":"Comprehensive assessment of T-cell receptor beta-chain diversity in alphabeta T cells","volume":"114","author":"Robins","year":"2009","journal-title":"Blood"},{"key":"2023020112335656500_btw112-B31","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12859-015-0589-x","article-title":"HTJoinSolver: Human immunoglobulin VDJ partitioning using approximate dynamic programming constrained by conserved motifs","volume":"16","author":"Russ","year":"2015","journal-title":"BMC Bioinf"},{"key":"2023020112335656500_btw112-B32","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1038\/nri2941","article-title":"Recombination centres and the orchestration of V(D)J recombination","volume":"11","author":"Schatz","year":"2011","journal-title":"Nat. Rev. Immunol"},{"key":"2023020112335656500_btw112-B33","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1146\/annurev-genet-110410-132552","article-title":"V(D)J recombination: mechanisms of initiation","volume":"45","author":"Schatz","year":"2011","journal-title":"Annu. Rev. Genet"},{"key":"2023020112335656500_btw112-B34","doi-asserted-by":"crossref","first-page":"466.","DOI":"10.3389\/fimmu.2013.00466","article-title":"Huge overlap of individual TCR beta repertoires","volume":"4","author":"Shugay","year":"2013","journal-title":"Front. Immunol"},{"key":"2023020112335656500_btw112-B35","doi-asserted-by":"crossref","first-page":"653","DOI":"10.1038\/nmeth.2960","article-title":"Towards error-free profiling of immune repertoires","volume":"11","author":"Shugay","year":"2014","journal-title":"Nat. Methods"},{"key":"2023020112335656500_btw112-B36","doi-asserted-by":"crossref","first-page":"6790","DOI":"10.4049\/jimmunol.172.11.6790","article-title":"Characterization of the human Ig heavy chain antigen binding complementarity determining region 3 using a newly developed software algorithm, JOINSOLVER","volume":"172","author":"Souto-Carneiro","year":"2004","journal-title":"J. Immunol. (Baltimore, MD.: 1950)"},{"key":"2023020112335656500_btw112-B37","doi-asserted-by":"crossref","first-page":"5170","DOI":"10.4049\/jimmunol.175.8.5170","article-title":"Hypermutation at A-T base pairs: the a nucleotide replacement spectrum is affected by adjacent nucleotides and there is no reverse complementarity of sequences flanking mutated A and T nucleotides","volume":"175","author":"Spencer","year":"2005","journal-title":"J. Immunol"},{"key":"2023020112335656500_btw112-B38","doi-asserted-by":"crossref","first-page":"542","DOI":"10.1093\/bioinformatics\/btt004","article-title":"Decombinator: a tool for fast, efficient gene assignment in T-cell receptor sequences using a finite state machine","volume":"29","author":"Thomas","year":"2013","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023020112335656500_btw112-B39","doi-asserted-by":"crossref","first-page":"4285","DOI":"10.4049\/jimmunol.1003898","article-title":"A mechanism for TCR sharing between T cell subsets and individuals revealed by pyrosequencing","volume":"186","author":"Venturi","year":"2011","journal-title":"J. Immunol. (Baltimore, MD.: 1950)"},{"key":"2023020112335656500_btw112-B40","first-page":"438","article-title":"SoDA: implementation of a 3D alignment algorithm for inference of antigen receptor recombinations","volume":"22","author":"Volpe","year":"2006","journal-title":"Bioinformatics (Oxford, England)"},{"key":"2023020112335656500_btw112-B41","doi-asserted-by":"crossref","first-page":"S20.","DOI":"10.1186\/1471-2105-9-S12-S20","article-title":"Ab-origin: an enhanced tool to identify the sourcing gene segments in germline for rearranged antibodies","volume":"9","author":"Wang","year":"2008","journal-title":"BMC Bioinf"},{"key":"2023020112335656500_btw112-B42","doi-asserted-by":"crossref","first-page":"3857","DOI":"10.4049\/jimmunol.177.6.3857","article-title":"A model for TCR gene segment use","volume":"177","author":"Warmflash","year":"2006","journal-title":"J. Immunol"},{"key":"2023020112335656500_btw112-B43","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1093\/nar\/gkt382","article-title":"IgBLAST: an immunoglobulin variable domain sequence analysis tool","volume":"41","author":"Ye","year":"2013","journal-title":"Nucleic Acids Res"},{"key":"2023020112335656500_btw112-B44","doi-asserted-by":"crossref","first-page":"5980","DOI":"10.1073\/pnas.1319389111","article-title":"Distinctive properties of identical twins\u2019 TCR repertoires revealed by high-throughput sequencing","volume":"111","author":"Zvyagin","year":"2014","journal-title":"Proc. Natl. Acad. Sci. U. S. A"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/13\/1943\/49020058\/bioinformatics_32_13_1943.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/32\/13\/1943\/49020058\/bioinformatics_32_13_1943.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,1]],"date-time":"2023-02-01T17:45:55Z","timestamp":1675273555000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/32\/13\/1943\/1743638"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,3,7]]},"references-count":44,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2016,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btw112","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/030403","asserted-by":"object"}]},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2016,7,1]]},"published":{"date-parts":[[2016,3,7]]}}}