{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,21]],"date-time":"2026-03-21T21:07:52Z","timestamp":1774127272695,"version":"3.50.1"},"reference-count":46,"publisher":"Public Library of Science (PLoS)","issue":"12","license":[{"start":{"date-parts":[[2022,12,7]],"date-time":"2022-12-07T00:00:00Z","timestamp":1670371200000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01 AI146028"],"award-info":[{"award-number":["R01 AI146028"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["U19 AI117891"],"award-info":[{"award-number":["U19 AI117891"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01 AI121832"],"award-info":[{"award-number":["R01 AI121832"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R01 AI136514"],"award-info":[{"award-number":["R01 AI136514"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["U01 AI150747"],"award-info":[{"award-number":["U01 AI150747"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"name":"American Lebanese Syrian Associated Charities at St. Jude"},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"crossref","award":["R35 GM141457, R01 AI146028"],"award-info":[{"award-number":["R35 GM141457, R01 AI146028"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"crossref","award":["R01 AI146028, U19 AI117891"],"award-info":[{"award-number":["R01 AI146028, U19 AI117891"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100000011","name":"Howard Hughes Medical Institute","doi-asserted-by":"publisher","award":["Faculty Scholar grant"],"award-info":[{"award-number":["Faculty Scholar grant"]}],"id":[{"id":"10.13039\/100000011","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000893","name":"Simons Foundation","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000893","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["www.ploscompbiol.org"],"crossmark-restriction":false},"short-container-title":["PLoS Comput Biol"],"abstract":"<jats:p>The complexity of entire T cell receptor (TCR) repertoires makes their comparison a difficult but important task. Current methods of TCR repertoire comparison can incur a high loss of distributional information by considering overly simplistic sequence- or repertoire-level characteristics. Optimal transport methods form a suitable approach for such comparison given some distance or metric between values in the sample space, with appealing theoretical and computational properties. In this paper we introduce a nonparametric approach to comparing empirical TCR repertoires that applies the Sinkhorn distance, a fast, contemporary optimal transport method, and a recently-created distance between TCRs called TCRdist. We show that our methods identify meaningful differences between samples from distinct TCR distributions for several case studies, and compete with more complicated methods despite minimal modeling assumptions and a simpler pipeline.<\/jats:p>","DOI":"10.1371\/journal.pcbi.1010681","type":"journal-article","created":{"date-parts":[[2022,12,7]],"date-time":"2022-12-07T18:32:16Z","timestamp":1670437936000},"page":"e1010681","update-policy":"https:\/\/doi.org\/10.1371\/journal.pcbi.corrections_policy","source":"Crossref","is-referenced-by-count":8,"title":["Comparing T cell receptor repertoires using optimal transport"],"prefix":"10.1371","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1951-8822","authenticated-orcid":true,"given":"Branden J.","family":"Olson","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6860-5079","authenticated-orcid":true,"given":"Stefan A.","family":"Schattgen","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7955-0256","authenticated-orcid":true,"given":"Paul G.","family":"Thomas","sequence":"additional","affiliation":[]},{"given":"Philip","family":"Bradley","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0607-6025","authenticated-orcid":true,"given":"Frederick A.","family":"Matsen IV","sequence":"additional","affiliation":[]}],"member":"340","published-online":{"date-parts":[[2022,12,7]]},"reference":[{"key":"pcbi.1010681.ref001","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1146\/annurev.immunol.21.120601.141107","article-title":"Positive and negative selection of T cells","volume":"21","author":"TK Starr","year":"2003","journal-title":"Annu Rev Immunol"},{"key":"pcbi.1010681.ref002","doi-asserted-by":"crossref","first-page":"33843","DOI":"10.1038\/srep33843","article-title":"Immune Repertoire Diversity Correlated with Mortality in Avian Influenza A (H7N9) Virus Infected Patients","volume":"6","author":"D Hou","year":"2016","journal-title":"Sci Rep"},{"issue":"1676","key":"pcbi.1010681.ref003","doi-asserted-by":"crossref","DOI":"10.1098\/rstb.2014.0237","article-title":"Ageing of the B-cell repertoire","volume":"370","author":"V Martin","year":"2015","journal-title":"Philos Trans R Soc Lond B Biol Sci"},{"key":"pcbi.1010681.ref004","doi-asserted-by":"crossref","first-page":"13642","DOI":"10.1038\/ncomms13642","article-title":"Production of individualized V gene databases reveals high levels of immunoglobulin genetic diversity","volume":"7","author":"M Corcoran","year":"2016","journal-title":"Nat Commun"},{"issue":"8","key":"pcbi.1010681.ref005","doi-asserted-by":"crossref","first-page":"E862","DOI":"10.1073\/pnas.1417683112","article-title":"Automated analysis of high-throughput B-cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles","volume":"112","author":"D Gadala-Maria","year":"2015","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"12","key":"pcbi.1010681.ref006","doi-asserted-by":"crossref","first-page":"6986","DOI":"10.4049\/jimmunol.1000445","article-title":"Individual variation in the germline Ig gene repertoire inferred from variable region gene rearrangements","volume":"184","author":"S Boyd","year":"2010","journal-title":"J Immunol"},{"issue":"1","key":"pcbi.1010681.ref007","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1186\/s12859-017-1556-5","article-title":"The repertoire dissimilarity index as a method to compare lymphocyte receptor repertoires","volume":"18","author":"C Bolen","year":"2017","journal-title":"BMC Bioinformatics"},{"key":"pcbi.1010681.ref008","doi-asserted-by":"crossref","first-page":"2533","DOI":"10.3389\/fimmu.2019.02533","article-title":"sumrep: A Summary Statistic Framework for Immune Receptor Repertoire Comparison and Model Validation","volume":"10","author":"BJ Olson","year":"2019","journal-title":"Frontiers in Immunology"},{"key":"pcbi.1010681.ref009","doi-asserted-by":"crossref","first-page":"2209","DOI":"10.1101\/gr.275373.121","article-title":"Individualized VDJ recombination predisposes the available Ig sequence space","volume":"31","author":"A Slabodkin","year":"2021","journal-title":"Genome Res"},{"key":"pcbi.1010681.ref010","doi-asserted-by":"crossref","first-page":"100269","DOI":"10.1016\/j.crmeth.2022.100269","article-title":"Reference-based comparison of adaptive immune receptor repertoires","volume":"2","author":"CR Weber","year":"2022","journal-title":"Cell Rep Methods"},{"issue":"6","key":"pcbi.1010681.ref011","doi-asserted-by":"crossref","first-page":"1057","DOI":"10.1016\/j.molimm.2006.06.026","article-title":"Statistical analysis of CDR3 length distributions for the assessment of T and B cell repertoire biases","volume":"44","author":"P Miqueu","year":"2007","journal-title":"Mol Immunol"},{"issue":"6","key":"pcbi.1010681.ref012","doi-asserted-by":"crossref","first-page":"3221","DOI":"10.4049\/jimmunol.1201303","article-title":"Shaping of human germline IgH repertoires revealed by deep sequencing","volume":"189","author":"K Larimore","year":"2012","journal-title":"J Immunol"},{"issue":"22","key":"pcbi.1010681.ref013","doi-asserted-by":"crossref","first-page":"3181","DOI":"10.1093\/bioinformatics\/btu523","article-title":"Tracking global changes induced in the CD4 T-cell receptor repertoire by immunization with a complex antigen using short stretches of CDR3 protein sequence","volume":"30","author":"N Thomas","year":"2014","journal-title":"Bioinformatics"},{"issue":"1","key":"pcbi.1010681.ref014","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1186\/s12859-017-1814-6","article-title":"Statistical classifiers for diagnosing disease from immune repertoires: a case study using multiple sclerosis","volume":"18","author":"J Ostmeyer","year":"2017","journal-title":"BMC Bioinformatics"},{"issue":"7","key":"pcbi.1010681.ref015","doi-asserted-by":"crossref","first-page":"951","DOI":"10.1093\/bioinformatics\/btw771","article-title":"Feature selection using a one dimensional na\u00efve Bayes\u2019 classifier increases the accuracy of support vector machine classification of CDR3 repertoires","volume":"33","author":"M Cinelli","year":"2017","journal-title":"Bioinformatics"},{"key":"pcbi.1010681.ref016","doi-asserted-by":"crossref","first-page":"1500","DOI":"10.3389\/fimmu.2017.01500","article-title":"Quantification of Inter-Sample Differences in T-Cell Receptor Repertoires Using Sequence-Based Information","volume":"8","author":"R Yokota","year":"2017","journal-title":"Front Immunol"},{"issue":"4","key":"pcbi.1010681.ref017","doi-asserted-by":"crossref","first-page":"e1007873","DOI":"10.1371\/journal.pcbi.1007873","article-title":"Inferring the immune response from repertoire sequencing","volume":"16","author":"Puelma Touzel M","year":"2020","journal-title":"PLoS Comput Biol"},{"key":"pcbi.1010681.ref018","doi-asserted-by":"crossref","first-page":"642673","DOI":"10.3389\/fimmu.2021.642673","article-title":"Using Domain Based Latent Personal Analysis of B Cell Clone Diversity Patterns to Identify Novel Relationships Between the B Cell Clone Populations in Different Tissues","volume":"12","author":"U Alon","year":"2021","journal-title":"Front Immunol"},{"issue":"1","key":"pcbi.1010681.ref019","doi-asserted-by":"crossref","first-page":"e1009301","DOI":"10.1371\/journal.pgen.1009301","article-title":"Immune fingerprinting through repertoire similarity","volume":"17","author":"T Dupic","year":"2021","journal-title":"PLoS Genet"},{"issue":"50","key":"pcbi.1010681.ref020","doi-asserted-by":"crossref","first-page":"12704","DOI":"10.1073\/pnas.1809642115","article-title":"Precise tracking of vaccine-responding T cell clones reveals convergent and personalized response in identical twins","volume":"115","author":"MV Pogorelyy","year":"2018","journal-title":"Proceedings of the National Academy of Sciences"},{"key":"pcbi.1010681.ref021","doi-asserted-by":"crossref","first-page":"e3000314","DOI":"10.1371\/journal.pbio.3000314","article-title":"Detecting T-cell receptors involved in immune responses from single repertoire snapshots","volume":"17","author":"MV Pogorelyy","year":"2019","journal-title":"PLoS Biol"},{"key":"pcbi.1010681.ref022","doi-asserted-by":"crossref","first-page":"2820","DOI":"10.3389\/fimmu.2019.02820","article-title":"Detection of enriched T cell epitope specificity in full T cell receptor sequence repertoires","volume":"10","author":"S Gielis","year":"2019","journal-title":"Frontiers in Immunology"},{"key":"pcbi.1010681.ref023","doi-asserted-by":"crossref","unstructured":"Jurtz VI, Jessen LE, Bentzen AK, Jespersen MC, Mahajan S, Vita R, et al. NetTCR: sequence-based prediction of TCR binding to peptide-MHC complexes using convolutional neural networks. bioRxiv. 2018;Available from: https:\/\/www.biorxiv.org\/content\/early\/2018\/10\/02\/433706.","DOI":"10.1101\/433706"},{"key":"pcbi.1010681.ref024","doi-asserted-by":"crossref","unstructured":"Jokinen E, Huuhtanen J, Mustjoki S, Heinonen M, L\u00e4hdesm\u00e4ki H. Determining epitope specificity of T cell receptors with TCRGP. bioRxiv. 2019;Available from: https:\/\/www.biorxiv.org\/content\/early\/2019\/08\/21\/542332.","DOI":"10.1101\/542332"},{"issue":"7661","key":"pcbi.1010681.ref025","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1038\/nature22383","article-title":"Quantifiable predictive features define epitope-specific T cell receptor repertoires","volume":"547","author":"P Dash","year":"2017","journal-title":"Nature"},{"issue":"7661","key":"pcbi.1010681.ref026","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1038\/nature22976","article-title":"Identifying specificity groups in the T cell receptor repertoire","volume":"547","author":"J Glanville","year":"2017","journal-title":"Nature"},{"key":"pcbi.1010681.ref027","doi-asserted-by":"crossref","DOI":"10.1038\/s41587-020-0505-4","article-title":"Analyzing the Mycobacterium tuberculosis immune response by T-cell receptor clustering with GLIPH2 and genome-wide antigen screening","author":"H Huang","year":"2020","journal-title":"Nat Biotechnol"},{"key":"pcbi.1010681.ref028","doi-asserted-by":"crossref","DOI":"10.7554\/eLife.68605","article-title":"TCR meta-clonotypes for biomarker discovery with tcrdist3 enabled identification of public, HLA-restricted clusters of SARS-CoV-2 TCRs","volume":"10","author":"K Mayer-Blackwell","year":"2021","journal-title":"Elife"},{"key":"pcbi.1010681.ref029","unstructured":"Cuturi M. Sinkhorn distances: Lightspeed computation of optimal transport. In: Advances in neural information processing systems; 2013. p. 2292\u20132300."},{"issue":"7661","key":"pcbi.1010681.ref030","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1038\/nature22383","article-title":"Quantifiable predictive features define epitope specific T cell receptor repertoires","volume":"547","author":"P Dash","year":"2017","journal-title":"Nature"},{"key":"pcbi.1010681.ref031","doi-asserted-by":"crossref","unstructured":"Vershik AM. Long History of the Monge-Kantorovich Transportation Problem. Math Intelligencer. 2013 Dec;35(4):1\u20139. Available from: https:\/\/doi.org\/10.1007\/s00283-013-9380-x.","DOI":"10.1007\/s00283-013-9380-x"},{"key":"pcbi.1010681.ref032","volume-title":"Lectures on the Coupling Method","author":"T Lindvall","year":"1992"},{"key":"pcbi.1010681.ref033","volume-title":"Harmonic analysis and applications","author":"J Benedetto","year":"1997"},{"issue":"22","key":"pcbi.1010681.ref034","doi-asserted-by":"crossref","first-page":"10915","DOI":"10.1073\/pnas.89.22.10915","article-title":"Amino acid substitution matrices from protein blocks","volume":"89","author":"S Henikoff","year":"1992","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"1","key":"pcbi.1010681.ref035","first-page":"20","article-title":"segmented: an R Package to Fit Regression Models with Broken-Line Relationships","volume":"8","author":"VMR Muggeo","year":"2008","journal-title":"R News"},{"issue":"9","key":"pcbi.1010681.ref036","doi-asserted-by":"crossref","first-page":"755","DOI":"10.1093\/bioinformatics\/14.9.755","article-title":"Profile hidden Markov models","volume":"14","author":"SR Eddy","year":"1998","journal-title":"Bioinformatics"},{"issue":"7","key":"pcbi.1010681.ref037","article-title":"HMM Logos for visualization of protein families","volume":"5","author":"B Schuster-B\u00f6ckler","year":"2004","journal-title":"BMC Bioinformatics"},{"issue":"1","key":"pcbi.1010681.ref038","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1186\/1471-2105-15-7","article-title":"Skylign: a tool for creating informative, interactive logos representing sequence alignments and profile hidden Markov models","volume":"15","author":"TJ Wheeler","year":"2014","journal-title":"BMC Bioinformatics"},{"issue":"14","key":"pcbi.1010681.ref039","doi-asserted-by":"crossref","first-page":"3059","DOI":"10.1093\/nar\/gkf436","article-title":"MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform","volume":"30","author":"K Katoh","year":"2002","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010681.ref040","article-title":"Intestinal Intraepithelial Lymphocyte Repertoires are Imprinted Clonal Structures Selected for MHC Reactivity","author":"SA Schattgen","year":"2019","journal-title":"Sneak Peek"},{"issue":"6","key":"pcbi.1010681.ref041","doi-asserted-by":"crossref","first-page":"653","DOI":"10.1038\/nmeth.2960","article-title":"Towards error-free profiling of immune repertoires","volume":"11","author":"M Shugay","year":"2014","journal-title":"Nat Methods"},{"key":"pcbi.1010681.ref042","unstructured":"10XGenomics. A new way of exploring immunity: linking highly multiplexed antigen recognition to immune repertoire and phenotype; 2020. Retrieved from the 10X Genomics website: https:\/\/pages.10xgenomics.com\/rs\/446-PBO-704\/images\/10x_AN047_IP_A_New_Way_of_Exploring_Immunity_Digital.pdf (2022\/08\/30)."},{"key":"pcbi.1010681.ref043","first-page":"1","article-title":"Integrating T cell receptor sequences and transcriptional profiles by clonotype neighbor graph analysis (CoNGA)","author":"SA Schattgen","year":"2021","journal-title":"Nat Biotechnol"},{"issue":"78","key":"pcbi.1010681.ref044","first-page":"1","article-title":"POT: Python Optimal Transport","volume":"22","author":"R Flamary","year":"2021","journal-title":"Journal of Machine Learning Research"},{"issue":"D1","key":"pcbi.1010681.ref045","doi-asserted-by":"crossref","first-page":"D419","DOI":"10.1093\/nar\/gkx760","article-title":"VDJdb: a curated database of T-cell receptor sequences with known antigen specificity","volume":"46","author":"M Shugay","year":"2018","journal-title":"Nucleic Acids Res"},{"key":"pcbi.1010681.ref046","doi-asserted-by":"crossref","first-page":"e46935","DOI":"10.7554\/eLife.46935","article-title":"Deep generative models for T cell receptor protein sequences","volume":"8","author":"K Davidsen","year":"2019","journal-title":"eLife"}],"container-title":["PLOS Computational Biology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1010681","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,3,16]],"date-time":"2023-03-16T11:54:32Z","timestamp":1678967672000},"score":1,"resource":{"primary":{"URL":"https:\/\/dx.plos.org\/10.1371\/journal.pcbi.1010681"}},"subtitle":[],"editor":[{"given":"Andrew J.","family":"Yates","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2022,12,7]]},"references-count":46,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2022,12,7]]}},"URL":"https:\/\/doi.org\/10.1371\/journal.pcbi.1010681","relation":{},"ISSN":["1553-7358"],"issn-type":[{"value":"1553-7358","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,12,7]]}}}