{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,21]],"date-time":"2026-05-21T02:15:20Z","timestamp":1779329720329,"version":"3.51.4"},"reference-count":18,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2020,7,16]],"date-time":"2020-07-16T00:00:00Z","timestamp":1594857600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,7,16]],"date-time":"2020-07-16T00:00:00Z","timestamp":1594857600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100000060","name":"National Institute of Allergy and Infectious Diseases","doi-asserted-by":"publisher","award":["U19 AI117905"],"award-info":[{"award-number":["U19 AI117905"]}],"id":[{"id":"10.13039\/100000060","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000060","name":"National Institute of Allergy and Infectious Diseases","doi-asserted-by":"publisher","award":["HHSN272201400024C"],"award-info":[{"award-number":["HHSN272201400024C"]}],"id":[{"id":"10.13039\/100000060","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100018183","name":"Human Vaccines Project","doi-asserted-by":"crossref","award":["NA"],"award-info":[{"award-number":["NA"]}],"id":[{"id":"10.13039\/100018183","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2020,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n<jats:title>Background<\/jats:title>\n<jats:p>Recent advances in DNA sequencing technologies have enabled significant leaps in capacity to generate large volumes of DNA sequence data, which has spurred a rapid growth in the use of bioinformatics as a means of interrogating antibody variable gene repertoires. Common tools used for annotation of antibody sequences are often limited in functionality, modularity and usability.<\/jats:p>\n<\/jats:sec><jats:sec>\n<jats:title>Results<\/jats:title>\n<jats:p>We have developed PyIR, a Python wrapper and library for IgBLAST, which offers a minimal setup CLI and API, FASTQ support, file chunking for large sequence files, JSON and Python dictionary output, and built-in sequence filtering.<\/jats:p>\n<\/jats:sec><jats:sec>\n<jats:title>Conclusions<\/jats:title>\n<jats:p>PyIR offers improved processing speed over multithreaded IgBLAST (version 1.14) when spawning more than 16 processes on a single computer system. Its customizable filtering and data encapsulation allow it to be adapted to a wide range of computing environments. The API allows for IgBLAST to be used in customized bioinformatics workflows.<\/jats:p>\n<\/jats:sec>","DOI":"10.1186\/s12859-020-03649-5","type":"journal-article","created":{"date-parts":[[2020,7,16]],"date-time":"2020-07-16T10:03:28Z","timestamp":1594893808000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":31,"title":["PyIR: a scalable wrapper for processing billions of immunoglobulin and T cell receptor sequences using IgBLAST"],"prefix":"10.1186","volume":"21","author":[{"given":"Cinque","family":"Soto","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jessica A.","family":"Finn","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jordan R.","family":"Willis","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Samuel B.","family":"Day","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Robert S.","family":"Sinkovits","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Taylor","family":"Jones","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Samuel","family":"Schmitz","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jens","family":"Meiler","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andre","family":"Branchizio","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0049-1079","authenticated-orcid":false,"suffix":"Jr","given":"James E.","family":"Crowe","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2020,7,16]]},"reference":[{"issue":"7744","key":"3649_CR1","doi-asserted-by":"publisher","first-page":"398","DOI":"10.1038\/s41586-019-0934-8","volume":"566","author":"C Soto","year":"2019","unstructured":"Soto C, Bombardi RG, Branchizio A, Kose N, Matta P, Sevy AM, Sinkovits RS, Gilchuk P, Finn JA, Crowe JE Jr. High frequency of shared clonotypes in human B cell receptor repertoires. Nature. 2019;566(7744):398\u2013402.","journal-title":"Nature"},{"issue":"7744","key":"3649_CR2","doi-asserted-by":"publisher","first-page":"393","DOI":"10.1038\/s41586-019-0879-y","volume":"566","author":"B Briney","year":"2019","unstructured":"Briney B, Inderbitzin A, Joyce C, Burton DR. Commonality despite exceptional diversity in the baseline human antibody repertoire. Nature. 2019;566(7744):393\u20137.","journal-title":"Nature"},{"issue":"5928","key":"3649_CR3","doi-asserted-by":"publisher","first-page":"807","DOI":"10.1126\/science.1170020","volume":"324","author":"JA Weinstein","year":"2009","unstructured":"Weinstein JA, Jiang N, White RA 3rd, Fisher DS, Quake SR. High-throughput sequencing of the zebrafish antibody repertoire. Science. 2009;324(5928):807\u201310.","journal-title":"Science"},{"issue":"7","key":"3649_CR4","doi-asserted-by":"publisher","first-page":"523","DOI":"10.1038\/gene.2012.28","volume":"13","author":"BS Briney","year":"2012","unstructured":"Briney BS, Willis JR, Crowe JE Jr. Location and length distribution of somatic hypermutation-associated DNA insertions and deletions reveals regions of antibody structural plasticity. Genes Immun. 2012;13(7):523\u20139.","journal-title":"Genes Immun"},{"issue":"16","key":"3649_CR5","doi-asserted-by":"publisher","first-page":"6470","DOI":"10.1073\/pnas.1219320110","volume":"110","author":"J Zhu","year":"2013","unstructured":"Zhu J, Ofek G, Yang Y, Zhang B, Louder MK, Lu G, McKee K, Pancera M, Skinner J, Zhang Z, et al. Mining the antibodyome for HIV-1-neutralizing antibodies with next-generation sequencing and phylogenetic pairing of heavy\/light chains. Proc Natl Acad Sci U S A. 2013;110(16):6470\u20135.","journal-title":"Proc Natl Acad Sci U S A"},{"key":"3649_CR6","doi-asserted-by":"crossref","unstructured":"Smakaj E, Babrak L, Ohlin M, Shugay M, Briney B, Tosoni D, Galli C, Grobelsek V, D'Angelo I, Olson B, et al. Benchmarking immunoinformatic tools for the analysis of antibody repertoire sequences. Bioinformatics.\u00a02020;36(6):1731-39.","DOI":"10.1093\/bioinformatics\/btz845"},{"issue":"Web Server issu","key":"3649_CR7","doi-asserted-by":"publisher","first-page":"W34","DOI":"10.1093\/nar\/gkt382","volume":"41","author":"J Ye","year":"2013","unstructured":"Ye J, Ma N, Madden TL, Ostell JM. IgBLAST: an immunoglobulin variable domain sequence analysis tool. Nucleic Acids Res. 2013;41(Web Server issue):W34\u201340.","journal-title":"Nucleic Acids Res"},{"key":"3649_CR8","doi-asserted-by":"publisher","first-page":"569","DOI":"10.1007\/978-1-61779-842-9_32","volume":"882","author":"E Alamyar","year":"2012","unstructured":"Alamyar E, Duroux P, Lefranc MP, Giudicelli V. IMGT((R)) tools for the nucleotide analysis of immunoglobulin (IG) and T cell receptor (TR) V-(D)-J repertoires, polymorphisms, and IG mutations: IMGT\/V-QUEST and IMGT\/HighV-QUEST for NGS. Methods Mol Biol. 2012;882:569\u2013604.","journal-title":"Methods Mol Biol"},{"key":"3649_CR9","doi-asserted-by":"publisher","first-page":"23901","DOI":"10.1038\/srep23901","volume":"6","author":"B Briney","year":"2016","unstructured":"Briney B, Le K, Zhu J, Burton DR. Clonify: unseeded antibody lineage assignment from next-generation sequencing data. Sci Rep. 2016;6:23901.","journal-title":"Sci Rep"},{"issue":"13","key":"3649_CR10","doi-asserted-by":"publisher","first-page":"1930","DOI":"10.1093\/bioinformatics\/btu138","volume":"30","author":"JA Vander Heiden","year":"2014","unstructured":"Vander Heiden JA, Yaari G, Uduman M, Stern JN, O'Connor KC, Hafler DA, Vigneault F, Kleinstein SH. pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires. Bioinformatics. 2014;30(13):1930\u20132.","journal-title":"Bioinformatics"},{"issue":"1","key":"3649_CR11","doi-asserted-by":"publisher","first-page":"108","DOI":"10.1111\/imr.12480","volume":"275","author":"PD Kwong","year":"2017","unstructured":"Kwong PD, Chuang GY, DeKosky BJ, Gindin T, Georgiev IS, Lemmin T, Schramm CA, Sheng Z, Soto C, Yang AS, et al. Antibodyomics: bioinformatics technologies for understanding B-cell immunity to HIV-1. Immunol Rev. 2017;275(1):108\u201328.","journal-title":"Immunol Rev"},{"key":"3649_CR12","doi-asserted-by":"publisher","first-page":"13642","DOI":"10.1038\/ncomms13642","volume":"7","author":"MM Corcoran","year":"2016","unstructured":"Corcoran MM, Phad GE, Vazquez Bernat N, Stahl-Hennig C, Sumida N, Persson MA, Martin M, Karlsson Hedestam GB. Production of individualized V gene databases reveals high levels of immunoglobulin genetic diversity. Nat Commun. 2016;7:13642.","journal-title":"Nat Commun"},{"issue":"Database issue","key":"3649_CR13","doi-asserted-by":"publisher","first-page":"D413","DOI":"10.1093\/nar\/gku1056","volume":"43","author":"MP Lefranc","year":"2015","unstructured":"Lefranc MP, Giudicelli V, Duroux P, Jabado-Michaloud J, Folch G, Aouinti S, Carillon E, Duvergey H, Houles A, Paysan-Lafosse T, et al. IMGT(R), the international ImMunoGeneTics information system(R) 25 years on. Nucleic Acids Res. 2015;43(Database issue):D413\u201322.","journal-title":"Nucleic Acids Res"},{"issue":"3","key":"3649_CR14","doi-asserted-by":"publisher","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","volume":"215","author":"SF Altschul","year":"1990","unstructured":"Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403\u201310.","journal-title":"J Mol Biol"},{"key":"3649_CR15","unstructured":"Adaptive Immune Receptor Repertoire (AIRR) Community of the Antibody Society. http:\/\/docs.airr-community.org\/en\/latest\/. Accessed 27 Jan 2020."},{"key":"3649_CR16","doi-asserted-by":"publisher","first-page":"2365","DOI":"10.3389\/fimmu.2019.02365","volume":"10","author":"Y Guo","year":"2019","unstructured":"Guo Y, Chen K, Kwong PD, Shapiro L, Sheng Z. cAb-rep: a database of curated antibody repertoires for exploring antibody diversity and predicting antibody prevalence. Front Immunol. 2019;10:2365.","journal-title":"Front Immunol"},{"key":"3649_CR17","doi-asserted-by":"publisher","first-page":"899","DOI":"10.3389\/fimmu.2019.00899","volume":"10","author":"L L\u00f3pez-Santib\u00e1\u00f1ez-J\u00e1come","year":"2019","unstructured":"L\u00f3pez-Santib\u00e1\u00f1ez-J\u00e1come L, Er\u00e9ndira Avenda\u00f1o-V\u00e1zquez S, Flores-Jasso CF. The pipeline repertoire for Ig-Seq analysis. Front Immunol. 2019;10:899.","journal-title":"Front Immunol"},{"issue":"6","key":"3649_CR18","doi-asserted-by":"publisher","first-page":"1731","DOI":"10.1093\/bioinformatics\/btz845","volume":"36","author":"E Smakaj","year":"2020","unstructured":"Smakaj E, Babrak L, Ohlin M, Shugay M, Briney B, Tosoni D, Galli C, Grobelsek V, D\u2019Angelo I, Olson B, et al. Benchmarking immunoinformatic tools for the analysis of antibody repertoire sequences. Bioinformatics. 2020;36(6):1731\u20139.","journal-title":"Bioinformatics"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03649-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-020-03649-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-020-03649-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,15]],"date-time":"2021-07-15T23:07:13Z","timestamp":1626390433000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-020-03649-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,7,16]]},"references-count":18,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2020,12]]}},"alternative-id":["3649"],"URL":"https:\/\/doi.org\/10.1186\/s12859-020-03649-5","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,7,16]]},"assertion":[{"value":"23 February 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 July 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 July 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Not applicable.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"314"}}