{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T04:30:21Z","timestamp":1776054621029,"version":"3.50.1"},"reference-count":49,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2024,5,16]],"date-time":"2024-05-16T00:00:00Z","timestamp":1715817600000},"content-version":"vor","delay-in-days":50,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100012579","name":"Iowa Science Foundation","doi-asserted-by":"publisher","award":["2940\/21"],"award-info":[{"award-number":["2940\/21"]}],"id":[{"id":"10.13039\/100012579","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000060","name":"National Institute of Allergy and Infectious Diseases","doi-asserted-by":"publisher","award":["U24AI177622"],"award-info":[{"award-number":["U24AI177622"]}],"id":[{"id":"10.13039\/100000060","id-type":"DOI","asserted-by":"publisher"}]},{"name":"European Union\u2019s Horizon 2020 research and innovation program","award":["825821"],"award-info":[{"award-number":["825821"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,3,27]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Enhancing the reproducibility and comprehension of adaptive immune receptor repertoire sequencing (AIRR-seq) data analysis is critical for scientific progress. This study presents guidelines for reproducible AIRR-seq data analysis, and a collection of ready-to-use pipelines with comprehensive documentation. To this end, ten common pipelines were implemented using ViaFoundry, a user-friendly interface for pipeline management and automation. This is accompanied by versioned containers, documentation and archiving capabilities. The automation of pre-processing analysis steps and the ability to modify pipeline parameters according to specific research needs are emphasized. AIRR-seq data analysis is highly sensitive to varying parameters and setups; using the guidelines presented here, the ability to reproduce previously published results is demonstrated. This work promotes transparency, reproducibility, and collaboration in AIRR-seq data analysis, serving as a model for handling and documenting bioinformatics pipelines in other research domains.<\/jats:p>","DOI":"10.1093\/bib\/bbae221","type":"journal-article","created":{"date-parts":[[2024,5,16]],"date-time":"2024-05-16T12:03:45Z","timestamp":1715861025000},"source":"Crossref","is-referenced-by-count":6,"title":["Guidelines for reproducible analysis of adaptive immune receptor repertoire sequencing data"],"prefix":"10.1093","volume":"25","author":[{"given":"Ayelet","family":"Peres","sequence":"first","affiliation":[{"name":"Faculty of Engineering, Bar Ilan University , 5290002 Ramat Gan , Israel"},{"name":"Bar Ilan institute of nanotechnology and advanced materials, Bar Ilan university , 5290002 Ramat Gan , Israel"}]},{"given":"Vered","family":"Klein","sequence":"additional","affiliation":[{"name":"Faculty of Engineering, Bar Ilan University , 5290002 Ramat Gan , Israel"},{"name":"Bar Ilan institute of nanotechnology and advanced materials, Bar Ilan university , 5290002 Ramat Gan , Israel"}]},{"given":"Boaz","family":"Frankel","sequence":"additional","affiliation":[{"name":"Faculty of Engineering, Bar Ilan University , 5290002 Ramat Gan , Israel"},{"name":"Bar Ilan institute of nanotechnology and advanced materials, Bar Ilan university , 5290002 Ramat Gan , Israel"}]},{"given":"William","family":"Lees","sequence":"additional","affiliation":[{"name":"Institute of Structural and Molecular Biology, Birkbeck College , University of London, London , United Kingdom"},{"name":"INESC TEC \u2013 Institute for Systems and Computer Engineering, Technology and Science Porto , Portugal"}]},{"given":"Pazit","family":"Polak","sequence":"additional","affiliation":[{"name":"Faculty of Engineering, Bar Ilan University , 5290002 Ramat Gan , Israel"},{"name":"Bar Ilan institute of nanotechnology and advanced materials, Bar Ilan university , 5290002 Ramat Gan , Israel"}]},{"given":"Mark","family":"Meehan","sequence":"additional","affiliation":[{"name":"INESC TEC \u2013 Institute for Systems and Computer Engineering, Technology and Science Porto , Portugal"}]},{"given":"Artur","family":"Rocha","sequence":"additional","affiliation":[{"name":"INESC TEC \u2013 Institute for Systems and Computer Engineering, Technology and Science Porto , Portugal"}]},{"given":"Jo\u00e3o","family":"Correia Lopes","sequence":"additional","affiliation":[{"name":"INESC TEC \u2013 Institute for Systems and Computer Engineering, Technology and Science Porto , Portugal"}]},{"given":"Gur","family":"Yaari","sequence":"additional","affiliation":[{"name":"Faculty of Engineering, Bar Ilan University , 5290002 Ramat Gan , Israel"},{"name":"Bar Ilan institute of nanotechnology and advanced materials, Bar Ilan university , 5290002 Ramat Gan , Israel"}]}],"member":"286","published-online":{"date-parts":[[2024,5,15]]},"reference":[{"issue":"1","key":"2024051612033125800_ref1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/sdata.2016.18","article-title":"The fair guiding principles for scientific data management and stewardship","volume":"3","author":"Wilkinson","year":"2016","journal-title":"Scientific data"},{"issue":"10","key":"2024051612033125800_ref2","doi-asserted-by":"crossref","first-page":"e1003285","DOI":"10.1371\/journal.pcbi.1003285","article-title":"Ten simple rules for reproducible computational research","volume":"9","author":"Sandve","year":"2013","journal-title":"PLoS Comput Biol"},{"issue":"6060","key":"2024051612033125800_ref3","doi-asserted-by":"crossref","first-page":"1226","DOI":"10.1126\/science.1213847","article-title":"Reproducible research in computational science","volume":"334","author":"Peng","year":"2011","journal-title":"Science"},{"issue":"10","key":"2024051612033125800_ref4","doi-asserted-by":"crossref","first-page":"1161","DOI":"10.1038\/s41592-021-01254-9","article-title":"Reproducible, scalable, and shareable analysis pipelines with bioinformatics workflow managers","volume":"18","author":"Wratten","year":"2021","journal-title":"Nat Methods"},{"issue":"19","key":"2024051612033125800_ref5","doi-asserted-by":"crossref","first-page":"2520","DOI":"10.1093\/bioinformatics\/bts480","article-title":"Snakemake - a scalable bioinformatics workflow engine","volume":"28","author":"K\u00f6ster","year":"2012","journal-title":"Bioinformatics"},{"issue":"1381","key":"2024051612033125800_ref6","article-title":"Full-stack genomics pipelining with gatk4 + wdl + Cromwell [version 1; not peer reviewed]","volume":"6","author":"Voss","year":"2017","journal-title":"ISCB Comm J"},{"issue":"W1","key":"2024051612033125800_ref7","doi-asserted-by":"crossref","first-page":"W537","DOI":"10.1093\/nar\/gky379","article-title":"The galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update","volume":"46","author":"Afgan","year":"2018","journal-title":"Nucleic Acids Res"},{"issue":"4","key":"2024051612033125800_ref8","doi-asserted-by":"crossref","first-page":"314","DOI":"10.1038\/nbt.3772","article-title":"Toil enables reproducible, open source, big biomedical data analyses","volume":"35","author":"Vivian","year":"2017","journal-title":"Nat Biotechnol"},{"issue":"4","key":"2024051612033125800_ref9","doi-asserted-by":"crossref","first-page":"316","DOI":"10.1038\/nbt.3820","article-title":"Nextflow enables reproducible computational workflows","volume":"35","author":"Di Tommaso","year":"2017","journal-title":"Nat Biotechnol"},{"key":"2024051612033125800_ref10","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12864-020-6714-x","article-title":"Dolphinnext: a distributed data processing platform for high throughput genomics","volume":"21","author":"Yukselen","year":"2020","journal-title":"BMC Genomics"},{"issue":"2","key":"2024051612033125800_ref11","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1038\/ng.295","article-title":"Repeatability of published microarray gene expression analyses","volume":"41","author":"Ioannidis","year":"2009","journal-title":"Nat Genet"},{"issue":"7","key":"2024051612033125800_ref12","doi-asserted-by":"crossref","first-page":"giy077","DOI":"10.1093\/gigascience\/giy077","article-title":"Experimenting with reproducibility: a case study of robustness in bioinformatics","volume":"7","author":"Kim","year":"2018","journal-title":"GigaScience"},{"key":"2024051612033125800_ref13","doi-asserted-by":"crossref","first-page":"1418","DOI":"10.3389\/fimmu.2017.01418","article-title":"Reproducibility and reuse of adaptive immune receptor repertoire data","volume":"8","author":"Breden","year":"2017","journal-title":"Front Immunol"},{"key":"2024051612033125800_ref14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13073-015-0243-2","article-title":"Practical guidelines for b-cell receptor repertoire sequencing analysis","volume":"7","author":"Yaari","year":"2015","journal-title":"Genome Med"},{"issue":"13","key":"2024051612033125800_ref15","doi-asserted-by":"crossref","first-page":"1930","DOI":"10.1093\/bioinformatics\/btu138","article-title":"Presto: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires","volume":"30","author":"Vander","year":"2014","journal-title":"Bioinformatics"},{"issue":"1","key":"2024051612033125800_ref16","first-page":"13642","article-title":"Production of individualized v gene databases reveals high levels of immunoglobulin genetic diversity. Nature","volume":"7","author":"Corcoran","year":"2016","journal-title":"Communications"},{"issue":"5","key":"2024051612033125800_ref17","doi-asserted-by":"crossref","first-page":"380","DOI":"10.1038\/nmeth.3364","article-title":"Mixcr: software for comprehensive adaptive immunity profiling","volume":"12","author":"Bolotin","year":"2015","journal-title":"Nat Methods"},{"issue":"W1","key":"2024051612033125800_ref18","doi-asserted-by":"crossref","first-page":"W34","DOI":"10.1093\/nar\/gkt382","article-title":"Igblast: an immunoglobulin variable domain sequence analysis tool","volume":"41","author":"Ye","year":"2013","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"2024051612033125800_ref19","doi-asserted-by":"crossref","DOI":"10.1371\/journal.pcbi.1004409","article-title":"Consistency of vdj rearrangement and substitution parameters enables accurate b cell receptor sequence annotation","volume":"12","author":"Ralph","year":"2016","journal-title":"PLoS Comput Biol"},{"issue":"20","key":"2024051612033125800_ref20","doi-asserted-by":"crossref","first-page":"3356","DOI":"10.1093\/bioinformatics\/btv359","article-title":"Change-o: a toolkit for analyzing large-scale b cell immunoglobulin repertoire sequencing data","volume":"31","author":"Gupta","year":"2015","journal-title":"Bioinformatics"},{"issue":"13","key":"2024051612033125800_ref21","doi-asserted-by":"crossref","first-page":"i341","DOI":"10.1093\/bioinformatics\/bty235","article-title":"A spectral clustering-based method for identifying clones from high-throughput b cell repertoire sequencing data","volume":"34","author":"Nouri","year":"2018","journal-title":"Bioinformatics"},{"issue":"4","key":"2024051612033125800_ref22","doi-asserted-by":"crossref","first-page":"e21","DOI":"10.1093\/nar\/gkaa1160","article-title":"Alignment free identification of clones in b cell receptor repertoires","volume":"49","author":"Lindenbaum","year":"2021","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"2024051612033125800_ref23","doi-asserted-by":"crossref","first-page":"209","DOI":"10.1093\/nar\/27.1.209","article-title":"Imgt, the international immunogenetics database","volume":"27","author":"Lefranc","year":"1999","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2024051612033125800_ref24","doi-asserted-by":"crossref","first-page":"D964","DOI":"10.1093\/nar\/gkz822","article-title":"Ogrdb: a reference database of inferred immune receptor genes","volume":"48","author":"Lees","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2024051612033125800_ref25","doi-asserted-by":"crossref","DOI":"10.1016\/j.immuno.2023.100025","article-title":"Airr community curation and standardised representation for immunoglobulin and t cell receptor germline sets","volume":"10","author":"Lees","year":"2023","journal-title":"ImmunoInformatics"},{"issue":"16","key":"2024051612033125800_ref26","doi-asserted-by":"crossref","first-page":"e86","DOI":"10.1093\/nar\/gkad603","article-title":"IGHV allele similarity clustering improves genotype inference from adaptive immune receptor repertoire sequencing data","volume":"51","author":"Peres","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"2024051612033125800_ref27","doi-asserted-by":"crossref","first-page":"D1051","DOI":"10.1093\/nar\/gkz872","article-title":"Vdjbase: an adaptive immune receptor genotype and haplotype database","volume":"48","author":"Omer","year":"2020","journal-title":"Nucleic Acids Res"},{"key":"2024051612033125800_ref28","first-page":"109","article-title":"Mining adaptive immune receptor repertoires for biological and clinical information using machine learning. Current opinion","volume":"24","author":"Greiff","year":"2020","journal-title":"Syst Biol"},{"issue":"11","key":"2024051612033125800_ref29","first-page":"936","article-title":"The immuneml ecosystem for machine learning analysis of adaptive immune receptor repertoires. Nature","volume":"3","author":"Pavlovi\u0107","year":"2021","journal-title":"Machine Intelligence"},{"key":"2024051612033125800_ref30","doi-asserted-by":"crossref","first-page":"2206","DOI":"10.3389\/fimmu.2018.02206","article-title":"Airr community standardized representations for annotated immune repertoires","volume":"9","author":"Heiden","year":"2018","journal-title":"Front Immunol"},{"issue":"12","key":"2024051612033125800_ref31","doi-asserted-by":"crossref","first-page":"1274","DOI":"10.1038\/ni.3873","article-title":"Adaptive immune receptor repertoire community recommendations for sharing immune-repertoire sequencing data","volume":"18","author":"Rubelt","year":"2017","journal-title":"Nat Immunol"},{"issue":"239","key":"2024051612033125800_ref32","first-page":"2","article-title":"Docker: lightweight linux containers for consistent development and deployment","volume":"2014","author":"Merkel","year":"2014","journal-title":"Linux journal"},{"issue":"5","key":"2024051612033125800_ref33","doi-asserted-by":"crossref","first-page":"e0177459","DOI":"10.1371\/journal.pone.0177459","article-title":"Singularity: scientific containers for mobility of compute","volume":"12","author":"Kurtzer","year":"2017","journal-title":"PloS One"},{"key":"2024051612033125800_ref34","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4842-0076-6","volume-title":"Pro git","author":"Chacon","year":"2014"},{"key":"2024051612033125800_ref35","volume-title":"European Organization For Nuclear Research and OpenAIRE","year":"2013"},{"issue":"248","key":"2024051612033125800_ref36","doi-asserted-by":"crossref","first-page":"248ra107","DOI":"10.1126\/scitranslmed.3008879","article-title":"B cells populating the multiple sclerosis brain mature in the draining cervical lymph nodes","volume":"6","author":"Stern","year":"2014","journal-title":"Sci Transl Med"},{"key":"2024051612033125800_ref37","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12865-014-0040-5","article-title":"Quantitative assessment of the robustness of next-generation sequencing of antibody variable gene repertoires from immunized mice","volume":"15","author":"Greiff","year":"2014","journal-title":"BMC Immunol"},{"issue":"1","key":"2024051612033125800_ref38","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41467-019-08489-3","article-title":"Mosaic deletion patterns of the human antibody heavy chain gene locus shown by bayesian haplotyping","volume":"10","author":"Gidoni","year":"2019","journal-title":"Nat Commun"},{"key":"2024051612033125800_ref39","doi-asserted-by":"crossref","first-page":"3004","DOI":"10.3389\/fimmu.2018.03004","article-title":"Antibody repertoire analysis of hepatitis c virus infections identifies immune signatures associated with spontaneous clearance","volume":"9","author":"Eliyahu","year":"2018","journal-title":"Front Immunol"},{"key":"2024051612033125800_ref40","doi-asserted-by":"crossref","first-page":"605170","DOI":"10.3389\/fimmu.2020.605170","article-title":"Deep sequencing of b cell receptor repertoires from covid-19 patients reveals strong convergent immune signatures","volume":"11","author":"Galson","year":"2020","journal-title":"Front Immunol"},{"key":"2024051612033125800_ref41","doi-asserted-by":"crossref","first-page":"1031914","DOI":"10.3389\/fimmu.2023.1031914","article-title":"Altered somatic hypermutation patterns in covid-19 patients classifies disease severity","volume":"14","author":"Safra","year":"2023","journal-title":"Front Immunol"},{"issue":"171","key":"2024051612033125800_ref42","doi-asserted-by":"crossref","first-page":"171ra19","DOI":"10.1126\/scitranslmed.3004794","article-title":"Lineage structure of the human antibody repertoire in response to influenza vaccination","volume":"5","author":"Jiang","year":"2013","journal-title":"Sci Transl Med"},{"key":"2024051612033125800_ref43","doi-asserted-by":"crossref","DOI":"10.1016\/j.immuno.2022.100012","article-title":"A nextflow pipeline for t-cell receptor repertoire reconstruction and analysis from rna sequencing data","volume":"6","author":"Rubio","year":"2022","journal-title":"ImmunoInformatics"},{"issue":"1","key":"2024051612033125800_ref44","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1101\/gr.276683.122","article-title":"A somatic hypermutation\u2013based machine learning model stratifies individuals with crohn\u2019s disease and controls","volume":"33","author":"Safra","year":"2023","journal-title":"Genome Res"},{"issue":"suppl_1","key":"2024051612033125800_ref45","first-page":"D28","article-title":"The european nucleotide archive","volume":"39","author":"Leinonen","year":"2010","journal-title":"Nucleic Acids Res"},{"issue":"suppl_1","key":"2024051612033125800_ref46","doi-asserted-by":"crossref","first-page":"D5","DOI":"10.1093\/nar\/gkl1031","article-title":"Database resources of the national center for biotechnology information","volume":"35","author":"Wheeler","year":"2007","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"2024051612033125800_ref47","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1111\/imr.12666","article-title":"Ireceptor: a platform for querying and analyzing antibody\/b-cell and t-cell receptor repertoire data across federated repositories","volume":"284","author":"Corrie","year":"2018","journal-title":"Immunol Rev"},{"issue":"1","key":"2024051612033125800_ref48","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1002\/pro.4205","article-title":"Observed antibody space: a diverse database of cleaned, annotated, and translated unpaired and paired antibody sequences","volume":"31","author":"Olsen","year":"2022","journal-title":"Protein Sci"},{"key":"2024051612033125800_ref49","doi-asserted-by":"crossref","DOI":"10.1101\/2023.09.01.555348","article-title":"Airr-c human ig reference sets: curated sets of immunoglobulin heavy and light chain germline genes","author":"Collins","year":"2023"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/3\/bbae221\/57678713\/bbae221.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bib\/article-pdf\/25\/3\/bbae221\/57678713\/bbae221.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,16]],"date-time":"2024-05-16T12:04:22Z","timestamp":1715861062000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/doi\/10.1093\/bib\/bbae221\/7674246"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,27]]},"references-count":49,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,3,27]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bbae221","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,5]]},"published":{"date-parts":[[2024,3,27]]},"article-number":"bbae221"}}