{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T07:51:50Z","timestamp":1773733910081,"version":"3.50.1"},"reference-count":46,"publisher":"Oxford University Press (OUP)","issue":"5","license":[{"start":{"date-parts":[[2017,10,23]],"date-time":"2017-10-23T00:00:00Z","timestamp":1508716800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","award":["R01 HG008354 and R21 HG009255"],"award-info":[{"award-number":["R01 HG008354 and R21 HG009255"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","award":["R21 CA205172"],"award-info":[{"award-number":["R21 CA205172"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,3,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>Barcode sequencing (bar-seq) is a high-throughput, and cost effective method to assay large numbers of cell lineages or genotypes in complex cell pools. Because of its advantages, applications for bar-seq are quickly growing\u2014from using neutral random barcodes to study the evolution of microbes or cancer, to using pseudo-barcodes, such as shRNAs or sgRNAs to simultaneously screen large numbers of cell perturbations. However, the computational pipelines for bar-seq clustering are not well developed. Available methods often yield a high frequency of under-clustering artifacts that result in spurious barcodes, or over-clustering artifacts that group distinct barcodes together. Here, we developed Bartender, an accurate clustering algorithm to detect barcodes and their abundances from raw next-generation sequencing data.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>In contrast with existing methods that cluster based on sequence similarity alone, Bartender uses a modified two-sample proportion test that also considers cluster size. This modification results in higher accuracy and lower rates of under- and over-clustering artifacts. Additionally, Bartender includes unique molecular identifier handling and a \u2018multiple time point\u2019 mode that matches barcode clusters between different clustering runs for seamless handling of time course data. Bartender is a set of simple-to-use command line tools that can be performed on a laptop at comparable run times to existing methods.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Bartender is available at no charge for non-commercial use at https:\/\/github.com\/LaoZZZZZ\/bartender-1.1.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx655","type":"journal-article","created":{"date-parts":[[2017,10,18]],"date-time":"2017-10-18T19:10:45Z","timestamp":1508353845000},"page":"739-747","source":"Crossref","is-referenced-by-count":100,"title":["Bartender: a fast and accurate clustering algorithm to count barcode reads"],"prefix":"10.1093","volume":"34","author":[{"given":"Lu","family":"Zhao","sequence":"first","affiliation":[{"name":"Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA"}]},{"given":"Zhimin","family":"Liu","sequence":"additional","affiliation":[{"name":"Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA"},{"name":"Department of Biochemistry and Cell Biology, Stony Brook University, Stony Brook, NY, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0923-1636","authenticated-orcid":false,"given":"Sasha F","family":"Levy","sequence":"additional","affiliation":[{"name":"Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, USA"},{"name":"Department of Biochemistry and Cell Biology, Stony Brook University, Stony Brook, NY, USA"}]},{"given":"Song","family":"Wu","sequence":"additional","affiliation":[{"name":"Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, USA"}]}],"member":"286","published-online":{"date-parts":[[2017,10,23]]},"reference":[{"key":"2023012712394341300_btx655-B1","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1016\/S0022-2836(05)80360-2","article-title":"Basic local alignment search tool","volume":"215","author":"Altschul","year":"1990","journal-title":"J. Mol. Biol"},{"key":"2023012712394341300_btx655-B2","doi-asserted-by":"crossref","first-page":"2502","DOI":"10.1093\/bioinformatics\/btr447","article-title":"SEED: efficient clustering of next-generation sequences","volume":"27","author":"Bao","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012712394341300_btx655-B3","doi-asserted-by":"crossref","first-page":"443","DOI":"10.1038\/nmeth.1330","article-title":"Rapid creation and quantitative monitoring of high coverage shRNA libraries","volume":"6","author":"Bassik","year":"2009","journal-title":"Nat. Methods"},{"key":"2023012712394341300_btx655-B4","doi-asserted-by":"crossref","first-page":"440","DOI":"10.1038\/nm.3841","article-title":"Studying clonal dynamics in response to cancer therapy using high-complexity barcoding","volume":"21","author":"Bhang","year":"2015","journal-title":"Nat. Medods"},{"key":"2023012712394341300_btx655-B5","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1016\/j.ygeno.2014.09.005","article-title":"Beyond genome sequencing: lineage tracking with barcodes to study the dynamics of evolution, infection, and cancer","volume":"104","author":"Blundell","year":"2014","journal-title":"Genomics"},{"key":"2023012712394341300_btx655-B6","doi-asserted-by":"crossref","first-page":"581","DOI":"10.1038\/nmeth.3869","article-title":"DADA2: High-resolution sample inference from Illumina amplicon data","volume":"13","author":"Callahan","year":"2016","journal-title":"Nat. Methods"},{"key":"2023012712394341300_btx655-B7","doi-asserted-by":"crossref","first-page":"2732","DOI":"10.1093\/bioinformatics\/bts482","article-title":"Rainbow: an integrated tool for efficient clustering and assembling RAD-seq reads","volume":"28","author":"Chong","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012712394341300_btx655-B8","doi-asserted-by":"crossref","first-page":"2460","DOI":"10.1093\/bioinformatics\/btq461","article-title":"Search and clustering orders of magnitude faster than BLAST","volume":"26","author":"Edgar","year":"2010","journal-title":"Bioinformatics"},{"key":"2023012712394341300_btx655-B9","doi-asserted-by":"crossref","first-page":"968","DOI":"10.1038\/ismej.2014.195","article-title":"Minimum entropy decomposition: Unsupervised oligotyping for sensitive partitioning of high-throughput marker gene sequences","volume":"9","author":"Eren","year":"2014","journal-title":"ISME J"},{"key":"2023012712394341300_btx655-B10","doi-asserted-by":"crossref","first-page":"3150","DOI":"10.1093\/bioinformatics\/bts565","article-title":"CD-HIT: accelerated for clustering the next-generation sequencing data","volume":"28","author":"Fu","year":"2012","journal-title":"Bioinformatics"},{"key":"2023012712394341300_btx655-B11","doi-asserted-by":"crossref","first-page":"387","DOI":"10.1038\/nature00935","article-title":"Functional profiling of the Saccharomyces cerevisiae genome","volume":"418","author":"Giaever","year":"2002","journal-title":"Nature"},{"key":"2023012712394341300_btx655-B12","doi-asserted-by":"crossref","first-page":"E4393","DOI":"10.1073\/pnas.1318100110","article-title":"Yeast metabolic and signaling genes are required for heat-shock survival and have little overlap with the heat-induced genes","volume":"110","author":"Gibney","year":"2013","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023012712394341300_btx655-B13","doi-asserted-by":"crossref","first-page":"475","DOI":"10.1126\/science.1241934","article-title":"Causes and effects of N-terminal codon bias in bacterial genes","volume":"342","author":"Goodman","year":"2013","journal-title":"Science"},{"key":"2023012712394341300_btx655-B14","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1038\/nmeth.1404","article-title":"Chromatin profiling by directly sequencing small quantities of immunoprecipitated DNA","volume":"7","author":"Goren","year":"2010","journal-title":"Nat. Methods"},{"key":"2023012712394341300_btx655-B15","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1534\/genetics.110.120766","article-title":"System-level analysis of genes and functions affecting survival during nutrient starvation in Saccharomyces cerevisiae","volume":"187","author":"Gresham","year":"2011","journal-title":"Genetics"},{"key":"2023012712394341300_btx655-B16","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.mrfmmm.2011.10.001","article-title":"Direct mutation analysis by high-throughput sequencing: from germline to low-abundant, somatic variants","volume":"729","author":"Gundry","year":"2012","journal-title":"Mutat. Res"},{"key":"2023012712394341300_btx655-B17","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1002\/j.1538-7305.1950.tb00463.x","article-title":"Error detecting and error correcting codes","volume":"29","author":"Hamming","year":"1950","journal-title":"Bell Syst. Technical J"},{"key":"2023012712394341300_btx655-B18","doi-asserted-by":"crossref","first-page":"R60","DOI":"10.1186\/gb-2010-11-6-r60","article-title":"Global fitness profiling of fission yeast deletion strains by barcode sequencing","volume":"11","author":"Han","year":"2010","journal-title":"Genome Biol"},{"key":"2023012712394341300_btx655-B19","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1128\/JB.00873-09","article-title":"Small RNAs and small proteins involved in resistance to cell envelope stress and acid shock in Escherichia coli: analysis of a bar-coded mutant collection","volume":"192","author":"Hobbs","year":"2010","journal-title":"J. Bacteriol"},{"key":"2023012712394341300_btx655-B20","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1534\/g3.116.034207","article-title":"iSeq: a new double-barcode method for detecting dynamic genetic interactions in yeast","volume":"7","author":"Jaffe","year":"2017","journal-title":"G3"},{"key":"2023012712394341300_btx655-B21","doi-asserted-by":"crossref","first-page":"72","DOI":"10.1038\/nmeth.1778","article-title":"Counting absolute numbers of molecules using unique molecular identifiers","volume":"9","author":"Kivioja","year":"2011","journal-title":"Nat. Methods"},{"key":"2023012712394341300_btx655-B22","doi-asserted-by":"crossref","first-page":"14024","DOI":"10.1073\/pnas.1301301110","article-title":"Composability of regulatory sequences controlling transcription and translation in Escherichia coli","volume":"110","author":"Kosuri","year":"2013","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023012712394341300_btx655-B23","first-page":"707","article-title":"Binary codes capable of correcting deletion, insertions and reversals","volume":"10","author":"Levenshtein","year":"1966","journal-title":"Soviet Phys. Doklady"},{"key":"2023012712394341300_btx655-B24","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1038\/nature14279","article-title":"Quantitative evolutionary dynamics using high-resolution lineage tracking","volume":"519","author":"Levy","year":"2015","journal-title":"Nature"},{"key":"2023012712394341300_btx655-B25","doi-asserted-by":"crossref","first-page":"928","DOI":"10.1038\/nbt.1977","article-title":"Tracking single hematopoietic stem cells in vivo using high-throughput sequencing in conjunction with viral genetic barcoding","volume":"29","author":"Lu","year":"2011","journal-title":"Nat. Biotechnol"},{"key":"2023012712394341300_btx655-B26","doi-asserted-by":"crossref","first-page":"aaf7907","DOI":"10.1126\/science.aaf7907","article-title":"Whole organism lineage tracing by combinatorial and cumulative genome editing","volume":"353","author":"McKenna","year":"2016","journal-title":"Science"},{"key":"2023012712394341300_btx655-B27","doi-asserted-by":"crossref","first-page":"1687","DOI":"10.1093\/nar\/18.7.1687","article-title":"DNA recombination during PCR","volume":"18","author":"Meyerhans","year":"1990","journal-title":"Nucleic Acids Res"},{"key":"2023012712394341300_btx655-B28","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1038\/nature15742","article-title":"Barcoding reveals complex clonal dynamics of de novo transformed human mammary cells","volume":"528","author":"Nguyen","year":"2015","journal-title":"Nature"},{"key":"2023012712394341300_btx655-B29","first-page":"CD006605","article-title":"Long-term opioid management for chronic noncancer pain","author":"Noble","year":"2010","journal-title":"Cochrane Database Syst. Rev"},{"key":"2023012712394341300_btx655-B30","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1186\/1471-2105-12-38","article-title":"Removing noise from pyrosequenced amplicons","volume":"12","author":"Quince","year":"2011","journal-title":"BMC Bioinformatics"},{"key":"2023012712394341300_btx655-B31","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1186\/1471-2105-13-283","article-title":"Denoising PCR-amplified metagenome data","volume":"13","author":"Rosen","year":"2012","journal-title":"BMC Bioinformatics"},{"key":"2023012712394341300_btx655-B32","doi-asserted-by":"crossref","first-page":"125.","DOI":"10.1186\/s12859-016-0976-y","article-title":"Illumina error profiles: resolving fine-scale variation in metagenomic sequencing data","volume":"17","author":"Schirmer","year":"2016","journal-title":"BMC Bioinformatics"},{"key":"2023012712394341300_btx655-B33","doi-asserted-by":"crossref","first-page":"620","DOI":"10.1126\/science.1149200","article-title":"Cancer proliferation gene discovery through functional genomics","volume":"319","author":"Schlabach","year":"2008","journal-title":"Science"},{"key":"2023012712394341300_btx655-B34","doi-asserted-by":"crossref","first-page":"15586.","DOI":"10.1038\/ncomms15586","article-title":"A scalable double-barcode sequencing platform for characterization of dynamic protein\u2013protein interactions","volume":"8","author":"Schlecht","year":"2017","journal-title":"Nat. Commun"},{"key":"2023012712394341300_btx655-B35","doi-asserted-by":"crossref","first-page":"14508","DOI":"10.1073\/pnas.1208715109","article-title":"Detection of ultra-rare mutations by next-generation sequencing","volume":"109","author":"Schmitt","year":"2012","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023012712394341300_btx655-B36","doi-asserted-by":"crossref","first-page":"e1004211","DOI":"10.1371\/journal.ppat.1004211","article-title":"Systematic phenotyping of a large-scale Candida glabrata deletion collection reveals novel antifungal tolerance genes","volume":"10","author":"Schwarzmuller","year":"2014","journal-title":"PLoS Pathog"},{"key":"2023012712394341300_btx655-B37","doi-asserted-by":"crossref","first-page":"464","DOI":"10.1093\/bioinformatics\/btq677","article-title":"SlideSort: all pairs similarity search for short reads","volume":"27","author":"Shimizu","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012712394341300_btx655-B38","doi-asserted-by":"crossref","first-page":"4820","DOI":"10.1073\/pnas.0712136105","article-title":"X-chromosome inactivation and epigenetic fluidity in human embryonic stem cells","volume":"105","author":"Silva","year":"2008","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023012712394341300_btx655-B39","doi-asserted-by":"crossref","first-page":"1117","DOI":"10.1101\/gr.089532.108","article-title":"ABySS: a parallel assembler for short read sequence data","volume":"19","author":"Simpson","year":"2009","journal-title":"Genome Res"},{"key":"2023012712394341300_btx655-B40","doi-asserted-by":"crossref","first-page":"R104","DOI":"10.1186\/gb-2011-12-10-r104","article-title":"High-throughput RNA interference screening using pooled shRNA libraries and next generation sequencing","volume":"12","author":"Sims","year":"2011","journal-title":"Genome Biol"},{"key":"2023012712394341300_btx655-B41","doi-asserted-by":"crossref","first-page":"1122","DOI":"10.1038\/nature08182","article-title":"Origins and evolutionary genomics of the 2009 swine-origin H1N1 influenza A epidemic","volume":"459","author":"Smith","year":"2009","journal-title":"Nature"},{"key":"2023012712394341300_btx655-B42","doi-asserted-by":"crossref","first-page":"e76","DOI":"10.1093\/nar\/gkp285","article-title":"ESPRIT: estimating species richness using large collections of 16S rRNA pyrosequences","volume":"37","author":"Sun","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2023012712394341300_btx655-B43","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1126\/science.1246981","article-title":"Genetic screens in human cells using the CRISPR-Cas9 system","volume":"343","author":"Wang","year":"2014","journal-title":"Science"},{"key":"2023012712394341300_btx655-B44","doi-asserted-by":"crossref","first-page":"901","DOI":"10.1126\/science.285.5429.901","article-title":"Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis","volume":"285","author":"Winzeler","year":"1999","journal-title":"Science"},{"key":"2023012712394341300_btx655-B45","doi-asserted-by":"crossref","first-page":"952","DOI":"10.1038\/nbt.3326","article-title":"Massively parallel high-order combinatorial genetics in human cells","volume":"33","author":"Wong","year":"2015","journal-title":"Nat. Biotechnol"},{"key":"2023012712394341300_btx655-B46","doi-asserted-by":"crossref","first-page":"1913","DOI":"10.1093\/bioinformatics\/btv053","article-title":"Starcode: sequence clustering based on all-pairs search","volume":"31","author":"Zorita","year":"2015","journal-title":"Bioinformatics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/5\/739\/48914361\/bioinformatics_34_5_739.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/5\/739\/48914361\/bioinformatics_34_5_739.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,27]],"date-time":"2023-01-27T13:31:48Z","timestamp":1674826308000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/5\/739\/4562326"}},"subtitle":[],"editor":[{"given":"Bonnie","family":"Berger","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2017,10,23]]},"references-count":46,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2018,3,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx655","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,3,1]]},"published":{"date-parts":[[2017,10,23]]}}}