{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,28]],"date-time":"2026-03-28T04:39:05Z","timestamp":1774672745522,"version":"3.50.1"},"reference-count":30,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2018,2,22]],"date-time":"2018-02-22T00:00:00Z","timestamp":1519257600000},"content-version":"vor","delay-in-days":1,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["1U01CA220378"],"award-info":[{"award-number":["1U01CA220378"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["2P30CA015083"],"award-info":[{"award-number":["2P30CA015083"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["1U54CA210180"],"award-info":[{"award-number":["1U54CA210180"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,7,19]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Traditional RNA sequencing (RNA-seq) allows the detection of gene expression variations between two or more cell populations through differentially expressed gene (DEG) analysis. However, genes that contribute to cell-to-cell differences are not discoverable with RNA-seq because RNA-seq samples are obtained from a mixture of cells. Single-cell RNA-seq (scRNA-seq) allows the detection of gene expression in each cell. With scRNA-seq, highly variable gene (HVG) discovery allows the detection of genes that contribute strongly to cell-to-cell variation within a homogeneous cell population, such as a population of embryonic stem cells. This analysis is implemented in many software packages. In this study, we compare seven HVG methods from six software packages, including BASiCS, Brennecke, scLVM, scran, scVEGs and Seurat. Our results demonstrate that reproducibility in HVG analysis requires a larger sample size than DEG analysis. Discrepancies between methods and potential issues in these tools are discussed and recommendations are made.<\/jats:p>","DOI":"10.1093\/bib\/bby011","type":"journal-article","created":{"date-parts":[[2018,1,30]],"date-time":"2018-01-30T15:10:40Z","timestamp":1517325040000},"page":"1583-1589","source":"Crossref","is-referenced-by-count":189,"title":["Evaluation of tools for highly variable gene discovery from single-cell RNA-seq data"],"prefix":"10.1093","volume":"20","author":[{"given":"Shun H","family":"Yip","sequence":"first","affiliation":[]},{"given":"Pak Chung","family":"Sham","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4432-4707","authenticated-orcid":false,"given":"Junwen","family":"Wang","sequence":"additional","affiliation":[]}],"member":"286","published-online":{"date-parts":[[2018,2,21]]},"reference":[{"issue":"7","key":"2019100807490162900_bby011-B1","doi-asserted-by":"crossref","first-page":"1160","DOI":"10.1101\/gr.110882.110","article-title":"Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq","volume":"21","author":"Islam","year":"2011","journal-title":"Genome Res"},{"issue":"5","key":"2019100807490162900_bby011-B2","doi-asserted-by":"crossref","first-page":"377","DOI":"10.1038\/nmeth.1315","article-title":"mRNA-Seq whole-transcriptome analysis of a single cell","volume":"6","author":"Tang","year":"2009","journal-title":"Nat Methods"},{"issue":"1","key":"2019100807490162900_bby011-B3","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1038\/nrg2484","article-title":"RNA-Seq: a revolutionary tool for transcriptomics","volume":"10","author":"Wang","year":"2009","journal-title":"Nat Rev Genet"},{"issue":"6","key":"2019100807490162900_bby011-B4","doi-asserted-by":"crossref","first-page":"e1004333.","DOI":"10.1371\/journal.pcbi.1004333","article-title":"BASiCS: Bayesian analysis of single-cell sequencing data","volume":"11","author":"Vallejos","year":"2015","journal-title":"PLoS Comput Biol"},{"issue":"11","key":"2019100807490162900_bby011-B5","doi-asserted-by":"crossref","first-page":"1093","DOI":"10.1038\/nmeth.2645","article-title":"Accounting for technical noise in single-cell RNA-seq experiments","volume":"10","author":"Brennecke","year":"2013","journal-title":"Nat Methods"},{"issue":"2","key":"2019100807490162900_bby011-B6","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1038\/nbt.3102","article-title":"Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells","volume":"33","author":"Buettner","year":"2015","journal-title":"Nat Biotechnol"},{"key":"2019100807490162900_bby011-B7","first-page":"2122.","article-title":"A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor","volume":"5","author":"Lun","year":"2016","journal-title":"F1000Res"},{"issue":"S7","key":"2019100807490162900_bby011-B8","doi-asserted-by":"crossref","first-page":"508","DOI":"10.1186\/s12864-016-2897-6","article-title":"Detection of high variability in gene expression from single-cell RNA-seq profiling","volume":"17","author":"Chen","year":"2016","journal-title":"BMC Genomics"},{"issue":"5","key":"2019100807490162900_bby011-B9","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1038\/nbt.3192","article-title":"Spatial reconstruction of single-cell gene expression data","volume":"33","author":"Satija","year":"2015","journal-title":"Nat Biotechnol"},{"issue":"1","key":"2019100807490162900_bby011-B10","doi-asserted-by":"crossref","first-page":"278","DOI":"10.1186\/s13059-015-0844-5","article-title":"MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data","volume":"16","author":"Finak","year":"2015","journal-title":"Genome Biol"},{"issue":"22","key":"2019100807490162900_bby011-B11","doi-asserted-by":"crossref","DOI":"10.1093\/nar\/gkx1189","article-title":"Linnorm: improved statistical analysis for single cell RNA-seq expression data","volume":"45","author":"Yip","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2019100807490162900_bby011-B12","doi-asserted-by":"crossref","first-page":"R29","DOI":"10.1186\/gb-2014-15-2-r29","article-title":"voom: precision weights unlock linear model analysis tools for RNA-seq read counts","volume":"15","author":"Law","year":"2014","journal-title":"Genome Biol"},{"issue":"1","key":"2019100807490162900_bby011-B13","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1186\/s13059-016-1077-y","article-title":"A statistical approach for identifying differential distributions in single-cell RNA-seq experiments","volume":"17","author":"Korthauer","year":"2016","journal-title":"Genome Biol"},{"key":"2019100807490162900_bby011-B14","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1186\/s13059-016-0947-7","article-title":"Pooling across cells to normalize single-cell RNA sequencing data with many zero counts","volume":"17","author":"Lun","year":"2016","journal-title":"Genome Biol"},{"issue":"4","key":"2019100807490162900_bby011-B15","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1038\/nmeth.4179","article-title":"Seq-well: portable, low-cost RNA sequencing of single cells at high throughput","volume":"14","author":"Gierahn","year":"2017","journal-title":"Nat Methods"},{"issue":"5","key":"2019100807490162900_bby011-B16","doi-asserted-by":"crossref","first-page":"1187","DOI":"10.1016\/j.cell.2015.04.044","article-title":"Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells","volume":"161","author":"Klein","year":"2015","journal-title":"Cell"},{"issue":"6167","key":"2019100807490162900_bby011-B17","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1126\/science.1245316","article-title":"Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells","volume":"343","author":"Deng","year":"2014","journal-title":"Science"},{"issue":"9","key":"2019100807490162900_bby011-B18","doi-asserted-by":"crossref","first-page":"1131","DOI":"10.1038\/nsmb.2660","article-title":"Single-cell RNA-seq profiling of human preimplantation embryos and embryonic stem cells","volume":"20","author":"Yan","year":"2013","journal-title":"Nat Struct Mol Biol"},{"issue":"15","key":"2019100807490162900_bby011-B19","doi-asserted-by":"crossref","first-page":"2114","DOI":"10.1093\/bioinformatics\/btu170","article-title":"Trimmomatic: a flexible trimmer for Illumina sequence data","volume":"30","author":"Bolger","year":"2014","journal-title":"Bioinformatics"},{"issue":"8","key":"2019100807490162900_bby011-B20","doi-asserted-by":"crossref","DOI":"10.1038\/nbt0816-888d","article-title":"Near-optimal probabilistic RNA-seq quantification","volume":"34","author":"Bray","year":"2016","journal-title":"Nat Biotechnol"},{"issue":"D1","key":"2019100807490162900_bby011-B21","doi-asserted-by":"crossref","first-page":"D710","DOI":"10.1093\/nar\/gkv1157","article-title":"Ensembl 2016","volume":"44","author":"Yates","year":"2016","journal-title":"Nucleic Acids Res"},{"issue":"4","key":"2019100807490162900_bby011-B22","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1093\/bib\/bbu033","article-title":"Rediscovery rate estimation for assessing the validation of significant findings in high-throughput studies","volume":"16","author":"Ganna","year":"2015","journal-title":"Brief Bioinform"},{"issue":"6","key":"2019100807490162900_bby011-B23","doi-asserted-by":"crossref","first-page":"591","DOI":"10.1038\/nbt.3498","article-title":"The contribution of cell cycle to heterogeneity in single-cell RNA-seq data","volume":"34","author":"McDavid","year":"2016","journal-title":"Nat Biotechnol"},{"issue":"16","key":"2019100807490162900_bby011-B24","doi-asserted-by":"crossref","first-page":"2614","DOI":"10.1093\/bioinformatics\/btv193","article-title":"EBSeq-HMM: a Bayesian approach for identifying gene-expression changes in ordered RNA-seq experiments","volume":"31","author":"Leng","year":"2015","journal-title":"Bioinformatics"},{"issue":"11","key":"2019100807490162900_bby011-B25","doi-asserted-by":"crossref","first-page":"e91.","DOI":"10.1093\/nar\/gku310","article-title":"Robustly detecting differential expression in RNA sequencing data using observation weights","volume":"42","author":"Zhou","year":"2014","journal-title":"Nucleic Acids Res"},{"issue":"12","key":"2019100807490162900_bby011-B26","doi-asserted-by":"crossref","first-page":"550.","DOI":"10.1186\/s13059-014-0550-8","article-title":"Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2","volume":"15","author":"Love","year":"2014","journal-title":"Genome Biol"},{"issue":"1","key":"2019100807490162900_bby011-B27","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1093\/bioinformatics\/btp616","article-title":"edgeR: a bioconductor package for differential expression analysis of digital gene expression data","volume":"26","author":"Robinson","year":"2010","journal-title":"Bioinformatics"},{"issue":"1","key":"2019100807490162900_bby011-B28","doi-asserted-by":"crossref","first-page":"422.","DOI":"10.1186\/1471-2105-11-422","article-title":"baySeq: empirical Bayesian methods for identifying differential expression in sequence count data","volume":"11","author":"Hardcastle","year":"2010","journal-title":"BMC Bioinformatics"},{"issue":"10","key":"2019100807490162900_bby011-B29","doi-asserted-by":"crossref","first-page":"R106.","DOI":"10.1186\/gb-2010-11-10-r106","article-title":"Differential expression analysis for sequence count data","volume":"11","author":"Anders","year":"2010","journal-title":"Genome Biol"},{"issue":"5","key":"2019100807490162900_bby011-B30","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1038\/nmeth.4236","article-title":"SC3: consensus clustering of single-cell RNA-seq data","volume":"14","author":"Kiselev","year":"2017","journal-title":"Nat Methods"}],"container-title":["Briefings in Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bib\/article-pdf\/20\/4\/1583\/30119512\/bby011.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"http:\/\/academic.oup.com\/bib\/article-pdf\/20\/4\/1583\/30119512\/bby011.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2019,10,8]],"date-time":"2019-10-08T07:53:17Z","timestamp":1570521197000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bib\/article\/20\/4\/1583\/4898116"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,2,21]]},"references-count":30,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2018,2,21]]},"published-print":{"date-parts":[[2019,7,19]]}},"URL":"https:\/\/doi.org\/10.1093\/bib\/bby011","relation":{},"ISSN":["1467-5463","1477-4054"],"issn-type":[{"value":"1467-5463","type":"print"},{"value":"1477-4054","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2019,7]]},"published":{"date-parts":[[2018,2,21]]}}}