{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T14:10:48Z","timestamp":1774447848399,"version":"3.50.1"},"reference-count":37,"publisher":"Oxford University Press (OUP)","issue":"4","license":[{"start":{"date-parts":[[2019,9,10]],"date-time":"2019-09-10T00:00:00Z","timestamp":1568073600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"name":"National Institute of General Medical Sciences of the National Institutes of Health","award":["#1R01GM131399-01"],"award-info":[{"award-number":["#1R01GM131399-01"]}]},{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["#ACI-1548562"],"award-info":[{"award-number":["#ACI-1548562"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"National Science Foundation and the National Institutes of Health"},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,2,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>The biclustering of large-scale gene expression data holds promising potential for detecting condition-specific functional gene modules (i.e. biclusters). However, existing methods do not adequately address a comprehensive detection of all significant bicluster structures and have limited power when applied to expression data generated by RNA-Sequencing (RNA-Seq), especially single-cell RNA-Seq (scRNA-Seq) data, where massive zero and low expression values are observed.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We present a new biclustering algorithm, QUalitative BIClustering algorithm Version 2 (QUBIC2), which is empowered by: (i) a novel left-truncated mixture of Gaussian model for an accurate assessment of multimodality in zero-enriched expression data, (ii) a fast and efficient dropouts-saving expansion strategy for functional gene modules optimization using information divergency and (iii) a rigorous statistical test for the significance of all the identified biclusters in any organism, including those without substantial functional annotations. QUBIC2 demonstrated considerably improved performance in detecting biclusters compared to other five widely used algorithms on various benchmark datasets from E.coli, Human and simulated data. QUBIC2 also showcased robust and superior performance on gene expression data generated by microarray, bulk RNA-Seq and scRNA-Seq.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>The source code of QUBIC2 is freely available at https:\/\/github.com\/OSU-BMBL\/QUBIC2.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz692","type":"journal-article","created":{"date-parts":[[2019,9,6]],"date-time":"2019-09-06T11:40:54Z","timestamp":1567770054000},"page":"1143-1149","source":"Crossref","is-referenced-by-count":60,"title":["QUBIC2: a novel and robust biclustering algorithm for analyses and interpretation of large-scale RNA-Seq data"],"prefix":"10.1093","volume":"36","author":[{"given":"Juan","family":"Xie","sequence":"first","affiliation":[{"name":"Department of Biomedical Informatics, College of Medicine, The Ohio State University , Columbus, OH 43210, USA"}]},{"given":"Anjun","family":"Ma","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, College of Medicine, The Ohio State University , Columbus, OH 43210, USA"}]},{"given":"Yu","family":"Zhang","sequence":"additional","affiliation":[{"name":"Colleges of Computer Science and Technology, Jilin University , Changchun 130012, China"}]},{"given":"Bingqiang","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Mathematics, Shandong University , Jinan 250100, China"}]},{"given":"Sha","family":"Cao","sequence":"additional","affiliation":[{"name":"Department of Biostatistics, Indiana University, School of Medicine , Indianapolis, IN 46202, USA"}]},{"given":"Cankun","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, College of Medicine, The Ohio State University , Columbus, OH 43210, USA"}]},{"given":"Jennifer","family":"Xu","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, College of Medicine, The Ohio State University , Columbus, OH 43210, USA"},{"name":"Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill , Chapel Hill, NC 27599, USA"}]},{"given":"Chi","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Medical & Molecular Genetics, Indiana University, School of Medicine , Indianapolis, IN 46202, USA"}]},{"given":"Qin","family":"Ma","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics, College of Medicine, The Ohio State University , Columbus, OH 43210, USA"}]}],"member":"286","published-online":{"date-parts":[[2019,9,10]]},"reference":[{"key":"2023020108351985600_btz692-B1","doi-asserted-by":"crossref","first-page":"63.","DOI":"10.1186\/s13059-016-0927-y","article-title":"Design and computational analysis of single-cell RNA-sequencing experiments","volume":"17","author":"Bacher","year":"2016","journal-title":"Genome Biol"},{"key":"2023020108351985600_btz692-B2","doi-asserted-by":"crossref","first-page":"1388","DOI":"10.1101\/gr.3820805","article-title":"Gene expression profiling in single cells from the pancreatic islets of Langerhans reveals lognormal distribution of mRNA levels","volume":"15","author":"Bengtsson","year":"2005","journal-title":"Genome Res"},{"key":"2023020108351985600_btz692-B3","doi-asserted-by":"crossref","first-page":"031902.","DOI":"10.1103\/PhysRevE.67.031902","article-title":"Iterative signature algorithm for the analysis of large-scale gene expression data","volume":"67","author":"Bergmann","year":"2003","journal-title":"Phys. Re. E"},{"key":"2023020108351985600_btz692-B4","author":"Cao","year":"2018"},{"key":"2023020108351985600_btz692-B5","doi-asserted-by":"crossref","first-page":"172","DOI":"10.1007\/s12155-015-9674-2","article-title":"Genome-scale identification of cell-wall-related genes in switchgrass through comparative genomics and computational analyses of transcriptomic data","volume":"9","author":"Chen","year":"2016","journal-title":"BioEnergy Res"},{"key":"2023020108351985600_btz692-B6","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1080\/00401706.1959.10489859","article-title":"Simplified estimators for the normal distribution when samples are singly censored or truncated","volume":"1","author":"Cohen","year":"1959","journal-title":"Technometrics"},{"key":"2023020108351985600_btz692-B7","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1093\/bib\/bbs032","article-title":"A comparative analysis of biclustering algorithms for gene expression data","volume":"14","author":"Eren","year":"2013","journal-title":"Brief. Bioinf"},{"key":"2023020108351985600_btz692-B8","doi-asserted-by":"crossref","first-page":"D866","DOI":"10.1093\/nar\/gkm815","article-title":"Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata","volume":"36","author":"Faith","year":"2008","journal-title":"Nucleic Acids Res"},{"key":"2023020108351985600_btz692-B9","doi-asserted-by":"crossref","first-page":"1297","DOI":"10.12688\/f1000research.15809.1","article-title":"Comparison of clustering tools in R for medium-sized 10x Genomics single-cell RNA-sequencing data [version 1; referees: 1 approved, 2 approved with reservations]","volume":"7","author":"Freytag","year":"2018","journal-title":"F1000Research"},{"key":"2023020108351985600_btz692-B10","doi-asserted-by":"crossref","first-page":"e1004791","DOI":"10.1371\/journal.pcbi.1004791","article-title":"Context specific and differential gene co-expression networks via Bayesian biclustering","volume":"12","author":"Gao","year":"2016","journal-title":"PLoS Comput. Biol"},{"key":"2023020108351985600_btz692-B11","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1038\/nmeth.4179","article-title":"Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput","volume":"14","author":"Gierahn","year":"2017","journal-title":"Nat. Methods"},{"key":"2023020108351985600_btz692-B12","doi-asserted-by":"crossref","first-page":"333.","DOI":"10.1038\/nrg.2016.49","article-title":"Coming of age: ten years of next-generation sequencing technologies","volume":"17","author":"Goodwin","year":"2016","journal-title":"Nat. Rev. Genet"},{"key":"2023020108351985600_btz692-B13","doi-asserted-by":"crossref","first-page":"497","DOI":"10.1038\/msb.2011.28","article-title":"RNA sequencing reveals two major classes of gene expression levels in metazoan cells","volume":"7","author":"Hebenstreit","year":"2011","journal-title":"Mol. Syst. Biol"},{"key":"2023020108351985600_btz692-B14","doi-asserted-by":"crossref","first-page":"1520","DOI":"10.1093\/bioinformatics\/btq227","article-title":"FABIA: factor analysis for bicluster acquisition","volume":"26","author":"Hochreiter","year":"2010","journal-title":"Bioinformatics"},{"key":"2023020108351985600_btz692-B15","doi-asserted-by":"crossref","first-page":"96","DOI":"10.1038\/s12276-018-0071-8","article-title":"Single-cell RNA sequencing technologies and bioinformatics pipelines","volume":"50","author":"Hwang","year":"2018","journal-title":"Exp. Mol. Med"},{"key":"2023020108351985600_btz692-B16","doi-asserted-by":"crossref","first-page":"D543","DOI":"10.1093\/nar\/gkw1003","article-title":"The EcoCyc database: reflecting new knowledge about Escherichia coli K-12","volume":"45","author":"Keseler","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023020108351985600_btz692-B17","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1038\/nmeth.4236","article-title":"SC3: consensus clustering of single-cell RNA-seq data","volume":"14","author":"Kiselev","year":"2017","journal-title":"Nat. Methods"},{"key":"2023020108351985600_btz692-B18","first-page":"61","article-title":"Plaid models for gene expression data","author":"Lazzeroni","year":"2002","journal-title":"Stat. Sin"},{"key":"2023020108351985600_btz692-B19","doi-asserted-by":"crossref","first-page":"D19","DOI":"10.1093\/nar\/gkq1019","article-title":"The sequence read archive","volume":"39","author":"Leinonen","year":"2011","journal-title":"Nucleic Acids Res"},{"key":"2023020108351985600_btz692-B20","doi-asserted-by":"crossref","first-page":"e101","DOI":"10.1093\/nar\/gkp491","article-title":"QUBIC: a qualitative biclustering algorithm for analyses of gene expression data","volume":"37","author":"Li","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2023020108351985600_btz692-B21","doi-asserted-by":"crossref","first-page":"75.","DOI":"10.1186\/s13059-016-0947-7","article-title":"Pooling across cells to normalize single-cell RNA sequencing data with many zero counts","volume":"17","author":"Lun","year":"2016","journal-title":"Genome Biol"},{"key":"2023020108351985600_btz692-B22","doi-asserted-by":"crossref","first-page":"D658","DOI":"10.1093\/nar\/gkw983","article-title":"Cistrome Data Browser: a data portal for ChIP-Seq and chromatin accessibility data in human and mouse","volume":"45","author":"Mei","year":"2017","journal-title":"Nucleic Acids Res"},{"key":"2023020108351985600_btz692-B23","first-page":"e1006792","volume-title":"PLoS Comput. Biol","author":"Monier","year":"2019"},{"key":"2023020108351985600_btz692-B24","doi-asserted-by":"crossref","first-page":"447","DOI":"10.1038\/nbt.2870","article-title":"Optimizing genome-scale network reconstructions","volume":"32","author":"Monk","year":"2014","journal-title":"Nat. Biotechnol"},{"key":"2023020108351985600_btz692-B25","doi-asserted-by":"crossref","first-page":"3719","DOI":"10.1093\/bioinformatics\/bty401","article-title":"EBIC: an evolutionary-based parallel biclustering algorithm for pattern discovery","volume":"34","author":"Orzechowski","year":"2018","journal-title":"Bioinformatics"},{"key":"2023020108351985600_btz692-B26","doi-asserted-by":"crossref","first-page":"5691","DOI":"10.1093\/nar\/gki866","article-title":"The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes","volume":"33","author":"Overbeek","year":"2005","journal-title":"Nucleic Acids Res"},{"key":"2023020108351985600_btz692-B27","doi-asserted-by":"crossref","first-page":"1122","DOI":"10.1093\/bioinformatics\/btl060","article-title":"A systematic comparison and evaluation of biclustering methods for gene expression data","volume":"22","author":"Preli\u0107","year":"2006","journal-title":"Bioinformatics"},{"key":"2023020108351985600_btz692-B28","author":"Qiu","year":"2018"},{"key":"2023020108351985600_btz692-B29","doi-asserted-by":"crossref","first-page":"1090.","DOI":"10.1038\/s41467-018-03424-4","article-title":"A comprehensive evaluation of module detection methods for gene expression data","volume":"9","author":"Saelens","year":"2018","journal-title":"Nat. Commun"},{"key":"2023020108351985600_btz692-B30","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1038\/nrg3833","article-title":"Computational and analytical challenges in single-cell transcriptomics","volume":"16","author":"Stegle","year":"2015","journal-title":"Nat. Rev. Genet"},{"key":"2023020108351985600_btz692-B31","first-page":"2431","article-title":"On the size and recovery of submatrices of ones in a random binary matrix","volume":"9","author":"Sun","year":"2008","journal-title":"J. Mach. Learn. Res"},{"key":"2023020108351985600_btz692-B32","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1038\/nprot.2009.230","article-title":"Expander: from expression microarrays to networks and functions","volume":"5","author":"Ulitsky","year":"2010","journal-title":"Nat. Protoc"},{"key":"2023020108351985600_btz692-B33","author":"Wan","year":"2019"},{"key":"2023020108351985600_btz692-B34","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1186\/1471-2229-12-138","article-title":"Genome-scale identification of cell-wall related genes in Arabidopsis based on co-expression network analysis","volume":"12","author":"Wang","year":"2012","journal-title":"BMC Plant Biol"},{"key":"2023020108351985600_btz692-B35","article-title":"It is time to apply biclustering: a comprehensive review of biclustering applications in biological and biomedical data","author":"Xie","year":"2018","journal-title":"Brief. Bioinf."},{"key":"2023020108351985600_btz692-B36","doi-asserted-by":"crossref","first-page":"450","DOI":"10.1093\/bioinformatics\/btw635","article-title":"QUBIC: a bioconductor package for qualitative biclustering analysis of gene co-expression data","volume":"33","author":"Zhang","year":"2016","journal-title":"Bioinformatics"},{"key":"2023020108351985600_btz692-B37","doi-asserted-by":"crossref","first-page":"e32660","DOI":"10.1371\/journal.pone.0032660","article-title":"QServer: a biclustering server for prediction and assessment of co-expressed gene clusters","volume":"7","author":"Zhou","year":"2012","journal-title":"PLoS One"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btz692\/30062687\/btz692.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/4\/1143\/48984273\/btz692.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/4\/1143\/48984273\/btz692.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,19]],"date-time":"2023-09-19T22:55:10Z","timestamp":1695164110000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/4\/1143\/5567116"}},"subtitle":[],"editor":[{"given":"Inanc","family":"Birol","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2019,9,10]]},"references-count":37,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,2,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz692","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2020,2,15]]},"published":{"date-parts":[[2019,9,10]]}}}