{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T06:58:54Z","timestamp":1760597934816,"version":"3.37.3"},"reference-count":29,"publisher":"Oxford University Press (OUP)","issue":"13","license":[{"start":{"date-parts":[[2020,1,8]],"date-time":"2020-01-08T00:00:00Z","timestamp":1578441600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/open_access\/funder_policies\/chorus\/standard_publication_model"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["R00LM011673","U54HG008540"],"award-info":[{"award-number":["R00LM011673","U54HG008540"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Hillman Cancer Bioinformatics Services"},{"name":"UPMC Hillman Cancer Center Developmental"},{"DOI":"10.13039\/100000054","name":"National Cancer Institute","doi-asserted-by":"publisher","award":["P30CA047904"],"award-info":[{"award-number":["P30CA047904"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,7,1]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:sec>\n                  <jats:title>Motivation<\/jats:title>\n                  <jats:p>The matrix factorization is an important way to analyze coregulation patterns in transcriptomic data, which can reveal the tumor signal perturbation status and subtype classification. However, current matrix factorization methods do not provide clear bicluster structure. Furthermore, these algorithms are based on the assumption of linear combination, which may not be sufficient to capture the coregulation patterns.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Results<\/jats:title>\n                  <jats:p>We presented a new algorithm for Boolean matrix factorization (BMF) via expectation maximization (BEM). BEM is more aligned with the molecular mechanism of transcriptomic coregulation and can scale to matrix with over 100 million data points. Synthetic experiments showed that BEM outperformed other BMF methods in terms of reconstruction error. Real-world application demonstrated that BEM is applicable to all kinds of transcriptomic data, including bulk RNA-seq, single-cell RNA-seq and spatial transcriptomic datasets. Given appropriate binarization, BEM was able to extract coregulation patterns consistent with disease subtypes, cell types or spatial anatomy.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Availability and implementation<\/jats:title>\n                  <jats:p>Python source code of BEM is available on https:\/\/github.com\/LifanLiang\/EM_BMF.<\/jats:p>\n               <\/jats:sec>\n               <jats:sec>\n                  <jats:title>Supplementary information<\/jats:title>\n                  <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n               <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btz977","type":"journal-article","created":{"date-parts":[[2020,1,2]],"date-time":"2020-01-02T20:09:17Z","timestamp":1577995757000},"page":"4030-4037","source":"Crossref","is-referenced-by-count":13,"title":["BEM: Mining Coregulation Patterns in Transcriptomics via Boolean Matrix Factorization"],"prefix":"10.1093","volume":"36","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2495-4779","authenticated-orcid":false,"given":"Lifan","family":"Liang","sequence":"first","affiliation":[{"name":"Department of Biomedical Informatics , University of Pittsburgh, Pittsburgh, PA 15206-3701, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3649-209X","authenticated-orcid":false,"given":"Kunju","family":"Zhu","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics , University of Pittsburgh, Pittsburgh, PA 15206-3701, USA"},{"name":"Department of Central Lab. , Clinical Medicine Research Institute, Jinan University, Guangzhou, Guangdong 51063, China"}]},{"given":"Songjian","family":"Lu","sequence":"additional","affiliation":[{"name":"Department of Biomedical Informatics , University of Pittsburgh, Pittsburgh, PA 15206-3701, USA"}]}],"member":"286","published-online":{"date-parts":[[2020,1,8]]},"reference":[{"key":"2023062300444035400_btz977-B1","doi-asserted-by":"crossref","first-page":"467","DOI":"10.1016\/j.pathol.2016.05.002","article-title":"Aberrant differential expression of EZH1 and EZH2 in Polycomb repressive complex 2 among B- and T\/NK-cell neoplasms","volume":"48","author":"Abdalkader","year":"2016","journal-title":"Pathology"},{"key":"2023062300444035400_btz977-B2","doi-asserted-by":"crossref","first-page":"2419","DOI":"10.1038\/s41467-018-04724-5","article-title":"Spatial maps of prostate cancer transcriptomes reveal an unexplored landscape of heterogeneity","volume":"9","author":"Berglund","year":"2018","journal-title":"Nat. Commun"},{"key":"2023062300444035400_btz977-B3","doi-asserted-by":"crossref","first-page":"031902","DOI":"10.1103\/PhysRevE.67.031902","article-title":"Iterative signature algorithm for the analysis of large-scale gene expression data","volume":"67","author":"Bergmann","year":"2003","journal-title":"Phys. Rev. E Stat. Nonlin. Soft. Matter. Phys"},{"key":"2023062300444035400_btz977-B4","doi-asserted-by":"crossref","first-page":"4164","DOI":"10.1073\/pnas.0308531101","article-title":"Metagenes and molecular pattern discovery using matrix factorization","volume":"101","author":"Brunet","year":"2004","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062300444035400_btz977-B5","first-page":"93","article-title":"Biclustering of expression data","volume":"8","author":"Cheng","year":"2000","journal-title":"Proc. Int. Conf. Intell. Syst. Mol. Biol"},{"key":"2023062300444035400_btz977-B6","doi-asserted-by":"crossref","first-page":"1520","DOI":"10.1093\/bioinformatics\/btq227","article-title":"FABIA: factor analysis for bicluster acquisition","volume":"26","author":"Hochreiter","year":"2010","journal-title":"Bioinformatics"},{"key":"2023062300444035400_btz977-B7","first-page":"170","article-title":"The expression of FOXP3 and its role in human cancers","volume-title":"Biochim. Biophys. Acta","author":"Jia","year":"2019"},{"key":"2023062300444035400_btz977-B8","doi-asserted-by":"crossref","first-page":"168","DOI":"10.1038\/nature05453","article-title":"Genome-wide atlas of gene expression in the adult mouse brain","volume":"445","author":"Lein","year":"2007","journal-title":"Nature"},{"key":"2023062300444035400_btz977-B9","doi-asserted-by":"crossref","first-page":"e101","DOI":"10.1093\/nar\/gkp491","article-title":"QUBIC: a qualitative biclustering algorithm for analyses of gene expression data","volume":"37","author":"Li","year":"2009","journal-title":"Nucleic Acids Res"},{"key":"2023062300444035400_btz977-B10","first-page":"3867","article-title":"Bipartite stochastic block models with tiny clusters","volume":"31","author":"Neumann","year":"2018","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2023062300444035400_btz977-B11","doi-asserted-by":"crossref","first-page":"e0178087","DOI":"10.1371\/journal.pone.0178087","article-title":"Genome-scale investigation of olfactory system spatial heterogeneity","volume":"12","author":"Noto","year":"2017","journal-title":"PLoS One"},{"key":"2023062300444035400_btz977-B12","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1186\/s12859-017-1487-1","article-title":"A systematic comparative evaluation of biclustering techniques","volume":"18","author":"Padilha","year":"2017","journal-title":"BMC Bioinformatics"},{"key":"2023062300444035400_btz977-B13","doi-asserted-by":"crossref","first-page":"1396","DOI":"10.1126\/science.1254257","article-title":"Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma","volume":"344","author":"Patel","year":"2014","journal-title":"Science"},{"key":"2023062300444035400_btz977-B14","doi-asserted-by":"crossref","first-page":"1663","DOI":"10.1016\/j.cell.2015.11.013","article-title":"Transcriptional heterogeneity and lineage commitment in myeloid progenitors","volume":"163","author":"Paul","year":"2015","journal-title":"Cell"},{"key":"2023062300444035400_btz977-B15","first-page":"945","article-title":"Boolean matrix factorization and noisy completion via message passing","volume":"69","author":"Ravanbakhsh","year":"2016","journal-title":"ICML"},{"key":"2023062300444035400_btz977-B16","first-page":"2969","article-title":"Bayesian Boolean matrix factorisation","volume-title":"Proceedings of the 34th International Conference on Machine Learning.","author":"Rukat","year":"2017"},{"key":"2023062300444035400_btz977-B17","doi-asserted-by":"crossref","first-page":"998","DOI":"10.1016\/j.cell.2018.10.038","article-title":"Defining T cell states associated with response to checkpoint immunotherapy in melanoma","volume":"175","author":"Sade-Feldman","year":"2018","journal-title":"Cell"},{"key":"2023062300444035400_btz977-B18","doi-asserted-by":"crossref","first-page":"1090","DOI":"10.1038\/s41467-018-03424-4","article-title":"A comprehensive evaluation of module detection methods for gene expression data","volume":"9","author":"Saelens","year":"2018","journal-title":"Nat. Commun"},{"key":"2023062300444035400_btz977-B19","doi-asserted-by":"crossref","first-page":"e24522","DOI":"10.4161\/onci.24522","article-title":"IL-21 in cancer immunotherapy: at the right place at the right time","volume":"2","author":"Santegoets","year":"2013","journal-title":"Oncoimmunology"},{"key":"2023062300444035400_btz977-B20","doi-asserted-by":"crossref","first-page":"10869","DOI":"10.1073\/pnas.191367098","article-title":"Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications","volume":"98","author":"S\u00f8rlie","year":"2001","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062300444035400_btz977-B21","doi-asserted-by":"crossref","first-page":"790","DOI":"10.1016\/j.tig.2018.07.003","article-title":"Enter the matrix: factorization uncovers knowledge from omics","volume":"34","author":"Stein-O\u2019Brien","year":"2018","journal-title":"Trends Genet"},{"key":"2023062300444035400_btz977-B22","doi-asserted-by":"crossref","first-page":"616","DOI":"10.1186\/s12864-018-4999-9","article-title":"Gene co-expression network analysis reveals coordinated regulation of three characteristic secondary biosynthetic pathways in tea plant (Camellia sinensis)","volume":"19","author":"Tai","year":"2018","journal-title":"BMC Genomics"},{"key":"2023062300444035400_btz977-B23","doi-asserted-by":"crossref","first-page":"S136","DOI":"10.1093\/bioinformatics\/18.suppl_1.S136","article-title":"Discovering statistically significant biclusters in gene expression data","volume":"18","author":"Tanay","year":"2002","journal-title":"Bioinformatics"},{"key":"2023062300444035400_btz977-B24","first-page":"68","article-title":"The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge","volume":"1A","author":"Tomczak","year":"2015","journal-title":"Contemp. Oncol. (Pozn.)"},{"key":"2023062300444035400_btz977-B25","first-page":"575","article-title":"Gene co-expression analysis for functional classification and gene-disease predictions","volume":"19","author":"van Dam","year":"2018","journal-title":"Brief. Bioinform"},{"key":"2023062300444035400_btz977-B26","doi-asserted-by":"crossref","first-page":"e201900443","DOI":"10.26508\/lsa.201900443","article-title":"De novo prediction of cell-type complexity in single-cell RNA-seq and tumor microenvironments","volume":"2","author":"Woo","year":"2019","journal-title":"Life Sci. Alliance"},{"key":"2023062300444035400_btz977-B27","doi-asserted-by":"crossref","first-page":"1449","DOI":"10.1093\/bib\/bby014","article-title":"It is time to apply biclustering: a comprehensive review of biclustering applications in biological and biomedical data","volume":"20","author":"Xie","year":"2019","journal-title":"Brief. Bioinform"},{"key":"2023062300444035400_btz977-B28","doi-asserted-by":"crossref","first-page":"28","DOI":"10.1007\/s10618-009-0145-2","article-title":"Binary matrix factorization for analyzing gene expression data","volume":"20","author":"Zhang","year":"2010","journal-title":"Data Min. Knowl. Discov"},{"key":"2023062300444035400_btz977-B29","doi-asserted-by":"crossref","first-page":"e2888","DOI":"10.7717\/peerj.2888","article-title":"Detecting heterogeneity in single-cell RNA-seq data by non-negative matrix factorization","volume":"5","author":"Zhu","year":"2017","journal-title":"PeerJ"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/academic.oup.com\/bioinformatics\/advance-article-pdf\/doi\/10.1093\/bioinformatics\/btz977\/33294361\/btz977.pdf","content-type":"application\/pdf","content-version":"am","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/13\/4030\/50671393\/bioinformatics_36_13_4030.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/36\/13\/4030\/50671393\/bioinformatics_36_13_4030.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,24]],"date-time":"2023-06-24T18:35:59Z","timestamp":1687631759000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/36\/13\/4030\/5698267"}},"subtitle":[],"editor":[{"given":"Alfonso","family":"Valencia","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,1,8]]},"references-count":29,"journal-issue":{"issue":"13","published-print":{"date-parts":[[2020,7,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btz977","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"type":"print","value":"1367-4803"},{"type":"electronic","value":"1367-4811"}],"subject":[],"published-other":{"date-parts":[[2020,7]]},"published":{"date-parts":[[2020,1,8]]}}}