{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T02:46:21Z","timestamp":1774925181598,"version":"3.50.1"},"reference-count":44,"publisher":"Oxford University Press (OUP)","issue":"Supplement_1","license":[{"start":{"date-parts":[[2021,7,12]],"date-time":"2021-07-12T00:00:00Z","timestamp":1626048000000},"content-version":"vor","delay-in-days":11,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["DBI-1846216"],"award-info":[{"award-number":["DBI-1846216"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000057","name":"NIGMS","doi-asserted-by":"publisher","award":["R01GM120507"],"award-info":[{"award-number":["R01GM120507"]}],"id":[{"id":"10.13039\/100000057","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Johnson & Johnson WiSTEM2D Award"},{"name":"Sloan Research Fellowship"},{"name":"UCLA David Geffen School of Medicine W.M. Keck Foundation Junior Faculty Award"},{"DOI":"10.13039\/100000002","name":"NIH","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000065","name":"NINDS","doi-asserted-by":"publisher","award":["R01NS117148"],"award-info":[{"award-number":["R01NS117148"]}],"id":[{"id":"10.13039\/100000065","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2021,8,4]]},"abstract":"<jats:title>ABSTRACT:<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Motivation<\/jats:title>\n                    <jats:p>Single-cell RNA sequencing (scRNA-seq) captures whole transcriptome information of individual cells. While scRNA-seq measures thousands of genes, researchers are often interested in only dozens to hundreds of genes for a closer study. Then, a question is how to select those informative genes from scRNA-seq data. Moreover, single-cell targeted gene profiling technologies are gaining popularity for their low costs, high sensitivity and extra (e.g. spatial) information; however, they typically can only measure up to a few hundred genes. Then another challenging question is how to select genes for targeted gene profiling based on existing scRNA-seq data.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Here, we develop the single-cell Projective Non-negative Matrix Factorization (scPNMF) method to select informative genes from scRNA-seq data in an unsupervised way. Compared with existing gene selection methods, scPNMF has two advantages. First, its selected informative genes can better distinguish cell types. Second, it enables the alignment of new targeted gene profiling data with reference data in a low-dimensional space to facilitate the prediction of cell types in the new data. Technically, scPNMF modifies the PNMF algorithm for gene selection by changing the initialization and adding a basis selection step, which selects informative bases to distinguish cell types. We demonstrate that scPNMF outperforms the state-of-the-art gene selection methods on diverse scRNA-seq datasets. Moreover, we show that scPNMF can guide the design of targeted gene profiling experiments and the cell-type annotation on targeted gene profiling data.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Availability and implementation<\/jats:title>\n                    <jats:p>The R package is open-access and available at https:\/\/github.com\/JSB-UCLA\/scPNMF. The data used in this work are available at Zenodo: https:\/\/doi.org\/10.5281\/zenodo.4797997.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Supplementary information<\/jats:title>\n                    <jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1093\/bioinformatics\/btab273","type":"journal-article","created":{"date-parts":[[2021,4,23]],"date-time":"2021-04-23T09:07:48Z","timestamp":1619168868000},"page":"i358-i366","source":"Crossref","is-referenced-by-count":26,"title":["scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling"],"prefix":"10.1093","volume":"37","author":[{"given":"Dongyuan","family":"Song","sequence":"first","affiliation":[{"name":"Bioinformatics Interdepartmental Ph.D. Program, University of California , Los Angeles, CA 90095-7246, USA"}]},{"given":"Kexin","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Statistics, University of California , Los Angeles, CA 90095-1554, USA"}]},{"given":"Zachary","family":"Hemminger","sequence":"additional","affiliation":[{"name":"Institute for Quantitative and Computational Biosciences, University of California , Los Angeles, CA 90095, USA"},{"name":"Department of Integrative Biology and Physiology, University of California , Los Angeles, CA 90095-7239, USA"}]},{"given":"Roy","family":"Wollman","sequence":"additional","affiliation":[{"name":"Institute for Quantitative and Computational Biosciences, University of California , Los Angeles, CA 90095, USA"},{"name":"Department of Integrative Biology and Physiology, University of California , Los Angeles, CA 90095-7239, USA"},{"name":"Department of Chemistry and Biochemistry, University of California , Los Angeles, CA 90095-1569, USA"}]},{"given":"Jingyi Jessica","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Statistics, University of California , Los Angeles, CA 90095-1554, USA"},{"name":"Department of Human Genetics, University of California , Los Angeles, CA 90095-7088, USA"},{"name":"Department of Computational Medicine, University of California , Los Angeles, CA 90095-1766, USA"},{"name":"Department of Biostatistics, University of California Los Angeles , CA 90095-1772, USA"}]}],"member":"286","published-online":{"date-parts":[[2021,7,12]]},"reference":[{"key":"2023062410163839800_btab273-B1","doi-asserted-by":"crossref","first-page":"900","DOI":"10.1007\/s11749-018-0611-5","article-title":"Mode testing, critical bandwidth and excess mass","volume":"28","author":"Ameijeiras-Alonso","year":"2019","journal-title":"Test"},{"key":"2023062410163839800_btab273-B2","doi-asserted-by":"crossref","first-page":"2865","DOI":"10.1093\/bioinformatics\/bty1044","article-title":"M3drop: dropout-based feature selection for scRNAseq","volume":"35","author":"Andrews","year":"2019","journal-title":"Bioinformatics"},{"key":"2023062410163839800_btab273-B3","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1152\/physiolgenomics.00025.2005","article-title":"GAPDH as a housekeeping gene: analysis of GAPDH mRNA expression in a panel of 72 human tissues","volume":"21","author":"Barber","year":"2005","journal-title":"Physiol. Genomics"},{"key":"2023062410163839800_btab273-B4","doi-asserted-by":"crossref","first-page":"346","DOI":"10.1016\/j.cels.2016.08.011","article-title":"A single-cell transcriptomic map of the human and mouse pancreas reveals inter-and intra-cell population structure","volume":"3","author":"Baron","year":"2016","journal-title":"Cell Syst"},{"key":"2023062410163839800_btab273-B5","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1146\/annurev-genet-120417-031247","article-title":"Power in numbers: single-cell RNA-seq strategies to dissect complex tissues","volume":"52","author":"Birnbaum","year":"2018","journal-title":"Annu. Rev. Genetics"},{"key":"2023062410163839800_btab273-B6","doi-asserted-by":"crossref","first-page":"1693","DOI":"10.1038\/ng.3990","article-title":"Evolution and clinical impact of co-occurring genetic alterations in advanced-stage egfr-mutant lung cancers","volume":"49","author":"Blakely","year":"2017","journal-title":"Nat. Genetics"},{"key":"2023062410163839800_btab273-B7","doi-asserted-by":"crossref","first-page":"3422","DOI":"10.1093\/bioinformatics\/btaa176","article-title":"Exploring high-dimensional biological data with sparse contrastive principal component analysis","volume":"36","author":"Boileau","year":"2020","journal-title":"Bioinformatics"},{"key":"2023062410163839800_btab273-B8","first-page":"144","author":"Boser","year":"1992"},{"key":"2023062410163839800_btab273-B9","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1023\/A:1010933404324","article-title":"Random forests","volume":"45","author":"Breiman","year":"2001","journal-title":"Mach. Learn"},{"key":"2023062410163839800_btab273-B10","doi-asserted-by":"crossref","first-page":"4164","DOI":"10.1073\/pnas.0308531101","article-title":"Metagenes and molecular pattern discovery using matrix factorization","volume":"101","author":"Brunet","year":"2004","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062410163839800_btab273-B11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-017-1334-8","article-title":"f-scLVM: scalable and versatile factor analysis for single-cell RNA-seq","volume":"18","author":"Buettner","year":"2017","journal-title":"Genome Biol"},{"key":"2023062410163839800_btab273-B12","doi-asserted-by":"crossref","first-page":"23020","DOI":"10.18632\/oncotarget.15479","article-title":"Efficacy of continuous EGFR-inhibition and role of hedgehog in egfr acquired resistance in human lung cancer cells with activating mutation of EGFR","volume":"8","author":"Della Corte","year":"2017","journal-title":"Oncotarget"},{"key":"2023062410163839800_btab273-B13","doi-asserted-by":"crossref","first-page":"7723","DOI":"10.1073\/pnas.1805681115","article-title":"Integrative analysis of single-cell genomics data by coupled nonnegative matrix factorizations","volume":"115","author":"Duren","year":"2018","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062410163839800_btab273-B14","doi-asserted-by":"crossref","first-page":"4011","DOI":"10.1093\/bioinformatics\/btz177","article-title":"Probabilistic count matrix factorization for single cell expression data analysis","volume":"35","author":"Durif","year":"2019","journal-title":"Bioinformatics"},{"key":"2023062410163839800_btab273-B15","doi-asserted-by":"crossref","first-page":"569","DOI":"10.1016\/j.tig.2013.05.010","article-title":"Human housekeeping genes, revisited","volume":"29","author":"Eisenberg","year":"2013","journal-title":"Trends Genetics"},{"key":"2023062410163839800_btab273-B16","doi-asserted-by":"crossref","first-page":"1297","DOI":"10.12688\/f1000research.15809.1","article-title":"Comparison of clustering tools in r for medium-sized 10x genomics single-cell RNA-sequencing data","volume":"7","author":"Freytag","year":"2018","journal-title":"F1000Research"},{"key":"2023062410163839800_btab273-B17","first-page":"248","author":"Gao","year":"2020"},{"key":"2023062410163839800_btab273-B18","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-019-1874-1","article-title":"Normalization and variance stabilization of single-cell rna-seq data using regularized negative binomial regression","volume":"20","author":"Hafemeister","year":"2019","journal-title":"Genome Biol"},{"key":"2023062410163839800_btab273-B19","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1186\/s13059-016-1010-4","article-title":"Giniclust: detecting rare cell types from single-cell gene expression data with gini index","volume":"17","author":"Jiang","year":"2016","journal-title":"Genome Biol"},{"key":"2023062410163839800_btab273-B20","doi-asserted-by":"crossref","first-page":"1289","DOI":"10.1038\/s41592-019-0619-0","article-title":"Fast, sensitive and accurate integration of single-cell data with harmony","volume":"16","author":"Korsunsky","year":"2019","journal-title":"Nat. Methods"},{"key":"2023062410163839800_btab273-B21","doi-asserted-by":"crossref","first-page":"788","DOI":"10.1038\/44565","article-title":"Learning the parts of objects by non-negative matrix factorization","volume":"401","author":"Lee","year":"1999","journal-title":"Nature"},{"key":"2023062410163839800_btab273-B22","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1186\/s13059-016-0947-7","article-title":"Pooling across cells to normalize single-cell RNA sequencing data with many zero counts","volume":"17","author":"Lun","year":"2016","journal-title":"Genome Biol"},{"key":"2023062410163839800_btab273-B23","doi-asserted-by":"crossref","first-page":"1202","DOI":"10.1016\/j.cell.2015.05.002","article-title":"Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets","volume":"161","author":"Macosko","year":"2015","journal-title":"Cell"},{"key":"2023062410163839800_btab273-B24","doi-asserted-by":"crossref","first-page":"33404","DOI":"10.1073\/pnas.2010738117","article-title":"HyPR-seq: single-cell quantification of chosen RNAs via hybridization and sequencing of DNA probes","volume":"117","author":"Marshall","year":"2020","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062410163839800_btab273-B25","doi-asserted-by":"crossref","first-page":"11046","DOI":"10.1073\/pnas.1612826113","article-title":"High-throughput single-cell gene-expression profiling with multiplexed error-robust fluorescence in situ hybridization","volume":"113","author":"Moffitt","year":"2016","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"2023062410163839800_btab273-B26","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1038\/nrclinonc.2016.26","article-title":"Treating cancer with selective cdk4\/6 inhibitors","volume":"13","author":"O'Leary","year":"2016","journal-title":"Nat. Rev. Clin. Oncol"},{"key":"2023062410163839800_btab273-B27","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-015-0737-7","article-title":"Single-cell ATAC-seq: strength in numbers","volume":"16","author":"Pott","year":"2015","journal-title":"Genome Biol"},{"key":"2023062410163839800_btab273-B28","doi-asserted-by":"crossref","first-page":"479","DOI":"10.1038\/s41581-018-0021-7","article-title":"Single-cell RNA sequencing for the study of development, physiology and disease","volume":"14","author":"Potter","year":"2018","journal-title":"Nat. Rev. Nephrol"},{"key":"2023062410163839800_btab273-B29","doi-asserted-by":"crossref","first-page":"877","DOI":"10.1038\/nmeth.1253","article-title":"Imaging individual mrna molecules using multiple singly labeled probes","volume":"5","author":"Raj","year":"2008","journal-title":"Nat. Methods"},{"key":"2023062410163839800_btab273-B30","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1186\/1471-2199-7-33","article-title":"Selection of housekeeping genes for gene expression studies in human reticulocytes using real-time PCR","volume":"7","author":"Silver","year":"2006","journal-title":"BMC Mol. Biol"},{"key":"2023062410163839800_btab273-B31","doi-asserted-by":"crossref","first-page":"1888","DOI":"10.1016\/j.cell.2019.05.031","article-title":"Comprehensive integration of single-cell data","volume":"177","author":"Stuart","year":"2019","journal-title":"Cell"},{"key":"2023062410163839800_btab273-B32","doi-asserted-by":"crossref","first-page":"1437","DOI":"10.1016\/j.cell.2017.10.049","article-title":"A next generation connectivity map: L1000 platform and the first 1,000,000 profiles","volume":"171","author":"Subramanian","year":"2017","journal-title":"Cell"},{"key":"2023062410163839800_btab273-B33","author":"Sun","year":"2021"},{"key":"2023062410163839800_btab273-B34","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1016\/S0168-1656(99)00163-7","article-title":"Housekeeping genes as internal standards: use and limits","volume":"75","author":"Thellin","year":"1999","journal-title":"J. Biotechnol"},{"key":"2023062410163839800_btab273-B35","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s13059-019-1748-6","article-title":"Bart-seq: cost-effective massively parallelized targeted sequencing for genomics, transcriptomics, and single-cell analysis","volume":"20","author":"Uzbas","year":"2019","journal-title":"Genome Biol"},{"key":"2023062410163839800_btab273-B36","doi-asserted-by":"crossref","first-page":"e1007445","DOI":"10.1371\/journal.pcbi.1007445","article-title":"Scmarker: ab initio marker selection for single cell transcriptome profiling","volume":"15","author":"Wang","year":"2019","journal-title":"PLoS Comput. Biol"},{"key":"2023062410163839800_btab273-B37","doi-asserted-by":"crossref","first-page":"1873","DOI":"10.1016\/j.cell.2019.05.006","article-title":"Single-cell multi-omic integration compares and contrasts features of brain cell identity","volume":"177","author":"Welch","year":"2019","journal-title":"Cell"},{"key":"2023062410163839800_btab273-B38","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1093\/bioinformatics\/btv544","article-title":"A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data","volume":"32","author":"Yang","year":"2016","journal-title":"Bioinformatics"},{"key":"2023062410163839800_btab273-B39","doi-asserted-by":"crossref","first-page":"734","DOI":"10.1109\/TNN.2010.2041361","article-title":"Linear and nonlinear projective nonnegative matrix factorization","volume":"21","author":"Yang","year":"2010","journal-title":"IEEE Trans. Neural Netw"},{"key":"2023062410163839800_btab273-B50"},{"key":"2023062410163839800_btab273-B40","first-page":"11","article-title":"Projective nonnegative matrix factorization: sparseness","author":"Yuan","year":"2009","journal-title":"Neural Process. Lett"},{"key":"2023062410163839800_btab273-B41","doi-asserted-by":"crossref","first-page":"lqaa064","DOI":"10.1093\/nargab\/lqaa064","article-title":"Dimensionality reduction for single cell RNA sequencing data using constrained robust non-negative matrix factorization","volume":"2","author":"Zhang","year":"2020","journal-title":"NAR Genomics Bioinformatics"},{"key":"2023062410163839800_btab273-B42","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1038\/s41592-019-0691-5","article-title":"Single-cell multimodal omics: the power of many","volume":"17","author":"Zhu","year":"2020","journal-title":"Nat. Methods"},{"key":"2023062410163839800_btab273-B43","doi-asserted-by":"crossref","first-page":"e2888","DOI":"10.7717\/peerj.2888","article-title":"Detecting heterogeneity in single-cell RNA-seq data by non-negative matrix factorization","volume":"5","author":"Zhu","year":"2017","journal-title":"PeerJ"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/Supplement_1\/i358\/50694011\/btab273.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/37\/Supplement_1\/i358\/50694011\/btab273.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,6,24]],"date-time":"2023-06-24T20:15:05Z","timestamp":1687637705000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/37\/Supplement_1\/i358\/6319662"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,1]]},"references-count":44,"journal-issue":{"issue":"Supplement_1","published-print":{"date-parts":[[2021,8,4]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btab273","relation":{"has-preprint":[{"id-type":"doi","id":"10.1101\/2021.02.09.430550","asserted-by":"object"}]},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2021,7,1]]},"published":{"date-parts":[[2021,7,1]]}}}