{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T23:12:11Z","timestamp":1770333131273,"version":"3.49.0"},"reference-count":31,"publisher":"Oxford University Press (OUP)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2007,6,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Many practical pattern recognition problems require non-negativity constraints. For example, pixels in digital images and chemical concentrations in bioinformatics are non-negative. Sparse non-negative matrix factorizations (NMFs) are useful when the degree of sparseness in the non-negative basis matrix or the non-negative coefficient matrix in an NMF needs to be controlled in approximating high-dimensional data in a lower dimensional space.<\/jats:p><jats:p>Results: In this article, we introduce a novel formulation of sparse NMF and show how the new formulation leads to a convergent sparse NMF algorithm via alternating non-negativity-constrained least squares. We apply our sparse NMF algorithm to cancer-class discovery and gene expression data analysis and offer biological analysis of the results obtained. Our experimental results illustrate that the proposed sparse NMF algorithm often achieves better clustering performance with shorter computing time compared to other existing NMF algorithms.<\/jats:p><jats:p>Availability: The software is available as supplementary material.<\/jats:p><jats:p>Contact: \u00a0hskim@cc.gatech.edu, hpark@acc.gatech.edu<\/jats:p><jats:p>Supplementary information: Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btm134","type":"journal-article","created":{"date-parts":[[2007,5,6]],"date-time":"2007-05-06T00:28:57Z","timestamp":1178411337000},"page":"1495-1502","source":"Crossref","is-referenced-by-count":693,"title":["Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis"],"prefix":"10.1093","volume":"23","author":[{"given":"Hyunsoo","family":"Kim","sequence":"first","affiliation":[{"name":"College of Computing, Georgia Institute of Technology, 266 Ferst Drive, Atlanta, GA 30332, USA"}]},{"given":"Haesun","family":"Park","sequence":"additional","affiliation":[{"name":"College of Computing, Georgia Institute of Technology, 266 Ferst Drive, Atlanta, GA 30332, USA"}]}],"member":"286","published-online":{"date-parts":[[2007,5,5]]},"reference":[{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"335","DOI":"10.1137\/S0036144598347035","article-title":"Matrices, vector spaces, and information retrieval","volume":"41","author":"Berry","year":"1999","journal-title":"SIAM Rev"},{"key":"2023041105083190500_","article-title":"Algorithms and applications for approximate nonnegative matrix factorization","author":"Berry","year":"2006","journal-title":"Comput. Stat. Data Anal"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"393","DOI":"10.1002\/(SICI)1099-128X(199709\/10)11:5<393::AID-CEM483>3.0.CO;2-L","article-title":"A fast non-negativity-constrained least squares algorithm","volume":"11","author":"Bro","year":"1997","journal-title":"J. Chemometrics"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"4164","DOI":"10.1073\/pnas.0308531101","article-title":"Metagenes and molecular pattern discovery using matrix factorization","volume":"101","author":"Brunet","year":"2004","journal-title":"Proc. Natl Acad. Sci. USA"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1186\/1471-2105-7-78","article-title":"Biclustering of gene expression data by non-smooth non-negative matrix factorization","volume":"7","author":"Carmona-Saez","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1016\/j.ccr.2006.03.019","article-title":"High-resolution genomic profiles define distinct clinico-pathogenetic subgroups of multiple myeloma patients","volume":"9","author":"Carrasco","year":"2006","journal-title":"Cancer Cell"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"41","DOI":"10.1186\/1471-2105-7-41","article-title":"Discovering semantic features in the literature: a foundation for building functional associations","volume":"7","author":"Chagoyen","year":"2006","journal-title":"BMC Bioinformatics"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","DOI":"10.2172\/807420","article-title":"Adaptive dimension reduction for clustering high dimensional data","author":"Ding","year":"2002"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"3775","DOI":"10.1093\/nar\/gkg624","article-title":"Onto-tools, the toolkit of the modern biologist: Onto-Express, Onto-Compare, Onto-Design and Onto-Translate","volume":"31","author":"Draghici","year":"2003","journal-title":"Nucleic Acids Res"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"i144","DOI":"10.1093\/bioinformatics\/bti1041","article-title":"Multi-way clustering of microarray data using probabilistic sparse matrix factorization","volume":"21","author":"Dueck","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"3970","DOI":"10.1093\/bioinformatics\/bti653","article-title":"Improving molecular cancer class discovery through sparse non-negative matrix factorization","volume":"21","author":"Gao","year":"2005","journal-title":"Bioinformatics"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1126\/science.286.5439.531","article-title":"Molecular classification of cancer: class discovery and class prediction by gene expression monitoring","volume":"286","author":"Golub","year":"1999","journal-title":"Science"},{"key":"2023041105083190500_","article-title":"Accelerating the Lee-Seung algorithm for non-negative matrix factorization","author":"Gonzales","year":"2005","journal-title":"Technical report"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1016\/S0167-6377(99)00074-7","article-title":"On the convergence of the block nonlinear Gauss-Seidel method under convex constraints","volume":"26","author":"Grippo","year":"2000","journal-title":"Operations Res. Lett"},{"key":"2023041105083190500_","first-page":"1457","article-title":"Non-negative matrix factorization with sparseness constraints","volume":"5","author":"Hoyer","year":"2004","journal-title":"J. Machine Learning Res"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"266","DOI":"10.1006\/geno.2002.6698","article-title":"Profiling gene expression using onto-express","volume":"79","author":"Khatri","year":"2002","journal-title":"Genomics"},{"key":"2023041105083190500_","first-page":"37","article-title":"Dimension reduction in text classification with support vector machines","volume":"6","author":"Kim","year":"2005","journal-title":"J. Machine Learning Res"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"1706","DOI":"10.1101\/gr.903503","article-title":"Subsystem identification through dimensionality reduction of large-scale gene expression data","volume":"13","author":"Kim","year":"2003","journal-title":"Genome Res"},{"key":"2023041105083190500_","author":"Lawson","year":"1974","journal-title":"Solving Least Squares Problems"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"788","DOI":"10.1038\/44565","article-title":"Learning the parts of objects by non-negative matrix factorization","volume":"401","author":"Lee","year":"1999","journal-title":"Nature"},{"key":"2023041105083190500_","first-page":"556","article-title":"Algorithms for non-negative matrix factorization","author":"Lee","year":"2000"},{"key":"2023041105083190500_","first-page":"207","article-title":"Learning spatially localized parts-based representations","author":"Li","year":"2001"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"11502","DOI":"10.1158\/0008-5472.CAN-06-2072","article-title":"Marked genomic differences characterize primary and secondary glioblastoma subtypes and identify two distinct molecular and clinical secondary glioblastoma entities","volume":"66","author":"Maher","year":"2006","journal-title":"Cancer Res"},{"key":"2023041105083190500_","volume-title":"User's Guide.","author":"MATLAB","year":"1992"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"403","DOI":"10.1109\/TPAMI.2006.60","article-title":"Nonsmooth nonnegative matrix factorization (nsNMF)","volume":"28","author":"Pascual-Montano","year":"2006","journal-title":"IEEE, Trans. Pattern Anal. Machine Intell"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","DOI":"10.1137\/1.9781611972740.45","article-title":"Text mining using non-negative matrix factorizations","author":"Pauca","year":"2004"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","DOI":"10.1016\/j.laa.2005.06.025","article-title":"Nonnegative matrix factorization for spectral data analysis","author":"Pauca","year":"2006","journal-title":"Linear Algebra and Applications"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"162","DOI":"10.1186\/1471-2105-6-162","article-title":"Theme discovery from gene lists for identification and viewing of multiple functional groups","volume":"6","author":"Pehkonen","year":"2005","journal-title":"BMC Bioinformatics"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/415436a","article-title":"Prediction of central nervous system embryonal tumour outcome based on gene expression","volume":"415","author":"Pomeroy","year":"2002","journal-title":"Nature"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via LASSO","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. Roy. Statist. Soc. B"},{"key":"2023041105083190500_","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1002\/cem.889","article-title":"Fast algorithm for the solution of large-scale non-negativity-constrained least squares problems","volume":"18","author":"van Benthem","year":"2004","journal-title":"J. Chemometrics"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/12\/1495\/49814038\/bioinformatics_23_12_1495.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/23\/12\/1495\/49814038\/bioinformatics_23_12_1495.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,16]],"date-time":"2025-01-16T02:19:21Z","timestamp":1736993961000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/23\/12\/1495\/225472"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2007,5,5]]},"references-count":31,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2007,6,15]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btm134","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2007,6,15]]},"published":{"date-parts":[[2007,5,5]]}}}