{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T08:37:57Z","timestamp":1774946277091,"version":"3.50.1"},"reference-count":33,"publisher":"Oxford University Press (OUP)","issue":"21","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,11,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Motivation: Nuclear magnetic resonance (NMR) spectroscopy has been used to study mixtures of metabolites in biological samples. This technology produces a spectrum for each sample depicting the chemical shifts at which an unknown number of latent metabolites resonate. The interpretation of this data with common multivariate exploratory methods such as principal components analysis (PCA) is limited due to high-dimensionality, non-negativity of the underlying spectra and dependencies at adjacent chemical shifts.<\/jats:p><jats:p>Results: We develop a novel modification of PCA that is appropriate for analysis of NMR data, entitled Sparse Non-Negative Generalized PCA. This method yields interpretable principal components and loading vectors that select important features and directly account for both the non-negativity of the underlying spectra and dependencies at adjacent chemical shifts. Through the reanalysis of experimental NMR data on five purified neural cell types, we demonstrate the utility of our methods for dimension reduction, pattern recognition, sample exploration and feature selection. Our methods lead to the identification of novel metabolites that reflect the differences between these cell types.<\/jats:p><jats:p>Availability: \u00a0www.stat.rice.edu\/~gallen\/software.html<\/jats:p><jats:p>Contact: \u00a0gallen@rice.edu<\/jats:p><jats:p>Supplementary Information: \u00a0Supplementary data are available at Bioinformatics online.<\/jats:p>","DOI":"10.1093\/bioinformatics\/btr522","type":"journal-article","created":{"date-parts":[[2011,9,20]],"date-time":"2011-09-20T07:44:13Z","timestamp":1316504653000},"page":"3029-3035","source":"Crossref","is-referenced-by-count":45,"title":["Sparse non-negative generalized PCA with applications to metabolomics"],"prefix":"10.1093","volume":"27","author":[{"given":"Genevera I.","family":"Allen","sequence":"first","affiliation":[{"name":"1 Department of Pediatrics-Neurology, Baylor College of Medicine, Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, 1250 Moursund St. Suite 1365, Houston, TX 77030 and 2Department of Statistics, Rice University, 6100 Main St. MS-138, Houston, TX 77005, USA"},{"name":"1 Department of Pediatrics-Neurology, Baylor College of Medicine, Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, 1250 Moursund St. Suite 1365, Houston, TX 77030 and 2Department of Statistics, Rice University, 6100 Main St. MS-138, Houston, TX 77005, USA"}]},{"given":"Mirjana","family":"Maleti\u0107-Savati\u0107","sequence":"additional","affiliation":[{"name":"1 Department of Pediatrics-Neurology, Baylor College of Medicine, Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, 1250 Moursund St. Suite 1365, Houston, TX 77030 and 2Department of Statistics, Rice University, 6100 Main St. MS-138, Houston, TX 77005, USA"}]}],"member":"286","published-online":{"date-parts":[[2011,9,19]]},"reference":[{"key":"2023012511342248100_B1","volume-title":"A generalized least squares matrix decomposition.","author":"Allen","year":"2011"},{"key":"2023012511342248100_B2","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1002\/nbm.935","article-title":"NMR-based metabonomic approaches for evaluating physiological influences on biofluid composition","volume":"18","author":"Bollard","year":"2005","journal-title":"NMR Biomed."},{"key":"2023012511342248100_B3","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1021\/tx700335d","article-title":"NMR-based metabolic profiling and metabonomic approaches to problems in molecular toxicology","volume":"21","author":"Coen","year":"2008","journal-title":"Chem. Res. Toxicol."},{"key":"2023012511342248100_B4","doi-asserted-by":"crossref","first-page":"4556","DOI":"10.1021\/ac0503456","article-title":"Curve-fitting method for direct quantitation of compounds in complex biological mixtures using 1h NMR: application in metabonomic toxicology studies","volume":"77","author":"Crockford","year":"2005","journal-title":"Anal. Chem."},{"key":"2023012511342248100_B5","doi-asserted-by":"crossref","DOI":"10.1002\/9780470512968","volume-title":"In Vivo NMR Spectroscopy: Principles and Techniques.","author":"De Graaf","year":"2007"},{"key":"2023012511342248100_B6","doi-asserted-by":"crossref","first-page":"606","DOI":"10.1039\/b418288j","article-title":"Measuring the metabolome: current analytical technologies","volume":"130","author":"Dunn","year":"2005","journal-title":"Analyst"},{"key":"2023012511342248100_B7","doi-asserted-by":"crossref","first-page":"361","DOI":"10.1016\/j.pnmrs.2009.07.003","article-title":"Bioinformatic methods in NMR-based metabolic profiling","volume":"55","author":"Ebbels","year":"2009","journal-title":"Progress in Nuclear Magnetic Resonance Spectroscopy"},{"key":"2023012511342248100_B8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.18637\/jss.v033.i01","article-title":"Regularization paths for generalized linear models via coordinate descent","volume":"33","author":"Friedman","year":"2010","journal-title":"J. Stat. Softw."},{"key":"2023012511342248100_B9","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1016\/j.tibtech.2004.03.007","article-title":"Metabolomics by numbers: acquiring and understanding global metabolite data","volume":"22","author":"Goodacre","year":"2004","journal-title":"Trends Biotechnol."},{"key":"2023012511342248100_B10","doi-asserted-by":"crossref","first-page":"4716","DOI":"10.1002\/pmic.200600106","article-title":"Metabolomics: current technologies and future trends","volume":"6","author":"Hollywood","year":"2006","journal-title":"Proteomics"},{"key":"2023012511342248100_B11","doi-asserted-by":"crossref","first-page":"714","DOI":"10.1016\/j.cell.2008.08.026","article-title":"Metabolic phenotyping in health and disease","volume":"134","author":"Holmes","year":"2008","journal-title":"Cell"},{"key":"2023012511342248100_B12","first-page":"1457","article-title":"Non-negative matrix factorization with sparseness constraints","volume":"5","author":"Hoyer","year":"2004","journal-title":"J. Mach. Learn. Res."},{"key":"2023012511342248100_B13","doi-asserted-by":"crossref","first-page":"682","DOI":"10.1198\/jasa.2009.0121","article-title":"On consistency and sparsity for principal components analysis in high dimensions","volume":"104","author":"Johnstone","year":"2009","journal-title":"J. Am. Stat. Assoc."},{"key":"2023012511342248100_B14","doi-asserted-by":"crossref","first-page":"531","DOI":"10.1198\/1061860032148","article-title":"A modified principal component technique based on the LASSO","volume":"12","author":"Jolliffe","year":"2003","journal-title":"J. Comput. Graph. Stat."},{"key":"2023012511342248100_B15","first-page":"517","article-title":"Generalized power method for sparse principal component analysis","volume":"11","author":"Journ\u00e9e","year":"2010","journal-title":"J. Mach. Learn. Res."},{"key":"2023012511342248100_B16","doi-asserted-by":"crossref","first-page":"1495","DOI":"10.1093\/bioinformatics\/btm134","article-title":"Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis","volume":"23","author":"Kim","year":"2007","journal-title":"Bioinformatics"},{"key":"2023012511342248100_B17","doi-asserted-by":"crossref","first-page":"788","DOI":"10.1038\/44565","article-title":"Learning the parts of objects by non-negative matrix factorization","volume":"401","author":"Lee","year":"1999","journal-title":"Nature"},{"key":"2023012511342248100_B18","doi-asserted-by":"crossref","first-page":"1087","DOI":"10.1111\/j.1541-0420.2010.01392.x","article-title":"Biclustering via sparse singular value decomposition","volume":"66","author":"Lee","year":"2010","journal-title":"Biometrics"},{"key":"2023012511342248100_B19","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1101\/sqb.2008.73.021","article-title":"Metabolomics of neural progenitor cells: a novel approach to biomarker discovery","volume":"73","author":"Maleti\u0107-Savati\u0107","year":"2008","journal-title":"Cold Spring Harb. Symp. Quant. Biol."},{"key":"2023012511342248100_B20","doi-asserted-by":"crossref","first-page":"980","DOI":"10.1126\/science.1147851","article-title":"Magnetic resonance spectroscopy identifies neural progenitor cells in the live human brain","volume":"318","author":"Manganas","year":"2007","journal-title":"Science"},{"key":"2023012511342248100_B21","doi-asserted-by":"crossref","first-page":"1054","DOI":"10.1038\/4551054a","article-title":"Systems biology: metabonomics","volume":"455","author":"Nicholson","year":"2008","journal-title":"Nature"},{"key":"2023012511342248100_B22","doi-asserted-by":"crossref","first-page":"355","DOI":"10.1186\/1471-2105-9-355","article-title":"NITPICK: peak identification for mass spectrometry data","volume":"9","author":"Renard","year":"2008","journal-title":"BMC Bioinformatics"},{"key":"2023012511342248100_B23","doi-asserted-by":"crossref","first-page":"1453","DOI":"10.1109\/TMI.2004.834626","article-title":"Nonnegative matrix factorization for rapid recovery of constituent spectra in magnetic resonance chemical shift imaging of the brain","volume":"23","author":"Sajda","year":"2004","journal-title":"Med. Imag. IEEE Trans."},{"key":"2023012511342248100_B24","doi-asserted-by":"crossref","first-page":"1015","DOI":"10.1016\/j.jmva.2007.06.007","article-title":"Sparse principal component analysis via regularized low rank matrix approximation","volume":"99","author":"Shen","year":"2008","journal-title":"J. Multivar. Anal."},{"key":"2023012511342248100_B25","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser. B"},{"key":"2023012511342248100_B26","doi-asserted-by":"crossref","first-page":"1335","DOI":"10.1214\/11-AOS878","article-title":"The solution path of the generalized lasso","volume":"39","author":"Tibshirani","year":"2011","journal-title":"Ann. Stat."},{"issue":"Suppl. 1","key":"2023012511342248100_B27","first-page":"D402","article-title":"Biomagresbank","volume":"36","author":"Ulrich","year":"2008","journal-title":"Nucleic Acids Res."},{"key":"2023012511342248100_B28","doi-asserted-by":"crossref","first-page":"1551","DOI":"10.1016\/S1359-6446(05)03609-3","article-title":"Metabolomics: from pattern recognition to biological interpretation","volume":"10","author":"Weckwerth","year":"2005","journal-title":"Drug Discov. Today"},{"key":"2023012511342248100_B29","doi-asserted-by":"crossref","first-page":"4430","DOI":"10.1021\/ac060209g","article-title":"Targeted profiling: quantitative analysis of 1h NMR metabolomics data","volume":"78","author":"Weljie","year":"2006","journal-title":"Anal. Chem."},{"key":"2023012511342248100_B30","doi-asserted-by":"crossref","first-page":"515","DOI":"10.1093\/biostatistics\/kxp008","article-title":"A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis","volume":"10","author":"Witten","year":"2009","journal-title":"Biostatistics"},{"key":"2023012511342248100_B31","doi-asserted-by":"crossref","first-page":"1561","DOI":"10.7551\/mitpress\/7503.003.0200","article-title":"Nonnegative sparse PCA","volume":"19","author":"Zass","year":"2007","journal-title":"Adv. Neural Informat. Process. Syst."},{"key":"2023012511342248100_B32","doi-asserted-by":"crossref","first-page":"1637","DOI":"10.1093\/bioinformatics\/btr118","article-title":"Identification and quantification of metabolites in 1H NMR spectra by Bayesian model selection","volume":"27","author":"Zheng","year":"2011","journal-title":"Bioinformatics"},{"key":"2023012511342248100_B33","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1198\/106186006X113430","article-title":"Sparse principal component analysis","volume":"15","author":"Zou","year":"2006","journal-title":"J. Comput. Graph. Stat."}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/21\/3029\/48863304\/bioinformatics_27_21_3029.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/27\/21\/3029\/48863304\/bioinformatics_27_21_3029.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,11]],"date-time":"2025-03-11T17:59:26Z","timestamp":1741715966000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/27\/21\/3029\/218921"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,9,19]]},"references-count":33,"journal-issue":{"issue":"21","published-print":{"date-parts":[[2011,11,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btr522","relation":{},"ISSN":["1367-4811","1367-4803"],"issn-type":[{"value":"1367-4811","type":"electronic"},{"value":"1367-4803","type":"print"}],"subject":[],"published-other":{"date-parts":[[2011,11,1]]},"published":{"date-parts":[[2011,9,19]]}}}