{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,21]],"date-time":"2026-01-21T06:45:22Z","timestamp":1768977922194,"version":"3.49.0"},"reference-count":42,"publisher":"Oxford University Press (OUP)","issue":"3","license":[{"start":{"date-parts":[[2017,9,13]],"date-time":"2017-09-13T00:00:00Z","timestamp":1505260800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/academic.oup.com\/journals\/pages\/about_us\/legal\/notices"}],"funder":[{"DOI":"10.13039\/501100001665","name":"ANR","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100001665","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2018,2,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Motivation<\/jats:title><jats:p>The high dimensionality of genomic data calls for the development of specific classification methodologies, especially to prevent over-optimistic predictions. This challenge can be tackled by compression and variable selection, which combined constitute a powerful framework for classification, as well as data visualization and interpretation. However, current proposed combinations lead to unstable and non convergent methods due to inappropriate computational frameworks. We hereby propose a computationally stable and convergent approach for classification in high dimensional based on sparse Partial Least Squares (sparse PLS).<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We start by proposing a new solution for the sparse PLS problem that is based on proximal operators for the case of univariate responses. Then we develop an adaptive version of the sparse PLS for classification, called logit-SPLS, which combines iterative optimization of logistic regression and sparse PLS to ensure computational convergence and stability. Our results are confirmed on synthetic and experimental data. In particular, we show how crucial convergence and stability can be when cross-validation is involved for calibration purposes. Using gene expression data, we explore the prediction of breast cancer relapse. We also propose a multicategorial version of our method, used to predict cell-types based on single-cell expression data.<\/jats:p><\/jats:sec><jats:sec><jats:title>Availability and implementation<\/jats:title><jats:p>Our approach is implemented in the plsgenomics R-package.<\/jats:p><\/jats:sec><jats:sec><jats:title>Supplementary information<\/jats:title><jats:p>Supplementary data are available at Bioinformatics online.<\/jats:p><\/jats:sec>","DOI":"10.1093\/bioinformatics\/btx571","type":"journal-article","created":{"date-parts":[[2017,9,12]],"date-time":"2017-09-12T11:09:45Z","timestamp":1505214585000},"page":"485-493","source":"Crossref","is-referenced-by-count":25,"title":["High dimensional classification with combined adaptive sparse PLS and logistic regression"],"prefix":"10.1093","volume":"34","author":[{"given":"Ghislain","family":"Durif","sequence":"first","affiliation":[{"name":"LBBE, UMR CNRS 5558, Universit\u00e9 Lyon 1, Villeurbanne, France"},{"name":"Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, Grenoble, France"}]},{"given":"Laurent","family":"Modolo","sequence":"additional","affiliation":[{"name":"LBBE, UMR CNRS 5558, Universit\u00e9 Lyon 1, Villeurbanne, France"},{"name":"LBMC UMR 5239 CNRS\/ENS Lyon, Lyon, France"},{"name":"Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden"}]},{"given":"Jakob","family":"Michaelsson","sequence":"additional","affiliation":[{"name":"Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden"}]},{"given":"Jeff E","family":"Mold","sequence":"additional","affiliation":[{"name":"Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden"}]},{"given":"Sophie","family":"Lambert-Lacroix","sequence":"additional","affiliation":[{"name":"UMR 5525 Universit\u00e9 Grenoble Alpes\/CNRS\/TIMC-IMAG, Grenoble, France"}]},{"given":"Franck","family":"Picard","sequence":"additional","affiliation":[{"name":"LBBE, UMR CNRS 5558, Universit\u00e9 Lyon 1, Villeurbanne, France"}]}],"member":"286","published-online":{"date-parts":[[2017,9,13]]},"reference":[{"key":"2023012712321693600_btx571-B1","author":"Aggarwal","year":"2001"},{"key":"2023012712321693600_btx571-B2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/2200000015","article-title":"Optimization with sparsity-inducing penalties","volume":"4","author":"Bach","year":"2012","journal-title":"Found. Trends Mach. Learn"},{"key":"2023012712321693600_btx571-B3","article-title":"Classification using LS-PLS with logistic regression based on both clinical and gene expression variables","author":"Bazzoli","year":"2016","journal-title":"Preprint"},{"key":"2023012712321693600_btx571-B4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.2202\/1544-6115.1075","article-title":"PLS dimension reduction for classification with microarray data","volume":"3","author":"Boulesteix","year":"2004","journal-title":"Statist. Appl. Genet. Mol. Biol"},{"key":"2023012712321693600_btx571-B5","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1093\/bib\/bbl016","article-title":"Partial least squares: a versatile tool for the analysis of high-dimensional genomic data","volume":"8","author":"Boulesteix","year":"2007","journal-title":"Brief. Bioinform"},{"key":"2023012712321693600_btx571-B6","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1016\/j.chemolab.2004.12.011","article-title":"Performance of some variable selection methods when multicollinearity is present","volume":"78","author":"Chong","year":"2005","journal-title":"Chem. Intel. Lab. Syst"},{"key":"2023012712321693600_btx571-B7","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1111\/j.1467-9868.2009.00723.x","article-title":"Sparse partial least squares regression for simultaneous dimension reduction and variable selection","volume":"72","author":"Chun","year":"2010","journal-title":"J. R. Stat. Soc. Ser. B (Stat. Methodol.)"},{"key":"2023012712321693600_btx571-B8","doi-asserted-by":"crossref","DOI":"10.2202\/1544-6115.1492","article-title":"Sparse partial least squares classification for high dimensional data","volume":"9","author":"Chung","year":"2010","journal-title":"Stat. Appl. Genet. Mol. Biol"},{"key":"2023012712321693600_btx571-B9","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1016\/0169-7439(93)85002-X","article-title":"SIMPLS: an alternative approach to partial least squares regression","volume":"18","author":"De Jong","year":"1993","journal-title":"Chem. Intel. Lab. Syst"},{"key":"2023012712321693600_btx571-B10","doi-asserted-by":"crossref","first-page":"280","DOI":"10.1198\/106186005X47697","article-title":"Classification using generalized partial least squares","volume":"14","author":"Ding","year":"2005","journal-title":"J. Comput. Graph. Stat"},{"key":"2023012712321693600_btx571-B11","first-page":"1","article-title":"High-dimensional data analysis: the curses and blessings of dimensionality","author":"Donoho","year":"2000","journal-title":"AMS Math Challenges Lecture"},{"key":"2023012712321693600_btx571-B12","author":"Eilers","year":"2001"},{"key":"2023012712321693600_btx571-B13","doi-asserted-by":"crossref","first-page":"1104","DOI":"10.1093\/bioinformatics\/bti114","article-title":"Classification using partial least squares with penalized logistic regression","volume":"21","author":"Fort","year":"2005","journal-title":"Bioinformatics"},{"key":"2023012712321693600_btx571-B14","doi-asserted-by":"crossref","first-page":"1.","DOI":"10.18637\/jss.v033.i01","article-title":"Regularization paths for generalized linear models via coordinate descent","volume":"33","author":"Friedman","year":"2010","journal-title":"J. Stat. Software"},{"key":"2023012712321693600_btx571-B15","doi-asserted-by":"crossref","first-page":"1290","DOI":"10.1038\/nm.2446","article-title":"A human memory T cell subset with stem cell-like properties","volume":"17","author":"Gattinoni","year":"2011","journal-title":"Nat. Med"},{"key":"2023012712321693600_btx571-B16","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1038\/nrg.2015.16","article-title":"Single-cell genome sequencing: current state of the science","volume":"17","author":"Gawad","year":"2016","journal-title":"Nat. Rev. Genet"},{"key":"2023012712321693600_btx571-B17","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1111\/j.2517-6161.1984.tb01288.x","article-title":"Iteratively reweighted least squares for maximum likelihood estimation, and some robust and resistant alternatives","author":"Green","year":"1984","journal-title":"J. R. Stat. Soc. Ser. B (Methodol.)"},{"key":"2023012712321693600_btx571-B18","doi-asserted-by":"crossref","first-page":"1196","DOI":"10.1038\/onc.2011.301","article-title":"A refined molecular taxonomy of breast cancer","volume":"31","author":"Guedj","year":"2012","journal-title":"Oncogene"},{"key":"2023012712321693600_btx571-B19","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-84858-7","volume-title":"The elements of statistical learning","author":"Hastie","year":"2009","edition":"2nd edn"},{"key":"2023012712321693600_btx571-B20","doi-asserted-by":"crossref","DOI":"10.2202\/1544-6115.1390","article-title":"A sparse PLS for variable selection when integrating omics data","volume":"7","author":"L\u00ea Cao","year":"2008","journal-title":"Stat. Appl. Genet. Mol. Biol"},{"key":"2023012712321693600_btx571-B21","doi-asserted-by":"crossref","first-page":"253.","DOI":"10.1186\/1471-2105-12-253","article-title":"Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems","volume":"12","author":"L\u00ea Cao","year":"2011","journal-title":"BMC Bioinform"},{"key":"2023012712321693600_btx571-B22","doi-asserted-by":"crossref","first-page":"191","DOI":"10.2307\/2347628","article-title":"Ridge estimators in logistic regression","volume":"41","author":"Le Cessie","year":"1992","journal-title":"Appl. Stat"},{"key":"2023012712321693600_btx571-B23","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1093\/imamat\/24.1.59","article-title":"Nearest neighbour searches and the curse of dimensionality","volume":"24","author":"Marimont","year":"1979","journal-title":"IMA J. Appl. Math"},{"key":"2023012712321693600_btx571-B24","doi-asserted-by":"crossref","first-page":"374","DOI":"10.1080\/00401706.1996.10484549","article-title":"Iteratively reweighted partial least squares estimation for generalized linear regression","volume":"38","author":"Marx","year":"1996","journal-title":"Technometrics"},{"key":"2023012712321693600_btx571-B25","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4899-3242-6","volume-title":"Generalized Linear Models","author":"McCullagh","year":"1989","edition":"2nd edn"},{"key":"2023012712321693600_btx571-B26","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1111\/j.1467-9868.2010.00740.x","article-title":"Stability selection","volume":"72","author":"Meinshausen","year":"2010","journal-title":"J. R. Stat. Soc. Ser. B (Stat. Methodol.)"},{"key":"2023012712321693600_btx571-B27","doi-asserted-by":"crossref","first-page":"142","DOI":"10.1016\/j.immuni.2012.01.002","article-title":"Cytometry by time-of-flight shows combinatorial cytokine expression and virus-specific cell niches within a continuum of CD8+ T cell phenotypes","volume":"36","author":"Newell","year":"2012","journal-title":"Immunity"},{"key":"2023012712321693600_btx571-B28","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1093\/bioinformatics\/18.1.39","article-title":"Tumor classification by partial least squares using microarray gene expression data","volume":"18","author":"Nguyen","year":"2002","journal-title":"Bioinformatics"},{"key":"2023012712321693600_btx571-B29","doi-asserted-by":"crossref","first-page":"708","DOI":"10.1038\/44385","article-title":"Two subsets of memory T lymphocytes with distinct homing potentials and effector functions","volume":"401","author":"Sallusto","year":"1999","journal-title":"Nature"},{"key":"2023012712321693600_btx571-B30","doi-asserted-by":"crossref","first-page":"1015","DOI":"10.1016\/j.jmva.2007.06.007","article-title":"Sparse principal component analysis via regularized low rank matrix approximation","volume":"99","author":"Shen","year":"2008","journal-title":"J. Multivariate Anal"},{"key":"2023012712321693600_btx571-B31","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1038\/nrg3833","article-title":"Computational and analytical challenges in single-cell transcriptomics","volume":"16","author":"Stegle","year":"2015","journal-title":"Nat. Rev. Genet"},{"key":"2023012712321693600_btx571-B32","doi-asserted-by":"crossref","first-page":"569","DOI":"10.1093\/biostatistics\/kxu001","article-title":"Variable selection for generalized canonical correlation analysis","volume":"15","author":"Tenenhaus","year":"2014","journal-title":"Biostatistics"},{"key":"2023012712321693600_btx571-B33","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser. B (Methodological)"},{"key":"2023012712321693600_btx571-B34","doi-asserted-by":"crossref","first-page":"471","DOI":"10.1111\/j.1751-1097.1999.tb03314.x","article-title":"A probability-based multivariate statistical algorithm for autofluorescence spectroscopic identification of oral carcinogenesis","volume":"69","author":"Wang","year":"1999","journal-title":"Photochem. Photobiol"},{"key":"2023012712321693600_btx571-B35","doi-asserted-by":"crossref","first-page":"5895","DOI":"10.4049\/jimmunol.175.9.5895","article-title":"Molecular signatures distinguish human central memory from effector memory CD8 T cell subsets","volume":"175","author":"Willinger","year":"2005","journal-title":"J. Immunol"},{"key":"2023012712321693600_btx571-B36","doi-asserted-by":"crossref","first-page":"515","DOI":"10.1093\/biostatistics\/kxp008","article-title":"A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis","volume":"10","author":"Witten","year":"2009","journal-title":"Biostatistics"},{"key":"2023012712321693600_btx571-B37","article-title":"Soft modeling by latent variables; the nonlinear iterative partial least squares approach","author":"Wold","year":"1975","journal-title":"Perspectives in Probability and Statistics. Papers in Honour of M. S. Bartlett"},{"key":"2023012712321693600_btx571-B38","doi-asserted-by":"crossref","first-page":"286","DOI":"10.1007\/BFb0062108","volume-title":"Matrix Pencils","author":"Wold","year":"1983"},{"key":"2023012712321693600_btx571-B39","first-page":"91","article-title":"On decomposing the proximal map","author":"Yu","year":"2013","journal-title":"Advances in Neural Information Processing Systems"},{"key":"2023012712321693600_btx571-B40","doi-asserted-by":"crossref","first-page":"1418","DOI":"10.1198\/016214506000000735","article-title":"The adaptive lasso and its oracle properties","volume":"101","author":"Zou","year":"2006","journal-title":"J. Am. Stat. Assoc"},{"key":"2023012712321693600_btx571-B41","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1198\/106186006X113430","article-title":"Sparse principal component analysis","volume":"15","author":"Zou","year":"2006","journal-title":"J. Comput. Graph. Stat"},{"key":"2023012712321693600_btx571-B42","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1111\/j.1467-9868.2005.00503.x","article-title":"Regularization and variable selection via the elastic net","volume":"67","author":"Zou","year":"2005","journal-title":"J. R. Stat. Soc. Ser. B (Stat. Methodol.)"}],"container-title":["Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/3\/485\/48913207\/bioinformatics_34_3_485.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article-pdf\/34\/3\/485\/48913207\/bioinformatics_34_3_485.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,26]],"date-time":"2024-06-26T21:51:02Z","timestamp":1719438662000},"score":1,"resource":{"primary":{"URL":"https:\/\/academic.oup.com\/bioinformatics\/article\/34\/3\/485\/4157444"}},"subtitle":[],"editor":[{"given":"Bonnie","family":"Berger","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2017,9,13]]},"references-count":42,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2018,2,1]]}},"URL":"https:\/\/doi.org\/10.1093\/bioinformatics\/btx571","relation":{},"ISSN":["1367-4803","1367-4811"],"issn-type":[{"value":"1367-4803","type":"print"},{"value":"1367-4811","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2018,2,1]]},"published":{"date-parts":[[2017,9,13]]}}}