{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,24]],"date-time":"2025-10-24T18:37:51Z","timestamp":1761331071920,"version":"build-2065373602"},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"8","license":[{"start":{"date-parts":[[2025,5,25]],"date-time":"2025-05-25T00:00:00Z","timestamp":1748131200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,5,25]],"date-time":"2025-05-25T00:00:00Z","timestamp":1748131200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"National Cancer Institute"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Comput Stat"],"published-print":{"date-parts":[[2025,11]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Penalized regression methods that shrink model coefficients are popular approaches to improve prediction and for variable selection in high-dimensional settings. We present a penalized (or regularized) regression approach for multinomial logistic models for categorical outcomes with a novel adaptive L1-type penalty term, that incorporates weights based on intra- and inter-outcome category distances of each predictor. A predictor that has large between- and small within-outcome category distances is penalized less and has a higher likelihood to be selected for the final model. We propose and study three measures for weight calculation: an analysis of variance (ANOVA)-based measure and two indices used in clustering approaches. Our novel approach, that we term the\n                    <jats:italic>discriminative power lasso<\/jats:italic>\n                    (DP-lasso), thus combines elements of marginal screening with regularized regression methods. We studied the performance of DP-lasso and other published methods in simulations with varying numbers of outcome categories, numbers of predictors, strengths of associations and predictor correlation structures. For correlated predictors, the DP-lasso approach with ANOVA based weights (DPan) resulted in much sparser models than other regularization approaches, especially in high-dimensional settings. When the number\n                    <jats:italic>p<\/jats:italic>\n                    of (correlated) predictors was much larger than the available sample size\n                    <jats:italic>N<\/jats:italic>\n                    , DPan had the highest true positive rate while maintaining low false positive rates for all simulation settings. Similarly, when\n                    <jats:inline-formula>\n                      <jats:alternatives>\n                        <jats:tex-math>$${p&lt;N}$$<\/jats:tex-math>\n                        <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                          <mml:mrow>\n                            <mml:mi>p<\/mml:mi>\n                            <mml:mo>&lt;<\/mml:mo>\n                            <mml:mi>N<\/mml:mi>\n                          <\/mml:mrow>\n                        <\/mml:math>\n                      <\/jats:alternatives>\n                    <\/jats:inline-formula>\n                    , DPan had high true positive rates and the lowest false positive rates of all methods studied. Thus we recommend DPan for analysing categorical outcomes in relation to high-dimensional predictors. We further illustrate all approaches in ultra high-dimensional settings, using several single-cell RNA-sequencing datasets.\n                  <\/jats:p>","DOI":"10.1007\/s00180-025-01635-0","type":"journal-article","created":{"date-parts":[[2025,5,25]],"date-time":"2025-05-25T08:04:04Z","timestamp":1748160244000},"page":"4565-4587","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["A powerful penalized multinomial logistic regression approach"],"prefix":"10.1007","volume":"40","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8354-6660","authenticated-orcid":false,"given":"Cornelia","family":"Fuetterer","sequence":"first","affiliation":[]},{"given":"Malte","family":"Nalenz","sequence":"additional","affiliation":[]},{"given":"Thomas","family":"Augustin","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7791-2698","authenticated-orcid":false,"given":"Ruth M.","family":"Pfeiffer","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,5,25]]},"reference":[{"issue":"1","key":"1635_CR1","doi-asserted-by":"publisher","first-page":"232","DOI":"10.1214\/10-AOAS388","volume":"5","author":"P Breheny","year":"2011","unstructured":"Breheny P, Huang J (2011) Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann Appl Stat 5(1):232\u2013253. https:\/\/doi.org\/10.1214\/10-AOAS388","journal-title":"Ann Appl Stat"},{"key":"1635_CR2","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1007\/s11634-021-00489-w","volume":"17","author":"X Bry","year":"2022","unstructured":"Bry X, Niang N, Verron T et al (2022) Clusterwise elastic-net regression based on a combined information criterion. Adv Data Anal Classif 17:75\u2013107. https:\/\/doi.org\/10.1007\/s11634-021-00489-w","journal-title":"Adv Data Anal Classif"},{"issue":"2","key":"1635_CR3","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1038\/nbt.3102","volume":"33","author":"F Buettner","year":"2015","unstructured":"Buettner F, Natarajan KN, Casale FP et al (2015) Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nat Biotechnol 33(2):155\u2013160. https:\/\/doi.org\/10.1038\/nbt.3102","journal-title":"Nat Biotechnol"},{"key":"1635_CR4","doi-asserted-by":"publisher","first-page":"224","DOI":"10.1109\/TPAMI.1979.4766909","volume":"2","author":"DL Davies","year":"1979","unstructured":"Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 2:224\u2013227. https:\/\/doi.org\/10.1109\/TPAMI.1979.4766909","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"6167","key":"1635_CR5","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1126\/science.1245316","volume":"343","author":"Q Deng","year":"2014","unstructured":"Deng Q, Ramsk\u00f6ld D, Reinius B et al (2014) Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science 343(6167):193\u2013196. https:\/\/doi.org\/10.1126\/science.1245316","journal-title":"Science"},{"issue":"6","key":"1635_CR6","doi-asserted-by":"publisher","first-page":"728","DOI":"10.1038\/ni.3437","volume":"17","author":"I Engel","year":"2016","unstructured":"Engel I, Seumois G, Chavez L et al (2016) Innate-like functions of natural killer T cell subsets result from highly divergent gene programs. Nat Immunol 17(6):728\u2013739. https:\/\/doi.org\/10.1038\/ni.3437","journal-title":"Nat Immunol"},{"issue":"456","key":"1635_CR7","doi-asserted-by":"publisher","first-page":"1348","DOI":"10.1198\/016214501753382273","volume":"96","author":"J Fan","year":"2001","unstructured":"Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348\u20131360. https:\/\/doi.org\/10.1198\/016214501753382273","journal-title":"J Am Stat Assoc"},{"issue":"5","key":"1635_CR8","doi-asserted-by":"publisher","first-page":"849","DOI":"10.48550\/arXiv.math\/0612857","volume":"70","author":"J Fan","year":"2008","unstructured":"Fan J, Lv J (2008) Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc Ser B Stat Method 70(5):849\u2013911. https:\/\/doi.org\/10.48550\/arXiv.math\/0612857","journal-title":"J R Stat Soc Ser B Stat Method"},{"key":"1635_CR9","doi-asserted-by":"publisher","first-page":"66","DOI":"10.1007\/978-1-4612-4380-9_6","volume-title":"Breakthroughs in statistics","author":"RA Fisher","year":"1992","unstructured":"Fisher RA (1992) Statistical methods for research workers. Breakthroughs in statistics. Springer, New York, pp 66\u201370"},{"issue":"1","key":"1635_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v033.i01","volume":"33","author":"J Friedman","year":"2010","unstructured":"Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1\u201322. https:\/\/doi.org\/10.18637\/jss.v033.i01","journal-title":"J Stat Softw"},{"issue":"4","key":"1635_CR11","doi-asserted-by":"publisher","first-page":"2150","DOI":"10.1214\/10-AOAS355","volume":"4","author":"J Gertheiss","year":"2010","unstructured":"Gertheiss J, Tutz G (2010) Sparse modeling of categorial explanatory variables. Ann Appl Stat 4(4):2150\u20132180. https:\/\/doi.org\/10.1214\/10-AOAS355","journal-title":"Ann Appl Stat"},{"key":"1635_CR12","first-page":"63","volume-title":"The elements of statistical learning: data mining, inference, and prediction","author":"T Hastie","year":"2017","unstructured":"Hastie T, Tibshirani R, Friedman JH (2017) The elements of statistical learning: data mining, inference, and prediction. Springer, New York, pp 63\u201364"},{"issue":"1","key":"1635_CR13","doi-asserted-by":"publisher","first-page":"69","DOI":"10.2307\/1267352","volume":"12","author":"AE Hoerl","year":"1970","unstructured":"Hoerl AE, Kennard RW (1970) Ridge regression: applications to nonorthogonal problems. Technometrics 12(1):69\u201382. https:\/\/doi.org\/10.2307\/1267352","journal-title":"Technometrics"},{"issue":"6","key":"1635_CR14","doi-asserted-by":"publisher","first-page":"957","DOI":"10.1109\/TPAMI.2005.127","volume":"27","author":"B Krishnapuram","year":"2005","unstructured":"Krishnapuram B, Carin L, Figueiredo MA et al (2005) Sparse multinomial logistic regression: fast algorithms and generalization bounds. IEEE Trans Pattern Anal Mach Intell 27(6):957\u2013968. https:\/\/doi.org\/10.1109\/TPAMI.2005.127","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"499","key":"1635_CR15","doi-asserted-by":"publisher","first-page":"1129","DOI":"10.1080\/01621459.2012.695654","volume":"107","author":"R Li","year":"2012","unstructured":"Li R, Zhong W, Zhu L (2012) Feature screening via distance correlation learning. J Am Stat Assoc 107(499):1129\u20131139. https:\/\/doi.org\/10.1080\/01621459.2012.695654","journal-title":"J Am Stat Assoc"},{"key":"1635_CR16","unstructured":"Maechler M, Rousseeuw P, Struyf A et\u00a0al (2022) Cluster: cluster analysis basics and extensions. r package version 2.1.4\u2014For new features, see the \u2018Changelog\u2019 file (in the package source). https:\/\/CRAN.R-project.org\/package=cluster"},{"issue":"1","key":"1635_CR17","doi-asserted-by":"publisher","first-page":"374","DOI":"10.1016\/j.csda.2006.12.019","volume":"52","author":"N Meinshausen","year":"2007","unstructured":"Meinshausen N (2007) Relaxed lasso. Comput Stat Data Anal 52(1):374\u2013393. https:\/\/doi.org\/10.1016\/j.csda.2006.12.019","journal-title":"Comput Stat Data Anal"},{"key":"1635_CR18","doi-asserted-by":"publisher","DOI":"10.1016\/j.csda.2021.107414","volume":"169","author":"D Nibbering","year":"2022","unstructured":"Nibbering D, Hastie TJ (2022) Multiclass-penalized logistic regression. Comput Stat Data Anal 169:107414. https:\/\/doi.org\/10.1016\/j.csda.2021.107414","journal-title":"Comput Stat Data Anal"},{"key":"1635_CR19","unstructured":"Oelker M (2013) gvcm. cat: regularized categorial effects\/categorial effect modifiers in glms. R package version 1"},{"issue":"1","key":"1635_CR20","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1109\/TNN.2003.809398","volume":"15","author":"V Roth","year":"2004","unstructured":"Roth V (2004) The generalized lasso. IEEE Trans Neural Netw 15(1):16\u201328. https:\/\/doi.org\/10.1109\/TNN.2003.809398","journal-title":"IEEE Trans Neural Netw"},{"key":"1635_CR21","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","volume":"20","author":"PJ Rousseeuw","year":"1987","unstructured":"Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53\u201365. https:\/\/doi.org\/10.1016\/0377-0427(87)90125-7","journal-title":"J Comput Appl Math"},{"issue":"7505","key":"1635_CR22","doi-asserted-by":"publisher","first-page":"363","DOI":"10.1038\/nature13437","volume":"510","author":"AK Shalek","year":"2014","unstructured":"Shalek AK, Satija R, Shuga J et al (2014) Single-cell RNA-seq reveals dynamic paracrine control of cellular variation. Nature 510(7505):363\u2013369. https:\/\/doi.org\/10.1038\/nature13437","journal-title":"Nature"},{"issue":"17","key":"1635_CR23","doi-asserted-by":"publisher","first-page":"2246","DOI":"10.1093\/bioinformatics\/btg308","volume":"19","author":"SK Shevade","year":"2003","unstructured":"Shevade SK, Keerthi SS (2003) A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics 19(17):2246\u20132253. https:\/\/doi.org\/10.1093\/bioinformatics\/btg308","journal-title":"Bioinformatics"},{"issue":"4","key":"1635_CR24","doi-asserted-by":"publisher","first-page":"255","DOI":"10.1038\/nmeth.4612","volume":"15","author":"C Soneson","year":"2018","unstructured":"Soneson C, Robinson MD (2018) Bias, robustness and scalability in single-cell differential expression analysis. Nat Methods 15(4):255\u2013261. https:\/\/doi.org\/10.1038\/nmeth.4612","journal-title":"Nat Methods"},{"issue":"5","key":"1635_CR25","doi-asserted-by":"publisher","first-page":"377","DOI":"10.1038\/nmeth.1315","volume":"6","author":"F Tang","year":"2009","unstructured":"Tang F, Barbacioru C, Wang Y et al (2009) mRNA-Seq whole-transcriptome analysis of a single cell. Nat Methods 6(5):377\u2013382. https:\/\/doi.org\/10.1038\/nmeth.1315","journal-title":"Nat Methods"},{"issue":"1","key":"1635_CR26","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","volume":"58","author":"R Tibshirani","year":"1996","unstructured":"Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B Stat Methodol 58(1):267\u2013288. https:\/\/doi.org\/10.1111\/j.2517-6161.1996.tb02080.x","journal-title":"J R Stat Soc B Stat Methodol"},{"issue":"3","key":"1635_CR27","doi-asserted-by":"publisher","first-page":"239","DOI":"10.1007\/s11222-008-9088-5","volume":"19","author":"G Tutz","year":"2009","unstructured":"Tutz G, Ulbricht J (2009) Penalized regression with correlation-based penalty. Stat Comput 19(3):239\u2013253. https:\/\/doi.org\/10.1007\/s11222-008-9088-5","journal-title":"Stat Comput"},{"key":"1635_CR28","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-21706-2","volume-title":"Modern applied statistics with S","author":"WN Venables","year":"2002","unstructured":"Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York","edition":"4"},{"key":"#cr-split#-1635_CR29.1","unstructured":"Walesiak M, Dudek A (2020) The choice of variable normalization method in cluster analysis. In: Soliman KS"},{"key":"#cr-split#-1635_CR29.2","unstructured":"(ed) Education excellence and innovation management: a 2025 vision to sustain economic development during global challenges. International Business Information Management Association (IBIMA), pp 325-340"},{"issue":"1","key":"1635_CR30","doi-asserted-by":"publisher","first-page":"52","DOI":"10.3934\/mfc.2022048","volume":"7","author":"X Wang","year":"2024","unstructured":"Wang X (2024) One-step sparse ridge estimation with folded concave penalty. Math Found Comput 7(1):52\u201369. https:\/\/doi.org\/10.3934\/mfc.2022048","journal-title":"Math Found Comput"},{"issue":"5","key":"1635_CR31","doi-asserted-by":"publisher","first-page":"796","DOI":"10.1080\/02664763.2015.1078300","volume":"43","author":"X Wang","year":"2016","unstructured":"Wang X, Wang M (2016) Variable selection for high-dimensional generalized linear models with the weighted elastic-net procedure. J Appl Stat 43(5):796\u2013809. https:\/\/doi.org\/10.1080\/02664763.2015.1078300","journal-title":"J Appl Stat"},{"issue":"1","key":"1635_CR32","doi-asserted-by":"publisher","first-page":"112","DOI":"10.1080\/00401706.2013.810174","volume":"56","author":"DM Witten","year":"2014","unstructured":"Witten DM, Shojaie A, Zhang F (2014) The cluster elastic net for high-dimensional regression with unknown variable grouping. Technometrics 56(1):112\u2013122. https:\/\/doi.org\/10.1080\/00401706.2013.810174","journal-title":"Technometrics"},{"issue":"2","key":"1635_CR33","doi-asserted-by":"publisher","first-page":"673","DOI":"10.1214\/07-AOS580","volume":"37","author":"H Xie","year":"2009","unstructured":"Xie H, Huang J (2009) SCAD-penalized regression in high-dimensional partially linear models. Ann Stat 37(2):673\u2013696. https:\/\/doi.org\/10.1214\/07-AOS580","journal-title":"Ann Stat"},{"issue":"2","key":"1635_CR34","doi-asserted-by":"publisher","first-page":"894","DOI":"10.1214\/09-AOS729","volume":"38","author":"CH Zhang","year":"2010","unstructured":"Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38(2):894\u2013942. https:\/\/doi.org\/10.1214\/09-AOS729","journal-title":"Ann Stat"},{"issue":"476","key":"1635_CR35","doi-asserted-by":"publisher","first-page":"1418","DOI":"10.1198\/016214506000000735","volume":"101","author":"H Zou","year":"2006","unstructured":"Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101(476):1418\u20131429. https:\/\/doi.org\/10.1198\/016214506000000735","journal-title":"J Am Stat Assoc"},{"issue":"2","key":"1635_CR36","doi-asserted-by":"publisher","first-page":"301","DOI":"10.1111\/j.1467-9868.2005.00503.x","volume":"67","author":"H Zou","year":"2005","unstructured":"Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B Stat Method 67(2):301\u2013320. https:\/\/doi.org\/10.1111\/j.1467-9868.2005.00503.x","journal-title":"J R Stat Soc Ser B Stat Method"}],"container-title":["Computational Statistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00180-025-01635-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00180-025-01635-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00180-025-01635-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,24]],"date-time":"2025-10-24T18:34:32Z","timestamp":1761330872000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00180-025-01635-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,25]]},"references-count":37,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2025,11]]}},"alternative-id":["1635"],"URL":"https:\/\/doi.org\/10.1007\/s00180-025-01635-0","relation":{},"ISSN":["0943-4062","1613-9658"],"issn-type":[{"type":"print","value":"0943-4062"},{"type":"electronic","value":"1613-9658"}],"subject":[],"published":{"date-parts":[[2025,5,25]]},"assertion":[{"value":"11 October 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 April 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 May 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no conflict of interest to declare that are relevant to the content of this article.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}}]}}