{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,2]],"date-time":"2025-11-02T04:40:38Z","timestamp":1762058438124,"version":"build-2065373602"},"reference-count":31,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2022,7,9]],"date-time":"2022-07-09T00:00:00Z","timestamp":1657324800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Fundamental Research Funds for the Central Universities of China","award":["2662020LXQD002","12001217","2022003","HBAM 202004"],"award-info":[{"award-number":["2662020LXQD002","12001217","2022003","HBAM 202004"]}]},{"DOI":"10.13039\/501100001809","name":"the Natural Science Foundation of China","doi-asserted-by":"publisher","award":["2662020LXQD002","12001217","2022003","HBAM 202004"],"award-info":[{"award-number":["2662020LXQD002","12001217","2022003","HBAM 202004"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"the Key Laboratory of Biomedical Engineering of Hainan Province","award":["2662020LXQD002","12001217","2022003","HBAM 202004"],"award-info":[{"award-number":["2662020LXQD002","12001217","2022003","HBAM 202004"]}]},{"name":"the Hubei Key Laboratory of Applied Mathematics","award":["2662020LXQD002","12001217","2022003","HBAM 202004"],"award-info":[{"award-number":["2662020LXQD002","12001217","2022003","HBAM 202004"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Gradient Learning (GL), aiming to estimate the gradient of target function, has attracted much attention in variable selection problems due to its mild structure requirements and wide applicability. Despite rapid progress, the majority of the existing GL works are based on the empirical risk minimization (ERM) principle, which may face the degraded performance under complex data environment, e.g., non-Gaussian noise. To alleviate this sensitiveness, we propose a new GL model with the help of the tilted ERM criterion, and establish its theoretical support from the function approximation viewpoint. Specifically, the operator approximation technique plays the crucial role in our analysis. To solve the proposed learning objective, a gradient descent method is proposed, and the convergence analysis is provided. Finally, simulated experimental results validate the effectiveness of our approach when the input variables are correlated.<\/jats:p>","DOI":"10.3390\/e24070956","type":"journal-article","created":{"date-parts":[[2022,7,10]],"date-time":"2022-07-10T21:19:28Z","timestamp":1657487968000},"page":"956","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Gradient Learning under Tilted Empirical Risk Minimization"],"prefix":"10.3390","volume":"24","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4046-9650","authenticated-orcid":false,"given":"Liyuan","family":"Liu","sequence":"first","affiliation":[{"name":"College of Science, Huazhong Agricultural University, Wuhan 430062, China"}]},{"given":"Biqin","family":"Song","sequence":"additional","affiliation":[{"name":"College of Science, Huazhong Agricultural University, Wuhan 430062, China"}]},{"given":"Zhibin","family":"Pan","sequence":"additional","affiliation":[{"name":"College of Science, Huazhong Agricultural University, Wuhan 430062, China"},{"name":"Hubei Key Laboratory of Applied Mathematics, Hubei University, Wuhan 430062, China"}]},{"given":"Chuanwu","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan 430074, China"}]},{"given":"Chi","family":"Xiao","sequence":"additional","affiliation":[{"name":"Key Laboratory of Biomedical Engineering of Hainan Province, School of Biomedical Engineering, Hainan University, Haikou 570228, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8444-9782","authenticated-orcid":false,"given":"Weifu","family":"Li","sequence":"additional","affiliation":[{"name":"College of Science, Huazhong Agricultural University, Wuhan 430062, China"},{"name":"Hubei Key Laboratory of Applied Mathematics, Hubei University, Wuhan 430062, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,7,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1214\/aos\/1176344136","article-title":"Estimating the Dimension of a Model","volume":"6","author":"Schwarz","year":"1978","journal-title":"Ann. Stat."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1348","DOI":"10.1198\/016214501753382273","article-title":"Variable selection via nonconcave penalized likelihood and its oracle properties","volume":"96","author":"Fan","year":"2001","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1214\/009053604000000067","article-title":"Least angle regression","volume":"32","author":"Efron","year":"2004","journal-title":"Ann. Stat."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"509","DOI":"10.1142\/S0219530520400011","article-title":"Sparse additive machine with ramp loss","volume":"19","author":"Chen","year":"2021","journal-title":"Anal. Appl."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"2373","DOI":"10.1109\/TNNLS.2020.3005144","article-title":"Sparse Modal Additive Model","volume":"32","author":"Chen","year":"2021","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Deng, H., Chen, J., Song, B., and Pan, Z. (2021). Error bound of mode-based additive models. Entropy, 23.","DOI":"10.3390\/e23060651"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"310","DOI":"10.1080\/01621459.1986.10478274","article-title":"Semiparametric Estimates of the Relation Between Weather and Electricity Sales","volume":"81","author":"Engle","year":"1986","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1099","DOI":"10.1198\/jasa.2011.tm10281","article-title":"Linear or Nonlinear? Automatic Structure Discovery for Partially Linear Models","volume":"106","author":"Zhang","year":"2011","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_9","first-page":"1403","article-title":"Semiparametric Regression Pursuit","volume":"22","author":"Huang","year":"2012","journal-title":"Stat. Sin."},{"key":"ref_10","first-page":"519","article-title":"Learning Coordinate Covariances via Gradients","volume":"7","author":"Mukherjee","year":"2006","journal-title":"J. Mach. Learn. Res."},{"key":"ref_11","first-page":"2481","article-title":"Estimation of Gradients and Coordinate Covariation in Classification","volume":"7","author":"Mukherjee","year":"2006","journal-title":"J. Mach. Learn. Res."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"674","DOI":"10.1016\/j.jat.2008.12.002","article-title":"Gradient learning in a classification setting by gradient descent","volume":"161","author":"Jia","year":"2009","journal-title":"J. Approx. Theory"},{"key":"ref_13","first-page":"2075","article-title":"Variable selection for classification with derivative-induced regularization","volume":"30","author":"He","year":"2020","journal-title":"Stat. Sin."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1018","DOI":"10.1016\/j.jmaa.2007.10.044","article-title":"Learning gradients by a gradient descent algorithm","volume":"341","author":"Dong","year":"2008","journal-title":"J. Math. Anal. Appl."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"181","DOI":"10.3150\/09-BEJ206","article-title":"Learning gradients on manifolds","volume":"16","author":"Mukherjee","year":"2010","journal-title":"Bernoulli"},{"key":"ref_16","first-page":"161:1","article-title":"Gradient Estimation with Simultaneous Perturbation and Compressive Sensing","volume":"18","author":"Borkar","year":"2017","journal-title":"J. Mach. Learn. Res."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s10994-012-5284-9","article-title":"Learning sparse gradients for variable selection and dimension reduction","volume":"87","author":"Ye","year":"2012","journal-title":"Mach. Learn."},{"key":"ref_18","unstructured":"He, X., Wang, J., and Lv, S. (2018). Efficient kernel-based variable selection with sparsistency. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1007\/s10994-010-5217-4","article-title":"Estimating variable structure and dependence in multitask learning via gradients","volume":"83","author":"Guinney","year":"2011","journal-title":"Mach. Learn."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"822","DOI":"10.1109\/TNNLS.2015.2425215","article-title":"Robust Gradient Learning with Applications","volume":"27","author":"Feng","year":"2016","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_21","unstructured":"Li, T., Beirami, A., Sanjabi, M., and Smith, V. (2021). On tilted losses in machine learning: Theory and applications. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1090\/S0273-0979-01-00923-5","article-title":"On the mathematical foundations of learning","volume":"39","author":"Cucker","year":"2002","journal-title":"Bull. Am. Math. Soc."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1016\/j.acha.2016.04.004","article-title":"Kernel-based sparse regression with the correntropy-induced loss","volume":"44","author":"Chen","year":"2018","journal-title":"Appl. Comput. Harmon. Anal."},{"key":"ref_24","first-page":"1","article-title":"A Statistical Learning Approach to Modal Regression","volume":"21","author":"Feng","year":"2020","journal-title":"J. Mach. Learn. Res."},{"key":"ref_25","first-page":"2885","article-title":"Model-free variable selection in reproducing kernel hilbert space","volume":"17","author":"Yang","year":"2016","journal-title":"J. Mach. Learn. Res."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1743","DOI":"10.1109\/TIT.2003.813564","article-title":"Capacity of reproducing kernel spaces in learning theory","volume":"49","author":"Zhou","year":"2003","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_27","first-page":"2399","article-title":"Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples","volume":"7","author":"Belkin","year":"2006","journal-title":"J. Mach. Learn. Res."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Sch\u00f6lkopf, B., Smola, A.J., and Bach, F. (2002). Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press.","DOI":"10.7551\/mitpress\/4175.001.0001"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Karimi, H., Nutini, J., and Schmidt, M. (2016, January 19\u201323). Linear convergence of gradient and proximal-gradient methods under the polyak-\u0142ojasiewicz condition. Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Riva del Garda, Italy.","DOI":"10.1007\/978-3-319-46128-1_50"},{"key":"ref_30","unstructured":"Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., and Weinberger, K. (2014). Scalable Kernel Methods via Doubly Stochastic Gradients. Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1950047","DOI":"10.1142\/S0219691319500474","article-title":"Regularized modal regression with data-dependent hypothesis spaces","volume":"17","author":"Wang","year":"2019","journal-title":"Int. J. Wavelets Multiresolution Inf. Process."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/24\/7\/956\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:47:26Z","timestamp":1760140046000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/24\/7\/956"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,7,9]]},"references-count":31,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2022,7]]}},"alternative-id":["e24070956"],"URL":"https:\/\/doi.org\/10.3390\/e24070956","relation":{},"ISSN":["1099-4300"],"issn-type":[{"type":"electronic","value":"1099-4300"}],"subject":[],"published":{"date-parts":[[2022,7,9]]}}}