{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,3,26]],"date-time":"2024-03-26T00:35:51Z","timestamp":1711413351133},"reference-count":41,"publisher":"MIT Press","issue":"4","content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,3,21]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Combining information-theoretic learning with deep learning has gained significant attention in recent years, as it offers a promising approach to tackle the challenges posed by big data. However, the theoretical understanding of convolutional structures, which are vital to many structured deep learning models, remains incomplete. To partially bridge this gap, this letter aims to develop generalization analysis for deep convolutional neural network (CNN) algorithms using learning theory. Specifically, we focus on investigating robust regression using correntropy-induced loss functions derived from information-theoretic learning. Our analysis demonstrates an explicit convergence rate for deep CNN-based robust regression algorithms when the target function resides in the Korobov space. This study sheds light on the theoretical underpinnings of CNNs and provides a framework for understanding their performance and limitations.<\/jats:p>","DOI":"10.1162\/neco_a_01650","type":"journal-article","created":{"date-parts":[[2024,3,8]],"date-time":"2024-03-08T20:57:14Z","timestamp":1709931434000},"page":"718-743","update-policy":"http:\/\/dx.doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":0,"title":["Learning Korobov Functions by Correntropy and Convolutional Neural Networks"],"prefix":"10.1162","volume":"36","author":[{"given":"Zhiying","family":"Fang","sequence":"first","affiliation":[{"name":"Institute of Applied Mathematics, Shenzhen Polytechnic University, Shenzhen, Guangdong, China fangzhiying@szpu.edu.cn"}]},{"given":"Tong","family":"Mao","sequence":"additional","affiliation":[{"name":"Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal 4700, Kingdom of Saudi Arabia tong.mao@kaust.edu.sa"}]},{"given":"Jun","family":"Fan","sequence":"additional","affiliation":[{"name":"Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong junfan@hkbu.edu.hk"}]}],"member":"281","published-online":{"date-parts":[[2024,3,21]]},"reference":[{"issue":"1","key":"2024032521535846700_bib1","first-page":"2285","article-title":"Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks","volume":"20","author":"Bartlett","year":"2019","journal-title":"Journal of Machine Learning Research"},{"key":"2024032521535846700_bib2","article-title":"Shallow and deep networks are near-optimal approximators of Korobov functions","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Blanchard","year":"2021"},{"key":"2024032521535846700_bib3","doi-asserted-by":"publisher","first-page":"147","DOI":"10.1017\/S0962492904000182","article-title":"Sparse grids","volume":"13","author":"Bungartz","year":"2004","journal-title":"Acta Numerica"},{"issue":"1","key":"2024032521535846700_bib4","doi-asserted-by":"publisher","first-page":"233","DOI":"10.1007\/BF02124745","article-title":"Limitations of the approximation capabilities of neural networks with one hidden layer","volume":"5","author":"Chui","year":"1996","journal-title":"Advances in Computational Mathematics"},{"key":"2024032521535846700_bib5","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511618796","volume-title":"Learning theory: An approximation theory viewpoint","author":"Cucker","year":"2007"},{"key":"2024032521535846700_bib6","article-title":"Optimal convergence rates of deep convolutional neural networks: Additive ridge functions","volume":"1","author":"Fang","year":"2022","journal-title":"Transactions on Machine Learning Research"},{"key":"2024032521535846700_bib7","doi-asserted-by":"publisher","first-page":"154","DOI":"10.1016\/j.neunet.2020.07.029","article-title":"Theory of deep convolutional neural networks II: Spherical analysis","volume":"131","author":"Fang","year":"2020","journal-title":"Neural Networks"},{"key":"2024032521535846700_bib8","doi-asserted-by":"publisher","first-page":"101426","DOI":"10.1016\/j.jco.2019.101426","article-title":"Optimal learning rates for distribution regression","volume":"56","author":"Fang","year":"2020","journal-title":"Journal of Complexity"},{"issue":"4","key":"2024032521535846700_bib9","doi-asserted-by":"publisher","DOI":"10.3934\/mfc.2022021","article-title":"CNN models for readability of Chinese texts","volume":"5","author":"Feng","year":"2022","journal-title":"Mathematical Foundations of Computing"},{"issue":"9","key":"2024032521535846700_bib10","doi-asserted-by":"crossref","DOI":"10.1109\/TNNLS.2021.3134675","article-title":"Generalization analysis of CNNs for classification on spheres","volume":"34","author":"Feng","year":"2023","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"issue":"1","key":"2024032521535846700_bib11","first-page":"25","article-title":"A statistical learning approach to modal regression","volume":"21","author":"Feng","year":"2020","journal-title":"Journal of Machine Learning Research"},{"issue":"30","key":"2024032521535846700_bib12","first-page":"993","article-title":"Learning with the maximum correntropy criterion induced losses for regression","volume":"16","author":"Feng","year":"2015","journal-title":"Journal of Machine Learning Research"},{"issue":"2","key":"2024032521535846700_bib13","doi-asserted-by":"publisher","first-page":"495","DOI":"10.1016\/j.acha.2020.05.009","article-title":"Learning under (1 + \u03f5)-moment conditions","volume":"49","author":"Feng","year":"2020","journal-title":"Applied and Computational Harmonic Analysis"},{"issue":"2","key":"2024032521535846700_bib14","doi-asserted-by":"publisher","first-page":"795","DOI":"10.1016\/j.acha.2019.09.001","article-title":"Learning with correntropy-induced losses for regression with mixture of symmetric stable noise","volume":"48","author":"Feng","year":"2020","journal-title":"Applied and Computational Harmonic Analysis"},{"key":"2024032521535846700_bib15","doi-asserted-by":"publisher","DOI":"10.1007\/s10208-023-09616-9","article-title":"Optimality of robust online learning","author":"Guo","year":"2023","journal-title":"Foundations of Computational Mathematics"},{"key":"2024032521535846700_bib16","first-page":"377","article-title":"Learning theory approach to minimum error entropy criterion","volume":"14","author":"Hu","year":"2013","journal-title":"Journal of Machine Learning Research"},{"issue":"6","key":"2024032521535846700_bib17","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1162\/neco_a_01548","article-title":"Generalization analysis of pairwise learning for ranking with deep neural networks","volume":"35","author":"Huang","year":"2023","journal-title":"Neural Computation"},{"key":"2024032521535846700_bib18","author":"Lei","year":"2023","journal-title":"Solving PDEs on spheres with physics-informed convolutional neural networks."},{"key":"2024032521535846700_bib19","doi-asserted-by":"publisher","first-page":"46","DOI":"10.3389\/fams.2019.00046","article-title":"Deep net tree structure for balance of capacity and approximation ability","volume":"5","author":"Lin","year":"2019","journal-title":"Frontiers in Applied Mathematics and Statistics"},{"issue":"7","key":"2024032521535846700_bib20","doi-asserted-by":"publisher","first-page":"4610","DOI":"10.1109\/TIT.2022.3151753","article-title":"Universal consistency of deep convolutional neural networks","volume":"68","author":"Lin","year":"2022","journal-title":"IEEE Transactions on Information Theory"},{"key":"2024032521535846700_bib21","first-page":"34","article-title":"Robust representations in deep learning","volume-title":"Proceedings of the 15th International Conference on Advances in Databases, Knowledge, and Data Application","author":"Liu","year":"2023"},{"issue":"1","key":"2024032521535846700_bib22","doi-asserted-by":"publisher","first-page":"107","DOI":"10.1142\/S0219530519410124","article-title":"Optimal learning with gaussians and correntropy loss","volume":"19","author":"Lv","year":"2021","journal-title":"Analysis and Applications"},{"key":"2024032521535846700_bib23","doi-asserted-by":"publisher","first-page":"778","DOI":"10.1016\/j.neunet.2021.09.027","article-title":"Theory of deep convolutional neural networks III: Approximating radial functions","volume":"144","author":"Mao","year":"2021","journal-title":"Neural Networks"},{"issue":"1","key":"2024032521535846700_bib24","doi-asserted-by":"publisher","first-page":"93","DOI":"10.1142\/S0219530522400085","article-title":"Approximating functions with multi-features by deep convolutional neural networks","volume":"21","author":"Mao","year":"2023","journal-title":"Analysis and Applications"},{"issue":"6","key":"2024032521535846700_bib25","doi-asserted-by":"publisher","first-page":"84","DOI":"10.1007\/s10444-022-09991-x","article-title":"Approximation of functions from Korobov spaces by deep convolutional neural networks","volume":"48","author":"Mao","year":"2022","journal-title":"Advances in Computational Mathematics"},{"issue":"1","key":"2024032521535846700_bib26","doi-asserted-by":"publisher","first-page":"164","DOI":"10.1162\/neco.1996.8.1.164","article-title":"Neural networks for optimal approximation of smooth and analytic functions","volume":"8","author":"Mhaskar","year":"1996","journal-title":"Neural Computation"},{"issue":"1","key":"2024032521535846700_bib27","doi-asserted-by":"publisher","first-page":"78","DOI":"10.1137\/18M1189336","article-title":"New error bounds for deep ReLU networks using sparse grids","volume":"1","author":"Montanelli","year":"2019","journal-title":"SIAM Journal on Mathematics of Data Science"},{"key":"2024032521535846700_bib28","first-page":"4922","article-title":"Approximation and non-parametric estimation of ResNet-type convolutional neural networks","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Oono","year":"2019"},{"key":"2024032521535846700_bib29","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4419-1570-2","volume-title":"Information theoretic learning: Renyi\u2019s entropy and kernel perspectives","author":"Principe","year":"2010"},{"key":"2024032521535846700_bib30","first-page":"2876","article-title":"Approximation with CNNs in Sobolev space: With applications to classification","volume-title":"Advances in neural information processing systems","author":"Shen","year":"2022"},{"key":"2024032521535846700_bib31","first-page":"1042","article-title":"Quadrature and interpolation formulas for tensor products of certain classes of functions","volume-title":"Doklady Akademii Nauk","author":"Smolyak","year":"1963"},{"key":"2024032521535846700_bib32","doi-asserted-by":"publisher","first-page":"50","DOI":"10.1007\/s00041-023-10027-1","article-title":"Approximation of nonlinear functionals using ReLU networks","volume":"29","author":"Song","year":"2023","journal-title":"Journal of Fourier Analysis and Applications"},{"key":"2024032521535846700_bib33","doi-asserted-by":"publisher","first-page":"424","DOI":"10.1016\/j.neunet.2023.07.012","article-title":"Approximation of smooth functionals using ReLU networks","volume":"166","author":"Song","year":"2023","journal-title":"Neural Networks"},{"key":"2024032521535846700_bib34","article-title":"Approximation and non-parametric estimation of functions over high-dimensional spheres via deep ReLU networks","volume-title":"Proceedings of the Eleventh International Conference on Learning Representations","author":"Suh","year":"2022"},{"key":"2024032521535846700_bib35","article-title":"Adaptivity of deep ReLU network for learning in Besov and mixed smooth Besov spaces: Optimal rate and curse of dimensionality","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Suzuki","year":"2019"},{"key":"2024032521535846700_bib36","doi-asserted-by":"publisher","first-page":"103","DOI":"10.1016\/j.neunet.2017.07.002","article-title":"Error bounds for approximations with deep ReLU networks","volume":"94","author":"Yarotsky","year":"2017","journal-title":"Neural Networks"},{"key":"2024032521535846700_bib37","first-page":"4669","article-title":"Information-theoretic methods in deep neural networks: Recent advances and emerging opportunities","volume-title":"Proceedings of IJCAI","author":"Yu","year":"2021"},{"key":"2024032521535846700_bib38","author":"Zhang","year":"2023"},{"key":"2024032521535846700_bib39","doi-asserted-by":"publisher","first-page":"319","DOI":"10.1016\/j.neunet.2020.01.018","article-title":"Theory of deep convolutional neural networks: Downsampling","volume":"124","author":"Zhou","year":"2020","journal-title":"Neural Networks"},{"issue":"2","key":"2024032521535846700_bib40","doi-asserted-by":"publisher","first-page":"787","DOI":"10.1016\/j.acha.2019.06.004","article-title":"Universality of deep convolutional neural networks","volume":"48","author":"Zhou","year":"2020","journal-title":"Applied and Computational Harmonic Analysis"},{"key":"2024032521535846700_bib41","doi-asserted-by":"publisher","first-page":"101582","DOI":"10.1016\/j.acha.2023.101582","article-title":"Learning ability of interpolating convolutional neural networks","volume":"68","author":"Zhou","year":"2024","journal-title":"Applied and Computational Harmonic Analysis"}],"container-title":["Neural Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/36\/4\/718\/2351739\/neco_a_01650.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/36\/4\/718\/2351739\/neco_a_01650.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,3,25]],"date-time":"2024-03-25T21:54:31Z","timestamp":1711403671000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/neco\/article\/36\/4\/718\/119781\/Learning-Korobov-Functions-by-Correntropy-and"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,21]]},"references-count":41,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2024,3,21]]},"published-print":{"date-parts":[[2024,3,21]]}},"URL":"https:\/\/doi.org\/10.1162\/neco_a_01650","relation":{},"ISSN":["0899-7667","1530-888X"],"issn-type":[{"value":"0899-7667","type":"print"},{"value":"1530-888X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,4]]},"published":{"date-parts":[[2024,3,21]]}}}