{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,17]],"date-time":"2026-02-17T12:41:29Z","timestamp":1771332089329,"version":"3.50.1"},"reference-count":28,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2020,1,2]],"date-time":"2020-01-02T00:00:00Z","timestamp":1577923200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>An artificial neural network (ANN) is an automatic way of capturing linear and nonlinear correlations, spatial and other structural dependence among features. This machine performs well in many application areas such as classification and prediction from magnetic resonance imaging, spatial data and computer vision tasks. Most commonly used ANNs assume the availability of large training data compared to the dimension of feature vector. However, in modern applications, as mentioned above, the training sample sizes are often low, and may be even lower than the dimension of feature vector. In this paper, we consider a single layer ANN classification model that is suitable for analyzing high-dimensional low sample-size (HDLSS) data. We investigate the theoretical properties of the sparse group lasso regularized neural network and show that under mild conditions, the classification risk converges to the optimal Bayes classifier\u2019s risk (universal consistency). Moreover, we proposed a variation on the regularization term. A few examples in popular research fields are also provided to illustrate the theory and methods.<\/jats:p>","DOI":"10.3390\/make2010001","type":"journal-article","created":{"date-parts":[[2020,1,3]],"date-time":"2020-01-03T04:43:03Z","timestamp":1578026583000},"page":"1-19","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Statistical Aspects of High-Dimensional Sparse Artificial Neural Network Models"],"prefix":"10.3390","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8971-0257","authenticated-orcid":false,"given":"Kaixu","family":"Yang","sequence":"first","affiliation":[{"name":"Department of Statistics and Probability, Michigan State University, 619 Red Cedar Rd, East Lansing, MI 48824, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9362-4984","authenticated-orcid":false,"given":"Tapabrata","family":"Maiti","sequence":"additional","affiliation":[{"name":"Department of Statistics and Probability, Michigan State University, 619 Red Cedar Rd, East Lansing, MI 48824, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,1,2]]},"reference":[{"key":"ref_1","unstructured":"Anthony, M., and Bartlett, P.L. (2009). Neural Network Learning: Theoretical Foundations, Cambridge University Press."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"930","DOI":"10.1109\/18.256500","article-title":"Universal approximation bounds for superpositions of a sigmoidal function","volume":"39","author":"Barron","year":"1993","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"989","DOI":"10.3150\/bj\/1106314847","article-title":"Some theory for Fisher\u2019s linear discriminant function, naive Bayes\u2019, and some alternatives when there are many more variables than observations","volume":"10","author":"Bickel","year":"2004","journal-title":"Bernoulli"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"B\u00fchlmann, P., and Van De Geer, S. (2011). Statistics for High-Dimensional Data: Methods, Theory and Applications, Springer Science & Business Media.","DOI":"10.1007\/978-3-642-20192-9"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Chen, T., and Guestrin, C. (2016, January 13\u201317). Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, FL, USA.","DOI":"10.1145\/2939672.2939785"},{"key":"ref_6","unstructured":"(2015, January 01). Chollet, Fran\u00e7ois and others. Available online: https:\/\/keras.io."},{"key":"ref_7","unstructured":"Chaudhuri, K., and Dasgupta, S. (2014, January 8\u201313). Rates of convergence for nearest neighbor classification. Proceedings of the Advances in Neural Information Processing Systems 27 (NIPS 2014), Montr\u00e9al, QC, Canada."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/BF02551274","article-title":"Approximation by superpositions of a sigmoidal function","volume":"2","author":"Cybenko","year":"1989","journal-title":"Math. Control. Signals Syst."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"407","DOI":"10.1214\/009053604000000067","article-title":"Least angle regression","volume":"32","author":"Efron","year":"2004","journal-title":"Ann. Stat."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"2605","DOI":"10.1214\/07-AOS504","article-title":"High dimensional classification using features annealed independence rules","volume":"36","author":"Fan","year":"2008","journal-title":"Ann. Stat."},{"key":"ref_11","unstructured":"Feng, J., and Simon, N. (2017). Sparse-Input Neural Networks for High-dimensional Nonparametric Regression and Classification. arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Geiger, A., Lenz, P., and Urtasun, R. (2012, January 16\u201321). Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248074"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"2282","DOI":"10.1214\/09-AOS781","article-title":"Variable selection in nonparametric additive models","volume":"38","author":"Huang","year":"2010","journal-title":"Ann. Stat."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Khuri, A. (2009). Linear Model Methodology, Chapman and Hall\/CRC.","DOI":"10.1201\/9781420010442"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1016\/j.acha.2009.05.006","article-title":"Sparse regression using mixed norms","volume":"27","author":"Kowalski","year":"2009","journal-title":"Appl. Comput. Harmon. Anal."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"861","DOI":"10.1016\/S0893-6080(05)80131-5","article-title":"Multilayer feedforward networks with a nonpolynomial activation function can approximate any function","volume":"6","author":"Leshno","year":"1993","journal-title":"Neural Networks"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Liu, B., Wei, Y., Zhang, Y., and Yang, Q. (2017). Deep Neural Networks for High Dimension, Low Sample Size Data. IJCAI, 2287\u20132293.","DOI":"10.24963\/ijcai.2017\/318"},{"key":"ref_18","unstructured":"Mazumder, R., Radchenko, P., and Dedieu, A. (2017). Subset selection with shrinkage: Sparse linear modeling when the SNR is low. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"3779","DOI":"10.1214\/09-AOS692","article-title":"High-dimensional additive modeling","volume":"37","author":"Meier","year":"2009","journal-title":"Ann. Stat."},{"key":"ref_20","unstructured":"Siegel, J.W., and Xu, J. (2019). On the Approximation Properties of Neural Networks. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"231","DOI":"10.1080\/10618600.2012.681250","article-title":"A sparse-group lasso","volume":"22","author":"Simon","year":"2013","journal-title":"J. Comput. Graph. Stat."},{"key":"ref_22","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv Preprint."},{"key":"ref_23","first-page":"1929","article-title":"Dropout: A simple way to prevent neural networks from overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"209","DOI":"10.1007\/s11749-010-0197-z","article-title":"\u21131-penalization for mixture regression models","volume":"19","year":"2010","journal-title":"Test"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"245","DOI":"10.1007\/BF02124746","article-title":"Critical points for least-squares problems involving certain analytic functions, with applications to sigmoidal nets","volume":"5","author":"Stonag","year":"1996","journal-title":"Adv. Comput. Math."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","article-title":"Regression shrinkage and selection via the lasso","volume":"58","author":"Tibshirani","year":"1996","journal-title":"J. R. Stat. Soc. Ser."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1111\/j.1467-9868.2005.00532.x","article-title":"Model selection and estimation in regression with grouped variables","volume":"68","author":"Yuan","year":"2006","journal-title":"J. R. Stat. Soc. Ser."},{"key":"ref_28","unstructured":"Li, Y., and Maiti, T. (2018). High Dimensional Discriminant Analysis for Spatially Dependent Data, Department of Statistics and Probability, Michigan State University. Technical Report."}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/2\/1\/1\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T13:42:11Z","timestamp":1760362931000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/2\/1\/1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,1,2]]},"references-count":28,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2020,3]]}},"alternative-id":["make2010001"],"URL":"https:\/\/doi.org\/10.3390\/make2010001","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,1,2]]}}}