{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:23:45Z","timestamp":1760149425612,"version":"build-2065373602"},"reference-count":31,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2023,8,1]],"date-time":"2023-08-01T00:00:00Z","timestamp":1690848000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computation"],"abstract":"<jats:p>We present a new regularization method called Weights Reset, which includes periodically resetting a random portion of layer weights during the training process using predefined probability distributions. This technique was applied and tested on several popular classification datasets, Caltech-101, CIFAR-100 and Imagenette. We compare these results with other traditional regularization methods. The subsequent test results demonstrate that the Weights Reset method is competitive, achieving the best performance on Imagenette dataset and the challenging and unbalanced Caltech-101 dataset. This method also has sufficient potential to prevent vanishing and exploding gradients. However, this analysis is of a brief nature. Further comprehensive studies are needed in order to gain a deep understanding of the computing potential and limitations of the Weights Reset method. The observed results show that the Weights Reset method can be estimated as an effective extension of the traditional regularization methods and can help to improve model performance and generalization.<\/jats:p>","DOI":"10.3390\/computation11080148","type":"journal-article","created":{"date-parts":[[2023,8,1]],"date-time":"2023-08-01T09:06:44Z","timestamp":1690880804000},"page":"148","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["The Weights Reset Technique for Deep Neural Networks Implicit Regularization"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0942-6134","authenticated-orcid":false,"given":"Grigoriy","family":"Plusch","sequence":"first","affiliation":[{"name":"Department of Applied Mathematics and Computer Modeling, National University of Oil and Gas \u201cGubkin University\u201d, 65, Leninsky Prospekt, 119991 Moscow, Russia"}]},{"given":"Sergey","family":"Arsenyev-Obraztsov","sequence":"additional","affiliation":[{"name":"Department of Applied Mathematics and Computer Modeling, National University of Oil and Gas \u201cGubkin University\u201d, 65, Leninsky Prospekt, 119991 Moscow, Russia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7919-4746","authenticated-orcid":false,"given":"Olga","family":"Kochueva","sequence":"additional","affiliation":[{"name":"Department of Applied Mathematics and Computer Modeling, National University of Oil and Gas \u201cGubkin University\u201d, 65, Leninsky Prospekt, 119991 Moscow, Russia"}]}],"member":"1968","published-online":{"date-parts":[[2023,8,1]]},"reference":[{"key":"ref_1","unstructured":"Tikhonov, A., and Arsenin, V. (1977). Solution of Ill-Posed Problems, Winston & Sons."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"533","DOI":"10.1038\/323533a0","article-title":"Learning representations by back-propagating errors","volume":"322","author":"Rumelhart","year":"1986","journal-title":"Nature"},{"key":"ref_3","unstructured":"Touretzky, D. (1989, January 27\u201330). Optimal Brain Damage. Proceedings of the Advances in Neural Information Processing Systems, Denver, CO, USA."},{"key":"ref_4","unstructured":"Prechelt, L. (1996). Neural Networks: Tricks of the Trade, Springer."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Caruana, R., Lawrence, S., and Giles, L. (December, January 27). Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping. Proceedings of the Advances in Neural Information Processing Systems 13\u2014Proceedings of the 2000 14th AAnnual Conference on Neural Information Processing Systems, NIPS 2000, Denver, CO, USA.","DOI":"10.1109\/IJCNN.2000.857823"},{"key":"ref_6","first-page":"1929","article-title":"Dropout: A Simple Way to Prevent Neural Networks from Overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_7","unstructured":"Ioffe, S., and Szegedy, C. (2015, January 6\u201311). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the 32nd International Conference on International Conference on Machine Learning, JMLR.org, ICML\u201915, Lille, France."},{"key":"ref_8","unstructured":"Pereira, F., Burges, C., Bottou, L., and Weinberger, K. (2012, January 3\u20136). ImageNet Classification with Deep Convolutional Neural Networks. Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_9","first-page":"2822","article-title":"The Implicit Bias of Gradient Descent on Separable Data","volume":"19","author":"Soudry","year":"2017","journal-title":"J. Mach. Learn. Res."},{"key":"ref_10","unstructured":"Razin, N., and Cohen, N. (2020, January 6\u201312). Implicit Regularization in Deep Learning May Not Be Explainable by Norms. Proceedings of the 34th International Conference on Neural Information Processing Systems NIPS\u201920, Red Hook, NY, USA."},{"key":"ref_11","unstructured":"Zhang, L., Xu, Z.Q.J., Luo, T., and Zhang, Y. (2022). Limitation of characterizing implicit regularization by data-independent functions. arXiv."},{"key":"ref_12","unstructured":"Behdin, K., and Mazumder, R. (2023). Sharpness-Aware Minimization: An Implicit Regularization Perspective. arXiv."},{"key":"ref_13","unstructured":"Gulcehre, C., Srinivasan, S., Sygnowski, J., Ostrovski, G., Farajtabar, M., Hoffman, M., Pascanu, R., and Doucet, A. (2022). An Empirical Study of Implicit Regularization in Deep Offline RL. arXiv."},{"key":"ref_14","unstructured":"Krizhevsky, A., and Hinton, G. (2009). Learning Multiple Layers of Features from Tiny Images, University of Toronto. Technical Report."},{"key":"ref_15","unstructured":"Li, F.F., Andreeto, M., Ranzato, M., and Perona, P. (2022). Caltech 101. CaltechDATA."},{"key":"ref_16","unstructured":"Howard, J. (2023, July 27). Imagewang. Available online: https:\/\/github.com\/fastai\/imagenette\/."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"15849","DOI":"10.1073\/pnas.1903070116","article-title":"Reconciling modern machine-learning practice and the classical bias\u2013variance trade-off","volume":"116","author":"Belkin","year":"2019","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_18","unstructured":"Glorot, X., and Bengio, Y. (2010, January 13\u201315). Understanding the difficulty of training deep feedforward neural networks. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, JMLR Workshop and Conference Proceedings, Sardinia, Italy."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. arXiv.","DOI":"10.1109\/ICCV.2015.123"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Li, F.-F. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_21","unstructured":"Grigoriy, P. (2023, July 27). Weights Reset Implicit Regularization. Available online: https:\/\/github.com\/amcircle\/weights-reset\/."},{"key":"ref_22","unstructured":"(2023, July 27). TensorFlow Datasets. A Collection of Ready-to-Use Datasets. Available online: https:\/\/www.tensorflow.org\/datasets."},{"key":"ref_23","unstructured":"Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., and Devin, M. (2023, July 27). TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Available online: tensorflow.org."},{"key":"ref_24","unstructured":"Keras (2023, July 27). Available online: https:\/\/keras.io."},{"key":"ref_25","unstructured":"Kingma, D.P., and Ba, J. (2017). Adam: A Method for Stochastic Optimization. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1111\/j.1467-9868.2005.00503.x","article-title":"Regularization and variable selection via the elastic net","volume":"67","author":"Zou","year":"2005","journal-title":"J. R. Stat. Soc. Ser. (Stat. Methodol.)"},{"key":"ref_27","first-page":"3371","article-title":"Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion","volume":"11","author":"Vincent","year":"2010","journal-title":"J. Mach. Learn. Res."},{"key":"ref_28","unstructured":"Li, H., Xu, Z., Taylor, G., Studer, C., and Goldstein, T. (2018). Visualizing the Loss Landscape of Neural Nets. arXiv."},{"key":"ref_29","unstructured":"Nakkiran, P., Kaplun, G., Bansal, Y., Yang, T., Barak, B., and Sutskever, I. (2019). Deep Double Descent: Where Bigger Models and More Data Hurt. arXiv."},{"key":"ref_30","unstructured":"Advani, M.S., and Saxe, A.M. (2017). High-dimensional dynamics of generalization error in neural networks. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"012115","DOI":"10.1103\/PhysRevE.100.012115","article-title":"Jamming transition as a paradigm to understand the loss landscape of deep neural networks","volume":"100","author":"Geiger","year":"2019","journal-title":"Phys. Rev. E"}],"container-title":["Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2079-3197\/11\/8\/148\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:23:44Z","timestamp":1760127824000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2079-3197\/11\/8\/148"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,1]]},"references-count":31,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2023,8]]}},"alternative-id":["computation11080148"],"URL":"https:\/\/doi.org\/10.3390\/computation11080148","relation":{},"ISSN":["2079-3197"],"issn-type":[{"type":"electronic","value":"2079-3197"}],"subject":[],"published":{"date-parts":[[2023,8,1]]}}}