{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T02:15:24Z","timestamp":1774404924738,"version":"3.50.1"},"reference-count":23,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2025,10,4]],"date-time":"2025-10-04T00:00:00Z","timestamp":1759536000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Fujian Province Natural Science Foundation","award":["2022J011103"],"award-info":[{"award-number":["2022J011103"]}]},{"name":"Quanzhou City\u2019s Major Science and Technology Special Project","award":["2024QZGZ8"],"award-info":[{"award-number":["2024QZGZ8"]}]}],"content-domain":{"domain":["www.mdpi.com"],"crossmark-restriction":true},"short-container-title":["Symmetry"],"abstract":"<jats:p>Neural network training often suffers from spectral asymmetry, where gradient energy is disproportionately allocated to high-frequency components, leading to suboptimal convergence and reduced efficiency. This paper introduces Gradient Spectral Normalization (GSN), a novel optimization technique designed to restore spectral symmetry by dynamically reshaping gradient distributions in the frequency domain. GSN transforms gradients using FFT, applies layer-specific energy redistribution to enforce a symmetric balance between low- and high-frequency components, and reconstructs the gradients for parameter updates. By tailoring normalization schedules for attention and MLP layers, GSN enhances inference performance and improves model accuracy with minimal overhead. Our approach leverages the principle of symmetry to create more stable and efficient neural systems, offering a practical solution for resource-constrained environments. This frequency-domain paradigm, grounded in symmetry restoration, opens new directions for neural network optimization with broad implications for large-scale AI systems.<\/jats:p>","DOI":"10.3390\/sym17101648","type":"journal-article","created":{"date-parts":[[2025,10,6]],"date-time":"2025-10-06T08:10:51Z","timestamp":1759738251000},"page":"1648","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Restoring Spectral Symmetry in Gradients: A Normalization Approach for Efficient Neural Network Training"],"prefix":"10.3390","volume":"17","author":[{"given":"Zhigao","family":"Huang","sequence":"first","affiliation":[{"name":"College of Physics and Information Engineering, Quanzhou Normal University, Quanzhou 362000, China"}]},{"given":"Nana","family":"Gong","sequence":"additional","affiliation":[{"name":"College of Artificial Intelligence, Hebei Oriental University, Langfang 065000, China"}]},{"given":"Quanfa","family":"Li","sequence":"additional","affiliation":[{"name":"College of Physics and Information Engineering, Quanzhou Normal University, Quanzhou 362000, China"}]},{"given":"Tianying","family":"Wu","sequence":"additional","affiliation":[{"name":"College of Physics and Information Engineering, Quanzhou Normal University, Quanzhou 362000, China"}]},{"given":"Shiyan","family":"Zheng","sequence":"additional","affiliation":[{"name":"College of Physics and Information Engineering, Quanzhou Normal University, Quanzhou 362000, China"}]},{"given":"Miao","family":"Pan","sequence":"additional","affiliation":[{"name":"College of Physics and Information Engineering, Quanzhou Normal University, Quanzhou 362000, China"}]}],"member":"1968","published-online":{"date-parts":[[2025,10,4]]},"reference":[{"key":"ref_1","unstructured":"Rahaman, N., Baratin, A., Arpit, D., Draxler, F., Lin, M., Hamprecht, F.A., Bengio, Y., and Courville, A. (2019). On the spectral bias of neural networks. arXiv."},{"key":"ref_2","unstructured":"Cao, Y., Fang, Z., Wu, Y., Zhou, D.X., and Gu, Q. (2019). Towards understanding the spectral bias of deep learning. arXiv."},{"key":"ref_3","unstructured":"Miyato, T., Kataoka, T., Koyama, M., and Yoshida, Y. (2018). Spectral normalization for generative adversarial networks. arXiv, Published at ICLR 2018."},{"key":"ref_4","unstructured":"Jiang, K., Malik, D., and Li, Y. (2022). How Does Adaptive Optimization Impact Local Neural Network Geometry?. arXiv."},{"key":"ref_5","first-page":"8571","article-title":"Neural tangent kernel: Convergence and generalization in neural networks","volume":"31","author":"Jacot","year":"2018","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_6","unstructured":"Bietti, A., and Mairal, J. (2019). On the inductive bias of neural tangent kernels. arXiv."},{"key":"ref_7","unstructured":"Ioffe, S., and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv."},{"key":"ref_8","unstructured":"Ba, J.L., Kiros, J.R., and Hinton, G.E. (2016). Layer normalization. arXiv."},{"key":"ref_9","first-page":"7537","article-title":"Fourier features let networks learn high frequency functions in low dimensional domains","volume":"33","author":"Tancik","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_10","unstructured":"Yi, K., Zhang, Q., Wang, S., He, H., Long, G., and Niu, Z. (2023). Neural Time Series Analysis with Fourier Transform: A Survey. arXiv."},{"key":"ref_11","unstructured":"Qin, S., Lyu, F., Peng, W., Geng, D., Wang, J., Gao, N., Liu, X., and Wang, L. (2024). Toward a Better Understanding of Fourier Neural Operators: Analysis and Improvement from a Spectral Perspective. arXiv."},{"key":"ref_12","unstructured":"Farhani, G., Kazachek, A., and Wang, B. (2022). Momentum Diminishes the Effect of Spectral Bias in Physics-Informed Neural Networks. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Deshpande, M., Agarwal, S., Snigdha, V., and Bhattacharya, A.K. (2023). Investigations on convergence behaviour of Physics Informed Neural Networks across spectral ranges and derivative orders. arXiv.","DOI":"10.1109\/SSCI51031.2022.10022020"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Seroussi, I., Miron, A., and Ringel, Z. (2023). Spectral-Bias and Kernel-Task Alignment in Physically Informed Neural Networks. arXiv.","DOI":"10.1088\/2632-2153\/ad652d"},{"key":"ref_15","unstructured":"Pascanu, R., Mikolov, T., and Bengio, Y. (2013, January 17\u201319). On the difficulty of training recurrent neural networks. Proceedings of the International Conference on Machine Learning, Atlanta, GA, USA."},{"key":"ref_16","unstructured":"Foret, P., Kleiner, A., Mobahi, H., and Neyshabur, B. (2020). Sharpness-aware minimization for efficiently improving generalization. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Bottou, L. (2010, January 22\u201327). Large-scale machine learning with stochastic gradient descent. Proceedings of the COMPSTAT\u20192010, Paris, France.","DOI":"10.1007\/978-3-7908-2604-3_16"},{"key":"ref_18","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_19","first-page":"2121","article-title":"Adaptive subgradient methods for online learning and stochastic optimization","volume":"12","author":"Duchi","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"ref_20","first-page":"1135","article-title":"Learning both weights and connections for efficient neural network","volume":"28","author":"Han","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., Adam, H., and Kalenichenko, D. (2018, January 18\u201322). Quantization and training of neural networks for efficient integer-arithmetic-only inference. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00286"},{"key":"ref_22","unstructured":"Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling the knowledge in a neural network. arXiv."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zou, J., Deng, X., and Sun, T. (2024). Sharpness-Aware Minimization with Adaptive Regularization for Training Deep Neural Networks. arXiv.","DOI":"10.1109\/ICASSP49660.2025.10890114"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/10\/1648\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,6]],"date-time":"2025-10-06T08:19:10Z","timestamp":1759738750000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/17\/10\/1648"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,4]]},"references-count":23,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2025,10]]}},"alternative-id":["sym17101648"],"URL":"https:\/\/doi.org\/10.3390\/sym17101648","relation":{},"ISSN":["2073-8994"],"issn-type":[{"value":"2073-8994","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,4]]}}}