{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,30]],"date-time":"2025-07-30T14:31:51Z","timestamp":1753885911361,"version":"3.41.2"},"reference-count":38,"publisher":"World Scientific Pub Co Pte Ltd","issue":"01","funder":[{"DOI":"10.13039\/501100001381","name":"National Research Foundation Singapore","doi-asserted-by":"publisher","award":["ACADEMIC RESEARCH FUND (AcRF) R-144-000-415-114"],"award-info":[{"award-number":["ACADEMIC RESEARCH FUND (AcRF) R-144-000-415-114"]}],"id":[{"id":"10.13039\/501100001381","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Artif. Intell. Robot. Res."],"published-print":{"date-parts":[[2024,3]]},"abstract":"<jats:p> It has been recently demonstrated that optimal neural networks operate near the asymptotic edge of chaos for state-of-the-art feed-forward neural networks, where its generalization power is maximal due to the highest number of asymptotic metastable states. However, how to leverage this principle to improve the model training process remains open. Here, by mapping the model evolution during training to the phase diagram in the classic analytic result of Sherrington\u2013Kirkpatrick model in spin glasses, we illustrate on a simple neural network model that one can provide principled training of the network without manually tuning the training hyper-parameters. In particular, we provide a semi-analytical method to set the optimal weight decay strength, such that the model will converge toward the edge of chaos during training. Consequently, such hyper-parameter setting leads the model to achieve the highest test accuracy. Another benefit for restricting the model at the edge of chaos is its robustness against the common practical problem of label noise, as we find that it automatically avoids fitting the shuffled labels in the training samples while maintaining good fitting to the correct labels, providing simple means of achieving good performance on noisy labels without any additional treatment. <\/jats:p>","DOI":"10.1142\/s2972335323500011","type":"journal-article","created":{"date-parts":[[2023,9,13]],"date-time":"2023-09-13T06:40:30Z","timestamp":1694587230000},"source":"Crossref","is-referenced-by-count":0,"title":["Asymptotic Edge of Chaos as Guiding Principle for Neural Network Training"],"prefix":"10.1142","volume":"01","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-4398-3430","authenticated-orcid":false,"given":"Lin","family":"Zhang","sequence":"first","affiliation":[{"name":"Department of Physics, Faculty of Science, National University of Singapore, Singapore 117551, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6578-2558","authenticated-orcid":false,"given":"Ling","family":"Feng","sequence":"additional","affiliation":[{"name":"Department of Physics, Faculty of Science, National University of Singapore, Singapore 117551, Singapore"},{"name":"IHPC, Agency for Science, Technology and Research, Singapore, Singapore 138632, Singapore"}]},{"given":"Kan","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Mathematics, Faculty of Science, National University of Singapore, Singapore 119076, Singapore"},{"name":"Risk Management Institute, National University of Singapore, Singapore 119613, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3339-669X","authenticated-orcid":false,"given":"Choy Heng","family":"Lai","sequence":"additional","affiliation":[{"name":"Department of Physics, Faculty of Science, National University of Singapore, Singapore 117551, Singapore"}]}],"member":"219","published-online":{"date-parts":[[2023,10,27]]},"reference":[{"key":"S2972335323500011BIB001","doi-asserted-by":"publisher","DOI":"10.1038\/nature14539"},{"key":"S2972335323500011BIB002","first-page":"1097","volume-title":"Advances in Neural Information Processing Systems","volume":"25","author":"Krizhevsky A.","year":"2012"},{"key":"S2972335323500011BIB003","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2012.2205597"},{"key":"S2972335323500011BIB004","first-page":"2493","volume":"12","author":"Collobert R.","year":"2011","journal-title":"J. Mach. Learn. Res."},{"key":"S2972335323500011BIB005","doi-asserted-by":"publisher","DOI":"10.1038\/nature16961"},{"key":"S2972335323500011BIB007","first-page":"314","volume-title":"Proc. 35th Int. Conf. Machine Learning","author":"Baity-Jesi M.","year":"2018"},{"key":"S2972335323500011BIB008","first-page":"192","volume-title":"Proc. 18th Int. Conf. Artificial Intelligence and Statistics","author":"Choromanska A.","year":"2015"},{"key":"S2972335323500011BIB009","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.100.012115"},{"key":"S2972335323500011BIB011","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2020.08.022"},{"key":"S2972335323500011BIB012","first-page":"4284","volume-title":"Proc. 36th Int. Conf. Machine Learning","author":"Mahoney M.","year":"2019"},{"key":"S2972335323500011BIB013","doi-asserted-by":"publisher","DOI":"10.1016\/0167-2789(90)90064-V"},{"key":"S2972335323500011BIB014","first-page":"293","volume-title":"Dynamic Patterns in Complex Systems: Proceedings of the Conference in Honor of Hermann Haken on the Occasion of His 60th Birthday","volume":"212","author":"Packard N. H.","year":"1988"},{"key":"S2972335323500011BIB015","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.61.259"},{"key":"S2972335323500011BIB016","doi-asserted-by":"publisher","DOI":"10.1088\/0305-4470\/20\/11\/009"},{"key":"S2972335323500011BIB017","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.47.359"},{"key":"S2972335323500011BIB018","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevB.25.6860"},{"issue":"4","key":"S2972335323500011BIB019","first-page":"041030","volume":"5","author":"Kadmon J.","year":"2015","journal-title":"Phys. Rev. X"},{"key":"S2972335323500011BIB020","doi-asserted-by":"publisher","DOI":"10.1007\/s12064-011-0146-8"},{"key":"S2972335323500011BIB021","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevE.100.062312"},{"key":"S2972335323500011BIB022","first-page":"3360","volume-title":"Advances in Neural Information Processing Systems","volume":"29","author":"Poole B.","year":"2016"},{"key":"S2972335323500011BIB024","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.35.1792"},{"key":"S2972335323500011BIB025","doi-asserted-by":"publisher","DOI":"10.1088\/0305-4470\/11\/5\/028"},{"key":"S2972335323500011BIB026","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevLett.69.3717"},{"key":"S2972335323500011BIB027","series-title":"World Scientific Lecture Notes in Physics","volume-title":"Spin Glass Theory and Beyond","volume":"9","author":"M\u00e9zard M.","year":"1987"},{"key":"S2972335323500011BIB028","doi-asserted-by":"publisher","DOI":"10.1093\/acprof:oso\/9780198509417.001.0001"},{"key":"S2972335323500011BIB030","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.2007.903150"},{"key":"S2972335323500011BIB031","doi-asserted-by":"publisher","DOI":"10.1109\/TSP.2017.2708039"},{"key":"S2972335323500011BIB033","first-page":"10678","volume-title":"Proc. 33rd Int. Conf. Neural Information Processing Systems","volume":"32","author":"Golatkar A. S.","year":"2019"},{"key":"S2972335323500011BIB034","first-page":"950","volume-title":"Advances in Neural Information Processing Systems","volume":"4","author":"Krogh A.","year":"1992"},{"volume-title":"Proc. Seventh Int. Conf. Learning Representations","year":"2019","author":"Zhang G.","key":"S2972335323500011BIB035"},{"issue":"2","key":"S2972335323500011BIB036","first-page":"281","volume":"13","author":"Bergstra J.","year":"2012","journal-title":"J. Mach. Learn. Res."},{"volume-title":"Proc. 34th Int. Conf. Machine Learning","year":"2017","author":"Arpit D.","key":"S2972335323500011BIB037"},{"volume-title":"Proc. Eighth Int. Conf. Learning Representations","year":"2020","author":"Menon A. K.","key":"S2972335323500011BIB042"},{"key":"S2972335323500011BIB043","first-page":"708","volume-title":"Proc. 33rd Int. Conf. Machine Learning","author":"Patrini G.","year":"2017"},{"volume-title":"Proc. 23rd Int. Conf. Artificial Intelligence and Statistics","year":"2019","author":"Li M.","key":"S2972335323500011BIB044"},{"key":"S2972335323500011BIB045","first-page":"2101","volume-title":"Proc. 34th Int. Conf. Machine Learning","author":"Li Q.","year":"2017"},{"volume-title":"Proc. NIPS Workshop Optimization for Machine Learning","year":"2015","author":"Mandt S.","key":"S2972335323500011BIB046"},{"volume-title":"Proc. Sixth Int. Conf. Learning Representations","year":"2018","author":"Smith S. L.","key":"S2972335323500011BIB047"}],"container-title":["International Journal of Artificial Intelligence and Robotics Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S2972335323500011","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,3,4]],"date-time":"2024-03-04T06:18:45Z","timestamp":1709533125000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S2972335323500011"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,27]]},"references-count":38,"journal-issue":{"issue":"01","published-print":{"date-parts":[[2024,3]]}},"alternative-id":["10.1142\/S2972335323500011"],"URL":"https:\/\/doi.org\/10.1142\/s2972335323500011","relation":{},"ISSN":["2972-3353","2972-3361"],"issn-type":[{"type":"print","value":"2972-3353"},{"type":"electronic","value":"2972-3361"}],"subject":[],"published":{"date-parts":[[2023,10,27]]},"article-number":"2350001"}}