{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2023,8,7]],"date-time":"2023-08-07T16:10:24Z","timestamp":1691424624076},"reference-count":26,"publisher":"MIT Press","issue":"9","content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2023,8,7]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Mirror descent is an elegant optimization technique that leverages a dual space of parametric models to perform gradient descent. While originally developed for convex optimization, it has increasingly been applied in the field of machine learning. In this study, we propose a novel approach for using mirror descent to initialize the parameters of neural networks. Specifically, we demonstrate that by using the Hopfield model as a prototype for neural networks, mirror descent can effectively train the model with significantly improved performance compared to traditional gradient descent methods that rely on random parameter initialization. Our findings highlight the potential of mirror descent as a promising initialization technique for enhancing the optimization of machine learning models.<\/jats:p>","DOI":"10.1162\/neco_a_01602","type":"journal-article","created":{"date-parts":[[2023,7,12]],"date-time":"2023-07-12T19:41:26Z","timestamp":1689190886000},"page":"1529-1542","update-policy":"http:\/\/dx.doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":0,"title":["Mirror Descent of Hopfield Model"],"prefix":"10.1162","volume":"35","author":[{"given":"Hyungjoon","family":"Soh","sequence":"first","affiliation":[{"name":"Department of Physics Education, Seoul National University, Seoul 08826, Korea hjsoh88@gmail.com"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dongyeob","family":"Kim","sequence":"additional","affiliation":[{"name":"Department of Physics and Astronomy, Seoul National University, Seoul 08826, Korea ktfa159@snu.ac.kr"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Juno","family":"Hwang","sequence":"additional","affiliation":[{"name":"Department of Physics Education, Seoul National University, Seoul 08826, Korea wnsdh10@snu.ac.kr"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Junghyo","family":"Jo","sequence":"additional","affiliation":[{"name":"Department of Physics Education, Department of Physics and Astronomy, and Center for Theoretical Physics and Artificial Intelligence Institute"},{"name":"Seoul National University, Seoul 08826, Korea"},{"name":"School of Computational Sciences, Korea Institute for Advanced Study, Seoul 02455 jojunghyo@snu.ac.kr"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"281","published-online":{"date-parts":[[2023,8,7]]},"reference":[{"key":"2023080715475822600_bib1","doi-asserted-by":"publisher","first-page":"147","DOI":"10.1207\/s15516709cog0901_7","article-title":"A learning algorithm for Boltzmann machines","volume-title":"Cognitive Science","author":"Ackley","year":"1985"},{"key":"2023080715475822600_bib2","article-title":"Stochastic mirror descent on overparameterized nonlinear models","volume-title":"IEEE Transactions on Neural Networks and Learning Systems","author":"Azizan","year":"2021"},{"key":"2023080715475822600_bib3","doi-asserted-by":"publisher","first-page":"167","DOI":"10.1016\/S0167-6377(02)00231-6","article-title":"Mirror descent and nonlinear projected subgradient methods for convex optimization","volume-title":"Operations Research Letters","author":"Beck","year":"2003"},{"key":"2023080715475822600_bib4","first-page":"773","article-title":"Fast convergence of competitive spiking neural networks with sample-based weight initialization","volume-title":"Proceedings of Information Processing and Management of Uncertainty in Knowledge-Based Systems, 18th International Conference","author":"Cachi","year":"2020"},{"key":"2023080715475822600_bib5","doi-asserted-by":"publisher","first-page":"5017","DOI":"10.1109\/TIP.2015.2475625","article-title":"PCANet: A simple deep learning baseline for image classification?","volume-title":"IEEE Transactions on Image Processing","author":"Chan","year":"2015"},{"key":"2023080715475822600_bib6","doi-asserted-by":"publisher","first-page":"220","DOI":"10.1016\/j.neunet.2021.11.020","article-title":"Feedforward neural networks initialization based on discriminant learning","volume-title":"Neural Networks","author":"Chumachenko","year":"2022"},{"key":"2023080715475822600_bib7","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-642-35289-8_30","article-title":"Learning feature representations with K-means","author":"Coates","year":"2012"},{"key":"2023080715475822600_bib8","volume-title":"Data-driven weight initialization with Sylvester solvers","author":"Das","year":"2021"},{"key":"2023080715475822600_bib9","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1137\/0801001","article-title":"Variable metric method for minimization","volume-title":"SIAM Journal on Optimization","author":"Davidon","year":"1991"},{"key":"2023080715475822600_bib10","article-title":"Fisher-Legendre (FishLeg) optimization of deep neural networks","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Garcia","year":"2023"},{"key":"2023080715475822600_bib11","first-page":"265","article-title":"The robustness of the p-norm algorithms","volume-title":"Machine Learning, 53","author":"Gentile","year":"2003"},{"key":"2023080715475822600_bib12","first-page":"249","article-title":"Understanding the difficulty of training deep feedforward neural networks","volume-title":"Journal of Machine Learning Research\u2014Proceedings Track","author":"Glorot","year":"2010"},{"key":"2023080715475822600_bib13","first-page":"1026","article-title":"Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"He","year":"2015"},{"key":"2023080715475822600_bib14","doi-asserted-by":"crossref","DOI":"10.4324\/9781410612403","volume-title":"The organization of behavior: A neuropsychological theory","author":"Hebb","year":"2005"},{"key":"2023080715475822600_bib15","doi-asserted-by":"publisher","first-page":"2554","DOI":"10.1073\/pnas.79.8.2554","article-title":"Neural networks and physical systems with emergent collective computational abilities","volume-title":"Proceedings of the National Academy of Sciences","author":"Hopfield","year":"1982"},{"key":"2023080715475822600_bib16","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s10107-022-01816-5","article-title":"Policy mirror descent for reinforcement learning: Linear convergence, new sampling complexity, and generalized problem classes","volume-title":"Mathematical Programming","author":"Lan","year":"2023"},{"key":"2023080715475822600_bib17","first-page":"2408","article-title":"Optimizing neural networks with Kronecker-factored approximate curvature","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Martens","year":"2015"},{"key":"2023080715475822600_bib18","volume-title":"Linear discriminant initialization for feed-forward neural networks","author":"Masden","year":"2020"},{"key":"2023080715475822600_bib19","doi-asserted-by":"publisher","first-page":"291","DOI":"10.1007\/s10462-021-10033-z","article-title":"A review on weight initialization strategies for neural networks","volume-title":"Artificial Intelligence Review","author":"Narkhede","year":"2022"},{"key":"2023080715475822600_bib20","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-319-91578-4","volume-title":"Lectures on convex optimization","author":"Nesterov","year":"2018"},{"key":"2023080715475822600_bib21","volume-title":"Numerical optimization","author":"Nocedal","year":"2006","edition":"2nd"},{"key":"2023080715475822600_bib22","doi-asserted-by":"publisher","first-page":"1451","DOI":"10.1109\/TIT.2015.2388583","article-title":"The information geometry of mirror descent","volume-title":"IEEE Transactions on Information Theory","author":"Raskutti","year":"2015"},{"key":"2023080715475822600_bib23","first-page":"877","article-title":"PCA-initialized deep neural networks applied to document image analysis","volume-title":"Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition","author":"Seuret","year":"2017"},{"key":"2023080715475822600_bib24","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/P19-1355","volume-title":"Energy and policy considerations for deep learning in NLP","author":"Strubell","year":"2019"},{"key":"2023080715475822600_bib25","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/S0925-2312(01)00674-9","article-title":"An independent component analysis based weight initialization method for multilayer perceptrons","volume-title":"Neurocomputing","author":"Yam","year":"2002"},{"key":"2023080715475822600_bib26","volume-title":"Policy mirror descent for regularized reinforcement learning: A generalized framework with linear convergence.","author":"Zhan","year":"2021"}],"container-title":["Neural Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/35\/9\/1529\/2152739\/neco_a_01602.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/35\/9\/1529\/2152739\/neco_a_01602.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,8,7]],"date-time":"2023-08-07T15:48:16Z","timestamp":1691423296000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/neco\/article\/35\/9\/1529\/116704\/Mirror-Descent-of-Hopfield-Model"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,7]]},"references-count":26,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2023,8,7]]},"published-print":{"date-parts":[[2023,8,7]]}},"URL":"https:\/\/doi.org\/10.1162\/neco_a_01602","relation":{},"ISSN":["0899-7667","1530-888X"],"issn-type":[{"value":"0899-7667","type":"print"},{"value":"1530-888X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2023,9]]},"published":{"date-parts":[[2023,8,7]]}}}