{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T16:51:00Z","timestamp":1773939060370,"version":"3.50.1"},"reference-count":45,"publisher":"Springer Science and Business Media LLC","issue":"1-2","license":[{"start":{"date-parts":[[2021,7,6]],"date-time":"2021-07-06T00:00:00Z","timestamp":1625529600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,7,6]],"date-time":"2021-07-06T00:00:00Z","timestamp":1625529600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CCF-1763314"],"award-info":[{"award-number":["CCF-1763314"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006754","name":"Army Research Laboratory","doi-asserted-by":"publisher","award":["W911NF-17-1-0304"],"award-info":[{"award-number":["W911NF-17-1-0304"]}],"id":[{"id":"10.13039\/100006754","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Math. Program."],"published-print":{"date-parts":[[2022,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Gradient-based optimization algorithms can be studied from the perspective of limiting ordinary differential equations (ODEs). Motivated by the fact that existing ODEs do not distinguish between two fundamentally different algorithms\u2014Nesterov\u2019s accelerated gradient method for strongly convex functions (NAG-) and Polyak\u2019s heavy-ball method\u2014we study an alternative limiting process that yields <jats:italic>high-resolution ODEs<\/jats:italic>. We show that these ODEs permit a general Lyapunov function framework for the analysis of convergence in both continuous and discrete time. We also show that these ODEs are more accurate surrogates for the underlying algorithms; in particular, they not only distinguish between NAG- and Polyak\u2019s heavy-ball method, but they allow the identification of a term that we refer to as \u201cgradient correction\u201d that is present in NAG- but not in the heavy-ball method and is responsible for the qualitative difference in convergence of the two methods. We also use the high-resolution ODE framework to study Nesterov\u2019s accelerated gradient method for (non-strongly) convex functions, uncovering a hitherto unknown result\u2014that NAG- minimizes the squared gradient norm at an inverse cubic rate. Finally, by modifying the high-resolution ODE of NAG-, we obtain a family of new optimization methods that are shown to maintain the accelerated convergence rates of NAG- for smooth convex functions.<\/jats:p>","DOI":"10.1007\/s10107-021-01681-8","type":"journal-article","created":{"date-parts":[[2021,7,6]],"date-time":"2021-07-06T14:04:42Z","timestamp":1625580282000},"page":"79-148","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":104,"title":["Understanding the acceleration phenomenon via high-resolution differential equations"],"prefix":"10.1007","volume":"195","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2378-5333","authenticated-orcid":false,"given":"Bin","family":"Shi","sequence":"first","affiliation":[]},{"given":"Simon S.","family":"Du","sequence":"additional","affiliation":[]},{"given":"Michael I.","family":"Jordan","sequence":"additional","affiliation":[]},{"given":"Weijie J.","family":"Su","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,7,6]]},"reference":[{"issue":"4","key":"1681_CR1","doi-asserted-by":"publisher","first-page":"1102","DOI":"10.1137\/S0363012998335802","volume":"38","author":"F Alvarez","year":"2000","unstructured":"Alvarez, F.: On the minimizing property of a second order dissipative system in Hilbert spaces. SIAM J. Control Optim. 38(4), 1102\u20131119 (2000)","journal-title":"SIAM J. Control Optim."},{"issue":"8","key":"1681_CR2","doi-asserted-by":"publisher","first-page":"747","DOI":"10.1016\/S0021-7824(01)01253-3","volume":"81","author":"F Alvarez","year":"2002","unstructured":"Alvarez, F., Attouch, H., Bolte, J., Redont, P.: A second-order gradient-like dissipative dynamical system with Hessian-driven damping: application to optimization and mechanics. J. Math. Pures Appl. 81(8), 747\u2013779 (2002)","journal-title":"J. Math. Pures Appl."},{"key":"1681_CR3","volume-title":"Mathematical Methods of Classical Mechanics","author":"VI Arnold","year":"2013","unstructured":"Arnold, V.I.: Mathematical Methods of Classical Mechanics, vol. 60. Springer Science and Business Media, Berlin (2013)"},{"issue":"1","key":"1681_CR4","doi-asserted-by":"publisher","first-page":"849","DOI":"10.1137\/17M1114739","volume":"28","author":"H Attouch","year":"2018","unstructured":"Attouch, H., Cabot, A.: Convergence rates of inertial forward-backward algorithms. SIAM J. Optim. 28(1), 849\u2013874 (2018)","journal-title":"SIAM J. Optim."},{"issue":"1\u20132","key":"1681_CR5","doi-asserted-by":"publisher","first-page":"123","DOI":"10.1007\/s10107-016-0992-8","volume":"168","author":"H Attouch","year":"2018","unstructured":"Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Progr. 168(1\u20132), 123\u2013175 (2018)","journal-title":"Math. Progr."},{"key":"1681_CR6","unstructured":"Attouch, H., Chbani, Z., Riahi, H.: Rate of convergence of the Nesterov accelerated gradient method in the subcritical case $$\\alpha \\le 3$$. arXiv preprint arXiv:1706.05671 (2017)"},{"issue":"1","key":"1681_CR7","first-page":"27","volume":"4","author":"H Attouch","year":"2012","unstructured":"Attouch, H., Maing\u00e9, P.-E., Redont, P.: A second-order differential system with Hessian-driven damping; application to non-elastic shock laws. Differ. Equ. Appl. 4(1), 27\u201365 (2012)","journal-title":"Differ. Equ. Appl."},{"issue":"3","key":"1681_CR8","doi-asserted-by":"publisher","first-page":"1824","DOI":"10.1137\/15M1046095","volume":"26","author":"H Attouch","year":"2016","unstructured":"Attouch, H., Peypouquet, J.: The rate of convergence of Nesterov\u2019s accelerated forward-backward method is actually faster than $$1\/k^{2}$$. SIAM J. Optim. 26(3), 1824\u20131834 (2016)","journal-title":"SIAM J. Optim."},{"issue":"1","key":"1681_CR9","doi-asserted-by":"publisher","first-page":"232","DOI":"10.1137\/130910294","volume":"24","author":"H Attouch","year":"2014","unstructured":"Attouch, H., Peypouquet, J., Redont, P.: A dynamical approach to an inertial forward-backward algorithm for convex minimization. SIAM J. Optim. 24(1), 232\u2013256 (2014)","journal-title":"SIAM J. Optim."},{"issue":"10","key":"1681_CR10","doi-asserted-by":"publisher","first-page":"5734","DOI":"10.1016\/j.jde.2016.08.020","volume":"261","author":"H Attouch","year":"2016","unstructured":"Attouch, H., Peypouquet, J., Redont, P.: Fast convex optimization via inertial dynamics with Hessian driven damping. J. Differ. Equ. 261(10), 5734\u20135783 (2016)","journal-title":"J. Differ. Equ."},{"key":"1681_CR11","doi-asserted-by":"crossref","unstructured":"Badithela, A., Seiler P.: Analysis of the heavy-ball algorithm using integral quadratic constraints. In: 2019 American Control Conference (ACC), pp. 4081\u20134085, IEEE, (2019)","DOI":"10.23919\/ACC.2019.8814459"},{"issue":"1","key":"1681_CR12","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1137\/080716542","volume":"2","author":"A Beck","year":"2009","unstructured":"Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2(1), 183\u2013202 (2009)","journal-title":"SIAM J. Imag. Sci."},{"key":"1681_CR13","unstructured":"Betancourt, M., Jordan, M.I., Wilson, A.C.: On symplectic optimization. arXiv preprint arXiv:1802.03653, (2018)"},{"issue":"3\u20134","key":"1681_CR14","doi-asserted-by":"publisher","first-page":"231","DOI":"10.1561\/2200000050","volume":"8","author":"S Bubeck","year":"2015","unstructured":"Bubeck, S.: Convex optimization: algorithms and complexity. Found. Trends Mach. Learn. 8(3\u20134), 231\u2013357 (2015)","journal-title":"Found. Trends Mach. Learn."},{"key":"1681_CR15","unstructured":"Bubeck, S., Lee, Y.T., Singh, M.: A geometric alternative to Nesterov\u2019s accelerated gradient descent. arXiv preprint arXiv:1506.08187, (2015)"},{"key":"1681_CR16","unstructured":"Carmon, Y., Duchi, J.C., Hinder, O., Sidford, A.: Lower bounds for finding stationary points I. arXiv preprint arXiv:1710.11606, (2017)"},{"issue":"3","key":"1681_CR17","doi-asserted-by":"publisher","first-page":"968","DOI":"10.1007\/s10957-015-0746-4","volume":"166","author":"A Chambolle","year":"2015","unstructured":"Chambolle, A., Dossal, C.: On the convergence of the iterates of the \u201cfast iterative shrinkage\/thresholding algorithm\u201d. J. Optim. Theory Appl. 166(3), 968\u2013982 (2015)","journal-title":"J. Optim. Theory Appl."},{"key":"1681_CR18","unstructured":"Diakonikolas, J., Orecchia, L.: The approximate duality gap technique: a unified theory of first-order methods. arXiv preprint arXiv:1712.02485, (2017)"},{"issue":"1","key":"1681_CR19","doi-asserted-by":"publisher","first-page":"251","DOI":"10.1137\/16M1072528","volume":"28","author":"D Drusvyatskiy","year":"2018","unstructured":"Drusvyatskiy, D., Fazel, M., Roy, S.: An optimal first order method based on optimal quadratic averaging. SIAM J. Optim. 28(1), 251\u2013271 (2018)","journal-title":"SIAM J. Optim."},{"key":"1681_CR20","unstructured":"Du, S.S., Jin, C., Lee, J.D., Jordan, M.I., Singh, A., Poczos, B.: Gradient descent can take exponential time to escape saddle points. In: Advances in Neural Information Processing Systems, pp. 1067\u20131077, (2017)"},{"key":"1681_CR21","unstructured":"Flammarion, N., Bach, F.: From averaging to acceleration, there is only a step-size. In: Conference on Learning Theory, pp. 658\u2013695, (2015)"},{"key":"1681_CR22","first-page":"103","volume":"17","author":"IM Gelfand","year":"1962","unstructured":"Gelfand, I.M., Tsetlin, M.L.: The method of\u201c ravines\u201d (Russian). Uspekhi Matematicheskikh Nauk (Progres in Mathematics) 17, 103\u2013131 (1962)","journal-title":"Uspekhi Matematicheskikh Nauk (Progres in Mathematics)"},{"issue":"1\u20132","key":"1681_CR23","doi-asserted-by":"publisher","first-page":"59","DOI":"10.1007\/s10107-015-0871-8","volume":"156","author":"S Ghadimi","year":"2016","unstructured":"Ghadimi, S., Lan, G.: Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Progr. 156(1\u20132), 59\u201399 (2016)","journal-title":"Math. Progr."},{"key":"1681_CR24","doi-asserted-by":"crossref","unstructured":"Hu, B., Lessard, L.: Control interpretations for first-order optimization methods. In: 2017 American Control Conference (ACC), pp. 3114\u20133119. IEEE, (2017)","DOI":"10.23919\/ACC.2017.7963426"},{"key":"1681_CR25","unstructured":"Hu, B., Lessard, L.: Dissipativity theory for Nesterov\u2019s accelerated method. arXiv preprint arXiv:1706.04381, (2017)"},{"key":"1681_CR26","unstructured":"Jin, C., Ge, R., Netrapalli, P., Kakade, S.M., Jordan, M.I.: How to escape saddle points efficiently. In: International Conference on Machine Learning (ICML), New York, ACM Press, (2017)"},{"key":"1681_CR27","unstructured":"Kim, D., Fessler, J.A.: Optimizing the efficiency of first-order methods for decreasing the gradient of smooth convex functions. arXiv preprint arXiv:1803.06600, (2018)"},{"key":"1681_CR28","unstructured":"Krichene, W., Bartlett, P.L.: Acceleration and averaging in stochastic descent dynamics. In: Advances in Neural Information Processing Systems, pp. 6796\u20136806, (2017)"},{"key":"1681_CR29","unstructured":"Krichene, W., Bayen, A., Bartlett, P.L.: Accelerated mirror descent in continuous and discrete time. In: Advances in Neural Information Processing Systems, pp. 2845\u20132853, (2015)"},{"key":"1681_CR30","unstructured":"Krichene, W., Bayen, A., Bartlett, P.L.: Adaptive averaging in accelerated descent dynamics. In: Advances in Neural Information Processing Systems, pp. 2991\u20132999, (2016)"},{"issue":"1","key":"1681_CR31","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1137\/15M1009597","volume":"26","author":"L Lessard","year":"2016","unstructured":"Lessard, L., Recht, B., Packard, A.: Analysis and design of optimization algorithms via integral quadratic constraints. SIAM J. Optim. 26(1), 57\u201395 (2016)","journal-title":"SIAM J. Optim."},{"issue":"212","key":"1681_CR32","first-page":"1","volume":"18","author":"H Lin","year":"2018","unstructured":"Lin, H., Mairal, J., Harchaoui, Z.: Catalyst acceleration for first-order convex optimization: from theory to practice. J. Mach. Learn. Res. 18(212), 1\u201354 (2018)","journal-title":"J. Mach. Learn. Res."},{"issue":"2","key":"1681_CR33","first-page":"264","volume":"27","author":"AS Nemirovsky","year":"1983","unstructured":"Nemirovsky, A.S., Yudin, D.B.: Problem complexity and method efficiency in optimization. SIAM Rev. 27(2), 264\u2013265 (1983)","journal-title":"SIAM Rev."},{"issue":"2","key":"1681_CR34","first-page":"372","volume":"27","author":"Y Nesterov","year":"1983","unstructured":"Nesterov, Y.: A method of solving a convex programming problem with convergence rate $$O(1\/k^2)$$. Sov. Math. Doklady 27(2), 372\u2013376 (1983)","journal-title":"Sov. Math. Doklady"},{"key":"1681_CR35","first-page":"10","volume":"88","author":"Y Nesterov","year":"2012","unstructured":"Nesterov, Y.: How to make the gradients small. Optima 88, 10\u201311 (2012)","journal-title":"Optima"},{"key":"1681_CR36","volume-title":"Introductory Lectures on Convex Optimization: A Basic Course","author":"Y Nesterov","year":"2013","unstructured":"Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course, vol. 87. Springer Science and Business Media, Berlin (2013)"},{"issue":"3","key":"1681_CR37","doi-asserted-by":"publisher","first-page":"715","DOI":"10.1007\/s10208-013-9150-3","volume":"15","author":"B O\u2019Donoghue","year":"2015","unstructured":"O\u2019Donoghue, B., Cand\u00e8s, E.J.: Adaptive restart for accelerated gradient schemes. Found. Comput. Math. 15(3), 715\u2013732 (2015)","journal-title":"Found. Comput. Math."},{"key":"1681_CR38","volume-title":"Geophysical Fluid Dynamics","author":"J Pedlosky","year":"2013","unstructured":"Pedlosky, J.: Geophysical Fluid Dynamics. Springer Science and Business Media, Berlin (2013)"},{"issue":"5","key":"1681_CR39","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/0041-5553(64)90137-5","volume":"4","author":"BT Polyak","year":"1964","unstructured":"Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5), 1\u201317 (1964)","journal-title":"USSR Comput. Math. Math. Phys."},{"key":"1681_CR40","unstructured":"Shi, B., Du, S.S., Su, W.J., Jordanm M.I.: Acceleration via symplectic discretization of high-resolution differential equations. In: Advances in Neural Information Processing Systems, pp. 5745\u20135753, (2019)"},{"issue":"153","key":"1681_CR41","first-page":"1","volume":"17","author":"W Su","year":"2016","unstructured":"Su, W., Boyd, S., Cand\u00e8s, E.L.: A differential equation for modeling Nesterov\u2019s accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17(153), 1\u201343 (2016)","journal-title":"J. Mach. Learn. Res."},{"issue":"1","key":"1681_CR42","doi-asserted-by":"publisher","first-page":"551","DOI":"10.1137\/17M1128642","volume":"28","author":"A Vassilis","year":"2018","unstructured":"Vassilis, A., Jean-Fran\u00e7ois, A., Charles, D.: The differential inclusion modeling FISTA algorithm and optimality of convergence rate in the case $$b < 3$$. SIAM J. Optim. 28(1), 551\u2013574 (2018)","journal-title":"SIAM J. Optim."},{"issue":"47","key":"1681_CR43","doi-asserted-by":"publisher","first-page":"E7351","DOI":"10.1073\/pnas.1614734113","volume":"113","author":"A Wibisono","year":"2016","unstructured":"Wibisono, A., Wilson, A.C., Jordan, M.I.: A variational perspective on accelerated methods in optimization. Proc. Natl. Acad. Sci. 113(47), E7351\u2013E7358 (2016)","journal-title":"Proc. Natl. Acad. Sci."},{"key":"1681_CR44","first-page":"1","volume":"22","author":"AC Wilson","year":"2021","unstructured":"Wilson, A.C., Recht, B., Jordan, M.I.: A Lyapunov analysis of momentum methods in optimization. J. Mach. Learn. Res. 22, 1\u201334 (2021)","journal-title":"J. Mach. Learn. Res."},{"key":"1681_CR45","unstructured":"Zhang, J., Mokhtari, A., Sra, S., Jadbabaie, A.: Direct Runge\u2013Kutta discretization achieves acceleration. arXiv preprint arXiv:1805.00521, (2018)"}],"container-title":["Mathematical Programming"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10107-021-01681-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10107-021-01681-8\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10107-021-01681-8.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,21]],"date-time":"2022-10-21T21:23:45Z","timestamp":1666387425000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10107-021-01681-8"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,6]]},"references-count":45,"journal-issue":{"issue":"1-2","published-print":{"date-parts":[[2022,9]]}},"alternative-id":["1681"],"URL":"https:\/\/doi.org\/10.1007\/s10107-021-01681-8","relation":{},"ISSN":["0025-5610","1436-4646"],"issn-type":[{"value":"0025-5610","type":"print"},{"value":"1436-4646","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,7,6]]},"assertion":[{"value":"27 October 2018","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 June 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 July 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}