{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,13]],"date-time":"2026-01-13T21:18:45Z","timestamp":1768339125561,"version":"3.49.0"},"reference-count":18,"publisher":"Springer Science and Business Media LLC","issue":"1-2","license":[{"start":{"date-parts":[[2024,11,25]],"date-time":"2024-11-25T00:00:00Z","timestamp":1732492800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,11,25]],"date-time":"2024-11-25T00:00:00Z","timestamp":1732492800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["Graduate Fellowship"],"award-info":[{"award-number":["Graduate Fellowship"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"TwoSigma","award":["PhD Fellowship"],"award-info":[{"award-number":["PhD Fellowship"]}]},{"name":"NYU","award":["Faculty Fellowship"],"award-info":[{"award-number":["Faculty Fellowship"]}]},{"name":"NYU","award":["Grant 6933939"],"award-info":[{"award-number":["Grant 6933939"]}]},{"DOI":"10.13039\/100006978","name":"University of California Berkeley","doi-asserted-by":"publisher","award":["Grant 6934982"],"award-info":[{"award-number":["Grant 6934982"]}],"id":[{"id":"10.13039\/100006978","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100006831","name":"U.S. Air Force","doi-asserted-by":"publisher","award":["Grant 6943130"],"award-info":[{"award-number":["Grant 6943130"]}],"id":[{"id":"10.13039\/100006831","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Math. Program."],"published-print":{"date-parts":[[2025,9]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>We provide a concise, self-contained proof that the Silver Stepsize Schedule proposed in our companion paper directly applies to smooth (non-strongly) convex optimization.  Specifically, we show that with these stepsizes, gradient descent computes an <jats:inline-formula>\n              <jats:alternatives>\n                <jats:tex-math>$$\\varepsilon $$<\/jats:tex-math>\n                <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>\u03b5<\/mml:mi>\n                <\/mml:math>\n              <\/jats:alternatives>\n            <\/jats:inline-formula>-minimizer in <jats:inline-formula>\n              <jats:alternatives>\n                <jats:tex-math>$$O(\\varepsilon ^{-\\log _{\\rho } 2}) = O(\\varepsilon ^{-0.7864})$$<\/jats:tex-math>\n                <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>O<\/mml:mi>\n                    <mml:mrow>\n                      <mml:mo>(<\/mml:mo>\n                      <mml:msup>\n                        <mml:mi>\u03b5<\/mml:mi>\n                        <mml:mrow>\n                          <mml:mo>-<\/mml:mo>\n                          <mml:msub>\n                            <mml:mo>log<\/mml:mo>\n                            <mml:mi>\u03c1<\/mml:mi>\n                          <\/mml:msub>\n                          <mml:mn>2<\/mml:mn>\n                        <\/mml:mrow>\n                      <\/mml:msup>\n                      <mml:mo>)<\/mml:mo>\n                    <\/mml:mrow>\n                    <mml:mo>=<\/mml:mo>\n                    <mml:mi>O<\/mml:mi>\n                    <mml:mrow>\n                      <mml:mo>(<\/mml:mo>\n                      <mml:msup>\n                        <mml:mi>\u03b5<\/mml:mi>\n                        <mml:mrow>\n                          <mml:mo>-<\/mml:mo>\n                          <mml:mn>0.7864<\/mml:mn>\n                        <\/mml:mrow>\n                      <\/mml:msup>\n                      <mml:mo>)<\/mml:mo>\n                    <\/mml:mrow>\n                  <\/mml:mrow>\n                <\/mml:math>\n              <\/jats:alternatives>\n            <\/jats:inline-formula> iterations, where <jats:inline-formula>\n              <jats:alternatives>\n                <jats:tex-math>$$\\rho = 1+\\sqrt{2}$$<\/jats:tex-math>\n                <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>\u03c1<\/mml:mi>\n                    <mml:mo>=<\/mml:mo>\n                    <mml:mn>1<\/mml:mn>\n                    <mml:mo>+<\/mml:mo>\n                    <mml:msqrt>\n                      <mml:mn>2<\/mml:mn>\n                    <\/mml:msqrt>\n                  <\/mml:mrow>\n                <\/mml:math>\n              <\/jats:alternatives>\n            <\/jats:inline-formula> is the silver ratio. This is intermediate between the textbook unaccelerated rate <jats:inline-formula>\n              <jats:alternatives>\n                <jats:tex-math>$$O(\\varepsilon ^{-1})$$<\/jats:tex-math>\n                <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>O<\/mml:mi>\n                    <mml:mo>(<\/mml:mo>\n                    <mml:msup>\n                      <mml:mi>\u03b5<\/mml:mi>\n                      <mml:mrow>\n                        <mml:mo>-<\/mml:mo>\n                        <mml:mn>1<\/mml:mn>\n                      <\/mml:mrow>\n                    <\/mml:msup>\n                    <mml:mo>)<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math>\n              <\/jats:alternatives>\n            <\/jats:inline-formula> and the accelerated rate <jats:inline-formula>\n              <jats:alternatives>\n                <jats:tex-math>$$O(\\varepsilon ^{-1\/2})$$<\/jats:tex-math>\n                <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>O<\/mml:mi>\n                    <mml:mo>(<\/mml:mo>\n                    <mml:msup>\n                      <mml:mi>\u03b5<\/mml:mi>\n                      <mml:mrow>\n                        <mml:mo>-<\/mml:mo>\n                        <mml:mn>1<\/mml:mn>\n                        <mml:mo>\/<\/mml:mo>\n                        <mml:mn>2<\/mml:mn>\n                      <\/mml:mrow>\n                    <\/mml:msup>\n                    <mml:mo>)<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math>\n              <\/jats:alternatives>\n            <\/jats:inline-formula> due to Nesterov in 1983. The Silver Stepsize Schedule is a simple explicit fractal: the <jats:italic>i<\/jats:italic>-th stepsize is <jats:inline-formula>\n              <jats:alternatives>\n                <jats:tex-math>$$1 + \\rho ^{\\nu (i)-1}$$<\/jats:tex-math>\n                <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mn>1<\/mml:mn>\n                    <mml:mo>+<\/mml:mo>\n                    <mml:msup>\n                      <mml:mi>\u03c1<\/mml:mi>\n                      <mml:mrow>\n                        <mml:mi>\u03bd<\/mml:mi>\n                        <mml:mo>(<\/mml:mo>\n                        <mml:mi>i<\/mml:mi>\n                        <mml:mo>)<\/mml:mo>\n                        <mml:mo>-<\/mml:mo>\n                        <mml:mn>1<\/mml:mn>\n                      <\/mml:mrow>\n                    <\/mml:msup>\n                  <\/mml:mrow>\n                <\/mml:math>\n              <\/jats:alternatives>\n            <\/jats:inline-formula> where <jats:inline-formula>\n              <jats:alternatives>\n                <jats:tex-math>$$\\nu (i)$$<\/jats:tex-math>\n                <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>\u03bd<\/mml:mi>\n                    <mml:mo>(<\/mml:mo>\n                    <mml:mi>i<\/mml:mi>\n                    <mml:mo>)<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math>\n              <\/jats:alternatives>\n            <\/jats:inline-formula> is the 2-adic valuation of\u00a0<jats:italic>i<\/jats:italic>. The design and analysis are conceptually identical to the strongly convex setting in our companion paper, but simplify remarkably in this specific setting.<\/jats:p>","DOI":"10.1007\/s10107-024-02164-2","type":"journal-article","created":{"date-parts":[[2024,11,25]],"date-time":"2024-11-25T06:05:29Z","timestamp":1732514729000},"page":"1105-1118","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Acceleration by stepsize hedging: Silver Stepsize Schedule for smooth convex optimization"],"prefix":"10.1007","volume":"213","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7367-0097","authenticated-orcid":false,"given":"Jason M.","family":"Altschuler","sequence":"first","affiliation":[]},{"given":"Pablo A.","family":"Parrilo","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,11,25]]},"reference":[{"key":"2164_CR1","volume-title":"Nonlinear programming","author":"DP Bertsekas","year":"1999","unstructured":"Bertsekas, D.P.: Nonlinear programming. Athena Scientific, Belmont, MA (1999)"},{"key":"2164_CR2","doi-asserted-by":"publisher","DOI":"10.1007\/b98874","volume-title":"Numerical optimization","author":"J Nocedal","year":"1999","unstructured":"Nocedal, J., Wright, S.J.: Numerical optimization. Springer, New York, NY (1999)"},{"key":"2164_CR3","volume-title":"Introductory lectures on convex optimization: a basic course","author":"Y Nesterov","year":"1998","unstructured":"Nesterov, Y.: Introductory lectures on convex optimization: a basic course, vol. 87. Springer, Boston, MA (1998)"},{"issue":"3\u20134","key":"2164_CR4","doi-asserted-by":"publisher","first-page":"231","DOI":"10.1561\/2200000050","volume":"8","author":"S Bubeck","year":"2015","unstructured":"Bubeck, S.: Convex optimization: algorithms and complexity. Found. Trends Mach. Learn. 8(3\u20134), 231\u2013357 (2015)","journal-title":"Found. Trends Mach. Learn."},{"key":"2164_CR5","first-page":"534","volume-title":"Convex analysis and optimization","author":"DP Bertsekas","year":"2003","unstructured":"Bertsekas, D.P., Nedi\u0107, A., Ozdaglar, A.E.: Convex analysis and optimization, p. 534. Athena Scientific, Belmont, MA (2003)"},{"key":"2164_CR6","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-39568-1","volume-title":"First-order and stochastic optimization methods for machine learning","author":"G Lan","year":"2020","unstructured":"Lan, G.: First-order and stochastic optimization methods for machine learning, vol. 1. Springer, New York, NY (2020)"},{"issue":"1\u20132","key":"2164_CR7","doi-asserted-by":"publisher","first-page":"451","DOI":"10.1007\/s10107-013-0653-0","volume":"145","author":"Y Drori","year":"2014","unstructured":"Drori, Y., Teboulle, M.: Performance of first-order methods for smooth convex minimization: a novel approach. Math. Program. 145(1\u20132), 451\u2013482 (2014)","journal-title":"Math. Program."},{"key":"2164_CR8","unstructured":"Altschuler, J.M., Parrilo, P.A.: Acceleration by stepsize hedging I: multi-step descent and the Silver Stepsize Schedule. arXiv preprint arXiv:2309.07879 (2023)"},{"issue":"2","key":"2164_CR9","first-page":"372","volume":"27","author":"YE Nesterov","year":"1983","unstructured":"Nesterov, Y.E.: A method of solving a convex programming problem with convergence rate $${O}(1\/k^2)$$. Soviet Math. Dokl. 27(2), 372\u2013376 (1983)","journal-title":"Soviet Math. Dokl."},{"issue":"1\u20132","key":"2164_CR10","first-page":"1","volume":"5","author":"A d\u2019Aspremont","year":"2021","unstructured":"d\u2019Aspremont, A., Scieur, D., Taylor, A.: Acceleration methods. Found. Trends Optim. 5(1\u20132), 1\u2013245 (2021)","journal-title":"Found. Trends Optim."},{"key":"2164_CR11","first-page":"1","volume":"204","author":"S Das Gupta","year":"2023","unstructured":"Das Gupta, S., Van Parys, B.P., Ryu, E.K.: Branch-and-bound performance estimation programming: a unified methodology for constructing optimal optimization methods. Math. Program. 204, 1\u201373 (2023)","journal-title":"Math. Program."},{"key":"2164_CR12","doi-asserted-by":"crossref","unstructured":"Grimmer, B.: Provably faster gradient descent via long steps. arXiv preprint arXiv:2307.06324 (2023)","DOI":"10.1137\/23M1588408"},{"key":"2164_CR13","unstructured":"Grimmer, B., Shu, K., Wang, A.L.: Accelerated gradient descent via long steps. arXiv preprint arXiv:2309.09961 (2023)"},{"key":"2164_CR14","unstructured":"Altschuler, J.M.: Greed, hedging, and acceleration in convex optimization. Master\u2019s thesis, Massachusetts Institute of Technology (2018)"},{"issue":"1\u20132","key":"2164_CR15","doi-asserted-by":"publisher","first-page":"307","DOI":"10.1007\/s10107-016-1009-3","volume":"161","author":"AB Taylor","year":"2017","unstructured":"Taylor, A.B., Hendrickx, J.M., Glineur, F.: Smooth strongly convex interpolation and exact worst-case performance of first-order methods. Math. Program. 161(1\u20132), 307\u2013345 (2017)","journal-title":"Math. Program."},{"key":"2164_CR16","unstructured":"Daccache, A.: Performance estimation of the gradient method with fixed arbitrary step sizes. PhD thesis, Master\u2019s thesis, Universit\u00e9 Catholique de Louvain (2019)"},{"key":"2164_CR17","unstructured":"Eloi, D.: Worst-case functions for the gradient method with fixed variable step sizes. PhD thesis, Master\u2019s thesis, Universit\u00e9 Catholique de Louvain (2022)"},{"key":"2164_CR18","unstructured":"Acceleration by Stepsize Hedging \u2013 verification of identities computer algebra script. https:\/\/jasonaltschuler.github.io\/AccelerationByStepsizeHedging"}],"container-title":["Mathematical Programming"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10107-024-02164-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10107-024-02164-2\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10107-024-02164-2.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,6]],"date-time":"2025-09-06T01:24:24Z","timestamp":1757121864000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10107-024-02164-2"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,25]]},"references-count":18,"journal-issue":{"issue":"1-2","published-print":{"date-parts":[[2025,9]]}},"alternative-id":["2164"],"URL":"https:\/\/doi.org\/10.1007\/s10107-024-02164-2","relation":{},"ISSN":["0025-5610","1436-4646"],"issn-type":[{"value":"0025-5610","type":"print"},{"value":"1436-4646","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11,25]]},"assertion":[{"value":"5 April 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 September 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 November 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no financial or non-financial Conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}