{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,10]],"date-time":"2026-06-10T07:49:58Z","timestamp":1781077798805,"version":"3.54.1"},"reference-count":78,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2024,11,8]],"date-time":"2024-11-08T00:00:00Z","timestamp":1731024000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"NSF","award":["AF-2341890, CCF-1704417, CCF-1813049"],"award-info":[{"award-number":["AF-2341890, CCF-1704417, CCF-1813049"]}]},{"name":"Naval Research Award YIP","award":["N00014-19-2288"],"award-info":[{"award-number":["N00014-19-2288"]}]},{"name":"NSF Award HDR","award":["1934578"],"award-info":[{"award-number":["1934578"]}]},{"name":"NSF CAREER","award":["CCF-2239265"],"award-info":[{"award-number":["CCF-2239265"]}]},{"name":"Microsoft Research Faculty Fellowship, NSF CAREER","award":["CCF-1844855"],"award-info":[{"award-number":["CCF-1844855"]}]},{"name":"NSF","award":["CCF-1955039"],"award-info":[{"award-number":["CCF-1955039"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. ACM"],"published-print":{"date-parts":[[2024,12,31]]},"abstract":"<jats:p>We show that any memory-constrained, first-order algorithm which minimizes<jats:italic>d<\/jats:italic>-dimensional, 1-Lipschitz convex functions over the unit ball to 1\/poly(<jats:italic>d<\/jats:italic>) accuracy using at most d<jats:sup>1.25 - \u03b4<\/jats:sup>bits of memory must make at least<jats:inline-formula content-type=\"math\/tex\"><jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\(\\tilde{\\Omega }(d^{1 + (4\/3)\\delta })\\)<\/jats:tex-math><\/jats:inline-formula>first-order queries (for any constant<jats:inline-formula content-type=\"math\/tex\"><jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\(\\delta \\in [0, 1\/4]\\)<\/jats:tex-math><\/jats:inline-formula>). Consequently, the performance of such memory-constrained algorithms are at least a polynomial factor worse than the optimal \u00d5(<jats:italic>d<\/jats:italic>) query bound for this problem obtained by cutting plane methods that use \u00d5(d<jats:sup>2<\/jats:sup>) memory. This resolves one of the open problems in the COLT 2019 open problem publication of Woodworth and Srebro.<\/jats:p>","DOI":"10.1145\/3689208","type":"journal-article","created":{"date-parts":[[2024,9,6]],"date-time":"2024-09-06T13:52:35Z","timestamp":1725630755000},"page":"1-37","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Efficient Convex Optimization Requires Superlinear Memory"],"prefix":"10.1145","volume":"71","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-0169-7102","authenticated-orcid":false,"given":"Annie","family":"Marsden","sequence":"first","affiliation":[{"name":"Stanford University, Stanford, United States"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-1280-5623","authenticated-orcid":false,"given":"Vatsal","family":"Sharan","sequence":"additional","affiliation":[{"name":"University of Southern California, Los Angeles, United States"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2675-7610","authenticated-orcid":false,"given":"Aaron","family":"Sidford","sequence":"additional","affiliation":[{"name":"Stanford University, Stanford, United States"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2211-1073","authenticated-orcid":false,"given":"Gregory","family":"Valiant","sequence":"additional","affiliation":[{"name":"Stanford University, Stanford, United States"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2024,11,8]]},"reference":[{"key":"e_1_3_4_2_2","article-title":"The convergence of sparsified gradient methods","volume":"31","author":"Alistarh Dan","year":"2018","unstructured":"Dan Alistarh, Torsten Hoefler, Mikael Johansson, Nikola Konstantinov, Sarit Khirirat, and C\u00e9dric Renggli. 2018. The convergence of sparsified gradient methods. Advances in Neural Information Processing Systems 31 (2018).","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"1","key":"e_1_3_4_3_2","doi-asserted-by":"crossref","first-page":"137","DOI":"10.1006\/jcss.1997.1545","article-title":"The space complexity of approximating the frequency moments","volume":"58","author":"Alon Noga","year":"1999","unstructured":"Noga Alon, Yossi Matias, and Mario Szegedy. 1999. The space complexity of approximating the frequency moments. Journal of Computer and System Sciences 58, 1 (1999), 137\u2013147.","journal-title":"Journal of Computer and System Sciences"},{"issue":"3","key":"e_1_3_4_4_2","doi-asserted-by":"crossref","first-page":"365","DOI":"10.1287\/moor.25.3.365.12212","article-title":"The volumetric barrier for semidefinite programming","volume":"25","author":"Anstreicher Kurt M.","year":"2000","unstructured":"Kurt M. Anstreicher. 2000. The volumetric barrier for semidefinite programming. Mathematics of Operations Research 25, 3 (2000), 365\u2013380.","journal-title":"Mathematics of Operations Research"},{"key":"e_1_3_4_5_2","article-title":"Communication complexity of distributed convex learning and optimization","volume":"28","author":"Arjevani Yossi","year":"2015","unstructured":"Yossi Arjevani and Ohad Shamir. 2015. Communication complexity of distributed convex learning and optimization. Advances in Neural Information Processing Systems 28 (2015).","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"1","key":"e_1_3_4_6_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/BF01585551","article-title":"A cutting plane algorithm for convex programming that uses analytic centers","volume":"69","author":"Atkinson David S.","year":"1995","unstructured":"David S. Atkinson and Pravin M. Vaidya. 1995. A cutting plane algorithm for convex programming that uses analytic centers. Mathematical Programming 69, 1 (1995), 1\u201343.","journal-title":"Mathematical Programming"},{"key":"e_1_3_4_7_2","volume-title":"Proceedings of the Conference on Learning Theory","author":"Balcan Maria Florina","year":"2012","unstructured":"Maria Florina Balcan, Avrim Blum, Shai Fine, and Yishay Mansour. 2012. Distributed learning, communication complexity and privacy. In Proceedings of the Conference on Learning Theory. JMLR Workshop and Conference Proceedings, 26\u20131."},{"key":"e_1_3_4_8_2","unstructured":"Eric Balkanski and Yaron Singer. 2018. Parallelization does not accelerate convex optimization: Adaptivity lower bounds for non-smooth convex minimization. arXiv:1808.03880. Retrieved from https:\/\/arxiv.org\/abs\/1808.03880"},{"issue":"4","key":"e_1_3_4_9_2","doi-asserted-by":"crossref","first-page":"702","DOI":"10.1016\/j.jcss.2003.11.006","article-title":"An information statistics approach to data stream and communication complexity","volume":"68","author":"Bar-Yossef Ziv","year":"2004","unstructured":"Ziv Bar-Yossef, Thathachar S. Jayram, Ravi Kumar, and D. Sivakumar. 2004. An information statistics approach to data stream and communication complexity. Journal of Computer and System Sciences 68, 4 (2004), 702\u2013732.","journal-title":"Journal of Computer and System Sciences"},{"key":"e_1_3_4_10_2","first-page":"843","volume-title":"Proceedings of the Conference On Learning Theory","author":"Beame Paul","year":"2018","unstructured":"Paul Beame, Shayan Oveis Gharan, and Xin Yang. 2018. Time-space tradeoffs for learning finite functions from random evaluations, with applications to polynomials. In Proceedings of the Conference On Learning Theory. PMLR, 843\u2013856."},{"issue":"4","key":"e_1_3_4_11_2","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1145\/1008731.1008733","article-title":"Solving convex programs by random walks","volume":"51","author":"Bertsimas Dimitris","year":"2004","unstructured":"Dimitris Bertsimas and Santosh Vempala. 2004. Solving convex programs by random walks. Journal of the ACM 51, 4 (2004), 540\u2013556.","journal-title":"Journal of the ACM"},{"key":"e_1_3_4_12_2","unstructured":"Moise Blanchard. 2024. Gradient descent is pareto-optimal in the oracle complexity and memory tradeoff for feasibility problems. arXiv:2404.06720. Retrieved from https:\/\/arxiv.org\/abs\/2404.06720"},{"key":"e_1_3_4_13_2","article-title":"Memory-constrained algorithms for convex optimization via recursive cutting-planes","author":"Blanchard Moise","year":"2023","unstructured":"Moise Blanchard, Junhui Zhang, and Patrick Jaillet. 2023. Memory-constrained algorithms for convex optimization via recursive cutting-planes. Advances in Neural Information Processing Systems (2023).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_4_14_2","doi-asserted-by":"crossref","unstructured":"Mo\u00efse Blanchard Junhui Zhang and Patrick Jaillet. 2023. Quadratic memory is necessary for optimal query complexity in convex optimization: Center-of-mass is pareto-optimal. arXiv:2302.04963. Retrieved from https:\/\/arxiv.org\/abs\/2302.04963","DOI":"10.1287\/moor.2023.0208"},{"issue":"7","key":"e_1_3_4_15_2","doi-asserted-by":"crossref","first-page":"4709","DOI":"10.1109\/TIT.2017.2701343","article-title":"Lower bounds on the oracle complexity of nonsmooth convex optimization via information theory","volume":"63","author":"Braun G\u00e1bor","year":"2017","unstructured":"G\u00e1bor Braun, Crist\u00f3bal Guzm\u00e1n, and Sebastian Pokutta. 2017. Lower bounds on the oracle complexity of nonsmooth convex optimization via information theory. IEEE Transactions on Information Theory 63, 7 (2017), 4709\u20134724.","journal-title":"IEEE Transactions on Information Theory"},{"key":"e_1_3_4_16_2","first-page":"1011","volume-title":"Proceedings of the 48th Annual ACM Symposium on Theory of Computing","author":"Braverman Mark","year":"2016","unstructured":"Mark Braverman, Ankit Garg, Tengyu Ma, Huy L. Nguyen, and David P. Woodruff. 2016. Communication lower bounds for statistical estimation problems via a distributed data processing inequality. In Proceedings of the 48th Annual ACM Symposium on Theory of Computing. 1011\u20131020."},{"key":"e_1_3_4_17_2","first-page":"627","volume-title":"Proceedings of the Conference on Learning Theory","author":"Braverman Mark","year":"2020","unstructured":"Mark Braverman, Elad Hazan, Max Simchowitz, and Blake Woodworth. 2020. The gradient complexity of linear regression. In Proceedings of the Conference on Learning Theory. PMLR, 627\u2013647."},{"key":"e_1_3_4_18_2","doi-asserted-by":"crossref","unstructured":"S\u00e9bastien Bubeck. 2014. Convex optimization: Algorithms and complexity. arXiv:1405.4980. Retrieved from https:\/\/arxiv.org\/abs\/1405.4980","DOI":"10.1561\/9781601988614"},{"key":"e_1_3_4_19_2","article-title":"Complexity of highly parallel non-smooth convex optimization","volume":"32","author":"Bubeck S\u00e9bastien","year":"2019","unstructured":"S\u00e9bastien Bubeck, Qijia Jiang, Yin-Tat Lee, Yuanzhi Li, and Aaron Sidford. 2019. Complexity of highly parallel non-smooth convex optimization. Advances in Neural Information Processing Systems 32 (2019).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_4_20_2","first-page":"866","volume-title":"Proceedings of the Conference on Learning Theory","author":"Carmon Yair","year":"2021","unstructured":"Yair Carmon, Arun Jambulapati, Yujia Jin, and Aaron Sidford. 2021. Thinking inside the ball: Near-optimal minimization of the maximal loss. In Proceedings of the Conference on Learning Theory. PMLR, 866\u2013882."},{"key":"e_1_3_4_21_2","unstructured":"Xi Chen and Binghui Peng. 2023. Memory-query tradeoffs for randomized convex optimization. arXiv:2306.12534. Retrieved from https:\/\/arxiv.org\/abs\/2306.12534"},{"key":"e_1_3_4_22_2","first-page":"205","volume-title":"Proceedings of the 41st Annual ACM Symposium on Theory of Computing","author":"Clarkson Kenneth L.","year":"2009","unstructured":"Kenneth L. Clarkson and David P. Woodruff. 2009. Numerical linear algebra in the streaming model. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing. 205\u2013214."},{"key":"e_1_3_4_23_2","volume-title":"Elements of Information Theory","author":"Cover T. M.","year":"1991","unstructured":"T. M. Cover and J. A. Thomas. 1991. Elements of Information Theory. Wiley."},{"key":"e_1_3_4_24_2","first-page":"929","volume-title":"Proceedings of the Conference on Learning Theory","author":"Dagan Yuval","year":"2019","unstructured":"Yuval Dagan, Gil Kur, and Ohad Shamir. 2019. Space lower bounds for linear prediction in the streaming model. In Proceedings of the Conference on Learning Theory. PMLR, 929\u2013954."},{"key":"e_1_3_4_25_2","first-page":"1145","volume-title":"Proceedings of the Conference On Learning Theory","author":"Dagan Yuval","year":"2018","unstructured":"Yuval Dagan and Ohad Shamir. 2018. Detecting correlations with little memory and communication. In Proceedings of the Conference On Learning Theory. PMLR, 1145\u20131198."},{"key":"e_1_3_4_26_2","first-page":"1132","volume-title":"Proceedings of the Conference on Learning Theory","author":"Diakonikolas Jelena","year":"2019","unstructured":"Jelena Diakonikolas and Crist\u00f3bal Guzm\u00e1n. 2019. Lower bounds for parallel and randomized convex optimization. In Proceedings of the Conference on Learning Theory. PMLR, 1132\u20131157."},{"key":"e_1_3_4_27_2","doi-asserted-by":"crossref","first-page":"429","DOI":"10.1109\/FOCS.2013.53","volume-title":"Proceedings of the 2013 IEEE 54th Annual Symposium on Foundations of Computer Science","author":"Duchi John C.","year":"2013","unstructured":"John C. Duchi, Michael I. Jordan, and Martin J. Wainwright. 2013. Local privacy and statistical minimax rates. In Proceedings of the 2013 IEEE 54th Annual Symposium on Foundations of Computer Science. IEEE, 429\u2013438."},{"issue":"2","key":"e_1_3_4_28_2","doi-asserted-by":"crossref","DOI":"10.1093\/comjnl\/7.2.149","article-title":"Function minimization by conjugate gradients","volume":"7","author":"Fletcher R.","year":"1964","unstructured":"R. Fletcher and C. M. Reeves. 1964. Function minimization by conjugate gradients. The Computer Journal 7, 2 (1964).","journal-title":"The Computer Journal"},{"key":"e_1_3_4_29_2","article-title":"On communication cost of distributed statistical estimation and dimensionality","volume":"27","author":"Garg Ankit","year":"2014","unstructured":"Ankit Garg, Tengyu Ma, and Huy Nguyen. 2014. On communication cost of distributed statistical estimation and dimensionality. Advances in Neural Information Processing Systems 27 (2014).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_4_30_2","doi-asserted-by":"crossref","first-page":"990","DOI":"10.1145\/3188745.3188962","volume-title":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","author":"Garg Sumegha","year":"2018","unstructured":"Sumegha Garg, Ran Raz, and Avishay Tal. 2018. Extractor-based time-space lower bounds for learning. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. 990\u20131002."},{"key":"e_1_3_4_31_2","volume-title":"Geometric Algorithms and Combinatorial Optimization","author":"Gr\u00f6tschel Martin","year":"2012","unstructured":"Martin Gr\u00f6tschel, L\u00e1szl\u00f3 Lov\u00e1sz, and Alexander Schrijver. 2012. Geometric Algorithms and Combinatorial Optimization. Vol. 2. Springer Science and Business Media."},{"issue":"1","key":"e_1_3_4_32_2","article-title":"A survey of nonlinear conjugate gradient methods","volume":"2","author":"Hager William W.","year":"2006","unstructured":"William W. Hager and Hongchao Zhang. 2006. A survey of nonlinear conjugate gradient methods. Pacific Journal of Optimization 2, 1 (2006).","journal-title":"Pacific Journal of Optimization"},{"issue":"3","key":"e_1_3_4_33_2","doi-asserted-by":"crossref","first-page":"1079","DOI":"10.1214\/aoms\/1177693335","article-title":"A bound on tail probabilities for quadratic forms in independent random variables","volume":"42","author":"Hanson David Lee","year":"1971","unstructured":"David Lee Hanson and Farroll Tim Wright. 1971. A bound on tail probabilities for quadratic forms in independent random variables. The Annals of Mathematical Statistics 42, 3 (1971), 1079\u20131083.","journal-title":"The Annals of Mathematical Statistics"},{"issue":"6","key":"e_1_3_4_34_2","doi-asserted-by":"crossref","DOI":"10.6028\/jres.049.044","article-title":"Methods of conjugate gradients for solving linear systems","volume":"49","author":"Hestenes M.R.","year":"1952","unstructured":"M.R. Hestenes and E. Stiefel. 1952. Methods of conjugate gradients for solving linear systems. Journal of Research of the National Bureau of Standards 49, 6 (1952).","journal-title":"Journal of Research of the National Bureau of Standards"},{"key":"e_1_3_4_35_2","article-title":"Communication-efficient distributed dual coordinate ascent","volume":"27","author":"Jaggi Martin","year":"2014","unstructured":"Martin Jaggi, Virginia Smith, Martin Tak\u00e1c, Jonathan Terhorst, Sanjay Krishnan, Thomas Hofmann, and Michael I Jordan. 2014. Communication-efficient distributed dual coordinate ascent. Advances in Neural Information Processing Systems 27 (2014).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_4_36_2","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1145\/1993574.1993593","volume-title":"Proceedings of the 12th ACM Conference on Electronic Commerce","author":"Jiang Albert Xin","year":"2011","unstructured":"Albert Xin Jiang and Kevin Leyton-Brown. 2011. Polynomial-time computation of exact correlated equilibrium in compact games. In Proceedings of the 12th ACM Conference on Electronic Commerce. 119\u2013126."},{"key":"e_1_3_4_37_2","doi-asserted-by":"crossref","first-page":"976","DOI":"10.1137\/1.9781611976465.61","volume-title":"Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA)","author":"Jiang Haotian","year":"2021","unstructured":"Haotian Jiang. 2021. Minimizing convex functions with integral minimizers. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA). SIAM, 976\u2013985."},{"key":"e_1_3_4_38_2","doi-asserted-by":"crossref","first-page":"944","DOI":"10.1145\/3357713.3384284","volume-title":"Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing","author":"Jiang Haotian","year":"2020","unstructured":"Haotian Jiang, Yin Tat Lee, Zhao Song, and Sam Chiu-wai Wong. 2020. An improved cutting plane method for convex optimization, convex-concave games, and its applications. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing. 944\u2013953."},{"key":"e_1_3_4_39_2","article-title":"Communication-efficient distributed statistical inference","author":"Jordan Michael I.","year":"2018","unstructured":"Michael I. Jordan, Jason D. Lee, and Yun Yang. 2018. Communication-efficient distributed statistical inference. Journal of the American Statistical Association (2018).","journal-title":"Journal of the American Statistical Association"},{"key":"e_1_3_4_40_2","unstructured":"Leonid G. Khachiyan Sergei Pavlovich Tarasov and II Erlikh. 1988. The method of inscribed ellipsoids. SovietMathematics-Doklady 37 1 (1988) 226\u2013230."},{"key":"e_1_3_4_41_2","first-page":"1067","volume-title":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","author":"Kol Gillat","year":"2017","unstructured":"Gillat Kol, Ran Raz, and Avishay Tal. 2017. Time-space hardness of learning sparse parities. In Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing. 1067\u20131080."},{"key":"e_1_3_4_42_2","first-page":"1049","volume-title":"Proceedings of the 2015 IEEE 56th Annual Symposium on Foundations of Computer Science","author":"Lee Yin Tat","year":"2015","unstructured":"Yin Tat Lee, Aaron Sidford, and Sam Chiu-wai Wong. 2015. A faster cutting plane method and its implications for combinatorial and convex optimization. In Proceedings of the 2015 IEEE 56th Annual Symposium on Foundations of Computer Science. IEEE, 1049\u20131065."},{"key":"e_1_3_4_43_2","first-page":"1244","volume-title":"Proceedings of the Doklady Akademii Nauk","volume":"160","author":"Levin Anatoly Yur\u2019evich","year":"1965","unstructured":"Anatoly Yur\u2019evich Levin. 1965. An algorithm for minimizing convex functions. In Proceedings of the Doklady Akademii Nauk, Vol. 160. Russian Academy of Sciences, 1244\u20131247."},{"issue":"1","key":"e_1_3_4_44_2","article-title":"On the limited memory BFGS method for large scale optimization","volume":"45","author":"Liu Dong C.","year":"1989","unstructured":"Dong C. Liu and Jorge Nocedal. 1989. On the limited memory BFGS method for large scale optimization. Mathematical Programming 45, 1-3 (1989).","journal-title":"Mathematical Programming"},{"key":"e_1_3_4_45_2","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1016\/S0927-0507(05)12007-6","article-title":"Submodular function minimization","volume":"12","author":"McCormick S. Thomas","year":"2005","unstructured":"S. Thomas McCormick. 2005. Submodular function minimization. Handbooks in Operations Research and Management Science 12 (2005), 321\u2013391.","journal-title":"Handbooks in Operations Research and Management Science"},{"key":"e_1_3_4_46_2","first-page":"1516","volume-title":"Proceedings of the Conference on Learning Theory","author":"Moshkovitz Dana","year":"2017","unstructured":"Dana Moshkovitz and Michal Moshkovitz. 2017. Mixing implies lower bounds for space bounded learning. In Proceedings of the Conference on Learning Theory. PMLR, 1516\u20131566."},{"key":"e_1_3_4_47_2","volume-title":"Proceedings of the 9th Innovations in Theoretical Computer Science Conference","author":"Moshkovitz Dana","year":"2018","unstructured":"Dana Moshkovitz and Michal Moshkovitz. 2018. Entropy samplers and strong generic lower bounds for space bounded learning. In Proceedings of the 9th Innovations in Theoretical Computer Science Conference. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik."},{"issue":"4","key":"e_1_3_4_48_2","doi-asserted-by":"crossref","first-page":"451","DOI":"10.1006\/jcom.1994.1025","article-title":"On parallel complexity of nonsmooth convex optimization","volume":"10","author":"Nemirovski Arkadi","year":"1994","unstructured":"Arkadi Nemirovski. 1994. On parallel complexity of nonsmooth convex optimization. Journal of Complexity 10, 4 (1994), 451\u2013463.","journal-title":"Journal of Complexity"},{"key":"e_1_3_4_49_2","unstructured":"Arkadij Nemirovski and David Yudin. 1983. Problem complexity and method efficiency in optimization. (1983)."},{"key":"e_1_3_4_50_2","article-title":"Self-concordant functions and polynomial-time methods in convex programming","author":"Nesterov Ju E.","year":"1989","unstructured":"Ju E. Nesterov. 1989. Self-concordant functions and polynomial-time methods in convex programming. Report, Central Economic and Mathematic Institute, USSR Acad. Sci (1989).","journal-title":"Report, Central Economic and Mathematic Institute, USSR Acad. Sci"},{"key":"e_1_3_4_51_2","volume-title":"Introductory Lectures on Convex Optimization: A Basic Course","author":"Nesterov Yurii","year":"2003","unstructured":"Yurii Nesterov. 2003. Introductory Lectures on Convex Optimization: A Basic Course. Vol. 87. Springer Science and Business Media."},{"issue":"151","key":"e_1_3_4_52_2","article-title":"Updating quasi-newton matrices with limited storage","volume":"35","author":"Nocedal Jorge","year":"1980","unstructured":"Jorge Nocedal. 1980. Updating quasi-newton matrices with limited storage. Mathematics of Computation 35, 151 (1980).","journal-title":"Mathematics of Computation"},{"issue":"3","key":"e_1_3_4_53_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1379759.1379762","article-title":"Computing correlated equilibria in multi-player games","volume":"55","author":"Papadimitriou Christos H.","year":"2008","unstructured":"Christos H. Papadimitriou and Tim Roughgarden. 2008. Computing correlated equilibria in multi-player games. Journal of the ACM 55, 3 (2008), 1\u201329.","journal-title":"Journal of the ACM"},{"issue":"1","key":"e_1_3_4_54_2","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1137\/15M1021106","article-title":"Newton sketch: A near linear-time optimization algorithm with linear-quadratic convergence","volume":"27","author":"Pilanci Mert","year":"2017","unstructured":"Mert Pilanci and Martin J. Wainwright. 2017. Newton sketch: A near linear-time optimization algorithm with linear-quadratic convergence. SIAM Journal on Optimization 27, 1 (2017), 205\u2013245.","journal-title":"SIAM Journal on Optimization"},{"key":"e_1_3_4_55_2","doi-asserted-by":"crossref","first-page":"732","DOI":"10.1109\/FOCS.2017.73","volume-title":"Proceedings of the 2017 IEEE 58th Annual Symposium on Foundations of Computer Science","author":"Raz Ran","year":"2017","unstructured":"Ran Raz. 2017. A time-space lower bound for a large class of learning problems. In Proceedings of the 2017 IEEE 58th Annual Symposium on Foundations of Computer Science. IEEE, 732\u2013742."},{"issue":"1","key":"e_1_3_4_56_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3186563","article-title":"Fast learning requires good memory: A time-space lower bound for parity learning","volume":"66","author":"Raz Ran","year":"2018","unstructured":"Ran Raz. 2018. Fast learning requires good memory: A time-space lower bound for parity learning. Journal of the ACM 66, 1 (2018), 1\u201318.","journal-title":"Journal of the ACM"},{"issue":"1","key":"e_1_3_4_57_2","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1007\/s10107-018-1346-5","article-title":"Sub-sampled newton methods","volume":"174","author":"Roosta-Khorasani Farbod","year":"2019","unstructured":"Farbod Roosta-Khorasani and Michael W. Mahoney. 2019. Sub-sampled newton methods. Mathematical Programming 174, 1 (2019), 293\u2013326.","journal-title":"Mathematical Programming"},{"key":"e_1_3_4_58_2","first-page":"1","article-title":"Hanson-wright inequality and sub-gaussian concentration","volume":"18","author":"Rudelson Mark","year":"2013","unstructured":"Mark Rudelson and Roman Vershynin. 2013. Hanson-wright inequality and sub-gaussian concentration. Electronic Communications in Probability 18 (2013), 1\u20139.","journal-title":"Electronic Communications in Probability"},{"key":"e_1_3_4_59_2","article-title":"Fundamental limits of online and distributed algorithms for statistical learning and estimation","volume":"27","author":"Shamir Ohad","year":"2014","unstructured":"Ohad Shamir. 2014. Fundamental limits of online and distributed algorithms for statistical learning and estimation. Advances in Neural Information Processing Systems 27 (2014).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_4_60_2","first-page":"1000","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Shamir Ohad","year":"2014","unstructured":"Ohad Shamir, Nati Srebro, and Tong Zhang. 2014. Communication-efficient distributed optimization using an approximate newton-type method. In Proceedings of the International Conference on Machine Learning. PMLR, 1000\u20131008."},{"key":"e_1_3_4_61_2","doi-asserted-by":"crossref","first-page":"890","DOI":"10.1145\/3313276.3316403","volume-title":"Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing","author":"Sharan Vatsal","year":"2019","unstructured":"Vatsal Sharan, Aaron Sidford, and Gregory Valiant. 2019. Memory-sample tradeoffs for linear regression with small error. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing. 890\u2013901."},{"issue":"1","key":"e_1_3_4_62_2","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1007\/BF01071394","article-title":"Cut-off method with space extension in convex programming problems","volume":"13","author":"Shor Naum Z.","year":"1977","unstructured":"Naum Z. Shor. 1977. Cut-off method with space extension in convex programming problems. Cybernetics 13, 1 (1977), 94\u201396.","journal-title":"Cybernetics"},{"key":"e_1_3_4_63_2","unstructured":"Max Simchowitz. 2018. On the randomized complexity of minimizing a convex quadratic function. arXiv:1807.09386. Retrieved from https:\/\/arxiv.org\/abs\/1807.09386"},{"key":"e_1_3_4_64_2","doi-asserted-by":"crossref","first-page":"1249","DOI":"10.1145\/3188745.3188796","volume-title":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","author":"Simchowitz Max","year":"2018","unstructured":"Max Simchowitz, Ahmed El Alaoui, and Benjamin Recht. 2018. Tight query complexity lower bounds for PCA via finite sample deformed wigner law. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. 1249\u20131259."},{"key":"e_1_3_4_65_2","first-page":"779","article-title":"Properties of a cutting plane method for semidefinite programming","volume":"8","author":"Sivaramakrishnan Kartik","year":"2012","unstructured":"Kartik Sivaramakrishnan and John Mitchell. 2012. Properties of a cutting plane method for semidefinite programming. Pacific Journal of Optimization 8 (2012), 779\u2013802.","journal-title":"Pacific Journal of Optimization"},{"key":"e_1_3_4_66_2","first-page":"1564","volume-title":"Proceedings of the Conference on Learning Theory","author":"Steinhardt Jacob","year":"2015","unstructured":"Jacob Steinhardt and John Duchi. 2015. Minimax rates for memory-bounded sparse linear regression. In Proceedings of the Conference on Learning Theory. PMLR, 1564\u20131587."},{"key":"e_1_3_4_67_2","first-page":"1490","volume-title":"Proceedings of the Conference on Learning Theory","author":"Steinhardt Jacob","year":"2016","unstructured":"Jacob Steinhardt, Gregory Valiant, and Stefan Wager. 2016. Memory, communication, and statistical queries. In Proceedings of the Conference on Learning Theory. PMLR, 1490\u20131516."},{"issue":"4","key":"e_1_3_4_68_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3470566","article-title":"Querying a matrix through matrix-vector products","volume":"17","author":"Sun Xiaoming","year":"2021","unstructured":"Xiaoming Sun, David P. Woodruff, Guang Yang, and Jialin Zhang. 2021. Querying a matrix through matrix-vector products. ACM Transactions on Algorithms 17, 4 (2021), 1\u201319.","journal-title":"ACM Transactions on Algorithms"},{"key":"e_1_3_4_69_2","doi-asserted-by":"crossref","first-page":"338","DOI":"10.1109\/SFCS.1989.63500","volume-title":"Proceedings of the 30th Annual Symposium on Foundations of Computer Science","author":"Vaidya Pravin M.","year":"1989","unstructured":"Pravin M. Vaidya. 1989. A new algorithm for minimizing convex functions over convex sets. In Proceedings of the 30th Annual Symposium on Foundations of Computer Science. IEEE Computer Society, 338\u2013343."},{"key":"e_1_3_4_70_2","volume-title":"High-dimensional Probability: An Introduction with Applications in Data Science","author":"Vershynin Roman","year":"2018","unstructured":"Roman Vershynin. 2018. High-dimensional Probability: An Introduction with Applications in Data Science. Vol. 47. Cambridge University Press."},{"key":"e_1_3_4_71_2","volume-title":"High-dimensional Statistics: A Non-asymptotic Viewpoint","author":"Wainwright Martin J.","year":"2019","unstructured":"Martin J. Wainwright. 2019. High-dimensional Statistics: A Non-asymptotic Viewpoint. Vol. 48. Cambridge University Press."},{"key":"e_1_3_4_72_2","unstructured":"Blake Woodworth and Nathan Srebro. 2017. Lower bound for randomized first order convex optimization. arXiv:1709.03594. Retrieved from https:\/\/arxiv.org\/abs\/1709.03594"},{"key":"e_1_3_4_73_2","first-page":"3202","volume-title":"Proceedings of the Conference on Learning Theory","author":"Woodworth Blake","year":"2019","unstructured":"Blake Woodworth and Nathan Srebro. 2019. Open problem: The oracle complexity of convex optimization with limited memory. In Proceedings of the Conference on Learning Theory. PMLR, 3202\u20133210."},{"key":"e_1_3_4_74_2","first-page":"4386","volume-title":"Proceedings of the Conference on Learning Theory","author":"Woodworth Blake E.","year":"2021","unstructured":"Blake E. Woodworth, Brian Bullins, Ohad Shamir, and Nathan Srebro. 2021. The min-max complexity of distributed stochastic convex optimization with intermittent communication. In Proceedings of the Conference on Learning Theory. PMLR, 4386\u20134437."},{"key":"e_1_3_4_75_2","article-title":"Tight complexity bounds for optimizing composite objectives","volume":"29","author":"Woodworth Blake E.","year":"2016","unstructured":"Blake E. Woodworth and Nati Srebro. 2016. Tight complexity bounds for optimizing composite objectives. Advances in Neural Information Processing Systems 29 (2016).","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"1","key":"e_1_3_4_76_2","first-page":"35","article-title":"Newton-type methods for non-convex optimization under inexact hessian information","volume":"184","author":"Xu Peng","year":"2020","unstructured":"Peng Xu, Fred Roosta, and Michael W. Mahoney. 2020. Newton-type methods for non-convex optimization under inexact hessian information. Mathematical Programming 184, 1 (2020), 35\u201370.","journal-title":"Mathematical Programming"},{"key":"e_1_3_4_77_2","first-page":"5610","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Ye Min","year":"2018","unstructured":"Min Ye and Emmanuel Abbe. 2018. Communication-computation efficient gradient coding. In Proceedings of the International Conference on Machine Learning. PMLR, 5610\u20135619."},{"issue":"2","key":"e_1_3_4_78_2","first-page":"22","article-title":"Informational complexity and efficient methods for the solution of convex extremal problems","volume":"13","author":"Yudin D. B.","year":"1976","unstructured":"D. B. Yudin and Arkadi S. Nemirovskii. 1976. Informational complexity and efficient methods for the solution of convex extremal problems. Matekon 13, 2 (1976), 22\u201345.","journal-title":"Matekon"},{"key":"e_1_3_4_79_2","first-page":"2328","volume-title":"Proceedings of the Conference on Neural Information Processing Systems","author":"Zhang Yuchen","year":"2013","unstructured":"Yuchen Zhang, John C. Duchi, Michael I. Jordan, and Martin J. Wainwright. 2013. Information-theoretic lower bounds for distributed statistical estimation with communication constraints.. In Proceedings of the Conference on Neural Information Processing Systems. Citeseer, 2328\u20132336."}],"container-title":["Journal of the ACM"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3689208","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3689208","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T00:57:55Z","timestamp":1750294675000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3689208"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,8]]},"references-count":78,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,12,31]]}},"alternative-id":["10.1145\/3689208"],"URL":"https:\/\/doi.org\/10.1145\/3689208","relation":{},"ISSN":["0004-5411","1557-735X"],"issn-type":[{"value":"0004-5411","type":"print"},{"value":"1557-735X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11,8]]},"assertion":[{"value":"2023-07-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-08-04","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-11-08","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}