{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,4,25]],"date-time":"2024-04-25T00:07:05Z","timestamp":1714003625849},"reference-count":29,"publisher":"MIT Press","issue":"5","content-domain":{"domain":["direct.mit.edu"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,4,23]]},"abstract":"<jats:title>Abstract<\/jats:title>\n               <jats:p>Zeroth-order (ZO) optimization is one key technique for machine learning problems where gradient calculation is expensive or impossible. Several variance, reduced ZO proximal algorithms have been proposed to speed up ZO optimization for nonsmooth problems, and all of them opted for the coordinated ZO estimator against the random ZO estimator when approximating the true gradient, since the former is more accurate. While the random ZO estimator introduces a larger error and makes convergence analysis more challenging compared to coordinated ZO estimator, it requires only O(1) computation, which is significantly less than O(d) computation of the coordinated ZO estimator, with d being dimension of the problem space. To take advantage of the computationally efficient nature of the random ZO estimator, we first propose a ZO objective decrease (ZOOD) property that can incorporate two different types of errors in the upper bound of convergence rate. Next, we propose two generic reduction frameworks for ZO optimization, which can automatically derive the convergence results for convex and nonconvex problems, respectively, as long as the convergence rate for the inner solver satisfies the ZOOD property. With the application of two reduction frameworks on our proposed ZOR-ProxSVRG and ZOR-ProxSAGA, two variance-reduced ZO proximal algorithms with fully random ZO estimators, we improve the state-of-the-art function query complexities from Omindn1\/2\u03b52,d\u03b53 to O\u02dcn+d\u03b52 under d&amp;gt;n12 for nonconvex problems, and from Od\u03b52 to O\u02dcnlog1\u03b5+d\u03b5 for convex problems. Finally, we conduct experiments to verify the superiority of our proposed methods.<\/jats:p>","DOI":"10.1162\/neco_a_01636","type":"journal-article","created":{"date-parts":[[2024,3,8]],"date-time":"2024-03-08T20:56:43Z","timestamp":1709931403000},"page":"897-935","update-policy":"http:\/\/dx.doi.org\/10.1162\/mitpressjournals.corrections.policy","source":"Crossref","is-referenced-by-count":0,"title":["Obtaining Lower Query Complexities Through Lightweight Zeroth-Order Proximal Gradient Algorithms"],"prefix":"10.1162","volume":"36","author":[{"given":"Bin","family":"Gu","sequence":"first","affiliation":[{"name":"School of Artificial Intelligence, Jilin University, Changchun, 130012 Jilin, China"},{"name":"Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, U.A.E. jsgubin@gmail.com"}]},{"given":"Xiyuan","family":"Wei","sequence":"additional","affiliation":[{"name":"Texas A&M University, College Station, TX 77843, U.S.A. xiyuan@tamu.edu"}]},{"given":"Hualin","family":"Zhang","sequence":"additional","affiliation":[{"name":"Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, U.A.E. zhanghualin98@gmail.com"}]},{"given":"Yi","family":"Chang","sequence":"additional","affiliation":[{"name":"School of Artificial Intelligence, Jilin University, Changchun, 130012 Jilin, China yichang@jlu.edu.cn"}]},{"given":"Heng","family":"Huang","sequence":"additional","affiliation":[{"name":"University of Maryland, College Park, MD 20742, U.S.A. henghuanghh@gmail.com"}]}],"member":"281","published-online":{"date-parts":[[2024,4,23]]},"reference":[{"key":"2024042423252924000_bib1","first-page":"1614","article-title":"Optimal black-box reductions between optimization objectives","volume-title":"Advances in neural information processing systems","author":"Allen-Zhu","year":"2016"},{"issue":"3","key":"2024042423252924000_bib2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/1961189.1961199","article-title":"Libsvm: A library for support vector machines","volume":"2","author":"Chang","year":"2011","journal-title":"ACM Transactions on Intelligent Systems and Technology"},{"key":"2024042423252924000_bib3","first-page":"7202","article-title":"Zo-AdaMM, Zeroth-order adaptive momentum method for black-box optimization","volume-title":"Advances in neural information processing systems","author":"Chen","year":"2019"},{"key":"2024042423252924000_bib4","first-page":"1681","article-title":"An accelerated DFO algorithm for finite-sum convex functions","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Chen","year":"2020"},{"key":"2024042423252924000_bib5","volume-title":"Universal stagewise learning for non-convex problems with convergence on averaged solutions.","author":"Chen","year":"2018"},{"key":"2024042423252924000_bib6","volume-title":"Structured evolution with compact architectures for scalable policy optimization.","author":"Choromanski","year":"2018"},{"key":"2024042423252924000_bib7","first-page":"1646","article-title":"SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives","volume-title":"Advances in neural information processing systems","author":"Defazio","year":"2014"},{"key":"2024042423252924000_bib8","volume-title":"An accelerated method for derivative-free smooth stochastic convex optimization.","author":"Dvurechensky","year":"2018"},{"key":"2024042423252924000_bib9","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-030-58542-6_3","article-title":"Sparse adversarial attack via perturbation factorization","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Fan","year":"2020"},{"key":"2024042423252924000_bib10","first-page":"689","article-title":"SPIDER: Near-optimal non- convex optimization via stochastic path-integrated differential estimator","volume-title":"Advances in neural information processing systems","author":"Fang","year":"2018"},{"key":"2024042423252924000_bib11","first-page":"385","article-title":"Online convex optimization in the bandit setting, gradient descent without a gradient","volume-title":"Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms","author":"Flaxman","year":"2005"},{"issue":"1","key":"2024042423252924000_bib12","doi-asserted-by":"publisher","first-page":"327","DOI":"10.1007\/s10915-017-0621-6","article-title":"On the information-adaptive variants of the ADMM: An iteration complexity perspective","volume":"76","author":"Gao","year":"2018","journal-title":"Journal of Scientific Computing"},{"issue":"4","key":"2024042423252924000_bib13","doi-asserted-by":"publisher","first-page":"2341","DOI":"10.1137\/120880811","article-title":"Stochastic first-and zeroth-order methods for nonconvex stochastic programming","volume":"23","author":"Ghadimi","year":"2013","journal-title":"SIAM Journal on Optimization"},{"issue":"1\u20132","key":"2024042423252924000_bib14","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1007\/s10107-014-0846-1","article-title":"Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization","volume":"155","author":"Ghadimi","year":"2016","journal-title":"Mathematical Programming"},{"key":"2024042423252924000_bib15","first-page":"1812","article-title":"Faster derivative-free stochastic algorithm for shared memory machines","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Gu","year":"2018"},{"key":"2024042423252924000_bib16","doi-asserted-by":"crossref","DOI":"10.1609\/aaai.v32i1.11802","article-title":"Inexact proximal gradient methods for non-convex and non-smooth optimization","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Gu","year":"2018"},{"key":"2024042423252924000_bib17","doi-asserted-by":"publisher","first-page":"1503","DOI":"10.1609\/aaai.v33i01.33011503","article-title":"Faster gradient-free proximal stochastic methods for nonconvex nonsmooth optimization","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Huang","year":"2019"},{"key":"2024042423252924000_bib18","first-page":"3100","article-title":"Improved zeroth-order variance reduced algorithms and analysis for nonconvex optimization","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Ji","year":"2019"},{"key":"2024042423252924000_bib19","first-page":"1519","article-title":"An interior-point method for large-scale l1-regularized logistic regression","volume":"8","author":"Koh","year":"2007","journal-title":"Journal of Machine Learning Research"},{"issue":"3","key":"2024042423252924000_bib20","doi-asserted-by":"publisher","first-page":"731","DOI":"10.1007\/s10589-021-00313-3","article-title":"A zeroth order method for stochastic weakly convex optimization","volume":"80","author":"Kungurtsev","year":"2021","journal-title":"Computational Optimization and Applications"},{"key":"2024042423252924000_bib21","volume-title":"Adversarial machine learning at scale.","author":"Kurakin","year":"2016"},{"key":"2024042423252924000_bib22","first-page":"3054","article-title":"A comprehensive linear speedup analysis for asynchronous stochastic parallel optimization from zeroth-order to first-order","volume-title":"Advances in neural information processing systems","author":"Lian","year":"2016"},{"key":"2024042423252924000_bib23","first-page":"3384","article-title":"A universal catalyst for first-order optimization","volume-title":"Advances in neural information processing systems","author":"Lin","year":"2015"},{"key":"2024042423252924000_bib24","first-page":"3727","article-title":"Zeroth-order stochastic variance reduction for nonconvex optimization","volume-title":"Advances in neural information processing systems","author":"Liu","year":"2018"},{"key":"2024042423252924000_bib25","volume-title":"Towards deep learning models resistant to adversarial attacks.","author":"Madry","year":"2017"},{"issue":"2","key":"2024042423252924000_bib26","doi-asserted-by":"publisher","first-page":"527","DOI":"10.1007\/s10208-015-9296-2","article-title":"Random gradient-free minimization of convex functions","volume":"17","author":"Nesterov","year":"2017","journal-title":"Foundations of Computational Mathematics"},{"key":"2024042423252924000_bib27","first-page":"31525","article-title":"Black-box generalization: Stability of zeroth-order learning","volume-title":"Advances in neural information processing systems","author":"Nikolakakis","year":"2022"},{"key":"2024042423252924000_bib28","doi-asserted-by":"crossref","first-page":"506","DOI":"10.1145\/3052973.3053009","article-title":"Practical black-box attacks against machine learning","volume-title":"Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security","author":"Papernot","year":"2017"},{"issue":"4","key":"2024042423252924000_bib29","doi-asserted-by":"publisher","first-page":"2057","DOI":"10.1137\/140961791","article-title":"A proximal stochastic gradient method with progressive variance reduction","volume":"24","author":"Xiao","year":"2014","journal-title":"SIAM Journal on Optimization"}],"container-title":["Neural Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/36\/5\/897\/2366819\/neco_a_01636.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/direct.mit.edu\/neco\/article-pdf\/36\/5\/897\/2366819\/neco_a_01636.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,4,24]],"date-time":"2024-04-24T23:25:55Z","timestamp":1714001155000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/neco\/article\/36\/5\/897\/119786\/Obtaining-Lower-Query-Complexities-Through"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,23]]},"references-count":29,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2024,4,23]]},"published-print":{"date-parts":[[2024,4,23]]}},"URL":"https:\/\/doi.org\/10.1162\/neco_a_01636","relation":{},"ISSN":["0899-7667","1530-888X"],"issn-type":[{"value":"0899-7667","type":"print"},{"value":"1530-888X","type":"electronic"}],"subject":[],"published-other":{"date-parts":[[2024,5]]},"published":{"date-parts":[[2024,4,23]]}}}