{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T14:07:48Z","timestamp":1760710068739,"version":"3.41.0"},"reference-count":103,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2019,12,6]],"date-time":"2019-12-06T00:00:00Z","timestamp":1575590400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["AI Matters"],"published-print":{"date-parts":[[2019,12,6]]},"abstract":"<jats:p>\n            As data gets more complex and applications of machine learning (ML) algorithms for decision-making broaden and diversify, traditional ML methods by minimizing an unconstrained or simply constrained convex objective are becoming increasingly unsatisfactory. To address this new challenge, recent ML research has sparked a\n            <jats:italic>paradigm shift<\/jats:italic>\n            in learning predictive models into non-convex learning and heavily constrained learning. Non-Convex Learning (NCL) refers to a family of learning methods that involve optimizing non-convex objectives. Heavily Constrained Learning (HCL) refers to a family of learning methods that involve constraints that are much more complicated than a simple norm constraint (e.g., data-dependent functional constraints, non-convex constraints), as in conventional learning. This paradigm shift has already created many promising outcomes: (i) non-convex deep learning has brought breakthroughs for learning representations from\n            <jats:italic>large-scale structured data<\/jats:italic>\n            (e.g., images, speech) (LeCun, Bengio, &amp; Hinton, 2015; Krizhevsky, Sutskever, &amp; Hinton, 2012; Amodei et al., 2016; Deng &amp; Liu, 2018); (ii) non-convex regularizers (e.g., for enforcing sparsity or low-rank) could be more effective than their convex counterparts for learning\n            <jats:italic>high-dimensional structured models<\/jats:italic>\n            (C.-H. Zhang &amp; Zhang, 2012; J. Fan &amp; Li, 2001; C.-H. Zhang, 2010; T. Zhang, 2010); (iii) constrained learning is being used to learn predictive models that satisfy various constraints to\n            <jats:italic>respect social norms<\/jats:italic>\n            (e.g., fairness) (B. E. Woodworth, Gunasekar, Ohannessian, &amp; Srebro, 2017; Hardt, Price, Srebro, et al., 2016; Zafar, Valera, Gomez Rodriguez, &amp; Gummadi, 2017; A. Agarwal, Beygelzimer, Dud\u00edk, Langford, &amp; Wallach, 2018), to\n            <jats:italic>improve the interpretability<\/jats:italic>\n            (Gupta et al., 2016; Canini, Cotter, Gupta, Fard, &amp; Pfeifer, 2016; You, Ding, Canini, Pfeifer, &amp; Gupta, 2017), to\n            <jats:italic>enhance the robustness<\/jats:italic>\n            (Globerson &amp; Roweis, 2006a; Sra, Nowozin, &amp; Wright, 2011; T. Yang, Mahdavi, Jin, Zhang, &amp; Zhou, 2012), etc. In spite of great promises brought by these new learning paradigms, they also bring emerging challenges to the design of computationally efficient algorithms for\n            <jats:italic>big data<\/jats:italic>\n            and the analysis of their statistical properties.\n          <\/jats:p>","DOI":"10.1145\/3362077.3362085","type":"journal-article","created":{"date-parts":[[2019,12,9]],"date-time":"2019-12-09T13:35:27Z","timestamp":1575898527000},"page":"29-39","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Advancing non-convex and constrained learning"],"prefix":"10.1145","volume":"5","author":[{"given":"Tianbao","family":"Yang","sequence":"first","affiliation":[{"name":"The University of Iowa"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2019,12,6]]},"reference":[{"volume-title":"Proceedings of the 35th international conference on machine learning (icml) (pp.-).","year":"2018","author":"Agarwal A.","key":"e_1_2_1_1_1"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3055399.3055464"},{"key":"e_1_2_1_3_1","unstructured":"Allen-Zhu Z. Li Y. & Song Z. (2018). A convergence theory for deep learning via over-parameterization. CoRR abs\/1811.03962.  Allen-Zhu Z. Li Y. & Song Z. (2018). A convergence theory for deep learning via over-parameterization. CoRR abs\/1811.03962."},{"key":"e_1_2_1_4_1","unstructured":"Allen-Zhu Z. (2017). Natasha 2: Faster non-convex optimization than sgd. CoRR \/abs\/1708.08694\/v4.  Allen-Zhu Z. (2017). Natasha 2: Faster non-convex optimization than sgd. CoRR \/abs\/1708.08694\/v4."},{"volume-title":"Proceedings of the 33rd international conference on international conference on machine learning (icml) (pp. 173--182)","year":"2016","author":"Amodei D.","key":"e_1_2_1_5_1"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1080\/02331934.2016.1253694"},{"volume-title":"International conference on machine learning (pp. 214--223)","year":"2017","author":"Arjovsky M.","key":"e_1_2_1_7_1"},{"key":"e_1_2_1_8_1","unstructured":"Arora S. Cohen N. & Hazan E. (2018). On the optimization of deep networks: Implicit acceleration by overparameterization. arXiv preprint arXiv:1802.06509.  Arora S. Cohen N. & Hazan E. (2018). On the optimization of deep networks: Implicit acceleration by overparameterization. arXiv preprint arXiv:1802.06509."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10107-011-0484-9"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.324"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10107-013-0701-9"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s13675-015-0045-8"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00041-008-9045-x"},{"volume-title":"Proceedings of the 30th international conference on neural information processing systems (nips) (pp. 2927--2935)","year":"2016","author":"Canini K.","key":"e_1_2_1_14_1"},{"key":"e_1_2_1_15_1","doi-asserted-by":"crossref","unstructured":"Carlini N. & Wagner D. (2017). Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp) (pp. 39--57).  Carlini N. & Wagner D. (2017). Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp) (pp. 39--57).","DOI":"10.1109\/SP.2017.49"},{"key":"e_1_2_1_16_1","unstructured":"Carmon Y. Duchi J. C. Hinder O. & Sidford A. (2016). Accelerated methods for non-convex optimization. CoRR abs\/1611.00756.  Carmon Y. Duchi J. C. Hinder O. & Sidford A. (2016). Accelerated methods for non-convex optimization. CoRR abs\/1611.00756."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10107-009-0337-y"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.5555\/1966622.1966624"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSP.2012.2208955"},{"volume-title":"Splitting methods in communication, imaging, science, and engineering (pp. 237--249)","year":"2016","author":"Chartrand R.","key":"e_1_2_1_20_1"},{"key":"e_1_2_1_21_1","unstructured":"Chen J. & Gu Q. (2018). Closing the generalization gap of adaptive gradient methods in training deep neural networks. arXiv preprint arXiv:1806.06763.  Chen J. & Gu Q. (2018). Closing the generalization gap of adaptive gradient methods in training deep neural networks. arXiv preprint arXiv:1806.06763."},{"volume-title":"7th international conference on learning representations, ICLR 2019","year":"2019","author":"Chen Z.","key":"e_1_2_1_22_1"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1137\/15M1026924"},{"volume-title":"Proceedings of the 34th international conference on machine learning-volume 70 (pp. 854--863)","year":"2017","author":"Cisse M.","key":"e_1_2_1_24_1"},{"key":"e_1_2_1_25_1","unstructured":"Daskalakis C. Ilyas A. Syrgkanis V. & Zeng H. (2017). Training gans with optimism. CoRR abs\/1711.00141.  Daskalakis C. Ilyas A. Syrgkanis V. & Zeng H. (2017). Training gans with optimism. CoRR abs\/1711.00141."},{"key":"e_1_2_1_26_1","unstructured":"Davis D. & Drusvyatskiy D. (2018). Stochastic subgradient method converges at the rate o(k-1\/4) on weakly convex functions. arXiv preprint arXiv:1802.02988.  Davis D. & Drusvyatskiy D. (2018). Stochastic subgradient method converges at the rate o( k -1\/4 ) on weakly convex functions. arXiv preprint arXiv:1802.02988."},{"key":"e_1_2_1_27_1","doi-asserted-by":"crossref","unstructured":"Deng L. & Liu Y. (2018). Deep learning in natural language processing. Springer.  Deng L. & Liu Y. (2018). Deep learning in natural language processing. Springer.","DOI":"10.1007\/978-981-10-5209-5"},{"key":"e_1_2_1_28_1","unstructured":"Du S. S. Zhai X. Poczos B. & Singh A. (2018). Gradient descent provably optimizes over-parameterized neural networks. arXiv preprint arXiv:1810.02054.  Du S. S. Zhai X. Poczos B. & Singh A. (2018). Gradient descent provably optimizes over-parameterized neural networks. arXiv preprint arXiv:1810.02054."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2021068"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1198\/016214501753382273"},{"key":"e_1_2_1_31_1","first-page":"497","volume-title":"Advances in neural information processing systems 30: Annual conference on neural information processing systems","author":"Fan Y.","year":"2017"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1137\/120880811"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/1143844.1143889"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/1143844.1143889"},{"volume-title":"Advances in neural information processing systems (pp. 2672--2680).","year":"2014","author":"Goodfellow I.","key":"e_1_2_1_35_1"},{"key":"e_1_2_1_36_1","unstructured":"Gouk H. Frank E. Pfahringer B. & Cree M. (2018). Regularisation of neural networks by enforcing lipschitz continuity. arXiv preprint arXiv:1804.04368.  Gouk H. Frank E. Pfahringer B. & Cree M. (2018). Regularisation of neural networks by enforcing lipschitz continuity. arXiv preprint arXiv:1804.04368."},{"key":"e_1_2_1_37_1","unstructured":"Grnarova P. Levy K. Y. Lucchi A. Hofmann T. & Krause A. (2017). An online learning approach to generative adversarial networks. CoRR abs\/1706.03269.  Grnarova P. Levy K. Y. Lucchi A. Hofmann T. & Krause A. (2017). An online learning approach to generative adversarial networks. CoRR abs\/1706.03269."},{"key":"e_1_2_1_38_1","article-title":"Monotonic calibrated interpolated look-up tables","volume":"17","author":"Gupta M. R.","year":"2016","journal-title":"Journal of Machine Learning Research (JMLR)"},{"volume-title":"Advances in neural information processing systems (pp. 3315--3323).","year":"2016","author":"Hardt M.","key":"e_1_2_1_39_1"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"volume-title":"Advances in neural information processing systems 30 nips) (pp. 6629--6640).","year":"2017","author":"Heusel M.","key":"e_1_2_1_41_1"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2512329"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11590-016-1031-7"},{"volume-title":"3rd international conference on learning representations, ICLR 2015, san diego, ca, usa, may 7--9, 2015, conference track proceedings.","year":"2015","author":"Kingma D. P.","key":"e_1_2_1_44_1"},{"volume-title":"Advances in neural information processing systems 30 (pp. 1675--1685).","year":"2017","author":"Kiryo R.","key":"e_1_2_1_45_1"},{"volume-title":"Proceedings of the international conference on machine learning (icml) (pp. 1895--1904)","year":"2017","author":"Kohler J. M.","key":"e_1_2_1_46_1"},{"volume-title":"Advances in neural information processing systems (nips) (pp. 1106--1114).","year":"2012","author":"Krizhevsky A.","key":"e_1_2_1_47_1"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature14539"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1007\/s40595-013-0010-5"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10100-013-0324-5"},{"key":"e_1_2_1_51_1","first-page":"379","volume-title":"Proceedings of the 28th international conference on neural information processing systems -","volume":"1","author":"Li H.","year":"2015"},{"key":"e_1_2_1_52_1","unstructured":"Li X. & Orabona F. (2018). On the convergence of stochastic gradient descent with adaptive stepsizes. arXiv preprint arXiv:1805.08114.  Li X. & Orabona F. (2018). On the convergence of stochastic gradient descent with adaptive stepsizes. arXiv preprint arXiv:1805.08114."},{"volume-title":"Advances in neural information processing systems (neurips) (pp. 8157--8166).","year":"2018","author":"Li Y.","key":"e_1_2_1_53_1"},{"key":"e_1_2_1_54_1","unstructured":"Lin Q. Liu M. Rafique H. & Yang T. (2018). Solving weakly-convex-weakly-concave saddle-point problems as weakly-monotone variational inequality. arXiv preprint arXiv:1810.10207.  Lin Q. Liu M. Rafique H. & Yang T. (2018). Solving weakly-convex-weakly-concave saddle-point problems as weakly-monotone variational inequality. arXiv preprint arXiv:1810.10207."},{"key":"e_1_2_1_55_1","doi-asserted-by":"crossref","unstructured":"Lin Q. Nadarajah S. Soheili N. & Yang T. (2019). A data efficient and feasible level set method for stochastic convex optimization with expectation constraints. CoRR abs\/1908.03077.  Lin Q. Nadarajah S. Soheili N. & Yang T. (2019). A data efficient and feasible level set method for stochastic convex optimization with expectation constraints. CoRR abs\/1908.03077.","DOI":"10.2139\/ssrn.3433280"},{"key":"e_1_2_1_56_1","unstructured":"Liu M. & Yang T. (2017a). On noisy negative curvature descent: Competing with gradient descent for faster non-convex optimization. CoRR abs\/1709.08571.  Liu M. & Yang T. (2017a). On noisy negative curvature descent: Competing with gradient descent for faster non-convex optimization. CoRR abs\/1709.08571."},{"key":"e_1_2_1_57_1","unstructured":"Liu M. & Yang T. (2017b). Stochastic non-convex optimization with strong high probability second-order convergence. CoRR abs\/1710.09447.  Liu M. & Yang T. (2017b). Stochastic non-convex optimization with strong high probability second-order convergence. CoRR abs\/1710.09447."},{"key":"e_1_2_1_58_1","doi-asserted-by":"crossref","unstructured":"Liu T. Pong T. K. & Takeda A. (2018 Sep 08). A successive difference-of-convex approximation method for a class of nonconvex nonsmooth optimization problems. Mathematical Programming.  Liu T. Pong T. K. & Takeda A. (2018 Sep 08). A successive difference-of-convex approximation method for a class of nonconvex nonsmooth optimization problems. Mathematical Programming.","DOI":"10.1007\/s10107-018-1327-8"},{"key":"e_1_2_1_59_1","unstructured":"Luo L. Xiong Y. Liu Y. & Sun X. (2019). Adaptive gradient methods with dynamic bound of learning rate. arXiv preprint arXiv:1902.09843.  Luo L. Xiong Y. Liu Y. & Sun X. (2019). Adaptive gradient methods with dynamic bound of learning rate. arXiv preprint arXiv:1902.09843."},{"key":"e_1_2_1_60_1","unstructured":"Ma R. Lin Q. & Yang T. (2019). Proximally constrained methods for weakly convex optimization with weakly convex constraints. arXiv preprint arXiv:1908.01871.  Ma R. Lin Q. & Yang T. (2019). Proximally constrained methods for weakly convex optimization with weakly convex constraints. arXiv preprint arXiv:1908.01871."},{"volume-title":"Advances in neural information processing systems (nips) (p. 503--511).","year":"2012","author":"Mahdavi M.","key":"e_1_2_1_61_1"},{"volume-title":"Advances in neural information processing systems 30 (nips) (pp. 5591--5600).","year":"2017","author":"Nagarajan V.","key":"e_1_2_1_62_1"},{"volume-title":"Advances in neural information processing systems (pp. 2208--2216).","year":"2016","author":"Namkoong H.","key":"e_1_2_1_63_1"},{"volume-title":"Advances in neural information processing systems (pp. 2971--2980).","year":"2017","author":"Namkoong H.","key":"e_1_2_1_64_1"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.5555\/3112681.3113165"},{"volume-title":"Artificial intelligence and statistics (pp. 470--478).","year":"2017","author":"Nitanda A.","key":"e_1_2_1_66_1"},{"key":"e_1_2_1_67_1","unstructured":"Radford A. Metz L. & Chintala S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.  Radford A. Metz L. & Chintala S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434."},{"key":"e_1_2_1_68_1","unstructured":"Rafique H. Liu M. Lin Q. & Yang T. (2018). Non-convex min-max optimization: Provable algorithms and applications in machine learning. CoRR abs\/1810.02060.  Rafique H. Liu M. Lin Q. & Yang T. (2018). Non-convex min-max optimization: Provable algorithms and applications in machine learning. CoRR abs\/1810.02060."},{"key":"e_1_2_1_69_1","unstructured":"Ravi S. N. Dinh T. Lokhande V. S. R. & Singh V. (2018). Constrained deep learning using conditional gradient and applications in computer vision. arXiv preprint arXiv:1803.06453.  Ravi S. N. Dinh T. Lokhande V. S. R. & Singh V. (2018). Constrained deep learning using conditional gradient and applications in computer vision. arXiv preprint arXiv:1803.06453."},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33014780"},{"key":"e_1_2_1_71_1","unstructured":"Reddi S. J. Zaheer M. Sra S. Poczos B. Bach F. Salakhutdinov R. & Smola A. J. (2017). A generic approach for escaping saddle points. arXiv preprint arXiv:1709.01434.  Reddi S. J. Zaheer M. Sra S. Poczos B. Bach F. Salakhutdinov R. & Smola A. J. (2017). A generic approach for escaping saddle points. arXiv preprint arXiv:1709.01434."},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.5555\/1953048.2078196"},{"key":"e_1_2_1_73_1","unstructured":"Royer C. W. & Wright S. J. (2017). Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization. CoRR abs\/1706.03131.  Royer C. W. & Wright S. J. (2017). Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization. CoRR abs\/1706.03131."},{"key":"e_1_2_1_74_1","doi-asserted-by":"crossref","unstructured":"Sra S. Nowozin S. & Wright S. J. (2011). Optimization for machine learning. The MIT Press.  Sra S. Nowozin S. & Wright S. J. (2011). Optimization for machine learning. The MIT Press.","DOI":"10.7551\/mitpress\/8996.001.0001"},{"key":"e_1_2_1_75_1","unstructured":"Tan M. & Le Q. V. (2019). Efficient-net: Rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946.  Tan M. & Le Q. V. (2019). Efficient-net: Rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946."},{"volume-title":"Proceedings of the 34th international conference on machine learning-volume 70 (pp. 3394--3403)","year":"2017","author":"Thi H. A. L.","key":"e_1_2_1_76_1"},{"key":"e_1_2_1_77_1","doi-asserted-by":"publisher","DOI":"10.1145\/3180155.3180220"},{"key":"e_1_2_1_78_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2018.2880454"},{"key":"e_1_2_1_79_1","unstructured":"Woodworth B. Gunasekar S. Ohannessian M. I. & Srebro N. (2017). Learning non-discriminatory predictors. arXiv preprint arXiv:1702.06081.  Woodworth B. Gunasekar S. Ohannessian M. I. & Srebro N. (2017). Learning non-discriminatory predictors. arXiv preprint arXiv:1702.06081."},{"volume-title":"Proceedings of the 30th conference on learning theory, COLT 2017, amsterdam, the netherlands, 7--10 july 2017 (pp. 1920--1953)","year":"2017","author":"Woodworth B. E.","key":"e_1_2_1_80_1"},{"key":"e_1_2_1_81_1","doi-asserted-by":"publisher","DOI":"10.1198\/016214507000000617"},{"key":"e_1_2_1_82_1","unstructured":"Xu P. Roosta-Khorasani F. & Mahoney M. W. (2017). Newton-type methods for non-convex optimization under inexact hessian information. CoRR abs\/1708.07164.  Xu P. Roosta-Khorasani F. & Mahoney M. W. (2017). Newton-type methods for non-convex optimization under inexact hessian information. CoRR abs\/1708.07164."},{"key":"e_1_2_1_83_1","unstructured":"Xu Y. Jin R. & Yang T. (2019). Stochastic proximal gradient methods for non-smooth non-convex regularized problems. arXiv preprint arXiv:1902.07672.  Xu Y. Jin R. & Yang T. (2019). Stochastic proximal gradient methods for non-smooth non-convex regularized problems. arXiv preprint arXiv:1902.07672."},{"volume-title":"Proceedings of the 34th international conference on machine learning-volume 70 (pp. 3821--3830)","year":"2017","author":"Xu Y.","key":"e_1_2_1_84_1"},{"volume-title":"Proceedings of the 36th international conference on machine learning, ICML 2019, 9--15 june","year":"2019","author":"Xu Y.","key":"e_1_2_1_85_1"},{"volume-title":"Advances in neural information processing systems (neurips) (pp. 5530--5540).","year":"2018","author":"Xu Y.","key":"e_1_2_1_86_1"},{"volume-title":"Proceedings of the thirty-fifth conference on uncertainty in artificial intelligence, UAI 2019","year":"2019","author":"Xu Y.","key":"e_1_2_1_87_1"},{"key":"e_1_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2018\/410"},{"key":"e_1_2_1_89_1","unstructured":"Yang L. (2018). Proximal gradient method with extrapolation and line search for a class of nonconvex and nonsmooth problems. CoRR abs\/1711.06831.  Yang L. (2018). Proximal gradient method with extrapolation and line search for a class of nonconvex and nonsmooth problems. CoRR abs\/1711.06831."},{"key":"e_1_2_1_90_1","doi-asserted-by":"publisher","DOI":"10.5555\/3305890.3306084"},{"volume-title":"Proceedings of the international conference on machine learning (icml) (pp. 233--240)","year":"2012","author":"Yang T.","key":"e_1_2_1_91_1"},{"volume-title":"Advances in neural information processing systems 30 (nips) (pp. 2985--2993).","year":"2017","author":"You S.","key":"e_1_2_1_92_1"},{"volume-title":"The 17th international conference on artificial intelligence and statistics (AISTATS).","year":"2015","author":"Yu Y.","key":"e_1_2_1_93_1"},{"key":"e_1_2_1_94_1","doi-asserted-by":"publisher","DOI":"10.1145\/3038912.3052660"},{"key":"e_1_2_1_95_1","first-page":"9793","volume-title":"S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, & R","author":"Zaheer M.","year":"2018"},{"key":"e_1_2_1_96_1","doi-asserted-by":"publisher","DOI":"10.1214\/09-AOS729"},{"key":"e_1_2_1_97_1","doi-asserted-by":"publisher","DOI":"10.1214\/12-STS399"},{"key":"e_1_2_1_98_1","unstructured":"Zhang S. & Xin J. (2014). Minimization of transformed I_1 penalty: Theory difference of convex function algorithm and robust application in compressed sensing. CoRR abs\/1411.5735.  Zhang S. & Xin J. (2014). Minimization of transformed I_1 penalty: Theory difference of convex function algorithm and robust application in compressed sensing. CoRR abs\/1411.5735."},{"key":"e_1_2_1_99_1","doi-asserted-by":"publisher","DOI":"10.5555\/1756006.1756041"},{"key":"e_1_2_1_100_1","doi-asserted-by":"publisher","DOI":"10.5555\/2892753.2892858"},{"key":"e_1_2_1_101_1","unstructured":"Zhou D. Tang Y. Yang Z. Cao Y. & Gu Q. (2018). On the convergence of adaptive gradient methods for nonconvex optimization. arXiv preprint arXiv:1808.05671.  Zhou D. Tang Y. Yang Z. Cao Y. & Gu Q. (2018). On the convergence of adaptive gradient methods for nonconvex optimization. arXiv preprint arXiv:1808.05671."},{"volume-title":"The 22nd international conference on artificial intelligence and statistics (pp. 517--526)","year":"2019","author":"Zhu D.","key":"e_1_2_1_102_1"},{"key":"e_1_2_1_103_1","unstructured":"Zou D. Cao Y. Zhou D. & Gu Q. (2018). Stochastic gradient descent optimizes over-parameterized deep relu networks. CoRR abs\/1811.08888.  Zou D. Cao Y. Zhou D. & Gu Q. (2018). Stochastic gradient descent optimizes over-parameterized deep relu networks. CoRR abs\/1811.08888."}],"container-title":["AI Matters"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3362077.3362085","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3362077.3362085","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:44:54Z","timestamp":1750203894000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3362077.3362085"}},"subtitle":["challenges and opportunities"],"short-title":[],"issued":{"date-parts":[[2019,12,6]]},"references-count":103,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2019,12,6]]}},"alternative-id":["10.1145\/3362077.3362085"],"URL":"https:\/\/doi.org\/10.1145\/3362077.3362085","relation":{},"ISSN":["2372-3483"],"issn-type":[{"type":"electronic","value":"2372-3483"}],"subject":[],"published":{"date-parts":[[2019,12,6]]},"assertion":[{"value":"2019-12-06","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}