{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T02:10:51Z","timestamp":1773281451748,"version":"3.50.1"},"reference-count":50,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T00:00:00Z","timestamp":1761868800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T00:00:00Z","timestamp":1761868800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100007105","name":"Universit\u00e4t Mannheim","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100007105","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Comput Optim Appl"],"published-print":{"date-parts":[[2026,4]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Inverse problems are key issues in several scientific areas, including signal processing and medical imaging. Data-driven approaches for inverse problems aim for learning model and regularization parameters from observed data samples, and investigate their generalization properties when confronted with unseen data. This approach dictates a statistical approach to inverse problems, calling for stochastic optimization methods. In order to learn model and regularisation parameters simultaneously, we develop in this paper a stochastic bilevel optimization approach in which the lower level problem represents a variational reconstruction method formulated as a convex non-smooth optimization problem, depending on the observed sample. The upper level problem represents the learning task of the regularisation parameters. Combining the lower level and the upper level problem leads to a stochastic non-smooth and non-convex optimization problem, for which standard gradient-based methods are not straightforward to implement. Instead, we develop a unified and flexible methodology, building on a derivative-free approach, which allows us to solve the bilevel optimization problem only with samples of the objective function values. We perform a complete complexity analysis of this scheme. Numerical results on signal denoising and experimental design demonstrate the computational efficiency and the generalization properties of our method.<\/jats:p>","DOI":"10.1007\/s10589-025-00745-1","type":"journal-article","created":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T15:39:27Z","timestamp":1761925167000},"page":"967-1020","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Derivative-free stochastic bilevel optimization for inverse problems"],"prefix":"10.1007","volume":"93","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2481-0019","authenticated-orcid":false,"given":"Mathias","family":"Staudigl","sequence":"first","affiliation":[]},{"given":"Simon","family":"Weissmann","sequence":"additional","affiliation":[]},{"given":"Tristan","family":"van Leeuwen","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,10,31]]},"reference":[{"key":"745_CR1","unstructured":"Agarwal, A., Dekel, O., Xiao, L.: Optimal algorithms for online convex optimization with multi-point bandit feedback. In 23rd Conference on Learning Theory, pages 28\u201340, (2010)"},{"key":"745_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1017\/S0962492919000059","volume":"28","author":"S Arridge","year":"2019","unstructured":"Arridge, S., Maass, P., \u00d6ktem, O., Sch\u00f6nlieb, C.-B.: Solving inverse problems using data-driven models. Acta Numer. 28, 1\u2013174 (2019)","journal-title":"Acta Numer."},{"issue":"1","key":"745_CR3","doi-asserted-by":"publisher","first-page":"35","DOI":"10.1007\/s10208-021-09499-8","volume":"22","author":"K Balasubramanian","year":"2022","unstructured":"Balasubramanian, K., Ghadimi, S.: Zeroth-order nonconvex stochastic optimization: Handling constraints, high dimensionality, and saddle points. Found. Comput. Math. 22(1), 35\u201376 (2022)","journal-title":"Found. Comput. Math."},{"key":"745_CR4","doi-asserted-by":"crossref","unstructured":"Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer - CMS Books in Mathematics, (2016)","DOI":"10.1007\/978-3-319-48311-5"},{"key":"745_CR5","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1017\/S0962492918000016","volume":"27","author":"M Benning","year":"2018","unstructured":"Benning, M., Burger, M.: Modern regularization methods for inverse problems. Acta Numer. 27, 1\u2013111 (2018)","journal-title":"Acta Numer."},{"issue":"2","key":"745_CR6","doi-asserted-by":"publisher","first-page":"507","DOI":"10.1007\/s10208-021-09513-z","volume":"22","author":"AS Berahas","year":"2022","unstructured":"Berahas, A.S., Cao, L., Choromanski, K., Scheinberg, K.: A theoretical and empirical comparison of gradient approximations in derivative-free optimization. Found. Comput. Math. 22(2), 507\u2013560 (2022)","journal-title":"Found. Comput. Math."},{"issue":"2","key":"745_CR7","doi-asserted-by":"publisher","first-page":"223","DOI":"10.1137\/16M1080173","volume":"60","author":"L Bottou","year":"2018","unstructured":"Bottou, L., Curtis, F., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223\u2013311 (2018)","journal-title":"SIAM Rev."},{"key":"745_CR8","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511546921","volume-title":"Prediction, learning, and games","author":"N Cesa-Bianchi","year":"2006","unstructured":"Cesa-Bianchi, N., Lugosi, G.: Prediction, learning, and games. Cambridge University Press, Cambridge (2006)"},{"key":"745_CR9","doi-asserted-by":"crossref","unstructured":"Clarke, F.H.: Optimization and nonsmooth analysis. SIAM, (1990)","DOI":"10.1137\/1.9781611971309"},{"key":"745_CR10","doi-asserted-by":"crossref","unstructured":"Conn, A.R., Scheinberg, K., Vicente, L.N.: Introduction to derivative-free optimization. SIAM, (2009)","DOI":"10.1137\/1.9780898718768"},{"key":"745_CR11","unstructured":"Cui, S., Shanbhag, U.V., Staudigl, M.: A regularized variance-reduced modified extragradient method for stochastic hierarchical games. arXiv preprint arXiv:2302.06497, (2023)"},{"key":"745_CR12","doi-asserted-by":"crossref","unstructured":"Cui, S., Shanbhag, U.V., Yousefian, F.: Complexity guarantees for an implicit smoothing-enabled method for stochastic mpecs. Mathematical Programming, (2022)","DOI":"10.1007\/s10107-022-01893-6"},{"issue":"1","key":"745_CR13","doi-asserted-by":"publisher","first-page":"207","DOI":"10.1137\/18M1178244","volume":"29","author":"D Davis","year":"2019","unstructured":"Davis, D., Drusvyatskiy, D.: Stochastic model-based minimization of weakly convex functions. SIAM J. Optim. 29(1), 207\u2013239 (2019)","journal-title":"SIAM J. Optim."},{"issue":"3","key":"745_CR14","doi-asserted-by":"publisher","first-page":"1908","DOI":"10.1137\/17M1151031","volume":"29","author":"D Davis","year":"2019","unstructured":"Davis, D., Grimmer, B.: Proximally guided stochastic subgradient method for nonsmooth, nonconvex problems. SIAM J. Optim. 29(3), 1908\u20131930 (2019)","journal-title":"SIAM J. Optim."},{"key":"745_CR15","doi-asserted-by":"crossref","unstructured":"Dontchev, A.L., Rockafellar, R.T.: Implicit functions and solution mappings: A view from variational analysis, volume\u00a011. Springer, (2009)","DOI":"10.1007\/978-0-387-87821-8"},{"key":"745_CR16","doi-asserted-by":"publisher","first-page":"503","DOI":"10.1007\/s10107-018-1311-3","volume":"178","author":"D Drusvyatskiy","year":"2019","unstructured":"Drusvyatskiy, D., Paquette, C.: Efficiency of minimizing compositions of convex functions and smooth maps. Math. Program. 178, 503\u2013558 (2019)","journal-title":"Math. Program."},{"issue":"5","key":"745_CR17","doi-asserted-by":"publisher","first-page":"2788","DOI":"10.1109\/TIT.2015.2409256","volume":"61","author":"JC Duchi","year":"2015","unstructured":"Duchi, J.C., Jordan, M.I., Wainwright, M.J., Wibisono, A.: Optimal rates for zero-order convex optimization: The power of two function evaluations. IEEE Trans. Inf. Theory 61(5), 2788\u20132806 (2015)","journal-title":"IEEE Trans. Inf. Theory"},{"key":"745_CR18","unstructured":"Duvocelle, B., Mertikopoulos, P., Staudigl, M., Vermeulen, D.: Multiagent online learning in time-varying games. Mathematics of Operations Research, 2023\/01\/31 (2022)"},{"issue":"5","key":"745_CR19","doi-asserted-by":"publisher","first-page":"580","DOI":"10.1007\/s10851-021-01020-8","volume":"63","author":"MJ Ehrhardt","year":"2021","unstructured":"Ehrhardt, M.J., Roberts, L.: Inexact derivative-free optimization for bilevel learning. J. Math. Imaging Vis. 63(5), 580\u2013600 (2021)","journal-title":"J. Math. Imaging Vis."},{"issue":"1","key":"745_CR20","doi-asserted-by":"publisher","first-page":"254","DOI":"10.1093\/imamat\/hxad035","volume":"89","author":"MJ Ehrhardt","year":"2024","unstructured":"Ehrhardt, M.J., Roberts, L.: Analyzing inexact hypergradients for bilevel learning. IMA J. Appl. Math. 89(1), 254\u2013278 (2024)","journal-title":"IMA J. Appl. Math."},{"key":"745_CR21","volume-title":"Regularization of inverse problems. Mathematics and Its Applications","author":"HW Engl","year":"1996","unstructured":"Engl, H.W., Hanke, M., Neubauer, G.: Regularization of inverse problems. Mathematics and Its Applications. Springer, Netherlands (1996)"},{"key":"745_CR22","unstructured":"Franceschi, L., Frasconi, P., Salzo, S., Grazzi, R., Pontil, M.: Bilevel programming for hyperparameter optimization and meta-learning. In International conference on machine learning, pages 1568\u20131577. PMLR, (2018)"},{"key":"745_CR23","doi-asserted-by":"publisher","DOI":"10.1017\/9781108348973","volume-title":"Bayesian optimization","author":"R Garnett","year":"2023","unstructured":"Garnett, R.: Bayesian optimization. Cambridge University Press, Cambridge (2023)"},{"issue":"4","key":"745_CR24","doi-asserted-by":"publisher","first-page":"2341","DOI":"10.1137\/120880811","volume":"23","author":"Saeed Ghadimi","year":"2013","unstructured":"Ghadimi, Saeed, Lan, Guanghui: Stochastic first- and zeroth-order methods for nonconvex stochastic programming. SIAM J. Optim. 23(4), 2341\u20132368 (2013)","journal-title":"SIAM J. Optim."},{"issue":"1\u20132","key":"745_CR25","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1007\/s10107-014-0846-1","volume":"155","author":"S Ghadimi","year":"2016","unstructured":"Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Math. Program. 155(1\u20132), 267\u2013305 (2016)","journal-title":"Math. Program."},{"key":"745_CR26","unstructured":"Ghadimi, S., Wang, M.: Approximation methods for bilevel programming. arXiv preprint arXiv:1802.02246, (2018)"},{"key":"745_CR27","doi-asserted-by":"publisher","first-page":"259","DOI":"10.1017\/S096249291500001X","volume":"24","author":"MB Giles","year":"2015","unstructured":"Giles, M.B.: Multilevel Monte Carlo methods. Acta Numer. 24, 259\u2013328 (2015)","journal-title":"Acta Numer."},{"key":"745_CR28","doi-asserted-by":"publisher","first-page":"14","DOI":"10.1007\/BF01584320","volume":"13","author":"AA Goldstein","year":"1977","unstructured":"Goldstein, A.A.: Optimization of Lipschitz continuous functions. Math. Program. 13, 14\u201322 (1977)","journal-title":"Math. Program."},{"key":"745_CR29","unstructured":"Grazzi, R., Franceschi, L., Pontil, M., Salzo, S.: On the iteration complexity of hypergradient computation. International Conference on Machine Learning, pages 3748\u20133758, (2020)"},{"issue":"3","key":"745_CR30","doi-asserted-by":"publisher","first-page":"611","DOI":"10.1088\/0266-5611\/19\/3\/309","volume":"19","author":"E Haber","year":"2003","unstructured":"Haber, E., Tenorio, L.: Learning regularization functionals\u2013a supervised training approach. Inverse Prob. 19(3), 611 (2003)","journal-title":"Inverse Prob."},{"key":"745_CR31","doi-asserted-by":"crossref","unstructured":"Hansen, P.C., J\u00f8rgensen, J., Lionheart, W.R.B.: Computed tomography: algorithms, insight, and just enough theory. Society for Industrial and Applied Mathematics, Philadelphia, PA, (2021)","DOI":"10.1137\/1.9781611976670"},{"issue":"11","key":"745_CR32","doi-asserted-by":"publisher","DOI":"10.1088\/1361-6420\/aade77","volume":"34","author":"G Holler","year":"2018","unstructured":"Holler, G., Kunisch, K., Barnard, R.C.: A bilevel approach for parameter learning in inverse problems. Inverse Prob. 34(11), 115012 (2018)","journal-title":"Inverse Prob."},{"issue":"1","key":"745_CR33","doi-asserted-by":"publisher","first-page":"147","DOI":"10.1137\/20M1387341","volume":"33","author":"M Hong","year":"2023","unstructured":"Hong, M., Wai, H.-T., Wang, Z., Yang, Z.: A two-timescale stochastic algorithm framework for bilevel optimization: complexity analysis and application to actor-critic. SIAM J. Optim. 33(1), 147\u2013180 (2023)","journal-title":"SIAM J. Optim."},{"key":"745_CR34","doi-asserted-by":"crossref","unstructured":"Kozak, D., Molinari, C., Rosasco, L., Tenorio, L., Villa, S.: Zeroth-order optimization with orthogonal random directions. Mathematical Programming, (2022)","DOI":"10.1007\/s10107-022-01866-9"},{"issue":"2","key":"745_CR35","doi-asserted-by":"publisher","first-page":"938","DOI":"10.1137\/120882706","volume":"6","author":"K Kunisch","year":"2013","unstructured":"Kunisch, K., Pock, T.: A bilevel optimization approach for parameter learning in variational models. SIAM. J. Imaging. Sci. 6(2), 938\u2013983 (2013)","journal-title":"SIAM. J. Imaging. Sci."},{"key":"745_CR36","unstructured":"Kwon, J., Kwon, D., Wright, S., Nowak, R.D.: A fully first-order method for stochastic bilevel optimization. PMLR, (2023)"},{"key":"745_CR37","doi-asserted-by":"crossref","unstructured":"Lan, G.: First-order and Stochastic Optimization Methods for Machine Learning. Springer Series in the Data Sciences, Springer Nature (2020)","DOI":"10.1007\/978-3-030-39568-1"},{"key":"745_CR38","doi-asserted-by":"crossref","unstructured":"Lei, M., Pong, T.K., Sun, S., Yue, M.-C.: Subdifferentially polynomially bounded functions and gaussian smoothing-based zeroth-order optimization. arXiv preprint arXiv:2405.04150, (2024)","DOI":"10.1137\/24M1659911"},{"issue":"12","key":"745_CR39","doi-asserted-by":"publisher","first-page":"10045","DOI":"10.1109\/TPAMI.2021.3132674","volume":"44","author":"R Liu","year":"2022","unstructured":"Liu, R., Gao, J., Zhang, J., Meng, D., Lin, Z.: Investigating bi-level optimization for learning and vision from a unified perspective: a survey and beyond. IEEE Trans. Pattern Anal. Mach. Intell. 44(12), 10045\u201310067 (2022)","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"issue":"2","key":"745_CR40","doi-asserted-by":"publisher","first-page":"1937","DOI":"10.1137\/23M1566753","volume":"34","author":"Z Lu","year":"2024","unstructured":"Lu, Z., Mei, S.: First-order penalty methods for bilevel optimization. SIAM J. Optim. 34(2), 1937\u20131969 (2024)","journal-title":"SIAM J. Optim."},{"key":"745_CR41","doi-asserted-by":"crossref","unstructured":"Nesterov, Y.: Lectures on Convex Optimization, volume 137 of Springer optimization and its applications. Springer International Publishing, (2018)","DOI":"10.1007\/978-3-319-91578-4_2"},{"issue":"2","key":"745_CR42","doi-asserted-by":"publisher","first-page":"527","DOI":"10.1007\/s10208-015-9296-2","volume":"17","author":"Y Nesterov","year":"2017","unstructured":"Nesterov, Y., Spokoiny, V.: Random gradient-free minimization of convex functions. Found. Comput. Math. 17(2), 527\u2013566 (2017)","journal-title":"Found. Comput. Math."},{"key":"745_CR43","doi-asserted-by":"publisher","first-page":"175","DOI":"10.1007\/s10851-016-0663-7","volume":"56","author":"P Ochs","year":"2016","unstructured":"Ochs, P., Ranftl, R., Brox, T., Pock, T.: Techniques for gradient-based bilevel optimization with non-smooth lower level problems. J. Math. Imaging Vision 56, 175\u2013194 (2016)","journal-title":"J. Math. Imaging Vision"},{"issue":"5","key":"745_CR44","doi-asserted-by":"publisher","first-page":"A2679","DOI":"10.1137\/22M1494270","volume":"45","author":"S Pougkakiotis","year":"2023","unstructured":"Pougkakiotis, S., Kalogerias, D.: A zeroth-order proximal stochastic gradient method for weakly convex stochastic optimization. SIAM J. Sci. Comput. 45(5), A2679\u2013A2702 (2023)","journal-title":"SIAM J. Sci. Comput."},{"key":"745_CR45","unstructured":"Rajeswaran, A., Finn, C., Kakade, S.M.: and Sergey Levine. Meta-learning with implicit gradients. Adv. Neural Inf. Process. Syst. 32, (2019)"},{"key":"745_CR46","doi-asserted-by":"crossref","unstructured":"Shapiro, A., Dentcheva, D., Ruszczy\u0144ski, A.: Lectures on stochastic programming. Society for Industrial and Applied Mathematics, 2017\/12\/29 (2009)","DOI":"10.1137\/1.9780898718751"},{"issue":"6","key":"745_CR47","doi-asserted-by":"publisher","first-page":"1383","DOI":"10.1007\/s11590-023-02057-x","volume":"18","author":"A Sinha","year":"2024","unstructured":"Sinha, A., Khandait, T., Mohanty, R.: A gradient-based bilevel optimization approach for tuning regularization hyperparameters. Optim. Lett. 18(6), 1383\u20131404 (2024)","journal-title":"Optim. Lett."},{"key":"745_CR48","unstructured":"Spall, J.C.: Introduction to stochastic search and optimization: estimation, simulation, and control. John Wiley & Sons, (2005)"},{"key":"745_CR49","doi-asserted-by":"publisher","first-page":"953","DOI":"10.1109\/TCI.2024.3414273","volume":"10","author":"T Wang","year":"2024","unstructured":"Wang, T., Lucka, F., van Leeuwen, T.: Sequential experimental design for x-ray CT using deep reinforcement learning. IEEE Trans. Comput. Imaging 10, 953\u2013968 (2024)","journal-title":"IEEE Trans. Comput. Imaging"},{"key":"745_CR50","unstructured":"Zhang, J., Lin, H., Jegelka, S., Sra, S., Jadbabaie, A.: Complexity of finding stationary points of nonconvex nonsmooth functions. International conference on machine learning, pages 11173\u201311182, (2020)"}],"container-title":["Computational Optimization and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10589-025-00745-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10589-025-00745-1","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10589-025-00745-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T11:18:07Z","timestamp":1773227887000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10589-025-00745-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,31]]},"references-count":50,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,4]]}},"alternative-id":["745"],"URL":"https:\/\/doi.org\/10.1007\/s10589-025-00745-1","relation":{},"ISSN":["0926-6003","1573-2894"],"issn-type":[{"value":"0926-6003","type":"print"},{"value":"1573-2894","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,31]]},"assertion":[{"value":"27 November 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 October 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"31 October 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}