{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,18]],"date-time":"2026-02-18T23:05:36Z","timestamp":1771455936863,"version":"3.50.1"},"reference-count":52,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T00:00:00Z","timestamp":1721174400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T00:00:00Z","timestamp":1721174400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Found Comput Math"],"published-print":{"date-parts":[[2025,8]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>The stein variational gradient descent (SVGD) algorithm is a deterministic particle method for sampling. However, a mean-field analysis reveals that the gradient flow corresponding to the SVGD algorithm (i.e., the Stein Variational Gradient Flow) only provides a constant-order approximation to the Wasserstein gradient flow corresponding to the KL-divergence minimization. In this work, we propose the Regularized Stein Variational Gradient Flow, which interpolates between the Stein Variational Gradient Flow and the Wasserstein gradient flow. We establish various theoretical properties of the Regularized Stein Variational Gradient Flow (and its time-discretization) including convergence to equilibrium, existence and uniqueness of weak solutions, and stability of the solutions. We provide preliminary numerical evidence of the improved performance offered by the regularization.\n<\/jats:p>","DOI":"10.1007\/s10208-024-09663-w","type":"journal-article","created":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T17:01:36Z","timestamp":1721235696000},"page":"1199-1257","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Regularized Stein Variational Gradient Flow"],"prefix":"10.1007","volume":"25","author":[{"given":"Ye","family":"He","sequence":"first","affiliation":[]},{"given":"Krishnakumar","family":"Balasubramanian","sequence":"additional","affiliation":[]},{"given":"Bharath K.","family":"Sriperumbudur","sequence":"additional","affiliation":[]},{"given":"Jianfeng","family":"Lu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,7,17]]},"reference":[{"key":"9663_CR1","unstructured":"Luigi Ambrosio, Nicola Gigli, and Giuseppe Savare. Gradient Flows: In Metric Spaces and in the Space of Probability Measures. Springer Science & Business Media, 2005."},{"key":"9663_CR2","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1016\/j.jat.2013.10.002","volume":"177","author":"Douglas Azevedo","year":"2014","unstructured":"Douglas Azevedo and Valdir\u00a0Antonio Menegatto. Sharp estimates for eigenvalues of integral operators generated by dot product kernels on the sphere. Journal of Approximation Theory, 177:57\u201368, 2014.","journal-title":"Journal of Approximation Theory"},{"key":"9663_CR3","doi-asserted-by":"crossref","unstructured":"Dominique Bakry, Ivan Gentil, and Michel Ledoux. Analysis and Geometry of Markov Diffusion Operators, volume 103. Springer, 2014.","DOI":"10.1007\/978-3-319-00227-9"},{"key":"9663_CR4","unstructured":"Krishnakumar Balasubramanian, Sinho Chewi, Murat\u00a0A Erdogdu, Adil Salim, and Shunshi Zhang. Towards a theory of non-log-concave sampling: First-order stationarity guarantees for Langevin Monte Carlo. In Conference on Learning Theory, pages 2896\u20132923. PMLR, 2022."},{"key":"9663_CR5","unstructured":"Krishnakumar Balasubramanian, Tong Li, and Ming Yuan. On the optimality of kernel-embedding based goodness-of-fit tests. Journal of Machine Learning Research, 22(1), 2021."},{"key":"9663_CR6","unstructured":"Alain Berlinet and Christine Thomas-Agnan. Reproducing Kernel Hilbert Spaces in Probability and Statistics. Springer Science & Business Media, 2011."},{"issue":"2","key":"9663_CR7","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s00526-019-1486-3","volume":"58","author":"Jos\u00e9 Antonio Carrillo","year":"2019","unstructured":"Jos\u00e9\u00a0Antonio Carrillo, Katy Craig, and Francesco\u00a0S Patacchini. A blob method for diffusion. Calculus of Variations and Partial Differential Equations, 58(2):1\u201353, 2019.","journal-title":"Calculus of Variations and Partial Differential Equations"},{"key":"9663_CR8","unstructured":"Lin Chen and Sheng Xu. Deep neural tangent kernel and Laplace kernel have the same RKHS. In International Conference on Learning Representations, 2020."},{"key":"9663_CR9","first-page":"2984","volume":"178","author":"Yongxin Chen","year":"2022","unstructured":"Yongxin Chen, Sinho Chewi, Adil Salim, and Andre Wibisono. Improved analysis for a proximal algorithm for sampling. In Po-Ling Loh and Maxim Raginsky, editors, Proceedings of Thirty Fifth Conference on Learning Theory, volume 178, pages 2984\u20133014, 2022.","journal-title":"In Po-Ling Loh and Maxim Raginsky, editors, Proceedings of Thirty Fifth Conference on Learning Theory"},{"key":"9663_CR10","doi-asserted-by":"crossref","unstructured":"Alina Chertock. A practical guide to deterministic particle methods. In Handbook of Numerical Analysis, volume\u00a018, pages 177\u2013202. Elsevier, 2017.","DOI":"10.1016\/bs.hna.2016.11.004"},{"key":"9663_CR11","unstructured":"Sinho Chewi, Murat\u00a0A Erdogdu, Mufan Li, Ruoqi Shen, and Shunshi Zhang. Analysis of Langevin Monte Carlo from Poincar\u00e9 to Log-Sobolev. In Conference on Learning Theory, pages 1\u20132. PMLR, 2022."},{"key":"9663_CR12","first-page":"2098","volume":"33","author":"Sinho Chewi","year":"2020","unstructured":"Sinho Chewi, Thibaut Le\u00a0Gouic, Chen Lu, Tyler Maunu, and Philippe Rigollet. SVGD as a kernelized Wasserstein gradient flow of the chi-squared divergence. Advances in Neural Information Processing Systems, 33:2098\u20132109, 2020.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"9663_CR13","unstructured":"Kacper Chwialkowski, Heiko Strathmann, and Arthur Gretton. A kernel test of goodness of fit. In International Conference on Machine Learning, pages 2606\u20132615. PMLR, 2016."},{"issue":"300","key":"9663_CR14","doi-asserted-by":"publisher","first-page":"1681","DOI":"10.1090\/mcom3033","volume":"85","author":"Katy Craig","year":"2016","unstructured":"Katy Craig and Andrea Bertozzi. A blob method for the aggregation equation. Mathematics of Computation, 85(300):1681\u20131717, 2016.","journal-title":"Mathematics of Computation"},{"key":"9663_CR15","doi-asserted-by":"crossref","unstructured":"Felipe Cucker and Ding-Xuan Zhou. Learning Theory: An Approximation Theory Viewpoint, volume\u00a024. Cambridge University Press, 2007.","DOI":"10.1017\/CBO9780511618796"},{"issue":"2","key":"9663_CR16","doi-asserted-by":"publisher","first-page":"293","DOI":"10.1137\/0911018","volume":"11","author":"Pierre Degond","year":"1990","unstructured":"Pierre Degond and Francisco-Jos\u00e9 Mustieles. A deterministic approximation of diffusion equations using particles. SIAM Journal on Scientific and Statistical Computing, 11(2):293\u2013310, 1990.","journal-title":"SIAM Journal on Scientific and Statistical Computing"},{"key":"9663_CR17","unstructured":"Mateo D\u00edaz, Ethan\u00a0N Epperly, Zachary Frangella, Joel\u00a0A Tropp, and Robert\u00a0J Webber. Robust, randomized preconditioning for kernel ridge regression. arXiv preprintarXiv:2304.12465, 2023."},{"issue":"2","key":"9663_CR18","doi-asserted-by":"publisher","first-page":"115","DOI":"10.1007\/BF01077243","volume":"13","author":"Roland Dobrushin","year":"1979","unstructured":"Roland Dobrushin. Vlasov equations. Functional Analysis and Its Applications, 13(2):115\u2013123, 1979.","journal-title":"Functional Analysis and Its Applications"},{"key":"9663_CR19","first-page":"1","volume":"24","author":"Andrew Duncan","year":"2023","unstructured":"Andrew Duncan, Nikolas N\u00fcsken, and Lukasz Szpruch. On the geometry of stein variational gradient descent. Journal of Machine Learning Research, 24:1\u201339, 2023.","journal-title":"Journal of Machine Learning Research"},{"issue":"3\u20134","key":"9663_CR20","doi-asserted-by":"publisher","first-page":"707","DOI":"10.1007\/s00440-014-0583-7","volume":"162","author":"Nicolas Fournier","year":"2015","unstructured":"Nicolas Fournier and Arnaud Guillin. On the rate of convergence in wasserstein distance of the empirical measure. Probability theory and related fields, 162(3-4):707\u2013738, 2015.","journal-title":"Probability theory and related fields"},{"issue":"5","key":"9663_CR21","doi-asserted-by":"publisher","first-page":"2884","DOI":"10.1214\/19-AAP1467","volume":"29","author":"Jackson Gorham","year":"2019","unstructured":"Jackson Gorham, Andrew\u00a0B Duncan, Sebastian\u00a0J Vollmer, and Lester Mackey. Measuring sample quality with diffusions. The Annals of Applied Probability, 29(5):2884\u20132928, 2019.","journal-title":"The Annals of Applied Probability"},{"key":"9663_CR22","unstructured":"Jackson Gorham and Lester Mackey. Measuring sample quality with kernels. In International Conference on Machine Learning, pages 1292\u20131301. PMLR, 2017."},{"issue":"1","key":"9663_CR23","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1137\/S0036141096303359","volume":"29","author":"Richard Jordan","year":"1998","unstructured":"Richard Jordan, David Kinderlehrer, and Felix Otto. The variational formulation of the Fokker\u2013Planck equation. SIAM Journal on Mathematical Analysis, 29(1):1\u201317, 1998.","journal-title":"SIAM Journal on Mathematical Analysis"},{"key":"9663_CR24","unstructured":"Anna Korba, Adil Salim, Michael Arbel, Giulia Luise, and Arthur Gretton. A non-asymptotic analysis for Stein Variational Gradient Descent. Advances in Neural Information Processing Systems, 33, 2020."},{"key":"9663_CR25","unstructured":"Qiang Liu. Stein Variational Gradient Descent as gradient flow. Advances in Neural Information Processing Systems, 30, 2017."},{"key":"9663_CR26","unstructured":"Qiang Liu, Jason Lee, and Michael Jordan. A kernelized Stein discrepancy for goodness-of-fit tests. In International Conference on Machine Learning, pages 276\u2013284. PMLR, 2016."},{"key":"9663_CR27","unstructured":"Qiang Liu and Dilin Wang. Stein Variational Gradient Descent: A general purpose Bayesian inference algorithm. Advances in Neural Information Processing Systems, 29, 2016."},{"key":"9663_CR28","unstructured":"Tianle Liu, Promit Ghosal, Krishnakumar Balasubramanian, and Natesh Pillai. Towards understanding the dynamics of gaussian-stein variational gradient descent. Advances in Neural Information Processing Systems, 36, 2024."},{"key":"9663_CR29","unstructured":"Yang Liu, Prajit Ramachandran, Qiang Liu, and Jian Peng. Stein variational policy gradient. In 33rd Conference on Uncertainty in Artificial Intelligence, UAI 2017, 2017."},{"issue":"2","key":"9663_CR30","doi-asserted-by":"publisher","first-page":"648","DOI":"10.1137\/18M1187611","volume":"51","author":"Lu Jianfeng","year":"2019","unstructured":"Jianfeng Lu, Yulong Lu, and James Nolen. Scaling limit of the Stein Variational Gradient Descent: The mean field regime. SIAM Journal on Mathematical Analysis, 51(2):648\u2013671, 2019.","journal-title":"SIAM Journal on Mathematical Analysis"},{"key":"9663_CR31","doi-asserted-by":"crossref","unstructured":"Ha\u00a0Quang Minh, Partha Niyogi, and Yuan Yao. Mercer\u2019s theorem, feature maps, and smoothing. In International Conference on Computational Learning Theory, pages 154\u2013168. Springer, 2006.","DOI":"10.1007\/11776420_14"},{"key":"9663_CR32","doi-asserted-by":"crossref","unstructured":"Adrian Muntean, Jens Rademacher, and Antonios Zagaris. Macroscopic and Large Scale Phenomena: Coarse Graining, Mean Field Limits and Ergodicity. Springer, 2016.","DOI":"10.1007\/978-3-319-26883-5"},{"key":"9663_CR33","doi-asserted-by":"crossref","unstructured":"Vern Paulsen and Mrinal Raghupathi. An Introduction to the Theory of Reproducing Kernel Hilbert Spaces, volume 152. Cambridge University Press, 2016.","DOI":"10.1017\/CBO9781316219232"},{"key":"9663_CR34","doi-asserted-by":"crossref","unstructured":"Pierre-Arnaud Raviart. An analysis of particle methods. In Numerical Methods in Fluid Dynamics, pages 243\u2013324. Springer, 1985.","DOI":"10.1007\/BFb0074532"},{"key":"9663_CR35","unstructured":"Alessandro Rudi and Lorenzo Rosasco. Generalization properties of learning with random features. Advances in neural information processing systems, 30, 2017."},{"issue":"6","key":"9663_CR36","doi-asserted-by":"publisher","first-page":"697","DOI":"10.1002\/cpa.3160430602","volume":"43","author":"Giovanni Russo","year":"1990","unstructured":"Giovanni Russo. Deterministic diffusion of particles. Communications on Pure and Applied Mathematics, 43(6):697\u2013733, 1990.","journal-title":"Communications on Pure and Applied Mathematics"},{"key":"9663_CR37","unstructured":"Adil Salim, Lukang Sun, and Peter Richtarik. A convergence theory for SVGD in the population limit under Talagrand\u2019s inequality $$T_1$$. In International Conference on Machine Learning, pages 19139\u201319152. PMLR, 2022."},{"issue":"1","key":"9663_CR38","doi-asserted-by":"publisher","first-page":"87","DOI":"10.1007\/s13373-017-0101-1","volume":"7","author":"Filippo Santambrogio","year":"2017","unstructured":"Filippo Santambrogio. $$\\{$$Euclidean, metric, and Wasserstein$$\\}$$ gradient flows: An overview. Bulletin of Mathematical Sciences, 7(1):87\u2013154, 2017.","journal-title":"Bulletin of Mathematical Sciences"},{"key":"9663_CR39","unstructured":"Meyer Scetbon and Zaid Harchaoui. A spectral analysis of dot-product kernels. In International conference on Artificial Intelligence and Statistics, pages 3394\u20133402. PMLR, 2021."},{"issue":"184","key":"9663_CR40","first-page":"1","volume":"24","author":"Carl-Johann Simon-Gabriel","year":"2023","unstructured":"Carl-Johann Simon-Gabriel, Alessandro Barp, Bernhard Sch\u00f6lkopf, and Lester Mackey. Metrizing weak convergence with maximum mean discrepancies. Journal of Machine Learning Research, 24(184):1\u201320, 2023.","journal-title":"Journal of Machine Learning Research"},{"key":"9663_CR41","first-page":"1517","volume":"11","author":"Bharath Sriperumbudur","year":"2010","unstructured":"Bharath Sriperumbudur, Arthur Gretton, Kenji Fukumizu, Bernhard Sch\u00f6lkopf, and Gert Lanckriet. Hilbert space embeddings and metrics on probability measures. Journal of Machine Learning Research, 11, 1517\u20131561, 2010.","journal-title":"Journal of Machine Learning Research"},{"key":"9663_CR42","doi-asserted-by":"crossref","unstructured":"Ingo Steinwart and Andreas Christmann. Support Vector Machines. Springer Science & Business Media, 2008.","DOI":"10.1007\/978-0-387-77242-4"},{"key":"9663_CR43","unstructured":"Lukang Sun and Peter Richt\u00e1rik. A note on the convergence of mirrored Stein Variational Gradient Descent under $$(L_0, L_1)$$-smoothness condition. arXiv preprintarXiv:2206.09709, 2022."},{"issue":"1","key":"9663_CR44","first-page":"29","volume":"15","author":"Nicolas Garcia Trillos","year":"2020","unstructured":"Nicolas\u00a0Garcia Trillos and Daniel Sanz-Alonso. The Bayesian update: Variational formulations and gradient flows. Bayesian Analysis, 15(1):29\u201356, 2020.","journal-title":"Bayesian Analysis"},{"key":"9663_CR45","unstructured":"Santosh Vempala and Andre Wibisono. Rapid convergence of the Unadjusted Langevin Algorithm: Isoperimetry suffices. Advances in neural information processing systems, 32, 2019."},{"key":"9663_CR46","unstructured":"C\u00e9dric Villani. Topics in Optimal Transportation, volume\u00a058. American Mathematical Soc., 2021."},{"key":"9663_CR47","unstructured":"Dilin Wang, Ziyang Tang, Chandrajit Bajaj, and Qiang Liu. Stein Variational Gradient Descent with matrix-valued kernels. Advances in Neural Information Processing Systems, 32, 2019."},{"key":"9663_CR48","unstructured":"Dilin Wang, Zhe Zeng, and Qiang Liu. Stein variational message passing for continuous graphical models. In International Conference on Machine Learning, pages 5219\u20135227. PMLR, 2018."},{"issue":"4A","key":"9663_CR49","doi-asserted-by":"publisher","first-page":"2620","DOI":"10.3150\/18-BEJ1065","volume":"25","author":"Jonathan Weed","year":"2019","unstructured":"Jonathan Weed and Francis Bach. Sharp asymptotic and finite-sample rates of convergence of empirical measures in wasserstein distance. Bernoulli, 25(4A):2620\u20132648, 2019.","journal-title":"Bernoulli"},{"key":"9663_CR50","unstructured":"Andre Wibisono. Sampling as optimization in the space of measures: The Langevin dynamics as a composite optimization problem. In Conference on Learning Theory, pages 2093\u20133027. PMLR, 2018."},{"key":"9663_CR51","unstructured":"Lantian Xu, Anna Korba, and Dejan Slepc\u0306ev. Accurate quantization of measures via interacting particle-based optimization. In International Conference on Machine Learning, pages 24576\u201324595. PMLR, 2022."},{"key":"9663_CR52","doi-asserted-by":"crossref","unstructured":"Yun Yang, Mert Pilanci, and Martin\u00a0J Wainwright. Randomized sketches for kernels: Fast and optimal nonparametric regression. Annals of Statistics, pages 991\u20131023, 2017.","DOI":"10.1214\/16-AOS1472"}],"container-title":["Foundations of Computational Mathematics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10208-024-09663-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10208-024-09663-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10208-024-09663-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,5]],"date-time":"2025-09-05T20:54:59Z","timestamp":1757105699000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10208-024-09663-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,17]]},"references-count":52,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,8]]}},"alternative-id":["9663"],"URL":"https:\/\/doi.org\/10.1007\/s10208-024-09663-w","relation":{},"ISSN":["1615-3375","1615-3383"],"issn-type":[{"value":"1615-3375","type":"print"},{"value":"1615-3383","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,17]]},"assertion":[{"value":"15 November 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 March 2024","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 May 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 July 2024","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}