{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,13]],"date-time":"2026-03-13T08:30:36Z","timestamp":1773390636547,"version":"3.50.1"},"reference-count":58,"publisher":"Springer Science and Business Media LLC","issue":"6","license":[{"start":{"date-parts":[[2024,9,21]],"date-time":"2024-09-21T00:00:00Z","timestamp":1726876800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,9,21]],"date-time":"2024-09-21T00:00:00Z","timestamp":1726876800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000923","name":"Australian Research Council","doi-asserted-by":"crossref","award":["DP200101414"],"award-info":[{"award-number":["DP200101414"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100000923","name":"Australian Research Council","doi-asserted-by":"crossref","award":["DE230100029"],"award-info":[{"award-number":["DE230100029"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Stat Comput"],"published-print":{"date-parts":[[2024,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Stochastic models with global parameters and latent variables are common, and for which variational inference (VI) is popular. However, existing methods are often either slow or inaccurate in high dimensions. We suggest a fast and accurate VI method for this case that employs a well-defined natural gradient variational optimization that targets the joint posterior of the global parameters and latent variables. It is a hybrid method, where at each step the global parameters are updated using the natural gradient and the latent variables are generated from their conditional posterior. A fast to compute expression for the Tikhonov damped Fisher information matrix is used, along with the re-parameterization trick, to provide a stable natural gradient. We apply the approach to deep mixed models, which are an emerging class of Bayesian neural networks with random output layer coefficients to allow for heterogeneity. A range of simulations show that using the natural gradient is substantially more efficient than using the ordinary gradient, and that the approach is faster and more accurate than two cutting-edge natural gradient VI methods. In a financial application we show that accounting for industry level heterogeneity using the deep mixed model improves the accuracy of asset pricing models. MATLAB code to implement the method and replicate the results can be found at <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/WeibenZhang07\/NG-HVI\">https:\/\/github.com\/WeibenZhang07\/NG-HVI<\/jats:ext-link><\/jats:p>","DOI":"10.1007\/s11222-024-10488-4","type":"journal-article","created":{"date-parts":[[2024,9,21]],"date-time":"2024-09-21T09:01:41Z","timestamp":1726909301000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Natural gradient hybrid variational inference with application to deep mixed models"],"prefix":"10.1007","volume":"34","author":[{"given":"Weiben","family":"Zhang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Michael","family":"Smith","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Worapree","family":"Maneesoonthorn","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Rub\u00e9n","family":"Loaiza-Maya","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,9,21]]},"reference":[{"issue":"2","key":"10488_CR1","doi-asserted-by":"publisher","first-page":"251","DOI":"10.1162\/089976698300017746","volume":"10","author":"S-I Amari","year":"1998","unstructured":"Amari, S.-I.: Natural gradient works efficiently in learning. Neural Comput. 10(2), 251\u2013276 (1998)","journal-title":"Neural Comput."},{"key":"10488_CR2","first-page":"993","volume":"3","author":"DM Blei","year":"2003","unstructured":"Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993\u20131022 (2003)","journal-title":"J. Mach. Learn. Res."},{"issue":"518","key":"10488_CR3","doi-asserted-by":"publisher","first-page":"859","DOI":"10.1080\/01621459.2017.1285773","volume":"112","author":"DM Blei","year":"2017","unstructured":"Blei, D.M., Kucukelbir, A., McAuliffe, J.D.: Variational inference: a review for statisticians. J. Am. Stat. Assoc. 112(518), 859\u2013877 (2017)","journal-title":"J. Am. Stat. Assoc."},{"key":"10488_CR4","doi-asserted-by":"crossref","unstructured":"Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) Proceedings of COMPSTAT\u20192010, pp. 177\u2013186. Physica-Verlag HD (2010)","DOI":"10.1007\/978-3-7908-2604-3_16"},{"issue":"3","key":"10488_CR5","doi-asserted-by":"publisher","first-page":"445","DOI":"10.1177\/0022243720910104","volume":"57","author":"PJ Danaher","year":"2020","unstructured":"Danaher, P.J., Danaher, T.S., Smith, M.S., Loaiza-Maya, R.: Advertising effectiveness for multiple retailer-brands in a multimedia and multichannel environment. J. Mark. Res. 57(3), 445\u2013467 (2020)","journal-title":"J. Mark. Res."},{"issue":"3\/4","key":"10488_CR6","doi-asserted-by":"publisher","first-page":"345","DOI":"10.2307\/2332581","volume":"38","author":"WL Deemer","year":"1951","unstructured":"Deemer, W.L., Olkin, I.: The jacobians of certain matrix transformations useful in multivariate analysis: Based on lectures of PL Hsu at the University of North Carolina, 1947. Biometrika 38(3\/4), 345\u2013367 (1951)","journal-title":"Biometrika"},{"key":"10488_CR7","doi-asserted-by":"crossref","unstructured":"Diallo, B., Bagudu, A., Zhang, Q.: A machine learning approach to the Fama-French three- and five-factor models. Available at SSRN 3440840 (2019)","DOI":"10.2139\/ssrn.3440840"},{"issue":"1","key":"10488_CR8","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1016\/0304-405X(93)90023-5","volume":"33","author":"EF Fama","year":"1993","unstructured":"Fama, E.F., French, K.R.: Common risk factors in the returns on stocks and bonds. J. Financ. Econ. 33(1), 3\u201356 (1993)","journal-title":"J. Financ. Econ."},{"issue":"1","key":"10488_CR9","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.jfineco.2014.10.010","volume":"116","author":"EF Fama","year":"2015","unstructured":"Fama, E.F., French, K.R.: A five-factor asset pricing model. J. Financ. Econ. 116(1), 1\u201322 (2015)","journal-title":"J. Financ. Econ."},{"key":"10488_CR10","doi-asserted-by":"publisher","first-page":"109919","DOI":"10.1016\/j.econlet.2021.109919","volume":"204","author":"M Fang","year":"2021","unstructured":"Fang, M., Taylor, S.: A machine learning based asset pricing factor model comparison anomaly portfolios. Econ. Lett. 204, 109919 (2021)","journal-title":"Econ. Lett."},{"key":"10488_CR11","doi-asserted-by":"crossref","unstructured":"Feng, G., Polson, N.G., Xu, J.: Deep Learning in Characteristics-Sorted Factor Models (2022). arXiv preprint arXiv:1805.01104","DOI":"10.1017\/S0022109023000893"},{"key":"10488_CR12","volume-title":"Data Analysis Using Regression and Multilevel\/Hierarchical Models. Analytical Methods for Social Research","author":"A Gelman","year":"2006","unstructured":"Gelman, A., Hill, J.: Data Analysis Using Regression and Multilevel\/Hierarchical Models. Analytical Methods for Social Research. Cambridge University Press, Cambridge (2006)"},{"issue":"3","key":"10488_CR13","doi-asserted-by":"publisher","first-page":"249","DOI":"10.1016\/S0304-3800(02)00257-0","volume":"160","author":"M Gevrey","year":"2003","unstructured":"Gevrey, M., Dimopoulos, I., Lek, S.: Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecol. Model. 160(3), 249\u2013264 (2003)","journal-title":"Ecol. Model."},{"issue":"5","key":"10488_CR14","doi-asserted-by":"publisher","first-page":"2223","DOI":"10.1093\/rfs\/hhaa009","volume":"33","author":"S Gu","year":"2020","unstructured":"Gu, S., Kelly, B., Xiu, D.: Empirical Asset Pricing via Machine Learning. Rev. Financ. Stud. 33(5), 2223\u20132273 (2020)","journal-title":"Rev. Financ. Stud."},{"issue":"1, Part B","key":"10488_CR15","doi-asserted-by":"publisher","first-page":"429","DOI":"10.1016\/j.jeconom.2020.07.009","volume":"222","author":"S Gu","year":"2021","unstructured":"Gu, S., Kelly, B., Xiu, D.: Autoencoder asset pricing models. J. Econom. 222(1, Part B), 429\u2013450 (2021)","journal-title":"J. Econom."},{"key":"10488_CR16","unstructured":"Gunawan, D., Tran, M.-N., Kohn, R.: Fast Inference for Intractable Likelihood Problems using Variational Bayes (2017). arXiv preprint arXiv:1705.06679"},{"key":"10488_CR17","unstructured":"Han, S., Liao, X., Dunson, D., Carin, L.: Variational Gaussian copula inference. In: Gretton, A., Robert, C.C. (eds.) Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, volume\u00a051 of Proceedings of Machine Learning Research, pp. 829\u2013838. Cadiz, Spain. PMLR (2016)"},{"key":"10488_CR18","unstructured":"Hoffman, M., Blei, D.: Stochastic structured variational inference. In: Lebanon, G., Vishwanathan, S.V.N. (eds.) Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, volume\u00a038 of Proceedings of Machine Learning Research, pp. 361\u2013369. San Diego, California, USA. PMLR (2015)"},{"issue":"1","key":"10488_CR19","first-page":"1303","volume":"14","author":"MD Hoffman","year":"2013","unstructured":"Hoffman, M.D., Blei, D.M., Wang, C., Paisley, J.: Stochastic variational inference. J. Mach. Learn. Res. 14(1), 1303\u20131347 (2013)","journal-title":"J. Mach. Learn. Res."},{"key":"10488_CR20","first-page":"3235","volume":"11","author":"A Honkela","year":"2010","unstructured":"Honkela, A., Raiko, T., Kuusela, M., Tornio, M., Karhunen, J.: Approximate Riemannian conjugate gradient learning for fixed-form variational Bayes. J. Mach. Learn. Res. 11, 3235\u20133268 (2010)","journal-title":"J. Mach. Learn. Res."},{"key":"10488_CR21","unstructured":"Ji, G., Sujono, D., and Sudderth, E.B.: Marginalized stochastic natural gradients for black-box variational inference. In: Proceedings of the 38th International Conference on Machine Learning, pp. 4870\u20134881. PMLR. ISSN: 2640-3498 (2021)"},{"key":"10488_CR22","doi-asserted-by":"crossref","unstructured":"Jospin, L.V., Laga, H., Boussaid, F., Buntine, W., Bennamoun, M.: Hands-on Bayesian neural networks\u2014a tutorial for deep learning users. IEEE Comput. Intell. Mag. 17(2), 29\u201348. Conference Name: IEEE Computational Intelligence Magazine (2022)","DOI":"10.1109\/MCI.2022.3155327"},{"key":"10488_CR23","unstructured":"Khan, M., Lin, W.: Conjugate-computation variational inference: converting variational inference in non-conjugate models to inferences in conjugate models. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, pp. 878\u2013887. PMLR. ISSN: 2640\u20133498 (2017)"},{"key":"10488_CR24","doi-asserted-by":"crossref","unstructured":"Khan, M.E., Nielsen, D.: Fast yet simple natural-gradient descent for variational inference in complex models. In: 2018 International Symposium on Information Theory and Its Applications (ISITA), pp. 31\u201335. IEEE, Singapore (2018)","DOI":"10.23919\/ISITA.2018.8664326"},{"key":"10488_CR25","unstructured":"Kingma, D.P., Welling, M.: Auto-Encoding Variational Bayes (2014). arXiv preprint arXiv:1312.6114"},{"key":"10488_CR26","unstructured":"Lin, W., Khan, M.E., Schmidt, M.: Fast and simple natural-gradient variational inference with mixture of exponential-family approximations. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th International Conference on Machine Learning, volume\u00a097 of Proceedings of Machine Learning Research, pp. 3992\u20134002. PMLR (2019)"},{"key":"10488_CR27","unstructured":"Lin, W., Nielsen, F., Emtiyaz, K.M., Schmidt, M.: Tractable structured natural-gradient descent using local parameterizations. In: Proceedings of the 38th International Conference on Machine Learning, pp. 6680\u20136691. PMLR. ISSN: 2640-3498 (2021)"},{"issue":"3","key":"10488_CR28","doi-asserted-by":"publisher","first-page":"523","DOI":"10.1080\/10618600.2018.1562936","volume":"28","author":"R Loaiza-Maya","year":"2019","unstructured":"Loaiza-Maya, R., Smith, M.S.: Variational bayes estimation of discrete-margined copula models with application to time series. J. Comput. Graph. Stat. 28(3), 523\u2013539 (2019)","journal-title":"J. Comput. Graph. Stat."},{"issue":"2","key":"10488_CR29","doi-asserted-by":"publisher","first-page":"339","DOI":"10.1016\/j.jeconom.2021.05.002","volume":"230","author":"R Loaiza-Maya","year":"2022","unstructured":"Loaiza-Maya, R., Smith, M.S., Nott, D.J., Danaher, P.J.: Fast and accurate variational inference for models with many latent variables. J. Econom. 230(2), 339\u2013362 (2022)","journal-title":"J. Econom."},{"key":"10488_CR30","unstructured":"Martens, J., Grosse, R.: Optimizing neural networks with Kronecker-factored approximate curvature. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning, volume\u00a037 of Proceedings of Machine Learning Research, pp. 2408\u20132417, Lille, France. PMLR (2015)"},{"key":"10488_CR31","doi-asserted-by":"crossref","unstructured":"Martens, J., Sutskever, I.: Training deep and recurrent networks with hessian-free optimization. In: Montavon, G., Orr, G.B., M\u00fcller, K.-R. (eds.) Neural Networks: Tricks of the Trade, vol. 7700, pp. 479\u2013535. Lecture Notes in Computer Science. Springer, Berlin. Series Title (2012)","DOI":"10.1007\/978-3-642-35289-8_27"},{"issue":"146","key":"10488_CR32","first-page":"1","volume":"21","author":"J Martens","year":"2020","unstructured":"Martens, J.: New insights and perspectives on the natural gradient method. J. Mach. Learn. Res. 21(146), 1\u201376 (2020)","journal-title":"J. Mach. Learn. Res."},{"key":"10488_CR33","volume-title":"Generalized, Linear, and Mixed Models","author":"CE McCulloch","year":"2004","unstructured":"McCulloch, C.E., Searle, S.R.: Generalized, Linear, and Mixed Models. Wiley, New York (2004)"},{"key":"10488_CR34","unstructured":"Miller, A.C., Foti, N.J., Adams, R.P.: Variational boosting: iteratively refining posterior approximations. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning, volume\u00a070 of Proceedings of Machine Learning Research, pp. 2420\u20132429. PMLR (2017)"},{"key":"10488_CR35","unstructured":"Mishkin, A., Kunstner, F., Nielsen, D., Schmidt, M., Khan, M.E.: SLANG: Fast structured covariance approximations for Bayesian deep learning with natural gradient. In: Advances in Neural Information Processing Systems, vol.\u00a031 (2018)"},{"key":"10488_CR36","doi-asserted-by":"crossref","unstructured":"Ong, V.M.-H., Nott, D.J., Smith, M.S.: Gaussian variational approximation with a factor covariance structure. J. Comput. Graph. Stat. 27(3), 465\u2013478 (2018a)","DOI":"10.1080\/10618600.2017.1390472"},{"key":"10488_CR37","doi-asserted-by":"crossref","unstructured":"Ong, V.M.-H., Nott, D.J., Tran, M.-N., Sisson, S.A., Drovandi, C.C.: Likelihood-free inference in high dimensions with synthetic likelihood. Comput. Stat. Data Anal. 128, 271\u2013291 (2018b)","DOI":"10.1016\/j.csda.2018.07.008"},{"issue":"2","key":"10488_CR38","doi-asserted-by":"publisher","first-page":"140","DOI":"10.1198\/tast.2010.09058","volume":"64","author":"JT Ormerod","year":"2010","unstructured":"Ormerod, J.T., Wand, M.P.: Explaining variational approximations. Am. Stat. 64(2), 140\u2013153 (2010)","journal-title":"Am. Stat."},{"key":"10488_CR39","doi-asserted-by":"crossref","unstructured":"Osawa, K., Tsuji, Y., Ueno, Y., Naruse, A., Yokota, R., Matsuoka, S.: Large-scale distributed second-order optimization using Kronecker-factored approximate curvature for deep convolutional neural networks. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 12359\u201312367 (2019)","DOI":"10.1109\/CVPR.2019.01264"},{"key":"10488_CR40","unstructured":"Paisley, J., Blei, D., and Jordan, M.: Variational Bayesian inference with stochastic search. arXiv preprint arXiv:1206.6430 (2012)"},{"key":"10488_CR41","unstructured":"Ranganath, R., Gerrish, S., Blei, D.: Black Box variational inference. In: Kaski, S., Corander, J. (eds.), Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, volume\u00a033 of Proceedings of Machine Learning Research, pp. 814\u2013822, Reykjavik, Iceland. PMLR (2014)"},{"issue":"24","key":"10488_CR42","doi-asserted-by":"publisher","first-page":"5461","DOI":"10.1103\/PhysRevLett.81.5461","volume":"81","author":"M Rattray","year":"1998","unstructured":"Rattray, M., Saad, D., Amari, S.-I.: Natural gradient descent for on-line learning. Phys. Rev. Lett. 81(24), 5461 (1998)","journal-title":"Phys. Rev. Lett."},{"key":"10488_CR43","unstructured":"Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: Xing, E.P., Jebara, T. (eds.), Proceedings of the 31st International Conference on Machine Learning, volume\u00a032 of Proceedings of Machine Learning Research, pp. 1278\u20131286, Bejing, China. PMLR (2014)"},{"issue":"3","key":"10488_CR44","doi-asserted-by":"publisher","first-page":"400","DOI":"10.1214\/aoms\/1177729586","volume":"22","author":"H Robbins","year":"1951","unstructured":"Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400\u2013407 (1951)","journal-title":"Ann. Math. Stat."},{"issue":"6088","key":"10488_CR45","doi-asserted-by":"publisher","first-page":"533","DOI":"10.1038\/323533a0","volume":"323","author":"DE Rumelhart","year":"1986","unstructured":"Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533\u2013536 (1986)","journal-title":"Nature"},{"key":"10488_CR46","first-page":"25111","volume":"34","author":"G Simchoni","year":"2021","unstructured":"Simchoni, G., Rosset, S.: Using random effects to account for high-cardinality categorical features and repeated measures in deep neural networks. Adv. Neural. Inf. Process. Syst. 34, 25111\u201325122 (2021)","journal-title":"Adv. Neural. Inf. Process. Syst."},{"issue":"4","key":"10488_CR47","doi-asserted-by":"publisher","first-page":"729","DOI":"10.1080\/10618600.2020.1740097","volume":"29","author":"MS Smith","year":"2020","unstructured":"Smith, M.S., Loaiza-Maya, R., Nott, D.J.: High-dimensional copula variational approximation through transformation. J. Comput. Graph. Stat. 29(4), 729\u2013743 (2020)","journal-title":"J. Comput. Graph. Stat."},{"key":"10488_CR48","unstructured":"Tan, L.S.L.: Analytic natural gradient updates for Cholesky factor in Gaussian variational approximation. arXiv preprint arXiv:2109.00375 (2022)"},{"issue":"1","key":"10488_CR49","doi-asserted-by":"publisher","first-page":"30","DOI":"10.1111\/rssb.12399","volume":"83","author":"LSL Tan","year":"2021","unstructured":"Tan, L.S.L.: Use of model reparametrization to improve variational bayes. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 83(1), 30\u201357 (2021)","journal-title":"J. R. Stat. Soc. Ser. B (Stat. Methodol.)"},{"issue":"4","key":"10488_CR50","doi-asserted-by":"publisher","first-page":"873","DOI":"10.1080\/10618600.2017.1330205","volume":"26","author":"M-N Tran","year":"2017","unstructured":"Tran, M.-N., Nott, D.J., Kohn, R.: Variational Bayes with intractable likelihood. J. Comput. Graph. Stat. 26(4), 873\u2013882 (2017)","journal-title":"J. Comput. Graph. Stat."},{"key":"10488_CR51","unstructured":"Tran, D., Vafa, K., Agrawal, K., Dinh, L., Poole, B. : Discrete flows: Invertible generative models of discrete data. In: Advances in Neural Information Processing Systems, vol.\u00a032 (2019)"},{"issue":"1","key":"10488_CR52","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1080\/10618600.2019.1637747","volume":"29","author":"M-N Tran","year":"2020","unstructured":"Tran, M.-N., Nguyen, N., Nott, D., Kohn, R.: Bayesian deep net GLM and GLMM. J. Comput. Graph. Stat. 29(1), 97\u2013113 (2020)","journal-title":"J. Comput. Graph. Stat."},{"key":"10488_CR53","unstructured":"Tran, B.-H., Rossi, S., Milios, D., Filippone, M.: All you need is a good functional prior for Bayesian deep learning. J. Mach. Learn. Res. 23(74), 1\u201356 (2022)"},{"key":"10488_CR54","unstructured":"Wang, H., Bhattacharya, A., Pati, D., Yang, Y.: Structured variational inference in bayesian state-space models. In: Camps-Valls, G., Ruiz, F.J.R., Valera, I. (eds.), Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, volume 151 of Proceedings of Machine Learning Research, pp. 8884\u20138905. PMLR (2022)"},{"issue":"2","key":"10488_CR55","doi-asserted-by":"publisher","first-page":"175","DOI":"10.1007\/s13253-019-00361-7","volume":"24","author":"CK Wikle","year":"2019","unstructured":"Wikle, C.K.: Comparison of deep neural networks and deep hierarchical models for spatio-temporal data. J. Agric. Biol. Environ. Stat. 24(2), 175\u2013203 (2019)","journal-title":"J. Agric. Biol. Environ. Stat."},{"key":"10488_CR56","unstructured":"Zhang, G., Sun, S., Duvenaud, D., Grosse, R.: Noisy natural gradient as variational inference. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning, volume\u00a080 of Proceedings of Machine Learning Research, pp. 5852\u20135861. PMLR (2018)"},{"key":"10488_CR57","doi-asserted-by":"crossref","unstructured":"Zhang, C., Butepage, J., Kjellstrom, H., Mandt, S.: Advances in variational inference. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 2008\u20132026 (2019a)","DOI":"10.1109\/TPAMI.2018.2889774"},{"key":"10488_CR58","unstructured":"Zhang, G., Martens, J., Grosse, R.B.: Fast convergence of natural gradient descent for over-parameterized neural networks. In: Advances in Neural Information Processing Systems, vol.\u00a032. Curran Associates, Inc (2019b)"}],"container-title":["Statistics and Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11222-024-10488-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11222-024-10488-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11222-024-10488-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,25]],"date-time":"2024-11-25T12:11:22Z","timestamp":1732536682000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11222-024-10488-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,9,21]]},"references-count":58,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,12]]}},"alternative-id":["10488"],"URL":"https:\/\/doi.org\/10.1007\/s11222-024-10488-4","relation":{},"ISSN":["0960-3174","1573-1375"],"issn-type":[{"value":"0960-3174","type":"print"},{"value":"1573-1375","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,9,21]]},"assertion":[{"value":"22 April 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 August 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 September 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"185"}}