{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T18:11:45Z","timestamp":1776449505811,"version":"3.51.2"},"reference-count":47,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,12,17]],"date-time":"2021-12-17T00:00:00Z","timestamp":1639699200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,12,17]],"date-time":"2021-12-17T00:00:00Z","timestamp":1639699200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["RTG 2126"],"award-info":[{"award-number":["RTG 2126"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["RTG 2126"],"award-info":[{"award-number":["RTG 2126"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["RTG 2126"],"award-info":[{"award-number":["RTG 2126"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Stat Comput"],"published-print":{"date-parts":[[2022,2,15]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Gaussian Mixture Models are a powerful tool in Data Science and Statistics that are mainly used for clustering and density approximation. The task of estimating the model parameters is in practice often solved by the expectation maximization (EM) algorithm which has its benefits in its simplicity and low per-iteration costs. However, the EM converges slowly if there is a large share of hidden information or overlapping clusters. Recent advances in Manifold Optimization for Gaussian Mixture Models have gained increasing interest. We introduce an explicit formula for the Riemannian Hessian for Gaussian Mixture Models. On top, we propose a new Riemannian Newton Trust-Region method which outperforms current approaches both in terms of runtime and number of iterations. We apply our method on clustering problems and density approximation tasks. Our method is very powerful for data with a large share of hidden information compared to existing methods.<\/jats:p>","DOI":"10.1007\/s11222-021-10071-1","type":"journal-article","created":{"date-parts":[[2021,12,17]],"date-time":"2021-12-17T18:21:47Z","timestamp":1639765307000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["A Riemannian Newton trust-region method for fitting Gaussian mixture models"],"prefix":"10.1007","volume":"32","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0828-0046","authenticated-orcid":false,"given":"Lena","family":"Sembach","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5771-6179","authenticated-orcid":false,"given":"Jan Pablo","family":"Burgard","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7665-130X","authenticated-orcid":false,"given":"Volker","family":"Schulz","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,12,17]]},"reference":[{"key":"10071_CR1","doi-asserted-by":"crossref","unstructured":"Absil, P.-A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2008) ISBN 978-0-691-13298-3","DOI":"10.1515\/9781400830244"},{"key":"10071_CR2","doi-asserted-by":"publisher","unstructured":"Alf\u00f2, M., Nieddu, L., Vicari, D.: A finite mixture model for image segmentation. Stat Comput, 18(2):137\u2013150. https:\/\/doi.org\/10.1007\/s11222-007-9044-9","DOI":"10.1007\/s11222-007-9044-9"},{"key":"10071_CR3","doi-asserted-by":"publisher","first-page":"160","DOI":"10.1016\/j.csda.2018.05.015","volume":"127","author":"JL Andrews","year":"2018","unstructured":"Andrews, J.L.: Addressing overfitting and underfitting in Gaussian model-based clustering. Comput Stat Data Anal 127, 160\u2013171 (2018)","journal-title":"Comput Stat Data Anal"},{"key":"10071_CR4","unstructured":"Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: SODA \u201907: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027\u20131035, Philadelphia (2007). Society for Industrial and Applied Mathematics. ISBN 978-0-898716-24-5"},{"key":"10071_CR5","unstructured":"Articus, C., Burgard, J.P.: A finite mixture fay herriot-type model for estimating regional rental prices in Germany. Research Papers in Economics 2014-14, University of Trier, Department of Economics, 2014. https:\/\/ideas.repec.org\/p\/trr\/wpaper\/201414.html"},{"key":"10071_CR6","series-title":"Princeton Series in Applied Mathematics","volume-title":"Positive Definite Matrices","author":"R Bhatia","year":"2007","unstructured":"Bhatia, R.: Positive Definite Matrices. Princeton Series in Applied Mathematics, Princeton University Press, Princeton (2007)"},{"key":"10071_CR7","volume-title":"Pattern Recognition and Machine Learning (Information Science and Statistics)","author":"CM Bishop","year":"2006","unstructured":"Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, Berlin, Heidelberg (2006)"},{"issue":"2","key":"10071_CR8","doi-asserted-by":"publisher","first-page":"511","DOI":"10.1016\/j.nima.2003.08.157","volume":"516","author":"R Bock","year":"2004","unstructured":"Bock, R., Chilingarian, A., Gaug, M., Hakl, F., Hengstebeck, T., Ji\u0159ina, M., Klaschka, J., Kotr\u010d, E., Savick\u00fd, P., Towers, S., Vaiciulis, A., Wittek, W.: Methods for multidimensional event classification: a case study using images from a cherenkov gamma-ray telescope. Nucl. Instrum. Methods Phys. Res. Sect. A Acceler. Spectrom. Detect. Assoc. Equip. 516(2), 511\u2013528 (2004). https:\/\/doi.org\/10.1016\/j.nima.2003.08.157","journal-title":"Nucl. Instrum. Methods Phys. Res. Sect. A Acceler. Spectrom. Detect. Assoc. Equip."},{"key":"10071_CR9","unstructured":"Boumal, N.: An introduction to optimization on smooth manifolds (2020). http:\/\/www.nicolasboumal.net\/book"},{"key":"10071_CR10","doi-asserted-by":"crossref","unstructured":"Carmo, M.P.d. (1992) Riemannian Geometry. Mathematics Theory and Applications. Birkh\u00e4user, Boston. ISBN 0-8176-3490-8","DOI":"10.1007\/978-1-4757-2201-7"},{"key":"10071_CR11","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1111\/ectj.12068","volume":"19(3), C95\u2013C127","author":"G Compiani","year":"2016","unstructured":"Compiani, G., Kitamura, Y.: Using mixtures in econometric models: a brief review and some new results. Econom. J. 19(3), C95\u2013C127, 10 (2016). https:\/\/doi.org\/10.1111\/ectj.12068","journal-title":"Econom. J."},{"key":"10071_CR12","doi-asserted-by":"crossref","unstructured":"Conn, A.R., Gould, N.I., Toint, P.L.: Trust-region methods. MPS-SIAM series on optimization. SIAM Society for Industrial and Applied Mathematics, Philadelphia (2000)0898714605","DOI":"10.1137\/1.9780898719857"},{"key":"10071_CR13","doi-asserted-by":"crossref","unstructured":"Coretto, P.: Estimation and computations for gaussian mixtures with uniform noise under separation constraints. Stat. Methods Appl., 07 2021. 10.1007\/s10260-021-00578-2","DOI":"10.1007\/s10260-021-00578-2"},{"key":"10071_CR14","doi-asserted-by":"publisher","unstructured":"Cortez, P., Cerdeira, A.,Almeida, F., Matos, T., Reis, J.: Modeling wine preferences by data mining from physicochemical properties. Decis. Support Syst., 47(4):547\u2013553, (2009). https:\/\/doi.org\/10.1016\/j.dss.2009.05.016. Smart Business Networks: Concepts and Empirical Evidence","DOI":"10.1016\/j.dss.2009.05.016"},{"key":"10071_CR15","unstructured":"Dasgupta, S.: Learning mixtures of gaussians. In: Proceedings of the 40th Annual Symposium on Foundations of Computer Science, FOCS \u201999, pp. 634 (1999). IEEE Computer Society. ISBN 0769504094"},{"issue":"4","key":"10071_CR16","doi-asserted-by":"publisher","first-page":"817","DOI":"10.1007\/s11222-020-09921-1","volume":"30","author":"D Dresvyanskiy","year":"2020","unstructured":"Dresvyanskiy, D., Karaseva, T., Makogin, V., Mitrofanov, S., Redenbach, C., Spodarev, E.: Detecting anomalies in fibre systems using 3-dimensional image data. Stat. Comput. 30(4), 817\u2013837 (2020)","journal-title":"Stat. Comput."},{"key":"10071_CR17","unstructured":"Dua, D., Graff, C.: UCI machine learning repository. http:\/\/archive.ics.uci.edu\/ml (2017)"},{"key":"10071_CR18","unstructured":"Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press. http:\/\/www.deeplearningbook.org (2016)"},{"key":"10071_CR19","doi-asserted-by":"publisher","unstructured":"Gould, N., Orban, D., Sartenaer, A., Toint, P.: Sensitivity of trust-region algorithms to their parameters. 4OR Q J Belgian French Italian Oper. Res. Soc. 3(227\u2013241),(2005). https:\/\/doi.org\/10.1007\/s10288-005-0065-y","DOI":"10.1007\/s10288-005-0065-y"},{"key":"10071_CR20","unstructured":"Gross, M., Rendtel, U., Schmid, T., Schmon, S., Tzavidis, N.: Estimating the Density of Ethnic Minorities and Aged People in Berlin (2015)"},{"key":"10071_CR21","doi-asserted-by":"publisher","first-page":"6","DOI":"10.1002\/nla.2175","volume":"25","author":"G Heidel","year":"2018","unstructured":"Heidel, G., Schulz, V.: A Riemannian trust-region method for low-rank tensor completion. Numer. Linear Algebra Appl. 25, 6 (2018)","journal-title":"Numer. Linear Algebra Appl."},{"key":"10071_CR22","unstructured":"Hosseini, R., Mash\u2019al, M.: Mixest: An estimation toolbox for mixture models. CoRR, abs\/1507.06065. http:\/\/arxiv.org\/abs\/1507.06065 (2015)"},{"key":"10071_CR23","unstructured":"Hosseini, R., Sra, S.: Matrix manifold optimization for gaussian mixtures. In: Advances in Neural Information Processing Systems, vol.\u00a028. Curran Associates, Inc. (2015)"},{"issue":"1","key":"10071_CR24","doi-asserted-by":"publisher","first-page":"187","DOI":"10.1007\/s10107-019-01381-4","volume":"181","author":"R Hosseini","year":"2020","unstructured":"Hosseini, R., Sra, S.: An alternative to EM for gaussian mixture models: batch and stochastic Riemannian optimization. Math. Program. 181(1), 187\u2013223 (2020)","journal-title":"Math. Program."},{"issue":"1","key":"10071_CR25","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1007\/BF01908075","volume":"2","author":"L Hubert","year":"1985","unstructured":"Hubert, L., Arabie, P.: Comparing partitions. J Classif 2(1), 193\u2013218 (1985)","journal-title":"J Classif"},{"key":"10071_CR26","first-page":"379","volume":"39","author":"B Jeuris","year":"2012","unstructured":"Jeuris, B., Vandebril, R., Vandereycken, B.: A survey and comparison of contemporary algorithms for computing the matrix geometric mean. ETNA 39, 379\u2013402 (2012)","journal-title":"ETNA"},{"key":"10071_CR27","unstructured":"Kaya, H., Tufekci, P.: Local and global learning methods for predicting power of a combined gas and steam turbine (2012)"},{"key":"10071_CR28","doi-asserted-by":"publisher","first-page":"4783","DOI":"10.3906\/elk-1807-87","volume":"27","author":"H Kaya","year":"2019","unstructured":"Kaya, H., T\u00fcfekci, P., Uzun, E.: Predicting co and nox emissions from gas turbines: novel data and a benchmark pems. Turk. J. Electr. Eng. Comput. Sci. 27, 4783\u20134796 (2019)","journal-title":"Turk. J. Electr. Eng. Comput. Sci."},{"key":"10071_CR29","doi-asserted-by":"publisher","first-page":"09","DOI":"10.1007\/s11634-013-0132-8","volume":"7","author":"S Lee","year":"2013","unstructured":"Lee, S., Mclachlan, G.: On mixtures of skew normal and skew [tex equation: t]-distributions. Adv. Data Anal. Classif. 7, 09 (2013). https:\/\/doi.org\/10.1007\/s11634-013-0132-8","journal-title":"Adv. Data Anal. Classif."},{"key":"10071_CR30","doi-asserted-by":"publisher","unstructured":"Li, Y., Li, L.: A novel split and merge em algorithm for gaussian mixture model. In: 2009 Fifth International Conference on Natural Computation, vol.\u00a06, pp. 479\u2013483 (2009). https:\/\/doi.org\/10.1109\/ICNC.2009.625","DOI":"10.1109\/ICNC.2009.625"},{"key":"10071_CR31","doi-asserted-by":"publisher","first-page":"2881","DOI":"10.1162\/089976600300014764","volume":"12","author":"J Ma","year":"2000","unstructured":"Ma, J., Xu, L., Jordan, M.: Asymptotic convergence rate of the em algorithm for gaussian mixtures. Neural Comput. 12, 2881\u20132907 (2000)","journal-title":"Neural Comput."},{"issue":"1","key":"10071_CR32","doi-asserted-by":"publisher","first-page":"355","DOI":"10.1146\/annurev-statistics-031017-100325","volume":"6","author":"GJ McLachlan","year":"2019","unstructured":"McLachlan, G.J., Lee, S.X., Rathnayake, S.I.: Finite mixture models. Ann. Rev. Stat. Appl. 6(1), 355\u2013378 (2019). https:\/\/doi.org\/10.1146\/annurev-statistics-031017-100325","journal-title":"Ann. Rev. Stat. Appl."},{"key":"10071_CR33","doi-asserted-by":"publisher","first-page":"1079","DOI":"10.1137\/S1052623497327854","volume":"10","author":"J Morales","year":"2000","unstructured":"Morales, J., Nocedal, J.: Automatic preconditioning by limited memory quasi-newton updating. SIAM J. Optim. 10, 1079\u2013109606 (2000)","journal-title":"SIAM J. Optim."},{"key":"10071_CR34","volume-title":"Machine Learning: A Probabilistic Perspective","author":"KP Murphy","year":"2013","unstructured":"Murphy, K.P.: Machine Learning: A Probabilistic Perspective. MIT Press, Cambridge (2013)"},{"key":"10071_CR35","unstructured":"Naim, I., Gildea, D.: Convergence of the em algorithm for gaussian mixtures with unbalanced mixing coefficients. In: Proceedings of the 29th International Conference on International Conference on Machine Learning, ICML\u201912, , Madison, pp. 1427\u20131431 (2012). Omnipress. ISBN 9781450312851"},{"key":"10071_CR36","unstructured":"Ormoneit, D., Tresp, V.: Improved gaussian mixture density estimates using bayesian penalty terms and network averaging. In: Proceedings of the 8th International Conference on Neural Information Processing Systems, NIPS\u201995, pp. 542\u2013548. MIT Press, Cambridge (1995)"},{"key":"10071_CR37","unstructured":"Salakhutdinov, R., Roweis, S., Ghahramani, Z.: Optimization with em and expectation-conjugate-gradient. In: Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML\u201903, pp. 672\u2013679. AAAI Press (2003). ISBN 1577351894"},{"issue":"6","key":"10071_CR38","doi-asserted-by":"publisher","first-page":"1788","DOI":"10.1137\/S1064827595286955","volume":"18","author":"A Sartenaer","year":"1997","unstructured":"Sartenaer, A.: Automatic determination of an initial trust region in nonlinear programming. SIAM J. Sci. Comput. 18(6), 1788\u20131803 (1997)","journal-title":"SIAM J. Sci. Comput."},{"key":"10071_CR39","doi-asserted-by":"publisher","DOI":"10.1002\/9780470316849","volume-title":"Multivariate Density Estimation. Theory, Practice, and Visualization. Wiley Series in Probability and Mathematical Statistics","author":"D Scott","year":"1992","unstructured":"Scott, D.: Multivariate Density Estimation. Theory, Practice, and Visualization. Wiley Series in Probability and Mathematical Statistics. Wiley, New York, London (1992)"},{"key":"10071_CR40","doi-asserted-by":"crossref","unstructured":"Sembach, L., Burgard, J.P., Schulz, V.H.: A Riemannian newton trust-region method for fitting gaussian mixture models (2021)","DOI":"10.1007\/s11222-021-10071-1"},{"issue":"1","key":"10071_CR41","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1063\/1.1477037","volume":"617","author":"H Snoussi","year":"2002","unstructured":"Snoussi, H., Mohammad-Djafari, A.: Penalized maximum likelihood for multivariate gaussian mixture. AIP Conf. Proc. 617(1), 36\u201346 (2002)","journal-title":"AIP Conf. Proc."},{"key":"10071_CR42","doi-asserted-by":"publisher","first-page":"713","DOI":"10.1137\/140978168","volume":"25","author":"S Sra","year":"2015","unstructured":"Sra, S., Hosseini, R.: Conic geometric optimization on the manifold of positive definite matrices. SIAM J. Optim. 25, 713\u2013739 (2015)","journal-title":"SIAM J. Optim."},{"issue":"137","key":"10071_CR43","first-page":"1","volume":"17","author":"J Townsend","year":"2016","unstructured":"Townsend, J., Koep, N., Weichwald, S.: Pymanopt: a python toolbox for optimization on manifolds using automatic differentiation. J. Mach. Learn. Res. 17(137), 1\u20135 (2016)","journal-title":"J. Mach. Learn. Res."},{"issue":"1","key":"10071_CR44","doi-asserted-by":"publisher","first-page":"231","DOI":"10.1023\/A:1008677427361","volume":"13","author":"RJ Vanderbei","year":"1999","unstructured":"Vanderbei, R.J., Benson, H.Y.: On formulating semidefinite programming problems as smooth convex nonlinear optimization problems. Comput. Optim. Appl. 13(1), 231\u2013252 (1999)","journal-title":"Comput. Optim. Appl."},{"key":"10071_CR45","doi-asserted-by":"publisher","unstructured":"Wu, B., Mcgrory, C.A., Pettitt, A.N.: A new variational Bayesian algorithm with application to human mobility pattern modeling. Stat Comput 22(1), 185\u2013203 (2012). https:\/\/doi.org\/10.1007\/s11222-010-9217-9","DOI":"10.1007\/s11222-010-9217-9"},{"issue":"1","key":"10071_CR46","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1162\/neco.1996.8.1.129","volume":"8","author":"L Xu","year":"1996","unstructured":"Xu, L., Jordan, M.I.: On convergence properties of the em algorithm for gaussian mixtures. Neural Comput. 8(1), 129\u2013151 (1996)","journal-title":"Neural Comput."},{"key":"10071_CR47","first-page":"1736","volume":"3","author":"D Zoran","year":"2012","unstructured":"Zoran, D., Weiss, Y.: Natural images, Gaussian mixtures and dead leaves. Adv. Neural Inf. Process. Syst. 3, 1736\u20131744 (2012)","journal-title":"Adv. Neural Inf. Process. Syst."}],"container-title":["Statistics and Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11222-021-10071-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11222-021-10071-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11222-021-10071-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,3,3]],"date-time":"2022-03-03T06:04:03Z","timestamp":1646287443000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11222-021-10071-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,17]]},"references-count":47,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,2,15]]}},"alternative-id":["10071"],"URL":"https:\/\/doi.org\/10.1007\/s11222-021-10071-1","relation":{},"ISSN":["0960-3174","1573-1375"],"issn-type":[{"value":"0960-3174","type":"print"},{"value":"1573-1375","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,12,17]]},"assertion":[{"value":"3 May 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 November 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 December 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that there is no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"8"}}