{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T03:24:03Z","timestamp":1740108243009,"version":"3.37.3"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2019,12,3]],"date-time":"2019-12-03T00:00:00Z","timestamp":1575331200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2019,12,3]],"date-time":"2019-12-03T00:00:00Z","timestamp":1575331200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100004440","name":"Wellcome Trust","doi-asserted-by":"publisher","award":["102423\/Z\/13\/Z","090532\/Z\/09\/Z"],"award-info":[{"award-number":["102423\/Z\/13\/Z","090532\/Z\/09\/Z"]}],"id":[{"id":"10.13039\/100004440","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000265","name":"Medical Research Council","doi-asserted-by":"publisher","award":["MC_UU_12025"],"award-info":[{"award-number":["MC_UU_12025"]}],"id":[{"id":"10.13039\/501100000265","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Comput Stat"],"published-print":{"date-parts":[[2020,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Expectation maximization (EM) is a technique for estimating maximum-likelihood parameters of a latent variable model given observed data by alternating between taking expectations of sufficient statistics, and maximizing the expected log likelihood. For situations where sufficient statistics are intractable, stochastic approximation EM (SAEM) is often used, which uses Monte Carlo techniques to approximate the expected log likelihood. Two common implementations of SAEM, Batch EM (BEM) and online EM (OEM), are parameterized by a \u201clearning rate\u201d, and their efficiency depend strongly on this parameter. We propose an extension to the OEM algorithm, termed Introspective Online Expectation Maximization (IOEM), which removes the need for specifying this parameter by adapting the learning rate to trends in the parameter updates. We show that our algorithm matches the efficiency of the optimal BEM and OEM algorithms in multiple models, and that the efficiency of IOEM can exceed that of BEM\/OEM methods with optimal learning rates when the model has many parameters. Finally we use IOEM to fit two models to a financial time series. A Python implementation is available at<jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/luntergroup\/IOEM.git\">https:\/\/github.com\/luntergroup\/IOEM.git<\/jats:ext-link>.<\/jats:p>","DOI":"10.1007\/s00180-019-00937-4","type":"journal-article","created":{"date-parts":[[2019,12,3]],"date-time":"2019-12-03T03:02:23Z","timestamp":1575342143000},"page":"1319-1344","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Efficient inference in state-space models through adaptive learning in online Monte Carlo expectation maximization"],"prefix":"10.1007","volume":"35","author":[{"given":"Donna","family":"Henderson","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3798-2058","authenticated-orcid":false,"given":"Gerton","family":"Lunter","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2019,12,3]]},"reference":[{"key":"937_CR1","first-page":"1","volume":"3","author":"LE Baum","year":"1972","unstructured":"Baum LE (1972) An equality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. Inequalities 3:1\u20138","journal-title":"Inequalities"},{"key":"937_CR2","doi-asserted-by":"crossref","unstructured":"Bottou L (2012) Stochastic gradient descent tricks. In: Montavon G, Orr GB, M\u00fcller KR (eds) Neural networks: tricks of the trade. Lecture notes in computer science, , vol 7700, pp 421\u2013436","DOI":"10.1007\/978-3-642-35289-8_25"},{"key":"937_CR3","doi-asserted-by":"crossref","unstructured":"Capp\u00e9 O (2009) Online sequential Monte Carlo EM algorithm. In: IEEE\/SP 15th Workshop on statistical signal processing, 2009. SSP\u201909, pp\u00a037\u201340. IEEE","DOI":"10.1109\/SSP.2009.5278646"},{"issue":"5","key":"937_CR4","doi-asserted-by":"publisher","first-page":"899","DOI":"10.1109\/JPROC.2007.893250","volume":"95","author":"O Capp\u00e9","year":"2007","unstructured":"Capp\u00e9 O, Godsill SJ, Moulines E (2007) An overview of existing methods and recent advances in sequential monte carlo. Proc IEEE 95(5):899\u2013924","journal-title":"Proc IEEE"},{"key":"937_CR5","unstructured":"Capp\u00e9 O, Moulines E (2005) On the use of particle filtering for maximum likelihood parameter estimation. In: 13th European signal processing conference, pp 1\u20134. IEEE"},{"issue":"3","key":"937_CR6","doi-asserted-by":"publisher","first-page":"593","DOI":"10.1111\/j.1467-9868.2009.00698.x","volume":"71","author":"O Capp\u00e9","year":"2009","unstructured":"Capp\u00e9 O, Moulines E (2009) On-line expectation\u2013maximization algorithm for latent data models. J R Stat Soc Ser B 71(3):593\u2013613","journal-title":"J R Stat Soc Ser B"},{"key":"937_CR7","unstructured":"Celeux G, Chaveaux D, Diebolt J (1995) On stochastic versions of the EM algorithm. Technical Report 2514, INRIA"},{"key":"937_CR8","first-page":"73","volume":"2","author":"G Celeux","year":"1985","unstructured":"Celeux G, Diebolt J (1985) The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Comput Stat Q 2:73\u201382","journal-title":"Comput Stat Q"},{"issue":"41","key":"937_CR9","doi-asserted-by":"publisher","first-page":"119","DOI":"10.1080\/17442509208833797","volume":"2","author":"G Celeux","year":"1992","unstructured":"Celeux G, Diebolt J (1992) A stochastic approximation type EM algorithm for the mixture problem. Stoch Stoch Rep 2(41):119\u2013132","journal-title":"Stoch Stoch Rep"},{"issue":"8","key":"937_CR10","doi-asserted-by":"publisher","first-page":"934","DOI":"10.1016\/j.jprocont.2010.06.008","volume":"20","author":"SB Chitralekha","year":"2010","unstructured":"Chitralekha SB, Prakash J, Raghavan H, Gopaluni R, Shah SL (2010) A comparison of simultaneous state and parameter estimation schemes for a continuous fermentor reactor. J Process Control 20(8):934\u2013943","journal-title":"J Process Control"},{"key":"937_CR11","doi-asserted-by":"publisher","first-page":"843","DOI":"10.1016\/0378-4266(95)00084-T","volume":"19","author":"MM Cornett","year":"1995","unstructured":"Cornett MM, Schwarz TV, Szakmary AC (1995) Seasonalities and intraday return patterns in the foreign currency futures market. J Bank Finance 19:843\u2013869","journal-title":"J Bank Finance"},{"issue":"1","key":"937_CR12","doi-asserted-by":"publisher","first-page":"94","DOI":"10.1214\/aos\/1018031103","volume":"27","author":"B Delyon","year":"1999","unstructured":"Delyon B, Lavielle M, Moulines E (1999) Convergence of a stochastic approximation version of the EM algorithm. Ann Stat 27(1):94\u2013128","journal-title":"Ann Stat"},{"key":"937_CR13","first-page":"1","volume-title":"J R Stat Soc Ser B","author":"AP Dempster","year":"1977","unstructured":"Dempster AP, Laird NM, Rubin DB (1977) J R Stat Soc Ser B. Maximum likelihood from incomplete data via the EM algorithm, Soc., pp 1\u201338"},{"key":"937_CR14","doi-asserted-by":"crossref","unstructured":"Douc R, Capp\u00e9 O (2005) Comparison of resampling schemes for particle filtering. In: Proceedings of the 4th international symposium on image and signal processing and analysis, pp\u00a064\u201369. IEEE","DOI":"10.1109\/ISPA.2005.195385"},{"key":"937_CR15","doi-asserted-by":"crossref","unstructured":"Doucet A, de\u00a0Freitas N, Gordon N (eds) (2001) Sequential Monte Carlo methods in practice. Springer","DOI":"10.1007\/978-1-4757-3437-9"},{"key":"937_CR16","first-page":"656","volume":"12","author":"A Doucet","year":"2009","unstructured":"Doucet A, Johansen AM (2009) A tutorial on particle filtering and smoothing: fifteen years later. Handb Nonlinear Filter 12:656\u2013704","journal-title":"Handb Nonlinear Filter"},{"key":"937_CR17","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1007\/s11222-006-8450-8","volume":"16","author":"P Fearnhead","year":"2006","unstructured":"Fearnhead P (2006) Efficient and exact bayesian inference for multiple changepoint problems. Stat Comput 16:203\u2013213","journal-title":"Stat Comput"},{"key":"937_CR18","doi-asserted-by":"publisher","first-page":"132","DOI":"10.1198\/jasa.2009.0009","volume":"104","author":"P Fearnhead","year":"2009","unstructured":"Fearnhead P, Vasileiou D (2009) Bayesian analysis of isochores. J Am Stat Assoc 104:132\u2013141","journal-title":"J Am Stat Assoc"},{"issue":"421","key":"937_CR19","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1080\/01621459.1993.10594313","volume":"88","author":"M Jamshidian","year":"1993","unstructured":"Jamshidian M, Jennrich RI (1993) Acceleration of the EM algorithm by using Quasi-Newton methods. J Am Stat Assoc 88(421):221\u2013228","journal-title":"J Am Stat Assoc"},{"key":"937_CR20","unstructured":"Jordan MI, Jacobs RA (1993) Hierarchical mixtures of experts and the EM algorithm. In: Proceedings of the 1993 international joint conference on neural networks, pp\u00a01339\u20131344"},{"key":"937_CR21","doi-asserted-by":"crossref","unstructured":"Kantas N, Doucet A, Singh SS, Maciejowski JM (2009) An overview of sequential Monte Carlo methods for parameter estimation in general state-space models. In: 15th IFAC symposium on system identification (SYSID), vol 102, p\u00a0117","DOI":"10.3182\/20090706-3-FR-2004.00129"},{"key":"937_CR22","volume-title":"Smarter trading: improving performance in changing markets","author":"P Kaufman","year":"1995","unstructured":"Kaufman P (1995) Smarter trading: improving performance in changing markets. McGraw-Hill, New York"},{"key":"937_CR23","unstructured":"Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: 3rd International conference on learning representations, ICLR 2015"},{"key":"937_CR24","first-page":"1","volume":"5","author":"K Lange","year":"1995","unstructured":"Lange K (1995) A quasi Newton acceleration of the EM algorithm. Stat Sin 5:1\u201318","journal-title":"Stat Sin"},{"key":"937_CR25","doi-asserted-by":"publisher","first-page":"763","DOI":"10.1214\/13-EJS789","volume":"7","author":"S Le Corff","year":"2013","unstructured":"Le Corff S, Fort G (2013) Online expectation maximization based algorithms for inference in hidden Markov models. Electron J Stat 7:763\u2013792","journal-title":"Electron J Stat"},{"issue":"1","key":"937_CR26","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1214\/12-STS401","volume":"28","author":"M Lin","year":"2013","unstructured":"Lin M, Chen R, Liu JS et al (2013) Lookahead strategies for sequential Monte Carlo. Stat Sci 28(1):69\u201394","journal-title":"Stat Sci"},{"issue":"1","key":"937_CR27","doi-asserted-by":"publisher","first-page":"168","DOI":"10.1002\/for.1195","volume":"30","author":"HF Lopes","year":"2010","unstructured":"Lopes HF, Tsay RS (2010) Particle filters and bayesian inference in financial econometrics. J Forecast 30(1):168\u2013209","journal-title":"J Forecast"},{"key":"937_CR28","unstructured":"Mandt S, Hoffman MD, Blei DM (2016) A variational analysis of stochastic gradient algorithms. In: Proceedings of the 33rd international conference on machine learning, vol\u00a048"},{"issue":"7","key":"937_CR29","doi-asserted-by":"publisher","first-page":"1706","DOI":"10.1162\/neco.2008.10-06-351","volume":"20","author":"G Mongillo","year":"2008","unstructured":"Mongillo G, Den\u00e8ve S (2008) Online learning with hidden Markov models. Neural Comput 20(7):1706\u20131716","journal-title":"Neural Comput"},{"key":"937_CR30","unstructured":"Nowlan S (1991) Soft competitive adaptation: neural network learning algorithms based on fitting statistical mixtures, Ph.D. thesis, School of Computer Science. Cargegie Mellon University"},{"issue":"1","key":"937_CR31","doi-asserted-by":"publisher","first-page":"155","DOI":"10.3150\/07-BEJ6150","volume":"14","author":"J Olsson","year":"2008","unstructured":"Olsson J, Capp\u00e9 O, Douc R, Moulines E et al (2008) Sequential monte carlo smoothing with application to parameter estimation in nonlinear state space models. Bernoulli 14(1):155\u2013179","journal-title":"Bernoulli"},{"issue":"446","key":"937_CR32","doi-asserted-by":"publisher","first-page":"590","DOI":"10.1080\/01621459.1999.10474153","volume":"94","author":"MK Pitt","year":"1999","unstructured":"Pitt MK, Shephard N (1999) Filtering via simulation: auxiliary particle filters. J Am Stat Assoc 94(446):590\u2013599","journal-title":"J Am Stat Assoc"},{"key":"937_CR33","first-page":"98","volume":"7","author":"BT Polyak","year":"1990","unstructured":"Polyak BT (1990) A new method of stochastic approximation type. Avtomatika i telemekhanika 7:98\u2013107","journal-title":"Avtomatika i telemekhanika"},{"key":"937_CR34","unstructured":"Reddi SJ, Kale S, Kumar S (2018) On the convergence of Adam and beyond. In: Proceedings of ICLR"},{"issue":"4","key":"937_CR35","doi-asserted-by":"publisher","first-page":"253","DOI":"10.1111\/j.1467-9892.1982.tb00349.x","volume":"3","author":"RH Shumway","year":"1982","unstructured":"Shumway RH, Stoffer DS (1982) An approach to time series smoothing and forecasting using the EM algorithm. J Time Ser Anal 3(4):253\u2013264","journal-title":"J Time Ser Anal"},{"issue":"2","key":"937_CR36","doi-asserted-by":"publisher","first-page":"335","DOI":"10.1111\/j.1467-9469.2007.00585.x","volume":"35","author":"R Varadhan","year":"2008","unstructured":"Varadhan R, Roland C (2008) Simple and globally convergent methods for accelerating the convergence of any EM algorithm. Scand J Stat 35(2):335\u2013353","journal-title":"Scand J Stat"},{"issue":"411","key":"937_CR37","doi-asserted-by":"publisher","first-page":"699","DOI":"10.1080\/01621459.1990.10474930","volume":"85","author":"GCG Wei","year":"1990","unstructured":"Wei GCG, Tanner MA (1990) A Monte Carlo implementation of the EM algorithm and the poor man\u2019s data augmentation algorithms. J Am Stat Assoc 85(411):699\u2013704","journal-title":"J Am Stat Assoc"},{"issue":"1","key":"937_CR38","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1162\/neco.1996.8.1.129","volume":"8","author":"L Xu","year":"1996","unstructured":"Xu L, Jordan M (1996) On convergence properties of the EM algorithm for gaussian mixtures. Neural Comput 8(1):129\u2013151","journal-title":"Neural Comput"},{"issue":"4","key":"937_CR39","doi-asserted-by":"publisher","first-page":"906","DOI":"10.1080\/10618600.2012.674653","volume":"22","author":"S Yildirim","year":"2013","unstructured":"Yildirim S, Singh SS, Doucet A (2013) An online expectation-maximization algorithm for changepoint models. J Comput Graph Stat 22(4):906\u2013926","journal-title":"J Comput Graph Stat"},{"key":"937_CR40","unstructured":"Zeiler MD (2012) ADADELTA: an adaptive learning rate method. arXiv:1212.5701"}],"container-title":["Computational Statistics"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00180-019-00937-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s00180-019-00937-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00180-019-00937-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,28]],"date-time":"2024-07-28T01:29:19Z","timestamp":1722130159000},"score":1,"resource":{"primary":{"URL":"http:\/\/link.springer.com\/10.1007\/s00180-019-00937-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,12,3]]},"references-count":40,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2020,9]]}},"alternative-id":["937"],"URL":"https:\/\/doi.org\/10.1007\/s00180-019-00937-4","relation":{},"ISSN":["0943-4062","1613-9658"],"issn-type":[{"type":"print","value":"0943-4062"},{"type":"electronic","value":"1613-9658"}],"subject":[],"published":{"date-parts":[[2019,12,3]]},"assertion":[{"value":"3 June 2018","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 November 2019","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 December 2019","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}