{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,3]],"date-time":"2026-05-03T03:02:45Z","timestamp":1777777365065,"version":"3.51.4"},"reference-count":65,"publisher":"Emerald","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2011,4,15]]},"abstract":"<jats:p>This introduction to the expectation\u2013maximization (EM) algorithm provides an intuitive and mathematically rigorous understanding of EM. Two of the most popular applications of EM are described in detail: estimating Gaussian mixture models (GMMs), and estimating hidden Markov models (HMMs). EM solutions are also derived for learning an optimal mixture of fixed models, for estimating the parameters of a compound Dirichlet distribution, and for dis-entangling superimposed signals. Practical issues that arise in the use of EM are discussed, as well as variants of the algorithm that help deal with these challenges.<\/jats:p>","DOI":"10.1561\/2000000034","type":"journal-article","created":{"date-parts":[[2011,4,15]],"date-time":"2011-04-15T12:43:46Z","timestamp":1302871426000},"page":"223-296","source":"Crossref","is-referenced-by-count":226,"title":["Theory and Use of the EM Algorithm"],"prefix":"10.1108","volume":"4","author":[{"given":"Maya R.","family":"Gupta","sequence":"first","affiliation":[{"name":"Department of Electrical Engineering, University of Washington , Seattle, WA 98195,","place":["USA"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yihua","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering, University of Washington , Seattle, WA 98195,","place":["USA"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"140","published-online":{"date-parts":[[2011,4,15]]},"reference":[{"issue":"4","key":"2026040901215894900_ref001","doi-asserted-by":"crossref","first-page":"635","DOI":"10.1007\/s10898-004-9972-2","article-title":"\u201cA numerical evaluation of several stochastic algorithms on selected continuous global optimization test problems,\u201d","volume":"31","author":"Ali","year":"2005","journal-title":"Journal of Global Optimization,"},{"key":"2026040901215894900_ref002","first-page":"51","article-title":"\u201cUnsupervised learning of multiple motifs in biopolymers using expectation maximization,\u201d","volume-title":"Machine Learning,","author":"Bailey","year":"1995"},{"key":"2026040901215894900_ref003","first-page":"2664","article-title":"\u201cOn the optimality of conditional expectation as a Bregman predictor,\u201d","volume-title":"IEEE Transactions on Information Theory","author":"Banerjee","year":"2005"},{"key":"2026040901215894900_ref004","first-page":"164","article-title":"\u201cA maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains,\u201d","volume-title":"The Annals of Mathematical Statistics,","author":"Baum","year":"1970"},{"key":"2026040901215894900_ref005","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511804441","volume-title":"Convex Optimization.","author":"Boyd","year":"2004"},{"issue":"1","key":"2026040901215894900_ref006","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1111\/j.2517-6161.1983.tb01229.x","article-title":"\u201cOn the convergence of the EM algorithm,\u201d","volume":"45","author":"Boyles","year":"1983","journal-title":"Journal of the Royal Statistical Society, Series B (Methodological),"},{"issue":"2","key":"2026040901215894900_ref007","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1093\/biomet\/65.2.273","article-title":"\u201cAsymptotic behavior of classification maximum likelihood estimates,\u201d","volume":"65","author":"Bryant","year":"1978","journal-title":"Biometrika,"},{"key":"2026040901215894900_ref008","first-page":"73","article-title":"\u201cThe SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem,\u201d","volume":"2","author":"Celeux","year":"1985","journal-title":"Computational Statistics Quaterly,"},{"key":"2026040901215894900_ref009","doi-asserted-by":"crossref","first-page":"315","DOI":"10.1016\/0167-9473(92)90042-E","article-title":"\u201cA classification EM algorithm for clustering and two stochastic versions,\u201d","volume":"14","author":"Celeux","year":"1992","journal-title":"Computational Statistics Data Analysis,"},{"key":"2026040901215894900_ref010","article-title":"\u201cProbabilistic modeling of traffic lanes from GPS traces,\u201d","author":"Chen","year":"2010","journal-title":"Proceedings of 18th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems,"},{"issue":"1","key":"2026040901215894900_ref011","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1111\/j.2517-6161.1977.tb01600.x","article-title":"\u201cMaximum likelihood from incomplete data via the EM algorithm,\u201d","volume":"39","author":"Dempster","year":"1977","journal-title":"Journal of the Royal Statistical Society, Series B (Methodological),"},{"issue":"4","key":"2026040901215894900_ref012","doi-asserted-by":"crossref","first-page":"477","DOI":"10.1109\/29.1552","article-title":"\u201cParameter estimation of superimposed signals using the EM algorithm,\u201d","volume":"36","author":"Feder","year":"1988","journal-title":"IEEE Transactions on Acoustics, Speech and Signal Processing,"},{"key":"2026040901215894900_ref013","volume-title":"Real Analysis: Modern Techniques and Their Applications.","author":"Folland","year":"1999"},{"key":"2026040901215894900_ref014","author":"Frigyik","year":"2010"},{"issue":"11","key":"2026040901215894900_ref015","doi-asserted-by":"crossref","first-page":"5130","DOI":"10.1109\/TIT.2008.929943","article-title":"\u201cFunctional Bregman divergence and Bayesian estimation of distributions,\u201d","volume":"54","author":"Frigyik","year":"2008","journal-title":"IEEE Transactions on Information Theory,"},{"issue":"3","key":"2026040901215894900_ref016","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1561\/2000000004","article-title":"\u201cThe application of hidden Markov models in speech recognition,\u201d","volume":"1","author":"Gales","year":"2008","journal-title":"Foundations and Trends in Signal Processing"},{"key":"2026040901215894900_ref017","volume-title":"Vector Quantization and Signal Compression.","author":"Gersho","year":"1991"},{"key":"2026040901215894900_ref018","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511841224","volume-title":"Wireless Communications.","author":"Goldsmith","year":"2005"},{"issue":"4","key":"2026040901215894900_ref019","doi-asserted-by":"crossref","first-page":"424","DOI":"10.1109\/34.277597","article-title":"\u201cOn a parameter estimation method for Gibbs- Markov random fields,\u201d","volume":"16","author":"Gurelli","year":"1994","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence,"},{"issue":"2","key":"2026040901215894900_ref020","doi-asserted-by":"crossref","first-page":"174","DOI":"10.2307\/2527783","article-title":"\u201cMaximum likelihood estimation from incomplete data,\u201d","volume":"14","author":"Hartley","year":"1958","journal-title":"Biometrics,"},{"key":"2026040901215894900_ref021","doi-asserted-by":"crossref","DOI":"10.1007\/978-0-387-84858-7","volume-title":"The Elements of Statistical Learning: Data Mining, Inference, and Prediction","author":"Hastie","year":"2009"},{"issue":"2","key":"2026040901215894900_ref022","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1109\/JSAC.2004.839380","article-title":"\u201cCognitive radio: Brain-empowered wireless communications,\u201d","volume":"23","author":"Haykin","year":"2005","journal-title":"IEEE Journal on Selected Areas in Communications,"},{"key":"2026040901215894900_ref023","doi-asserted-by":"crossref","first-page":"3013","DOI":"10.1109\/CEC.2006.1688689","article-title":"\u201cA multiresolutional estimated gradient architecture for global optimization,\u201d","author":"Hazen","year":"2006","journal-title":"Proceedings of the IEEE Congress on Evolu-tionary Computation,"},{"key":"2026040901215894900_ref024","first-page":"1841","article-title":"\u201cGradient estimation in global optimization algo-rithms,\u201d","author":"Hazen","year":"2009","journal-title":"Proceedings of the IEEE Congress on Evolutionary Computation"},{"issue":"2","key":"2026040901215894900_ref025","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1109\/42.24868","article-title":"\u201cA generalized EM algorithm for 3-D Bayesian recon-struction from Poisson data using Gibbs priors,\u201d","volume":"8","author":"Hebert","year":"1989","journal-title":"IEEE Transactions on Medical Imaging,"},{"issue":"8","key":"2026040901215894900_ref026","doi-asserted-by":"crossref","first-page":"1084","DOI":"10.1109\/83.403415","article-title":"\u201cExpectation-maximization algorithms, null spaces, and MAP image restoration,\u201d","volume":"4","author":"Hebert","year":"1995","journal-title":"IEEE Transactions on Image Processing,"},{"issue":"3","key":"2026040901215894900_ref027","doi-asserted-by":"crossref","first-page":"569","DOI":"10.1111\/1467-9868.00083","article-title":"\u201cAcceleration of the EM algorithm by using quasi-Newton methods,\u201d","volume":"59","author":"Jamshidian","year":"1997","journal-title":"Journal of the Royal Statistical Society, Series B (Methodological),"},{"issue":"4","key":"2026040901215894900_ref028","doi-asserted-by":"crossref","first-page":"345","DOI":"10.1023\/A:1012771025575","article-title":"\u201cA taxonomy of global optimization methods based on response surfaces,\u201d","volume":"21","author":"Jones","year":"2001","journal-title":"Journal of Global Optimization,"},{"issue":"4","key":"2026040901215894900_ref029","doi-asserted-by":"crossref","first-page":"439","DOI":"10.1109\/42.61759","article-title":"\u201cConvergence of EM image reconstruction algorithms with Gibbs smoothing,\u201d","volume":"9","author":"Lange","year":"1990","journal-title":"IEEE Transactions on Medical Imaging,"},{"issue":"2","key":"2026040901215894900_ref030","doi-asserted-by":"crossref","first-page":"425","DOI":"10.1111\/j.2517-6161.1995.tb02037.x","article-title":"\u201cA gradient algorithm locally equivalent to the EM algorithm,\u201d","volume":"57","author":"Lange","year":"1995","journal-title":"Journal of the Royal Statistical Society, Series B (Methodological"},{"key":"2026040901215894900_ref031","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4615-4497-5","volume-title":"Image Segmentation and Compression Using Hidden Markov Models.","author":"Li","year":"2000"},{"issue":"2","key":"2026040901215894900_ref032","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1109\/TIT.1982.1056489","article-title":"\u201cLeast squares quantization in PCM,\u201d","volume":"28","author":"Lloyd","year":"1982","journal-title":"IEEE Transactions on Information Theory,"},{"issue":"6","key":"2026040901215894900_ref033","doi-asserted-by":"crossref","first-page":"745","DOI":"10.1086\/111605","article-title":"\u201cAn iterative technique for the rectification of observed distribu-tions,\u201d","volume":"79","author":"Lucy","year":"1974","journal-title":"Astronomical Journal,"},{"key":"2026040901215894900_ref034","volume-title":"Information Theory, Inference, and Learning Algorithms.","author":"MacKay","year":"2003"},{"key":"2026040901215894900_ref035","first-page":"281","article-title":"\u201cSome methods for classification and analysis of multivariate observations,\u201d","author":"MacQueen","year":"1967","journal-title":"Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability"},{"key":"2026040901215894900_ref036","doi-asserted-by":"crossref","DOI":"10.1002\/9780470191613","volume-title":"The EM Algorithm and Extensions","author":"McLachlan","year":"2008"},{"key":"2026040901215894900_ref037","doi-asserted-by":"crossref","DOI":"10.1002\/0471721182","volume-title":"Finite Mixture Models","author":"McLachlan","year":"2000"},{"issue":"3","key":"2026040901215894900_ref038","doi-asserted-by":"crossref","first-page":"204","DOI":"10.1109\/TEVC.2004.826074","article-title":"\u201cThe fully informed particle swarm: sim-pler, maybe better,\u201d","volume":"8","author":"Mendes","journal-title":"IEEE Transactions on Evolutionary Computation,"},{"key":"2026040901215894900_ref039","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1016\/0024-3795(94)90363-8","article-title":"\u201cOn the global and componentwise rates of con-vergence of the EM algorithm,\u201d","volume":"199","author":"Meng","year":"1994","journal-title":"Linear Algebra and its Applications,"},{"key":"2026040901215894900_ref040","doi-asserted-by":"crossref","first-page":"511","DOI":"10.1111\/1467-9868.00082","article-title":"\u201cThe EM algorithm \u2014 an old folk-song sung to a fast new tune,\u201d","volume":"59","author":"Meng","year":"1997","journal-title":"Journal of the Royal Statistical Society, Series B (Methodological),"},{"key":"2026040901215894900_ref041","volume-title":"Learning in Graphical Models,","author":"Neal","year":"1998"},{"key":"2026040901215894900_ref042","first-page":"610","article-title":"\u201cAn EM technique for multiple transmitter localization,\u201d","author":"Nelson","year":"2007","journal-title":"Proceedings of the 41st Annual Conference on Information Sciences and Systems,"},{"issue":"5","key":"2026040901215894900_ref043","doi-asserted-by":"crossref","first-page":"354","DOI":"10.1109\/LSP.2009.2016003","article-title":"\u201cA quasi EM method for estimating multiple transmitter locations,\u201d","volume":"16","author":"Nelson","year":"2009","journal-title":"IEEE Signal Processing Letters,"},{"issue":"4","key":"2026040901215894900_ref044","doi-asserted-by":"crossref","first-page":"343","DOI":"10.2307\/2369392","article-title":"\u201cA generalized theory of the combination of observations so as to obtain the best result,\u201d","volume":"8","author":"Newcomb","journal-title":"American Journal of Mathematics,"},{"key":"2026040901215894900_ref045","volume-title":"Numerical Optimization.","author":"Nocedal","year":"2006"},{"key":"2026040901215894900_ref046","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-5362-2","volume-title":"Handbook of Global Optimization.","author":"Pardalos","year":"2002"},{"key":"2026040901215894900_ref047","author":"Petersen","year":"2008","journal-title":"The Matrix Cookbook."},{"key":"2026040901215894900_ref048","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1080\/00949659208811365","article-title":"\u201cStochastic relaxations and EM algorithms for Markov random fields,\u201d","volume":"40","author":"Qian","year":"1992","journal-title":"Journal of Statistical Computation and Simulation,"},{"issue":"2","key":"2026040901215894900_ref049","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1109\/5.18626","article-title":"\u201cA tutorial on hidden Markov models and selected applications in speech recognition,\u201d","volume":"77","author":"Rabiner","year":"1989","journal-title":"Proceedings of the IEEE,"},{"issue":"2","key":"2026040901215894900_ref050","doi-asserted-by":"crossref","first-page":"195","DOI":"10.1137\/1026034","article-title":"\u201cMixture densities, maximum likelihood and the EM algorithm,\u201d","volume":"26","author":"Redner","year":"1984","journal-title":"SIAM Review,"},{"issue":"1","key":"2026040901215894900_ref051","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1364\/JOSA.62.000055","article-title":"\u201cBayesian-based iterative method of image restoration,\u201d","volume":"62","author":"Richardson","journal-title":"Journal of Optical Society of America,"},{"key":"2026040901215894900_ref052","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4757-4145-2","volume-title":"Monte Carlo Statistical Methods.","author":"Robert","year":"2004"},{"key":"2026040901215894900_ref053","unstructured":"A.\n              Roche\n            \n          , \u201cEM algorithm and variants: An informal tutorial,\u201d Unpublished (available online at ftp:\/\/ftp.cea.fr\/pub\/dsv\/madic\/publis\/Roche_em. pdf), 2003."},{"issue":"4","key":"2026040901215894900_ref054","doi-asserted-by":"crossref","first-page":"215","DOI":"10.1080\/00949658908811178","article-title":"\u201cMaximum Likelihood estimation of Dirichlet distributions,\u201d","volume":"32","author":"Ronning","year":"1989","journal-title":"Journal of Statistical Computation and Simulation,"},{"key":"2026040901215894900_ref055","volume-title":"Vector Space Projections: A Numerical Approach to Signal and Image Processing, Neural Nets, and Optics","author":"Stark","year":"1998"},{"issue":"463","key":"2026040901215894900_ref056","doi-asserted-by":"crossref","first-page":"750","DOI":"10.1198\/016214503000000666","article-title":"\u201cFinding the number of clusters in a dataset: An information-theoretic approach,\u201d","volume":"98","author":"Sugar","year":"2003","journal-title":"Journal of the American Statistical Association"},{"issue":"398","key":"2026040901215894900_ref057","doi-asserted-by":"crossref","first-page":"528","DOI":"10.1080\/01621459.1987.10478458","article-title":"\u201cThe calculation of posterior distributions by data augmentation,\u201d","volume":"82","author":"Tanner","year":"1987","journal-title":"Journal of the American Statistical Association,"},{"issue":"1","key":"2026040901215894900_ref058","first-page":"1","article-title":"\u201cThe art of data augmentation,\u201d","volume":"10","author":"van","year":"2001","journal-title":"Journal of Computational and Graphical Statistics,"},{"issue":"10","key":"2026040901215894900_ref059","doi-asserted-by":"crossref","first-page":"3884","DOI":"10.1109\/TSP.2006.880209","article-title":"\u201cMaximum likelihood estimation of compound-Gaussian clutter and target parameters,\u201d","volume":"54","author":"Wang","year":"2006","journal-title":"IEEE Transactions on Signal Processing"},{"issue":"411","key":"2026040901215894900_ref060","doi-asserted-by":"crossref","first-page":"699","DOI":"10.1080\/01621459.1990.10474930","article-title":"\u201cA Monte Carlo implementation of the EM algorithm and the poor man\u2019s data augmentation algorithms,\u201d","volume":"85","author":"Wei","year":"1990","journal-title":"Journal of the American Statistical Association,"},{"issue":"4","key":"2026040901215894900_ref061","first-page":"1","article-title":"\u201cHidden Markov Models and the Baum-Welch Algorithm,\u201d","volume":"53","author":"Welch","year":"2003","journal-title":"IEEE Information Theory Society Newsletter,"},{"issue":"1","key":"2026040901215894900_ref062","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1214\/aos\/1176346060","article-title":"\u201cOn the convergence properties of the EM algorithm,\u201d","volume":"11","author":"Wu","year":"1983","journal-title":"The Annals of Statistics,"},{"issue":"1","key":"2026040901215894900_ref063","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1162\/neco.1996.8.1.129","article-title":"\u201cOn convergence properties of the EM algorithm for Gaussian mixtures,\u201d","volume":"8","author":"Xu","year":"1996","journal-title":"Neural Computation,"},{"key":"2026040901215894900_ref064","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4419-8608-5","volume-title":"A First Course in Information Theory.","author":"Yeung","year":"2002"},{"issue":"10","key":"2026040901215894900_ref065","doi-asserted-by":"crossref","first-page":"2570","DOI":"10.1109\/78.157297","article-title":"\u201cThe mean field theory in EM procedures for Markov random fields,\u201d","volume":"40","author":"Zhang","year":"1992","journal-title":"IEEE Transactions on Signal Processing,"}],"container-title":["Foundations and Trends\u00ae in Signal Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/ftsig\/article-pdf\/4\/3\/223\/11095552\/2000000034en.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/www.emerald.com\/ftsig\/article-pdf\/4\/3\/223\/11095552\/2000000034en.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T18:55:20Z","timestamp":1777488920000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.emerald.com\/ftsig\/article\/4\/3\/223\/1330198\/Theory-and-Use-of-the-EM-Algorithm"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2011,4,15]]},"references-count":65,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2011,4,15]]}},"URL":"https:\/\/doi.org\/10.1561\/2000000034","relation":{},"ISSN":["1932-8346","1932-8354"],"issn-type":[{"value":"1932-8346","type":"print"},{"value":"1932-8354","type":"electronic"}],"subject":[],"published":{"date-parts":[[2011,4,15]]}}}