{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T18:00:35Z","timestamp":1770746435843,"version":"3.49.0"},"reference-count":55,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2023,5,23]],"date-time":"2023-05-23T00:00:00Z","timestamp":1684800000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Microsoft Trustworthy AI Grant","award":["NSF CAREER CCF-1453261, NSF Large CCF1565235"],"award-info":[{"award-number":["NSF CAREER CCF-1453261, NSF Large CCF1565235"]}]},{"name":"David and Lucile Packard Fellowship and an ONR Young Investigator Award"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. ACM"],"published-print":{"date-parts":[[2023,6,30]]},"abstract":"<jats:p>This work represents a natural coalescence of two important lines of work \u2014 learning mixtures of Gaussians and algorithmic robust statistics. In particular, we give the first provably robust algorithm for learning mixtures of any constant number of Gaussians. We require only mild assumptions on the mixing weights and that the total variation distance between components is bounded away from zero. At the heart of our algorithm is a new method for proving a type of dimension-independent polynomial identifiability \u2014 which we call robust identifiability \u2014 through applying a carefully chosen sequence of differential operations to certain generating functions that not only encode the parameters we would like to learn but also the system of polynomial equations we would like to solve. We show how the symbolic identities we derive can be directly used to analyze a natural sum-of-squares relaxation.<\/jats:p>","DOI":"10.1145\/3583680","type":"journal-article","created":{"date-parts":[[2023,2,15]],"date-time":"2023-02-15T23:10:21Z","timestamp":1676502621000},"page":"1-53","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Robustly Learning General Mixtures of Gaussians"],"prefix":"10.1145","volume":"70","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7987-5755","authenticated-orcid":false,"given":"Allen","family":"Liu","sequence":"first","affiliation":[{"name":"Massachusetts Institute of Technology, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7047-0495","authenticated-orcid":false,"given":"Ankur","family":"Moitra","sequence":"additional","affiliation":[{"name":"Massachusetts Institute of Technology, USA"}]}],"member":"320","published-online":{"date-parts":[[2023,5,23]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"crossref","first-page":"458","DOI":"10.1007\/11503415_31","volume-title":"Learning Theory","author":"Achlioptas Dimitris","year":"2005","unstructured":"Dimitris Achlioptas and Frank McSherry. 2005. On spectral learning of mixtures of distributions. In Learning Theory. Springer, 458\u2013469."},{"key":"e_1_3_2_3_2","volume-title":"Neural Network Learning: Theoretical Foundations","author":"Anthony Martin","year":"2009","unstructured":"Martin Anthony and Peter L. Bartlett. 2009. Neural Network Learning: Theoretical Foundations. Cambridge University Press."},{"key":"e_1_3_2_4_2","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1145\/380752.380808","volume-title":"Proceedings of the Thirty-third Annual ACM Symposium on Theory of Computing","author":"Arora Sanjeev","year":"2001","unstructured":"Sanjeev Arora and Ravi Kannan. 2001. Learning mixtures of arbitrary Gaussians. In Proceedings of the Thirty-third Annual ACM Symposium on Theory of Computing. 247\u2013257."},{"key":"e_1_3_2_5_2","article-title":"Robustly learning mixtures of k arbitrary Gaussians","author":"Bakshi Ainesh","year":"2020","unstructured":"Ainesh Bakshi, Ilias Diakonikolas, He Jia, Daniel M. Kane, Pravesh K. Kothari, and Santosh S. Vempala. 2020. Robustly learning mixtures of k arbitrary Gaussians. arXiv preprint arXiv:2012.02119 (2020). This version may be found at https:\/\/arxiv.org\/abs\/2012.02119v2.","journal-title":"arXiv preprint arXiv:2012.02119"},{"key":"e_1_3_2_6_2","article-title":"Robustly learning mixtures of k arbitrary Gaussians","author":"Bakshi Ainesh","year":"2020","unstructured":"Ainesh Bakshi, Ilias Diakonikolas, He Jia, Daniel M. Kane, Pravesh K. Kothari, and Santosh S. Vempala. 2020. Robustly learning mixtures of k arbitrary Gaussians. arXiv preprint arXiv:2012.02119 (2020). This version may be found at https:\/\/arxiv.org\/abs\/2012.02119v3.","journal-title":"arXiv preprint arXiv:2012.02119"},{"key":"e_1_3_2_7_2","article-title":"Outlier-robust clustering of non-spherical mixtures","author":"Bakshi Ainesh","year":"2020","unstructured":"Ainesh Bakshi and Pravesh Kothari. 2020. Outlier-robust clustering of non-spherical mixtures. arXiv preprint arXiv:2005.02970 (2020).","journal-title":"arXiv preprint arXiv:2005.02970"},{"key":"e_1_3_2_8_2","article-title":"Robust linear regression: Optimal rates in polynomial time","author":"Bakshi Ainesh","year":"2020","unstructured":"Ainesh Bakshi and Adarsh Prasad. 2020. Robust linear regression: Optimal rates in polynomial time. arXiv preprint arXiv:2007.01394 (2020).","journal-title":"arXiv preprint arXiv:2007.01394"},{"key":"e_1_3_2_9_2","first-page":"169","volume-title":"Conference on Learning Theory","author":"Balakrishnan Sivaraman","year":"2017","unstructured":"Sivaraman Balakrishnan, Simon S. Du, Jerry Li, and Aarti Singh. 2017. Computationally efficient robust sparse estimation in high dimensions. In Conference on Learning Theory. 169\u2013212."},{"key":"e_1_3_2_10_2","unstructured":"Boaz Barak. [n.d.]. Proofs beliefs and algorithms through the lens of sum-of-squares. ([n. d.])."},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/2591796.2591886"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/2746539.2746605"},{"key":"e_1_3_2_13_2","first-page":"417","volume-title":"Conference on Learning Theory","author":"Barak Boaz","year":"2016","unstructured":"Boaz Barak and Ankur Moitra. 2016. Noisy tensor completion via the sum-of-squares hierarchy. In Conference on Learning Theory. 417\u2013445."},{"key":"e_1_3_2_14_2","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1109\/FOCS.2010.16","volume-title":"Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on","author":"Belkin Mikhail","year":"2010","unstructured":"Mikhail Belkin and Kaushik Sinha. 2010. Polynomial learning of distribution families. In Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on. IEEE, 103\u2013112."},{"key":"e_1_3_2_15_2","volume-title":"Robust Estimators are Hard to Compute","author":"Bernholt Thorsten","year":"2006","unstructured":"Thorsten Bernholt. 2006. Robust Estimators are Hard to Compute. Technical Report."},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/2591796.2591881"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-85221-6_8"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3055399.3055491"},{"key":"e_1_3_2_19_2","article-title":"Online and distribution-free robustness: Regression and contextual bandits with huber contamination","author":"Chen Sitan","year":"2020","unstructured":"Sitan Chen, Frederic Koehler, Ankur Moitra, and Morris Yau. 2020. Online and distribution-free robustness: Regression and contextual bandits with huber contamination. arXiv preprint arXiv:2010.04157 (2020).","journal-title":"arXiv preprint arXiv:2010.04157"},{"key":"e_1_3_2_20_2","first-page":"634","volume-title":"Foundations of Computer Science, 1999. 40th Annual Symposium on","author":"Dasgupta Sanjoy","year":"1999","unstructured":"Sanjoy Dasgupta. 1999. Learning mixtures of Gaussians. In Foundations of Computer Science, 1999. 40th Annual Symposium on. IEEE, 634\u2013644."},{"key":"e_1_3_2_21_2","article-title":"A two-round variant of EM for Gaussian mixtures","author":"Dasgupta Sanjoy","year":"2013","unstructured":"Sanjoy Dasgupta and Leonard Schulman. 2013. A two-round variant of EM for Gaussian mixtures. arXiv preprint arXiv:1301.3850 (2013).","journal-title":"arXiv preprint arXiv:1301.3850"},{"key":"e_1_3_2_22_2","article-title":"Robustly learning any clusterable mixture of Gaussians","author":"Diakonikolas Ilias","year":"2020","unstructured":"Ilias Diakonikolas, Samuel B. Hopkins, Daniel Kane, and Sushrut Karmalkar. 2020. Robustly learning any clusterable mixture of Gaussians. arXiv preprint arXiv:2005.06417 (2020).","journal-title":"arXiv preprint arXiv:2005.06417"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1137\/17M1126680"},{"key":"e_1_3_2_24_2","first-page":"1596","volume-title":"International Conference on Machine Learning","author":"Diakonikolas Ilias","year":"2019","unstructured":"Ilias Diakonikolas, Gautam Kamath, Daniel Kane, Jerry Li, Jacob Steinhardt, and Alistair Stewart. 2019. Sever: A robust meta-algorithm for stochastic optimization. In International Conference on Machine Learning. 1596\u20131606."},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.5555\/3305381.3305485"},{"key":"e_1_3_2_26_2","article-title":"Recent advances in algorithmic high-dimensional robust statistics","author":"Diakonikolas Ilias","year":"2019","unstructured":"Ilias Diakonikolas and Daniel M. Kane. 2019. Recent advances in algorithmic high-dimensional robust statistics. arXiv preprint arXiv:1911.05911 (2019).","journal-title":"arXiv preprint arXiv:1911.05911"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/2746539.2746616"},{"key":"e_1_3_2_28_2","volume-title":"Robust Statistics: The Approach based on Influence Functions","author":"Hampel Frank R.","year":"2011","unstructured":"Frank R. Hampel, Elvezio M. Ronchetti, Peter J. Rousseeuw, and Werner A. Stahel. 2011. Robust Statistics: The Approach based on Influence Functions. Vol. 196. John Wiley & Sons."},{"key":"e_1_3_2_29_2","first-page":"354","volume-title":"Conference on Learning Theory","author":"Hardt Moritz","year":"2013","unstructured":"Moritz Hardt and Ankur Moitra. 2013. Algorithms and hardness for robust subspace recovery. In Conference on Learning Theory. 354\u2013375."},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.1404.4997"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/FOCS.2017.72"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3188745.3188748"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/2897518.2897529"},{"key":"e_1_3_2_34_2","first-page":"956","volume-title":"Conference on Learning Theory","author":"Hopkins Samuel B.","year":"2015","unstructured":"Samuel B. Hopkins, Jonathan Shi, and David Steurer. 2015. Tensor principal component analysis via sum-of-square proofs. In Conference on Learning Theory. 956\u20131006."},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/2422436.2422439"},{"key":"e_1_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1214\/aoms\/1177703732"},{"key":"e_1_3_2_37_2","volume-title":"Robust Statistics","author":"Huber Peter J.","year":"2004","unstructured":"Peter J. Huber. 2004. Robust Statistics. Vol. 523. John Wiley & Sons."},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1016\/0304-3975(78)90006-3"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/1806689.1806765"},{"key":"e_1_3_2_40_2","article-title":"Robust learning of mixtures of Gaussians","author":"Kane Daniel M.","year":"2020","unstructured":"Daniel M. Kane. 2020. Robust learning of mixtures of Gaussians. arXiv preprint arXiv:2007.05912 (2020).","journal-title":"arXiv preprint arXiv:2007.05912"},{"key":"e_1_3_2_41_2","first-page":"1420","volume-title":"Conference on Learning Theory","author":"Klivans Adam","year":"2018","unstructured":"Adam Klivans, Pravesh K. Kothari, and Raghu Meka. 2018. Efficient algorithms for outlier-robust regression. In Conference on Learning Theory. 1420\u20131430."},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3188745.3188970"},{"key":"e_1_3_2_43_2","doi-asserted-by":"crossref","first-page":"299","DOI":"10.1109\/FOCS.2010.35","volume-title":"Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on","author":"Kumar Amit","year":"2010","unstructured":"Amit Kumar and Ravindran Kannan. 2010. Clustering with spectral norm and the k-means algorithm. In Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on. IEEE, 299\u2013308."},{"key":"e_1_3_2_44_2","first-page":"665","volume-title":"2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS\u201916)","author":"Lai Kevin A.","year":"2016","unstructured":"Kevin A. Lai, Anup B. Rao, and Santosh Vempala. 2016. Agnostic estimation of mean and covariance. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS\u201916). IEEE, 665\u2013674."},{"key":"e_1_3_2_45_2","volume-title":"Principled Approaches to Robust Machine Learning and Beyond","author":"Li Jerry Zheng","year":"2018","unstructured":"Jerry Zheng Li. 2018. Principled Approaches to Robust Machine Learning and Beyond. Ph.D. Dissertation. Massachusetts Institute of Technology."},{"key":"e_1_3_2_46_2","article-title":"Learning GMMs with nearly optimal robustness guarantees","author":"Liu Allen","year":"2021","unstructured":"Allen Liu and Ankur Moitra. 2021. Learning GMMs with nearly optimal robustness guarantees. arXiv preprint arXiv:2104.09665 (2021).","journal-title":"arXiv preprint arXiv:2104.09665"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1017\/9781316882177"},{"key":"e_1_3_2_48_2","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1109\/FOCS.2010.15","volume-title":"Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on","author":"Moitra Ankur","year":"2010","unstructured":"Ankur Moitra and Gregory Valiant. 2010. Settling the polynomial learnability of mixtures of Gaussians. In Foundations of Computer Science (FOCS), 2010 51st Annual IEEE Symposium on. IEEE, 93\u2013102."},{"key":"e_1_3_2_49_2","volume-title":"Structured Semidefinite Programs and Semialgebraic Geometry Methods in Robustness and Optimization","author":"Parrilo Pablo A.","year":"2000","unstructured":"Pablo A. Parrilo. 2000. Structured Semidefinite Programs and Semialgebraic Geometry Methods in Robustness and Optimization. Ph.D. Dissertation. California Institute of Technology."},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1098\/rsta.1894.0003"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.5555\/AAI28115249"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1214\/aoms\/1177705155"},{"key":"e_1_3_2_53_2","first-page":"448","article-title":"A survey of sampling from contaminated distributions","author":"Tukey John W.","year":"1960","unstructured":"John W. Tukey. 1960. A survey of sampling from contaminated distributions. Contributions to Probability and Statistics (1960), 448\u2013485.","journal-title":"Contributions to Probability and Statistics"},{"key":"e_1_3_2_54_2","first-page":"523","volume-title":"Proceedings of the International Congress of Mathematicians, Vancouver, 1975","volume":"2","author":"Tukey John W.","year":"1975","unstructured":"John W. Tukey. 1975. Mathematics and the picturing of data. In Proceedings of the International Congress of Mathematicians, Vancouver, 1975, Vol. 2. 523\u2013531."},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-21852-6_3"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2003.11.008"}],"container-title":["Journal of the ACM"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3583680","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3583680","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:46:27Z","timestamp":1750178787000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3583680"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,23]]},"references-count":55,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,6,30]]}},"alternative-id":["10.1145\/3583680"],"URL":"https:\/\/doi.org\/10.1145\/3583680","relation":{},"ISSN":["0004-5411","1557-735X"],"issn-type":[{"value":"0004-5411","type":"print"},{"value":"1557-735X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,23]]},"assertion":[{"value":"2021-07-25","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-01-19","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-05-23","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}