{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,29]],"date-time":"2026-05-29T11:28:32Z","timestamp":1780054112151,"version":"3.54.0"},"reference-count":41,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2023,3,21]],"date-time":"2023-03-21T00:00:00Z","timestamp":1679356800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"NSF","award":["CCF-2006489"],"award-info":[{"award-number":["CCF-2006489"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. ACM"],"published-print":{"date-parts":[[2023,4,30]]},"abstract":"<jats:p>\n            Classically, data interpolation with a parametrized model class is possible as long as the number of parameters is larger than the number of equations to be satisfied. A puzzling phenomenon in deep learning is that models are trained with many more parameters than what this classical theory would suggest. We propose a partial theoretical explanation for this phenomenon. We prove that for a broad class of data distributions and model classes, overparametrization is\n            <jats:italic>necessary<\/jats:italic>\n            if one wants to interpolate the data\n            <jats:italic>smoothly<\/jats:italic>\n            . Namely we show that\n            <jats:italic>smooth<\/jats:italic>\n            interpolation requires\n            <jats:italic>d<\/jats:italic>\n            times more parameters than mere interpolation, where\n            <jats:italic>d<\/jats:italic>\n            is the ambient data dimension. We prove this universal law of robustness for any smoothly parametrized function class with polynomial size weights, and any covariate distribution verifying isoperimetry (or a mixture thereof). In the case of two-layer neural networks and Gaussian covariates, this law was conjectured in prior work by Bubeck, Li, and Nagaraj. We also give an interpretation of our result as an improved generalization bound for model classes consisting of smooth functions.\n          <\/jats:p>","DOI":"10.1145\/3578580","type":"journal-article","created":{"date-parts":[[2022,12,29]],"date-time":"2022-12-29T12:03:44Z","timestamp":1672315424000},"page":"1-18","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":35,"title":["A Universal Law of Robustness via Isoperimetry"],"prefix":"10.1145","volume":"70","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9122-4410","authenticated-orcid":false,"given":"S\u00e9bastien","family":"Bubeck","sequence":"first","affiliation":[{"name":"Microsoft Research, Redmond, WA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9166-8185","authenticated-orcid":false,"given":"Mark","family":"Sellke","sequence":"additional","affiliation":[{"name":"Stanford University, Stanford, CA, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2023,3,21]]},"reference":[{"key":"e_1_3_4_2_2","first-page":"1120","volume-title":"International Conference on Machine Learning","author":"Arjovsky Martin","year":"2016","unstructured":"Martin Arjovsky, Amar Shah, and Yoshua Bengio. 2016. Unitary evolution recurrent neural networks. In International Conference on Machine Learning. PMLR, 1120\u20131128."},{"key":"e_1_3_4_3_2","volume-title":"Advances in Neural Information Processing Systems","author":"Bartlett Peter L.","year":"2017","unstructured":"Peter L. Bartlett, Dylan J. Foster, and Matus J. Telgarsky. 2017. Spectrally-normalized margin bounds for neural networks. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc.https:\/\/proceedings.neurips.cc\/paper\/2017\/file\/b22b257ad0519d4500539da3c8bcf4dd-Paper.pdf."},{"issue":"48","key":"e_1_3_4_4_2","doi-asserted-by":"crossref","first-page":"30063","DOI":"10.1073\/pnas.1907378117","article-title":"Benign overfitting in linear regression","volume":"117","author":"Bartlett Peter L.","year":"2020","unstructured":"Peter L. Bartlett, Philip M. Long, G\u00e1bor Lugosi, and Alexander Tsigler. 2020. Benign overfitting in linear regression. Proceedings of the National Academy of Sciences 117, 48 (2020), 30063\u201330070.","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"32","key":"e_1_3_4_5_2","doi-asserted-by":"crossref","first-page":"15849","DOI":"10.1073\/pnas.1903070116","article-title":"Reconciling modern machine-learning practice and the classical bias\u2013variance trade-off","volume":"116","author":"Belkin Mikhail","year":"2019","unstructured":"Mikhail Belkin, Daniel Hsu, Siyuan Ma, and Soumik Mandal. 2019. Reconciling modern machine-learning practice and the classical bias\u2013variance trade-off. Proceedings of the National Academy of Sciences 116, 32 (2019), 15849\u201315854.","journal-title":"Proceedings of the National Academy of Sciences"},{"issue":"3","key":"e_1_3_4_6_2","doi-asserted-by":"crossref","first-page":"383","DOI":"10.1007\/s004400050090","article-title":"Poincar\u00e9\u2019s inequalities and Talagrand\u2019s concentration phenomenon for the exponential distribution","volume":"107","author":"Bobkov Sergey","year":"1997","unstructured":"Sergey Bobkov and Michel Ledoux. 1997. Poincar\u00e9\u2019s inequalities and Talagrand\u2019s concentration phenomenon for the exponential distribution. Probability Theory and Related Fields 107, 3 (1997), 383\u2013400.","journal-title":"Probability Theory and Related Fields"},{"issue":"5","key":"e_1_3_4_7_2","doi-asserted-by":"crossref","first-page":"1028","DOI":"10.1007\/PL00001645","article-title":"From Brunn-Minkowski to Brascamp-Lieb and to logarithmic Sobolev inequalities","volume":"10","author":"Bobkov Sergey G.","year":"2000","unstructured":"Sergey G. Bobkov and Michel Ledoux. 2000. From Brunn-Minkowski to Brascamp-Lieb and to logarithmic Sobolev inequalities. Geometric & Functional Analysis GAFA 10, 5 (2000), 1028\u20131052.","journal-title":"Geometric & Functional Analysis GAFA"},{"key":"e_1_3_4_8_2","first-page":"4977","volume-title":"Advances in Neural Information Processing Systems","author":"Bubeck Sebastien","year":"2020","unstructured":"Sebastien Bubeck, Ronen Eldan, Yin Tat Lee, and Dan Mikulincer. 2020. Network size and size of the weights in memorization with two-layers neural networks. In Advances in Neural Information Processing Systems, Vol. 33. 4977\u20134986."},{"key":"e_1_3_4_9_2","first-page":"804","volume-title":"Conference on Learning Theory","author":"Bubeck S\u00e9bastien","year":"2021","unstructured":"S\u00e9bastien Bubeck, Yuanzhi Li, and Dheeraj M. Nagaraj. 2021. A law of robustness for two-layers neural networks. In Conference on Learning Theory. PMLR, 804\u2013820."},{"key":"e_1_3_4_10_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jfa.2021.109236"},{"key":"e_1_3_4_11_2","first-page":"854","volume-title":"International Conference on Machine Learning","author":"Cisse Moustapha","year":"2017","unstructured":"Moustapha Cisse, Piotr Bojanowski, Edouard Grave, Yann Dauphin, and Nicolas Usunier. 2017. Parseval networks: Improving robustness to adversarial examples. In International Conference on Machine Learning. PMLR, 854\u2013863."},{"issue":"1","key":"e_1_3_4_12_2","first-page":"1","article-title":"Estimating the intrinsic dimension of datasets by a minimal neighborhood information","volume":"7","author":"Facco Elena","year":"2017","unstructured":"Elena Facco, Maria d\u2019Errico, Alex Rodriguez, and Alessandro Laio. 2017. Estimating the intrinsic dimension of datasets by a minimal neighborhood information. Scientific Reports 7, 1 (2017), 1\u20138.","journal-title":"Scientific Reports"},{"issue":"4","key":"e_1_3_4_13_2","doi-asserted-by":"crossref","first-page":"983","DOI":"10.1090\/jams\/852","article-title":"Testing the manifold hypothesis","volume":"29","author":"Fefferman Charles","year":"2016","unstructured":"Charles Fefferman, Sanjoy Mitter, and Hariharan Narayanan. 2016. Testing the manifold hypothesis. Journal of the American Mathematical Society 29, 4 (2016), 983\u20131049.","journal-title":"Journal of the American Mathematical Society"},{"key":"e_1_3_4_14_2","first-page":"13029","article-title":"Convergence of adversarial training in overparametrized neural networks","volume":"32","author":"Gao Ruiqi","year":"2019","unstructured":"Ruiqi Gao, Tianle Cai, Haochuan Li, Cho-Jui Hsieh, Liwei Wang, and Jason D. Lee. 2019. Convergence of adversarial training in overparametrized neural networks. Advances in Neural Information Processing Systems 32 (2019), 13029\u201313040.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_4_15_2","article-title":"Adversarial spheres","author":"Gilmer Justin","year":"2018","unstructured":"Justin Gilmer, Luke Metz, Fartash Faghri, Samuel S. Schoenholz, Maithra Raghu, Martin Wattenberg, and Ian Goodfellow. 2018. Adversarial spheres. arXiv preprint arXiv:1801.02774 (2018).","journal-title":"arXiv preprint arXiv:1801.02774"},{"key":"e_1_3_4_16_2","article-title":"Uncovering the limits of adversarial training against norm-bounded adversarial examples","author":"Gowal Sven","year":"2020","unstructured":"Sven Gowal, Chongli Qin, Jonathan Uesato, Timothy Mann, and Pushmeet Kohli. 2020. Uncovering the limits of adversarial training against norm-bounded adversarial examples. arXiv preprint arXiv:2010.03593 (2020).","journal-title":"arXiv preprint arXiv:2010.03593"},{"key":"e_1_3_4_17_2","first-page":"114","volume-title":"Asymptotic Theory of Finite Dimensional Spaces","author":"Gromov Mikhael","year":"1986","unstructured":"Mikhael Gromov. 1986. Isoperimetric inequalities in Riemannian manifolds. In Asymptotic Theory of Finite Dimensional Spaces. Vol. 1200. Springer Berlin, 114\u2013129."},{"issue":"406","key":"e_1_3_4_18_2","doi-asserted-by":"crossref","first-page":"502","DOI":"10.1080\/01621459.1989.10478797","article-title":"Principal curves","volume":"84","author":"Hastie Trevor","year":"1989","unstructured":"Trevor Hastie and Werner Stuetzle. 1989. Principal curves. J. Amer. Statist. Assoc. 84, 406 (1989), 502\u2013516.","journal-title":"J. Amer. Statist. Assoc."},{"key":"e_1_3_4_19_2","article-title":"Distilling the knowledge in a neural network","author":"Hinton Geoffrey","year":"2015","unstructured":"Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).","journal-title":"arXiv preprint arXiv:1503.02531"},{"key":"e_1_3_4_20_2","article-title":"On computation and generalization of GANs with spectrum control","author":"Jiang Haoming","year":"2019","unstructured":"Haoming Jiang, Zhehui Chen, Minshuo Chen, Feng Liu, Dingding Wang, and Tuo Zhao. 2019. On computation and generalization of GANs with spectrum control. Proc. of International Conference on Learning Representation (ICLR\u201919) (2019).","journal-title":"Proc. of International Conference on Learning Representation (ICLR\u201919)"},{"key":"e_1_3_4_21_2","doi-asserted-by":"crossref","first-page":"1213","DOI":"10.1109\/ICNN.1993.298730","volume-title":"IEEE International Conference on Neural Networks","author":"Kambhatla Nanda","year":"1993","unstructured":"Nanda Kambhatla and Todd K. Leen. 1993. Fast nonlinear dimension reduction. In IEEE International Conference on Neural Networks. IEEE, 1213\u20131218."},{"key":"e_1_3_4_22_2","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1007\/BFb0096511","volume-title":"Seminaire de probabilites XXXIII","author":"Ledoux Michel","year":"1999","unstructured":"Michel Ledoux. 1999. Concentration of measure and logarithmic Sobolev inequalities. In Seminaire de probabilites XXXIII. Springer, 120\u2013216."},{"key":"e_1_3_4_23_2","volume-title":"Mathematical Surveys and Monographs","author":"Ledoux M.","year":"2001","unstructured":"M. Ledoux. 2001. The concentration of measure phenomenon. In Mathematical Surveys and Monographs. Vol. 89. American Mathematical Society, Providence, RI."},{"key":"e_1_3_4_24_2","article-title":"Towards deep learning models resistant to adversarial attacks","author":"Madry Aleksander","year":"2018","unstructured":"Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards deep learning models resistant to adversarial attacks. Proc. of International Conference on Learning Representation (ICLR) (2018).","journal-title":"Proc. of International Conference on Learning Representation (ICLR)"},{"key":"e_1_3_4_25_2","article-title":"The generalization error of random features regression: Precise asymptotics and the double descent curve","author":"Mei Song","year":"2019","unstructured":"Song Mei and Andrea Montanari. 2019. The generalization error of random features regression: Precise asymptotics and the double descent curve. Communications on Pure and Applied Mathematics (2019).","journal-title":"Communications on Pure and Applied Mathematics"},{"key":"e_1_3_4_26_2","first-page":"2401","volume-title":"International Conference on Machine Learning","author":"Mhammedi Zakaria","year":"2017","unstructured":"Zakaria Mhammedi, Andrew Hellicar, Ashfaqur Rahman, and James Bailey. 2017. Efficient orthogonal parametrisation of recurrent neural networks using householder reflections. In International Conference on Machine Learning. PMLR, 2401\u20132409."},{"key":"e_1_3_4_27_2","article-title":"Spectral normalization for generative adversarial networks","author":"Miyato Takeru","year":"2018","unstructured":"Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. 2018. Spectral normalization for generative adversarial networks. Proc. of International Conference on Learning Representation (ICLR) (2018).","journal-title":"Proc. of International Conference on Learning Representation (ICLR)"},{"key":"e_1_3_4_28_2","volume-title":"Foundations of Machine Learning","author":"Mohri Mehryar","year":"2018","unstructured":"Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. 2018. Foundations of Machine Learning. MIT Press."},{"key":"e_1_3_4_29_2","volume-title":"International Conference on Learning Representations","author":"Nakkiran Preetum","year":"2020","unstructured":"Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak, and Ilya Sutskever. 2020. Deep double descent: Where bigger models and more data hurt. In International Conference on Learning Representations."},{"key":"e_1_3_4_30_2","first-page":"1786","volume-title":"Proceedings of the 23rd International Conference on Neural Information Processing Systems-Volume 2","author":"Narayanan Hariharan","year":"2010","unstructured":"Hariharan Narayanan and Sanjoy Mitter. 2010. Sample complexity of testing the manifold hypothesis. In Proceedings of the 23rd International Conference on Neural Information Processing Systems-Volume 2. 1786\u20131794."},{"key":"e_1_3_4_31_2","volume-title":"International Conference on Learning Representations","author":"Novak Roman","year":"2018","unstructured":"Roman Novak, Yasaman Bahri, Daniel A. Abolafia, Jeffrey Pennington, and Jascha Sohl-Dickstein. 2018. Sensitivity and generalization in neural networks: An empirical study. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=HJC2SzZCW."},{"key":"e_1_3_4_32_2","volume-title":"International Conference on Learning Representations","author":"Pope Phil","year":"2021","unstructured":"Phil Pope, Chen Zhu, Ahmed Abdelkader, Micah Goldblum, and Tom Goldstein. 2021. The intrinsic dimension of images and its impact on learning. In International Conference on Learning Representations."},{"key":"e_1_3_4_33_2","first-page":"8093","volume-title":"Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research)","volume":"119","author":"Rice Leslie","year":"2020","unstructured":"Leslie Rice, Eric Wong, and Zico Kolter. 2020. Overfitting in adversarially robust deep learning. In Proceedings of the 37th International Conference on Machine Learning (Proceedings of Machine Learning Research), Vol. 119. PMLR, 8093\u20138104."},{"issue":"5500","key":"e_1_3_4_34_2","doi-asserted-by":"crossref","first-page":"2323","DOI":"10.1126\/science.290.5500.2323","article-title":"Nonlinear dimensionality reduction by locally linear embedding","volume":"290","author":"Roweis Sam T.","year":"2000","unstructured":"Sam T. Roweis and Lawrence K. Saul. 2000. Nonlinear dimensionality reduction by locally linear embedding. Science 290, 5500 (2000), 2323\u20132326.","journal-title":"Science"},{"key":"e_1_3_4_35_2","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9781107298019","volume-title":"Understanding Machine Learning: From Theory to Algorithms","author":"Shalev-Shwartz Shai","year":"2014","unstructured":"Shai Shalev-Shwartz and Shai Ben-David. 2014. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press."},{"issue":"5500","key":"e_1_3_4_36_2","doi-asserted-by":"crossref","first-page":"2319","DOI":"10.1126\/science.290.5500.2319","article-title":"A global geometric framework for nonlinear dimensionality reduction","volume":"290","author":"Tenenbaum Joshua B.","year":"2000","unstructured":"Joshua B. Tenenbaum, Vin De Silva, and John C. Langford. 2000. A global geometric framework for nonlinear dimensionality reduction. Science 290, 5500 (2000), 2319\u20132323.","journal-title":"Science"},{"key":"e_1_3_4_37_2","doi-asserted-by":"crossref","DOI":"10.21236\/ADA623999","volume-title":"Probability in High Dimension","author":"Handel Ramon van","year":"2014","unstructured":"Ramon van Handel. 2014. Probability in High Dimension. Technical Report. Princeton University."},{"key":"e_1_3_4_38_2","volume-title":"High-dimensional Probability: An Introduction with Applications in Data Science","author":"Vershynin Roman","year":"2018","unstructured":"Roman Vershynin. 2018. High-dimensional Probability: An Introduction with Applications in Data Science. Vol. 47. Cambridge University Press."},{"key":"e_1_3_4_39_2","volume-title":"International Conference on Learning Representations","author":"Xie Cihang","year":"2020","unstructured":"Cihang Xie and Alan Yuille. 2020. Intriguing properties of adversarial training at scale. In International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=HyxJhCEFDS."},{"key":"e_1_3_4_40_2","first-page":"7085","volume-title":"International Conference on Machine Learning","author":"Yin Dong","year":"2019","unstructured":"Dong Yin, Ramchandran Kannan, and Peter Bartlett. 2019. Rademacher complexity for adversarially robust generalization. In International Conference on Machine Learning. PMLR, 7085\u20137094."},{"key":"e_1_3_4_41_2","article-title":"Spectral norm regularization for improving the generalizability of deep learning","author":"Yoshida Yuichi","year":"2017","unstructured":"Yuichi Yoshida and Takeru Miyato. 2017. Spectral norm regularization for improving the generalizability of deep learning. arXiv preprint arXiv:1705.10941 (2017).","journal-title":"arXiv preprint arXiv:1705.10941"},{"key":"e_1_3_4_42_2","first-page":"15532","volume-title":"Advances in Neural Information Processing Systems","author":"Yun Chulhee","year":"2019","unstructured":"Chulhee Yun, Suvrit Sra, and Ali Jadbabaie. 2019. Small ReLU networks are powerful memorizers: A tight analysis of memorization capacity. In Advances in Neural Information Processing Systems. 15532\u201315543."}],"container-title":["Journal of the ACM"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3578580","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3578580","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:08:38Z","timestamp":1750183718000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3578580"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,21]]},"references-count":41,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,4,30]]}},"alternative-id":["10.1145\/3578580"],"URL":"https:\/\/doi.org\/10.1145\/3578580","relation":{},"ISSN":["0004-5411","1557-735X"],"issn-type":[{"value":"0004-5411","type":"print"},{"value":"1557-735X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,21]]},"assertion":[{"value":"2022-04-13","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-12-23","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-03-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}