{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,11]],"date-time":"2026-05-11T10:44:08Z","timestamp":1778496248595,"version":"3.51.4"},"reference-count":53,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2023,7,12]],"date-time":"2023-07-12T00:00:00Z","timestamp":1689120000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,7,12]],"date-time":"2023-07-12T00:00:00Z","timestamp":1689120000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100003065","name":"University of Vienna","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100003065","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Found Comput Math"],"published-print":{"date-parts":[[2024,8]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>We study the computational complexity of (deterministic or randomized) algorithms based on point samples for approximating or integrating functions that can be well approximated by neural networks. Such algorithms (most prominently stochastic gradient descent and its variants) are used extensively in the field of deep learning. One of the most important problems in this field concerns the question of whether it is possible to realize theoretically provable neural network approximation rates by such algorithms. We answer this question in the negative by proving hardness results for the problems of approximation and integration on a novel class of neural network approximation spaces. In particular, our results confirm a conjectured and empirically observed theory-to-practice gap in deep learning. We complement our hardness results by showing that error bounds of a comparable order of convergence are (at least theoretically) achievable.<\/jats:p>","DOI":"10.1007\/s10208-023-09607-w","type":"journal-article","created":{"date-parts":[[2023,7,12]],"date-time":"2023-07-12T14:02:38Z","timestamp":1689170558000},"page":"1085-1143","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":25,"title":["Proof of the Theory-to-Practice Gap in Deep Learning via Sampling Complexity bounds for Neural Network Approximation Spaces"],"prefix":"10.1007","volume":"24","author":[{"given":"Philipp","family":"Grohs","sequence":"first","affiliation":[]},{"given":"Felix","family":"Voigtlaender","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,7,12]]},"reference":[{"key":"9607_CR1","doi-asserted-by":"crossref","unstructured":"B.\u00a0Adcock and N.\u00a0Dexter. The gap between theory and practice in function approximation with deep neural networks. SIAM Journal on Mathematics of Data Science, 3(2):624\u2013655, 2021.","DOI":"10.1137\/20M131309X"},{"key":"9607_CR2","volume-title":"Infinite dimensional analysis","author":"CD Aliprantis","year":"2006","unstructured":"C.\u00a0D. Aliprantis and K.\u00a0C. Border. Infinite dimensional analysis. Springer, Berlin, third edition, 2006."},{"key":"9607_CR3","doi-asserted-by":"crossref","unstructured":"V.\u00a0Antun, M.\u00a0J. Colbrook, and A.\u00a0C. Hansen. The difficulty of computing stable and accurate neural networks: On the barriers of deep learning and Smale\u2019s 18th problem. Applied Mathematics, 119(12):e21071511, 2022.","DOI":"10.1073\/pnas.2107151119"},{"key":"9607_CR4","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1017\/S0962492919000059","volume":"28","author":"S Arridge","year":"2019","unstructured":"S.\u00a0Arridge, P.\u00a0Maass, O.\u00a0\u00d6ktem, and C.-B. Sch\u00f6nlieb. Solving inverse problems using data-driven models. Acta Numerica, 28:1\u2013174, 2019.","journal-title":"Acta Numerica"},{"issue":"1","key":"9607_CR5","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/ncomms5308","volume":"5","author":"P Baldi","year":"2014","unstructured":"P.\u00a0Baldi, P.\u00a0Sadowski, and D.\u00a0Whiteson. Searching for exotic particles in high-energy physics with deep learning. Nature communications, 5(1):1\u20139, 2014.","journal-title":"Nature Communications"},{"issue":"63","key":"9607_CR6","first-page":"1","volume":"20","author":"PL Bartlett","year":"2019","unstructured":"P.\u00a0L. Bartlett, N.\u00a0Harvey, C.\u00a0Liaw, and A.\u00a0Mehrabian. Nearly-tight VC-dimension and Pseudodimension Bounds for Piecewise Linear Neural Networks. Journal of Machine Learning Research, 20(63):1\u201317, 2019.","journal-title":"Journal of Machine Learning Research"},{"key":"9607_CR7","unstructured":"P.\u00a0Beneventano, P.\u00a0Cheridito, A.\u00a0Jentzen, and P.\u00a0von Wurstemberger. High-dimensional approximation spaces of artificial neural networks and applications to partial differential equations. arXiv preprint arXiv:2012.04326, 2020."},{"issue":"3","key":"9607_CR8","doi-asserted-by":"publisher","first-page":"631","DOI":"10.1137\/19M125649X","volume":"2","author":"J Berner","year":"2020","unstructured":"J.\u00a0Berner, P.\u00a0Grohs, and A.\u00a0Jentzen. Analysis of the Generalization Error: Empirical Risk Minimization over Deep Artificial Neural Networks Overcomes the Curse of Dimensionality in the Numerical Approximation of Black\u2013Scholes Partial Differential Equations. SIAM Journal on Mathematics of Data Science, 2(3):631\u2013657, 2020.","journal-title":"SIAM Journal on Mathematics of Data Science"},{"key":"9607_CR9","unstructured":"A.\u00a0Blum and R.\u00a0L. Rivest. Training a 3-node neural network is NP-complete. In Advances in neural information processing systems, pages 494\u2013501, 1989."},{"key":"9607_CR10","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1137\/18M118709X","volume":"1","author":"H B\u00f6lcskei","year":"2019","unstructured":"H.\u00a0B\u00f6lcskei, P.\u00a0Grohs, G.\u00a0Kutyniok, and P.\u00a0C. Petersen. Optimal approximation with sparsely connected deep neural networks. SIAM J. Math. Data Sci., 1:8\u201345, 2019.","journal-title":"SIAM J. Math. Data Sci."},{"key":"9607_CR11","unstructured":"A.\u00a0Caragea, P.\u00a0Petersen, and F.\u00a0Voigtlaender. Neural network approximation and estimation of classifiers with classification boundary in a Barron class. arXiv preprint arXiv:2011.09363, 2020."},{"key":"9607_CR12","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4612-1950-7","volume-title":"Probability theory","author":"YS Chow","year":"1997","unstructured":"Y.\u00a0S. Chow and H.\u00a0Teicher. Probability theory. Springer Texts in Statistics. Springer-Verlag, New York, third edition, 1997.","edition":"3"},{"key":"9607_CR13","doi-asserted-by":"crossref","unstructured":"F.\u00a0Cucker and S.\u00a0Smale. On the mathematical foundations of learning. Bull. Amer. Math. Soc. (N.S.), 39(1):1\u201349, 2002.","DOI":"10.1090\/S0273-0979-01-00923-5"},{"key":"9607_CR14","doi-asserted-by":"crossref","unstructured":"R.\u00a0A. DeVore and G.\u00a0G. Lorentz. Constructive approximation, volume 303 of Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, Berlin, 1993.","DOI":"10.1007\/978-3-662-02888-9_10"},{"key":"9607_CR15","doi-asserted-by":"crossref","unstructured":"R.\u00a0DeVore, B.\u00a0Hanin, and G.\u00a0Petrova. Neural network approximation. Acta Numerica, 30:327\u2013444, 2021.","DOI":"10.1017\/S0962492921000052"},{"key":"9607_CR16","unstructured":"Z.\u00a0Ditzian and V.\u00a0Totik. Moduli of smoothness, volume\u00a09. Springer Science & Business Media, 2012."},{"key":"9607_CR17","doi-asserted-by":"crossref","unstructured":"W.\u00a0E and B.\u00a0Yu. The deep ritz method: a deep learning-based numerical algorithm for solving variational problems. Communications in Mathematics and Statistics, 6(1):1\u201312, 2018.","DOI":"10.1007\/s40304-018-0127-z"},{"issue":"11","key":"9607_CR18","doi-asserted-by":"publisher","first-page":"5255","DOI":"10.1021\/acs.jctc.7b00577","volume":"13","author":"FA Faber","year":"2017","unstructured":"F.\u00a0A. Faber, L.\u00a0Hutchison, B.\u00a0Huang, J.\u00a0Gilmer, S.\u00a0S. Schoenholz, G.\u00a0E. Dahl, O.\u00a0Vinyals, S.\u00a0Kearnes, P.\u00a0F. Riley, and O.\u00a0A. Von Lilienfeld. Prediction errors of molecular machine learning models lower than hybrid DFT error. Journal of chemical theory and computation, 13(11):5255\u20135264, 2017.","journal-title":"Journal of chemical theory and computation"},{"key":"9607_CR19","unstructured":"G.\u00a0B. Folland. Real analysis. Pure and Applied Mathematics (New York). John Wiley & Sons, Inc., New York, second edition, 1999."},{"key":"9607_CR20","doi-asserted-by":"crossref","unstructured":"R.\u00a0Gribonval, G.\u00a0Kutyniok, M.\u00a0Nielsen, and F.\u00a0Voigtlaender. Approximation spaces of deep neural networks. Constructive Approximation, 55:259\u2013367, 2022.","DOI":"10.1007\/s00365-021-09543-4"},{"key":"9607_CR21","unstructured":"P.\u00a0Grohs, F.\u00a0Hornung, A.\u00a0Jentzen, and P.\u00a0Von Wurstemberger. A proof that artificial neural networks overcome the curse of dimensionality in the numerical approximation of Black-Scholes partial differential equations. Memoirs of the American Mathematical Society, 2020."},{"key":"9607_CR22","doi-asserted-by":"crossref","unstructured":"P.\u00a0Grohs, D.\u00a0Perekrestenko, D.\u00a0Elbr\u00e4chter, and H.\u00a0B\u00f6lcskei. Deep neural network approximation theory. IEEE Transactions on Information Theory, 67(5):2581\u20132623, 2021.","DOI":"10.1109\/TIT.2021.3062161"},{"issue":"6","key":"9607_CR23","doi-asserted-by":"publisher","first-page":"1127","DOI":"10.1016\/S0893-6080(98)00046-X","volume":"11","author":"A Gupta","year":"1998","unstructured":"A.\u00a0Gupta and S.\u00a0M. Lam. Weight decay backpropagation for noisy data. Neural Networks, 11(6):1127\u20131138, 1998.","journal-title":"Neural Networks"},{"key":"9607_CR24","doi-asserted-by":"crossref","unstructured":"K.\u00a0He, X.\u00a0Zhang, S.\u00a0Ren, and J.\u00a0Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770\u2013778, 2016.","DOI":"10.1109\/CVPR.2016.90"},{"key":"9607_CR25","unstructured":"S.\u00a0Heinrich. Random approximation in numerical analysis. In Functional analysis (Essen, 1991), volume 150 of Lecture Notes in Pure and Appl. Math., pages 123\u2013171. Dekker, New York, 1994."},{"issue":"10","key":"9607_CR26","doi-asserted-by":"publisher","first-page":"891","DOI":"10.1038\/s41557-020-0544-y","volume":"12","author":"J Hermann","year":"2020","unstructured":"J.\u00a0Hermann, Z.\u00a0Sch\u00e4tzle, and F.\u00a0No\u00e9. Deep-neural-network solution of the electronic Schr\u00f6dinger equation. Nature Chemistry, 12(10):891\u2013897, 2020.","journal-title":"Nature Chemistry"},{"issue":"2","key":"9607_CR27","first-page":"1","volume":"1","author":"M Hutzenthaler","year":"2020","unstructured":"M.\u00a0Hutzenthaler, A.\u00a0Jentzen, T.\u00a0Kruse, and T.\u00a0A. Nguyen. A proof that rectified deep neural networks overcome the curse of dimensionality in the numerical approximation of semilinear heat equations. SN Partial Differential Equations and Applications, 1(2):1\u201334, 2020.","journal-title":"SN Partial Differential Equations and Applications"},{"key":"9607_CR28","unstructured":"D.\u00a0P. Kingma and J.\u00a0Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014."},{"key":"9607_CR29","unstructured":"A.\u00a0Krizhevsky, I.\u00a0Sutskever, and G.\u00a0E. Hinton. Imagenet classification with deep convolutional neural networks. In F.\u00a0Pereira, C.\u00a0J.\u00a0C. Burges, L.\u00a0Bottou, and K.\u00a0Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume\u00a025, pages 1097\u20131105. Curran Associates, Inc., 2012."},{"key":"9607_CR30","unstructured":"G.\u00a0Kutyniok, P.\u00a0Petersen, M.\u00a0Raslan, and R.\u00a0Schneider. A theoretical analysis of deep neural networks and parametric PDEs. arXiv preprint arXiv:1904.00377, 2019."},{"key":"9607_CR31","unstructured":"G.\u00a0Lample and F.\u00a0Charton. Deep learning for symbolic mathematics. In International Conference on Learning Representations, 2019."},{"issue":"11","key":"9607_CR32","doi-asserted-by":"publisher","first-page":"2278","DOI":"10.1109\/5.726791","volume":"86","author":"Y LeCun","year":"1998","unstructured":"Y.\u00a0LeCun, L.\u00a0Bottou, Y.\u00a0Bengio, and P.\u00a0Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278\u20132324, 1998.","journal-title":"Proceedings of the IEEE"},{"issue":"2","key":"9607_CR33","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1021\/ci500747n","volume":"55","author":"J Ma","year":"2015","unstructured":"J.\u00a0Ma, R.\u00a0P. Sheridan, A.\u00a0Liaw, G.\u00a0E. Dahl, and V.\u00a0Svetnik. Deep neural nets as a method for quantitative structure\u2013activity relationships. Journal of chemical information and modeling, 55(2):263\u2013274, 2015.","journal-title":"Journal of chemical information and modeling"},{"key":"9607_CR34","unstructured":"V.\u00a0Mnih, K.\u00a0Kavukcuoglu, D.\u00a0Silver, A.\u00a0Graves, I.\u00a0Antonoglou, D.\u00a0Wierstra, and M.\u00a0Riedmiller. Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013."},{"key":"9607_CR35","volume-title":"Foundations of machine learning","author":"M Mohri","year":"2018","unstructured":"M.\u00a0Mohri, A.\u00a0Rostamizadeh, and A.\u00a0Talwalkar. Foundations of machine learning. MIT Press, Cambridge, MA, 2018."},{"key":"9607_CR36","doi-asserted-by":"publisher","first-page":"296","DOI":"10.1016\/j.neunet.2018.08.019","volume":"108","author":"P Petersen","year":"2018","unstructured":"P.\u00a0Petersen and F.\u00a0Voigtlaender. Optimal approximation of piecewise smooth functions using deep ReLU neural networks. Neural Networks, 108:296\u2013330, 2018.","journal-title":"Neural Networks"},{"issue":"3","key":"9607_CR37","doi-asserted-by":"publisher","DOI":"10.1103\/PhysRevResearch.2.033429","volume":"2","author":"D Pfau","year":"2020","unstructured":"D.\u00a0Pfau, J.\u00a0S. Spencer, A.\u00a0G. Matthews, and W.\u00a0M.\u00a0C. Foulkes. Ab initio solution of the many-electron Schr\u00f6dinger equation with deep neural networks. Physical Review Research, 2(3):033429, 2020.","journal-title":"Physical Review Research"},{"key":"9607_CR38","unstructured":"A.\u00a0Pietsch. Eigenvalues and s-numbers. Cambridge University Press, 1986."},{"key":"9607_CR39","unstructured":"A.\u00a0Pinkus. N-widths in Approximation Theory, volume\u00a07. Springer Science & Business Media, 2012."},{"key":"9607_CR40","doi-asserted-by":"publisher","first-page":"686","DOI":"10.1016\/j.jcp.2018.10.045","volume":"378","author":"M Raissi","year":"2019","unstructured":"M.\u00a0Raissi, P.\u00a0Perdikaris, and G.\u00a0E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686\u2013707, 2019.","journal-title":"Journal of Computational Physics"},{"key":"9607_CR41","unstructured":"M.\u00a0M. Rao and Z.\u00a0D. Ren. Theory of Orlicz spaces, volume 146 of Monographs and Textbooks in Pure and Applied Mathematics. Marcel Dekker, Inc., New York, 1991."},{"key":"9607_CR42","unstructured":"D.\u00a0Saxton, E.\u00a0Grefenstette, F.\u00a0Hill, and P.\u00a0Kohli. Analysing mathematical reasoning abilities of neural models. In International Conference on Learning Representations, 2018."},{"issue":"7792","key":"9607_CR43","doi-asserted-by":"publisher","first-page":"706","DOI":"10.1038\/s41586-019-1923-7","volume":"577","author":"AW Senior","year":"2020","unstructured":"A.\u00a0W. Senior, R.\u00a0Evans, J.\u00a0Jumper, J.\u00a0Kirkpatrick, L.\u00a0Sifre, T.\u00a0Green, C.\u00a0Qin, A.\u00a0\u017d\u00eddek, A.\u00a0W. Nelson, and A.\u00a0Bridgland. Improved protein structure prediction using potentials from deep learning. Nature, 577(7792):706\u2013710, 2020.","journal-title":"Nature"},{"key":"9607_CR44","doi-asserted-by":"crossref","unstructured":"S.\u00a0Shalev-Shwartz and S.\u00a0Ben-David. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014.","DOI":"10.1017\/CBO9781107298019"},{"issue":"7587","key":"9607_CR45","doi-asserted-by":"publisher","first-page":"484","DOI":"10.1038\/nature16961","volume":"529","author":"D Silver","year":"2016","unstructured":"D.\u00a0Silver, A.\u00a0Huang, C.\u00a0J. Maddison, A.\u00a0Guez, L.\u00a0Sifre, G.\u00a0Van Den Driessche, J.\u00a0Schrittwieser, I.\u00a0Antonoglou, V.\u00a0Panneershelvam, and M.\u00a0Lanctot. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587):484\u2013489, 2016.","journal-title":"Nature"},{"issue":"7676","key":"9607_CR46","doi-asserted-by":"publisher","first-page":"354","DOI":"10.1038\/nature24270","volume":"550","author":"D Silver","year":"2017","unstructured":"D.\u00a0Silver, J.\u00a0Schrittwieser, K.\u00a0Simonyan, I.\u00a0Antonoglou, A.\u00a0Huang, A.\u00a0Guez, T.\u00a0Hubert, L.\u00a0Baker, M.\u00a0Lai, and A.\u00a0Bolton. Mastering the game of Go without human knowledge. Nature, 550(7676):354\u2013359, 2017.","journal-title":"Nature"},{"key":"9607_CR47","doi-asserted-by":"crossref","unstructured":"C.\u00a0Szegedy, W.\u00a0Liu, Y.\u00a0Jia, P.\u00a0Sermanet, S.\u00a0Reed, D.\u00a0Anguelov, D.\u00a0Erhan, V.\u00a0Vanhoucke, and A.\u00a0Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1\u20139, 2015.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"9607_CR48","unstructured":"M.\u00a0Telgarsky. Benefits of depth in neural networks. In Conference on learning theory, pages 1517\u20131539. PMLR, 2016."},{"key":"9607_CR49","unstructured":"A.\u00a0F. Timan. Theory of approximation of functions of a real variable. Elsevier, 2014."},{"key":"9607_CR50","unstructured":"R.\u00a0Vershynin. High-dimensional probability, volume\u00a047 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 2018."},{"issue":"7782","key":"9607_CR51","doi-asserted-by":"publisher","first-page":"350","DOI":"10.1038\/s41586-019-1724-z","volume":"575","author":"O Vinyals","year":"2019","unstructured":"O.\u00a0Vinyals, I.\u00a0Babuschkin, W.\u00a0M. Czarnecki, M.\u00a0Mathieu, A.\u00a0Dudzik, J.\u00a0Chung, D.\u00a0H. Choi, R.\u00a0Powell, T.\u00a0Ewalds, and P.\u00a0Georgiev. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782):350\u2013354, 2019.","journal-title":"Nature"},{"key":"9607_CR52","unstructured":"D.\u00a0Yarotsky. Optimal approximation of continuous functions by very deep ReLU networks. In Conference on Learning Theory, pages 639\u2013649. PMLR, 2018."},{"issue":"3","key":"9607_CR53","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1109\/MCI.2018.2840738","volume":"13","author":"T Young","year":"2018","unstructured":"T.\u00a0Young, D.\u00a0Hazarika, S.\u00a0Poria, and E.\u00a0Cambria. Recent trends in deep learning based natural language processing. IEEE Computational intelligence magazine, 13(3):55\u201375, 2018.","journal-title":"IEEE Computational intelligence magazine"}],"container-title":["Foundations of Computational Mathematics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10208-023-09607-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10208-023-09607-w\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10208-023-09607-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,8]],"date-time":"2024-08-08T19:05:12Z","timestamp":1723143912000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10208-023-09607-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,12]]},"references-count":53,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,8]]}},"alternative-id":["9607"],"URL":"https:\/\/doi.org\/10.1007\/s10208-023-09607-w","relation":{},"ISSN":["1615-3375","1615-3383"],"issn-type":[{"value":"1615-3375","type":"print"},{"value":"1615-3383","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,12]]},"assertion":[{"value":"29 October 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 September 2022","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 September 2022","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 July 2023","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}