{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T13:02:06Z","timestamp":1770814926291,"version":"3.50.1"},"reference-count":41,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,11,16]],"date-time":"2021-11-16T00:00:00Z","timestamp":1637020800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,11,16]],"date-time":"2021-11-16T00:00:00Z","timestamp":1637020800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100005713","name":"Technische Universit\u00e4t M\u00fcnchen","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100005713","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Found Comput Math"],"published-print":{"date-parts":[[2023,2]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Rate distortion theory is concerned with optimally encoding signals from a given signal class <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\mathcal {S}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>S<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> using a budget of <jats:italic>R<\/jats:italic> bits, as <jats:inline-formula><jats:alternatives><jats:tex-math>$$R \\rightarrow \\infty $$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>R<\/mml:mi>\n                    <mml:mo>\u2192<\/mml:mo>\n                    <mml:mi>\u221e<\/mml:mi>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>. We say that <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\mathcal {S}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>S<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula><jats:italic>can be compressed at rate<\/jats:italic><jats:italic>s<\/jats:italic> if we can achieve an error of at most <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\mathcal {O}(R^{-s})$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>O<\/mml:mi>\n                    <mml:mo>(<\/mml:mo>\n                    <mml:msup>\n                      <mml:mi>R<\/mml:mi>\n                      <mml:mrow>\n                        <mml:mo>-<\/mml:mo>\n                        <mml:mi>s<\/mml:mi>\n                      <\/mml:mrow>\n                    <\/mml:msup>\n                    <mml:mo>)<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> for encoding the given signal class; the supremal compression rate is denoted by <jats:inline-formula><jats:alternatives><jats:tex-math>$$s^*(\\mathcal {S})$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:msup>\n                      <mml:mi>s<\/mml:mi>\n                      <mml:mo>\u2217<\/mml:mo>\n                    <\/mml:msup>\n                    <mml:mrow>\n                      <mml:mo>(<\/mml:mo>\n                      <mml:mi>S<\/mml:mi>\n                      <mml:mo>)<\/mml:mo>\n                    <\/mml:mrow>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>. Given a fixed coding scheme, there usually are <jats:italic>some<\/jats:italic> elements of <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\mathcal {S}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>S<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> that are compressed at a higher rate than <jats:inline-formula><jats:alternatives><jats:tex-math>$$s^*(\\mathcal {S})$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:msup>\n                      <mml:mi>s<\/mml:mi>\n                      <mml:mo>\u2217<\/mml:mo>\n                    <\/mml:msup>\n                    <mml:mrow>\n                      <mml:mo>(<\/mml:mo>\n                      <mml:mi>S<\/mml:mi>\n                      <mml:mo>)<\/mml:mo>\n                    <\/mml:mrow>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> by the given coding scheme; in this paper, we study the size of this set of signals. We show that for certain \u201cnice\u201d signal classes <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\mathcal {S}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>S<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>, a <jats:italic>phase transition<\/jats:italic> occurs: We construct a probability measure <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\mathbb {P}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>P<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> on <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\mathcal {S}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>S<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> such that for <jats:italic>every<\/jats:italic> coding scheme <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\mathcal {C}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>C<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> and any <jats:inline-formula><jats:alternatives><jats:tex-math>$$s &gt; s^*(\\mathcal {S})$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>s<\/mml:mi>\n                    <mml:mo>&gt;<\/mml:mo>\n                    <mml:msup>\n                      <mml:mi>s<\/mml:mi>\n                      <mml:mo>\u2217<\/mml:mo>\n                    <\/mml:msup>\n                    <mml:mrow>\n                      <mml:mo>(<\/mml:mo>\n                      <mml:mi>S<\/mml:mi>\n                      <mml:mo>)<\/mml:mo>\n                    <\/mml:mrow>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>, the set of signals encoded with error <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\mathcal {O}(R^{-s})$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>O<\/mml:mi>\n                    <mml:mo>(<\/mml:mo>\n                    <mml:msup>\n                      <mml:mi>R<\/mml:mi>\n                      <mml:mrow>\n                        <mml:mo>-<\/mml:mo>\n                        <mml:mi>s<\/mml:mi>\n                      <\/mml:mrow>\n                    <\/mml:msup>\n                    <mml:mo>)<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> by <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\mathcal {C}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>C<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> forms a <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\mathbb {P}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>P<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>-null-set. In particular, our results apply to all unit balls in Besov and Sobolev spaces that embed compactly into <jats:inline-formula><jats:alternatives><jats:tex-math>$$L^2 (\\varOmega )$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:msup>\n                      <mml:mi>L<\/mml:mi>\n                      <mml:mn>2<\/mml:mn>\n                    <\/mml:msup>\n                    <mml:mrow>\n                      <mml:mo>(<\/mml:mo>\n                      <mml:mi>\u03a9<\/mml:mi>\n                      <mml:mo>)<\/mml:mo>\n                    <\/mml:mrow>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> for a bounded Lipschitz domain <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\varOmega $$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>\u03a9<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>. As an application, we show that several existing sharpness results concerning function approximation using deep neural networks are in fact <jats:italic>generically sharp<\/jats:italic>. In addition, we provide quantitative and non-asymptotic bounds on the probability that a random <jats:inline-formula><jats:alternatives><jats:tex-math>$$f\\in \\mathcal {S}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>f<\/mml:mi>\n                    <mml:mo>\u2208<\/mml:mo>\n                    <mml:mi>S<\/mml:mi>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> can be encoded to within accuracy <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\varepsilon $$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>\u03b5<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> using <jats:italic>R<\/jats:italic> bits. This result is subsequently applied to the problem of approximately representing <jats:inline-formula><jats:alternatives><jats:tex-math>$$f\\in \\mathcal {S}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>f<\/mml:mi>\n                    <mml:mo>\u2208<\/mml:mo>\n                    <mml:mi>S<\/mml:mi>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> to within accuracy <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\varepsilon $$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mi>\u03b5<\/mml:mi>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> by a (quantized) neural network with at most <jats:italic>W<\/jats:italic> nonzero weights. We show that for any <jats:inline-formula><jats:alternatives><jats:tex-math>$$s &gt; s^*(\\mathcal {S})$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mi>s<\/mml:mi>\n                    <mml:mo>&gt;<\/mml:mo>\n                    <mml:msup>\n                      <mml:mi>s<\/mml:mi>\n                      <mml:mo>\u2217<\/mml:mo>\n                    <\/mml:msup>\n                    <mml:mrow>\n                      <mml:mo>(<\/mml:mo>\n                      <mml:mi>S<\/mml:mi>\n                      <mml:mo>)<\/mml:mo>\n                    <\/mml:mrow>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> there are constants <jats:italic>c<\/jats:italic>,\u00a0<jats:italic>C<\/jats:italic> such that, no matter what kind of \u201clearning\u201d procedure is used to produce such a network, the probability of success is bounded from above by <jats:inline-formula><jats:alternatives><jats:tex-math>$$\\min \\big \\{1, 2^{C\\cdot W \\lceil \\log _2 (1+W) \\rceil ^2 - c\\cdot \\varepsilon ^{-1\/s}} \\big \\}$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mo>min<\/mml:mo>\n                    <mml:mrow>\n                      <mml:mo>{<\/mml:mo>\n                    <\/mml:mrow>\n                    <mml:mn>1<\/mml:mn>\n                    <mml:mo>,<\/mml:mo>\n                    <mml:msup>\n                      <mml:mn>2<\/mml:mn>\n                      <mml:mrow>\n                        <mml:mi>C<\/mml:mi>\n                        <mml:mo>\u00b7<\/mml:mo>\n                        <mml:mi>W<\/mml:mi>\n                        <mml:msup>\n                          <mml:mrow>\n                            <mml:mo>\u2308<\/mml:mo>\n                            <mml:msub>\n                              <mml:mo>log<\/mml:mo>\n                              <mml:mn>2<\/mml:mn>\n                            <\/mml:msub>\n                            <mml:mrow>\n                              <mml:mo>(<\/mml:mo>\n                              <mml:mn>1<\/mml:mn>\n                              <mml:mo>+<\/mml:mo>\n                              <mml:mi>W<\/mml:mi>\n                              <mml:mo>)<\/mml:mo>\n                            <\/mml:mrow>\n                            <mml:mo>\u2309<\/mml:mo>\n                          <\/mml:mrow>\n                          <mml:mn>2<\/mml:mn>\n                        <\/mml:msup>\n                        <mml:mo>-<\/mml:mo>\n                        <mml:mi>c<\/mml:mi>\n                        <mml:mo>\u00b7<\/mml:mo>\n                        <mml:msup>\n                          <mml:mi>\u03b5<\/mml:mi>\n                          <mml:mrow>\n                            <mml:mo>-<\/mml:mo>\n                            <mml:mn>1<\/mml:mn>\n                            <mml:mo>\/<\/mml:mo>\n                            <mml:mi>s<\/mml:mi>\n                          <\/mml:mrow>\n                        <\/mml:msup>\n                      <\/mml:mrow>\n                    <\/mml:msup>\n                    <mml:mrow>\n                      <mml:mo>}<\/mml:mo>\n                    <\/mml:mrow>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>.<\/jats:p>","DOI":"10.1007\/s10208-021-09546-4","type":"journal-article","created":{"date-parts":[[2021,11,16]],"date-time":"2021-11-16T15:02:56Z","timestamp":1637074976000},"page":"329-392","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Phase Transitions in Rate Distortion Theory and Deep Learning"],"prefix":"10.1007","volume":"23","author":[{"given":"Philipp","family":"Grohs","sequence":"first","affiliation":[]},{"given":"Andreas","family":"Klotz","sequence":"additional","affiliation":[]},{"given":"Felix","family":"Voigtlaender","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,11,16]]},"reference":[{"key":"9546_CR1","unstructured":"Adams, R., Fournier, J.: Sobolev spaces, Pure and Applied Mathematics (Amsterdam), vol. 140, second edn. Elsevier\/Academic Press, Amsterdam (2003)"},{"key":"9546_CR2","doi-asserted-by":"publisher","unstructured":"Alt, H.W.: Linear functional analysis. Universitext. Springer-Verlag London, Ltd., London (2016). https:\/\/doi.org\/10.1007\/978-1-4471-7280-2.","DOI":"10.1007\/978-1-4471-7280-2"},{"key":"9546_CR3","doi-asserted-by":"crossref","unstructured":"Berger, T.: Rate-distortion theory. Wiley Encyclopedia of Telecommunications (2003)","DOI":"10.1002\/0471219282.eot142"},{"key":"9546_CR4","doi-asserted-by":"publisher","first-page":"8","DOI":"10.1137\/18M118709X","volume":"1","author":"H B\u00f6lcskei","year":"2019","unstructured":"B\u00f6lcskei, H., Grohs, P., Kutyniok, G., Petersen, P.C.: Optimal approximation with sparsely connected deep neural networks. SIAM J. Math. Data Sci. 1, 8\u201345 (2019)","journal-title":"SIAM J. Math. Data Sci."},{"key":"9546_CR5","doi-asserted-by":"publisher","unstructured":"Carl, B., Stephani, I.: Entropy, compactness and the approximation of operators, Cambridge Tracts in Mathematics, vol.\u00a098. Cambridge University Press, Cambridge (1990). https:\/\/doi.org\/10.1017\/CBO9780511897467.","DOI":"10.1017\/CBO9780511897467"},{"key":"9546_CR6","unstructured":"Conway, J.B.: A course in functional analysis, Graduate Texts in Mathematics, vol.\u00a096, second edn. Springer-Verlag, New York (1990)"},{"issue":"7","key":"9546_CR7","doi-asserted-by":"publisher","first-page":"909","DOI":"10.1002\/cpa.3160410705","volume":"41","author":"I Daubechies","year":"1988","unstructured":"Daubechies, I.: Orthonormal bases of compactly supported wavelets. Comm. Pure Appl. Math. 41(7), 909\u2013996 (1988). https:\/\/doi.org\/10.1002\/cpa.3160410705.","journal-title":"Comm. Pure Appl. Math."},{"key":"9546_CR8","doi-asserted-by":"publisher","unstructured":"Daubechies, I.: Ten lectures on wavelets, CBMS-NSF Regional Conference Series in Applied Mathematics, vol.\u00a061. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (1992). https:\/\/doi.org\/10.1137\/1.9781611970104.","DOI":"10.1137\/1.9781611970104"},{"key":"9546_CR9","doi-asserted-by":"publisher","unstructured":"DeVore, R.A.: Nonlinear approximation. In: Acta numerica, 1998, Acta Numer., vol.\u00a07, pp. 51\u2013150. Cambridge Univ. Press, Cambridge (1998). https:\/\/doi.org\/10.1017\/S0962492900002816.","DOI":"10.1017\/S0962492900002816"},{"key":"9546_CR10","doi-asserted-by":"publisher","unstructured":"DeVore, R.A., Lorentz, G.G.: Constructive approximation, Grundlehren der Mathematischen Wissenschaften, vol. 303. Springer-Verlag, Berlin (1993). https:\/\/doi.org\/10.1007\/978-3-662-02888-9.","DOI":"10.1007\/978-3-662-02888-9"},{"key":"9546_CR11","doi-asserted-by":"publisher","unstructured":"Dudley, R.M.: Real analysis and probability, Cambridge Studies in Advanced Mathematics, vol.\u00a074. Cambridge University Press, Cambridge (2002). https:\/\/doi.org\/10.1017\/CBO9780511755347.","DOI":"10.1017\/CBO9780511755347"},{"key":"9546_CR12","doi-asserted-by":"publisher","unstructured":"Edmunds, D.E., Triebel, H.: Function spaces, entropy numbers, differential operators, Cambridge Tracts in Mathematics, vol. 120. Cambridge University Press, Cambridge (1996). https:\/\/doi.org\/10.1017\/CBO9780511662201.","DOI":"10.1017\/CBO9780511662201"},{"issue":"5","key":"9546_CR13","doi-asserted-by":"publisher","first-page":"2581","DOI":"10.1109\/TIT.2021.3062161","volume":"67","author":"D Elbr\u00e4chter","year":"2021","unstructured":"Elbr\u00e4chter, D., Perekrestenko, D., Grohs, P., B\u00f6lcskei, H.: Deep Neural Network Approximation Theory. IEEE Transactions on Information Theory 67(5), 2581\u20132623 (2021). https:\/\/doi.org\/10.1109\/TIT.2021.3062161","journal-title":"IEEE Trans. Inf. Theory"},{"key":"9546_CR14","unstructured":"Folland, G.: Real analysis, second edn. Pure and Applied Mathematics (New York). John Wiley & Sons, Inc., New York (1999)"},{"key":"9546_CR15","doi-asserted-by":"crossref","unstructured":"Grohs, P.: Optimally sparse data representations. In: Harmonic and Applied Analysis, pp. 199\u2013248. Springer (2015)","DOI":"10.1007\/978-3-319-18863-8_5"},{"issue":"2","key":"9546_CR16","doi-asserted-by":"publisher","first-page":"723","DOI":"10.1016\/j.jat.2008.12.004","volume":"161","author":"DD Haroske","year":"2009","unstructured":"Haroske, D.D., Schneider, C.: Besov spaces with positive smoothness on $${\\mathbb{R}}^n$$, embeddings and growth envelopes. J. Approx. Theory 161(2), 723\u2013747 (2009). https:\/\/doi.org\/10.1016\/j.jat.2008.12.004.","journal-title":"J. Approx. Theory"},{"key":"9546_CR17","doi-asserted-by":"publisher","unstructured":"Hunt, B., Sauer, T., Yorke, J.: Prevalence: a translation-invariant \u201calmost every\u201d on infinite-dimensional spaces. Bull. Amer. Math. Soc. (N.S.) 27(2), 217\u2013238 (1992). https:\/\/doi.org\/10.1090\/S0273-0979-1992-00328-2.","DOI":"10.1090\/S0273-0979-1992-00328-2"},{"key":"9546_CR18","unstructured":"Kossaczk\u00e1, M., Vyb\u00edral, J.: Entropy numbers of finite-dimensional embeddings. ArXiv preprint arXiv:1802.00572 (2018)"},{"key":"9546_CR19","doi-asserted-by":"crossref","unstructured":"Leoni, G.: A first course in Sobolev spaces, Graduate Studies in Mathematics, vol. 181, second edn. American Mathematical Society, Providence, RI (2017)","DOI":"10.1090\/gsm\/181"},{"issue":"4","key":"9546_CR20","doi-asserted-by":"publisher","first-page":"731","DOI":"10.1515\/GMJ.2000.731","volume":"7","author":"H Leopold","year":"2000","unstructured":"Leopold, H.: Embeddings and entropy numbers for general weighted sequence spaces: the non-limiting case. Georgian Math. J. 7(4), 731\u2013743 (2000)","journal-title":"Georgian Math. J."},{"issue":"3","key":"9546_CR21","doi-asserted-by":"publisher","first-page":"1556","DOI":"10.1214\/aop\/1022677459","volume":"27","author":"WV Li","year":"1999","unstructured":"Li, W.V., Linde, W.: Approximation, metric entropy and small ball estimates for Gaussian measures. Ann. Probab. 27(3), 1556\u20131578 (1999). https:\/\/doi.org\/10.1214\/aop\/1022677459.","journal-title":"Ann. Probab."},{"key":"9546_CR22","doi-asserted-by":"publisher","unstructured":"Li, W.V., Shao, Q.M.: Gaussian processes: inequalities, small ball probabilities and applications. In: Stochastic processes: theory and methods, Handbook of Statist., vol.\u00a019, pp. 533\u2013597. North-Holland, Amsterdam (2001). https:\/\/doi.org\/10.1016\/S0169-7161(01)19019-X.","DOI":"10.1016\/S0169-7161(01)19019-X"},{"key":"9546_CR23","doi-asserted-by":"publisher","unstructured":"Lin, S.: Limitations of shallow nets approximation. Neural Netw. 94, 96 \u2013 102 (2017). https:\/\/doi.org\/10.1016\/j.neunet.2017.06.016. URL http:\/\/www.sciencedirect.com\/science\/article\/pii\/S0893608017301521","DOI":"10.1016\/j.neunet.2017.06.016"},{"issue":"1","key":"9546_CR24","doi-asserted-by":"publisher","first-page":"95","DOI":"10.1006\/jath.1998.3305","volume":"99","author":"V Maiorov","year":"1999","unstructured":"Maiorov, V., Meir, R., Ratsaby, J.: On the approximation of functional classes equipped with a uniform measure using ridge functions. J. Approx. Theory 99(1), 95\u2013111 (1999). https:\/\/doi.org\/10.1006\/jath.1998.3305.","journal-title":"J. Approx. Theory"},{"key":"9546_CR25","doi-asserted-by":"publisher","unstructured":"Maiorov, V., Pinkus, A.: Lower bounds for approximation by MLP neural networks. Neurocomputing 25(1), 81 \u2013 91 (1999). https:\/\/doi.org\/10.1016\/S0925-2312(98)00111-8. URL http:\/\/www.sciencedirect.com\/science\/article\/pii\/S0925231298001118","DOI":"10.1016\/S0925-2312(98)00111-8"},{"key":"9546_CR26","volume-title":"A wavelet tour of signal processing","author":"S Mallat","year":"2009","unstructured":"Mallat, S.: A wavelet tour of signal processing, third edn. Elsevier\/Academic Press, Amsterdam (2009)","edition":"3"},{"issue":"1","key":"9546_CR27","doi-asserted-by":"publisher","first-page":"164","DOI":"10.1162\/neco.1996.8.1.164","volume":"8","author":"HN Mhaskar","year":"1996","unstructured":"Mhaskar, H.N.: Neural networks for optimal approximation of smooth and analytic functions. Neural Computation 8(1), 164\u2013177 (1996). https:\/\/doi.org\/10.1162\/neco.1996.8.1.164","journal-title":"Neural Comput."},{"key":"9546_CR28","doi-asserted-by":"crossref","unstructured":"Oxtoby, J.C.: Measure and category, Graduate Texts in Mathematics, vol.\u00a02, second edn. Springer-Verlag, New York-Berlin (1980)","DOI":"10.1007\/978-1-4684-9339-9"},{"key":"9546_CR29","doi-asserted-by":"publisher","first-page":"296","DOI":"10.1016\/j.neunet.2018.08.019","volume":"108","author":"P Petersen","year":"2018","unstructured":"Petersen, P., Voigtlaender, F.: Optimal approximation of piecewise smooth functions using deep ReLU neural networks. Neural Netw. 108, 296\u2013330 (2018)","journal-title":"Neural Netw."},{"key":"9546_CR30","unstructured":"Rudin, W.: Functional analysis, second edn. International Series in Pure and Applied Mathematics. McGraw-Hill, Inc., New York (1991)"},{"key":"9546_CR31","unstructured":"Safran, I., Shamir, O.: Depth-width tradeoffs in approximating natural functions with neural networks. arXiv preprint arXiv:1610.09887 (2016)"},{"key":"9546_CR32","unstructured":"Safran, I., Shamir, O.: Depth-width tradeoffs in approximating natural functions with neural networks. In: International Conference on Machine Learning, pp. 2979\u20132987. PMLR (2017)"},{"key":"9546_CR33","unstructured":"Stein, E.: Singular integrals and differentiability properties of functions. Princeton Mathematical Series, No. 30. Princeton University Press, Princeton, N.J. (1970)"},{"key":"9546_CR34","unstructured":"Suzuki, T.: Adaptivity of deep ReLU network for learning in Besov and mixed smooth Besov spaces: optimal rate and curse of dimensionality. In: International Conference on Learning Representations (2019). https:\/\/openreview.net\/forum?id=H1ebTsActm"},{"key":"9546_CR35","unstructured":"Triebel, H.: Theory of function spaces. III, Monographs in Mathematics, vol. 100. Birkh\u00e4user Verlag, Basel (2006)"},{"key":"9546_CR36","unstructured":"Triebel, H.: Theory of function spaces. Modern Birkh\u00e4user Classics. Birkh\u00e4user\/Springer Basel AG, Basel (2010)"},{"key":"9546_CR37","doi-asserted-by":"publisher","unstructured":"Vershynin, R.: High-dimensional probability, Cambridge Series in Statistical and Probabilistic Mathematics, vol.\u00a047. Cambridge University Press, Cambridge (2018). https:\/\/doi.org\/10.1017\/9781108231596.","DOI":"10.1017\/9781108231596"},{"key":"9546_CR38","unstructured":"Voigtlaender, F.: Embeddings of Decomposition Spaces into Sobolev and BV Spaces. arXiv preprints arXiv:1601.02201 (2016)"},{"key":"9546_CR39","doi-asserted-by":"publisher","unstructured":"Wojtaszczyk, P.: A mathematical introduction to wavelets, London Mathematical Society Student Texts, vol.\u00a037. Cambridge University Press, Cambridge (1997). https:\/\/doi.org\/10.1017\/CBO9780511623790.","DOI":"10.1017\/CBO9780511623790"},{"key":"9546_CR40","unstructured":"Yarotsky, D.: Elementary superexpressive activations. arXiv preprint arXiv:2102.10911 (2021)"},{"key":"9546_CR41","unstructured":"Yarotsky, D., Zhevnerchuk, A.: The phase diagram of approximation rates for deep neural networks. arXiv preprint arXiv:1906.09477 (2019)"}],"container-title":["Foundations of Computational Mathematics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10208-021-09546-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10208-021-09546-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10208-021-09546-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,9]],"date-time":"2023-02-09T05:07:27Z","timestamp":1675919247000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10208-021-09546-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,16]]},"references-count":41,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,2]]}},"alternative-id":["9546"],"URL":"https:\/\/doi.org\/10.1007\/s10208-021-09546-4","relation":{},"ISSN":["1615-3375","1615-3383"],"issn-type":[{"value":"1615-3375","type":"print"},{"value":"1615-3383","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,11,16]]},"assertion":[{"value":"29 September 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 July 2021","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 September 2021","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 November 2021","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}