{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:48:03Z","timestamp":1760240883924,"version":"build-2065373602"},"reference-count":30,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2019,9,26]],"date-time":"2019-09-26T00:00:00Z","timestamp":1569456000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001821","name":"Vienna Science and Technology Fund","doi-asserted-by":"publisher","award":["MA14-018"],"award-info":[{"award-number":["MA14-018"]}],"id":[{"id":"10.13039\/501100001821","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001823","name":"Ministerstvo \u0160kolstv\u00ed, Ml\u00e1de\u017ee a T\u011blov\u00fdchovy","doi-asserted-by":"publisher","award":["CZ.02.2.69\/0.0\/0.0\/16 027\/0008371","LO1401"],"award-info":[{"award-number":["CZ.02.2.69\/0.0\/0.0\/16 027\/0008371","LO1401"]}],"id":[{"id":"10.13039\/501100001823","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003065","name":"Universit\u00e4t Wien","doi-asserted-by":"publisher","award":["Uni:docs Fellowship Programme"],"award-info":[{"award-number":["Uni:docs Fellowship Programme"]}],"id":[{"id":"10.13039\/501100003065","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Axioms"],"abstract":"<jats:p>This paper introduces Gabor scattering, a feature extractor based on Gabor frames and Mallat\u2019s scattering transform. By using a simple signal model for audio signals, specific properties of Gabor scattering are studied. It is shown that, for each layer, specific invariances to certain signal characteristics occur. Furthermore, deformation stability of the coefficient vector generated by the feature extractor is derived by using a decoupling technique which exploits the contractivity of general scattering networks. Deformations are introduced as changes in spectral shape and frequency modulation. The theoretical results are illustrated by numerical examples and experiments. Numerical evidence is given by evaluation on a synthetic and a \u201creal\u201d dataset, that the invariances encoded by the Gabor scattering transform lead to higher performance in comparison with just using Gabor transform, especially when few training samples are available.<\/jats:p>","DOI":"10.3390\/axioms8040106","type":"journal-article","created":{"date-parts":[[2019,9,27]],"date-time":"2019-09-27T03:03:15Z","timestamp":1569553395000},"page":"106","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Gabor Frames and Deep Scattering Networks in Audio Processing"],"prefix":"10.3390","volume":"8","author":[{"given":"Roswitha","family":"Bammer","sequence":"first","affiliation":[{"name":"NuHAG, Faculty of Mathematics, University of Vienna, 1090 Wien, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6139-630X","authenticated-orcid":false,"given":"Monika","family":"D\u00f6rfler","sequence":"additional","affiliation":[{"name":"NuHAG, Faculty of Mathematics, University of Vienna, 1090 Wien, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5206-1794","authenticated-orcid":false,"given":"Pavol","family":"Harar","sequence":"additional","affiliation":[{"name":"NuHAG, Faculty of Mathematics, University of Vienna, 1090 Wien, Austria"},{"name":"Department of Telecommunications, Brno University of Technology, 60190 Brno, Czech Republic"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2019,9,26]]},"reference":[{"key":"ref_1","unstructured":"Pereira, F., Burges, C.J.C., Bottou, L., and Weinberger, K.Q. (2012). ImageNet Classification with Deep Convolutional Neural Networks. Advances in Neural Information Processing Systems 25, Curran Associates, Inc."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Grill, T., and Schl\u00fcter, J. (2015, January 26\u201330). Music Boundary Detection Using Neural Networks on Combined Features and Two-Level Annotations. Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR 2015), Malaga, Spain.","DOI":"10.1109\/EUSIPCO.2015.7362593"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1331","DOI":"10.1002\/cpa.21413","article-title":"Group Invariant Scattering","volume":"65","author":"Mallat","year":"2012","journal-title":"Comm. Pure Appl. Math."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1845","DOI":"10.1109\/TIT.2017.2776228","article-title":"A Mathematical Theory of Deep Convolutional Neural Networks for Feature Extraction","volume":"64","author":"Wiatowski","year":"2017","journal-title":"IEEE Trans. Inf. Theory"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Wiatowski, T., and B\u00f6lcskei, H. (2015, January 14\u201319). Deep Convolutional Neural Networks Based on Semi-Discrete Frames. Proceedings of the IEEE International Symposium on Information Theory (ISIT), Hong Kong, China.","DOI":"10.1109\/ISIT.2015.7282648"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"4114","DOI":"10.1109\/TSP.2014.2326991","article-title":"Deep Scattering Spectrum","volume":"62","author":"Mallat","year":"2014","journal-title":"IEEE Trans. Signal Process."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"And\u00e9n, J., Lostanlen, V., and Mallat, S. (2015, January 17\u201320). Joint time-frequency scattering for audio classification. Proceedings of the 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), Boston, MA, USA.","DOI":"10.1109\/MLSP.2015.7324385"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Grohs, P., Wiatowski, T., and B\u00f6lcskei, H. (2016, January 10\u201315). Deep convolutional neural networks on cartoon functions. Proceedings of the 2016 IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain.","DOI":"10.1109\/ISIT.2016.7541482"},{"key":"ref_9","unstructured":"Romani Picas, O., Parra Rodriguez, H., Dabiri, D., Tokuda, H., Hariya, W., Oishi, K., and Serra, X. (2015). A real-time system for measuring sound goodness in instrumental sounds. Audio Engineering Society Convention 138, Audio Engineering Society."},{"key":"ref_10","unstructured":"Zhang, C., Bengio, S., Hardt, M., Recht, B., and Vinyals, O. (2016). Understanding deep learning requires rethinking generalization. arXiv."},{"key":"ref_11","unstructured":"Kawaguchi, K., Kaelbling, L.P., and Bengio, Y. (2017). Generalization in Deep Learning. arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1171","DOI":"10.1214\/009053607000000677","article-title":"Kernel Methods in Machine Learning","volume":"36","author":"Hofmann","year":"2008","journal-title":"Ann. Stat."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Mallat, S. (2016). Understanding deep convolutional networks. Philos. Trans. R. Soc. Lond. A Math. Phys. Eng. Sci., 374.","DOI":"10.1098\/rsta.2015.0203"},{"key":"ref_14","unstructured":"Wiatowski, T., Tschannen, M., Stanic, A., Grohs, P., and B\u00f6lcskei, H. (2016, January 19\u201324). Discrete deep feature extraction: A theory and new architectures. Proceedings of the International Conference on Machine Learning (ICML), New York, NY, USA."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Gr\u00f6chenig, K. (2001). Foundations of Time-Frequency Analysis, Birkh\u00e4user. Applied and Numerical Harmonic Analysis.","DOI":"10.1007\/978-1-4612-0003-1"},{"key":"ref_16","unstructured":"Harar, P., and Bammer, R. (2019, June 20). gs-gt. Available online: https:\/\/gitlab.com\/hararticles\/gs-gt."},{"key":"ref_17","unstructured":"Harar, P. (2019, June 20). Gabor Scattering v0.0.4. Available online: https:\/\/gitlab.com\/paloha\/gabor-scattering."},{"key":"ref_18","unstructured":"Jones, E., Oliphant, T., and Peterson, P. (2019, February 01). SciPy: Open Source Scientific Tools for Python. Available online: http:\/\/www.scipy.org\/."},{"key":"ref_19","unstructured":"Oppenheim, A.V. (1999). Discrete-Time Signal Processing, Pearson Education India."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1109\/TASSP.1984.1164317","article-title":"Signal estimation from modified short-time Fourier transform","volume":"32","author":"Griffin","year":"1984","journal-title":"IEEE Trans. Acoust. Speech Signal Process."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Kirkland, E.J. (2010). Bilinear interpolation. Advanced Computing in Electron Microscopy, Springer.","DOI":"10.1007\/978-1-4419-6533-2"},{"key":"ref_22","first-page":"249","article-title":"Understanding the difficulty of training deep feedforward neural networks","volume":"9","author":"Glorot","year":"2010","journal-title":"Aistats"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2015, January 7\u201313). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.123"},{"key":"ref_24","unstructured":"Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press."},{"key":"ref_25","unstructured":"Kingma, D., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_26","unstructured":"Chollet, F. (2019, August 19). Keras. Available online: https:\/\/keras.io."},{"key":"ref_27","unstructured":"Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, C., Davis, A., Dean, J., and Devin, M. (2019, February 01). TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Available online: https:\/\/www.tensorflow.org."},{"key":"ref_28","unstructured":"Bagwell, C. (2018, October 31). SoX\u2014Sound Exchange the Swiss Army Knife of Sound Processing. Available online: https:\/\/launchpad.net\/ubuntu\/+source\/sox\/14.4.1-5."},{"key":"ref_29","unstructured":"Navarrete, J. (2018, October 31). The SoX of Silence Tutorial. Available online: https:\/\/digitalcardboard.com\/blog\/2009\/08\/25\/the-sox-of-silence."},{"key":"ref_30","unstructured":"Bammer, R., Breger, A., D\u00f6rfler, M., Harar, P., and Sm\u00e9kal, Z. (2019). Machines listening to music: The role of signal representations in learning from music. arXiv."}],"container-title":["Axioms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2075-1680\/8\/4\/106\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:24:48Z","timestamp":1760189088000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2075-1680\/8\/4\/106"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,9,26]]},"references-count":30,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2019,12]]}},"alternative-id":["axioms8040106"],"URL":"https:\/\/doi.org\/10.3390\/axioms8040106","relation":{},"ISSN":["2075-1680"],"issn-type":[{"type":"electronic","value":"2075-1680"}],"subject":[],"published":{"date-parts":[[2019,9,26]]}}}