{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,24]],"date-time":"2025-12-24T12:20:36Z","timestamp":1766578836191,"version":"3.37.3"},"reference-count":50,"publisher":"Springer Science and Business Media LLC","issue":"34","license":[{"start":{"date-parts":[[2024,3,8]],"date-time":"2024-03-08T00:00:00Z","timestamp":1709856000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"},{"start":{"date-parts":[[2024,3,8]],"date-time":"2024-03-08T00:00:00Z","timestamp":1709856000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62071484"],"award-info":[{"award-number":["62071484"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Multimed Tools Appl"],"DOI":"10.1007\/s11042-024-18318-5","type":"journal-article","created":{"date-parts":[[2024,3,8]],"date-time":"2024-03-08T06:29:26Z","timestamp":1709879366000},"page":"80847-80872","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["An efficient low-perceptual environmental sound classification adversarial method based on GAN"],"prefix":"10.1007","volume":"83","author":[{"given":"Qiang","family":"Zhang","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1072-5422","authenticated-orcid":false,"given":"Jibin","family":"Yang","sequence":"additional","affiliation":[]},{"given":"Xiongwei","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Tieyong","family":"Cao","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,3,8]]},"reference":[{"key":"18318_CR1","doi-asserted-by":"publisher","unstructured":"Salamon J, Bello JP (2015) Unsupervised feature learning for urban sound classification. In: IEEE International conference on acoustics, speech and signal processing. South Brisbane, QLD, Australia. pp 171\u2013175. https:\/\/doi.org\/10.1109\/ICASSP.2015.7177954","DOI":"10.1109\/ICASSP.2015.7177954"},{"key":"18318_CR2","unstructured":"Zeghidour N, Teboul O, Quitry F de C, Tagliasacchi M (2021) LEAF: a learnable frontend for audio classification. In: The 9th international conference on learning representations. Virtual Event, Austria. https:\/\/openreview.net\/forum?id=jM76BCb6F9m"},{"key":"18318_CR3","doi-asserted-by":"publisher","first-page":"252","DOI":"10.1016\/j.eswa.2019.06.040","volume":"136","author":"S Abdoli","year":"2019","unstructured":"Abdoli S, Cardinal P, LameirasKoerich A (2019) End-to-end environmental sound classification using a 1D convolutional neural network. Expert Syst Appl 136:252\u2013263. https:\/\/doi.org\/10.1016\/j.eswa.2019.06.040","journal-title":"Expert Syst Appl"},{"key":"18318_CR4","unstructured":"Yuji Tokozume, Yoshitaka Ushiku TH (2018) Learning from between-class examples for deep sound recognition. In: The 6th international conference on learning representations. Vancouver, BC, Canada"},{"key":"18318_CR5","first-page":"892","volume":"29","author":"Y Aytar","year":"2016","unstructured":"Aytar Y, Vondrick C, Torralba A (2016) SoundNet: Learning sound representations from unlabeled video. Adv Neural Inf Process Syst 29:892\u2013900","journal-title":"Adv Neural Inf Process Syst"},{"key":"18318_CR6","doi-asserted-by":"publisher","unstructured":"Chen K, Du X, Zhu B et al (2022) HTS-AT: a hierarchical token-semantic audio transformer for sound classification and detection. In: IEEE international Conference on Acoustics, Speech and Signal Processing. Singapore, Singapore. pp 646\u2013650. https:\/\/doi.org\/10.1109\/ICASSP43922.2022.9746312","DOI":"10.1109\/ICASSP43922.2022.9746312"},{"key":"18318_CR7","unstructured":"Szegedy C, Zaremba W, Sutskever I et al (2014) Intriguing properties of neural networks. In: The 2nd International conference on learning representations. Banff, AB, Canada"},{"key":"18318_CR8","doi-asserted-by":"publisher","unstructured":"Du T, Ji S, Li J et al (2020) SirenAttack: generating adversarial audio for end-to-end acoustic systems. In: ACM Asia Conference on Computer and Communications Security. pp 357\u2013369. https:\/\/doi.org\/10.1145\/3320269.3384733","DOI":"10.1145\/3320269.3384733"},{"key":"18318_CR9","unstructured":"Abdoli S, Hafemann LG, Rony J et al (2019) Universal adversarial audio perturbations. arXiv: 1908.03173. https:\/\/arxiv.org\/pdf\/1908.03173v5"},{"key":"18318_CR10","doi-asserted-by":"publisher","first-page":"2147","DOI":"10.1109\/TIFS.2019.2956591","volume":"15","author":"M Esmaeilpour","year":"2019","unstructured":"Esmaeilpour M, Cardinal P, Koerich AL (2019) A robust approach for securing audio classification against adversarial attacks. IEEE Trans Inf Forensics Secur. 15. pp 2147\u20132159. https:\/\/doi.org\/10.1109\/TIFS.2019.2956591","journal-title":"IEEE Trans Inf Forensics Secur"},{"key":"18318_CR11","doi-asserted-by":"publisher","unstructured":"Tripathi AM, Mishra A (2022) Adv-ESC: adversarial attack datasets for an environmental sound classification. Appl Acoust 185:108437. https:\/\/doi.org\/10.1016\/j.apacoust.2021.108437","DOI":"10.1016\/j.apacoust.2021.108437"},{"key":"18318_CR12","doi-asserted-by":"publisher","unstructured":"Esmaeilpour M, Cardinal P, Koerich AL (2022) From environmental sound representation to robustness of 2D CNN models against adversarial attacks. Appl Acoust 195:108817. https:\/\/doi.org\/10.1016\/j.apacoust.2022.108817","DOI":"10.1016\/j.apacoust.2022.108817"},{"key":"18318_CR13","unstructured":"Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. In: The 3rd international conference on learning representations. San Diego, CA, USA"},{"key":"18318_CR14","doi-asserted-by":"crossref","unstructured":"Kurakin A, Goodfellow IJ, Bengio S (2017) Adversarial examples in the physical world. In: The 5th International Conference on Learning Representations. Toulon, France","DOI":"10.1201\/9781351251389-8"},{"key":"18318_CR15","unstructured":"Madry A, Makelov A, Schmidt L et al (2018) Towards deep learning models resistant to adversarial attacks. In: The 6th International Conference on Learning Representations. Vancouver, BC, Canada"},{"key":"18318_CR16","doi-asserted-by":"publisher","unstructured":"Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: IEEE symposium on security and privacy. San Jose, CA, USA, pp 39\u201357. https:\/\/doi.org\/10.1109\/SP.2017.49","DOI":"10.1109\/SP.2017.49"},{"key":"18318_CR17","doi-asserted-by":"crossref","unstructured":"Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: The 6th International symposium on micro machine and human science. pp 39\u201343","DOI":"10.1109\/MHS.1995.494215"},{"key":"18318_CR18","doi-asserted-by":"publisher","unstructured":"Xie Y, Li Z, Shi C et al (2021) Enabling fast and universal audio adversarial attack using generative model. In: The 35th Conference on Artificial Intelligence. pp 14129\u201314137. https:\/\/doi.org\/10.1609\/aaai.v35i16.17663","DOI":"10.1609\/aaai.v35i16.17663"},{"key":"18318_CR19","doi-asserted-by":"publisher","first-page":"124503","DOI":"10.1109\/ACCESS.2020.3006130","volume":"8","author":"D Wang","year":"2020","unstructured":"Wang D, Dong L, Wang R et al (2020) Targeted speech adversarial example generation with generative adversarial network. IEEE Access 8:124503\u2013124513. https:\/\/doi.org\/10.1109\/ACCESS.2020.3006130","journal-title":"IEEE Access"},{"key":"18318_CR20","doi-asserted-by":"publisher","unstructured":"Zhang Q, Yang J, Zhang X, Cao T (2022) Generating adversarial examples in audio classification with generative adversarial network. In: The 7th international conference on image, vision and computing, Xi\u2019an, China, IEEE. pp 848\u2013853. https:\/\/doi.org\/10.1109\/ICIVC55077.2022.9886154","DOI":"10.1109\/ICIVC55077.2022.9886154"},{"key":"18318_CR21","doi-asserted-by":"publisher","unstructured":"Xiao C, Li B, Zhu JY et al (2018) Generating adversarial examples with adversarial networks. In: The 27th International Joint Conference on Artificial Intelligence. pp 3905\u20133911. https:\/\/doi.org\/10.24963\/ijcai.2018\/543","DOI":"10.24963\/ijcai.2018\/543"},{"key":"18318_CR22","doi-asserted-by":"publisher","unstructured":"Jandial S, Mangla P, Varshney S, Balasubramanian VN (2019) AdvGAN++: harnessing latent layers for adversary generation. In: International conference on computer vision workshops. IEEE, pp 2045\u20132048. https:\/\/doi.org\/10.1109\/ICCVW.2019.00257","DOI":"10.1109\/ICCVW.2019.00257"},{"key":"18318_CR23","doi-asserted-by":"publisher","unstructured":"Deb D, Zhang J, Jain AK (2020) AdvFaces:adversarial face synthesis. In: International joint conference on biometrics. IEEE, pp 1\u201310. https:\/\/doi.org\/10.1109\/IJCB48548.2020.9304898","DOI":"10.1109\/IJCB48548.2020.9304898"},{"key":"18318_CR24","doi-asserted-by":"publisher","unstructured":"Liu X, Wan K, Ding Y (2020) Towards weighted-sampling audio adversarial example attack. In: AAAI Conference on Artificial Intelligence. pp 4908\u20134915. https:\/\/doi.org\/10.48550\/arXiv.1901.10300","DOI":"10.48550\/arXiv.1901.10300"},{"key":"18318_CR25","doi-asserted-by":"publisher","unstructured":"Salamon J, Jacoby C, Bello JP (2014) A dataset and taxonomy for urban sound research. In: ACM Conference on Multimedia. Orlando, FL, USA, pp 1041\u20131044. https:\/\/doi.org\/10.1145\/2647868.2655045","DOI":"10.1145\/2647868.2655045"},{"key":"18318_CR26","unstructured":"Stoller D, Ewert S, Dixon S (2018) Wave-U-Net:a multi-scale neural network for end-to-end audio source separation. In: The 19th International Society for Music Information Retrieval Conference. pp 334\u2013340"},{"key":"18318_CR27","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1109\/MSP.2017.2765202","volume":"35","author":"I Goodfellow","year":"2018","unstructured":"Goodfellow I, Pouget-Abadie J, Mirza M et al (2018) Generative adversarial networks. IEEE Signal Process Mag 35:53\u201365","journal-title":"IEEE Signal Process Mag"},{"key":"18318_CR28","doi-asserted-by":"publisher","unstructured":"Kashani HB, Jodeiri A, Goodarzi MM, Rezaei IS (2019) Speech enhancement via deep spectrum image translation network. In: International Iranian conference on biomedical engineering. Tehran, Iran. pp 145\u2013151. https:\/\/doi.org\/10.1109\/ICBME49163.2019.9030421","DOI":"10.1109\/ICBME49163.2019.9030421"},{"key":"18318_CR29","doi-asserted-by":"publisher","unstructured":"Pascual S, Bonafonte A, Serra J (2017) SEGAN: speech enhancement generative adversarial network. Proc Interspeech pp 3642\u20133646. https:\/\/doi.org\/10.21437\/Interspeech.2017-1428","DOI":"10.21437\/Interspeech.2017-1428"},{"key":"18318_CR30","doi-asserted-by":"publisher","unstructured":"Yuji Tokozume TH (2017) Learning environmental sounds with end-to-end convolutional neural network. In: IEEE international conference on acoustics, speech and signal processing. pp 2721\u20132725. https:\/\/doi.org\/10.1109\/ICASSP.2017.7952651","DOI":"10.1109\/ICASSP.2017.7952651"},{"key":"18318_CR31","doi-asserted-by":"publisher","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE computer society conference on computer vision and pattern recognition. Las Vegas, NV, USA, pp 770\u2013778. https:\/\/doi.org\/10.1109\/CVPR.2016.90","DOI":"10.1109\/CVPR.2016.90"},{"key":"18318_CR32","doi-asserted-by":"publisher","first-page":"730","DOI":"10.1080\/17512786.2015.1058180","volume":"10","author":"S Joseph","year":"2016","unstructured":"Joseph S (2016) Batch normalization: Accelerating deep network training by reducing internal covariate shift. J Pract 10:730\u2013743. https:\/\/doi.org\/10.1080\/17512786.2015.1058180","journal-title":"J Pract"},{"key":"18318_CR33","doi-asserted-by":"publisher","unstructured":"He K, Zhang X, Ren S, Sun J\u00a0(2014) Delving deep into rectifiers: surpassing human-level performance on ImageNet Classification. In: International conference on computer vision. pp 1026\u20131034. https:\/\/doi.org\/10.1109\/ICCV.2015.123","DOI":"10.1109\/ICCV.2015.123"},{"key":"18318_CR34","doi-asserted-by":"publisher","first-page":"537","DOI":"10.1049\/cvi2.12138","volume":"17","author":"G Wu","year":"2023","unstructured":"Wu G, He F, Zhou Y et al (2023) ACGAN: Age-compensated makeup transfer based on homologous continuity generative adversarial network model. IET Comput Vis 17:537\u2013548. https:\/\/doi.org\/10.1049\/cvi2.12138","journal-title":"IET Comput Vis"},{"key":"18318_CR35","unstructured":"Kingma DP, Ba JL (2015) Adam: a method for stochastic optimization. In: The 3rd International Conference on Learning Representations. pp 1\u201315"},{"key":"18318_CR36","first-page":"18795","volume":"33","author":"J Zhuang","year":"2020","unstructured":"Zhuang J, Tang T, Ding Y et al (2020) AdaBelief optimizer: Adapting stepsizes by the belief in observed gradient. Adv Neural Inf Process Syst 33:18795\u201318806","journal-title":"Adv Neural Inf Process Syst"},{"key":"18318_CR37","first-page":"8026","volume":"32","author":"A Paszke","year":"2019","unstructured":"Paszke A, Gross S, Massa F et al (2019) PyTorch: An imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026\u20138037","journal-title":"Adv Neural Inf Process Syst"},{"key":"18318_CR38","doi-asserted-by":"publisher","unstructured":"Piczak KJ (2015) ESC: dataset for environmental sound classification. In: 23th ACM multimedia conference. Brisbane, Australia, pp 1015\u20131018. https:\/\/doi.org\/10.1145\/2733373.2806390","DOI":"10.1145\/2733373.2806390"},{"key":"18318_CR39","doi-asserted-by":"publisher","first-page":"102495","DOI":"10.1016\/j.cose.2021.102495","volume":"112","author":"J Vadillo","year":"2022","unstructured":"Vadillo J, Santana R (2022) On the human evaluation of universal audio adversarial perturbations. Comput Secur 112:102495. https:\/\/doi.org\/10.1016\/j.cose.2021.102495","journal-title":"Comput Secur"},{"key":"18318_CR40","unstructured":"Salimans T, Goodfellow I, Zaremba W et al (2016) Improved techniques for training GANs. In: The 30th international conference on neural information processing systems. Barcelona, Spain, pp 2234\u20132242"},{"key":"18318_CR41","unstructured":"U. D (2018) Keep calm and train a GAN. Pitfalls and tips on training generative adversarial networks. Medium. https:\/\/medium.com\/@utk.is.here\/keep-calm-and-train-a-gan-pitfalls-and-tips-on-training-generative-adversarial-networks-edd529764aa9"},{"key":"18318_CR42","doi-asserted-by":"publisher","first-page":"19843","DOI":"10.1007\/s10489-023-04532-5","volume":"53","author":"L Schwinn","year":"2023","unstructured":"Schwinn L, Raab R, Nguyen A et al (2023) Exploring misclassifications of robust neural networks to enhance adversarial attacks. Appl Intell 53:19843\u201319859. https:\/\/doi.org\/10.1007\/s10489-023-04532-5","journal-title":"Appl Intell"},{"key":"18318_CR43","doi-asserted-by":"publisher","first-page":"1097","DOI":"10.1201\/9781420010749","volume":"25","author":"A Krizhevsky","year":"2012","unstructured":"Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097\u20131105. https:\/\/doi.org\/10.1201\/9781420010749","journal-title":"Adv Neural Inf Process Syst"},{"key":"18318_CR44","doi-asserted-by":"publisher","unstructured":"Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions. In: IEEE computer society conference on computer vision and pattern recognition. Boston, MA, USA, pp 1\u20139. https:\/\/doi.org\/10.1109\/CVPR.2015.7298594","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"18318_CR45","doi-asserted-by":"publisher","unstructured":"Hershey S, Chaudhuri S, Ellis DPW et al (2016) CNN architectures for large-scale audio classification. IEEE Int Conf Acoust Speech Signal Process. pp 131\u2013135. https:\/\/doi.org\/10.1109\/ICASSP.2017.7952132","DOI":"10.1109\/ICASSP.2017.7952132"},{"key":"18318_CR46","doi-asserted-by":"publisher","unstructured":"Carlini N, Wagner D (2018) Audio adversarial examples: targeted attacks on speech-to-text. In: IEEE symposium on security and privacy workshops. pp 1\u20137. https:\/\/doi.org\/10.1109\/SPW.2018.00009","DOI":"10.1109\/SPW.2018.00009"},{"key":"18318_CR47","doi-asserted-by":"publisher","first-page":"3154","DOI":"10.1109\/TPAMI.2020.2978474","volume":"43","author":"A Mustafa","year":"2021","unstructured":"Mustafa A, Khan SH, Hayat M et al (2021) Deeply supervised discriminative learning for adversarial defense. IEEE Trans Pattern Anal Mach Intell 43:3154\u20133166. https:\/\/doi.org\/10.1109\/TPAMI.2020.2978474","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"18318_CR48","unstructured":"Tram\u00e8r F, Papernot N, Goodfellow I et al (2017) The space of transferable adversarial examples. arXiv Prepr arXiv170403453 1\u201315"},{"key":"18318_CR49","unstructured":"Naseer M, Khan S, Khan MH et al (2019) Cross-domain transferability of adversarial perturbations. In: Annual conference on neural information processing systems. pp 12885\u201312895"},{"key":"18318_CR50","doi-asserted-by":"publisher","unstructured":"Hssayni E Houssaine, Joudar N-E, Ettaouil M (2023) Localization and reduction of redundancy in CNN using L1-sparsity induction. J Ambient Intell Humaniz Comput 14:13715\u201313727. https:\/\/doi.org\/10.1007\/s12652-022-04025-2","DOI":"10.1007\/s12652-022-04025-2"}],"container-title":["Multimedia Tools and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-024-18318-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11042-024-18318-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-024-18318-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,9]],"date-time":"2024-10-09T12:11:13Z","timestamp":1728475873000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11042-024-18318-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,8]]},"references-count":50,"journal-issue":{"issue":"34","published-online":{"date-parts":[[2024,10]]}},"alternative-id":["18318"],"URL":"https:\/\/doi.org\/10.1007\/s11042-024-18318-5","relation":{},"ISSN":["1573-7721"],"issn-type":[{"type":"electronic","value":"1573-7721"}],"subject":[],"published":{"date-parts":[[2024,3,8]]},"assertion":[{"value":"8 August 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 December 2023","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 January 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 March 2024","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}