{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T17:00:57Z","timestamp":1769706057140,"version":"3.49.0"},"reference-count":29,"publisher":"SAGE Publications","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IFS"],"published-print":{"date-parts":[[2023,8,1]]},"abstract":"<jats:p>The advent use of smart devices has enabled the emergence of many applications that facilitate user interaction through speech. However, speech reveals private and sensitive information about the user\u2019s identity, posing several security risks. For example, a speaker\u2019s speech can be acquired and used in speech synthesis systems to generate fake speech recordings that can be used to attack that speaker\u2019s verification system. One solution is to anonymize the speaker\u2019s identity from speech before using it. Existing anonymization schemes rely on using a pool of real speakers\u2019 identities for anonymization, which may result in associating a speaker\u2019s speech with an existing speaker. Hence, this paper investigates the use of Generative Adversarial Networks (GAN) to generate a pool of fake identities that are used for anonymization. Several GAN types were considered for this purpose, and the Conditional Tabular GAN (CTGAN) showed the best performance among all GAN types according to different metrics that measure the naturalness of the anonymized speech and its linguistic content.<\/jats:p>","DOI":"10.3233\/jifs-223642","type":"journal-article","created":{"date-parts":[[2023,6,13]],"date-time":"2023-06-13T10:11:54Z","timestamp":1686651114000},"page":"3345-3359","source":"Crossref","is-referenced-by-count":3,"title":["Speaker anonymization using generative adversarial networks"],"prefix":"10.1177","volume":"45","author":[{"given":"Aya","family":"Jafari","sequence":"first","affiliation":[{"name":"Computer Engineering Department, Princess Sumaya University for Technology, Amman, Jordan"}]},{"given":"Amjed","family":"Al-Mousa","sequence":"additional","affiliation":[{"name":"Computer Engineering Department, Princess Sumaya University for Technology, Amman, Jordan"}]},{"given":"Iyad","family":"Jafar","sequence":"additional","affiliation":[{"name":"Computer Engineering Department, University of Jordan, Amman, Jordan"}]}],"member":"179","reference":[{"key":"10.3233\/JIFS-223642_ref1","doi-asserted-by":"crossref","first-page":"749","DOI":"10.1109\/ICASSP.2014.6853696","article-title":"Evasion and obfuscation in automatic speaker verification","author":"Alegre","year":"2014","journal-title":"2014 IEEEInternational Conference on Acoustics, Speech and Signal Processing (ICASSP)"},{"key":"10.3233\/JIFS-223642_ref2","first-page":"214","article-title":"Wasserstein generative adversarial networks","author":"Arjovsky","year":"2017","journal-title":"International conference onmachine learning"},{"key":"10.3233\/JIFS-223642_ref3","doi-asserted-by":"crossref","first-page":"255","DOI":"10.21437\/Odyssey.2018-36","article-title":"Convolutional neural network based speaker deidentification","author":"Bahmaninezhad","year":"2018","journal-title":"Odyssey"},{"issue":"2006","key":"10.3233\/JIFS-223642_ref4","first-page":"230","article-title":"Application-independent evaluation of speaker detection","volume":"20","author":"Br\u00fcmmer","journal-title":"Computer Speech & Language"},{"key":"10.3233\/JIFS-223642_ref5","unstructured":"Champion Pierre , Jouvet Denis and Larcher Anthony , A study of f0 modification for x-vector based speech pseudonymization across gender, arXiv preprint arXiv:2101.08478, 2021."},{"key":"10.3233\/JIFS-223642_ref6","doi-asserted-by":"crossref","unstructured":"Fang Fuming , Wang Xin , Yamagishi Junichi , Echizen Isao , Todisco Massimiliano , Evans Nicholas and Bonastre Jean-Francois , Speaker anonymization using x-vector and neural waveform models, arXivpreprint arXiv:1905.13561, 2019.","DOI":"10.21437\/SSW.2019-28"},{"key":"10.3233\/JIFS-223642_ref7","unstructured":"Goodfellow Ian J , Pouget-Abadie Jean , Mirza Mehdi , Xu Bing , Warde-Farley David , Ozair Sherjil , Courville Aaron and Bengio Yoshua , Generative adversarial networks, arXiv preprintarXiv:1406.2661, 2014."},{"key":"10.3233\/JIFS-223642_ref8","unstructured":"Hannun Awni , Case Carl , Casper Jared , Catanzaro Bryan , Diamos Greg , Elsen Erich , Prenger Ryan , Satheesh Sanjeev , Sengupta Shubho , Coates Adam , et al. Deep speech: Scaling up end to- end speech recognition, arXiv preprint arXiv:1412.5567, 2014."},{"key":"10.3233\/JIFS-223642_ref9","doi-asserted-by":"crossref","unstructured":"Hashimoto Kei , Yamagishi Junichi and Echizen Isao , Privacy preservingsound to degrade automatic speaker verification performance. In 2016 IEEE International Conference on Acoustics, Speech and SignalProcessing (ICASSP), pages 5500\u20135504. IEEE 2016.","DOI":"10.1109\/ICASSP.2016.7472729"},{"key":"10.3233\/JIFS-223642_ref10","first-page":"531","article-title":"Probabilistic linear discriminant analysis","volume":"3954","author":"Ioffe","year":"2006","journal-title":"Computer Vision\u2013ECCV"},{"key":"10.3233\/JIFS-223642_ref11","doi-asserted-by":"crossref","unstructured":"Jin Qin , Toth Arthur R , Schultz Tanja and Black Alan W , Speakerde-identification via voice transformation. In 2009 IEEEWorkshop on Automatic Speech Recognition & Understanding pages 529\u2013533. IEEE, 2009.","DOI":"10.1109\/ASRU.2009.5373356"},{"key":"10.3233\/JIFS-223642_ref12","first-page":"1","article-title":"Speakerde-identification using diphone recognition and speech synthesis","volume":"4","author":"Justin","year":"2015","journal-title":"2015 11th IEEE International Conference and Workshops onAutomatic Face and Gesture Recognition (FG)"},{"key":"10.3233\/JIFS-223642_ref13","unstructured":"Knote Robin , Janson Andreas , Eigenbrod Laura and S\u00f6llner Matthias , The what and how of smart personal assistants:principles and application domains for is research, Multikonferenz Wirtschaftsinformatik (MKWI), 2018."},{"key":"10.3233\/JIFS-223642_ref14","unstructured":"Google LLC. voice-search-mobile-usestatistics. https:\/\/www.thinkwithgoogle.com\/marketing-strategies\/search\/voice-search-mobile-use-statistics\/, May 2021."},{"key":"10.3233\/JIFS-223642_ref15","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1016\/j.csl.2017.05.001","article-title":"Reversible speaker de-identification using pre-trained transformation functions","volume":"46","author":"Magarinos","year":"2017","journal-title":"Computer Speech & Language"},{"key":"10.3233\/JIFS-223642_ref16","unstructured":"Mirza Mehdi and Osindero Simon , Conditional generative adversarialnets, arXiv preprint arXiv:1411.1784, 2014."},{"key":"10.3233\/JIFS-223642_ref17","unstructured":"Persaud Mark , Where is voice tech going? https:\/\/techcrunch.com\/2020\/07\/29\/voice-tech-in-2020\/, July, 2020."},{"key":"10.3233\/JIFS-223642_ref18","doi-asserted-by":"crossref","first-page":"1264","DOI":"10.1109\/MIPRO.2014.6859761","article-title":"Online speaker de-identificationusing voice transformation","author":"Pobar","year":"2014","journal-title":"2014 37th Internationalconvention on information and communication technology, electronicsand microelectronics (mipro)"},{"key":"10.3233\/JIFS-223642_ref19","unstructured":"Povey Daniel , Ghoshal Arnab , Boulianne Gilles , Burget Lukas , Glembek Ondrej , Goel Nagendra , Hannemann Mirko , Motlicek Petr , Qian Yanmin , Schwarz Petr , et al. The kaldi speech recognition toolkit. In IEEE 2011 workshop on automatic speech recognition and understanding. IEEE Signal Processing Society, 2011."},{"key":"10.3233\/JIFS-223642_ref20","first-page":"1","article-title":"Probabilistic linear discriminant analysis for inferences about identity","author":"Prince","year":"2007","journal-title":"2007 IEEE 11th International Conference on Computer Vision"},{"key":"10.3233\/JIFS-223642_ref21","unstructured":"Qian Jianwei , Du Haohua , Hou Jiahui , Chen Linlin , Jung Taeho , Li Xiang-Yang , Wang Yu and Deng Yanbo , Voicemask: Anonymize andsanitize voice input on mobile devices, arXiv preprintarXiv:1711.11460, 2017."},{"key":"10.3233\/JIFS-223642_ref22","unstructured":"Radford Alec , Metz Luke and Chintala Soumith , Unsupervised representation learning with deep convolutional generative adversarial networks, arXiv preprint arXiv:1511.06434, 2015."},{"key":"10.3233\/JIFS-223642_ref23","doi-asserted-by":"crossref","first-page":"5329","DOI":"10.1109\/ICASSP.2018.8461375","article-title":"X-vectors: Robust dnn embeddings for speaker recognition","author":"Snyder","year":"2018","journal-title":"2018 IEEE International Conference onAcoustics, Speech and Signal Processing (ICASSP)"},{"key":"10.3233\/JIFS-223642_ref24","unstructured":"Tomashenko Natalia , Srivastava Brij Mohan Lal , Wang Xin , Vincent Emmanuel , Nautsch Andreas , Yamagishi Junichi , Evans Nicholas , Patino Jose , Bonastre Jean-Francois , No\u00e9 Paul-Gauthier , et al. The voiceprivacy 2020 challenge plan, 2020."},{"key":"10.3233\/JIFS-223642_ref25","unstructured":"Turner Henry , Lovisotto Giulio and Martinovic Ivan , Speaker anonymization with distribution-preserving x-vector generation forthe voiceprivacy challenge 2020, arXiv preprintarXiv:2010.13457 2020."},{"key":"10.3233\/JIFS-223642_ref26","first-page":"3152676","article-title":"The eu general data protectionregulation (gdpr)","volume":"10","author":"Voigt","year":"2017","journal-title":"A Practical Guide, 1st Ed., Cham: SpringerInternational Publishing"},{"key":"10.3233\/JIFS-223642_ref27","doi-asserted-by":"crossref","unstructured":"Wang Xin and Yamagishi Junichi , Neural harmonic-plus-noise waveform model with trainable maximum voice frequency for text-to-speech synthesis, arXiv preprint arXiv:1908.10256, 2019.","DOI":"10.21437\/SSW.2019-1"},{"key":"10.3233\/JIFS-223642_ref28","unstructured":"Xu Lei , Skoularidou Maria , Cuesta-Infante Alfredo and Veeramachaneni Kalyan , Modeling tabular data using conditional gan, arXiv preprint arXiv:1907.00503, 2019."},{"key":"10.3233\/JIFS-223642_ref29","doi-asserted-by":"crossref","unstructured":"Zeng Chang , Wang Xin , Cooper Erica , Miao Xiaoxiao and Yamagishi Junichi , Attention back-end for automatic speaker verification with multiple enrollment utterances, ICASSP 2022 \u2013 2022 IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), 2022.","DOI":"10.1109\/ICASSP43922.2022.9746688"}],"container-title":["Journal of Intelligent &amp; Fuzzy Systems"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/JIFS-223642","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T05:49:19Z","timestamp":1769665759000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/JIFS-223642"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,8,1]]},"references-count":29,"journal-issue":{"issue":"2"},"URL":"https:\/\/doi.org\/10.3233\/jifs-223642","relation":{},"ISSN":["1064-1246","1875-8967"],"issn-type":[{"value":"1064-1246","type":"print"},{"value":"1875-8967","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,8,1]]}}}