{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,10]],"date-time":"2025-12-10T09:08:35Z","timestamp":1765357715944,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":44,"publisher":"ACM","license":[{"start":{"date-parts":[[2024,10,28]],"date-time":"2024-10-28T00:00:00Z","timestamp":1730073600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/https:\/\/doi.org\/10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["No. 62322120, No.U21B2010, No. 62306316, No. 62206278"],"award-info":[{"award-number":["No. 62322120, No.U21B2010, No. 62306316, No. 62206278"]}],"id":[{"id":"10.13039\/https:\/\/doi.org\/10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2024,10,28]]},"DOI":"10.1145\/3664647.3681602","type":"proceedings-article","created":{"date-parts":[[2024,10,26]],"date-time":"2024-10-26T06:59:41Z","timestamp":1729925981000},"page":"1961-1970","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Utilizing Speaker Profiles for Impersonation Audio Detection"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-6044-8239","authenticated-orcid":false,"given":"Hao","family":"Gu","sequence":"first","affiliation":[{"name":"Institute of Automation, Chinese Academy of Sciences, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2422-4618","authenticated-orcid":false,"given":"Jiangyan","family":"Yi","sequence":"additional","affiliation":[{"name":"Institute of Automation, Chinese Academy of Sciences, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5785-7027","authenticated-orcid":false,"given":"Chenglong","family":"Wang","sequence":"additional","affiliation":[{"name":"Institute of Automation, Chinese Academy of Sciences, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-9015-000X","authenticated-orcid":false,"given":"Yong","family":"Ren","sequence":"additional","affiliation":[{"name":"Institute of Automation, Chinese Academy of Sciences, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9344-6428","authenticated-orcid":false,"given":"Jianhua","family":"Tao","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6499-0272","authenticated-orcid":false,"given":"Xinrui","family":"Yan","sequence":"additional","affiliation":[{"name":"Institute of Automation, Chinese Academy of Sciences, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-4252-8862","authenticated-orcid":false,"given":"Yujie","family":"Chen","sequence":"additional","affiliation":[{"name":"Anhui University, Hefei, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9949-5415","authenticated-orcid":false,"given":"Xiaohui","family":"Zhang","sequence":"additional","affiliation":[{"name":"Institute of Automation, Chinese Academy of Sciences, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2024,10,28]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/BTAS.2013.6712706"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.21437\/INTERSPEECH.2019--3174"},{"key":"e_1_3_2_1_3_1","volume-title":"Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020","author":"Baevski Alexei","year":"2020","unstructured":"Alexei Baevski, Yuhao Zhou, Abdelrahman Mohamed, and Michael Auli. 2020. wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6--12, 2020, virtual, Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https:\/\/proceedings.neurips.cc\/paper\/2020\/hash\/92d1e1eb1cd6f9fba3227870bb6d7f07-Abstract.html"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.1995.479543"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSTSP.2022.3188113"},{"key":"e_1_3_2_1_6_1","first-page":"119","article-title":"Automatic speaker recognition as a measurement of voice imitation and conversion. The Intenational Journal of Speech","volume":"1","author":"Cabeceran Mireia Farr\u00fas","year":"2010","unstructured":"Mireia Farr\u00fas Cabeceran, Michael Wagner, Daniel Erro Eslava, and Francisco Javier Hernando Peric\u00e1s. 2010. Automatic speaker recognition as a measurement of voice imitation and conversion. The Intenational Journal of Speech. Language and the Law, Vol. 1, 17 (2010), 119--142.","journal-title":"Language and the Law"},{"key":"e_1_3_2_1_7_1","volume-title":"Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021","author":"Frank Joel","year":"2021","unstructured":"Joel Frank and Lea Sch\u00f6nherr. 2021. WaveFake: A Data Set to Facilitate Audio Deepfake Detection. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021, December 2021, virtual, Joaquin Vanschoren and Sai-Kit Yeung (Eds.). https:\/\/datasets-benchmarks-proceedings.neurips.cc\/paper\/2021\/hash\/c74d97b01eae257e44aa9d5bade97baf-Abstract-round2.html"},{"key":"e_1_3_2_1_8_1","volume-title":"Raw differentiable architecture search for speech deepfake and spoofing detection. arXiv preprint arXiv:2107.12212","author":"Ge Wanying","year":"2021","unstructured":"Wanying Ge, Jose Patino, Massimiliano Todisco, and Nicholas Evans. 2021. Raw differentiable architecture search for speech deepfake and spoofing detection. arXiv preprint arXiv:2107.12212 (2021)."},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.21437\/INTERSPEECH.2013--289"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2021.3122291"},{"key":"e_1_3_2_1_12_1","volume-title":"Squeeze-and-Excitation Networks. CoRR","author":"Hu Jie","year":"2017","unstructured":"Jie Hu, Li Shen, and Gang Sun. 2017. Squeeze-and-Excitation Networks. CoRR, Vol. abs\/1709.01507 (2017). showeprint[arXiv]1709.01507 http:\/\/arxiv.org\/abs\/1709.01507"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP43922.2022.9747766"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.21437\/INTERSPEECH.2020--1011"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.21437\/INTERSPEECH.2022--126"},{"key":"e_1_3_2_1_16_1","volume-title":"Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021","author":"Khalid Hasam","year":"2021","unstructured":"Hasam Khalid, Shahroz Tariq, Minha Kim, and Simon S. Woo. 2021. FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, NeurIPS Datasets and Benchmarks 2021, December 2021, virtual, Joaquin Vanschoren and Sai-Kit Yeung (Eds.). https:\/\/datasets-benchmarks-proceedings.neurips.cc\/paper\/2021\/hash\/d9d4f495e875a2e075a1a4a6e1b9770f-Abstract-round2.html"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.21437\/INTERSPEECH.2017--1111"},{"key":"e_1_3_2_1_18_1","volume-title":"Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing","author":"Lau Yee Wah","year":"2004","unstructured":"Yee Wah Lau, Michael Wagner, and Dat Tran. 2004. Vulnerability of speaker verification to voice mimicking. In Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004. IEEE, 145--148."},{"key":"e_1_3_2_1_19_1","volume-title":"Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28 -","author":"De Leon Phillip L.","year":"2010","unstructured":"Phillip L. De Leon, Michael Pucher, and Junichi Yamagishi. 2010. Evaluation of the Vulnerability of Speaker Verification to Synthetic Speech. In Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28 - July 1, 2010. ISCA, 28. http:\/\/www.isca-speech.org\/archive_open\/odyssey_2010\/od10_028.html"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.21437\/INTERSPEECH.2022--108"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/TBIOM.2021.3059479"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1609\/AAAI.V34I04.5985"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/SPED.2019.8906599"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00009"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1007\/978--3--319--92627--8_15"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP39728.2021.9414234"},{"key":"e_1_3_2_1_27_1","volume-title":"Deep Graph Infomax. CoRR","author":"Velickovic Petar","year":"2018","unstructured":"Petar Velickovic, William Fedus, William L. Hamilton, Pietro Li\u00f2, Yoshua Bengio, and R. Devon Hjelm. 2018. Deep Graph Infomax. CoRR, Vol. abs\/1809.10341 (2018). showeprint[arXiv]1809.10341 http:\/\/arxiv.org\/abs\/1809.10341"},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2305.13701"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP39728.2021.9414427"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.21437\/ODYSSEY.2022--14"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.CSL.2020.101114"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2018.2833032"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.SPECOM.2014.10.005"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2015.7178810"},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.21437\/INTERSPEECH.2015--462"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2022.3233005"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2304.08401"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.21437\/INTERSPEECH.2021--930"},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP43922.2022.9746939"},{"key":"e_1_3_2_1_41_1","volume-title":"Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, and Haizhou Li.","author":"Yi Jiangyan","year":"2023","unstructured":"Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, and Haizhou Li. 2023. ADD 2023: the Second Audio Deepfake Detection Challenge. In Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), Macao, China, August 19, 2023 (CEUR Workshop Proceedings, Vol. 3597), Jianhua Tao, Haizhou Li, Jiangyan Yi, and Cunhang Fan (Eds.). CEUR-WS.org, 125--130. https:\/\/ceur-ws.org\/Vol-3597\/paper21.pdf"},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2308.14970"},{"key":"e_1_3_2_1_43_1","volume-title":"LEAF: A Learnable Frontend for Audio Classification. In 9th International Conference on Learning Representations, ICLR 2021","author":"Zeghidour Neil","year":"2021","unstructured":"Neil Zeghidour, Olivier Teboul, F\u00e9lix de Chaumont Quitry, and Marco Tagliasacchi. 2021. LEAF: A Learnable Frontend for Audio Classification. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3--7, 2021. OpenReview.net. https:\/\/openreview.net\/forum?id=jM76BCb6F9m"},{"key":"e_1_3_2_1_44_1","volume-title":"Tomi Kinnunen, Zhen-Hua Ling, and Tomoki Toda.","author":"Zhao Yi","year":"2020","unstructured":"Yi Zhao, Wen-Chin Huang, Xiaohai Tian, Junichi Yamagishi, Rohan Kumar Das, Tomi Kinnunen, Zhen-Hua Ling, and Tomoki Toda. 2020. Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion. CoRR, Vol. abs\/2008.12527 (2020). showeprint[arXiv]2008.12527 https:\/\/arxiv.org\/abs\/2008.12527"}],"event":{"name":"MM '24: The 32nd ACM International Conference on Multimedia","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Melbourne VIC Australia","acronym":"MM '24"},"container-title":["Proceedings of the 32nd ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3664647.3681602","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3664647.3681602","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:17:49Z","timestamp":1750295869000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3664647.3681602"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,28]]},"references-count":44,"alternative-id":["10.1145\/3664647.3681602","10.1145\/3664647"],"URL":"https:\/\/doi.org\/10.1145\/3664647.3681602","relation":{},"subject":[],"published":{"date-parts":[[2024,10,28]]},"assertion":[{"value":"2024-10-28","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}