{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,9]],"date-time":"2025-12-09T19:55:52Z","timestamp":1765310152142,"version":"3.46.0"},"publisher-location":"New York, NY, USA","reference-count":40,"publisher":"ACM","funder":[{"name":"Stable Supporting Fund of Acoustic Science and Technology Laboratory","award":["JCKYS2024604SSJS006"],"award-info":[{"award-number":["JCKYS2024604SSJS006"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,10,27]]},"DOI":"10.1145\/3746027.3755682","type":"proceedings-article","created":{"date-parts":[[2025,10,25]],"date-time":"2025-10-25T07:26:38Z","timestamp":1761377198000},"page":"660-668","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Adaspeaker: Learning Discriminative Speaker Representations with Gradient-Aware Adaptive Scaling"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0009-4354-2946","authenticated-orcid":false,"given":"Jinghan","family":"Liu","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology, Harbin Engineering University, Harbin, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0281-0336","authenticated-orcid":false,"given":"Xingmei","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Harbin Engineering University, Harbin, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1395-3962","authenticated-orcid":false,"given":"Jiaxiang","family":"Meng","sequence":"additional","affiliation":[{"name":"College of Software, Taiyuan University of Technology, Taiyuan, China"}]}],"member":"320","published-online":{"date-parts":[[2025,10,27]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2021.03.004"},{"key":"e_1_3_2_1_2_1","volume-title":"Arsha Nagrani, Daniel Garcia-Romero, and Andrew Zisserman.","author":"Brown Andrew","year":"2022","unstructured":"Andrew Brown, Jaesung Huh, Joon Son Chung, Arsha Nagrani, Daniel Garcia-Romero, and Andrew Zisserman. 2022. Voxsrc 2021: The third voxceleb speaker recognition challenge. arXiv:2201.04583"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2020-1064"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2020-1064"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2018-1929"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2011-64"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.23919\/APSIPA.2018.8659567"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00482"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2020-2650"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP49357.2023.10094954"},{"key":"e_1_3_2_1_11_1","volume-title":"Kushal Lakhotia","author":"Hsu Wei-Ning","year":"2021","unstructured":"Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, and Abdelrahman Mohamed. 2021. Hubert: Self-supervised speech representation learning by masked prediction of hidden units. IEEE\/ACM transactions on audio, speech, and language processing, Vol. 29 (2021), 3451-3460."},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00594"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2024.3444456"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01819"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2009.08.009"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.122384"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2023.3314371"},{"key":"e_1_3_2_1_18_1","volume-title":"Kong Aik Lee, and Haizhou Li.","author":"Liu Tianchi","year":"2022","unstructured":"Tianchi Liu, Rohan Kumar Das, Kong Aik Lee, and Haizhou Li. 2022. MFA: TDNN with Multi-Scale Frequency-Channel Attention for Text-Independent Speaker Verification with Short Utterances. In IEEE ICASSP. Institute of Electrical and Electronics Engineers Inc., Virtual, Online, Singapore, 7517-7521."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2024.3385277"},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.713"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2017-950"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2018.8461375"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00643"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP39728.2021.9414600"},{"key":"e_1_3_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2021-1570"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00552"},{"key":"e_1_3_2_1_28_1","first-page":"5301","article-title":"CAM: A Fast and Efficient Network for Speaker Verification Using Context-Aware Masking. In Interspeech 2023. Dublin","author":"Wang Hui","year":"2023","unstructured":"Hui Wang, Siqi Zheng, Yafeng Chen, Luyao Cheng, and Qian Chen. 2023b. CAM: A Fast and Efficient Network for Speaker Verification Using Context-Aware Masking. In Interspeech 2023. Dublin, Ireland, 5301-5305.","journal-title":"Ireland"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2024.103104"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i07.6906"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP49357.2023.10095066"},{"key":"e_1_3_2_1_32_1","unstructured":"Yandong Wen Weiyang Liu Adrian Weller Bhiksha Raj and Rita Singh. 2022. Sphereface2: Binary classification is all you need for deep face recognition. In ICLR."},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/APSIPAASC47483.2019.9023039"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2022.3184415"},{"key":"e_1_3_2_1_35_1","first-page":"4618","article-title":"Adaptive Margin Circle Loss for Speaker Verification. In Interspeech. Brno","author":"Xiao Runqiu","year":"2021","unstructured":"Runqiu Xiao, Xiaoxiao Miao, Wenchao Wang, Pengyuan Zhang, Bin Cai, and Liuping Luo. 2021. Adaptive Margin Circle Loss for Speaker Verification. In Interspeech. Brno, Czechia, 4618-4622.","journal-title":"Czechia"},{"key":"e_1_3_2_1_36_1","first-page":"406","article-title":"Speaker Augmentation and Bandwidth Extension for Deep Speaker Embedding.. In Interspeech. Graz","author":"Yamamoto Hitoshi","year":"2019","unstructured":"Hitoshi Yamamoto, Kong Aik Lee, Koji Okabe, and Takafumi Koshinaka. 2019. Speaker Augmentation and Bandwidth Extension for Deep Speaker Embedding.. In Interspeech. Graz, Austria, 406-410.","journal-title":"Austria"},{"key":"e_1_3_2_1_37_1","first-page":"6717","article-title":"Attention Back-End for Automatic Speaker Verification with Multiple Enrollment Utterances. In IEEE ICASSP. Institute of Electrical and Electronics Engineers Inc., Virtual, Online","author":"Zeng Chang","year":"2022","unstructured":"Chang Zeng, Xin Wang, Erica Cooper, Xiaoxiao Miao, and Junichi Yamagishi. 2022. Attention Back-End for Automatic Speaker Verification with Multiple Enrollment Utterances. In IEEE ICASSP. Institute of Electrical and Electronics Engineers Inc., Virtual, Online, Singapore, 6717-6721.","journal-title":"Singapore"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01108"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP43922.2022.9746178"},{"key":"e_1_3_2_1_40_1","first-page":"3800","article-title":"Dynamic Margin Softmax Loss for Speaker Verification.. In INTERSPEECH. Shanghai","author":"Zhou Dao","year":"2020","unstructured":"Dao Zhou, Longbiao Wang, Kong Aik Lee, Yibo Wu, Meng Liu, Jianwu Dang, and Jianguo Wei. 2020. Dynamic Margin Softmax Loss for Speaker Verification.. In INTERSPEECH. Shanghai, China, 3800-3804.","journal-title":"China"}],"event":{"name":"MM '25: The 33rd ACM International Conference on Multimedia","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Dublin Ireland","acronym":"MM '25"},"container-title":["Proceedings of the 33rd ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3746027.3755682","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,9]],"date-time":"2025-12-09T19:52:04Z","timestamp":1765309924000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3746027.3755682"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,27]]},"references-count":40,"alternative-id":["10.1145\/3746027.3755682","10.1145\/3746027"],"URL":"https:\/\/doi.org\/10.1145\/3746027.3755682","relation":{},"subject":[],"published":{"date-parts":[[2025,10,27]]},"assertion":[{"value":"2025-10-27","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}