{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,7]],"date-time":"2026-03-07T00:46:03Z","timestamp":1772844363924,"version":"3.50.1"},"reference-count":79,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2023,7,12]],"date-time":"2023-07-12T00:00:00Z","timestamp":1689120000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2023,11,30]]},"abstract":"<jats:p>With the recent development of deep convolutional neural networks and large-scale datasets, deep face recognition has made remarkable progress and been widely used in various applications. However, unlike the existing public face datasets, in many real-world scenarios of face recognition, the depth of the training dataset is shallow, which means that only two face images are available for each ID. With the non-uniform increase of samples, such issue is converted to a more general case, known as long-tail face learning, which suffers from data imbalance and intra-class diversity dearth simultaneously. These adverse conditions damage the training and result in the decline of model performance. Based on Semi-Siamese Training, we introduce an advanced solution, named<jats:italic>Multi-Agent Semi-Siamese Training<\/jats:italic>(MASST), to address these problems. MASST includes a probe network and multiple gallery agents\u2014the former aims to encode the probe features, and the latter constitutes a stack of networks that encode the prototypes (gallery features). For each training iteration, the gallery network, which is sequentially rotated from the stack, and the probe network form a pair of Semi-Siamese networks. We give the theoretical and empirical analysis that, given the long-tail (or shallow) data and training loss, MASST smooths the loss landscape and satisfies the Lipschitz continuity with the help of multiple agents and the updating gallery queue. The proposed method is out of extra-dependency, and thus can be easily integrated with the existing loss functions and network architectures. It is worth noting that although multiple gallery agents are employed for training, only the probe network is needed for inference, without increasing the inference cost. Extensive experiments and comparisons demonstrate the advantages of MASST for long-tail and shallow face learning.<\/jats:p>","DOI":"10.1145\/3594669","type":"journal-article","created":{"date-parts":[[2023,4,26]],"date-time":"2023-04-26T11:46:25Z","timestamp":1682509585000},"page":"1-20","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Multi-Agent Semi-Siamese Training for Long-Tail and Shallow Face Learning"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3656-2593","authenticated-orcid":false,"given":"Yichun","family":"Tai","sequence":"first","affiliation":[{"name":"Shanghai University"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3603-2683","authenticated-orcid":false,"given":"Hailin","family":"Shi","sequence":"additional","affiliation":[{"name":"JD AI Research"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1300-1769","authenticated-orcid":false,"given":"Dan","family":"Zeng","sequence":"additional","affiliation":[{"name":"Shanghai University"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1066-4399","authenticated-orcid":false,"given":"Hang","family":"Du","sequence":"additional","affiliation":[{"name":"Shanghai University"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-7762-6986","authenticated-orcid":false,"given":"Yibo","family":"Hu","sequence":"additional","affiliation":[{"name":"JD AI Research"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9553-7358","authenticated-orcid":false,"given":"Zicheng","family":"Zhang","sequence":"additional","affiliation":[{"name":"University of Chinese Academy of Sciences"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7699-0747","authenticated-orcid":false,"given":"Zhijiang","family":"Zhang","sequence":"additional","affiliation":[{"name":"Shanghai University"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5990-7307","authenticated-orcid":false,"given":"Tao","family":"Mei","sequence":"additional","affiliation":[{"name":"JD AI Research"}]}],"member":"320","published-online":{"date-parts":[[2023,7,12]]},"reference":[{"key":"e_1_3_1_2_2","volume-title":"Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201922)","author":"Alshammari Shaden","year":"2022","unstructured":"Shaden Alshammari, Yuxiong Wang, Deva Ramanan, and Shu Kong. 2022. Long-tailed recognition via weight balancing. In Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201922). 6887\u20136897."},{"key":"e_1_3_1_3_2","doi-asserted-by":"crossref","first-page":"249","DOI":"10.1016\/j.neunet.2018.07.011","article-title":"A systematic study of the class imbalance problem in convolutional neural networks","volume":"106","author":"Buda Mateusz","year":"2018","unstructured":"Mateusz Buda, Atsuto Maki, and Maciej A. Mazurowski. 2018. A systematic study of the class imbalance problem in convolutional neural networks. Neural Networks 106 (2018), 249\u2013259.","journal-title":"Neural Networks"},{"key":"e_1_3_1_4_2","first-page":"67","volume-title":"Proceedings of the 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition (FG\u201918)","author":"Cao Qiong","year":"2018","unstructured":"Qiong Cao, Li Shen, Weidi Xie, Omkar M. Parkhi, and Andrew Zisserman. 2018. VGGFace2: A dataset for recognising faces across pose and age. In Proceedings of the 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition (FG\u201918). IEEE, Los Alamitos, CA, 67\u201374."},{"key":"e_1_3_1_5_2","doi-asserted-by":"crossref","first-page":"321","DOI":"10.1613\/jair.953","article-title":"SMOTE: Synthetic minority over-sampling technique","volume":"16","author":"Chawla Nitesh V.","year":"2002","unstructured":"Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16 (2002), 321\u2013357.","journal-title":"Journal of Artificial Intelligence Research"},{"key":"e_1_3_1_6_2","doi-asserted-by":"crossref","first-page":"428","DOI":"10.1007\/978-3-319-97909-0_46","volume-title":"Proceedings of the Chinese Conference on Biometric Recognition","author":"Chen Sheng","year":"2018","unstructured":"Sheng Chen, Yang Liu, Xiang Gao, and Zhen Han. 2018. MobileFaceNets: Efficient CNNs for accurate real-time face verification on mobile devices. In Proceedings of the Chinese Conference on Biometric Recognition. 428\u2013438."},{"key":"e_1_3_1_7_2","first-page":"1924","volume-title":"Proceedings of the IEEE International Conference on Computer Vision Workshops","author":"Cheng Yu","year":"2017","unstructured":"Yu Cheng, Jian Zhao, Zhecan Wang, Yan Xu, Karlekar Jayashree, Shengmei Shen, and Jiashi Feng. 2017. Know you at one glance: A compact vector representation for low-shot learning. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 1924\u20131932."},{"key":"e_1_3_1_8_2","article-title":"Surveillance face recognition challenge","author":"Cheng Zhiyi","year":"2018","unstructured":"Zhiyi Cheng, Xiatian Zhu, and Shaogang Gong. 2018. Surveillance face recognition challenge. arXiv preprint arXiv:1804.09691 (2018).","journal-title":"arXiv preprint arXiv:1804.09691"},{"key":"e_1_3_1_9_2","first-page":"715","volume-title":"Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (CVPR\u201921)","author":"Cui Jiequan","year":"2021","unstructured":"Jiequan Cui, Zhisheng Zhong, Shu Liu, Bei Yu, and Jiaya Jia. 2021. Parametric contrastive learning. In Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (CVPR\u201921). 715\u2013724."},{"key":"e_1_3_1_10_2","unstructured":"DeepGlint. 2018. Challenge 3: Face Feature Test\/Trillion Pairs. Retrieved May 2 2023 from http:\/\/trillionpairs.deepglint.com\/overview."},{"key":"e_1_3_1_11_2","first-page":"4690","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Deng Jiankang","year":"2019","unstructured":"Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. 2019. ArcFace: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4690\u20134699."},{"key":"e_1_3_1_12_2","article-title":"Generative one-shot face recognition","author":"Ding Zhengming","year":"2019","unstructured":"Zhengming Ding, Yandong Guo, Lei Zhang, and Yun Fu. 2019. Generative one-shot face recognition. arXiv preprint arXiv:1910.04860 (2019).","journal-title":"arXiv preprint arXiv:1910.04860"},{"key":"e_1_3_1_13_2","first-page":"7455","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Dixit Mandar","year":"2017","unstructured":"Mandar Dixit, Roland Kwitt, Marc Niethammer, and Nuno Vasconcelos. 2017. AGA: Attribute-guided augmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7455\u20137463."},{"key":"e_1_3_1_14_2","first-page":"1","volume-title":"Proceedings of the Workshop on Learning from Imbalanced Datasets II","volume":"11","year":"2003","unstructured":"Chris Drummond and Robert C. Holte. 2003. C4. 5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling. In Proceedings of the Workshop on Learning from Imbalanced Datasets II, Vol. 11. 1\u20138."},{"key":"e_1_3_1_15_2","first-page":"36","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Du Hang","year":"2020","unstructured":"Hang Du, Hailin Shi, Yuchi Liu, Jun Wang, Zhen Lei, Dan Zeng, and Tao Mei. 2020. Semi-Siamese training for shallow face learning. In Proceedings of the European Conference on Computer Vision. 36\u201353."},{"key":"e_1_3_1_16_2","first-page":"2235","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Feng Zhen-Hua","year":"2018","unstructured":"Zhen-Hua Feng, Josef Kittler, Muhammad Awais, Patrik Huber, and Xiao-Jun Wu. 2018. Wing loss for robust facial landmark localisation with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2235\u20132245."},{"key":"e_1_3_1_17_2","article-title":"One-shot face recognition by promoting underrepresented classes","author":"Guo Yandong","year":"2017","unstructured":"Yandong Guo and Lei Zhang. 2017. One-shot face recognition by promoting underrepresented classes. arXiv preprint arXiv:1707.05574 (2017).","journal-title":"arXiv preprint arXiv:1707.05574"},{"key":"e_1_3_1_18_2","first-page":"87","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Guo Yandong","year":"2016","unstructured":"Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, and Jianfeng Gao. 2016. MS-Celeb-1M: A dataset and benchmark for large-scale face recognition. In Proceedings of the European Conference on Computer Vision. 87\u2013102."},{"key":"e_1_3_1_19_2","first-page":"3018","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Hariharan Bharath","year":"2017","unstructured":"Bharath Hariharan and Ross Girshick. 2017. Low-shot visual recognition by shrinking and hallucinating features. In Proceedings of the IEEE International Conference on Computer Vision. 3018\u20133027."},{"key":"e_1_3_1_20_2","article-title":"Revisiting local descriptor for improved few-shot classification","author":"He Jun","year":"2022","unstructured":"Jun He, Richang Hong, Xueliang Liu, Mingliang Xu, and Qianru Sun. 2022. Revisiting local descriptor for improved few-shot classification. ACM Transactions on Multimedia Computing, Communications, and Applications 18, 2s (2022), Article 127, 23 pages.","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"key":"e_1_3_1_21_2","first-page":"9729","volume-title":"Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201920)","author":"He Kaiming","year":"2020","unstructured":"Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201920). 9729\u20139738."},{"key":"e_1_3_1_22_2","first-page":"770","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"He Kaiming","year":"2016","unstructured":"Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770\u2013778."},{"issue":"9","key":"e_1_3_1_23_2","doi-asserted-by":"crossref","first-page":"1263","DOI":"10.1109\/TPDS.2008.39","article-title":"Provably efficient online nonclairvoyant adaptive scheduling","volume":"19","author":"He Yuxiong","year":"2008","unstructured":"Yuxiong He, Wen-Jing Hsu, and Charles E. Leiserson. 2008. Provably efficient online nonclairvoyant adaptive scheduling. IEEE Transactions on Parallel and Distributed Systems 19, 9 (2008), 1263\u20131279.","journal-title":"IEEE Transactions on Parallel and Distributed Systems"},{"key":"e_1_3_1_24_2","first-page":"7132","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Hu Jie","year":"2018","unstructured":"Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7132\u20137141."},{"key":"e_1_3_1_25_2","first-page":"5375","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Huang Chen","year":"2016","unstructured":"Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang. 2016. Learning deep representation for imbalanced classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5375\u20135384."},{"issue":"11","key":"e_1_3_1_26_2","doi-asserted-by":"crossref","first-page":"2781","DOI":"10.1109\/TPAMI.2019.2914680","article-title":"Deep imbalanced learning for face recognition and attribute prediction","volume":"42","author":"Huang Chen","year":"2019","unstructured":"Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang. 2019. Deep imbalanced learning for face recognition and attribute prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 11 (2019), 2781\u20132794.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_1_27_2","volume-title":"Proceedings of the Workshop on Faces in \u2018Real-Life\u2019 Images: Detection, Alignment, and Recognition","author":"Huang Gary B.","year":"2008","unstructured":"Gary B. Huang, Marwan Mattar, Tamara Berg, and Eric Learned-Miller. 2008. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In Proceedings of the Workshop on Faces in \u2018Real-Life\u2019 Images: Detection, Alignment, and Recognition."},{"key":"e_1_3_1_28_2","first-page":"4873","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Kemelmacher-Shlizerman Ira","year":"2016","unstructured":"Ira Kemelmacher-Shlizerman, Steven M. Seitz, Daniel Miller, and Evan Brossard. 2016. The MegaFace benchmark: 1 million faces for recognition at scale. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4873\u20134882."},{"issue":"8","key":"e_1_3_1_29_2","doi-asserted-by":"crossref","first-page":"3573","DOI":"10.1109\/TNNLS.2017.2732482","article-title":"Cost-sensitive learning of deep feature representations from imbalanced data","volume":"29","author":"Khan Salman H.","year":"2017","unstructured":"Salman H. Khan, Munawar Hayat, Mohammed Bennamoun, Ferdous A. Sohel, and Roberto Togneri. 2017. Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Transactions on Neural Networks and Learning Systems 29, 8 (2017), 3573\u20133587.","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"e_1_3_1_30_2","first-page":"3763","volume-title":"Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201921)","author":"Li Bi","year":"2021","unstructured":"Bi Li, Teng Xi, Gang Zhang, Haocheng Feng, Junyu Han, Jingtuo Liu, Errui Ding, and Wenyu Liu. 2021. Dynamic class queue for large scale face recognition in the wild. In Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201921). 3763\u20133772."},{"key":"e_1_3_1_31_2","article-title":"Markov-Lipschitz deep learning","author":"Li Stan Z.","year":"2020","unstructured":"Stan Z. Li, Zelin Zhang, and Lirong Wu. 2020. Markov-Lipschitz deep learning. arXiv preprint arXiv:2006.08256 (2020).","journal-title":"arXiv preprint arXiv:2006.08256"},{"key":"e_1_3_1_32_2","first-page":"9572","volume-title":"Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201919)","author":"Li Yi","year":"2019","unstructured":"Yi Li and Nuno Vasconcelos. 2019. Repair: Removing representation bias by dataset resampling. In Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201919). 9572\u20139581."},{"key":"e_1_3_1_33_2","first-page":"1","volume-title":"Proceedings of the IEEE International Joint Conference on Biometrics","author":"Liao Shengcai","year":"2014","unstructured":"Shengcai Liao, Zhen Lei, Dong Yi, and Stan Z. Li. 2014. A benchmark study of large-scale unconstrained face recognition. In Proceedings of the IEEE International Joint Conference on Biometrics. IEEE, Los Alamitos, CA, 1\u20138."},{"key":"e_1_3_1_34_2","first-page":"2980","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Lin Tsung-Yi","year":"2017","unstructured":"Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Doll\u00e1r. 2017. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision. 2980\u20132988."},{"key":"e_1_3_1_35_2","first-page":"10052","volume-title":"Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (CVPR\u201919)","author":"Liu Bingyu","year":"2019","unstructured":"Bingyu Liu, Weihong Deng, Yaoyao Zhong, Mei Wang, Jiani Hu, Xunqiang Tao, and Yaohai Huang. 2019. Fair loss: Margin-aware reinforcement learning for deep face recognition. In Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (CVPR\u201919). 10052\u201310061."},{"key":"e_1_3_1_36_2","first-page":"212","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Liu Weiyang","year":"2017","unstructured":"Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, and Le Song. 2017. SphereFace: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 212\u2013220."},{"issue":"2","key":"e_1_3_1_37_2","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1109\/TSMCB.2008.2007853","article-title":"Exploratory undersampling for class-imbalance learning","volume":"39","author":"Liu Xu-Ying","year":"2008","unstructured":"Xu-Ying Liu, Jianxin Wu, and Zhi-Hua Zhou. 2008. Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 39, 2 (2008), 539\u2013550.","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)"},{"key":"e_1_3_1_38_2","first-page":"2401","volume-title":"Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP\u201918)","author":"Ma Yuhao","year":"2018","unstructured":"Yuhao Ma, Meina Kan, Shiguang Shan, and Xilin Chen. 2018. Hierarchical training for large scale face recognition with few samples per subject. In Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP\u201918). IEEE, Los Alamitos, CA, 2401\u20132405."},{"key":"e_1_3_1_39_2","doi-asserted-by":"crossref","first-page":"48","DOI":"10.1016\/j.patrec.2020.02.007","article-title":"Learning deep face representation with long-tail data: An aggregate-and-disperse approach","volume":"133","author":"Ma Yuhao","year":"2020","unstructured":"Yuhao Ma, Meina Kan, Shiguang Shan, and Xilin Chen. 2020. Learning deep face representation with long-tail data: An aggregate-and-disperse approach. Pattern Recognition Letters 133 (2020), 48\u201354.","journal-title":"Pattern Recognition Letters"},{"key":"e_1_3_1_40_2","doi-asserted-by":"crossref","first-page":"158","DOI":"10.1109\/ICB2018.2018.00033","volume-title":"Proceedings of the 2018 International Conference on Biometrics (ICB\u201918)","author":"Maze Brianna","year":"2018","unstructured":"Brianna Maze, Jocelyn Adams, James A. Duncan, Nathan Kalka, Tim Miller, Charles Otto, Anil K. Jain, et\u00a0al. 2018. IARPA Janus benchmark - C: Face dataset and protocol. In Proceedings of the 2018 International Conference on Biometrics (ICB\u201918). IEEE, Los Alamitos, CA, 158\u2013165."},{"key":"e_1_3_1_41_2","first-page":"51","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops","author":"Moschoglou Stylianos","year":"2017","unstructured":"Stylianos Moschoglou, Athanasios Papaioannou, Christos Sagonas, Jiankang Deng, Irene Kotsia, and Stefanos Zafeiriou. 2017. AgeDB: The first manually collected, in-the-wild age database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 51\u201359."},{"key":"e_1_3_1_42_2","volume-title":"Lectures on Convex Optimization","year":"2018","unstructured":"Yurii Nesterov. 2018. Lectures on Convex Optimization, Vol. 137. Springer."},{"key":"e_1_3_1_43_2","first-page":"13593","article-title":"BAW: Learning from class imbalance and noisy labels with batch adaptation weighted loss","author":"Pan Siyuan","year":"2022","unstructured":"Siyuan Pan, Bin Sheng, Gaoqi He, Huating Li, and Guangtao Xue. 2022. BAW: Learning from class imbalance and noisy labels with batch adaptation weighted loss. Multimedia Tools and Applications 81 (2022), 13593\u201313610.","journal-title":"Multimedia Tools and Applications"},{"key":"e_1_3_1_44_2","first-page":"815","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Schroff Florian","year":"2015","unstructured":"Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 815\u2013823."},{"key":"e_1_3_1_45_2","article-title":"Delta-encoder: An effective sample synthesis method for few-shot object recognition","volume":"31","author":"Schwartz Eli","year":"2018","unstructured":"Eli Schwartz, Leonid Karlinsky, Joseph Shtok, Sivan Harary, Mattias Marder, Abhishek Kumar, Rogerio Feris, Raja Giryes, and Alex Bronstein. 2018. Delta-encoder: An effective sample synthesis method for few-shot object recognition. Advances in Neural Information Processing Systems 31 (2018), 1\u201311.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_46_2","first-page":"1","volume-title":"Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision (WACV\u201916)","author":"Sengupta Soumyadip","year":"2016","unstructured":"Soumyadip Sengupta, Jun-Cheng Chen, Carlos Castillo, Vishal M. Patel, Rama Chellappa, and David W. Jacobs. 2016. Frontal to profile face verification in the wild. In Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision (WACV\u201916). IEEE, Los Alamitos, CA, 1\u20139."},{"key":"e_1_3_1_47_2","first-page":"4136","volume-title":"Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201920)","author":"Simon Christian","year":"2020","unstructured":"Christian Simon, Piotr Koniusz, Richard Nock, and Mehrtash Harandi. 2020. Adaptive subspaces for few-shot learning. In Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201920). 4136\u20134145."},{"key":"e_1_3_1_48_2","article-title":"Very deep convolutional networks for large-scale image recognition","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).","journal-title":"arXiv preprint arXiv:1409.1556"},{"key":"e_1_3_1_49_2","first-page":"1857","article-title":"Improved deep metric learning with multi-class N-pair loss objective","volume":"29","author":"Sohn Kihyuk","year":"2016","unstructured":"Kihyuk Sohn. 2016. Improved deep metric learning with multi-class N-pair loss objective. Advances in Neural Information Processing Systems 29 (2016), 1857\u20131865.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_50_2","first-page":"1988","article-title":"Deep learning face representation by joint identification-verification","volume":"27","author":"Sun Yi","year":"2014","unstructured":"Yi Sun, Yuheng Chen, Xiaogang Wang, and Xiaoou Tang. 2014. Deep learning face representation by joint identification-verification. Advances in Neural Information Processing Systems 27 (2014), 1988\u20131996.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_51_2","article-title":"Cross-batch hard example mining with pseudo large batch for ID vs. Spot face recognition","author":"Tan Zichang","year":"2022","unstructured":"Zichang Tan, Ajian Liu, Jun Wan, Hao Liu, Zhen Lei, Guodong Guo, and Stan Z. Li. 2022. Cross-batch hard example mining with pseudo large batch for ID vs. Spot face recognition. IEEE Transactions on Image Processing 31 (2022), 3224\u20133235.","journal-title":"IEEE Transactions on Image Processing"},{"key":"e_1_3_1_52_2","volume-title":"Proceedings of the 17th International Conference on Machine Learning","author":"Ting Kai Ming","year":"2000","unstructured":"Kai Ming Ting. 2000. A comparative study of cost-sensitive boosting algorithms. In Proceedings of the 17th International Conference on Machine Learning."},{"issue":"7","key":"e_1_3_1_53_2","doi-asserted-by":"crossref","first-page":"926","DOI":"10.1109\/LSP.2018.2822810","article-title":"Additive margin softmax for face verification","volume":"25","author":"Wang Feng","year":"2018","unstructured":"Feng Wang, Jian Cheng, Weiyang Liu, and Haijun Liu. 2018. Additive margin softmax for face verification. IEEE Signal Processing Letters 25, 7 (2018), 926\u2013930.","journal-title":"IEEE Signal Processing Letters"},{"key":"e_1_3_1_54_2","first-page":"3156","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Wang Fei","year":"2017","unstructured":"Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, and Xiaoou Tang. 2017. Residual attention network for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3156\u20133164."},{"key":"e_1_3_1_55_2","first-page":"2386","volume-title":"Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP\u201918)","author":"Wang Lingxiao","year":"2018","unstructured":"Lingxiao Wang, Yali Li, and Shengjin Wang. 2018. Feature learning for one-shot face recognition. In Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP\u201918). IEEE, Los Alamitos, CA, 2386\u20132390."},{"key":"e_1_3_1_56_2","first-page":"9358","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Wang Xiaobo","year":"2019","unstructured":"Xiaobo Wang, Shuo Wang, Jun Wang, Hailin Shi, and Tao Mei. 2019. Co-mining: Deep face recognition with noisy labels. In Proceedings of the IEEE International Conference on Computer Vision. 9358\u20139367."},{"key":"e_1_3_1_57_2","first-page":"12241","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"34","author":"Wang Xiaobo","year":"2020","unstructured":"Xiaobo Wang, Shifeng Zhang, Shuo Wang, Tianyu Fu, Hailin Shi, and Tao Mei. 2020. Mis-classified vector guided softmax loss for face recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 12241\u201312248."},{"key":"e_1_3_1_58_2","first-page":"499","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Wen Yandong","year":"2016","unstructured":"Yandong Wen, Kaipeng Zhang, Zhifeng Li, and Yu Qiao. 2016. A discriminative feature learning approach for deep face recognition. In Proceedings of the European Conference on Computer Vision. 499\u2013515."},{"key":"e_1_3_1_59_2","first-page":"90","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops","author":"Whitelam Cameron","year":"2017","unstructured":"Cameron Whitelam, Emma Taborsky, Austin Blanton, Brianna Maze, Jocelyn Adams, Tim Miller, Nathan Kalka, et\u00a0al. 2017. IARPA Janus benchmark-B face dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 90\u201398."},{"key":"e_1_3_1_60_2","article-title":"ForestDet: Large-vocabulary long-tailed object detection and instance segmentation","author":"Wu Jialian","year":"2021","unstructured":"Jialian Wu, Liangchen Song, Qian Zhang, Ming Yang, and Junsong Yuan. 2021. ForestDet: Large-vocabulary long-tailed object detection and instance segmentation. IEEE Transactions on Multimedia 24 (2021), 3693\u20133705.","journal-title":"IEEE Transactions on Multimedia"},{"key":"e_1_3_1_61_2","first-page":"1933","volume-title":"Proceedings of the IEEE International Conference on Computer Vision Workshops","author":"Wu Yue","year":"2017","unstructured":"Yue Wu, Hongfu Liu, and Yun Fu. 2017. Low-shot face recognition with hybrid classifiers. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 1933\u20131939."},{"key":"e_1_3_1_62_2","doi-asserted-by":"crossref","first-page":"408","DOI":"10.1145\/3126686.3126693","volume-title":"Proceedings of the on Thematic Workshops of ACM Multimedia 2017","author":"Wu Yue","year":"2017","unstructured":"Yue Wu, Hongfu Liu, Jun Li, and Yun Fu. 2017. Deep face recognition with center invariant loss. In Proceedings of the on Thematic Workshops of ACM Multimedia 2017. 408\u2013414."},{"issue":"1","key":"e_1_3_1_63_2","doi-asserted-by":"crossref","first-page":"56","DOI":"10.1049\/iet-bmt.2017.0193","article-title":"Scarce face recognition via two-layer collaborative representation","volume":"7","author":"Xia Zhaoqiang","year":"2018","unstructured":"Zhaoqiang Xia, Xianlin Peng, Xiaoyi Feng, and Abdenour Hadid. 2018. Scarce face recognition via two-layer collaborative representation. IET Biometrics 7, 1 (2018), 56\u201362.","journal-title":"IET Biometrics"},{"issue":"1","key":"e_1_3_1_64_2","first-page":"1","article-title":"Age-invariant face recognition by multi-feature fusionand decomposition with self-attention","volume":"18","author":"Yan Chenggang","year":"2022","unstructured":"Chenggang Yan, Lixuan Meng, Liang Li, Jiehua Zhang, Zhan Wang, Jian Yin, Jiyong Zhang, Yaoqi Sun, and Bolun Zheng. 2022. Age-invariant face recognition by multi-feature fusionand decomposition with self-attention. ACM Transactions on Multimedia Computing, Communications, and Applications 18, 1s (2022), 1\u201318.","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"key":"e_1_3_1_65_2","article-title":"Large batch size training of neural networks with adversarial training and second-order information","author":"Yao Zhewei","year":"2018","unstructured":"Zhewei Yao, Amir Gholami, Daiyaan Arfeen, Richard Liaw, Joseph Gonzalez, Kurt Keutzer, and Michael Mahoney. 2018. Large batch size training of neural networks with adversarial training and second-order information. arXiv preprint arXiv:1810.01021 (2018).","journal-title":"arXiv preprint arXiv:1810.01021"},{"key":"e_1_3_1_66_2","article-title":"Feature transfer learning for deep face recognition with under-represented data","author":"Yin Xi","year":"2018","unstructured":"Xi Yin, Xiang Yu, Kihyuk Sohn, Xiaoming Liu, and Manmohan Chandraker. 2018. Feature transfer learning for deep face recognition with under-represented data. arXiv preprint arXiv:1803.09014 (2018).","journal-title":"arXiv preprint arXiv:1803.09014"},{"key":"e_1_3_1_67_2","first-page":"5704","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Yin Xi","year":"2019","unstructured":"Xi Yin, Xiang Yu, Kihyuk Sohn, Xiaoming Liu, and Manmohan Chandraker. 2019. Feature transfer learning for face recognition with under-represented data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5704\u20135713."},{"key":"e_1_3_1_68_2","first-page":"2659","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Yoo Donggeun","year":"2015","unstructured":"Donggeun Yoo, Sunggyun Park, Joon-Young Lee, Anthony S. Paek, and In So Kweon. 2015. AttentionNet: Aggregating weak directions for accurate object detection. In Proceedings of the IEEE International Conference on Computer Vision. 2659\u20132667."},{"key":"e_1_3_1_69_2","doi-asserted-by":"crossref","first-page":"2309","DOI":"10.1109\/TIP.2022.3154938","article-title":"Sample-centric feature generation for semi-supervised few-shot learning","volume":"31","author":"Zhang Bo","year":"2022","unstructured":"Bo Zhang, Hancheng Ye, Gang Yu, Bin Wang, Yike Wu, Jiayuan Fan, and Tao Chen. 2022. Sample-centric feature generation for semi-supervised few-shot learning. IEEE Transactions on Image Processing 31 (2022), 2309\u20132320.","journal-title":"IEEE Transactions on Image Processing"},{"key":"e_1_3_1_70_2","first-page":"1","volume-title":"Proceedings of the 2017 IEEE International Joint Conference on Biometrics (IJCB\u201917)","author":"Zhang Shifeng","year":"2017","unstructured":"Shifeng Zhang, Xiangyu Zhu, Zhen Lei, Hailin Shi, Xiaobo Wang, and Stan Z. Li. 2017. FaceBoxes: A CPU real-time face detector with high accuracy. In Proceedings of the 2017 IEEE International Joint Conference on Biometrics (IJCB\u201917). IEEE, Los Alamitos, CA, 1\u20139."},{"key":"e_1_3_1_71_2","first-page":"5409","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Zhang Xiao","year":"2017","unstructured":"Xiao Zhang, Zhiyuan Fang, Yandong Wen, Zhifeng Li, and Yu Qiao. 2017. Range loss for deep face recognition with long-tailed training data. In Proceedings of the IEEE International Conference on Computer Vision. 5409\u20135418."},{"key":"e_1_3_1_72_2","first-page":"10823","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Zhang Xiao","year":"2019","unstructured":"Xiao Zhang, Rui Zhao, Yu Qiao, Xiaogang Wang, and Hongsheng Li. 2019. AdaCos: Adaptively scaling cosine logits for effectively learning deep face representations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10823\u201310832."},{"key":"e_1_3_1_73_2","first-page":"824","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Zhang Yaobin","year":"2020","unstructured":"Yaobin Zhang and Weihong Deng. 2020. Class-balanced training for deep face recognition. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 824\u2013825."},{"key":"e_1_3_1_74_2","unstructured":"Tianyue Zheng and Weihong Deng. 2018. Cross-Pose LFW: A Database for Studying Cross-Pose Face Recognition in Unconstrained Environments . Technical Report. Beijing University of Posts and Telecommunications."},{"key":"e_1_3_1_75_2","article-title":"Cross-age LFW: A database for studying cross-age face recognition in unconstrained environments","author":"Zheng Tianyue","year":"2017","unstructured":"Tianyue Zheng, Weihong Deng, and Jiani Hu. 2017. Cross-age LFW: A database for studying cross-age face recognition in unconstrained environments. arXiv preprint arXiv:1708.08197 (2017).","journal-title":"arXiv preprint arXiv:1708.08197"},{"key":"e_1_3_1_76_2","first-page":"5089","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Zheng Yutong","year":"2018","unstructured":"Yutong Zheng, Dipan K. Pal, and Marios Savvides. 2018. Ring loss: Convex feature normalization for face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5089\u20135097."},{"key":"e_1_3_1_77_2","article-title":"Graph complemented latent representation for few-shot image classification","author":"Zhong Xian","year":"2022","unstructured":"Xian Zhong, Cheng Gu, Mang Ye, Wenxin Huang, and Chia-Wen Lin. 2022. Graph complemented latent representation for few-shot image classification. IEEE Transactions on Multimedia. Early access, January 11, 2022.","journal-title":"IEEE Transactions on Multimedia."},{"key":"e_1_3_1_78_2","first-page":"7812","volume-title":"Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201919)","author":"Zhong Yaoyao","year":"2019","unstructured":"Yaoyao Zhong, Weihong Deng, Mei Wang, Jiani Hu, Jianteng Peng, Xunqiang Tao, and Yaohai Huang. 2019. Unequal-training for deep face recognition with long-tailed noisy data. In Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201919). 7812\u20137821."},{"issue":"6","key":"e_1_3_1_79_2","first-page":"684","article-title":"Large-scale bisample learning on id versus spot face recognition","volume":"127","author":"Zhu Xiangyu","year":"2019","unstructured":"Xiangyu Zhu, Hao Liu, Zhen Lei, Hailin Shi, Fan Yang, Dong Yi, Guojun Qi, and Stan Z. Li. 2019. Large-scale bisample learning on id versus spot face recognition. International Journal of Computer Vision 127, 6-7 (2019), 684\u2013700.","journal-title":"International Journal of Computer Vision"},{"key":"e_1_3_1_80_2","first-page":"10492","volume-title":"Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201921)","author":"Zhu Zheng","year":"2021","unstructured":"Zheng Zhu, Guan Huang, Jiankang Deng, Yun Ye, Junjie Huang, Xinze Chen, Jiagang Zhu, et\u00a0al. 2021. WebFace260M: A benchmark unveiling the power of million-scale deep face recognition. In Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201921). 10492\u201310502."}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3594669","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3594669","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:09:08Z","timestamp":1750183748000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3594669"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,12]]},"references-count":79,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023,11,30]]}},"alternative-id":["10.1145\/3594669"],"URL":"https:\/\/doi.org\/10.1145\/3594669","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,12]]},"assertion":[{"value":"2022-11-28","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-04-20","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-07-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}