{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,19]],"date-time":"2026-03-19T23:27:40Z","timestamp":1773962860553,"version":"3.50.1"},"reference-count":40,"publisher":"Association for Computing Machinery (ACM)","issue":"1s","license":[{"start":{"date-parts":[[2022,1,25]],"date-time":"2022-01-25T00:00:00Z","timestamp":1643068800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Key-Area Research and Development Program of Guangdong Province","award":["2020B010165001"],"award-info":[{"award-number":["2020B010165001"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61772527, 62002356, 61925303"],"award-info":[{"award-number":["61772527, 62002356, 61925303"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Open Project of Key Laboratory of Ministry of Public Security for Road Traffic Safety","award":["2020ZDSYSKFKT04"],"award-info":[{"award-number":["2020ZDSYSKFKT04"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2022,2,28]]},"abstract":"<jats:p>Visible-infrared person re-identification (Re-ID) has received increasing research attention for its great practical value in night-time surveillance scenarios. Due to the large variations in person pose, viewpoint, and occlusion in the same modality, as well as the domain gap brought by heterogeneous modality, this hybrid modality person matching task is quite challenging. Different from the metric learning methods for visible person re-ID, which only pose similarity constraints on class level, an efficient metric learning approach for visible-infrared person Re-ID should take both the class-level and modality-level similarity constraints into full consideration to learn sufficiently discriminative and robust features. In this article, the hybrid modality is divided into two types, within modality and cross modality. We first fully explore the variations that hinder the ranking results of visible-infrared person re-ID and roughly summarize them into three types: within-modality variation, cross-modality modality-related variation, and cross-modality modality-unrelated variation. Then, we propose a comprehensive metric learning framework based on four kinds of paired-based similarity constraints to address all the variations within and cross modality. This framework focuses on both class-level and modality-level similarity relationships between person images. Furthermore, we demonstrate the compatibility of our framework with any paired-based loss functions by giving detailed implementation of combing it with triplet loss and contrastive loss separately. Finally, extensive experiments of our approach on SYSU-MM01 and RegDB demonstrate the effectiveness and superiority of our proposed metric learning framework for visible-infrared person Re-ID.<\/jats:p>","DOI":"10.1145\/3473341","type":"journal-article","created":{"date-parts":[[2022,1,25]],"date-time":"2022-01-25T15:06:00Z","timestamp":1643123160000},"page":"1-15","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":18,"title":["Hybrid Modality Metric Learning for Visible-Infrared Person Re-Identification"],"prefix":"10.1145","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2377-6907","authenticated-orcid":false,"given":"La","family":"Zhang","sequence":"first","affiliation":[{"name":"Beijing Institute of Technology, Beijing, Haidian Qu, Beijing Shi, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haiyun","family":"Guo","sequence":"additional","affiliation":[{"name":"Institute of Automation Chinese Academy of Sciences, Haidian Qu, Beijing Shi, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kuan","family":"Zhu","sequence":"additional","affiliation":[{"name":"Institute of Automation Chinese Academy of Sciences, Haidian Qu, Beijing Shi, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Honglin","family":"Qiao","sequence":"additional","affiliation":[{"name":"Alibaba Cloud, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gaopan","family":"Huang","sequence":"additional","affiliation":[{"name":"Alibaba Cloud, Nanjing, Jiangsu, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sen","family":"Zhang","sequence":"additional","affiliation":[{"name":"Traffic Management Research Institute of the Ministry of Public Security, Wuxi, Jiangsu Sheng, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Huichen","family":"Zhang","sequence":"additional","affiliation":[{"name":"Traffic Management Research Institute of the Ministry of Public Security, Wuxi, Jiangsu Sheng, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jian","family":"Sun","sequence":"additional","affiliation":[{"name":"Beijing Institute of Technology, Beijing, Haidian Qu, Beijing Shi, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jinqiao","family":"Wang","sequence":"additional","affiliation":[{"name":"Institute of Automation Chinese Academy of Sciences, Haidian Qu, Beijing Shi, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,1,25]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2019.12.100"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.145"},{"key":"e_1_3_1_4_2","volume-title":"Computer Vision and Pattern Recognition","author":"Cheng De","year":"2016","unstructured":"De Cheng, Yihong Gong, Sanping Zhou, Jinjun Wang, and Nanning Zheng. 2016. Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In Computer Vision and Pattern Recognition."},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01027"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.5555\/3304415.3304512"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2005.177"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.3390\/s17030605"},{"key":"e_1_3_1_9_2","doi-asserted-by":"crossref","unstructured":"Weijian Deng Liang Zheng Guoliang Kang Yi Yang Qixiang Ye and Jianbin Jiao. 2018. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In IEEE\/CVF Conference on Computer Vision and Pattern Recognition . IEEE.","DOI":"10.1109\/CVPR.2018.00110"},{"key":"e_1_3_1_10_2","doi-asserted-by":"crossref","unstructured":"Zhangxiang Feng Jianhuang Lai and Xiaohua Xie. 2019. Learning modality-specific representations for visible-infrared person re-identification. IEEE Transactions on Image Processing 29 (2019) 579\u2013590.","DOI":"10.1109\/TIP.2019.2928126"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.5555\/2969033.2969125"},{"key":"e_1_3_1_12_2","doi-asserted-by":"crossref","unstructured":"Wang Guan\u2019An Zhang Tianzhu Cheng Jian Liu Si Yang Yang and Hou Zengguang. 2020. RGB-infrared cross-modality person re-identification via joint pixel and feature alignment. In IEEE\/CVF International Conference on Computer Vision (ICCV\u201919) . IEEE.","DOI":"10.1109\/ICCV.2019.00372"},{"issue":"11","key":"e_1_3_1_13_2","first-page":"2032","article-title":"A survey on deep learning based person re-identification","volume":"45","author":"Hao Lup","year":"2019","unstructured":"Lup Hao, Jiang Wei, Fan Xing, and Zhang Sipeng. 2019. A survey on deep learning based person re-identification. Acta Automat. Sin. 45, 11 (2019), 2032\u20132049. DOI:https:\/\/doi.org\/10.16383\/j.aas.c180154","journal-title":"Acta Automat. Sin."},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33018385"},{"key":"e_1_3_1_15_2","article-title":"In defense of the triplet loss for person re-identification","volume":"1703","author":"Hermans Alexander","year":"2017","unstructured":"Alexander Hermans, Lucas Beyer, and Bastian Leibe. 2017. In defense of the triplet loss for person re-identification. CoRR abs\/1703.07737 (2017). DOI:https:\/\/doi.org\/1703.07737","journal-title":"CoRR"},{"key":"e_1_3_1_16_2","doi-asserted-by":"crossref","unstructured":"Yan Huang Jingsong Xu Qiang Wu Zhedong Zheng Zhaoxiang Zhang and Jian Zhang. 2018. Multi-pseudo regularized label for generated data in person re-identification. IEEE Transactions on Image Processing PP (2018) 1\u20131. https:\/\/doi.org\/10.1109\/TIP.2018.2874715","DOI":"10.1109\/TIP.2018.2874715"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.5555\/3491440.3491583"},{"issue":"4","key":"e_1_3_1_18_2","first-page":"4610","article-title":"Infrared-visible cross-modal person re-identification with an X modality","volume":"34","author":"Li Diangang","year":"2020","unstructured":"Diangang Li, Xing Wei, Xiaopeng Hong, and Yihong Gong. 2020. Infrared-visible cross-modal person re-identification with an X modality. Proc. AAAI Conf. Artif. Intell. 34, 4 (2020), 4610\u20134617.","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"e_1_3_1_19_2","doi-asserted-by":"crossref","unstructured":"Shengcai Liao Yang Hu Xiangyu Zhu and Stan Z. Li. 2015. Person re-identification by Local Maximal Occurrence representation and metric learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201915) . IEEE.","DOI":"10.1109\/CVPR.2015.7298832"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2020.01.089"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2017.2700762"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01339"},{"key":"e_1_3_1_23_2","unstructured":"Ye Mang Shen Jianbing Lin Gaojie Xiang Tao Shao Ling and Steven C. H. Hoi. 2021. Deep learning for person re-identification: a survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelligence PP 99 (2021) 1\u20131."},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3351043"},{"key":"e_1_3_1_25_2","doi-asserted-by":"crossref","unstructured":"Xuelin Qian Yanwei Fu Wenxuan Wang Tao Xiang Yang Wu Yu-Gang Jiang and Xiangyang Xue. 2018. Pose-normalized image generation for person re-identification. In Proceedings of the 15th European Conference Munich Germany . Springer Cham.","DOI":"10.1007\/978-3-030-01240-3_40"},{"key":"e_1_3_1_26_2","doi-asserted-by":"crossref","unstructured":"Ergys Ristani and Carlo Tomasi. 2018. Features for multi-target multi-camera tracking and re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 6036\u20136046.","DOI":"10.1109\/CVPR.2018.00632"},{"key":"e_1_3_1_27_2","doi-asserted-by":"crossref","unstructured":"Hailin Shi Yang Yang Xiangyu Zhu Shengcai Liao Zhen Lei and Stan Z. Li. 2016. Embedding deep metric for person re-identification: A study against large variations. In European Conference on Computer Vision . Springer Cham 732\u2013748.","DOI":"10.1007\/978-3-319-46448-0_44"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.5555\/3157096.3157304"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46484-8_48"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46478-7_9"},{"key":"e_1_3_1_31_2","doi-asserted-by":"crossref","unstructured":"Wang Guan\u2019An Zhang Tianzhu Yang Yang Cheng Jian Chang Jianlong Liang Xu and Hou Zengguang. 2020. Cross-modality paired-images generation for RGB-infrared person re-identification. Proceedings of the AAAI Conference on Artificial Intelligence 34 7 (2020) 12144\u201312151.","DOI":"10.1609\/aaai.v34i07.6894"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00159"},{"key":"e_1_3_1_33_2","unstructured":"Longhui Wei Shiliang Zhang Wen Gao and Qi Tian. 2018. Person transfer gan to bridge domain gap for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 79\u201388."},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.575"},{"key":"e_1_3_1_35_2","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Ye Mang","year":"2018","unstructured":"Mang Ye, Xiangyuan Lan, Jiawei Li, and Pong Yuen. 2018. Hierarchical discriminative learning for visible thermal person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence. AAAI."},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.5555\/3304415.3304570"},{"key":"e_1_3_1_37_2","doi-asserted-by":"crossref","unstructured":"Shizhou Zhang Yifei Yang Peng Wang Xiuwei Zhang and Yanning Zhang. 2021. Attend to the difference: Cross-modality person re-identification via contrastive correlation. IEEE Transactions on Image Processing 30 (2021) 8861\u20138872.","DOI":"10.1109\/TIP.2021.3120881"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1049\/iet-ipr.2019.0699"},{"key":"e_1_3_1_39_2","unstructured":"Liang Zheng Yi Yang and Alexander G. Hauptmann. 2016. Person re-identification: Past present and future."},{"key":"e_1_3_1_40_2","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201919)","author":"Zhixiang Wang","year":"2019","unstructured":"Wang Zhixiang, Wang Zheng, Zheng Yinqiang, Chuang Yung Yu, and Satoh Shin\u2019Ich. 2019. Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR\u201919)."},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00541"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3473341","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3473341","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:28:15Z","timestamp":1750195695000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3473341"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,25]]},"references-count":40,"journal-issue":{"issue":"1s","published-print":{"date-parts":[[2022,2,28]]}},"alternative-id":["10.1145\/3473341"],"URL":"https:\/\/doi.org\/10.1145\/3473341","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,1,25]]},"assertion":[{"value":"2020-01-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-06-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-01-25","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}