{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,21]],"date-time":"2026-01-21T09:30:44Z","timestamp":1768987844161,"version":"3.49.0"},"reference-count":51,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2023,7,12]],"date-time":"2023-07-12T00:00:00Z","timestamp":1689120000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62271484"],"award-info":[{"award-number":["62271484"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National Science Fund for Distinguished Young Scholars under","award":["61925112"],"award-info":[{"award-number":["61925112"]}]},{"DOI":"10.13039\/501100015401","name":"Key Research and Development Program of Shaanxi","doi-asserted-by":"crossref","award":["2023-YBGY-225"],"award-info":[{"award-number":["2023-YBGY-225"]}],"id":[{"id":"10.13039\/501100015401","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"crossref","award":["XJSJ23007"],"award-info":[{"award-number":["XJSJ23007"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2023,11,30]]},"abstract":"<jats:p>Visible-infrared person re-identification (VI-ReID) task aims to retrieve persons from different spectrum cameras (i.e., visible and infrared images). The biggest challenge of VI-ReID is the huge cross-modal discrepancy caused by different imaging mechanisms. Many VI-ReID methods have been proposed by embedding different modal person images into a shared feature space to narrow the cross-modal discrepancy. However, these methods ignore the purification of identity features, which results in identity features containing different modal information and failing to align well. In this article, an identity feature disentanglement method is proposed to disentangle the identity features from identity-irrelevant information, such as pose and modality. Specifically, images of different modalities are first processed to extract shared features that reduce the cross-modal discrepancy preliminarily. Then the extracted feature of each image is disentangled into a latent identity variable and an identity-irrelevant variable. In order to enforce the latent identity variable to contain as much identity information as possible and as little identity-irrelevant information, an ID-discriminative loss and an ID-swapping reconstruction process are additionally designed. Extensive quantitative and qualitative experiments on two popular public VI-ReID datasets, RegDB and SYSU-MM01, demonstrate the efficacy and superiority of the proposed method.<\/jats:p>","DOI":"10.1145\/3595183","type":"journal-article","created":{"date-parts":[[2023,4,28]],"date-time":"2023-04-28T11:57:51Z","timestamp":1682683071000},"page":"1-20","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Identity Feature Disentanglement for Visible-Infrared Person Re-Identification"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0610-990X","authenticated-orcid":false,"given":"Xiumei","family":"Chen","sequence":"first","affiliation":[{"name":"Hangzhou Institute of Technology, Xidian University, China, School of Computer Science and Technology, Xidian University, China, and Key Laboratory of Spectral Imaging Technology CAS, Xi\u2019an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8398-6324","authenticated-orcid":false,"given":"Xiangtao","family":"Zheng","sequence":"additional","affiliation":[{"name":"College of Physics and Information Engineering, Fuzhou University, China and Key Laboratory of Spectral Imaging Technology CAS, Xi\u2019an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7037-5188","authenticated-orcid":false,"given":"Xiaoqiang","family":"Lu","sequence":"additional","affiliation":[{"name":"College of Physics and Information Engineering, Fuzhou University, China and Key Laboratory of Spectral Imaging Technology CAS, Xi\u2019an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,7,12]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00046"},{"key":"e_1_3_1_3_2","first-page":"1320","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Chen Weihua","year":"2017","unstructured":"Weihua Chen, Xiaotang Chen, Jianguo Zhang, and Kaiqi Huang. 2017. Beyond triplet loss: A deep quadruplet network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1320\u20131329."},{"key":"e_1_3_1_4_2","first-page":"3300","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Chen Xuesong","year":"2020","unstructured":"Xuesong Chen, Canmiao Fu, Yong Zhao, Feng Zheng, Jingkuan Song, Rongrong Ji, and Yi Yang. 2020. Salience-guided cascaded suppression network for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3300\u20133310."},{"key":"e_1_3_1_5_2","first-page":"10257","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Choi Seokeon","year":"2020","unstructured":"Seokeon Choi, Sumin Lee, Youngeun Kim, Taekyung Kim, and Changick Kim. 2020. Hi-CMD: Hierarchical cross-modality disentanglement for visible-infrared person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10257\u201310266."},{"key":"e_1_3_1_6_2","first-page":"677","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence","author":"Dai Pingyang","year":"2018","unstructured":"Pingyang Dai, Rongrong Ji, Haibin Wang, Qiong Wu, and Yuyu Huang. 2018. Cross-modality person re-identification with generative adversarial training. In Proceedings of the International Joint Conference on Artificial Intelligence. 677\u2013683."},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/3243316"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2019.2928126"},{"key":"e_1_3_1_9_2","first-page":"2670","volume-title":"Advances in Neural Information Processing Systems","author":"Fu Chaoyou","year":"2019","unstructured":"Chaoyou Fu, Xiang Wu, Yibo Hu, Huaibo Huang, and Ran He. 2019. Dual variational generation for low shot heterogeneous face recognition. In Advances in Neural Information Processing Systems. 2670\u20132679."},{"key":"e_1_3_1_10_2","first-page":"1222","volume-title":"Advances in Neural Information Processing Systems","author":"Ge Yixiao","year":"2018","unstructured":"Yixiao Ge, Zhuowan Li, Haiyu Zhao, Guojun Yin, Shuai Yi, and Xiaogang Wang. 2018. FD-GAN: Pose-guided feature distilling gan for robust person re-identification. In Advances in Neural Information Processing Systems. 1222\u20131233."},{"key":"e_1_3_1_11_2","first-page":"8385","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Hao Yi","year":"2019","unstructured":"Yi Hao, Nannan Wang, Jie Li, and Xinbo Gao. 2019. HSME: Hypersphere manifold embedding for visible thermal person reidentification. In Proceedings of the AAAI Conference on Artificial Intelligence. 8385\u20138392."},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_1_13_2","first-page":"1","volume-title":"Proceedings of International Conference on Learning Representations","author":"Higgins Irina","year":"2017","unstructured":"Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. 2017. BETA-VAE: Learning basic visual concepts with a constrained variational framework. In Proceedings of International Conference on Learning Representations. 1\u201322."},{"key":"e_1_3_1_14_2","first-page":"172","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Huang Xun","year":"2018","unstructured":"Xun Huang, Ming Yu Liu, Serge Belongie, and Jan Kautz. 2018. Multimodal unsupervised image-to-image translation. In Proceedings of the European Conference on Computer Vision. 172\u2013189."},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2020.3014167"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2021.3067760"},{"key":"e_1_3_1_17_2","first-page":"448","volume-title":"Proceedings of the 7th IEEE International Conference on Advanced Video and Signal Based Surveillance","author":"Jungling Kai","year":"2010","unstructured":"Kai Jungling and Michael Arens. 2010. Local feature based person reidentification in infrared image sequences. In Proceedings of the 7th IEEE International Conference on Advanced Video and Signal Based Surveillance. 448\u2013455."},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2019.2963721"},{"key":"e_1_3_1_19_2","first-page":"1","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Kingma Diederik P.","year":"2014","unstructured":"Diederik P. Kingma and Max Welling. 2014. Auto-encoding variational bayes. In Proceedings of the International Conference on Learning Representations. 1\u201314."},{"key":"e_1_3_1_20_2","first-page":"35","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Lee Hsin Ying","year":"2018","unstructured":"Hsin Ying Lee, Hung Yu Tseng, Jia Bin Huang, Maneesh Kumar Singh, and Ming Hsuan Yang. 2018. Diverse image-to-image translation via disentangled representations. In Proceedings of the European Conference on Computer Vision. 35\u201351."},{"key":"e_1_3_1_21_2","first-page":"4610","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Li Diangang","year":"2020","unstructured":"Diangang Li, Xing Wei, Xiaopeng Hong, and Yihong Gong. 2020. Infrared-visible cross-modal person re-identification with an x modality. In Proceedings of the AAAI Conference on Artificial Intelligence. 4610\u20134617."},{"issue":"4","key":"e_1_3_1_22_2","article-title":"Part-based structured representation learning for person re-identification","volume":"16","author":"Li Yaoyu","year":"2020","unstructured":"Yaoyu Li, Hantao Yao, Tianzhu Zhang, and Changsheng Xu. 2020. Part-based structured representation learning for person re-identification. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 4 (2020).","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"issue":"1","key":"e_1_3_1_23_2","article-title":"Spatial preserved graph convolution networks for person re-identification","volume":"16","author":"Li Zhaoju","year":"2020","unstructured":"Zhaoju Li, Zongwei Zhou, Nan Jiang, Zhenjun Han, Junliang Xing, and Jianbin Jiao. 2020. Spatial preserved graph convolution networks for person re-identification. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 1s (2020).","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"key":"e_1_3_1_24_2","first-page":"6887","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Liu Chong","year":"2020","unstructured":"Chong Liu, Xiaojun Chang, and Yi-Dong Shen. 2020. Unity style transfer for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6887\u20136896."},{"key":"e_1_3_1_25_2","first-page":"2080","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Liu Yu","year":"2018","unstructured":"Yu Liu, Fangyin Wei, Jing Shao, Lu Sheng, Junjie Yan, and Xiaogang Wang. 2018. Exploring disentangled feature representation beyond face identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2080\u20132089."},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2019.2958756"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.3390\/s17030605"},{"key":"e_1_3_1_28_2","doi-asserted-by":"crossref","unstructured":"Bo Pang Deming Zhai Junjun Jiang and Xianming Liu. 2022. Fully unsupervised person re-identification via selectivecontrastive learning. ACM Transactions on Multimedia Computing Communications and Applications 18 2 (2022).","DOI":"10.1145\/3485061"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3413673"},{"key":"e_1_3_1_30_2","doi-asserted-by":"crossref","unstructured":"Zengming Tang and Jun Huang. 2022. Harmonious multi-branch network for person re-identification with hardertriplet loss. ACM Transactions on Multimedia Computing Communications and Applications 18 4 (2022).","DOI":"10.1145\/3501405"},{"key":"e_1_3_1_31_2","first-page":"1283","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Tran Luan","year":"2017","unstructured":"Luan Tran, Xi Yin, and Xiaoming Liu. 2017. Disentangled representation learning gan for pose-invariant face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1283\u20131292."},{"key":"e_1_3_1_32_2","first-page":"3623","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Wang Guan\u2019an","year":"2019","unstructured":"Guan\u2019an Wang, Tianzhu Zhang, Jian Cheng, Si Liu, Yang Yang, and Zengguang Hou. 2019. RGB-infrared cross-modality person re-identification via joint pixel and feature alignment. In Proceedings of the IEEE International Conference on Computer Vision. 3623\u20133632."},{"key":"e_1_3_1_33_2","first-page":"12144","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Wang Guanan","year":"2020","unstructured":"Guanan Wang, Tianzhu Zhang, Yang Yang, Jian Cheng, Jianlong Chang, and Zengguang Hou. 2020. Cross-modality paired-images generation for rgb-infrared person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence. 12144\u201312151."},{"key":"e_1_3_1_34_2","first-page":"618","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Wang Zhixiang","year":"2019","unstructured":"Zhixiang Wang, Zheng Wang, Yinqiang Zheng, Yungyu Chuang, and Shinich Satoh. 2019. Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 618\u2013626."},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2021.3059713"},{"key":"e_1_3_1_36_2","first-page":"5095","article-title":"Adversarial decoupling and modality-invariant representation learning for visible-infrared person re-identification","author":"Zeng Haitang","year":"2022","unstructured":"Haitang Zeng, Yanke Hou, Weipeng Hu, Bohong Liu, and Haifeng Hu. 2022. Adversarial decoupling and modality-invariant representation learning for visible-infrared person re-identification. IEEE Transactions on Circuits and Systems for Video Technology (2022), 5095\u20135109.","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.575"},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33019005"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2020.2998275"},{"key":"e_1_3_1_40_2","first-page":"7501","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Ye Mang","year":"2018","unstructured":"Mang Ye, Xiangyuan Lan, Jiawei Li, and Pong C. Yuen. 2018. Hierarchical discriminative learning for visible thermal person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence. 7501\u20137508."},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2020.3001665"},{"key":"e_1_3_1_42_2","first-page":"1092","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence","author":"Ye Mang","year":"2018","unstructured":"Mang Ye, Zheng Wang, Xiangyuan Lan, and Pong C. Yuen. 2018. Visible-thermal person re-identification via dual-constrained top-ranking. In Proceedings of the International Joint Conference on Artificial Intelligence. 1092\u20131099."},{"key":"e_1_3_1_43_2","first-page":"188","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Yu Rui","year":"2018","unstructured":"Rui Yu, Zhiyong Dou, Song Bai, Zhaoxiang Zhang, Yongchao Xu, and Xiang Bai. 2018. Hard-aware point-to-set deep metric for person re-identification. In Proceedings of the European Conference on Computer Vision. 188\u2013204."},{"key":"e_1_3_1_44_2","first-page":"13657","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Zeng Kaiwei","year":"2020","unstructured":"Kaiwei Zeng, Munan Ning, Yaohua Wang, and Yang Guo. 2020. Hierarchical clustering with hard-batch triplet loss for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 13657\u201313665."},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2020.2978669"},{"issue":"1","key":"e_1_3_1_46_2","article-title":"Hybrid modality metric learning for visible-infrared person re-identification","volume":"18","author":"Zhang La","year":"2022","unstructured":"La Zhang, Haiyun Guo, Kuan Zhu, Honglin Qiao, Gaopan Huang, Sen Zhang, Huichen Zhang, Jian Sun, and Jinqiao Wang. 2022. Hybrid modality metric learning for visible-infrared person re-identification. ACM Transactions on Multimedia Computing, Communications, and Applications 18, 1s (2022).","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2019.2939564"},{"issue":"1","key":"e_1_3_1_48_2","article-title":"JoT-GAN: A framework for jointly training GAN and person re-identification model","volume":"18","author":"Zhao Zhongwei","year":"2022","unstructured":"Zhongwei Zhao, Ran Song, Qian Zhang, Peng Duan, and Youmei Zhang. 2022. JoT-GAN: A framework for jointly training GAN and person re-identification model. ACM Transactions on Multimedia Computing, Communications, and Applications 18, 1s (2022).","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"key":"e_1_3_1_49_2","first-page":"8514","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Zheng Feng","year":"2019","unstructured":"Feng Zheng, Xing Sun, Xinyang Jiang, Xiaowei Guo, Zongqiao Yu, Feiyue Huang, Cheng Deng, and Rongrong Ji. 2019. Pyramidal person re-identification via multi-loss dynamic training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8514\u20138522."},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2020.2972104"},{"issue":"4","key":"e_1_3_1_51_2","doi-asserted-by":"crossref","DOI":"10.1145\/3501404","article-title":"Clustering matters: Sphere feature for fully unsupervised person re-identification","volume":"18","author":"Zheng Yi","year":"2022","unstructured":"Yi Zheng, Yong Zhou, Jiaqi Zhao, Ying Chen, Rui Yao, Bing Liu, and Abdulmotaleb El Saddik. 2022. Clustering matters: Sphere feature for fully unsupervised person re-identification. ACM Transactions on Multimedia Computing, Communications, and Applications 18, 4 (2022).","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2018.2873599"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3595183","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3595183","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:49:08Z","timestamp":1750182548000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3595183"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,12]]},"references-count":51,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023,11,30]]}},"alternative-id":["10.1145\/3595183"],"URL":"https:\/\/doi.org\/10.1145\/3595183","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,12]]},"assertion":[{"value":"2023-01-28","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-04-20","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-07-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}