{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,16]],"date-time":"2026-06-16T05:21:40Z","timestamp":1781587300229,"version":"3.54.5"},"reference-count":63,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2024,3,8]],"date-time":"2024-03-08T00:00:00Z","timestamp":1709856000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62302338 and 62006174"],"award-info":[{"award-number":["62302338 and 62006174"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100007129","name":"Shandong Provincial Natural Science Foundation","doi-asserted-by":"crossref","award":["ZR2022QF046 and ZR2023MF033"],"award-info":[{"award-number":["ZR2022QF046 and ZR2023MF033"]}],"id":[{"id":"10.13039\/501100007129","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100005230","name":"Natural Science Foundation of Chongqing","doi-asserted-by":"crossref","award":["CSTB2023NSCQ-MSX0407"],"award-info":[{"award-number":["CSTB2023NSCQ-MSX0407"]}],"id":[{"id":"10.13039\/501100005230","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Science and Technology Research Program of Chongqing Municipal Education Commission","award":["KJQN202200551"],"award-info":[{"award-number":["KJQN202200551"]}]},{"name":"Chongqing Normal University Foundation","award":["21XLB026"],"award-info":[{"award-number":["21XLB026"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2024,6,30]]},"abstract":"<jats:p>\n            Cross-modal retrieval methods based on hashing have gained significant attention in both academic and industrial research. Deep learning techniques have played a crucial role in advancing supervised cross-modal hashing methods, leading to significant practical improvements. Despite these achievements, current deep cross-modal hashing still encounters some underexplored limitations. Specifically, most of the available deep hashing usually utilizes pair-wise or triplet-wise strategies to promote the separation of the inter-classes by calculating the relative similarities between samples, weakening the compactness of intra-class data from different modalities, which could generate ambiguous neighborhoods. In this article, the Deep Neighborhood-aware Proxy Hashing (DNPH) framework is proposed to learn a discriminative embedding space with the original neighborhood relation preserved. By introducing learnable shared category proxies, the neighborhood-aware proxy loss is proposed to project the heterogeneous data into a unified common embedding, in which the sample is pulled closer to the corresponding category proxy and is pushed away from other proxies, capturing small within-class scatter and big between-class scatter. To enhance the quality of the obtained binary codes, the uniform distribution constraint is developed to make each hash bit independently obey the discrete uniform distribution. In addition, the discrimination loss is designed to preserve modality-specific semantic information of samples. Extensive experiments are performed on three benchmark datasets to prove that our proposed DNPH framework achieves comparable or even better performance compared with the state-of-the-art cross-modal retrieval applications. The corresponding code implementation of our DNPH framework is as follows:\n            <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" xlink:href=\"https:\/\/github.com\/QinLab-WFU\/OUR-DNPH\">https:\/\/github.com\/QinLab-WFU\/OUR-DNPH<\/jats:ext-link>\n            .\n          <\/jats:p>","DOI":"10.1145\/3643639","type":"journal-article","created":{"date-parts":[[2024,1,27]],"date-time":"2024-01-27T12:51:15Z","timestamp":1706359875000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":51,"title":["Deep Neighborhood-aware Proxy Hashing with Uniform Distribution Constraint for Cross-modal Retrieval"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-8805-8958","authenticated-orcid":false,"given":"Yadong","family":"Huo","sequence":"first","affiliation":[{"name":"Qufu Normal University, Rizhao, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7976-318X","authenticated-orcid":false,"given":"Qin","family":"Qibing","sequence":"additional","affiliation":[{"name":"Weifang University, Ocean University of China, Weifang, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1695-4629","authenticated-orcid":false,"given":"Jiangyan","family":"Dai","sequence":"additional","affiliation":[{"name":"Weifang University, Weifang, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7459-2510","authenticated-orcid":false,"given":"Wenfeng","family":"Zhang","sequence":"additional","affiliation":[{"name":"Chongqing Normal University, Chongqing, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4087-3677","authenticated-orcid":false,"given":"Lei","family":"Huang","sequence":"additional","affiliation":[{"name":"Ocean University of China, Qindao, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-1131-1407","authenticated-orcid":false,"given":"Chengduan","family":"Wang","sequence":"additional","affiliation":[{"name":"Weifang University, Weifang, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2024,3,8]]},"reference":[{"key":"e_1_3_2_2_2","first-page":"214","volume-title":"Proceedings of the International Conference on Machine Learning","volume":"70","author":"Arjovsky Mart\u00edn","year":"2017","unstructured":"Mart\u00edn Arjovsky, Soumith Chintala, and L\u00e9on Bottou. 2017. Wasserstein generative adversarial networks. In Proceedings of the International Conference on Machine Learning, Vol. 70. 214\u2013223."},{"key":"e_1_3_2_3_2","first-page":"525","volume-title":"Proceedings of the ACM SIGMM International Conference on Multimedia Information Retrieval","author":"Bai Cong","year":"2020","unstructured":"Cong Bai, Chao Zeng, Qing Ma, Jinglin Zhang, and Shengyong Chen. 2020. Deep adversarial discrete hashing for cross-modal retrieval. In Proceedings of the ACM SIGMM International Conference on Multimedia Information Retrieval. 525\u2013531."},{"key":"e_1_3_2_4_2","first-page":"207","volume-title":"Proceedings of the European Conference on Computer Vision","volume":"11205","author":"Cao Yue","year":"2018","unstructured":"Yue Cao, Bin Liu, Mingsheng Long, and Jianmin Wang. 2018. Cross-modal hamming hashing. In Proceedings of the European Conference on Computer Vision, Vol. 11205. 207\u2013223."},{"key":"e_1_3_2_5_2","first-page":"1287","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Cao Yue","year":"2018","unstructured":"Yue Cao, Bin Liu, Mingsheng Long, and Jianmin Wang. 2018. HashGAN: Deep learning to hash with pair conditional Wasserstein GAN. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1287\u20131296."},{"key":"e_1_3_2_6_2","doi-asserted-by":"crossref","first-page":"3291","DOI":"10.1145\/3503161.3548195","volume-title":"Proceedings of the ACM Intermational Conference on Multimedia","author":"Chen Dapeng","year":"2022","unstructured":"Dapeng Chen, Min Wang, Haobin Chen, Lin Wu, Jing Qin, and Wei Peng. 2022. Cross-modal retrieval with heterogeneous graph embedding. In Proceedings of the ACM Intermational Conference on Multimedia. 3291\u20133300."},{"issue":"6","key":"e_1_3_2_7_2","doi-asserted-by":"crossref","first-page":"7270","DOI":"10.1109\/TPAMI.2022.3218591","article-title":"Deep learning for instance retrieval: A survey","volume":"45","author":"Chen Wei","year":"2023","unstructured":"Wei Chen, Yu Liu, Weiping Wang, Erwin M. Bakker, Theodoros Georgiou, Paul W. Fieguth, Li Liu, and Michael S. Lew. 2023. Deep learning for instance retrieval: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 6 (2023), 7270\u20137292.","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"issue":"8","key":"e_1_3_2_8_2","doi-asserted-by":"crossref","first-page":"3893","DOI":"10.1109\/TIP.2018.2821921","article-title":"Triplet-based deep hashing network for cross-modal retrieval","volume":"27","author":"Deng Cheng","year":"2018","unstructured":"Cheng Deng, Zhaojia Chen, Xianglong Liu, Xinbo Gao, and Dacheng Tao. 2018. Triplet-based deep hashing network for cross-modal retrieval. IEEE Trans. Image Process. 27, 8 (2018), 3893\u20133903.","journal-title":"IEEE Trans. Image Process."},{"key":"e_1_3_2_9_2","first-page":"4171","volume-title":"Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics. 4171\u20134186."},{"key":"e_1_3_2_10_2","first-page":"9437","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Doan Khoa D.","year":"2022","unstructured":"Khoa D. Doan, Peng Yang, and Ping Li. 2022. One loss for quantization: Deep hashing with discrete Wasserstein distributional matching. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 9437\u20139447."},{"key":"e_1_3_2_11_2","first-page":"1","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Dosovitskiy Alexey","year":"2021","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An image is worth 16 \\(\\times\\) 16 words: Transformers for image recognition at scale. In Proceedings of the International Conference on Learning Representations. 1\u201322."},{"key":"e_1_3_2_12_2","doi-asserted-by":"crossref","first-page":"6535","DOI":"10.1109\/TIP.2020.2991510","article-title":"Semantic Neighborhood-aware Deep Facial Expression Recognition","volume":"29","author":"Fu Yongjian","year":"2020","unstructured":"Yongjian Fu, Xintian Wu, Xi Li, Zhijie Pan, and Daxin Luo. 2020. Semantic Neighborhood-aware Deep Facial Expression Recognition. IEEE Trans. Image Process. 29 (2020), 6535\u20136548.","journal-title":"IEEE Trans. Image Process."},{"key":"e_1_3_2_13_2","first-page":"159","volume-title":"Proceedings of the ACM SIGMM International Conference on Multimedia Information Retrieval","author":"Gu Wen","year":"2019","unstructured":"Wen Gu, Xiaoyan Gu, Jingzi Gu, Bo Li, Zhi Xiong, and Weiping Wang. 2019. Adversary guided asymmetric hashing for cross-modal retrieval. In Proceedings of the ACM SIGMM International Conference on Multimedia Information Retrieval. 159\u2013167."},{"key":"e_1_3_2_14_2","first-page":"5767","volume-title":"Proceedings of the Conference on Neural Information Processing Systems","author":"Gulrajani Ishaan","year":"2017","unstructured":"Ishaan Gulrajani, Faruk Ahmed, Mart\u00edn Arjovsky, Vincent Dumoulin, and Aaron C. Courville. 2017. Improved training of Wasserstein GANs. In Proceedings of the Conference on Neural Information Processing Systems. 5767\u20135777."},{"key":"e_1_3_2_15_2","first-page":"770","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"He Kaiming","year":"2016","unstructured":"Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770\u2013778."},{"issue":"3","key":"e_1_3_2_16_2","first-page":"3877","article-title":"Unsupervised contrastive cross-modal hashing","volume":"45","author":"Hu Peng","year":"2023","unstructured":"Peng Hu, Hongyuan Zhu, Jie Lin, Dezhong Peng, Yin-Ping Zhao, and Xi Peng. 2023. Unsupervised contrastive cross-modal hashing. IEEE Trans. Pattern Anal. Mach. Intell. 45, 3 (2023), 3877\u20133889.","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"e_1_3_2_17_2","doi-asserted-by":"crossref","unstructured":"Yadong Huo Qibing Qin Jiangyan Dai Lei Wang Wenfeng Zhang Lei Huang and Chengduan Wang. 2024. Deep semantic-aware proxy hashing for multi-label cross-modal retrieval. IEEE Trans. Circ. Syst. Video Technol. 34 1 (2024) 576\u2013589.","DOI":"10.1109\/TCSVT.2023.3285266"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.348"},{"key":"e_1_3_2_19_2","first-page":"1","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Kingma Diederik P.","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations. 1\u201314."},{"issue":"6","key":"e_1_3_2_20_2","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"ImageNet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky Alex","year":"2017","unstructured":"Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2017. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 6 (2017), 84\u201390.","journal-title":"Commun. ACM"},{"issue":"2605","key":"e_1_3_2_21_2","first-page":"2579","article-title":"Visualizing data using t-SNE","volume":"9","author":"Laurens Van Der Maaten","year":"2008","unstructured":"Van Der Maaten Laurens and Geoffrey Hinton. 2008. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2605 (2008), 2579\u20132605.","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_2_22_2","first-page":"4242","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Li Chao","year":"2018","unstructured":"Chao Li, Cheng Deng, Ning Li, Wei Liu, Xinbo Gao, and Dacheng Tao. 2018. Self-supervised adversarial hashing networks for cross-modal retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4242\u20134251."},{"key":"e_1_3_2_23_2","first-page":"10275","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Li Tieying","year":"2022","unstructured":"Tieying Li, Xiaochun Yang, Bin Wang, Chong Xi, Hanzhong Zheng, and Xiangmin Zhou. 2022. Bi-CMR: Bidirectional reinforcement guided hashing for effective cross-modal retrieval. In Proceedings of the AAAI Conference on Artificial Intelligence. 10275\u201310282."},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i3.16296"},{"issue":"2","key":"e_1_3_2_25_2","doi-asserted-by":"crossref","first-page":"920","DOI":"10.1109\/TCSVT.2022.3203247","article-title":"Deep supervised dual cycle adversarial network for cross-modal retrieval","volume":"33","author":"Liao Lei","year":"2023","unstructured":"Lei Liao, Meng Yang, and Bob Zhang. 2023. Deep supervised dual cycle adversarial network for cross-modal retrieval. IEEE Trans. Circ. Syst. Video Technol. 33, 2 (2023), 920\u2013934.","journal-title":"IEEE Trans. Circ. Syst. Video Technol."},{"key":"e_1_3_2_26_2","first-page":"11515","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Lin Kaiyi","year":"2020","unstructured":"Kaiyi Lin, Xing Xu, Lianli Gao, Zheng Wang, and Heng Tao Shen. 2020. Learning cross-aligned latent embeddings for zero-shot cross-modal retrieval. In Proceedings of the AAAI Conference on Artificial Intelligence. 11515\u201311522."},{"key":"e_1_3_2_27_2","first-page":"1129","volume-title":"Proceedings of the ACM International Conference on Multimedia","author":"Lu Xu","year":"2019","unstructured":"Xu Lu, Lei Zhu, Zhiyong Cheng, Jingjing Li, Xiushan Nie, and Huaxiang Zhang. 2019. Flexible online multi-modal hashing for large-scale multimedia retrieval. In Proceedings of the ACM International Conference on Multimedia. 1129\u20131137."},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331217"},{"issue":"1","key":"e_1_3_2_29_2","first-page":"15:1\u201315:50","article-title":"A survey on deep hashing methods","volume":"17","author":"Luo Xiao","year":"2023","unstructured":"Xiao Luo, Haixin Wang, Daqing Wu, Chong Chen, Minghua Deng, Jianqiang Huang, and Xian-Sheng Hua. 2023. A survey on deep hashing methods. ACM Trans. Knowl. Discov. Data 17, 1 (2023), 15:1\u201315:50.","journal-title":"ACM Trans. Knowl. Discov. Data"},{"key":"e_1_3_2_30_2","doi-asserted-by":"crossref","first-page":"986","DOI":"10.1109\/TIP.2020.3038365","article-title":"Asymmetric supervised consistent and specific hashing for cross-modal retrieval","volume":"30","author":"Meng Min","year":"2021","unstructured":"Min Meng, Haitao Wang, Jun Yu, Hui Chen, and Jigang Wu. 2021. Asymmetric supervised consistent and specific hashing for cross-modal retrieval. IEEE Trans. Image Process. 30 (2021), 986\u20131000.","journal-title":"IEEE Trans. Image Process."},{"key":"e_1_3_2_31_2","first-page":"8024","volume-title":"Proceedings of the Annual Conference on Neural Information Processing Systems","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas K\u00f6pf, Edward Z. Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An imperative style, high-performance deep learning library. In Proceedings of the Annual Conference on Neural Information Processing Systems. 8024\u20138035."},{"issue":"7","key":"e_1_3_2_32_2","doi-asserted-by":"crossref","first-page":"2852","DOI":"10.1109\/TCSVT.2020.3032402","article-title":"Unsupervised deep multi-similarity hashing with semantic structure for image retrieval","volume":"31","author":"Qin Qibing","year":"2021","unstructured":"Qibing Qin, Lei Huang, Zhiqiang Wei, Kezhen Xie, and Wenfeng Zhang. 2021. Unsupervised deep multi-similarity hashing with semantic structure for image retrieval. IEEE Trans. Circ. Syst. Video Technol. 31, 7 (2021), 2852\u20132865.","journal-title":"IEEE Trans. Circ. Syst. Video Technol."},{"key":"e_1_3_2_33_2","doi-asserted-by":"crossref","unstructured":"Qibing Qin Lei Huang Kezhen Xie Zhiqiang Wei Chengduan Wang and Wenfeng Zhang. 2023. Deep adaptive quadruplet hashing with probability sampling for large-scale image retrieval. IEEE Trans. Circ. Syst. Video Technol. 33 12 (2023) 7914\u20137927.","DOI":"10.1109\/TCSVT.2023.3281868"},{"key":"e_1_3_2_34_2","doi-asserted-by":"crossref","unstructured":"Qibing Qin Kezhen Xie Wenfeng Zhang Chengduan Wang and Lei Huang. 2024. Deep neighborhood structure-preserving hashing for large-scale image retrieval. IEEE Trans. Multimedia. 26 (2024) 1881\u20131893.","DOI":"10.1109\/TMM.2023.3289765"},{"issue":"2","key":"e_1_3_2_35_2","doi-asserted-by":"crossref","first-page":"336","DOI":"10.1007\/s11263-019-01228-7","article-title":"Grad-CAM: Visual explanations from deep networks via gradient-based localization","volume":"128","author":"Selvaraju Ramprasaath R.","year":"2020","unstructured":"Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2020. Grad-CAM: Visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vision 128, 2 (2020), 336\u2013359.","journal-title":"Int. J. Comput. Vision"},{"key":"e_1_3_2_36_2","volume-title":"Proceedings of the Annual Meeting of the Association for Computational Linguistics","author":"Sennrich Rico","year":"2016","unstructured":"Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Proceedings of the Annual Meeting of the Association for Computational Linguistics."},{"key":"e_1_3_2_37_2","first-page":"4937","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops","author":"Sun Changchang","year":"2022","unstructured":"Changchang Sun, Hugo Latapie, Gaowen Liu, and Yan Yan. 2022. Deep normalized cross-modal hashing with bi-direction relation reasoning. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 4937\u20134945."},{"key":"e_1_3_2_38_2","first-page":"499","volume-title":"Proceedings of the ACM International Conference on Multimedia Retrieval","author":"Sun Lina","year":"2023","unstructured":"Lina Sun, Yewen Li, and Yumin Dong. 2023. Learning from expert: Vision-language knowledge distillation for unsupervised cross-modal hashing retrieval. In Proceedings of the ACM International Conference on Multimedia Retrieval. 499\u2013507."},{"key":"e_1_3_2_39_2","first-page":"453","volume-title":"Proceedings of the ACM International Conference on Multimedia","author":"Tu Junfeng","year":"2022","unstructured":"Junfeng Tu, Xueliang Liu, Zongxiang Lin, Richang Hong, and Meng Wang. 2022. Differentiable cross-modal hashing via multimodal transformers. In Proceedings of the ACM International Conference on Multimedia. 453\u2013461."},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2022.3187023"},{"key":"e_1_3_2_41_2","doi-asserted-by":"crossref","unstructured":"Cedric Villani and Cedric Villani. 2009. The Wasserstein distances. Optimal Transport Old and New. 338 (2009) 93\u2013111.","DOI":"10.1007\/978-3-540-71050-9_6"},{"key":"e_1_3_2_42_2","first-page":"6165","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wang Chia-Hui","year":"2023","unstructured":"Chia-Hui Wang, Yu-Chee Tseng, Ting-Hui Chiang, and Yan-Ann Chen. 2023. Learning multi-scale representations with single-stream network for video retrieval. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 6165\u20136175."},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2012.48"},{"key":"e_1_3_2_44_2","doi-asserted-by":"crossref","first-page":"255","DOI":"10.1016\/j.neucom.2020.03.019","article-title":"Self-constraining and attention-based hashing network for bit-scalable cross-modal retrieval","volume":"400","author":"Wang Xinzhi","year":"2020","unstructured":"Xinzhi Wang, Xitao Zou, Erwin M. Bakker, and Song Wu. 2020. Self-constraining and attention-based hashing network for bit-scalable cross-modal retrieval. Neurocomputing 400 (2020), 255\u2013271.","journal-title":"Neurocomputing"},{"key":"e_1_3_2_45_2","first-page":"871","volume-title":"Proceedings of the ACM International Conference on Multimedia","author":"Wang Yongxin","year":"2020","unstructured":"Yongxin Wang, Xin Luo, and Xin-Shun Xu. 2020. Label embedding online hashing for cross-modal retrieval. In Proceedings of the ACM International Conference on Multimedia. 871\u2013879."},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","unstructured":"Yuxin Wei Ligang Zheng Guoping Qiu and Guocan Cai. 2023. Cross-modal retrieval based on shared proxies. Retrieved from https:\/\/www.researchsquare.com\/article\/rs-2667484\/v1. DOI:10.21203\/rs.3.rs-2667484\/v1","DOI":"10.21203\/rs.3.rs-2667484\/v1"},{"key":"e_1_3_2_47_2","first-page":"2743","volume-title":"Proceedings of the IEEE International Conference on Image Processing","author":"Wu Fei","year":"2021","unstructured":"Fei Wu, Xiaokai Luo, Qinghua Huang, Pengfei Wei, Ying Sun, Xiwei Dong, and Zhiyong Wu. 2021. Semantic preserving generative adversarial network for cross-modal hashing. In Proceedings of the IEEE International Conference on Image Processing. 2743\u20132747."},{"key":"e_1_3_2_48_2","doi-asserted-by":"crossref","first-page":"2215","DOI":"10.1109\/TIP.2023.3265261","article-title":"Neighbor-guided consistent and contrastive learning for semi-supervised action recognition","volume":"32","author":"Wu Jianlong","year":"2023","unstructured":"Jianlong Wu, Wei Sun, Tian Gan, Ning Ding, Feijun Jiang, Jialie Shen, and Liqiang Nie. 2023. Neighbor-guided consistent and contrastive learning for semi-supervised action recognition. IEEE Trans. Image Process. 32 (2023), 2215\u20132227.","journal-title":"IEEE Trans. Image Process."},{"key":"e_1_3_2_49_2","first-page":"3173","volume-title":"Proceedings of the ACM International Conference on Multimedia","author":"Xu Chengyin","year":"2022","unstructured":"Chengyin Xu, Zenghao Chai, Zhengzhuo Xu, Chun Yuan, Yanbo Fan, and Jue Wang. 2022. HyP2 Loss: Beyond hypersphere metric space for multi-label image retrieval. In Proceedings of the ACM International Conference on Multimedia. 3173\u20133184."},{"key":"e_1_3_2_50_2","first-page":"982","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence","author":"Xu Ruiqing","year":"2019","unstructured":"Ruiqing Xu, Chao Li, Junchi Yan, Cheng Deng, and Xianglong Liu. 2019. Graph convolutional network hashing for cross-modal retrieval. In Proceedings of the International Joint Conference on Artificial Intelligence. 982\u2013988."},{"issue":"4","key":"e_1_3_2_51_2","first-page":"105:1\u2013105:20","article-title":"Sequential cross-modal hashing learning via multi-scale correlation mining","volume":"15","author":"Ye Zhaoda","year":"2020","unstructured":"Zhaoda Ye and Yuxin Peng. 2020. Sequential cross-modal hashing learning via multi-scale correlation mining. ACM Trans. Multimedia Comput. Commun. Appl. 15, 4 (2020), 105:1\u2013105:20.","journal-title":"ACM Trans. Multimedia Comput. Commun. Appl."},{"key":"e_1_3_2_52_2","first-page":"4626","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Yu Jun","year":"2021","unstructured":"Jun Yu, Hao Zhou, Yibing Zhan, and Dacheng Tao. 2021. Deep graph-neighbor coherence preserving network for unsupervised cross-modal hashing. In Proceedings of the AAAI Conference on Artificial Intelligence. 4626\u20134634."},{"key":"e_1_3_2_53_2","doi-asserted-by":"crossref","first-page":"108262","DOI":"10.1016\/j.patcog.2021.108262","article-title":"Discrete online cross-modal hashing","volume":"122","author":"Zhan Yu-Wei","year":"2022","unstructured":"Yu-Wei Zhan, Yongxin Wang, Yu Sun, Xiao-Ming Wu, Xin Luo, and Xin-Shun Xu. 2022. Discrete online cross-modal hashing. Pattern Recogn. 122 (2022), 108262.","journal-title":"Pattern Recogn."},{"issue":"1","key":"e_1_3_2_54_2","first-page":"2:1\u20132:22","article-title":"HCMSL: Hybrid cross-modal similarity learning for cross-modal retrieval","volume":"17","author":"Zhang Chengyuan","year":"2021","unstructured":"Chengyuan Zhang, Jiayu Song, Xiaofeng Zhu, Lei Zhu, and Shichao Zhang. 2021. HCMSL: Hybrid cross-modal similarity learning for cross-modal retrieval. ACM Trans. Multimedia Comput. Commun. Appl. 17, 1s (2021), 2:1\u20132:22.","journal-title":"ACM Trans. Multimedia Comput. Commun. Appl."},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2021.3053766"},{"key":"e_1_3_2_56_2","first-page":"1651","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence","author":"Zhang Qi","year":"2022","unstructured":"Qi Zhang, Liang Hu, Longbing Cao, Chongyang Shi, Shoujin Wang, and Dora D. Liu. 2022. A probabilistic code balance constraint with compactness and informativeness enhancement for deep supervised hashing. In Proceedings of the International Joint Conference on Artificial Intelligence. 1651\u20131657."},{"issue":"5","key":"e_1_3_2_57_2","first-page":"5091","article-title":"Modality-invariant asymmetric networks for cross-modal hashing","volume":"35","author":"Zhang Zheng","year":"2023","unstructured":"Zheng Zhang, Haoyang Luo, Lei Zhu, Guangming Lu, and Heng Tao Shen. 2023. Modality-invariant asymmetric networks for cross-modal hashing. IEEE Trans. Knowl. Data Eng. 35, 5 (2023), 5091\u20135104.","journal-title":"IEEE Trans. Knowl. Data Eng."},{"issue":"2","key":"e_1_3_2_58_2","first-page":"114:1\u2013114:21","article-title":"Discriminative visual similarity search with semantically cycle-consistent hashing networks","volume":"18","author":"Zhang Zheng","year":"2022","unstructured":"Zheng Zhang, Jianning Wang, Lei Zhu, and Guangming Lu. 2022. Discriminative visual similarity search with semantically cycle-consistent hashing networks. ACM Trans. Multimedia Comput. Commun. Appl. 18, 2s (2022), 114:1\u2013114:21.","journal-title":"ACM Trans. Multimedia Comput. Commun. Appl."},{"key":"e_1_3_2_59_2","doi-asserted-by":"crossref","first-page":"120913","DOI":"10.1016\/j.eswa.2023.120913","article-title":"One for more: Structured multi-modal hashing for multiple multimedia retrieval tasks","volume":"233","author":"Zheng Chaoqun","year":"2023","unstructured":"Chaoqun Zheng, Fengling Li, Lei Zhu, Zheng Zhang, and Wenpeng Lu. 2023. One for more: Structured multi-modal hashing for multiple multimedia retrieval tasks. Expert Syst. Appl. 233 (2023), 120913.","journal-title":"Expert Syst. Appl."},{"key":"e_1_3_2_60_2","first-page":"9204","volume-title":"Proceedings of the International Conference on Computer Vision","author":"Zhong Huasong","year":"2021","unstructured":"Huasong Zhong, Jianlong Wu, Chong Chen, Jianqiang Huang, Minghua Deng, Liqiang Nie, Zhouchen Lin, and Xian-Sheng Hua. 2021. Graph contrastive clustering. In Proceedings of the International Conference on Computer Vision. 9204\u20139213."},{"key":"e_1_3_2_61_2","doi-asserted-by":"crossref","first-page":"4643","DOI":"10.1109\/TIP.2020.2974065","article-title":"Deep collaborative multi-view hashing for large-scale image search","volume":"29","author":"Zhu Lei","year":"2020","unstructured":"Lei Zhu, Xu Lu, Zhiyong Cheng, Jingjing Li, and Huaxiang Zhang. 2020. Deep collaborative multi-view hashing for large-scale image search. IEEE Trans. Image Process. 29 (2020), 4643\u20134655.","journal-title":"IEEE Trans. Image Process."},{"key":"e_1_3_2_62_2","doi-asserted-by":"crossref","unstructured":"Lei Zhu Chaoqun Zheng Weili Guan Jingjing Li Yang Yang and Heng Tao Shen. 2024. Multi-modal hashing for efficient multimedia retrieval: A survey. IEEE Trans. Knowl. Data Eng. 36 1 (2024) 239\u2013260.","DOI":"10.1109\/TKDE.2023.3282921"},{"key":"e_1_3_2_63_2","first-page":"1943","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence","author":"Zhu Qiannan","year":"2019","unstructured":"Qiannan Zhu, Xiaofei Zhou, Jia Wu, and Jianlong Tan. 2019. Neighborhood-aware attentional representation for multilingual knowledge graphs. In Proceedings of the International Joint Conference on Artificial Intelligence. 1943\u20131949."},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2021.09.053"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3643639","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3643639","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T22:50:28Z","timestamp":1750287028000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3643639"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,8]]},"references-count":63,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2024,6,30]]}},"alternative-id":["10.1145\/3643639"],"URL":"https:\/\/doi.org\/10.1145\/3643639","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,3,8]]},"assertion":[{"value":"2023-09-02","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-01-24","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-03-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}