{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T16:29:51Z","timestamp":1776443391498,"version":"3.51.2"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2024,4,2]],"date-time":"2024-04-02T00:00:00Z","timestamp":1712016000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,4,2]],"date-time":"2024-04-02T00:00:00Z","timestamp":1712016000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"the Key-Area Research and Development Program of Guangdong Province","award":["2020B0101100001"],"award-info":[{"award-number":["2020B0101100001"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Data Sci. Eng."],"published-print":{"date-parts":[[2024,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Unsupervised hashing for cross-modal retrieval has received much attention in the data mining area. Recent methods rely on image-text paired data to conduct unsupervised cross-modal hashing in batch samples. There are two main limitations for existing models: (1) learning of cross-modal representations is restricted to batches; (2) semantically similar samples may be wrongly treated as negative. In this paper, we propose a novel category-level contrastive learning for unsupervised cross-modal hashing, which alleviates the above problems and improves cross-modal query accuracy. To break the limitation of learning in small batches, a selected memory module is first proposed to take global relations into account. Then, we obtain pseudo labels through clustering and combine the labels with the Hadamard Matrix for category-centered learning. To reduce wrong negatives, we further propose a memory bank to store clusters of samples and construct negatives by selecting samples from different categories for contrastive learning. Extensive experiments show the significant superiority of our approach over the state-of-the-art models on MIRFLICKR-25K and NUS-WIDE datasets.<\/jats:p>","DOI":"10.1007\/s41019-024-00248-9","type":"journal-article","created":{"date-parts":[[2024,4,2]],"date-time":"2024-04-02T16:01:46Z","timestamp":1712073706000},"page":"251-263","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Category-Level Contrastive Learning for Unsupervised Hashing in Cross-Modal Retrieval"],"prefix":"10.1007","volume":"9","author":[{"given":"Mengying","family":"Xu","sequence":"first","affiliation":[]},{"given":"Linyin","family":"Luo","sequence":"additional","affiliation":[]},{"given":"Hanjiang","family":"Lai","sequence":"additional","affiliation":[]},{"given":"Jian","family":"Yin","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,4,2]]},"reference":[{"key":"248_CR1","doi-asserted-by":"crossref","unstructured":"Bronstein MM, Bronstein AM, Michel F et\u00a0al (2010) Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: Computer vision pattern recognition","DOI":"10.1109\/CVPR.2010.5539928"},{"key":"248_CR2","doi-asserted-by":"crossref","unstructured":"Cao Y, Liu B, Long M et\u00a0al (2018) Cross-modal hamming hashing. In: European conference on computer vision","DOI":"10.1007\/978-3-030-01246-5_13"},{"issue":"5","key":"248_CR3","doi-asserted-by":"publisher","first-page":"871","DOI":"10.1109\/3477.623240","volume":"27","author":"D Chaudhuri","year":"1997","unstructured":"Chaudhuri D, Chaudhuri B (1997) A novel multiseed nonhierarchical data clustering technique. IEEE Trans Syst Man Cybern Part B (Cybernetics) 27(5):871\u2013876","journal-title":"IEEE Trans Syst Man Cybern Part B (Cybernetics)"},{"key":"248_CR4","unstructured":"Chen T, Kornblith S, Norouzi M et\u00a0al (2021) A simple framework for contrastive learning of visual representations. In: International conference on machine learning"},{"key":"248_CR5","doi-asserted-by":"crossref","unstructured":"Chua TS, Tang J, Hong R, et\u00a0al (2009) NUS-WIDE: a real-world web image database from National University of Singapore. In: ACM international conference on image and video retrieval","DOI":"10.1145\/1646396.1646452"},{"issue":"11","key":"248_CR6","doi-asserted-by":"publisher","first-page":"5427","DOI":"10.1109\/TIP.2016.2607421","volume":"25","author":"G Ding","year":"2016","unstructured":"Ding G, Guo Y, Zhou J et al (2016) Large-scale cross-modality search via collective matrix factorization hashing. IEEE Trans Image Process 25(11):5427\u20135440","journal-title":"IEEE Trans Image Process"},{"key":"248_CR7","first-page":"3","volume":"32","author":"SP Harter","year":"1997","unstructured":"Harter SP, Hert CA (1997) Evaluation of information retrieval systems: approaches, issues, and methods. Ann Rev Inf Sci Technol (ARIST) 32:3\u201394","journal-title":"Ann Rev Inf Sci Technol (ARIST)"},{"key":"248_CR8","doi-asserted-by":"crossref","unstructured":"He K, Fan H, Wu Y et\u00a0al (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 9729\u20139738","DOI":"10.1109\/CVPR42600.2020.00975"},{"issue":"3","key":"248_CR9","first-page":"3877","volume":"45","author":"P Hu","year":"2022","unstructured":"Hu P, Zhu H, Lin J et al (2022) Unsupervised contrastive cross-modal hashing. IEEE Trans Pattern Anal Mach Intell 45(3):3877\u20133889","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"248_CR10","doi-asserted-by":"crossref","unstructured":"Huiskes MJ, Lew MS (2008) The MIR flickr retrieval evaluation. In: Proceedings of the 1st ACM international conference on multimedia information retrieval. Association for Computing Machinery, New York, pp 39\u201343","DOI":"10.1145\/1460096.1460104"},{"key":"248_CR11","doi-asserted-by":"crossref","unstructured":"Jang YK, Cho NI (2021) Self-supervised product quantization for deep unsupervised image retrieval. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 12085\u201312094","DOI":"10.1109\/ICCV48922.2021.01187"},{"key":"248_CR12","doi-asserted-by":"crossref","unstructured":"Jiang QY, Li WJ (2016) Deep cross-modal hashing. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 3270\u20133278","DOI":"10.1109\/CVPR.2017.348"},{"key":"248_CR13","unstructured":"Kim W, Son B, Kim I (2021) ViLT: vision-and-language transformer without convolution or region supervision. In: International conference on machine learning, PMLR, pp 5583\u20135594"},{"key":"248_CR14","unstructured":"Kumar S, Udupa R (2011) Learning hash functions for cross-view similarity search. In: Proceedings of the twenty-second international joint conference on artificial intelligence, IJCAI 11, pp 1360\u20131365"},{"key":"248_CR15","doi-asserted-by":"publisher","first-page":"165034","DOI":"10.1109\/ACCESS.2020.3022672","volume":"8","author":"Y Li","year":"2020","unstructured":"Li Y, Wang Y, Miao Z et al (2020) Contrastive self-supervised hashing with dual pseudo agreement. IEEE Access 8:165034\u2013165043","journal-title":"IEEE Access"},{"key":"248_CR16","doi-asserted-by":"crossref","unstructured":"Lin Z, Ding G, Hu M et\u00a0al (2015) Semantics-preserving hashing for cross-view retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3864\u20133872","DOI":"10.1109\/CVPR.2015.7299011"},{"key":"248_CR17","doi-asserted-by":"crossref","unstructured":"Liu S, Qian S, Guan Y et\u00a0al (2020) Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp 1379\u20131388","DOI":"10.1145\/3397271.3401086"},{"key":"248_CR18","unstructured":"Oord Avd, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748"},{"key":"248_CR19","doi-asserted-by":"crossref","unstructured":"Qiu Z, Su Q, Ou Z et\u00a0al (2021) Unsupervised hashing with contrastive information bottleneck. arXiv preprint arXiv:2105.06138","DOI":"10.24963\/ijcai.2021\/133"},{"key":"248_CR20","unstructured":"Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556"},{"key":"248_CR21","doi-asserted-by":"crossref","unstructured":"Song J, Yang Y, Yang Y et\u00a0al (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: Proceedings of the 2013 ACM SIGMOD international conference on management of data, New York, SIGMOD 13, pp 785\u2013796","DOI":"10.1145\/2463676.2465274"},{"key":"248_CR22","doi-asserted-by":"crossref","unstructured":"Su S, Zhong Z, Zhang C (2019) Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 3027\u20133035","DOI":"10.1109\/ICCV.2019.00312"},{"issue":"5","key":"248_CR23","first-page":"739","volume":"117","author":"RR Varshamov","year":"1957","unstructured":"Varshamov RR (1957) Estimate of the number of signals in error correcting codes. Doklady Akademia Nauk Sssr 117(5):739\u2013741","journal-title":"Doklady Akademia Nauk Sssr"},{"key":"248_CR24","unstructured":"Wang H, Xiao R, Li Y et\u00a0al (2017) PiCO: contrastive label disambiguation for partial label learning. In: International conference on learning representations"},{"key":"248_CR25","doi-asserted-by":"crossref","unstructured":"Wang L, Pan Y, Lai H et\u00a0al (2022) Image retrieval with well-separated semantic hash centers. In: Asian conference on computer vision","DOI":"10.1007\/978-3-031-26351-4_43"},{"key":"248_CR26","doi-asserted-by":"crossref","unstructured":"Wang L, Pan Y, Liu C et\u00a0al (2023) Deep hashing with minimal-distance-separated hash centers. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 23455\u201323464","DOI":"10.1109\/CVPR52729.2023.02246"},{"key":"248_CR27","doi-asserted-by":"publisher","first-page":"255","DOI":"10.1016\/j.neucom.2020.03.019","volume":"400","author":"X Wang","year":"2020","unstructured":"Wang X, Zou X, Bakker EM et al (2020) Self-constraining and attention-based hashing network for bit-scalable cross-modal retrieval. Neurocomputing 400:255\u2013271","journal-title":"Neurocomputing"},{"key":"248_CR28","doi-asserted-by":"publisher","first-page":"328","DOI":"10.3390\/jimaging8120328","volume":"8","author":"M Williams-Lekuona","year":"2022","unstructured":"Williams-Lekuona M, Cosma G, Phillips I (2022) A framework for enabling unpaired multi-modal learning for deep cross-modal hashing retrieval. J Imaging 8:328","journal-title":"J Imaging"},{"issue":"7","key":"248_CR29","doi-asserted-by":"publisher","first-page":"1695","DOI":"10.1109\/TPAMI.2018.2845842","volume":"41","author":"B Wu","year":"2018","unstructured":"Wu B, Ghanem B (2018) $$\\ell _p$$-Box ADMM: a versatile framework for integer programming. IEEE Trans Pattern Anal Mach Intell 41(7):1695\u20131708","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"248_CR30","doi-asserted-by":"crossref","unstructured":"Wu G, Lin Z, Han J et\u00a0al (2018a) Unsupervised deep hashing via binary latent factor models for large-scale cross-modal retrieval. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, IJCAI-18, pp 2854\u20132860","DOI":"10.24963\/ijcai.2018\/396"},{"key":"248_CR31","doi-asserted-by":"crossref","unstructured":"Wu Z, Xiong Y, Yu SX et\u00a0al (2018b) Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3733\u20133742","DOI":"10.1109\/CVPR.2018.00393"},{"key":"248_CR32","doi-asserted-by":"crossref","unstructured":"Yang D, Wu D, Zhang W et\u00a0al (2020) Deep semantic-alignment hashing for unsupervised cross-modal retrieval. In: Proceedings of the 2020 international conference on multimedia retrieval, pp 44\u201352","DOI":"10.1145\/3372278.3390673"},{"key":"248_CR33","doi-asserted-by":"crossref","unstructured":"Yu J, Zhou H, Zhan Y et\u00a0al (2021) Deep graph-neighbor coherence preserving network for unsupervised cross-modal hashing. In: Proceedings of the AAAI conference on artificial intelligence, pp 4626\u20134634","DOI":"10.1609\/aaai.v35i5.16592"},{"key":"248_CR34","doi-asserted-by":"crossref","unstructured":"Zhang D, Li WJ (2014) Large-scale supervised multimodal hashing with semantic correlation maximization. In: AAAI conference on artificial intelligence","DOI":"10.1609\/aaai.v28i1.8995"},{"key":"248_CR35","doi-asserted-by":"crossref","unstructured":"Zhang J, Peng Y, Yuan M (2018a) Unsupervised generative adversarial cross-modal hashing. In: Proceedings of the AAAI conference on artificial intelligence","DOI":"10.1609\/aaai.v32i1.11263"},{"key":"248_CR36","doi-asserted-by":"crossref","unstructured":"Zhang Q, Lei Z, Zhang Z et\u00a0al (2020) Context-aware attention network for image-text retrieval. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 3536\u20133545","DOI":"10.1109\/CVPR42600.2020.00359"},{"key":"248_CR37","doi-asserted-by":"crossref","unstructured":"Zhang X, Lai H, Feng J (2017) Attention-aware deep adversarial hashing for cross-modal retrieval. In: European conference on computer vision","DOI":"10.1007\/978-3-030-01267-0_36"},{"key":"248_CR38","doi-asserted-by":"crossref","unstructured":"Zhang X, Lai H, Feng J (2018b) Attention-aware deep adversarial hashing for cross-modal retrieval. In: Proceedings of the European conference on computer vision (ECCV), pp 591\u2013606","DOI":"10.1007\/978-3-030-01267-0_36"},{"key":"248_CR39","doi-asserted-by":"crossref","unstructured":"Zhen L, Hu P, Wang X et\u00a0al (2019) Deep supervised cross-modal retrieval. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 10394\u201310403","DOI":"10.1109\/CVPR.2019.01064"},{"key":"248_CR40","doi-asserted-by":"crossref","unstructured":"Zhou J, Ding G, Guo Y (2014) Latent semantic sparse hashing for cross-modal similarity search. In: Proceedings of the 37th international ACM SIGIR conference on research development in information retrieval, New York, SIGIR \u201914, pp 415\u2013424","DOI":"10.1145\/2600428.2609610"}],"container-title":["Data Science and Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41019-024-00248-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s41019-024-00248-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s41019-024-00248-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,9]],"date-time":"2024-09-09T02:24:39Z","timestamp":1725848679000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s41019-024-00248-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,2]]},"references-count":40,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,9]]}},"alternative-id":["248"],"URL":"https:\/\/doi.org\/10.1007\/s41019-024-00248-9","relation":{},"ISSN":["2364-1185","2364-1541"],"issn-type":[{"value":"2364-1185","type":"print"},{"value":"2364-1541","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,4,2]]},"assertion":[{"value":"28 November 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 February 2024","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 February 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 April 2024","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}