{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,15]],"date-time":"2026-03-15T04:48:15Z","timestamp":1773550095043,"version":"3.50.1"},"reference-count":53,"publisher":"Association for Computing Machinery (ACM)","issue":"3","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62202272, 62172256, and 62202278"],"award-info":[{"award-number":["62202272, 62172256, and 62202278"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100007129","name":"Natural Science Foundation of Shandong Province","doi-asserted-by":"crossref","award":["ZR2019ZD06"],"award-info":[{"award-number":["ZR2019ZD06"]}],"id":[{"id":"10.13039\/501100007129","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2026,3,31]]},"abstract":"<jats:p>Unsupervised fine-grained image retrieval aims to retrieve specific subcategory images from large-scale unlabeled databases. The small inter-class and large intra-class variances inherent in fine-grained images present significant challenges for unsupervised model training and feature recognition. Without the guidance of supervised information, existing methods often fail to focus on fine-grained details, and multi-region features struggle to embed effectively into hash codes. In this article, we propose Fine-Grained Augmentation and Progressive Feature Integration for unsupervised fine-grained hashing, named FAPI. Specifically, from the perspective of unsupervised contrastive learning, we design fine-grained feature augmentation and cross-contrastive learning modules to enhance the capture of critical discriminative details. Additionally, from a feature extraction standpoint, we propose a progressive granularity feature integration module to extract and fuse multi-layer, multi-granularity features, ensuring effective fine-grained feature extraction and hash code embedding. Extensive experiments on five widely recognized fine-grained datasets demonstrate that FAPI significantly outperforms existing unsupervised methods, achieving state-of-the-art performance.<\/jats:p>","DOI":"10.1145\/3786797","type":"journal-article","created":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T14:47:45Z","timestamp":1769179665000},"page":"1-19","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Fine-Grained Augmentation and Progressive Feature Integration for Unsupervised Fine-Grained Hashing"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-6280-3170","authenticated-orcid":false,"given":"Yun-Cong","family":"Liu","sequence":"first","affiliation":[{"name":"School of Software, Shandong University, Jinan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3481-4892","authenticated-orcid":false,"given":"Zhen-Duo","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Software, Shandong University, Jinan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-1594-6716","authenticated-orcid":false,"given":"Qing-Ze","family":"Bai","sequence":"additional","affiliation":[{"name":"School of Software, Shandong University, Jinan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-4430-3101","authenticated-orcid":false,"given":"Xiao-Dong","family":"Xie","sequence":"additional","affiliation":[{"name":"School of Software, Shandong University, Jinan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-5333-9705","authenticated-orcid":false,"given":"Hao","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Software, Shandong University, Jinan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6901-5476","authenticated-orcid":false,"given":"Xin","family":"Luo","sequence":"additional","affiliation":[{"name":"School of Software, Shandong University, Jinan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9972-7370","authenticated-orcid":false,"given":"Xin-Shun","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Software, Shandong University, Jinan, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2026,2,27]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2022.3174970"},{"key":"e_1_3_1_3_2","first-page":"446","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Bossard Lukas","year":"2014","unstructured":"Lukas Bossard, Matthieu Guillaumin, and Luc Van Gool. 2014. Food-101\u2014Mining discriminative components with random forests. In Proceedings of the European Conference on Computer Vision, 446\u2013461."},{"key":"e_1_3_1_4_2","first-page":"9912","volume-title":"Proceedings of the Conference on Neural Information Processing Systems","author":"Caron Mathilde","year":"2020","unstructured":"Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. 2020. Unsupervised learning of visual features by contrasting cluster assignments. In Proceedings of the Conference on Neural Information Processing Systems, 9912\u20139924."},{"key":"e_1_3_1_5_2","first-page":"1597","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Chen Ting","year":"2020","unstructured":"Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey E. Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of the International Conference on Machine Learning, 1597\u20131607."},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2022.3145159"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.01635"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58580-8_12"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.01671"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.01187"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2024.3393512"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1145\/3664647.3680763"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2020.2971105"},{"key":"e_1_3_1_14_2","first-page":"2","volume-title":"Proceedings of the CVPR Workshop on Fine-Grained Visual Categorization","author":"Khosla Aditya","year":"2011","unstructured":"Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, and Li Fei-Fei. 2011. Novel dataset for fine-grained image categorization. In Proceedings of the CVPR Workshop on Fine-Grained Visual Categorization, 2."},{"key":"e_1_3_1_15_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Kingma Diederik P.","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVW.2013.77"},{"key":"e_1_3_1_17_2","first-page":"5639","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Laskin Michael","year":"2020","unstructured":"Michael Laskin, Aravind Srinivas, and Pieter Abbeel. 2020. CURL: Contrastive unsupervised representations for reinforcement learning. In Proceedings of the International Conference on Machine Learning, 5639\u20135650."},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i3.16296"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.106"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3581783.3612043"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2019.03.022"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1631\/FITEE.2200297"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/3664647.3680593"},{"key":"e_1_3_1_24_2","first-page":"53","volume-title":"Proceedings of the British Machine Vision Conference","author":"Ng Kam Woh","year":"2023","unstructured":"Kam Woh Ng, Xiatian Zhu, Jiun Tian Hoe, Chee Seng Chan, Tianyu Zhang, Yi-Zhe Song, and Tao Xiang. 2023. Unsupervised hashing with similarity distribution calibration. In Proceedings of the British Machine Vision Conference, 53\u201369."},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/3323873.3325017"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICVGIP.2008.47"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-15561-1_11"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2021\/133"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2018.2789887"},{"key":"e_1_3_1_30_2","first-page":"1218","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence","author":"Shen Xiaobo","year":"2024","unstructured":"Xiaobo Shen, Haoyu Cai, Xiuwen Gong, and Yuhui Zheng. 2024. Contrastive transformer masked image hashing for degraded image retrieval. In Proceedings of the International Joint Conference on Artificial Intelligence, 1218\u20131226."},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2024.3421583"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2024.3419414"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2023.3326994"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00289"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-19781-9_31"},{"key":"e_1_3_1_36_2","volume-title":"Proceedings of the International Conference on Learning Representations","author":"Simonyan Karen","year":"2015","unstructured":"Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations."},{"key":"e_1_3_1_37_2","first-page":"806","volume-title":"Proceedings of the Conference on Neural Information Processing Systems","author":"Su Shupeng","year":"2018","unstructured":"Shupeng Su, Chao Zhang, Kai Han, and Yonghong Tian. 2018. Greedy hash: Towards fast optimization for accurate hash coding in CNN. In Proceedings of the Conference on Neural Information Processing Systems, 806\u2013815."},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2020\/479"},{"key":"e_1_3_1_39_2","unstructured":"Catherine Wah Steve Branson Peter Welinder Pietro Perona and Serge Belongie. 2011. The Caltech-UCSD birds-200-2011 dataset (2011)."},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3581783.3612303"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v36i3.20147"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2022.3203574"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2017.2688133"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2023.3299563"},{"key":"e_1_3_1_45_2","first-page":"5720","volume-title":"Proceedings of the Conference on Neural Information Processing Systems","author":"Shen Wei Xiu","year":"2021","unstructured":"Xiu Shen Wei, Shen Yang, Xuhao Sun, Han-Jia Ye, and Jian Yang. 2021. \\(A^{2}\\) -NET: Learning attribute-aware hash codes for large-scale fine-grained image retrieval. In Proceedings of the Conference on Neural Information Processing Systems, 5720\u20135730."},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2021.3126648"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2021.3131042"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2011.170"},{"key":"e_1_3_1_49_2","doi-asserted-by":"publisher","DOI":"10.1145\/2578726.2578736"},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3414028"},{"key":"e_1_3_1_51_2","first-page":"1","article-title":"S3mix: Same category same semantics mixing for augmenting fine-grained images","volume":"20","author":"Zhang Zi-Chao","year":"2023","unstructured":"Zi-Chao Zhang, Zhen-Duo Chen, Zhen-Yu Xie, Xin Luo, and Xin-Shun Xu. 2023. S3mix: Same category same semantics mixing for augmenting fine-grained images. ACM Transactions on Multimedia Computing, Communications, and Applications 20 (2023), 1\u201316.","journal-title":"ACM Transactions on Multimedia Computing, Communications, and Applications"},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.319"},{"key":"e_1_3_1_53_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2023.109543"},{"key":"e_1_3_1_54_2","first-page":"3612","volume-title":"Proceedings of the Conference on Neural Information Processing Systems","author":"Zieba Maciej","year":"2018","unstructured":"Maciej Zieba, Piotr Semberecki, Tarek El-Gaaly, and Tomasz Trzcinski. 2018. BinGAN: Learning compact binary descriptors with a regularized GAN. In Proceedings of the Conference on Neural Information Processing Systems, 3612\u20133622."}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3786797","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,15]],"date-time":"2026-03-15T03:48:21Z","timestamp":1773546501000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3786797"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,27]]},"references-count":53,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,3,31]]}},"alternative-id":["10.1145\/3786797"],"URL":"https:\/\/doi.org\/10.1145\/3786797","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,27]]},"assertion":[{"value":"2025-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-12-13","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-02-27","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}