{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T04:27:02Z","timestamp":1777696022816,"version":"3.51.4"},"reference-count":37,"publisher":"SAGE Publications","issue":"3","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IDA"],"published-print":{"date-parts":[[2021,4,20]]},"abstract":"<jats:p>Due to the emergence of the era of big data, cross-modal learning have been applied to many research fields. As an efficient retrieval method, hash learning is widely used frequently in many cross-modal retrieval scenarios. However, most of existing hashing methods use fixed-length hash codes, which increase the computational costs for large-size datasets. Furthermore, learning hash functions is an NP hard problem. To address these problems, we initially propose a novel method named Cross-modal Variable-length Hashing Based on Hierarchy (CVHH), which can learn the hash functions more accurately to improve retrieval performance, and also reduce the computational costs and training time. The main contributions of CVHH are: (1) We propose a variable-length hashing algorithm to improve the algorithm performance; (2) We apply the hierarchical architecture to effectively reduce the computational costs and training time. To validate the effectiveness of CVHH, our extensive experimental results show the superior performance compared with recent state-of-the-art cross-modal methods on three benchmark datasets, WIKI, NUS-WIDE and MIRFlickr.<\/jats:p>","DOI":"10.3233\/ida-205162","type":"journal-article","created":{"date-parts":[[2021,4,23]],"date-time":"2021-04-23T14:47:26Z","timestamp":1619189246000},"page":"669-685","source":"Crossref","is-referenced-by-count":1,"title":["Cross-modal variable-length hashing based on hierarchy"],"prefix":"10.1177","volume":"25","author":[{"given":"Xiaojun","family":"Qi","sequence":"first","affiliation":[{"name":"Chongqing Key Laboratory of Image Cognition, College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xianhua","family":"Zeng","sequence":"additional","affiliation":[{"name":"Chongqing Key Laboratory of Image Cognition, College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shumin","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yicai","family":"Xie","sequence":"additional","affiliation":[{"name":"Chongqing Key Laboratory of Image Cognition, College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, China"},{"name":"Gannan Normal University, Jiangxi, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Liming","family":"Xu","sequence":"additional","affiliation":[{"name":"Chongqing Key Laboratory of Image Cognition, College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","reference":[{"key":"10.3233\/IDA-205162_ref1","doi-asserted-by":"crossref","first-page":"259","DOI":"10.1016\/j.neuropsychologia.2014.06.027","article-title":"Improving the efficiency of multisensory integration in older adults: audio-visual temporal discrimination training reduces susceptibility to the sound-induced flash illusion","volume":"61","author":"Setti","year":"2014","journal-title":"Neuropsychologia"},{"key":"10.3233\/IDA-205162_ref2","doi-asserted-by":"crossref","unstructured":"J. Deng, W. Dong, R. socher et al., Magenet: A large-scale hierarchical image database, in: 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2014, in: 248\u2013255.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"10.3233\/IDA-205162_ref3","doi-asserted-by":"crossref","unstructured":"F. Feng, X. Wang and R. Li, Cross-modal retrieval with correspondence autoencoder, in: Proceedings of the 22nd ACM International Conference on Multimedia, ACM, 2014, pp. 7\u201316.","DOI":"10.1145\/2647868.2654902"},{"issue":"8","key":"10.3233\/IDA-205162_ref4","doi-asserted-by":"crossref","first-page":"3826","DOI":"10.1109\/TIP.2016.2577885","article-title":"Generalized coupled dictionary learning approach with applications to cross-modal matching","volume":"25","author":"Mandal","year":"2016","journal-title":"IEEE Transactions on Image Processing"},{"key":"10.3233\/IDA-205162_ref5","doi-asserted-by":"crossref","unstructured":"J. Tang, K. Wang and L. Shao, Supervised matrix factorization hashing for cross-modal retrieval, IEEE Transactions on Image Processing 25(7) (2016), 3157\u20133166.","DOI":"10.1109\/TIP.2016.2564638"},{"key":"10.3233\/IDA-205162_ref6","doi-asserted-by":"crossref","unstructured":"Q.Y. Jiang and L.W. Jun, Deep cross-modal hashing, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 3232\u20133240.","DOI":"10.1109\/CVPR.2017.348"},{"key":"10.3233\/IDA-205162_ref7","unstructured":"K. Wang, Q. Yin, W. Wang et al., A comprehensive survey on cross-modal retrieval, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, arXiv preprint arXiv:1607.06215, 2016."},{"issue":"9","key":"10.3233\/IDA-205162_ref8","doi-asserted-by":"crossref","first-page":"2372","DOI":"10.1109\/TCSVT.2017.2705068","article-title":"An overview of cross-media retrieval: concepts, methodologies, benchmarks, and challenges","volume":"28","author":"Peng","year":"2017","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"},{"key":"10.3233\/IDA-205162_ref9","doi-asserted-by":"crossref","unstructured":"N. Rasiwasia, J.C. Pereira, E. Coviello et al., A new approach to cross-modal multimedia retrieva, in: Proceedings of the 18th ACM International Conference on Multimedia, ACM, 2010, pp. 251\u2013260.","DOI":"10.1145\/1873951.1873987"},{"issue":"2","key":"10.3233\/IDA-205162_ref10","doi-asserted-by":"crossref","first-page":"134","DOI":"10.1007\/s11263-011-0494-3","article-title":"Learning the relative importance of objects from tagged images for retrieval and cross-modal search","volume":"100","author":"Hwang","year":"2012","journal-title":"International Journal of Computer Vision"},{"issue":"12","key":"10.3233\/IDA-205162_ref11","doi-asserted-by":"crossref","first-page":"2639","DOI":"10.1162\/0899766042321814","article-title":"Canonical correlation analysis: an overview with application to learning methods","volume":"16","author":"Hardoon","year":"2004","journal-title":"Neural Computation"},{"key":"10.3233\/IDA-205162_ref12","doi-asserted-by":"crossref","unstructured":"A. Sharma, A. Kumar, H. Daume et al., Generalized multiview analysis: a discriminative latent space, in: 2012 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2012, pp. 2160\u20132167.","DOI":"10.1109\/CVPR.2012.6247923"},{"issue":"7","key":"10.3233\/IDA-205162_ref13","doi-asserted-by":"crossref","first-page":"390","DOI":"10.1016\/j.tics.2012.05.003","article-title":"Cortical oscillations and sensory predictions","volume":"16","author":"Amal","year":"2012","journal-title":"Trends in Cognitive Sciences"},{"key":"10.3233\/IDA-205162_ref14","doi-asserted-by":"crossref","unstructured":"C. Wang, H. Yang and C. Meinel, Deep semantic mapping for cross-modal retrieval, in: 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI), IEEE, 2015, pp. 234\u2013241.","DOI":"10.1109\/ICTAI.2015.45"},{"key":"10.3233\/IDA-205162_ref15","unstructured":"G. Andrew, R. Arora, J. Bilmes et al., Deep canonical correlation analysis, in: International Conference on Machine Learning, 2013, pp. 1247\u20131255."},{"issue":"2","key":"10.3233\/IDA-205162_ref16","first-page":"449","article-title":"Cross-modal retrieval with CNN visual features: a new baseline","volume":"47","author":"Wei","year":"2016","journal-title":"IEEE Transactions on Cybernetics"},{"key":"10.3233\/IDA-205162_ref17","unstructured":"G.S. Manku, M. Bawa and P. Raghavan, Symphony: Distributed Hashing in a Small World, in: USENIX Symposium on Internet Technologies and Systems, Vol. 10, 2003."},{"key":"10.3233\/IDA-205162_ref18","doi-asserted-by":"crossref","unstructured":"M. Datar, N. Immorlica, P. Indyk et al., Locality-sensitive hashing scheme based on p-stable distributions, in: Proceedings of the Twentieth Annual Symposium on Computational Geometry, ACM, 2004, pp. 253\u2013262.","DOI":"10.1145\/997817.997857"},{"issue":"1\u20133","key":"10.3233\/IDA-205162_ref20","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1016\/0169-7439(87)80084-9","article-title":"Principal component analysis","volume":"2","author":"Wold","year":"1987","journal-title":"Chemometrics and Intelligent Laboratory Systems"},{"issue":"12","key":"10.3233\/IDA-205162_ref23","doi-asserted-by":"crossref","first-page":"2393","DOI":"10.1109\/TPAMI.2012.48","article-title":"Semi-supervised hashing for large-scale search","volume":"34","author":"Liu","year":"2012","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"10.3233\/IDA-205162_ref24","doi-asserted-by":"crossref","unstructured":"S. Kim and S. Choi, Semi-supervised discriminant hashing, in: 2011 IEEE 11th International Conference on Data Mining, IEEE, 2011, pp. 1122\u20131127.","DOI":"10.1109\/ICDM.2011.128"},{"key":"10.3233\/IDA-205162_ref25","doi-asserted-by":"crossref","unstructured":"W. Liu, J. Wang, R. Ji et al., Supervised hashing with kernels, in: 2012 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2012, pp. 2074\u20132081.","DOI":"10.1109\/CVPR.2012.6247912"},{"issue":"4","key":"10.3233\/IDA-205162_ref26","doi-asserted-by":"crossref","first-page":"181","DOI":"10.1145\/1090191.1080114","article-title":"Fast hash table lookup using extended bloom filter: an aid to network processing","volume":"35","author":"Song","year":"2005","journal-title":"ACM SIGCOMM Computer Communication Review. ACM"},{"key":"10.3233\/IDA-205162_ref27","doi-asserted-by":"crossref","unstructured":"F.M. Shen, C.H. Shen, W. Liu et al., Supervised discrete hashing, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 37\u201345.","DOI":"10.1109\/CVPR.2015.7298598"},{"key":"10.3233\/IDA-205162_ref28","unstructured":"S. Kumar and R. Udupa, Learning hash functions for cross-view similarity search, in: Twenty-Second International Joint Conference on Artificial Intelligence, 2011."},{"key":"10.3233\/IDA-205162_ref29","doi-asserted-by":"crossref","unstructured":"J. Song, Y. Yang and Y. Yang, Inter-media hashing for large-scale retrieval from heterogeneous data sources, in: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, ACM, 2013, pp. 785\u2013796.","DOI":"10.1145\/2463676.2465274"},{"issue":"5","key":"10.3233\/IDA-205162_ref30","doi-asserted-by":"crossref","first-page":"473","DOI":"10.1093\/bib\/bbq015","article-title":"A survey of sequence alignment algorithms for next-generation sequencing","volume":"11","author":"Lin","year":"2010","journal-title":"Briefings in Bioinformatics"},{"key":"10.3233\/IDA-205162_ref31","doi-asserted-by":"crossref","unstructured":"G. Ding, Y. Guo and J. Zhou, Collective matrix factorization hashing for multimodal data, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 2075\u20132082.","DOI":"10.1109\/CVPR.2014.267"},{"key":"10.3233\/IDA-205162_ref32","doi-asserted-by":"crossref","unstructured":"N. Rasiwasia, J.C Pereira, E. Coviello et al., A New Approach to Cross-Modal Multimedia Retrieval, in: International Conference on Multimedia. ACM, 2010.","DOI":"10.1145\/1873951.1873987"},{"key":"10.3233\/IDA-205162_ref33","doi-asserted-by":"crossref","unstructured":"D. Zhang and W.J. Li, Large-scale supervised multimodal hashing with semantic correlation maximization, in: Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014.","DOI":"10.1609\/aaai.v28i1.8995"},{"key":"10.3233\/IDA-205162_ref34","doi-asserted-by":"crossref","unstructured":"Z. Lin, G. Ding, M. Hu et al., Semantics-preserving hashing for cross-view retrieval, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3864\u20133872.","DOI":"10.1109\/CVPR.2015.7299011"},{"key":"10.3233\/IDA-205162_ref35","doi-asserted-by":"crossref","unstructured":"D. Mandal, K.N. Chaudhury and S. Biswas, Generalized semantic preserving hashing for n-label cross-modal retrieval, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 4076\u20134084.","DOI":"10.1109\/CVPR.2017.282"},{"key":"10.3233\/IDA-205162_ref36","doi-asserted-by":"crossref","unstructured":"E. Sharon, A. Brandt and R. Basri, Fast multiscale image segmentation, in: Proceedings IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2000 (Cat. No. PR00662), pp. 70\u201377.","DOI":"10.1109\/CVPR.2000.855801"},{"key":"10.3233\/IDA-205162_ref37","doi-asserted-by":"crossref","unstructured":"F. Yang, B. Matei and L.S. Davis, Re-ranking by multi-feature fusion with diffusion for image retrieval, in: 2015 IEEE Winter Conference on Applications of Computer Vision, IEEE, 2015, pp. 572\u2013579.","DOI":"10.1109\/WACV.2015.82"},{"issue":"3","key":"10.3233\/IDA-205162_ref38","doi-asserted-by":"crossref","first-page":"521","DOI":"10.1109\/TPAMI.2013.142","article-title":"On the role of correlation and abstraction in cross-modal multimedia retrieval","volume":"36","author":"Pereira","year":"2013","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"10.3233\/IDA-205162_ref39","doi-asserted-by":"crossref","unstructured":"M.J. Huiskes, B. Thomee and M.S. Lew, New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative, in: Proceedings of the International Conference on Multimedia Information Retrieval, ACM, 2010, pp. 527\u2013536.","DOI":"10.1145\/1743384.1743475"},{"key":"10.3233\/IDA-205162_ref40","doi-asserted-by":"crossref","unstructured":"T.S. Chua, J. Tang, R. Hong et al., NUS-WIDE: a real-world web image database from National University of Singapore, in: Proceedings of the ACM International Conference on Image and Video Retrieval, ACM, Vol. 48, 2009.","DOI":"10.1145\/1646396.1646452"}],"container-title":["Intelligent Data Analysis"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/IDA-205162","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T09:19:04Z","timestamp":1777454344000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/IDA-205162"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,20]]},"references-count":37,"journal-issue":{"issue":"3"},"URL":"https:\/\/doi.org\/10.3233\/ida-205162","relation":{},"ISSN":["1088-467X","1571-4128"],"issn-type":[{"value":"1088-467X","type":"print"},{"value":"1571-4128","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,4,20]]}}}