{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T14:40:43Z","timestamp":1760366443927,"version":"build-2065373602"},"reference-count":41,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2021,4,8]],"date-time":"2021-04-08T00:00:00Z","timestamp":1617840000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"the Nature Science Foundation of China","award":["61841602"],"award-info":[{"award-number":["61841602"]}]},{"DOI":"10.13039\/100007847","name":"Jilin Provincial Natural Science Foundation","doi-asserted-by":"publisher","award":["20200201283JC"],"award-info":[{"award-number":["20200201283JC"]}],"id":[{"id":"10.13039\/100007847","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Foundation of Jilin Educational Committee","award":["JJKH20200994KJ"],"award-info":[{"award-number":["JJKH20200994KJ"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IJGI"],"abstract":"<jats:p>For full description of images\u2019 semantic information, image retrieval tasks are increasingly using deep convolution features trained by neural networks. However, to form a compact feature representation, the obtained convolutional features must be further aggregated in image retrieval. The quality of aggregation affects retrieval performance. In order to obtain better image descriptors for image retrieval, we propose two modules in our method. The first module is named generalized regional maximum activation of convolutions (GR-MAC), which pays more attention to global information at multiple scales. The second module is called saliency joint weighting, which uses nonparametric saliency weighting and channel weighting to focus feature maps more on the salient region without discarding overall information. Finally, we fuse the two modules to obtain more representative image feature descriptors that not only consider the global information of the feature map but also highlight the salient region. We conducted experiments on multiple widely used retrieval data sets such as roxford5k to verify the effectiveness of our method. The experimental results prove that our method is more accurate than the state-of-the-art methods.<\/jats:p>","DOI":"10.3390\/ijgi10040249","type":"journal-article","created":{"date-parts":[[2021,4,8]],"date-time":"2021-04-08T11:58:45Z","timestamp":1617883125000},"page":"249","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Toward Improving Image Retrieval via Global Saliency Weighted Feature"],"prefix":"10.3390","volume":"10","author":[{"given":"Hongwei","family":"Zhao","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology, Jilin University, Changchun 130012, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2108-5248","authenticated-orcid":false,"given":"Jiaxin","family":"Wu","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Jilin University, Changchun 130012, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Danyang","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Jilin University, Changchun 130012, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0196-7913","authenticated-orcid":false,"given":"Pingping","family":"Liu","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Jilin University, Changchun 130012, China"},{"name":"Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China"},{"name":"School of Mechanical Science and Engineering, Jilin University, Changchun 130025, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,4,8]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1016\/j.neucom.2017.03.072","article-title":"Content-Based Image Retrieval with Compact Deep Convolutional Features","volume":"249","author":"Amira","year":"2017","journal-title":"Neurocomputing"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1349","DOI":"10.1109\/34.895972","article-title":"Content-based image retrieval at the end of the early","volume":"22","author":"Smeulders","year":"2001","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_3","unstructured":"Yue-Hei Ng, J., Yang, F., and Davis, L.S. (2015, January 7\u201312). Exploiting local features from deep networks for image retrieval. Proceedings of 2015 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, New York, NY, USA."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1224","DOI":"10.1109\/TPAMI.2017.2709749","article-title":"SIFT Meets CNN: A Decade Survey of Instance Retrieval","volume":"40","author":"Zheng","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Chum, O., Philbin, J., Sivic, J., Isard, M., and Zisserman, A. (2007, January 14\u201321). Total recall: Automatic query expansion with a generative feature model for object retrieval. Proceedings of the 2007 IEEE 11th International Conference on Computer Vision, Rio de Janeiro, Brazil.","DOI":"10.1109\/ICCV.2007.4408891"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Chum, O., Mikulik, A., Perdoch, M., and Matas, J. (2011, January 20\u201325). Total recall II: Query expansion revisited. Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, CO, USA.","DOI":"10.1109\/CVPR.2011.5995601"},{"key":"ref_7","first-page":"1655","article-title":"Fine-tuning CNN Image Retrieval with No Human Annotation","volume":"41","author":"Filip","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"J\u00e9gou, H., and Chum, O. (2012, January 7\u201313). Negative evidences and co-occurrences in image retrieval: The benefit of PCA and whitening. Proceedings of the 2012 European Conference on Computer Vision, Florence, Italy.","DOI":"10.1007\/978-3-642-33709-3_55"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1023\/A:1011139631724","article-title":"Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope","volume":"42","author":"Oliva","year":"2001","journal-title":"Int. J. Comput. Vision"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Oliva, A., and Torralba, A. (2002, January 22\u201324). Scene-centered description from spatial envelope properties. Proceedings of the 2002 International Workshop on Biologically Motivated Computer Vision, T\u00fcbingen, Germany.","DOI":"10.1007\/3-540-36181-2_26"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1233","DOI":"10.1016\/0031-3203(95)00160-3","article-title":"Image retrieval using color and shape","volume":"29","author":"Jain","year":"1996","journal-title":"Pattern Recognit."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive Image Features from Scale-Invariant Keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vision"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Tolias, G., Furon, T., and J\u00e9gou, H. (2014, January 6\u201312). Orientation covariant aggregation of local descriptors with embeddings. Proceedings of the 2014 European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10599-4_25"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Bay, H., Tuytelaars, T., and Van Gool, L. (2006, January 7\u201313). Surf: Speeded up robust features. Proceedings of the 2006 European Conference on Computer Vision, Graz, Austria.","DOI":"10.1007\/11744023_32"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Sivic, J., and Zisserman, A. (2003, January 13\u201316). Video Google: A text retrieval approach to object matching in videos. Proceedings of the 2003 Computer Vision, IEEE International Conference on, Nice, France.","DOI":"10.1109\/ICCV.2003.1238663"},{"key":"ref_16","first-page":"1704","article-title":"Aggregating local image descriptors into compact codes","volume":"34","author":"Perronnin","year":"2011","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Perronnin, F., and Dance, C. (2007, January 17\u201322). Fisher kernels on visual vocabularies for image categorization. Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, MN, USA.","DOI":"10.1109\/CVPR.2007.383266"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Perronnin, F., S\u00e1nchez, J., and Mensink, T. (2010, January 5\u201311). Improving the fisher kernel for large-scale image classification. Proceedings of the 2010 European Conference on Computer Vision, Heraklion, Crete, Greece.","DOI":"10.1007\/978-3-642-15561-1_11"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"J\u00e9gou, H., and Zisserman, A. (2014, January 23\u201328). Triangulation embedding and democratic aggregation for image search. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2014.417"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Cimpoi, M., Maji, S., and Vedaldi, A. (2015, January 7\u201312). Deep filter banks for texture recognition and segmentation. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299007"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"115","DOI":"10.1007\/s11263-017-1006-x","article-title":"DeepProposals: Hunting Objects and Actions by Cascading Deep Convolutional Layers","volume":"124","author":"Ghodrati","year":"2017","journal-title":"Int. J. Comput. Vision"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Babenko, A., Slesarev, A., Chigorin, A., and Lempitsky, V. (2014, January 6\u201312). Neural codes for image retrieval. Proceedings of 2014 European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10590-1_38"},{"key":"ref_23","unstructured":"Babenko, A., and Lempitsky, V. (2015, January 7\u201313). Aggregating local deep features for image retrieval. Proceedings of the 2015 IEEE International Conference on Computer Vision, NW Washington, DC, USA."},{"key":"ref_24","first-page":"251","article-title":"Visual instance retrieval with deep convolutional networks","volume":"4","author":"Razavian","year":"2016","journal-title":"Trans. Media Technol. Appl."},{"key":"ref_25","unstructured":"Tolias, G., Sicre, R., and J\u00e9gou, H. (2015). Particular object retrieval with integral max-pooling of CNN activations. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Kalantidis, Y., Mellina, C., and Osindero, S. (2016, January 11\u201314). Cross-dimensional weighting for aggregated deep convolutional features. Proceedings of the 2016 European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46604-0_48"},{"key":"ref_27","unstructured":"Hao, J., Wang, W., Dong, J., and Tan, T. (2018, January 23\u201327). MFC: A multi-scale fully convolutional approach for visual instance retrieval. Proceedings of the 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), San Diego, CA, USA."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Sharif Razavian, A., Azizpour, H., Sullivan, J., and Carlsson, S. (2014, January 24\u201327). CNN features off-the-shelf: An astounding baseline for recognition. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, OH, USA.","DOI":"10.1109\/CVPRW.2014.131"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"601","DOI":"10.1109\/TIP.2018.2867104","article-title":"Unsupervised Semantic-Based Aggregation of Deep Convolutional Features","volume":"28","author":"Xu","year":"2019","journal-title":"IEEE Trans. Image Process."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Gordo, A., and Larlus, D. (2017, January 21\u201326). Beyond instance-level image retrieval: Leveraging captions to learn a global visual representation for semantic retrieval. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2017.560"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Mikolajczyk, K., and Matas, J. (2007, January 14\u201321). Improving descriptors for fast tree matching by optimal linear projection. Proceedings of the 2007 IEEE 11th International Conference on Computer Vision, Rio de Janeiro, Brazil.","DOI":"10.1109\/ICCV.2007.4408871"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. (2007, January 17\u201322). Object retrieval with large vocabularies and fast spatial matching. Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, MN, USA.","DOI":"10.1109\/CVPR.2007.383172"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Zhai, Y., and Shah, M. (2006, January 23\u201327). Visual attention detection in video sequences using spatiotemporal cues. Proceedings of the 14th ACM International Conference on Multimedia, Santa Barbara, CA, USA.","DOI":"10.1145\/1180639.1180824"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. (2008, January 23\u201328). Lost in quantization: Improving particular object retrieval in large scale image databases. Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA.","DOI":"10.1109\/CVPR.2008.4587635"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Jegou, H., Douze, M., and Schmid, C. (2008, January 12\u201318). Hamming embedding and weak geometric consistency for large scale image search. Proceedings of the 2008 European Conference on Computer Vision, Marseille, France.","DOI":"10.1007\/978-3-540-88682-2_24"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Radenovi\u0107, F., Iscen, A., Tolias, G., Avrithis, Y., and Chum, O. (2018, January 18\u201322). Revisiting oxford and paris: Large-scale image retrieval benchmarking. Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00598"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Liu, P., Gou, G., Guo, H., Zhang, D., and Zhou, Q. (2019). Fusing Feature Distribution Entropy with R-MAC Features in Image Retrieval. Entropy, 21.","DOI":"10.3390\/e21111037"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Mohedano, E., McGuinness, K., O\u2019Connor, N.E., Salvador, A., Marques, F., and Gir\u00f3-i-Nieto, X. (2016, January 6\u20139). Bags of local convolutional features for scalable instance search. Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, Vancouver, BC, Canada.","DOI":"10.1145\/2911996.2912061"},{"key":"ref_41","first-page":"3221","article-title":"Accelerating t-SNE using tree-based algorithms","volume":"15","author":"Maaten","year":"2014","journal-title":"J. Mach. Learn. Res."}],"container-title":["ISPRS International Journal of Geo-Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2220-9964\/10\/4\/249\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T13:58:55Z","timestamp":1760363935000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2220-9964\/10\/4\/249"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,8]]},"references-count":41,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2021,4]]}},"alternative-id":["ijgi10040249"],"URL":"https:\/\/doi.org\/10.3390\/ijgi10040249","relation":{},"ISSN":["2220-9964"],"issn-type":[{"type":"electronic","value":"2220-9964"}],"subject":[],"published":{"date-parts":[[2021,4,8]]}}}