{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,20]],"date-time":"2026-02-20T05:25:01Z","timestamp":1771565101101,"version":"3.50.1"},"reference-count":52,"publisher":"MDPI AG","issue":"23","license":[{"start":{"date-parts":[[2020,12,4]],"date-time":"2020-12-04T00:00:00Z","timestamp":1607040000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Key S&amp;T Special Projects of China","award":["2017YFB0503704"],"award-info":[{"award-number":["2017YFB0503704"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Street view image retrieval aims to estimate the image locations by querying the nearest neighbor images with the same scene from a large-scale reference dataset. Query images usually have no location information and are represented by features to search for similar results. The deep local features (DELF) method shows great performance in the landmark retrieval task, but the method extracts many features so that the feature file is too large to load into memory when training the features index. The memory size is limited, and removing the part of features simply causes a great retrieval precision loss. Therefore, this paper proposes a grid feature-point selection method (GFS) to reduce the number of feature points in each image and minimize the precision loss. Convolutional Neural Networks (CNNs) are constructed to extract dense features, and an attention module is embedded into the network to score features. GFS divides the image into a grid and selects features with local region high scores. Product quantization and an inverted index are used to index the image features to improve retrieval efficiency. The retrieval performance of the method is tested on a large-scale Hong Kong street view dataset, and the results show that the GFS reduces feature points by 32.27\u201377.09% compared with the raw feature. In addition, GFS has a 5.27\u201323.59% higher precision than other methods.<\/jats:p>","DOI":"10.3390\/rs12233978","type":"journal-article","created":{"date-parts":[[2020,12,4]],"date-time":"2020-12-04T11:59:00Z","timestamp":1607083140000},"page":"3978","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["A Grid Feature-Point Selection Method for Large-Scale Street View Image Retrieval Based on Deep Local Features"],"prefix":"10.3390","volume":"12","author":[{"given":"Tianyou","family":"Chu","sequence":"first","affiliation":[{"name":"School of Resource and Environment Science, Wuhan University, Wuhan 430079, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yumin","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Resource and Environment Science, Wuhan University, Wuhan 430079, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Liheng","family":"Huang","sequence":"additional","affiliation":[{"name":"School of Resource and Environment Science, Wuhan University, Wuhan 430079, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhiqiang","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Resource and Environment Science, Wuhan University, Wuhan 430079, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Huangyuan","family":"Tan","sequence":"additional","affiliation":[{"name":"School of Resource and Environment Science, Wuhan University, Wuhan 430079, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2020,12,4]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1080\/10095020.2020.1805367","article-title":"Local color and morphological image feature based vegetation identification and its application to human environment street view vegetation mapping, or how green is our county?","volume":"23","author":"Lauko","year":"2020","journal-title":"Geo Spat. Inf. Sci."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1016\/j.ecolind.2020.106342","article-title":"Fusing street level photographs and satellite remote sensing to map leaf area index","volume":"115","author":"Richards","year":"2020","journal-title":"Ecol. Indic."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Chang, S.Z., Wang, Z.M., Mao, D.H., Guan, K.H., Jia, M.M., and Chen, C.Q. (2020). Mapping the Essential Urban Land Use in Changchun by Applying Random Forest and Multi-Source Geospatial Data. Remote Sens., 12.","DOI":"10.3390\/rs12152488"},{"key":"ref_4","first-page":"9","article-title":"An efficient urban localization method based on speed humps","volume":"24","author":"Chen","year":"2019","journal-title":"Sust. Comput."},{"key":"ref_5","unstructured":"Ozaki, K., and Yokoo, S. (2019). Large-scale Landmark Retrieval\/Recognition under a Noisy and Diverse Dataset. arXiv."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., and Sivic, J. (2016, January 27\u201330). NetVLAD: CNN architecture for weakly supervised place recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.572"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Chen, D.M., Baatz, G., K\u00f6ser, K., Tsai, S.S., Vedantham, R., Pylv\u00e4n\u00e4inen, T., Roimela, K., Chen, X., Bach, J., and Pollefeys, M. (2011, January 20\u201325). City-scale landmark identification on mobile devices. Proceedings of the CVPR 2011, Providence, RI, USA.","DOI":"10.1109\/CVPR.2011.5995610"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Zhu, Y.Y., Wang, J., Xie, L.X., and Zheng, L. (2018, January 22\u201326). Attention-based Pyramid Aggregation Network for Visual Place Recognition. Proceedings of the 26th ACM international conference on Multimedia, Seoul, Korea.","DOI":"10.1145\/3240508.3240525"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Weng, L., Gouet-Brunet, V., and Soheilian, B. (2020). Semantic signatures for large-scale visual localization. Multimed. Tools Appl.","DOI":"10.1007\/s11042-020-08992-6"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Bay, H., Tuytelaars, T., and Van Gool, L. (2006). Surf: Speeded up robust features. European Conference on Computer Vision, Springer.","DOI":"10.1007\/11744023_32"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Perd\u2019och, M., Chum, O., and Matas, J. (2009, January 20\u201325). Efficient Representation of Local Geometry for Large Scale Object Retrieval. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206529"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"101","DOI":"10.4316\/AECE.2020.02012","article-title":"Generation of Visual Patterns from BoVW for Image Retrieval using modified Similarity Score Fusion","volume":"20","author":"Arulmozhi","year":"2020","journal-title":"Adv. Electr. Comput. Eng."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"6561","DOI":"10.1007\/s11227-019-02890-x","article-title":"Feature mining simulation of video image information in multimedia learning environment based on BOW algorithm","volume":"76","author":"Zhang","year":"2020","journal-title":"J. Supercomput."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1016\/j.dsp.2020.102765","article-title":"Content-based remote sensing image retrieval using multi-scale local ternary pattern","volume":"104","author":"Sukhia","year":"2020","journal-title":"Digit. Signal Process."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1783","DOI":"10.1007\/s00371-018-1573-z","article-title":"Weighted two-step aggregated VLAD for image retrieval","volume":"35","author":"Liu","year":"2019","journal-title":"Vis. Comput."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"257","DOI":"10.1109\/TPAMI.2017.2667665","article-title":"24\/7 Place Recognition by View Synthesis","volume":"40","author":"Torii","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"748","DOI":"10.1007\/978-3-642-15549-9_54","article-title":"Avoiding Confusing Features in Place Recognition","volume":"Volume 6311","author":"Daniilidis","year":"2010","journal-title":"Computer Vision-Eccv 2010, Pt I"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"2346","DOI":"10.1109\/TPAMI.2015.2409868","article-title":"Visual Place Recognition with Repetitive Structures","volume":"37","author":"Torii","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"255","DOI":"10.1007\/978-3-642-15561-1_19","article-title":"Accurate Image Localization Based on Google Maps Street View","volume":"Volume 6314","author":"Daniilidis","year":"2010","journal-title":"Computer Vision-Eccv 2010, Pt Iv"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Noh, H., Araujo, A., Sim, J., Weyand, T., and Han, B. (2017, January 22\u201329). Large-Scale Image Retrieval with Attentive Deep Local Features. Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.374"},{"key":"ref_22","first-page":"1655","article-title":"Fine-tuning CNN image retrieval with no human annotation","volume":"41","author":"Tolias","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_23","unstructured":"Yang, T.-Y., Nguyen, D.-K., Heijnen, H., and Balntas, V. (2020). Ur2kid: Unifying retrieval, keypoint detection, and keypoint description without local correspondence supervision. arXiv."},{"key":"ref_24","unstructured":"Tian, Y., Balntas, V., Ng, T., Barroso-Laguna, A., Demiris, Y., and Mikolajczyk, K. (2020). D2D: Keypoint Extraction with Describe to Detect Approach. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1224","DOI":"10.1109\/TPAMI.2017.2709749","article-title":"SIFT meets CNN: A decade survey of instance retrieval","volume":"40","author":"Zheng","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_26","first-page":"251","article-title":"Visual instance retrieval with deep convolutional networks","volume":"4","author":"Razavian","year":"2016","journal-title":"ITE Trans. Media Technol. Appl."},{"key":"ref_27","unstructured":"Babenko, A., and Lempitsky, V. (2015, January 11\u201318). Aggregating Deep Convolutional Features for Image Retrieval. Proceedings of the 2015 IEEE International Conference on Computer Vision, Las Condes, Chile."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Kalantidis, Y., Mellina, C., and Osindero, S. (2016). Cross-dimensional weighting for aggregated deep convolutional features. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46604-0_48"},{"key":"ref_29","unstructured":"Tolias, G., Sicre, R., and J\u00e9gou, H. (2015). Particular Object Retrieval with Integral Max-Pooling of CNN Activations. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"188","DOI":"10.1016\/j.neucom.2017.12.069","article-title":"E(2)BoWs: An end-to-end Bag-of-Words model via deep convolutional neural network for image retrieval","volume":"395","author":"Liu","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1016\/j.neucom.2018.11.089","article-title":"Bidirectional image-sentence retrieval by local and global deep matching","volume":"345","author":"Ma","year":"2019","journal-title":"Neurocomputing"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Imbriaco, R., Sebastian, C., Bondarev, E., and de With, P.H.N. (2019). Aggregated Deep Local Features for Remote Sensing Image Retrieval. Remote Sens., 11.","DOI":"10.3390\/rs11050493"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Xiong, W., Lv, Y.F., Cui, Y.Q., Zhang, X.H., and Gu, X.Q. (2019). A Discriminative Feature Learning Approach for Remote Sensing Image Retrieval. Remote Sens., 11.","DOI":"10.3390\/rs11030281"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Morere, O., Lin, J., Veillard, A., Duan, L.-Y., Chandrasekhar, V., and Poggio, T. (2017, January 6). Nested invariance pooling and RBM hashing for image instance retrieval. Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, Bucharest, Romania.","DOI":"10.1145\/3078971.3078987"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Razavian, A.S., Azizpour, H., Sullivan, J., and Carlsson, S. (2014, January 24\u201327). CNN Features off-the-shelf: An Astounding Baseline for Recognition. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, OH, USA.","DOI":"10.1109\/CVPRW.2014.131"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"16421","DOI":"10.1007\/s11042-019-7438-2","article-title":"An adaptive image feature matching method using mixed Vocabulary-KD tree","volume":"79","author":"Zhang","year":"2020","journal-title":"Multimed. Tools Appl."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Shan, X., Liu, P., Gou, G., Zhou, Q., and Wang, Z. (2020). Deep Hash Remote Sensing Image Retrieval with Hard Probability Sampling. Remote Sens., 12.","DOI":"10.3390\/rs12172789"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1016\/j.neucom.2020.04.026","article-title":"Mean-removed product quantization for large-scale image retrieval","volume":"406","author":"Yang","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Sivic, J., and Zisserman, A. (2003, January 13\u201316). Video Google: A Text Retrieval Approach to Object Matching in Videos. Proceedings of the Ninth IEEE International Conference on Computer Vision, Nice, France.","DOI":"10.1109\/ICCV.2003.1238663"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Arandjelovic, R., and Zisserman, A. (2012, January 16\u201321). Three things everyone should know to improve object retrieval. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248018"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"1675","DOI":"10.1109\/TIP.2018.2881829","article-title":"On-Device Scalable Image-Based Localization via Prioritized Cascade Search and Fast One-Many RANSAC","volume":"28","author":"Tran","year":"2019","journal-title":"IEEE Trans. Image Process."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"57796","DOI":"10.1109\/ACCESS.2020.2982560","article-title":"Large Scale Category-Structured Image Retrieval for Object Identification Through Supervised Learning of CNN and SURF-Based Matching","volume":"8","author":"Li","year":"2020","journal-title":"IEEE Access"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"21524","DOI":"10.1109\/ACCESS.2020.2969287","article-title":"A Method of Hierarchical Image Retrieval for Real-Time Photogrammetry Based on Multiple Features","volume":"8","author":"Zhan","year":"2020","journal-title":"IEEE Access"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"5288","DOI":"10.1109\/TIP.2018.2845136","article-title":"Dynamic Match Kernel with Deep Convolutional Features for Image Retrieval","volume":"27","author":"Yang","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Cao, B., Araujo, A., and Sim, J. (2020). Unifying Deep Local and Global Features for Image Search. arXiv.","DOI":"10.1007\/978-3-030-58565-5_43"},{"key":"ref_46","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_47","unstructured":"Johnson, J., Douze, M., and J\u00e9gou, H. (2017). Billion-scale similarity search with GPUs. arXiv."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Lin, C.Y., Chiu, Y.C., Ng, H.F., Shih, T.K., and Lin, K.H. (2020). Global-and-Local Context Network for Semantic Segmentation of Street View Images. Sensors, 20.","DOI":"10.3390\/s20102907"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"302","DOI":"10.1016\/j.neucom.2019.11.118","article-title":"A Brief Survey on Semantic Segmentation with Deep Learning","volume":"406","author":"Hao","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"5395","DOI":"10.1109\/TIM.2019.2958580","article-title":"Detecting Trees in Street Images via Deep Learning with Attention Module","volume":"69","author":"Xie","year":"2020","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_51","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Doulamis, A., Voulodimos, A., Protopapadakis, E., Doulamis, N., and Makantasis, K. (2020). Automatic 3D Modeling and Reconstruction of Cultural Heritage Sites from Twitter Images. Sustainability, 12.","DOI":"10.3390\/su12104223"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/12\/23\/3978\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:41:41Z","timestamp":1760179301000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/12\/23\/3978"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,12,4]]},"references-count":52,"journal-issue":{"issue":"23","published-online":{"date-parts":[[2020,12]]}},"alternative-id":["rs12233978"],"URL":"https:\/\/doi.org\/10.3390\/rs12233978","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,12,4]]}}}