{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T05:31:12Z","timestamp":1775021472055,"version":"3.50.1"},"reference-count":57,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2023,2,7]],"date-time":"2023-02-07T00:00:00Z","timestamp":1675728000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004731","name":"Natural Science Foundation of Zhejiang Province","doi-asserted-by":"publisher","award":["LGG22F030001"],"award-info":[{"award-number":["LGG22F030001"]}],"id":[{"id":"10.13039\/501100004731","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>UAV localization in denial environments is a hot research topic in the field of cross-view geo-localization. The previous methods tried to find the corresponding position directly in the satellite image through the UAV image, but they lacked the consideration of spatial information and multi-scale information. Based on the method of finding points with an image, we propose a novel architecture\u2014a Weight-Adaptive Multi-Feature fusion network for UAV localization (WAMF-FPI). We treat this positioning as a low-level task and achieve more accurate localization by restoring the feature map to the resolution of the original satellite image. Then, in order to enhance the ability of the model to solve multi-scale problems, we propose a Weight-Adaptive Multi-Feature fusion module (WAMF), which introduces a weighting mechanism to fuse different features. Finally, since all positive samples are treated in the same way in the existing methods, which is very disadvantageous for accurate localization tasks, we introduce Hanning loss to allow the model to pay more attention to the central area of the target. Our model achieves competitive results on the UL14 dataset. When using RDS as the evaluation metric, the performance of the model improves from 57.22 to 65.33 compared to Finding Point with Image (FPI). In addition, we calculate the actual distance errors (meters) to evaluate the model performance, and the localization accuracy at the 20 m level improves from 57.67% to 69.73%, showing the powerful performance of the model. Although the model shows better performance, much remains to be done before it can be applied.<\/jats:p>","DOI":"10.3390\/rs15040910","type":"journal-article","created":{"date-parts":[[2023,2,8]],"date-time":"2023-02-08T05:37:31Z","timestamp":1675834651000},"page":"910","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":20,"title":["WAMF-FPI: A Weight-Adaptive Multi-Feature Fusion Network for UAV Localization"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2353-6241","authenticated-orcid":false,"given":"Guirong","family":"Wang","sequence":"first","affiliation":[{"name":"School of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou 310018, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2148-2865","authenticated-orcid":false,"given":"Jiahao","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou 310018, China"}]},{"given":"Ming","family":"Dai","sequence":"additional","affiliation":[{"name":"School of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou 310018, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1358-7846","authenticated-orcid":false,"given":"Enhui","family":"Zheng","sequence":"additional","affiliation":[{"name":"School of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou 310018, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,2,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1016\/j.ajem.2017.09.025","article-title":"The potential use of unmanned aircraft systems (drones) in mountain search and rescue operations","volume":"36","author":"Karaca","year":"2018","journal-title":"Am. J. Emerg. Med."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Li, Y.C., Ye, D.M., Ding, X.B., Teng, C.S., Wang, G.H., and Li, T.H. (2011, January 9\u201311). UAV Aerial Photography Technology in Island Topographic Mapping. Proceedings of the 2011 International Symposium on Image and Data Fusion, Tengchong, China.","DOI":"10.1109\/ISIDF.2011.6024228"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"107148","DOI":"10.1016\/j.comnet.2020.107148","article-title":"A compilation of UAV applications for precision agriculture","volume":"172","author":"Sarigiannidis","year":"2020","journal-title":"Comput. Netw."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Ding, L., Zhou, J., Meng, L., and Long, Z. (2020). A Practical Cross-View Image Matching Method between UAV and Satellite for UAV-Based Geo-Localization. Remote Sens., 13.","DOI":"10.3390\/rs13010047"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Dannenberg, M., Wang, X., Yan, D., and Smith, W. (2020). Phenological characteristics of global ecosystems based on optical, fluorescence, and microwave remote sensing. Remote Sens., 12.","DOI":"10.3390\/rs12040671"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"112437","DOI":"10.1016\/j.rse.2021.112437","article-title":"A practical reanalysis data and thermal infrared remote sensing data merging (RTM) method for reconstruction of a 1-km all-weather land surface temperature","volume":"260","author":"Zhang","year":"2021","journal-title":"Remote Sens. Environ."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"58","DOI":"10.7474\/TUS.2017.27.1.058","article-title":"A Study on the roughness measurement for joints in rock mass using LIDAR","volume":"27","author":"Lee","year":"2017","journal-title":"Tunn. Undergr. Space"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"150","DOI":"10.1016\/j.ijrmms.2012.06.003","article-title":"Automated mapping of rock discontinuities in 3D lidar and photogrammetry models","volume":"54","author":"Lato","year":"2012","journal-title":"Int. J. Rock Mech. Min. Sci."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1016\/j.enggeo.2018.05.007","article-title":"Automated measurements of discontinuity geometric properties from a 3D-point cloud based on a modified region growing algorithm","volume":"242","author":"Ge","year":"2018","journal-title":"Eng. Geol."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"112299","DOI":"10.1016\/j.rse.2021.112299","article-title":"Quality control and crop characterization framework for multi-temporal UAV LiDAR data over mechanized agricultural fields","volume":"256","author":"Lin","year":"2021","journal-title":"Remote Sens. Environ."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Opromolla, R., Fasano, G., Rufino, G., Grassi, M., and Savvaris, A. (2016, January 7\u201310). LIDAR-inertial integration for UAV localization and mapping in complex environments. Proceedings of the 2016 International Conference on Unmanned Aircraft Systems (ICUAS), Arlington, VA, USA.","DOI":"10.1109\/ICUAS.2016.7502580"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Pritzl, V., Vrba, M., \u0160t\u0115p\u00e1n, P., and Saska, M. (2022, January 21\u201324). Cooperative navigation and guidance of a micro-scale aerial vehicle by an accompanying UAV using 3D LiDAR relative localization. Proceedings of the 2022 International Conference on Unmanned Aircraft Systems (ICUAS), Dubrovnik, Croatia.","DOI":"10.1109\/ICUAS54217.2022.9836116"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Meng, F., and Yang, D. (2020, January 6\u20138). Research of UAV Location Control System Based on SINS, GPS and Optical Flow. Proceedings of the 2020 IEEE International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA), Chongqing, China.","DOI":"10.1109\/ICIBA50161.2020.9276977"},{"key":"ref_14","unstructured":"Dai, M., Chen, J., Lu, Y., Hao, W., and Zheng, E. (2022). Finding Point with Image: An End-to-End Benchmark for Vision-based UAV Localization. arXiv."},{"key":"ref_15","first-page":"9355","article-title":"Twins: Revisiting the design of spatial attention in vision transformers","volume":"34","author":"Chu","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_16","unstructured":"Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., and J\u00e9gou, H. (2021, January 18\u201324). Training data-efficient image transformers & distillation through attention. Proceedings of the International Conference on Machine Learning PMLR, Virtual."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., and Sivic, J. (July, January 26). NetVLAD: CNN Architecture for Weakly Supervised Place Recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.572"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Kim, H.J., Dunn, E., and Frahm, J.M. (2017, January 21\u201326). Learned Contextual Feature Reweighting for Image Geo-Localization. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.346"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Kim, H.J., Dunn, E., and Frahm, J.M. (2015, January 7\u201313). Predicting Good Features for Image Geo-Localization Using Per-Bundle VLAD. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.139"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1007\/s11263-015-0830-0","article-title":"Image Based Geo-localization in the Alps","volume":"116","author":"Saurer","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Hays, J., and Efros, A.A. (2008, January 23\u201328). IM2GPS: Estimating geographic information from a single image. Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA.","DOI":"10.1109\/CVPR.2008.4587784"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Vo, N., Jacobs, N., and Hays, J. (2017, January 22\u201329). Revisiting IM2GPS in the Deep Learning Era. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.286"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Sattler, T., Havlena, M., Schindler, K., and Pollefeys, M. (July, January 26). Large-Scale Location Recognition and the Geometric Burstiness Problem. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.175"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1546","DOI":"10.1109\/TPAMI.2014.2299799","article-title":"Image Geo-Localization Based on MultipleNearest Neighbor Feature Matching UsingGeneralized Graphs","volume":"36","author":"Zamir","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Tian, Y., Chen, C., and Shah, M. (2017, January 21\u201326). Cross-View Image Matching for Geo-Localization in Urban Environments. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.216"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Belongie, S., and Hays, J. (2013, January 23\u201328). Cross-View Image Geolocalization. Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, USA.","DOI":"10.1109\/CVPR.2013.120"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Hu, S., Feng, M., Nguyen, R.M.H., and Lee, G.H. (2018, January 18\u201323). CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00758"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhai, M., Bessinger, Z., Workman, S., and Jacobs, N. (2017, January 21\u201326). Predicting Ground-Level Scene Layout from Aerial Imagery. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.440"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Workman, S., Souvenir, R., and Jacobs, N. (2015, January 7\u201313). Wide-Area Image Geolocalization with Aerial Reference Imagery. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Washington, DC, USA.","DOI":"10.1109\/ICCV.2015.451"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Cui, Y., Belongie, S., and Hays, J. (2015, January 7\u201312). Learning deep representations for ground-to-aerial geolocalization. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299135"},{"key":"ref_31","first-page":"10090","article-title":"Spatial-aware feature aggregation for image based cross-view geo-localization","volume":"32","author":"Shi","year":"2019","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Toker, A., Zhou, Q., Maximov, M., and Leal-Taixe, L. (2021, January 20\u201325). Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00642"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zeng, Z., Wang, Z., Yang, F., and Satoh, S. (2022). Geo-Localization via Ground-to-Satellite Cross-View Image Retrieval. IEEE Trans. Multimed.","DOI":"10.1109\/TMM.2022.3144066"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Zheng, Z., Wei, Y., and Yang, Y. (2020, January 12\u201316). University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization. Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA.","DOI":"10.1145\/3394171.3413896"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"867","DOI":"10.1109\/TCSVT.2021.3061265","article-title":"Each Part Matters: Local Patterns Facilitate Cross-View Geo-Localization","volume":"32","author":"Wang","year":"2022","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_36","unstructured":"Dai, M., Huang, J., Zhuang, J., Lan, W., Cai, Y., and Zheng, E. (2022). Vision-Based UAV Localization System in Denial Environments. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Zhu, S., Yang, T., and Chen, C. (2021, January 20\u201325). VIGOR: Cross-View Image Geo-localization beyond One-to-one Retrieval. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00364"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"2278","DOI":"10.1109\/5.726791","article-title":"Gradient-based learning applied to document recognition","volume":"86","author":"Lecun","year":"1998","journal-title":"Proc. IEEE"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"ImageNet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ref_41","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep Residual Learning for Image Recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_43","unstructured":"Yu, F., and Koltun, V. (2015). Multi-scale context aggregation by dilated convolutions. arXiv."},{"key":"ref_44","first-page":"993","article-title":"Deformable convolutional networks based Mask R-CNN","volume":"31","author":"Kim","year":"2020","journal-title":"J. Korean Data Inf. Sci. Soc."},{"key":"ref_45","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020). Computer Vision\u2014ECCV 2020, Springer International Publishing."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., and Torr, P.H. (2021, January 20\u201325). Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00681"},{"key":"ref_47","unstructured":"Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2020). Deformable detr: Deformable transformers for end-to-end object detection. arXiv."},{"key":"ref_48","first-page":"14745","article-title":"Transgan: Two pure transformers can make one strong gan, and that can scale up","volume":"34","author":"Jiang","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Yang, F., Yang, H., Fu, J., Lu, H., and Guo, B. (2020, January 13\u201319). Learning Texture Transformer Network for Image Super-Resolution. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00583"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"He, S., Luo, H., Wang, P., Wang, F., Li, H., and Jiang, W. (2021, January 10\u201317). TransReID: Transformer-based Object Re-Identification. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.01474"},{"key":"ref_51","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Wang, W., Xie, E., Li, X., Fan, D.P., Song, K., Liang, D., Lu, T., Luo, P., and Shao, L. (2021, January 10\u201317). Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00061"},{"key":"ref_53","first-page":"15908","article-title":"Transformer in transformer","volume":"34","author":"Han","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_54","first-page":"15475","article-title":"ResT: An efficient transformer for visual recognition","volume":"34","author":"Zhang","year":"2021","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Wu, H., Xiao, B., Codella, N., Liu, M., Dai, X., Yuan, L., and Zhang, L. (2021, January 10\u201317). CvT: Introducing Convolutions to Vision Transformers. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00009"},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 10\u201317). Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Ren, S., Zhou, D., He, S., Feng, J., and Wang, X. (2022, January 19\u201320). Shunted self-attention via multi-scale token aggregation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01058"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/4\/910\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:26:30Z","timestamp":1760120790000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/4\/910"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,7]]},"references-count":57,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,2]]}},"alternative-id":["rs15040910"],"URL":"https:\/\/doi.org\/10.3390\/rs15040910","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,7]]}}}