{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T06:05:49Z","timestamp":1772690749442,"version":"3.50.1"},"reference-count":51,"publisher":"MDPI AG","issue":"22","license":[{"start":{"date-parts":[[2021,11,16]],"date-time":"2021-11-16T00:00:00Z","timestamp":1637020800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"The Qian Xuesen Youth Innovation Foundation from China Aerospace Science and Technology Corporation","award":["2019JY39"],"award-info":[{"award-number":["2019JY39"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Local features extraction is a crucial technology for image matching navigation of an unmanned aerial vehicle (UAV), where it aims to accurately and robustly match a real-time image and a geo-referenced image to obtain the position update information of the UAV. However, it is a challenging task due to the inconsistent image capture conditions, which will lead to extreme appearance changes, especially the different imaging principle between an infrared image and RGB image. In addition, the sparsity and labeling complexity of existing public datasets hinder the development of learning-based methods in this research area. This paper proposes a novel learning local features extraction method, which uses local features extracted by deep neural network to find the correspondence features on the satellite RGB reference image and real-time infrared image. First, we propose a single convolution neural network that simultaneously extracts dense local features and their corresponding descriptors. This network combines the advantages of a high repeatability local feature detector and high reliability local feature descriptors to match the reference image and real-time image with extreme appearance changes. Second, to make full use of the sparse dataset, an iterative training scheme is proposed to automatically generate the high-quality corresponding features for algorithm training. During the scheme, the dense correspondences are automatically extracted, and the geometric constraints are added to continuously improve the quality of them. With these improvements, the proposed method achieves state-of-the-art performance for infrared aerial (UAV captured) image and satellite reference image, which shows 4\u20136% performance improvements in precision, recall, and F1-score, compared to the other methods. Moreover, the applied experiment results show its potential and effectiveness on localization for UAVs navigation and trajectory reconstruction application.<\/jats:p>","DOI":"10.3390\/rs13224618","type":"journal-article","created":{"date-parts":[[2021,11,17]],"date-time":"2021-11-17T02:42:28Z","timestamp":1637116948000},"page":"4618","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["LLFE: A Novel Learning Local Features Extraction for UAV Navigation Based on Infrared Aerial Image and Satellite Reference Image Matching"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0080-6306","authenticated-orcid":false,"given":"Xupei","family":"Zhang","sequence":"first","affiliation":[{"name":"Xi\u2019an Microelectronics Technology Institute, Xi\u2019an 710065, China"}]},{"given":"Zhanzhuang","family":"He","sequence":"additional","affiliation":[{"name":"Xi\u2019an Microelectronics Technology Institute, Xi\u2019an 710065, China"}]},{"given":"Zhong","family":"Ma","sequence":"additional","affiliation":[{"name":"Xi\u2019an Microelectronics Technology Institute, Xi\u2019an 710065, China"}]},{"given":"Zhongxi","family":"Wang","sequence":"additional","affiliation":[{"name":"Xi\u2019an Microelectronics Technology Institute, Xi\u2019an 710065, China"}]},{"given":"Li","family":"Wang","sequence":"additional","affiliation":[{"name":"Xi\u2019an Microelectronics Technology Institute, Xi\u2019an 710065, China"}]}],"member":"1968","published-online":{"date-parts":[[2021,11,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Wang, C., Peng, T., Hu, L., and Liu, G. (2020). Improved UAV scene matching algorithm based on censure features and FREAK descriptor. International Conference on Computer Engineering and Networks, Springer.","DOI":"10.1007\/978-981-15-8462-6_18"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Kaniewski, P., and Grzywacz, W. (2017, January 12\u201314). Visual-based navigation system for unmanned aerial vehicles. Proceedings of the 2017 Signal Processing Symposium (SPSympo), Jachranka Village, Poland.","DOI":"10.1109\/SPS.2017.8053686"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Zhuo, X., Koch, T., Kurz, F., Fraundorfer, F., and Reinartz, P. (2017). Automatic UAV image geo-registration by matching UAV images to georeferenced image data. Remote Sens., 9.","DOI":"10.3390\/rs9040376"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Ebadi, K., and Wood, S. (2018, January 28\u201331). Scene matching-based localization of unmanned aerial vehicles in unstructured environments. Proceedings of the IEEE 2018 52nd Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA.","DOI":"10.1109\/ACSSC.2018.8645277"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Liu, J.S., and Liu, H.C. (2020, January 3\u20135). Visual Navigation for UAVs Landing on Accessory Building Floor. Proceedings of the IEEE 2020 International Conference on Pervasive Artificial Intelligence (ICPAI), Taipei, Taiwan.","DOI":"10.1109\/ICPAI51961.2020.00037"},{"key":"ref_6","unstructured":"Andrew, A.M., and Hartley, R. (2001). Multiple View Geometry in Computer Vision, Cambridge University Press."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Conte, G., and Doherty, P. (2008, January 1\u20138). An integrated UAV navigation system based on aerial image matching. Proceedings of the 2008 IEEE Aerospace Conference, Big Sky, MT, USA.","DOI":"10.1109\/AERO.2008.4526556"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Balamurugan, G., Valarmathi, J., and Naidu, V. (2016, January 3\u20135). Survey on UAV navigation in GPS denied environments. Proceedings of the IEEE 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES), Odisha, India.","DOI":"10.1109\/SCOPES.2016.7955787"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"198","DOI":"10.1016\/j.isprsjprs.2016.05.016","article-title":"Illumination-invariant image matching for autonomous UAV localisation based on optical sensing","volume":"119","author":"Wan","year":"2016","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"36","DOI":"10.1117\/12.959130","article-title":"Digital scene matching area correlator (DSMAC)","volume":"Volume 238","author":"Carr","year":"1980","journal-title":"Image Processing For Missile Guidance"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Rublee, E., Rabaud, V., Konolige, K., and Bradski, G. (2011, January 6\u201313). ORB: An efficient alternative to SIFT or SURF. Proceedings of the IEEE 2011 International Conference on Computer Vision, Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126544"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Bay, H., Tuytelaars, T., and Van Gool, L. (2006). Surf: Speeded up robust features. European Conference on Computer Vision, Springer.","DOI":"10.1007\/11744023_32"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s11263-020-01297-z","article-title":"Is there anything new to say about SIFT matching?","volume":"128","author":"Bellavia","year":"2020","journal-title":"Int. J. Comput. Vis."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Held, D., Thrun, S., and Savarese, S. (2015). Deep learning for single-view instance recognition. arXiv.","DOI":"10.1109\/ICRA.2016.7487365"},{"key":"ref_16","unstructured":"Han, X., Leung, T., Jia, Y., Sukthankar, R., and Berg, A.C. (2015, January 7\u201312). Matchnet: Unifying feature and metric learning for patch-based matching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1016\/j.cviu.2017.10.007","article-title":"Deep compare: A study on using convolutional neural networks to compare image patches","volume":"164","author":"Zagoruyko","year":"2017","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Tian, Y., Fan, B., and Wu, F. (2017, January 21\u201326). L2-net: Deep learning of discriminative patch descriptor in euclidean space. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.649"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1494","DOI":"10.1109\/TGRS.2007.892599","article-title":"Robust multispectral image registration using mutual-information models","volume":"45","author":"Kern","year":"2007","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_20","first-page":"248","article-title":"Registration algorithm of infrared and visible images based on improved gradient normalized mutual information and particle swarm optimization","volume":"41","author":"Jing","year":"2012","journal-title":"Infrared Laser Eng."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"578","DOI":"10.1016\/j.patcog.2012.07.026","article-title":"Local self-similarity-based registration of human ROIs in pairs of stereo thermal-visible videos","volume":"46","author":"Torabi","year":"2013","journal-title":"Pattern Recognit."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Bhuiyan, M.A.E., Witharana, C., and Liljedahl, A.K. (2020). Use of Very High Spatial Resolution Commercial Satellite Imagery and Deep Learning to Automatically Map Ice-Wedge Polygons across Tundra Vegetation Types. J. Imaging, 6.","DOI":"10.3390\/jimaging6120137"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zhang, W., Liljedahl, A.K., Kanevskiy, M., Epstein, H.E., Jones, B.M., Jorgenson, M.T., and Kent, K. (2020). Transferability of the deep learning mask R-CNN model for automated mapping of ice-wedge polygons in high-resolution satellite and UAV images. Remote Sens., 12.","DOI":"10.3390\/rs12071085"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Szeliski, R. (2010). Computer Vision: Algorithms and Applications, Springer Science & Business Media.","DOI":"10.1007\/978-1-84882-935-0"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"381","DOI":"10.1145\/358669.358692","article-title":"Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography","volume":"24","author":"Fischler","year":"1981","journal-title":"Commun. ACM"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Yi, K.M., Trulls, E., Lepetit, V., and Fua, P. (2016). Lift: Learned invariant feature transform. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46466-4_28"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Noh, H., Araujo, A., Sim, J., Weyand, T., and Han, B. (2017, January 22\u201329). Large-scale image retrieval with attentive deep local features. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.374"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"DeTone, D., Malisiewicz, T., and Rabinovich, A. (2018, January 18\u201322). Superpoint: Self-supervised interest point detection and description. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPRW.2018.00060"},{"key":"ref_29","unstructured":"Ono, Y., Trulls, E., Fua, P., and Yi, K.M. (2018). LF-Net: Learning local features from images. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Dusmanu, M., Rocco, I., Pajdla, T., Pollefeys, M., Sivic, J., Torii, A., and Sattler, T. (2019). D2-net: A trainable cnn for joint detection and description of local features. arXiv.","DOI":"10.1109\/CVPR.2019.00828"},{"key":"ref_31","unstructured":"Revaud, J., Weinzaepfel, P., De Souza, C., Pion, N., Csurka, G., Cabon, Y., and Humenberger, M. (2019). R2D2: Repeatable and reliable detector and descriptor. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Savinov, N., Seki, A., Ladicky, L., Sattler, T., and Pollefeys, M. (2017, January 21\u201326). Quad-networks: Unsupervised learning to rank for interest point detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.418"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhang, L., and Rusinkiewicz, S. (2018, January 18\u201323). Learning to detect features in texture images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00662"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Simo-Serra, E., Trulls, E., Ferraz, L., Kokkinos, I., Fua, P., and Moreno-Noguer, F. (2015, January 7\u201313). Discriminative learning of deep convolutional feature point descriptors. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.22"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1573","DOI":"10.1109\/TPAMI.2014.2301163","article-title":"Learning local feature descriptors using convex optimisation","volume":"36","author":"Simonyan","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_36","unstructured":"Mishchuk, A., Mishkin, D., Radenovic, F., and Matas, J. (2017). Working hard to know your neighbor\u2019s margins: Local descriptor learning loss. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Schonberger, J.L., and Frahm, J.M. (2016, January 27\u201330). Structure-from-motion revisited. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.445"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"103755","DOI":"10.1016\/j.robot.2021.103755","article-title":"ColMap: A memory-efficient occupancy grid mapping framework","volume":"142","author":"Fisher","year":"2021","journal-title":"Robot. Auton. Syst."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1007\/s11263-020-01359-2","article-title":"Image matching from handcrafted to deep features: A survey","volume":"129","author":"Ma","year":"2021","journal-title":"Int. J. Comput. Vis."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Revaud, J., Almaz\u00e1n, J., Rezende, R.S., and Souza, C.R.D. (2019, January 27\u201328). Learning with average precision: Training image retrieval with a listwise loss. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00521"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., and Hullender, G. (2005, January 7\u201311). Learning to rank using gradient descent. Proceedings of the 22nd International Conference on Machine Learning, Bonn, Germany.","DOI":"10.1145\/1102351.1102363"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Cao, Z., Qin, T., Liu, T.Y., Tsai, M.F., and Li, H. (2007, January 20\u201324). Learning to rank: From pairwise approach to listwise approach. Proceedings of the 24th International Conference on Machine Learning, Corvallis, OR, USA.","DOI":"10.1145\/1273496.1273513"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"He, K., Lu, Y., and Sclaroff, S. (2018, January 18\u201323). Local descriptors optimized for average precision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00069"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Cakir, F., He, K., Xia, X., Kulis, B., and Sclaroff, S. (2019, January 15\u201320). Deep metric learning to rank. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00196"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014). Microsoft coco: Common objects in context. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Li, Z., and Snavely, N. (2018, January 18\u201323). Megadepth: Learning single-view depth prediction from internet photos. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00218"},{"key":"ref_47","first-page":"20","article-title":"Impact of training set batch size on the performance of convolutional neural networks for diverse datasets","volume":"20","author":"Radiuk","year":"2017","journal-title":"Inf. Technol. Manag. Sci."},{"key":"ref_48","unstructured":"You, K., Long, M., Wang, J., and Jordan, M.I. (2019). How does learning rate decay help modern neural networks?. arXiv."},{"key":"ref_49","unstructured":"Ge, R., Kakade, S.M., Kidambi, R., and Netrapalli, P. (2019). The step decay schedule: A near optimal, geometrically decaying learning rate procedure for least squares. arXiv."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Mishra, P. (2019). Supervised Learning Using PyTorch. PyTorch Recipes, Springer.","DOI":"10.1007\/978-1-4842-4258-2"},{"key":"ref_51","unstructured":"Lewkowycz, A. (2021). How to decay your learning rate. arXiv."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/22\/4618\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:31:19Z","timestamp":1760167879000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/22\/4618"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,16]]},"references-count":51,"journal-issue":{"issue":"22","published-online":{"date-parts":[[2021,11]]}},"alternative-id":["rs13224618"],"URL":"https:\/\/doi.org\/10.3390\/rs13224618","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,11,16]]}}}