{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T16:59:20Z","timestamp":1775667560609,"version":"3.50.1"},"reference-count":96,"publisher":"MDPI AG","issue":"16","license":[{"start":{"date-parts":[[2021,8,17]],"date-time":"2021-08-17T00:00:00Z","timestamp":1629158400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Strong geometric and radiometric distortions often exist in optical wide-baseline stereo images, and some local regions can include surface discontinuities and occlusions. Digital photogrammetry and computer vision researchers have focused on automatic matching for such images. Deep convolutional neural networks, which can express high-level features and their correlation, have received increasing attention for the task of wide-baseline image matching, and learning-based methods have the potential to surpass methods based on handcrafted features. Therefore, we focus on the dynamic study of wide-baseline image matching and review the main approaches of learning-based feature detection, description, and end-to-end image matching. Moreover, we summarize the current representative research using stepwise inspection and dissection. We present the results of comprehensive experiments on actual wide-baseline stereo images, which we use to contrast and discuss the advantages and disadvantages of several state-of-the-art deep-learning algorithms. Finally, we conclude with a description of the state-of-the-art methods and forecast developing trends with unresolved challenges, providing a guide for future work.<\/jats:p>","DOI":"10.3390\/rs13163247","type":"journal-article","created":{"date-parts":[[2021,8,17]],"date-time":"2021-08-17T21:17:06Z","timestamp":1629235026000},"page":"3247","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Review of Wide-Baseline Stereo Image Matching Based on Deep Learning"],"prefix":"10.3390","volume":"13","author":[{"given":"Guobiao","family":"Yao","sequence":"first","affiliation":[{"name":"School of Surveying and Geo-Informatics, Shandong Jianzhu University, No. 1000 Fengming Road, Jinan 250101, China"},{"name":"Photogrammetric Computer Vision Lab, The Ohio State University, Columbus, OH 43210, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0755-2628","authenticated-orcid":false,"given":"Alper","family":"Yilmaz","sequence":"additional","affiliation":[{"name":"Photogrammetric Computer Vision Lab, The Ohio State University, Columbus, OH 43210, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0192-3870","authenticated-orcid":false,"given":"Fei","family":"Meng","sequence":"additional","affiliation":[{"name":"School of Surveying and Geo-Informatics, Shandong Jianzhu University, No. 1000 Fengming Road, Jinan 250101, China"}]},{"given":"Li","family":"Zhang","sequence":"additional","affiliation":[{"name":"Chinese Academy of Surveying & Mapping, No. 28 Lianhuachi West Road, Beijing 100830, China"}]}],"member":"1968","published-online":{"date-parts":[[2021,8,17]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"2274","DOI":"10.1002\/cta.2997","article-title":"Stable image matching for 3D reconstruction in outdoor","volume":"49","author":"Cao","year":"2021","journal-title":"Int. J. Circuit Theory Appl."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"106475","DOI":"10.1016\/j.optlaseng.2020.106475","article-title":"Total variation and block-matching 3D filtering-based image reconstruction for single-shot compressed ultrafast photography","volume":"139","author":"Yao","year":"2020","journal-title":"Opt. Lasers Eng."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Park, S.-W., Yoon, R., Lee, H., Lee, H.-J., Choi, Y.-D., and Lee, D.-H. (2020). Impacts of Thresholds of Gray Value for Cone-Beam Computed Tomography 3D Reconstruction on the Accuracy of Image Matching with Optical Scan. Int. J. Environ. Res. Public Health, 17.","DOI":"10.3390\/ijerph17176375"},{"key":"ref_4","first-page":"1","article-title":"Generalized photogrammetry of spaceborne, airborne and terrestrial multi-source remote sensing datasets","volume":"50","author":"Zhang","year":"2021","journal-title":"Acta Geod. Cartogr. Sin."},{"key":"ref_5","first-page":"1129","article-title":"Structure adaptive feature point matching for urban area wide-baseline images with viewpoint variation","volume":"48","author":"Chen","year":"2019","journal-title":"Acta Geod. Cartogr. Sin."},{"key":"ref_6","first-page":"554","article-title":"Automatic tie-point extraction based on multiple-image matching and bundle adjustment of large block of oblique aerial images","volume":"46","author":"Zhang","year":"2017","journal-title":"Acta Geod. Cartogr. Sin."},{"key":"ref_7","first-page":"843","article-title":"An algorithm of automatic quasi-dense matching and three-dimensional recon-struction for oblique stereo images","volume":"39","author":"Yao","year":"2014","journal-title":"Geomat. Informat. Sci. Wuhan Univ."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"517","DOI":"10.1007\/s11263-020-01385-0","article-title":"Image Matching across Wide Baselines: From Paper to Practice","volume":"129","author":"Jin","year":"2020","journal-title":"Int. J. Comput. Vis."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Sarlin, P.-E., DeTone, D., Malisiewicz, T., and Rabinovich, A. (2020, January 14\u201319). SuperGlue: Learning Feature Matching with Graph Neural Networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00499"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1007\/s11263-005-3848-x","article-title":"A Comparison of Affine Region Detectors","volume":"65","author":"Mikolajczyk","year":"2005","journal-title":"Int. J. Comput. Vis."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"101752","DOI":"10.1016\/j.cose.2020.101752","article-title":"A deep learning method with wrapper based feature extraction for wireless intrusion detection system","volume":"92","author":"Kasongo","year":"2020","journal-title":"Comput. Secur."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1007\/s11263-020-01359-2","article-title":"Image Matching from Handcrafted to Deep Features: A Survey","volume":"129","author":"Ma","year":"2020","journal-title":"Int. J. Comput. Vis."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1080\/10095020.2020.1843376","article-title":"Feature detection and description for image matching: From hand-crafted design to deep learning","volume":"24","author":"Chen","year":"2020","journal-title":"Geo-Spat. Inf. Sci."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive Image Features from Scale-Invariant Keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"559","DOI":"10.1109\/TGRS.2017.2751567","article-title":"Robust Harris Corner Matching Based on the Quasi-Homography Transform and Self-Adaptive Window for Wide-Baseline Stereo Images","volume":"56","author":"Yao","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"63","DOI":"10.1023\/B:VISI.0000027790.02288.f2","article-title":"Scale & Affine Invariant Interest Point Detectors","volume":"60","author":"Mikolajczyk","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"761","DOI":"10.1016\/j.imavis.2004.02.006","article-title":"Robust wide-baseline stereo from maximally stable extremal regions","volume":"22","author":"Matas","year":"2004","journal-title":"Image Vis. Comput."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"438","DOI":"10.1137\/080732730","article-title":"ASIFT: A New Framework for Fully Affine Invariant Image Comparison","volume":"2","author":"Morel","year":"2009","journal-title":"SIAM J. Imaging Sci."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1600","DOI":"10.1109\/LGRS.2019.2905350","article-title":"A Multiple Feature Fully Convolutional Network for Road Extraction from High-Resolution Remote Sensing Image Over Mountainous Areas","volume":"16","author":"Zhang","year":"2019","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_20","unstructured":"Han, X., Leung, T., Jia, Y., Sukthankar, R., and Berg, A.C. (2015, January 7\u201312). MatchNet: Unifying feature and metric learning for patch-based matching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA."},{"key":"ref_21","first-page":"23949","article-title":"An effective analysis of deep learning based approaches for audio based feature extraction and its visualization","volume":"78","author":"Sangwan","year":"2018","journal-title":"Multimedia Tools Appl."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1109\/TGRS.2019.2937830","article-title":"Attention GANs: Unsupervised Deep Feature Learning for Aerial Scene Classification","volume":"58","author":"Yu","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Alshaikhli, T., Liu, W., and Maruyama, Y. (2019). Automated Method of Road Extraction from Aerial Images Using a Deep Convolutional Neural Network. Appl. Sci., 9.","DOI":"10.3390\/app9224825"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"947","DOI":"10.1080\/13658816.2019.1696968","article-title":"Automatic extraction of road intersection points from USGS historical map series using deep convolutional neural networks","volume":"34","author":"Saeedimoghaddam","year":"2019","journal-title":"Int. J. Geogr. Inf. Sci."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1016\/j.inffus.2021.02.012","article-title":"A review of multimodal image matching: Methods and applications","volume":"73","author":"Jiang","year":"2021","journal-title":"Inf. Fusion"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"131","DOI":"10.1016\/j.bja.2019.10.017","article-title":"Deep learning for risk assessment: All about automatic feature extraction","volume":"124","author":"Cosgriff","year":"2020","journal-title":"Br. J. Anaesth."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"24","DOI":"10.1016\/j.jprocont.2019.08.006","article-title":"DeepVM: A Deep Learning-based approach with automatic feature extraction for 2D input data Virtual Metrology","volume":"84","author":"Maggipinto","year":"2019","journal-title":"J. Process. Control"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1109\/TEVC.2018.2808689","article-title":"Evolving Unsupervised Deep Neural Networks for Learning Meaningful Representations","volume":"23","author":"Sun","year":"2018","journal-title":"IEEE Trans. Evol. Comput."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1007","DOI":"10.1016\/j.petrol.2018.07.070","article-title":"Feature extraction using a deep learning algorithm for uncertainty quantification of channelized reservoirs","volume":"171","author":"Lee","year":"2018","journal-title":"J. Pet. Sci. Eng."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Verdie, Y., Yi, K.M., Fua, P., and Lepetit, V. (2015, January 7\u201312). TILDE: A Temporally Invariant Learned DEtector. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299165"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Shukla, S., and Arac, A. (2020). A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis. J. Vis. Exp., e60763.","DOI":"10.3791\/60763-v"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"138810","DOI":"10.1109\/ACCESS.2020.3012695","article-title":"An End-to-End Deep Learning Network for 3D Object Detection From RGB-D Data Based on Hough Voting","volume":"8","author":"Yan","year":"2020","journal-title":"IEEE Access"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Laguna, A.B., Riba, E., Ponsa, D., and Mikolajczyk, K. (2019, January 27\u201329). Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters. Proceedings of the IEEECVF International Conference on Computer Vision (ICCV), Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00593"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Balntas, V., Riba, E., Ponsa, D., and Mikolajczyk, K. (2016, January 19\u201322). Learning local feature descriptors with triplets and shallow convolutional neural networks. Proceedings of the British Machine Vision Conference, York, UK.","DOI":"10.5244\/C.30.119"},{"key":"ref_35","first-page":"1042","article-title":"Power tower detection in remote sensing imagery based on deformable network and transfer learning","volume":"49","author":"Zheng","year":"2020","journal-title":"Acta Geod. Cartogr. Sin."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Yao, Y., and Park, H.S. (2020, January 1\u20135). Multiview co-segmentation for wide baseline images using cross-view supervision. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision (WACV), Snowmass, CL, USA.","DOI":"10.1109\/WACV45572.2020.9093497"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"5573","DOI":"10.1080\/01431161.2020.1734251","article-title":"A deep residual learning serial segmentation network for extracting buildings from remote sensing imagery","volume":"41","author":"Liu","year":"2020","journal-title":"Int. J. Remote Sens."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Zhu, Y., Zhou, Z., Liao, G., and Yuan, K. New loss functions for medical image registration based on VoxelMorph. Image Processing of Medical Imaging, Proceedings of the SPIE Medical Imaging, Houston, TX, USA, 15\u201320 February 2020.","DOI":"10.1117\/12.2550030"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"8888","DOI":"10.1109\/TGRS.2020.2991545","article-title":"DML-GANR: Deep Metric Learning with Generative Adversarial Network Regularization for High Spatial Resolution Remote Sensing Image Retrieval","volume":"58","author":"Cao","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"4867","DOI":"10.3233\/JIFS-201679","article-title":"Quantitative analysis of the generalization ability of deep feedforward neural networks","volume":"40","author":"Yang","year":"2021","journal-title":"J. Intell. Fuzzy Syst."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Wang, L., Qian, Y., and Kong, X. (2021). Line and point matching based on the maximum number of consecutive matching edge segment pairs for large viewpoint changing images. Signal Image Video Process., 1\u20138.","DOI":"10.1007\/s11760-021-01959-6"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"2261","DOI":"10.1007\/s10064-020-02011-6","article-title":"Characterization of discontinuity surface morphology based on 3D fractal dimension by integrating laser scanning with ArcGIS","volume":"80","author":"Zheng","year":"2021","journal-title":"Bull. Int. Assoc. Eng. Geol."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Ma, Y., Peng, S., Jia, Y., and Liu, S. (2020). Prediction of terrain occlusion in Change-4 mission. Measures, 152.","DOI":"10.1016\/j.measurement.2019.107368"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"014503","DOI":"10.1117\/1.JRS.14.014503","article-title":"Efficient and de-shadowing approach for multiple vehicle tracking in aerial video via image segmentation and local region matching","volume":"14","author":"Zhang","year":"2020","journal-title":"J. Appl. Remote Sens."},{"key":"ref_45","first-page":"1542","article-title":"Research developments and prospects on dense image matching in photogrammetry","volume":"48","author":"Yuan","year":"2019","journal-title":"Acta Geod. Cartogr. Sin."},{"key":"ref_46","first-page":"1141","article-title":"Deep learning based dense matching for aerial remote sensing images","volume":"48","author":"Liu","year":"2019","journal-title":"Acta Geod. Cartogr. Sin."},{"key":"ref_47","first-page":"1595","article-title":"Progress and future of image matching in low-altitude photogrammetry","volume":"48","author":"Chen","year":"2019","journal-title":"Acta Geod. Cartogr. Sin."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"567","DOI":"10.14358\/PERS.83.8.567","article-title":"Unsupervised Deep Feature Learning for Urban Village Detection from High-Resolution Remote Sensing Images","volume":"83","author":"Li","year":"2017","journal-title":"Photogramm. Eng. Remote Sens."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"152483","DOI":"10.1109\/ACCESS.2019.2948062","article-title":"Salient Object Detection: Integrate Salient Features in the Deep Learning Framework","volume":"7","author":"Chen","year":"2019","journal-title":"IEEE Access"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Xu, D., and Wu, Y. (2021). FE-YOLO: A Feature Enhancement Network for Remote Sensing Target Detection. Remote Sens., 13.","DOI":"10.3390\/rs13071311"},{"key":"ref_51","unstructured":"Lenc, K., and Vedaldi, A. (September, January 31). Learning Covariant Feature Detectors. Proceedings of the ECCV Workshop on Geometry Meets Deep Learning, Amsterdam, The Netherlands."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Zhang, X., Yu, F.X., Karaman, S., and Chang, S.-F. (2017, January 21\u201326). Learning Discriminative and Transformation Covariant Local Feature Detectors. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.523"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Doiphode, N., Mitra, R., Ahmed, S., and Jain, A. (2019, January 2\u20136). An Improved Learning Framework for Covariant Local Feature Detection. Proceedings of the Asian Conference on Computer Vision (ACCV), Perth, Australia.","DOI":"10.1007\/978-3-030-20876-9_17"},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Hoffer, E., and Ailon, N. (2015, January 7\u20139). Deep Metric Learning Using Triplet Network. Proceedings of the International Conference on Learning Representations, San Diego, CA, USA.","DOI":"10.1007\/978-3-319-24261-3_7"},{"key":"ref_55","unstructured":"Yi, K.M., Verdie, Y., Fua, P., and Lepetit, V. (July, January 26). Learning to Assign Orientations to Feature Points. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Zitnick, C.L., and Ramnath, K. (2011, January 6\u201313). Edge foci interest points. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126263"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Mishkin, D., Radenovi\u0107, F., and Matas, J. (2018). Repeatability Is Not Enough: Learning Affine Regions via Discriminability. European Conference on Computer Vision (ECCV), Springer.","DOI":"10.1007\/978-3-030-01240-3_18"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Savinov, N., Seki, A., Ladicky, L., Sattler, T., and Plooeleys, M. (2017, January 21\u201326). Quad-networks: Unsupervised learning to rank for interest point detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.418"},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"128","DOI":"10.1016\/j.media.2018.11.010","article-title":"A deep learning framework for unsupervised affine and deformable image registration","volume":"52","author":"Berendsen","year":"2019","journal-title":"Med. Image Anal."},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Abdullah, T., Bazi, Y., Al Rahhal, M.M., Mekhalfi, M.L., Rangarajan, L., and Zuair, M. (2020). TextRS: Deep Bidirectional Triplet Network for Matching Text to Remote Sensing Images. Remote Sens., 12.","DOI":"10.3390\/rs12030405"},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Wei, X., Zhang, Y., Gong, Y., and Zheng, N. (2018, January 18\u201323). Kernelized subspace pooling for deep local descriptors. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00200"},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Simo-Serra, E., Trulls, E., Ferraz, L., Kokkinos, I., Fua, P., and Moreno-Noguer, F. (2015, January 7\u201313). Discriminative Learning of Deep Convolutional Feature Point Descriptors. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.22"},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Zagoruyko, S., and Komodakis, N. (2015, January 7\u201312). Learning to compare image patches via convolutional neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299064"},{"key":"ref_64","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, 23\u201328 June 2014. IEEE Trans. Pattern Anal. Mach. Intell., 346\u2013361."},{"key":"ref_65","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2014","journal-title":"IEEE T. Pattern. Anal."},{"key":"ref_66","first-page":"844","article-title":"Satellite image matching method based on deep convolution neural network","volume":"47","author":"Fan","year":"2018","journal-title":"Acta Geod. Cartogr. Sin."},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Tian, Y., Fan, B., and Wu, F. (2017, January 21\u201326). L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.649"},{"key":"ref_68","first-page":"43","article-title":"Discriminative Learning of Local Image Descriptors","volume":"33","author":"Hua","year":"2010","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Balntas, V., Lenc, K., Vedaldi, A., and Mikolajczyk, K. (2017, January 21\u201326). HPatches: A Benchmark and Evaluation of Handcrafted and Learned Local Descriptors. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.410"},{"key":"ref_70","unstructured":"Mishchuk, A., Mishkin, D., and Radenovic, F. (2017, January 4\u20139). Working hard to know your neighbor\u2019s margins: Local descriptor learning loss. Proceedings of the Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"Ebel, P., Mishchuk, A., Yi, K.M., Fua, P., and Trulls, E. (2019, January 27\u201328). Beyond cartesian representations for local descriptors. Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00034"},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Tian, Y., Yu, X., Fan, B., Wu, F., Heijnen, H., and Balntas, V. (2019, January 16\u201320). SOSNet: Second Order Similarity Regularization for Local Descriptor Learning. Proceedings of the Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.01127"},{"key":"ref_73","doi-asserted-by":"crossref","unstructured":"Luo, Z., Shen, T., Zhou, L., Zhu, S., Zhang, R., Yao, Y., Fang, T., and Quan, L. (2018, January 8\u201314). GeoDesc: Learning Local Descriptors by Integrating Geometry Constraints. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01240-3_11"},{"key":"ref_74","doi-asserted-by":"crossref","unstructured":"Luo, Z., Shen, T., Zhou, L., Zhang, J., Yao, Y., Li, S., Fang, T., and Quan, L. (2019, January 16\u201320). ContextDesc: Local Descriptor Augmentation with Cross-Modality Context. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00263"},{"key":"ref_75","doi-asserted-by":"crossref","unstructured":"Yao, G., Yilmaz, A., Zhang, L., Meng, F., Ai, H., and Jin, F. (2021). Matching Large Baseline Oblique Stereo Images Using an End-To-End Convolutional Neural Network. Remote Sens., 13.","DOI":"10.3390\/rs13020274"},{"key":"ref_76","doi-asserted-by":"crossref","first-page":"107109","DOI":"10.1016\/j.patcog.2019.107109","article-title":"Training data independent image registration using generative adversarial networks and domain adaptation","volume":"100","author":"Mahapatra","year":"2019","journal-title":"Pattern Recognit."},{"key":"ref_77","doi-asserted-by":"crossref","unstructured":"Yi, K.M., Trulls, E., Lepetit, V., and Fua, P. (2016, January 11\u201314). LIFT: Learned Invariant Feature Transform. Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46466-4_28"},{"key":"ref_78","first-page":"2017","article-title":"Spatial transformer networks","volume":"28","author":"Jaderberg","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_79","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1007\/s10791-009-9110-3","article-title":"Gradient descent optimization of smoothed information retrieval metrics","volume":"13","author":"Chapelle","year":"2009","journal-title":"Inf. Retr."},{"key":"ref_80","doi-asserted-by":"crossref","unstructured":"Zhu, S., Zhang, R., Zhou, L., Shen, T., Fang, T., Tan, P., and Quan, L. (2018, January 18\u201323). Very Large-Scale Global SfM by Distributed Motion Averaging. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00480"},{"key":"ref_81","doi-asserted-by":"crossref","unstructured":"DeTone, D., Malisiewicz, T., and Rabinovich, A. (2018, January 18\u201322). SuperPoint: Self-Supervised Interest Point Detection and Description. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPRW.2018.00060"},{"key":"ref_82","doi-asserted-by":"crossref","unstructured":"Li, H., and Li, F. (2013, January 2\u20133). Image Encode Method Based on IFS with Probabilities Applying in Image Retrieval. Proceedings of the Fourth Global Congress on Intelligent Systems (GCIS), Hong Kong, China.","DOI":"10.1109\/GCIS.2013.53"},{"key":"ref_83","doi-asserted-by":"crossref","first-page":"982","DOI":"10.1109\/TCSVT.2006.879119","article-title":"Video Error Concealment by Integrating Greedy Suboptimization and Kalman Filtering Techniques","volume":"16","author":"Lie","year":"2006","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_84","unstructured":"Revaud, J., Weinzaepfel, P., and De, S. (2019). R2D2: Repeatable and reliable detector and descriptor. arXiv."},{"key":"ref_85","unstructured":"Ono, Y., Trulls, E., Fua, P., and Mooyi, K. (2018, January 3\u20138). LF-Net: Learning local features from images. Proceedings of the Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_86","doi-asserted-by":"crossref","unstructured":"Dusmanu, M., Rocco, I., Pajdla, T., Pollefeys, M., Sivic, J., Torii, A., and Sattler, T. (2019, January 16\u201320). D2-Net: A Trainable CNN for Joint Description and Detection of Local Features. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00828"},{"key":"ref_87","doi-asserted-by":"crossref","first-page":"12168","DOI":"10.1109\/ACCESS.2019.2963211","article-title":"Research on Inception Module Incorporated Siamese Convolutional Neural Networks to Realize Face Recognition","volume":"8","author":"Xu","year":"2019","journal-title":"IEEE Access"},{"key":"ref_88","doi-asserted-by":"crossref","unstructured":"Li, J., Xie, Y., Li, C., Dai, Y., Ma, J., Dong, Z., and Yang, T. (2021). UAV-Assisted Wide Area Multi-Camera Space Alignment Based on Spatiotemporal Feature Map. Remote Sens., 13.","DOI":"10.3390\/rs13061117"},{"key":"ref_89","doi-asserted-by":"crossref","unstructured":"Hasheminasab, S.M., Zhou, T., and Habib, A. (2020). GNSS\/INS-Assisted Structure from Motion Strategies for UAV-Based Imagery over Mechanized Agricultural Fields. Remote Sens., 12.","DOI":"10.3390\/rs12030351"},{"key":"ref_90","doi-asserted-by":"crossref","unstructured":"Lee, S.-H., Yoo, J., Park, M., Kim, J., and Kwon, S. (2021). Robust Extrinsic Calibration of Multiple RGB-D Cameras with Body Tracking and Feature Matching. Sensors, 21.","DOI":"10.3390\/s21031013"},{"key":"ref_91","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft COCO: Common Objects in Context. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_92","doi-asserted-by":"crossref","unstructured":"Li, Z., and Snavely, N. (2018, January 18\u201323). MegaDepth: Learning Single-View Depth Prediction from Internet Photos. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00218"},{"key":"ref_93","doi-asserted-by":"crossref","unstructured":"Shen, T., Luo, Z., Zhou, L., Zhang, R., Zhu, S., Fang, T., and Quan, L. (2019). Matchable Image Retrieval by Learning from Surface Reconstruction. Computer Vision\u2013ACCV, Proceedings of the 14th Asian Conference on Computer Vision, Perth, Australia, 2\u20136 December 2019, Springer.","DOI":"10.1007\/978-3-030-20887-5_26"},{"key":"ref_94","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1007\/s11263-016-0902-9","article-title":"Large-Scale Data for Multiple-View Stereopsis","volume":"120","author":"Jensen","year":"2016","journal-title":"Int. J. Comput. Vis."},{"key":"ref_95","first-page":"869","article-title":"An automated registration method with high accuracy for oblique stereo images based on complementary affine invariant features","volume":"42","author":"Yao","year":"2013","journal-title":"Acta Geod. Cartogr. Sin."},{"key":"ref_96","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1109\/LGRS.2005.861735","article-title":"Seed Point Selection Method for Triangle Constrained Image Matching Propagation","volume":"3","author":"Zhu","year":"2006","journal-title":"IEEE Geosci. Remote Sens. Lett."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/16\/3247\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:45:41Z","timestamp":1760165141000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/16\/3247"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,17]]},"references-count":96,"journal-issue":{"issue":"16","published-online":{"date-parts":[[2021,8]]}},"alternative-id":["rs13163247"],"URL":"https:\/\/doi.org\/10.3390\/rs13163247","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,8,17]]}}}