{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,5]],"date-time":"2025-11-05T06:54:08Z","timestamp":1762325648960,"version":"build-2065373602"},"reference-count":27,"publisher":"MDPI AG","issue":"14","license":[{"start":{"date-parts":[[2023,7,9]],"date-time":"2023-07-09T00:00:00Z","timestamp":1688860800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Academy of Integrated Circuit Innovation of Xidian University in ChongQing Foundation","award":["CQIRI-CXYHT-2021-06","KX202061","20199236855"],"award-info":[{"award-number":["CQIRI-CXYHT-2021-06","KX202061","20199236855"]}]},{"name":"Guangxi Key Laboratory of Trusted Software","award":["CQIRI-CXYHT-2021-06","KX202061","20199236855"],"award-info":[{"award-number":["CQIRI-CXYHT-2021-06","KX202061","20199236855"]}]},{"name":"Fundamental Research Funds for the Central Universities","award":["CQIRI-CXYHT-2021-06","KX202061","20199236855"],"award-info":[{"award-number":["CQIRI-CXYHT-2021-06","KX202061","20199236855"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>How to recover geometric transformations is one of the most challenging issues in image registration. To alleviate the effect of large geometric distortion in multimodal remote sensing image registration, a scale and rotate transform prediction net is proposed in this paper. First, to reduce the scale between the reference and sensed images, the image scale regression module is constructed via CNN feature extraction and FFT correlation, and the scale of sensed image can be recovered roughly. Second, the rotation estimate module is developed for predicting the rotation angles between the reference and the scale-recovered images. Finally, to obtain the accurate registration results, LoFTR is employed to match the geometric-recovered images. The proposed registration network was evaluated on GoogleEarth, HRMS, VIS-NIR and UAV datasets with contrast differences and geometric distortions. The experimental results show that the number of correct matches of our model reached 74.6%, and the RMSE of the registration results achieved 1.236, which is superior to the related methods.<\/jats:p>","DOI":"10.3390\/rs15143469","type":"journal-article","created":{"date-parts":[[2023,7,10]],"date-time":"2023-07-10T00:47:35Z","timestamp":1688950055000},"page":"3469","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["SRTPN: Scale and Rotation Transform Prediction Net for Multimodal Remote Sensing Image Registration"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2751-6096","authenticated-orcid":false,"given":"Xiangzeng","family":"Liu","sequence":"first","affiliation":[{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an 710071, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0306-6163","authenticated-orcid":false,"given":"Xueling","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Xiaodong","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an 710071, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2872-388X","authenticated-orcid":false,"given":"Qiguang","family":"Miao","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Lei","family":"Wang","sequence":"additional","affiliation":[{"name":"NavInfo Co., Ltd., Beijing 100028, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7262-4707","authenticated-orcid":false,"given":"Liang","family":"Chang","sequence":"additional","affiliation":[{"name":"Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin 541004, China"}]},{"given":"Ruyi","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xidian University, Xi\u2019an 710071, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,7,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"977","DOI":"10.1016\/S0262-8856(03)00137-9","article-title":"Image Registration Methods: A Survey","volume":"21","author":"Flusser","year":"2003","journal-title":"Image Vis. Comput."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1016\/j.inffus.2021.02.012","article-title":"A Review of Multimodal Image Matching: Methods and Applications","volume":"73","author":"Jiang","year":"2021","journal-title":"Inf. Fusion"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Zhao, Z., Bai, H., Zhang, J., Zhang, Y., Xu, S., Lin, Z., Timofte, R., and Gool, L.V. (2023, January 18\u201322). CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion. Proceedings of the 2023 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00572"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Bai, X., Hu, Z., Zhu, X., Huang, Q., Chen, Y., Fu, H., and Tai, C.-L. (2022, January 18\u201324). TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00116"},{"key":"ref_5","unstructured":"Gedara Chaminda Bandara, W., Gopalakrishnan Nair, N., and Patel, V.M. (2022). DDPM-CD: Remote Sensing Change Detection Using Denoising Diffusion Probabilistic Models. arXiv."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1109\/TIP.2022.3226418","article-title":"Transition Is a Process: Pair-to-Video Change Detection Networks for Very High Resolution Remote Sensing Images","volume":"32","author":"Lin","year":"2023","journal-title":"IEEE Trans. Image Process."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Jiao, Y., Jie, Z., Chen, S., Chen, J., Ma, L., and Jiang, Y.-G. (2023, January 18\u201322). MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection. Proceedings of the 2023 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.02073"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Gupta, A., Narayan, S., Joseph, K.J., Khan, S., Khan, F.S., and Shah, M. (2022, January 18\u201324). OW-DETR: Open-World Detection Transformer. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00902"},{"key":"ref_9","unstructured":"Wang, C.-Y., Bochkovskiy, A., and Liao, H.-Y.M. (2022). YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. arXiv."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Liu, X., Xue, J., Xu, X., Lu, Z., Liu, R., Zhao, B., Li, Y., and Miao, Q. (2022). Robust Multimodal Remote Sensing Image Registration Based on Local Statistical Frequency Information. Remote Sens., 14.","DOI":"10.3390\/rs14041051"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive Image Features from Scale-Invariant Keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_12","first-page":"698","article-title":"Key. Net: Keypoint Detection by Handcrafted and Learned Cnn Filters Revisited","volume":"45","author":"Mikolajczyk","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Rocco, I., Arandjelovi\u0107, R., and Sivic, J. (2020, January 23\u201328). Efficient Neighbourhood Consensus Networks via Submanifold Sparse Convolutions. Proceedings of the Computer Vision\u2014ECCV 2020: 16th European Conference, Glasgow, UK. Proceedings, Part IX 16.","DOI":"10.1007\/978-3-030-58545-7_35"},{"key":"ref_14","unstructured":"Rocco, I., Cimpoi, M., Arandjelovi\u0107, R., Torii, A., Pajdla, T., and Sivic, J. (2018). Neighbourhood Consensus Networks. Adv. Neural Inf. Process. Syst., 31."},{"key":"ref_15","first-page":"17346","article-title":"Dual-Resolution Correspondence Networks","volume":"33","author":"Li","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Sun, J., Shen, Z., Wang, Y., Bao, H., and Zhou, X. (2021, January 19\u201325). LoFTR: Detector-Free Local Feature Matching with Transformers. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Online.","DOI":"10.1109\/CVPR46437.2021.00881"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Sarlin, P.-E., DeTone, D., Malisiewicz, T., and Rabinovich, A. (2020, January 13\u201319). SuperGlue: Learning Feature Matching with Graph Neural Networks. Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00499"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Bokman, G., and Kahl, F. (2022, January 18\u201324). A Case for Using Rotation Invariant Features in State of the Art Feature Matchers. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA.","DOI":"10.1109\/CVPRW56347.2022.00559"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1335","DOI":"10.1109\/TCSVT.2022.3210602","article-title":"Scale-Net: Learning to Reduce Scale Differences for Large-Scale Invariant Image Matching","volume":"33","author":"Fu","year":"2022","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Rocco, I., Arandjelovic, R., and Sivic, J. (2017, January 21\u201326). Convolutional Neural Network Architecture for Geometric Matching. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.12"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Kim, D.-G., Nam, W.-J., and Lee, S.-W. (2019, January 6\u20139). A Robust Matching Network for Gradually Estimating Geometric Transformation on Remote Sensing Imagery. Proceedings of the 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), Bari, Italy.","DOI":"10.1109\/SMC.2019.8913881"},{"key":"ref_23","first-page":"1","article-title":"SAR-Optical Image Matching by Integrating Siamese U-Net With FFT Correlation","volume":"19","author":"Fang","year":"2022","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Barath, D., Noskova, J., Ivashechkin, M., and Matas, J. (2020, January 13\u201319). MAGSAC++, a Fast, Reliable and Accurate Robust Estimator. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00138"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1007\/978-3-030-01240-3_18","article-title":"Repeatability Is Not Enough: Learning Affine Regions via Discriminability","volume":"Volume 11213","author":"Ferrari","year":"2018","journal-title":"Computer Vision\u2014ECCV 2018"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Park, J.-H., Nam, W.-J., and Lee, S.-W. (2020). A Two-Stream Symmetric Network with Bidirectional Ensemble for Aerial Image Matching. Remote Sens., 12.","DOI":"10.3390\/rs12030465"},{"key":"ref_27","first-page":"18433","article-title":"CoMIR: Contrastive Multimodal Image Representation for Registration","volume":"Volume 33","author":"Pielawski","year":"2020","journal-title":"Proceedings of the Advances in Neural Information Processing Systems"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/14\/3469\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:09:32Z","timestamp":1760126972000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/14\/3469"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,9]]},"references-count":27,"journal-issue":{"issue":"14","published-online":{"date-parts":[[2023,7]]}},"alternative-id":["rs15143469"],"URL":"https:\/\/doi.org\/10.3390\/rs15143469","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2023,7,9]]}}}