{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,10]],"date-time":"2025-12-10T09:08:29Z","timestamp":1765357709175,"version":"build-2065373602"},"reference-count":40,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2024,10,4]],"date-time":"2024-10-04T00:00:00Z","timestamp":1728000000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of China","award":["62476207","CSTB2023NSCQ-LZX0085","2020ZDLGY05-01"],"award-info":[{"award-number":["62476207","CSTB2023NSCQ-LZX0085","2020ZDLGY05-01"]}]},{"name":"Chongqing Natural Science Foundation Innovation and Development Joint Fund Project","award":["62476207","CSTB2023NSCQ-LZX0085","2020ZDLGY05-01"],"award-info":[{"award-number":["62476207","CSTB2023NSCQ-LZX0085","2020ZDLGY05-01"]}]},{"name":"Key Industrial Innovation Chain Project in Industrial Domain of Shaanxi Province","award":["62476207","CSTB2023NSCQ-LZX0085","2020ZDLGY05-01"],"award-info":[{"award-number":["62476207","CSTB2023NSCQ-LZX0085","2020ZDLGY05-01"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Remote-sensing images typically feature large dimensions and contain repeated texture patterns. To effectively capture finer details and encode comprehensive information, feature-extraction networks with larger receptive fields are essential for remote-sensing image super-resolution tasks. However, mainstream methods based on stacked Transformer modules suffer from limited receptive fields due to fixed window sizes, impairing long-range dependency capture and fine-grained texture reconstruction. In this paper, we propose a spatial-frequency joint attention network based on multi-window fusion (MWSFA). Specifically, our approach introduces a multi-window fusion strategy, which merges windows with similar textures to allow self-attention mechanisms to capture long-range dependencies effectively, therefore expanding the receptive field of the feature extractor. Additionally, we incorporate a frequency-domain self-attention branch in parallel with the original Transformer architecture. This branch leverages the global characteristics of the frequency domain to further extend the receptive field, enabling more comprehensive self-attention calculations across different frequency bands and better utilization of consistent frequency information. Extensive experiments on both synthetic and real remote-sensing datasets demonstrate that our method achieves superior visual reconstruction effects and higher evaluation metrics compared to other super-resolution methods.<\/jats:p>","DOI":"10.3390\/rs16193695","type":"journal-article","created":{"date-parts":[[2024,10,4]],"date-time":"2024-10-04T03:26:36Z","timestamp":1728012396000},"page":"3695","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Multi-Window Fusion Spatial-Frequency Joint Self-Attention for Remote-Sensing Image Super-Resolution"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0009-0000-7720-9812","authenticated-orcid":false,"given":"Ziang","family":"Li","sequence":"first","affiliation":[{"name":"School of Electronic Engineering, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Wen","family":"Lu","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering, Xidian University, Xi\u2019an 710071, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-6412-5799","authenticated-orcid":false,"given":"Zhaoyang","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Jian","family":"Hu","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Zeming","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Lihuo","family":"He","sequence":"additional","affiliation":[{"name":"School of Electronic Engineering, Xidian University, Xi\u2019an 710071, China"}]}],"member":"1968","published-online":{"date-parts":[[2024,10,4]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"3365","DOI":"10.1109\/TPAMI.2020.2982166","article-title":"Deep Learning for Image Super-Resolution: A Survey","volume":"43","author":"Wang","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1915","DOI":"10.5194\/isprs-archives-XLII-3-1915-2018","article-title":"Rapid Target Detection in High-Resolution Remote Sensing Images Using YOLO Model","volume":"42","author":"Wu","year":"2018","journal-title":"Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci."},{"key":"ref_3","unstructured":"Gupta, M., Almomani, O., Khasawneh, A.M., and Darabkh, K.A. (2022). Smart Remote Sensing Network for Early Warning of Disaster Risks. Nanotechnology-Based Smart Remote Sensing Networks for Disaster Prevention, Elsevier. [2nd ed.]."},{"key":"ref_4","first-page":"301","article-title":"Military Reconnaissance Application of High-Resolution Optical Satellite Remote Sensing","volume":"9299","author":"Wang","year":"2014","journal-title":"Proc. SPIE"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"328","DOI":"10.1016\/j.neucom.2019.03.106","article-title":"Ultra-Dense GAN for Satellite Imagery Super-Resolution","volume":"398","author":"Wang","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"397","DOI":"10.7848\/ksgpc.2015.33.5.397","article-title":"Digital Map Updates with UAV Photogrammetric Methods","volume":"33","author":"Lim","year":"2015","journal-title":"J. Korean Soc. Surv. Geod. Photogramm. Cartogr."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Guo, M., Liu, H., Xu, Y., and Huang, Y. (2020). Building Extraction Based on U-Net with an Attention Block and Multiple Losses. Remote Sens., 12.","DOI":"10.3390\/rs12091400"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"109","DOI":"10.1109\/LGRS.2011.2161569","article-title":"Automatic Target Detection in High-Resolution Remote Sensing Images Using Spatial Sparse Coding Bag-of-Words Model","volume":"9","author":"Sun","year":"2011","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Liang, X., and Gan, Z. (2011, January 12\u201315). Improved Non-Local Iterative Back-Projection Method for Image Super-Resolution. Proceedings of the 2011 Sixth International Conference on Image and Graphics, Hefei, China.","DOI":"10.1109\/ICIG.2011.108"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"2861","DOI":"10.1109\/TIP.2010.2050625","article-title":"Image Super-Resolution Via Sparse Representation","volume":"19","author":"Yang","year":"2010","journal-title":"IEEE Trans. Image Process."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Gu, S., Zuo, W., Xie, Q., Meng, D., Feng, X., and Zhang, L. (2015, January 7\u201313). Convolutional Sparse Coding for Image Super-Resolution. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.212"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"301","DOI":"10.1109\/TPAMI.2016.2542816","article-title":"Graphical Representation for Heterogeneous Face Recognition","volume":"39","author":"Peng","year":"2016","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1109\/TPAMI.2015.2439281","article-title":"Image Super-Resolution Using Deep Convolutional Networks","volume":"38","author":"Dong","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Kim, J., Lee, J.K., and Lee, K.M. (2016, January 27\u201330). Deeply-Recursive Convolutional Network for Image Super-Resolution. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.181"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Tian, Y., Kong, Y., Zhong, B., and Fu, Y. (2018, January 18\u201323). Residual Dense Network for Image Super-Resolution. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00262"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Ledig, C., Theis, L., Husz\u00e1r, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A.P., Tejani, A., Totz, J., and Wang, Z. (2017, January 21\u201326). Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.19"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Haris, M., Shakhnarovich, G., and Ukita, N. (2018, January 18\u201323). Deep Back-Projection Networks for Super-Resolution. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00179"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Shocher, A., Cohen, N., and Irani, M. (2018, January 18\u201323). \u201cZero-Shot\u201d Super-Resolution Using Deep Internal Learning. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00329"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., and Fu, Y. (2018, January 2\u201314). Image Super-Resolution Using Very Deep Residual Channel Attention Networks. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_18"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Lim, B., Son, S., Kim, H., Nah, S., and Lee, K.M. (2017, January 21\u201326). Enhanced Deep Residual Networks for Single Image Super-Resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.151"},{"key":"ref_21","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Yang, F., Yang, H., Fu, J., Lu, H., and Guo, B. (2020, January 13\u201319). Learning Texture Transformer Network for Image Super-Resolution. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00583"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., and Yang, M.-H. (2022, January 18\u201324). Restormer: Efficient Transformer for High-Resolution Image Restoration. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00564"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., and Timofte, R. (2021, January 11\u201317). SwinIR: Image Restoration Using Swin Transformer. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00210"},{"key":"ref_25","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-Net: Convolutional Networks for Biomedical Image Segmentation. Proceedings of the 18th International Conference, Munich, Germany. Part III, 18."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Choi, H., Lee, J., and Yang, J. (2023, January 17\u201324). N-Gram in Swin Transformers for Efficient Lightweight Image Super-Resolution. Proceedings of the 2023 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00206"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Chen, X., Wang, X., Zhou, J., Qiao, Y., and Chao, D. (2023, January 17\u201324). Activating More Pixels in Image Super-Resolution Transformer. Proceedings of the 2023 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.02142"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1480307","DOI":"10.1155\/2016\/1480307","article-title":"Algorithm for Automated Mapping of Land Surface Temperature Using LANDSAT 8 Satellite Data","volume":"2016","author":"Avdan","year":"2016","journal-title":"J. Sensors"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Gu, J., and Dong, C. (2021, January 20\u201325). Interpreting Super-Resolution Networks with Local Attribution Maps. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00908"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Deng, X., Yang, R., Xu, M., and Dragotti, P.L. (November, January 27). Wavelet Domain Style Transfer for an Effective Perception-Distortion Tradeoff in Single Image Super-Resolution. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00317"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"297","DOI":"10.1090\/S0025-5718-1965-0178586-1","article-title":"An Algorithm for the Machine Calculation of Complex Fourier Series","volume":"19","author":"Cooley","year":"1965","journal-title":"Math. Comput."},{"key":"ref_32","unstructured":"Simonyan, K. (2014). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Guo, M., Zhang, Z., Liu, H., and Huang, Y. (2022). NDSRGAN: A Novel Dense Generative Adversarial Network for Real Aerial Imagery Super-Resolution Reconstruction. Remote Sens., 14.","DOI":"10.3390\/rs14071574"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Liang, J., Zeng, H., and Zhang, L. (2022, January 18\u201324). Details or Artifacts: A Locally Discriminative Learning Approach to Realistic Image Super-Resolution. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00557"},{"key":"ref_35","unstructured":"Diederik, P.K. (2014). Adam: A Method for Stochastic Optimization. arXiv."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"9277","DOI":"10.1109\/TGRS.2019.2924818","article-title":"Remote Sensing Image Super Resolution Using Deep Residual Channel Attention","volume":"57","author":"Haut","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1618","DOI":"10.1109\/TGRS.2020.2994253","article-title":"Remote Sensing Image Super-Resolution Using Novel Dense-Sampling Networks","volume":"59","author":"Dong","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"18524","DOI":"10.1007\/s11227-022-04617-x","article-title":"Remote Sensing Image Reconstruction Using an Asymmetric Multi-Scale Super-Resolution Network","volume":"78","author":"Huan","year":"2022","journal-title":"J. Supercomput."},{"key":"ref_39","first-page":"1","article-title":"Single Remote Sensing Image Super-Resolution Via a Generative Adversarial Network with Stratified Dense Sampling and Chain Training","volume":"62","author":"Meng","year":"2023","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_40","unstructured":"Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Qiao, Y., and Loy, C.C. (2018, January 8\u201314). ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks. Proceedings of the 15th European Conference on Computer Vision, ECCV 2018, Munich, Germany."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/19\/3695\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:10:18Z","timestamp":1760112618000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/19\/3695"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,4]]},"references-count":40,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2024,10]]}},"alternative-id":["rs16193695"],"URL":"https:\/\/doi.org\/10.3390\/rs16193695","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2024,10,4]]}}}