{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,28]],"date-time":"2026-01-28T20:58:06Z","timestamp":1769633886560,"version":"3.49.0"},"reference-count":54,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2024,5,24]],"date-time":"2024-05-24T00:00:00Z","timestamp":1716508800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Joint Scientific Research Project Fund","award":["0051\/2022\/AFJ"],"award-info":[{"award-number":["0051\/2022\/AFJ"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Transformers have recently achieved significant breakthroughs in various visual tasks. However, these methods often overlook the optimization of interactions between convolution and transformer blocks. Although the basic attention module strengthens the feature selection ability, it is still weak in generating superior quality output. In order to address this challenge, we propose the integration of sub-pixel space and the application of sparse coding theory in the calculation of self-attention. This approach aims to enhance the network\u2019s generation capability, leading to the development of a sparse-activated sub-pixel transformer network (SSTNet). The experimental results show that compared with several state-of-the-art methods, our proposed network can obtain better generation results, improving the sharpness of object edges and the richness of detail texture information in super-resolution generated images.<\/jats:p>","DOI":"10.3390\/rs16111895","type":"journal-article","created":{"date-parts":[[2024,5,24]],"date-time":"2024-05-24T11:17:52Z","timestamp":1716549472000},"page":"1895","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Activated Sparsely Sub-Pixel Transformer for Remote Sensing Image Super-Resolution"],"prefix":"10.3390","volume":"16","author":[{"given":"Yongde","family":"Guo","sequence":"first","affiliation":[{"name":"Faculty of Data Science, City University of Macau, Macau SAR, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-4401-2507","authenticated-orcid":false,"given":"Chengying","family":"Gong","sequence":"additional","affiliation":[{"name":"Faculty of Data Science, City University of Macau, Macau SAR, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2099-7162","authenticated-orcid":false,"given":"Jun","family":"Yan","sequence":"additional","affiliation":[{"name":"School of Data Science, Qingdao University of Science and Technology, Qingdao 266000, China"},{"name":"Zhuhai Aerospace Microchips Science & Technology Co., Ltd., Zhuhai 519000, China"}]}],"member":"1968","published-online":{"date-parts":[[2024,5,24]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1016\/j.sigpro.2016.05.002","article-title":"Image super-resolution: The techniques, applications, and future","volume":"128","author":"Yue","year":"2016","journal-title":"Signal Process."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"2312","DOI":"10.1109\/TGRS.2017.2778191","article-title":"Adaptive super-resolution for remote sensing images based on sparse representation with global joint dictionary model","volume":"56","author":"Hou","year":"2017","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"7918","DOI":"10.1109\/TGRS.2019.2917427","article-title":"Super-resolution of single remote sensing image based on residual dense backprojection networks","volume":"57","author":"Pan","year":"2019","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"3633","DOI":"10.1109\/TGRS.2019.2959020","article-title":"Coupled adversarial training for remote sensing image super-resolution","volume":"58","author":"Lei","year":"2019","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_5","first-page":"5401410","article-title":"Hybrid-scale self-similarity exploitation for remote sensing image super-resolution","volume":"60","author":"Lei","year":"2021","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1243","DOI":"10.1109\/LGRS.2017.2704122","article-title":"Super-resolution for remote sensing images via local\u2013global combined network","volume":"14","author":"Lei","year":"2017","journal-title":"IEEE Geosci. Remote. Sens. Lett."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Ledig, C., Theis, L., Husz\u00e1r, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., and Wang, Z. (2017, January 21\u201326). Photo-realistic single image super-resolution using a generative adversarial network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.19"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Lim, B., Son, S., Kim, H., Nah, S., and Mu Lee, K. (2017, January 21\u201326). Enhanced deep residual networks for single image super-resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.151"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"3911","DOI":"10.1109\/TCSVT.2019.2915238","article-title":"Channel-wise and spatial feature modulation network for single image super-resolution","volume":"30","author":"Hu","year":"2019","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"2547","DOI":"10.1109\/TCSVT.2020.3027732","article-title":"MDCN: Multi-scale dense cross network for image super-resolution","volume":"31","author":"Li","year":"2020","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Tong, T., Li, G., Liu, X., and Gao, Q. (2017, January 22\u201329). Image super-resolution using dense skip connections. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.514"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Tian, Y., Kong, Y., Zhong, B., and Fu, Y. (2018, January 18\u201323). Residual dense network for image super-resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00262"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., and Fu, Y. (2018, January 23\u201327). Image super-resolution using very deep residual channel attention networks. Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel.","DOI":"10.1007\/978-3-030-01234-2_18"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Dai, T., Cai, J., Zhang, Y., Xia, S.T., and Zhang, L. (2019, January 15\u201320). Second-order attention network for single image super-resolution. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.01132"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Liu, J., Zhang, W., Tang, Y., Tang, J., and Wu, G. (2020, January 13\u201319). Residual feature aggregation network for image super-resolution. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00243"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Chen, Z., Zhang, Y., Gu, J., Kong, L., Yang, X., and Yu, F. (2023, January 2\u20136). Dual aggregation transformer for image super-resolution. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.01131"},{"key":"ref_17","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16 \u00d7 16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., and Timofte, R. (2021, January 11\u201317). Swinir: Image restoration using swin transformer. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00210"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Chen, X., Wang, X., Zhou, J., Qiao, Y., and Dong, C. (2023, January 17\u201324). Activating more pixels in image super-resolution transformer. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.02142"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Zhou, Y., Li, Z., Guo, C.L., Bai, S., Cheng, M.M., and Hou, Q. (2023, January 2\u20136). Srformer: Permuted self-attention for single image super-resolution. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.01174"},{"key":"ref_21","first-page":"5615611","article-title":"Transformer-based multistage enhancement for remote sensing image super-resolution","volume":"60","author":"Lei","year":"2021","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_22","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017, January 4\u20139). Attention is all you need. Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017): 31st Annual Conference on Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"295","DOI":"10.1109\/TPAMI.2015.2439281","article-title":"Image super-resolution using deep convolutional networks","volume":"38","author":"Dong","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Kim, J., Lee, J.K., and Lee, K.M. (2016, January 27\u201330). Accurate image super-resolution using very deep convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.182"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Shi, W., Caballero, J., Husz\u00e1r, F., Totz, J., Aitken, A.P., Bishop, R., Rueckert, D., and Wang, Z. (2016, January 27\u201330). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.207"},{"key":"ref_26","unstructured":"Yu, J., Fan, Y., Yang, J., Xu, N., Wang, Z., Wang, X., and Huang, T. (2018). Wide activation for efficient and accurate image super-resolution. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Lai, W.S., Huang, J.B., Ahuja, N., and Yang, M.H. (2017, January 21\u201326). Deep laplacian pyramid networks for fast and accurate super-resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.618"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Li, J., Fang, F., Mei, K., and Zhang, G. (2018, January 8\u201314). Multi-scale residual network for image super-resolution. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01237-3_32"},{"key":"ref_29","first-page":"5000905","article-title":"Remote sensing image super-resolution via multiscale enhancement network","volume":"20","author":"Wang","year":"2023","journal-title":"IEEE Geosci. Remote. Sens. Lett."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Qiao, Y., and Change Loy, C. (2018, January 8\u201314). Esrgan: Enhanced super-resolution generative adversarial networks. Proceedings of the European Conference on Computer Vision (ECCV) Workshops, Munich, Germany.","DOI":"10.1007\/978-3-030-11021-5_5"},{"key":"ref_31","unstructured":"Zhang, W., Liu, Y., Dong, C., and Qiao, Y. (November, January 27). Ranksrgan: Generative adversarial networks with ranker for image super-resolution. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Wang, X., Xie, L., Dong, C., and Shan, Y. (2021, January 11\u201317). Real-esrgan: Training real-world blind super-resolution with pure synthetic data. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Virtual.","DOI":"10.1109\/ICCVW54120.2021.00217"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Park, J., Son, S., and Lee, K.M. (2023, January 2\u20136). Content-aware local gan for photo-realistic super-resolution. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.00971"},{"key":"ref_34","unstructured":"Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014, January 8\u201313). Generative adversarial nets. Proceedings of the Advances in Neural Information Processing Systems 27 (NIPS 2014): 28st Annual Conference on Neural Information Processing Systems, Montreal QC, Canada."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Mei, Y., Fan, Y., Zhou, Y., Huang, L., Huang, T.S., and Shi, H. (2020, January 13\u201319). Image super-resolution with cross-scale non-local attention and exhaustive self-exemplars mining. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00573"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"5624715","DOI":"10.1109\/TGRS.2022.3180068","article-title":"Multiattention generative adversarial network for remote sensing image super-resolution","volume":"60","author":"Jia","year":"2022","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Xu, Y., Luo, W., Hu, A., Xie, Z., Xie, X., and Tao, L. (2022). TE-SAGAN: An improved generative adversarial network for remote sensing super-resolution images. Remote. Sens., 14.","DOI":"10.3390\/rs14102425"},{"key":"ref_38","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv."},{"key":"ref_39","unstructured":"Radford, A., Narasimhan, K., Salimans, T., and Sutskever, I. (2018). Improving language understanding by generative pre-training. arXiv."},{"key":"ref_40","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford","year":"2019","journal-title":"Openai Blog"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Chen, H., Wang, Y., Guo, T., Xu, C., Deng, Y., Liu, Z., Ma, S., Xu, C., Xu, C., and Gao, W. (2021, January 20\u201325). Pre-trained image processing transformer. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01212"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Zhang, X., Zeng, H., Guo, S., and Zhang, L. (2022, January 23\u201327). Efficient long-range attention network for image super-resolution. Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel.","DOI":"10.1007\/978-3-031-19790-1_39"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Liu, Z., Feng, R., Wang, L., Zhong, Y., Zhang, L., and Zeng, T. (2022, January 17\u201322). Remote Sensing Image Super-Resolution via Dilated Convolution Network with Gradient Prior. Proceedings of the IGARSS 2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia.","DOI":"10.1109\/IGARSS46834.2022.9883673"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"769","DOI":"10.1109\/LGRS.2018.2810893","article-title":"Aerial image super resolution via wavelet multiscale convolutional neural networks","volume":"15","author":"Wang","year":"2018","journal-title":"IEEE Geosci. Remote. Sens. Lett."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"3512","DOI":"10.1109\/TGRS.2018.2885506","article-title":"Achieving super-resolution remote sensing images via the wavelet transform combined with the recursive res-net","volume":"57","author":"Ma","year":"2019","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"4764","DOI":"10.1109\/TGRS.2020.2966805","article-title":"Scene-adaptive remote sensing image super-resolution using a multiscale attention network","volume":"58","author":"Zhang","year":"2020","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_47","first-page":"1","article-title":"Sparse autoencoder","volume":"72","author":"Ng","year":"2011","journal-title":"Cs294a Lect. Notes"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Chen, X., Liu, Z., Tang, H., Yi, L., Zhao, H., and Han, S. (2023, January 2\u20136). Sparsevit: Revisiting activation sparsity for efficient high-resolution vision transformer. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Paris, France.","DOI":"10.1109\/CVPR52729.2023.00205"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Yang, Y., and Newsam, S. (2010, January 2\u20135). Bag-of-visual-words and spatial extensions for land-use classification. Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA.","DOI":"10.1145\/1869790.1869829"},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"3965","DOI":"10.1109\/TGRS.2017.2685945","article-title":"AID: A benchmark data set for performance evaluation of aerial scene classification","volume":"55","author":"Xia","year":"2017","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"600","DOI":"10.1109\/TIP.2003.819861","article-title":"Image quality assessment: From error visibility to structural similarity","volume":"13","author":"Wang","year":"2004","journal-title":"IEEE Trans. Image Process."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Dong, C., Loy, C.C., and Tang, X. (2016, January 11\u201314). Accelerating the super-resolution convolutional neural network. Proceedings of the Computer Vision\u2013ECCV 2016: 14th European Conference, Amsterdam, The Netherlands. Proceedings, Part II 14.","DOI":"10.1007\/978-3-319-46475-6_25"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"1432","DOI":"10.1109\/LGRS.2019.2899576","article-title":"Remote sensing single-image superresolution based on a deep compendium model","volume":"16","author":"Haut","year":"2019","journal-title":"IEEE Geosci. Remote. Sens. Lett."},{"key":"ref_54","first-page":"5615313","article-title":"Contextual transformation network for lightweight remote-sensing image super-resolution","volume":"60","author":"Wang","year":"2021","journal-title":"IEEE Trans. Geosci. Remote. Sens."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/11\/1895\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:48:23Z","timestamp":1760107703000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/11\/1895"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,24]]},"references-count":54,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2024,6]]}},"alternative-id":["rs16111895"],"URL":"https:\/\/doi.org\/10.3390\/rs16111895","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,24]]}}}