{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T16:03:01Z","timestamp":1776441781476,"version":"3.51.2"},"reference-count":38,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2022,1,27]],"date-time":"2022-01-27T00:00:00Z","timestamp":1643241600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61771470"],"award-info":[{"award-number":["61771470"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Strategic Priority Research Program of the Chinese Academy of Sciences","award":["XDA19010401, XDA19060103"],"award-info":[{"award-number":["XDA19010401, XDA19060103"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>The majority of existing deep learning pan-sharpening methods often use simulated degraded reference data due to the missing of real fusion labels which affects the fusion performance. The normally used convolutional neural network (CNN) can only extract the local detail information well which may cause the loss of important global contextual characteristics with long-range dependencies in fusion. To address these issues and to fuse spatial and spectral information with high quality information from the original panchromatic (PAN) and multispectral (MS) images, this paper presents a novel pan-sharpening method by designing the CNN+ pyramid Transformer network with no-reference loss (CPT-noRef). Specifically, the Transformer is used as the main architecture for fusion to supply the global features, the local features in shallow CNN are combined, and the multi-scale features from the pyramid structure adding to the Transformer encoder are learned simultaneously. Our loss function directly learns the spatial information extracted from the PAN image and the spectral information from the MS image which is suitable for the theory of pan-sharpening and makes the network control the spatial and spectral loss simultaneously. Both training and test processes are based on real data, so the simulated degraded reference data is no longer needed, which is quite different from most existing deep learning fusion methods. The proposed CPT-noRef network can effectively solve the huge amount of data required by the Transformer network and extract abundant image features for fusion. In order to assess the effectiveness and universality of the fusion model, we have trained and evaluated the model on the experimental data of WorldView-2(WV-2) and Gaofen-1(GF-1) and compared it with other typical deep learning pan-sharpening methods from both the subjective visual effect and the objective index evaluation. The results show that the proposed CPT-noRef network offers superior performance in both qualitative and quantitative evaluations compared with existing state-of-the-art methods. In addition, our method has the strongest generalization capability by testing the Pleiades and WV-2 images on the network trained by GF-1 data. The no-reference loss function proposed in this paper can greatly enhance the spatial and spectral information of the fusion image with good performance and robustness.<\/jats:p>","DOI":"10.3390\/rs14030624","type":"journal-article","created":{"date-parts":[[2022,1,27]],"date-time":"2022-01-27T22:01:57Z","timestamp":1643320917000},"page":"624","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":30,"title":["Pan-Sharpening Based on CNN+ Pyramid Transformer by Using No-Reference Loss"],"prefix":"10.3390","volume":"14","author":[{"given":"Sijia","family":"Li","sequence":"first","affiliation":[{"name":"Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China"},{"name":"School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5954-2179","authenticated-orcid":false,"given":"Qing","family":"Guo","sequence":"additional","affiliation":[{"name":"Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China"}]},{"given":"An","family":"Li","sequence":"additional","affiliation":[{"name":"Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,1,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"174","DOI":"10.1016\/j.isprsjprs.2020.10.010","article-title":"Understanding the synergies of deep learning and data fusion of multispectral and panchromatic high resolution commercial satellite imagery for automated ice-wedge polygon detection","volume":"170","author":"Witharana","year":"2020","journal-title":"J. Photogramm. Remote Sens."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Siok, K., Ewiak, I., and Jenerowicz, A. (2020). Multi-Sensor Fusion: A Simulation Approach to Pansharpening Aerial and Satellite Images. Sensors, 20.","DOI":"10.3390\/s20247100"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Zhang, W., Liljedahl, A.K., Kanevskiy, M., Epstein, H.E., Jones, B.M., Jorgenson, M.T., and Kent, K. (2020). Transferability of the Deep Learning Mask R-CNN Model for Automated Mapping of Ice-Wedge Polygons in High-Resolution Satellite and UAV Images. Remote Sens., 12.","DOI":"10.3390\/rs12071085"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Gkioxari, G., Girshick, R., and Malik, J. (2015, January 7\u201313). Actions and Attributes from Wholes and Parts. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.284"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"318","DOI":"10.1109\/TPAMI.2018.2858826","article-title":"Focal Loss for Dense Object Detection","volume":"42","author":"Lin","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_7","unstructured":"Dai, J.F., Li, Y., He, K.M., and Sun, J. (2016, January 5\u201310). R-FCN: Object Detection via Region-based Fully Convolutional Networks. Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Masi, G., Cozzolino, D., Verdoliva, L., and Scarpa, G. (2016). Pansharpening by Convolutional Neural Networks. Remote Sens., 8.","DOI":"10.3390\/rs8070594"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1795","DOI":"10.1109\/LGRS.2017.2736020","article-title":"Boosting the Accuracy of Multispectral Image Pansharpening by Learning a Deep Residual Network","volume":"14","author":"Wei","year":"2017","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Rao, Y., He, L., and Zhu, J. (2017, January 18\u201321). A residual convolutional neural network for pan-shaprening. Proceedings of the 2017 International Workshop on Remote Sensing with Intelligent Processing (RSIP), Shanghai, China.","DOI":"10.1109\/RSIP.2017.7958807"},{"key":"ref_11","first-page":"1244","article-title":"Pan-sharpening by deep recursive residual network","volume":"25","author":"Wang","year":"2021","journal-title":"J. Remote Sens."},{"key":"ref_12","first-page":"1270","article-title":"Pan-sharpening by residual network with dense convolution for remote sensing images","volume":"25","author":"Chen","year":"2021","journal-title":"J. Remote Sens."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Wu, Y., Huang, M., Li, Y., Feng, S., and Wu, D. (2021). A Distributed Fusion Framework of Multispectral and Panchromatic Images Based on Residual Network. Remote Sens., 13.","DOI":"10.3390\/rs13132556"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Vitale, S., and Scarpa, G. (2020). A Detail-Preserving Cross-Scale Learning Strategy for CNN-Based Pansharpening. Remote Sens., 12.","DOI":"10.3390\/rs12030348"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Wang, W., Zhou, Z., Liu, H., and Xie, G. (2021). MSDRN: Pansharpening of Multispectral Images via Multi-Scale Deep Residual Network. Remote Sens., 13.","DOI":"10.3390\/rs13061200"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Naushad, R., Kaur, T., and Ghaderpour, E. (2021). Deep Transfer Learning for Land Use and Land Cover Classification: A Comparative Study. Sensors, 21.","DOI":"10.3390\/s21238083"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Xu, H., Le, Z., Huang, J., and Ma, J. (2021). A Cross-Direction and Progressive Network for Pan-Sharpening. Remote Sens., 13.","DOI":"10.3390\/rs13153045"},{"key":"ref_18","unstructured":"Tuli, S., Dasgupta, I., Grant, E., and Griffiths, T.L. (2021). Are Convolutional Neural Networks or Transformers more like human vision?. arXiv."},{"key":"ref_19","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., and Polosukhin, I. (2017, January 4\u20139). Attention is all you need. Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA."},{"key":"ref_20","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An Image is Worth 16 \u00d7 16 Words: Transformers for Image Recognition at Scale. arXiv."},{"key":"ref_21","unstructured":"Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2020). Deformable DETR: Deformable Transformers for End-to-End Object Detection. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., and Torr, P.H. (2020). Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers. arXiv.","DOI":"10.1109\/CVPR46437.2021.00681"},{"key":"ref_23","unstructured":"Fu, Y., Xu, T., Wu, X., and Kittler, J. (2021). PPT Fusion: Pyramid Patch Transformer for a Case Study in Image Fusion. arXiv."},{"key":"ref_24","first-page":"691","article-title":"Fusion of satellite images of different spatial resolutions: Assessing the quality of resulting images","volume":"63","author":"Wald","year":"1997","journal-title":"Photogramm. Eng. Remote Sens."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"897","DOI":"10.1109\/JSTARS.2020.3038057","article-title":"Pan-Sharpening Based on Convolutional Neural Network by Using the Loss Function With No-Reference","volume":"14","author":"Xiong","year":"2021","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Li, Z., and Cheng, C. (2019). A CNN-Based Pan-Sharpening Method for Integrating Panchromatic and Multispectral Images Using Landsat 8. Remote Sens., 11.","DOI":"10.3390\/rs11222606"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"5443","DOI":"10.1109\/TGRS.2018.2817393","article-title":"Target-Adaptive CNN-Based Pansharpening","volume":"56","author":"Scarpa","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_28","first-page":"1","article-title":"Pan-Sharpening Based on Panchromatic Image Spectral Learning Using WorldView-2","volume":"19","author":"Xiong","year":"2021","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"115523","DOI":"10.1109\/ACCESS.2021.3104321","article-title":"Pan-Sharpening Based on Panchromatic Colorization Using WorldView-2","volume":"9","author":"Xiong","year":"2021","journal-title":"IEEE Access"},{"key":"ref_30","unstructured":"Podlubny, I. (1997). The Laplace transform method for linear differential equations of the fractional order. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"193","DOI":"10.14358\/PERS.74.2.193","article-title":"Multispectral and Panchromatic Data Fusion Assessment Without Reference","volume":"74","author":"Alparone","year":"2008","journal-title":"Photogramm. Eng. Remote Sens."},{"key":"ref_32","unstructured":"Kurt, K. (2020). ADAHESSIAN: An adaptive second order optimizer for machine learning. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Wei, Y., and Yuan, Q. (2017, January 18\u201321). Deep residual learning for remote sensed imagery pansharpening. Proceedings of the 2017 International Workshop on Remote Sensing with Intelligent Processing (RSIP), Shanghai, China.","DOI":"10.1109\/RSIP.2017.7958794"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Yang, J., Fu, X., Hu, Y., Huang, Y., Ding, X., and Paisley, J. (2017, January 22\u201329). PanNet: A Deep Network Architecture for Pan-Sharpening. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.193"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.inffus.2019.07.010","article-title":"Remote sensing image fusion based on two-stream fusion network","volume":"55","author":"Liu","year":"2020","journal-title":"Inf. Fusion"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1016\/j.inffus.2020.04.006","article-title":"Pan-GAN: An unsupervised pan-sharpening method for remote sensing image fusion","volume":"62","author":"Ma","year":"2020","journal-title":"Inf. Fusion"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"2565","DOI":"10.1109\/TGRS.2014.2361734","article-title":"A Critical Comparison Among Pansharpening Algorithms","volume":"53","author":"Vivone","year":"2014","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"3012","DOI":"10.1109\/TGRS.2007.904923","article-title":"Comparison of Pansharpening Algorithms: Outcome of the 2006 GRS-S Data-Fusion Contest","volume":"45","author":"Alparone","year":"2007","journal-title":"IEEE Trans. Geosci. Remote Sens."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/3\/624\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:09:19Z","timestamp":1760134159000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/3\/624"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,27]]},"references-count":38,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2022,2]]}},"alternative-id":["rs14030624"],"URL":"https:\/\/doi.org\/10.3390\/rs14030624","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,1,27]]}}}