{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T22:01:07Z","timestamp":1773180067826,"version":"3.50.1"},"reference-count":48,"publisher":"MDPI AG","issue":"18","license":[{"start":{"date-parts":[[2022,9,16]],"date-time":"2022-09-16T00:00:00Z","timestamp":1663286400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61971426"],"award-info":[{"award-number":["61971426"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>SAR-optical images from different sensors can provide consistent information for scene classification. However, the utilization of unlabeled SAR-optical images in deep learning-based remote sensing image interpretation remains an open issue. In recent years, contrastive self-supervised learning (CSSL) methods have shown great potential for obtaining meaningful feature representations from massive amounts of unlabeled data. This paper investigates the effectiveness of CSSL-based pretraining models for SAR-optical remote-sensing classification. Firstly, we analyze the contrastive strategies of single-source and multi-source SAR-optical data augmentation under different CSSL architectures. We find that the CSSL framework without explicit negative sample selection naturally fits the multi-source learning problem. Secondly, we find that the registered SAR-optical images can guide the Siamese self-supervised network without negative samples to learn shared features, which is also the reason why the CSSL framework outperforms the CSSL framework with negative samples. Finally, we apply the CSSL pretrained network without negative samples that can learn the shared features of SAR-optical images to the downstream domain adaptation task of optical transfer to SAR images. We find that the choice of a pretrained network is important for downstream tasks.<\/jats:p>","DOI":"10.3390\/rs14184632","type":"journal-article","created":{"date-parts":[[2022,9,19]],"date-time":"2022-09-19T04:49:22Z","timestamp":1663562962000},"page":"4632","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":17,"title":["Multi-Source Remote Sensing Pretraining Based on Contrastive Self-Supervised Learning"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2838-1306","authenticated-orcid":false,"given":"Chenfang","family":"Liu","sequence":"first","affiliation":[{"name":"State Key Laboratory of Complex Electromagnetic Environment Effects on Electronics and Information System, National University of Defense Technology, Changsha 410073, China"}]},{"given":"Hao","family":"Sun","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Complex Electromagnetic Environment Effects on Electronics and Information System, National University of Defense Technology, Changsha 410073, China"}]},{"given":"Yanjie","family":"Xu","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Complex Electromagnetic Environment Effects on Electronics and Information System, National University of Defense Technology, Changsha 410073, China"}]},{"given":"Gangyao","family":"Kuang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Complex Electromagnetic Environment Effects on Electronics and Information System, National University of Defense Technology, Changsha 410073, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,9,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"3185","DOI":"10.1109\/JSTARS.2021.3063849","article-title":"Global land-cover mapping with weak supervision: Outcome of the 2020 IEEE GRSS data fusion contest","volume":"14","author":"Robinson","year":"2021","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"2207","DOI":"10.1109\/JPROC.2016.2598228","article-title":"Big data for remote sensing: Challenges and opportunities","volume":"104","author":"Chi","year":"2016","journal-title":"Proc. IEEE"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1109\/MGRS.2018.2890023","article-title":"Multisource and multitemporal data fusion in remote sensing: A comprehensive review of the state of the art","volume":"7","author":"Ghamisi","year":"2019","journal-title":"IEEE Geosci. Remote Sens. Mag."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1011","DOI":"10.1109\/JSTARS.2020.2975252","article-title":"Multimodal Bilinear Fusion Network With Second-Order Attention-Based Channel Selection for Land Cover Classification","volume":"13","author":"Li","year":"2020","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"7708","DOI":"10.1109\/TGRS.2014.2317499","article-title":"Semisupervised Manifold Alignment of Multimodal Remote Sensing Images","volume":"52","author":"Tuia","year":"2014","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Penatti, O., Nogueira, K., and dos Santos, J.A. (2015, January 7\u201312). Do deep features generalize from everyday objects to remote sensing and aerial scenes domains?. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Boston, MA, USA.","DOI":"10.1109\/CVPRW.2015.7301382"},{"key":"ref_7","first-page":"76","article-title":"Fusion of GF-3 SAR and optical images based on the nonsubsampled contourlet transform","volume":"38","author":"Yi","year":"2018","journal-title":"Acta Opt. Sin."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Feng, Q., Yang, J., Zhu, D., Liu, J., Guo, H., Bayartungalag, B., and Li, B. (2019). Integrating Multitemporal Sentinel-1\/2 Data for Coastal Land Cover Classification Using a Multibranch Convolutional Neural Network: A Case of the Yellow River Delta. Remote Sens., 11.","DOI":"10.3390\/rs11091006"},{"key":"ref_9","first-page":"1","article-title":"Fully contextual network for hyperspectral scene parsing","volume":"60","author":"Wang","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Kim, S., Song, W.-J., and Kim, S.-H. (2018). Double Weight-Based SAR and Infrared Sensor Fusion for Automatic Ground Target Recognition with Deep Learning. Remote Sens., 10.","DOI":"10.3390\/rs10010072"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1109\/JPROC.2020.3004555","article-title":"A comprehensive survey on transfer learning","volume":"109","author":"Zhuang","year":"2020","journal-title":"Proc. IEEE"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1865","DOI":"10.1109\/JPROC.2017.2675998","article-title":"Remote sensing image scene classification: Benchmark and state of the art","volume":"105","author":"Cheng","year":"2017","journal-title":"Proc. IEEE"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"Imagenet large scale visual recognition challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zhou, H.Y., Yu, S., Bian, C., Hu, Y., Ma, K., and Zheng, Y. (2020, January 4\u20138). Comparing to learn: Surpassing imagenet pretraining on radiographs by comparing image representations. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Lima, Peru.","DOI":"10.1007\/978-3-030-59710-8_39"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Wang, D., Zhang, J., Du, B., Xia, G.S., and Tao, D. (2022). An Empirical Study of Remote Sensing Pretraining. arXiv.","DOI":"10.1109\/TGRS.2022.3176603"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Stojnic, V., and Risojevic, V. (2021, January 19\u201325). Self-supervised learning of remote sensing scene representations using contrastive multiview coding. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPRW53098.2021.00129"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1146\/annurev-vision-082114-035447","article-title":"Deep neural networks: A new framework for modelling biological vision and brain information processing","volume":"1","author":"Kriegeskorte","year":"2015","journal-title":"Annu. Rev. Vis. Sci."},{"key":"ref_19","unstructured":"Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., and Brendel, W. (2018). ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv."},{"key":"ref_20","unstructured":"Albuquerque, I., Naik, N., Li, J., Keskar, N., and Socher, R. (2020). Improving out-of-distribution generalization via multi-task self-supervised pretraining. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Scheibenreif, L., Hanna, J., Mommert, M., and Borth, D. (2022, January 19\u201320). Self-Supervised Vision Transformers for Land-Cover Segmentation and Classification. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPRW56347.2022.00148"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Stojnic, V., and Risojevic, V. (2018, January 16\u201319). Evaluation of Split-Brain Autoencoders for High-Resolution Remote Sensing Scene Classification. Proceedings of the 2018 International Symposium ELMAR, Zadar, Croatia.","DOI":"10.23919\/ELMAR.2018.8534634"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1560","DOI":"10.1109\/JPROC.2015.2449668","article-title":"Multimodal classification of remote sensing images: A review and future directions","volume":"103","author":"Tuia","year":"2015","journal-title":"Proc. IEEE"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"7799","DOI":"10.1109\/JSTARS.2021.3099483","article-title":"An anchor-free detection method for ship targets in high-resolution SAR images","volume":"14","author":"Sun","year":"2021","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Goyal, P., Mahajan, D., Gupta, A., and Misra, I. (2019, January 20\u201326). Scaling and benchmarking self-supervised visual representation learning. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00649"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Liu, X., Zhang, F., Hou, Z., Mian, L., Wang, Z., Zhang, J., and Tang, J. (2021). Self-supervised learning: Generative or contrastive. IEEE Trans. Knowl. Data Eng.","DOI":"10.1109\/TKDE.2021.3090866"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Jaiswal, A., Babu, A.R., Zadeh, M.Z., Banerjee, D., and Makedon, F. (2020). A survey on contrastive self-supervised learning. Technologies, 9.","DOI":"10.3390\/technologies9010002"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Manas, O., Lacoste, A., Gir\u00f3-i-Nieto, X., Vazquez, D., and Rodriguez, P. (2021, January 10\u201317). Seasonal contrast: Unsupervised pre-training from uncurated remote sensing data. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00928"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"He, K., Fan, H., Wu, Y., Xie, S., and Girshick, R. (2020, January 13\u201319). Momentum contrast for unsupervised visual representation learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00975"},{"key":"ref_30","unstructured":"Chen, T., Kornblith, S., Norouzi, M., and Hinton, G. (2020, January 13\u201319). A simple framework for contrastive learning of visual representations. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Tian, Y., Krishnan, D., and Isola, P. (2019). Contrastive Multiview Coding. arXiv.","DOI":"10.1007\/978-3-030-58621-8_45"},{"key":"ref_32","first-page":"21271","article-title":"Bootstrap your own latent-a new approach to self-supervised learning","volume":"33","author":"Grill","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Chen, X., and He, K. (2021, January 20\u201325). Exploring simple siamese representation learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01549"},{"key":"ref_34","unstructured":"Bachman, P., Hjelm, R.D., and Buchwalter, W. (2019, January 8\u201314). Learning representations by maximizing mutual information across views. Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada."},{"key":"ref_35","first-page":"2096-2030","article-title":"Domain-adversarial training of neural networks","volume":"17","author":"Ganin","year":"2016","journal-title":"J. Mach. Learn. Res."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"3071","DOI":"10.1109\/TPAMI.2018.2868685","article-title":"Transferable representation learning with deep adaptation networks","volume":"41","author":"Long","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_37","unstructured":"Long, M., Cao, Z., Wang, J., and Jordan, M.I. (2018, January 3\u20138). Conditional adversarial domain adaptation. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"705","DOI":"10.5194\/isprs-annals-V-3-2022-705-2022","article-title":"Contrastive self-supervised data fusion for satellite imagery","volume":"3","author":"Scheibenreif","year":"2022","journal-title":"ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci."},{"key":"ref_39","unstructured":"Ioffe, S., and Szegedy, C. (2015, January 7\u20139). Batch normalization: Accelerating deep network training by reducing internal covariate shift. Proceedings of the International Conference on Machine Learning (PMLR), Lille, France."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Bottou, L. (2010, January 22\u201327). Large-scale machine learning with stochastic gradient descent. Proceedings of the COMPSTAT\u20192010, Paris, France.","DOI":"10.1007\/978-3-7908-2604-3_16"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Deng, W., Zhao, L., Kuang, G., Hu, D., Pietik\u00e4inen, M., and Liu, L. (2021). Deep Ladder-Suppression Network for Unsupervised Domain Adaptation. IEEE Trans. Cybern., 1\u201315.","DOI":"10.1016\/j.patrec.2021.10.009"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1109\/MGRS.2020.2964708","article-title":"So2Sat LCZ42: A benchmark data set for the classification of global local climate zones [Software and Data Sets]","volume":"8","author":"Zhu","year":"2020","journal-title":"IEEE Geosci. Remote Sens. Mag."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Schmitt, M., Hughes, L.H., and Zhu, X.X. (2018). The SEN1-2 dataset for deep learning in SAR-optical data fusion. arXiv.","DOI":"10.5194\/isprs-annals-IV-1-141-2018"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Huang, M., Xu, Y., Qian, L., Shi, W., Zhang, Y., Bao, W., Wang, N., Liu, X.J., and Xiang, X. (2021). The QXS-SAROPT dataset for deep learning in SAR-optical data fusion. arXiv.","DOI":"10.34133\/2021\/9841456"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Yang, Y., and Newsam, S. (2010, January 2\u20135). Bag-of-visual-words and spatial extensions for land-use classification. Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA.","DOI":"10.1145\/1869790.1869829"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"3965","DOI":"10.1109\/TGRS.2017.2685945","article-title":"AID: A benchmark data set for performance evaluation of aerial scene classification","volume":"55","author":"Xia","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"187","DOI":"10.1109\/JSTARS.2019.2954850","article-title":"OpenSARUrban: A Sentinel-1 SAR image dataset for urban interpretation","volume":"13","author":"Zhao","year":"2020","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_48","first-page":"1","article-title":"Solo-learn: A Library of Self-supervised Methods for Visual Representation Learning","volume":"23","author":"Fini","year":"2022","journal-title":"J. Mach. Learn. Res."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/18\/4632\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:32:59Z","timestamp":1760142779000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/18\/4632"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,16]]},"references-count":48,"journal-issue":{"issue":"18","published-online":{"date-parts":[[2022,9]]}},"alternative-id":["rs14184632"],"URL":"https:\/\/doi.org\/10.3390\/rs14184632","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,16]]}}}