{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T15:48:38Z","timestamp":1780501718801,"version":"3.54.1"},"reference-count":40,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2023,2,7]],"date-time":"2023-02-07T00:00:00Z","timestamp":1675728000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"high-end foreign experts introduction program","award":["G2022012010L"],"award-info":[{"award-number":["G2022012010L"]}]},{"name":"Reserved Leaders of Heilongjiang Provincial Leading Talent Echelon","award":["G2022012010L"],"award-info":[{"award-number":["G2022012010L"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Deep-learning-based multi-sensor hyperspectral image classification algorithms can automatically acquire the advanced features of multiple sensor images, enabling the classification model to better characterize the data and improve the classification accuracy. However, the currently available classification methods for feature representation in multi-sensor remote sensing data in their respective domains do not focus on the existence of bottlenecks in heterogeneous feature fusion due to different sensors. This problem directly limits the final collaborative classification performance. In this paper, to address the bottleneck problem of joint classification due to the difference in heterogeneous features, we innovatively combine self-supervised comparative learning while designing a robust and discriminative feature extraction network for multi-sensor data, using spectral\u2013spatial information from hyperspectral images (HSIs) and elevation information from LiDAR. The advantages of multi-sensor data are realized. The dual encoders of the hyperspectral encoder by the ConvNeXt network (ConvNeXt-HSI) and the LiDAR encoder by Octave Convolution (OctaveConv-LiDAR) are also used. The adequate feature representation of spectral\u2013spatial features and depth information obtained from different sensors is performed for the joint classification of hyperspectral images and LiDAR data. The multi-sensor joint classification performance of both HSI and LiDAR sensors is greatly improved. Finally, on the Houston2013 dataset and the Trento dataset, we demonstrate through a series of experiments that the dual-encoder model for hyperspectral and LiDAR joint classification via contrastive learning achieves state-of-the-art classification performance.<\/jats:p>","DOI":"10.3390\/rs15040924","type":"journal-article","created":{"date-parts":[[2023,2,8]],"date-time":"2023-02-08T05:37:31Z","timestamp":1675834651000},"page":"924","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["A Novel Dual-Encoder Model for Hyperspectral and LiDAR Joint Classification via Contrastive Learning"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2453-3691","authenticated-orcid":false,"given":"Haibin","family":"Wu","sequence":"first","affiliation":[{"name":"Heilongjiang Province Key Laboratory of Laser Spectroscopy Technology and Application, Harbin University of Science and Technology, Harbin 150080, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Shiyu","family":"Dai","sequence":"additional","affiliation":[{"name":"Heilongjiang Province Key Laboratory of Laser Spectroscopy Technology and Application, Harbin University of Science and Technology, Harbin 150080, China"},{"name":"Artificial Intelligence R&D Center, Nuctech Jiang Su Company Limited, Changzhou 213000, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Chengyang","family":"Liu","sequence":"additional","affiliation":[{"name":"Heilongjiang Province Key Laboratory of Laser Spectroscopy Technology and Application, Harbin University of Science and Technology, Harbin 150080, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9118-230X","authenticated-orcid":false,"given":"Aili","family":"Wang","sequence":"additional","affiliation":[{"name":"Heilongjiang Province Key Laboratory of Laser Spectroscopy Technology and Application, Harbin University of Science and Technology, Harbin 150080, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1016-1636","authenticated-orcid":false,"given":"Yuji","family":"Iwahori","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Chubu University, Aichi 487-8501, Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2023,2,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"S123","DOI":"10.1016\/j.rse.2009.03.001","article-title":"Earth system science related imaging spectroscopy\u2014An assessment","volume":"113","author":"Schaepman","year":"2009","journal-title":"Remote Sens. Environ."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"4349","DOI":"10.1109\/TGRS.2018.2890705","article-title":"CoSpace: Common subspace learning from hyperspectral-multispectral correspondences","volume":"57","author":"Hong","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Shah, C., Du, Q., and Xu, Y. (2022). Enhanced TabNet: Attentive Interpretable Tabular Learning for Hyperspectral Image Classification. Remote Sens., 14.","DOI":"10.3390\/rs14030716"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Zhao, R., and Du, S. (2022). An Encoder\u2013Decoder with a Residual Network for Fusing Hyperspectral and Panchromatic Remote Sensing Images. Remote Sens., 14.","DOI":"10.3390\/rs14091981"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1087","DOI":"10.1109\/36.312897","article-title":"The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon","volume":"32","author":"Shahshahani","year":"1994","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"258","DOI":"10.1016\/j.rse.2012.03.013","article-title":"Tree species classification in the Southern Alps based on the fusion of very high geometrical resolution multispectral\/hyperspectral images and LiDAR data","volume":"123","author":"Dalponte","year":"2012","journal-title":"Remote Sens. Environ."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Kuras, A., Brell, M., Rizzi, J., and Burud, I. (2021). Hyperspectral and lidar data applied to the urban land cover machine learning and neural-network-based classification: A review. Remote Sens., 13.","DOI":"10.3390\/rs13173393"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"112322","DOI":"10.1016\/j.rse.2021.112322","article-title":"Tree species classification from airborne hyperspectral and LiDAR data using 3D convolutional neural networks","volume":"256","author":"Kivinen","year":"2021","journal-title":"Remote Sens. Environ."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1416","DOI":"10.1109\/TGRS.2008.916480","article-title":"Fusion of hyperspectral and LIDAR remote sensing data for classification of complex forest areas","volume":"46","author":"Dalponte","year":"2008","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"247","DOI":"10.1016\/j.isprsjprs.2010.11.001","article-title":"Support vector machines in remote sensing: A review","volume":"66","author":"Mountrakis","year":"2011","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1109\/MGRS.2018.2890023","article-title":"Multisource and multitemporal data fusion in remote sensing: A comprehensive review of the state of the art","volume":"7","author":"Ghamisi","year":"2019","journal-title":"IEEE Geosci. Remote Sens. Mag."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"2405","DOI":"10.1109\/JSTARS.2014.2305441","article-title":"Hyperspectral and LiDAR data fusion: Outcome of the 2013 GRSS data fusion contest","volume":"7","author":"Debes","year":"2014","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"5377","DOI":"10.1109\/TGRS.2020.2964679","article-title":"Transfer learning for SAR image classification via deep joint distribution adaptation networks","volume":"58","author":"Geng","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Feng, Q., Zhu, D., Yang, J., and Li, B. (2019). Multisource hyperspectral and LiDAR data fusion for urban land-use mapping based on a modified two-branch convolutional neural network. ISPRS Int. J. Geo-Inf., 8.","DOI":"10.3390\/ijgi8010028"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"676","DOI":"10.1109\/JPROC.2012.2229082","article-title":"Feature mining for hyperspectral image classification","volume":"101","author":"Jia","year":"2013","journal-title":"Proc. IEEE"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"652","DOI":"10.1109\/TPAMI.2019.2938758","article-title":"Res2net: A new multi-scale backbone architecture","volume":"43","author":"Gao","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"Imagenet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ref_18","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016, January 27\u201330). Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.308"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Ioffe, S., Vanhoucke, V., and Alemi, A.A. (2017, January 4\u20139). Inception-v4, inception-resnet and the impact of residual connections on learning. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA.","DOI":"10.1609\/aaai.v31i1.11231"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Xie, S., Girshick, R., Doll\u00e1r, P., Tu, Z., and He, K. (2017, January 21\u201326). Aggregated residual transformations for deep neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.634"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"Imagenet large scale visual recognition challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1452","DOI":"10.1109\/TPAMI.2017.2723009","article-title":"Places: A 10 million image database for scene recognition","volume":"40","author":"Zhou","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_26","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015). Advances in Neural Information Processing Systems 28, MIT Press."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Bazi, Y., Bashmal, L., Rahhal, M.M.A., Dayil, R.A., and Ajlan, N.A. (2021). Vision transformers for remote sensing image classification. Remote Sens., 13.","DOI":"10.3390\/rs13030516"},{"key":"ref_28","first-page":"1","article-title":"Semi-Supervised Remote-Sensing Image Scene Classification Using Representation Consistency Siamese Network","volume":"60","author":"Miao","year":"2022","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"937","DOI":"10.1109\/TGRS.2017.2756851","article-title":"Multisource remote sensing data classification based on convolutional neural network","volume":"56","author":"Xu","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_30","first-page":"5500205","article-title":"Deep encoder-decoder networks for classification of hyperspectral and LiDAR data","volume":"19","author":"Hong","year":"2020","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"4340","DOI":"10.1109\/TGRS.2020.3016820","article-title":"More diverse means better: Multimodal deep learning meets remote-sensing imagery classification","volume":"59","author":"Hong","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_32","unstructured":"Bachman, P., Hjelm, R.D., and Buchwalter, W. (2019). Advances in Neural Information Processing Systems 32, MIT Press."},{"key":"ref_33","unstructured":"Oord, A.v.d., Li, Y., and Vinyals, O. (2018). Representation learning with contrastive predictive coding. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"He, K., Fan, H., Wu, Y., Xie, S., and Girshick, R. (2020, January 13\u201319). Momentum contrast for unsupervised visual representation learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00975"},{"key":"ref_35","unstructured":"Chen, T., Kornblith, S., Norouzi, M., and Hinton, G. (2020, January 12\u201318). A simple framework for contrastive learning of visual representations. Proceedings of the International Conference on Machine Learning, Vienna, Austria."},{"key":"ref_36","unstructured":"Chen, X., Fan, H., Girshick, R., and He, K. (2020). Improved baselines with momentum contrastive learning. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Liu, Z., Mao, H., Wu, C.-Y., Feichtenhofer, C., Darrell, T., and Xie, S. (2022, January 19\u201324). A convnet for the 2020s. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01167"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 11\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., and Cottrell, G. (2018, January 12\u201315). Understanding convolution for semantic segmentation. Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA.","DOI":"10.1109\/WACV.2018.00163"},{"key":"ref_40","unstructured":"Chen, Y., Fan, H., Xu, B., Yan, Z., Kalantidis, Y., Rohrbach, M., Yan, S., and Feng, J. (November, January 27). Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/4\/924\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:27:27Z","timestamp":1760120847000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/4\/924"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,7]]},"references-count":40,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,2]]}},"alternative-id":["rs15040924"],"URL":"https:\/\/doi.org\/10.3390\/rs15040924","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,7]]}}}