{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,5]],"date-time":"2026-03-05T16:04:49Z","timestamp":1772726689857,"version":"3.50.1"},"reference-count":37,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2023,1,7]],"date-time":"2023-01-07T00:00:00Z","timestamp":1673049600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"Natural Science Foundation of China","doi-asserted-by":"publisher","award":["31971789"],"award-info":[{"award-number":["31971789"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"Natural Science Foundation of China","doi-asserted-by":"publisher","award":["2022AH010005"],"award-info":[{"award-number":["2022AH010005"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"Natural Science Foundation of China","doi-asserted-by":"publisher","award":["2017YFB050420"],"award-info":[{"award-number":["2017YFB050420"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Excellent Scientific Research and Innovation Team","award":["31971789"],"award-info":[{"award-number":["31971789"]}]},{"name":"Excellent Scientific Research and Innovation Team","award":["2022AH010005"],"award-info":[{"award-number":["2022AH010005"]}]},{"name":"Excellent Scientific Research and Innovation Team","award":["2017YFB050420"],"award-info":[{"award-number":["2017YFB050420"]}]},{"name":"National Key Research and Development Project","award":["31971789"],"award-info":[{"award-number":["31971789"]}]},{"name":"National Key Research and Development Project","award":["2022AH010005"],"award-info":[{"award-number":["2022AH010005"]}]},{"name":"National Key Research and Development Project","award":["2017YFB050420"],"award-info":[{"award-number":["2017YFB050420"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Remote image semantic segmentation technology is one of the core research elements in the field of computer vision and has a wide range of applications in production life. Most remote image semantic segmentation methods are based on CNN. Recently, Transformer provided a view of long-distance dependencies in images. In this paper, we propose RCCT-ASPPNet, which includes the dual-encoder structure of Residual Multiscale Channel Cross-Fusion with Transformer (RCCT) and Atrous Spatial Pyramid Pooling (ASPP). RCCT uses Transformer to cross fuse global multiscale semantic information; the residual structure is then used to connect the inputs and outputs. ASPP based on CNN extracts contextual information of high-level semantics from different perspectives and uses Convolutional Block Attention Module (CBAM) to extract spatial and channel information, which will further improve the model segmentation ability. The experimental results show that the mIoU of our method is 94.14% and 61.30% on the datasets Farmland and AeroScapes, respectively, and that the mPA is 97.12% and 84.36%, respectively, both outperforming DeepLabV3+ and UCTransNet.<\/jats:p>","DOI":"10.3390\/rs15020379","type":"journal-article","created":{"date-parts":[[2023,1,9]],"date-time":"2023-01-09T04:47:08Z","timestamp":1673239628000},"page":"379","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":31,"title":["RCCT-ASPPNet: Dual-Encoder Remote Image Segmentation Based on Transformer and ASPP"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2812-8917","authenticated-orcid":false,"given":"Yazhou","family":"Li","sequence":"first","affiliation":[{"name":"National Engineering Research Center for Analysis and Application of Agro-Ecological Big Data, Anhui University, Hefei 230601, China"},{"name":"School of Internet, Anhui University, Hefei 230039, China"}]},{"given":"Zhiyou","family":"Cheng","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Analysis and Application of Agro-Ecological Big Data, Anhui University, Hefei 230601, China"},{"name":"School of Internet, Anhui University, Hefei 230039, China"}]},{"given":"Chuanjian","family":"Wang","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Analysis and Application of Agro-Ecological Big Data, Anhui University, Hefei 230601, China"},{"name":"School of Internet, Anhui University, Hefei 230039, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8352-7689","authenticated-orcid":false,"given":"Jinling","family":"Zhao","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Analysis and Application of Agro-Ecological Big Data, Anhui University, Hefei 230601, China"},{"name":"School of Internet, Anhui University, Hefei 230039, China"}]},{"given":"Linsheng","family":"Huang","sequence":"additional","affiliation":[{"name":"National Engineering Research Center for Analysis and Application of Agro-Ecological Big Data, Anhui University, Hefei 230601, China"},{"name":"School of Internet, Anhui University, Hefei 230039, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,1,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., and Garcia-Rodriguez, J. (2017). A Review on Deep Learning Techniques Applied to Semantic Segmentation. arXiv.","DOI":"10.1016\/j.asoc.2018.05.018"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Liu, H., Ye, Q., Wang, H., Chen, L., and Yang, J. (2019). A Precise and Robust Segmentation-Based Lidar Localization System for Automated Urban Driving. Remote. Sens., 11.","DOI":"10.3390\/rs11111348"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Lai, C., Yang, Q., Guo, Y., Bai, F., and Sun, H. (2022). Semantic Segmentation of Panoramic Images for Real-Time Parking Slot Detection. Remote Sens., 14.","DOI":"10.3390\/rs14163874"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Mekyska, J., Espinosa-Duro, V., and Faundez-Zanuy, M. (2010, January 5\u20138). Face segmentation: A comparison between visible and thermal images. Proceedings of the 44th Annual 2010 IEEE International Carnahan Conference on Security Technology, San Jose, CA, USA.","DOI":"10.1109\/CCST.2010.5678709"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"58683","DOI":"10.1109\/ACCESS.2020.2982970","article-title":"Face Segmentation: A Journey from Classical to Deep Learning Paradigm, Approaches, Trends, and Directions","volume":"8","author":"Khan","year":"2020","journal-title":"IEEE Access"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Masi, I., Mathai, J., and AbdAlmageed, W. (2020, January 13\u201319). Towards Learning Structure via Consensus for Face Segmentation and Parsing. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00555"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Wang, Y., Dong, M., Shen, J., Wu, Y., Cheng, S., and Pantic, M. (2020, January 13\u201319). Dynamic Face Video Segmentation via Reinforcement Learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00699"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Abdelrahman, A., and Viriri, S. (2022). Kidney Tumor Semantic Segmentation Using Deep Learning: A Survey of State-of-the-Art. J. Imaging, 8.","DOI":"10.3390\/jimaging8030055"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Arbabshirani, M.R., Dallal, A.H., Agarwal, C., Patel, A., and Moore, G. (2017, January 11\u201316). Accurate Segmentation of Lung Fields on Chest Radio-graphs Using Deep Convolutional Networks. Proceedings of the Medical Imaging: Image Processing, Orlando, FL, USA.","DOI":"10.1117\/12.2254526"},{"key":"ref_10","unstructured":"Dai, P., Dong, L., Zhang, R., Zhu, H., Wu, J., and Yuan, K. (2022). Soft-CP: A Credible and Effective Data Augmentation for Semantic Segmentation of Medical Lesions. arXiv."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Wang, J., and Valaee, S. (2019, January 9\u201313). From Whole to Parts: Medical Imaging Semantic Segmentation with Very Imbalanced Data. Proceedings of the 2019 IEEE Global Communications Conference (GLOBECOM), Waikoloa, HI, USA.","DOI":"10.1109\/GLOBECOM38437.2019.9014112"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Neupane, B., Horanont, T., and Aryal, J. (2021). Deep Learning-Based Semantic Segmentation of Urban Features in Satellite Images: A Review and Meta-Analysis. Remote. Sens., 13.","DOI":"10.3390\/rs13040808"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Peng, B., Zhang, W., Hu, Y., Chu, Q., and Li, Q. (2022). LRFFNet: Large Receptive Field Feature Fusion Network for Semantic Segmentation of SAR Images in Building Areas. Remote. Sens., 14.","DOI":"10.3390\/rs14246291"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Li, Y., Si, Y., Tong, Z., He, L., Zhang, J., Luo, S., and Gong, Y. (2022). MQANet: Multi-Task Quadruple Attention Network of Multi-Object Semantic Segmentation from Remote Sensing Images. Remote. Sens., 14.","DOI":"10.3390\/rs14246256"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"640","DOI":"10.1109\/TPAMI.2016.2572683","article-title":"Fully convolutional networks for semantic segmentation","volume":"39","author":"Shelhamer","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_16","first-page":"234","article-title":"U-Net: Convolutional Networks for Biomedical Image Segmentation","volume":"Volume 9351","author":"Navab","year":"2015","journal-title":"Medical Image Computing and Computer-Assisted Intervention\u2014MICCAI 2015"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"Segnet: A deep convolutional encoder-decoder architecture for image segmentation","volume":"39","author":"Badrinarayanan","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_18","unstructured":"Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A.L. (2016). Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs","volume":"40","author":"Chen","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_20","unstructured":"Chen, L.-C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"833","DOI":"10.1007\/978-3-030-01234-2_49","article-title":"Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation","volume":"Volume 11211","author":"Ferrari","year":"2018","journal-title":"Computer Vision\u2013ECCV 2018"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21\u201326). Pyramid Scene Parsing Network. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., and Torr, P.H. (2021, January 20\u201325). Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00681"},{"key":"ref_24","first-page":"2441","article-title":"UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-Wise Perspective with Transformer","volume":"36","author":"Wang","year":"2022","journal-title":"Proc. Conf. AAAI Artif. Intell."},{"key":"ref_25","unstructured":"Dumoulin, V., and Visin, F. (2018). A Guide to Convolution Arithmetic for Deep Learning. arXiv."},{"key":"ref_26","unstructured":"Simonyan, K., and Zisserman, A. (2015). Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv."},{"key":"ref_27","unstructured":"Yu, F., and Koltun, V. (2016). Multi-Scale Context Aggregation by Dilated Convolutions. arXiv."},{"key":"ref_28","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., and Polosukhin, I. (2017). Attention Is All You Need. Adv. Neural Inf. Process. Syst. arXiv."},{"key":"ref_29","unstructured":"Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., and Wang, M. (2021). Swin-Unet: Unet-like Pure Transformer for Medical Image Segmentation. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep Residual Learning for Image Recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.Y., and Kweon, I.S. (2018, January 8\u201314). CBAM: Convolutional block attention module. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Nigam, I., Huang, C., and Ramanan, D. (2018, January 12\u201315). Ensemble Knowledge Transfer for Semantic Segmentation. Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA.","DOI":"10.1109\/WACV.2018.00168"},{"key":"ref_33","unstructured":"Liu, W., Rabinovich, A., and Berg, A.C. (2015). ParseNet: Looking Wider to See Better. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"400","DOI":"10.1214\/aoms\/1177729586","article-title":"A Stochastic Approximation Method","volume":"22","author":"Robbins","year":"1951","journal-title":"Ann. Math. Stat."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Li, F.-F. (2009, January 20\u201325). ImageNet: A Large-Scale Hierarchical Image Database. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Berman, M., Triki, A.R., and Blaschko, M.B. (2018, January 18\u201322). The Lov\u00e1sz-Softmax Loss: A Tractable Surrogate for the Optimization of the Inter-section-over-Union Measure in Neural Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00464"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1111\/j.1469-8137.1912.tb05611.x","article-title":"The Distribution of The Flora in The Alpine Zone","volume":"11","author":"Jaccard","year":"1912","journal-title":"New Phytol."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/2\/379\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:03:05Z","timestamp":1760119385000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/2\/379"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,1,7]]},"references-count":37,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2023,1]]}},"alternative-id":["rs15020379"],"URL":"https:\/\/doi.org\/10.3390\/rs15020379","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,1,7]]}}}