{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T01:27:41Z","timestamp":1760232461932,"version":"build-2065373602"},"reference-count":43,"publisher":"MDPI AG","issue":"21","license":[{"start":{"date-parts":[[2022,11,1]],"date-time":"2022-11-01T00:00:00Z","timestamp":1667260800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61502429","61672337","61972357","LY18F020012","LY17F020011","2019C03135"],"award-info":[{"award-number":["61502429","61672337","61972357","LY18F020012","LY17F020011","2019C03135"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Zhejiang Provincial Natural Science Foundation of China","award":["61502429","61672337","61972357","LY18F020012","LY17F020011","2019C03135"],"award-info":[{"award-number":["61502429","61672337","61972357","LY18F020012","LY17F020011","2019C03135"]}]},{"name":"Zhejiang Key R &amp; D Program","award":["61502429","61672337","61972357","LY18F020012","LY17F020011","2019C03135"],"award-info":[{"award-number":["61502429","61672337","61972357","LY18F020012","LY17F020011","2019C03135"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>The success of deep learning and the segmentation of remote sensing images (RSIs) has improved semantic segmentation in recent years. However, existing RSI segmentation methods have two inherent problems: (1) detecting objects of various scales in RSIs of complex scenes is challenging, and (2) feature reconstruction for accurate segmentation is difficult. To solve these problems, we propose a deep-separation-guided progressive reconstruction network that achieves accurate RSI segmentation. First, we design a decoder comprising progressive reconstruction blocks capturing detailed features at various resolutions through multi-scale features obtained from various receptive fields to preserve accuracy during reconstruction. Subsequently, we propose a deep separation module that distinguishes various classes based on semantic features to use deep features to detect objects of different scales. Moreover, adjacent middle features are complemented during decoding to improve the segmentation performance. Extensive experimental results on two optical RSI datasets show that the proposed network outperforms 11 state-of-the-art methods.<\/jats:p>","DOI":"10.3390\/rs14215510","type":"journal-article","created":{"date-parts":[[2022,11,2]],"date-time":"2022-11-02T03:36:44Z","timestamp":1667360204000},"page":"5510","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Deep-Separation Guided Progressive Reconstruction Network for Semantic Segmentation of Remote Sensing Images"],"prefix":"10.3390","volume":"14","author":[{"given":"Jiabao","family":"Ma","sequence":"first","affiliation":[{"name":"School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, China"}]},{"given":"Wujie","family":"Zhou","sequence":"additional","affiliation":[{"name":"School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, China"},{"name":"College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China"}]},{"given":"Xiaohong","family":"Qian","sequence":"additional","affiliation":[{"name":"School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, China"}]},{"given":"Lu","family":"Yu","sequence":"additional","affiliation":[{"name":"College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,11,1]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Neupane, B., Horanont, T., and Aryal, J. (2021). Deep learning-based semantic segmentation of urban features in satellite images: A review and meta-analysis. Remote Sens., 13.","DOI":"10.3390\/rs13040808"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"7790","DOI":"10.1109\/TIP.2021.3109518","article-title":"GMNet: Graded-feature multilabel-Learning network for RGB-Thermal urban scene semantic segmentation","volume":"30","author":"Zhou","year":"2021","journal-title":"IEEE Trans. Image Process."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Chen, Z., Li, D., Fan, W., Guan, H., Wang, C., and Li, J. (2021). Self-attention in reconstruction bias U-Net for semantic segmentation of building rooftops in optical remote sensing images. Remote Sens., 13.","DOI":"10.3390\/rs13132524"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Wang, L., Li, R., Wang, D., Duan, C., Wang, T., and Meng, X. (2021). Transformer meets convolution: A bilateral awareness network for semantic segmentation of very fine resolution urban scene images. Remote Sens., 13.","DOI":"10.3390\/rs13163065"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Zhou, W., Yang, E., Lei, J., Wan, J., and Yu, L. (2022). PGDENet: Progressive Guided Fusion and Depth Enhancement Network for RGB-D Indoor Scene Parsing. IEEE Trans. Multimed.","DOI":"10.1109\/TMM.2022.3161852"},{"key":"ref_6","unstructured":"Zhou, W., Liu, W., Lei, J., Luo, T., and Yu, L. (2021). Deep binocular fixation prediction using hierarchical multimodal fusion network. IEEE Trans. Cogn. Dev. Syst."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"107766","DOI":"10.1016\/j.sigpro.2020.107766","article-title":"Multiscale multilevel context and multimodal fusion for RGB-D salient object detection","volume":"178","author":"Wu","year":"2021","journal-title":"Signal Process."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"666","DOI":"10.1109\/JSTSP.2022.3159032","article-title":"CIMFNet: Cross-Layer Interaction and Multiscale Fusion Network for Semantic Segmentation of High-Resolution Remote Sensing Images","volume":"16","author":"Zhou","year":"2022","journal-title":"IEEE J. Sel. Top. Signal Process."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"461","DOI":"10.1109\/TGRS.2017.2750220","article-title":"Deep multiple instance learning-based spatial\u2013spectral classification for PAN and MS imagery","volume":"56","author":"Liu","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Mou, L., Hua, Y., and Zhu, X.X. (2018, January 18\u201323). A relation-augmented fully convolutional network for semantic segmentation in aerial scenes. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2019.01270"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Zhou, W., Dong, S., Lei, J., and Yu, L. (2022). MTANet: Multitask-Aware Network with Hierarchical Multimodal Fusion for RGB-T Urban Scene Understanding. IEEE Trans. Intell. Veh.","DOI":"10.1109\/TIV.2022.3164899"},{"key":"ref_12","unstructured":"Zhou, W., Guo, Q., Lei, J., Yu, L., and Hwang, J.-N. (2021). IRFR-Net: Interactive recursive feature-reshaping network for detecting salient objects in RGB-D images. IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation","volume":"39","author":"Vijay","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_16","unstructured":"Jiang, J., Zheng, L., Luo, F., and Zhang, Z. (2018). Rednet: Residual encoder-decoder network for indoor rgb-d semantic segmentation. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8\u201314). Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Chen, X., Lin, K.Y., Wang, J., Wu, W., Qian, C., Li, H., and Zeng, G. (2020). Bi-directional cross-modality feature propagation with separation-and-aggregation gate for RGB-D semantic segmentation. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-030-58621-8_33"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"3051","DOI":"10.1007\/s11263-021-01515-2","article-title":"Bisenet v2: Bilateral network with guided aggregation for real-time semantic segmentation","volume":"129","author":"Yu","year":"2021","journal-title":"Int. J. Comput. Vis."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Xu, Z., Zhang, W., Zhang, T., and Li, J. (2020). HRCNet: High-resolution context extraction network for semantic segmentation of remote sensing images. Remote Sens., 13.","DOI":"10.3390\/rs13010071"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1109\/MIS.2020.2999462","article-title":"TSNet: Three-stream self-attention network for RGB-D indoor semantic segmentation","volume":"36","author":"Zhou","year":"2020","journal-title":"IEEE Intell. Syst."},{"key":"ref_22","unstructured":"Seichter, D., K\u00f6hler, M., Lewandowski, B., Wengefeld, T., and Gross, H.M. (June, January 30). Efficient rgb-d semantic segmentation for indoor scene analysis. Proceedings of the IEEE International Conference on Robotics and Automation, Xi\u2019an, China."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Hu, X., Yang, K., Fei, L., and Wang, K. (2019, January 22\u201325). Acnet: Attention based network to exploit complementary features for rgbd semantic segmentation. Proceedings of the IEEE International Conference on Image Processing, Taipei, Taiwan.","DOI":"10.1109\/ICIP.2019.8803025"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"2086","DOI":"10.1109\/TIP.2018.2794207","article-title":"Local and global feature learning for blind quality evaluation of screen content and natural scene images","volume":"27","author":"Zhou","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"677","DOI":"10.1109\/JSTSP.2022.3174338","article-title":"FRNet: Feature Reconstruction Network for RGB-D Indoor Scene Parsing","volume":"16","author":"Zhou","year":"2022","journal-title":"IEEE J. Sel. Top. Signal Process."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"2526","DOI":"10.1109\/TMM.2021.3086618","article-title":"MFFENet: Multiscale feature fusion and enhancement network for RGB\u2013Thermal urban road scene parsing","volume":"24","author":"Zhou","year":"2022","journal-title":"IEEE Trans. Multimed."},{"key":"ref_27","first-page":"1","article-title":"A Gather-to-Guide Network for Remote Sensing Semantic Segmentation of RGB and Auxiliary Image","volume":"60","author":"Zheng","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"3463","DOI":"10.1109\/JSTARS.2022.3165005","article-title":"A Crossmodal Multiscale Fusion Network for Semantic Segmentation of Remote Sensing Data","volume":"15","author":"Ma","year":"2022","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_29","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Las Vegas, NE, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1016\/j.neucom.2021.11.100","article-title":"HFNet: Hierarchical feedback network with multilevel atrous spatial pyramid pooling for RGB-D saliency detection","volume":"490","author":"Zhou","year":"2022","journal-title":"Neurocomputing"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"3641","DOI":"10.1109\/TSMC.2019.2957386","article-title":"Global and Local-Contrast Guides Content-Aware Fusion for RGB-D Saliency Prediction","volume":"51","author":"Zhou","year":"2021","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1224","DOI":"10.1109\/TCSVT.2021.3077058","article-title":"ECFFNet: Effective and consistent feature fusion network for RGB-T salient object detection","volume":"32","author":"Zhou","year":"2022","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Li, G., Liu, Z., Zeng, D., Lin, W., and Ling, H. (2022). Adjacent context coordination network for salient object detection in optical remote sensing images. IEEE Trans. Cybern.","DOI":"10.1109\/TGRS.2021.3131221"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"160107","DOI":"10.1007\/s11432-020-3337-9","article-title":"RLLNet: A lightweight remaking learning network for saliency redetection on RGB-D images","volume":"65","author":"Zhou","year":"2022","journal-title":"Sci. China Inf. Sci."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"105510","DOI":"10.1016\/j.engappai.2022.105510","article-title":"Global contextually guided lightweight network for RGB-thermal urban scene understanding","volume":"117","author":"Gong","year":"2023","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"3388","DOI":"10.1109\/TMM.2020.3025166","article-title":"Salient object detection in stereoscopic 3D images using a deep convolutional residual autoencoder","volume":"23","author":"Zhou","year":"2021","journal-title":"IEEE Trans. Multimed."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"2192","DOI":"10.1109\/TMM.2021.3077767","article-title":"CCAFNet: Crossflow and cross-scale adaptive fusion network for detecting salient objects in RGB-D images","volume":"24","author":"Zhou","year":"2022","journal-title":"IEEE Trans. Multimed."},{"key":"ref_40","unstructured":"International Society for Photogrammetry and Remote Sensing (2020, January 01). 2D Semantic Labeling Contest-Potsdam. Available online: https:\/\/www.isprs.org\/education\/benchmarks\/UrbanSemLab\/2d-sem-label-potsdam.aspx."},{"key":"ref_41","unstructured":"International Society for Photogrammetry and Remote Sensing (2020, January 01). 2D Semantic Labeling Contest-Vaihingen. Available online: https:\/\/www.isprs.org\/education\/benchmarks\/UrbanSemLab\/2d-sem-label-vaihingen.aspx."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs","volume":"40","author":"Chen","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_43","first-page":"1","article-title":"A novel transformer based semantic segmentation scheme for fine-resolution remote sensing images","volume":"19","author":"Wang","year":"2022","journal-title":"IEEE Geosci. Remote Sens. Lett."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/21\/5510\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:09:01Z","timestamp":1760144941000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/21\/5510"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,1]]},"references-count":43,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2022,11]]}},"alternative-id":["rs14215510"],"URL":"https:\/\/doi.org\/10.3390\/rs14215510","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2022,11,1]]}}}