{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,10]],"date-time":"2026-07-10T03:10:35Z","timestamp":1783653035762,"version":"3.55.0"},"reference-count":37,"publisher":"National Library of Serbia","issue":"3","license":[{"start":{"date-parts":[[2025,1,1]],"date-time":"2025-01-01T00:00:00Z","timestamp":1735689600000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["ComSIS","COMPUT SCI INF SYST","COMPUT SCI INFORM SY","COMPUTER SCI INFORM","COMSIS J"],"published-print":{"date-parts":[[2025]]},"abstract":"<jats:p>Semantic segmentation of remote sensing images remains challenging due to complex object structures and varying scales. This paper proposes a novel hybrid segmentation model that combines Segformer for global context extraction with Dynamic Snake Convolution to better capture fine-grained, boundary-aware features. An auxiliary semantic branch is introduced to improve feature alignment across scales. Experiments on three benchmark datasets?LoveDA, Potsdam, and Vaihingen?demonstrate that the proposed approach achieves consistent improvements in mIoU over baseline models, particularly in segmenting irregular and linear structures. This framework offers a promising solution for high-resolution land cover mapping and urban scene understanding.<\/jats:p>","DOI":"10.2298\/csis250312054x","type":"journal-article","created":{"date-parts":[[2025,5,23]],"date-time":"2025-05-23T08:07:11Z","timestamp":1747987631000},"page":"991-1010","source":"Crossref","is-referenced-by-count":4,"title":["Boundary-aware semantic segmentation of remote sensing images via Segformer and Snake Convolution"],"prefix":"10.2298","volume":"22","author":[{"given":"Yanting","family":"Xia","sequence":"first","affiliation":[{"name":"Geely University of China, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Lin","family":"Zhang","sequence":"additional","affiliation":[{"name":"Geely University of China, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ting","family":"Guo","sequence":"additional","affiliation":[{"name":"Geely University of China, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Qi","family":"Jin","sequence":"additional","affiliation":[{"name":"Geely University of China, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1078","reference":[{"key":"ref1","doi-asserted-by":"crossref","unstructured":"A., R.M., Y, W.: Optimizing intersection-over-union in deep neural networks for image segmentation. In: International Symposium on Visual Computing. pp. 234-244. Springer, Cham (2016)","DOI":"10.1007\/978-3-319-50835-1_22"},{"key":"ref2","doi-asserted-by":"crossref","unstructured":"A, S., Y, K.: Semantic segmentation of remote-sensing imagery using heterogeneous big data: International society for photogrammetry and remote sensing potsdam and cityscape datasets. ISPRS International Journal of Geo-Information 9(10), 601 (2020)","DOI":"10.3390\/ijgi9100601"},{"key":"ref3","doi-asserted-by":"crossref","unstructured":"Borgwardt, K.M., Gretton, A., Rasch, M.J., et al.: Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics 22(14), e49-e57 (2006)","DOI":"10.1093\/bioinformatics\/btl242"},{"key":"ref4","doi-asserted-by":"crossref","unstructured":"Ding, L., Lin, D., Lin, S., Zhang, J., Cui, X.,Wang, Y., Tang, H., Bruzzone, L.: Looking outside the window:Wide-context transformer for the semantic segmentation of high-resolution remote sensing images. IEEE Transactions on Geoscience and Remote Sensing 60, 1-13 (2022)","DOI":"10.1109\/TGRS.2022.3168697"},{"key":"ref5","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)"},{"key":"ref6","doi-asserted-by":"crossref","unstructured":"G, Z., T, L., Y, C., et al.: A dual-path and light-weight convolutional neural network for highresolution aerial image segmentation. ISPRS International Journal of Geo-Information 8(12), 582 (2019)","DOI":"10.3390\/ijgi8120582"},{"key":"ref7","doi-asserted-by":"crossref","unstructured":"Guo, Y., Liu, Y., Georgiou, T., et al.: A review of semantic segmentation using deep neural networks. International Journal of Multimedia Information Retrieval 7, 87-93 (2018)","DOI":"10.1007\/s13735-017-0141-z"},{"key":"ref8","unstructured":"H, Z., J, S., X, Q., et al.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2881-2890 (2017)"},{"key":"ref9","unstructured":"J, D., H, Q., Y, X., et al.: Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). pp. 2-3 (2017)"},{"key":"ref10","doi-asserted-by":"crossref","unstructured":"J, G., S, S.,W, J.: Dual-path feature aware network for remote sensing image semantic segmentation. IEEE Transactions on Circuits and Systems for Video Technology 34(5), 3674-3686 (2023)","DOI":"10.1109\/TCSVT.2023.3317937"},{"key":"ref11","unstructured":"J, L., E, S., T, D.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3431-3440 (2015)"},{"key":"ref12","unstructured":"J, W., Z, Z., A, M., et al.: Loveda: A remote sensing land-cover dataset for domain adaptive semantic segmentation. arXiv preprint arXiv:2110.08733 (2021)"},{"key":"ref13","doi-asserted-by":"crossref","unstructured":"L,W., K, P., M, X., et al.: Sgformer: A local and global features coupling network for semantic segmentation of land cover. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 16, 6812-6824 (2023)","DOI":"10.1109\/JSTARS.2023.3295729"},{"key":"ref14","doi-asserted-by":"crossref","unstructured":"LC, C., G, P., I, K., et al.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(4), 834-848 (2017)","DOI":"10.1109\/TPAMI.2017.2699184"},{"key":"ref15","doi-asserted-by":"crossref","unstructured":"Li, Z., Liu, F., Yang,W., Peng, S., Zhou, J.: A survey of convolutional neural networks: Analysis, applications, and prospects. IEEE Transactions on Neural Networks and Learning Systems 33(12), 6999-7019 (2021)","DOI":"10.1109\/TNNLS.2021.3084827"},{"key":"ref16","doi-asserted-by":"crossref","unstructured":"Lian, X., Pang, Y., Han, J., Pan, J.: Cascaded hierarchical atrous spatial pyramid pooling module for semantic segmentation. Pattern Recognition 110, 107622 (2021)","DOI":"10.1016\/j.patcog.2020.107622"},{"key":"ref17","unstructured":"Markus Gerke, I.: Use of the stair vision library within the isprs 2d semantic labeling benchmark (vaihingen). Use of the stair vision library within the isprs 2d semantic labeling benchmark (vaihingen) (2014)"},{"key":"ref18","doi-asserted-by":"crossref","unstructured":"O, R., P, F., T, B.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 234-241. Springer, Cham (2015)","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref19","doi-asserted-by":"crossref","unstructured":"P, S.J., P, E., R, K.S.: Unveiling the secrets of brain tumors: A fuzzy c-means and u-net convolution approach for enhanced segmentation. INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL 19(2) (2024)","DOI":"10.15837\/ijccc.2024.2.5732"},{"key":"ref20","doi-asserted-by":"crossref","unstructured":"Pinheiro, P.O., Collobert, R.: From image-level to pixel-level labeling with convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1713-1721 (2015)","DOI":"10.1109\/CVPR.2015.7298780"},{"key":"ref21","doi-asserted-by":"crossref","unstructured":"Rodrigues, C.M., Pereira, L., Rocha, A., et al.: Image semantic representation for event understanding. In: 2019 IEEE International Workshop on Information Forensics and Security (WIFS). pp. 1-6. IEEE (2019)","DOI":"10.1109\/WIFS47025.2019.9035102"},{"key":"ref22","doi-asserted-by":"crossref","unstructured":"S, J., J, L., Z, H.: Dpcfn: Dual path cross fusion network for medical image segmentation. Engineering Applications of Artificial Intelligence 116, 105420 (2022)","DOI":"10.1016\/j.engappai.2022.105420"},{"key":"ref23","unstructured":"S, Y., L, W., L, T.: Threshold segmentation based on information fusion for object shadow detection in remote sensing images. Computer Science and Information Systems 00, 23-23 (2024)"},{"key":"ref24","unstructured":"S, Z., J, L., H, Z., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Computer Vision and Pattern Recognition. pp. 6877-6886. IEEE (2021)"},{"key":"ref25","unstructured":"S, Z., J, L., H, Z., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. pp. 6881-6890 (2021)"},{"key":"ref26","doi-asserted-by":"crossref","unstructured":"V, B., A, K., R, C.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39(12), 2481- 2495 (2017)","DOI":"10.1109\/TPAMI.2016.2644615"},{"key":"ref27","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. Advances in Neural Information Processing Systems 30, 5998-6008 (2017)"},{"key":"ref28","doi-asserted-by":"crossref","unstructured":"Voulodimos, A., Doulamis, N., Doulamis, A., et al.: Deep learning for computer vision: A brief review. Computational Intelligence and Neuroscience 2018(1), 7068349 (2018)","DOI":"10.1155\/2018\/7068349"},{"key":"ref29","doi-asserted-by":"crossref","unstructured":"Wang, W., Zhou, C., He, H., Ma, C.: Advancing uav image semantic segmentation with an improved multiscale diffusion model. Tehni\u02c7cki vjesnik 31(6), 1859-1865 (2024)","DOI":"10.17559\/TV-20231023001051"},{"key":"ref30","doi-asserted-by":"crossref","unstructured":"X, H., Y, Z., J, Z., et al.: Swin transformer embedding unet for remote sensing image semantic segmentation. IEEE Transactions on Geoscience and Remote Sensing 60, 1-15 (2022)","DOI":"10.1109\/TGRS.2022.3144165"},{"key":"ref31","doi-asserted-by":"crossref","unstructured":"Xu, H., Zhang, X., Li, H., et al.: Seed the views: Hierarchical semantic alignment for contrastive representation learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 45(3), 3753-3767 (2022)","DOI":"10.1109\/TPAMI.2022.3176690"},{"key":"ref32","unstructured":"Y, M.: Research review of image semantic segmentation methods in high-resolution remote sensing image interpretation. Journal of Frontiers of Computer Science and Technology 17(7), 1526-1548 (2023)"},{"key":"ref33","unstructured":"Yin, S., Wang, L., Teng, L.: Threshold segmentation based on information fusion for object shadow detection in remote sensing images. Computer Science and Information Systems (00), 23-23 (2024)"},{"key":"ref34","doi-asserted-by":"crossref","unstructured":"Z, G., C, G., Z, F., et al.: Integrating masked generative distillation and network compression to identify the severity of wheat fusarium head blight. Computers and Electronics in Agriculture 227, 109647 (2024)","DOI":"10.1016\/j.compag.2024.109647"},{"key":"ref35","unstructured":"Z, L., Y, L., Y, C., et al.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision. pp. 10012-10022 (2021)"},{"key":"ref36","doi-asserted-by":"crossref","unstructured":"Zeng, F., Yang, B., Zhao, M., Xing, Y., Ma, Y.: Masanet: Multi-angle self-attention network for semantic segmentation of remote sensing images. Tehni\u02c7cki vjesnik 29(5), 1567-1575 (2022)","DOI":"10.17559\/TV-20220421142959"},{"key":"ref37","unstructured":"Zhong, L., Ruijun, B., Jun, H., et al.: Aircraft detection algorithm for remote sensing images based on adaptive feature fusion and multi-scale output. Microelectronics and Computer 38(4), 40-45, 51 (2021)"}],"container-title":["Computer Science and Information Systems"],"original-title":[],"language":"en","deposited":{"date-parts":[[2025,7,18]],"date-time":"2025-07-18T09:19:11Z","timestamp":1752830351000},"score":1,"resource":{"primary":{"URL":"https:\/\/doiserbia.nb.rs\/Article.aspx?ID=1820-02142500054X"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025]]},"references-count":37,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025]]}},"URL":"https:\/\/doi.org\/10.2298\/csis250312054x","relation":{},"ISSN":["1820-0214","2406-1018"],"issn-type":[{"value":"1820-0214","type":"print"},{"value":"2406-1018","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025]]}}}