{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,2]],"date-time":"2025-08-02T14:54:39Z","timestamp":1754146479116,"version":"3.41.2"},"reference-count":37,"publisher":"National Library of Serbia","issue":"3","license":[{"start":{"date-parts":[[2025,1,1]],"date-time":"2025-01-01T00:00:00Z","timestamp":1735689600000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by-nc-nd\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["ComSIS","COMPUT SCI INF SYST","COMPUT SCI INFORM SY","COMPUTER SCI INFORM","COMSIS J"],"published-print":{"date-parts":[[2025]]},"abstract":"<jats:p>Traditional semantic segmentation methods have problems such as poor multi-scale feature extraction ability, weak lightweight backbone network feature extraction ability, lack of effective fusion of context information, resulting in edge segmentation errors and feature discontinuity. In this paper, a novel semantic segmentation model based on multi-layer information fusion and dual convolutional attention mechanism is proposed. In this method, SegFormer network is used as the backbone network, and multi-scale features of encoder output are fused with overlapping features. The feature extraction subnetwork is optimized by constructing the object region enhancement module, and the intermediate feature map is refined adaptively in each convolutional block of the deep network, so as to strengthen the fine extraction of multi-dimensional feature information of complex images. Dual convolutional attention module is used to fusion high-level semantic information to avoid the loss of feature information caused by up-sampling operation and the influence of introducing noise, and refine the effect of target edge segmentation. At the same time, the feature pyramid grid is proposed to process the overlapping features, obtain the context information of different scales, and enhance the semantic expression of features. Finally, the features processed by the feature pyramid grid module are combined to improve the segmentation effect. The experimental results on the public data set show that the proposed method has better performance than the existing methods, and has better segmentation effect on the object edge in the scene.<\/jats:p>","DOI":"10.2298\/csis240713051t","type":"journal-article","created":{"date-parts":[[2025,5,23]],"date-time":"2025-05-23T07:14:31Z","timestamp":1747984471000},"page":"907-926","source":"Crossref","is-referenced-by-count":0,"title":["Image semantic segmentation based on multi-layer feature information fusion and dual convolutional attention mechanism"],"prefix":"10.2298","volume":"22","author":[{"given":"Lin","family":"Teng","sequence":"first","affiliation":[{"name":"School of Information and Communication Engineering, Harbin Engineering University, Nangang District, Harbin, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yulong","family":"Qiao","sequence":"additional","affiliation":[{"name":"School of Information and Communication Engineering, Harbin Engineering University, Nangang District, Harbin, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jinfeng","family":"Wang","sequence":"additional","affiliation":[{"name":"Weifang Vocational College, Weifang, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1946-0384","authenticated-orcid":false,"given":"Mirjana","family":"Ivanovic","sequence":"additional","affiliation":[{"name":"Faculty of Sciences, University of Novi Sad, Serbia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shoulin","family":"Yin","sequence":"additional","affiliation":[{"name":"School of Information and Communication Engineering, Harbin Engineering University, Nangang District, Harbin, China + Software College, Shenyang Normal University, Shenyang, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1078","reference":[{"key":"ref1","unstructured":"Luo Z, Yang W, Yuan Y, et al. Semantic segmentation of agricultural images: A survey[J]. Information Processing in Agriculture, 2023."},{"key":"ref2","doi-asserted-by":"crossref","unstructured":"Mo Y, Wu Y, Yang X, et al. Review the state-of-the-art technologies of semantic segmentation based on deep learning[J]. Neurocomputing, 2022, 493: 626-646.","DOI":"10.1016\/j.neucom.2022.01.005"},{"key":"ref3","doi-asserted-by":"crossref","unstructured":"Yin S,Wang L, Teng L. Threshold segmentation based on information fusion for object shadow detection in remote sensing images[J]. Computer Science and Information Systems, 2024. doi: 10.2298\/CSIS231230023Y.","DOI":"10.2298\/CSIS231230023Y"},{"key":"ref4","doi-asserted-by":"crossref","unstructured":"Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[ C]\/\/Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 3431-3440.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref5","doi-asserted-by":"crossref","unstructured":"Badrinarayanan V, Kendall A, Cipolla R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(12): 2481-2495.","DOI":"10.1109\/TPAMI.2016.2644615"},{"key":"ref6","doi-asserted-by":"crossref","unstructured":"Badrinarayanan V, Kendall A, Cipolla R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(12): 2481-2495.","DOI":"10.1109\/TPAMI.2016.2644615"},{"key":"ref7","doi-asserted-by":"crossref","unstructured":"Yuan H, Zhu J,Wang Q, et al. An improved DeepLab v3+ deep learning network applied to the segmentation of grape leaf black rot spots[J]. Frontiers in plant science, 2022, 13: 795410.","DOI":"10.3389\/fpls.2022.795410"},{"key":"ref8","doi-asserted-by":"crossref","unstructured":"Lian X, Pang Y, Han J, et al. Cascaded hierarchical atrous spatial pyramid pooling module for semantic segmentation[J]. Pattern Recognition, 2021, 110: 107622.","DOI":"10.1016\/j.patcog.2020.107622"},{"key":"ref9","doi-asserted-by":"crossref","unstructured":"Li X, Li M, Yan P, et al. Deep learning attention mechanism in medical image analysis: Basics and beyonds[J]. International Journal of Network Dynamics and Intelligence, 2023: 93-116.","DOI":"10.53941\/ijndi0201006"},{"key":"ref10","doi-asserted-by":"crossref","unstructured":"Zhao H, Zhang Y, Liu S, et al. Psanet: Point-wise spatial attention network for scene parsing[ C]\/\/Proceedings of the European conference on computer vision (ECCV). 2018: 267-283.","DOI":"10.1007\/978-3-030-01240-3_17"},{"key":"ref11","doi-asserted-by":"crossref","unstructured":"Li X, Zhong Z,Wu J, et al. Expectation-maximization attention networks for semantic segmentation[ C]\/\/Proceedings of the IEEE\/CVF international conference on computer vision. 2019: 9167-9176.","DOI":"10.1109\/ICCV.2019.00926"},{"key":"ref12","doi-asserted-by":"crossref","unstructured":"Asadi A, Safabakhsh R. The encoder-decoder framework and its applications[J]. Deep learning: Concepts and architectures, 2020: 133-167.","DOI":"10.1007\/978-3-030-31756-0_5"},{"key":"ref13","doi-asserted-by":"crossref","unstructured":"Qamar S, Ahmad P, Shen L. Dense encoder-decoder Cbased architecture for skin lesion segmentation[ J]. Cognitive Computation, 2021, 13(2): 583-594.","DOI":"10.1007\/s12559-020-09805-6"},{"key":"ref14","doi-asserted-by":"crossref","unstructured":"S. Yin, H. Li, Y. Sun, M. Ibrar, and L. Teng. Data Visualization Analysis Based on Explainable Artificial Intelligence: A Survey[J]. IJLAI Transactionss on Science and Engineering, vol. 2, no. 2, pp. 13-20, 2024.","DOI":"10.1007\/978-3-662-68313-2_2"},{"key":"ref15","doi-asserted-by":"crossref","unstructured":"Li X, Chen H, Qi X, et al. H-DenseUNet: hybrid densely connected UNet for liver and tumor segmentation from CT volumes[J]. IEEE transactions on medical imaging, 2018, 37(12): 2663- 2674.","DOI":"10.1109\/TMI.2018.2845918"},{"key":"ref16","doi-asserted-by":"crossref","unstructured":"Chollet F. Xception: Deep learning with depthwise separable convolutions[C]\/\/Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1251-1258.","DOI":"10.1109\/CVPR.2017.195"},{"key":"ref17","doi-asserted-by":"crossref","unstructured":"Chollet F. Xception: Deep learning with depthwise separable convolutions[C]\/\/Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 1251-1258.","DOI":"10.1109\/CVPR.2017.195"},{"key":"ref18","doi-asserted-by":"crossref","unstructured":"Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]\/\/Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref19","doi-asserted-by":"crossref","unstructured":"Nam H, Ha J W, Kim J. Dual attention networks for multimodal reasoning and matching[ C]\/\/Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 299-307.","DOI":"10.1109\/CVPR.2017.232"},{"key":"ref20","doi-asserted-by":"crossref","unstructured":"Huang Z, Wang X, Huang L, et al. Ccnet: Criss-cross attention for semantic segmentation[ C]\/\/Proceedings of the IEEE\/CVF international conference on computer vision. 2019: 603-612.","DOI":"10.1109\/ICCV.2019.00069"},{"key":"ref21","doi-asserted-by":"crossref","unstructured":"Li X, Zhong Z,Wu J, et al. Expectation-maximization attention networks for semantic segmentation[ C]\/\/Proceedings of the IEEE\/CVF international conference on computer vision. 2019: 9167-9176.","DOI":"10.1109\/ICCV.2019.00926"},{"key":"ref22","unstructured":"Katafuchi R, Tokunaga T. LEA-Net: layer-wise external attention network for efficient color anomaly detection[J]. arxiv preprint arxiv:2109.05493, 2021."},{"key":"ref23","doi-asserted-by":"crossref","unstructured":"Agac S, Durmaz Incel O. On the use of a convolutional block attention module in deep learningbased human activity recognition with motion sensors[J]. Diagnostics, 2023, 13(11): 1861.","DOI":"10.3390\/diagnostics13111861"},{"key":"ref24","doi-asserted-by":"crossref","unstructured":"Yang Y, Jiao L, Liu X, et al. Dual wavelet attention networks for image classification[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 33(4): 1899-1910.","DOI":"10.1109\/TCSVT.2022.3218735"},{"key":"ref25","doi-asserted-by":"crossref","unstructured":"Maaz M, Shaker A, Cholakkal H, et al. Edgenext: efficiently amalgamated cnn-transformer architecture for mobile vision applications[C]\/\/European conference on computer vision. Cham: Springer Nature Switzerland, 2022: 3-20.","DOI":"10.1007\/978-3-031-25082-8_1"},{"key":"ref26","doi-asserted-by":"crossref","unstructured":"Zhao S, Dong Y, Chang E I, et al. Recursive cascaded networks for unsupervised medical image registration[C]\/\/Proceedings of the IEEE\/CVF international conference on computer vision. 2019: 10600-10610.","DOI":"10.1109\/ICCV.2019.01070"},{"key":"ref27","doi-asserted-by":"crossref","unstructured":"Sinha D, El-Sharkawy M. Thin mobilenet: An enhanced mobilenet architecture[C]\/\/2019 IEEE 10th annual ubiquitous computing, electronics & mobile communication conference (UEMCON). IEEE, 2019: 0280-0285.","DOI":"10.1109\/UEMCON47517.2019.8993089"},{"key":"ref28","doi-asserted-by":"crossref","unstructured":"Qin Z, Zhang Z, Chen X, et al. Fd-mobilenet: Improved mobilenet with a fast downsampling strategy[C]\/\/2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, 2018: 1363-1367.","DOI":"10.1109\/ICIP.2018.8451355"},{"key":"ref29","doi-asserted-by":"crossref","unstructured":"Qin Z, Zhang Z, Chen X, et al. Fd-mobilenet: Improved mobilenet with a fast downsampling strategy[C]\/\/2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, 2018: 1363-1367.","DOI":"10.1109\/ICIP.2018.8451355"},{"key":"ref30","doi-asserted-by":"crossref","unstructured":"Yin S, Li H, Laghari A A, et al. An anomaly detection model based on deep auto-encoder and capsule graph convolution via sparrow search algorithm in 6G internet-of-everything[J]. IEEE Internet of Things Journal, 2024.","DOI":"10.1109\/JIOT.2024.3353337"},{"key":"ref31","doi-asserted-by":"crossref","unstructured":"Zhang K, Cheng K, Li J, et al. A channel pruning algorithm based on depth-wise separable convolution unit[J]. IEEE Access, 2019, 7: 173294-173309.","DOI":"10.1109\/ACCESS.2019.2956976"},{"key":"ref32","doi-asserted-by":"crossref","unstructured":"Dang L, Pang P, Lee J. Depth-wise separable convolution neural network with residual connection for hyperspectral image classification[J]. Remote Sensing, 2020, 12(20): 3408.","DOI":"10.3390\/rs12203408"},{"key":"ref33","doi-asserted-by":"crossref","unstructured":"Li X, Yu L, Chang D, et al. Dual cross-entropy loss for small-sample fine-grained vehicle classification[J]. IEEE Transactions on Vehicular Technology, 2019, 68(5): 4204-4212.","DOI":"10.1109\/TVT.2019.2895651"},{"key":"ref34","doi-asserted-by":"crossref","unstructured":"Yu C, Wang J, Peng C, et al. Bisenet: Bilateral segmentation network for real-time semantic segmentation[C]\/\/Proceedings of the European conference on computer vision (ECCV). 2018: 325-341.","DOI":"10.1007\/978-3-030-01261-8_20"},{"key":"ref35","doi-asserted-by":"crossref","unstructured":"Zhou J, Hao M, Zhang D, et al. Fusion PSPnet image segmentation based method for multifocus image fusion[J]. IEEE Photonics Journal, 2019, 11(6): 1-12.","DOI":"10.1109\/JPHOT.2019.2950949"},{"key":"ref36","doi-asserted-by":"crossref","unstructured":"Seong S, Choi J. Semantic segmentation of urban buildings using a high-resolution network (HRNet) with channel and spatial attention gates[J]. Remote Sensing, 2021, 13(16): 3087.","DOI":"10.3390\/rs13163087"},{"key":"ref37","doi-asserted-by":"crossref","unstructured":"Fang K, Li W J. DMNet: difference minimization network for semi-supervised segmentation in medical images[C]\/\/International Conference on Medical Image Computing and Computer- Assisted Intervention. Cham: Springer International Publishing, 2020: 532-541.","DOI":"10.1007\/978-3-030-59710-8_52"}],"container-title":["Computer Science and Information Systems"],"original-title":[],"language":"en","deposited":{"date-parts":[[2025,7,18]],"date-time":"2025-07-18T09:18:12Z","timestamp":1752830292000},"score":1,"resource":{"primary":{"URL":"https:\/\/doiserbia.nb.rs\/Article.aspx?ID=1820-02142500051T"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025]]},"references-count":37,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025]]}},"URL":"https:\/\/doi.org\/10.2298\/csis240713051t","relation":{},"ISSN":["1820-0214","2406-1018"],"issn-type":[{"type":"print","value":"1820-0214"},{"type":"electronic","value":"2406-1018"}],"subject":[],"published":{"date-parts":[[2025]]}}}