{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,20]],"date-time":"2026-02-20T20:23:05Z","timestamp":1771618985574,"version":"3.50.1"},"reference-count":52,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2023,3,8]],"date-time":"2023-03-08T00:00:00Z","timestamp":1678233600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61976225"],"award-info":[{"award-number":["61976225"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Few-shot semantic segmentation has attracted much attention because it requires only a few labeled samples to achieve good segmentation performance. However, existing methods still suffer from insufficient contextual information and unsatisfactory edge segmentation results. To overcome these two issues, this paper proposes a multi-scale context enhancement and edge-assisted network (called MCEENet) for few-shot semantic segmentation. First, rich support and query image features were extracted, respectively, using two weight-shared feature extraction networks, each consisting of a ResNet and a Vision Transformer. Subsequently, a multi-scale context enhancement (MCE) module was proposed to fuse the features of ResNet and Vision Transformer, and further mine the contextual information of the image by using cross-scale feature fusion and multi-scale dilated convolutions. Furthermore, we designed an Edge-Assisted Segmentation (EAS) module, which fuses the shallow ResNet features of the query image and the edge features computed by the Sobel operator to assist in the final segmentation task. We experimented on the PASCAL-5i dataset to demonstrate the effectiveness of MCEENet; the results of the 1-shot setting and 5-shot setting on the PASCAL-5i dataset are 63.5% and 64.7%, which surpasses the state-of-the-art results by 1.4% and 0.6%, respectively.<\/jats:p>","DOI":"10.3390\/s23062922","type":"journal-article","created":{"date-parts":[[2023,3,8]],"date-time":"2023-03-08T03:59:32Z","timestamp":1678247972000},"page":"2922","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["MCEENet: Multi-Scale Context Enhancement and Edge-Assisted Network for Few-Shot Semantic Segmentation"],"prefix":"10.3390","volume":"23","author":[{"given":"Hongjie","family":"Zhou","sequence":"first","affiliation":[{"name":"School of Automation, Central South University, Changsha 410083, China"}]},{"given":"Rufei","family":"Zhang","sequence":"additional","affiliation":[{"name":"Beijing Institute of Control and Electronic Technology, Beijing 100038, China"}]},{"given":"Xiaoyu","family":"He","sequence":"additional","affiliation":[{"name":"School of Automation, Central South University, Changsha 410083, China"}]},{"given":"Nannan","family":"Li","sequence":"additional","affiliation":[{"name":"Beijing Institute of Control and Electronic Technology, Beijing 100038, China"}]},{"given":"Yong","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Automation, Central South University, Changsha 410083, China"}]},{"given":"Sheng","family":"Shen","sequence":"additional","affiliation":[{"name":"Beijing Institute of Control and Electronic Technology, Beijing 100038, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,3,8]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Voulodimos, A., Protopapadakis, E., Katsamenis, I., Doulamis, A., and Doulamis, N. (2021). A few-shot U-net deep learning model for COVID-19 infected area segmentation in CT images. Sensors, 21.","DOI":"10.3390\/s21062215"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Bello, S.A., Yu, S., Wang, C., Adam, J.M., and Li, J. (2020). Deep learning on 3D point clouds. Remote Sens., 12.","DOI":"10.3390\/rs12111729"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"He, M., Jiang, P., and Deng, F. (2022, January 22\u201324). A study of microseismic first arrival pickup based on image semantic segmentation. Proceedings of the 2022 3rd International Conference on Geology, Mapping and Remote Sensing (ICGMRS), Zhoushan, China.","DOI":"10.1109\/ICGMRS55602.2022.9849339"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"6149","DOI":"10.1007\/s00521-021-06802-0","article-title":"Multi-scale strip pooling feature aggregation network for cloud and cloud shadow segmentation","volume":"34","author":"Lu","year":"2022","journal-title":"Neural Comput. Appl."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"104940","DOI":"10.1016\/j.cageo.2021.104940","article-title":"Strip pooling channel spatial attention network for the segmentation of cloud and cloud shadow","volume":"157","author":"Qu","year":"2021","journal-title":"Comput. Geosci."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"5874","DOI":"10.1080\/01431161.2022.2073795","article-title":"MANet: A multi-level aggregation network for semantic segmentation of high-resolution remote sensing images","volume":"43","author":"Chen","year":"2022","journal-title":"Int. J. Remote Sens."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"016513","DOI":"10.1117\/1.JRS.16.016513","article-title":"MLNet: Multichannel feature fusion lozenge network for land segmentation","volume":"16","author":"Gao","year":"2022","journal-title":"J. Appl. Remote Sens."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"5940","DOI":"10.1080\/01431161.2021.2014077","article-title":"Cloud\/shadow segmentation based on multi-level feature enhanced network for remote sensing imagery","volume":"43","author":"Miao","year":"2022","journal-title":"Int. J. Remote Sens."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1109\/JSTARS.2022.3224081","article-title":"Axial Cross Attention Meets CNN: Bibranch Fusion Network for Change Detection","volume":"16","author":"Song","year":"2022","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"3258","DOI":"10.1109\/TITS.2020.2980426","article-title":"Real-time high-performance semantic image segmentation of urban street scenes","volume":"22","author":"Dong","year":"2020","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015, January 5\u20139). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_12","unstructured":"Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A.L. (2014). Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Shaban, A., Bansal, S., Liu, Z., Essa, I., and Boots, B. (2017). One-shot learning for semantic segmentation. arXiv.","DOI":"10.5244\/C.31.167"},{"key":"ref_14","unstructured":"Wang, K., Liew, J.H., Zou, Y., Zhou, D., and Feng, J. (November, January 27). Panet: Few-shot image semantic segmentation with prototype alignment. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_15","unstructured":"Nguyen, K., and Todorovic, S. (November, January 27). Feature weighting and boosting for few-shot segmentation. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Yang, B., Liu, C., Li, B., Jiao, J., and Ye, Q. (2020, January 23\u201328). Prototype mixture models for few-shot semantic segmentation. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58598-3_45"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Liu, Y., Zhang, X., Zhang, S., and He, X. (2020, January 23\u201328). Part-aware prototype network for few-shot semantic segmentation. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58545-7_9"},{"key":"ref_18","first-page":"4080","article-title":"Prototypical networks for few-shot learning","volume":"30","author":"Snell","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Fan, Q., Pei, W., Tai, Y.W., and Tang, C.K. (2022, January 23\u201327). Self-support few-shot semantic segmentation. Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel.","DOI":"10.1007\/978-3-031-19800-7_41"},{"key":"ref_20","unstructured":"Zhang, C., Lin, G., Liu, F., Guo, J., Wu, Q., and Yao, R. (November, January 27). Pyramid graph networks with connection attentions for region-based one-shot semantic segmentation. Proceedings of the of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21\u201326). Pyramid scene parsing network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_23","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8\u201314). Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"106965","DOI":"10.1016\/j.patcog.2019.106965","article-title":"A deep one-shot network for query-based logo retrieval","volume":"96","author":"Bhunia","year":"2019","journal-title":"Pattern Recognit."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Tian, P., Wu, Z., Qi, L., Wang, L., Shi, Y., and Gao, Y. (2020, January 7\u201312). Differentiable meta-learning model for few-shot semantic segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA.","DOI":"10.1609\/aaai.v34i07.6887"},{"key":"ref_27","unstructured":"Dong, N., and Xing, E.P. (2018, January 2\u20136). Few-shot semantic segmentation with prototype learning. Proceedings of the British Machine Vision Conference, Northumbria University, Newcastle, UK."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Yang, Y., Meng, F., Li, H., Wu, Q., Xu, X., and Chen, S. (2020, January 5\u20138). A new local transformation module for few-shot segmentation. Proceedings of the International Conference on Multimedia Modeling, Daejeon, Republic of Korea.","DOI":"10.1007\/978-3-030-37734-2_7"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Gairola, S., Hemani, M., Chopra, A., and Krishnamurthy, B. (2020). Simpropnet: Improved similarity propagation for few-shot image segmentation. arXiv.","DOI":"10.24963\/ijcai.2020\/80"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"3855","DOI":"10.1109\/TCYB.2020.2992433","article-title":"SG-One: Similarity guidance network for one-shot semantic segmentation","volume":"50","author":"Zhang","year":"2020","journal-title":"IEEE Trans. Cybern."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Li, G., Jampani, V., Sevilla-Lara, L., Sun, D., Kim, J., and Kim, J. (2021, January 20\u201325). Adaptive prototype learning and allocation for few-shot segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00823"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Yang, L., Zhuo, W., Qi, L., Shi, Y., and Gao, Y. (2021, January 10\u201317). Mining latent classes for few-shot segmentation. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00860"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Liu, C., Fu, Y., Xu, C., Yang, S., Li, J., Wang, C., and Zhang, L. (2021, January 2\u20139). Learning a few-shot embedding model with contrastive learning. Proceedings of the AAAI Conference on Artificial Intelligence, held virtually.","DOI":"10.1609\/aaai.v35i10.17047"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Xie, G.S., Liu, J., Xiong, H., and Shao, L. (2021, January 20\u201325). Scale-aware graph neural network for few-shot semantic segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00543"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Lu, Z., He, S., Zhu, X., Zhang, L., Song, Y.Z., and Xiang, T. (2021, January 10\u201317). Simpler is better: Few-shot semantic segmentation with classifier weight transformer. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00862"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Siam, M., Doraiswamy, N., Oreshkin, B.N., Yao, H., and Jagersand, M. (2020). Weakly supervised few-shot object segmentation using co-attention with visual and semantic embeddings. arXiv.","DOI":"10.24963\/ijcai.2020\/120"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Liu, L., Cao, J., Liu, M., Guo, Y., Chen, Q., and Tan, M. (2020, January 12\u201316). Dynamic extension nets for few-shot semantic segmentation. Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA.","DOI":"10.1145\/3394171.3413915"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Zhang, C., Lin, G., Liu, F., Yao, R., and Shen, C. (2019, January 15\u201320). Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00536"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1050","DOI":"10.1109\/TPAMI.2020.3013717","article-title":"Prior guided feature enrichment network for few-shot segmentation","volume":"44","author":"Tian","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Zhang, B., Xiao, J., and Qin, T. (2021, January 20\u201325). Self-guided and cross-guided learning for few-shot segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00821"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Liu, W., Zhang, C., Lin, G., and Liu, F. (2020, January 13\u201319). Crnet: Cross-reference networks for few-shot segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00422"},{"key":"ref_42","unstructured":"Yang, X., Wang, B., Chen, K., Zhou, X., Yi, S., Ouyang, W., and Zhou, L. (2020). Brinet: Towards bridging the intra-class and inter-class gaps in one-shot segmentation. arXiv."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Xie, G.S., Xiong, H., Liu, J., Yao, Y., and Shao, L. (2021, January 10\u201317). Few-shot semantic segmentation with cyclic memory network. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00720"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"Imagenet large scale visual recognition challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs","volume":"40","author":"Chen","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_46","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_48","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The pascal visual object classes (VOC) challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Hariharan, B., Arbel\u00e1ez, P., Bourdev, L., Maji, S., and Malik, J. (2011, January 6\u201313). Semantic contours from inverse detectors. Proceedings of the Proceedings of the IEEE\/CVF International Conference on Computer Vision, Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126343"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Min, J., Kang, D., and Cho, M. (2021, January 10\u201317). Hypercorrelation squeeze for few-shot segmentation. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00686"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Li, X., Wei, T., Chen, Y.P., Tai, Y.W., and Tang, C.K. (2020, January 13\u201319). FSS-1000: A 1000-class dataset for few-shot segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00294"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/6\/2922\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:50:46Z","timestamp":1760122246000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/6\/2922"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,8]]},"references-count":52,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2023,3]]}},"alternative-id":["s23062922"],"URL":"https:\/\/doi.org\/10.3390\/s23062922","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,8]]}}}