{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,20]],"date-time":"2026-02-20T20:23:05Z","timestamp":1771618985631,"version":"3.50.1"},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2024,2,17]],"date-time":"2024-02-17T00:00:00Z","timestamp":1708128000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,17]],"date-time":"2024-02-17T00:00:00Z","timestamp":1708128000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Process Lett"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Few-shot semantic segmentation aims to recognize novel classes with only very few labelled data. This challenging task requires mining of the correlation between the query image and the support images. Previous works have typically regarded it as a pixel-wise classification problem. Therefore, various models have been designed to explore the correlation of pixels between the query image and the support images. However, they focus only on pixel-wise correspondence and ignore the overall correlation of objects. In this paper, we introduce a mask-based classification method for addressing this problem. The mask aggregation network, which is a simple mask classification model, is proposed to simultaneously generate a fixed number of masks and their probabilities of being targets. Then, the final segmentation result is obtained by aggregating all the masks according to their locations. Experiments on both the PASCAL-<jats:inline-formula><jats:alternatives><jats:tex-math>$$5^i$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:msup>\n                    <mml:mn>5<\/mml:mn>\n                    <mml:mi>i<\/mml:mi>\n                  <\/mml:msup>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> and COCO-<jats:inline-formula><jats:alternatives><jats:tex-math>$$20^i$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:msup>\n                    <mml:mn>20<\/mml:mn>\n                    <mml:mi>i<\/mml:mi>\n                  <\/mml:msup>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> datasets show that our method performs comparably to the state-of-the-art pixel-based methods. This competitive performance demonstrates the potential of mask classification as an alternative baseline method for few-shot semantic segmentation.<\/jats:p>","DOI":"10.1007\/s11063-024-11511-5","type":"journal-article","created":{"date-parts":[[2024,2,17]],"date-time":"2024-02-17T17:02:31Z","timestamp":1708189351000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Few-Shot Semantic Segmentation via Mask Aggregation"],"prefix":"10.1007","volume":"56","author":[{"given":"Wei","family":"Ao","sequence":"first","affiliation":[]},{"given":"Shunyi","family":"Zheng","sequence":"additional","affiliation":[]},{"given":"Yan","family":"Meng","sequence":"additional","affiliation":[]},{"given":"Yang","family":"Yang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,2,17]]},"reference":[{"issue":"11","key":"11511_CR1","doi-asserted-by":"publisher","first-page":"2278","DOI":"10.1109\/5.726791","volume":"86","author":"Y Lecun","year":"1998","unstructured":"Lecun Y, Bottou L (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278\u20132324","journal-title":"Proc IEEE"},{"issue":"4","key":"11511_CR2","first-page":"640","volume":"39","author":"J Long","year":"2015","unstructured":"Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640\u2013651","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"11511_CR3","doi-asserted-by":"crossref","unstructured":"Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, pp 234\u2013241. Springer","DOI":"10.1007\/978-3-319-24574-4_28"},{"issue":"12","key":"11511_CR4","doi-asserted-by":"publisher","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","volume":"39","author":"V Badrinarayanan","year":"2017","unstructured":"Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481\u20132495","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"11511_CR5","doi-asserted-by":"crossref","unstructured":"Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2881\u20132890","DOI":"10.1109\/CVPR.2017.660"},{"key":"11511_CR6","doi-asserted-by":"crossref","unstructured":"Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 3146\u20133154","DOI":"10.1109\/CVPR.2019.00326"},{"key":"11511_CR7","doi-asserted-by":"crossref","unstructured":"Shaban A, Bansal S, Liu Z, Essa I, Boots B (2017) One-shot learning for semantic segmentation. In: Proceedings of the British machine vision conference, pp 6230\u20136239","DOI":"10.5244\/C.31.167"},{"key":"11511_CR8","doi-asserted-by":"crossref","unstructured":"Zhang C, Lin G, Liu F, Yao R, Shen C (2019) Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 5217\u20135226","DOI":"10.1109\/CVPR.2019.00536"},{"key":"11511_CR9","doi-asserted-by":"crossref","unstructured":"Wang K, Liew J.H, Zou Y, Zhou D, Feng J (2019) Panet: few-shot image semantic segmentation with prototype alignment. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 9197\u20139206","DOI":"10.1109\/ICCV.2019.00929"},{"key":"11511_CR10","unstructured":"Tian Z, Zhao H, Shu M, Yang Z, Li R, Jia J (2020) Prior guided feature enrichment network for few-shot segmentation. IEEE Trans Pattern Anal Mach Intell"},{"key":"11511_CR11","doi-asserted-by":"crossref","unstructured":"Yang Y, Meng F, Li H, Wu Q, Xu X, Chen S (2020) A new local transformation module for few-shot segmentation. In: International conference on multimedia modeling. Springer, pp 76\u201387","DOI":"10.1007\/978-3-030-37734-2_7"},{"key":"11511_CR12","doi-asserted-by":"crossref","unstructured":"Liu W, Zhang C, Lin G, Liu F (2020) Crnet: cross-reference networks for few-shot segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 4165\u20134173","DOI":"10.1109\/CVPR42600.2020.00422"},{"key":"11511_CR13","doi-asserted-by":"crossref","unstructured":"Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921\u20132929","DOI":"10.1109\/CVPR.2016.319"},{"key":"11511_CR14","doi-asserted-by":"crossref","unstructured":"He K, Gkioxari G, Doll\u00e1r P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961\u20132969","DOI":"10.1109\/ICCV.2017.322"},{"key":"11511_CR15","doi-asserted-by":"crossref","unstructured":"Hariharan B, Arbel\u00e1ez P, Girshick R, Malik J (2014) Simultaneous detection and segmentation. In: European conference on computer vision, pp 297\u2013312 . Springer","DOI":"10.1007\/978-3-319-10584-0_20"},{"key":"11511_CR16","doi-asserted-by":"crossref","unstructured":"Kirillov A, He K, Girshick R, Rother C, Doll\u00e1r P (2019) Panoptic segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 9404\u20139413","DOI":"10.1109\/CVPR.2019.00963"},{"key":"11511_CR17","unstructured":"Cheng B, Schwing A, Kirillov A (2021) Per-pixel classification is not all you need for semantic segmentation. Advances in neural information processing systems, 34"},{"key":"11511_CR18","doi-asserted-by":"crossref","unstructured":"Chen L.-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801\u2013818","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"11511_CR19","doi-asserted-by":"crossref","unstructured":"Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 1520\u20131528","DOI":"10.1109\/ICCV.2015.178"},{"key":"11511_CR20","unstructured":"Dong N, Xing EP (2018) Few-shot semantic segmentation with prototype learning. In: Proceedings of the British machine vision conference, vol 3"},{"key":"11511_CR21","doi-asserted-by":"crossref","unstructured":"Wang X, Kong T, Shen C, Jiang Y, Li L (2020) Solo: segmenting objects by locations. In: European conference on computer vision, pp 649\u2013665. Springer","DOI":"10.1007\/978-3-030-58523-5_38"},{"key":"11511_CR22","doi-asserted-by":"crossref","unstructured":"Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European conference on computer vision. Springer, pp 213\u2013229","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"11511_CR23","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"11511_CR24","first-page":"1097","volume":"25","author":"A Krizhevsky","year":"2012","unstructured":"Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097\u20131105","journal-title":"Adv Neural Inf Process Syst"},{"key":"11511_CR25","doi-asserted-by":"crossref","unstructured":"Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779\u2013788","DOI":"10.1109\/CVPR.2016.91"},{"key":"11511_CR26","doi-asserted-by":"crossref","unstructured":"Liu R, Lehman J, Molino P, Petroski\u00a0Such F, Frank E, Sergeev A, Yosinski J (2018) An intriguing failing of convolutional neural networks and the coordconv solution. Advances in neural information processing systems, 31","DOI":"10.1007\/978-3-030-04212-7_1"},{"issue":"2","key":"11511_CR27","doi-asserted-by":"publisher","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","volume":"88","author":"M Everingham","year":"2010","unstructured":"Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303\u2013338","journal-title":"Int J Comput Vis"},{"key":"11511_CR28","doi-asserted-by":"crossref","unstructured":"Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Doll\u00e1r P, Zitnick CL (2014)Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740\u2013755","DOI":"10.1007\/978-3-319-10602-1_48"},{"issue":"4","key":"11511_CR29","doi-asserted-by":"publisher","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","volume":"40","author":"L-C Chen","year":"2017","unstructured":"Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834\u2013848","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"11511_CR30","doi-asserted-by":"crossref","unstructured":"Boudiaf M, Kervadec H, Masud Z.I, Piantanida P, Ben\u00a0Ayed I, Dolz J (2021) Few-shot segmentation without meta-learning: a good transductive inference is all you need? In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 13979\u201313988","DOI":"10.1109\/CVPR46437.2021.01376"},{"key":"11511_CR31","doi-asserted-by":"crossref","unstructured":"Zhang B, Xiao J, Qin T (2021) Self-guided and cross-guided learning for few-shot segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 8312\u20138321","DOI":"10.1109\/CVPR46437.2021.00821"},{"key":"11511_CR32","doi-asserted-by":"crossref","unstructured":"Min J, Kang D, Cho M (2021) Hypercorrelation squeeze for few-shot segmentation. In: Proceedings of the IEEE\/CVF International conference on computer vision (ICCV), pp 6941\u20136952","DOI":"10.1109\/ICCV48922.2021.00686"},{"key":"11511_CR33","doi-asserted-by":"crossref","unstructured":"Wang H, Zhang X, Hu Y, Yang Y, Cao X, Zhen X (2020) Few-shot semantic segmentation with democratic attention networks. In: European conference on computer vision. Springer, pp 730\u2013746","DOI":"10.1007\/978-3-030-58601-0_43"},{"key":"11511_CR34","doi-asserted-by":"crossref","unstructured":"Liu B, Jiao J, Ye Q (2021) Harmonic feature activation for few-shot semantic segmentation. IEEE Trans Image Process 30(3142\u20133153)","DOI":"10.1109\/TIP.2021.3058512"},{"key":"11511_CR35","doi-asserted-by":"crossref","unstructured":"Yang B, Liu C, Li B, Jiao J, Ye Q (2020) Prototype mixture models for few-shot semantic segmentation. In: European conference on computer vision. Springer, pp 763\u2013778","DOI":"10.1007\/978-3-030-58598-3_45"},{"key":"11511_CR36","doi-asserted-by":"crossref","unstructured":"Liu Y, Zhang X, Zhang S, He X (2020) Part-aware prototype network for few-shot semantic segmentation. In: European conference on computer vision. Springer, pp 142\u2013158","DOI":"10.1007\/978-3-030-58545-7_9"},{"key":"11511_CR37","doi-asserted-by":"crossref","unstructured":"Nguyen K, Todorovic S (2019) Feature weighting and boosting for few-shot segmentation. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 622\u2013631","DOI":"10.1109\/ICCV.2019.00071"}],"container-title":["Neural Processing Letters"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11063-024-11511-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11063-024-11511-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11063-024-11511-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,16]],"date-time":"2024-05-16T20:23:16Z","timestamp":1715890996000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11063-024-11511-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,17]]},"references-count":37,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2024,4]]}},"alternative-id":["11511"],"URL":"https:\/\/doi.org\/10.1007\/s11063-024-11511-5","relation":{},"ISSN":["1573-773X"],"issn-type":[{"value":"1573-773X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,17]]},"assertion":[{"value":"25 November 2023","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 February 2024","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors have no relevant financial or non-financial interests to disclose.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"56"}}