{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,20]],"date-time":"2026-01-20T13:38:50Z","timestamp":1768916330761,"version":"3.49.0"},"reference-count":98,"publisher":"Springer Science and Business Media LLC","issue":"11","license":[{"start":{"date-parts":[[2022,9,3]],"date-time":"2022-09-03T00:00:00Z","timestamp":1662163200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,9,3]],"date-time":"2022-09-03T00:00:00Z","timestamp":1662163200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100003500","name":"Universit\u00e0 degli Studi di Padova","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100003500","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Comput Vis"],"published-print":{"date-parts":[[2022,11]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Semantic segmentation of parts of objects is a marginally explored and challenging task in which multiple instances of objects and multiple parts within those objects must be recognized in an image. We introduce a novel approach (GMENet) for this task combining object-level context conditioning, part-level spatial relationships, and shape contour information. The first target is achieved by introducing a class-conditioning module that enforces class-level semantics when learning the part-level ones. Thus, intermediate-level features carry object-level prior to the decoding stage. To tackle part-level ambiguity and spatial relationships among parts we exploit an adjacency graph-based module that aims at matching the spatial relationships between parts in the ground truth and predicted maps. Last, we introduce an additional module to further leverage edges localization. Besides testing our framework on the already used Pascal-Part-58 and Pascal-Person-Part benchmarks, we further introduce two novel benchmarks for large-scale part parsing, i.e., a more challenging version of Pascal-Part with 108 classes and the ADE20K-Part benchmark with 544 parts. GMENet achieves state-of-the-art results in all the considered tasks and furthermore allows to improve object-level segmentation accuracy.<\/jats:p>","DOI":"10.1007\/s11263-022-01671-z","type":"journal-article","created":{"date-parts":[[2022,9,3]],"date-time":"2022-09-03T18:04:47Z","timestamp":1662228287000},"page":"2797-2821","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["Edge-Aware Graph Matching Network for Part-Based Semantic Segmentation"],"prefix":"10.1007","volume":"130","author":[{"given":"Umberto","family":"Michieli","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9502-2389","authenticated-orcid":false,"given":"Pietro","family":"Zanuttigh","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,9,3]]},"reference":[{"key":"1671_CR1","unstructured":"Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D. G., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., ...\u00a0Zheng, X. (2016). Tensorflow: A system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI) (pp. 265\u2013283)."},{"key":"1671_CR2","doi-asserted-by":"crossref","unstructured":"Azizpour, H., & Laptev, I. (2012). Object detection using strongly-supervised deformable part models. In Proceedings of European conference on computer vision (ECCV) (pp. 836\u2013849). Springer.","DOI":"10.1007\/978-3-642-33718-5_60"},{"issue":"12","key":"1671_CR3","doi-asserted-by":"publisher","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","volume":"39","author":"V Badrinarayanan","year":"2017","unstructured":"Badrinarayanan, V., Kendall, A., & Cipolla, R. (2017). Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 39(12), 2481\u20132495.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)"},{"key":"1671_CR4","unstructured":"Cao, K., Wei, C., Gaidon, A., Arechiga, N., & Ma, T. (2019). Learning imbalanced datasets with label-distribution-aware margin loss. In Neural information processing systems (NeurIPS) (pp. 1567\u20131578)."},{"key":"1671_CR5","doi-asserted-by":"crossref","unstructured":"Cermelli, F., Mancini, M., Bulo, S. R., Ricci, E., & Caputo, B. (2020). Modeling the background for incremental learning in semantic segmentation. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 9233\u20139242).","DOI":"10.1109\/CVPR42600.2020.00925"},{"key":"1671_CR6","doi-asserted-by":"crossref","unstructured":"Chang, W. L., Wang, H. P., Peng, W. H., & Chiu, W. C. (2019). All about structure: Adapting structural information across domains for boosting semantic segmentation. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1900\u20131909).","DOI":"10.1109\/CVPR.2019.00200"},{"key":"1671_CR7","unstructured":"Chen, L. C. (2020). DeepLab official TensorFlow implementation. https:\/\/github.com\/tensorflow\/models\/tree\/master\/research\/deeplab. Accessed 2020-03-01."},{"key":"1671_CR8","doi-asserted-by":"crossref","unstructured":"Chen, L. C., Yang, Y., Wang, J., Xu, W., & Yuille, A. L. (2016) Attention to scale: Scale-aware semantic image segmentation. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3640\u20133649).","DOI":"10.1109\/CVPR.2016.396"},{"key":"1671_CR9","unstructured":"Chen, L. C., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587."},{"issue":"4","key":"1671_CR10","doi-asserted-by":"publisher","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","volume":"40","author":"LC Chen","year":"2018","unstructured":"Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2018). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 40(4), 834\u2013848.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)"},{"key":"1671_CR11","doi-asserted-by":"crossref","unstructured":"Chen, X., Mottaghi, R., Liu, X., Fidler, S., Urtasun, R., & Yuille, A. (2014). Detect what you can: Detecting and representing objects using holistic models and body parts. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1971\u20131978).","DOI":"10.1109\/CVPR.2014.254"},{"key":"1671_CR12","first-page":"9355","volume":"34","author":"X Chu","year":"2021","unstructured":"Chu, X., Tian, Z., Wang, Y., Zhang, B., Ren, H., Wei, X., Xia, H., & Shen, C. (2021). Twins: Revisiting the design of spatial attention in vision transformers. Neural Information Processing Systems (NeurIPS), 34, 9355\u20139366.","journal-title":"Neural Information Processing Systems (NeurIPS)"},{"key":"1671_CR13","doi-asserted-by":"crossref","unstructured":"Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR).","DOI":"10.1109\/CVPR.2016.350"},{"key":"1671_CR14","doi-asserted-by":"crossref","unstructured":"Csurka, G., Larlus, D., Perronnin, F., & Meylan, F. (2013). What is a good evaluation measure for semantic segmentation? In Proceedings of British machine vision conference (BMVC) (p. 2013).","DOI":"10.5244\/C.27.32"},{"key":"1671_CR15","doi-asserted-by":"crossref","unstructured":"Das, D., & Lee, C. G. (2018). Unsupervised domain adaptation using regularized hyper-graph matching. In Proceedings of IEEE international conference on image processing (ICIP) (pp. 3758\u20133762). IEEE.","DOI":"10.1109\/ICIP.2018.8451152"},{"key":"1671_CR16","doi-asserted-by":"crossref","unstructured":"de Geus, D., Meletis, P., Lu, C., Wen, X., & Dubbelman, G. (2021). Part-aware panoptic segmentation. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5485\u20135494).","DOI":"10.1109\/CVPR46437.2021.00544"},{"key":"1671_CR17","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 248\u2013255). IEEE.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"1671_CR18","doi-asserted-by":"crossref","unstructured":"Dhar, P., Singh, R. V., Peng, K. C., Wu, Z., & Chellappa, R. (2019). Learning without memorizing. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5138\u20135146).","DOI":"10.1109\/CVPR.2019.00528"},{"key":"1671_CR19","doi-asserted-by":"crossref","unstructured":"Dong, J., Chen, Q., Shen, X., Yang, J. & Yan, S. (2014). Towards unified human parsing and pose estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 843\u2013850).","DOI":"10.1109\/CVPR.2014.113"},{"key":"1671_CR20","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In International conference on learning representations (ICLR)."},{"key":"1671_CR21","doi-asserted-by":"crossref","unstructured":"Douillard, A., Chen, Y., Dapogny, A., & Cord, M. (2021). Plop: Learning without forgetting for continual semantic segmentation. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 4040\u20134050).","DOI":"10.1109\/CVPR46437.2021.00403"},{"key":"1671_CR22","doi-asserted-by":"publisher","first-page":"180","DOI":"10.1016\/j.ins.2016.01.074","volume":"346","author":"F Emmert-Streib","year":"2016","unstructured":"Emmert-Streib, F., Dehmer, M., & Shi, Y. (2016). Fifty years of graph matching, network alignment and network comparison. Information Sciences, 346, 180\u2013197.","journal-title":"Information Sciences"},{"key":"1671_CR23","unstructured":"Eslami, S,. & Williams, C. (2012). A generative model for parts-based object segmentation. In Neural information processing systems (NeurIPS) (pp. 100\u2013107)."},{"issue":"2","key":"1671_CR24","doi-asserted-by":"publisher","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","volume":"88","author":"M Everingham","year":"2010","unstructured":"Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision (IJCV), 88(2), 303\u2013338.","journal-title":"International Journal of Computer Vision (IJCV)"},{"key":"1671_CR25","doi-asserted-by":"crossref","unstructured":"Fang, H. S., Lu, G., Fang, X., Xie, J., Tai, Y. W., & Lu, C. (2018). Weakly and semi supervised human body part parsing via pose-guided knowledge transfer. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR).","DOI":"10.1109\/CVPR.2018.00015"},{"key":"1671_CR26","doi-asserted-by":"crossref","unstructured":"Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., & Lu, H. (2019). Dual attention network for scene segmentation. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3146\u20133154).","DOI":"10.1109\/CVPR.2019.00326"},{"key":"1671_CR27","doi-asserted-by":"crossref","unstructured":"Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M. W., & Keutzer, K. (2021). A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630","DOI":"10.1201\/9781003162810-13"},{"issue":"5","key":"1671_CR28","doi-asserted-by":"publisher","first-page":"476","DOI":"10.1007\/s11263-017-1048-0","volume":"126","author":"A Gonzalez-Garcia","year":"2018","unstructured":"Gonzalez-Garcia, A., Modolo, D., & Ferrari, V. (2018). Do semantic parts emerge in convolutional neural networks? International Journal of Computer Vision (IJCV), 126(5), 476\u2013494.","journal-title":"International Journal of Computer Vision (IJCV)"},{"issue":"2","key":"1671_CR29","doi-asserted-by":"publisher","first-page":"87","DOI":"10.1007\/s13735-017-0141-z","volume":"7","author":"Y Guo","year":"2018","unstructured":"Guo, Y., Liu, Y., Georgiou, T., & Lew, M. S. (2018). A review of semantic segmentation using deep neural networks. International Journal of Multimedia Information Retrieval, 7(2), 87\u201393.","journal-title":"International Journal of Multimedia Information Retrieval"},{"key":"1671_CR30","doi-asserted-by":"crossref","unstructured":"Haggag, H., Abobakr, A., Hossny, M., & Nahavandi, S. (2016). Semantic body parts segmentation for quadrupedal animals. In 2016 IEEE international conference on systems, man, and cybernetics (SMC) (pp. 000855\u2013000860).","DOI":"10.1109\/SMC.2016.7844347"},{"issue":"2","key":"1671_CR31","doi-asserted-by":"publisher","first-page":"1041","DOI":"10.1109\/TITS.2019.2962094","volume":"22","author":"HY Han","year":"2020","unstructured":"Han, H. Y., Chen, Y. C., Hsiao, P. Y., & Fu, L. C. (2020). Using channel-wise attention for deep CNN based real-time semantic segmentation with class-aware edge information. IEEE Transactions on Intelligent Transportation Systems, 22(2), 1041\u20131051.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"1671_CR32","doi-asserted-by":"crossref","unstructured":"Hariharan, B., Arbel\u00e1ez, P., Girshick, R., & Malik, J. (2015). Hypercolumns for object segmentation and fine-grained localization. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 447\u2013456).","DOI":"10.1109\/CVPR.2015.7298642"},{"key":"1671_CR33","unstructured":"He, H., Zhang, J., Zhuang, B., Cai, J., & Tao, D. (2021a). End-to-end one-shot human parsing. arXiv preprint arXiv:2105.01241."},{"key":"1671_CR34","doi-asserted-by":"crossref","unstructured":"He, J., Yang, S., Yang, S., Kortylewski, A., Yuan, X., Chen, J. N., Liu, S., Yang, C. & Yuille, A. (2021b). Partimagenet: A large, high-quality dataset of parts. arXiv preprint arXiv:2112.00933.","DOI":"10.1007\/978-3-031-20074-8_8"},{"key":"1671_CR35","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., & Sun, J.(2016). Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770\u2013778).","DOI":"10.1109\/CVPR.2016.90"},{"key":"1671_CR36","doi-asserted-by":"crossref","unstructured":"Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., & Liu, W. (2019). Ccnet: Criss-cross attention for semantic segmentation. In Proceedings of international conference on computer vision (ICCV) (pp. 603\u2013612).","DOI":"10.1109\/ICCV.2019.00069"},{"key":"1671_CR37","doi-asserted-by":"crossref","unstructured":"Huang, Z., Wang, X., Wei, Y., et\u00a0al. (2020). Ccnet: Criss-cross attention for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).","DOI":"10.1109\/TPAMI.2020.3007032"},{"key":"1671_CR38","doi-asserted-by":"crossref","unstructured":"Jiang, H., Sun, D., Jampani, V., Lv, Z., Learned-Miller, E., & Kautz, J. (2019). SENSE: A shared encoder network for scene-flow estimation. In Proceedings of international conference on computer vision (ICCV) (pp. 3195\u20133204).","DOI":"10.1109\/ICCV.2019.00329"},{"key":"1671_CR39","doi-asserted-by":"publisher","first-page":"29","DOI":"10.1016\/j.patrec.2021.04.024","volume":"148","author":"Y Jin","year":"2021","unstructured":"Jin, Y., Han, D., & Ko, H. (2021). Trseg: Transformer for semantic segmentation. Pattern Recognition Letters, 148, 29\u201335.","journal-title":"Pattern Recognition Letters"},{"key":"1671_CR40","unstructured":"Kang, B., Xie, S., Rohrbach, M., Yan, Z., Gordo, A., Feng, J., & Kalantidis, Y. (2019). Decoupling representation and classifier for long-tailed recognition. In International Conference on Learning Representations (ICLR)."},{"key":"1671_CR41","doi-asserted-by":"crossref","unstructured":"Krause, J., Jin, H., Yang, J., & Fei-Fei, L. (2015). Fine-grained recognition without part annotations. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5546\u20135555).","DOI":"10.1109\/CVPR.2015.7299194"},{"key":"1671_CR42","unstructured":"Li, J., Zhao, J., Wei, Y., Lang, C., Li, Y., Sim, T., Yan, S. & Feng, J. (2017). Multiple-human parsing in the wild. arXiv preprint arXiv:1705.07206."},{"key":"1671_CR43","unstructured":"Li, P., Xu, Y., Wei, Y., & Yang, Y. (2020a). Self-correction for human parsing. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)."},{"key":"1671_CR44","doi-asserted-by":"crossref","unstructured":"Li, X., Li, X., Zhang, L., Cheng, G., Shi, J., Lin, Z., Tan, S. & Tong, Y. (2020b). Improving semantic segmentation via decoupled body and edge supervision. In Proceedings of European conference on computer vision (ECCV) (pp. 435\u2013452). Springer.","DOI":"10.1007\/978-3-030-58520-4_26"},{"issue":"12","key":"1671_CR45","doi-asserted-by":"publisher","first-page":"2935","DOI":"10.1109\/TPAMI.2017.2773081","volume":"40","author":"Z Li","year":"2018","unstructured":"Li, Z., & Hoiem, D. (2018). Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 40(12), 2935\u20132947.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)"},{"key":"1671_CR46","doi-asserted-by":"publisher","first-page":"370","DOI":"10.1016\/j.neucom.2021.07.045","volume":"461","author":"T Liang","year":"2021","unstructured":"Liang, T., Glossner, J., Wang, L., Shi, S., & Zhang, X. (2021). Pruning and quantization for deep neural network acceleration: A survey. Neurocomputing, 461, 370\u2013403.","journal-title":"Neurocomputing"},{"issue":"12","key":"1671_CR47","doi-asserted-by":"publisher","first-page":"2402","DOI":"10.1109\/TPAMI.2015.2408360","volume":"37","author":"X Liang","year":"2015","unstructured":"Liang, X., Liu, S., Shen, X., Yang, J., Liu, L., Dong, J., Lin, L., & Yan, S. (2015). Deep human parsing with active template regression. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 37(12), 2402\u20132414.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)"},{"key":"1671_CR48","doi-asserted-by":"crossref","unstructured":"Liang, X., Shen, X., Feng, J., Lin, L., & Yan, S. (2016). Semantic object parsing with graph lstm. In Proceedings of European Conference on Computer Vision (ECCV) (pp. 125\u2013143). Springer.","DOI":"10.1007\/978-3-319-46448-0_8"},{"key":"1671_CR49","doi-asserted-by":"crossref","unstructured":"Liang, X., Lin, L., Shen, X., Feng, J., Yan, S., & Xing, E. P. (2017). Interpretable structure-evolving lstm. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1010\u20131019).","DOI":"10.1109\/CVPR.2017.234"},{"issue":"4","key":"1671_CR50","doi-asserted-by":"publisher","first-page":"871","DOI":"10.1109\/TPAMI.2018.2820063","volume":"41","author":"X Liang","year":"2018","unstructured":"Liang, X., Gong, K., Shen, X., & Lin, L. (2018). Look into person: Joint body parsing & pose estimation network and a new benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 41(4), 871\u2013885.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)"},{"key":"1671_CR51","doi-asserted-by":"crossref","unstructured":"Liu, X., Deng, Z., & Yang, Y. (2019a). Recent progress in semantic image segmentation. Artificial Intelligence Review, 52(2), 1089\u20131106.","DOI":"10.1007\/s10462-018-9641-3"},{"key":"1671_CR52","doi-asserted-by":"crossref","unstructured":"Liu, Z., Miao, Z., Zhan, X., Wang, J., Gong, B., & Yu, S. X. (2019b). Large-scale long-tailed recognition in an open world. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2537\u20132546).","DOI":"10.1109\/CVPR.2019.00264"},{"key":"1671_CR53","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of international conference on computer vision (ICCV) (pp. 10012\u201310022).","DOI":"10.1109\/ICCV48922.2021.00986"},{"issue":"3","key":"1671_CR54","doi-asserted-by":"publisher","first-page":"253","DOI":"10.1007\/s10044-012-0284-8","volume":"16","author":"L Livi","year":"2013","unstructured":"Livi, L., & Rizzi, A. (2013). The graph matching problem. Pattern Analysis and Applications, 16(3), 253\u2013283.","journal-title":"Pattern Analysis and Applications"},{"key":"1671_CR55","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3431\u20133440).","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"1671_CR56","doi-asserted-by":"crossref","unstructured":"Lu, W., Lian, X., & Yuille, A. (2014). Parsing semantic parts of cars using graphical models and segment appearance consistency. In Proceedings of British Machine Vision Conference (BMVC).","DOI":"10.5244\/C.28.118"},{"key":"1671_CR57","doi-asserted-by":"crossref","unstructured":"Maracani, A., Michieli, U., Toldo, M., & Zanuttigh, P. (2021). Recall: Replay-based continual learning in semantic segmentation. In Proceedings of International Conference on Computer Vision (ICCV) (pp. 7026\u20137035).","DOI":"10.1109\/ICCV48922.2021.00694"},{"issue":"1","key":"1671_CR58","doi-asserted-by":"publisher","first-page":"1","DOI":"10.3390\/technologies8010001","volume":"8","author":"M Mel","year":"2020","unstructured":"Mel, M., Michieli, U., & Zanuttigh, P. (2020). Incremental and multi-task learning strategies for coarse-to-fine semantic segmentation. Technologies, 8(1), 1.","journal-title":"Technologies"},{"key":"1671_CR59","unstructured":"Michieli, U., & Ozay, M. (2021). Prototype guided federated learning of visual feature representations. arXiv preprint arXiv:2105.08982."},{"key":"1671_CR60","doi-asserted-by":"crossref","unstructured":"Michieli, U., & Zanuttigh, P. (2019). Incremental learning techniques for semantic segmentation. In Proceedings of IEEE conference on computer vision and pattern recognition workshops (CVPRW).","DOI":"10.1109\/ICCVW.2019.00400"},{"key":"1671_CR61","doi-asserted-by":"crossref","unstructured":"Michieli, U., & Zanuttigh, P. (2021a). Knowledge distillation for incremental learning in semantic segmentation. Computer Vision and Image Understanding, 205, 103167.","DOI":"10.1016\/j.cviu.2021.103167"},{"key":"1671_CR62","doi-asserted-by":"crossref","unstructured":"Michieli, U., & Zanuttigh, P. (2021b). Continual semantic segmentation via repulsion-attraction of sparse and disentangled latent representations. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1114\u20131124).","DOI":"10.1109\/CVPR46437.2021.00117"},{"key":"1671_CR63","doi-asserted-by":"crossref","unstructured":"Michieli, U., Borsato, E., Rossi, L., & Zanuttigh, P. (2020). Gmnet: Graph matching network for large scale part semantic segmentation in the wild. In Proceedings of European conference on computer vision (ECCV) (pp. 397\u2013414). Springer.","DOI":"10.1007\/978-3-030-58598-3_24"},{"key":"1671_CR64","doi-asserted-by":"crossref","unstructured":"Nie, X., Feng, J., & Yan, S. (2018). Mutual learning to adapt for joint human parsing and pose estimation. In Proceedings of European conference on computer vision (ECCV) (pp. 502\u2013517).","DOI":"10.1007\/978-3-030-01228-1_31"},{"key":"1671_CR65","doi-asserted-by":"crossref","unstructured":"Rebuffi, S. A., Kolesnikov, A., Sperl, G., & Lampert, C. H. (2017). icarl: Incremental classifier and representation learning. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2001\u20132010).","DOI":"10.1109\/CVPR.2017.587"},{"key":"1671_CR66","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In International conference on medical image computing and computer-assisted intervention (pp. 234\u2013241). Springer.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"1671_CR67","doi-asserted-by":"crossref","unstructured":"Ruan, T., Liu, T., Huang, Z., Wei, Y., Wei, S., & Zhao, Y. (2019). Devil in the details: Towards accurate single and multiple human parsing. In Proceedings of the AAAI conference on artificial intelligence (AAAI) (pp. 4814\u20134821).","DOI":"10.1609\/aaai.v33i01.33014814"},{"key":"1671_CR68","doi-asserted-by":"crossref","unstructured":"Shmelkov, K., Schmid, C., & Alahari, K. (2017). Incremental learning of object detectors without catastrophic forgetting. In Proceedings of international conference on computer vision (ICCV) (pp. 3400\u20133409).","DOI":"10.1109\/ICCV.2017.368"},{"key":"1671_CR69","doi-asserted-by":"crossref","unstructured":"Song, Y., Chen, X., Li, J., & Zhao, Q. (2017). Embedding 3d geometric features for rigid object part segmentation. In Proceedings of international conference on computer vision (ICCV) (pp. 580\u2013588).","DOI":"10.1109\/ICCV.2017.70"},{"key":"1671_CR70","doi-asserted-by":"crossref","unstructured":"Strudel, R., Garcia, R., Laptev, I., & Schmid, C. (2021). Segmenter: Transformer for semantic segmentation. In Proceedings of international conference on computer vision (ICCV) (pp. 7262\u20137272).","DOI":"10.1109\/ICCV48922.2021.00717"},{"key":"1671_CR71","doi-asserted-by":"crossref","unstructured":"Sun, J., & Ponce, J. (2013). Learning discriminative part detectors for image classification and cosegmentation. In Proceedings of international conference on computer vision (ICCV) (pp. 3400\u20133407).","DOI":"10.1109\/ICCV.2013.422"},{"key":"1671_CR72","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. Neural Information Processing Systems (NeurIPS) 30"},{"key":"1671_CR73","doi-asserted-by":"crossref","unstructured":"Vu, T. H., Jain, H., Bucher, M., Cord, M., & P\u00e9rez, P. (2019). Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2517\u20132526).","DOI":"10.1109\/CVPR.2019.00262"},{"key":"1671_CR74","doi-asserted-by":"crossref","unstructured":"Wan, W., Chen, J., Li, T., Huang, Y., Tian, J., Yu, C., & Xue, Y. (2019). Information entropy based feature pooling for convolutional neural networks. In Proceedings of international conference on computer vision (ICCV) (pp. 3405\u20133414).","DOI":"10.1109\/ICCV.2019.00350"},{"key":"1671_CR75","doi-asserted-by":"crossref","unstructured":"Wang, J., & Yuille, A. L. (2015). Semantic part segmentation using compositional model combining shape and appearance. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1788\u20131797).","DOI":"10.1109\/CVPR.2015.7298788"},{"key":"1671_CR76","doi-asserted-by":"crossref","unstructured":"Wang, P., Shen, X., Lin, Z., Cohen, S., Price, B. & Yuille, A. L. (2015). Joint object and part segmentation using deep learned potentials. In Proceedings of international conference on computer vision (ICCV) (pp. 1573\u20131581).","DOI":"10.1109\/ICCV.2015.184"},{"key":"1671_CR77","first-page":"3075","volume":"13","author":"Y Wang","year":"2012","unstructured":"Wang, Y., Tran, D., Liao, Z., & Forsyth, D. (2012). Discriminative hierarchical part-based models for human parsing and action recognition. Journal of Machine Learning Research, 13, 3075\u20133102.","journal-title":"Journal of Machine Learning Research"},{"key":"1671_CR78","unstructured":"Xia, F., Zhu, J., Wang, P., & Yuille, A.(2015). Pose-guided human parsing with deep learned features. arXiv preprint arXiv:1508.03881."},{"key":"1671_CR79","doi-asserted-by":"crossref","unstructured":"Xia, F., Wang, P., Chen, L. C., & Yuille, A. L. (2016). Zoom better to see clearer: Human and object parsing with hierarchical auto-zoom net. In Proceedings of European conference on computer vision (ECCV) (pp. 648\u2013663). Springer.","DOI":"10.1007\/978-3-319-46454-1_39"},{"key":"1671_CR80","doi-asserted-by":"crossref","unstructured":"Xia, F., Wang, P., Chen, X., & Yuille, A. L. (2017). Joint multi-person pose estimation and semantic part segmentation. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 6769\u20136778).","DOI":"10.1109\/CVPR.2017.644"},{"key":"1671_CR81","unstructured":"Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J. M., & Luo, P. (2021). Segformer: Simple and efficient design for semantic segmentation with transformers. In Neural information processing systems (NeurIPS)."},{"key":"1671_CR82","doi-asserted-by":"crossref","unstructured":"Yamaguchi, K., Kiapour, M. H., Ortiz, L. E., & Berg, T. L. (2012). Parsing clothing in fashion photographs. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3570\u20133577).","DOI":"10.1109\/CVPR.2012.6248101"},{"key":"1671_CR83","doi-asserted-by":"crossref","unstructured":"Yang, Y., & Ramanan, D. (2011). Articulated pose estimation with flexible mixtures-of-parts. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1385\u20131392).","DOI":"10.1109\/CVPR.2011.5995741"},{"key":"1671_CR84","doi-asserted-by":"crossref","unstructured":"Yin, J., Liu, W., Xing, W., & Xiao, Y. (2021). Class-level aware network for human parsing. In International conference on computing, networks and internet of things (pp. 1\u20136).","DOI":"10.1145\/3468691.3468733"},{"key":"1671_CR85","doi-asserted-by":"crossref","unstructured":"Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., & Sang, N. (2018). Bisenet: Bilateral segmentation network for real-time semantic segmentation. In Proceedings of European conference on computer vision (ECCV) (pp. 325\u2013341).","DOI":"10.1007\/978-3-030-01261-8_20"},{"issue":"11","key":"1671_CR86","doi-asserted-by":"publisher","first-page":"3051","DOI":"10.1007\/s11263-021-01515-2","volume":"129","author":"C Yu","year":"2021","unstructured":"Yu, C., Gao, C., Wang, J., Yu, G., Shen, C., & Sang, N. (2021). Bisenet v2: Bilateral network with guided aggregation for real-time semantic segmentation. International Journal of Computer Vision (IJCV), 129(11), 3051\u20133068.","journal-title":"International Journal of Computer Vision (IJCV)"},{"key":"1671_CR87","doi-asserted-by":"crossref","unstructured":"Yu, F., Koltun, V., & Funkhouser, T. (2017). Dilated residual networks. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR).","DOI":"10.1109\/CVPR.2017.75"},{"key":"1671_CR88","unstructured":"Yuan, Y., Huang, L., Guo, J., Zhang, C., Chen, X., & Wang, J. (2018). Ocnet: Object context network for scene parsing. arXiv preprint arXiv:1809.00916."},{"key":"1671_CR89","doi-asserted-by":"crossref","unstructured":"Zhang, N., Donahue, J., Girshick, R., & Darrell, T. (2014). Part-based r-cnns for fine-grained category detection. In Proceedings of European Conference on Computer Vision (ECCV) (pp. 834\u2013849). Springer.","DOI":"10.1007\/978-3-319-10590-1_54"},{"key":"1671_CR90","doi-asserted-by":"crossref","unstructured":"Zhang, W., Huang, Z., Luo, G., Chen, T., Wang, X., Liu, W., Yu, G., & Shen, C. (2022) Topformer: Token pyramid transformer for mobile semantic segmentation. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR).","DOI":"10.1109\/CVPR52688.2022.01177"},{"issue":"2","key":"1671_CR91","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s11432-019-2718-7","volume":"63","author":"Z Zhang","year":"2020","unstructured":"Zhang, Z., & Pang, Y. (2020). Cgnet: Cross-guidance network for semantic segmentation. Science China Information Sciences, 63(2), 1\u201316.","journal-title":"Science China Information Sciences"},{"key":"1671_CR92","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Fu, H., Dai, H., Shen, J., Pang, Y., & Shao, L. (2019). Et-net: A generic edge-attention guidance network for medical image segmentation. In International conference on medical image computing and computer-assisted intervention (pp. 442\u2013450). Springer.","DOI":"10.1007\/978-3-030-32239-7_49"},{"key":"1671_CR93","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017a). Pyramid scene parsing network. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2881\u20132890).","DOI":"10.1109\/CVPR.2017.660"},{"key":"1671_CR94","doi-asserted-by":"crossref","unstructured":"Zhao, J., Li, J., Nie, X., Zhao, F., Chen, Y., Wang, Z., Feng, J. & Yan, S. (2017b). Self-supervised neural aggregation networks for human parsing. In Proceedings of IEEE conference on computer vision and pattern recognition workshops (CVPRW) (pp. 7\u201315).","DOI":"10.1109\/CVPRW.2017.204"},{"key":"1671_CR95","doi-asserted-by":"crossref","unstructured":"Zhao, Y., Li, J., Zhang, Y., & Tian, Y. (2019). Multi-class part parsing with joint boundary-semantic awareness. In Proceedings of international conference on computer vision (ICCV) (pp. 9177\u20139186).","DOI":"10.1109\/ICCV.2019.00927"},{"key":"1671_CR96","doi-asserted-by":"crossref","unstructured":"Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., Torr, P. H., & Zhang, L. (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 6881\u20136890).","DOI":"10.1109\/CVPR46437.2021.00681"},{"key":"1671_CR97","doi-asserted-by":"crossref","unstructured":"Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., & Torralba, A. (2017). Scene parsing through ade20k dataset. In Proceedings of IEEE conference on computer vision and pattern recognition (CVPR) (pp. 633\u2013641).","DOI":"10.1109\/CVPR.2017.544"},{"issue":"1","key":"1671_CR98","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s11263-010-0375-1","volume":"93","author":"LL Zhu","year":"2011","unstructured":"Zhu, L. L., Chen, Y., Lin, C., & Yuille, A. (2011). Max margin learning of hierarchical configural deformable templates (hcdts) for efficient object parsing and pose estimation. International Journal of Computer Vision (IJCV), 93(1), 1\u201321.","journal-title":"International Journal of Computer Vision (IJCV)"}],"container-title":["International Journal of Computer Vision"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-022-01671-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11263-022-01671-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11263-022-01671-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,2,18]],"date-time":"2023-02-18T01:22:36Z","timestamp":1676683356000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11263-022-01671-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,3]]},"references-count":98,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2022,11]]}},"alternative-id":["1671"],"URL":"https:\/\/doi.org\/10.1007\/s11263-022-01671-z","relation":{},"ISSN":["0920-5691","1573-1405"],"issn-type":[{"value":"0920-5691","type":"print"},{"value":"1573-1405","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,3]]},"assertion":[{"value":"23 December 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 August 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 September 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}