{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,17]],"date-time":"2026-06-17T01:48:30Z","timestamp":1781660910011,"version":"3.54.5"},"reference-count":24,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2020,3,6]],"date-time":"2020-03-06T00:00:00Z","timestamp":1583452800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004739","name":"Youth Innovation Promotion Association of the Chinese Academy of Sciences","doi-asserted-by":"publisher","award":["2016336"],"award-info":[{"award-number":["2016336"]}],"id":[{"id":"10.13039\/501100004739","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>Semantic image segmentation, as one of the most popular tasks in computer vision, has been widely used in autonomous driving, robotics and other fields. Currently, deep convolutional neural networks (DCNNs) are driving major advances in semantic segmentation due to their powerful feature representation. However, DCNNs extract high-level feature representations by strided convolution, which makes it impossible to segment foreground objects precisely, especially when locating object boundaries. This paper presents a novel semantic segmentation algorithm with DeepLab v3+ and super-pixel segmentation algorithm-quick shift. DeepLab v3+ is employed to generate a class-indexed score map for the input image. Quick shift is applied to segment the input image into superpixels. Outputs of them are then fed into a class voting module to refine the semantic segmentation results. Extensive experiments on proposed semantic image segmentation are performed over PASCAL VOC 2012 dataset, and results that the proposed method can provide a more efficient solution.<\/jats:p>","DOI":"10.3390\/sym12030427","type":"journal-article","created":{"date-parts":[[2020,3,6]],"date-time":"2020-03-06T09:26:41Z","timestamp":1583486801000},"page":"427","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":36,"title":["Semantic Image Segmentation with Deep Convolutional Neural Networks and Quick Shift"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7800-0751","authenticated-orcid":false,"given":"Sanxing","family":"Zhang","sequence":"first","affiliation":[{"name":"Institute of Optics and Electronics Chinese Academy of Science, Chengdu 610209, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zhenhuan","family":"Ma","sequence":"additional","affiliation":[{"name":"Institute of Optics and Electronics Chinese Academy of Science, Chengdu 610209, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6730-045X","authenticated-orcid":false,"given":"Gang","family":"Zhang","sequence":"additional","affiliation":[{"name":"Institute of Optics and Electronics Chinese Academy of Science, Chengdu 610209, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0900-1582","authenticated-orcid":false,"given":"Tao","family":"Lei","sequence":"additional","affiliation":[{"name":"Institute of Optics and Electronics Chinese Academy of Science, Chengdu 610209, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6889-1247","authenticated-orcid":false,"given":"Rui","family":"Zhang","sequence":"additional","affiliation":[{"name":"Institute of Optics and Electronics Chinese Academy of Science, Chengdu 610209, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yi","family":"Cui","sequence":"additional","affiliation":[{"name":"Institute of Optics and Electronics Chinese Academy of Science, Chengdu 610209, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2020,3,6]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., and Garcia-Rodriguez, J. (2017). A review on deep learning techniques applied to semantic segmentation. arXiv.","DOI":"10.1016\/j.asoc.2018.05.018"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Gu, Y., Wang, Y., and Li, Y. (2019). A Survey on Deep Learning-Driven Remote Sensing Image Scene Understanding: Scene Classification, Scene Retrieval and Scene-Guided Object Detection. Appl. Sci., 9.","DOI":"10.3390\/app9102110"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Ma, H., Liu, Y., Ren, Y., and Yu, J. (2020). Detection of Collapsed Buildings in Post-Earthquake Remote Sensing Images Based on the Improved YOLOv3. Remote Sens., 12.","DOI":"10.3390\/rs12010044"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Lu, Z., and Chen, D. (2020). Weakly Supervised and Semi-Supervised Semantic Segmentation for Optic Disc of Fundus Image. Symmetry, 12.","DOI":"10.3390\/sym12010145"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_6","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). Imagenet classification with deep convolutional neural networks. Proceedings of the Advances in Neural Information Processing Systems 25, Lake Tahoe, CA, USA."},{"key":"ref_7","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The pascal visual object classes (voc) challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_10","unstructured":"Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A.L. (2014). Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs","volume":"40","author":"Chen","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_12","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., and Adam, H. (2018, January 8\u201314). Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., and Cottrell, G. (2018, January 12\u201315). Understanding convolution for semantic segmentation. Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA.","DOI":"10.1109\/WACV.2018.00163"},{"key":"ref_15","unstructured":"Kr\u00e4henb\u00fchl, P., and Koltun, V. (2011, January 12\u201314). Efficient inference in fully connected crfs with gaussian edge potentials. Proceedings of the Advances in Neural Information Processing Systems 24, Granada, Spain."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Chollet, F. (2017, January 21\u201326). Xception: Deep learning with depthwise separable convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.195"},{"key":"ref_17","unstructured":"Sifre, L., and Mallat, S. (2014). Rigid-Motion Scattering for Image Classification. [Ph.D. Thesis, Ecole Normale Superieure]."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Ren, X., and Malik, J. (2003, January 13\u201316). Learning a classification model for segmentation. Proceedings of the IEEE International Conference on Computer Vision, Nice, France.","DOI":"10.1109\/ICCV.2003.1238308"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Vedaldi, A., and Soatto, S. (2008, January 12\u201318). Quick shift and kernel methods for mode seeking. Proceedings of the European Conference on Computer Vision, Marseille, France.","DOI":"10.1007\/978-3-540-88693-8_52"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Sheikh, Y.A., Khan, E.A., and Kanade, T. (2007, January 14\u201321). Mode-seeking by medoidshifts. Proceedings of the IEEE International Conference on Computer Vision, Rio de Janeiro, Brazil.","DOI":"10.1109\/ICCV.2007.4408978"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"5312","DOI":"10.1109\/TIP.2017.2728185","article-title":"Nonparametric joint shape and feature priors for image segmentation","volume":"26","author":"Erdil","year":"2017","journal-title":"IEEE Trans. Image Process."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"2274","DOI":"10.1109\/TPAMI.2012.120","article-title":"SLIC superpixels compared to state-of-the-art superpixel methods","volume":"34","author":"Achanta","year":"2012","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1023\/B:VISI.0000022288.19776.77","article-title":"Efficient graph-based image segmentation","volume":"59","author":"Felzenszwalb","year":"2004","journal-title":"Int. J. Comput. Vision."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Hariharan, B., Arbel\u00e1ez, P., Bourdev, L., Maji, S., and Malik, J. (2011, January 6\u201313). Semantic contours from inverse detectors. Proceedings of the IEEE International Conference on Computer Vision, Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126343"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/12\/3\/427\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:04:53Z","timestamp":1760173493000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/12\/3\/427"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,3,6]]},"references-count":24,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2020,3]]}},"alternative-id":["sym12030427"],"URL":"https:\/\/doi.org\/10.3390\/sym12030427","relation":{},"ISSN":["2073-8994"],"issn-type":[{"value":"2073-8994","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,3,6]]}}}