{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,10]],"date-time":"2026-06-10T03:17:14Z","timestamp":1781061434381,"version":"3.54.1"},"reference-count":46,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2019,12,12]],"date-time":"2019-12-12T00:00:00Z","timestamp":1576108800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004739","name":"Youth Innovation Promotion Association, Chinese Academy of Sciences","doi-asserted-by":"publisher","award":["2016336"],"award-info":[{"award-number":["2016336"]}],"id":[{"id":"10.13039\/501100004739","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IJGI"],"abstract":"<jats:p>Semantic segmentation on high-resolution aerial images plays a significant role in many remote sensing applications. Although the Deep Convolutional Neural Network (DCNN) has shown great performance in this task, it still faces the following two challenges: intra-class heterogeneity and inter-class homogeneity. To overcome these two problems, a novel dual-path DCNN, which contains a spatial path and an edge path, is proposed for high-resolution aerial image segmentation. The spatial path, which combines the multi-level and global context features to encode the local and global information, is used to address the intra-class heterogeneity challenge. For inter-class homogeneity problem, a Holistically-nested Edge Detection (HED)-like edge path is employed to detect the semantic boundaries for the guidance of feature learning. Furthermore, we improve the computational efficiency of the network by employing the backbone of MobileNetV2. We enhance the performance of MobileNetV2 with two modifications: (1) replacing the standard convolution in the last four Bottleneck Residual Blocks (BRBs) with atrous convolution; and (2) removing the convolution stride of 2 in the first layer of BRBs 4 and 6. Experimental results on the ISPRS Vaihingen and Potsdam 2D labeling dataset show that the proposed DCNN achieved real-time inference speed on a single GPU card with better performance, compared with the state-of-the-art baselines.<\/jats:p>","DOI":"10.3390\/ijgi8120582","type":"journal-article","created":{"date-parts":[[2019,12,12]],"date-time":"2019-12-12T11:06:41Z","timestamp":1576148801000},"page":"582","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":27,"title":["A Dual-Path and Lightweight Convolutional Neural Network for High-Resolution Aerial Image Segmentation"],"prefix":"10.3390","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6730-045X","authenticated-orcid":false,"given":"Gang","family":"Zhang","sequence":"first","affiliation":[{"name":"Institute of Optics and Electronics, Chinese Academy of Sciences, P.O. Box 350, No.1 Guangdian Avenue, Chengdu 610209, China"},{"name":"University of Chinese Academy of Sciences, No. 19 (A) Yuquan Road, Beijing 100049, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0900-1582","authenticated-orcid":false,"given":"Tao","family":"Lei","sequence":"additional","affiliation":[{"name":"Institute of Optics and Electronics, Chinese Academy of Sciences, P.O. Box 350, No.1 Guangdian Avenue, Chengdu 610209, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yi","family":"Cui","sequence":"additional","affiliation":[{"name":"Institute of Optics and Electronics, Chinese Academy of Sciences, P.O. Box 350, No.1 Guangdian Avenue, Chengdu 610209, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ping","family":"Jiang","sequence":"additional","affiliation":[{"name":"Institute of Optics and Electronics, Chinese Academy of Sciences, P.O. Box 350, No.1 Guangdian Avenue, Chengdu 610209, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2019,12,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/1047-3203(90)90014-M","article-title":"Morphological segmentation","volume":"1","author":"Meyer","year":"1990","journal-title":"J. Vis. Commun. Image R."},{"key":"ref_2","unstructured":"Boykov, Y.Y., and Jolly, M.P. (2001, January 7\u201314). Interactive graph cuts for optimal boundary and region segmentation of objects in ND images. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Vancouver, BC, Canada."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1080\/01431160412331269698","article-title":"Random forest classifier for remote sensing classification","volume":"26","author":"Pal","year":"2005","journal-title":"Int. J. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Long, J., Shelhamer, E., and Darrell, T. (2015, January 7\u201312). Fully convolutional networks for semantic segmentation. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"2481","DOI":"10.1109\/TPAMI.2016.2644615","article-title":"SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation","volume":"39","author":"Vijay","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Ronneberger, O., Fischer, P., and Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv.","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"ref_7","unstructured":"Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A.L. (2016). DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. arXiv."},{"key":"ref_8","unstructured":"Chen, L.C., Papandreou, G., Schroff, F., and Hartwig, A. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Peng, C., Zhang, X., Yu, G., Luo, G., and Sun, J. (2017). Large Kernel Matters\u2014Improve Semantic Segmentation by Global Convolutional Network. arXiv.","DOI":"10.1109\/CVPR.2017.189"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.C. (2018). MobileNetV2: Inverted residuals and linear bottlenecks. arXiv.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"2279","DOI":"10.1016\/S0031-3203(01)00178-9","article-title":"Image processing with neural networks\u2014A review","volume":"35","author":"Handels","year":"2002","journal-title":"Pattern Recogn."},{"key":"ref_13","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). Imagenet classification with deep convolutional neural networks. Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., and Garcia-Rodriguez, J. (2017). A review on deep learning techniques applied to semantic segmentation. arXiv.","DOI":"10.1016\/j.asoc.2018.05.018"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2016). Pyramid Scene Parsing Network. arXiv.","DOI":"10.1109\/CVPR.2017.660"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Xie, S., and Tu, Z. (2015, January 7\u201313). Holistically-nested edge detection. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.164"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1109\/MGRS.2017.2762307","article-title":"Deep learning in remote sensing: A comprehensive review and list of resources","volume":"5","author":"Zhu","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Kampffmeyer, M., Salberg, A.B., and Jenssen, R. (July, January 26). Semantic Segmentation of Small Objects and Modeling of Uncertainty in Urban Remote Sensing Images Using Deep Convolutional Neural Networks. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, USA.","DOI":"10.1109\/CVPRW.2016.90"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Guo, R., Liu, J., Li, N., Liu, S., Chen, F., Cheng, B., and Ma, C. (2018). Pixel-wise classification method for high resolution remote sensing imagery using deep neural networks. ISPRS Int. J. Geo-Inf., 7.","DOI":"10.3390\/ijgi7030110"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Chen, G., Li, C., Wei, W., Jing, W., Wo\u017aniak, M., Bla\u017eauskas, T., and Dama\u0161evi\u010dius, R. (2019). Fully Convolutional Neural Network with Augmented Atrous Spatial Pyramid Pool and Fully Connected Fusion Path for High Resolution Remote Sensing Image Segmentation. Appl. Sci., 9.","DOI":"10.3390\/app9091816"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Liu, W., Cheng, D., Yin, P., Yang, M., Li, E., Xie, M., and Zhang, L. (2019). Small Manhole Cover Detection in Remote Sensing Imagery with Deep Convolutional Neural Networks. ISPRS Int. J. Geo-Inf., 8.","DOI":"10.3390\/ijgi8010049"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Schuegraf, P., and Bittner, K. (2019). Automatic Building Footprint Extraction from Multi-Resolution Remote Sensing Images Using a Hybrid FCN. ISPRS Int. J. Geo-Inf., 8.","DOI":"10.3390\/ijgi8040191"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Panboonyuen, T., Jitkajornwanich, K., Lawawirojwong, S., Srestasathiern, P., and Vateekul, P. (2019). Semantic Segmentation on Remotely Sensed Images Using an Enhanced Global Convolutional Network with Channel Attention and Domain Specific Transfer Learning. Remote Sens., 11.","DOI":"10.20944\/preprints201812.0090.v3"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Liu, P., Liu, X., Liu, M., Shi, Q., Yang, J., Xu, X., and Zhang, Y. (2019). Building Footprint Extraction from High-Resolution Images via Spatial Residual Inception Convolutional Neural Network. Remote Sens., 11.","DOI":"10.3390\/rs11070830"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Pan, X., Yang, F., Gao, L., Chen, Z., Zhang, B., Fan, H., and Ren, J. (2019). Building Extraction from High-Resolution Aerial Imagery Using a Generative Adversarial Network with Spatial and Channel Attention Mechanisms. Remote Sens., 11.","DOI":"10.3390\/rs11080917"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Benjdira, B., Bazi, Y., Koubaa, A., and Ouni, K. (2019). Unsupervised Domain Adaptation Using Generative Adversarial Networks for Semantic Segmentation of Aerial Images. Remote Sens., 11.","DOI":"10.3390\/rs11111369"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Pan, X., Gao, L., Zhang, B., Yang, F., and Liao, W. (2018). High-Resolution Aerial Imagery Semantic Labeling with Dense Pyramid Network. Sensors, 18.","DOI":"10.3390\/s18113774"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Yao, X., Yang, H., Wu, Y., Wu, P., Wang, B., Zhou, X., and Wang, S. (2019). Land Use Classification of the Deep Convolutional Neural Network Method Reducing the Loss of Spatial Features. Sensors, 19.","DOI":"10.3390\/s19122792"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1016\/j.isprsjprs.2017.12.007","article-title":"Semantic labeling in very high resolution images via a self-cascaded convolutional neural network","volume":"145","author":"Liu","year":"2018","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Wu, G., Guo, Y., Song, X., Guo, Z., Zhang, H., Shi, X., Shibasaki, R., and Shao, X. (2019). A Stacked Fully Convolutional Networks with Feature Alignment Framework for Multi-Label Land-cover Segmentation. Remote Sens., 11.","DOI":"10.3390\/rs11091051"},{"key":"ref_31","unstructured":"Marmanis, D., Schindler, K., Wegner, J.D., Galliani, S., Datcu, M., and Stilla, U. (2016). Classification with an edge: Improving semantic image segmentation with boundary detection. arXiv."},{"key":"ref_32","first-page":"1","article-title":"Convolutional deep belief networks on cifar-10","volume":"40","author":"Krizhevsky","year":"2010","journal-title":"Unpubl. Manuscr."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. arXiv.","DOI":"10.1109\/CVPR.2017.195"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2015). Deep residual learning for image recognition. arXiv.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2017). Squeeze-and-excitation networks. arXiv.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_36","unstructured":"Yosinski, J., Clune, J., Bengio, Y., and Lipson, H. (2014, January 8\u201313). How transferable are features in deep neural networks?. Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS), Montreal, QC, Canada."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"Imagenet large scale visual recognition challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vision"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2015). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. arXiv.","DOI":"10.1109\/ICCV.2015.123"},{"key":"ref_39","unstructured":"ISPRS (International Society for Photogrammetry and Remote Sensing) (2018, November 10). 2D Semantic Labeling Challenge. Available online: http:\/\/www2.isprs.org\/commissions\/comm3\/wg4\/semantic-labeling.html."},{"key":"ref_40","unstructured":"(2017, September 09). Facebook. Available online: http:\/\/pytorch.org."},{"key":"ref_41","unstructured":"Duda, R., Hart, P., and Stork, D. (2000). Pattern Classification, Wiley Press."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Zhao, H., Qi, X., Shen, X., Shi, J., and Jia, J. (2018). ICNet for Real-Time Semantic Segmentation on High-Resolution Images. arXiv.","DOI":"10.1007\/978-3-030-01219-9_25"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Mehta, S., Rastegari, M., Caspi, A., Shapiro, L., and Hajishirzi, H. (2018). ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation. arXiv.","DOI":"10.1007\/978-3-030-01249-6_34"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., and Sang, N. (2018, January 8\u201314). BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01261-8_20"},{"key":"ref_45","unstructured":"Nekrasov, V., Shen, C., and Reid, I. (2018, January 3\u20136). Light-Weight RefineNet for Real-Time Semantic Segmentation. Proceedings of the 29th British Machine Vision Conference (BMVC), Newcastle, UK."},{"key":"ref_46","unstructured":"Li, G., Milan, A., Shen, C., and Reid, I. (2016). RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation. arXiv."}],"container-title":["ISPRS International Journal of Geo-Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2220-9964\/8\/12\/582\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:41:35Z","timestamp":1760190095000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2220-9964\/8\/12\/582"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,12,12]]},"references-count":46,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2019,12]]}},"alternative-id":["ijgi8120582"],"URL":"https:\/\/doi.org\/10.3390\/ijgi8120582","relation":{},"ISSN":["2220-9964"],"issn-type":[{"value":"2220-9964","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,12,12]]}}}