{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,28]],"date-time":"2026-02-28T04:30:28Z","timestamp":1772253028854,"version":"3.50.1"},"reference-count":96,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2017,5,19]],"date-time":"2017-05-19T00:00:00Z","timestamp":1495152000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Semantic image segmentation has recently witnessed considerable progress by training deep convolutional neural networks (CNNs). The core issue of this technique is the limited capacity of CNNs to depict visual objects. Existing approaches tend to utilize approximate inference in a discrete domain or additional aides and do not have a global optimum guarantee. We propose the use of the multi-label manifold ranking (MR) method in solving the linear objective energy function in a continuous domain to delineate visual objects and solve these problems. We present a novel embedded single stream optimization method based on the MR model to avoid approximations without sacrificing expressive power. In addition, we propose a novel network, which we refer to as dual multi-scale manifold ranking (DMSMR) network, that combines the dilated, multi-scale strategies with the single stream MR optimization method in the deep learning architecture to further improve the performance. Experiments on high resolution images, including close-range and remote sensing datasets, demonstrate that the proposed approach can achieve competitive accuracy without additional aides in an end-to-end manner.<\/jats:p>","DOI":"10.3390\/rs9050500","type":"journal-article","created":{"date-parts":[[2017,5,23]],"date-time":"2017-05-23T01:47:33Z","timestamp":1495504053000},"page":"500","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":33,"title":["Learning Dual Multi-Scale Manifold Ranking for Semantic Segmentation of High-Resolution Images"],"prefix":"10.3390","volume":"9","author":[{"given":"Mi","family":"Zhang","sequence":"first","affiliation":[{"name":"School of Remote Sensing and Information Engineering, 129 Luoyu Road, Wuhan University, Wuhan 430079, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiangyun","family":"Hu","sequence":"additional","affiliation":[{"name":"School of Remote Sensing and Information Engineering, 129 Luoyu Road, Wuhan University, Wuhan 430079, China"},{"name":"Collaborative Innovation Center of Geospatial Technology, Wuhan University, Wuhan 430079, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Like","family":"Zhao","sequence":"additional","affiliation":[{"name":"School of Remote Sensing and Information Engineering, 129 Luoyu Road, Wuhan University, Wuhan 430079, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ye","family":"Lv","sequence":"additional","affiliation":[{"name":"School of Remote Sensing and Information Engineering, 129 Luoyu Road, Wuhan University, Wuhan 430079, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Min","family":"Luo","sequence":"additional","affiliation":[{"name":"School of Remote Sensing and Information Engineering, 129 Luoyu Road, Wuhan University, Wuhan 430079, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shiyan","family":"Pang","sequence":"additional","affiliation":[{"name":"Collaborative Innovation Center of Geospatial Technology, Wuhan University, Wuhan 430079, China"},{"name":"School of Resource and Environmental Sciences, 129 Luoyu Road, Wuhan University, Wuhan 430079, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2017,5,19]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Ladicky, L., Torr, P., and Zisserman, A. (2013, January 23\u201328). Human Pose Estimation using a Joint Pixel-wise and Part-wise Formulation. Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.459"},{"key":"ref_2","unstructured":"Romera, E., Bergasa, L., and Arroyo, R. (arXiv, 2016). Can we unify monocular detectors for autonomous driving by using the pixel-wise semantic segmentation of CNNs?, arXiv."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Barrnes, D., Maddern, W., and Posner, I. (arXiv, 2016). Find Your Own Way: Weakly-Supervised Segmentation of Path Proposals for Urban Autonomy, arXiv.","DOI":"10.1109\/ICRA.2017.7989025"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Kendall, A., and Cipolla, R. (arXiv, 2015). Modelling Uncertainty in Deep Learning for Camera Relocalization, arXiv.","DOI":"10.1109\/ICRA.2016.7487679"},{"key":"ref_5","unstructured":"Xiao, J., and Quan, L. (October, January 29). Multiple View Semantic Segmentation for Street View Images. Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Floros, G., and Leibe, B. (2012, January 16\u201321). Joint 2D-3D Temporally Consistent Semantic Segmentation of Street Scenes. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248007"},{"key":"ref_7","unstructured":"Huval, B., Wang, T., Tandon, S., Kiske, J., Song, W., Pazhayampallil, J., and Mujica, F. (arXiv, 2015). An empirical evaluation of deep learning on highway driving, arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Chen, C., Seff, A., Kornhauser, A., and Xiao, J. (2015, January 7\u201313). Deepdriving: Learning affordance for direct perception in autonomous driving. Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.312"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Toshev, A., and Szegedy, C. (arXiv, 2014). DeepPose: Human Pose Estimation via Deep Neural Networks, arXiv.","DOI":"10.1109\/CVPR.2014.214"},{"key":"ref_10","unstructured":"Tompson, J.J., Jain, A., LeCun, Y., and Bregler, C. (arXiv, 2014). Joint training of a convolutional network and a graphical model for human pose estimation, arXiv."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Jackson, A., Valstar, M., and Tzimiropoulos, G. (arXiv, 2016). A CNN Cascade for Landmark Guided Semantic Part Segmentation, arXiv.","DOI":"10.1007\/978-3-319-49409-8_14"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Maggiori, E., Tarabalka, Y., Charpiat, G., and Alliez, P. (arXiv, 2016). High-Resolution Semantic Labeling with Convolutional Neural Networks, arXiv.","DOI":"10.1109\/IGARSS.2017.8128163"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Kampffmeyer, M., Salberg, A.B., and Jenssen, R. (July, January 26). Semantic segmentation of small objects and modeling of uncertainty in urban remote sensing images using deep convolutional neural networks. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, USA.","DOI":"10.1109\/CVPRW.2016.90"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Audebert, N., Saux, B.L., and Lef\u00e8vre, S. (arXiv, 2016). Semantic Segmentation of Earth Observation Data Using Multimodal and Multi-scale Deep Networks, arXiv.","DOI":"10.1007\/978-3-319-54181-5_12"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"L\u00e4ngkvist, M., Kiselev, A., Alirezaie, M., and Loutfi, A. (2016). Classification and segmentation of satellite orthoimagery using convolutional neural networks. Remote Sens., 8.","DOI":"10.3390\/rs8040329"},{"key":"ref_16","unstructured":"Muruganandham, S. (2016). Semantic Segmentation of Satellite Images Using Deep Learning. [Master\u2019s Thesis, Department of Computer Science, Electrical and Space Engineering, Lule\u00e5 University of Technology]."},{"key":"ref_17","unstructured":"Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., and Xiao, J. (2015). 3d Shapenets: A Deep Representation for Volumetric Shapes, Princeton University."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Kendall, A., Grimes, M., and Cipolla, R. (arXiv, 2016). PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization, arXiv.","DOI":"10.1109\/ICCV.2015.336"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Barron, J.T., and Poole, B. (arXiv, 2016). The fast bilateral solver, arXiv.","DOI":"10.1007\/978-3-319-46487-9_38"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Mostajabi, M., Yadollahpour, P., and Shakhnarovich, G. (arXiv, 2015). Feedforward semantic segmentation with zoom-out features, arXiv.","DOI":"10.1109\/CVPR.2015.7298959"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Dai, J., He, K., and Sun, J. (arXiv, 2015). Instance-aware Semantic Segmentation via Multi-task Network Cascades, arXiv.","DOI":"10.1109\/CVPR.2016.343"},{"key":"ref_22","unstructured":"Shelhamer, E., Long, J., and Darrell, T. (arXiv, 2015). Fully Convolutional Networks for Semantic Segmentation, arXiv."},{"key":"ref_23","unstructured":"Yu, F., and Koltun, V. (arXiv, 2016). Multi-Scale Context Aggregation by Dilated Convolutions, arXiv."},{"key":"ref_24","unstructured":"Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A. (arXiv, 2015). DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs, arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zheng, S., Jayasumana, S., Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., and Torr, P. (arXiv, 2015). Conditional Random Fields as Recurrent Neural Networks, arXiv.","DOI":"10.1109\/ICCV.2015.179"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Chandra, S., and Kokkinos, I. (arXiv, 2016). Fast, Exact and Multi-Scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs, arXiv.","DOI":"10.1007\/978-3-319-46478-7_25"},{"key":"ref_27","unstructured":"Badrinarayanan, V., Handa, A., and Cipolla, R. (arXiv, 2015). SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation, arXiv."},{"key":"ref_28","unstructured":"Hyeonwoo, N., Hong, S., and Han, B. (arXiv, 2015). Learning deconvolution network for semantic segmentation, arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Lin, G., Shen, C., Hengel, A., and Reid, I. (arXiv, 2016). Efficient piecewise training of deep structured models for semantic segmentation, arXiv.","DOI":"10.1109\/CVPR.2016.348"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Eigen, D., and Fergus, R. (arXiv, 2015). Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture, arXiv.","DOI":"10.1109\/ICCV.2015.304"},{"key":"ref_31","unstructured":"Chen, L., Schwing, A., Yuille, A., and Urtasun, R. (arXiv, 2015). Learning Deep Structured Models, arXiv."},{"key":"ref_32","unstructured":"Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A. (arXiv, 2015). Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs, arXiv."},{"key":"ref_33","unstructured":"Kr\u00e4henb\u00fchl, P., and Koltun, V. (arXiv, 2012). Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials, arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Arnab, A., Jayasumana, S., Zheng, S., and Torr, P. (arXiv, 2016). Higher Order Conditional Random Fields in Deep Neural Networks, arXiv.","DOI":"10.1007\/978-3-319-46475-6_33"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Vemulapalli, R., Tuzel, O., Liu, M., and Chellappa, R. (2016, January 27\u201330). Gaussian Conditional Random Field Network for Semantic Segmentation. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.351"},{"key":"ref_36","unstructured":"Zhou, D., Weston, J., Gretton, A., Bousquent, O., and Scholkopf, B. (2003, January 9\u201311). Ranking on data manifolds. Proceedings of the 16th International Conference on Neural Information Processing Systems, Whistler, BC, Canada."},{"key":"ref_37","unstructured":"Zhou, D., Bousquent, O., Lal, T., Weston, J., and Scholkopf, B. (2003, January 9\u201311). Learning with Local and Global Consistency. Proceedings of the 16th International Conference on Neural Information Processing Systems, Whistler, BC, Canada."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Yang, C., Zhang, L., Lu, H., Ruan, X., and Yang, M. (2013, January 23\u201328). Saliency Detection via Graph-Based Manifold Ranking. Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.407"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"527","DOI":"10.1109\/LGRS.2014.2349538","article-title":"Fusion of Extreme Learning Machine and Graph-Based Optimization Methods for Active Classification of Remote Sensing Images","volume":"12","author":"Bencherif","year":"2015","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_40","unstructured":"Kr\u00e4henb\u00fchl, P., and Koltun, V. (2013, January 16\u201321). Parameter Learning and Convergent Inference for Dense Random Fields. Proceedings of the 30th International Conference on Machine Learning, Atlanta, GA, USA."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Gupta, S., Girshick, R., Arbel\u00e1ez, P., and Malik, J. (arXiv, 2014). Learning Rich Features from RGB-D Images for Object Detection and Segmentation, arXiv.","DOI":"10.1007\/978-3-319-10584-0_23"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Hariharan, B., Arbel\u00e1ez, P., Girshick, R., and Malik, J. (arXiv, 2014). Simultaneous Detection and Segmentation, arXiv.","DOI":"10.1007\/978-3-319-10584-0_20"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Dai, J., He, K., and Sun, J. (arXiv, 2015). Convolutional Feature Masking for Joint Object and Stuff Segmentation, arXiv.","DOI":"10.1109\/CVPR.2015.7299025"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"1915","DOI":"10.1109\/TPAMI.2012.231","article-title":"Learning Hierarchical Features for Scene Labeling","volume":"35","author":"Farabet","year":"2013","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Chen, L., Yang, Y., Wang, J., Xu, W., and Yuille, A. (arXiv, 2016). Attention to Scale: Scale-aware Semantic Image Segmentation, arXiv.","DOI":"10.1109\/CVPR.2016.396"},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Bearman, A., Russakovsky, O., Ferrari, V., and Li, F.F. (arXiv, 2016). What\u2019s the Point: Semantic Segmentation with Point Supervision, arXiv.","DOI":"10.1007\/978-3-319-46478-7_34"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"1349","DOI":"10.1109\/TGRS.2015.2478379","article-title":"Unsupervised Deep Feature Extraction for Remote Sensing Image Classification","volume":"54","author":"Romero","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"5547","DOI":"10.1109\/JSTARS.2016.2569162","article-title":"Processing of Extremely High-Resolution LiDAR and RGB Data: Outcome of the 2015 IEEE GRSS Data Fusion Contest\u2013Part A: 2-D Contest","volume":"9","author":"Gatta","year":"2016","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Tschannen, M., Cavigelli, L., Mentzer, F., Wiatowski, T., and Benini, L. (arXiv, 2016). Deep Structured Features for Semantic Segmentation, arXiv.","DOI":"10.23919\/EUSIPCO.2017.8081169"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Piramanayagam, S., Schwartzkopf, W., Koehler, F.W., and Saber, E. (2016). Classification of remote sensed images using random forests and deep learning framework. SPIE Remote Sens. Int. Soc. Opt. Photonics.","DOI":"10.1117\/12.2243169"},{"key":"ref_51","unstructured":"Marcu, A., and Leordeanu, M. (arXiv, 2016). Dual Local-Global Contextual Pathways for Recognition in Aerial Imagery, arXiv."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"1431","DOI":"10.1109\/TGRS.2015.2480866","article-title":"Dual-clustering-based hyperspectral band selection by contextual analysis","volume":"54","author":"Yuan","year":"2016","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_53","unstructured":"Kendall, A., Badrinarayanan, V., and Cipolla, R. (arXiv, 2015). Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding, arXiv."},{"key":"ref_54","unstructured":"Simonyan, K., and Zisserman, A. (arXiv, 2015). Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv."},{"key":"ref_55","unstructured":"Hong, S., Noh, H., and Han, B. (arXiv, 2015). Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation, arXiv."},{"key":"ref_56","doi-asserted-by":"crossref","unstructured":"Audebert, N., Saux, B.L., and Lef\u00e8vre, S. (2017). Segment-before-Detect: Vehicle Detection and Classification through Semantic Segmentation of Aerial Images. Remote Sens., 9.","DOI":"10.3390\/rs9040368"},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Huang, Z., Cheng, G., Wang, H., Li, H., Shi, L., and Pan, C. (2016, January 10\u201315). Building extraction from multi-source remote sensing images via deep deconvolution neural networks. Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China.","DOI":"10.1109\/IGARSS.2016.7729471"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Audebert, N., Boulch, A., Lagrange, A., Le Saux, B., and Lefevre, S. (2016). Deep Learning for Remote Sensing, ONERA The French Aerospace Lab, DTIM & Univ. Bretagne-Sud & ENSTA ParisTech. Technical Report.","DOI":"10.1109\/JURSE.2017.7924536"},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Paisitkriangkrai, S., Sherrah, J., Janney, P., and Hengel, V.D. (2015, January 7\u201312). Effective semantic pixel labelling with convolutional networks and conditional random fields. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Boston, MA, USA.","DOI":"10.1109\/CVPRW.2015.7301381"},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Alam, F.I., Zhou, J., Liew, A.W.C., and Jia, X. (2016, January 10\u201315). CRF learning with CNN features for hyperspectral image segmentation. Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Beijing, China.","DOI":"10.1109\/IGARSS.2016.7730798"},{"key":"ref_61","unstructured":"He, X., Cai, D., and Niyogi, P. (2005, January 5\u20138). Laplacian Score for Feature Selection. Proceedings of the 18th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada."},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Quan, R., Han, J., Zhang, D., and Nie, F. (2016, January 27\u201330). Object co-segmentation via graph optimized-flexible manifold ranking. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.81"},{"key":"ref_63","doi-asserted-by":"crossref","first-page":"1279","DOI":"10.1109\/TNNLS.2015.2477537","article-title":"Salient band selection for hyperspectral image classification via manifold ranking","volume":"27","author":"Wang","year":"2016","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_64","doi-asserted-by":"crossref","first-page":"637","DOI":"10.1109\/LSP.2013.2260737","article-title":"Graph-regularized saliency detection with convex-hull-based center prior","volume":"20","author":"Yang","year":"2013","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Xu, B., Bu, J., Chen, C., Cai, D., He, X., Liu, W., and Luo, J. (2011, January 24\u201328). Efficient Manifold Ranking for Image Retrieval. Proceedings of the 34th international ACM SIGIR Conference on Research and Development in Information Retrieval, Beijing, China.","DOI":"10.1145\/2009916.2009988"},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Hsieh, C., Han, C., Shih, J., Lee, C., and Fan, K. (2015, January 24\u201326). 3D Model Retrieval Using Multiple Features and Manifold Ranking. Proceedings of the 2015 8th International Conference on Ubi-Media Computing (UMEDIA), Colombo, Sri Lanka.","DOI":"10.1109\/UMEDIA.2015.7297419"},{"key":"ref_67","doi-asserted-by":"crossref","first-page":"2459","DOI":"10.1016\/j.patcog.2015.03.008","article-title":"Robust visual tracking via efficient manifold ranking with low-dimensional compressive features","volume":"48","author":"Zhou","year":"2015","journal-title":"Pattern Recognit."},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Brostow, G., Shotton, J., Fauqueur, J., and Cipolla, R. (2008, January 12\u201318). Segmentation and Recognition Using Structure from Motion Point Clouds. Proceedings of the 10th European Conference on Computer Vision, Marseille, France.","DOI":"10.1007\/978-3-540-88682-2_5"},{"key":"ref_69","doi-asserted-by":"crossref","first-page":"88","DOI":"10.1016\/j.patrec.2008.04.005","article-title":"Semantic object classes in video: A high-definition ground truth database","volume":"30","author":"Brostow","year":"2009","journal-title":"Pattern Recognit. Lett."},{"key":"ref_70","unstructured":"Ruder, S. (arXiv, 2016). An overview of gradient descent optimization algorithms, arXiv."},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (arXiv, 2015). Deep Residual Learning for Image Recognition, arXiv.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_72","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The Pascal Visual Object Classes (VOC) Challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_73","doi-asserted-by":"crossref","first-page":"293","DOI":"10.5194\/isprsannals-I-3-293-2012","article-title":"The ISPRS benchmark on urban object classification and 3D building reconstruction","volume":"I-3","author":"Rottensteiner","year":"2012","journal-title":"ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci."},{"key":"ref_74","unstructured":"Glorot, X., Bordes, A., and Bengio, Y. (2011, January 11\u201313). Deep Sparse Rectifier Neural Networks. Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA."},{"key":"ref_75","doi-asserted-by":"crossref","unstructured":"Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. (2014, January 3\u20137). Caffe: Convolutional Architecture for Fast Feature Embedding. Proceedings of the 22nd ACM International Conference on Multimedia, Orlando, FL, USA.","DOI":"10.1145\/2647868.2654889"},{"key":"ref_76","doi-asserted-by":"crossref","first-page":"473","DOI":"10.5194\/isprs-annals-III-3-473-2016","article-title":"Semantic Segmentation of Aerial Images with an Ensemble of CNSS","volume":"3","author":"Marmanis","year":"2016","journal-title":"ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci."},{"key":"ref_77","doi-asserted-by":"crossref","unstructured":"Hariharan, B., Arbel\u00e1ez, P., Bourdev, L., Maji, S., and Malik, J. (2011, January 6\u201313). Semantic Contours from Inverse Detectors. Proceedings of the 2011 IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126343"},{"key":"ref_78","doi-asserted-by":"crossref","unstructured":"Zoran, D., and Weiss, Y. (2011, January 6\u201313). From Learning Models of Natural Image Patches to Whole Image Restoration. Proceedings of the 2011 IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126278"},{"key":"ref_79","doi-asserted-by":"crossref","unstructured":"Lin, T., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C., and Doll\u00e1r, P. (arXiv, 2014). Microsoft coco: Common objects in context, arXiv.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_80","doi-asserted-by":"crossref","unstructured":"Lin, G., Milan, A., Shen, C., and Reid, I. (arXiv, 2016). RefineNet: Multi-Path Refinement Networks with Identity Mappings for High-Resolution Semantic Segmentation, arXiv.","DOI":"10.1109\/CVPR.2017.549"},{"key":"ref_81","doi-asserted-by":"crossref","first-page":"302","DOI":"10.1007\/s11263-008-0202-0","article-title":"Robust higher order potentials for enforcing label consistency","volume":"82","author":"Kohli","year":"2009","journal-title":"Int. J. Comput. Vis."},{"key":"ref_82","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). Imagenet classification with deep convolutional neural networks. Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_83","doi-asserted-by":"crossref","unstructured":"Quang, N.T., Thuy, N.T., Sang, D.V., and Binh, H.T.T. (2015, January 3\u20134). An efficient framework for pixel-wise building segmentation from aerial images. Proceedings of the Sixth International Symposium on Information and Communication Technology, Hue City, Vietnam.","DOI":"10.1145\/2833258.2833272"},{"key":"ref_84","unstructured":"Boulch, A. (2015). DAG of Convolutional Networks for Semantic Labeling, Office National d\u2019\u00e9tudes et de Recherches A\u00e9rospatiales. Technical Report."},{"key":"ref_85","unstructured":"Gerke, M., Speldekamp, T., Fries, C., and Gevaert, C. (2015). Automatic semantic labelling of urban areas using a rule-based approach and realized with mevislab. Unpublished."},{"key":"ref_86","unstructured":"Sherrah, J. (arXiv, 2016). Fully convolutional networks for dense semantic labelling of high-resolution aerial imagery, arXiv."},{"key":"ref_87","unstructured":"Gerke, M. (2015). Use of the Stair Vision Library within the ISPRS 2D Semantic Labeling Benchmark (Vaihingen), University of Twente. Technical Report."},{"key":"ref_88","unstructured":"Petersen, K., and Pedersen, M. (2008). The Matrix Cookbook, Technical University of Denmark."},{"key":"ref_89","unstructured":"The National Survey of Geographical Conditions Leading Group Office, Sate Council, P.R.C. (2013). General Situation and Index of Geographical Conditions (Chinese Manual, GDPJ 01-2013), The National Survey of Geographical Conditions Leading Group Office, Sate Council, P.R.C."},{"key":"ref_90","doi-asserted-by":"crossref","first-page":"2661","DOI":"10.3390\/rs4092661","article-title":"Tree species classification with random forest using very high spatial resolution 8-band WorldView-2 satellite data","volume":"4","author":"Immitzer","year":"2012","journal-title":"Remote Sens."},{"key":"ref_91","doi-asserted-by":"crossref","first-page":"1887","DOI":"10.3390\/rs4071887","article-title":"Monitoring seasonal hydrological dynamics of minerotrophic peatlands using multi-date GeoEye-1 very high resolution imagery and object-based classification","volume":"4","author":"Dribault","year":"2012","journal-title":"Remote Sens."},{"key":"ref_92","doi-asserted-by":"crossref","first-page":"8121","DOI":"10.1080\/01431161.2010.532822","article-title":"Mapping reedbed habitats using texture-based classification of QuickBird imagery","volume":"32","author":"Onojeghuo","year":"2011","journal-title":"Int. J. Remote Sens."},{"key":"ref_93","first-page":"255","article-title":"Comparison between GF-1 and Landsat-8 images in land cover classification","volume":"35","author":"Junwei","year":"2016","journal-title":"Prog. Geogr."},{"key":"ref_94","unstructured":"Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, The MIT Press."},{"key":"ref_95","unstructured":"Mirza, M., and Osindero, S. (arXiv, 2014). Conditional generative adversarial nets, arXiv."},{"key":"ref_96","unstructured":"Luc, P., Couprie, C., Chintala, S., and Verbeek, J. (arXiv, 2016). Semantic Segmentation using Adversarial Networks, arXiv."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/9\/5\/500\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T18:36:20Z","timestamp":1760207780000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/9\/5\/500"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,5,19]]},"references-count":96,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2017,5]]}},"alternative-id":["rs9050500"],"URL":"https:\/\/doi.org\/10.3390\/rs9050500","relation":{"has-preprint":[{"id-type":"doi","id":"10.20944\/preprints201704.0061.v1","asserted-by":"object"}]},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,5,19]]}}}