{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T11:08:12Z","timestamp":1769166492045,"version":"3.49.0"},"reference-count":63,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2018,9,7]],"date-time":"2018-09-07T00:00:00Z","timestamp":1536278400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>In this paper, we present a convolutional neural network (CNN)-based method to efficiently combine information from multisensor remotely sensed images for pixel-wise semantic classification. The CNN features obtained from multiple spectral bands are fused at the initial layers of deep neural networks as opposed to final layers. The early fusion architecture has fewer parameters and thereby reduces the computational time and GPU memory during training and inference. We also propose a composite fusion architecture that fuses features throughout the network. The methods were validated on four different datasets: ISPRS Potsdam, Vaihingen, IEEE Zeebruges and Sentinel-1, Sentinel-2 dataset. For the Sentinel-1,-2 datasets, we obtain the ground truth labels for three classes from OpenStreetMap. Results on all the images show early fusion, specifically after layer three of the network, achieves results similar to or better than a decision level fusion mechanism. The performance of the proposed architecture is also on par with the state-of-the-art results.<\/jats:p>","DOI":"10.3390\/rs10091429","type":"journal-article","created":{"date-parts":[[2018,9,7]],"date-time":"2018-09-07T11:47:41Z","timestamp":1536320861000},"page":"1429","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":82,"title":["Supervised Classification of Multisensor Remotely Sensed Images Using a Deep Learning Framework"],"prefix":"10.3390","volume":"10","author":[{"given":"Sankaranarayanan","family":"Piramanayagam","sequence":"first","affiliation":[{"name":"Chester F. Carlson Center for Imaging Science, Rochester Institute of Technology, 54 Lomb Memorial Drive, Rochester, NY 14623, USA"}]},{"given":"Eli","family":"Saber","sequence":"additional","affiliation":[{"name":"Chester F. Carlson Center for Imaging Science, Rochester Institute of Technology, 54 Lomb Memorial Drive, Rochester, NY 14623, USA"},{"name":"Department of Electrical &amp; Microelectronic Engineering, Rochester Institute of Technology, 54 Lomb Memorial Drive, Rochester, NY 14623, USA"}]},{"given":"Wade","family":"Schwartzkopf","sequence":"additional","affiliation":[{"name":"National Geospatial-Intelligence Agency, 7500 GEOINT Dr, Springfield, VA 22153, USA"}]},{"given":"Frederick W.","family":"Koehler","sequence":"additional","affiliation":[{"name":"National Geospatial-Intelligence Agency, 7500 GEOINT Dr, Springfield, VA 22153, USA"}]}],"member":"1968","published-online":{"date-parts":[[2018,9,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1560","DOI":"10.1109\/JPROC.2015.2449668","article-title":"Multimodal classification of remote sensing images: A review and future directions","volume":"103","author":"Tuia","year":"2015","journal-title":"Proc. IEEE"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"3858","DOI":"10.1109\/TGRS.2007.898446","article-title":"Fusion of support vector machines for classification of multisensor data","volume":"45","author":"Waske","year":"2007","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"217","DOI":"10.1080\/01431160412331269698","article-title":"Random forest classifier for remote sensing classification","volume":"26","author":"Pal","year":"2005","journal-title":"Int. J. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"3180","DOI":"10.1109\/TGRS.2009.2019636","article-title":"A Novel Approach to the Selection of Spatially Invariant Features for the Classification of Hyperspectral Images With Improved Generalization Capability","volume":"47","author":"Bruzzone","year":"2009","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1109\/TIP.2016.2617462","article-title":"Discovering Diverse Subset for Unsupervised Hyperspectral Band Selection","volume":"26","author":"Yuan","year":"2017","journal-title":"IEEE Trans. Image Process."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"280","DOI":"10.1109\/TGRS.2014.2321423","article-title":"Features, color spaces, and boosting: New insights on semantic classification of remote sensing images","volume":"53","author":"Tokarczyk","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_8","unstructured":"Simonyan, K., and Zisserman, A. (arXiv, 2014). Very deep convolutional networks for large-scale image recognition, arXiv."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"100040L","DOI":"10.1117\/12.2243169","article-title":"Classification of remote sensed images using random forests and deep learning framework","volume":"Volume 10004","author":"Piramanayagam","year":"2016","journal-title":"Image and Signal Processing for Remote Sensing XXII"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"73","DOI":"10.1127\/1432-8364\/2010\/0041","article-title":"The DGPF-test on digital airborne camera evaluation; Overview and Test design","volume":"2","author":"Cramer","year":"2010","journal-title":"Photogramm.-Fernerkund.-Geoinf."},{"key":"ref_11","unstructured":"ISPRS Contest Website: ISPRS WG III\/4 (2017, January 01). ISPRS 2D Semantic Labeling Contest. Available online: http:\/\/www2.isprs.org\/commissions\/comm3\/wg4\/semantic-labeling.html."},{"key":"ref_12","unstructured":"(2017, January 01). IEEE GRSS Contest Website: 2015 IEEE GRSS Data Fusion Contest. Available online: http:\/\/www.grss-ieee.org\/community\/technical-committees\/data-fusion."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"5547","DOI":"10.1109\/JSTARS.2016.2569162","article-title":"Processing of Extremely High-Resolution LiDAR and RGB Data: Outcome of the 2015 IEEE GRSS Data Fusion Contest\u2013Part A: 2-D Contest","volume":"9","author":"Gatta","year":"2016","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"640","DOI":"10.1109\/TPAMI.2016.2572683","article-title":"Fully Convolutional Networks for Semantic Segmentation","volume":"39","author":"Shelhamer","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_15","unstructured":"Badrinarayanan, V., Kendall, A., and Cipolla, R. (arXiv, 2015). Segnet: A deep convolutional encoder-decoder architecture for image segmentation, arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1016\/j.isprsjprs.2016.12.008","article-title":"Multi-source remotely sensed data fusion for improving land cover classification","volume":"124","author":"Chen","year":"2017","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"222","DOI":"10.1080\/22797254.2017.1314179","article-title":"A metaheuristic feature-level fusion strategy in classification of urban area using hyperspectral imagery and LiDAR data","volume":"50","author":"Hasani","year":"2017","journal-title":"Eur. J. Remote Sens."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1080\/19479830903561035","article-title":"Multi-source remote sensing data fusion: Status and trends","volume":"1","author":"Zhang","year":"2010","journal-title":"Int. J. Image Data Fusion"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"2828","DOI":"10.1109\/TGRS.2006.876708","article-title":"Decision Fusion for the Classification of Urban Remote Sensing Images","volume":"44","author":"Fauvel","year":"2006","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"540","DOI":"10.1109\/TGRS.1990.572944","article-title":"Neural Network Approaches versus Statistical Methods in Classification of Multisource Remote Sensing Data","volume":"28","author":"Benediktsson","year":"1990","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1109\/MGRS.2013.2248301","article-title":"A tutorial on synthetic aperture radar","volume":"1","author":"Moreira","year":"2013","journal-title":"IEEE Geosci. Remote Sens. Mag."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Joshi, N., Baumann, M., Ehammer, A., Fensholt, R., Grogan, K., Hostert, P., Jepsen, M.R., Kuemmerle, T., Meyfroidt, P., and Mitchard, E.T. (2016). A review of the application of optical and radar remote sensing data fusion to land use mapping and monitoring. Remote Sens., 8.","DOI":"10.3390\/rs8010070"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"71","DOI":"10.1016\/S0924-2716(03)00018-2","article-title":"Detection of building outlines based on the fusion of SAR and optical features","volume":"58","author":"Tupin","year":"2003","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1016\/j.rse.2013.10.028","article-title":"Improving the impervious surface estimation with combined use of optical and SAR remote sensing images","volume":"141","author":"Zhang","year":"2014","journal-title":"Remote Sens. Environ."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1457","DOI":"10.1109\/TGRS.2008.916089","article-title":"Classifying Multilevel Imagery From SAR and Optical Sensors by Decision Fusion","volume":"46","author":"Waske","year":"2008","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1769","DOI":"10.1109\/JSTARS.2017.2657607","article-title":"Spatiotemporal Fuzzy Clustering Strategy for Urban Expansion Monitoring Based on Time Series of Pixel-Level Optical and SAR Images","volume":"10","author":"Li","year":"2017","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_27","first-page":"41","article-title":"Sentinel-1A SAR and sentinel-2A MSI data fusion for urban ecosystem service mapping","volume":"8","author":"Haas","year":"2017","journal-title":"Remote Sens. Appl. Soc. Environ."},{"key":"ref_28","unstructured":"Gyorgy, S., Gizella, N., Zolt\u00e1n, F., M\u00e1ty\u00e1s, R., Anik\u00f3 Rottern\u00e9, K., Ir\u00e9n, H., B\u00e1lint, G., and Cecilia, T. (2016, January 20\u201324). Fusion of the Sentinel-1 and Sentinel-2 Data for Mapping High Resolution Land Cover Layers. Proceedings of the 36th EARSeL Symosium 2016, Bonn, Germany."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1016\/j.isprsjprs.2007.01.001","article-title":"Data fusion of high-resolution satellite imagery and LiDAR data for automatic building extraction","volume":"62","author":"Sohn","year":"2007","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"457","DOI":"10.1016\/j.isprsjprs.2010.06.001","article-title":"Automatic detection of residential buildings using LIDAR data and multispectral imagery","volume":"65","author":"Awrangjeb","year":"2010","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1016\/j.isprsjprs.2017.02.011","article-title":"Semantic segmentation of forest stands of pure species combining airborne lidar data and very high resolution multispectral imagery","volume":"126","author":"Dechesne","year":"2017","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"2405","DOI":"10.1109\/JSTARS.2014.2305441","article-title":"Hyperspectral and LiDAR data fusion: Outcome of the 2013 GRSS data fusion contest","volume":"7","author":"Debes","year":"2014","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1109\/LGRS.2008.915939","article-title":"Urban mapping using coarse SAR and optical data: Outcome of the 2007 GRSS data fusion contest","volume":"5","author":"Pacifici","year":"2008","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., and Fei-Fei, L. (2014, January 24\u201327). Large-scale video classification with convolutional neural networks. Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.223"},{"key":"ref_35","unstructured":"Simonyan, K., and Zisserman, A. (2014). Two-stream convolutional networks for action recognition in videos. Advances in Neural Information Processing Systems 27, Curran Associates, Inc."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Feichtenhofer, C., Pinz, A., and Zisserman, A. (2016, January 27\u201330). Convolutional two-stream network fusion for video action recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.213"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Hazirbas, C., Ma, L., Domokos, C., and Cremers, D. (2016, January 20\u201324). FuseNet: Incorporating depth into semantic segmentation via fusion-based CNN architecture. Proceedings of the Asian Conference on Computer Vision, Taipei, Taiwan.","DOI":"10.1007\/978-3-319-54181-5_14"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"20","DOI":"10.1016\/j.isprsjprs.2017.11.011","article-title":"Beyond RGB: Very high resolution urban remote sensing with multimodal deep networks","volume":"140","author":"Audebert","year":"2018","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., and Garcia-Rodriguez, J. (arXiv, 2017). A Review on Deep Learning Techniques Applied to Semantic Segmentation, arXiv.","DOI":"10.1016\/j.asoc.2018.05.018"},{"key":"ref_40","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_41","unstructured":"Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A.L. (arXiv, 2016). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs, arXiv."},{"key":"ref_42","unstructured":"Tran, D., Bourdev, L., Fergus, R., Torresani, L., and Paluri, M. (July, January 26). Deep End2End Voxel2Voxel Prediction. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Las Vegas, NV, USA."},{"key":"ref_43","unstructured":"Li, Z., Gan, Y., Liang, X., Yu, Y., Cheng, H., and Lin, L. (arXiv, 2016). RGB-D scene labeling with long short-term memorized fusion model, arXiv."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"2868","DOI":"10.1109\/JSTARS.2016.2582921","article-title":"Semantic labeling of aerial and satellite imagery","volume":"9","author":"Paisitkriangkrai","year":"2016","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_45","unstructured":"Sherrah, J. (arXiv, 2016). Fully convolutional networks for dense semantic labelling of high-resolution aerial imagery, arXiv."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Liu, Y., Piramanayagam, S., Monteiro, S.T., and Saber, E. (2017, January 21\u201326). Dense Semantic Labeling of Very-High-Resolution Aerial Imagery and LiDAR with Fully-Convolutional Neural Networks and Higher-Order CRFs. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.200"},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"881","DOI":"10.1109\/TGRS.2016.2616585","article-title":"Dense semantic labeling of subdecimeter resolution images with convolutional neural networks","volume":"55","author":"Volpi","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1016\/j.isprsjprs.2017.05.002","article-title":"Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks","volume":"130","author":"Alshehhi","year":"2017","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_49","unstructured":"(2017, May 01). Spacenet Challenge: Building Detectors. Available online: https:\/\/github.com\/SpaceNetChallenge\/BuildingDetectors."},{"key":"ref_50","unstructured":"Karpathy, A., and Li, F.F. (2016, April 01). Stanford CS class CS231n: Convolutional Neural Networks for Visual Recognition. Available online: http:\/\/cs231n.github.io\/."},{"key":"ref_51","unstructured":"Dumoulin, V., and Visin, F. (arXiv, 2016). A guide to convolution arithmetic for deep learning, arXiv."},{"key":"ref_52","unstructured":"Yosinski, J., Clune, J., Bengio, Y., and Lipson, H. (2014). How transferable are features in deep neural networks?. Advances in Neural Information Processing Systems 27, Curran Associates, Inc."},{"key":"ref_53","unstructured":"Rottensteiner, F., Sohn, G., Gerke, M., and Wegner, J.D. (2017, January 01). ISPRS Test Project on Urban Classification and 3D Building Reconstruction. Available online: http:\/\/www2.isprs.org\/tl_files\/isprs\/wg34\/docs\/ComplexScenes_revision_v4.pdf."},{"key":"ref_54","unstructured":"Gerke, M. (2015). Use of the Stair Vision Library within the ISPRS 2D Semantic Labeling Benchmark (Vaihingen), University of Twente. Technical Report."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Lagrange, A., Le Saux, B., Beaupere, A., Boulch, A., Chan-Hon-Tong, A., Herbin, S., Randrianarivo, H., and Ferecatu, M. (2015, January 26\u201331). Benchmarking classification of earth-observation data: From learning explicit features to convolutional networks. Proceedings of the 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Milan, Italy.","DOI":"10.1109\/IGARSS.2015.7326745"},{"key":"ref_56","unstructured":"(2017, February 01). IEEE GRSS Data and Algorithm Standard Evaluation Website. Available online: http:\/\/dase.ticinumaerospace.com\/index.php."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. (arXiv, 2014). Caffe: Convolutional Architecture for Fast Feature Embedding, arXiv.","DOI":"10.1145\/2647868.2654889"},{"key":"ref_58","unstructured":"Thakker, A. (2016, October 01). Skynet-Machine Learning with Satellites and OpenStreetMap Data. Available online: https:\/\/2016.stateofthemap.us\/skynet\/."},{"key":"ref_59","unstructured":"Glorot, X., and Bengio, Y. (2010, January 13\u201315). Understanding the difficulty of training deep feedforward neural networks. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, Italy."},{"key":"ref_60","unstructured":"Marmanis, D., Schindler, K., Wegner, J.D., Galliani, S., Datcu, M., and Stilla, U. (arXiv, 2016). Classification with an edge: Improving semantic image segmentation with boundary detection, arXiv."},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/BF00994018","article-title":"Support-vector networks","volume":"20","author":"Cortes","year":"1995","journal-title":"Mach. Learn."},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1961189.1961199","article-title":"LIBSVM: A library for support vector machines","volume":"2","author":"Chang","year":"2011","journal-title":"ACM Trans. Intell. Syst. Technol."},{"key":"ref_63","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. Advances in Neural Information Processing Systems 28, Curran Associates, Inc."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/10\/9\/1429\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:19:23Z","timestamp":1760195963000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/10\/9\/1429"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,9,7]]},"references-count":63,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2018,9]]}},"alternative-id":["rs10091429"],"URL":"https:\/\/doi.org\/10.3390\/rs10091429","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2018,9,7]]}}}