{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T16:09:17Z","timestamp":1774627757569,"version":"3.50.1"},"reference-count":34,"publisher":"MDPI AG","issue":"21","license":[{"start":{"date-parts":[[2024,10,24]],"date-time":"2024-10-24T00:00:00Z","timestamp":1729728000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Research Grant Council of the HKSAR, China","award":["CityU 11205421"],"award-info":[{"award-number":["CityU 11205421"]}]},{"name":"Research Grant Council of the HKSAR, China","award":["DTEC202102"],"award-info":[{"award-number":["DTEC202102"]}]},{"name":"Jiangsu Engineering Research Center of Digital Twinning Technology for Key Equipment in Petrochemical Process","award":["CityU 11205421"],"award-info":[{"award-number":["CityU 11205421"]}]},{"name":"Jiangsu Engineering Research Center of Digital Twinning Technology for Key Equipment in Petrochemical Process","award":["DTEC202102"],"award-info":[{"award-number":["DTEC202102"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>The advancement in satellite image sensors has enabled the acquisition of high-resolution remote sensing (HRRS) images. However, interpreting these images accurately and obtaining the computational power needed to do so is challenging due to the complexity involved. This manuscript proposed a multi-stream convolutional neural network (CNN) fusion framework that involves multi-scale and multi-CNN integration for HRRS image recognition. The pre-trained CNNs were used to learn and extract semantic features from multi-scale HRRS images. Feature extraction using pre-trained CNNs is more efficient than training a CNN from scratch or fine-tuning a CNN. Discriminative canonical correlation analysis (DCCA) was used to fuse deep features extracted across CNNs and image scales. DCCA reduced the dimension of the features extracted from CNNs while providing a discriminative representation by maximizing the within-class correlation and minimizing the between-class correlation. The proposed model has been evaluated on NWPU-RESISC45 and UC Merced datasets. The accuracy associated with DCCA was 10% and 6% higher than discriminant correlation analysis (DCA) in the NWPU-RESISC45 and UC Merced datasets. The advantage of DCCA was better demonstrated in the NWPU-RESISC45 dataset due to the incorporation of richer within-class variability in this dataset. While both DCA and DCCA minimize between-class correlation, only DCCA maximizes the within-class correlation and, therefore, attains better accuracy. The proposed framework achieved higher accuracy than all state-of-the-art frameworks involving unsupervised learning and pre-trained CNNs and 2\u20133% higher than the majority of fine-tuned CNNs. The proposed framework offers computational time advantages, requiring only 13 s for training in NWPU-RESISC45, compared to a day for fine-tuning the existing CNNs. Thus, the proposed framework achieves a favourable balance between efficiency and accuracy in HRRS image recognition.<\/jats:p>","DOI":"10.3390\/rs16213961","type":"journal-article","created":{"date-parts":[[2024,10,25]],"date-time":"2024-10-25T03:46:04Z","timestamp":1729827964000},"page":"3961","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Multi-Scale and Multi-Network Deep Feature Fusion for Discriminative Scene Classification of High-Resolution Remote Sensing Images"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9694-9250","authenticated-orcid":false,"given":"Baohua","family":"Yuan","sequence":"first","affiliation":[{"name":"Jiangsu Engineering Research Center of Digital Twinning Technology for Key Equipment in Petrochemical Process, Changzhou University, Changzhou 213164, China"},{"name":"Department of Electrical Engineering, City University of Hong Kong, Hong Kong"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9058-7869","authenticated-orcid":false,"given":"Sukhjit Singh","family":"Sehra","sequence":"additional","affiliation":[{"name":"Department of Physics & Computer Science, Wilfrid Laurier University, Waterloo, ON N2L 3C5, Canada"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5237-2410","authenticated-orcid":false,"given":"Bernard","family":"Chiu","sequence":"additional","affiliation":[{"name":"Department of Physics & Computer Science, Wilfrid Laurier University, Waterloo, ON N2L 3C5, Canada"},{"name":"Department of Electrical Engineering, City University of Hong Kong, Hong Kong"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2024,10,24]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1675","DOI":"10.1080\/13658816.2017.1324976","article-title":"Classifying urban land use by integrating remote sensing and social media data","volume":"31","author":"Liu","year":"2017","journal-title":"Int. J. Geogr. Inf. Sci."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1109\/LGRS.2017.2731997","article-title":"Remote sensing image scene classification using bag of convolutional features","volume":"14","author":"Cheng","year":"2017","journal-title":"IEEE Geosci. Remote. Sens. Lett."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"015010","DOI":"10.1117\/1.JRS.12.015010","article-title":"Multiscale deep features learning for land-use scene recognition","volume":"12","author":"Yuan","year":"2018","journal-title":"J. Appl. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1016\/j.patcog.2016.07.001","article-title":"Towards better exploiting convolutional neural networks for remote sensing scene classification","volume":"61","author":"Nogueira","year":"2017","journal-title":"Pattern Recognit."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"4775","DOI":"10.1109\/TGRS.2017.2700322","article-title":"Deep feature fusion for VHR remote sensing scene classification","volume":"55","author":"Chaib","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1865","DOI":"10.1109\/JPROC.2017.2675998","article-title":"Remote sensing image scene classification: Benchmark and state of the art","volume":"105","author":"Cheng","year":"2017","journal-title":"Proc. IEEE"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"2047","DOI":"10.1007\/s00521-020-05071-7","article-title":"Multi-deep features fusion for high-resolution remote sensing image scene classification","volume":"33","author":"Yuan","year":"2020","journal-title":"Neural Comput. Appl."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"519","DOI":"10.1109\/TGRS.2019.2937830","article-title":"Attention GANs: Unsupervised Deep Feature Learning for Aerial Scene Classification","volume":"58","author":"Yu","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"67200","DOI":"10.1109\/ACCESS.2019.2918732","article-title":"Global-local attention network for aerial scene classification","volume":"7","author":"Guo","year":"2019","journal-title":"IEEE Access"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"8506","DOI":"10.1109\/TGRS.2019.2921342","article-title":"Adaptive Multiscale Deep Fusion Residual Network for Remote Sensing Image Classification","volume":"57","author":"Li","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"105284","DOI":"10.1016\/j.cageo.2022.105284","article-title":"Impact of dataset size and convolutional neural network architecture on transfer learning for carbonate rock classification","volume":"171","author":"Dawson","year":"2023","journal-title":"Comput. Geosci."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1016\/j.patcog.2018.12.019","article-title":"Dictionaries of deep features for land-use scene classification of very high spatial resolution images","volume":"89","author":"Flores","year":"2019","journal-title":"Pattern Recognit."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1655001","DOI":"10.1142\/S0218001416550016","article-title":"Scene categorization through combining LBP and SIFT features effectively","volume":"30","author":"Bai","year":"2016","journal-title":"Int. J. Pattern Recognit. Artif. Intell."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"3173","DOI":"10.1109\/TGRS.2018.2794326","article-title":"Hyperspectral image classification with deep feature fusion network","volume":"56","author":"Song","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Sun, T., Chen, S., Yang, J., and Shi, P. (2008, January 5\u201319). A novel method of combined feature extraction for recognition. Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy.","DOI":"10.1109\/ICDM.2008.28"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016, January 27\u201330). Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.308"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Ioffe, S., Vanhoucke, V., and Alemi, A. (2016). Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv.","DOI":"10.1609\/aaai.v31i1.11231"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Chollet, F. (2017, January 21\u201326). Xception: Deep learning with depthwise separable convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.195"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1369","DOI":"10.1016\/S0031-3203(02)00262-5","article-title":"Feature fusion: Parallel strategy vs. serial strategy","volume":"36","author":"Yang","year":"2003","journal-title":"Pattern Recognit."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Yang, Y., and Newsam, S. (2010, January 2\u20135). Bag-of-visual-words and spatial extensions for land-use classification. Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA.","DOI":"10.1145\/1869790.1869829"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1961189.1961199","article-title":"LIBSVM: A library for support vector machines","volume":"2","author":"Chang","year":"2011","journal-title":"Acm Trans. Intell. Syst. Technol. (TIST)"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1414","DOI":"10.1109\/TNNLS.2020.3042276","article-title":"Looking closer at the scene: Multiscale representation learning for remote sensing image scene classification","volume":"33","author":"Wang","year":"2022","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_25","first-page":"1","article-title":"MFST: A multi-level fusion network for remote sensing scene classification","volume":"19","author":"Wang","year":"2022","journal-title":"IEEE Geosci. Remote. Sens. Lett."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"7071","DOI":"10.1007\/s00521-024-09446-y","article-title":"Enhanced multi-level features for very high resolution remote sensing scene classification","volume":"36","author":"Sitaula","year":"2024","journal-title":"Neural Comput. Appl."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"8639367","DOI":"10.1155\/2018\/8639367","article-title":"A two-stream deep fusion framework for high-resolution aerial scene classification","volume":"2018","author":"Yu","year":"2018","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1947","DOI":"10.1109\/TGRS.2014.2351395","article-title":"Pyramid of spatial relatons for scene-level land use classification","volume":"53","author":"Chen","year":"2014","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"2811","DOI":"10.1109\/TGRS.2017.2783902","article-title":"When deep learning meets metric learning: Remote sensing image scene classification via learning discriminative CNNs","volume":"56","author":"Cheng","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_30","first-page":"197","article-title":"Satmae: Pre-training transformers for temporal and multi-spectral satellite imagery","volume":"35","author":"Cong","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Noman, M., Naseer, M., Cholakkal, H., Anwer, R.M., Khan, S., and Khan, F.S. (2024, January 17\u201321). Rethinking transformers pre-training for multi-spectral satellite imagery. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52733.2024.02627"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Penatti, O.A., Nogueira, K., and Dos Santos, J.A. (2015, January 7\u201312). Do deep features generalize from everyday objects to remote sensing and aerial scenes domains?. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Boston, MA, USA.","DOI":"10.1109\/CVPRW.2015.7301382"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1016\/j.isprsjprs.2017.11.004","article-title":"A semi-supervised generative framework with deep learning features for high-resolution remote sensing image scene classification","volume":"145","author":"Han","year":"2018","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1109\/LGRS.2015.2499239","article-title":"Deep learning earth observation classification using ImageNet pretrained networks","volume":"13","author":"Marmanis","year":"2015","journal-title":"IEEE Geosci. Remote. Sens. Lett."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/21\/3961\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:19:53Z","timestamp":1760113193000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/21\/3961"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,24]]},"references-count":34,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2024,11]]}},"alternative-id":["rs16213961"],"URL":"https:\/\/doi.org\/10.3390\/rs16213961","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,10,24]]}}}