{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,5]],"date-time":"2026-01-05T01:12:44Z","timestamp":1767575564143,"version":"build-2065373602"},"reference-count":47,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2024,2,9]],"date-time":"2024-02-09T00:00:00Z","timestamp":1707436800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Natural Science Basic Research Program of Shaanxi","award":["2023-YBGY-028"],"award-info":[{"award-number":["2023-YBGY-028"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Remote sensing image classification (RSIC) is designed to assign specific semantic labels to aerial images, which is significant and fundamental in many applications. In recent years, substantial work has been conducted on RSIC with the help of deep learning models. Even though these models have greatly enhanced the performance of RSIC, the issues of diversity in the same class and similarity between different classes in remote sensing images remain huge challenges for RSIC. To solve these problems, a duplex-hierarchy representation learning (DHRL) method is proposed. The proposed DHRL method aims to explore duplex-hierarchy spaces, including a common space and a label space, to learn discriminative representations for RSIC. The proposed DHRL method consists of three main steps: First, paired images are fed to a pretrained ResNet network for extracting the corresponding features. Second, the extracted features are further explored and mapped into a common space for reducing the intra-class scatter and enlarging the inter-class separation. Third, the obtained representations are used to predict the categories of the input images, and the discrimination loss in the label space is minimized to further promote the learning of discriminative representations. Meanwhile, a confusion score is computed and added to the classification loss for guiding the discriminative representation learning via backpropagation. The comprehensive experimental results show that the proposed method is superior to the existing state-of-the-art methods on two challenging remote sensing image scene datasets, demonstrating that the proposed method is significantly effective.<\/jats:p>","DOI":"10.3390\/s24041130","type":"journal-article","created":{"date-parts":[[2024,2,9]],"date-time":"2024-02-09T03:53:46Z","timestamp":1707450826000},"page":"1130","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Duplex-Hierarchy Representation Learning for Remote Sensing Image Classification"],"prefix":"10.3390","volume":"24","author":[{"given":"Xiaobin","family":"Yuan","sequence":"first","affiliation":[{"name":"The School of Electronic and Information Engineering, Xi\u2019an Jiaotong University, Xi\u2019an 710049, China"},{"name":"The Xi\u2019an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi\u2019an 710119, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jingping","family":"Zhu","sequence":"additional","affiliation":[{"name":"The School of Electronic and Information Engineering, Xi\u2019an Jiaotong University, Xi\u2019an 710049, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hao","family":"Lei","sequence":"additional","affiliation":[{"name":"National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi\u2019an Jiaotong University, Xi\u2019an 710049, China"},{"name":"Institute of Artificial Intelligence and Robotics, Xi\u2019an Jiaotong University, Xi\u2019an 710049, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shengjun","family":"Peng","sequence":"additional","affiliation":[{"name":"The State Key Laboratory of Astronautic Dynamics, China Xi\u2019an Satellite Control Center, Xi\u2019an 710043, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weidong","family":"Wang","sequence":"additional","affiliation":[{"name":"PLA 63768, Xi\u2019an 710600, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaobin","family":"Li","sequence":"additional","affiliation":[{"name":"The Beijing Institute of Remote Sensing Information, Beijing 100192, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2024,2,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1080\/01431161.2012.705443","article-title":"Automatic landslide detection from remote-sensing imagery using a scene classification method based on BoVW and pLSA","volume":"34","author":"Cheng","year":"2013","journal-title":"Int. J. Remote Sens."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"2250","DOI":"10.1109\/TGRS.2016.2640186","article-title":"Unsupervised feature learning for land-use scene recognition","volume":"55","author":"Fan","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1155","DOI":"10.1109\/TGRS.2011.2165548","article-title":"Very high resolution multiangle urban classification analysis","volume":"50","author":"Longbotham","year":"2011","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1007\/BF00130487","article-title":"Color indexing","volume":"7","author":"Swain","year":"1991","journal-title":"Int. J. Comput. Vis."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"925","DOI":"10.1109\/TIP.2005.849319","article-title":"Design-based texture feature fusion using Gabor filters and co-occurrence probabilities","volume":"14","author":"Clausi","year":"2005","journal-title":"IEEE Trans. Image Process."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"2950","DOI":"10.1109\/TGRS.2006.876704","article-title":"A pixel shape index coupled with spectral information for classification of high spatial resolution remotely sensed imagery","volume":"44","author":"Zhang","year":"2006","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1109\/TSMC.1973.4309314","article-title":"Textural features for image classification","volume":"3","author":"Haralick","year":"1973","journal-title":"IEEE Trans. Syst. Man Cybern."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1023\/A:1011139631724","article-title":"Modeling the shape of the scene: A holistic representation of the spatial envelope","volume":"42","author":"Oliva","year":"2001","journal-title":"Int. J. Comput. Vis."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive image features from scale-invariant key points","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_10","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201326). Histograms of oriented gradients for human detection. Proceedings of the Computer Vision and Pattern Recognition, San Diego, CA, USA."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Perronnin, F., Sanchez, J., and Mensink, T. (2010, January 5\u201311). Improving the fisher kernel for large-scale image classification. Proceedings of the European Conference on Computer Vision, Heraklion, Greece.","DOI":"10.1007\/978-3-642-15561-1_11"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1704","DOI":"10.1109\/TPAMI.2011.235","article-title":"Aggregating local image descriptors into compact codes","volume":"34","author":"Jegou","year":"2011","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_13","first-page":"2169","article-title":"Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories","volume":"2","author":"Lazebnik","year":"2006","journal-title":"IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Yang, Y., and Newsam, S. (2010, January 2\u20135). Bag-of-visual-words and spatial extensions for land-use classification. Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA.","DOI":"10.1145\/1869790.1869829"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Penatti, O.A.B., Nogueira, K., and Dos Santos, J.A. (2015, January 7\u201312). Do deep features generalize from everyday objects to remote sensing and aerial scenes domains?. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Boston, MA, USA.","DOI":"10.1109\/CVPRW.2015.7301382"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"645","DOI":"10.1109\/TGRS.2016.2612821","article-title":"Convolutional Neural networks for large-scale remote-sensing image classification","volume":"55","author":"Maggiori","year":"2016","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"7109","DOI":"10.1109\/TGRS.2018.2848473","article-title":"Scene classification based on multiscale convolutional neural network","volume":"56","author":"Liu","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"6916","DOI":"10.1109\/TGRS.2019.2909695","article-title":"Scale-free convolutional neural network for remote sensing scene classification","volume":"57","author":"Xie","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_19","unstructured":"Castelluccio, M., Poggi, G., Sansone, C., and Verdoliva, L. (2015). Land use classification in remote sensing images by convolutional neural networks. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1865","DOI":"10.1109\/JPROC.2017.2675998","article-title":"Remote sensing image classification: Benchmark and state of the art","volume":"105","author":"Cheng","year":"2017","journal-title":"Proc. IEEE"},{"key":"ref_21","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"2352","DOI":"10.1162\/neco_a_00990","article-title":"Deep convolutional neural networks for image classification: A comprehensive review","volume":"29","author":"Rawat","year":"2017","journal-title":"Neural Comput."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Agrawal, P., Girshick, R., and Malik, J. (2014, January 6\u201312). Analyzing the performance of multilayer neural networks for object recognition. Proceedings of the Computer Vision\u2014ECCV 2014, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10584-0_22"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1007\/s13735-017-0141-z","article-title":"A review of semantic segmentation using deep neural networks","volume":"7","author":"Guo","year":"2018","journal-title":"Int. J. Multimed. Inf. Retr."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"14680","DOI":"10.3390\/rs71114680","article-title":"Transferring Deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery","volume":"7","author":"Hu","year":"2015","journal-title":"Remote Sens."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1109\/LGRS.2017.2731997","article-title":"Remote sensing image scene classification using bag of convolutional features","volume":"14","author":"Cheng","year":"2017","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"7894","DOI":"10.1109\/TGRS.2019.2917161","article-title":"A feature aggregation convolutional neural network for remote sensing scene classification","volume":"57","author":"Lu","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"2494","DOI":"10.1109\/TGRS.2018.2873966","article-title":"Scene classification using hierarchical Wasserstein CNN","volume":"57","author":"Liu","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1155","DOI":"10.1109\/TGRS.2018.2864987","article-title":"Scene classification with recurrent attention of VHR remote sensing images","volume":"57","author":"Wang","year":"2018","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1793","DOI":"10.1109\/TGRS.2015.2488681","article-title":"Scene classification via a gradient boosting random convolutional network framework","volume":"54","author":"Zhang","year":"2015","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1461","DOI":"10.1109\/TNNLS.2019.2920374","article-title":"Skip-connected covariance network for remote sensing scene classification","volume":"31","author":"He","year":"2019","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_32","first-page":"1097","article-title":"Imagenet classification with deep convolutional neural networks","volume":"25","author":"Krizhevsky","year":"2012","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_33","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"3965","DOI":"10.1109\/TGRS.2017.2685945","article-title":"AID: A benchmark data set for performance evaluation of aerial scene classification","volume":"55","author":"Xia","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_35","unstructured":"Abadi, M., Barham, P., Chen, J., Davis, J., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., and Kudlur, M. (2016, January 2\u20134). TensorFlow: A system for Large-Scale machine learning. Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, Savannah, GA, USA."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Ketkar, N. (2017). Deep Learning with Python: A Hands-On Introduction, Apress.","DOI":"10.1007\/978-1-4842-2766-4"},{"key":"ref_37","unstructured":"Kingma, D., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Zhang, W., Tang, P., and Zhao, L. (2019). Remote sensing image scene classification using CNN-CapsNet. Remote Sens., 11.","DOI":"10.3390\/rs11050494"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1200","DOI":"10.1109\/LGRS.2019.2894399","article-title":"Siamese convolutional neural networks for remote sensing scene classification","volume":"16","author":"Liu","year":"2019","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Dong, R., Xu, D., Jiao, L., Zhao, J., and An, J. (2020). A Fast Deep Perception Network for Remote Sensing Scene Classification. Remote Sens., 12.","DOI":"10.3390\/rs12040729"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Li, J., Lin, D., Wang, Y., Xu, G., Zhang, Y., Ding, C., and Zhou, Y. (2020). Deep discriminative representation learning with attention map for scene classification. Remote Sens., 12.","DOI":"10.3390\/rs12091366"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"6372","DOI":"10.1109\/JSTARS.2020.3030257","article-title":"Hierarchical Attention and Bilinear Fusion for Remote Sensing Image Scene Classification","volume":"13","author":"Yu","year":"2020","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_43","first-page":"1","article-title":"Bag-of-visual-words scene classifier for remote sensing image based on region covariance","volume":"19","author":"Chen","year":"2022","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"542","DOI":"10.1109\/TBDATA.2022.3196314","article-title":"Mining Hierarchical Information of CNNs for Scene Classification of VHR Remote Sensing Images","volume":"9","author":"Xu","year":"2022","journal-title":"IEEE Trans. Big Data"},{"key":"ref_45","first-page":"5615312","article-title":"Remote sensing scene classification via multi-stage self-guided separation network","volume":"61","author":"Wang","year":"2023","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1109\/TGRS.2019.2931801","article-title":"Remote sensing scene classification by gated bidirectional network","volume":"58","author":"Sun","year":"2019","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_47","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1109\/LGRS.2020.2968550","article-title":"Self-Attention-Based Deep Feature Fusion for Remote Sensing Scene Classification","volume":"18","author":"Cao","year":"2020","journal-title":"IEEE Geosci. Remote Sens. Lett."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/4\/1130\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T13:57:37Z","timestamp":1760104657000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/24\/4\/1130"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,9]]},"references-count":47,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2024,2]]}},"alternative-id":["s24041130"],"URL":"https:\/\/doi.org\/10.3390\/s24041130","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2024,2,9]]}}}