{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T18:17:20Z","timestamp":1770574640864,"version":"3.49.0"},"reference-count":90,"publisher":"MDPI AG","issue":"13","license":[{"start":{"date-parts":[[2021,6,28]],"date-time":"2021-06-28T00:00:00Z","timestamp":1624838400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Science and technology research project of Sinopec","award":["pe19003-3"],"award-info":[{"award-number":["pe19003-3"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>In real applications, it is necessary to classify new unseen classes that cannot be acquired in training datasets. To solve this problem, few-shot learning methods are usually adopted to recognize new categories with only a few (out-of-bag) labeled samples together with the known classes available in the (large-scale) training dataset. Unlike common scene classification images obtained by CCD (Charge-Coupled Device) cameras, remote sensing scene classification datasets tend to have plentiful texture features rather than shape features. Therefore, it is important to extract more valuable texture semantic features from a limited number of labeled input images. In this paper, a multi-scale feature fusion network for few-shot remote sensing scene classification is proposed by integrating a novel self-attention feature selection module, denoted as SAFFNet. Unlike a pyramidal feature hierarchy for object detection, the informative representations of the images with different receptive fields are automatically selected and re-weighted for feature fusion after refining network and global pooling operation for a few-shot remote sensing classification task. Here, the feature weighting value can be fine-tuned by the support set in the few-shot learning task. The proposed model is evaluated on three publicly available datasets for few shot remote sensing scene classification. Experimental results demonstrate the effectiveness of the proposed SAFFNet to improve the few-shot classification accuracy significantly compared to other few-shot methods and the typical multi-scale feature fusion network.<\/jats:p>","DOI":"10.3390\/rs13132532","type":"journal-article","created":{"date-parts":[[2021,6,28]],"date-time":"2021-06-28T13:39:22Z","timestamp":1624887562000},"page":"2532","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":55,"title":["SAFFNet: Self-Attention-Based Feature Fusion Network for Remote Sensing Few-Shot Scene Classification"],"prefix":"10.3390","volume":"13","author":[{"given":"Joseph","family":"Kim","sequence":"first","affiliation":[{"name":"Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, 2005 SongHu Road, Shanghai 200438, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2650-4146","authenticated-orcid":false,"given":"Mingmin","family":"Chi","sequence":"additional","affiliation":[{"name":"Shanghai Key Laboratory of Data Science, School of Computer Science, Fudan University, 2005 SongHu Road, Shanghai 200438, China"},{"name":"Zhongshan Fudan Joint Innovation Center, Zhongshan PoolNet Technology Co. Ltd., 6, Xiangxing Road, Zhongshan 528400, China"}]}],"member":"1968","published-online":{"date-parts":[[2021,6,28]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"4928","DOI":"10.1109\/TGRS.2011.2151866","article-title":"Segment optimization and data-driven thresholding for knowledge-based landslide detection by object-based image analysis","volume":"49","author":"Martha","year":"2011","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1080\/01431161.2012.705443","article-title":"Automatic landslide detection from remote-sensing imagery using a scene classification method based on BoVW and pLSA","volume":"34","author":"Cheng","year":"2013","journal-title":"Int. J. Remote Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"3983","DOI":"10.1080\/0143116031000103826","article-title":"Application of satellite remote sensing in natural hazard management: The Mount Mangart landslide case study","volume":"24","author":"Veljanovski","year":"2003","journal-title":"Int. J. Remote Sens."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1847","DOI":"10.1080\/01431160701874553","article-title":"Contribution of remote sensing to disaster management activities: A case study of the large fires in the Peloponnese, Greece","volume":"29","author":"Gitas","year":"2008","journal-title":"Int. J. Remote Sens."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"2564","DOI":"10.1016\/j.rse.2011.05.013","article-title":"Object-oriented mapping of landslides using Random Forests","volume":"115","author":"Stumpf","year":"2011","journal-title":"Remote Sens. Environ."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1016\/j.isprsjprs.2014.10.002","article-title":"Multi-class geospatial object detection and geographic image classification based on collection of part detectors","volume":"98","author":"Cheng","year":"2014","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1016\/j.isprsjprs.2013.08.001","article-title":"Object detection in remote sensing imagery using a discriminatively trained mixture model","volume":"85","author":"Cheng","year":"2013","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"7405","DOI":"10.1109\/TGRS.2016.2601622","article-title":"Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images","volume":"54","author":"Cheng","year":"2016","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"162","DOI":"10.1016\/j.neucom.2015.02.073","article-title":"A coarse-to-fine model for airport detection from remote sensing images using target-oriented visual saliency and CRF","volume":"164","author":"Yao","year":"2015","journal-title":"Neurocomputing"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"2074","DOI":"10.1109\/JSEN.2017.2664864","article-title":"A multi-parametric indicator design for ECT sensor optimization used in oil transmission","volume":"17","author":"Li","year":"2017","journal-title":"IEEE Sensors J."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Yang, X., Yang, J., Yan, J., Zhang, Y., Zhang, T., Guo, Z., Sun, X., and Fu, K. (2019, January 27\u201328). SCRDet: Towards More Robust Detection for Small, Cluttered and Rotated Objects. Proceedings of the IEEE\/CVF International Conference on Computer Vision 2019, Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00832"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"6020","DOI":"10.1109\/TGRS.2016.2579648","article-title":"A three-layered graph-based learning approach for remote sensing image retrieval","volume":"54","author":"Wang","year":"2016","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"818","DOI":"10.1109\/TGRS.2012.2205158","article-title":"Geographic image retrieval using local invariant features","volume":"51","author":"Yang","year":"2013","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"839","DOI":"10.1109\/TGRS.2006.890579","article-title":"GeoIRIS: Geospatial information retrieval and indexing system\u2014Content mining, semantics modeling, and complex queries","volume":"45","author":"Shyu","year":"2007","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Li, Y., Zhang, Y., Tao, C., and Zhu, H. (2016). Content-based high-resolution remote sensing image retrieval via unsupervised feature learning and collaborative affinity metric fusion. Remote Sens., 8.","DOI":"10.3390\/rs8090709"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"317","DOI":"10.1016\/j.rse.2005.08.006","article-title":"Land cover classification and change analysis of the Twin Cities (Minnesota) Metropolitan Area by multitemporal Landsat remote sensing","volume":"98","author":"Yuan","year":"2005","journal-title":"Remote Sens. Environ."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"173","DOI":"10.1016\/S0034-4257(01)00204-8","article-title":"Monitoring urban land cover change: An expert system approach to land cover classification of semiarid to arid urban centers","volume":"77","author":"Stefanov","year":"2001","journal-title":"Remote Sens. Environ."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1016\/j.apgeog.2010.10.012","article-title":"Land use and land cover change detection in the western Nile delta of Egypt using remote sensing data","volume":"31","author":"Ismail","year":"2011","journal-title":"Appl. Geogr."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Lima, R.P.D., and Marfurt, K. (2020). Convolutional Neural Network for Remote-Sensing Scene Classification: Transfer Learning Analysis. Remote Sens., 12.","DOI":"10.3390\/rs12234003"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1865","DOI":"10.1109\/JPROC.2017.2675998","article-title":"Remote sensing image scene classification: Benchmark and state of the art","volume":"105","author":"Cheng","year":"2017","journal-title":"Proc. IEEE"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"123","DOI":"10.1016\/j.isprsjprs.2020.05.016","article-title":"HyperLi-Net: A hyper-light deep learning network for high-accurate and high-speed ship detection from synthetic aperture radar imagery","volume":"167","author":"Zhang","year":"2020","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1584","DOI":"10.1093\/nsr\/nwaa047","article-title":"Deep-learning-based information mining from ocean remote-sensing imagery","volume":"7","author":"Li","year":"2020","journal-title":"Natl. Sci. Rev."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1109\/MGRS.2017.2762307","article-title":"Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources","volume":"5","author":"Zhu","year":"2017","journal-title":"IEEE Geosci. Remote Sens. Mag."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1461","DOI":"10.1109\/TNNLS.2019.2920374","article-title":"Skip-Connected Covariance Network for Remote Sensing Scene Classification","volume":"31","author":"He","year":"2020","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"G\u2019omez, P., and Meoni, G. (2021). MSMatch: Semi-Supervised Multispectral Scene Classification with Few Labels. arXiv.","DOI":"10.1109\/JSTARS.2021.3126082"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Schmitt, M., and Wu, Y.L. (2021). Remote Sensing Image Classification with the SEN12MS Dataset. arXiv.","DOI":"10.5194\/isprs-annals-V-2-2021-101-2021"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1108","DOI":"10.1109\/TGRS.2008.2007741","article-title":"Toward the Automatic Updating of Land-Cover Maps by a Domain-Adaptation SVM Classifier and a Circular Validation Strategy","volume":"47","author":"Bruzzone","year":"2009","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"336","DOI":"10.1109\/LGRS.2008.916070","article-title":"Semisupervised Image Classification With Laplacian Support Vector Machines","volume":"5","year":"2008","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_29","unstructured":"Vinyals, O., Blundell, C., Lillicrap, T., and Wierstra, D. (2016, January 5\u201310). Matching networks for one shot learning. Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, Barcelona, Spain."},{"key":"ref_30","unstructured":"Snell, J., Swersky, K., and Zemel, R. (2017). Prototypical networks for few-shot learning. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P., and Hospedales, T.M. (2018, January 18\u201323). Learning to Compare: Relation Network for Few-Shot Learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00131"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"4157","DOI":"10.1109\/TGRS.2017.2689071","article-title":"Zero-Shot Scene Classification for High Spatial Resolution Remote Sensing Images","volume":"55","author":"Li","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_33","unstructured":"Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., and Dean, J. (2013, January 5\u201310). Distributed representations of words and phrases and their compositionality. Proceedings of the Advances in Neural Information Processing Systems 2013, Stateline, NV, USA."},{"key":"ref_34","unstructured":"Dong, X., Zheng, L., Ma, F., Yang, Y., and Meng, D. (2017). Few-shot Object Detection. arXiv."},{"key":"ref_35","unstructured":"Lampinen, A.K., and McClelland, J.L. (2017). One-shot and few-shot learning of word embeddings. arXiv."},{"key":"ref_36","unstructured":"Brown, T.B., Mann, B.P., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., and Askell, A. (2020). Language Models are Few-Shot Learners. arXiv."},{"key":"ref_37","unstructured":"Lake, B., Lee, C.y., Glass, J., and Tenenbaum, J. (2014, January 23\u201326). One-shot learning of generative speech concepts. Proceedings of the Annual Meeting of the Cognitive Science Society, Quebec City, QC, Canada."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Wu, D., Zhu, F., and Shao, L. (2012, January 16\u201321). One shot learning gesture recognition from rgbd images. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, RI, USA.","DOI":"10.1109\/CVPRW.2012.6239179"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Kim, M., Zuallaert, J., and De Neve, W. (2017, January 23). Few-shot Learning Using a Small-Sized Dataset of High-Resolution FUNDUS Images for Glaucoma Diagnosis. Proceedings of the 2nd International Workshop on Multimedia for Personal Health and Health Care, Mountain View, CA, USA.","DOI":"10.1145\/3132635.3132650"},{"key":"ref_40","unstructured":"Liu, M.Y., Huang, X., Mallya, A., Karras, T., Aila, T., Lehtinen, J., and Kautz, J. (November, January 27). Few-Shot Unsupervised Image-to-Image Translation. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Korea."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1021\/acscentsci.6b00367","article-title":"Low data drug discovery with one-shot learning","volume":"3","author":"Ramsundar","year":"2017","journal-title":"ACS Cent. Sci."},{"key":"ref_42","unstructured":"Finn, C., Abbeel, P., and Levine, S. (2017, January 6\u201311). Model-agnostic meta-learning for fast adaptation of deep networks. Proceedings of the International Conference on Machine Learning, Sydney, Australia."},{"key":"ref_43","unstructured":"Ravi, S., and Larochelle, H. (2017, January 24\u201326). Optimization as a model for few-shot learning. Proceedings of the 5th International Conference on Learning Representations, Toulon, France."},{"key":"ref_44","unstructured":"Li, Z., Zhou, F., Chen, F., and Li, H. (2017). Meta-SGD: Learning to Learn Quickly for Few Shot Learning. arXiv."},{"key":"ref_45","unstructured":"Munkhdalai, T., and Yu, H. (2017, January 6\u201311). Meta networks. Proceedings of the International Conference on Machine Learning, Sydney, NSW, Australia."},{"key":"ref_46","unstructured":"Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., and Lillicrap, T. (2016). One-shot learning with memory-augmented neural networks. arXiv."},{"key":"ref_47","unstructured":"Kaiser, \u0141., Nachum, O., Roy, A., and Bengio, S. (2017). Learning to remember rare events. arXiv."},{"key":"ref_48","unstructured":"Ramalho, T., and Garnelo, M. (2019, January 6\u20139). Adaptive Posterior Learning: Few-shot learning with a surprise-based memory module. Proceedings of the 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA."},{"key":"ref_49","unstructured":"Koch, G., Zemel, R., and Salakhutdinov, R. (2015, January 6\u201311). Siamese neural networks for one-shot image recognition. Proceedings of the International Conference on Machine Learning, Lille, France."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"3965","DOI":"10.1109\/TGRS.2017.2685945","article-title":"AID: A benchmark data set for performance evaluation of aerial scene classification","volume":"55","author":"Xia","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L. (2009, January 20\u201321). Imagenet: A large-scale hierarchical image database. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_52","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Qiao, S., Chen, L.C., and Yuille, A. (2020). DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution. arXiv.","DOI":"10.1109\/CVPR46437.2021.01008"},{"key":"ref_54","unstructured":"Tao, A., Sapra, K., and Catanzaro, B. (2020). Hierarchical Multi-Scale Attention for Semantic Segmentation. arXiv."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Kim, S., Kook, H.K., Sun, J.Y., Kang, M.C., and Ko, S. (2018, January 8\u201314). Parallel Feature Pyramid Network for Object Detection. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01228-1_15"},{"key":"ref_58","unstructured":"Satorras, V.G., and Estrach, J.B. (May, January 30). Few-Shot Learning with Graph Neural Networks. Proceedings of the 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada."},{"key":"ref_59","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1016\/j.patrec.2020.07.015","article-title":"Improved prototypical networks for few-Shot learning","volume":"140","author":"Ji","year":"2020","journal-title":"Pattern Recognit. Lett."},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Wang, K., Liew, J., Zou, Y., Zhou, D., and Feng, J. (2019, January 27\u201328). PANet: Few-Shot Image Semantic Segmentation With Prototype Alignment. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Korea.","DOI":"10.1109\/ICCV.2019.00929"},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Zheng, Y.D., Ma, Y., Liu, R.Z., and Lu, T. (2019, January 14\u201319). A Novel Group-Aware Pruning Method for Few-shot Learning. Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary.","DOI":"10.1109\/IJCNN.2019.8852221"},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"19891","DOI":"10.1109\/ACCESS.2020.3044192","article-title":"Few-Shot Scene Classification With Multi-Attention Deepemd Network in Remote Sensing","volume":"9","author":"Yuan","year":"2021","journal-title":"IEEE Access"},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Ghiasi, G., Cui, Y., Srinivas, A., Qian, R., Lin, T.Y., Cubuk, E.D., Le, Q.V., and Zoph, B. (2020). Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation. arXiv.","DOI":"10.1109\/CVPR46437.2021.00294"},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201323). Path aggregation network for instance segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_65","unstructured":"Zoph, B., and Le, Q.V. (2016). Neural architecture search with reinforcement learning. arXiv."},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Ghiasi, G., Lin, T.Y., and Le, Q.V. (2019, January 16\u201320). NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00720"},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Tan, M., Pang, R., and Le, Q.V. (2020, January 13\u201319). Efficientdet: Scalable and efficient object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Sun, K., Xiao, B., Liu, D., and Wang, J. (2019, January 16\u201320). Deep High-Resolution Representation Learning for Human Pose Estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00584"},{"key":"ref_69","doi-asserted-by":"crossref","first-page":"652","DOI":"10.1109\/TPAMI.2019.2938758","article-title":"Res2net: A new multi-scale backbone architecture","volume":"43","author":"Gao","year":"2021","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_70","unstructured":"Singh, B., Najibi, M., and Davis, L.S. (2018). Sniper: Efficient multi-scale training. arXiv."},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"Pan, X., Ren, Y., Sheng, K., Dong, W., Yuan, H., Guo, X., Ma, C., and Xu, C. (2020, January 13\u201319). Dynamic Refinement Network for Oriented and Densely Packed Object Detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01122"},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., and Lin, D. (2019, January 15\u201320). Libra R-CNN: Towards Balanced Learning for Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00091"},{"key":"ref_73","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1109\/LGRS.2020.2968550","article-title":"Self-Attention-Based Deep Feature Fusion for Remote Sensing Scene Classification","volume":"18","author":"Cao","year":"2021","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_74","doi-asserted-by":"crossref","unstructured":"Cao, Y., Xu, J., Lin, S., Wei, F., and Hu, H. (2019, January 27\u201328). GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Korea.","DOI":"10.1109\/ICCVW.2019.00246"},{"key":"ref_75","doi-asserted-by":"crossref","unstructured":"Wang, X., Girshick, R., Gupta, A., and He, K. (2018, January 18\u201323). Non-local neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00813"},{"key":"ref_76","doi-asserted-by":"crossref","unstructured":"Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., and Wei, Y. (2017, January 22\u201329). Deformable Convolutional Networks. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.89"},{"key":"ref_77","doi-asserted-by":"crossref","unstructured":"Zhu, X., Cheng, D., Zhang, Z., Lin, S., and Dai, J. (2019). An Empirical Study of Spatial Attention Mechanisms in Deep Networks. arXiv.","DOI":"10.1109\/ICCV.2019.00679"},{"key":"ref_78","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201323). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_79","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.Y., and Kweon, I.S. (2018, January 8\u201314). CBAM: Convolutional Block Attention Module. Proceedings of the 15th European Conference, Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_80","doi-asserted-by":"crossref","unstructured":"Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., and Lu, H. (2019, January 15\u201320). Dual attention network for scene segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00326"},{"key":"ref_81","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv."},{"key":"ref_82","unstructured":"Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2020). Deformable DETR: Deformable Transformers for End-to-End Object Detection. arXiv."},{"key":"ref_83","unstructured":"Ioffe, S., and Szegedy, C. (2015). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv."},{"key":"ref_84","unstructured":"Nair, V., and Hinton, G.E. (2010, January 21\u201324). Rectified Linear Units Improve Restricted Boltzmann Machines. Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel."},{"key":"ref_85","unstructured":"Chen, W.Y., Liu, Y.C., Kira, Z., Wang, Y.C., and Huang, J.B. (2019, January 6\u20139). A Closer Look at Few-shot Classification. Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA."},{"key":"ref_86","doi-asserted-by":"crossref","unstructured":"Yang, Y., and Newsam, S. (2010, January 3\u20135). Bag-of-visual-words and spatial extensions for land-use classification. Proceedings of the 18th ACM SIGSPATIAL International Symposium on Advances in Geographic Information Systems, ACM-GIS, San Jose, CA, USA.","DOI":"10.1145\/1869790.1869829"},{"key":"ref_87","unstructured":"Kingma, D., and Ba, J. (2015, January 7\u20139). Adam: A Method for Stochastic Optimization. Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA."},{"key":"ref_88","unstructured":"Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014, January 8\u201313). Generative Adversarial Nets. Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, Montreal, QC, Canada."},{"key":"ref_89","first-page":"10276","article-title":"Learning to self-train for semi-supervised few-shot classification","volume":"32","author":"Li","year":"2019","journal-title":"Adv. Neural Inf. Proc. Syst."},{"key":"ref_90","doi-asserted-by":"crossref","unstructured":"Ma, T., and Zhang, A. (February, January 27). AffinityNet: Semi-supervised few-shot learning for disease type prediction. Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, HI, USA.","DOI":"10.1609\/aaai.v33i01.33011069"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/13\/2532\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T06:26:15Z","timestamp":1760163975000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/13\/13\/2532"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6,28]]},"references-count":90,"journal-issue":{"issue":"13","published-online":{"date-parts":[[2021,7]]}},"alternative-id":["rs13132532"],"URL":"https:\/\/doi.org\/10.3390\/rs13132532","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,6,28]]}}}