{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T16:13:35Z","timestamp":1774628015083,"version":"3.50.1"},"reference-count":72,"publisher":"MDPI AG","issue":"14","license":[{"start":{"date-parts":[[2023,7,21]],"date-time":"2023-07-21T00:00:00Z","timestamp":1689897600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of China","award":["No. 62076192"],"award-info":[{"award-number":["No. 62076192"]}]},{"name":"National Natural Science Foundation of China","award":["No. 61836009"],"award-info":[{"award-number":["No. 61836009"]}]},{"name":"National Natural Science Foundation of China","award":["No. IRT_15R53"],"award-info":[{"award-number":["No. IRT_15R53"]}]},{"name":"National Natural Science Foundation of China","award":["No. B07048"],"award-info":[{"award-number":["No. B07048"]}]},{"name":"State Key Program of National Natural Science of China","award":["No. 62076192"],"award-info":[{"award-number":["No. 62076192"]}]},{"name":"State Key Program of National Natural Science of China","award":["No. 61836009"],"award-info":[{"award-number":["No. 61836009"]}]},{"name":"State Key Program of National Natural Science of China","award":["No. IRT_15R53"],"award-info":[{"award-number":["No. IRT_15R53"]}]},{"name":"State Key Program of National Natural Science of China","award":["No. B07048"],"award-info":[{"award-number":["No. B07048"]}]},{"name":"Program for Cheung Kong Scholars and Innovative Research Team in University","award":["No. 62076192"],"award-info":[{"award-number":["No. 62076192"]}]},{"name":"Program for Cheung Kong Scholars and Innovative Research Team in University","award":["No. 61836009"],"award-info":[{"award-number":["No. 61836009"]}]},{"name":"Program for Cheung Kong Scholars and Innovative Research Team in University","award":["No. IRT_15R53"],"award-info":[{"award-number":["No. IRT_15R53"]}]},{"name":"Program for Cheung Kong Scholars and Innovative Research Team in University","award":["No. B07048"],"award-info":[{"award-number":["No. B07048"]}]},{"name":"Fund for Foreign Scholars in University Research and Teaching Programs (the 111 Project)","award":["No. 62076192"],"award-info":[{"award-number":["No. 62076192"]}]},{"name":"Fund for Foreign Scholars in University Research and Teaching Programs (the 111 Project)","award":["No. 61836009"],"award-info":[{"award-number":["No. 61836009"]}]},{"name":"Fund for Foreign Scholars in University Research and Teaching Programs (the 111 Project)","award":["No. IRT_15R53"],"award-info":[{"award-number":["No. IRT_15R53"]}]},{"name":"Fund for Foreign Scholars in University Research and Teaching Programs (the 111 Project)","award":["No. B07048"],"award-info":[{"award-number":["No. B07048"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Remote sensing (RS) scene classification has received considerable attention due to its wide applications in the RS community. Many methods based on convolutional neural networks (CNNs) have been proposed to classify complex RS scenes, but they cannot fully capture the context in RS images because of the lack of long-range dependencies (the dependency relationship between two distant elements). Recently, some researchers fine-tuned the large pretrained vision transformer (ViT) on small RS datasets to extract long-range dependencies effectively in RS scenes. However, it usually takes more time to fine-tune the ViT on account of high computational complexity. The lack of good local feature representation in the ViT limits classification performance improvement. To this end, we propose a lightweight transformer network (LTNet) for RS scene classification. First, a multi-level group convolution (MLGC) module is presented. It enriches the diversity of local features and requires a lower computational cost by co-representing multi-level and multi-group features in a single module. Then, based on the MLGC module, a lightweight transformer block, LightFormer, was designed to capture global dependencies with fewer computing resources. Finally, the LTNet was built using the MLGC and LightFormer. The experiments of fine-tuning the LTNet on four RS scene classification datasets demonstrate that the proposed network achieves a competitive classification performance under less training time.<\/jats:p>","DOI":"10.3390\/rs15143645","type":"journal-article","created":{"date-parts":[[2023,7,24]],"date-time":"2023-07-24T01:12:28Z","timestamp":1690161148000},"page":"3645","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":29,"title":["Faster and Better: A Lightweight Transformer Network for Remote Sensing Scene Classification"],"prefix":"10.3390","volume":"15","author":[{"given":"Xinyan","family":"Huang","sequence":"first","affiliation":[{"name":"Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Xidian University, Xi\u2019an 710071, China"},{"name":"International Research Center for Intelligent Perception and Computation, Xidian University, Xi\u2019an 710071, China"},{"name":"Joint International Research Laboratory of Intelligent Perception and Computation, Xidian University, Xi\u2019an 710071, China"},{"name":"School of Artificial Intelligence, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Fang","family":"Liu","sequence":"additional","affiliation":[{"name":"Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Xidian University, Xi\u2019an 710071, China"},{"name":"International Research Center for Intelligent Perception and Computation, Xidian University, Xi\u2019an 710071, China"},{"name":"Joint International Research Laboratory of Intelligent Perception and Computation, Xidian University, Xi\u2019an 710071, China"},{"name":"School of Artificial Intelligence, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Yuanhao","family":"Cui","sequence":"additional","affiliation":[{"name":"Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Xidian University, Xi\u2019an 710071, China"},{"name":"International Research Center for Intelligent Perception and Computation, Xidian University, Xi\u2019an 710071, China"},{"name":"Joint International Research Laboratory of Intelligent Perception and Computation, Xidian University, Xi\u2019an 710071, China"},{"name":"School of Artificial Intelligence, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Puhua","family":"Chen","sequence":"additional","affiliation":[{"name":"Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Xidian University, Xi\u2019an 710071, China"},{"name":"International Research Center for Intelligent Perception and Computation, Xidian University, Xi\u2019an 710071, China"},{"name":"Joint International Research Laboratory of Intelligent Perception and Computation, Xidian University, Xi\u2019an 710071, China"},{"name":"School of Artificial Intelligence, Xidian University, Xi\u2019an 710071, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6130-2518","authenticated-orcid":false,"given":"Lingling","family":"Li","sequence":"additional","affiliation":[{"name":"Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Xidian University, Xi\u2019an 710071, China"},{"name":"International Research Center for Intelligent Perception and Computation, Xidian University, Xi\u2019an 710071, China"},{"name":"Joint International Research Laboratory of Intelligent Perception and Computation, Xidian University, Xi\u2019an 710071, China"},{"name":"School of Artificial Intelligence, Xidian University, Xi\u2019an 710071, China"}]},{"given":"Pengfang","family":"Li","sequence":"additional","affiliation":[{"name":"Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Xidian University, Xi\u2019an 710071, China"},{"name":"International Research Center for Intelligent Perception and Computation, Xidian University, Xi\u2019an 710071, China"},{"name":"Joint International Research Laboratory of Intelligent Perception and Computation, Xidian University, Xi\u2019an 710071, China"},{"name":"School of Artificial Intelligence, Xidian University, Xi\u2019an 710071, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,7,21]]},"reference":[{"key":"ref_1","unstructured":"Xiao, Y., and Zhan, Q. (2009, January 20\u201322). A review of remote sensing applications in urban planning and management in China. Proceedings of the 2009 Joint Urban Remote Sensing Event, Shanghai, China."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"4928","DOI":"10.1109\/TGRS.2011.2151866","article-title":"Segment optimization and data-driven thresholding for knowledge-based landslide detection by object-based image analysis","volume":"49","author":"Martha","year":"2011","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"2564","DOI":"10.1016\/j.rse.2011.05.013","article-title":"Object-oriented mapping of landslides using Random Forests","volume":"115","author":"Stumpf","year":"2011","journal-title":"Remote. Sens. Environ."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1080\/01431161.2012.705443","article-title":"Automatic landslide detection from remote-sensing imagery using a scene classification method based on BoVW and pLSA","volume":"34","author":"Cheng","year":"2013","journal-title":"Int. J. Remote. Sens."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"111322","DOI":"10.1016\/j.rse.2019.111322","article-title":"Land-cover classification with high-resolution remote sensing images using transferable deep models","volume":"237","author":"Tong","year":"2020","journal-title":"Remote. Sens. Environ."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Li, Y., Zhang, Y., Tao, C., and Zhu, H. (2016). Content-based high-resolution remote sensing image retrieval via unsupervised feature learning and collaborative affinity metric fusion. Remote Sens., 8.","DOI":"10.3390\/rs8090709"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"813","DOI":"10.1016\/j.neucom.2016.05.061","article-title":"Local structure learning in high resolution remote sensing image retrieval","volume":"207","author":"Du","year":"2016","journal-title":"Neurocomputing"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"255","DOI":"10.1016\/j.patcog.2016.11.015","article-title":"SAR image segmentation based on convolutional-wavelet neural network and Markov random field","volume":"64","author":"Duan","year":"2017","journal-title":"Pattern Recognit."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"125","DOI":"10.1016\/j.patcog.2017.10.025","article-title":"A modified convolutional neural network for face sketch synthesis","volume":"76","author":"Jiao","year":"2018","journal-title":"Pattern Recognit."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"107110","DOI":"10.1016\/j.patcog.2019.107110","article-title":"Complex Contourlet-CNN for polarimetric SAR image classification","volume":"100","author":"Li","year":"2020","journal-title":"Pattern Recognit."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"4225","DOI":"10.1109\/TIP.2021.3065244","article-title":"Semantic perceptual image compression with a laplacian pyramid of convolutional networks","volume":"30","author":"Wang","year":"2021","journal-title":"IEEE Trans. Image Process."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"108284","DOI":"10.1016\/j.patcog.2021.108284","article-title":"Context extraction module for deep convolutional neural networks","volume":"122","author":"Singh","year":"2022","journal-title":"Pattern Recognit."},{"key":"ref_13","first-page":"1","article-title":"Polarimetric multipath convolutional neural network for PolSAR image classification","volume":"60","author":"Cui","year":"2021","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"539","DOI":"10.1016\/j.patcog.2016.07.001","article-title":"Towards better exploiting convolutional neural networks for remote sensing scene classification","volume":"61","author":"Nogueira","year":"2017","journal-title":"Pattern Recognit."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1865","DOI":"10.1109\/JPROC.2017.2675998","article-title":"Remote sensing image scene classification: Benchmark and state of the art","volume":"105","author":"Cheng","year":"2017","journal-title":"Proc. IEEE"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Bazi, Y., Al Rahhal, M.M., Alhichri, H., and Alajlan, N. (2019). Simple yet effective fine-tuning of deep CNNs using an auxiliary classification loss for remote sensing scene classification. Remote Sens., 11.","DOI":"10.3390\/rs11242908"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1986","DOI":"10.1109\/JSTARS.2020.2988477","article-title":"Classification of high-spatial-resolution remote sensing scenes method using transfer learning and deep convolutional neural network","volume":"13","author":"Li","year":"2020","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"7894","DOI":"10.1109\/TGRS.2019.2917161","article-title":"A feature aggregation convolutional neural network for remote sensing scene classification","volume":"57","author":"Lu","year":"2019","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1109\/TGRS.2019.2931801","article-title":"Remote sensing scene classification by gated bidirectional network","volume":"58","author":"Sun","year":"2019","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"6899","DOI":"10.1109\/TGRS.2018.2845668","article-title":"Remote sensing scene classification using multilayer stacked covariance pooling","volume":"56","author":"He","year":"2018","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1109\/LGRS.2017.2779469","article-title":"Scene classification based on two-stage deep feature fusion","volume":"15","author":"Liu","year":"2017","journal-title":"IEEE Geosci. Remote. Sens. Lett."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"28746","DOI":"10.1109\/ACCESS.2020.2968771","article-title":"Remote sensing scene classification based on multi-structure deep features fusion","volume":"8","author":"Xue","year":"2020","journal-title":"IEEE Access"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"7918","DOI":"10.1109\/TGRS.2020.3044655","article-title":"Enhanced feature pyramid network with deep semantic embedding for remote sensing scene classification","volume":"59","author":"Wang","year":"2021","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1155","DOI":"10.1109\/TGRS.2018.2864987","article-title":"Scene classification with recurrent attention of VHR remote sensing images","volume":"57","author":"Wang","year":"2019","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"2030","DOI":"10.1109\/JSTARS.2021.3051569","article-title":"Attention consistent network for remote sensing scene classification","volume":"14","author":"Tang","year":"2021","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens."},{"key":"ref_26","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017, January 4\u20139). Attention is all you need. Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS), Long Beach, CA, USA."},{"key":"ref_27","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2021, January 4). An image is worth 16x16 words: Transformers for image recognition at scale. Proceedings of the 9th International Conference on Learning Representations (ICLR), Vienna, Austria."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Wu, H., Xiao, B., Codella, N., Liu, M., Dai, X., Yuan, L., and Zhang, L. (2021, January 11\u201317). Cvt: Introducing convolutions to vision transformers. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00009"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Yuan, L., Chen, Y., Wang, T., Yu, W., Shi, Y., Jiang, Z.H., Tay, F.E., Feng, J., and Yan, S. (2021, January 11\u201317). Tokens-to-token vit: Training vision transformers from scratch on imagenet. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00060"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020, January 23\u201328). End-to-end object detection with transformers. Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Sun, Z., Cao, S., Yang, Y., and Kitani, K.M. (2021, January 11\u201317). Rethinking transformer-based set prediction for object detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00359"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Chen, H., Wang, Y., Guo, T., Xu, C., Deng, Y., Liu, Z., Ma, S., Xu, C., Xu, C., and Gao, W. (2021, January 20\u201325). Pre-trained image processing transformer. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01212"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Wang, Y., Xu, Z., Wang, X., Shen, C., Cheng, B., Shen, H., and Xia, H. (2021, January 20\u201325). End-to-end video instance segmentation with transformers. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00863"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., and Torr, P.H. (2021, January 20\u201325). Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00681"},{"key":"ref_35","unstructured":"Li, S., Liu, F., and Jiao, L. (March, January 22). Self-Training Multi-Sequence Learning with Transformer for Weakly Supervised Video Anomaly Detection. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Virtual Event."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Bazi, Y., Bashmal, L., Rahhal, M.M.A., Dayil, R.A., and Ajlan, N.A. (2021). Vision transformers for remote sensing image classification. Remote Sens., 13.","DOI":"10.3390\/rs13030516"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"2223","DOI":"10.1109\/JSTARS.2022.3155665","article-title":"Homo\u2013heterogenous transformer learning framework for RS scene classification","volume":"15","author":"Ma","year":"2022","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Srinivas, A., Lin, T.Y., Parmar, N., Shlens, J., Abbeel, P., and Vaswani, A. (2021, January 20\u201325). Bottleneck transformers for visual recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01625"},{"key":"ref_39","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.C. (2018, January 18\u201323). Mobilenetv2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_41","unstructured":"Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., and Vasudevan, V. (November, January 27). Searching for mobilenetv3. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Zhang, X., Zhou, X., Lin, M., and Sun, J. (2018, January 18\u201323). Shufflenet: An extremely efficient convolutional neural network for mobile devices. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00716"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Ma, N., Zhang, X., Zheng, H.T., and Sun, J. (2018, January 8\u201314). Shufflenet v2: Practical guidelines for efficient cnn architecture design. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01264-9_8"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., and Xu, C. (2020, January 13\u201319). Ghostnet: More features from cheap operations. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00165"},{"key":"ref_45","unstructured":"Hassani, A., Walton, S., Shah, N., Abuduweili, A., Li, J., and Shi, H. (2021). Escaping the Big Data Paradigm with Compact Transformers. arXiv."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Graham, B., El-Nouby, A., Touvron, H., Stock, P., Joulin, A., J\u00e9gou, H., and Douze, M. (2021, January 11\u201317). LeViT: A Vision Transformer in ConvNet\u2019s Clothing for Faster Inference. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.01204"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Yang, Y., and Newsam, S. (2010, January 2\u20135). Bag-of-visual-words and spatial extensions for land-use classification. Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems, San Jose, CA, USA.","DOI":"10.1145\/1869790.1869829"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"3965","DOI":"10.1109\/TGRS.2017.2685945","article-title":"AID: A benchmark data set for performance evaluation of aerial scene classification","volume":"55","author":"Xia","year":"2017","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_50","unstructured":"Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., and J\u00e9gou, H. (2021, January 18\u201324). Training data-efficient image transformers & distillation through attention. Proceedings of the International Conference on Machine Learning (ICML), Virtual Event."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Chen, C.F.R., Fan, Q., and Panda, R. (2021, January 11\u201317). Crossvit: Cross-attention multi-scale vision transformer for image classification. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00041"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Dai, Z., Cai, B., Lin, Y., and Chen, J. (2021, January 20\u201325). Up-detr: Unsupervised pre-training for object detection with transformers. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00165"},{"key":"ref_53","unstructured":"Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2021, January 3\u20137). Deformable detr: Deformable transformers for end-to-end object detection. Proceedings of the 9th International Conference on Learning Representations (ICLR), Virtual Event."},{"key":"ref_54","unstructured":"Mehta, S., and Rastegari, M. (2022, January 25\u201329). Mobilevit: Light-weight, general-purpose, and mobile-friendly vision transformer. Proceedings of the 10th International Conference on Learning Representations (ICLR), Virtual Event."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"106379","DOI":"10.1016\/j.cmpb.2021.106379","article-title":"MFB-LANN: A lightweight and updatable myocardial infarction diagnosis system based on convolutional neural networks and active learning","volume":"210","author":"He","year":"2021","journal-title":"Comput. Methods Programs Biomed."},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1016\/j.neunet.2021.08.002","article-title":"Learning lightweight super-resolution networks with weight pruning","volume":"144","author":"Jiang","year":"2021","journal-title":"Neural Networks"},{"key":"ref_57","doi-asserted-by":"crossref","first-page":"9290","DOI":"10.1109\/TGRS.2021.3051057","article-title":"Ridgelet-Nets With Speckle Reduction Regularization for SAR Image Scene Classification","volume":"59","author":"Qian","year":"2021","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"220","DOI":"10.1016\/j.neucom.2021.04.086","article-title":"Progressive Mimic Learning: A new perspective to train lightweight CNN models","volume":"456","author":"Ma","year":"2021","journal-title":"Neurocomputing"},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Ioannou, Y., Robertson, D., Cipolla, R., and Criminisi, A. (2017, January 21\u201326). Deep roots: Improving cnn efficiency with hierarchical filter groups. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.633"},{"key":"ref_60","unstructured":"Simonyan, K., and Zisserman, A. (2015, January 7\u20139). Very deep convolutional networks for large-scale image recognition. Proceedings of the 3rd International Conference on Learning Representations (ICLR), San Diego, CA, USA."},{"key":"ref_61","unstructured":"Krizhevsky, A., and Hinton, G. (2009). Learning Multiple Layers of Features from Tiny Images, University of Toronto."},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_63","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., and Antiga, L. (2019, January 8\u201314). Pytorch: An imperative style, high-performance deep learning library. Proceedings of the Advances in Neural Information Processing Systems 32 (NIPS), Vancouver, BC, Canada."},{"key":"ref_64","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). ImageNet Classification with Deep Convolutional Neural Networks. Proceedings of the Advances in Neural Information Processing Systems 25 (NIPS), Lake Tahoe, NV, USA."},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016, January 27\u201330). Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.308"},{"key":"ref_66","unstructured":"Tan, M., and Le, Q. (2019, January 9\u201315). Efficientnet: Rethinking model scaling for convolutional neural networks. Proceedings of the International Conference on Machine Learning (ICML). PMLR, Long Beach, CA, USA."},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_68","first-page":"9969","article-title":"GhostNetv2: Enhance cheap operation with long-range attention","volume":"35","author":"Tang","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst. (NIPS)"},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Luo, J.H., Wu, J., and Lin, W. (2017, January 22\u201329). Thinet: A filter level pruning method for deep neural network compression. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.541"},{"key":"ref_70","unstructured":"Wang, Y., Xu, C., Xu, C., Xu, C., and Tao, D. (2018, January 3\u20138). Learning versatile filters for efficient convolutional neural networks. Proceedings of the Advances in Neural Information Processing Systems 31 (NIPS), Montr\u00e9al, QC, Canada."},{"key":"ref_71","unstructured":"Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., and Yuille, A.L. (2015, January 7\u20139). Semantic image segmentation with deep convolutional nets and fully connected crfs. Proceedings of the 3rd International Conference on Learning Representations (ICLR), San Diego, CA, USA."},{"key":"ref_72","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs","volume":"40","author":"Chen","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/14\/3645\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:16:36Z","timestamp":1760127396000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/14\/3645"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,21]]},"references-count":72,"journal-issue":{"issue":"14","published-online":{"date-parts":[[2023,7]]}},"alternative-id":["rs15143645"],"URL":"https:\/\/doi.org\/10.3390\/rs15143645","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,7,21]]}}}