{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,16]],"date-time":"2026-01-16T09:03:37Z","timestamp":1768554217935,"version":"3.49.0"},"reference-count":61,"publisher":"MDPI AG","issue":"16","license":[{"start":{"date-parts":[[2022,8,16]],"date-time":"2022-08-16T00:00:00Z","timestamp":1660608000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Science Foundation of China (NSFC)","doi-asserted-by":"publisher","award":["U1931134"],"award-info":[{"award-number":["U1931134"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Science Foundation of China (NSFC)","doi-asserted-by":"publisher","award":["A2020202001"],"award-info":[{"award-number":["A2020202001"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Natural Science Foundation of Hebei","award":["U1931134"],"award-info":[{"award-number":["U1931134"]}]},{"name":"Natural Science Foundation of Hebei","award":["A2020202001"],"award-info":[{"award-number":["A2020202001"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>In recent years, convolutional neural networks (CNNs) have achieved competitive performance in the field of ground-based cloud image (GCI) classification. Proposed CNN-based methods can fully extract the local features of images. However, due to the locality of the convolution operation, they cannot well establish the long-range dependencies between the images, and thus they cannot extract the global features of images. Transformer has been applied to computer vision with great success due to its powerful global modeling capability. Inspired by it, we propose a Transformer-based GCI classification method that combines the advantages of the CNN and Transformer models. Firstly, the CNN model acts as a low-level feature extraction tool to generate local feature sequences of images. Then, the Transformer model is used to learn the global features of the images by efficiently extracting the long-range dependencies between the sequences. Finally, a linear classifier is used for GCI classification. In addition, we introduce a center loss function to address the problem of the simple cross-entropy loss not adequately supervising feature learning. Our method is evaluated on three commonly used datasets: ASGC, CCSN, and GCD. The experimental results show that the method achieves 94.24%, 92.73%, and 93.57% accuracy, respectively, outperforming other state-of-the-art methods. It proves that Transformer has great potential to be applied to GCI classification tasks.<\/jats:p>","DOI":"10.3390\/rs14163978","type":"journal-article","created":{"date-parts":[[2022,8,17]],"date-time":"2022-08-17T03:15:27Z","timestamp":1660706127000},"page":"3978","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":26,"title":["A Novel Method for Ground-Based Cloud Image Classification Using Transformer"],"prefix":"10.3390","volume":"14","author":[{"given":"Xiaotong","family":"Li","sequence":"first","affiliation":[{"name":"Department of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, China"}]},{"given":"Bo","family":"Qiu","sequence":"additional","affiliation":[{"name":"Department of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, China"}]},{"given":"Guanlong","family":"Cao","sequence":"additional","affiliation":[{"name":"Department of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, China"}]},{"given":"Chao","family":"Wu","sequence":"additional","affiliation":[{"name":"Department of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, China"}]},{"given":"Liwen","family":"Zhang","sequence":"additional","affiliation":[{"name":"Department of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,8,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"117834","DOI":"10.1016\/j.apenergy.2021.117834","article-title":"Machine Learning techniques for solar irradiation nowcasting: Cloud type classification forecast through satellite data and imagery","volume":"305","author":"Nespoli","year":"2022","journal-title":"Appl. Energ."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"82","DOI":"10.1088\/1674-4527\/20\/6\/82","article-title":"Data processing and data products from 2017 to 2019 campaign of astronomical site testing at Ali, Daocheng and Muztagh-ata","volume":"20","author":"Cao","year":"2020","journal-title":"Res. Astron. Astrophys."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"94","DOI":"10.1002\/qj.3907","article-title":"Effects of terrain-following vertical coordinates on simulation of stratus clouds in numerical weather prediction models","volume":"147","author":"Westerhuis","year":"2021","journal-title":"Q. J. R. Meteorol. Soc."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"633","DOI":"10.1175\/JTECH1875.1","article-title":"Retrieving cloud characteristics from ground-based daytime color all-sky images","volume":"23","author":"Long","year":"2006","journal-title":"J. Atmos. Ocean. Technol."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"6657","DOI":"10.1080\/01431161.2018.1466069","article-title":"Cloud detection for high-resolution remote-sensing images of urban areas using colour and edge features based on dual-colour models","volume":"39","author":"Huang","year":"2018","journal-title":"Int. J. Remote Sens."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Liu, Y., Tang, Y., Hua, S., Luo, R., and Zhu, Q. (2019). Features of the cloud base height and determining the threshold of relative humidity over southeast China. Remote Sens., 11.","DOI":"10.3390\/rs11242900"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"871","DOI":"10.1038\/ngeo2828","article-title":"Impact of decadal cloud variations on the Earth\u2019s energy budget","volume":"9","author":"Zhou","year":"2016","journal-title":"Nat. Geosci."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"542","DOI":"10.3390\/make3030028","article-title":"Voting in transfer learning system for ground-based cloud classification","volume":"3","author":"Manzo","year":"2021","journal-title":"Mach. Learn. Knowl. Extr."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"4787","DOI":"10.1007\/s00382-018-4413-y","article-title":"The cloud-free global energy balance and inferred cloud radiative effects: An assessment based on direct observations and climate models","volume":"52","author":"Wild","year":"2019","journal-title":"Clim. Dynam."},{"key":"ref_10","first-page":"11045","article-title":"Automatic cloud-type classification based on the combined use of a sky camera and a ceilometer","volume":"122","year":"2017","journal-title":"J. Geophys. Res. Atmos."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"4898","DOI":"10.1109\/JSTARS.2017.2734912","article-title":"A cloud detection method based on relationship between objects of cloud and cloud-shadow for Chinese moderate to high resolution satellite imagery","volume":"10","author":"Zhong","year":"2017","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"583","DOI":"10.5194\/essd-10-583-2018","article-title":"The international satellite cloud climatology project H-Series climate data record product","volume":"10","author":"Young","year":"2018","journal-title":"Earth Syst. Sci. Data"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1900","DOI":"10.1007\/s12517-021-08259-w","article-title":"An integrated deep learning framework of U-Net and inception module for cloud detection of remote sensing images","volume":"14","author":"Kumthekar","year":"2021","journal-title":"Arab. J. Geosci."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Jain, M., Gollini, I., Bertolotto, M., McArdle, G., and Dev, S. (2021, January 11\u201316). An extremely-low cost ground-based whole sky imager. Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium, Brussels, Belgium.","DOI":"10.1109\/IGARSS47720.2021.9553032"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"251","DOI":"10.1016\/j.solener.2019.02.004","article-title":"Determination of cloud transmittance for all sky imager based solar nowcasting","volume":"181","author":"Nouri","year":"2019","journal-title":"Sol. Energy"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1016\/j.solener.2018.10.079","article-title":"Cloud height and tracking accuracy of three all sky imager systems for individual clouds","volume":"177","author":"Nouri","year":"2019","journal-title":"Sol. Energy"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"557","DOI":"10.5194\/amt-3-557-2010","article-title":"Automatic cloud classification of whole sky images","volume":"3","author":"Heinle","year":"2010","journal-title":"Atmos. Meas. Technol."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"753","DOI":"10.5194\/amt-9-753-2016","article-title":"From pixels to patches: A cloud classification method based on a bag of micro-structures","volume":"9","author":"Li","year":"2016","journal-title":"Atmos. Meas. Technol."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Dev, S., Lee, Y.H., and Winkler, S. (2015, January 27\u201330). Categorization of cloud image patches using an improved texton-based approach. Proceedings of the 2015 IEEE International Conference on Image Processing, Quebec City, QC, Canada.","DOI":"10.1109\/ICIP.2015.7350833"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"789","DOI":"10.1175\/JTECH-D-15-0015.1","article-title":"mCLOUD: A multiview visual feature extraction mechanism for ground-based cloud image categorization","volume":"33","author":"Xiao","year":"2016","journal-title":"J. Atmos. Ocean. Technol."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1175\/JTECH-D-13-00048.1","article-title":"Cloud classification of ground-based images using texture\u2013structure features","volume":"31","author":"Zhuo","year":"2014","journal-title":"J. Atmos. Ocean. Technol."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"816","DOI":"10.1109\/LGRS.2017.2681658","article-title":"Deep convolutional activations-based features for ground-based cloud classification","volume":"14","author":"Shi","year":"2017","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"5729","DOI":"10.1109\/TGRS.2017.2712809","article-title":"DeepCloud: Ground-based cloud image categorization using deep convolutional features","volume":"55","author":"Ye","year":"2017","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"510","DOI":"10.1016\/j.solener.2019.01.096","article-title":"3D-CNN-based feature extraction of ground-based cloud images for direct normal irradiance prediction","volume":"181","author":"Zhao","year":"2019","journal-title":"Sol. Energy"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"44111","DOI":"10.1109\/ACCESS.2020.2978090","article-title":"Cloud shape classification system based on multi-channel cnn and improved fdm","volume":"8","author":"Zhao","year":"2020","journal-title":"IEEE Access"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"63081","DOI":"10.1109\/ACCESS.2019.2916905","article-title":"Dual guided loss for ground-based cloud classification in weather station networks","volume":"7","author":"Li","year":"2019","journal-title":"IEEE Access"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"8665","DOI":"10.1029\/2018GL077787","article-title":"CloudNet: Ground-based cloud classification with deep convolutional neural network","volume":"45","author":"Zhang","year":"2018","journal-title":"Geophys. Res. Lett."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"e2020GL087338","DOI":"10.1029\/2020GL087338","article-title":"Ground-based cloud classification using task-based graph convolutional network","volume":"47","author":"Liu","year":"2020","journal-title":"Geophys. Res. Lett."},{"key":"ref_29","first-page":"5602711","article-title":"Ground-Based Remote Sensing Cloud Classification via Context Graph Attention Network","volume":"60","author":"Liu","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Liu, S., Li, M., Zhang, Z., Xiao, B., and Cao, X. (2018). Multimodal ground-based cloud classification using joint fusion convolutional neural network. Remote Sens., 10.","DOI":"10.3390\/rs10060822"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Liu, S., Li, M., Zhang, Z., Xiao, B., and Durrani, T.S. (2020). Multi-evidence and multi-modal fusion network for ground-based cloud recognition. Remote Sens., 12.","DOI":"10.3390\/rs12030464"},{"key":"ref_33","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017, January 4\u20139). Attention is all you need. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Mare\u010dek, D., and Rosa, R. (2018, January 1). Extracting syntactic trees from transformer encoder self-attentions. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, Brussels, Belgium.","DOI":"10.18653\/v1\/W18-5444"},{"key":"ref_35","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv."},{"key":"ref_36","unstructured":"Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2020). Deformable detr: Deformable transformers for end-to-end object detection. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., and Wang, Y. (2021, January 20\u201325). Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00681"},{"key":"ref_38","unstructured":"Parmar, N., Vaswani, A., Uszkoreit, J., Kaiser, L., Shazeer, N., Ku, A., and Tran, D. (2018, January 10\u201315). Image transformer. Proceedings of the International Conference on Machine Learning, Stockholm, Sweden."},{"key":"ref_39","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_40","unstructured":"Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., and J\u00e9gou, H. (2021, January 18\u201324). Training data-efficient image transformers & distillation through attention. Proceedings of the International Conference on Machine Learning, online."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Reedha, R., Dericquebourg, E., Canals, R., and Hafiane, A. (2022). Transformer Neural Network for Weed and Crop Classification of High Resolution UAV Images. Remote Sens., 14.","DOI":"10.3390\/rs14030592"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Chen, Y., Gu, X., Liu, Z., and Liang, J. (2022). A Fast Inference Vision Transformer for Automatic Pavement Image Classification and Its Visual Interpretation Method. Remote Sens., 14.","DOI":"10.3390\/rs14081877"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Shome, D., Kar, T., Mohanty, S.N., Tiwari, P., Muhammad, K., AlTameem, A., Zhang, Y.Z., and Saudagar, A.K.J. (2021). COVID-transformer: Interpretable COVID-19 detection using vision transformer for healthcare. Int. J. Environ. Res. Public Health, 18.","DOI":"10.3390\/ijerph182111086"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"He, X., Chen, Y., and Lin, Z. (2021). Spatial-spectral transformer for hyperspectral image classification. Remote Sens., 13.","DOI":"10.3390\/rs13030498"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Jogin, M., Madhulika, M.S., Divya, G.D., Meghana, R.K., and Apoorva, S. (2018, January 18\u201319). Feature extraction using convolution neural networks (CNN) and deep learning. Proceedings of the 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bangalore, India.","DOI":"10.1109\/RTEICT42901.2018.9012507"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1016\/j.tifs.2021.04.042","article-title":"Efficient extraction of deep image features using convolutional neural network (CNN) for applications in detecting and analysing complex food matrices","volume":"113","author":"Liu","year":"2021","journal-title":"Trends Food Sci. Technol."},{"key":"ref_47","unstructured":"Tan, M., and Le, Q. (2019, January 10\u201315). Efficientnet: Rethinking model scaling for convolutional neural networks. Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.C. (2018, January 18\u201323). Mobilenetv2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_49","unstructured":"Ramachandran, P., Zoph, B., and Le, Q.V. (2017). Searching for activation functions. arXiv."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201323). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_51","unstructured":"Hendrycks, D., and Gimpel, K. (2016). Gaussian error linear units (gelus). arXiv."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 10\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Wen, Y., Zhang, K., Li, Z., and Qiao, Y. (2016, January 8\u201316). A discriminative feature learning approach for deep face recognition. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46478-7_31"},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"178","DOI":"10.3847\/1538-3881\/ab744f","article-title":"Cloud Identification from All-sky Camera Data with Machine Learning","volume":"159","author":"Mommert","year":"2020","journal-title":"Astron. J."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1016\/j.jfa.2013.05.001","article-title":"Bilinear interpolation theorems and applications","volume":"265","year":"2013","journal-title":"J. Funct. Anal."},{"key":"ref_56","unstructured":"Loshchilov, I., and Hutter, F. (2017). Decoupled weight decay regularization. arXiv."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2017, January 22\u201329). Grad-cam: Visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.74"},{"key":"ref_58","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_61","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/16\/3978\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:09:24Z","timestamp":1760141364000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/16\/3978"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,16]]},"references-count":61,"journal-issue":{"issue":"16","published-online":{"date-parts":[[2022,8]]}},"alternative-id":["rs14163978"],"URL":"https:\/\/doi.org\/10.3390\/rs14163978","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,8,16]]}}}