{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T16:38:03Z","timestamp":1775839083241,"version":"3.50.1"},"reference-count":43,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2023,2,19]],"date-time":"2023-02-19T00:00:00Z","timestamp":1676764800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Natural Science Foundation Youth Science Fund Project of Hebei Province","award":["F2021502008"],"award-info":[{"award-number":["F2021502008"]}]},{"name":"Natural Science Foundation Youth Science Fund Project of Hebei Province","award":["2021MS081"],"award-info":[{"award-number":["2021MS081"]}]},{"name":"Natural Science Foundation Youth Science Fund Project of Hebei Province","award":["20220102"],"award-info":[{"award-number":["20220102"]}]},{"name":"General Project of the Fundamental Research Funds for the Central Universities","award":["F2021502008"],"award-info":[{"award-number":["F2021502008"]}]},{"name":"General Project of the Fundamental Research Funds for the Central Universities","award":["2021MS081"],"award-info":[{"award-number":["2021MS081"]}]},{"name":"General Project of the Fundamental Research Funds for the Central Universities","award":["20220102"],"award-info":[{"award-number":["20220102"]}]},{"name":"Open Research Fund of The State Key Laboratory for Management and Control of Complex Systems","award":["F2021502008"],"award-info":[{"award-number":["F2021502008"]}]},{"name":"Open Research Fund of The State Key Laboratory for Management and Control of Complex Systems","award":["2021MS081"],"award-info":[{"award-number":["2021MS081"]}]},{"name":"Open Research Fund of The State Key Laboratory for Management and Control of Complex Systems","award":["20220102"],"award-info":[{"award-number":["20220102"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>The zero-shot image classification (ZSIC) is designed to solve the classification problem when the sample is very small, or the category is missing. A common method is to use attribute or word vectors as a priori category features (auxiliary information) and complete the domain transfer from training of seen classes to recognition of unseen classes by building a mapping between image features and a priori category features. However, feature extraction of the whole image lacks discrimination, and the amount of information of single attribute features or word vector features of categories is insufficient, which makes the matching degree between image features and prior class features not high and affects the accuracy of the ZSIC model. To this end, a spatial attention mechanism is designed, and an image feature extraction module based on this attention mechanism is constructed to screen critical features with discrimination. A semantic information fusion method based on matrix decomposition is proposed, which first decomposes the attribute features and then fuses them with the extracted word vector features of a dataset to achieve information expansion. Through the above two improvement measures, the classification accuracy of the ZSIC model for unseen images is improved. The experimental results on public datasets verify the effect and superiority of the proposed methods.<\/jats:p>","DOI":"10.3390\/s23042311","type":"journal-article","created":{"date-parts":[[2023,2,20]],"date-time":"2023-02-20T02:29:08Z","timestamp":1676860148000},"page":"2311","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Zero-Shot Image Classification Method Based on Attention Mechanism and Semantic Information Fusion"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4437-144X","authenticated-orcid":false,"given":"Yaru","family":"Wang","sequence":"first","affiliation":[{"name":"Department of Automation, North China Electric Power University, Baoding 071003, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lilong","family":"Feng","sequence":"additional","affiliation":[{"name":"Department of Automation, North China Electric Power University, Baoding 071003, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaoke","family":"Song","sequence":"additional","affiliation":[{"name":"Department of Automation, North China Electric Power University, Baoding 071003, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dawei","family":"Xu","sequence":"additional","affiliation":[{"name":"Department of Automation, North China Electric Power University, Baoding 071003, China"},{"name":"State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2997-5840","authenticated-orcid":false,"given":"Yongjie","family":"Zhai","sequence":"additional","affiliation":[{"name":"Department of Automation, North China Electric Power University, Baoding 071003, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,2,19]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"436","DOI":"10.1038\/nature14539","article-title":"Deep learning","volume":"521","author":"Lecun","year":"2015","journal-title":"Nature"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"3600","DOI":"10.1007\/s10489-020-02075-7","article-title":"Research progress of zero-shot learning","volume":"51","author":"Sun","year":"2021","journal-title":"Appl. Intell."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Li, L.W., Liu, L., Du, X.H., Wang, X., Zhang, Z., Zhang, J., and Liu, J. (2022). CGUN-2A: Deep Graph Convolutional Network via Contrastive Learning for Large-Scale Zero-Shot Image Classification. Sensors, 22.","DOI":"10.3390\/s22249980"},{"key":"ref_4","unstructured":"Palatucci, M., Pomerleau, D., and Hinton, G.E. (2009, January 7\u201310). Zero-shot learning with semantic output codes. Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.neunet.2021.04.014","article-title":"Augmented semantic feature based generative network for generalized zero-shot learning","volume":"143","author":"Li","year":"2021","journal-title":"Neural Netw."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Ohashi, H., Al-Naser, M., Ahmed, S., Nakamura, K., Sato, T., and Dengel, A. (2018). Attributes\u2019 Importance for Zero-Shot Pose-Classification Based on Wearable Sensors. Sensors, 18.","DOI":"10.3390\/s18082485"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1791","DOI":"10.1109\/TCYB.2018.2813971","article-title":"Deep attention-based spatially recursive networks for fine-grained visual recognition","volume":"49","author":"Wu","year":"2018","journal-title":"IEEE Trans. Cybern."},{"key":"ref_8","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). ImageNet classification with deep convolutional neural networks. Proceedings of the Advances In Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"453","DOI":"10.1109\/TPAMI.2013.140","article-title":"Attribute-based classification for zero-shot visual object categorization","volume":"36","author":"Lampert","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_11","first-page":"21969","article-title":"Attribute prototype network for zero-shot learning","volume":"33","author":"Xu","year":"2020","journal-title":"Neural Inf. Process. Syst."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Xie, G.S., Liu, L., Jin, X.B., Zhu, F., Zhang, Z., Qin, J., Yao, Y.Z., and Shao, L. (2019, January 16\u201317). Attentive region embedding network for zero-shot learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00961"},{"key":"ref_13","unstructured":"Li, K., Min, M.R., and Fu, Y. (November, January 27). Rethinking zero-shot learning: A conditional visual classification perspective. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zhang, L., Xiang, T., and Gong, S. (2017, January 21\u201326). Learning a deep embedding model for zero-shot learning. Proceedings of the IEEE Conference on Computer Vision and Vattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.321"},{"key":"ref_15","first-page":"16622","article-title":"Hsva: Hierarchical semantic-visual adaptation for zero-shot learning","volume":"34","author":"Chen","year":"2021","journal-title":"Neural Inf. Process. Syst."},{"key":"ref_16","unstructured":"Zhu, Y.Z., Tang, Z., Peng, X., and Elgammal, A. (2019, January 8\u201314). Semantic-guided multi-attention localization for zero-shot learning. Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada."},{"key":"ref_17","unstructured":"Jayaraman, D., and Kristen, G. (2014, January 8\u201313). Zero-shot recognition with unreliable attributes. Proceedings of the International Conference on Neural Information Processing Systems, Montreal, QC, USA."},{"key":"ref_18","unstructured":"Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., and Dean, J. (2013, January 5\u20138). Distributed representations of words and phrases and their compositionality. Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Pennington, J., Socher, R., and Manning, C.D. (2014, January 25\u201329). Glove: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar.","DOI":"10.3115\/v1\/D14-1162"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Joulin, A., Grave, E., Bojanowski, P., and Mikolov, T. (2016). Bag of tricks for efficient text classification. arXiv.","DOI":"10.18653\/v1\/E17-2068"},{"key":"ref_21","unstructured":"Xu, W., Xian, Y., Wang, J., Schiele, B., and Akata, Z. (2020). Attribute prototype net-work for zeroshot learning. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Chen, S., Hong, Z., Liu, Y., Xie, G.S., Sun, B., Li, H., Peng, Q., Lu, K., and You, X. (2021). Transzero: Attribute-guided transformer for zero-shot learning. arXiv.","DOI":"10.1109\/TPAMI.2022.3229526"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Yang, Z., Liu, Y., Xu, W., Huang, C., Zhou, L., and Tong, C. (2022). Learning prototype via placeholder for zero-shot recognition. arXiv.","DOI":"10.24963\/ijcai.2022\/217"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Chen, L., Zhang, H.-W., Xiao, J., Liu, W., and Chang, S. (2018, January 18\u201322). Zero-shot visual recognition using semantics preserving adversarial embedding networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00115"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1425","DOI":"10.1109\/TPAMI.2015.2487986","article-title":"Label-embedding for image classification","volume":"38","author":"Akata","year":"2016","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_26","unstructured":"Liu, Y., Zhou, L., Bai, X., Gu, L., Harada, T., and Zhou, J. (2020). Information bottleneck constrained latent bidirectional embedding for zero-shot learning. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1109\/TPAMI.2018.2857768","article-title":"Zero-Shot Learning-A Comprehensive Evaluation of the Good, the Bad and the Ugly","volume":"41","author":"Xian","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhao, B., Wu, B., Wu, T., and Wang, Y. (2017, January 22\u201329). Zero-shot learning posed as a missing data problem. Proceedings of the IEEE International Conference on Computer Vision Workshops, Venice, Italy.","DOI":"10.1109\/ICCVW.2017.310"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Wang, D., Li, Y., Lin, Y., and Zhuang, Y. (2016, January 12\u201317). Relational knowledge transfer for zero-shot learning. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA.","DOI":"10.1609\/aaai.v30i1.10195"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Changpinyo, S., Chao, W.L., Gong, B., and Sha, F. (2016, January 27\u201330). Synthesized classifiers for zero-shot learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.575"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Shigeto, Y., Suzuki, I., Hara, K., Shimbo, M., and Matsumoto, Y. (2015, January 7\u201311). Ridge Regression, Hubness, and Zero-shot Learning. Proceedings of the Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2015, Porto, Portugal.","DOI":"10.1007\/978-3-319-23528-8_9"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"6543","DOI":"10.1109\/TCYB.2020.3004641","article-title":"Semantic-guided class-imbalance learning model for zero-shot image classification","volume":"52","author":"Ji","year":"2021","journal-title":"IEEE Trans. Cybern."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Chen, S.-M., Wang, W.J., Xia, B.H., Peng, Q.M., You, X.G., Zheng, F., and Shao, L. (2021, January 10\u201317). Free: Feature re-finement for generalized zero-shot learning. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00019"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Li, J., Jing, M.M., Lu, K., Ding, Z., Zhu, L., and Huang, Z. (2019, January 16\u201317). Leveraging the invariant side of generative zero-shot learning. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00758"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Keshari, R., Singh, R., and Vatsa, M. (2020, January 13\u201319). Generalized zero-shot learning via over-complete distribution. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01331"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Schonfeld, E., Ebrahimi, S., Sinha, S., Darrell, T., and Akata, Z. (2019, January 16\u201317). Generalized zero- and few-shot learning via aligned variational autoencoders. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00844"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Shen, Y., Qin, J., Huang, L., Liu, L., Zhu, F., and Shao, L. (2020, January 23\u201328). Invertible zero-shot recognition flows. Proceedings of the European Conference on Computer Vision, 16th European Conference, Glasgow, UK.","DOI":"10.1007\/978-3-030-58517-4_36"},{"key":"ref_38","unstructured":"Yao-Hung, H.T., Huang, L.-K., and Salakhutdinov, R. (2017, January 22\u201329). Learning robust visual-semantic embeddings. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"2908","DOI":"10.1109\/TCYB.2017.2751741","article-title":"Transductive zero-shot learning with a self-training dictionary approach","volume":"48","author":"Yu","year":"2018","journal-title":"IEEE Trans. Cybern."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Zhu, X.L., He, Z.L., Zhao, L., Dai, Z.C., and Yang, Q.L. (2022). A Cascade Attention Based Facial Expression Recognition Network by Fusing Multi-Scale Spatio-Temporal Features. Sensors, 22.","DOI":"10.3390\/s22041350"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Sun, Y., Bi, F., Gao, Y.E., Chen, L., and Feng, S.T. (2022). A Multi-Attention UNet for Semantic Segmentation in Remote Sensing Images. Symmetry, 14.","DOI":"10.3390\/sym14050906"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Liu, R., Tao, F., Liu, X., Na, J., Leng, H., Wu, J., and Zhou, T. (2022). RAANet: A Residual ASPP with Attention Framework for Semantic Segmentation of High-Resolution Remote Sensing Images. Remote Sens., 14.","DOI":"10.3390\/rs14133109"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"108411","DOI":"10.1016\/j.patcog.2021.108411","article-title":"Visual vs internal attention mechanisms in deep neural networks for image classification and object detection","volume":"123","author":"Obeso","year":"2022","journal-title":"Pattern Recognit."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/4\/2311\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:36:23Z","timestamp":1760121383000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/4\/2311"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,19]]},"references-count":43,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2023,2]]}},"alternative-id":["s23042311"],"URL":"https:\/\/doi.org\/10.3390\/s23042311","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,19]]}}}