{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,11]],"date-time":"2026-03-11T17:13:36Z","timestamp":1773249216281,"version":"3.50.1"},"reference-count":37,"publisher":"MDPI AG","issue":"23","license":[{"start":{"date-parts":[[2024,12,7]],"date-time":"2024-12-07T00:00:00Z","timestamp":1733529600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62001155"],"award-info":[{"award-number":["62001155"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62301396"],"award-info":[{"award-number":["62301396"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["F2023202013"],"award-info":[{"award-number":["F2023202013"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["XH-KY-202306-0281"],"award-info":[{"award-number":["XH-KY-202306-0281"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003787","name":"Natural Science Foundation of Hebei Province","doi-asserted-by":"publisher","award":["62001155"],"award-info":[{"award-number":["62001155"]}],"id":[{"id":"10.13039\/501100003787","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003787","name":"Natural Science Foundation of Hebei Province","doi-asserted-by":"publisher","award":["62301396"],"award-info":[{"award-number":["62301396"]}],"id":[{"id":"10.13039\/501100003787","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003787","name":"Natural Science Foundation of Hebei Province","doi-asserted-by":"publisher","award":["F2023202013"],"award-info":[{"award-number":["F2023202013"]}],"id":[{"id":"10.13039\/501100003787","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003787","name":"Natural Science Foundation of Hebei Province","doi-asserted-by":"publisher","award":["XH-KY-202306-0281"],"award-info":[{"award-number":["XH-KY-202306-0281"]}],"id":[{"id":"10.13039\/501100003787","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Open Foundation for the Hangzhou Institute of Technology Academician Workstation at Xidian University","award":["62001155"],"award-info":[{"award-number":["62001155"]}]},{"name":"Open Foundation for the Hangzhou Institute of Technology Academician Workstation at Xidian University","award":["62301396"],"award-info":[{"award-number":["62301396"]}]},{"name":"Open Foundation for the Hangzhou Institute of Technology Academician Workstation at Xidian University","award":["F2023202013"],"award-info":[{"award-number":["F2023202013"]}]},{"name":"Open Foundation for the Hangzhou Institute of Technology Academician Workstation at Xidian University","award":["XH-KY-202306-0281"],"award-info":[{"award-number":["XH-KY-202306-0281"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>A millimeter-wave radar is widely accepted by the public due to its low susceptibility to interference, such as changes in light, and the protection of personal privacy. With the development of the deep learning theory, the deep learning method has been dominant in the millimeter-wave radar field, which usually uses convolutional neural networks for feature extraction. In recent years, transformer networks have also been highly valued by researchers due to their parallel processing capabilities and long-distance dependency modeling capabilities. However, traditional convolutional neural networks (CNNs) and vision transformers each have their limitations: CNNs usually overlook the global features of images and vision transformers may neglect local image continuity, and both of them may impede gesture recognition performance. In addition, whether CNN or transformer, their implementation is hindered by the scarcity of public radar gesture datasets. To address these limitations, this paper proposes a new recognition method using a local pyramid visual transformer (LPVT) based on millimeter-wave radar. LPVT can capture global and local features in dynamic gesture spectrograms, ultimately improving the recognition ability of gestures. In this paper, we mainly carried out the following two tasks: building the corresponding datasets and executing gesture recognition. First, we constructed a gesture dataset for training. In this stage, we use a 77 GHz radar to collect the echo signals of gestures and preprocess them to build a dataset. Second, we propose the LPVT network specifically designed for gesture recognition tasks. By integrating local sensing into the globally focused transformer, we improve its capacity to capture both global and local features in dynamic gesture spectrograms. The experimental results using the dataset we constructed show that the proposed LPVT network achieved a gesture recognition accuracy of 92.2%, which exceeds the performance of other networks.<\/jats:p>","DOI":"10.3390\/rs16234602","type":"journal-article","created":{"date-parts":[[2024,12,9]],"date-time":"2024-12-09T10:11:47Z","timestamp":1733739107000},"page":"4602","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Local Pyramid Vision Transformer: Millimeter-Wave Radar Gesture Recognition Based on Transformer with Integrated Local and Global Awareness"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4890-2540","authenticated-orcid":false,"given":"Zhaocheng","family":"Wang","sequence":"first","affiliation":[{"name":"School of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guangxuan","family":"Hu","sequence":"additional","affiliation":[{"name":"School of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shuo","family":"Zhao","sequence":"additional","affiliation":[{"name":"School of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2866-9609","authenticated-orcid":false,"given":"Ruonan","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Electronics and Information Engineering, Hebei University of Technology, Tianjin 300401, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hailong","family":"Kang","sequence":"additional","affiliation":[{"name":"Hangzhou Institute of Technology, Xidian University, Hangzhou 311200, China"},{"name":"National Key Laboratory of Radar Signal Processing, Xidian University, Xi\u2019an 710071, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Feng","family":"Luo","sequence":"additional","affiliation":[{"name":"Hangzhou Institute of Technology, Xidian University, Hangzhou 311200, China"},{"name":"National Key Laboratory of Radar Signal Processing, Xidian University, Xi\u2019an 710071, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2024,12,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Wang, Y., Wang, D., Fu, Y., Yao, D., Xie, L., and Zhou, M. (2022). Multi-hand gesture recognition using automotive FMCW radar sensor. Remote Sens., 14.","DOI":"10.3390\/rs14102374"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"3278","DOI":"10.1109\/JSEN.2018.2808688","article-title":"Latern: Dynamic continuous hand gesture recognition using FMCW radar sensor","volume":"18","author":"Zhang","year":"2018","journal-title":"IEEE Sens. J."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"14610","DOI":"10.1109\/JSEN.2022.3181518","article-title":"Video hand gestures recognition using depth camera and lightweight CNN","volume":"22","author":"Leon","year":"2022","journal-title":"IEEE Sens. J."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Jawad, S.K., and Alaziz, M. (2022, January 7\u20138). Human Activity and Gesture Recognition based on WiFi. Proceedings of the 2022 Iraqi International Conference on Communication and Information Technologies (IICCIT), IEEE, Basrah, Iraq.","DOI":"10.1109\/IICCIT55816.2022.10010433"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"10336","DOI":"10.1109\/JIOT.2021.3067382","article-title":"Tinyradarnn: Combining spatial and temporal convolutional neural networks for embedded gesture recognition with short range radars","volume":"8","author":"Scherer","year":"2021","journal-title":"IEEE Internet Things J."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"119042","DOI":"10.1016\/j.eswa.2022.119042","article-title":"mmGesture: Semi-supervised gesture recognition system using mmWave radar","volume":"213","author":"Yan","year":"2023","journal-title":"Expert Syst. Appl."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"7125","DOI":"10.1109\/ACCESS.2016.2617282","article-title":"Hand gesture recognition using micro-Doppler signatures with convolutional neural network","volume":"4","author":"Kim","year":"2016","journal-title":"IEEE Access."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"144610","DOI":"10.1109\/ACCESS.2020.3010063","article-title":"Enhanced multi-channel feature synthesis for hand gesture recognition based on CNN with a channel and spatial attention mechanism","volume":"8","author":"Du","year":"2020","journal-title":"IEEE Access."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Jiang, W., Ren, Y., Liu, Y., Wang, Z., and Wang, X. (2021, January 6\u201311). Recognition of dynamic hand gesture based on mm-wave FMCW radar micro-Doppler signatures. Proceedings of the ICASSP 2021\u20132021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, Toronto, ON, Canada.","DOI":"10.1109\/ICASSP39728.2021.9414837"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"4749","DOI":"10.1109\/TGRS.2020.3010880","article-title":"Multidimensional feature representation and learning for robust hand-gesture recognition on commercial millimeter-wave radar","volume":"59","author":"Xia","year":"2020","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"10893","DOI":"10.1109\/ACCESS.2021.3051454","article-title":"Improved static hand gesture classification on deep convolutional neural networks using novel sterile training technique","volume":"9","author":"Smith","year":"2021","journal-title":"IEEE Access."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Peng, L., Ma, G., Man, M., and Liu, S. (2022). Dynamic gesture recognition model based on millimeter-wave radar with ResNet-18 and LSTM. Front. Neurorobot., 16.","DOI":"10.3389\/fnbot.2022.903197"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2897824.2925953","article-title":"Soli: Ubiquitous gesture sensing with millimeter wave radar","volume":"35","author":"Lien","year":"2016","journal-title":"ACM Trans. Graph. (TOG)"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"26701","DOI":"10.1109\/JSEN.2023.3319339","article-title":"DGSCR: Double-Target Gesture Separation and Classification Recognition Based on Deep Learning and Millimeter-Wave Radar","volume":"23","author":"Zhao","year":"2023","journal-title":"IEEE Sens. J."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Sun, B., Xu, Z., Wu, Z., and Zhang, S. (2022, January 17\u201318). SwinFMCW: A Joint Swin Transformer and LSTM Method for Gesture and Identity Recognition Using FMCW Radar. Proceedings of the 2022 Cross Strait Radio Science & Wireless Technology Conference (CSRSWTC), IEEE, Haidian, China.","DOI":"10.1109\/CSRSWTC56224.2022.10098436"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Narayanan, A.L., KT, A.B., Wu, H., and Ma, J. (2022, January 28\u201330). mm-Wave Radar Hand Shape Classification Using Deformable Transformers. Proceedings of the 2022 19th European Radar Conference (EuRAD), IEEE, Milan, Italy.","DOI":"10.23919\/EuRAD54643.2022.9924850"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"17680","DOI":"10.1109\/JIOT.2023.3280227","article-title":"Dcs-ctn: Subtle gesture recognition based on td-cnn-transformer via millimeter-wave radar","volume":"10","author":"Wang","year":"2023","journal-title":"IEEE Internet Things J."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"2741","DOI":"10.1109\/JIOT.2023.3293092","article-title":"Interference-robust millimeter-wave radar-based dynamic hand gesture recognition using 2D CNN-transformer networks","volume":"11","author":"Jin","year":"2023","journal-title":"IEEE Internet Things J."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"87425","DOI":"10.1109\/ACCESS.2022.3200757","article-title":"Fmcw radar-based real-time hand gesture recognition system capable of out-of-distribution detection","volume":"10","author":"Choi","year":"2022","journal-title":"IEEE Access"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/LSENS.2022.3206439","article-title":"Vision transformer with convolutional encoder\u2013decoder for hand gesture recognition using 24-GHz Doppler radar","volume":"6","author":"Kehelella","year":"2022","journal-title":"IEEE Sens. Lett."},{"key":"ref_21","unstructured":"Alexey, D. (2020). An image is worth 16 \u00d7 16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Wang, W., Xie, E., Li, X., Fan, D.P., Song, K., Liang, D., Lu, T., Luo, P., and Shao, L. (2021, January 10\u201317). Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00061"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1007\/s41095-022-0274-8","article-title":"Pvt v2: Improved baselines with pyramid vision transformer","volume":"8","author":"Wang","year":"2022","journal-title":"Comput. Vis. Media."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Li, Y., Zhang, K., Cao, J., Timofte, R., Magno, M., Benini, L., and Goo, L. (2023, January 1\u20135). LocalViT: Analyzing Locality in Vision Transformers. Proceedings of the 2023 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE, Detroit, MI, USA.","DOI":"10.1109\/IROS55552.2023.10342025"},{"key":"ref_25","unstructured":"Chen, C.F., Panda, R., and Fan, Q. (2021). Regionvit: Regional-to-local attention for vision transformers. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Li, B., Hu, Y., Nie, X., Han, C., Jiang, X., Guo, T., and Liu, L. (2023, January 17\u201324). Dropkey for vision transformer. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.02174"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Zhao, S., Wang, Z., Kang, H., Wang, R., Hu, G., and Zhang, G. (2023, January 3\u20135). Gesture recognition for millimeter wave radar based on LocalPVT. Proceedings of the IET International Radar Conference (IRC 2023), IET, Chongqing, China.","DOI":"10.1049\/icp.2024.1209"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"16329","DOI":"10.1109\/JSTARS.2024.3455891","article-title":"Spatial Reduction Attention in Multiscale Vision Transform for Surface Water-Land Interface Zone Segmentation","volume":"17","author":"Chen","year":"2024","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"103504","DOI":"10.1109\/ACCESS.2024.3431939","article-title":"From Text to Insight: An Integrated CNN-BiLSTM-GRU Model for Arabic Cyberbullying Detection","volume":"2","author":"Daraghmi","year":"2024","journal-title":"IEEE Access"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"110936","DOI":"10.1016\/j.ymssp.2023.110936","article-title":"Bayesian variational transformer: A generalizable model for rotating machinery fault diagnosis","volume":"207","author":"Xiao","year":"2024","journal-title":"Mech. Syst. Signal Process."},{"key":"ref_31","first-page":"1","article-title":"Complex Surface Electromyography Signal Gesture Recognition Based on Multi-Pathway Featured Scale Convolutional Neural Network","volume":"73","author":"Liu","year":"2024","journal-title":"IEEE Trans. Instrum. Meas."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1142\/S0218488598000094","article-title":"The vanishing gradient problem during learning recurrent neural nets and problem solutions","volume":"6","author":"Hochreiter","year":"1998","journal-title":"Int. J. Uncertain. Fuzziness Knowl. Based Syst."},{"key":"ref_33","unstructured":"Philipp, G., Song, D., and Carbonell, J.G. (2017). The exploding gradient problem demystified-definition, prevalence, impact, origin, tradeoffs, and solutions. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long Short-term Memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Mascarenhas, S., and Agarwal, M. (2021, January 19\u201321). A comparison between VGG16, VGG19 and ResNet50 architecture frameworks for Image Classification. Proceedings of the 2021 International Conference on Disruptive Technologies for Multi-Disciplinary Research and Applications (CENTCON), IEEE, Bengaluru, India.","DOI":"10.1109\/CENTCON52345.2021.9687944"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 10\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/23\/4602\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:49:37Z","timestamp":1760114977000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/23\/4602"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,7]]},"references-count":37,"journal-issue":{"issue":"23","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["rs16234602"],"URL":"https:\/\/doi.org\/10.3390\/rs16234602","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,7]]}}}