{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,21]],"date-time":"2026-05-21T05:21:21Z","timestamp":1779340881700,"version":"3.51.4"},"reference-count":35,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2021,11,18]],"date-time":"2021-11-18T00:00:00Z","timestamp":1637193600000},"content-version":"vor","delay-in-days":321,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61632006"],"award-info":[{"award-number":["61632006"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61772049"],"award-info":[{"award-number":["61772049"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61876012"],"award-info":[{"award-number":["61876012"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61771058"],"award-info":[{"award-number":["61771058"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Computational Intelligence and Neuroscience"],"published-print":{"date-parts":[[2021,1]]},"abstract":"<jats:p>Hand gesture recognition is a challenging topic in the field of computer vision. Multimodal hand gesture recognition based on RGB\u2010D is with higher accuracy than that of only RGB or depth. It is not difficult to conclude that the gain originates from the complementary information existing in the two modalities. However, in reality, multimodal data are not always easy to acquire simultaneously, while unimodal RGB or depth hand gesture data are more general. Therefore, one hand gesture system is expected, in which only unimordal RGB or Depth data is supported for testing, while multimodal RGB\u2010D data is available for training so as to attain the complementary information. Fortunately, a kind of method via multimodal training and unimodal testing has been proposed. However, unimodal feature representation and cross\u2010modality transfer still need to be further improved. To this end, this paper proposes a new 3D\u2010Ghost and Spatial Attention Inflated 3D ConvNet (3DGSAI) to extract high\u2010quality features for each modality. The baseline of 3DGSAI network is Inflated 3D ConvNet (I3D), and two main improvements are proposed. One is 3D\u2010Ghost module, and the other is the spatial attention mechanism. The 3D\u2010Ghost module can extract richer features for hand gesture representation, and the spatial attention mechanism makes the network pay more attention to hand region. This paper also proposes an adaptive parameter for positive knowledge transfer, which ensures that the transfer always occurs from the strong modality network to the weak one. Extensive experiments on SKIG, VIVA, and NVGesture datasets demonstrate that our method is competitive with the state of the art. Especially, the performance of our method reaches 97.87% on the SKIG dataset using only RGB, which is the current best result.<\/jats:p>","DOI":"10.1155\/2021\/5044916","type":"journal-article","created":{"date-parts":[[2021,11,18]],"date-time":"2021-11-18T20:50:08Z","timestamp":1637268608000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Attentive 3D\u2010Ghost Module for Dynamic Hand Gesture Recognition with Positive Knowledge Transfer"],"prefix":"10.1155","volume":"2021","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5583-8260","authenticated-orcid":false,"given":"Jinghua","family":"Li","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2402-220X","authenticated-orcid":false,"given":"Runze","family":"Liu","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7722-7172","authenticated-orcid":false,"given":"Dehui","family":"Kong","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3045-624X","authenticated-orcid":false,"given":"Shaofan","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4977-0183","authenticated-orcid":false,"given":"Lichun","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8125-4648","authenticated-orcid":false,"given":"Baocai","family":"Yin","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2991-5522","authenticated-orcid":false,"given":"Ronghua","family":"Gao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"311","published-online":{"date-parts":[[2021,11,18]]},"reference":[{"key":"e_1_2_9_1_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-012-9356-9"},{"key":"e_1_2_9_2_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00779-015-0844-1"},{"key":"e_1_2_9_3_2","doi-asserted-by":"crossref","unstructured":"MiaoQ. LiY. OuyangW. MaZ. XuX. ShiW. andCaoX. Multimodal gesture recognition based on the resc3d network Proceedings of the IEEE International Conference on Computer Vision Workshop (ICCVW) October 2017 Seoul South Korea 3047\u20133055 https:\/\/doi.org\/10.1109\/iccvw.2017.360 2-s2.0-85046273669.","DOI":"10.1109\/ICCVW.2017.360"},{"key":"e_1_2_9_4_2","doi-asserted-by":"crossref","unstructured":"WangH. WangP. SongZ. andLiW. Large-scale multimodal gesture recognition using heterogeneous networks Proceedings of the IEEE International Conference on Computer Vision Workshop (ICCVW) October 2017 Venice Italy 3129\u20133137 https:\/\/doi.org\/10.1109\/iccvw.2017.370 2-s2.0-85046292749.","DOI":"10.1109\/ICCVW.2017.370"},{"key":"e_1_2_9_5_2","doi-asserted-by":"crossref","unstructured":"AbavisaniM. Vaezi JozeH. R. andPatelV. M. Improving the performance of unimodal dynamic hand-gesture recognition with multimodal training Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) June 2019 Seattle WA USA 1165\u20131174 https:\/\/doi.org\/10.1109\/cvpr.2019.00126.","DOI":"10.1109\/CVPR.2019.00126"},{"key":"e_1_2_9_6_2","doi-asserted-by":"crossref","unstructured":"CarreiraJ.andZissermanA. Quo vadis action recognition? a new model and the kinetics dataset Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) July 2017 Honolulu HI USA 4724\u20134733.","DOI":"10.1109\/CVPR.2017.502"},{"key":"e_1_2_9_7_2","doi-asserted-by":"crossref","unstructured":"HanK. WangY. QiT. GuoJ. XuC. andXuC. GhostNet: more features from cheap operations Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) June 2020 Seattle WA USA 1580\u20131589 https:\/\/doi.org\/10.1109\/cvpr42600.2020.00165.","DOI":"10.1109\/CVPR42600.2020.00165"},{"key":"e_1_2_9_8_2","unstructured":"XuK. JimmyB. RyanK. ChoK. CourvilleA. SalakhudinovR. ZemelR. andBengioY. Show attend and tell: neural image caption generation with visual attention Proceedings of the International Conference on Machine Learning (ICML) July 2015 Lille France 2048\u20132057."},{"key":"e_1_2_9_9_2","doi-asserted-by":"publisher","DOI":"10.1155\/2021\/6215281"},{"key":"e_1_2_9_10_2","doi-asserted-by":"publisher","DOI":"10.1155\/2020\/8871605"},{"key":"e_1_2_9_11_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-33709-4_14"},{"key":"e_1_2_9_12_2","doi-asserted-by":"crossref","unstructured":"MolchanovP. GuptaS. KimK. andPulliK. Multi-sensor system for driver\u2019s hand-gesture recognition Proceedings of the 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG) May 2015 Ljubljana Slovenia 1\u20138 https:\/\/doi.org\/10.1109\/fg.2015.7163132 2-s2.0-84944930863.","DOI":"10.1109\/FG.2015.7163132"},{"key":"e_1_2_9_13_2","doi-asserted-by":"crossref","unstructured":"MolchanovP. GuptaS. KimK. andKautzJ. Hand gesture recognition with 3d convolutional neural networks Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop June 2015 Boston MA USA 1\u20137 https:\/\/doi.org\/10.1109\/cvprw.2015.7301342 2-s2.0-84952021150.","DOI":"10.1109\/CVPRW.2015.7301342"},{"key":"e_1_2_9_14_2","doi-asserted-by":"crossref","unstructured":"LiY. MiaoQ. TianK. FanY. XuX. LiR. andSongJ. Large-scale gesture recognition with a fusion of rgb-d data based on the c3d model Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR) December 2016 Cancun Mexico 25\u201330 https:\/\/doi.org\/10.1109\/icpr.2016.7899602 2-s2.0-85019054109.","DOI":"10.1109\/ICPR.2016.7899602"},{"key":"e_1_2_9_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/access.2017.2684186"},{"key":"e_1_2_9_16_2","unstructured":"ZhangL. ZhuG. MeiL.et al. Attention in convolutional LSTM for gesture recognition Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS) December 2018 Montreal Canada 1957\u20131966."},{"key":"e_1_2_9_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2018.2870740"},{"key":"e_1_2_9_18_2","unstructured":"LiangZ. ZhuG. ShenP.et al. Learning spatiotemporal features using 3DCNN and convolutional LSTM for gesture recognition Proceedings of the IEEE International Conference on Computer Vision Workshop (ICCVW) October 2017 Venice Italy 3120\u20133128."},{"key":"e_1_2_9_19_2","doi-asserted-by":"crossref","unstructured":"DuW. WangY. andYuQ. RPAN: an end-to-end recurrent pose-attention network for action recognition in videos Proceedings of the IEEE International Conference on Computer Vision (ICCV) October 2017 Venice Italy 3745\u20133754 https:\/\/doi.org\/10.1109\/iccv.2017.402 2-s2.0-85041905918.","DOI":"10.1109\/ICCV.2017.402"},{"key":"e_1_2_9_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3131343"},{"key":"e_1_2_9_21_2","doi-asserted-by":"crossref","unstructured":"YiY. NiF. MaY.et al. High performance gesture recognition via effective and efficient temporal modeling Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI) August 2019 Macao China 1003\u20131009 https:\/\/doi.org\/10.24963\/ijcai.2019\/141.","DOI":"10.24963\/ijcai.2019\/141"},{"key":"e_1_2_9_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/tkde.2009.191"},{"key":"e_1_2_9_23_2","doi-asserted-by":"publisher","DOI":"10.4018\/978-1-60566-766-9.ch011"},{"key":"e_1_2_9_24_2","doi-asserted-by":"crossref","unstructured":"OquabM. LeonB. LaptevI. andSivicJ. Learning and transferring mid-level image representations using convolutional neural networks Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) June 2014 Columbus OH USA 1717\u20131724.","DOI":"10.1109\/CVPR.2014.222"},{"key":"e_1_2_9_25_2","doi-asserted-by":"crossref","unstructured":"PereraP.andPatelV. Deep transfer learning for multiple class novelty detection Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) June 2019 Long Beach CA USA 11544\u201311552 https:\/\/doi.org\/10.1109\/cvpr.2019.01181.","DOI":"10.1109\/CVPR.2019.01181"},{"key":"e_1_2_9_26_2","doi-asserted-by":"crossref","unstructured":"WooS. ParkJ. LeeJ.-Y. andKweonI. S. CBAM: convolutional block Attention module Proceedings of the European Conference on Computer Vision (ECCV) September 2018 Munich Germany 3\u201319 https:\/\/doi.org\/10.1007\/978-3-030-01234-2_1 2-s2.0-85055111544.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"e_1_2_9_27_2","unstructured":"LiL.andLingS. Learning discriminative representations from RGB-D video data Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence (IJCAI) August 2013 Beijing China 1493\u20131500."},{"key":"e_1_2_9_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/tits.2014.2337331"},{"key":"e_1_2_9_29_2","doi-asserted-by":"crossref","unstructured":"MolchanovP. YangX. GuptaS. KimK. StephenT. andKautzJ. Online detection and classification of dynamic hand gestures with recurrent 3D convolutional neural networks Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) June 2016 Las Vegas NV USA 4207\u20134215 https:\/\/doi.org\/10.1109\/cvpr.2016.456 2-s2.0-84986321396.","DOI":"10.1109\/CVPR.2016.456"},{"key":"e_1_2_9_30_2","unstructured":"PaszkeA. GrossS. ChintalaS. ChananG. YangE. DeVitoZ. LinZ. DesmaisonA. AntigaL. andLererA. Automatic differentiation in pytorch Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS) December 2017 Long Beach CA USA 1\u20134."},{"key":"e_1_2_9_31_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2015.11.005"},{"key":"e_1_2_9_32_2","doi-asserted-by":"crossref","unstructured":"NishidaN.andNakayamaH. Multimodal gesture recognition using multi-stream recurrent neural network Proceedings of the Pacific-Rim Symposium on Image and Video Technology February 2015 Auckland New Zealand 682\u2013694.","DOI":"10.1007\/978-3-319-29451-3_54"},{"key":"e_1_2_9_33_2","doi-asserted-by":"crossref","unstructured":"TranDu BourdevL. FergusR. TorresaniL. andPaluriM. Learning spatiotemporal features with 3d convolutional networks Proceedings of the IEEE International Conference on Computer Vision (ICCV) December 2015 Santiago Chile 4489\u20134497 https:\/\/doi.org\/10.1109\/iccv.2015.510 2-s2.0-84973865953.","DOI":"10.1109\/ICCV.2015.510"},{"key":"e_1_2_9_34_2","unstructured":"SimonyanK.andZissermanA. Two-stream convolutional networks for action recognition in videos Proceedings of the Neural Information Processing Systems (NIPS) December 2014 Montreal Canada 568\u2013576."},{"key":"e_1_2_9_35_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0846-5"}],"container-title":["Computational Intelligence and Neuroscience"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/cin\/2021\/5044916.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/cin\/2021\/5044916.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/pdf\/10.1155\/2021\/5044916","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,6]],"date-time":"2024-08-06T11:21:27Z","timestamp":1722943287000},"score":1,"resource":{"primary":{"URL":"https:\/\/onlinelibrary.wiley.com\/doi\/10.1155\/2021\/5044916"}},"subtitle":[],"editor":[{"given":"Simone","family":"Ranaldi","sequence":"additional","affiliation":[],"role":[{"role":"editor","vocabulary":"crossref"}]}],"short-title":[],"issued":{"date-parts":[[2021,1]]},"references-count":35,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,1]]}},"alternative-id":["10.1155\/2021\/5044916"],"URL":"https:\/\/doi.org\/10.1155\/2021\/5044916","archive":["Portico"],"relation":{},"ISSN":["1687-5265","1687-5273"],"issn-type":[{"value":"1687-5265","type":"print"},{"value":"1687-5273","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,1]]},"assertion":[{"value":"2021-06-04","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-10-15","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-11-18","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"5044916"}}