{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T02:54:34Z","timestamp":1760237674698,"version":"build-2065373602"},"reference-count":29,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2020,6,5]],"date-time":"2020-06-05T00:00:00Z","timestamp":1591315200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100007424","name":"Universitatea Politehnica Timi\u015foara","doi-asserted-by":"publisher","award":["GNAC2018 - ARUT 1356\/01.02.2019"],"award-info":[{"award-number":["GNAC2018 - ARUT 1356\/01.02.2019"]}],"id":[{"id":"10.13039\/501100007424","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Gesture recognition is an intensively researched area for several reasons. One of the most important reasons is because of this technology\u2019s numerous application in various domains (e.g., robotics, games, medicine, automotive, etc.) Additionally, the introduction of three-dimensional (3D) image acquisition techniques (e.g., stereovision, projected-light, time-of-flight, etc.) overcomes the limitations of traditional two-dimensional (2D) approaches. Combined with the larger availability of 3D sensors (e.g., Microsoft Kinect, Intel RealSense, photonic mixer device (PMD), CamCube, etc.), recent interest in this domain has sparked. Moreover, in many computer vision tasks, the traditional statistic top approaches were outperformed by deep neural network-based solutions. In view of these considerations, we proposed a deep neural network solution by employing PointNet architecture for the problem of hand gesture recognition using depth data produced by a time of flight (ToF) sensor. We created a custom hand gesture dataset, then proposed a multistage hand segmentation by designing filtering, clustering, and finding the hand in the volume of interest and hand-forearm segmentation. For comparison purpose, two equivalent datasets were tested: a 3D point cloud dataset and a 2D image dataset, both obtained from the same stream. Besides the advantages of the 3D technology, the accuracy of the 3D method using PointNet is proven to outperform the 2D method in all circumstances, even the 2D method that employs a deep neural network.<\/jats:p>","DOI":"10.3390\/s20113226","type":"journal-article","created":{"date-parts":[[2020,6,9]],"date-time":"2020-06-09T05:16:14Z","timestamp":1591679774000},"page":"3226","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["A PointNet-Based Solution for 3D Hand Gesture Recognition"],"prefix":"10.3390","volume":"20","author":[{"given":"Radu","family":"Mirsu","sequence":"first","affiliation":[{"name":"Applied Electronics Department, Faculty of Electronics, Telecommunications and Information Technologies, Politehnica University Timi\u0219oara, 300223 Timi\u0219oara, Romania"}]},{"given":"Georgiana","family":"Simion","sequence":"additional","affiliation":[{"name":"Applied Electronics Department, Faculty of Electronics, Telecommunications and Information Technologies, Politehnica University Timi\u0219oara, 300223 Timi\u0219oara, Romania"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0252-7047","authenticated-orcid":false,"given":"Catalin Daniel","family":"Caleanu","sequence":"additional","affiliation":[{"name":"Applied Electronics Department, Faculty of Electronics, Telecommunications and Information Technologies, Politehnica University Timi\u0219oara, 300223 Timi\u0219oara, Romania"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8476-3299","authenticated-orcid":false,"given":"Ioana Monica","family":"Pop-Calimanu","sequence":"additional","affiliation":[{"name":"Applied Electronics Department, Faculty of Electronics, Telecommunications and Information Technologies, Politehnica University Timi\u0219oara, 300223 Timi\u0219oara, Romania"}]}],"member":"1968","published-online":{"date-parts":[[2020,6,5]]},"reference":[{"key":"ref_1","unstructured":"(2020, March 03). Terabee 3Dcam. Available online: https:\/\/www.terabee.com\/shop\/3d-tof-cameras\/terabee-3dcam\/."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"2202","DOI":"10.1016\/j.patcog.2013.01.033","article-title":"A robust static hand gesture recognition system using geometry based normalizations and Krawtchouk moments","volume":"46","author":"Priyal","year":"2013","journal-title":"Pattern Recognit."},{"key":"ref_3","first-page":"2513","article-title":"One-shot-learning gesture recognition using HOG-HOF features","volume":"15","author":"Konecny","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Wu, D., Zhu, F., and Shao, L. (2012, January 16\u201321). One shot learning gesture recognition from RGB-D images. Proceedings of the IEEE Conference Computer Vision Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPRW.2012.6239179"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1626","DOI":"10.1109\/TPAMI.2015.2513479","article-title":"Explore efficient local features from RGB-D data for one-shot learning gesture recognition","volume":"38","author":"Wan","year":"2016","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Tamrakar, A., Ali, S., Yu, Q., Liu, J., Javed, O., Divakaran, A., Cheng, H., and Sawhney, H. (2012, January 16\u201321). Evaluation of low-level features and their combinations for complex event detection in opensource videos. Proceedings of the IEEE Conference Computer Vision Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248114"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"219","DOI":"10.1007\/s11263-015-0846-5","article-title":"A robust and efficient video representation for action recognition","volume":"119","author":"Wang","year":"2015","journal-title":"IJCV"},{"key":"ref_8","first-page":"1","article-title":"Hand gesture recognition in real time for automotive interfaces: A multimodal vision-based approach and evaluations","volume":"15","author":"Trivedi","year":"2014","journal-title":"IEEE ITS"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Wan, J., Li, S.Z., Zhao, Y., Zhou, S., Guyon, I., and Escalera, S. (2016). ChaLearn Looking at People RGB-D Isolated and Continuous Datasets for Gesture Recognition, CVPR.","DOI":"10.1109\/CVPRW.2016.100"},{"key":"ref_10","unstructured":"Maturana, D., and Scherer, S. (October, January 28). Voxnet: A 3d convolutional neural network for real-time object recognition. Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems, Hamburg, Germany."},{"key":"ref_11","unstructured":"Wu, Z. (2015). 3D ShapeNets: A Deep Representation for Volumetric Shapes, CVPR."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Qi, C.R., Su, H., Nie\u00dfner, M., Dai, A., Yan, M., and Guibas, L. (2016). Volumetric and Multi-View Cnns for Object Classification on 3d Data, CVPR.","DOI":"10.1109\/CVPR.2016.609"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Molchanov, P., Gupta, S., Kim, K., and Kautz, J. (2015, January 7\u201312). Hand gesture recognition with 3D convolutional neural networks. Proceedings of the 28th IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPRW.2015.7301342"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Molchanov, P., Yang, X., Gupta, S., Kim, K., Tyree, S., and Kautz, J. (2016). Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks, IEEE CVPR.","DOI":"10.1109\/CVPR.2016.456"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"4517","DOI":"10.1109\/ACCESS.2017.2684186","article-title":"Multimodal Gesture Recognition Using 3-D Convolution and Convolutional LSTM","volume":"5","author":"Zhu","year":"2017","journal-title":"IEEE Access"},{"key":"ref_16","unstructured":"Zhang, L., Zhu, G.M., and Mei, L. (2018, January 4). Attention in convolutional LSTM for gesture recognition. Proceedings of the Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_17","unstructured":"Kingkan, C., Owoyemi, J., and Hashimoto, K. (2018). Point Attention Network for Gesture Recognition Using Point Cloud Data, BMVC."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Ge, L., Liang, H., Yuan, J., and Thalmann, D. (2016). Robust 3Dhand Pose Estimation in Single Depth Images: From Single View CNN to Multi-View CNNs, CVPR.","DOI":"10.1109\/CVPR.2016.391"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Ge, L., Liang, H., Yuan, J., and Thalmann, D. (2017). 3d Convolutional Neural Networks for Efficient and Robust Hand Pose Estimation from Single Depth Images, CVPR.","DOI":"10.1109\/CVPR.2017.602"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Ge, L., Cai, Y., Weng, J., and Yuan, J. (2018). Hand PointNet: 3d Hand Pose Estimation Using Point Sets, CVPR.","DOI":"10.1109\/CVPR.2018.00878"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Asadi-Aghbolaghi, M., Clapes, A., Bellantonio, M., Esacalnte, H., Ponce-Lopez, V., Baro, X., Guyon, I., Kasaei, S., and Escalera, S. (2017). Deep learning for action and gesture recognition in image sequences: A survey. Gesture Recognition, Springer.","DOI":"10.1007\/978-3-319-57021-1_19"},{"key":"ref_22","unstructured":"Qi, C.R., Su, H., Mo, K., and Guibas, L.J. (2016). PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation, CVPR."},{"key":"ref_23","unstructured":"Qi, C.R., Yi, L., Su, H., and Guibas, L.J. (2017, January 4). PointNet++: Deep hierarchical feature learning on point sets in a metric space. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Owoyemi, J., Chiba, N., and Hashimoto, K. (2019, January 6\u20138). Discriminative Recognition of Point Cloud Gesture Classes through One-Shot Learning. Proceedings of the 2019 IEEE International Conference on Robotics and Biomimetics, Dali, China.","DOI":"10.1109\/ROBIO49542.2019.8961778"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Chugunov, I., and Zakhor, A. (2019, January 22\u201325). Duodepth: Static Gesture Recognition Via Dual Depth Sensors. Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan.","DOI":"10.1109\/ICIP.2019.8803665"},{"key":"ref_26","unstructured":"(2020, March 03). UPT Time of Flight 3D Hand Gesture Database. Available online: https:\/\/www.kaggle.com\/cdcaleanu\/upt-tof-3d-hand-gesture-database."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Simion, G., and C\u0103leanu, C.D. (2012, January 15\u201316). A ToF 3D database for hand gesture recognition. Proceedings of the 2012 10th International Symposium on Electronics and Telecommunications, Timisoara, Romania.","DOI":"10.1109\/ISETC.2012.6408145"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Simion, G., and C\u0103leanu, C. (2014, January 15\u201316). Multi-stage 3D segmentation for ToF based gesture recognition system. Proceedings of the 2014 11th International Symposium on Electronics and Telecommunications (ISETC), Timisoara, Romania.","DOI":"10.1109\/ISETC.2014.7010800"},{"key":"ref_29","unstructured":"Liu, Z., Tang, H., Lin, Y., and Han, S. (2019, January 8). Point-Voxel CNN for Efficient 3D Deep Learning. Proceedings of the Advances in Neural Information Processing Systems 2019, Vancouver, BC, Canada."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/11\/3226\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:36:11Z","timestamp":1760175371000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/11\/3226"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,5]]},"references-count":29,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2020,6]]}},"alternative-id":["s20113226"],"URL":"https:\/\/doi.org\/10.3390\/s20113226","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2020,6,5]]}}}