{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T17:58:02Z","timestamp":1770746282693,"version":"3.49.0"},"reference-count":40,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2015,3,31]],"date-time":"2015-03-31T00:00:00Z","timestamp":1427760000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"Natural Science Foundation of China","doi-asserted-by":"crossref","award":["No. 61325009, No. 61390514, and No. 61272316"],"award-info":[{"award-number":["No. 61325009, No. 61390514, and No. 61272316"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Intell. Syst. Technol."],"published-print":{"date-parts":[[2015,5,4]]},"abstract":"<jats:p>Hand posture recognition (HPR) is quite a challenging task, due to both the difficulty in detecting and tracking hands with normal cameras and the limitations of traditional manually selected features. In this article, we propose a two-stage HPR system for Sign Language Recognition using a Kinect sensor. In the first stage, we propose an effective algorithm to implement hand detection and tracking. The algorithm incorporates both color and depth information, without specific requirements on uniform-colored or stable background. It can handle the situations in which hands are very close to other parts of the body or hands are not the nearest objects to the camera and allows for occlusion of hands caused by faces or other hands. In the second stage, we apply deep neural networks (DNNs) to automatically learn features from hand posture images that are insensitive to movement, scaling, and rotation. Experiments verify that the proposed system works quickly and accurately and achieves a recognition accuracy as high as 98.12%.<\/jats:p>","DOI":"10.1145\/2735952","type":"journal-article","created":{"date-parts":[[2015,4,3]],"date-time":"2015-04-03T20:29:44Z","timestamp":1428092984000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":108,"title":["A Real-Time Hand Posture Recognition System Using Deep Neural Networks"],"prefix":"10.1145","volume":"6","author":[{"given":"Ao","family":"Tang","sequence":"first","affiliation":[{"name":"University of Science and Technology of China, Hefei, China"}]},{"given":"Ke","family":"Lu","sequence":"additional","affiliation":[{"name":"University of the Chinese Academy of Sciences, Beijing, China"}]},{"given":"Yufei","family":"Wang","sequence":"additional","affiliation":[{"name":"University of Science and Technology of China, Hefei, China"}]},{"given":"Jie","family":"Huang","sequence":"additional","affiliation":[{"name":"University of Science and Technology of China, Hefei, China"}]},{"given":"Houqiang","family":"Li","sequence":"additional","affiliation":[{"name":"University of Science and Technology of China, Hefei, China"}]}],"member":"320","published-online":{"date-parts":[[2015,3,31]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the 2011 7th International Conference on Electrical and Electronics Engineering (ELECO\u201911)","author":"Aksa\u00e7 Alper","year":"2011","unstructured":"Alper Aksa\u00e7 , Orkun \u00d6zt\u00fcrk , and Tansel \u00d6zyer . 2011 . Real-time multi-objective hand posture\/gesture recognition by using distance classifiers and finite state machine for virtual mouse operations . In Proceedings of the 2011 7th International Conference on Electrical and Electronics Engineering (ELECO\u201911) . IEEE, II--457. Alper Aksa\u00e7, Orkun \u00d6zt\u00fcrk, and Tansel \u00d6zyer. 2011. Real-time multi-objective hand posture\/gesture recognition by using distance classifiers and finite state machine for virtual mouse operations. In Proceedings of the 2011 7th International Conference on Electrical and Electronics Engineering (ELECO\u201911). IEEE, II--457."},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV\u201904)","author":"Antonis","unstructured":"Antonis A. Argyros and Manolis I. A. Lourakis. 2004. Real-time tracking of multiple skin-colored objects with a possibly moving camera . In Proceedings of the European Conference on Computer Vision (ECCV\u201904) . Springer, 368--379. Antonis A. Argyros and Manolis I. A. Lourakis. 2004. Real-time tracking of multiple skin-colored objects with a possibly moving camera. In Proceedings of the European Conference on Computer Vision (ECCV\u201904). Springer, 368--379."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/MVHI.2010.39"},{"key":"e_1_2_1_4_1","volume-title":"Mensch &amp","author":"Caputo Manuel","year":"2012","unstructured":"Manuel Caputo , Klaus Denker , Benjamin Dums , and Georg Umlauf . 2012. 3D hand gesture recognition based on sensor fusion of commodity hardware . In Mensch &amp ; Computer 2012 : interaktiv informiert--allgegenw&auml;\u00e4rtig und allumfassend&excl;&quest; Manuel Caputo, Klaus Denker, Benjamin Dums, and Georg Umlauf. 2012. 3D hand gesture recognition based on sensor fusion of commodity hardware. In Mensch &amp; Computer 2012: interaktiv informiert--allgegenw&auml;\u00e4rtig und allumfassend&excl;&quest;"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/76.767122"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0262-8856(03)00070-2"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2011.2134090"},{"key":"e_1_2_1_8_1","volume-title":"Neural Nets and Surroundings","author":"Fagiani Marco","unstructured":"Marco Fagiani , Emanuele Principi , Stefano Squartini , and Francesco Piazza . 2013. A new system for automatic recognition of italian sign language . In Neural Nets and Surroundings . Springer , 69--79. Marco Fagiani, Emanuele Principi, Stefano Squartini, and Francesco Piazza. 2013. A new system for automatic recognition of italian sign language. In Neural Nets and Surroundings. Springer, 69--79."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/76.795058"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2004.04.008"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1162\/089976602760128018"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.2006.18.7.1527"},{"key":"e_1_2_1_13_1","volume-title":"Salakhutdinov","author":"Hinton Geoffrey E.","year":"2006","unstructured":"Geoffrey E. Hinton and Ruslan R . Salakhutdinov . 2006 . Reducing the dimensionality of data with neural networks. Science 313, 5786 (2006), 504--507. Geoffrey E. Hinton and Ruslan R. Salakhutdinov. 2006. Reducing the dimensionality of data with neural networks. Science 313, 5786 (2006), 504--507."},{"key":"e_1_2_1_14_1","volume-title":"Department of Computer Science","author":"Hsu Chih-Wei","year":"2003","unstructured":"Chih-Wei Hsu , Chih-Chung Chang , and Chih-Jen Lin . 2003. A practical guide to support vector classification. Technical report , Department of Computer Science , National Taiwan University . July , 2003 . Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin. 2003. A practical guide to support vector classification. Technical report, Department of Computer Science, National Taiwan University. July, 2003."},{"key":"e_1_2_1_15_1","unstructured":"Shuiwang Ji Wei Xu Ming Yang and Kai Yu. 2013. 3D convolutional neural networks for human action recognition. (2013).  Shuiwang Ji Wei Xu Ming Yang and Kai Yu. 2013. 3D convolutional neural networks for human action recognition. (2013)."},{"key":"e_1_2_1_16_1","volume-title":"Proceeding of the European Symposium on Artificial Neural Networks (ESANN\u201911)","author":"Krizhevsky Alex","unstructured":"Alex Krizhevsky and Geoffrey E. Hinton . 2011. Using very deep autoencoders for content-based image retrieval . In Proceeding of the European Symposium on Artificial Neural Networks (ESANN\u201911) . Alex Krizhevsky and Geoffrey E. Hinton. 2011. Using very deep autoencoders for content-based image retrieval. In Proceeding of the European Symposium on Artificial Neural Networks (ESANN\u201911)."},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the 20th European Signal Processing Conference (EUSIPCO\u201912)","author":"Kurakin A.","year":"1975","unstructured":"A. Kurakin , Z. Zhang , and Z. Liu . 2012. A real time system for dynamic hand gesture recognition with a depth sensor . In Proceedings of the 20th European Signal Processing Conference (EUSIPCO\u201912) . IEEE, 1975 --1979. A. Kurakin, Z. Zhang, and Z. Liu. 2012. A real time system for dynamic hand gesture recognition with a depth sensor. In Proceedings of the 20th European Signal Processing Conference (EUSIPCO\u201912). IEEE, 1975--1979."},{"key":"e_1_2_1_18_1","volume-title":"Generalization and network design strategies. Connectionism in Perspective","author":"LeCun Yann","year":"1989","unstructured":"Yann LeCun . 1989. Generalization and network design strategies. Connectionism in Perspective ( 1989 ), 143--155. Yann LeCun. 1989. Generalization and network design strategies. Connectionism in Perspective (1989), 143--155."},{"key":"e_1_2_1_19_1","volume-title":"Convolutional networks for images, speech, and time series. The Handbook of Brain Theory and Neural Networks 3361, 310","author":"LeCun Yann","year":"1995","unstructured":"Yann LeCun and Yoshua Bengio . 1995. Convolutional networks for images, speech, and time series. The Handbook of Brain Theory and Neural Networks 3361, 310 ( 1995 ). Yann LeCun and Yoshua Bengio. 1995. Convolutional networks for images, speech, and time series. The Handbook of Brain Theory and Neural Networks 3361, 310 (1995)."},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV.2013.6475017"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.18297\/etd\/823"},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the Australasian Conference on Robotics and Automation. 21--27","author":"Li Zhi","year":"2009","unstructured":"Zhi Li and Ray Jarvis . 2009 . Real time hand gesture recognition using a range camera . In Proceedings of the Australasian Conference on Robotics and Automation. 21--27 . Zhi Li and Ray Jarvis. 2009. Real time hand gesture recognition using a range camera. In Proceedings of the Australasian Conference on Robotics and Automation. 21--27."},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI).","author":"Liu Li","year":"2013","unstructured":"Li Liu and Ling Shao . 2013 . Learning discriminative representations from RGB-D video data . In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI). Li Liu and Ling Shao. 2013. Learning discriminative representations from RGB-D video data. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI)."},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2007.11.007"},{"key":"e_1_2_1_26_1","first-page":"1339","article-title":"3-d object recognition with deep belief nets","volume":"22","author":"Nair Vinod","year":"2009","unstructured":"Vinod Nair and Geoffrey Hinton . 2009 . 3-d object recognition with deep belief nets . Advances in Neural Information Processing Systems 22 (2009), 1339 -- 1347 . Vinod Nair and Geoffrey Hinton. 2009. 3-d object recognition with deep belief nets. Advances in Neural Information Processing Systems 22 (2009), 1339--1347.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/72.701181"},{"key":"e_1_2_1_28_1","volume-title":"Proceedings of the World Congress on Nature &amp; Biologically Inspired Computing","author":"Radha V.","year":"2009","unstructured":"V. Radha and M. Krishnaveni . 2009. Threshold based segmentation using median filter for sign language recognition system . In Proceedings of the World Congress on Nature &amp; Biologically Inspired Computing , 2009 (NaBIC&rsquo;\u201909). IEEE, 1394--1399. V. Radha and M. Krishnaveni. 2009. Threshold based segmentation using median filter for sign language recognition system. In Proceedings of the World Congress on Nature &amp; Biologically Inspired Computing, 2009 (NaBIC&rsquo;\u201909). IEEE, 1394--1399."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2072298.2071946"},{"key":"e_1_2_1_30_1","volume-title":"Hinton","author":"Salakhutdinov Ruslan","year":"2008","unstructured":"Ruslan Salakhutdinov and Geoffrey E . Hinton . 2008 . Using deep belief nets to learn covariance kernels for Gaussian processes. Advances in Neural Information Processing Systems ( 2008), 1249--1256. Ruslan Salakhutdinov and Geoffrey E. Hinton. 2008. Using deep belief nets to learn covariance kernels for Gaussian processes. Advances in Neural Information Processing Systems (2008), 1249--1256."},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of Interspeech. 437--440","author":"Seide Frank","year":"2011","unstructured":"Frank Seide , Gang Li , and Dong Yu . 2011 . Conversational speech transcription using context-dependent deep neural networks . In Proceedings of Interspeech. 437--440 . Frank Seide, Gang Li, and Dong Yu. 2011. Conversational speech transcription using context-dependent deep neural networks. In Proceedings of Interspeech. 437--440."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.2010.760"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/0734-189X(85)90016-7"},{"key":"e_1_2_1_34_1","volume-title":"Proceedings of the 2010 IEEE Instrumentation and Measurement Technology Conference (I2MTC\u201910)","author":"Tusor Balazs","unstructured":"Balazs Tusor and A. R. Varkonyi-Koczy . 2010. Circular fuzzy neural network based hand gesture and posture modeling . In Proceedings of the 2010 IEEE Instrumentation and Measurement Technology Conference (I2MTC\u201910) . IEEE, 815--820. Balazs Tusor and A. R. Varkonyi-Koczy. 2010. Circular fuzzy neural network based hand gesture and posture modeling. In Proceedings of the 2010 IEEE Instrumentation and Measurement Technology Conference (I2MTC\u201910). IEEE, 815--820."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV.2011.5711485"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.5555\/2964398.2964463"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2012.2199502"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cviu.2008.08.003"},{"key":"e_1_2_1_39_1","volume-title":"Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW\u201912)","author":"Xia Lu","unstructured":"Lu Xia , Chia-Chih Chen , and J. K. Aggarwal . 2012. View invariant human action recognition using histograms of 3D joints . In Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW\u201912) . IEEE, 20--27. Lu Xia, Chia-Chih Chen, and J. K. Aggarwal. 2012. View invariant human action recognition using histograms of 3D joints. In Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW\u201912). IEEE, 20--27."},{"key":"e_1_2_1_40_1","doi-asserted-by":"crossref","unstructured":"X. Zabulis H. Baltzakis and A. Argyros. 2009. Vision-based hand gesture recognition for human-computer interaction. The Universal Access Handbook. LEA (2009).  X. Zabulis H. Baltzakis and A. Argyros. 2009. Vision-based hand gesture recognition for human-computer interaction. The Universal Access Handbook. LEA (2009).","DOI":"10.1201\/9781420064995-c34"}],"container-title":["ACM Transactions on Intelligent Systems and Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2735952","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2735952","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T06:16:36Z","timestamp":1750227396000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2735952"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2015,3,31]]},"references-count":40,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2015,5,4]]}},"alternative-id":["10.1145\/2735952"],"URL":"https:\/\/doi.org\/10.1145\/2735952","relation":{},"ISSN":["2157-6904","2157-6912"],"issn-type":[{"value":"2157-6904","type":"print"},{"value":"2157-6912","type":"electronic"}],"subject":[],"published":{"date-parts":[[2015,3,31]]},"assertion":[{"value":"2013-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2014-01-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2015-03-31","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}