{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T06:32:22Z","timestamp":1770532342811,"version":"3.49.0"},"reference-count":60,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2021,6,23]],"date-time":"2021-06-23T00:00:00Z","timestamp":1624406400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100003725","name":"National Research Foundation of Korea","doi-asserted-by":"publisher","award":["2021R1A2C4002380"],"award-info":[{"award-number":["2021R1A2C4002380"]}],"id":[{"id":"10.13039\/501100003725","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002573","name":"Yonsei University","doi-asserted-by":"publisher","award":["2020-22-0513"],"award-info":[{"award-number":["2020-22-0513"]}],"id":[{"id":"10.13039\/501100002573","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Interact. Mob. Wearable Ubiquitous Technol."],"published-print":{"date-parts":[[2021,6,23]]},"abstract":"<jats:p>In this work we present SUGO, a depth video-based system for translating sign language to text using a smartphone's front camera. While exploiting depth-only videos offer benefits such as being less privacy-invasive compared to using RGB videos, it introduces new challenges which include dealing with low video resolutions and the sensors' sensitiveness towards user motion. We overcome these challenges by diversifying our sign language video dataset to be robust to various usage scenarios via data augmentation and design a set of schemes to emphasize human gestures from the input images for effective sign detection. The inference engine of SUGO is based on a 3-dimensional convolutional neural network (3DCNN) to classify a sequence of video frames as a pre-trained word. Furthermore, the overall operations are designed to be light-weight so that sign language translation takes place in real-time using only the resources available on a smartphone, with no help from cloud servers nor external sensing components. Specifically, to train and test SUGO, we collect sign language data from 20 individuals for 50 Korean Sign Language words, summing up to a dataset of ~5,000 sign gestures and collect additional in-the-wild data to evaluate the performance of SUGO in real-world usage scenarios with different lighting conditions and daily activities. Comprehensively, our extensive evaluations show that SUGO can properly classify sign words with an accuracy of up to 91% and also suggest that the system is suitable (in terms of resource usage, latency, and environmental robustness) to enable a fully mobile solution for sign language translation.<\/jats:p>","DOI":"10.1145\/3463498","type":"journal-article","created":{"date-parts":[[2021,6,24]],"date-time":"2021-06-24T16:29:19Z","timestamp":1624552159000},"page":"1-30","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":41,"title":["Enabling Real-time Sign Language Translation on Mobile Platforms with On-board Depth Cameras"],"prefix":"10.1145","volume":"5","author":[{"given":"HyeonJung","family":"Park","sequence":"first","affiliation":[{"name":"School of Integrated Technology, Yonsei University, Yeonsu-Gu, Incheon, South Korea"}]},{"given":"Youngki","family":"Lee","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, Seoul National University, Gwanak-Gu, Seoul, South Korea"}]},{"given":"JeongGil","family":"Ko","sequence":"additional","affiliation":[{"name":"School of Integrated Technology, Yonsei University, Yeonsu-Gu, Incheon, South Korea"}]}],"member":"320","published-online":{"date-parts":[[2021,6,24]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2020. Deafness and hearing loss. https:\/\/www.who.int\/news-room\/fact-sheets\/detail\/deafness-and-hearing-loss  2020. Deafness and hearing loss. https:\/\/www.who.int\/news-room\/fact-sheets\/detail\/deafness-and-hearing-loss"},{"key":"e_1_2_1_2_1","unstructured":"2020. Harness AI at the Edge with the Jetson TX2. https:\/\/developer.nvidia.com\/embedded\/jetson-tx2-developer-kit  2020. Harness AI at the Edge with the Jetson TX2. https:\/\/developer.nvidia.com\/embedded\/jetson-tx2-developer-kit"},{"key":"e_1_2_1_3_1","unstructured":"2020. Jetson Nano. https:\/\/developer.nvidia.com\/embedded\/jetson-nano  2020. Jetson Nano. https:\/\/developer.nvidia.com\/embedded\/jetson-nano"},{"key":"e_1_2_1_4_1","first-page":"68","article-title":"Finding Small-Bowel Lesions","volume":"51","author":"Ahn J.","year":"2018","unstructured":"J. Ahn , H. Nguyen Loc , R. Krishna Balan , Y. Lee , and J. Ko . 2018 . Finding Small-Bowel Lesions : Challenges in Endoscopy-Image-Based Learning Systems. Computer 51 , 5 (2018), 68 -- 76 . J. Ahn, H. Nguyen Loc, R. Krishna Balan, Y. Lee, and J. Ko. 2018. Finding Small-Bowel Lesions: Challenges in Endoscopy-Image-Based Learning Systems. Computer 51, 5 (2018), 68--76.","journal-title":"Challenges in Endoscopy-Image-Based Learning Systems. Computer"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.2990699"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2938829"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1016\/0010-0277(72)90018-2"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3308561.3353774"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.502"},{"key":"e_1_2_1_10_1","volume-title":"IEEE Conf. on AFGR","volume":"655","author":"Chai Xiujuan","year":"2013","unstructured":"Xiujuan Chai , Guang Li , Yushun Lin , Zhihao Xu , Yili Tang , Xilin Chen , and Ming Zhou . 2013 . Sign language recognition and translation with kinect . In IEEE Conf. on AFGR , Vol. 655 . 4. Xiujuan Chai, Guang Li, Yushun Lin, Zhihao Xu, Yili Tang, Xilin Chen, and Ming Zhou. 2013. Sign language recognition and translation with kinect. In IEEE Conf. on AFGR, Vol. 655. 4."},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 44--52","author":"Dong Cao","year":"2015","unstructured":"Cao Dong , Ming C Leu , and Zhaozheng Yin . 2015 . American sign language alphabet recognition using microsoft kinect . In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 44--52 . Cao Dong, Ming C Leu, and Zhaozheng Yin. 2015. American sign language alphabet recognition using microsoft kinect. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 44--52."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3379337.3415881"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/SIU.2017.7960255"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3131672.3131693"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1017\/S0140525X15001247"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3152121"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVW.2017.373"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_19_1","volume-title":"High-Precision SmartWatch-Based Sign Language Translator. In The 25th Annual International Conference on Mobile Computing and Networking (MobiCom '19)","author":"Hou Jiahui","year":"2019","unstructured":"Jiahui Hou , Xiang-Yang Li , Peide Zhu , Zefan Wang , Yu Wang , Jianwei Qian , and Panlong Yang . 2019 . SignSpeaker: A Real-Time , High-Precision SmartWatch-Based Sign Language Translator. In The 25th Annual International Conference on Mobile Computing and Networking (MobiCom '19) . Association for Computing Machinery, New York, NY, USA, Article 24, 15 pages. https:\/\/doi.org\/10.1145\/3300061.3300117 10.1145\/3300061.3300117 Jiahui Hou, Xiang-Yang Li, Peide Zhu, Zefan Wang, Yu Wang, Jianwei Qian, and Panlong Yang. 2019. SignSpeaker: A Real-Time, High-Precision SmartWatch-Based Sign Language Translator. In The 25th Annual International Conference on Mobile Computing and Networking (MobiCom '19). Association for Computing Machinery, New York, NY, USA, Article 24, 15 pages. https:\/\/doi.org\/10.1145\/3300061.3300117"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2015.7177428"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11903"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3081333.3081360"},{"key":"e_1_2_1_23_1","volume-title":"Ms-asl: A large-scale data set and benchmark for understanding american sign language. arXiv preprint arXiv:1812.01053","author":"Vaezi Joze Hamid Reza","year":"2018","unstructured":"Hamid Reza Vaezi Joze and Oscar Koller . 2018 . Ms-asl: A large-scale data set and benchmark for understanding american sign language. arXiv preprint arXiv:1812.01053 (2018). Hamid Reza Vaezi Joze and Oscar Koller. 2018. Ms-asl: A large-scale data set and benchmark for understanding american sign language. arXiv preprint arXiv:1812.01053 (2018)."},{"key":"e_1_2_1_24_1","volume-title":"The Kinetics Human Action Video Dataset. CoRR abs\/1705.06950","author":"Kay Will","year":"2017","unstructured":"Will Kay , Jo\u00e3o Carreira , Karen Simonyan , Brian Zhang , Chloe Hillier , Sudheendra Vijayanarasimhan , Fabio Viola , Tim Green , Trevor Back , Paul Natsev , Mustafa Suleyman , and Andrew Zisserman . 2017. The Kinetics Human Action Video Dataset. CoRR abs\/1705.06950 ( 2017 ). arXiv:1705.06950 http:\/\/arxiv.org\/abs\/1705.06950 Will Kay, Jo\u00e3o Carreira, Karen Simonyan, Brian Zhang, Chloe Hillier, Sudheendra Vijayanarasimhan, Fabio Viola, Tim Green, Trevor Back, Paul Natsev, Mustafa Suleyman, and Andrew Zisserman. 2017. The Kinetics Human Action Video Dataset. CoRR abs\/1705.06950 (2017). arXiv:1705.06950 http:\/\/arxiv.org\/abs\/1705.06950"},{"key":"e_1_2_1_25_1","volume-title":"Indoor and outdoor depth imaging of leaves with time-of-flight and stereo vision sensors: Analysis and comparison. ISPRS journal of photogrammetry and remote sensing 88","author":"Kazmi Wajahat","year":"2014","unstructured":"Wajahat Kazmi , Sergi Foix , Guillem Aleny\u00e0 , and Hans J\u00f8rgen Andersen . 2014. Indoor and outdoor depth imaging of leaves with time-of-flight and stereo vision sensors: Analysis and comparison. ISPRS journal of photogrammetry and remote sensing 88 ( 2014 ), 128--146. Wajahat Kazmi, Sergi Foix, Guillem Aleny\u00e0, and Hans J\u00f8rgen Andersen. 2014. Indoor and outdoor depth imaging of leaves with time-of-flight and stereo vision sensors: Analysis and comparison. ISPRS journal of photogrammetry and remote sensing 88 (2014), 128--146."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.3390\/s20072129"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2019.2911077"},{"key":"e_1_2_1_28_1","volume-title":"Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers. Computer Vision and Image Understanding 141 (Dec","author":"Koller Oscar","year":"2015","unstructured":"Oscar Koller , Jens Forster , and Hermann Ney . 2015. Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers. Computer Vision and Image Understanding 141 (Dec . 2015 ), 108--125. Oscar Koller, Jens Forster, and Hermann Ney. 2015. Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers. Computer Vision and Image Understanding 141 (Dec. 2015), 108--125."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-018-1121-3"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVW.2019.00240"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV45572.2020.9093512"},{"key":"e_1_2_1_32_1","volume-title":"Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710","author":"Li Hao","year":"2016","unstructured":"Hao Li , Asim Kadav , Igor Durdanovic , Hanan Samet , and Hans Peter Graf . 2016. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710 ( 2016 ). Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. 2016. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710 (2016)."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2904749"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3191755"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1145\/3316782.3316786"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/1355734.1355746"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1007\/s13369-012-0378-z"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.456"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3134230.3134236"},{"key":"e_1_2_1_40_1","unstructured":"myoarmband [n. d.]. https:\/\/support.getmyo.com\/hc\/en-us  myoarmband [n. d.]. https:\/\/support.getmyo.com\/hc\/en-us"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/INVENTIVE.2016.7830097"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3411843"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cola.2019.04.002"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3381010"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/MASS.2017.41"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/SLT.2018.8639639"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01262053"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/2629481"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01424-7_27"},{"key":"e_1_2_1_50_1","unstructured":"Ultraleap. [n. d.]. Leap Motion Contorller. Available at https:\/\/www.ultraleap.com\/product\/leap-motion-controller\/.  Ultraleap. [n. d.]. Leap Motion Contorller. Available at https:\/\/www.ultraleap.com\/product\/leap-motion-controller\/."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2897735"},{"key":"e_1_2_1_52_1","unstructured":"Seongok Won. 2019. A Study on the Korean Sign Language(KSL) Grammer.  Seongok Won. 2019. A Study on the Korean Sign Language(KSL) Grammer."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/JBHI.2016.2598302"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICIEA.2010.5514688"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3282353.3282356"},{"key":"e_1_2_1_56_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3377553","article-title":"WiSign: Ubiquitous American Sign Language Recognition Using Commercial Wi-Fi Devices","volume":"11","author":"Zhang Lei","year":"2020","unstructured":"Lei Zhang , Yixiang Zhang , and Xiaolong Zheng . 2020 . WiSign: Ubiquitous American Sign Language Recognition Using Commercial Wi-Fi Devices . ACM Transactions on Intelligent Systems and Technology (TIST) 11 , 3 (2020), 1 -- 24 . Lei Zhang, Yixiang Zhang, and Xiaolong Zheng. 2020. WiSign: Ubiquitous American Sign Language Recognition Using Commercial Wi-Fi Devices. ACM Transactions on Intelligent Systems and Technology (TIST) 11, 3 (2020), 1--24.","journal-title":"ACM Transactions on Intelligent Systems and Technology (TIST)"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3301275.3302296"},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/MMUL.2012.24"},{"key":"e_1_2_1_59_1","volume-title":"2017 IEEE International Conference on Pervasive Computing and Communications (PerCom). IEEE, 2--10","author":"Zhong Henry","year":"2017","unstructured":"Henry Zhong , Salil S Kanhere , and Chun Tung Chou . 2017 . VeinDeep: smartphone unlock using vein patterns . In 2017 IEEE International Conference on Pervasive Computing and Communications (PerCom). IEEE, 2--10 . Henry Zhong, Salil S Kanhere, and Chun Tung Chou. 2017. VeinDeep: smartphone unlock using vein patterns. In 2017 IEEE International Conference on Pervasive Computing and Communications (PerCom). IEEE, 2--10."},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/3287080"}],"container-title":["Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3463498","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3463498","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:31:28Z","timestamp":1750195888000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3463498"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,6,23]]},"references-count":60,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2021,6,23]]}},"alternative-id":["10.1145\/3463498"],"URL":"https:\/\/doi.org\/10.1145\/3463498","relation":{},"ISSN":["2474-9567"],"issn-type":[{"value":"2474-9567","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,6,23]]},"assertion":[{"value":"2021-06-24","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}