{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:36:07Z","timestamp":1750221367233,"version":"3.41.0"},"reference-count":40,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2017,12,13]],"date-time":"2017-12-13T00:00:00Z","timestamp":1513123200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"City University of Hong","award":["7004889"],"award-info":[{"award-number":["7004889"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61772495, 61232013, and 61702194"],"award-info":[{"award-number":["61772495, 61232013, and 61702194"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/100014380","name":"Beijing Advanced Innovation Center for Imaging Technology","doi-asserted-by":"crossref","award":["BAICIT-2016009"],"award-info":[{"award-number":["BAICIT-2016009"]}],"id":[{"id":"10.13039\/100014380","id-type":"DOI","asserted-by":"crossref"}]},{"name":"SRG"},{"name":"National Key Research and Development Program of China","award":["2017YFB1002203"],"award-info":[{"award-number":["2017YFB1002203"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2018,2,28]]},"abstract":"<jats:p>Egocentric videos, which mainly record the activities carried out by the users of wearable cameras, have drawn much research attention in recent years. Due to its lengthy content, a large number of ego-related applications have been developed to abstract the captured videos. As the users are accustomed to interacting with the target objects using their own hands, while their hands usually appear within their visual fields during the interaction, an egocentric hand detection step is involved in tasks like gesture recognition, action recognition, and social interaction understanding. In this work, we propose a dynamic region-growing approach for hand region detection in egocentric videos, by jointly considering hand-related motion and egocentric cues. We first determine seed regions that most likely belong to the hand, by analyzing the motion patterns across successive frames. The hand regions can then be located by extending from the seed regions, according to the scores computed for the adjacent superpixels. These scores are derived from four egocentric cues: contrast, location, position consistency, and appearance continuity. We discuss how to apply the proposed method in real-life scenarios, where multiple hands irregularly appear and disappear from the videos. Experimental results on public datasets show that the proposed method achieves superior performance compared with the state-of-the-art methods, especially in complicated scenarios.<\/jats:p>","DOI":"10.1145\/3152129","type":"journal-article","created":{"date-parts":[[2017,12,20]],"date-time":"2017-12-20T14:54:00Z","timestamp":1513781640000},"page":"1-17","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Egocentric Hand Detection Via Dynamic Region Growing"],"prefix":"10.1145","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5043-1448","authenticated-orcid":false,"given":"Shao","family":"Huang","sequence":"first","affiliation":[{"name":"University of Chinese Academy of Sciences, City University of Hong Kong, Beijing, China"}]},{"given":"Weiqiang","family":"Wang","sequence":"additional","affiliation":[{"name":"University of Chinese Academy of Sciences, Beijing, China"}]},{"given":"Shengfeng","family":"He","sequence":"additional","affiliation":[{"name":"South China University of Technology, Guangzhou, China"}]},{"given":"Rynson W. H.","family":"Lau","sequence":"additional","affiliation":[{"name":"City University of Hong Kong, Kowloon, Hong Kong SAR"}]}],"member":"320","published-online":{"date-parts":[[2017,12,13]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2012.120"},{"volume-title":"A survey on recent advances of computer vision algorithms for egocentric video. arXiv:1501.02825","year":"2015","author":"Bambach Sven","key":"e_1_2_1_2_1"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.226"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2014.107"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/11744023_32"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2014.92"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2015.2409731"},{"key":"e_1_2_1_8_1","doi-asserted-by":"crossref","unstructured":"Hakan Cevikalp Bill Triggs and Vojtech Franc. 2013. Face and landmark detection by using cascade of classifiers. In Automatic Face and Gesture Recognition. 1--7.  Hakan Cevikalp Bill Triggs and Vojtech Franc. 2013. Face and landmark detection by using cascade of classifiers. In Automatic Face and Gesture Recognition. 1--7.","DOI":"10.1109\/FG.2013.6553705"},{"key":"e_1_2_1_9_1","first-page":"65","article-title":"Summarization of egocentric videos: A comprehensive survey","volume":"47","author":"del Molino Ana Garcia","year":"2017","journal-title":"IEEE Trans. Hum.-Mach. Syst."},{"volume-title":"Joint hand detection and rotation estimation by using CNN. arXiv:1612.02742","year":"2016","author":"Deng Xiaoming","key":"e_1_2_1_10_1"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2006.879872"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33718-5_23"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2011.5995444"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/358669.358692"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11760-014-0631-x"},{"volume-title":"Proceedings of the CVPR. 1346--1353","year":"2012","author":"Ghosh Joydeep","key":"e_1_2_1_16_1"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.5555\/946247.946644"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.464"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1013200319198"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/1032641.1033046"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2016.2608002"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2015.7301344"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2014.86"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2013.326"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.458"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2013.399"},{"volume-title":"Rehg","year":"2015","author":"Li Yin","key":"e_1_2_1_27_1"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2733373.2807972"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1023\/B:VISI.0000029664.99615.94"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.350"},{"key":"e_1_2_1_31_1","first-page":"674","article-title":"An iterative image registration technique with an application to stereo vision","volume":"81","author":"Lucas Bruce D.","year":"1981","journal-title":"Proceedings of the IJCAI"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.5555\/2354409.2355089"},{"volume-title":"Proceedings of the ITSC. 2545--2550","year":"2016","author":"Rangesh Akshay","key":"e_1_2_1_33_1"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2010.5540074"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7299061"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2011.6126544"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2009.5459334"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-013-0620-5"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.252"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10602-1_19"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3152129","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3152129","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T02:26:26Z","timestamp":1750213586000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3152129"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,12,13]]},"references-count":40,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2018,2,28]]}},"alternative-id":["10.1145\/3152129"],"URL":"https:\/\/doi.org\/10.1145\/3152129","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"type":"print","value":"1551-6857"},{"type":"electronic","value":"1551-6865"}],"subject":[],"published":{"date-parts":[[2017,12,13]]},"assertion":[{"value":"2017-04-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-10-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-12-13","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}