{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T02:53:04Z","timestamp":1770346384023,"version":"3.49.0"},"reference-count":47,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2024,5,11]],"date-time":"2024-05-11T00:00:00Z","timestamp":1715385600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62272213 and 62272223"],"award-info":[{"award-number":["62272213 and 62272223"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"program B for Outstanding Ph.D. candidate of Nanjing University"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Sen. Netw."],"published-print":{"date-parts":[[2024,7,31]]},"abstract":"<jats:p>We propose UltraCLR, a new contrastive learning framework that fuses dual modulation ultrasonic sensing signals to enhance gesture representation. Most existing ultrasound-based gesture recognition tasks rely on a large amount of manually labeled samples to learn task-specific representations via end-to-end training. However, they cannot exploit unlabeled continuous gesture signals that are easy to collect. Inspired by recent self-supervised learning techniques, UltraCLR aims to autonomously learn a ubiquitous gesture signal representation that can benefit all tasks from low-cost unlabeled signals. We use the STFT heatmap as a secondary input and leverage the contrastive learning framework to improve the high-quality Channel Impulsive Response heatmap input representations. The learned representations can better represent the spatial-position information and intermediate states of gesture movement. With the representation learned by UltraCLR, we can greatly reduce the complexity of downstream gesture recognition tasks so that they can be completed using a simple classifier trained with a small training set and a lower computational cost. Our experimental results show that UltraCLR outperforms state-of-the-art gesture recognition systems with only a few labeled samples and achieves more than 85% reduction in computational complexity and over 9\u00d7 improvement in inference speed.<\/jats:p>","DOI":"10.1145\/3597498","type":"journal-article","created":{"date-parts":[[2023,5,29]],"date-time":"2023-05-29T11:01:28Z","timestamp":1685358088000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["UltraCLR: Contrastive Representation Learning Framework for Ultrasound-based Sensing"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2479-6571","authenticated-orcid":false,"given":"Xun","family":"Wang","sequence":"first","affiliation":[{"name":"State Key Lab. for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-1931-8602","authenticated-orcid":false,"given":"Zhizheng","family":"Yang","sequence":"additional","affiliation":[{"name":"State Key Lab. for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9882-2090","authenticated-orcid":false,"given":"Wei","family":"Wang","sequence":"additional","affiliation":[{"name":"State Key Lab. for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0545-8187","authenticated-orcid":false,"given":"Haipeng","family":"Dai","sequence":"additional","affiliation":[{"name":"State Key Lab. for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7510-5120","authenticated-orcid":false,"given":"Shuyu","family":"Shi","sequence":"additional","affiliation":[{"name":"State Key Lab. for Novel Software Technology, Nanjing University, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1112-790X","authenticated-orcid":false,"given":"Qing","family":"Gu","sequence":"additional","affiliation":[{"name":"State Key Lab. for Novel Software Technology, Nanjing University, Nanjing, China"}]}],"member":"320","published-online":{"date-parts":[[2024,5,11]]},"reference":[{"issue":"4","key":"e_1_3_1_2_2","first-page":"1","article-title":"IMU2Doppler: Cross-modal domain adaptation for doppler-based activity recognition using IMU data","volume":"5","author":"Bhalla Sejal","year":"2022","unstructured":"Sejal Bhalla, Mayank Goel, and Rushil Khurana. 2022. IMU2Doppler: Cross-modal domain adaptation for doppler-based activity recognition using IMU data. Proc. ACM Interact. Mob. Wear. Ubiq. Technol. 5, 4 (2022), 1\u201320.","journal-title":"Proc. ACM Interact. Mob. Wear. Ubiq. Technol."},{"key":"e_1_3_1_3_2","first-page":"119","volume-title":"Proceedings of USENIX NSDI","author":"Bhardwaj Romil","year":"2022","unstructured":"Romil Bhardwaj, Zhengxu Xia, Ganesh Ananthanarayanan, Junchen Jiang, Yuanchao Shu, Nikolaos Karianakis, Kevin Hsieh, Paramvir Bahl, and Ion Stoica. 2022. Ekya: Continuous learning of video analytics models on edge compute servers. In Proceedings of USENIX NSDI. 119\u2013135."},{"key":"e_1_3_1_4_2","first-page":"9912","volume-title":"Proceedings of NeurlPS","author":"Caron Mathilde","year":"2020","unstructured":"Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. 2020. Unsupervised learning of visual features by contrasting cluster assignments. In Proceedings of NeurlPS. 9912\u20139924."},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3380985"},{"key":"e_1_3_1_6_2","first-page":"1597","volume-title":"Proceedings of ICML","author":"Chen Ting","year":"2020","unstructured":"Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of ICML. 1597\u20131607."},{"key":"e_1_3_1_7_2","first-page":"22243","volume-title":"Proceedings of NeurlPS","author":"Chen Ting","year":"2020","unstructured":"Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, and Geoffrey Hinton. 2020. Big self-supervised models are strong semi-supervised learners. In Proceedings of NeurlPS. 22243\u201322255 pages."},{"key":"e_1_3_1_8_2","first-page":"1","article-title":"Improved baselines with momentum contrastive learning","volume":"2003","author":"Chen Xinlei","year":"2020","unstructured":"Xinlei Chen, Haoqi Fan, Ross B. Girshick, and Kaiming He. 2020. Improved baselines with momentum contrastive learning. CoRR abs\/2003.04297 (2020), 1\u201320.","journal-title":"CoRR"},{"issue":"4","key":"e_1_3_1_9_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3494962","article-title":"ChestLive: Fortifying voice-based authentication with chest motion biometric on smart devices","volume":"5","author":"Chen Yanjiao","year":"2021","unstructured":"Yanjiao Chen, Meng Xue, Jian Zhang, Qianyun Guan, Zhiyuan Wang, Qian Zhang, and Wei Wang. 2021. ChestLive: Fortifying voice-based authentication with chest motion biometric on smart devices. Proc. ACM Interact. Mob. Wear. Ubiq. Technol. 5, 4 (2021), 1\u201325.","journal-title":"Proc. ACM Interact. Mob. Wear. Ubiq. Technol."},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM42981.2021.9488703"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3356250.3360020"},{"key":"e_1_3_1_12_2","first-page":"21271","volume-title":"Proceedings of NeurlPS","author":"Grill Jean-Bastien","year":"2020","unstructured":"Jean-Bastien Grill, Florian Strub, Florent Altch\u00e9, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, Bilal Piot, koray kavukcuoglu, Remi Munos, and Michal Valko. 2020. Bootstrap your own latent - a new approach to self-supervised learning. In Proceedings of NeurlPS. 21271\u201321284."},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM48880.2022.9796879"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/2207676.2208331"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/WCNC45663.2020.9120726"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00975"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/TASSP.1981.1163535"},{"key":"e_1_3_1_18_2","first-page":"10944","volume-title":"Proceedings of NeurlPS","author":"Huang Yu","year":"2021","unstructured":"Yu Huang, Chenzhuang Du, Zihui Xue, Xuanyao Chen, Hang Zhao, and Longbo Huang. 2021. What makes multi-modal learning better than single (provably). In Proceedings of NeurlPS. 10944\u201310956."},{"issue":"4","key":"e_1_3_1_19_2","first-page":"1","article-title":"DAFI: WiFi-based device-free indoor localization via domain adaptation","volume":"5","author":"Li Hang","year":"2022","unstructured":"Hang Li, Xi Chen, Ju Wang, Di Wu, and Xue Liu. 2022. DAFI: WiFi-based device-free indoor localization via domain adaptation. Proc. ACM Interact. Mob. Wear. Ubiq. Technol. 5, 4 (2022), 1\u201321.","journal-title":"Proc. ACM Interact. Mob. Wear. Ubiq. Technol."},{"key":"e_1_3_1_20_2","first-page":"872","volume-title":"Proceedings of IEEE ICCV","author":"Li Tianhong","year":"2019","unstructured":"Tianhong Li, Lijie Fan, Mingmin Zhao, Yingcheng Liu, and Dina Katabi. 2019. Making the invisible bisible: Action recognition through walls and occlusions. In Proceedings of IEEE ICCV. 872\u2013881."},{"issue":"7","key":"e_1_3_1_21_2","first-page":"2620","article-title":"UltraGesture: Fine-grained gesture sensing and recognition","volume":"21","author":"Ling Kang","year":"2022","unstructured":"Kang Ling, Haipeng Dai, Yuntang Liu, Alex X. Liu, Wei Wang, and Qing Gu. 2022. UltraGesture: Fine-grained gesture sensing and recognition. IEEE Trans. Mob. Comput. 21, 7 (2022), 2620\u20132636.","journal-title":"IEEE Trans. Mob. Comput."},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3384419.3430776"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/CSE.2014.273"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/2971648.2971736"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2021.3053704"},{"key":"e_1_3_1_26_2","first-page":"5628","volume-title":"Proceedings of ICML","author":"Saunshi Nikunj","year":"2019","unstructured":"Nikunj Saunshi, Orestis Plevrakis, Sanjeev Arora, Mikhail Khodak, and Hrishikesh Khandeparkar. 2019. A theoretical analysis of contrastive unsupervised representation learning. In Proceedings of ICML. 5628\u20135637."},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/SECON55815.2022.9918549"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/3495243.3560529"},{"key":"e_1_3_1_29_2","first-page":"298","volume-title":"Proceedings of ACM SenSys","author":"Sun Ke","year":"2020","unstructured":"Ke Sun, Chen Chen, and Xinyu Zhang. 2020. \u201cAlexa, stop spying on me!\u201d speech privacy protection against voice assistants. In Proceedings of ACM SenSys. 298\u2013311."},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/3447993.3448626"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3241539.3241568"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3448112"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58621-8_45"},{"key":"e_1_3_1_34_2","unstructured":"A\u00e4ron van den Oord Yazhe Li and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv:1807.03748. Retrieved from http:\/\/arxiv.org\/abs\/1807.03748."},{"issue":"11","key":"e_1_3_1_35_2","first-page":"2579","article-title":"Visualizing data using t-SNE.","volume":"9","author":"Maaten Laurens Van der","year":"2008","unstructured":"Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 11 (2008), 2579\u20132605.","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM42981.2021.9488881"},{"key":"e_1_3_1_37_2","first-page":"82","volume-title":"Proceedings of ACM MobiCom","author":"Wang Wei","year":"2016","unstructured":"Wei Wang, Alex X. Liu, and Ke Sun. 2016. Device-free gesture tracking using acoustic signals. In Proceedings of ACM MobiCom. 82\u201394."},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM41043.2020.9155491"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMC.2020.3032278"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3485730.3485936"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/3485730.3485937"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00637"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-19809-0_38"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/3081333.3081356"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3241539.3241570"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00768"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1145\/3230543.3230579"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCCN.2018.8487345"}],"container-title":["ACM Transactions on Sensor Networks"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3597498","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3597498","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:48:45Z","timestamp":1750182525000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3597498"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,11]]},"references-count":47,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,7,31]]}},"alternative-id":["10.1145\/3597498"],"URL":"https:\/\/doi.org\/10.1145\/3597498","relation":{},"ISSN":["1550-4859","1550-4867"],"issn-type":[{"value":"1550-4859","type":"print"},{"value":"1550-4867","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,11]]},"assertion":[{"value":"2022-12-22","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-05-08","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2024-05-11","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}