{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,27]],"date-time":"2026-04-27T17:40:06Z","timestamp":1777311606997,"version":"3.51.4"},"reference-count":61,"publisher":"Association for Computing Machinery (ACM)","issue":"3","funder":[{"name":"National Major Scientific Instruments and Equipments Development Project of National Natural Science Foundation of China","award":["62427810"],"award-info":[{"award-number":["62427810"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62222202, 62232004, U24B20176"],"award-info":[{"award-number":["62222202, 62232004, U24B20176"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Beijing Natural Science Foundation","award":["L223002"],"award-info":[{"award-number":["L223002"]}]},{"DOI":"10.13039\/501100013314","name":"111 Project","doi-asserted-by":"crossref","award":["B18008"],"award-info":[{"award-number":["B18008"]}],"id":[{"id":"10.13039\/501100013314","id-type":"DOI","asserted-by":"crossref"}]},{"name":"BUPT Excellent Ph.D. Students Foundation","award":["CX20242002"],"award-info":[{"award-number":["CX20242002"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Interact. Mob. Wearable Ubiquitous Technol."],"published-print":{"date-parts":[[2025,9,3]]},"abstract":"<jats:p>Multi-modal sensing has become crucial in Human Activity Recognition (HAR) due to its ability to combine data from diverse sensors. However, challenges arise in recognizing various activities in different scenes using multi-modal data from different positions and devices, due to dynamic combinations of modal inputs, data heterogeneity, and scarcity of labeled data. To tackle these challenges, we propose MASTER, a multi-modal foundation model specifically designed for HAR. MASTER introduces a masked-data modeling-based self-supervised pre-training method, enabling the model to learn from unlabeled data and adapt to dynamic combinations of modal inputs. Moreover, it incorporates a few-shot alignment mechanism to facilitate adaptation to different activities, scenes, positions, and devices. Through the pre-training and fine-tuning on 7 multi-modal HAR datasets, MASTER currently supports, but is not limited to, 8 modalities (ACC, Gyro, mmWave, WiFi, Skeleton, Lidar, Infrared, and RGB) and 45 human activities. The results demonstrate that MASTER achieves the highest accuracy with minimal labeled data across various situations, surpassing alternative solutions.<\/jats:p>","DOI":"10.1145\/3749511","type":"journal-article","created":{"date-parts":[[2025,9,3]],"date-time":"2025-09-03T17:15:45Z","timestamp":1756919745000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["MASTER: A Multi-modal Foundation Model for Human Activity Recognition"],"prefix":"10.1145","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5671-9637","authenticated-orcid":false,"given":"Guanzhou","family":"Zhu","sequence":"first","affiliation":[{"name":"State Key Laboratory of Network and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7337-9168","authenticated-orcid":false,"given":"Dong","family":"Zhao","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Network and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-0682-7447","authenticated-orcid":false,"given":"Chunliang","family":"Li","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Network and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-7944-9391","authenticated-orcid":false,"given":"Mingyue","family":"Zhao","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Network and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-0262-3323","authenticated-orcid":false,"given":"Zhengyuan","family":"Zhang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Network and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-8656-0352","authenticated-orcid":false,"given":"Hefeng","family":"Quan","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Network and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7199-5047","authenticated-orcid":false,"given":"Huadong","family":"Ma","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Network and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2025,9,3]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"2021. Apple: Measuring Walking Quality Through iPhone Mobility Metrics. https:\/\/www.apple.com\/ca\/healthcare\/docs\/site\/Measuring_Walking_Quality_Through_iPhone_Mobility_Metrics.pdf."},{"key":"e_1_2_1_2_1","unstructured":"2024. MotionNode IMU platform. http:\/\/www.motionnode.com\/."},{"key":"e_1_2_1_3_1","volume-title":"Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al.","author":"Achiam Josh","year":"2023","unstructured":"Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. GPT-4 technical report. arXiv:2303.08774 (2023)."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3563948"},{"key":"e_1_2_1_5_1","first-page":"27414","article-title":"mri: Multi-modal 3d human pose estimation dataset using mmwave, rgb-d, and inertial sensors","volume":"35","author":"An Sizhe","year":"2022","unstructured":"Sizhe An, Yin Li, and Umit Ogras. 2022. mri: Multi-modal 3d human pose estimation dataset using mmwave, rgb-d, and inertial sensors. Advances in NeurIPS 35 (2022), 27414--27426.","journal-title":"Advances in NeurIPS"},{"key":"e_1_2_1_6_1","first-page":"3","article-title":"A public domain dataset for human activity recognition using smartphones","volume":"3","author":"Anguita Davide","year":"2013","unstructured":"Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra, Jorge Luis Reyes-Ortiz, et al. 2013. A public domain dataset for human activity recognition using smartphones.. In Esann, Vol. 3. 3--4.","journal-title":"Esann"},{"key":"e_1_2_1_7_1","first-page":"32897","article-title":"Vlmo: Unified vision-language pre-training with mixture-of-modality-experts","volume":"35","author":"Bao Hangbo","year":"2022","unstructured":"Hangbo Bao, Wenhui Wang, Li Dong, Qiang Liu, Owais Khan Mohammed, Kriti Aggarwal, Subhojit Som, Songhao Piao, and Furu Wei. 2022. Vlmo: Unified vision-language pre-training with mixture-of-modality-experts. Advances in NeurIPS 35 (2022), 32897--32912.","journal-title":"Advances in NeurIPS"},{"key":"e_1_2_1_8_1","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in NeurIPS 33 (2020), 1877--1901.","journal-title":"Advances in NeurIPS"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICIP.2015.7350781"},{"key":"e_1_2_1_10_1","first-page":"1","article-title":"PaLM: Scaling language modeling with pathways","volume":"24","author":"Chowdhery Aakanksha","year":"2023","unstructured":"Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, et al. 2023. PaLM: Scaling language modeling with pathways. Journal of Machine Learning Research 24, 240 (2023), 1--113.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_2_1_11_1","volume-title":"Advancing Multi-Modal Sensing Through Expandable Modality Alignment. arXiv preprint arXiv:2407.17777","author":"Dai Shenghong","year":"2024","unstructured":"Shenghong Dai, Shiqi Jiang, Yifan Yang, Ting Cao, Mo Li, Suman Banerjee, and Lili Qiu. 2024. Advancing Multi-Modal Sensing Through Expandable Modality Alignment. arXiv preprint arXiv:2407.17777 (2024)."},{"key":"e_1_2_1_12_1","first-page":"1","article-title":"COCOA: Cross modality contrastive learning for sensor data","volume":"6","author":"Deldari Shohreh","year":"2022","unstructured":"Shohreh Deldari, Hao Xue, Aaqib Saeed, Daniel V Smith, and Flora D Salim. 2022. COCOA: Cross modality contrastive learning for sensor data. Proceedings of the ACM IMWUT 6, 3 (2022), 1--28.","journal-title":"Proceedings of the ACM IMWUT"},{"key":"e_1_2_1_13_1","volume-title":"G 3 R: Generating Rich and Fine-Grained Mmwave Radar Data From 2D Videos for Generalized Gesture Recognition","author":"Deng Kaikai","year":"2024","unstructured":"Kaikai Deng, Dong Zhao, Wenxin Zheng, Yue Ling, Kangwen Yin, and Huadong Ma. 2024. G 3 R: Generating Rich and Fine-Grained Mmwave Radar Data From 2D Videos for Generalized Gesture Recognition. IEEE Transactions on Mobile Computing (2024)."},{"key":"e_1_2_1_14_1","volume-title":"Proceedings of the NAACL. 4171--4186","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the NAACL. 4171--4186."},{"key":"e_1_2_1_15_1","volume-title":"Words: Transformers for Image Recognition at Scale. In ICLR.","author":"Dosovitskiy Alexey","year":"2020","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In ICLR."},{"key":"e_1_2_1_16_1","volume-title":"Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, Jonathan Tompson, Quan Vuong, Tianhe Yu, et al.","author":"Driess Danny","year":"2023","unstructured":"Danny Driess, Fei Xia, Mehdi SM Sajjadi, Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, Jonathan Tompson, Quan Vuong, Tianhe Yu, et al. 2023. PaLM-E: An Embodied Multimodal Language Model. In ICML. 8469--8488."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2021\/324"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01457"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01553"},{"key":"e_1_2_1_20_1","first-page":"1","article-title":"CrossHAR: Generalizing Cross-dataset Human Activity Recognition via Hierarchical Self-Supervised Pretraining","volume":"8","author":"Hong Zhiqing","year":"2024","unstructured":"Zhiqing Hong, Zelong Li, Shuxin Zhong, Wenjun Lyu, Haotian Wang, Yi Ding, Tian He, and Desheng Zhang. 2024. CrossHAR: Generalizing Cross-dataset Human Activity Recognition via Hierarchical Self-Supervised Pretraining. Proceedings of the ACM IMWUT 8, 2 (2024), 1--26.","journal-title":"Proceedings of the ACM IMWUT"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1147"},{"key":"e_1_2_1_22_1","first-page":"1","article-title":"Augmented adversarial learning for human activity recognition with partial sensor sets","volume":"6","author":"Kang Hua","year":"2022","unstructured":"Hua Kang, Qianyi Huang, and Qian Zhang. 2022. Augmented adversarial learning for human activity recognition with partial sensor sets. Proceedings of the ACM IMWUT 6, 3 (2022), 1--30.","journal-title":"Proceedings of the ACM IMWUT"},{"key":"e_1_2_1_23_1","unstructured":"Junnan Li Dongxu Li Silvio Savarese and Steven Hoi. 2023. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. In ICML. 19730--19742."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3511808.3557402"},{"key":"e_1_2_1_25_1","first-page":"1","article-title":"mmStress: Distilling Human Stress from Daily Activities via Contact-less Millimeter-wave Sensing","volume":"7","author":"Liang Kun","year":"2023","unstructured":"Kun Liang, Anfu Zhou, Zhan Zhang, Hao Zhou, Huadong Ma, and Chenshu Wu. 2023. mmStress: Distilling Human Stress from Daily Activities via Contact-less Millimeter-wave Sensing. Proceedings of the ACM IMWUT 7, 3 (2023), 1--36.","journal-title":"Proceedings of the ACM IMWUT"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3699754"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1109\/COMST.2019.2934489"},{"key":"e_1_2_1_28_1","doi-asserted-by":"crossref","unstructured":"Haojie Ma Wenzhong Li Xiao Zhang Songcheng Gao and Sanglu Lu. 2019. AttnSense: Multi-level attention mechanism for multimodal human activity recognition. In IJCAI. 3109--3115.","DOI":"10.24963\/ijcai.2019\/431"},{"key":"e_1_2_1_29_1","first-page":"1","article-title":"Spatial-Temporal Masked Autoencoder for Multi-Device Wearable Human Activity Recognition","volume":"7","author":"Miao Shenghuan","year":"2024","unstructured":"Shenghuan Miao, Ling Chen, and Rong Hu. 2024. Spatial-Temporal Masked Autoencoder for Multi-Device Wearable Human Activity Recognition. Proceedings of the ACM IMWUT 7, 4 (2024), 1--25.","journal-title":"Proceedings of the ACM IMWUT"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3636534.3649370"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3495243.3560519"},{"key":"e_1_2_1_32_1","volume-title":"Proceedings of the ACM ICML. 8748--8763","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In Proceedings of the ACM ICML. 8748--8763."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/s12062-020-09260-z"},{"key":"e_1_2_1_34_1","volume-title":"CDFi: Cross-Domain Action Recognition using WiFi Signals","author":"Sheng Biyun","year":"2024","unstructured":"Biyun Sheng, Rui Han, Hui Cai, Fu Xiao, Linqing Gui, and Zhengxin Guo. 2024. CDFi: Cross-Domain Action Recognition using WiFi Signals. IEEE Transactions on Mobile Computing (2024)."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.3390\/s140610146"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2809695.2809718"},{"key":"e_1_2_1_37_1","first-page":"1","article-title":"Multimodal Daily-Life Logging in Free-living Environment Using Non-Visual Egocentric Sensors on a Smartphone","volume":"8","author":"Sun Ke","year":"2024","unstructured":"Ke Sun, Chunyu Xia, Xinyu Zhang, Hao Chen, and Charlie Jianzhong Zhang. 2024. Multimodal Daily-Life Logging in Free-living Environment Using Non-Visual Egocentric Sensors on a Smartphone. Proceedings of the ACM IMWUT 8, 1 (2024), 1--32.","journal-title":"Proceedings of the ACM IMWUT"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58621-8_45"},{"key":"e_1_2_1_39_1","volume-title":"Flexible imputation of missing data","author":"Buuren Stef Van","unstructured":"Stef Van Buuren. 2018. Flexible imputation of missing data. CRC press."},{"key":"e_1_2_1_40_1","article-title":"Visualizing data using t-SNE","volume":"9","author":"der Maaten Laurens Van","year":"2008","unstructured":"Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, 11 (2008).","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_2_1_41_1","volume-title":"Attention is all you need. Advances in NeurIPS 30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in NeurIPS 30 (2017)."},{"key":"e_1_2_1_42_1","volume-title":"Deepnet: Scaling transformers to 1,000 layers","author":"Wang Hongyu","year":"2024","unstructured":"Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, and Furu Wei. 2024. Deepnet: Scaling transformers to 1,000 layers. IEEE Transactions on Pattern Analysis and Machine Intelligence (2024)."},{"key":"e_1_2_1_43_1","volume-title":"Saksham Singhal, Subhojit Som, et al.","author":"Wang Wenhui","year":"2022","unstructured":"Wenhui Wang, Hangbo Bao, Li Dong, Johan Bjorck, Zhiliang Peng, Qiang Liu, Kriti Aggarwal, Owais Khan Mohammed, Saksham Singhal, Subhojit Som, et al. 2022. Image as a foreign language: Beit pretraining for all vision and vision-language tasks. arXiv:2208.10442 (2022)."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/3666025.3699349"},{"key":"e_1_2_1_45_1","volume-title":"The Dawn of Synthetic Era: Synthesizing mmWave Radar Data from 2D Videos for Human Sensing","author":"Xing Ling","year":"2025","unstructured":"Ling Xing, Kaikai Deng, Honghai Wu, Huahong Ma, Jianping Gao, and Yue Ling. 2025. The Dawn of Synthetic Era: Synthesizing mmWave Radar Data from 2D Videos for Human Sensing. IEEE Communications Magazine (2025)."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3570361.3613299"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3485730.3485937"},{"key":"e_1_2_1_48_1","volume-title":"RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data. arXiv preprint arXiv:2411.18822","author":"Xu Maxwell A","year":"2024","unstructured":"Maxwell A Xu, Jaya Narain, Gregory Darnell, Haraldur Hallgrimsson, Hyewon Jeong, Darren Forde, Richard Fineman, Karthik J Raghuram, James M Rehg, and Shirley Ren. 2024. RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data. arXiv preprint arXiv:2411.18822 (2024)."},{"key":"e_1_2_1_49_1","volume-title":"Chris Xiaoxuan Lu, and Lihua Xie","author":"Yang Jianfei","year":"2024","unstructured":"Jianfei Yang, He Huang, Yunjiao Zhou, Xinyan Chen, Yuecong Xu, Shenghai Yuan, Han Zou, Chris Xiaoxuan Lu, and Lihua Xie. 2024. Mm-fi: Multi-modal non-intrusive 4d human dataset for versatile wireless sensing. Advances in NeurIPS 36 (2024)."},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3038912.3052577"},{"key":"e_1_2_1_51_1","volume-title":"A Survey on Multimodal Large Language Models. arXiv:2306.13549","author":"Yin Shukang","year":"2023","unstructured":"Shukang Yin, Chaoyou Fu, Sirui Zhao, Ke Li, Xing Sun, Tong Xu, and Enhong Chen. 2023. A Survey on Multimodal Large Language Models. arXiv:2306.13549 (2023)."},{"key":"e_1_2_1_52_1","volume-title":"Self-supervised learning for human activity recognition using 700,000 person-days of wearable data. NPJ digital medicine 7, 1","author":"Yuan Hang","year":"2024","unstructured":"Hang Yuan, Shing Chan, Andrew P Creagh, Catherine Tong, Aidan Acquah, David A Clifton, and Aiden Doherty. 2024. Self-supervised learning for human activity recognition using 700,000 person-days of wearable data. NPJ digital medicine 7, 1 (2024), 91."},{"key":"e_1_2_1_53_1","first-page":"1","article-title":"Lt-fall: The design and implementation of a life-threatening fall detection and alarming system","volume":"7","author":"Zhang Duo","year":"2023","unstructured":"Duo Zhang, Xusheng Zhang, Shengjie Li, Yaxiong Xie, Yang Li, Xuanzhi Wang, and Daqing Zhang. 2023. Lt-fall: The design and implementation of a life-threatening fall detection and alarming system. Proceedings of the ACM IMWUT 7, 1 (2023), 1--24.","journal-title":"Proceedings of the ACM IMWUT"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1145\/3372224.3380889"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/3300061.3300125"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/2370216.2370438"},{"key":"e_1_2_1_57_1","volume-title":"CamoNet: On-Device Neural Network Adaptation With Zero Interaction and Unlabeled Data for Diverse Edge Environments","author":"Zhang Zhengyuan","year":"2024","unstructured":"Zhengyuan Zhang, Dong Zhao, Renhao Liu, Kuo Tian, Yuxing Yao, YuanChun Li, and Huadong Ma. 2024. CamoNet: On-Device Neural Network Adaptation With Zero Interaction and Unlabeled Data for Diverse Edge Environments. IEEE Transactions on Mobile Computing (2024)."},{"key":"e_1_2_1_58_1","volume-title":"ACL: Adaptive Edge-Cloud Collaborative Learning for Heterogeneous Devices with Unlabeled Local Data","author":"Zhang Zhengyuan","year":"2025","unstructured":"Zhengyuan Zhang, Dong Zhao, Renhao Liu, Yuxing Yao, Xiangyu Li, and Huadong Ma. 2025. ACL: Adaptive Edge-Cloud Collaborative Learning for Heterogeneous Devices with Unlabeled Local Data. IEEE Transactions on Mobile Computing (2025)."},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM55648.2025.11044700"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1145\/3570361.3592517"},{"key":"e_1_2_1_61_1","first-page":"1","article-title":"Combining Smart Speaker and Smart Meter to Infer Your Residential Power Usage by Self-supervised Cross-modal Learning","volume":"7","author":"Zhu Guanzhou","year":"2023","unstructured":"Guanzhou Zhu, Dong Zhao, Kuo Tian, Zhengyuan Zhang, Rui Yuan, and Huadong Ma. 2023. Combining Smart Speaker and Smart Meter to Infer Your Residential Power Usage by Self-supervised Cross-modal Learning. Proceedings of the ACM IMWUT 7, 3 (2023), 1--26.","journal-title":"Proceedings of the ACM IMWUT"}],"container-title":["Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3749511","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,25]],"date-time":"2025-09-25T16:25:50Z","timestamp":1758817550000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3749511"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,3]]},"references-count":61,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2025,9,3]]}},"alternative-id":["10.1145\/3749511"],"URL":"https:\/\/doi.org\/10.1145\/3749511","relation":{},"ISSN":["2474-9567"],"issn-type":[{"value":"2474-9567","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,3]]},"assertion":[{"value":"2025-09-03","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}