{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T17:10:03Z","timestamp":1776791403929,"version":"3.51.2"},"reference-count":59,"publisher":"Association for Computing Machinery (ACM)","issue":"2","funder":[{"name":"Scientific Research Project of Hunan Provincial Department of Science and Technology","award":["2024RC3165, 2025JJ50409"],"award-info":[{"award-number":["2024RC3165, 2025JJ50409"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Internet Things"],"published-print":{"date-parts":[[2026,5,31]]},"abstract":"<jats:p>In the Internet of Vehicles (IoV) systems, recognizing driver emotions is crucial to alleviate dangerous driving behaviors caused by emotional instability. Current research predominantly utilizes multimodal data generated by various types of sensors in IoV systems as input to analyze driver emotion changes using multimodal models. However, existing methods are not enough to fully exploit the advantages of large language models (LLM) in information extraction and multimodal feature fusion, which limits the inference capability of emotion recognition models. Therefore, this article proposes an LLM-auxiliary supervision module, which assists in the training phase through LLM to enhance the performance of multimodal emotion recognition models. Specifically, we designed a label text feature extraction (LTFE) module that employs LLM for text data augmentation and extraction, converting label text into semantically informative feature representations. Additionally, we proposed the label-auxiliary supervision (LAS) strategy, which effectively integrates the LLM label text features learned from the LTFE module with the multimodal emotion recognition model during the training phase to enhance the model\u2019s inference ability. Notably, the LTFE and LAS modules are used only during the training phase, ensuring that the backbone model requires minimal computational resources during inference, making it compatible with the computational constraints of intelligent vehicular devices. Extensive experiments conducted on the PPB-Emo, RAVDESS, and IEMOCAP datasets demonstrate that the proposed method outperforms existing approaches in driver emotion recognition tasks.<\/jats:p>","DOI":"10.1145\/3786767","type":"journal-article","created":{"date-parts":[[2025,12,30]],"date-time":"2025-12-30T21:25:39Z","timestamp":1767129939000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Application of LLM-powered Multimodal Driver Emotion Recognition in IoV System"],"prefix":"10.1145","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2571-8065","authenticated-orcid":false,"given":"Yiming","family":"Wu","sequence":"first","affiliation":[{"name":"College of Information Science and Engineering, Hunan University","place":["Changsha, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-4805-090X","authenticated-orcid":false,"given":"Ronghui","family":"Cao","sequence":"additional","affiliation":[{"name":"College of Computer and Communication Engineering, Changsha University of Science and Technology","place":["Changsha, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-3960-5397","authenticated-orcid":false,"given":"Zeyu","family":"Chen","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Hunan University","place":["Changsha, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9081-8153","authenticated-orcid":false,"given":"Zhuo","family":"Tang","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Hunan University","place":["Changsha, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2681-7898","authenticated-orcid":false,"given":"Wangdong","family":"Yang","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Hunan University","place":["Changsha, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0884-4051","authenticated-orcid":false,"given":"Huilong","family":"Pi","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Hunan University","place":["Changsha, China"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2026,4,21]]},"reference":[{"key":"e_1_3_1_2_2","unstructured":"Josh Achiam Steven Adler Sandhini Agarwal Lama Ahmad Ilge Akkaya Florencia Leoni Aleman Diogo Almeida Janko Altenschmidt Sam Altman Shyamal Anadkat et\u00a0al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774. Retrieved from https:\/\/arxiv.org\/abs\/2303.08774"},{"key":"e_1_3_1_3_2","first-page":"12449","article-title":"wav2vec 2.0: A framework for self-supervised learning of speech representations","volume":"33","author":"Baevski Alexei","year":"2020","unstructured":"Alexei Baevski, Yuhao Zhou, Abdelrahman Mohamed, and Michael Auli. 2020. wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in Neural Information Processing Systems 33 (2020), 12449\u201312460.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10579-008-9076-6"},{"key":"e_1_3_1_5_2","first-page":"1493","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Cai Zhixi","year":"2023","unstructured":"Zhixi Cai, Shreya Ghosh, Kalin Stefanov, Abhinav Dhall, Jianfei Cai, Hamid Rezatofighi, Reza Haffari, and Munawar Hayat. 2023. Marlin: Masked autoencoder for facial video representation learning. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1493\u20131504."},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2015.05.037"},{"key":"e_1_3_1_7_2","unstructured":"Xiangxiang Chu Limeng Qiao Xinyang Lin Shuang Xu Yang Yang Yiming Hu Fei Wei Xinyu Zhang Bo Zhang Xiaolin Wei et\u00a0al. 2023. Mobilevlm: A fast reproducible and strong vision language assistant for mobile devices. arXiv preprint arXiv:2312.16886. Retrieved from https:\/\/arxiv.org\/abs\/2312.16886"},{"key":"e_1_3_1_8_2","first-page":"960","volume-title":"Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing.","author":"Degottex Gilles","year":"2014","unstructured":"Gilles Degottex, John Kane, Thomas Drugman, Tuomo Raitio, and Stefan Scherer. 2014. COVAREP\u2013A collaborative voice analysis repository for speech technologies. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing.IEEE, 960\u2013964."},{"key":"e_1_3_1_9_2","unstructured":"Jacob Devlin Ming-Wei Chang Kenton Lee and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Retrieved from https:\/\/arxiv.org\/abs\/1810.04805"},{"key":"e_1_3_1_10_2","first-page":"371","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Dong Zhekang","year":"2024","unstructured":"Zhekang Dong, Chenhao Hu, Shiqi Zhou, Liyan Zhu, Junfan Wang, Yi Chen, Xudong Lv, and Xiaoyue Ji. 2024. DECNet: A non-contacting dual-modality emotion classification network for driver health monitoring. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 371\u2013379."},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1145\/3549551"},{"key":"e_1_3_1_12_2","unstructured":"Ziwang Fu Feng Liu Hanyang Wang Jiayin Qi Xiangling Fu Aimin Zhou and Zhibin Li. 2021. A cross-modal fusion network based on self-attention and residual structure for multimodal emotion recognition. arXiv preprint arXiv:2111.02172. Retrieved from https:\/\/arxiv.org\/abs\/2111.02172"},{"key":"e_1_3_1_13_2","doi-asserted-by":"crossref","unstructured":"Akira Fukui Dong Huk Park Daylen Yang Anna Rohrbach Trevor Darrell and Marcus Rohrbach. 2016. Multimodal compact bilinear pooling for visual question answering and visual grounding. arXiv preprint arXiv:1606.01847. Retrieved from https:\/\/arxiv.org\/abs\/1606.01847","DOI":"10.18653\/v1\/D16-1044"},{"key":"e_1_3_1_14_2","unstructured":"Aaron Grattafiori Abhimanyu Dubey Abhinav Jauhri Abhinav Pandey Abhishek Kadian Ahmad Al-Dahle Aiesha Letman Akhil Mathur Alan Schelten Alex Vaughan et\u00a0al. 2024. The llama 3 herd of models. arXiv preprint arXiv:2407.21783. Retrieved from https:\/\/arxiv.org\/abs\/2407.21783"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/3695883"},{"key":"e_1_3_1_16_2","unstructured":"Binyuan Hui Jian Yang Zeyu Cui Jiaxi Yang Dayiheng Liu Lei Zhang Tianyu Liu Jiajun Zhang Bowen Yu Keming Lu et\u00a0al. 2024. Qwen2. 5-coder technical report. arXiv preprint arXiv:2409.12186. Retrieved from https:\/\/arxiv.org\/abs\/2409.12186"},{"key":"e_1_3_1_17_2","unstructured":"Daniel P Jeong Zachary C Lipton and Pradeep Ravikumar. 2024. Llm-select: Feature selection with large language models. arXiv:2407.02694. Retrieved from https:\/\/arxiv.org\/abs\/2407.02694"},{"key":"e_1_3_1_18_2","first-page":"13289","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Joze Hamid Reza Vaezi","year":"2020","unstructured":"Hamid Reza Vaezi Joze, Amirreza Shaban, Michael L. Iuzzolino, and Kazuhito Koishida. 2020. MMTM: Multimodal transfer module for CNN fusion. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 13289\u201313299."},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3614437"},{"issue":"1","key":"e_1_3_1_20_2","first-page":"264","article-title":"Wedea: A new eeg-based framework for emotion recognition","volume":"26","author":"Kim Sun-Hee","year":"2021","unstructured":"Sun-Hee Kim, Hyung-Jeong Yang, Ngoc Anh Thi Nguyen, Sunil Kumar Prabhakar, and Seong-Whan Lee. 2021. Wedea: A new eeg-based framework for emotion recognition. IEEE Journal of Biomedical and Health Informatics 26, 1 (2021), 264\u2013275.","journal-title":"IEEE Journal of Biomedical and Health Informatics"},{"key":"e_1_3_1_21_2","first-page":"67","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Li Hanting","year":"2023","unstructured":"Hanting Li, Hongjing Niu, Zhaoqing Zhu, and Feng Zhao. 2023. Intensity-aware loss for dynamic facial expression recognition in the wild. In Proceedings of the AAAI Conference on Artificial Intelligence. 67\u201375."},{"issue":"1","key":"e_1_3_1_22_2","doi-asserted-by":"crossref","first-page":"481","DOI":"10.1038\/s41597-022-01557-2","article-title":"A multimodal psychological, physiological and behavioural dataset for human emotions in driving tasks","volume":"9","author":"Li Wenbo","year":"2022","unstructured":"Wenbo Li, Ruichen Tan, Yang Xing, Guofa Li, Shen Li, Guanzhong Zeng, Peizhi Wang, Bingbing Zhang, Xinyu Su, Dawei Pi, et\u00a0al. 2022. A multimodal psychological, physiological and behavioural dataset for human emotions in driving tasks. Scientific Data 9, 1 (2022), 481.","journal-title":"Scientific Data"},{"issue":"3","key":"e_1_3_1_23_2","first-page":"667","article-title":"Cogemonet: A cognitive-feature-augmented driver emotion recognition model for smart cockpit","volume":"9","author":"Li Wenbo","year":"2021","unstructured":"Wenbo Li, Guanzhong Zeng, Juncheng Zhang, Yan Xu, Yang Xing, Rui Zhou, Gang Guo, Yu Shen, Dongpu Cao, and Fei-Yue Wang. 2021. Cogemonet: A cognitive-feature-augmented driver emotion recognition model for smart cockpit. IEEE Transactions on Computational Social Systems 9, 3 (2021), 667\u2013678.","journal-title":"IEEE Transactions on Computational Social Systems"},{"key":"e_1_3_1_24_2","unstructured":"Huanshuo Liu Hao Zhang Zhijiang Guo Kuicai Dong Xiangyang Li Yi Quan Lee Cong Zhang and Yong Liu. 2024. CtrlA: Adaptive retrieval-augmented generation via probe-guided control. arXiv preprint arXiv:2405.18727. Retrieved from https:\/\/arxiv.org\/abs\/2405.18727"},{"key":"e_1_3_1_25_2","unstructured":"Kuan Liu Yanen Li Ning Xu and Prem Natarajan. 2018. Learn to combine modalities in multimodal deep learning. arXiv preprint arXiv:1805.11730. Retrieved from https:\/\/arxiv.org\/abs\/1805.11730"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2022.03.062"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0196391"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.3390\/app12010327"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00258"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/3549548"},{"key":"e_1_3_1_31_2","first-page":"3866","volume-title":"Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP)","author":"Meng Debin","year":"2019","unstructured":"Debin Meng, Xiaojiang Peng, Kai Wang, and Yu Qiao. 2019. Frame attention networks for facial expression recognition in videos. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP). IEEE, 3866\u20133870."},{"key":"e_1_3_1_32_2","unstructured":"Tom\u00e1\u0161 Mikolov. 2012. Statistical Language Models Based on Neural Networks. Ph.D. Dissertation. Brno University of Technology Brno Czech Republic."},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2023.104676"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/TAFFC.2023.3250460"},{"key":"e_1_3_1_35_2","unstructured":"Juan DS Ortega Mohammed Senoussaoui Eric Granger Marco Pedersoli Patrick Cardinal and Alessandro L. Koerich. 2019. Multimodal fusion with deep neural networks for audio-video emotion recognition. arXiv preprint arXiv:1907.03196. Retrieved from https:\/\/arxiv.org\/abs\/1907.03196"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33016892"},{"key":"e_1_3_1_37_2","unstructured":"Alec Radford Karthik Narasimhan Tim Salimans and Ilya Sutskever. 2018. Improving Language Understanding by Generative Pre-training. Technical Report. OpenAI."},{"key":"e_1_3_1_38_2","first-page":"8748","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et\u00a0al. 2021. Learning transferable visual models from natural language supervision. In Proceedings of the International Conference on Machine Learning. PMLR, 8748\u20138763."},{"issue":"8","key":"e_1_3_1_39_2","doi-asserted-by":"crossref","first-page":"e0256109","DOI":"10.1371\/journal.pone.0256109","article-title":"Developmental differences in the visual processing of emotionally ambiguous neutral faces based on perceived valence","volume":"16","author":"Rollins Leslie","year":"2021","unstructured":"Leslie Rollins, Erin Bertero, and Laurie Hunter. 2021. Developmental differences in the visual processing of emotionally ambiguous neutral faces based on perceived valence. PloS One 16, 8 (2021), e0256109.","journal-title":"PloS One"},{"key":"e_1_3_1_40_2","article-title":"Global road safety 2010\u201318: An analysis of global status reports","author":"Rosen Heather E.","year":"2025","unstructured":"Heather E. Rosen, Imran Bari, Nino Paichadze, Margaret Peden, Meleckidzedeck Khayesi, Jes\u00fas Moncl\u00fas, and Adnan A Hyder. 2025. Global road safety 2010\u201318: An analysis of global status reports. Injury 56, 6 (2025), 110266.","journal-title":"Injury"},{"key":"e_1_3_1_41_2","doi-asserted-by":"crossref","unstructured":"James A. Russell. 1980. A circumplex model of affect. Journal of Personality and Social Psychology 39 6 (1980) 1161\u20131178.","DOI":"10.1037\/h0077714"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/TAFFC.2020.3014842"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1145\/3674150"},{"key":"e_1_3_1_44_2","unstructured":"Lang Su Chuqing Hu Guofa Li and Dongpu Cao. 2020. Msaf: Multimodal split attention fusion. arXiv preprint arXiv:2012.07175. Retrieved from https:\/\/arxiv.org\/abs\/2012.07175"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ssci.2014.08.013"},{"key":"e_1_3_1_46_2","first-page":"6105","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Tan Mingxing","year":"2019","unstructured":"Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning. PMLR, 6105\u20136114."},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2023.119125"},{"key":"e_1_3_1_48_2","first-page":"6558","volume-title":"Proceedings of the Conference. Association for Computational Linguistics. Meeting","author":"Tsai Yao-Hung Hubert","year":"2019","unstructured":"Yao-Hung Hubert Tsai, Shaojie Bai, Paul Pu Liang, J Zico Kolter, Louis-Philippe Morency, and Ruslan Salakhutdinov. 2019. Multimodal transformer for unaligned multimodal language sequences. In Proceedings of the Conference. Association for Computational Linguistics. Meeting. NIH Public Access, 6558."},{"key":"e_1_3_1_49_2","unstructured":"Yao-Hung Hubert Tsai Paul Pu Liang Amir Zadeh Louis-Philippe Morency and Ruslan Salakhutdinov. 2018. Learning factorized multimodal representations. arXiv preprint arXiv:1806.06176. Retrieved from https:\/\/arxiv.org\/abs\/1806.06176"},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2022.07.012"},{"key":"e_1_3_1_51_2","first-page":"17958","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wang Hanyang","year":"2023","unstructured":"Hanyang Wang, Bo Li, Shuang Wu, Siyuan Shen, Feng Liu, Shouhong Ding, and Aimin Zhou. 2023. Rethinking the learning paradigm for dynamic facial expression recognition. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 17958\u201317968."},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33017216"},{"key":"e_1_3_1_53_2","first-page":"20459","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Yang Dingkang","year":"2023","unstructured":"Dingkang Yang, Shuai Huang, Zhi Xu, Zhenpeng Li, Shunli Wang, Mingcheng Li, Yuzheng Wang, Yang Liu, Kun Yang, Zhaoyu Chen, et\u00a0al. 2023. Aide: A vision-driven multi-view, multi-modal, multi-tasking dataset for assistive driving perception. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 20459\u201320470."},{"key":"e_1_3_1_54_2","article-title":"Analyzing audiovisual data for understanding user\u2019s emotion in human- computer interaction environment","author":"Yang Juan","year":"2024","unstructured":"Juan Yang, Zhenkun Li, and Xu Du. 2024. Analyzing audiovisual data for understanding user\u2019s emotion in human- computer interaction environment. Data Technologies and Applications 58, 2 (2024), 318\u2013343.","journal-title":"Data Technologies and Applications"},{"key":"e_1_3_1_55_2","unstructured":"Jianing Yang Yongxin Wang Ruitao Yi Yuying Zhu Azaan Rehman Amir Zadeh Soujanya Poria and Louis-Philippe Morency. 2020. Mtag: Modal-temporal attention graph for unaligned human multimodal language sequences. arXiv preprint arXiv:2010.11985. Retrieved from https:\/\/arxiv.org\/abs\/2010.11985"},{"key":"e_1_3_1_56_2","article-title":"A robust driver emotion recognition method based on high-purity feature separation","author":"Yang Lie","year":"2023","unstructured":"Lie Yang, Haohan Yang, Bin-Bin Hu, Yan Wang, and Chen Lv. 2023. A robust driver emotion recognition method based on high-purity feature separation. IEEE Transactions on Intelligent Transportation Systems 24, 12 (2023), 15092\u201315104.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"e_1_3_1_57_2","unstructured":"Werner Zellinger Thomas Grubinger Edwin Lughofer Thomas Natschl\u00e4ger and Susanne Saminger-Platz. 2017. Central moment discrepancy (cmd) for domain-invariant representation learning. arXiv preprint arXiv:1702.08811. Retrieved from https:\/\/arxiv.org\/abs\/1702.08811"},{"key":"e_1_3_1_58_2","doi-asserted-by":"publisher","DOI":"10.1145\/3474085.3475292"},{"key":"e_1_3_1_59_2","first-page":"7614","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Zhong Yicheng","year":"2024","unstructured":"Yicheng Zhong, Huawei Wei, Peiji Yang, and Zhisheng Wang. 2024. Expclip: Bridging text and facial expressions via semantic alignment. In Proceedings of the AAAI Conference on Artificial Intelligence. 7614\u20137622."},{"key":"e_1_3_1_60_2","first-page":"18","volume-title":"Proceedings of the 1st International Workshop on Efficient Multimedia Computing under Limited","author":"Zhu Yichen","year":"2024","unstructured":"Yichen Zhu, Minjie Zhu, Ning Liu, Zhiyuan Xu, and Yaxin Peng. 2024. Llava-phi: Efficient multi-modal assistant with small language model. In Proceedings of the 1st International Workshop on Efficient Multimedia Computing under Limited. 18\u201322."}],"container-title":["ACM Transactions on Internet of Things"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3786767","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T16:23:56Z","timestamp":1776788636000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3786767"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4,21]]},"references-count":59,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2026,5,31]]}},"alternative-id":["10.1145\/3786767"],"URL":"https:\/\/doi.org\/10.1145\/3786767","relation":{},"ISSN":["2691-1914","2577-6207"],"issn-type":[{"value":"2691-1914","type":"print"},{"value":"2577-6207","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,4,21]]},"assertion":[{"value":"2025-01-25","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-12-12","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-04-21","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}