{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,26]],"date-time":"2026-01-26T10:22:11Z","timestamp":1769422931455,"version":"3.49.0"},"reference-count":95,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2023,9,27]],"date-time":"2023-09-27T00:00:00Z","timestamp":1695772800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Interact. Mob. Wearable Ubiquitous Technol."],"published-print":{"date-parts":[[2023,9,27]]},"abstract":"<jats:p>Recent advancements in deep learning have shown that multimodal inference can be particularly useful in tasks like autonomous driving, human health, and production line monitoring. However, deploying state-of-the-art multimodal models in distributed IoT systems poses unique challenges since the sensor data from low-cost edge devices can get corrupted, lost, or delayed before reaching the cloud. These problems are magnified in the presence of asymmetric data generation rates from different sensor modalities, wireless network dynamics, or unpredictable sensor behavior, leading to either increased latency or degradation in inference accuracy, which could affect the normal operation of the system with severe consequences like human injury or car accident. In this paper, we propose PATCH, a framework of speculative inference to adapt to these complex scenarios. PATCH serves as a plug-in module in the existing multimodal models, and it enables speculative inference of these off-the-shelf deep learning models. PATCH consists of 1) a Masked-AutoEncoder-based cross-modality imputation module to impute missing data using partially-available sensor data, 2) a lightweight feature pair ranking module that effectively limits the searching space for the optimal imputation configuration with low computation overhead, and 3) a data alignment module that aligns multimodal heterogeneous data streams without using accurate timestamp or external synchronization mechanisms. We implement PATCH in nine popular multimodal models using five public datasets and one self-collected dataset. The experimental results show that PATCH achieves up to 13% mean accuracy improvement over the state-of-art method while only using 10% of training data and reducing the training overhead by 73% compared to the original cost of retraining the model.<\/jats:p>","DOI":"10.1145\/3610885","type":"journal-article","created":{"date-parts":[[2023,9,27]],"date-time":"2023-09-27T15:45:03Z","timestamp":1695829503000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["PATCH"],"prefix":"10.1145","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-4330-7098","authenticated-orcid":false,"given":"Juexing","family":"Wang","sequence":"first","affiliation":[{"name":"Michigan State University, East Lansing, Michigan, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9353-9042","authenticated-orcid":false,"given":"Guangjing","family":"Wang","sequence":"additional","affiliation":[{"name":"Michigan State University, East Lansing, Michigan, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7392-3477","authenticated-orcid":false,"given":"Xiao","family":"Zhang","sequence":"additional","affiliation":[{"name":"Michigan State University, East Lansing, Michigan, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-0418-736X","authenticated-orcid":false,"given":"Li","family":"Liu","sequence":"additional","affiliation":[{"name":"Michigan State University, East Lansing, Michigan, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3272-5239","authenticated-orcid":false,"given":"Huacheng","family":"Zeng","sequence":"additional","affiliation":[{"name":"Michigan State University, East Lansing, Michigan, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2861-8438","authenticated-orcid":false,"given":"Li","family":"Xiao","sequence":"additional","affiliation":[{"name":"Michigan State University, East Lansing, Michigan, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8159-9072","authenticated-orcid":false,"given":"Zhichao","family":"Cao","sequence":"additional","affiliation":[{"name":"Michigan State University, East Lansing, Michigan, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7419-6240","authenticated-orcid":false,"given":"Lin","family":"Gu","sequence":"additional","affiliation":[{"name":"RIKEN AIP, Tokoyo, Tokoyo, JAPAN and The University of Tokyo, Tokoyo, Tokoyo, JAPAN"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0808-2285","authenticated-orcid":false,"given":"Tianxing","family":"Li","sequence":"additional","affiliation":[{"name":"Michigan State University, East Lansing, Michigan, USA"}]}],"member":"320","published-online":{"date-parts":[[2023,9,27]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Church","author":"Aach John","year":"2001","unstructured":"John Aach and George M. Church. 2001. Aligning gene expression time series with time warping algorithms. Bioinformatics 17, 6 (06 2001), 495--508."},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 21th international European symposium on artificial neural networks, computational intelligence and machine learning. 437--442","author":"Anguita Davide","year":"2013","unstructured":"Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra Perez, and Jorge Luis Reyes Ortiz. 2013. A public domain dataset for human activity recognition using smartphones. In Proceedings of the 21th international European symposium on artificial neural networks, computational intelligence and machine learning. 437--442."},{"key":"e_1_2_1_3_1","volume-title":"Security and privacy issues in deep learning. arXiv preprint arXiv:1807.11655","author":"Bae Ho","year":"2018","unstructured":"Ho Bae, Jaehee Jang, Dahuin Jung, Hyemi Jang, Heonseok Ha, Hyungyu Lee, and Sungroh Yoon. 2018. Security and privacy issues in deep learning. arXiv preprint arXiv:1807.11655 (2018)."},{"key":"e_1_2_1_4_1","volume-title":"Proceedings of ICML workshop on unsupervised and transfer learning. JMLR Workshop and Conference Proceedings, 37--49","author":"Baldi Pierre","year":"2012","unstructured":"Pierre Baldi. 2012. Autoencoders, unsupervised learning, and deep architectures. In Proceedings of ICML workshop on unsupervised and transfer learning. JMLR Workshop and Conference Proceedings, 37--49."},{"key":"e_1_2_1_5_1","volume-title":"CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning. arXiv preprint arXiv:2303.03323","author":"Bansal Hritik","year":"2023","unstructured":"Hritik Bansal, Nishad Singhi, Yu Yang, Fan Yin, Aditya Grover, and Kai-Wei Chang. 2023. CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning. arXiv preprint arXiv:2303.03323 (2023)."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.engappai.2018.11.013"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11265-020-01596-1"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICC45855.2022.9838942"},{"key":"e_1_2_1_9_1","volume-title":"Scaling video analytics on constrained edge nodes. arXiv preprint arXiv:1905.13536","author":"Canel Christopher","year":"2019","unstructured":"Christopher Canel, Thomas Kim, Giulio Zhou, Conglong Li, Hyeontaek Lim, David G Andersen, Michael Kaminsky, and Subramanya R Dulloor. 2019. Scaling video analytics on constrained edge nodes. arXiv preprint arXiv:1905.13536 (2019)."},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2012.12.014"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSAC.2019.2927065"},{"key":"e_1_2_1_12_1","volume-title":"Detecting backdoor attacks on deep neural networks by activation clustering. arXiv preprint arXiv:1811.03728","author":"Chen Bryant","year":"2018","unstructured":"Bryant Chen, Wilka Carvalho, Nathalie Baracaldo, Heiko Ludwig, Benjamin Edwards, Taesung Lee, Ian Molloy, and Biplav Srivastava. 2018. Detecting backdoor attacks on deep neural networks by activation clustering. arXiv preprint arXiv:1811.03728 (2018)."},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2018.08.007"},{"key":"e_1_2_1_14_1","volume-title":"TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving. arXiv preprint arXiv:2205.15997","author":"Chitta Kashyap","year":"2022","unstructured":"Kashyap Chitta, Aditya Prakash, Bernhard Jaeger, Zehao Yu, Katrin Renz, and Andreas Geiger. 2022. TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving. arXiv preprint arXiv:2205.15997 (2022)."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ohx.2021.e00191"},{"key":"e_1_2_1_16_1","volume-title":"international conference on machine learning. PMLR, 1310--1320","author":"Cohen Jeremy","year":"2019","unstructured":"Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. 2019. Certified adversarial robustness via randomized smoothing. In international conference on machine learning. PMLR, 1310--1320."},{"key":"e_1_2_1_17_1","volume-title":"Meindert Niemeijer, Michael Abr\u00e0moff, Ana Maria Mendon\u00e7a, and Aur\u00e9lio Campilho.","author":"Costa Pedro","year":"2017","unstructured":"Pedro Costa, Adrian Galdran, Maria Ines Meyer, Meindert Niemeijer, Michael Abr\u00e0moff, Ana Maria Mendon\u00e7a, and Aur\u00e9lio Campilho. 2017. End-to-end adversarial retinal image synthesis. IEEE transactions on medical imaging 37, 3 (2017), 781--791."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41467-021-27112-y"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.3390\/s140915760"},{"key":"e_1_2_1_20_1","volume-title":"HyperDense-Net: a hyper-densely connected CNN for multi-modal image segmentation","author":"Dolz Jose","year":"2018","unstructured":"Jose Dolz, Karthik Gopinath, Jing Yuan, Herve Lombaert, Christian Desrosiers, and Ismail Ben Ayed. 2018. HyperDense-Net: a hyper-densely connected CNN for multi-modal image segmentation. IEEE transactions on medical imaging 38, 5 (2018), 1116--1126."},{"key":"e_1_2_1_21_1","volume-title":"An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ICLR","author":"Dosovitskiy Alexey","year":"2021","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2021. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. ICLR (2021)."},{"key":"e_1_2_1_22_1","volume-title":"Conference on robot learning. PMLR, 1--16","author":"Dosovitskiy Alexey","year":"2017","unstructured":"Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. 2017. CARLA: An open urban driving simulator. In Conference on robot learning. PMLR, 1--16."},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.23919\/FUSION45008.2020.9190246"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2021.107728"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2022.3162397"},{"key":"e_1_2_1_26_1","volume-title":"Deep learning","author":"Goodfellow Ian","unstructured":"Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep learning. MIT press."},{"key":"e_1_2_1_27_1","volume-title":"SHAD: Privacy-Friendly Shared Activity Detection and Data Sharing. In 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems (MASS). IEEE, 109--117","author":"Han Feng","year":"2019","unstructured":"Feng Han, Lan Zhang, Xuanke You, Guangjing Wang, and Xiang-Yang Li. 2019. SHAD: Privacy-Friendly Shared Activity Detection and Data Sharing. In 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems (MASS). IEEE, 109--117."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2906388.2906396"},{"key":"e_1_2_1_29_1","volume-title":"Master's thesis. EECS Department","author":"Hardiman Mark","year":"2015","unstructured":"Mark Hardiman, Ying Ou, Ryan Frazier, Zeyi Lee, and Longxiang Cui. 2015. Project NoScope. Master's thesis. EECS Department, University of California, Berkeley. http:\/\/www2.eecs.berkeley.edu\/Pubs\/TechRpts\/2015\/EECS-2015-51.html"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2009.5178958"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01553"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACII.2019.8925538"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-00536-8_4"},{"key":"e_1_2_1_35_1","volume-title":"An introduction to image synthesis with generative adversarial nets. arXiv preprint arXiv:1803.04469","author":"Huang He","year":"2018","unstructured":"He Huang, Philip S Yu, and Changhu Wang. 2018. An introduction to image synthesis with generative adversarial nets. arXiv preprint arXiv:1803.04469 (2018)."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/2534169.2486006"},{"key":"e_1_2_1_37_1","first-page":"12080","article-title":"Metapoison: Practical general-purpose clean-label data poisoning","volume":"33","author":"Huang W Ronny","year":"2020","unstructured":"W Ronny Huang, Jonas Geiping, Liam Fowl, Gavin Taylor, and Tom Goldstein. 2020. Metapoison: Practical general-purpose clean-label data poisoning. Advances in Neural Information Processing Systems 33 (2020), 12080--12091.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1089\/tmj.2014.0028"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.632"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACII.2017.8273601"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/3230543.3230574"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1002\/j.1538-7305.1984.tb00034.x"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.223"},{"key":"e_1_2_1_44_1","volume-title":"International conference on machine learning. PMLR","author":"Kim Taeksoo","year":"2017","unstructured":"Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, and Jiwon Kim. 2017. Learning to discover cross-domain relations with generative adversarial networks. In International conference on machine learning. PMLR, 1857--1865."},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","unstructured":"Diederik P Kingma and Max Welling. 2013. Auto-Encoding Variational Bayes. https:\/\/doi.org\/10.48550\/ARXIV.1312.6114","DOI":"10.48550\/ARXIV.1312.6114"},{"key":"e_1_2_1_46_1","volume-title":"Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25","author":"Krizhevsky Alex","year":"2012","unstructured":"Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012), 1097--1105."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2015.2460697"},{"key":"e_1_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2020.101716"},{"key":"e_1_2_1_49_1","volume-title":"Alice: Towards understanding adversarial learning for joint distribution matching. Advances in neural information processing systems 30","author":"Li Chunyuan","year":"2017","unstructured":"Chunyuan Li, Hao Liu, Changyou Chen, Yuchen Pu, Liqun Chen, Ricardo Henao, and Lawrence Carin. 2017. Alice: Towards understanding adversarial learning for joint distribution matching. Advances in neural information processing systems 30 (2017)."},{"key":"e_1_2_1_50_1","volume-title":"Learning IoT in edge: Deep learning for the Internet of Things with edge computing","author":"Li He","year":"2018","unstructured":"He Li, Kaoru Ota, and Mianxiong Dong. 2018. Learning IoT in edge: Deep learning for the Internet of Things with edge computing. IEEE network 32, 1 (2018), 96--101."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3458864.3467884"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2020.3043716"},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"e_1_2_1_54_1","unstructured":"Peizhuo Lv Chang Yue Ruigang Liang Yunfei Yang Shengzhi Zhang Hualong Ma and Kai Chen. 2023. A Data-free Backdoor Injection Approach in Neural Networks. (2023)."},{"key":"e_1_2_1_55_1","volume-title":"A system for clock synchronization in an internet of things. arXiv preprint arXiv:1806.02474","author":"Mani Sathiya Kumaran","year":"2018","unstructured":"Sathiya Kumaran Mani, Ramakrishnan Durairajan, Paul Barford, and Joel Sommers. 2018. A system for clock synchronization in an internet of things. arXiv preprint arXiv:1806.02474 (2018)."},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1145\/2663204.2663236"},{"key":"e_1_2_1_57_1","volume-title":"Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784","author":"Mirza Mehdi","year":"2014","unstructured":"Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)."},{"key":"e_1_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.3390\/s16010115"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2018.8461326"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00700"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2019\/779"},{"key":"e_1_2_1_62_1","volume-title":"Distributed sensor systems: practice and applications","author":"Rashvand Habib F","unstructured":"Habib F Rashvand and Jose M Alcaraz Calero. 2012. Distributed sensor systems: practice and applications. John Wiley & Sons."},{"key":"e_1_2_1_63_1","volume-title":"2011 International Conference on Computer Vision. 2572--2578","author":"Shariat S.","unstructured":"S. Shariat and V. Pavlovic. 2011. Isotonic CCA for sequence alignment and activity recognition. In 2011 International Conference on Computer Vision. 2572--2578."},{"key":"e_1_2_1_64_1","volume-title":"Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556","author":"Simonyan Karen","year":"2014","unstructured":"Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)."},{"key":"e_1_2_1_65_1","doi-asserted-by":"crossref","unstructured":"Yapeng Tian Dingzeyu Li and Chenliang Xu. 2020. Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing. In ECCV.","DOI":"10.1007\/978-3-030-58580-8_26"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01216-8_16"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1145\/3161192"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01494"},{"key":"e_1_2_1_69_1","volume-title":"Federated IoT Interaction Vulnerability Analysis. In 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE.","author":"Wang Guangjing","year":"2023","unstructured":"Guangjing Wang, Hanqing Guo, Anran Li, Xiaorui Liu, and Qiben Yan. 2023. Federated IoT Interaction Vulnerability Analysis. In 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE."},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1145\/3588956"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1109\/PCCC.2018.8710834"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.adhoc.2016.10.003"},{"key":"e_1_2_1_73_1","volume-title":"VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation. arXiv preprint arXiv:2305.05736","author":"Wang Yuanda","year":"2023","unstructured":"Yuanda Wang, Hanqing Guo, Guangjing Wang, Bocheng Chen, and Qiben Yan. 2023. VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation. arXiv preprint arXiv:2305.05736 (2023)."},{"key":"e_1_2_1_74_1","volume-title":"Multimodal generative models for scalable weakly-supervised learning. Advances in Neural Information Processing Systems 31","author":"Wu Mike","year":"2018","unstructured":"Mike Wu and Noah Goodman. 2018. Multimodal generative models for scalable weakly-supervised learning. Advances in Neural Information Processing Systems 31 (2018)."},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00138"},{"key":"e_1_2_1_76_1","volume-title":"Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization. In ACM International Conference on Multimedia.","author":"Xu Haoming","year":"2020","unstructured":"Haoming Xu, Runhao Zeng, Qingyao Wu, Mingkui Tan, and Chuang Gan. 2020. Cross-Modal Relation-Aware Networks for Audio-Visual Event Localization. In ACM International Conference on Multimedia."},{"key":"e_1_2_1_77_1","volume-title":"Data Poisoning Attacks Against Multimodal Encoders. arXiv preprint arXiv:2209.15266","author":"Yang Ziqing","year":"2022","unstructured":"Ziqing Yang, Xinlei He, Zheng Li, Michael Backes, Mathias Humbert, Pascal Berrang, and Yang Zhang. 2022. Data Poisoning Attacks Against Multimodal Encoders. arXiv preprint arXiv:2209.15266 (2022)."},{"key":"e_1_2_1_78_1","doi-asserted-by":"publisher","DOI":"10.1145\/3038912.3052577"},{"key":"e_1_2_1_79_1","doi-asserted-by":"publisher","DOI":"10.1145\/3212725.3212729"},{"key":"e_1_2_1_80_1","doi-asserted-by":"publisher","unstructured":"Jongwon Yoon Sayandeep Sen and Joshua Hare. 2012. CRAWDAD dataset wisc\/wiscape (v. 2012-08-03). Downloaded from https:\/\/crawdad.org\/wisc\/wiscape\/20120803. https:\/\/doi.org\/10.15783\/C71C7D","DOI":"10.15783\/C71C7D"},{"key":"e_1_2_1_81_1","volume-title":"Ea-GANs: edge-aware generative adversarial networks for cross-modality MR image synthesis","author":"Yu Biting","year":"2019","unstructured":"Biting Yu, Luping Zhou, Lei Wang, Yinghuan Shi, Jurgen Fripp, and Pierrick Bourgeat. 2019. Ea-GANs: edge-aware generative adversarial networks for cross-modality MR image synthesis. IEEE transactions on medical imaging 38, 7 (2019), 1750--1762."},{"key":"e_1_2_1_82_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.2019.1632079"},{"key":"e_1_2_1_83_1","volume-title":"Timestamp Shift Detection for Synchrophasor Data Based on Similarity Analysis between Relative Phase Angle and Frequency","author":"Yu Wenpeng","year":"2019","unstructured":"Wenpeng Yu, Wenxuan Yao, Xianda Deng, Yinfeng Zhao, and Yilu Liu. 2019. Timestamp Shift Detection for Synchrophasor Data Based on Similarity Analysis between Relative Phase Angle and Frequency. IEEE Transactions on Power Delivery (2019)."},{"key":"e_1_2_1_84_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neuroimage.2012.03.059"},{"key":"e_1_2_1_85_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2019.103312"},{"key":"e_1_2_1_86_1","volume-title":"14th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 17). 377--392.","author":"Zhang Haoyu","unstructured":"Haoyu Zhang, Ganesh Ananthanarayanan, Peter Bodik, Matthai Philipose, Paramvir Bahl, and Michael J Freedman. 2017. Live video analytics at scale with approximation and delay-tolerance. In 14th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 17). 377--392."},{"key":"e_1_2_1_87_1","unstructured":"Xuezhou Zhang Xiaojin Zhu and Laurent Lessard. 2020. Online data poisoning attacks. In Learning for Dynamics and Control. PMLR 201--210."},{"key":"e_1_2_1_88_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2020.104042"},{"key":"e_1_2_1_89_1","doi-asserted-by":"crossref","unstructured":"Ce Zhou Qian Li Chen Li Jun Yu Yixin Liu Guangjing Wang Kai Zhang Cheng Ji Qiben Yan Lifang He et al. 2023. A comprehensive survey on pretrained foundation models: A history from bert to chatgpt. arXiv preprint arXiv:2302.09419 (2023).","DOI":"10.1007\/s13042-024-02443-6"},{"key":"e_1_2_1_90_1","volume-title":"2012 IEEE Conference on Computer Vision and Pattern Recognition. 1282--1289","author":"Zhou F.","unstructured":"F. Zhou and F. De la Torre. 2012. Generalized time warping for multi-modal alignment of human motion. In 2012 IEEE Conference on Computer Vision and Pattern Recognition. 1282--1289."},{"key":"e_1_2_1_91_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2015.2414429"},{"key":"e_1_2_1_92_1","volume-title":"Advances in Neural Information Processing Systems 22","author":"Zhou Feng","unstructured":"Feng Zhou and Fernando Torre. 2009. Canonical Time Warping for Alignment of Human Behavior. In Advances in Neural Information Processing Systems 22, Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams, and A. Culotta (Eds.). Curran Associates, Inc., 2286--2294. http:\/\/papers.nips.cc\/paper\/3728-canonical-time-warping-for-alignment-of-human-behavior.pdf"},{"key":"e_1_2_1_93_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00833"},{"key":"e_1_2_1_94_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.244"},{"key":"e_1_2_1_95_1","doi-asserted-by":"publisher","DOI":"10.1145\/3322205.3311074"}],"container-title":["Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3610885","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3610885","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,28]],"date-time":"2025-07-28T16:26:07Z","timestamp":1753719967000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3610885"}},"subtitle":["A Plug-in Framework of Non-blocking Inference for Distributed Multimodal System"],"short-title":[],"issued":{"date-parts":[[2023,9,27]]},"references-count":95,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,9,27]]}},"alternative-id":["10.1145\/3610885"],"URL":"https:\/\/doi.org\/10.1145\/3610885","relation":{},"ISSN":["2474-9567"],"issn-type":[{"value":"2474-9567","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,27]]},"assertion":[{"value":"2023-09-27","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}