{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,23]],"date-time":"2026-02-23T11:05:30Z","timestamp":1771844730396,"version":"3.50.1"},"reference-count":63,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2023,10,18]],"date-time":"2023-10-18T00:00:00Z","timestamp":1697587200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62172417, 62272461, 62276266"],"award-info":[{"award-number":["62172417, 62272461, 62276266"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Xuzhou Key Research and Development Program","award":["KC22287"],"award-info":[{"award-number":["KC22287"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2024,2,29]]},"abstract":"<jats:p>Deep learning models have been proven to be susceptible to malicious adversarial attacks, which manipulate input images to deceive the model into making erroneous decisions. Consequently, the threat posed to these models serves as a poignant reminder of the necessity to focus on the model security of object segmentation algorithms based on deep learning. However, the current landscape of research on adversarial attacks primarily centers around static images, resulting in a dearth of studies on adversarial attacks targeting Video Object Segmentation (VOS) models. Given that a majority of self-supervised VOS models rely on affinity matrices to learn feature representations of video sequences and achieve robust pixel correspondence, our investigation has delved into the impact of adversarial attacks on self-supervised VOS models. In response, we propose an innovative black-box attack method incorporating contrastive loss. This method induces segmentation errors in the model through perturbations in the feature space and the application of a pixel-level loss function. Diverging from conventional gradient-based attack techniques, we adopt an iterative black-box attack strategy that incorporates contrastive loss across the current frame, any two consecutive frames, and multiple frames. Through extensive experimentation conducted on the DAVIS 2016 and DAVIS 2017 datasets using three self-supervised VOS models and one unsupervised VOS model, we unequivocally demonstrate the potent attack efficiency of the black-box approach. Remarkably, the<jats:italic>J&amp;F<\/jats:italic>metric value experiences a significant decline of up to 50.08% post-attack.<\/jats:p>","DOI":"10.1145\/3617502","type":"journal-article","created":{"date-parts":[[2023,8,25]],"date-time":"2023-08-25T11:28:39Z","timestamp":1692962919000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["Black-box Attack against Self-supervised Video Object Segmentation Models with Contrastive Loss"],"prefix":"10.1145","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-8416-8886","authenticated-orcid":false,"given":"Ying","family":"Chen","sequence":"first","affiliation":[{"name":"School of Computer Sciences and Technology, China University of Mining and Technology, and Engineering Research Center of Mine Digitization of Ministry of Education of the Peoples Republic of China, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2734-915X","authenticated-orcid":false,"given":"Rui","family":"Yao","sequence":"additional","affiliation":[{"name":"School of Computer Sciences and Technology, China University of Mining and Technology, and Engineering Research Center of Mine Digitization of Ministry of Education of the Peoples Republic of China, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6207-0299","authenticated-orcid":false,"given":"Yong","family":"Zhou","sequence":"additional","affiliation":[{"name":"School of Computer Sciences and Technology, China University of Mining and Technology, and Engineering Research Center of Mine Digitization of Ministry of Education of the Peoples Republic of China, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3564-5090","authenticated-orcid":false,"given":"Jiaqi","family":"Zhao","sequence":"additional","affiliation":[{"name":"School of Computer Sciences and Technology, China University of Mining and Technology, and Engineering Research Center of Mine Digitization of Ministry of Education of the Peoples Republic of China, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2365-6606","authenticated-orcid":false,"given":"Bing","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Computer Sciences and Technology, China University of Mining and Technology, and Engineering Research Center of Mine Digitization of Ministry of Education of the Peoples Republic of China, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7690-8547","authenticated-orcid":false,"given":"Abdulmotaleb El","family":"Saddik","sequence":"additional","affiliation":[{"name":"School of Electrical Engineering and Computer Science, Multimedia Communications Research Laboratory, University of Ottawa, Canada"}]}],"member":"320","published-online":{"date-parts":[[2023,10,18]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.13"},{"key":"e_1_3_1_3_2","doi-asserted-by":"crossref","first-page":"5320","DOI":"10.1109\/CVPR.2017.565","volume-title":"Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917)","author":"Caelles S.","year":"2017","unstructured":"S. Caelles, K. -K. Maninis, J. Pont-Tuset, L. Leal-Taix\u00e9, D. Cremers, and L. Van Gool. 2017. One-shot video object segmentation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201917). 5320\u20135329."},{"key":"e_1_3_1_4_2","first-page":"9796","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Chen Shixing","year":"2021","unstructured":"Shixing Chen, Xiaohan Nie, David Fan, Dongqing Zhang, Vimal Bhat, and Raffay Hamid. 2021. Shot contrastive self-supervised learning for scene boundary detection. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 9796\u20139805."},{"key":"e_1_3_1_5_2","first-page":"1597","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Chen Ting","year":"2020","unstructured":"Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In Proceedings of the International Conference on Machine Learning. PMLR, 1597\u20131607."},{"key":"e_1_3_1_6_2","unstructured":"Zedu Chen Bineng Zhong Guorong Li Shengping Zhang Rongrong Ji Zhenjun Tang and Xianxian Li. 2022. SiamBAN: Target-aware tracking with siamese box adaptive network. IEEE Transactions on Pattern Analysis and Machine Intelligence 45 4 (2023) 5158\u20135173."},{"key":"e_1_3_1_7_2","unstructured":"Emily L. Denton and Vighnesh Birodkar. 2017. Unsupervised learning of disentangled representations from video. Advances in Neural Information Processing Systems 30 1 (2017) 1\u201310."},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00586"},{"key":"e_1_3_1_9_2","unstructured":"Ian J. Goodfellow Jonathon Shlens and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. International Conference on Learning Representations . 1\u201312."},{"key":"e_1_3_1_10_2","first-page":"9080","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Han Junwei","year":"2018","unstructured":"Junwei Han, Le Yang, Dingwen Zhang, Xiaojun Chang, and Xiaodan Liang. 2018. Reinforcement cutting-agent learning for video object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9080\u20139089."},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00975"},{"key":"e_1_3_1_12_2","unstructured":"R. Devon Hjelm Alex Fedorov Samuel Lavoie-Marchildon Karan Grewal Phil Bachman Adam Trischler and Yoshua Bengio. 2019. Learning deep representations by mutual information estimation and maximizationd. International Conference on Learning Representations . 1\u201324."},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2022.01.066"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/JAS.2021.1004210"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00483"},{"key":"e_1_3_1_16_2","first-page":"20791","article-title":"Perturbing across the feature hierarchy to improve standard and strict blackbox attack transferability","volume":"33","author":"Inkawhich Nathan","year":"2020","unstructured":"Nathan Inkawhich, Kevin Liang, Binghui Wang, Matthew Inkawhich, Lawrence Carin, and Yiran Chen. 2020. Perturbing across the feature hierarchy to improve standard and strict blackbox attack transferability. Advances in Neural Information Processing Systems 33 (2020), 20791\u201320801.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00723"},{"key":"e_1_3_1_18_2","first-page":"19545","article-title":"Space-time correspondence as a contrastive random walk","volume":"33","author":"Jabri Allan","year":"2020","unstructured":"Allan Jabri, Andrew Owens, and Alexei Efros. 2020. Space-time correspondence as a contrastive random walk. Advances in Neural Information Processing Systems 33 (2020), 19545\u201319560.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_19_2","first-page":"19","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV\u201918)","author":"Jiang Huaizu","year":"2018","unstructured":"Huaizu Jiang, Gustav Larsson, Michael Maire Greg Shakhnarovich, and Erik Learned-Miller. 2018. Self-supervised relative depth learning for urban scene understanding. In Proceedings of the European Conference on Computer Vision (ECCV\u201918). 19\u201335."},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33018545"},{"key":"e_1_3_1_21_2","first-page":"2057","volume-title":"Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision","author":"Kim Youngeun","year":"2020","unstructured":"Youngeun Kim, Seokeon Choi, Hankyeol Lee, Taekyung Kim, and Changick Kim. 2020. Rpm-net: Robust pixel-level matching networks for self-supervised video object segmentation. In Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision. 2057\u20132065."},{"key":"e_1_3_1_22_2","unstructured":"Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. International Conference on Learning Representations . 1\u201315."},{"key":"e_1_3_1_23_2","unstructured":"Jyoti Kini Fahad Shahbaz Khan Salman Khan and Mubarak Shah. 2022. Self-supervised video object segmentation via cutout prediction and tagging. arXiv:2204.10846. Retrieved from https:\/\/arxiv.org\/abs\/2204.10846"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1201\/9781351251389-8"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00651"},{"key":"e_1_3_1_26_2","first-page":"2278","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Lee Junghyup","year":"2019","unstructured":"Junghyup Lee, Dohyung Kim, Jean Ponce, and Bumsub Ham. 2019. Sfnet: Learning object-aware semantic correspondence. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 2278\u20132287."},{"key":"e_1_3_1_27_2","first-page":"9522","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Li Hanchao","year":"2019","unstructured":"Hanchao Li, Pengfei Xiong, Haoqiang Fan, and Jian Sun. 2019. Dfanet: Deep feature aggregation for real-time semantic segmentation. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 9522\u20139531."},{"key":"e_1_3_1_28_2","first-page":"6526","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Li Siyang","year":"2018","unstructured":"Siyang Li, Bryan Seybold, Alexey Vorobyov, Alireza Fathi, Qin Huang, and C. -C. Jay Kuo. 2018. Instance embedding transfer to unsupervised video object segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6526\u20136535."},{"key":"e_1_3_1_29_2","first-page":"207","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV\u201918)","author":"Li Siyang","year":"2018","unstructured":"Siyang Li, Bryan Seybold, Alexey Vorobyov, Xuejing Lei, and C. -C. Jay Kuo. 2018. Unsupervised video object segmentation with motion-based bilateral networks. In Proceedings of the European Conference on Computer Vision (ECCV\u201918). 207\u2013223."},{"key":"e_1_3_1_30_2","unstructured":"Xueting Li Sifei Liu Shalini De Mello Xiaolong Wang Jan Kautz and Ming-Hsuan Yang. 2019. Jointtask selfsupervised learning for temporal correspondence. Advances in Neural Information Processing Systems 32 (2019) 1\u201311."},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3267935.3267946"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00374"},{"key":"e_1_3_1_33_2","unstructured":"Aleksander Madry Aleksandar Makelov Ludwig Schmidt Dimitris Tsipras and Adrian Vladu. 2018. Towards deep learning models resistant to adversarial attacks. International Conference on Learning Representations . 1\u201328."},{"issue":"6","key":"e_1_3_1_34_2","doi-asserted-by":"crossref","first-page":"1515","DOI":"10.1109\/TPAMI.2018.2838670","article-title":"Video object segmentation without temporal information","volume":"41","author":"Maninis K. -K.","year":"2018","unstructured":"K. -K. Maninis, Sergi Caelles, Yuhua Chen, Jordi Pont-Tuset, Laura Leal-Taix\u00e9, Daniel Cremers, and Luc Van Gool. 2018. Video object segmentation without temporal information. IEEE Transactions on Pattern Analysis and Machine Intelligence 41, 6 (2018), 1515\u20131530.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.282"},{"key":"e_1_3_1_36_2","unstructured":"Aaron van den Oord Yazhe Li and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv:1807.03748. Retrieved from https:\/\/arxiv.org\/abs\/1807.03748"},{"key":"e_1_3_1_37_2","unstructured":"Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury Gregory Chanan Trevor Killeen Zeming Lin Natalia Gimelshein Luca Antiga Alban Desmaison Andreas K\u00f6pf Edward Yang Zach DeVito Martin Raison Alykhan Tejani Sasank Chilamkurthy Benoit Steiner Lu Fang Junjie Bai and Soumith Chintala. 2019. Pytorch: An imperative style high-performance deep learning library. Advances in Neural Information Processing Systems 32 (2019) 1\u201312."},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.638"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.85"},{"key":"e_1_3_1_40_2","unstructured":"Jordi Pont-Tuset Federico Perazzi Sergi Caelles Pablo Arbel\u00e1ez Alex Sorkine-Hornung and Luc Van Gool. 2017. The 2017 davis challenge on video object segmentation. The 2017 DAVIS Challenge on Video Object Segmentation - CVPR Workshops . 1\u20136."},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58621-8_45"},{"key":"e_1_3_1_42_2","first-page":"6827","article-title":"What makes for good views for contrastive learning?","volume":"33","author":"Tian Yonglong","year":"2020","unstructured":"Yonglong Tian, Chen Sun, Ben Poole, Dilip Krishnan, Cordelia Schmid, and Phillip Isola. 2020. What makes for good views for contrastive learning? Advances in Neural Information Processing Systems 33 (2020), 6827\u20136839.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_43_2","doi-asserted-by":"crossref","unstructured":"Paul Voigtlaender and Bastian Leibe. 2017. Online adaptation of convolutional neural networks for video object segmentation. The 2017 DAVIS Challenge on Video Object Segmentation - CVPR Workshops . 1\u20136.","DOI":"10.5244\/C.31.116"},{"key":"e_1_3_1_44_2","first-page":"391","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV\u201918)","author":"Vondrick Carl","year":"2018","unstructured":"Carl Vondrick, Abhinav Shrivastava, Alireza Fathi, Sergio Guadarrama, and Kevin Murphy. 2018. Tracking emerges by colorizing videos. In Proceedings of the European Conference on Computer Vision (ECCV\u201918). 391\u2013408."},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00252"},{"key":"e_1_3_1_46_2","first-page":"1308","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Wang Ning","year":"2019","unstructured":"Ning Wang, Yibing Song, Chao Ma, Wengang Zhou, Wei Liu, and Houqiang Li. 2019. Unsupervised deep tracking. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1308\u20131317."},{"key":"e_1_3_1_47_2","first-page":"10174","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","author":"Wang Ning","year":"2021","unstructured":"Ning Wang, Wengang Zhou, and Houqiang Li. 2021. Contrastive transformation for self-supervised correspondence learning. In Proceedings of the AAAI Conference on Artificial Intelligence. 10174\u201310182."},{"key":"e_1_3_1_48_2","first-page":"9236","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Wang Wenguan","year":"2019","unstructured":"Wenguan Wang, Xiankai Lu, Jianbing Shen, David J. Crandall, and Ling Shao. 2019. Zero-shot video object segmentation via attentive graph neural networks. In Proceedings of the IEEE International Conference on Computer Vision. 9236\u20139245."},{"key":"e_1_3_1_49_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00754"},{"key":"e_1_3_1_50_2","doi-asserted-by":"crossref","unstructured":"Xingxing Wei Siyuan Liang Ning Chen and Xiaochun Cao. 2019. Transferable adversarial attacks for image and video object detection. International Joint Conferences on Artificial Intelligence . 954\u2013960.","DOI":"10.24963\/ijcai.2019\/134"},{"key":"e_1_3_1_51_2","doi-asserted-by":"crossref","unstructured":"Olivia Wiles A. Koepke and Andrew Zisserman. 2018. Self-supervised learning of a facial attribute embedding from video. The British Machine Vision Conference (BMVC) . 1\u201315.","DOI":"10.1109\/ICCVW.2019.00364"},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01261-8_41"},{"key":"e_1_3_1_53_2","doi-asserted-by":"crossref","unstructured":"Chaowei Xiao Bo Li Jun-Yan Zhu Warren He Mingyan Liu and Dawn Song. 2018. Generating adversarial examples with adversarial networks. International Joint Conferences on Artificial Intelligence . 3905\u20133911.","DOI":"10.24963\/ijcai.2018\/543"},{"key":"e_1_3_1_54_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.153"},{"key":"e_1_3_1_55_2","first-page":"7177","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Yang Charig","year":"2021","unstructured":"Charig Yang, Hala Lamdouar, Erika Lu, Andrew Zisserman, and Weidi Xie. 2021. Self-supervised video object segmentation by motion grouping. In Proceedings of the IEEE International Conference on Computer Vision. 7177\u20137188."},{"key":"e_1_3_1_56_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00637"},{"key":"e_1_3_1_57_2","first-page":"6731","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision","author":"Yu Ning","year":"2021","unstructured":"Ning Yu, Guilin Liu, Aysegul Dundar, Andrew Tao, Bryan Catanzaro, Larry S. Davis, and Mario Fritz. 2021. Dual contrastive loss and attention for gans. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 6731\u20136742."},{"issue":"2","key":"e_1_3_1_58_2","doi-asserted-by":"crossref","first-page":"475","DOI":"10.1109\/TPAMI.2018.2881114","article-title":"SPFTN: A joint learning framework for localizing and segmenting objects in weakly labeled videos","volume":"42","author":"Zhang Dingwen","year":"2018","unstructured":"Dingwen Zhang, Junwei Han, Le Yang, and Dong Xu. 2018. SPFTN: A joint learning framework for localizing and segmenting objects in weakly labeled videos. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 2 (2018), 475\u2013489.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_1_59_2","doi-asserted-by":"crossref","first-page":"445","DOI":"10.1007\/978-3-030-58583-9_27","volume-title":"Proceedings of the Computer Vision\u2013ECCV 2020: 16th European Conference","author":"Zhen Mingmin","year":"2020","unstructured":"Mingmin Zhen, Shiwei Li, Lei Zhou, Jiaxiang Shang, Haoan Feng, Tian Fang, and Long Quan. 2020. Learning discriminative feature with crf for unsupervised video object segmentation. In Proceedings of the Computer Vision\u2013ECCV 2020: 16th European Conference. Springer, 445\u2013462."},{"key":"e_1_3_1_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00911"},{"key":"e_1_3_1_61_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i07.7008"},{"key":"e_1_3_1_62_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01264-9_28"},{"key":"e_1_3_1_63_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2021.04.090"},{"key":"e_1_3_1_64_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.52"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3617502","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3617502","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T16:45:58Z","timestamp":1750178758000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3617502"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,18]]},"references-count":63,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,2,29]]}},"alternative-id":["10.1145\/3617502"],"URL":"https:\/\/doi.org\/10.1145\/3617502","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,18]]},"assertion":[{"value":"2022-08-23","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-08-20","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-10-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}