{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T10:50:04Z","timestamp":1770720604672,"version":"3.49.0"},"reference-count":52,"publisher":"Association for Computing Machinery (ACM)","issue":"2","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62325206, 62532003"],"award-info":[{"award-number":["62325206, 62532003"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Key Research and Development Program of Jiangsu Province","award":["BE2023016-4"],"award-info":[{"award-number":["BE2023016-4"]}]},{"DOI":"10.13039\/501100004608","name":"Natural Science Foundation of Jiangsu Province","doi-asserted-by":"crossref","award":["BK20210595"],"award-info":[{"award-number":["BK20210595"]}],"id":[{"id":"10.13039\/501100004608","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2026,2,28]]},"abstract":"<jats:p>Fake news video detection has become a pressing concern with the growth of short video platforms. However, previous studies have primarily focused on videos with modality-complete data, failing to handle the uncertain missing modality issue in real-world applications. They fall short in two key aspects: (1) Highly coupled feature fusion hinders the model to learn intra- and inter-modality dependencies, making it difficult to form robust multimodal representations when facing uncertain modality missing. (2) Excessive reliance on discriminative modality combinations toward fake news, which leads to inferior performance on other modality combinations.<\/jats:p>\n                  <jats:p>\n                    To this end, we propose a novel model for modality-incomplete fake news video detection called MyGO. It contains three modules: (1) Caption-guided Keyframe Attention (CKA) leverages embedded captions to guide feature extraction, which adaptively excludes irrelevant frames to enhance the learning of intra-modality dependencies, resulting in refined modality features. (2) Based on refined modality features from CKA, Modality Disentangling Network (MDN) is designed to decompose them into shared and specific parts, which captures fine-grained inter-modality dependencies effectively. These two kinds of dependencies help avoid coupled multimodal fusion and bridge information gaps caused by missing modalities. (3) Furthermore,\n                    <jats:italic toggle=\"yes\">missing prompts<\/jats:italic>\n                    are newly introduced to explicitly mark modality combinations within each news video. By integrating\n                    <jats:italic toggle=\"yes\">missing prompts<\/jats:italic>\n                    with aforementioned inter-modality dependencies within Prompt-assisted Modality Aligning (PMA) Module, we alleviate over-reliance on discriminative modality combinations and enhancing the representation of less discriminative ones. Extensive experiments showcase that MyGO achieves 3.79\u20134.85% improvements in accuracy, demonstrating its performance over state-of-the-art approaches under different missing conditions.\n                  <\/jats:p>","DOI":"10.1145\/3785481","type":"journal-article","created":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T16:01:57Z","timestamp":1766073717000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["MyGO: Modality-incomplete Fake News Video Detection via Prompt-assisted Modality Disentangling Model"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-8855-2879","authenticated-orcid":false,"given":"Mingjie","family":"Qiu","sequence":"first","affiliation":[{"name":"School of Communications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1209-2817","authenticated-orcid":false,"given":"Zhiyi","family":"Tan","sequence":"additional","affiliation":[{"name":"School of Communications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5956-831X","authenticated-orcid":false,"given":"Bing-Kun","family":"Bao","sequence":"additional","affiliation":[{"name":"Nanjing University of Posts and Telecommunications, Nanjing, China"}]}],"member":"320","published-online":{"date-parts":[[2026,2,9]]},"reference":[{"key":"e_1_3_1_2_2","first-page":"663","volume-title":"2021 8th International Conference on Computing for Sustainable Global Development (INDIACom)","author":"Agrawal Ronak","year":"2021","unstructured":"Ronak Agrawal and Dilip Kumar Sharma. 2021. A survey on video-based fake news detection techniques. In 2021 8th International Conference on Computing for Sustainable Global Development (INDIACom), 663\u2013669."},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/CICSyN.2012.51"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3581783.3612426"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3664647.3680663"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3219819.3219963"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-32248-9_50"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3459637.3482212"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2022.01.007"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1423"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00394"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2018.2794996"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2023.3337134"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3474085.3475508"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/AVSS.2018.8639163"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/3624748"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2017.7952132"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i05.6307"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3413676"},{"key":"e_1_3_1_20_2","first-page":"18661","volume-title":"Advances in Neural Information Processing Systems","author":"Khosla Prannay","year":"2020","unstructured":"Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised contrastive learning. In Advances in Neural Information Processing Systems. H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 18661\u201318673. Retrieved from https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2020\/file\/d89a66c7c80a29b1bdbab0f2a1a94af8-Paper.pdf"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3711866"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3414034"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/3503161.3548128"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2023.3234553"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2021.3065495"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.eacl-main.14"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMI.2020.2983085"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2021.3091214"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2015.2466088"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v37i12.26689"},{"key":"e_1_3_1_31_2","unstructured":"Peng Qi Yuyang Zhao Yufeng Shen Wei Ji Juan Cao and Tat-Seng Chua. 2023. Two heads are better than one: improving fake news video detection by correlating with neighbors. arXiv:2306.05241. Retrieved from https:\/\/arxiv.org\/abs\/2306.05241"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3451215"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2023.03.003"},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1038\/s44159-023-00183-y"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/BigData52589.2021.9671928"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1145\/3137597.3137600"},{"key":"e_1_3_1_37_2","first-page":"1","volume-title":"3rd International Conference on Learning Representations","year":"2015","unstructured":"Karen Simonyan and Andrew Zisserman.. 2015. Very deep convolutional networks for large-scale image recognition. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May7\u20139, 2015, Conference Track Proceedings. Yoshua Bengio and Yann LeCun (Eds.), 1\u201314."},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2023.3275586"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3699959"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403234"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/3742786"},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01919"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2023.3280555"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2023.101944"},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-30675-4_19"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1145\/3672566"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-981-99-2356-4_5"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/3499026"},{"key":"e_1_3_1_49_2","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3532064"},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1145\/3664647.3681673"},{"key":"e_1_3_1_51_2","doi-asserted-by":"publisher","DOI":"10.1145\/3534678.3539388"},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-16443-9_11"},{"key":"e_1_3_1_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2021.3070752"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3785481","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,9]],"date-time":"2026-02-09T14:58:15Z","timestamp":1770649095000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3785481"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,9]]},"references-count":52,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2026,2,28]]}},"alternative-id":["10.1145\/3785481"],"URL":"https:\/\/doi.org\/10.1145\/3785481","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,9]]},"assertion":[{"value":"2025-06-15","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-11-24","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-02-09","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}