{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,16]],"date-time":"2026-05-16T18:26:57Z","timestamp":1778956017434,"version":"3.51.4"},"publisher-location":"New York, NY, USA","reference-count":62,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,10,10]],"date-time":"2022-10-10T00:00:00Z","timestamp":1665360000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,10,10]]},"DOI":"10.1145\/3503161.3548255","type":"proceedings-article","created":{"date-parts":[[2022,10,10]],"date-time":"2022-10-10T15:42:35Z","timestamp":1665416555000},"page":"4505-4514","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":36,"title":["Multimodal Hate Speech Detection via Cross-Domain Knowledge Transfer"],"prefix":"10.1145","author":[{"given":"Chuanpeng","family":"Yang","sequence":"first","affiliation":[{"name":"Institute of Information Engineering, Chinese Academy of Sciences &amp; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Fuqing","family":"Zhu","sequence":"additional","affiliation":[{"name":"Institute of Information Engineering, Chinese Academy of Sciences &amp; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Guihua","family":"Liu","sequence":"additional","affiliation":[{"name":"Institute of Information Engineering, Chinese Academy of Sciences &amp; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jizhong","family":"Han","sequence":"additional","affiliation":[{"name":"Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Songlin","family":"Hu","sequence":"additional","affiliation":[{"name":"Institute of Information Engineering, Chinese Academy of Sciences &amp; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,10,10]]},"reference":[{"key":"e_1_3_2_2_1_1","volume-title":"Representation learning: A review and new perspectives","author":"Bengio Yoshua","year":"2013","unstructured":"Yoshua Bengio , Aaron Courville , and Pascal Vincent . 2013. Representation learning: A review and new perspectives . IEEE Transactions on Pattern Analysis and Machine Intelligence ( 2013 ), 1798--1828. Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence (2013), 1798--1828."},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553380"},{"key":"e_1_3_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1239"},{"key":"e_1_3_2_2_4_1","volume-title":"From Big to Small: Adaptive Learning to Partial-Set Domains","author":"Cao Zhangjie","year":"2022","unstructured":"Zhangjie Cao , Kaichao You , Ziyang Zhang , Jianmin Wang , and Mingsheng Long . 2022. From Big to Small: Adaptive Learning to Partial-Set Domains . IEEE Transactions on Pattern Analysis and Machine Intelligence ( 2022 ). Zhangjie Cao, Kaichao You, Ziyang Zhang, Jianmin Wang, and Mingsheng Long. 2022. From Big to Small: Adaptive Learning to Partial-Set Domains. IEEE Transactions on Pattern Analysis and Machine Intelligence (2022)."},{"key":"e_1_3_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58577-8_7"},{"key":"e_1_3_2_2_6_1","volume-title":"Japsimar Singh Wahi, and Siyao Li","author":"Das Abhishek","year":"2020","unstructured":"Abhishek Das , Japsimar Singh Wahi, and Siyao Li . 2020 . Detecting hate speech in multi-modal memes. arXiv preprint arXiv:2012.14891 (2020). Abhishek Das, Japsimar Singh Wahi, and Siyao Li. 2020. Detecting hate speech in multi-modal memes. arXiv preprint arXiv:2012.14891 (2020)."},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1301"},{"key":"e_1_3_2_2_8_1","volume-title":"A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR)","author":"Fortuna Paula","year":"2018","unstructured":"Paula Fortuna and S\u00e9rgio Nunes . 2018. A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR) ( 2018 ), 1--30. Paula Fortuna and S\u00e9rgio Nunes. 2018. A survey on automatic detection of hate speech in text. ACM Computing Surveys (CSUR) (2018), 1--30."},{"key":"e_1_3_2_2_9_1","volume-title":"International Conference on Machine Learning. 1180--1189","author":"Ganin Yaroslav","year":"2015","unstructured":"Yaroslav Ganin and Victor Lempitsky . 2015 . Unsupervised domain adaptation by backpropagation . In International Conference on Machine Learning. 1180--1189 . Yaroslav Ganin and Victor Lempitsky. 2015. Unsupervised domain adaptation by backpropagation. In International Conference on Machine Learning. 1180--1189."},{"key":"e_1_3_2_2_10_1","volume-title":"The Journal of Machine Learning Research (2016)","author":"Ganin Yaroslav","year":"2016","unstructured":"Yaroslav Ganin , Evgeniya Ustinova , Hana Ajakan , Pascal Germain , Hugo Larochelle , Francc ois Laviolette , Mario Marchand , and Victor Lempitsky . 2016 . Domain-adversarial training of neural networks . The Journal of Machine Learning Research (2016) , 2096--2030. Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, Francc ois Laviolette, Mario Marchand, and Victor Lempitsky. 2016. Domain-adversarial training of neural networks. The Journal of Machine Learning Research (2016), 2096--2030."},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3270101.3270103"},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-21735-7_6"},{"key":"e_1_3_2_2_14_1","volume-title":"International Conference on Machine Learning. 1989--1998","author":"Hoffman Judy","year":"2018","unstructured":"Judy Hoffman , Eric Tzeng , Taesung Park , Jun-Yan Zhu , Phillip Isola , Kate Saenko , Alexei Efros , and Trevor Darrell . 2018 . Cycada: Cycle-consistent adversarial domain adaptation . In International Conference on Machine Learning. 1989--1998 . Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alexei Efros, and Trevor Darrell. 2018. Cycada: Cycle-consistent adversarial domain adaptation. In International Conference on Machine Learning. 1989--1998."},{"key":"e_1_3_2_2_15_1","volume-title":"Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144","author":"Jang Eric","year":"2016","unstructured":"Eric Jang , Shixiang Gu , and Ben Poole . 2016. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 ( 2016 ). Eric Jang, Shixiang Gu, and Ben Poole. 2016. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016)."},{"key":"e_1_3_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v29i1.9608"},{"key":"e_1_3_2_2_17_1","volume-title":"Bayesian representation learning with oracle constraints. arXiv preprint arXiv:1506.05011","author":"Karaletsos Theofanis","year":"2015","unstructured":"Theofanis Karaletsos , Serge Belongie , and Gunnar R\"atsch. 2015. Bayesian representation learning with oracle constraints. arXiv preprint arXiv:1506.05011 ( 2015 ). Theofanis Karaletsos, Serge Belongie, and Gunnar R\"atsch. 2015. Bayesian representation learning with oracle constraints. arXiv preprint arXiv:1506.05011 (2015)."},{"key":"e_1_3_2_2_18_1","volume-title":"Proceedings of NAACL-HLT. 4171--4186","author":"Ming-Wei Chang Jacob Devlin","year":"2019","unstructured":"Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding . In Proceedings of NAACL-HLT. 4171--4186 . Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT. 4171--4186."},{"key":"e_1_3_2_2_19_1","volume-title":"Supervised multimodal bitransformers for classifying images and text. arXiv preprint arXiv:1909.02950","author":"Kiela Douwe","year":"2019","unstructured":"Douwe Kiela , Suvrat Bhooshan , Hamed Firooz , Ethan Perez , and Davide Testuggine . 2019. Supervised multimodal bitransformers for classifying images and text. arXiv preprint arXiv:1909.02950 ( 2019 ). Douwe Kiela, Suvrat Bhooshan, Hamed Firooz, Ethan Perez, and Davide Testuggine. 2019. Supervised multimodal bitransformers for classifying images and text. arXiv preprint arXiv:1909.02950 (2019)."},{"key":"e_1_3_2_2_20_1","volume-title":"Proceedings of the International Conference on Neural Information Processing Systems. 2611--2624","author":"Kiela Douwe","year":"2020","unstructured":"Douwe Kiela , Hamed Firooz , Aravind Mohan , Vedanuj Goswami , Amanpreet Singh , Pratik Ringshia , and Davide Testuggine . 2020 . The hateful memes challenge: detecting hate speech in multimodal memes . In Proceedings of the International Conference on Neural Information Processing Systems. 2611--2624 . Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, and Davide Testuggine. 2020. The hateful memes challenge: detecting hate speech in multimodal memes. In Proceedings of the International Conference on Neural Information Processing Systems. 2611--2624."},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/1873951.1874104"},{"key":"e_1_3_2_2_22_1","volume-title":"On information and sufficiency. The Annals of Mathematical Statistics","author":"Kullback Solomon","year":"1951","unstructured":"Solomon Kullback and Richard A Leibler . 1951. On information and sufficiency. The Annals of Mathematical Statistics ( 1951 ), 79--86. Solomon Kullback and Richard A Leibler. 1951. On information and sufficiency. The Annals of Mathematical Statistics (1951), 79--86."},{"key":"e_1_3_2_2_23_1","volume-title":"Proceedings of ACM International Conference on Multimedia. 5138--5147","author":"Ka-Wei Lee Roy","year":"2021","unstructured":"Roy Ka-Wei Lee , Rui Cao , Ziqing Fan , Jing Jiang , and Wen-Haw Chong . 2021 . Disentangling Hate in Online Memes . In Proceedings of ACM International Conference on Multimedia. 5138--5147 . Roy Ka-Wei Lee, Rui Cao, Ziqing Fan, Jing Jiang, and Wen-Haw Chong. 2021. Disentangling Hate in Online Memes. In Proceedings of ACM International Conference on Multimedia. 5138--5147."},{"key":"e_1_3_2_2_24_1","volume-title":"Proceedings of the International Conference on Neural Information Processing Systems. 1978--1992","author":"Li Junnan","year":"2021","unstructured":"Junnan Li , Ramprasaath Selvaraju , Akhilesh Gotmare , Shafiq Joty , Caiming Xiong , and Steven Chu Hong Hoi . 2021 . Align before Fuse: Vision and Language Representation Learning with Momentum Distillation . In Proceedings of the International Conference on Neural Information Processing Systems. 1978--1992 . Junnan Li, Ramprasaath Selvaraju, Akhilesh Gotmare, Shafiq Joty, Caiming Xiong, and Steven Chu Hong Hoi. 2021. Align before Fuse: Vision and Language Representation Learning with Momentum Distillation. In Proceedings of the International Conference on Neural Information Processing Systems. 1978--1992."},{"key":"e_1_3_2_2_25_1","volume-title":"Visualbert: A simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557","author":"Li Liunian Harold","year":"2019","unstructured":"Liunian Harold Li , Mark Yatskar , Da Yin , Cho-Jui Hsieh , and Kai-Wei Chang . 2019 . Visualbert: A simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557 (2019). Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, and Kai-Wei Chang. 2019. Visualbert: A simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557 (2019)."},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"e_1_3_2_2_27_1","volume-title":"A multimodal framework for the detection of hateful memes. arXiv preprint arXiv:2012.12871","author":"Lippe Phillip","year":"2020","unstructured":"Phillip Lippe , Nithin Holla , Shantanu Chandra , Santhosh Rajamanickam , Georgios Antoniou , Ekaterina Shutova , and Helen Yannakoudakis . 2020. A multimodal framework for the detection of hateful memes. arXiv preprint arXiv:2012.12871 ( 2020 ). Phillip Lippe, Nithin Holla, Shantanu Chandra, Santhosh Rajamanickam, Georgios Antoniou, Ekaterina Shutova, and Helen Yannakoudakis. 2020. A multimodal framework for the detection of hateful memes. arXiv preprint arXiv:2012.12871 (2020)."},{"key":"e_1_3_2_2_28_1","volume-title":"International Conference on Machine Learning. 97--105","author":"Long Mingsheng","year":"2015","unstructured":"Mingsheng Long , Yue Cao , Jianmin Wang , and Michael Jordan . 2015 . Learning transferable features with deep adaptation networks . In International Conference on Machine Learning. 97--105 . Mingsheng Long, Yue Cao, Jianmin Wang, and Michael Jordan. 2015. Learning transferable features with deep adaptation networks. In International Conference on Machine Learning. 97--105."},{"key":"e_1_3_2_2_29_1","volume-title":"Proceedings of the International Conference on Neural Information Processing Systems. 1647--1657","author":"Long Mingsheng","year":"2018","unstructured":"Mingsheng Long , Zhangjie Cao , Jianmin Wang , and Michael I Jordan . 2018 . Conditional adversarial domain adaptation . In Proceedings of the International Conference on Neural Information Processing Systems. 1647--1657 . Mingsheng Long, Zhangjie Cao, Jianmin Wang, and Michael I Jordan. 2018. Conditional adversarial domain adaptation. In Proceedings of the International Conference on Neural Information Processing Systems. 1647--1657."},{"key":"e_1_3_2_2_30_1","volume-title":"International Conference on Machine Learning. 2208--2217","author":"Long Mingsheng","year":"2017","unstructured":"Mingsheng Long , Han Zhu , Jianmin Wang , and Michael I Jordan . 2017 . Deep transfer learning with joint adaptation networks . In International Conference on Machine Learning. 2208--2217 . Mingsheng Long, Han Zhu, Jianmin Wang, and Michael I Jordan. 2017. Deep transfer learning with joint adaptation networks. In International Conference on Machine Learning. 2208--2217."},{"key":"e_1_3_2_2_31_1","volume-title":"Proceedings of the International Conference on Neural Information Processing Systems. 13--23","author":"Lu Jiasen","year":"2019","unstructured":"Jiasen Lu , Dhruv Batra , Devi Parikh , and Stefan Lee . 2019 . ViLBERT: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks . In Proceedings of the International Conference on Neural Information Processing Systems. 13--23 . Jiasen Lu, Dhruv Batra, Devi Parikh, and Stefan Lee. 2019. ViLBERT: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In Proceedings of the International Conference on Neural Information Processing Systems. 13--23."},{"key":"e_1_3_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403091"},{"key":"e_1_3_2_2_33_1","volume-title":"Detecting hate speech in social media. arXiv preprint arXiv:1712.06427","author":"Malmasi Shervin","year":"2017","unstructured":"Shervin Malmasi and Marcos Zampieri . 2017. Detecting hate speech in social media. arXiv preprint arXiv:1712.06427 ( 2017 ). Shervin Malmasi and Marcos Zampieri. 2017. Detecting hate speech in social media. arXiv preprint arXiv:1712.06427 (2017)."},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v35i17.17745"},{"key":"e_1_3_2_2_35_1","volume-title":"Vilio: state-of-the-art Visio-Linguistic models applied to hateful memes. arXiv preprint arXiv:2012.07788","author":"Muennighoff Niklas","year":"2020","unstructured":"Niklas Muennighoff . 2020. Vilio: state-of-the-art Visio-Linguistic models applied to hateful memes. arXiv preprint arXiv:2012.07788 ( 2020 ). Niklas Muennighoff. 2020. Vilio: state-of-the-art Visio-Linguistic models applied to hateful memes. arXiv preprint arXiv:2012.07788 (2020)."},{"key":"e_1_3_2_2_36_1","unstructured":"Hongliang Pan Zheng Lin Peng Fu Yatao Qi and Weiping Wang. 2020. Modeling intra and inter-modality incongruity for multi-modal sarcasm detection. In Findings of the Association for Computational Linguistics: EMNLP. 1383--1392.  Hongliang Pan Zheng Lin Peng Fu Yatao Qi and Weiping Wang. 2020. Modeling intra and inter-modality incongruity for multi-modal sarcasm detection. In Findings of the Association for Computational Linguistics: EMNLP. 1383--1392."},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11767"},{"key":"e_1_3_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1119"},{"key":"e_1_3_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1043"},{"key":"e_1_3_2_2_40_1","volume-title":"Detecting hateful memes using a multimodal deep ensemble. arXiv preprint arXiv:2012.13235","author":"Sandulescu Vlad","year":"2020","unstructured":"Vlad Sandulescu . 2020. Detecting hateful memes using a multimodal deep ensemble. arXiv preprint arXiv:2012.13235 ( 2020 ). Vlad Sandulescu. 2020. Detecting hateful memes using a multimodal deep ensemble. arXiv preprint arXiv:2012.13235 (2020)."},{"key":"e_1_3_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W17-1101"},{"key":"e_1_3_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1238"},{"key":"e_1_3_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33014951"},{"key":"e_1_3_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00129"},{"key":"e_1_3_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-49409-8_35"},{"key":"e_1_3_2_2_46_1","volume-title":"Proceedings of the Workshop on Trolling, Aggression and Cyberbullying. 32--41","author":"Suryawanshi Shardul","year":"2020","unstructured":"Shardul Suryawanshi , Bharathi Raja Chakravarthi , Mihael Arcan , and Paul Buitelaar . 2020 . Multimodal meme dataset (MultiOFF) for identifying offensive content in image and text . In Proceedings of the Workshop on Trolling, Aggression and Cyberbullying. 32--41 . Shardul Suryawanshi, Bharathi Raja Chakravarthi, Mihael Arcan, and Paul Buitelaar. 2020. Multimodal meme dataset (MultiOFF) for identifying offensive content in image and text. In Proceedings of the Workshop on Trolling, Aggression and Cyberbullying. 32--41."},{"key":"e_1_3_2_2_47_1","volume-title":"Siu Cheung Hui, and Jian Su.","author":"Tay Yi","year":"2018","unstructured":"Yi Tay , Anh Tuan Luu , Siu Cheung Hui, and Jian Su. 2018 . Reasoning with Sarcasm by Reading In-Between. In Proceedings of the Association for Computational Linguistics . 1010--1020. Yi Tay, Anh Tuan Luu, Siu Cheung Hui, and Jian Su. 2018. Reasoning with Sarcasm by Reading In-Between. In Proceedings of the Association for Computational Linguistics. 1010--1020."},{"key":"e_1_3_2_2_48_1","volume-title":"Jie Fu, Minh C Phan, Xingdi Yuan, Jinfeng Rao, Siu Cheung Hui, and Aston Zhang.","author":"Tay Yi","year":"2019","unstructured":"Yi Tay , Shuohang Wang , Anh Tuan Luu , Jie Fu, Minh C Phan, Xingdi Yuan, Jinfeng Rao, Siu Cheung Hui, and Aston Zhang. 2019 . Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives. In Proceedings of the Association for Computational Linguistics . 4922--4931. Yi Tay, Shuohang Wang, Anh Tuan Luu, Jie Fu, Minh C Phan, Xingdi Yuan, Jinfeng Rao, Siu Cheung Hui, and Aston Zhang. 2019. Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives. In Proceedings of the Association for Computational Linguistics. 4922--4931."},{"key":"e_1_3_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.316"},{"key":"e_1_3_2_2_50_1","volume-title":"Deep domain confusion: Maximizing for domain invariance. arXiv preprint arXiv:1412.3474","author":"Tzeng Eric","year":"2014","unstructured":"Eric Tzeng , Judy Hoffman , Ning Zhang , Kate Saenko , and Trevor Darrell . 2014. Deep domain confusion: Maximizing for domain invariance. arXiv preprint arXiv:1412.3474 ( 2014 ). Eric Tzeng, Judy Hoffman, Ning Zhang, Kate Saenko, and Trevor Darrell. 2014. Deep domain confusion: Maximizing for domain invariance. arXiv preprint arXiv:1412.3474 (2014)."},{"key":"e_1_3_2_2_51_1","volume-title":"Proceedings of the International Conference on Neural Information Processing Systems. 6000--6010","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan N Gomez , \u0141ukasz Kaiser , and Illia Polosukhin . 2017 . Attention is all you need . In Proceedings of the International Conference on Neural Information Processing Systems. 6000--6010 . Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the International Conference on Neural Information Processing Systems. 6000--6010."},{"key":"e_1_3_2_2_52_1","volume-title":"Detecting hate speech in memes using multimodal deep learning approaches: Prize-winning solution to hateful memes challenge. arXiv preprint arXiv:2012.12975","author":"Velioglu Riza","year":"2020","unstructured":"Riza Velioglu and Jewgeni Rose . 2020. Detecting hate speech in memes using multimodal deep learning approaches: Prize-winning solution to hateful memes challenge. arXiv preprint arXiv:2012.12975 ( 2020 ). Riza Velioglu and Jewgeni Rose. 2020. Detecting hate speech in memes using multimodal deep learning approaches: Prize-winning solution to hateful memes challenge. arXiv preprint arXiv:2012.12975 (2020)."},{"key":"e_1_3_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3463032"},{"key":"e_1_3_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11536"},{"key":"e_1_3_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.nlpbt-1.3"},{"key":"e_1_3_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N16-2013"},{"key":"e_1_3_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.349"},{"key":"e_1_3_2_2_58_1","volume-title":"Hateful memes detection via complementary visual and linguistic networks. arXiv preprint arXiv:2012.04977","author":"Zhang Weibo","year":"2020","unstructured":"Weibo Zhang , Guihua Liu , Zhuohua Li , and Fuqing Zhu . 2020. Hateful memes detection via complementary visual and linguistic networks. arXiv preprint arXiv:2012.04977 ( 2020 ). Weibo Zhang, Guihua Liu, Zhuohua Li, and Fuqing Zhu. 2020. Hateful memes detection via complementary visual and linguistic networks. arXiv preprint arXiv:2012.04977 (2020)."},{"key":"e_1_3_2_2_59_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICMEW53276.2021.9455994"},{"key":"e_1_3_2_2_60_1","volume-title":"Enhance multimodal transformer with external label and in-domain pretrain: Hateful meme challenge winning solution. arXiv preprint arXiv:2012.08290","author":"Zhu Ron","year":"2020","unstructured":"Ron Zhu . 2020. Enhance multimodal transformer with external label and in-domain pretrain: Hateful meme challenge winning solution. arXiv preprint arXiv:2012.08290 ( 2020 ). Ron Zhu. 2020. Enhance multimodal transformer with external label and in-domain pretrain: Hateful meme challenge winning solution. arXiv preprint arXiv:2012.08290 (2020)."},{"key":"e_1_3_2_2_61_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33015989"},{"key":"e_1_3_2_2_62_1","volume-title":"Multi-representation adaptation network for cross-domain image classification. Neural Networks","author":"Zhu Yongchun","year":"2019","unstructured":"Yongchun Zhu , Fuzhen Zhuang , Jindong Wang , Jingwu Chen , Zhiping Shi , Wenjuan Wu , and Qing He. 2019b. Multi-representation adaptation network for cross-domain image classification. Neural Networks ( 2019 ), 214--221.io Yongchun Zhu, Fuzhen Zhuang, Jindong Wang, Jingwu Chen, Zhiping Shi, Wenjuan Wu, and Qing He. 2019b. Multi-representation adaptation network for cross-domain image classification. Neural Networks (2019), 214--221.io"}],"event":{"name":"MM '22: The 30th ACM International Conference on Multimedia","location":"Lisboa Portugal","acronym":"MM '22","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 30th ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3503161.3548255","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3503161.3548255","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:00:42Z","timestamp":1750186842000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3503161.3548255"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,10]]},"references-count":62,"alternative-id":["10.1145\/3503161.3548255","10.1145\/3503161"],"URL":"https:\/\/doi.org\/10.1145\/3503161.3548255","relation":{},"subject":[],"published":{"date-parts":[[2022,10,10]]},"assertion":[{"value":"2022-10-10","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}