{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T18:14:59Z","timestamp":1775326499435,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":44,"publisher":"ACM","license":[{"start":{"date-parts":[[2019,10,15]],"date-time":"2019-10-15T00:00:00Z","timestamp":1571097600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2019,10,15]]},"DOI":"10.1145\/3343031.3351088","type":"proceedings-article","created":{"date-parts":[[2019,10,21]],"date-time":"2019-10-21T16:32:26Z","timestamp":1571675546000},"page":"864-872","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":107,"title":["Black-box Adversarial Attacks on Video Recognition Models"],"prefix":"10.1145","author":[{"given":"Linxi","family":"Jiang","sequence":"first","affiliation":[{"name":"Fudan University, Shanghai, China"}]},{"given":"Xingjun","family":"Ma","sequence":"additional","affiliation":[{"name":"The University of Melbourne, Melbourne, Australia"}]},{"given":"Shaoxiang","family":"Chen","sequence":"additional","affiliation":[{"name":"Fudan University, Shanghai, China"}]},{"given":"James","family":"Bailey","sequence":"additional","affiliation":[{"name":"The University of Melbourne, Melbourne, Australia"}]},{"given":"Yu-Gang","family":"Jiang","sequence":"additional","affiliation":[{"name":"Fudan University and Jilian Technology Group, Shanghai, China"}]}],"member":"320","published-online":{"date-parts":[[2019,10,15]]},"reference":[{"key":"e_1_3_2_1_1_1","unstructured":"Anish Athalye Nicholas Carlini and David Wagner. 2018. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In ICML .  Anish Athalye Nicholas Carlini and David Wagner. 2018. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In ICML ."},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"crossref","unstructured":"Arjun Nitin Bhagoji Warren He Bo Li and Dawn Song. 2018. Practical black-box attacks on deep neural networks using efficient query mechanisms. In ECCV .  Arjun Nitin Bhagoji Warren He Bo Li and Dawn Song. 2018. Practical black-box attacks on deep neural networks using efficient query mechanisms. In ECCV .","DOI":"10.1007\/978-3-030-01258-8_10"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"crossref","unstructured":"Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In S&P .  Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In S&P .","DOI":"10.1109\/SP.2017.49"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"crossref","unstructured":"Joao Carreira and Andrew Zisserman. 2017. Quo vadis action recognition? a new model and the kinetics dataset. In CVPR .  Joao Carreira and Andrew Zisserman. 2017. Quo vadis action recognition? a new model and the kinetics dataset. In CVPR .","DOI":"10.1109\/CVPR.2017.502"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"crossref","unstructured":"Pin-Yu Chen Yash Sharma Huan Zhang Jinfeng Yi and Cho-Jui Hsieh. 2018. EAD: elastic-net attacks to deep neural networks via adversarial examples. In AAAI .  Pin-Yu Chen Yash Sharma Huan Zhang Jinfeng Yi and Cho-Jui Hsieh. 2018. EAD: elastic-net attacks to deep neural networks via adversarial examples. In AAAI .","DOI":"10.1609\/aaai.v32i1.11302"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3128572.3140448"},{"key":"e_1_3_2_1_7_1","volume-title":"Imagenet: A large-scale hierarchical image database. In CVPR .","author":"Deng Jia","year":"2009"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"crossref","unstructured":"Kevin Eykholt Ivan Evtimov Earlence Fernandes Bo Li Amir Rahmati Chaowei Xiao Atul Prakash Tadayoshi Kohno and Dawn Song. 2018. Robust Physical-World Attacks on Deep Learning Visual Classification. In CVPR .  Kevin Eykholt Ivan Evtimov Earlence Fernandes Bo Li Amir Rahmati Chaowei Xiao Atul Prakash Tadayoshi Kohno and Dawn Song. 2018. Robust Physical-World Attacks on Deep Learning Visual Classification. In CVPR .","DOI":"10.1109\/CVPR.2018.00175"},{"key":"e_1_3_2_1_9_1","unstructured":"Ian J. Goodfellow Jonathon Shlens and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In ICLR .  Ian J. Goodfellow Jonathon Shlens and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In ICLR ."},{"key":"e_1_3_2_1_10_1","unstructured":"Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR .  Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR ."},{"key":"e_1_3_2_1_11_1","volume-title":"Laurens Van Der Maaten, and Kilian Q Weinberger","author":"Huang Gao","year":"2017"},{"key":"e_1_3_2_1_12_1","unstructured":"Andrew Ilyas Logan Engstrom Anish Athalye and Jessy Lin. 2018. Black-box adversarial attacks with limited queries and information. ICML .  Andrew Ilyas Logan Engstrom Anish Athalye and Jessy Lin. 2018. Black-box adversarial attacks with limited queries and information. ICML ."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2012.59"},{"key":"e_1_3_2_1_14_1","article-title":"DeepProduct: Mobile Product Search With Portable Deep Features","volume":"14","author":"Jiang Yu-Gang","year":"2018","journal-title":"ACM Transactions on Multimedia Computing, Communications and Applications (TOMM)"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2018.2823900"},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2670560"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"crossref","unstructured":"Andrej Karpathy George Toderici Sanketh Shetty Thomas Leung Rahul Sukthankar and Li Fei-Fei. 2014. Large-scale video classification with convolutional neural networks. In CVPR .  Andrej Karpathy George Toderici Sanketh Shetty Thomas Leung Rahul Sukthankar and Li Fei-Fei. 2014. Large-scale video classification with convolutional neural networks. In CVPR .","DOI":"10.1109\/CVPR.2014.223"},{"key":"e_1_3_2_1_18_1","volume-title":"et almbox","author":"Kay Will","year":"2017"},{"key":"e_1_3_2_1_20_1","volume-title":"Hinton","author":"Krizhevsky Alex","year":"2012"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"crossref","unstructured":"Hildegard Kuehne Hueihan Jhuang Est'ibaliz Garrote Tomaso Poggio and Thomas Serre. 2011. HMDB: a large video database for human motion recognition. In ICCV .  Hildegard Kuehne Hueihan Jhuang Est'ibaliz Garrote Tomaso Poggio and Thomas Serre. 2011. HMDB: a large video database for human motion recognition. In ICCV .","DOI":"10.1109\/ICCV.2011.6126543"},{"key":"e_1_3_2_1_22_1","volume-title":"Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533","author":"Kurakin Alexey","year":"2016"},{"key":"e_1_3_2_1_23_1","unstructured":"Yann LeCun Bernhard E Boser John S Denker Donnie Henderson Richard E Howard Wayne E Hubbard and Lawrence D Jackel. 1990. Handwritten digit recognition with a back-propagation network. In NIPS .  Yann LeCun Bernhard E Boser John S Denker Donnie Henderson Richard E Howard Wayne E Hubbard and Lawrence D Jackel. 1990. Handwritten digit recognition with a back-propagation network. In NIPS ."},{"key":"e_1_3_2_1_24_1","volume-title":"Amit K Roy Chowdhury, and Ananthram Swami","author":"Li Shasha","year":"2018"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"crossref","unstructured":"Sheng Liu Zhou Ren and Junsong Yuan. 2018. SibNet: Sibling Convolutional Encoder for Video Captioning. In ACM MM .  Sheng Liu Zhou Ren and Junsong Yuan. 2018. SibNet: Sibling Convolutional Encoder for Video Captioning. In ACM MM .","DOI":"10.1145\/3240508.3240667"},{"key":"e_1_3_2_1_26_1","unstructured":"Xingjun Ma Bo Li Yisen Wang Sarah M. Erfani Sudanthi Wijewickrema Michael E. Houle Grant Schoenebeck Dawn Song and James Bailey. 2018. Characterizing adversarial subspaces using local intrinsic dimensionality. In ICLR .  Xingjun Ma Bo Li Yisen Wang Sarah M. Erfani Sudanthi Wijewickrema Michael E. Houle Grant Schoenebeck Dawn Song and James Bailey. 2018. Characterizing adversarial subspaces using local intrinsic dimensionality. In ICLR ."},{"key":"e_1_3_2_1_27_1","unstructured":"Aleksander Madry Aleksandar Makelov Ludwig Schmidt Dimitris Tsipras and Adrian Vladu. 2018. Towards deep learning models resistant to adversarial attacks. In ICLR .  Aleksander Madry Aleksandar Makelov Ludwig Schmidt Dimitris Tsipras and Adrian Vladu. 2018. Towards deep learning models resistant to adversarial attacks. In ICLR ."},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"crossref","unstructured":"Seyed-Mohsen Moosavi-Dezfooli Alhussein Fawzi Omar Fawzi and Pascal Frossard. 2017. Universal adversarial perturbations. In CVPR .  Seyed-Mohsen Moosavi-Dezfooli Alhussein Fawzi Omar Fawzi and Pascal Frossard. 2017. Universal adversarial perturbations. In CVPR .","DOI":"10.1109\/CVPR.2017.17"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Seyed-Mohsen Moosavi-Dezfooli Alhussein Fawzi and Pascal Frossard. 2016. Deepfool: a simple and accurate method to fool deep neural networks. In CVPR .  Seyed-Mohsen Moosavi-Dezfooli Alhussein Fawzi and Pascal Frossard. 2016. Deepfool: a simple and accurate method to fool deep neural networks. In CVPR .","DOI":"10.1109\/CVPR.2016.282"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"crossref","unstructured":"Anh Nguyen Jason Yosinski and Jeff Clune. 2015. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In CVPR .  Anh Nguyen Jason Yosinski and Jeff Clune. 2015. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In CVPR .","DOI":"10.1109\/CVPR.2015.7298640"},{"key":"e_1_3_2_1_31_1","volume-title":"Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277","author":"Papernot Nicolas","year":"2016"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"crossref","unstructured":"Nicolas Papernot Patrick McDaniel Ian Goodfellow Somesh Jha Z Berkay Celik and Ananthram Swami. 2017. Practical black-box attacks against machine learning. In ASIACCS .  Nicolas Papernot Patrick McDaniel Ian Goodfellow Somesh Jha Z Berkay Celik and Ananthram Swami. 2017. Practical black-box attacks against machine learning. In ASIACCS .","DOI":"10.1145\/3052973.3053009"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"crossref","unstructured":"Nicolas Papernot Patrick McDaniel Somesh Jha Matt Fredrikson Z Berkay Celik and Ananthram Swami. 2016b. The limitations of deep learning in adversarial settings. In EuroS&P .  Nicolas Papernot Patrick McDaniel Somesh Jha Matt Fredrikson Z Berkay Celik and Ananthram Swami. 2016b. The limitations of deep learning in adversarial settings. In EuroS&P .","DOI":"10.1109\/EuroSP.2016.36"},{"key":"e_1_3_2_1_34_1","volume-title":"Targeted nonlinear adversarial perturbations in images and videos. arXiv preprint arXiv:1809.00958","author":"de Castro Roberto","year":"2018"},{"key":"e_1_3_2_1_35_1","volume-title":"Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864","author":"Salimans Tim","year":"2017"},{"key":"e_1_3_2_1_36_1","volume-title":"Amir Roshan Zamir, and Mubarak Shah","author":"Soomro Khurram","year":"2012"},{"key":"e_1_3_2_1_37_1","unstructured":"Christian Szegedy Wojciech Zaremba Ilya Sutskever Joan Bruna Dumitru Erhan Ian Goodfellow and Rob Fergus. 2014. Intriguing properties of neural networks. In ICLR .  Christian Szegedy Wojciech Zaremba Ilya Sutskever Joan Bruna Dumitru Erhan Ian Goodfellow and Rob Fergus. 2014. Intriguing properties of neural networks. In ICLR ."},{"key":"e_1_3_2_1_38_1","unstructured":"Dimitris Tsipras Shibani Santurkar Logan Engstrom Alexander Turner and Aleksander Madry. 2019. Robustness may be at odds with accuracy. In ICLR .  Dimitris Tsipras Shibani Santurkar Logan Engstrom Alexander Turner and Aleksander Madry. 2019. Robustness may be at odds with accuracy. In ICLR ."},{"key":"e_1_3_2_1_39_1","unstructured":"Yisen Wang Xingjun Ma James Bailey Jinfeng Yi Bowen Zhou and Quanquan Gu. 2019. On the Convergence and Robustness of Adversarial Training. In ICML. 6586--6595.  Yisen Wang Xingjun Ma James Bailey Jinfeng Yi Bowen Zhou and Quanquan Gu. 2019. On the Convergence and Robustness of Adversarial Training. In ICML. 6586--6595."},{"key":"e_1_3_2_1_40_1","volume-title":"Sparse adversarial perturbations for videos. arXiv preprint arXiv:1803.02536","author":"Wei Xingxing","year":"2018"},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.5555\/2627435.2638566"},{"key":"e_1_3_2_1_42_1","unstructured":"Zuxuan Wu Yu-Gang Jiang Xi Wang Hao Ye and Xiangyang Xue. 2016. Multi-stream multi-class fusion of deep networks for video classification. In ACM MM .  Zuxuan Wu Yu-Gang Jiang Xi Wang Hao Ye and Xiangyang Xue. 2016. Multi-stream multi-class fusion of deep networks for video classification. In ACM MM ."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"crossref","unstructured":"Ziwei Yang Yahong Han and Zheng Wang. 2017. Catching the temporal regions-of-interest for video captioning. In ACM MM .  Ziwei Yang Yahong Han and Zheng Wang. 2017. Catching the temporal regions-of-interest for video captioning. In ACM MM .","DOI":"10.1145\/3123266.3123327"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"crossref","unstructured":"Joe Yue-Hei Ng Matthew Hausknecht Sudheendra Vijayanarasimhan Oriol Vinyals Rajat Monga and George Toderici. 2015. Beyond short snippets: Deep networks for video classification. In CVPR .  Joe Yue-Hei Ng Matthew Hausknecht Sudheendra Vijayanarasimhan Oriol Vinyals Rajat Monga and George Toderici. 2015. Beyond short snippets: Deep networks for video classification. In CVPR .","DOI":"10.1109\/CVPR.2015.7299101"},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"crossref","unstructured":"Rui-Wei Zhao Zuxuan Wu Jianguo Li and Yu-Gang Jiang. 2017. Learning semantic feature map for visual content recognition. In ACM MM .  Rui-Wei Zhao Zuxuan Wu Jianguo Li and Yu-Gang Jiang. 2017. Learning semantic feature map for visual content recognition. In ACM MM .","DOI":"10.1145\/3123266.3123379"}],"event":{"name":"MM '19: The 27th ACM International Conference on Multimedia","location":"Nice France","acronym":"MM '19","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 27th ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3343031.3351088","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3343031.3351088","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:13:12Z","timestamp":1750201992000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3343031.3351088"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,10,15]]},"references-count":44,"alternative-id":["10.1145\/3343031.3351088","10.1145\/3343031"],"URL":"https:\/\/doi.org\/10.1145\/3343031.3351088","relation":{},"subject":[],"published":{"date-parts":[[2019,10,15]]},"assertion":[{"value":"2019-10-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}