{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,25]],"date-time":"2025-10-25T21:25:58Z","timestamp":1761427558512,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":55,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,6,8]],"date-time":"2020-06-08T00:00:00Z","timestamp":1591574400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Science,Technology and Innovation Commission of Shenzhen Municipality","award":["JCYJ20180307151516166"],"award-info":[{"award-number":["JCYJ20180307151516166"]}]},{"name":"Natural Science Foundation of Jiangsu Province","award":["BK20191248"],"award-info":[{"award-number":["BK20191248"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,6,8]]},"DOI":"10.1145\/3372278.3390671","type":"proceedings-article","created":{"date-parts":[[2020,6,2]],"date-time":"2020-06-02T04:35:27Z","timestamp":1591072527000},"page":"26-34","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":13,"title":["Human Object Interaction Detection via Multi-level Conditioned Network"],"prefix":"10.1145","author":[{"given":"Xu","family":"Sun","sequence":"first","affiliation":[{"name":"Nanjing University, Nanjing, China"}]},{"given":"Xinwen","family":"Hu","sequence":"additional","affiliation":[{"name":"Nanjing University, Nanjing, China"}]},{"given":"Tongwei","family":"Ren","sequence":"additional","affiliation":[{"name":"Nanjing University, Nanjing, China"}]},{"given":"Gangshan","family":"Wu","sequence":"additional","affiliation":[{"name":"Nanjing University, Nanjing, China"}]}],"member":"320","published-online":{"date-parts":[[2020,6,8]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/WACV.2018.00048"},{"key":"e_1_3_2_1_2_1","volume-title":"HICO: A Benchmark for Recognizing Human-Object Interactions in Images. In IEEE International Conference on Computer Vision. 1017--1025","author":"Chao Yu-Wei","year":"2015","unstructured":"Yu-Wei Chao , Zhan Wang , Yugeng He , Jiaxuan Wang , and Jia Deng . 2015 . HICO: A Benchmark for Recognizing Human-Object Interactions in Images. In IEEE International Conference on Computer Vision. 1017--1025 . Yu-Wei Chao, Zhan Wang, Yugeng He, Jiaxuan Wang, and Jia Deng. 2015. HICO: A Benchmark for Recognizing Human-Object Interactions in Images. In IEEE International Conference on Computer Vision. 1017--1025."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.254"},{"key":"e_1_3_2_1_4_1","volume-title":"Context-Dependent Diffusion Network for Visual Relationship Detection. In ACM International Conference on Multimedia. 1475--1482","author":"Cui Zhen","year":"2018","unstructured":"Zhen Cui , Chunyan Xu , Wenming Zheng , and Jian Yang . 2018 . Context-Dependent Diffusion Network for Visual Relationship Detection. In ACM International Conference on Multimedia. 1475--1482 . Zhen Cui, Chunyan Xu, Wenming Zheng, and Jian Yang. 2018. Context-Dependent Diffusion Network for Visual Relationship Detection. In ACM International Conference on Multimedia. 1475--1482."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3206025.3206047"},{"key":"e_1_3_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.121"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"e_1_3_2_1_8_1","volume-title":"Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer. arXiv preprint arXiv:1805.04310","author":"Fang Hao-Shu","year":"2018","unstructured":"Hao-Shu Fang , Guansong Lu , Xiaolin Fang , Jianwen Xie , Yu-Wing Tai , and Cewu Lu. 2018. Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer. arXiv preprint arXiv:1805.04310 ( 2018 ). Hao-Shu Fang, Guansong Lu, Xiaolin Fang, Jianwen Xie, Yu-Wing Tai, and Cewu Lu. 2018. Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer. arXiv preprint arXiv:1805.04310 (2018)."},{"key":"e_1_3_2_1_9_1","volume-title":"RMPE: Regional Multi-Person Pose Estimation. In IEEE International Conference on Computer Vision. 2334--2343","author":"Fang Hao-Shu","year":"2017","unstructured":"Hao-Shu Fang , Shuqin Xie , Yu-Wing Tai , and Cewu Lu . 2017 . RMPE: Regional Multi-Person Pose Estimation. In IEEE International Conference on Computer Vision. 2334--2343 . Hao-Shu Fang, Shuqin Xie, Yu-Wing Tai, and Cewu Lu. 2017. RMPE: Regional Multi-Person Pose Estimation. In IEEE International Conference on Computer Vision. 2334--2343."},{"key":"e_1_3_2_1_10_1","volume-title":"iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection. arXiv preprint arXiv:1808.10437","author":"Gao Chen","year":"2018","unstructured":"Chen Gao , Yuliang Zou , and Jia-Bin Huang . 2018. iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection. arXiv preprint arXiv:1808.10437 ( 2018 ). Chen Gao, Yuliang Zou, and Jia-Bin Huang. 2018. iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection. arXiv preprint arXiv:1808.10437 (2018)."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123266.3123442"},{"key":"e_1_3_2_1_12_1","volume-title":"Detecting and Recognizing Human-Object Interactions. In IEEE Conference on Computer Vision and Pattern Recognition. 8359--8367","author":"Gkioxari Georgia","year":"2018","unstructured":"Georgia Gkioxari , Ross Girshick , Piotr Doll\u00e1r , and Kaiming He . 2018 . Detecting and Recognizing Human-Object Interactions. In IEEE Conference on Computer Vision and Pattern Recognition. 8359--8367 . Georgia Gkioxari, Ross Girshick, Piotr Doll\u00e1r, and Kaiming He. 2018. Detecting and Recognizing Human-Object Interactions. In IEEE Conference on Computer Vision and Pattern Recognition. 8359--8367."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.129"},{"key":"e_1_3_2_1_14_1","volume-title":"Aligning Linguistic Words and Visual Semantic Units for Image Captioning. In ACM International Conference on Multimedia. 765--773","author":"Guo Longteng","year":"2019","unstructured":"Longteng Guo , Jing Liu , Jinhui Tang , Jiangwei Li , Wei Luo , and Hanqing Lu . 2019 . Aligning Linguistic Words and Visual Semantic Units for Image Captioning. In ACM International Conference on Multimedia. 765--773 . Longteng Guo, Jing Liu, Jinhui Tang, Jiangwei Li, Wei Luo, and Hanqing Lu. 2019. Aligning Linguistic Words and Visual Semantic Units for Image Captioning. In ACM International Conference on Multimedia. 765--773."},{"key":"e_1_3_2_1_15_1","volume-title":"Visual Semantic Role Labeling. arXiv preprint arXiv:1505.04474","author":"Gupta Saurabh","year":"2015","unstructured":"Saurabh Gupta and Jitendra Malik . 2015. Visual Semantic Role Labeling. arXiv preprint arXiv:1505.04474 ( 2015 ). Saurabh Gupta and Jitendra Malik. 2015. Visual Semantic Role Labeling. arXiv preprint arXiv:1505.04474 (2015)."},{"key":"e_1_3_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00977"},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.123"},{"key":"e_1_3_2_1_18_1","volume-title":"Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 770--778","author":"He Kaiming","year":"2016","unstructured":"Kaiming He , Xiangyu Zhang , Shaoqing Ren , and Jian Sun . 2016 . Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 770--778 . Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 770--778."},{"key":"e_1_3_2_1_19_1","volume-title":"Unsupervised Rank-Preserving Hashing for Large-Scale Image Retrieval. In ACM International Conference on Multimedia Retrieval. 192--196","author":"Karaman Svebor","year":"2019","unstructured":"Svebor Karaman , Xudong Lin , Xuefeng Hu , and Shih-Fu Chang . 2019 . Unsupervised Rank-Preserving Hashing for Large-Scale Image Retrieval. In ACM International Conference on Multimedia Retrieval. 192--196 . Svebor Karaman, Xudong Lin, Xuefeng Hu, and Shih-Fu Chang. 2019. Unsupervised Rank-Preserving Hashing for Large-Scale Image Retrieval. In ACM International Conference on Multimedia Retrieval. 192--196."},{"key":"e_1_3_2_1_20_1","volume-title":"Compositional Learning for Human Object Interaction. In European Conference on Computer Vision. 234--251","author":"Kato Keizo","year":"2018","unstructured":"Keizo Kato , Yin Li , and Abhinav Gupta . 2018 . Compositional Learning for Human Object Interaction. In European Conference on Computer Vision. 234--251 . Keizo Kato, Yin Li, and Abhinav Gupta. 2018. Compositional Learning for Human Object Interaction. In European Conference on Computer Vision. 234--251."},{"key":"e_1_3_2_1_21_1","unstructured":"Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems. 1097--1105.  Alex Krizhevsky Ilya Sutskever and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems. 1097--1105."},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123266.3123366"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2911996.2912001"},{"key":"e_1_3_2_1_24_1","volume-title":"Transferable Interactiveness Knowledge for Human-Object Interaction Detection. In IEEE Conference on Computer Vision and Pattern Recognition. 3585--3594","author":"Li Yong-Lu","year":"2019","unstructured":"Yong-Lu Li , Siyuan Zhou , Xijie Huang , Liang Xu , Ze Ma , Hao-Shu Fang , Yanfeng Wang , and Cewu Lu . 2019 . Transferable Interactiveness Knowledge for Human-Object Interaction Detection. In IEEE Conference on Computer Vision and Pattern Recognition. 3585--3594 . Yong-Lu Li, Siyuan Zhou, Xijie Huang, Liang Xu, Ze Ma, Hao-Shu Fang, Yanfeng Wang, and Cewu Lu. 2019. Transferable Interactiveness Knowledge for Human-Object Interaction Detection. In IEEE Conference on Computer Vision and Pattern Recognition. 3585--3594."},{"key":"e_1_3_2_1_25_1","volume-title":"Network in Network. arXiv preprint arXiv:1312.4400","author":"Lin Min","year":"2013","unstructured":"Min Lin , Qiang Chen , and Shuicheng Yan . 2013. Network in Network. arXiv preprint arXiv:1312.4400 ( 2013 ). Min Lin, Qiang Chen, and Shuicheng Yan. 2013. Network in Network. arXiv preprint arXiv:1312.4400 (2013)."},{"key":"e_1_3_2_1_26_1","volume-title":"Feature Pyramid Networks for Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition. 2117--2125","author":"Lin Tsung-Yi","year":"2017","unstructured":"Tsung-Yi Lin , Piotr Doll\u00e1r , Ross Girshick , Kaiming He , Bharath Hariharan , and Serge Belongie . 2017 . Feature Pyramid Networks for Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition. 2117--2125 . Tsung-Yi Lin, Piotr Doll\u00e1r, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature Pyramid Networks for Object Detection. In IEEE Conference on Computer Vision and Pattern Recognition. 2117--2125."},{"key":"e_1_3_2_1_27_1","volume-title":"Microsoft COCO: Common Objects in Context. In European Conference on Computer Vision. 740--755","author":"Lin Tsung-Yi","year":"2014","unstructured":"Tsung-Yi Lin , Michael Maire , Serge Belongie , James Hays , Pietro Perona , Deva Ramanan , Piotr Doll\u00e1r , and C Lawrence Zitnick . 2014 . Microsoft COCO: Common Objects in Context. In European Conference on Computer Vision. 740--755 . Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll\u00e1r, and C Lawrence Zitnick. 2014. Microsoft COCO: Common Objects in Context. In European Conference on Computer Vision. 740--755."},{"key":"e_1_3_2_1_28_1","volume-title":"Context-Aware Visual Policy Network for Sequence-Level Image Captioning. In ACM International Conference on Multimedia. 1416--1424","author":"Liu Daqing","year":"2018","unstructured":"Daqing Liu , Zheng-Jun Zha , Hanwang Zhang , Yongdong Zhang , and Feng Wu . 2018 . Context-Aware Visual Policy Network for Sequence-Level Image Captioning. In ACM International Conference on Multimedia. 1416--1424 . Daqing Liu, Zheng-Jun Zha, Hanwang Zhang, Yongdong Zhang, and Feng Wu. 2018. Context-Aware Visual Policy Network for Sequence-Level Image Captioning. In ACM International Conference on Multimedia. 1416--1424."},{"key":"e_1_3_2_1_29_1","volume-title":"Visual Relationship Detection with Language Priors. In European Conference on Computer Vision. 852--869","author":"Lu Cewu","year":"2016","unstructured":"Cewu Lu , Ranjay Krishna , Michael Bernstein , and Li Fei-Fei . 2016 . Visual Relationship Detection with Language Priors. In European Conference on Computer Vision. 852--869 . Cewu Lu, Ranjay Krishna, Michael Bernstein, and Li Fei-Fei. 2016. Visual Relationship Detection with Language Priors. In European Conference on Computer Vision. 852--869."},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3323873.3325017"},{"key":"e_1_3_2_1_31_1","unstructured":"Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury Gregory Chanan Trevor Killeen Zeming Lin Natalia Gimelshein Luca Antiga etal 2019. PyTorch: An Imperative Style High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems. 8024--8035.  Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury Gregory Chanan Trevor Killeen Zeming Lin Natalia Gimelshein Luca Antiga et al. 2019. PyTorch: An Imperative Style High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems. 8024--8035."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3350925"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11671"},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01240-3_25"},{"key":"e_1_3_2_1_35_1","volume-title":"Video Relation Detection with Spatio-Temporal Graph. In ACM International Conference on Multimedia. 84--93","author":"Qian Xufeng","year":"2019","unstructured":"Xufeng Qian , Yueting Zhuang , Yimeng Li , Shaoning Xiao , Shiliang Pu , and Jun Xiao . 2019 . Video Relation Detection with Spatio-Temporal Graph. In ACM International Conference on Multimedia. 84--93 . Xufeng Qian, Yueting Zhuang, Yimeng Li, Shaoning Xiao, Shiliang Pu, and Jun Xiao. 2019. Video Relation Detection with Spatio-Temporal Graph. In ACM International Conference on Multimedia. 84--93."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3078971.3078985"},{"key":"e_1_3_2_1_37_1","volume-title":"Annotating Objects and Relations in User-Generated Videos. In ACM International Conference on Multimedia Retrieval. 279--287","author":"Shang Xindi","year":"2019","unstructured":"Xindi Shang , Donglin Di , Junbin Xiao , Yu Cao , Xun Yang , and Tat-Seng Chua . 2019 . Annotating Objects and Relations in User-Generated Videos. In ACM International Conference on Multimedia Retrieval. 279--287 . Xindi Shang, Donglin Di, Junbin Xiao, Yu Cao, Xun Yang, and Tat-Seng Chua. 2019. Annotating Objects and Relations in User-Generated Videos. In ACM International Conference on Multimedia Retrieval. 279--287."},{"key":"e_1_3_2_1_38_1","volume-title":"Video Visual Relation Detection. In ACM International Conference on Multimedia. 1300--1308","author":"Shang Xindi","year":"2017","unstructured":"Xindi Shang , Tongwei Ren , Jingfan Guo , Hanwang Zhang , and Tat-Seng Chua . 2017 . Video Visual Relation Detection. In ACM International Conference on Multimedia. 1300--1308 . Xindi Shang, Tongwei Ren, Jingfan Guo, Hanwang Zhang, and Tat-Seng Chua. 2017. Video Visual Relation Detection. In ACM International Conference on Multimedia. 1300--1308."},{"key":"e_1_3_2_1_39_1","volume-title":"Scaling Human-Object Interaction Recognition Through Zero-Shot Learning. In IEEE Winter Conference on Applications of Computer Vision. 1568--1576","author":"Shen Liyue","year":"2018","unstructured":"Liyue Shen , Serena Yeung , Judy Hoffman , Greg Mori , and Li Fei-Fei . 2018 . Scaling Human-Object Interaction Recognition Through Zero-Shot Learning. In IEEE Winter Conference on Applications of Computer Vision. 1568--1576 . Liyue Shen, Serena Yeung, Judy Hoffman, Greg Mori, and Li Fei-Fei. 2018. Scaling Human-Object Interaction Recognition Through Zero-Shot Learning. In IEEE Winter Conference on Applications of Computer Vision. 1568--1576."},{"key":"e_1_3_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3356076"},{"key":"e_1_3_2_1_41_1","volume-title":"Hierarchical Visual Relationship Detection. In ACM International Conference on Multimedia. 94--102","author":"Sun Xu","year":"2019","unstructured":"Xu Sun , Yuan Zi , Tongwei Ren , Jinhui Tang , and Gangshan Wu . 2019 b . Hierarchical Visual Relationship Detection. In ACM International Conference on Multimedia. 94--102 . Xu Sun, Yuan Zi, Tongwei Ren, Jinhui Tang, and Gangshan Wu. 2019 b. Hierarchical Visual Relationship Detection. In ACM International Conference on Multimedia. 94--102."},{"key":"e_1_3_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/2964284.2967234"},{"key":"e_1_3_2_1_43_1","volume-title":"Pose-aware Multi-level Feature Network for Human Object Interaction Detection. In IEEE International Conference on Computer Vision. 9469--9478","author":"Wan Bo","year":"2019","unstructured":"Bo Wan , Desen Zhou , Yongfei Liu , Rongjie Li , and Xuming He . 2019 . Pose-aware Multi-level Feature Network for Human Object Interaction Detection. In IEEE International Conference on Computer Vision. 9469--9478 . Bo Wan, Desen Zhou, Yongfei Liu, Rongjie Li, and Xuming He. 2019. Pose-aware Multi-level Feature Network for Human Object Interaction Detection. In IEEE International Conference on Computer Vision. 9469--9478."},{"key":"e_1_3_2_1_44_1","volume-title":"Deep Contextual Attention for Human-Object Interaction Detection. In IEEE International Conference on Computer Vision. 5694--5702","author":"Wang Tiancai","year":"2019","unstructured":"Tiancai Wang , Rao Muhammad Anwer , Muhammad Haris Khan , Fahad Shahbaz Khan , Yanwei Pang , Ling Shao , and Jorma Laaksonen . 2019 . Deep Contextual Attention for Human-Object Interaction Detection. In IEEE International Conference on Computer Vision. 5694--5702 . Tiancai Wang, Rao Muhammad Anwer, Muhammad Haris Khan, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao, and Jorma Laaksonen. 2019. Deep Contextual Attention for Human-Object Interaction Detection. In IEEE International Conference on Computer Vision. 5694--5702."},{"key":"e_1_3_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00070"},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00212"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3206025.3206028"},{"key":"e_1_3_2_1_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/3078971.3079037"},{"key":"e_1_3_2_1_49_1","volume-title":"Human-centric Visual Relation Segmentation Using Mask R-CNN and VTransE. In European Conference on Computer Vision. 582--589","author":"Yu Fan","year":"2018","unstructured":"Fan Yu , Xin Tan , Tongwei Ren , and Gangshan Wu . 2018 . Human-centric Visual Relation Segmentation Using Mask R-CNN and VTransE. In European Conference on Computer Vision. 582--589 . Fan Yu, Xin Tan, Tongwei Ren, and Gangshan Wu. 2018. Human-centric Visual Relation Segmentation Using Mask R-CNN and VTransE. In European Conference on Computer Vision. 582--589."},{"key":"e_1_3_2_1_50_1","volume-title":"Visual Translation Embedding Network for Visual Relation Detection. In IEEE Conference on Computer Vision and Pattern Recognition. 5532--5540","author":"Zhang Hanwang","year":"2017","unstructured":"Hanwang Zhang , Zawlin Kyaw , Shih-Fu Chang , and Tat-Seng Chua . 2017 . Visual Translation Embedding Network for Visual Relation Detection. In IEEE Conference on Computer Vision and Pattern Recognition. 5532--5540 . Hanwang Zhang, Zawlin Kyaw, Shih-Fu Chang, and Tat-Seng Chua. 2017. Visual Translation Embedding Network for Visual Relation Detection. In IEEE Conference on Computer Vision and Pattern Recognition. 5532--5540."},{"key":"e_1_3_2_1_51_1","volume-title":"Visual Relation Detection with Multi-Level Attention. In ACM International Conference on Multimedia. 121--129","author":"Zheng Sipeng","year":"2019","unstructured":"Sipeng Zheng , Shizhe Chen , and Qin Jin . 2019 . Visual Relation Detection with Multi-Level Attention. In ACM International Conference on Multimedia. 121--129 . Sipeng Zheng, Shizhe Chen, and Qin Jin. 2019. Visual Relation Detection with Multi-Level Attention. In ACM International Conference on Multimedia. 121--129."},{"key":"e_1_3_2_1_52_1","volume-title":"Structure Guided Photorealistic Style Transfer. In ACM International Conference on Multimedia Conference. 365--373","author":"Zhi Yuheng","year":"2018","unstructured":"Yuheng Zhi , Huawei Wei , and Bingbing Ni . 2018 . Structure Guided Photorealistic Style Transfer. In ACM International Conference on Multimedia Conference. 365--373 . Yuheng Zhi, Huawei Wei, and Bingbing Ni. 2018. Structure Guided Photorealistic Style Transfer. In ACM International Conference on Multimedia Conference. 365--373."},{"key":"e_1_3_2_1_53_1","volume-title":"Visual Relationship Detection with Relative Location Mining. In ACM International Conference on Multimedia. 30--38","author":"Zhou Hao","year":"2019","unstructured":"Hao Zhou , Chongyang Zhang , and Chuanping Hu . 2019 . Visual Relationship Detection with Relative Location Mining. In ACM International Conference on Multimedia. 30--38 . Hao Zhou, Chongyang Zhang, and Chuanping Hu. 2019. Visual Relationship Detection with Relative Location Mining. In ACM International Conference on Multimedia. 30--38."},{"key":"e_1_3_2_1_54_1","volume-title":"Relation Parsing Neural Network for Human-Object Interaction Detection. In IEEE International Conference on Computer Vision. 843--851","author":"Zhou Penghao","year":"2019","unstructured":"Penghao Zhou and Mingmin Chi . 2019 . Relation Parsing Neural Network for Human-Object Interaction Detection. In IEEE International Conference on Computer Vision. 843--851 . Penghao Zhou and Mingmin Chi. 2019. Relation Parsing Neural Network for Human-Object Interaction Detection. In IEEE International Conference on Computer Vision. 843--851."},{"volume-title":"HCVRD: A Benchmark for Large-Scale Human-Centered Visual Relationship Detection. In AAAI Conference on Artificial Intelligence. 7632--7638","author":"Zhuang Bohan","key":"e_1_3_2_1_55_1","unstructured":"Bohan Zhuang , Qi Wu , Chunhua Shen , Ian Reid , and Anton van den Hengel. 2018 . HCVRD: A Benchmark for Large-Scale Human-Centered Visual Relationship Detection. In AAAI Conference on Artificial Intelligence. 7632--7638 . Bohan Zhuang, Qi Wu, Chunhua Shen, Ian Reid, and Anton van den Hengel. 2018. HCVRD: A Benchmark for Large-Scale Human-Centered Visual Relationship Detection. In AAAI Conference on Artificial Intelligence. 7632--7638."}],"event":{"name":"ICMR '20: International Conference on Multimedia Retrieval","sponsor":["SIGMM ACM Special Interest Group on Multimedia"],"location":"Dublin Ireland","acronym":"ICMR '20"},"container-title":["Proceedings of the 2020 International Conference on Multimedia Retrieval"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3372278.3390671","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3372278.3390671","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:32:10Z","timestamp":1750195930000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3372278.3390671"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,8]]},"references-count":55,"alternative-id":["10.1145\/3372278.3390671","10.1145\/3372278"],"URL":"https:\/\/doi.org\/10.1145\/3372278.3390671","relation":{},"subject":[],"published":{"date-parts":[[2020,6,8]]},"assertion":[{"value":"2020-06-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}