{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T21:54:39Z","timestamp":1775253279485,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":40,"publisher":"ACM","license":[{"start":{"date-parts":[[2018,10,15]],"date-time":"2018-10-15T00:00:00Z","timestamp":1539561600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Natural Science Foundation of China","award":["61672072, 61532003"],"award-info":[{"award-number":["61672072, 61532003"]}]},{"name":"Beijing Nova Program","award":["Z181100006218063"],"award-info":[{"award-number":["Z181100006218063"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2018,10,15]]},"DOI":"10.1145\/3240508.3240540","type":"proceedings-article","created":{"date-parts":[[2018,10,18]],"date-time":"2018-10-18T13:52:08Z","timestamp":1539870728000},"page":"474-482","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Collaborative Annotation of Semantic Objects in Images with Multi-granularity Supervisions"],"prefix":"10.1145","author":[{"given":"Lishi","family":"Zhang","sequence":"first","affiliation":[{"name":"Beihang University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chenghan","family":"Fu","sequence":"additional","affiliation":[{"name":"Beihang University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jia","family":"Li","sequence":"additional","affiliation":[{"name":"Beihang University, Beijing, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2018,10,15]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"crossref","unstructured":"Radhakrishna Achanta and Sabine Susstrunk. 2017. Superpixels and polygons using simple non-iterative clustering. In CVPR. 4895--4904. Radhakrishna Achanta and Sabine Susstrunk. 2017. Superpixels and polygons using simple non-iterative clustering. In CVPR. 4895--4904.","DOI":"10.1109\/CVPR.2017.520"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.49"},{"key":"e_1_3_2_1_3_1","first-page":"2","article-title":"Annotating object instances with a polygon-rnn","volume":"1","author":"Castrejon Llu's","year":"2017","unstructured":"Llu's Castrejon , Kaustav Kundu , Raquel Urtasun , and Sanja Fidler . 2017 . Annotating object instances with a polygon-rnn . In CVPR , Vol. 1. 2 . Llu's Castrejon, Kaustav Kundu, Raquel Urtasun, and Sanja Fidler. 2017. Annotating object instances with a polygon-rnn. In CVPR, Vol. 1. 2.","journal-title":"CVPR"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.312"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.409"},{"key":"e_1_3_2_1_6_1","unstructured":"Yunpeng Chen Jianan Li Huaxin Xiao Xiaojie Jin Shuicheng Yan and Jiashi Feng. 2017. Dual path networks. In NIPS. 4470-- 4478. Yunpeng Chen Jianan Li Huaxin Xiao Xiaojie Jin Shuicheng Yan and Jiashi Feng. 2017. Dual path networks. In NIPS. 4470-- 4478."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"crossref","unstructured":"Zetao Chen Fabiola Maffra Inkyu Sa and Margarita Chli. 2017. Only look once mining distinctive landmarks from ConvNet for visual place recognition. In IROS. 9--16. Zetao Chen Fabiola Maffra Inkyu Sa and Margarita Chli. 2017. Only look once mining distinctive landmarks from ConvNet for visual place recognition. In IROS. 9--16.","DOI":"10.1109\/IROS.2017.8202131"},{"key":"e_1_3_2_1_8_1","doi-asserted-by":"crossref","unstructured":"Marius Cordts Mohamed Omran Sebastian Ramos Timo Rehfeld Markus Enzweiler Rodrigo Benenson Uwe Franke Stefan Roth and Bernt Schiele. 2016. The cityscapes dataset for semantic urban scene understanding. In CVPR. 3213--3223. Marius Cordts Mohamed Omran Sebastian Ramos Timo Rehfeld Markus Enzweiler Rodrigo Benenson Uwe Franke Stefan Roth and Bernt Schiele. 2016. The cityscapes dataset for semantic urban scene understanding. In CVPR. 3213--3223.","DOI":"10.1109\/CVPR.2016.350"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.191"},{"key":"e_1_3_2_1_10_1","volume-title":"Imagenet: A large-scale hierarchical image database. In CVPR. 248--255.","author":"Deng Jia","year":"2009","unstructured":"Jia Deng , Wei Dong , Richard Socher , Li-Jia Li , Kai Li , and Li Fei-Fei . 2009 . Imagenet: A large-scale hierarchical image database. In CVPR. 248--255. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In CVPR. 248--255."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2011.2173241"},{"key":"e_1_3_2_1_12_1","volume-title":"Kinect v2 for mobile robot navigation: Evaluation and modeling","author":"Fankhauser Peter","unstructured":"Peter Fankhauser , Michael Bloesch , Diego Rodriguez , Ralf Kaestner , Marco Hutter , and Roland Siegwart . 2015. Kinect v2 for mobile robot navigation: Evaluation and modeling . In IEEE ICRA. 388--394. Peter Fankhauser, Michael Bloesch, Diego Rodriguez, Ralf Kaestner, Marco Hutter, and Roland Siegwart. 2015. Kinect v2 for mobile robot navigation: Evaluation and modeling. In IEEE ICRA. 388--394."},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"crossref","unstructured":"Andreas Geiger Philip Lenz and Raquel Urtasun. 2012. Are we ready for autonomous driving? the kitti vision benchmark suite. In CVPR. 3354--3361. Andreas Geiger Philip Lenz and Raquel Urtasun. 2012. Are we ready for autonomous driving? the kitti vision benchmark suite. In CVPR. 3354--3361.","DOI":"10.1109\/CVPR.2012.6248074"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"crossref","unstructured":"Stephen Gould. 2012. Multiclass pixel labeling with non-local matching constraints. In CVPR. 2783--2790. Stephen Gould. 2012. Multiclass pixel labeling with non-local matching constraints. In CVPR. 2783--2790.","DOI":"10.1109\/CVPR.2012.6248002"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"crossref","unstructured":"Yash Goyal Tejas Khot Douglas Summers-Stay Dhruv Batra and Devi Parikh. 2017. Making the V in VQA matter: Elevating the role of image understanding in Visual Question Answering. In CVPR. 9. Yash Goyal Tejas Khot Douglas Summers-Stay Dhruv Batra and Devi Parikh. 2017. Making the V in VQA matter: Elevating the role of image understanding in Visual Question Answering. In CVPR. 9.","DOI":"10.1109\/CVPR.2017.670"},{"key":"e_1_3_2_1_16_1","unstructured":"Suyog Dutt Jain and Kristen Grauman. 2016. Active Image Segmentation Propagation.. In CVPR. 4. Suyog Dutt Jain and Kristen Grauman. 2016. Active Image Segmentation Propagation.. In CVPR. 4."},{"key":"e_1_3_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2011.2161275"},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"crossref","unstructured":"Jun-Sik Kim and Jung-Min Park. 2017. Direct hand manipulation of constrained virtual objects. In IROS. 357--362. Jun-Sik Kim and Jung-Min Park. 2017. Direct hand manipulation of constrained virtual objects. In IROS. 357--362.","DOI":"10.1109\/IROS.2017.8202180"},{"key":"e_1_3_2_1_19_1","volume-title":"Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882","author":"Kim Yoon","year":"2014","unstructured":"Yoon Kim . 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 ( 2014 ). Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)."},{"key":"e_1_3_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-33786-4_34"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"crossref","unstructured":"Honglak Lee Alexis Battle Rajat Raina and Andrew Y Ng. 2007. Efficient sparse coding algorithms. In Advances in neural information processing systems. 801--808. Honglak Lee Alexis Battle Rajat Raina and Andrew Y Ng. 2007. Efficient sparse coding algorithms. In Advances in neural information processing systems. 801--808.","DOI":"10.7551\/mitpress\/7503.003.0105"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"crossref","unstructured":"Ke Li Bharath Hariharan and Jitendra Malik. 2016. Iterative instance segmentation. In CVPR. 3659--3667. Ke Li Bharath Hariharan and Jitendra Malik. 2016. Iterative instance segmentation. In CVPR. 3659--3667.","DOI":"10.1109\/CVPR.2016.398"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.344"},{"key":"e_1_3_2_1_24_1","volume-title":"Piotr Dollar, and C Lawrence Zitnick","author":"Lin Tsung-Yi","year":"2014","unstructured":"Tsung-Yi Lin , Michael Maire , Serge Belongie , James Hays , Pietro Perona , Deva Ramanan , Piotr Dollar, and C Lawrence Zitnick . 2014 . Microsoft coco: Common objects in context. In ECCV. 740--755. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollar, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In ECCV. 740--755."},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"crossref","unstructured":"Huitan Mao Jing Xiao Mabel M Zhang and Kostas Daniilidis. 2017. Shape-based object classification and recognition through continuum manipulation. In IROS. 456--463. Huitan Mao Jing Xiao Mabel M Zhang and Kostas Daniilidis. 2017. Shape-based object classification and recognition through continuum manipulation. In IROS. 456--463.","DOI":"10.1109\/IROS.2017.8202193"},{"key":"e_1_3_2_1_26_1","volume-title":"Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781","author":"Mikolov Tomas","year":"2013","unstructured":"Tomas Mikolov , Kai Chen , Greg Corrado , and Jeffrey Dean . 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 ( 2013 ). Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)."},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"crossref","unstructured":"Arsalan Mousavian Dragomir Anguelov John Flynn and Jana Ko'secka. 2017. 3d bounding box estimation using deep learning and geometry. In CVPR. 5632--5640. Arsalan Mousavian Dragomir Anguelov John Flynn and Jana Ko'secka. 2017. 3d bounding box estimation using deep learning and geometry. In CVPR. 5632--5640.","DOI":"10.1109\/CVPR.2017.597"},{"key":"e_1_3_2_1_28_1","unstructured":"Pedro O Pinheiro Ronan Collobert and Piotr Dollar. 2015. Learning to segment object candidates. In NIPS. 1990--1998. Pedro O Pinheiro Ronan Collobert and Piotr Dollar. 2015. Learning to segment object candidates. In NIPS. 1990--1998."},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"crossref","unstructured":"Pedro O Pinheiro Tsung-Yi Lin Ronan Collobert and Piotr Dollar. 2016. Learning to refine object segments. In ECCV. 75-- 91. Pedro O Pinheiro Tsung-Yi Lin Ronan Collobert and Piotr Dollar. 2016. Learning to refine object segments. In ECCV. 75-- 91.","DOI":"10.1007\/978-3-319-46448-0_5"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1015706.1015720"},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-007-0090-8"},{"key":"e_1_3_2_1_33_1","volume-title":"Weakly-supervised image annotation and segmentation with objects and attributes","author":"Shi Zhiyuan","year":"2017","unstructured":"Zhiyuan Shi , Yongxin Yang , Timothy M Hospedales , and Tao Xiang . 2017. Weakly-supervised image annotation and segmentation with objects and attributes . IEEE TPAMI ( 2017 ), 2525--2538. Zhiyuan Shi, Yongxin Yang, Timothy M Hospedales, and Tao Xiang. 2017. Weakly-supervised image annotation and segmentation with objects and attributes. IEEE TPAMI (2017), 2525--2538."},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.compedu.2010.03.008"},{"key":"e_1_3_2_1_35_1","volume-title":"AAAI","volume":"1","author":"Su Hao","year":"2012","unstructured":"Hao Su , Jia Deng , and Li Fei-Fei . 2012 . Crowdsourcing annotations for visual object detection . In AAAI , Vol. 1 . Hao Su, Jia Deng, and Li Fei-Fei. 2012. Crowdsourcing annotations for visual object detection. In AAAI, Vol. 1."},{"key":"e_1_3_2_1_36_1","doi-asserted-by":"crossref","unstructured":"Jonas Uhrig Marius Cordts Uwe Franke and Thomas Brox. 2016. Pixel-level encoding and depth layering for instance-level semantic labeling. 14--25. Jonas Uhrig Marius Cordts Uwe Franke and Thomas Brox. 2016. Pixel-level encoding and depth layering for instance-level semantic labeling. 14--25.","DOI":"10.1007\/978-3-319-45886-1_2"},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123266.3123427"},{"key":"e_1_3_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123266.3123364"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123266.3123335"},{"key":"e_1_3_2_1_40_1","volume-title":"Target-driven visual navigation in indoor scenes using deep reinforcement learning","author":"Zhu Yuke","unstructured":"Yuke Zhu , Roozbeh Mottaghi , Eric Kolve , Joseph J Lim , Abhinav Gupta , Li Fei-Fei , and Ali Farhadi . 2017. Target-driven visual navigation in indoor scenes using deep reinforcement learning . In IEEE ICRA. 3357--3364. Yuke Zhu, Roozbeh Mottaghi, Eric Kolve, Joseph J Lim, Abhinav Gupta, Li Fei-Fei, and Ali Farhadi. 2017. Target-driven visual navigation in indoor scenes using deep reinforcement learning. In IEEE ICRA. 3357--3364."}],"event":{"name":"MM '18: ACM Multimedia Conference","location":"Seoul Republic of Korea","acronym":"MM '18","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 26th ACM international conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3240508.3240540","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3240508.3240540","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T20:40:07Z","timestamp":1775248807000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3240508.3240540"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,10,15]]},"references-count":40,"alternative-id":["10.1145\/3240508.3240540","10.1145\/3240508"],"URL":"https:\/\/doi.org\/10.1145\/3240508.3240540","relation":{},"subject":[],"published":{"date-parts":[[2018,10,15]]},"assertion":[{"value":"2018-10-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}