{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,16]],"date-time":"2026-03-16T17:26:04Z","timestamp":1773681964300,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":66,"publisher":"ACM","license":[{"start":{"date-parts":[[2021,10,17]],"date-time":"2021-10-17T00:00:00Z","timestamp":1634428800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"the National Key Research and Development Program of China","award":["2018YFB1800204"],"award-info":[{"award-number":["2018YFB1800204"]}]},{"name":"the National Natural Science Foundation of China","award":["61771273;61972188"],"award-info":[{"award-number":["61771273;61972188"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2021,10,17]]},"DOI":"10.1145\/3474085.3475217","type":"proceedings-article","created":{"date-parts":[[2021,10,18]],"date-time":"2021-10-18T06:09:05Z","timestamp":1634537345000},"page":"2995-3004","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":9,"title":["WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations"],"prefix":"10.1145","author":[{"given":"Peidong","family":"Liu","sequence":"first","affiliation":[{"name":"Tsinghua University, Beijing, China"}]},{"given":"Zibin","family":"He","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}]},{"given":"Xiyu","family":"Yan","sequence":"additional","affiliation":[{"name":"Tsinghua University, Beijing, China"}]},{"given":"Yong","family":"Jiang","sequence":"additional","affiliation":[{"name":"Tsinghua University &amp; Peng Cheng Laboratory, Beijing;Shenzhen, China"}]},{"given":"Shu-Tao","family":"Xia","sequence":"additional","affiliation":[{"name":"Tsinghua University &amp; Peng Cheng Laboratory, Beijing;Shenzhen, China"}]},{"given":"Feng","family":"Zheng","sequence":"additional","affiliation":[{"name":"Southern University of Science and Technology, Shenzhen, China"}]},{"given":"Hu","family":"Maowei","sequence":"additional","affiliation":[{"name":"Tsinghua University &amp; Shenzhen Rejoice Sport Tech. Co., LTD, Beijing;Shenzhen, China"}]}],"member":"320","published-online":{"date-parts":[[2021,10,17]]},"reference":[{"key":"e_1_3_2_2_1_1","doi-asserted-by":"crossref","unstructured":"Amy Bearman Olga Russakovsky Vittorio Ferrari and Li Fei-Fei. 2016. What's the point: Semantic segmentation with point supervision. In ECCV. 549--565.  Amy Bearman Olga Russakovsky Vittorio Ferrari and Li Fei-Fei. 2016. What's the point: Semantic segmentation with point supervision. In ECCV. 549--565.","DOI":"10.1007\/978-3-319-46478-7_34"},{"key":"e_1_3_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2008.04.005"},{"key":"e_1_3_2_2_3_1","volume-title":"Robinson Piramuthu, and Ming Hsuan Yang.","author":"Chang Yu Ting","year":"2020","unstructured":"Yu Ting Chang , Qiaosong Wang , Wei Chih Hung , Robinson Piramuthu, and Ming Hsuan Yang. 2020 . Weakly-Supervised Semantic Segmentation via Sub-category Exploration. In CVPR . Yu Ting Chang, Qiaosong Wang, Wei Chih Hung, Robinson Piramuthu, and Ming Hsuan Yang. 2020. Weakly-Supervised Semantic Segmentation via Sub-category Exploration. In CVPR ."},{"key":"e_1_3_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.5555\/3294771.3294842"},{"key":"e_1_3_2_2_5_1","volume-title":"Bowen Cheng, Maxwell D Collins, Ekin D Cubuk, Barret Zoph, Hartwig Adam, and Jonathon Shlens.","author":"Chen Liang-Chieh","year":"2020","unstructured":"Liang-Chieh Chen , Raphael Gontijo Lopes , Bowen Cheng, Maxwell D Collins, Ekin D Cubuk, Barret Zoph, Hartwig Adam, and Jonathon Shlens. 2020 . Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation . arXiv preprint arXiv:2005.10266 (2020). Liang-Chieh Chen, Raphael Gontijo Lopes, Bowen Cheng, Maxwell D Collins, Ekin D Cubuk, Barret Zoph, Hartwig Adam, and Jonathon Shlens. 2020. Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation. arXiv preprint arXiv:2005.10266 (2020)."},{"key":"e_1_3_2_2_6_1","doi-asserted-by":"crossref","unstructured":"Liang-Chieh Chen Yukun Zhu George Papandreou Florian Schroff and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In ECCV. 801--818.  Liang-Chieh Chen Yukun Zhu George Papandreou Florian Schroff and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In ECCV. 801--818.","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"e_1_3_2_2_7_1","doi-asserted-by":"crossref","unstructured":"\u00d6zg\u00fcn cC icc ek Ahmed Abdulkadir Soeren S Lienkamp Thomas Brox and Olaf Ronneberger. 2016. 3D U-Net: learning dense volumetric segmentation from sparse annotation. In MICCAI. 424--432.  \u00d6zg\u00fcn cC icc ek Ahmed Abdulkadir Soeren S Lienkamp Thomas Brox and Olaf Ronneberger. 2016. 3D U-Net: learning dense volumetric segmentation from sparse annotation. In MICCAI. 424--432.","DOI":"10.1007\/978-3-319-46723-8_49"},{"key":"e_1_3_2_2_8_1","doi-asserted-by":"crossref","unstructured":"Marius Cordts Mohamed Omran Sebastian Ramos Timo Rehfeld Markus Enzweiler Rodrigo Benenson Uwe Franke Stefan Roth and Bernt Schiele. 2016. The Cityscapes Dataset for Semantic Urban Scene Understanding. In CVPR. 3213--3223.  Marius Cordts Mohamed Omran Sebastian Ramos Timo Rehfeld Markus Enzweiler Rodrigo Benenson Uwe Franke Stefan Roth and Bernt Schiele. 2016. The Cityscapes Dataset for Semantic Urban Scene Understanding. In CVPR. 3213--3223.","DOI":"10.1109\/CVPR.2016.350"},{"key":"e_1_3_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.191"},{"key":"e_1_3_2_2_10_1","volume-title":"Imagenet: A large-scale hierarchical image database. In CVPR. 248--255.","author":"Deng Jia","year":"2009","unstructured":"Jia Deng , Wei Dong , Richard Socher , Li-Jia Li , Kai Li , and Li Fei-Fei . 2009 . Imagenet: A large-scale hierarchical image database. In CVPR. 248--255. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In CVPR. 248--255."},{"key":"e_1_3_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-009-0275-4"},{"key":"e_1_3_2_2_12_1","doi-asserted-by":"crossref","unstructured":"Raghudeep Gadde Varun Jampani and Peter V Gehler. 2017. Semantic video cnns through representation warping. In ICCV. 4453--4462.  Raghudeep Gadde Varun Jampani and Peter V Gehler. 2017. Semantic video cnns through representation warping. In ICCV. 4453--4462.","DOI":"10.1109\/ICCV.2017.477"},{"key":"e_1_3_2_2_13_1","unstructured":"Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In AISTATS. 249--256.  Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In AISTATS. 249--256."},{"key":"e_1_3_2_2_14_1","volume-title":"Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149","author":"Han Song","year":"2015","unstructured":"Song Han , Huizi Mao , and William J Dally . 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 ( 2015 ). Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015)."},{"key":"e_1_3_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2011.6126343"},{"key":"e_1_3_2_2_16_1","unstructured":"Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770--778.  Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770--778."},{"key":"e_1_3_2_2_17_1","unstructured":"Yihui He Xiangyu Zhang and Jian Sun. 2017. Channel pruning for accelerating very deep neural networks. In ICCV. 1389--1397.  Yihui He Xiangyu Zhang and Jian Sun. 2017. Channel pruning for accelerating very deep neural networks. In ICCV. 1389--1397."},{"key":"e_1_3_2_2_18_1","volume-title":"Fakd: Feature-Affinity Based Knowledge Distillation for Efficient Image Super-Resolution. In ICIP. 518--522.","author":"He Zibin","year":"2020","unstructured":"Zibin He , Tao Dai , Jian Lu , Yong Jiang , and Shu-Tao Xia . 2020 . Fakd: Feature-Affinity Based Knowledge Distillation for Efficient Image Super-Resolution. In ICIP. 518--522. Zibin He, Tao Dai, Jian Lu, Yong Jiang, and Shu-Tao Xia. 2020. Fakd: Feature-Affinity Based Knowledge Distillation for Efficient Image Super-Resolution. In ICIP. 518--522."},{"key":"e_1_3_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33013779"},{"key":"e_1_3_2_2_20_1","volume-title":"NeurIPS Workshop .","author":"Hinton Geoffrey","year":"2014","unstructured":"Geoffrey Hinton , Oriol Vinyals , and Jeff Dean . 2014 . Distilling the knowledge in a neural network . In NeurIPS Workshop . Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2014. Distilling the knowledge in a neural network. In NeurIPS Workshop ."},{"key":"e_1_3_2_2_21_1","doi-asserted-by":"crossref","unstructured":"Yuenan Hou Zheng Ma Chunxiao Liu and Chen Change Loy. 2019. Learning lightweight lane detection cnns by self attention distillation. In ICCV. 1013--1021.  Yuenan Hou Zheng Ma Chunxiao Liu and Chen Change Loy. 2019. Learning lightweight lane detection cnns by self attention distillation. In ICCV. 1013--1021.","DOI":"10.1109\/ICCV.2019.00110"},{"key":"e_1_3_2_2_22_1","volume-title":"Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861","author":"Howard Andrew G","year":"2017","unstructured":"Andrew G Howard , Menglong Zhu , Bo Chen , Dmitry Kalenichenko , Weijun Wang , Tobias Weyand , Marco Andreetto , and Hartwig Adam . 2017 . Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017). Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)."},{"key":"e_1_3_2_2_23_1","volume-title":"Weakly supervised semantic image segmentation with self-correcting networks. arXiv preprint arXiv:1811.07073","author":"Ibrahim Mostafa S","year":"2018","unstructured":"Mostafa S Ibrahim , Arash Vahdat , Mani Ranjbar , and William G Macready . 2018. Weakly supervised semantic image segmentation with self-correcting networks. arXiv preprint arXiv:1811.07073 ( 2018 ). Mostafa S Ibrahim, Arash Vahdat, Mani Ranjbar, and William G Macready. 2018. Weakly supervised semantic image segmentation with self-correcting networks. arXiv preprint arXiv:1811.07073 (2018)."},{"key":"e_1_3_2_2_24_1","doi-asserted-by":"crossref","unstructured":"Eddy Ilg Nikolaus Mayer Tonmoy Saikia Margret Keuper Alexey Dosovitskiy and Thomas Brox. 2017. Flownet 2.0: Evolution of optical flow estimation with deep networks. In CVPR. 2462--2470.  Eddy Ilg Nikolaus Mayer Tonmoy Saikia Margret Keuper Alexey Dosovitskiy and Thomas Brox. 2017. Flownet 2.0: Evolution of optical flow estimation with deep networks. In CVPR. 2462--2470.","DOI":"10.1109\/CVPR.2017.179"},{"key":"e_1_3_2_2_25_1","doi-asserted-by":"crossref","unstructured":"Anna Khoreva Rodrigo Benenson Jan Hosang Matthias Hein and Bernt Schiele. 2017. Simple does it: Weakly supervised instance and semantic segmentation. In CVPR. 876--885.  Anna Khoreva Rodrigo Benenson Jan Hosang Matthias Hein and Bernt Schiele. 2017. Simple does it: Weakly supervised instance and semantic segmentation. In CVPR. 876--885.","DOI":"10.1109\/CVPR.2017.181"},{"key":"e_1_3_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.5555\/3327144.3327200"},{"key":"e_1_3_2_2_27_1","unstructured":"Jungbeom Lee Eunji Kim Sungmin Lee Jangho Lee and Sungroh Yoon. 2019. Frame-to-Frame Aggregation of Active Regions in Web Videos for Weakly Supervised Semantic Segmentation. In ICCV. 6808--6818.  Jungbeom Lee Eunji Kim Sungmin Lee Jangho Lee and Sungroh Yoon. 2019. Frame-to-Frame Aggregation of Active Regions in Web Videos for Weakly Supervised Semantic Segmentation. In ICCV. 6808--6818."},{"key":"e_1_3_2_2_28_1","unstructured":"Wonkyung Lee Junghyup Lee Dohyung Kim and Bumsub Ham. 2020. Learning with Privileged Information for Efficient Image Super-Resolution. In ECCV .  Wonkyung Lee Junghyup Lee Dohyung Kim and Bumsub Ham. 2020. Learning with Privileged Information for Efficient Image Super-Resolution. In ECCV ."},{"key":"e_1_3_2_2_29_1","unstructured":"Quanquan Li Shengying Jin and Junjie Yan. 2017. Mimicking very efficient network for object detection. In CVPR. 6356--6364.  Quanquan Li Shengying Jin and Junjie Yan. 2017. Mimicking very efficient network for object detection. In CVPR. 6356--6364."},{"key":"e_1_3_2_2_30_1","unstructured":"Yule Li Jianping Shi and Dahua Lin. 2018. Low-latency video semantic segmentation. In CVPR. 5997--6005.  Yule Li Jianping Shi and Dahua Lin. 2018. Low-latency video semantic segmentation. In CVPR. 5997--6005."},{"key":"e_1_3_2_2_31_1","volume-title":"Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In CVPR. 3159--3167.","author":"Lin Di","year":"2016","unstructured":"Di Lin , Jifeng Dai , Jiaya Jia , Kaiming He , and Jian Sun . 2016 . Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In CVPR. 3159--3167. Di Lin, Jifeng Dai, Jiaya Jia, Kaiming He, and Jian Sun. 2016. Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In CVPR. 3159--3167."},{"key":"e_1_3_2_2_32_1","volume-title":"2020 b. Deep Flow Collaborative Network for Online Visual Tracking","author":"Liu Peidong","unstructured":"Peidong Liu , Xiyu Yan , Yong Jiang , and Shu-Tao Xia . 2020 b. Deep Flow Collaborative Network for Online Visual Tracking . In ICASSP. IEEE , 2598--2602. Peidong Liu, Xiyu Yan, Yong Jiang, and Shu-Tao Xia. 2020 b. Deep Flow Collaborative Network for Online Visual Tracking. In ICASSP. IEEE, 2598--2602."},{"key":"e_1_3_2_2_33_1","doi-asserted-by":"crossref","unstructured":"Yufan Liu Jiajiong Cao Bing Li Chunfeng Yuan Weiming Hu Yangxi Li and Yunqiang Duan. 2019 a. Knowledge distillation via instance relationship graph. In CVPR. 7096--7104.  Yufan Liu Jiajiong Cao Bing Li Chunfeng Yuan Weiming Hu Yangxi Li and Yunqiang Duan. 2019 a. Knowledge distillation via instance relationship graph. In CVPR. 7096--7104.","DOI":"10.1109\/CVPR.2019.00726"},{"key":"e_1_3_2_2_34_1","doi-asserted-by":"crossref","unstructured":"Yifan Liu Ke Chen Chris Liu Zengchang Qin Zhenbo Luo and Jingdong Wang. 2019 b. Structured knowledge distillation for semantic segmentation. In CVPR. 2604--2613.  Yifan Liu Ke Chen Chris Liu Zengchang Qin Zhenbo Luo and Jingdong Wang. 2019 b. Structured knowledge distillation for semantic segmentation. In CVPR. 2604--2613.","DOI":"10.1109\/CVPR.2019.00271"},{"key":"e_1_3_2_2_35_1","doi-asserted-by":"crossref","unstructured":"Yifan Liu Chunhua Shen Changqian Yu and Jingdong Wang. 2020 a. Efficient Semantic Video Segmentation with Per-frame Inference. In ECCV .  Yifan Liu Chunhua Shen Changqian Yu and Jingdong Wang. 2020 a. Efficient Semantic Video Segmentation with Per-frame Inference. In ECCV .","DOI":"10.1007\/978-3-030-58607-2_21"},{"key":"e_1_3_2_2_36_1","unstructured":"Zhuang Liu Jianguo Li Zhiqiang Shen Gao Huang Shoumeng Yan and Changshui Zhang. 2017. Learning efficient convolutional networks through network slimming. In ICCV. 2736--2744.  Zhuang Liu Jianguo Li Zhiqiang Shen Gao Huang Shoumeng Yan and Changshui Zhang. 2017. Learning efficient convolutional networks through network slimming. In ICCV. 2736--2744."},{"key":"e_1_3_2_2_37_1","doi-asserted-by":"crossref","unstructured":"Jonathan Long Evan Shelhamer and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In CVPR. 3431--3440.  Jonathan Long Evan Shelhamer and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In CVPR. 3431--3440.","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"e_1_3_2_2_38_1","doi-asserted-by":"crossref","unstructured":"Kevis-Kokitsi Maninis Sergi Caelles Jordi Pont-Tuset and Luc Van Gool. 2018. Deep extreme cut: From extreme points to object segmentation. In CVPR. 616--625.  Kevis-Kokitsi Maninis Sergi Caelles Jordi Pont-Tuset and Luc Van Gool. 2018. Deep extreme cut: From extreme points to object segmentation. In CVPR. 616--625.","DOI":"10.1109\/CVPR.2018.00071"},{"key":"e_1_3_2_2_39_1","doi-asserted-by":"crossref","unstructured":"David Nilsson and Cristian Sminchisescu. 2018. Semantic video segmentation by gated recurrent flow propagation. In CVPR. 6819--6828.  David Nilsson and Cristian Sminchisescu. 2018. Semantic video segmentation by gated recurrent flow propagation. In CVPR. 6819--6828.","DOI":"10.1109\/CVPR.2018.00713"},{"key":"e_1_3_2_2_40_1","volume-title":"Gated crf loss for weakly supervised semantic image segmentation. arXiv preprint arXiv:1906.04651","author":"Obukhov Anton","year":"2019","unstructured":"Anton Obukhov , Stamatios Georgoulis , Dengxin Dai , and Luc Van Gool . 2019. Gated crf loss for weakly supervised semantic image segmentation. arXiv preprint arXiv:1906.04651 ( 2019 ). Anton Obukhov, Stamatios Georgoulis, Dengxin Dai, and Luc Van Gool. 2019. Gated crf loss for weakly supervised semantic image segmentation. arXiv preprint arXiv:1906.04651 (2019)."},{"key":"e_1_3_2_2_41_1","volume-title":"Frank Keller, and Vittorio Ferrari.","author":"Papadopoulos Dim P","year":"2014","unstructured":"Dim P Papadopoulos , Alasdair DF Clarke , Frank Keller, and Vittorio Ferrari. 2014 . Training object class detectors from eye tracking data. In ECCV. Springer , 361--376. Dim P Papadopoulos, Alasdair DF Clarke, Frank Keller, and Vittorio Ferrari. 2014. Training object class detectors from eye tracking data. In ECCV. Springer, 361--376."},{"key":"e_1_3_2_2_42_1","doi-asserted-by":"crossref","unstructured":"Wonpyo Park Dongju Kim Yan Lu and Minsu Cho. 2019. Relational knowledge distillation. In CVPR. 3967--3976.  Wonpyo Park Dongju Kim Yan Lu and Minsu Cho. 2019. Relational knowledge distillation. In CVPR. 3967--3976.","DOI":"10.1109\/CVPR.2019.00409"},{"key":"e_1_3_2_2_43_1","doi-asserted-by":"crossref","unstructured":"Baoyun Peng Xiao Jin Jiaheng Liu Dongsheng Li Yichao Wu Yu Liu Shunfeng Zhou and Zhaoning Zhang. 2019. Correlation congruence for knowledge distillation. In ICCV. 5007--5016.  Baoyun Peng Xiao Jin Jiaheng Liu Dongsheng Li Yichao Wu Yu Liu Shunfeng Zhou and Zhaoning Zhang. 2019. Correlation congruence for knowledge distillation. In ICCV. 5007--5016.","DOI":"10.1109\/ICCV.2019.00511"},{"key":"e_1_3_2_2_44_1","volume-title":"Xnor-net: Imagenet classification using binary convolutional neural networks","author":"Rastegari Mohammad","year":"2016","unstructured":"Mohammad Rastegari , Vicente Ordonez , Joseph Redmon , and Ali Farhadi . 2016 . Xnor-net: Imagenet classification using binary convolutional neural networks . In ECCV. Springer , 525--542. Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. Xnor-net: Imagenet classification using binary convolutional neural networks. In ECCV. Springer, 525--542."},{"key":"e_1_3_2_2_45_1","volume-title":"Antoine Chassang, Carlo Gatta, and Yoshua Bengio.","author":"Romero Adriana","year":"2014","unstructured":"Adriana Romero , Nicolas Ballas , Samira Ebrahimi Kahou , Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2014 . Fitnets : Hints for thin deep nets. arXiv preprint arXiv:1412.6550 (2014). Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio. 2014. Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550 (2014)."},{"key":"e_1_3_2_2_46_1","volume-title":"Mathieu Salzmann, Lars Petersson, and Jose M Alvarez.","author":"Saleh Fatemeh Sadat","year":"2017","unstructured":"Fatemeh Sadat Saleh , Mohammad Sadegh Aliakbarian , Mathieu Salzmann, Lars Petersson, and Jose M Alvarez. 2017 . Bringing background into the foreground: making all classes equal in weakly-supervised video semantic segmentation. In ICCV. 2106--2116. Fatemeh Sadat Saleh, Mohammad Sadegh Aliakbarian, Mathieu Salzmann, Lars Petersson, and Jose M Alvarez. 2017. Bringing background into the foreground: making all classes equal in weakly-supervised video semantic segmentation. In ICCV. 2106--2116."},{"key":"e_1_3_2_2_47_1","volume-title":"Mathieu Salzmann, Lars Petersson, and Jose M. Alvarez.","author":"Saleh Fatemeh Sadat","year":"2017","unstructured":"Fatemeh Sadat Saleh , Mohammad Sadegh Aliakbarian , Mathieu Salzmann, Lars Petersson, and Jose M. Alvarez. 2017 . Bringing background into the foreground: Making all classes equal in weakly-supervised video semantic segmentation. In ICCV. 2125--2135. Fatemeh Sadat Saleh, Mohammad Sadegh Aliakbarian, Mathieu Salzmann, Lars Petersson, and Jose M. Alvarez. 2017. Bringing background into the foreground: Making all classes equal in weakly-supervised video semantic segmentation. In ICCV. 2125--2135."},{"key":"e_1_3_2_2_48_1","doi-asserted-by":"crossref","unstructured":"Mark Sandler Andrew Howard Menglong Zhu Andrey Zhmoginov and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In CVPR. 4510--4520.  Mark Sandler Andrew Howard Menglong Zhu Andrey Zhmoginov and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In CVPR. 4510--4520.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"e_1_3_2_2_49_1","doi-asserted-by":"crossref","unstructured":"Meng Tang Abdelaziz Djelouah Federico Perazzi Yuri Boykov and Christopher Schroers. 2018a. Normalized cut loss for weakly-supervised cnn segmentation. In CVPR. 1818--1827.  Meng Tang Abdelaziz Djelouah Federico Perazzi Yuri Boykov and Christopher Schroers. 2018a. Normalized cut loss for weakly-supervised cnn segmentation. In CVPR. 1818--1827.","DOI":"10.1109\/CVPR.2018.00195"},{"key":"e_1_3_2_2_50_1","volume-title":"Christopher Schroers, and Yuri Boykov.","author":"Tang Meng","year":"2018","unstructured":"Meng Tang , Federico Perazzi , Abdelaziz Djelouah , Ismail Ben Ayed , Christopher Schroers, and Yuri Boykov. 2018 b. On regularized losses for weakly-supervised cnn segmentation. In ECCV. 507--522. Meng Tang, Federico Perazzi, Abdelaziz Djelouah, Ismail Ben Ayed, Christopher Schroers, and Yuri Boykov. 2018b. On regularized losses for weakly-supervised cnn segmentation. In ECCV. 507--522."},{"key":"e_1_3_2_2_51_1","unstructured":"Frederick Tung and Greg Mori. 2019. Similarity-preserving knowledge distillation. In ICCV. 1365--1374.  Frederick Tung and Greg Mori. 2019. Similarity-preserving knowledge distillation. In ICCV. 1365--1374."},{"key":"e_1_3_2_2_52_1","volume-title":"ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation. CVPR","author":"Vu Tuan Hung","year":"2019","unstructured":"Tuan Hung Vu , Himalaya Jain , Maxime Bucher , Matthieu Cord , and Patrick P\u00e9rez . 2019 . ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation. CVPR (2019). Tuan Hung Vu, Himalaya Jain, Maxime Bucher, Matthieu Cord, and Patrick P\u00e9rez. 2019. ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation. CVPR (2019)."},{"key":"e_1_3_2_2_53_1","volume-title":"International Journal of Computer Vision","author":"Wang Xiang","year":"2020","unstructured":"Xiang Wang , Sifei Liu , Huimin Ma , and Ming-Hsuan Yang . 2020 a. Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning . International Journal of Computer Vision ( 2020 ), 1--14. Xiang Wang, Sifei Liu, Huimin Ma, and Ming-Hsuan Yang. 2020 a. Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning. International Journal of Computer Vision (2020), 1--14."},{"key":"e_1_3_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2019.11.019"},{"key":"e_1_3_2_2_55_1","doi-asserted-by":"crossref","unstructured":"Zian Wang David Acuna Huan Ling Amlan Kar and Sanja Fidler. 2019. Object instance annotation with deep extreme level set evolution. In CVPR. 7500--7508.  Zian Wang David Acuna Huan Ling Amlan Kar and Sanja Fidler. 2019. Object instance annotation with deep extreme level set evolution. In CVPR. 7500--7508.","DOI":"10.1109\/CVPR.2019.00768"},{"key":"e_1_3_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2016.01.015"},{"key":"e_1_3_2_2_57_1","unstructured":"Jiafeng Xie Bing Shuai Jian-Fang Hu Jingyang Lin and Wei-Shi Zheng. 2018. Improving fast segmentation with teacher-student learning. In BMVC .  Jiafeng Xie Bing Shuai Jian-Fang Hu Jingyang Lin and Wei-Shi Zheng. 2018. Improving fast segmentation with teacher-student learning. In BMVC ."},{"key":"e_1_3_2_2_58_1","unstructured":"Yu-Syuan Xu Tsu-Jui Fu Hsuan-Kung Yang and Chun-Yi Lee. 2018. Dynamic video segmentation network. In CVPR. 6556--6565.  Yu-Syuan Xu Tsu-Jui Fu Hsuan-Kung Yang and Chun-Yi Lee. 2018. Dynamic video segmentation network. In CVPR. 6556--6565."},{"key":"e_1_3_2_2_59_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cviu.2020.103077"},{"key":"e_1_3_2_2_60_1","doi-asserted-by":"crossref","unstructured":"Chenglin Yang Lingxi Xie Chi Su and Alan L Yuille. 2019. Snapshot distillation: Teacher-student optimization in one generation. In CVPR. 2859--2868.  Chenglin Yang Lingxi Xie Chi Su and Alan L Yuille. 2019. Snapshot distillation: Teacher-student optimization in one generation. In CVPR. 2859--2868.","DOI":"10.1109\/CVPR.2019.00297"},{"key":"e_1_3_2_2_61_1","doi-asserted-by":"crossref","unstructured":"Junho Yim Donggyu Joo Jihoon Bae and Junmo Kim. 2017. A gift from knowledge distillation: Fast optimization network minimization and transfer learning. In CVPR. 4133--4141.  Junho Yim Donggyu Joo Jihoon Bae and Junmo Kim. 2017. A gift from knowledge distillation: Fast optimization network minimization and transfer learning. In CVPR. 4133--4141.","DOI":"10.1109\/CVPR.2017.754"},{"key":"e_1_3_2_2_62_1","volume-title":"Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv preprint arXiv:1612.03928","author":"Zagoruyko Sergey","year":"2016","unstructured":"Sergey Zagoruyko and Nikos Komodakis . 2016. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv preprint arXiv:1612.03928 ( 2016 ). Sergey Zagoruyko and Nikos Komodakis. 2016. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv preprint arXiv:1612.03928 (2016)."},{"key":"e_1_3_2_2_63_1","doi-asserted-by":"crossref","unstructured":"Feng Zhang Xiatian Zhu and Mao Ye. 2019 b. Fast human pose estimation. In CVPR. 3517--3526.  Feng Zhang Xiatian Zhu and Mao Ye. 2019 b. Fast human pose estimation. In CVPR. 3517--3526.","DOI":"10.1109\/CVPR.2019.00363"},{"key":"e_1_3_2_2_64_1","doi-asserted-by":"crossref","unstructured":"Linfeng Zhang Jiebo Song Anni Gao Jingwei Chen Chenglong Bao and Kaisheng Ma. 2019 a. Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In ICCV. 3713--3722.  Linfeng Zhang Jiebo Song Anni Gao Jingwei Chen Chenglong Bao and Kaisheng Ma. 2019 a. Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In ICCV. 3713--3722.","DOI":"10.1109\/ICCV.2019.00381"},{"key":"e_1_3_2_2_65_1","unstructured":"Hengshuang Zhao Jianping Shi Xiaojuan Qi Xiaogang Wang and Jiaya Jia. 2017. Pyramid Scene Parsing Network. In CVPR. 2881--2890.  Hengshuang Zhao Jianping Shi Xiaojuan Qi Xiaogang Wang and Jiaya Jia. 2017. Pyramid Scene Parsing Network. In CVPR. 2881--2890."},{"key":"e_1_3_2_2_66_1","unstructured":"Xizhou Zhu Yuwen Xiong Jifeng Dai Lu Yuan and Yichen Wei. 2017. Deep feature flow for video recognition. In CVPR. 2349--2358.  Xizhou Zhu Yuwen Xiong Jifeng Dai Lu Yuan and Yichen Wei. 2017. Deep feature flow for video recognition. In CVPR. 2349--2358."}],"event":{"name":"MM '21: ACM Multimedia Conference","location":"Virtual Event China","acronym":"MM '21","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 29th ACM International Conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3474085.3475217","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3474085.3475217","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:48:16Z","timestamp":1750193296000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3474085.3475217"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,17]]},"references-count":66,"alternative-id":["10.1145\/3474085.3475217","10.1145\/3474085"],"URL":"https:\/\/doi.org\/10.1145\/3474085.3475217","relation":{},"subject":[],"published":{"date-parts":[[2021,10,17]]},"assertion":[{"value":"2021-10-17","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}