{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T21:54:11Z","timestamp":1775253251758,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":62,"publisher":"ACM","license":[{"start":{"date-parts":[[2018,10,15]],"date-time":"2018-10-15T00:00:00Z","timestamp":1539561600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2018,10,15]]},"DOI":"10.1145\/3240508.3240511","type":"proceedings-article","created":{"date-parts":[[2018,10,18]],"date-time":"2018-10-18T13:52:08Z","timestamp":1539870728000},"page":"35-44","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":93,"title":["Step-by-step Erasion, One-by-one Collection"],"prefix":"10.1145","author":[{"given":"Jia-Xing","family":"Zhong","sequence":"first","affiliation":[{"name":"Peking University, Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nannan","family":"Li","sequence":"additional","affiliation":[{"name":"Peking University, Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weijie","family":"Kong","sequence":"additional","affiliation":[{"name":"Peking University, Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tao","family":"Zhang","sequence":"additional","affiliation":[{"name":"Peking University, Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Thomas H.","family":"Li","sequence":"additional","affiliation":[{"name":"Peking University, Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ge","family":"Li","sequence":"additional","affiliation":[{"name":"Peking University, Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2018,10,15]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.223"},{"key":"e_1_3_2_1_2_1","volume-title":"Weakly Supervised Action Labeling in Videos under Ordering Constraints Computer Vision - ECCV","author":"Bojanowski Piotr","year":"2014","unstructured":"Piotr Bojanowski , R\u00e9mi Lajugie , Francis Bach , Ivan Laptev , Jean Ponce , Cordelia Schmid , and Josef Sivic . 2014. Weakly Supervised Action Labeling in Videos under Ordering Constraints Computer Vision - ECCV 2014 , David Fleet, Tomas Pajdla , Bernt Schiele, and Tinne Tuytelaars (Eds.). Springer International Publishing , Cham, 628--643. Piotr Bojanowski, R\u00e9mi Lajugie, Francis Bach, Ivan Laptev, Jean Ponce, Cordelia Schmid, and Josef Sivic. 2014. Weakly Supervised Action Labeling in Videos under Ordering Constraints Computer Vision - ECCV 2014, David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars (Eds.). Springer International Publishing, Cham, 628--643."},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.507"},{"key":"e_1_3_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.502"},{"key":"e_1_3_2_1_5_1","volume-title":"Temporal Context Network for Activity Localization in Videos. In The IEEE International Conference on Computer Vision (ICCV). 5727--5736","author":"Dai Xiyang","year":"2017","unstructured":"Xiyang Dai , Bharat Singh , Guyue Zhang , Larry S. Davis , and Yan Qiu Chen . 2017 . Temporal Context Network for Activity Localization in Videos. In The IEEE International Conference on Computer Vision (ICCV). 5727--5736 . Xiyang Dai, Bharat Singh, Guyue Zhang, Larry S. Davis, and Yan Qiu Chen. 2017. Temporal Context Network for Activity Localization in Videos. In The IEEE International Conference on Computer Vision (ICCV). 5727--5736."},{"key":"e_1_3_2_1_6_1","volume-title":"The LEAR submission at Thumos","author":"Dan Oneata","year":"2014","unstructured":"Oneata Dan , Jakob Verbeek , and Cordelia Schmid . 2014. The LEAR submission at Thumos 2014 . Computer Vision and Pattern Recognition {cs.CV} (2014). Oneata Dan, Jakob Verbeek, and Cordelia Schmid. 2014. The LEAR submission at Thumos 2014. Computer Vision and Pattern Recognition {cs.CV} (2014)."},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2599174"},{"key":"e_1_3_2_1_8_1","volume-title":"Juan Carlos Niebles, and Bernard Ghanem.","author":"Escorcia Victor","year":"2016","unstructured":"Victor Escorcia , Fabian Caba Heilbron , Juan Carlos Niebles, and Bernard Ghanem. 2016 . DAPs: Deep Action Proposals for Action Understanding Computer Vision - ECCV 2016, Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer International Publishing , Cham, 768--784. Victor Escorcia, Fabian Caba Heilbron, Juan Carlos Niebles, and Bernard Ghanem. 2016. DAPs: Deep Action Proposals for Action Understanding Computer Vision - ECCV 2016, Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer International Publishing, Cham, 768--784."},{"key":"e_1_3_2_1_9_1","volume-title":"Juan Carlos Niebles, and Bernard Ghanem.","author":"Escorcia Victor","year":"2016","unstructured":"Victor Escorcia , Fabian Caba Heilbron , Juan Carlos Niebles, and Bernard Ghanem. 2016 . DAPs: Deep Action Proposals for Action Understanding Computer Vision - ECCV 2016, Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer International Publishing , Cham, 768--784. Victor Escorcia, Fabian Caba Heilbron, Juan Carlos Niebles, and Bernard Ghanem. 2016. DAPs: Deep Action Proposals for Action Understanding Computer Vision - ECCV 2016, Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer International Publishing, Cham, 768--784."},{"key":"e_1_3_2_1_10_1","volume-title":"ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 961--970","author":"Fabian Caba Heilbron Bernard Ghanem","year":"2015","unstructured":"Bernard Ghanem Fabian Caba Heilbron , Victor Escorcia and Juan Carlos Niebles . 2015 . ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 961--970 . Bernard Ghanem Fabian Caba Heilbron, Victor Escorcia and Juan Carlos Niebles. 2015. ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 961--970."},{"key":"e_1_3_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.213"},{"key":"e_1_3_2_1_12_1","volume-title":"Computer Vision - ECCV","author":"Gan Chuang","year":"2016","unstructured":"Chuang Gan , Chen Sun , Lixin Duan , and Boqing Gong . 2016. Webly-Supervised Video Recognition by Mutually Voting for Relevant Web Images and Web Video Frames . In Computer Vision - ECCV 2016 , Bastian Leibe, Jiri Matas , Nicu Sebe, and Max Welling (Eds.). Springer International Publishing , Cham, 849--866. Chuang Gan, Chen Sun, Lixin Duan, and Boqing Gong. 2016. Webly-Supervised Video Recognition by Mutually Voting for Relevant Web Images and Web Video Frames. In Computer Vision - ECCV 2016, Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer International Publishing, Cham, 849--866."},{"key":"e_1_3_2_1_13_1","volume-title":"TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals. (Oct.","author":"Gao Jiyang","year":"2017","unstructured":"Jiyang Gao , Zhenheng Yang , Kan Chen , Chen Sun , and Ram Nevatia . 2017 . TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals. (Oct. 2017). Jiyang Gao, Zhenheng Yang, Kan Chen, Chen Sun, and Ram Nevatia. 2017. TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals. (Oct. 2017)."},{"key":"e_1_3_2_1_14_1","volume-title":"SCC: Semantic Context Cascade for Efficient Action Detection. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Heilbron Fabian Caba","year":"2017","unstructured":"Fabian Caba Heilbron , Wayner Barrios , Victor Escorcia , and Bernard Ghanem . 2017 . SCC: Semantic Context Cascade for Efficient Action Detection. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Fabian Caba Heilbron, Wayner Barrios, Victor Escorcia, and Bernard Ghanem. 2017. SCC: Semantic Context Cascade for Efficient Action Detection. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_3_2_1_15_1","volume-title":"Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1914--1923","author":"Heilbron Fabian Caba","year":"2016","unstructured":"Fabian Caba Heilbron , Juan Carlos Niebles , and Bernard Ghanem . 2016 . Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1914--1923 . Fabian Caba Heilbron, Juan Carlos Niebles, and Bernard Ghanem. 2016. Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1914--1923."},{"key":"e_1_3_2_1_16_1","volume-title":"Computer Vision - ECCV","author":"Huang De-Au","year":"2016","unstructured":"De-Au Huang , Li Fei-Fei , and Juan Carlos Niebles . 2016. Connectionist temporal modeling for weakly supervised action labeling . In Computer Vision - ECCV 2016 . Springer International Publishing , Cham , 137--153. De-Au Huang, Li Fei-Fei, and Juan Carlos Niebles. 2016. Connectionist temporal modeling for weakly supervised action labeling. In Computer Vision - ECCV 2016. Springer International Publishing, Cham, 137--153."},{"key":"e_1_3_2_1_17_1","volume-title":"SAP: Self-Adaptive Proposal Model for Temporal Action Detection Based on Reinforcement Learning.","author":"Huang Jingjia","year":"2018","unstructured":"Jingjia Huang , Nannan Li , Tao Zhang , Ge Li , Tiejun Huang , and Wen Gao . 2018 . SAP: Self-Adaptive Proposal Model for Temporal Action Detection Based on Reinforcement Learning. (2018). https:\/\/www.aaai.org\/ocs\/index.php\/AAAI\/AAAI18\/paper\/view\/16109 Jingjia Huang, Nannan Li, Tao Zhang, Ge Li, Tiejun Huang, and Wen Gao. 2018. SAP: Self-Adaptive Proposal Model for Temporal Action Detection Based on Reinforcement Learning. (2018). https:\/\/www.aaai.org\/ocs\/index.php\/AAAI\/AAAI18\/paper\/view\/16109"},{"key":"e_1_3_2_1_18_1","volume-title":"Slow and Steady Feature Analysis: Higher Order Temporal Coherence in Video. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3852--3861","author":"Jayaraman Dinesh","year":"2016","unstructured":"Dinesh Jayaraman and Kristen Grauman . 2016 . Slow and Steady Feature Analysis: Higher Order Temporal Coherence in Video. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3852--3861 . Dinesh Jayaraman and Kristen Grauman. 2016. Slow and Steady Feature Analysis: Higher Order Temporal Coherence in Video. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3852--3861."},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1145\/2647868.2654889"},{"key":"e_1_3_2_1_20_1","unstructured":"Y.-G. Jiang J. Liu A. Roshan Zamir G. Toderici I. Laptev M. Shah and R. Sukthankar. 2014. THUMOS Challenge: Action Recognition with a Large Number of Classes. http:\/\/crcv.ucf.edu\/THUMOS14\/. (2014). Y.-G. Jiang J. Liu A. Roshan Zamir G. Toderici I. Laptev M. Shah and R. Sukthankar. 2014. THUMOS Challenge: Action Recognition with a Large Number of Classes. http:\/\/crcv.ucf.edu\/THUMOS14\/. (2014)."},{"key":"e_1_3_2_1_21_1","unstructured":"Svebor Karaman Lorenzo Seidenari and Alberto Del Bimbo. {n. d.}. Fast saliency based pooling of Fisher encoded dense trajectories. ({n. d.}). Svebor Karaman Lorenzo Seidenari and Alberto Del Bimbo. {n. d.}. Fast saliency based pooling of Fisher encoded dense trajectories. ({n. d.})."},{"key":"e_1_3_2_1_22_1","volume-title":"Advances in Neural Information Processing Systems 24","author":"Kr\u00e4henb\u00fchl Philipp","unstructured":"Philipp Kr\u00e4henb\u00fchl and Vladlen Koltun . 2011. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials . In Advances in Neural Information Processing Systems 24 , J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger (Eds.). Curran Associates, Inc. , 109--117. http:\/\/papers.nips.cc\/paper\/4296-efficient-inference-in-fully-connected-crfs-with-gaussian-edge-potentials.pdf Philipp Kr\u00e4henb\u00fchl and Vladlen Koltun. 2011. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials. In Advances in Neural Information Processing Systems 24, J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 109--117. http:\/\/papers.nips.cc\/paper\/4296-efficient-inference-in-fully-connected-crfs-with-gaussian-edge-potentials.pdf"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cviu.2017.06.004"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/946247.946605"},{"key":"e_1_3_2_1_25_1","volume-title":"The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1--8.","author":"Laptev I.","unstructured":"I. Laptev , M. Marszalek , C. Schmid , and B. Rozenfeld . 2008. Learning realistic human actions from movies . In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1--8. I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfeld. 2008. Learning realistic human actions from movies. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1--8."},{"key":"e_1_3_2_1_26_1","volume-title":"Temporal Convolutional Networks for Action Segmentation and Detection. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1003--1012","author":"Lea Colin","unstructured":"Colin Lea , Michael D. Flynn , Rene Vidal , Austin Reiter , and Gregory D. Hager . 2017 . Temporal Convolutional Networks for Action Segmentation and Detection. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1003--1012 . Colin Lea, Michael D. Flynn, Rene Vidal, Austin Reiter, and Gregory D. Hager. 2017. Temporal Convolutional Networks for Action Segmentation and Detection. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1003--1012."},{"key":"e_1_3_2_1_27_1","volume-title":"Computer Vision - ACCV","author":"Li Nannan","year":"2016","unstructured":"Nannan Li , Dan Xu , Zhenqiang Ying , Zhihao Li , and Ge Li. 2017. Searching Action Proposals via Spatial Actionness Estimation and Temporal Path Inference and Tracking . In Computer Vision - ACCV 2016 , Shang-Hong Lai, Vincent Lepetit , Ko Nishino, and Yoichi Sato (Eds.). Springer International Publishing , Cham, 384--399. Nannan Li, Dan Xu, Zhenqiang Ying, Zhihao Li, and Ge Li. 2017. Searching Action Proposals via Spatial Actionness Estimation and Temporal Path Inference and Tracking. In Computer Vision - ACCV 2016, Shang-Hong Lai, Vincent Lepetit, Ko Nishino, and Yoichi Sato (Eds.). Springer International Publishing, Cham, 384--399."},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123266.3123343"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206557"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553469"},{"key":"e_1_3_2_1_31_1","volume-title":"The IEEE International Conference on Computer Vision (ICCV). 1491--1498","author":"Duchenne J. Sivic F. R.","unstructured":"J. Sivic F. R. Bach O. Duchenne , I. Laptev and J. Ponce . 2009. Automatic annotation of human actions in video . In The IEEE International Conference on Computer Vision (ICCV). 1491--1498 . J. Sivic F. R. Bach O. Duchenne, I. Laptev and J. Ponce. 2009. Automatic annotation of human actions in video. In The IEEE International Conference on Computer Vision (ICCV). 1491--1498."},{"key":"e_1_3_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.590"},{"key":"e_1_3_2_1_33_1","volume-title":"Learning Discriminative Aggregation Network for Video-Based Face Recognition. In The IEEE International Conference on Computer Vision (ICCV).","author":"Rao Yongming","year":"2017","unstructured":"Yongming Rao , Ji Lin , Jiwen Lu , and Jie Zhou . 2017 . Learning Discriminative Aggregation Network for Video-Based Face Recognition. In The IEEE International Conference on Computer Vision (ICCV). Yongming Rao, Ji Lin, Jiwen Lu, and Jie Zhou. 2017. Learning Discriminative Aggregation Network for Video-Based Face Recognition. In The IEEE International Conference on Computer Vision (ICCV)."},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.341"},{"key":"e_1_3_2_1_35_1","volume-title":"Computer Vision - ECCV","author":"Satkin Scott","year":"2010","unstructured":"Scott Satkin and Martial Hebert . 2010. Modeling the Temporal Extent of Actions . In Computer Vision - ECCV 2010 , Kostas Daniilidis, Petros Maragos , and Nikos Paragios (Eds.). Springer Berlin Heidelberg , Berlin, Heidelberg, 536--548. Scott Satkin and Martial Hebert. 2010. Modeling the Temporal Extent of Actions. In Computer Vision - ECCV 2010, Kostas Daniilidis, Petros Maragos, and Nikos Paragios (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 536--548."},{"key":"e_1_3_2_1_36_1","volume-title":"CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos.","author":"Shou Zheng","year":"2017","unstructured":"Zheng Shou , Jonathan Chan , Alireza Zareian , Kazuyuki Miyazawa , and Shih Fu Chang . 2017 . CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos. (2017). Zheng Shou, Jonathan Chan, Alireza Zareian, Kazuyuki Miyazawa, and Shih Fu Chang. 2017. CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos. (2017)."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.119"},{"key":"e_1_3_2_1_38_1","volume-title":"Advances in Neural Information Processing Systems 27","author":"Simonyan Karen","unstructured":"Karen Simonyan and Andrew Zisserman . 2014. Two-Stream Convolutional Networks for Action Recognition in Videos . In Advances in Neural Information Processing Systems 27 , Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.). Curran Associates, Inc. , 568--576. http:\/\/papers.nips.cc\/paper\/5353-two-stream-convolutional-networks-for-action-recognition-in-videos.pdf Karen Simonyan and Andrew Zisserman. 2014. Two-Stream Convolutional Networks for Action Recognition in Videos. In Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 568--576. http:\/\/papers.nips.cc\/paper\/5353-two-stream-convolutional-networks-for-action-recognition-in-videos.pdf"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.381"},{"key":"e_1_3_2_1_40_1","volume-title":"Gurkirt Singh and Fabio Cuzzolin","author":"Philip Torr Suman Saha Michael Sapienza","year":"2016","unstructured":"Michael Sapienza Philip Torr Suman Saha , Gurkirt Singh and Fabio Cuzzolin . 2016 . Deep Learning for Detecting Multiple Space-Time Action Tubes in Videos. Article 58 (September. 2016), 13 pages. Michael Sapienza Philip Torr Suman Saha, Gurkirt Singh and Fabio Cuzzolin. 2016. Deep Learning for Detecting Multiple Space-Time Action Tubes in Videos. Article 58 (September. 2016), 13 pages."},{"key":"e_1_3_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1145\/2733373.2806226"},{"key":"e_1_3_2_1_42_1","volume-title":"Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Tang Yansong","year":"2018","unstructured":"Yansong Tang , Yi Tian , Jiwen Lu , Peiyang Li , and Jie Zhou . 2018 . Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Yansong Tang, Yi Tian, Jiwen Lu, Peiyang Li, and Jie Zhou. 2018. Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_3_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.510"},{"key":"e_1_3_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2013.441"},{"key":"e_1_3_2_1_45_1","unstructured":"Limin Wang Yu Qiao and Xiaoou Tang. {n.d.}. Action Recognition and Detection by Combining Motion and Appearance Features. ({n. d.}). Limin Wang Yu Qiao and Xiaoou Tang. {n.d.}. Action Recognition and Detection by Combining Motion and Appearance Features. ({n. d.})."},{"key":"e_1_3_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7299059"},{"key":"e_1_3_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0859-0"},{"key":"e_1_3_2_1_48_1","volume-title":"UntrimmedNets for Weakly Supervised Action Recognition and Detection The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6402--6411","author":"Wang Limin","year":"2017","unstructured":"Limin Wang , Yuanjun Xiong , Dahua Lin , and Luc Van Gool . 2017 . UntrimmedNets for Weakly Supervised Action Recognition and Detection The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6402--6411 . Limin Wang, Yuanjun Xiong, Dahua Lin, and Luc Van Gool. 2017. UntrimmedNets for Weakly Supervised Action Recognition and Detection The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6402--6411."},{"key":"e_1_3_2_1_49_1","volume-title":"Temporal Segment Networks for Action Recognition in Videos. CoRR abs\/1705.02953","author":"Wang Limin","year":"2017","unstructured":"Limin Wang , Yuanjun Xiong , Zhe Wang , Yu Qiao , Dahua Lin , Xiaoou Tang , and Luc Van Gool . 2017. Temporal Segment Networks for Action Recognition in Videos. CoRR abs\/1705.02953 ( 2017 ). arxiv: 1705.02953 http:\/\/arxiv.org\/abs\/1705.02953 Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, and Luc Van Gool. 2017. Temporal Segment Networks for Action Recognition in Videos. CoRR abs\/1705.02953 (2017). arxiv: 1705.02953 http:\/\/arxiv.org\/abs\/1705.02953"},{"key":"e_1_3_2_1_50_1","doi-asserted-by":"crossref","unstructured":"Yunchao Wei Jiashi Feng Xiaodan Liang Ming-Ming Cheng Yao Zhao and Shuicheng Yan. 2017. Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach. (2017) 6488--6496. Yunchao Wei Jiashi Feng Xiaodan Liang Ming-Ming Cheng Yao Zhao and Shuicheng Yan. 2017. Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach. (2017) 6488--6496.","DOI":"10.1109\/CVPR.2017.687"},{"key":"e_1_3_2_1_51_1","volume-title":"CNN: Single-label to Multi-label. Computer Science","author":"Wei Yunchao","year":"2014","unstructured":"Yunchao Wei , Wei Xia , Junshi Huang , Bingbing Ni , Jian Dong , Yao Zhao , and Shuicheng Yan . 2014 . CNN: Single-label to Multi-label. Computer Science (2014). Yunchao Wei, Wei Xia, Junshi Huang, Bingbing Ni, Jian Dong, Yao Zhao, and Shuicheng Yan. 2014. CNN: Single-label to Multi-label. Computer Science (2014)."},{"key":"e_1_3_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1162\/089976602317318938"},{"key":"e_1_3_2_1_53_1","volume-title":"A Pursuit of Temporal Accuracy in General Activity Detection. CoRR abs\/1703.02716","author":"Xiong Yuanjun","year":"2017","unstructured":"Yuanjun Xiong , Yue Zhao , Limin Wang , Dahua Lin , and Xiaoou Tang . 2017. A Pursuit of Temporal Accuracy in General Activity Detection. CoRR abs\/1703.02716 ( 2017 ). arxiv: 1703.02716 http:\/\/arxiv.org\/abs\/1703.02716 Yuanjun Xiong, Yue Zhao, Limin Wang, Dahua Lin, and Xiaoou Tang. 2017. A Pursuit of Temporal Accuracy in General Activity Detection. CoRR abs\/1703.02716 (2017). arxiv: 1703.02716 http:\/\/arxiv.org\/abs\/1703.02716"},{"key":"e_1_3_2_1_54_1","volume-title":"Monocular Depth Estimation using Multi-Scale Continuous CRFs as Sequential Deep Networks. CoRR abs\/1803.00891","author":"Xu Dan","year":"2018","unstructured":"Dan Xu , Elisa Ricci , Wanli Ouyang , Xiaogang Wang , and Nicu Sebe . 2018. Monocular Depth Estimation using Multi-Scale Continuous CRFs as Sequential Deep Networks. CoRR abs\/1803.00891 ( 2018 ). arxiv: 1803.00891 http:\/\/arxiv.org\/abs\/1803.00891 Dan Xu, Elisa Ricci, Wanli Ouyang, Xiaogang Wang, and Nicu Sebe. 2018. Monocular Depth Estimation using Multi-Scale Continuous CRFs as Sequential Deep Networks. CoRR abs\/1803.00891 (2018). arxiv: 1803.00891 http:\/\/arxiv.org\/abs\/1803.00891"},{"key":"e_1_3_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.617"},{"key":"e_1_3_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.293"},{"key":"e_1_3_2_1_57_1","volume-title":"Temporal Action Localization with Pyramid of Score Distribution Features. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3093--3102","author":"Yuan Jun","unstructured":"Jun Yuan , Bingbing Ni , Xiaokang Yang , and Ashraf A. Kassim . 2016 . Temporal Action Localization with Pyramid of Score Distribution Features. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3093--3102 . Jun Yuan, Bingbing Ni, Xiaokang Yang, and Ashraf A. Kassim. 2016. Temporal Action Localization with Pyramid of Score Distribution Features. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3093--3102."},{"key":"e_1_3_2_1_58_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.342"},{"key":"e_1_3_2_1_59_1","volume-title":"2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 00","author":"Zhang Yimeng","year":"2012","unstructured":"Yimeng Zhang and Tsuhan Chen . 2012 . Efficient inference for fully-connected CRFs with stationarity . 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 00 (2012), 582--589. Yimeng Zhang and Tsuhan Chen. 2012. Efficient inference for fully-connected CRFs with stationarity. 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 00 (2012), 582--589."},{"key":"e_1_3_2_1_60_1","volume-title":"Temporal Action Detection with Structured Segment Networks. In The IEEE International Conference on Computer Vision (ICCV). 2933--2942","author":"Zhao Yue","year":"2017","unstructured":"Yue Zhao , Yuanjun Xiong , Limin Wang , Zhirong Wu , Xiaoou Tang , and Dahua Lin . 2017 . Temporal Action Detection with Structured Segment Networks. In The IEEE International Conference on Computer Vision (ICCV). 2933--2942 . Yue Zhao, Yuanjun Xiong, Limin Wang, Zhirong Wu, Xiaoou Tang, and Dahua Lin. 2017. Temporal Action Detection with Structured Segment Networks. In The IEEE International Conference on Computer Vision (ICCV). 2933--2942."},{"key":"e_1_3_2_1_61_1","volume-title":"Deep Metric Learning with False Positive Probability","author":"Zhong Jia-Xing","unstructured":"Jia-Xing Zhong , Ge Li , and Nannan Li. 2017. Deep Metric Learning with False Positive Probability . Springer International Publishing , Cham , 653--664. Jia-Xing Zhong, Ge Li, and Nannan Li. 2017. Deep Metric Learning with False Positive Probability. Springer International Publishing, Cham, 653--664."},{"key":"e_1_3_2_1_62_1","volume-title":"Newsam","author":"Zhu Yi","year":"2017","unstructured":"Yi Zhu and Shawn D . Newsam . 2017 . Efficient Action Detection in Untrimmed Videos via Multi-task Learning . (2017), 197--206. Yi Zhu and Shawn D. Newsam. 2017. Efficient Action Detection in Untrimmed Videos via Multi-task Learning. (2017), 197--206."}],"event":{"name":"MM '18: ACM Multimedia Conference","location":"Seoul Republic of Korea","acronym":"MM '18","sponsor":["SIGMM ACM Special Interest Group on Multimedia"]},"container-title":["Proceedings of the 26th ACM international conference on Multimedia"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3240508.3240511","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3240508.3240511","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T20:39:46Z","timestamp":1775248786000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3240508.3240511"}},"subtitle":["A Weakly Supervised Temporal Action Detector"],"short-title":[],"issued":{"date-parts":[[2018,10,15]]},"references-count":62,"alternative-id":["10.1145\/3240508.3240511","10.1145\/3240508"],"URL":"https:\/\/doi.org\/10.1145\/3240508.3240511","relation":{},"subject":[],"published":{"date-parts":[[2018,10,15]]},"assertion":[{"value":"2018-10-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}