{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,8,29]],"date-time":"2025-08-29T10:25:17Z","timestamp":1756463117899,"version":"3.41.0"},"reference-count":26,"publisher":"Association for Computing Machinery (ACM)","issue":"2s","license":[{"start":{"date-parts":[[2020,4,30]],"date-time":"2020-04-30T00:00:00Z","timestamp":1588204800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100014718","name":"Innovative Research Group Project of the National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61836002?61622205?61902101"],"award-info":[{"award-number":["61836002?61622205?61902101"]}],"id":[{"id":"10.13039\/100014718","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2020,4,30]]},"abstract":"<jats:p>Temporal action detection not only requires correct classification but also needs to detect the start and end times of each action accurately. However, traditional approaches always employ sliding windows or actionness to predict the actions, and it is different to train to model with sliding windows or actionness by end-to-end means. In this article, we attempt a different idea to detect the actions end-to-end, which can calculate the probabilities of actions directly through one network as one part of the results. We present PCAD, a novel proposal complementary action detector to deal with video streams under continuous, untrimmed conditions. Our approach first uses a simple fully 3D convolutional network to encode the video streams and then generates candidate temporal proposals for activities by using anchor segments. To generate more precise proposals, we also design a boundary proposal network to offer some complementary information for the candidate proposals. Finally, we learn an efficient classifier to classify the generated proposals into different activities and refine their temporal boundaries at the same time. Our model can achieve end-to-end training by jointly optimizing classification loss and regression loss. When evaluating on the THUMOS\u201914 detection benchmark, PCAD achieves state-of-the-art performance in high-speed models.<\/jats:p>","DOI":"10.1145\/3361845","type":"journal-article","created":{"date-parts":[[2020,6,22]],"date-time":"2020-06-22T02:49:20Z","timestamp":1592794160000},"page":"1-12","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["Proposal Complementary Action Detection"],"prefix":"10.1145","volume":"16","author":[{"given":"Suguo","family":"Zhu","sequence":"first","affiliation":[{"name":"Hangzhou Dianzi University, Hangzhou, China"}]},{"given":"Xiaoxian","family":"Yang","sequence":"additional","affiliation":[{"name":"Shanghai Polytechnic University, Shanghai, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1922-7283","authenticated-orcid":false,"given":"Jun","family":"Yu","sequence":"additional","affiliation":[{"name":"Hangzhou Dianzi University, Hangzhou, China"}]},{"given":"Zhenying","family":"Fang","sequence":"additional","affiliation":[{"name":"Hangzhou Dianzi University, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3094-7735","authenticated-orcid":false,"given":"Meng","family":"Wang","sequence":"additional","affiliation":[{"name":"Hefei University of Technology, Hefei, China"}]},{"given":"Qingming","family":"Huang","sequence":"additional","affiliation":[{"name":"University of Chinese Academy of Sciences, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2020,6,21]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5244\/C.31.93"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00124"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.610"},{"volume-title":"Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.","author":"Dalal N.","key":"e_1_2_1_4_1","unstructured":"N. Dalal and B. Triggs . 2005. Histograms of oriented gradients for human detection . In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. N. Dalal and B. Triggs. 2005. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition."},{"volume-title":"Proceedings of the 13th International Conference on Machine Learning (ICML\u201996)","author":"Freund Yoav","key":"e_1_2_1_5_1","unstructured":"Yoav Freund and Robert E. Schapire . 1996. Experiments with a new boosting algorithm . In Proceedings of the 13th International Conference on Machine Learning (ICML\u201996) . 148--156. Yoav Freund and Robert E. Schapire. 1996. Experiments with a new boosting algorithm. In Proceedings of the 13th International Conference on Machine Learning (ICML\u201996). 148--156."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01216-8_5"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.392"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.5244\/C.31.52"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2015.2437384"},{"key":"e_1_2_1_10_1","volume-title":"Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision. 2961--2969","author":"He Kaiming","year":"2017","unstructured":"Kaiming He , Georgia Gkioxari , Piotr Doll\u00e1r , and Ross Girshick . 2017 . Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision. 2961--2969 . Kaiming He, Georgia Gkioxari, Piotr Doll\u00e1r, and Ross Girshick. 2017. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision. 2961--2969."},{"key":"e_1_2_1_11_1","volume-title":"Retrieved","author":"Jiang Y.-G.","year":"2020","unstructured":"Y.-G. Jiang , J. Liu , A. Roshan Zamir , G. Toderici , I. Laptev , M. Shah , and R. Sukthankar . 2014. THUMOS Challenge 2014: Action Recognition with a Large Number of Classes . Retrieved May 14, 2020 from http:\/\/crcv.ucf.edu\/THUMOS14\/. Y.-G. Jiang, J. Liu, A. Roshan Zamir, G. Toderici, I. Laptev, M. Shah, and R. Sukthankar. 2014. THUMOS Challenge 2014: Action Recognition with a Large Number of Classes. Retrieved May 14, 2020 from http:\/\/crcv.ucf.edu\/THUMOS14\/."},{"key":"e_1_2_1_12_1","first-page":"18","article-title":"Classification and regression by randomForest","volume":"2","author":"Liaw Andy","year":"2002","unstructured":"Andy Liaw and Matthew Wiener . 2002 . Classification and regression by randomForest . R News 2 , 3 (2002), 18 -- 22 . Andy Liaw and Matthew Wiener. 2002. Classification and regression by randomForest. R News 2, 3 (2002), 18--22.","journal-title":"R News"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/3123266.3123343"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01225-0_1"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.324"},{"volume-title":"Proceedings of the 6th International Conference on Computer Vision. 555","author":"Papageorgiou Constantine","key":"e_1_2_1_16_1","unstructured":"Constantine Papageorgiou , Michael Oren , and Tomaso A. Poggio . 1998. A general framework for object detection . In Proceedings of the 6th International Conference on Computer Vision. 555 . Constantine Papageorgiou, Michael Oren, and Tomaso A. Poggio. 1998. A general framework for object detection. In Proceedings of the 6th International Conference on Computer Vision. 555."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/11499145_13"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2577031"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.155"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.119"},{"key":"e_1_2_1_21_1","volume-title":"Xiaolong Wang, Ali Farhadi, Ivan Laptev, and Abhinav Gupta.","author":"Sigurdsson Gunnar A.","year":"2016","unstructured":"Gunnar A. Sigurdsson , G\u00fcl Varol Varol , Xiaolong Wang, Ali Farhadi, Ivan Laptev, and Abhinav Gupta. 2016 . Hollywood in homes: Crowdsourcing data collection for activity understanding. In Computer Vision\u2014ECCV 2016. Lecture Notes in Computer Science, Vol. 9905 . Springer , 510--526. Gunnar A. Sigurdsson, G\u00fcl Varol Varol, Xiaolong Wang, Ali Farhadi, Ivan Laptev, and Abhinav Gupta. 2016. Hollywood in homes: Crowdsourcing data collection for activity understanding. In Computer Vision\u2014ECCV 2016. Lecture Notes in Computer Science, Vol. 9905. Springer, 510--526."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.510"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4757-3264-1"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.617"},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the British Machine Vision Conference (BMVC\u201918)","author":"Zhang Da","year":"2018","unstructured":"Da Zhang , Xiyang Dai , Xin Wang , and Yuan-Fang Wang . 2018 . S3D: Single shot multi-span detector via fully 3D convolutional network . In Proceedings of the British Machine Vision Conference (BMVC\u201918) . 1--11. Da Zhang, Xiyang Dai, Xin Wang, and Yuan-Fang Wang. 2018. S3D: Single shot multi-span detector via fully 3D convolutional network. In Proceedings of the British Machine Vision Conference (BMVC\u201918). 1--11."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.317"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3361845","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3361845","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:23:10Z","timestamp":1750202590000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3361845"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,4,30]]},"references-count":26,"journal-issue":{"issue":"2s","published-print":{"date-parts":[[2020,4,30]]}},"alternative-id":["10.1145\/3361845"],"URL":"https:\/\/doi.org\/10.1145\/3361845","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"type":"print","value":"1551-6857"},{"type":"electronic","value":"1551-6865"}],"subject":[],"published":{"date-parts":[[2020,4,30]]},"assertion":[{"value":"2019-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-09-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-06-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}