{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,30]],"date-time":"2026-05-30T01:24:07Z","timestamp":1780104247668,"version":"3.54.0"},"reference-count":27,"publisher":"MDPI AG","issue":"22","license":[{"start":{"date-parts":[[2021,11,12]],"date-time":"2021-11-12T00:00:00Z","timestamp":1636675200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Weakly supervised video anomaly detection is a recent focus of computer vision research thanks to the availability of large-scale weakly supervised video datasets. However, most existing research works are limited to the frame-level classification with emphasis on finding the presence of specific objects or activities. In this article, a new neural network architecture is proposed to efficiently extract the prominent features for detecting whether a video contains anomalies. A video is treated as an integral input and the detection follows the procedure of video-label assignment. The extraction of spatial and temporal features is carried out by three-dimensional convolutions, and then their relationship is further modeled using an LSTM network. The concise structure of the proposed method enables high computational efficiency, and extensive experiments demonstrate its effectiveness.<\/jats:p>","DOI":"10.3390\/s21227508","type":"journal-article","created":{"date-parts":[[2021,11,14]],"date-time":"2021-11-14T20:51:53Z","timestamp":1636923113000},"page":"7508","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Weakly Supervised Video Anomaly Detection Based on 3D Convolution and LSTM"],"prefix":"10.3390","volume":"21","author":[{"given":"Zhen","family":"Ma","sequence":"first","affiliation":[{"name":"Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias s\/n, 4200-465 Porto, Portugal"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jos\u00e9 J. M.","family":"Machado","sequence":"additional","affiliation":[{"name":"Departamento de Engenharia Mec\u00e2nica, Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias s\/n, 4200-465 Porto, Portugal"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7603-6526","authenticated-orcid":false,"given":"Jo\u00e3o Manuel R. S.","family":"Tavares","sequence":"additional","affiliation":[{"name":"Departamento de Engenharia Mec\u00e2nica, Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias s\/n, 4200-465 Porto, Portugal"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2021,11,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1541880.1541882","article-title":"Anomaly detection: A survey","volume":"41","author":"Chandola","year":"2009","journal-title":"ACM Comput. Surv."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"976","DOI":"10.1016\/j.imavis.2009.11.014","article-title":"A survey on vision-based human action recognition","volume":"28","author":"Poppe","year":"2010","journal-title":"Image Vis. Comput."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Jiang, F., Yuan, J., Tsaftaris, S.A., and Katsaggelos, A.K. (2010, January 26\u201329). Video anomaly detection in spatiotemporal context. Proceedings of the IEEE International Conference on Image Processing, Hong Kong, China.","DOI":"10.1109\/ICIP.2010.5650993"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"104078","DOI":"10.1016\/j.imavis.2020.104078","article-title":"A comprehensive review on deep learning-based methods for video anomaly detection","volume":"106","author":"Nayak","year":"2021","journal-title":"Image Vis. Comput."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Sultani, W., Chen, C., and Shah, M. (2018, January 18\u201322). Real-world anomaly detection in surveillance videos. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00678"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Ghadiyaram, D., Feiszli, M., Tran, D., Yan, X., Wang, H., and Mahajan, D. (2019, January 16\u201320). Large-scale weakly-supervised pre-training for video action recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.01232"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Wu, P., Liu, J., Shi, Y., Sun, Y., Shao, F., Wu, Z., and Yang, Z. (2020, January 23\u201328). Not only look, but also listen: Learning multimodal violence detection under weak supervision. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58577-8_20"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., and Davis, L.S. (2016, January 27\u201330). Learning temporal regularity in video sequences. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.86"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Luo, W., Liu, W., and Gao, S. (2017, January 10\u201314). Remembering history with convolutional LSTM for anomaly detection. Proceedings of the IEEE International Conference on Multimedia and Expo, Hong Kong, China.","DOI":"10.1109\/ICME.2017.8019325"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Luo, W., Liu, W., and Gao, S. (2017, January 22\u201329). A revisit of sparse coding based anomaly detection in stacked rnn framework. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.45"},{"key":"ref_12","unstructured":"Nguyen, T.N., and Meunier, J. (November, January 27). Anomaly detection in video sequence with appearance-motion correspondence. Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"121","DOI":"10.1016\/j.patcog.2016.03.028","article-title":"High-dimensional and largescale anomaly detection using a linear one-class svm with deep learning","volume":"58","author":"Erfani","year":"2016","journal-title":"Pattern Recognit."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Ionescu, R.T., Smeureanu, S., Alexe, B., and Popescu, M. (2017, January 22\u201329). Unmasking the abnormal events in video. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.315"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Georgescu, M.I., and Ionescu, R.T. (2019, January 22\u201325). Clustering images by unmasking\u2014A new baseline. Proceedings of the IEEE International Conference on Computer Vision, Taipei, Taiwan.","DOI":"10.1109\/ICIP.2019.8803097"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Ionescu, R.T., Khan, F.S., Georgescu, M.I., and Shao, L. (2019, January 15\u201320). Object-centric auto-encoders and dummy anomalies for abnormal event detection in video. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00803"},{"key":"ref_17","unstructured":"Simonyan, K., and Zisserman, A. (2015, January 7\u20139). Very deep convolutional networks for large-scale image recognition. Proceedings of the International Conference on Learning Representations, San Diego, CA, USA."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Liu, W., Lian, D., Luo, W., and Gao, S. (2018, January 18\u201323). Future frame prediction for anomaly detection\u2014A new baseline. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00684"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Ye, M., Peng, X., Gan, W., Wu, W., and Qiao, Y. (2019, January 21\u201325). Anopcn: Video anomaly detection via deep predictive coding network. Proceedings of the 27th ACM International Conference on Multimedia, Nice, France.","DOI":"10.1145\/3343031.3350899"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Zhang, J., Qing, L., and Miao, J. (2019, January 22\u201325). Temporal convolutional network with complementary inner bag loss for weakly supervised anomaly detection. Proceedings of the IEEE International Conference on Image Processing, Taipei, Taiwan.","DOI":"10.1109\/ICIP.2019.8803657"},{"key":"ref_21","unstructured":"Zhu, Y., and Newsam, S. (2019, January 9\u201312). Motion-aware feature for improved video anomaly detection. Proceedings of the British Machine Vision Conference, Cardiff, UK."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Wan, B., Fang, Y., Xia, X., and Mei, J. (2020, January 6\u201310). Weakly supervised video anomaly detection via center-guided discriminative learning. Proceedings of the IEEE International Conference on Multimedia and Expo, London, UK.","DOI":"10.1109\/ICME46284.2020.9102722"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zhong, J., Li, N., Kong, W., Liu, S., Li, T.H., and Li, G. (2019, January 15\u201320). Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00133"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Tran, D., Bourdev, L., Fergus, R., Torresani, L., and Paluri, M. (2015, January 7\u201313). Learning Spatiotemporal Features with 3D Convolutional Networks. Proceedings of the IEEE International Conference on Image Processing, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.510"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zaheer, M.Z., Mahmood, A., Astrid, M., and Lee, S. (2020, January 23\u201328). CLAWS: Clustering assisted weakly supervised learning with normalcy suppression for anomalous event detection. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58542-6_22"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Feng, J., Hong, F., and Zheng, W. (2021, January 19\u201325). MIST: Multiple instance self-training framework for video anomaly detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01379"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Tian, Y., Pang, G., Chen, Y., Singh, R., Verjans, J.W., and Carneiro, G. (2021). Weakly-supervised video anomaly detection with robust temporal feature magnitude learning. arXiv.","DOI":"10.1109\/ICCV48922.2021.00493"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/22\/7508\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:28:59Z","timestamp":1760167739000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/22\/7508"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,12]]},"references-count":27,"journal-issue":{"issue":"22","published-online":{"date-parts":[[2021,11]]}},"alternative-id":["s21227508"],"URL":"https:\/\/doi.org\/10.3390\/s21227508","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,11,12]]}}}