{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,24]],"date-time":"2026-01-24T03:01:27Z","timestamp":1769223687895,"version":"3.49.0"},"reference-count":28,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T00:00:00Z","timestamp":1769126400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T00:00:00Z","timestamp":1769126400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Strategic Priority Research Program of the Chinese Academy of Sciences","award":["E3W00233C3"],"award-info":[{"award-number":["E3W00233C3"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Cybersecurity"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    Taking photos of sensitive facilities and sensitive information in no photography area may cause sensitive information leakage if not discovered in time. Employing action recognition models to detect instances of photography can effectively prevent information leakage. Current action recognition models have shown unsatisfactory performance in detecting photo-taking actions in surveillance videos, and their reliance on GPU devices hinder their practicality. This paper presents a novel approach to address the detection of photo-taking actions. The method utilizes object detection to filter out background data and incorporates human pose estimation to extract human skeleton data. By combining these AI techniques, the method enables accurate recognition of photo-taking actions. We introduce a novel technique called self-annotation that enables the model to focus on the crucial elements associated with photo-taking actions. Additionally, we introduce a new alarm mechanism that leads to a 69\n                    <jats:inline-formula>\n                      <jats:alternatives>\n                        <jats:tex-math>$$\\%$$<\/jats:tex-math>\n                        <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                          <mml:mo>%<\/mml:mo>\n                        <\/mml:math>\n                      <\/jats:alternatives>\n                    <\/jats:inline-formula>\n                    reduction in false positives while maintaining the same level of recall by integrating the labels over a period to recognize actions. Compared with traditional action recognition approaches, our method is more flexible and lightweight in actual engineering applications. Moreover, our model is capable of running on CPU-only devices. Experimental results show that our model achieves a precision of 91\n                    <jats:inline-formula>\n                      <jats:alternatives>\n                        <jats:tex-math>$$\\%$$<\/jats:tex-math>\n                        <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                          <mml:mo>%<\/mml:mo>\n                        <\/mml:math>\n                      <\/jats:alternatives>\n                    <\/jats:inline-formula>\n                    on our dataset.\n                  <\/jats:p>","DOI":"10.1186\/s42400-025-00429-7","type":"journal-article","created":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T01:02:13Z","timestamp":1769130133000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Detecting photo-taking actions in surveillance videos based on CPU-only devices"],"prefix":"10.1186","volume":"9","author":[{"given":"Zixiang","family":"Liu","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4507-3356","authenticated-orcid":false,"given":"Peisong","family":"Shen","sequence":"additional","affiliation":[]},{"given":"Chi","family":"Chen","sequence":"additional","affiliation":[]},{"given":"Shuguang","family":"Yuan","sequence":"additional","affiliation":[]},{"given":"Xiaojie","family":"Zhu","sequence":"additional","affiliation":[]},{"given":"Houzhe","family":"Wang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2026,1,23]]},"reference":[{"key":"429_CR1","unstructured":"Shahid Z (2024) Human behaviour recognition of elderly in single-resident iotenabled smart homes: An applied machine learning approach. PhD thesis, Lule\u00e5 University of Technology"},{"issue":"4","key":"429_CR2","doi-asserted-by":"publisher","first-page":"2358","DOI":"10.3390\/s23042358","volume":"23","author":"M Islam","year":"2023","unstructured":"Islam M, Dukyil AS, Alyahya S, Habib S (2023) An iot enable anomaly detection system for smart city surveillance. Sensors 23(4):2358","journal-title":"Sensors"},{"key":"429_CR3","doi-asserted-by":"publisher","unstructured":"Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. In: 2015 IEEE international conference on computer vision (ICCV), pp. 4489\u20134497. https:\/\/doi.org\/10.1109\/ICCV.2015.510","DOI":"10.1109\/ICCV.2015.510"},{"key":"429_CR4","doi-asserted-by":"publisher","unstructured":"Carreira J, Zisserman A (2017) Quo vadis, action recognition? a new model and the kinetics dataset. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp. 4724\u20134733. https:\/\/doi.org\/10.1109\/CVPR.2017.502","DOI":"10.1109\/CVPR.2017.502"},{"key":"429_CR5","doi-asserted-by":"crossref","unstructured":"Feichtenhofer C, Fan H, Malik J, He K (2019) Slowfast networks for video recognition. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp. 6202\u20136211","DOI":"10.1109\/ICCV.2019.00630"},{"key":"429_CR6","doi-asserted-by":"publisher","first-page":"488","DOI":"10.1016\/j.neucom.2021.12.059","volume":"489","author":"A Lamas","year":"2022","unstructured":"Lamas A, Tabik S, Montes AC, P\u00e9rez-Hern\u00e1ndez F, Garc\u00eda J, Olmos R, Herrera F (2022) Human pose estimation for mitigating false negatives in weapon detection in video-surveillance. Neurocomputing 489:488\u2013503","journal-title":"Neurocomputing"},{"issue":"11","key":"429_CR7","doi-asserted-by":"publisher","first-page":"4773","DOI":"10.1109\/TITS.2019.2946642","volume":"21","author":"Z Fang","year":"2020","unstructured":"Fang Z, L\u00f3pez AM (2020) Intention recognition of pedestrians and cyclists by 2d pose estimation. IEEE Trans Intell Transp Syst 21(11):4773\u20134783. https:\/\/doi.org\/10.1109\/TITS.2019.2946642","journal-title":"IEEE Trans Intell Transp Syst"},{"key":"429_CR8","unstructured":"Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst, 25"},{"key":"429_CR9","doi-asserted-by":"publisher","first-page":"35479","DOI":"10.1109\/ACCESS.2023.3266093","volume":"11","author":"AB Amjoud","year":"2023","unstructured":"Amjoud AB, Amrouch M (2023) Object detection using deep learning, cnns and vision transformers: a review. IEEE Access 11:35479\u201335516. https:\/\/doi.org\/10.1109\/ACCESS.2023.3266093","journal-title":"IEEE Access"},{"issue":"1","key":"429_CR10","doi-asserted-by":"publisher","first-page":"142","DOI":"10.1109\/TPAMI.2015.2437384","volume":"38","author":"R Girshick","year":"2016","unstructured":"Girshick R, Donahue J, Darrell T, Malik J (2016) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142\u2013158. https:\/\/doi.org\/10.1109\/TPAMI.2015.2437384","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"429_CR11","doi-asserted-by":"crossref","unstructured":"Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: computer vision\u2013ECCV 2016: 14th European conference, Amsterdam, The Netherlands, October 11\u201314, 2016, Proceedings, Part I 14, pp. 21\u201337. Springer","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"429_CR12","first-page":"1","volume-title":"Computer Vision and Pattern Recognition","author":"A Farhadi","year":"2018","unstructured":"Farhadi A, Redmon J (2018) Yolov3: An incremental improvement. Computer Vision and Pattern Recognition, vol 1804. Springer, Berlin\/Heidelberg, Germany, pp 1\u20136"},{"key":"429_CR13","doi-asserted-by":"crossref","unstructured":"Kim K, Lee HS (2020) Probabilistic anchor assignment with iou prediction for object detection. In: computer vision\u2013ECCV 2020: 16th European conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XXV 16, pp. 355\u2013371. Springer","DOI":"10.1007\/978-3-030-58595-2_22"},{"key":"429_CR14","doi-asserted-by":"crossref","unstructured":"Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 5693\u20135703","DOI":"10.1109\/CVPR.2019.00584"},{"key":"429_CR15","doi-asserted-by":"publisher","unstructured":"Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images. In: CVPR 2011, pp. 1297\u20131304. https:\/\/doi.org\/10.1109\/CVPR.2011.5995316","DOI":"10.1109\/CVPR.2011.5995316"},{"issue":"1","key":"429_CR16","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3603618","volume":"56","author":"C Zheng","year":"2023","unstructured":"Zheng C, Wu W, Chen C, Yang T, Zhu S, Shen J, Kehtarnavaz N, Shah M (2023) Deep learning-based human pose estimation: a survey. ACM Comput Surv 56(1):1\u201337","journal-title":"ACM Comput Surv"},{"key":"429_CR17","doi-asserted-by":"publisher","unstructured":"Toshev A, Szegedy C (2014) Deeppose: Human pose estimation via deep neural networks. In: 2014 IEEE conference on computer vision and pattern recognition, pp. 1653\u20131660. https:\/\/doi.org\/10.1109\/CVPR.2014.214","DOI":"10.1109\/CVPR.2014.214"},{"key":"429_CR18","doi-asserted-by":"publisher","unstructured":"Tompson J, Goroshin R, Jain A, LeCun Y, Bregler C (2015) Efficient object localization using convolutional networks. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 648\u2013656. https:\/\/doi.org\/10.1109\/CVPR.2015.7298664","DOI":"10.1109\/CVPR.2015.7298664"},{"key":"429_CR19","doi-asserted-by":"crossref","unstructured":"Nie Q, Wang X, Wang J, Wang M, Liu Y (2018) A child caring robot for the dangerous behavior detection based on the object recognition and human action recognition. In: 2018 IEEE international conference on robotics and biomimetics (ROBIO), pp. 1921\u20131926. IEEE","DOI":"10.1109\/ROBIO.2018.8665218"},{"key":"429_CR20","doi-asserted-by":"publisher","first-page":"66","DOI":"10.1016\/j.neucom.2017.05.012","volume":"275","author":"R Olmos","year":"2018","unstructured":"Olmos R, Tabik S, Herrera F (2018) Automatic handgun detection alarm in videos using deep learning. Neurocomputing 275:66\u201372","journal-title":"Neurocomputing"},{"issue":"8","key":"429_CR21","doi-asserted-by":"publisher","first-page":"3729","DOI":"10.1007\/s12652-022-03848-3","volume":"13","author":"SA Khowaja","year":"2022","unstructured":"Khowaja SA, Lee S-L (2022) Skeleton-based human action recognition with sequential convolutional-lstm networks and fusion strategies. J Ambient Intell Humaniz Comput 13(8):3729\u20133746","journal-title":"J Ambient Intell Humaniz Comput"},{"key":"429_CR22","doi-asserted-by":"publisher","first-page":"4980","DOI":"10.1109\/TIP.2022.3191461","volume":"31","author":"X Chen","year":"2022","unstructured":"Chen X, Han Y, Wang X, Sun Y, Yang Y (2022) Action keypoint network for efficient video recognition. IEEE Trans Image Process 31:4980\u20134993. https:\/\/doi.org\/10.1109\/TIP.2022.3191461","journal-title":"IEEE Trans Image Process"},{"key":"429_CR23","doi-asserted-by":"publisher","unstructured":"Shen M, Lu H (2022) Rarn: A real-time skeleton-based action recognition network for auxiliary rehabilitation therapy. In: 2022 IEEE international symposium on circuits and systems (ISCAS), pp. 2482\u20132486. https:\/\/doi.org\/10.1109\/ISCAS48785.2022.9937262","DOI":"10.1109\/ISCAS48785.2022.9937262"},{"key":"429_CR24","doi-asserted-by":"publisher","unstructured":"Hsu H-M, Yuan X, Zhu B, Cheng Z, Chen L (2022) Package theft detection from smart home security cameras. In: 2022 IEEE international conference on multimedia and expo workshops (ICMEW), pp. 1\u20134. https:\/\/doi.org\/10.1109\/ICMEW56448.2022.9859522","DOI":"10.1109\/ICMEW56448.2022.9859522"},{"key":"429_CR25","doi-asserted-by":"crossref","unstructured":"Osokin D (2018) Real-time 2d multi-person pose estimation on cpu: Lightweight openpose. arXiv preprint arXiv:1811.12004","DOI":"10.5220\/0007555407440748"},{"key":"429_CR26","unstructured":"Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929"},{"issue":"5","key":"429_CR27","doi-asserted-by":"publisher","DOI":"10.1098\/rsos.171790","volume":"5","author":"TM Versluys","year":"2018","unstructured":"Versluys TM, Foley RA, Skylark WJ (2018) The influence of leg-to-body ratio, arm-to-body ratio and intra-limb ratio on male human attractiveness. R Soc Open Sci 5(5):171790","journal-title":"R Soc Open Sci"},{"key":"429_CR28","doi-asserted-by":"crossref","unstructured":"Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: proceedings of the IEEE international conference on computer vision, pp. 618\u2013626","DOI":"10.1109\/ICCV.2017.74"}],"container-title":["Cybersecurity"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-025-00429-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s42400-025-00429-7","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42400-025-00429-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T01:02:17Z","timestamp":1769130137000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1186\/s42400-025-00429-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,1,23]]},"references-count":28,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2026,12]]}},"alternative-id":["429"],"URL":"https:\/\/doi.org\/10.1186\/s42400-025-00429-7","relation":{},"ISSN":["2523-3246"],"issn-type":[{"value":"2523-3246","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,1,23]]},"assertion":[{"value":"27 November 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 June 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 January 2026","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no Conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"17"}}