{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T16:15:36Z","timestamp":1776442536190,"version":"3.51.2"},"reference-count":98,"publisher":"Springer Science and Business Media LLC","issue":"30","license":[{"start":{"date-parts":[[2024,2,14]],"date-time":"2024-02-14T00:00:00Z","timestamp":1707868800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,14]],"date-time":"2024-02-14T00:00:00Z","timestamp":1707868800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100006752","name":"Universidade do Porto","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100006752","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Multimed Tools Appl"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Since digital media has become increasingly popular, video processing has expanded in recent years. Video processing systems require high levels of processing, which is one of the challenges in this field. Various approaches, such as hardware upgrades, algorithmic optimizations, and removing unnecessary information, have been suggested to solve this problem. This study proposes a video saliency map based method that identifies the critical parts of the video and improves the system\u2019s overall performance. Using an image registration algorithm, the proposed method first removes the camera\u2019s motion. Subsequently, each video frame\u2019s color, edge, and gradient information are used to obtain a spatial saliency map. Combining spatial saliency with motion information derived from optical flow and color-based segmentation can produce a saliency map containing both motion and spatial data. A nonlinear function is suggested to properly combine the temporal and spatial saliency maps, which was optimized using a multi-objective genetic algorithm. The proposed saliency map method was added as a preprocessing step in several Human Action Recognition (HAR) systems based on deep learning, and its performance was evaluated. Furthermore, the proposed method was compared with similar methods based on saliency maps, and the superiority of the proposed method was confirmed. The results show that the proposed method can improve HAR efficiency by up to 6.5% relative to HAR methods with no preprocessing step and 3.9% compared to the HAR method containing a temporal saliency map.<\/jats:p>","DOI":"10.1007\/s11042-024-18126-x","type":"journal-article","created":{"date-parts":[[2024,2,14]],"date-time":"2024-02-14T07:02:18Z","timestamp":1707894138000},"page":"74053-74073","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Hybrid time-spatial video saliency detection method to enhance human action recognition systems"],"prefix":"10.1007","volume":"83","author":[{"given":"Abdorreza Alavi","family":"Gharahbagh","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vahid","family":"Hajihashemi","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Marta Campos","family":"Ferreira","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"J. J. M.","family":"Machado","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7603-6526","authenticated-orcid":false,"given":"Jo\u00e3o Manuel R. S.","family":"Tavares","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,2,14]]},"reference":[{"key":"18126_CR1","unstructured":"Walther D (2006) Interactions of visual attention and object recognition: computational modeling, algorithms, and psychophysics, California Institute of Technology"},{"key":"18126_CR2","volume-title":"Human activity recognition in videos based on a two levels k-means and hierarchical codebooks","author":"V Hajihashemi","year":"2016","unstructured":"Hajihashemi V, Pakizeh E (2016) Human activity recognition in videos based on a two levels k-means and hierarchical codebooks. Int J Mechatron, Electr Comput Technol"},{"issue":"3","key":"18126_CR3","doi-asserted-by":"crossref","first-page":"748","DOI":"10.1109\/TCSVT.2019.2896029","volume":"30","author":"X Song","year":"2019","unstructured":"Song X, Lan C, Zeng W, Xing J, Sun X, Yang J (2019) Temporal-spatial mapping for action recognition. IEEE Trans Circuits Syst Video Technol 30(3):748\u2013759","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"key":"18126_CR4","doi-asserted-by":"crossref","unstructured":"Deshpnande A, Warhade KK (2021) An improved model for human activity recognition by integrated feature approach and optimized SVM. In: 2021 International conference on emerging smart computing and informatics (ESCI). IEEE, pp 571\u2013576","DOI":"10.1109\/ESCI50559.2021.9396914"},{"issue":"10","key":"18126_CR5","doi-asserted-by":"crossref","first-page":"2941","DOI":"10.1109\/TCSVT.2018.2870832","volume":"29","author":"R Cong","year":"2018","unstructured":"Cong R, Lei J, Fu H, Cheng MM, Lin W, Huang Q (2018) Review of visual saliency detection with comprehensive information. IEEE Trans Circuits Syst Video Technol 29(10):2941\u20132959","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"issue":"10","key":"18126_CR6","doi-asserted-by":"crossref","first-page":"1174","DOI":"10.3390\/e22101174","volume":"22","author":"AK Gupta","year":"2020","unstructured":"Gupta AK, Seal A, Prasad M, Khanna P (2020) Salient object detection techniques in computer vision\u2013a survey. Entropy 22(10):1174","journal-title":"Entropy"},{"issue":"2","key":"18126_CR7","doi-asserted-by":"crossref","first-page":"660","DOI":"10.1109\/TSMCB.2012.2214210","volume":"43","author":"Q Wang","year":"2013","unstructured":"Wang Q, Yuan Y, Yan P, Li X (2013) Saliency detection by multiple-instance learning. IEEE Trans Cybern 43(2):660\u2013672","journal-title":"IEEE Trans Cybern"},{"key":"18126_CR8","doi-asserted-by":"crossref","unstructured":"Li G, Xie Y, Wei T, Wang K, Lin L (2018) Flow guided recurrent neural encoder for video salient object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3243\u20133252","DOI":"10.1109\/CVPR.2018.00342"},{"issue":"8","key":"18126_CR9","doi-asserted-by":"crossref","first-page":"2900","DOI":"10.1109\/TCYB.2018.2832053","volume":"49","author":"M Sun","year":"2018","unstructured":"Sun M, Zhou Z, Hu Q, Wang Z, Jiang J (2018) SG-FCN: a motion and memory-based deep learning model for video saliency detection. IEEE Trans Cybern 49(8):2900\u20132911","journal-title":"IEEE Trans Cybern"},{"key":"18126_CR10","doi-asserted-by":"crossref","unstructured":"Lee S, Jang D, Jeong J, Ryu ES (2019) \u201cMotion-constrained tile set based 360-degree video streaming using saliency map prediction. In: Proceedings of the 29th ACM workshop on network and operating systems support for digital audio and video, pp 20\u201324","DOI":"10.1145\/3304112.3325614"},{"key":"18126_CR11","doi-asserted-by":"crossref","unstructured":"Li H, Chen G, Li G, Yu Y (2019) Motion guided attention for video salient object detection. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 7274\u20137283","DOI":"10.1109\/ICCV.2019.00737"},{"key":"18126_CR12","doi-asserted-by":"crossref","unstructured":"Yan P, Li G, Xie Y, Li Z, Wang C, Chen T, Lin L (2019) Semi-supervised video salient object detection using pseudo-labels. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 7284\u20137293","DOI":"10.1109\/ICCV.2019.00738"},{"key":"18126_CR13","doi-asserted-by":"crossref","unstructured":"Fan DP, Wang W, Cheng MM, Shen J (2019) Shifting more attention to video salient object detection. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 8554\u20138564","DOI":"10.1109\/CVPR.2019.00875"},{"key":"18126_CR14","doi-asserted-by":"crossref","DOI":"10.1016\/j.imavis.2019.10.008","volume":"93","author":"J Yang","year":"2020","unstructured":"Yang J, Fang X, Zhang L, Lu H, Wei G (2020) Salient object detection via double random walks with dual restarts. Image Vis Comput 93:103822","journal-title":"Image Vis Comput"},{"issue":"8","key":"18126_CR15","doi-asserted-by":"crossref","first-page":"2811","DOI":"10.3390\/app10082811","volume":"10","author":"F Liu","year":"2020","unstructured":"Liu F, Zhao L, Cheng X, Dai Q, Shi X, Qiao J (2020) Fine-grained action recognition by motion saliency and mid-level patches. Appl Sci 10(8):2811","journal-title":"Appl Sci"},{"key":"18126_CR16","doi-asserted-by":"crossref","first-page":"10869","DOI":"10.1609\/aaai.v34i07.6718","volume":"34","author":"Y Gu","year":"2020","unstructured":"Gu Y, Wang L, Wang Z, Liu Y, Cheng MM, Lu SP (2020) Pyramid constrained self-attention network for fast video salient object detection. Proceedings of the AAAI conference on artificial intelligence 34:10869\u201310876","journal-title":"Proceedings of the AAAI conference on artificial intelligence"},{"key":"18126_CR17","doi-asserted-by":"crossref","first-page":"835","DOI":"10.1016\/j.ins.2020.09.003","volume":"546","author":"Y Ji","year":"2021","unstructured":"Ji Y, Zhang H, Zhang Z, Liu M (2021) CNN-based encoder-decoder networks for salient object detection: a comprehensive review and recent advances. Inf Sci 546:835\u2013857","journal-title":"Inf Sci"},{"key":"18126_CR18","doi-asserted-by":"crossref","DOI":"10.1016\/j.eswa.2020.114064","volume":"166","author":"N Kousik","year":"2021","unstructured":"Kousik N, Natarajan Y, Raja RA, Kallam S, Patan R, Gandomi AH (2021) Improved salient object detection using hybrid convolution recurrent neural network. Expert Syst Appl 166:114064","journal-title":"Expert Syst Appl"},{"key":"18126_CR19","doi-asserted-by":"crossref","DOI":"10.1016\/j.imavis.2021.104108","volume":"107","author":"M Zong","year":"2021","unstructured":"Zong M, Wang R, Chen X, Chen Z, Gong Y (2021) Motion saliency based multi-stream multiplier resnets for action recognition. Image Vis Comput 107:104108","journal-title":"Image Vis Comput"},{"issue":"6","key":"18126_CR20","doi-asserted-by":"crossref","first-page":"2676","DOI":"10.1109\/TNNLS.2020.3007534","volume":"32","author":"Y Ji","year":"2020","unstructured":"Ji Y, Zhang H, Jie Z, Ma L, Wu QJ (2020) CASNet: a cross-attention Siamese network for video salient object detection. IEEE Trans Neural Networks Learn Syst 32(6):2676\u20132690","journal-title":"IEEE Trans Neural Networks Learn Syst"},{"key":"18126_CR21","doi-asserted-by":"crossref","unstructured":"Zhang M, Liu J, Wang Y, Piao Y, Yao S, Ji W, Li J, Lu H, Luo Z (2021) Dynamic context-sensitive filtering network for video salient object detection. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 1553\u20131563","DOI":"10.1109\/ICCV48922.2021.00158"},{"key":"18126_CR22","first-page":"1","volume":"60","author":"Q Wang","year":"2022","unstructured":"Wang Q, Liu Y, Xiong Z, Yuan Y (2022) Hybrid feature aligned network for salient object detection in optical remote sensing imagery. IEEE Trans Geosci Remote Sens 60:1\u201315","journal-title":"IEEE Trans Geosci Remote Sens"},{"key":"18126_CR23","doi-asserted-by":"crossref","unstructured":"Liu Y, Xiong Z, Yuan Y, Wang Q (2023) Transcending pixels: boosting saliency detection via scene understanding from aerial imagery. IEEE Trans Geosci Remote Sens","DOI":"10.1109\/TGRS.2023.3298661"},{"key":"18126_CR24","doi-asserted-by":"crossref","unstructured":"Liu Y, Xiong Z, Yuan Y, Wang Q (2023) Distilling knowledge from super resolution for efficient remote sensing salient object detection. IEEE Trans Geosci Remote Sens","DOI":"10.1109\/TGRS.2023.3267271"},{"issue":"11","key":"18126_CR25","doi-asserted-by":"crossref","first-page":"616","DOI":"10.3390\/info14110616","volume":"14","author":"A Alavigharahbagh","year":"2023","unstructured":"Alavigharahbagh A, Hajihashemi V, Machado JJ, Tavares JM (2023) Deep learning approach for human action recognition using a time saliency map based on motion features considering camera movement and shot in video image sequences. Information 14(11):616","journal-title":"Information"},{"key":"18126_CR26","first-page":"1","volume":"60","author":"Y Liu","year":"2021","unstructured":"Liu Y, Li Q, Yuan Y, Du Q, Wang Q (2021) ABNet: adaptive balanced network for multiscale object detection in remote sensing imagery. IEEE Trans Geosci Remote Sens 60:1\u201314","journal-title":"IEEE Trans Geosci Remote Sens"},{"issue":"6","key":"18126_CR27","doi-asserted-by":"crossref","first-page":"7055","DOI":"10.1007\/s11042-018-6459-6","volume":"78","author":"M Vijayan","year":"2019","unstructured":"Vijayan M, Ramasundaram M (2019) A fast DGPSO-motion saliency map based moving object detection. Multimed Tools Appl 78(6):7055\u20137075","journal-title":"Multimed Tools Appl"},{"key":"18126_CR28","doi-asserted-by":"crossref","unstructured":"Huang T, McKenna S (2018) Sequential recognition of manipulation actions using discriminative superpixel group mining. In: 2018 25th IEEE International conference on image processing (ICIP). IEEE, pp 579\u2013583","DOI":"10.1109\/ICIP.2018.8451451"},{"key":"18126_CR29","doi-asserted-by":"crossref","unstructured":"Mahapatra D, Winkler S, Yen SC (2008) Motion saliency outweighs other low-level features while watching videos. In: Human vision and electronic imaging XIII, vol 6806. SPIE, pp 246\u2013255","DOI":"10.1117\/12.766243"},{"key":"18126_CR30","doi-asserted-by":"crossref","unstructured":"Lee I, Ban SW, Fukushima K, Lee M (2006) Selective motion analysis based on dynamic visual saliency map model. In: International conference on artificial intelligence and soft computing. Springer, pp 814\u2013822","DOI":"10.1007\/11785231_85"},{"issue":"10","key":"18126_CR31","doi-asserted-by":"crossref","first-page":"1420","DOI":"10.1016\/j.neunet.2008.10.002","volume":"21","author":"S Jeong","year":"2008","unstructured":"Jeong S, Ban SW, Lee M (2008) Stereo saliency map considering affective factors and selective motion analysis in a dynamic environment. Neural Netw 21(10):1420\u20131430","journal-title":"Neural Netw"},{"key":"18126_CR32","doi-asserted-by":"crossref","unstructured":"Cui X, Liu Q, Metaxas D (2009) Temporal spectral residual: fast motion saliency detection. In: Proceedings of the 17th ACM international conference on multimedia, pp 617\u2013620","DOI":"10.1145\/1631272.1631370"},{"key":"18126_CR33","doi-asserted-by":"crossref","unstructured":"Woo JW, Lim YC, Lee M (2009) Obstacle categorization based on hybridizing global and local features. In: International conference on neural information processing. Springer, pp 1\u201310","DOI":"10.1007\/978-3-642-10684-2_1"},{"key":"18126_CR34","unstructured":"Kim S, Kim M (2014) Improvement of saliency map using motion information. In: Proceedings of the Korean society of broadcast engineers conference. The Korean Institute of Broadcast and Media Engineers, pp 259\u2013260"},{"key":"18126_CR35","doi-asserted-by":"crossref","unstructured":"Morita S (2008) Generating saliency map related to motion based on self-organized feature extracting. In: International conference on neural information processing. Springer, pp 784\u2013791","DOI":"10.1007\/978-3-642-03040-6_96"},{"key":"18126_CR36","doi-asserted-by":"crossref","unstructured":"Morita S (2009) Generating self-organized saliency map based on color and motion. In: International conference on neural information processing. Springer, pp 28\u201337","DOI":"10.1007\/978-3-642-10684-2_4"},{"key":"18126_CR37","unstructured":"Hu J, Pitsianis N, Sun X Motion saliency map generations for video data analysis: spatio-temporalsignatures in the array operations"},{"key":"18126_CR38","doi-asserted-by":"crossref","unstructured":"Mej\u00eda-Oca\u00f1a AB, De\u00a0Frutos-L\u00f3pez M, Sanz-Rodr\u00edguez S,\u00a0del Ama-Esteban \u00d3, Pel\u00e1ez-Moreno C, D\u00edaz-de Mar\u00eda F (2011) Low-complexity motion-based saliency map estimation for perceptual video coding. IEEE","DOI":"10.1109\/CONATEL.2011.5958666"},{"key":"18126_CR39","doi-asserted-by":"crossref","unstructured":"Gkamas T, Nikou C (2011) Guiding optical flow estimation using superpixels. In: 2011 17th International Conference on Digital Signal Processing (DSP). IEEE, pp 1\u20136","DOI":"10.1109\/ICDSP.2011.6004871"},{"issue":"7","key":"18126_CR40","doi-asserted-by":"crossref","first-page":"2600","DOI":"10.1109\/TIP.2013.2253483","volume":"22","author":"WT Li","year":"2013","unstructured":"Li WT, Chang HS, Lien KC, Chang HT, Wang YC (2013) Exploring visual and motion saliency for automatic video object extraction. IEEE Trans Image Process 22(7):2600\u20132610","journal-title":"IEEE Trans Image Process"},{"key":"18126_CR41","doi-asserted-by":"crossref","unstructured":"Chang HS, Wang YC (2013) Superpixel-based large displacement optical flow. In: 2013 IEEE international conference on image processing, pp 3835\u20133839","DOI":"10.1109\/ICIP.2013.6738790"},{"issue":"8","key":"18126_CR42","doi-asserted-by":"crossref","first-page":"1336","DOI":"10.1109\/TCSVT.2014.2308652","volume":"24","author":"CR Huang","year":"2014","unstructured":"Huang CR, Chang YJ, Yang ZX, Lin YY (2014) Video saliency map detection by dominant camera motion removal. IEEE Trans Circuits Syst Video Technol 24(8):1336\u20131349","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"key":"18126_CR43","doi-asserted-by":"crossref","unstructured":"Dong X, Tsoi AC, Lo SL (2014) Superpixel appearance and motion descriptors for action recognition. In: 2014 International joint conference on neural networks (IJCNN). IEEE, pp 1173\u20131178","DOI":"10.1109\/IJCNN.2014.6889575"},{"key":"18126_CR44","doi-asserted-by":"crossref","unstructured":"Giosan I, Nedevschi S (2014) Superpixel-based obstacle segmentation from dense stereo urban traffic scenarios using intensity, depth and optical flow information. In: 17th International IEEE conference on intelligent transportation systems (ITSC). IEEE, pp 1662\u20131668","DOI":"10.1109\/ITSC.2014.6957932"},{"key":"18126_CR45","doi-asserted-by":"crossref","unstructured":"Roberts R, Dellaert F (2014) Direct superpixel labeling for mobile robot navigation using learned general optical flow templates. In: 2014 IEEE\/RSJ international conference on intelligent robots and systems. IEEE, pp 1032\u20131037","DOI":"10.1109\/IROS.2014.6942685"},{"key":"18126_CR46","unstructured":"Xu J, Tu Q, Li C, Gao R, Men A (2015) Video saliency map detection based on global motion estimation. In: 2015 IEEE international conference on multimedia & expo workshops (ICMEW). IEEE, pp 1\u20136"},{"key":"18126_CR47","doi-asserted-by":"crossref","unstructured":"Srivatsa RS, Babu RV (2015) Salient object detection via objectness measure. In: 2015 IEEE international conference on image processing (ICIP). IEEE, pp 4481\u20134485","DOI":"10.1109\/ICIP.2015.7351654"},{"key":"18126_CR48","doi-asserted-by":"crossref","unstructured":"Donn\u00e9 S, Aelterman J, Goossens B, Philips W (2015) Fast and robust variational optical flow for high-resolution images using slic superpixels. In: International conference on advanced concepts for intelligent vision systems. Springer, pp 205\u2013216","DOI":"10.1007\/978-3-319-25903-1_18"},{"key":"18126_CR49","doi-asserted-by":"crossref","first-page":"100","DOI":"10.1016\/j.image.2015.04.014","volume":"38","author":"J Li","year":"2015","unstructured":"Li J, Liu Z, Zhang X, Le Meur O, Shen L (2015) Spatiotemporal saliency detection based on superpixel-level trajectory. Signal Process Image Commun 38:100\u2013114","journal-title":"Signal Process Image Commun"},{"key":"18126_CR50","doi-asserted-by":"crossref","first-page":"167","DOI":"10.1016\/j.imavis.2016.06.004","volume":"52","author":"Y Hu","year":"2016","unstructured":"Hu Y, Song R, Li Y, Rao P, Wang Y (2016) Highly accurate optical flow estimation on superpixel tree. Image Vis Comput 52:167\u2013177","journal-title":"Image Vis Comput"},{"key":"18126_CR51","doi-asserted-by":"crossref","unstructured":"Guo J, Ren T, Huang L, Liu X, Cheng MM, Wu G (2017) Video salient object detection via cross-frame cellular automata. In: 2017 IEEE international conference on multimedia and expo (ICME). IEEE, pp 325\u2013330","DOI":"10.1109\/ICME.2017.8019389"},{"key":"18126_CR52","doi-asserted-by":"crossref","first-page":"285","DOI":"10.1016\/j.patcog.2017.07.028","volume":"72","author":"Z Tu","year":"2017","unstructured":"Tu Z, Guo Z, Xie W, Yan M, Veltkamp RC, Li B, Yuan J (2017) Fusing disparate object signatures for salient object detection in video. Pattern Recognit 72:285\u2013299","journal-title":"Pattern Recognit"},{"key":"18126_CR53","doi-asserted-by":"crossref","unstructured":"Hu YT, Huang JB, Schwing AG (2018) Unsupervised video object segmentation using motion saliency-guided spatio-temporal propagation. In: Proceedings of the European conference on computer vision (ECCV), pp 786\u2013802","DOI":"10.1007\/978-3-030-01246-5_48"},{"issue":"3","key":"18126_CR54","doi-asserted-by":"crossref","first-page":"561","DOI":"10.1109\/TCSVT.2016.2618934","volume":"28","author":"Q Ling","year":"2016","unstructured":"Ling Q, Deng S, Li F, Huang Q, Li X (2016) A feedback-based robust video stabilization method for traffic videos. IEEE Trans Circuits Syst Video Technol 28(3):561\u2013572","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"key":"18126_CR55","doi-asserted-by":"crossref","first-page":"44","DOI":"10.1016\/j.image.2018.01.005","volume":"63","author":"J Wang","year":"2018","unstructured":"Wang J, Liu W, Xing W, Zhang S (2018) Visual object tracking with multi-scale superpixels and color-feature guided kernelized correlation filters. Signal Process Image Commun 63:44\u201362","journal-title":"Signal Process Image Commun"},{"key":"18126_CR56","doi-asserted-by":"crossref","unstructured":"Chen R, Tong Y, Yang J, Wu M (2019) Video foreground detection algorithm based on fast principal component pursuit and motion saliency. Comput Intell Neurosci 2019","DOI":"10.1155\/2019\/4769185"},{"key":"18126_CR57","doi-asserted-by":"crossref","unstructured":"Maczyta L, Bouthemy P, Le Meur O (2019) Unsupervised motion saliency map estimation based on optical flow inpainting. In: 2019 IEEE international conference on image processing (ICIP). IEEE, pp 4469\u20134473","DOI":"10.1109\/ICIP.2019.8803542"},{"key":"18126_CR58","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1109\/TCI.2019.2897937","volume":"6","author":"H Zhu","year":"2019","unstructured":"Zhu H, Sun X, Zhang Q, Wang Q, Robles-Kelly A, Li H, You S (2019) Full view optical flow estimation leveraged from light field superpixel. IEEE Trans Comput Imaging 6:12\u201323","journal-title":"IEEE Trans Comput Imaging"},{"key":"18126_CR59","doi-asserted-by":"crossref","first-page":"194","DOI":"10.1016\/j.ins.2018.12.042","volume":"480","author":"C Kim","year":"2019","unstructured":"Kim C, Song D, Kim CS, Park SK (2019) Object tracking under large motion: combining coarse-to-fine search with superpixels. Inf Sci 480:194\u2013210","journal-title":"Inf Sci"},{"issue":"9","key":"18126_CR60","doi-asserted-by":"crossref","first-page":"1397","DOI":"10.3390\/sym12091397","volume":"12","author":"TT Ngo","year":"2020","unstructured":"Ngo TT, Nguyen V, Pham XQ, Hossain MA, Huh EN (2020) Motion saliency detection for surveillance systems using streaming dynamic mode decomposition. Symmetry 12(9):1397","journal-title":"Symmetry"},{"key":"18126_CR61","doi-asserted-by":"crossref","unstructured":"Qiu G, Wang Y, Wei Y (2020) An algorithm for the hole filling of motion foreground based on superpixel segmentation. In: 2020 International conference on communications, information system and computer engineering (CISCE). IEEE, pp 450\u2013453","DOI":"10.1109\/CISCE50729.2020.00101"},{"key":"18126_CR62","doi-asserted-by":"crossref","unstructured":"Tian H, Cai W, Ding W, Liang P, Yu J, Huang Q (2023) Long-term liver lesion tracking in contrast-enhanced ultrasound videos via a siamese network with temporal motion attention. Front Physiol 14","DOI":"10.3389\/fphys.2023.1180713"},{"key":"18126_CR63","doi-asserted-by":"crossref","unstructured":"Bay H, Tuytelaars T, Van Gool L (2006) \u201cSURF: speeded up robust features. In: European conference on computer vision. Springer, pp 404\u2013417","DOI":"10.1007\/11744023_32"},{"key":"18126_CR64","doi-asserted-by":"crossref","unstructured":"Kim J, Han D, Tai YW, Kim J (2014) Salient region detection via high-dimensional color transform. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 883\u2013890","DOI":"10.1109\/CVPR.2014.118"},{"issue":"3","key":"18126_CR65","first-page":"527","volume":"35","author":"B Nan","year":"2014","unstructured":"Nan B, Mu Z (2014) Slic0-based superpixel segmentation method with texture fusion. Chin J Sci Instrum 35(3):527\u2013534","journal-title":"Chin J Sci Instrum"},{"key":"18126_CR66","doi-asserted-by":"crossref","unstructured":"Hetherington R (1952) The perception of the visual world. by James J. Gibson. USA: Houghton mifflin company, 1950 (George Allen & Unwin, Ltd., London). price 35s. J Mental Sci 98(413):717\u2013717","DOI":"10.1192\/bjp.98.413.717-a"},{"key":"18126_CR67","volume-title":"The senses considered as perceptual systems","author":"JJ Gibson","year":"1966","unstructured":"Gibson JJ, Carmichael L (1966) The senses considered as perceptual systems, vol 2. Houghton Mifflin, Boston"},{"issue":"1","key":"18126_CR68","doi-asserted-by":"crossref","first-page":"43","DOI":"10.1007\/BF01420984","volume":"12","author":"JL Barron","year":"1994","unstructured":"Barron JL, Fleet DJ, Beauchemin SS (1994) Performance of optical flow techniques. Int J Comput Vis 12(1):43\u201377","journal-title":"Int J Comput Vis"},{"key":"18126_CR69","volume-title":"Handbook of mathematics","author":"IN Bronshtein","year":"2013","unstructured":"Bronshtein IN, Semendyayev KA (2013) Handbook of mathematics. Springer"},{"issue":"1\u20133","key":"18126_CR70","doi-asserted-by":"crossref","first-page":"185","DOI":"10.1016\/0004-3702(81)90024-2","volume":"17","author":"BK Horn","year":"1981","unstructured":"Horn BK, Schunck BG (1981) Determining optical flow. Artif Intell 17(1\u20133):185\u2013203","journal-title":"Artif Intell"},{"key":"18126_CR71","doi-asserted-by":"crossref","unstructured":"Brox T (2020) Optical flow: traditional approaches. In: Computer vision: a reference guide, pp 1\u20135","DOI":"10.1007\/978-3-030-03243-2_600-1"},{"issue":"21","key":"18126_CR72","doi-asserted-by":"crossref","first-page":"10176","DOI":"10.3390\/app112110176","volume":"11","author":"R Bensaci","year":"2021","unstructured":"Bensaci R, Khaldi B, Aiadi O, Benchabana A (2021) Deep convolutional neural network with KNN regression for automatic image annotation. Appl Sci 11(21):10176","journal-title":"Appl Sci"},{"issue":"1","key":"18126_CR73","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1002\/col.5080150109","volume":"15","author":"S Wan","year":"1990","unstructured":"Wan S, Prusinkiewicz P, Wong S (1990) Variance-based color image quantization for frame buffer display. Color Res Appl 15(1):52\u201358","journal-title":"Color Res Appl"},{"key":"18126_CR74","first-page":"75","volume":"17","author":"RW Floyd","year":"1976","unstructured":"Floyd RW (1976) An adaptive algorithm for spatial gray-scale. Proceedings of the Society for Information Display 17:75\u201377","journal-title":"Proceedings of the Society for Information Display"},{"key":"18126_CR75","unstructured":"Pont-Tuset J, Perazzi F, Caelles S, Arbel\u00e1ez P, Sorkine-Hornung A, Van Gool L (2017) The 2017 Davis challenge on video object segmentation. arXiv:1704.00675"},{"key":"18126_CR76","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1016\/j.neucom.2021.07.088","volume":"462","author":"J Chen","year":"2021","unstructured":"Chen J, Li Z, Jin Y, Ren D, Ling H (2021) Video saliency prediction via spatio-temporal reasoning. Neurocomputing 462:59\u201368","journal-title":"Neurocomputing"},{"key":"18126_CR77","doi-asserted-by":"crossref","first-page":"3995","DOI":"10.1109\/TIP.2021.3068644","volume":"30","author":"C Chen","year":"2021","unstructured":"Chen C, Wang G, Peng C, Fang Y, Zhang D, Qin H (2021) Exploring rich and efficient spatial temporal interactions for real-time video salient object detection. IEEE Trans Image Process 30:3995\u20134007","journal-title":"IEEE Trans Image Process"},{"key":"18126_CR78","doi-asserted-by":"crossref","unstructured":"Huang X, Zhang YJ (2021) Fast video saliency detection via maximally stable region motion and object repeatability. IEEE Trans Multimedia","DOI":"10.1109\/TMM.2021.3094356"},{"issue":"2","key":"18126_CR79","volume":"30","author":"J Shang","year":"2021","unstructured":"Shang J, Liu Y, Zhou H, Wang M (2021) Moving object properties-based video saliency detection. J Electron Imaging 30(2):023005","journal-title":"J Electron Imaging"},{"key":"18126_CR80","doi-asserted-by":"crossref","unstructured":"Rosten E, Drummond T (2005) Fusing points and lines for high performance tracking. In: 10th IEEE international conference on computer vision (ICCV\u201905) vol 1, vol\u00a02. IEEE, pp 1508\u20131515","DOI":"10.1109\/ICCV.2005.104"},{"key":"18126_CR81","doi-asserted-by":"crossref","unstructured":"Harris C, Stephens M et\u00a0al (1988) A combined corner and edge detector. In: Alvey vision conference, vol\u00a015. Citeseer, pp 10\u20135244","DOI":"10.5244\/C.2.23"},{"key":"18126_CR82","doi-asserted-by":"crossref","unstructured":"Alcantarilla PF, Bartoli A, Davison AJ (2012) KAZE features. In: European conference on computer vision. Springer, pp 214\u2013227","DOI":"10.1007\/978-3-642-33783-3_16"},{"key":"18126_CR83","unstructured":"Shi J et\u00a0al (1994) Good features to track. In: 1994 Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 593\u2013600"},{"key":"18126_CR84","doi-asserted-by":"crossref","unstructured":"Nist\u00e9r D, Stew\u00e9nius H (2008) Linear time maximally stable extremal regions. In: European conference on computer vision. Springer, pp 183\u2013196","DOI":"10.1007\/978-3-540-88688-4_14"},{"key":"18126_CR85","doi-asserted-by":"crossref","unstructured":"Rublee E, Rabaud V, Konolige K, Bradski G (2011) ORB: an efficient alternative to SIFT or SURF. In: 2011 International conference on computer vision. IEEE, pp 2564\u20132571","DOI":"10.1109\/ICCV.2011.6126544"},{"issue":"2","key":"18126_CR86","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","volume":"60","author":"DG Lowe","year":"2004","unstructured":"Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91\u2013110","journal-title":"Int J Comput Vis"},{"key":"18126_CR87","doi-asserted-by":"crossref","unstructured":"Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 724\u2013732","DOI":"10.1109\/CVPR.2016.85"},{"key":"18126_CR88","doi-asserted-by":"crossref","unstructured":"Farneb\u00e4ck G (2003) Two-frame motion estimation based on polynomial expansion. In: Scandinavian conference on Image analysis. Springer, pp 363\u2013370","DOI":"10.1007\/3-540-45103-X_50"},{"key":"18126_CR89","unstructured":"Lucas BD, Kanade T et\u00a0al (1981) An iterative image registration technique with an application to stereo vision, vol\u00a081"},{"issue":"3","key":"18126_CR90","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1023\/B:VISI.0000011205.11775.fd","volume":"56","author":"S Baker","year":"2004","unstructured":"Baker S, Matthews I (2004) Lucas-Kanade 20 years on: a unifying framework. Int J Comput Vis 56(3):221\u2013255","journal-title":"Int J Comput Vis"},{"key":"18126_CR91","doi-asserted-by":"crossref","unstructured":"Carreira J, Zisserman A (2017) Quo vadis, action recognition, a new model and the kinetics dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6299\u20136308","DOI":"10.1109\/CVPR.2017.502"},{"key":"18126_CR92","doi-asserted-by":"crossref","unstructured":"Zheng Z, An G, Ruan Q (2020) Motion guided feature-augmented network for action recognition. In: 2020 15th IEEE international conference on signal processing (ICSP), vol\u00a01. IEEE, pp 391\u2013394","DOI":"10.1109\/ICSP48669.2020.9321026"},{"key":"18126_CR93","doi-asserted-by":"crossref","first-page":"57267","DOI":"10.1109\/ACCESS.2019.2910604","volume":"7","author":"E Chen","year":"2019","unstructured":"Chen E, Bai X, Gao L, Tinega HC, Ding Y (2019) A spatiotemporal heterogeneous two-stream network for action recognition. IEEE Access 7:57267\u201357275","journal-title":"IEEE Access"},{"key":"18126_CR94","doi-asserted-by":"crossref","DOI":"10.1016\/j.image.2019.115731","volume":"82","author":"N Yudistira","year":"2020","unstructured":"Yudistira N, Kurita T (2020) Correlation Net: spatiotemporal multimodal deep learning for action recognition. Signal Process Image Commun 82:115731","journal-title":"Signal Process Image Commun"},{"key":"18126_CR95","doi-asserted-by":"crossref","unstructured":"Gharahbagh AA, Hajihashemi V, Ferreira MC, Machado JJ, Tavares JMR (2022) Best frame selection to enhance training step efficiency in video-based human action recognition. Appl Sci 12(4):1830","DOI":"10.3390\/app12041830"},{"issue":"12","key":"18126_CR96","doi-asserted-by":"crossref","first-page":"2119","DOI":"10.1587\/transinf.2022EDP7058","volume":"105","author":"K Omi","year":"2022","unstructured":"Omi K, Kimata J, Tamaki T (2022) Model-agnostic multi-domain learning with domain-specific adapters for action recognition. IEICE Trans Inf Syst 105(12):2119\u20132126","journal-title":"IEICE Trans Inf Syst"},{"key":"18126_CR97","doi-asserted-by":"crossref","DOI":"10.1016\/j.cviu.2022.103406","volume":"219","author":"I Dave","year":"2022","unstructured":"Dave I, Gupta R, Rizve MN, Shah M (2022) TCLR: temporal contrastive learning for video representation. Comput Vis Image Understand 219:103406","journal-title":"Comput Vis Image Understand"},{"issue":"5","key":"18126_CR98","doi-asserted-by":"crossref","first-page":"813","DOI":"10.1109\/TETCI.2020.3014367","volume":"5","author":"SP Sahoo","year":"2020","unstructured":"Sahoo SP, Ari S, Mahapatra K, Mohanty SP (2020) HAR-depth: a novel framework for human action recognition using sequential learning and depth estimated history images. IEEE Trans Emerg Top Comput Intell 5(5):813\u2013825","journal-title":"IEEE Trans Emerg Top Comput Intell"}],"container-title":["Multimedia Tools and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-024-18126-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11042-024-18126-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-024-18126-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,3]],"date-time":"2024-09-03T02:12:54Z","timestamp":1725329574000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11042-024-18126-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,14]]},"references-count":98,"journal-issue":{"issue":"30","published-online":{"date-parts":[[2024,9]]}},"alternative-id":["18126"],"URL":"https:\/\/doi.org\/10.1007\/s11042-024-18126-x","relation":{},"ISSN":["1573-7721"],"issn-type":[{"value":"1573-7721","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,14]]},"assertion":[{"value":"20 July 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 December 2023","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 January 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"14 February 2024","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Compliance with ethical standards"}},{"value":"The authors declare no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflicts of interest"}}]}}