{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,23]],"date-time":"2025-09-23T14:10:58Z","timestamp":1758636658591,"version":"3.37.3"},"reference-count":72,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2024,2,13]],"date-time":"2024-02-13T00:00:00Z","timestamp":1707782400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,13]],"date-time":"2024-02-13T00:00:00Z","timestamp":1707782400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61861032"],"award-info":[{"award-number":["61861032"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2024,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Convolutional neural networks (CNNs) have been the dominant architectures for feature extraction tasks, but CNNs do not look for and focus on some specific image features. Correlation operations play an important role in visual tracking. However, the correlation operation reserves a large amount of unfavorable background information. In this paper, we propose an effective feature recognizer including channel and spatial attention modules to focus on important object feature information. Thus, the representation power of the feature extraction network is improved. Further, we design a multi-scale feature fusion network. The fusion network performs feature fusion on template feature and encoded feature branches to establish connections between features at different scales. Experiments on six benchmarks demonstrate that the proposed tracker outperforms the state-of-the-art trackers. In particular, the proposed tracker achieves an 80.4% AUC on TrackingNet and a 68.4% AUC on GOT-10k while running at a real-time speed.<\/jats:p>","DOI":"10.1007\/s40747-024-01345-y","type":"journal-article","created":{"date-parts":[[2024,2,13]],"date-time":"2024-02-13T19:02:16Z","timestamp":1707850936000},"page":"3617-3632","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Exploiting multi-scale hierarchical feature representation for visual tracking"],"prefix":"10.1007","volume":"10","author":[{"given":"Jun","family":"Wang","sequence":"first","affiliation":[]},{"given":"Peng","family":"Yin","sequence":"additional","affiliation":[]},{"given":"Wenhui","family":"Yang","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6446-5873","authenticated-orcid":false,"given":"Yuanyun","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Shengqian","family":"Wang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,2,13]]},"reference":[{"issue":"3","key":"1345_CR1","doi-asserted-by":"publisher","first-page":"1403","DOI":"10.1109\/TCSVT.2021.3072207","volume":"32","author":"T Zhang","year":"2022","unstructured":"Zhang T, Liu X, Zhang Q, Han J (2022) Siamcda: complementarity- and distractor-aware rgb-t tracking based on siamese network. IEEE Trans Circ Syst Video Technol 32(3):1403\u20131417","journal-title":"IEEE Trans Circ Syst Video Technol"},{"key":"1345_CR2","doi-asserted-by":"crossref","unstructured":"Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PH (2016) Fully-convolutional siamese networks for object tracking, in: European conference on computer vision, Springer, pp. 850\u2013865","DOI":"10.1007\/978-3-319-48881-3_56"},{"key":"1345_CR3","doi-asserted-by":"crossref","unstructured":"Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8971\u20138980","DOI":"10.1109\/CVPR.2018.00935"},{"key":"1345_CR4","doi-asserted-by":"crossref","unstructured":"Guo D, Shao Y, Cui Y, Wang Z, Zhang L, Shen C (2021) Graph attention tracking. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 9543\u20139552","DOI":"10.1109\/CVPR46437.2021.00942"},{"key":"1345_CR5","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"1345_CR6","doi-asserted-by":"crossref","unstructured":"Chen C-F, Fan Q, Panda R (2021) Crossvit: Cross-attention multi-scale vision transformer for image classification, arXiv preprint arXiv:2103.14899","DOI":"10.1109\/ICCV48922.2021.00041"},{"key":"1345_CR7","doi-asserted-by":"crossref","unstructured":"Fan H, Lin L, Yang F, Chu P, Deng G, Yu S, Bai H, Xu Y, Liao C, Ling H (2019) Lasot: A high-quality benchmark for large-scale single object tracking. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 5374\u20135383","DOI":"10.1109\/CVPR.2019.00552"},{"key":"1345_CR8","unstructured":"Huang L, Zhao X, Huang K (2019) Got-10k: A large high-diversity benchmark for generic object tracking in the wild, IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"1345_CR9","doi-asserted-by":"crossref","unstructured":"Wu Y, Lim J, Yang M-H (2013) Online object tracking: A benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2411\u20132418","DOI":"10.1109\/CVPR.2013.312"},{"key":"1345_CR10","doi-asserted-by":"crossref","unstructured":"Mueller M, Smith N, Ghanem B (2016) A benchmark and simulator for uav tracking, in: European conference on computer vision, Springer, pp. 445\u2013461","DOI":"10.1007\/978-3-319-46448-0_27"},{"key":"1345_CR11","unstructured":"Kristan M, Leonardis A, Matas J, Felsberg M, Pflugfelder R, \u010cehovin\u00a0Zajc L, Vojir T, Bhat G, Lukezic A, Eldesokey A, et\u00a0al (2018) The sixth visual object tracking vot2018 challenge results. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops"},{"issue":"3","key":"1345_CR12","doi-asserted-by":"publisher","first-page":"1537","DOI":"10.1109\/TCSVT.2021.3077640","volume":"32","author":"X Li","year":"2022","unstructured":"Li X, Huang L, Wei Z (2022) A twofold convolutional regression tracking network with temporal and spatial mechanism. IEEE Trans Circ Syst Video Technol 32(3):1537\u20131551","journal-title":"IEEE Trans Circ Syst Video Technol"},{"key":"1345_CR13","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2023.110380","volume":"265","author":"Y Wang","year":"2023","unstructured":"Wang Y, Zhang W, Lai C, Wang J (2023) Adaptive temporal feature modeling for visual tracking via cross-channel learning. Knowl-Based Syst 265:110380","journal-title":"Knowl-Based Syst"},{"key":"1345_CR14","doi-asserted-by":"crossref","unstructured":"Guo Q, Feng W, Zhou C, Huang R, Wan L, Wang S (2017) Learning dynamic siamese network for visual object tracking. In: Proceedings of the IEEE international conference on computer vision, pp. 1763\u20131771","DOI":"10.1109\/ICCV.2017.196"},{"key":"1345_CR15","doi-asserted-by":"crossref","unstructured":"He A, Luo C, Tian X, Zeng W (2018) A twofold siamese network for real-time object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4834\u20134843","DOI":"10.1109\/CVPR.2018.00508"},{"key":"1345_CR16","doi-asserted-by":"crossref","unstructured":"Zhu Z, Wang Q, Li B, Wu W, Yan J, Hu W (2018) Distractor-aware siamese networks for visual object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 101\u2013117","DOI":"10.1007\/978-3-030-01240-3_7"},{"key":"1345_CR17","doi-asserted-by":"crossref","unstructured":"Fan H, Ling H (2019) Siamese cascaded region proposal networks for real-time visual tracking. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 7952\u20137961","DOI":"10.1109\/CVPR.2019.00814"},{"key":"1345_CR18","doi-asserted-by":"crossref","unstructured":"Chen Z, Zhong B, Li G, Zhang S, Ji R (2020) Siamese box adaptive network for visual tracking. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 6668\u20136677","DOI":"10.1109\/CVPR42600.2020.00670"},{"key":"1345_CR19","first-page":"12549","volume":"34","author":"Y Xu","year":"2020","unstructured":"Xu Y, Wang Z, Li Z, Yuan Y, Yu G (2020) Siamfc++: towards robust and accurate visual tracking with target estimation guidelines. Proc AAAI Conf Artificial Intell 34:12549\u201312556","journal-title":"Proc AAAI Conf Artificial Intell"},{"key":"1345_CR20","doi-asserted-by":"crossref","unstructured":"Guo D, Wang J, Cui Y, Wang Z, Chen S (2020) Siamcar: Siamese fully convolutional classification and regression for visual tracking. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 6269\u20136277","DOI":"10.1109\/CVPR42600.2020.00630"},{"key":"1345_CR21","doi-asserted-by":"crossref","unstructured":"Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132\u20137141","DOI":"10.1109\/CVPR.2018.00745"},{"issue":"4","key":"1345_CR22","doi-asserted-by":"publisher","first-page":"783","DOI":"10.1007\/s11263-019-01283-0","volume":"128","author":"J Park","year":"2020","unstructured":"Park J, Woo S, Lee J-Y, Kweon IS (2020) A simple and light-weight attention module for convolutional neural networks. Int J Comput Vis 128(4):783\u2013798","journal-title":"Int J Comput Vis"},{"key":"1345_CR23","doi-asserted-by":"crossref","unstructured":"Yang Z, Zhu L, Wu Y, Yang Y (2020) Gated channel transformation for visual recognition, in: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 11794\u201311803","DOI":"10.1109\/CVPR42600.2020.01181"},{"key":"1345_CR24","doi-asserted-by":"crossref","unstructured":"Fan J, Wu Y, Dai S (2010) Discriminative spatial attention for robust tracking. In: European Conference on computer vision, Springer, pp. 480\u2013493","DOI":"10.1007\/978-3-642-15549-9_35"},{"key":"1345_CR25","doi-asserted-by":"crossref","unstructured":"Choi J, Jin\u00a0Chang H, Yun S, Fischer T, Demiris Y, Young\u00a0Choi J (2017) Attentional correlation filter network for adaptive visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4807\u20134816","DOI":"10.1109\/CVPR.2017.513"},{"key":"1345_CR26","doi-asserted-by":"crossref","unstructured":"Lukezic A, Vojir T, \u010cehovin\u00a0Zajc L, Matas J, Kristan M (2017) Discriminative correlation filter with channel and spatial reliability. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6309\u20136318","DOI":"10.1109\/CVPR.2017.515"},{"key":"1345_CR27","doi-asserted-by":"crossref","unstructured":"Wang Q, Teng Z, Xing J, Gao J, Hu W, Maybank S (2018) Learning attentions: residual attentional siamese network for high performance online visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4854\u20134863","DOI":"10.1109\/CVPR.2018.00510"},{"key":"1345_CR28","doi-asserted-by":"crossref","unstructured":"Yu Y, Xiong Y, Huang W, Scott MR (2020) Deformable siamese attention networks for visual object tracking. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 6728\u20136737","DOI":"10.1109\/CVPR42600.2020.00676"},{"key":"1345_CR29","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser \u0141, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp. 5998\u20136008"},{"key":"1345_CR30","doi-asserted-by":"crossref","unstructured":"Cui Y, Jiang C, Wang L, Wu G (2022) Mixformer: End-to-end tracking with iterative mixed attention, in: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608\u201313618","DOI":"10.1109\/CVPR52688.2022.01324"},{"key":"1345_CR31","doi-asserted-by":"crossref","unstructured":"Wang N, Zhou W, Wang J, Li H (2021) Transformer meets tracker: Exploiting temporal context for robust visual tracking. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 1571\u20131580","DOI":"10.1109\/CVPR46437.2021.00162"},{"key":"1345_CR32","doi-asserted-by":"crossref","unstructured":"Chen X, Yan B, Zhu J, Wang D, Yang X, Lu H (2021) Transformer tracking. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 8126\u20138135","DOI":"10.1109\/CVPR46437.2021.00803"},{"key":"1345_CR33","doi-asserted-by":"crossref","unstructured":"Cao Z, Fu C, Ye J, Li B, Li Y (2021) Hift: Hierarchical feature transformer for aerial tracking, in: Proceedings of the IEEE\/CVF international conference on computer vision, pp. 15457\u201315466","DOI":"10.1109\/ICCV48922.2021.01517"},{"key":"1345_CR34","unstructured":"Lin L, Fan H, Xu Y, Ling H (2021) Swintrack: A simple and strong baseline for transformer tracking, arXiv preprint arXiv:2112.00995"},{"key":"1345_CR35","doi-asserted-by":"crossref","unstructured":"Xie F, Wang C, Wang G, Yang W, Zeng W (2021) Learning tracking representations via dual-branch fully transformer networks. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp. 2688\u20132697","DOI":"10.1109\/ICCVW54120.2021.00303"},{"key":"1345_CR36","doi-asserted-by":"crossref","unstructured":"Xie F, Wang C, Wang G, Cao Y, Yang W, Zeng W (2022) Correlation-aware deep tracking. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 8751\u20138760","DOI":"10.1109\/CVPR52688.2022.00855"},{"key":"1345_CR37","unstructured":"Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser \u0141, Polosukhin I (2017) Attention is all you need, Advances in neural information processing systems 30"},{"key":"1345_CR38","unstructured":"Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556"},{"key":"1345_CR39","first-page":"1097","volume":"25","author":"A Krizhevsky","year":"2012","unstructured":"Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inform Process Syst 25:1097\u20131105","journal-title":"Adv Neural Inform Process Syst"},{"key":"1345_CR40","unstructured":"Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, PMLR, pp. 448\u2013456"},{"key":"1345_CR41","doi-asserted-by":"crossref","unstructured":"Muller M, Bibi A, Giancola S, Alsubaihi S, Ghanem B (2018) Trackingnet: A large-scale dataset and benchmark for object tracking in the wild. In: Proceedings of the European conference on computer vision (ECCV), pp. 300\u2013317","DOI":"10.1007\/978-3-030-01246-5_19"},{"key":"1345_CR42","doi-asserted-by":"crossref","unstructured":"Kiani\u00a0Galoogahi H, Fagg A, Huang C, Ramanan D, Lucey S (2017) Need for speed: A benchmark for higher frame rate object tracking. In: Proceedings of the IEEE international conference on computer vision, pp. 1125\u20131134","DOI":"10.1109\/ICCV.2017.128"},{"key":"1345_CR43","doi-asserted-by":"crossref","unstructured":"Danelljan M, Bhat G, Khan FS, Felsberg M (2019) Atom: Accurate tracking by overlap maximization, in: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 4660\u20134669","DOI":"10.1109\/CVPR.2019.00479"},{"key":"1345_CR44","doi-asserted-by":"crossref","unstructured":"Li B, Wu W, Wang Q, Zhang F, Xing J, Yan J (2019) Siamrpn++: Evolution of siamese visual tracking with very deep networks. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 4282\u20134291","DOI":"10.1109\/CVPR.2019.00441"},{"key":"1345_CR45","doi-asserted-by":"crossref","unstructured":"Bhat G, Danelljan M, Gool LV, Timofte R (2019) Learning discriminative model prediction for tracking. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp. 6182\u20136191","DOI":"10.1109\/ICCV.2019.00628"},{"key":"1345_CR46","doi-asserted-by":"crossref","unstructured":"Mayer C, Danelljan M, Paudel DP, Van\u00a0Gool L (2021) Learning target candidate association to keep track of what not to track. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp. 13444\u201313454","DOI":"10.1109\/ICCV48922.2021.01319"},{"key":"1345_CR47","doi-asserted-by":"crossref","unstructured":"Blatter P, Kanakis M, Danelljan M, Van\u00a0Gool L (2023) Efficient visual tracking with exemplar transformers. In: Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, pp. 1571\u20131581","DOI":"10.1109\/WACV56688.2023.00162"},{"key":"1345_CR48","doi-asserted-by":"crossref","unstructured":"Mayer C, Danelljan M, Bhat G, Paul M, Paudel DP, Yu F, Van\u00a0Gool L (2022) Transforming model prediction for tracking. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 8731\u20138740","DOI":"10.1109\/CVPR52688.2022.00853"},{"key":"1345_CR49","doi-asserted-by":"crossref","unstructured":"Dong X, Shen J, Shao L, Porikli F (2020) Clnet: A compact latent network for fast adjusting siamese trackers. In: Computer vision\u2013ECCV 2020: 16th European conference, Glasgow, UK, August 23\u201328 Proceedings, Part XX 16, Springer, 2020, pp. 378\u2013395","DOI":"10.1007\/978-3-030-58565-5_23"},{"key":"1345_CR50","doi-asserted-by":"crossref","unstructured":"Fu Z, Liu Q, Fu Z, Wang Y (2021) Stmtrack: Template-free visual tracking with space-time memory networks. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 13774\u201313783","DOI":"10.1109\/CVPR46437.2021.01356"},{"issue":"11","key":"1345_CR51","doi-asserted-by":"publisher","first-page":"5596","DOI":"10.1109\/TIP.2019.2919201","volume":"28","author":"T Xu","year":"2019","unstructured":"Xu T, Feng Z-H, Wu X-J, Kittler J (2019) Learning adaptive discriminative correlation filters via temporal consistency preserving spatial feature selection for robust visual object tracking. IEEE Trans Image Process 28(11):5596\u20135609","journal-title":"IEEE Trans Image Process"},{"key":"1345_CR52","doi-asserted-by":"crossref","unstructured":"Wang Q, Zhang L, Bertinetto L, Hu W, Torr PH (2019) Fast online object tracking and segmentation: A unifying approach. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 1328\u20131338","DOI":"10.1109\/CVPR.2019.00142"},{"key":"1345_CR53","doi-asserted-by":"crossref","unstructured":"Bhat G, Johnander J, Danelljan M, Khan FS, Felsberg M (2018) Unveiling the power of deep tracking. In: Proceedings of the European conference on computer vision (ECCV), pp. 483\u2013498","DOI":"10.1007\/978-3-030-01216-8_30"},{"key":"1345_CR54","doi-asserted-by":"crossref","unstructured":"He Z, Fan Y, Zhuang J, Dong Y, Bai H (2017) Correlation filters with weighted convolution responses. In: Proceedings of the IEEE international conference on computer vision workshops, pp. 1992\u20132000","DOI":"10.1109\/ICCVW.2017.233"},{"key":"1345_CR55","doi-asserted-by":"crossref","unstructured":"Li F, Tian C, Zuo W, Zhang L, Yang M-H (2018) Learning spatial-temporal regularized correlation filters for visual tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4904\u20134913","DOI":"10.1109\/CVPR.2018.00515"},{"key":"1345_CR56","doi-asserted-by":"crossref","unstructured":"Che M, Wang R, Lu Y, Li Y, Zhi H, Xiong C (2018) Channel pruning for visual tracking. In: Proceedings of the European conference on computer vision (ECCV) Workshops,","DOI":"10.1007\/978-3-030-11009-3_3"},{"key":"1345_CR57","doi-asserted-by":"crossref","unstructured":"He A, Luo C, Tian X, Zeng W (2018) Towards a better match in siamese network based visual object tracker. in: Proceedings of the European conference on computer vision (ECCV) workshops","DOI":"10.1007\/978-3-030-11009-3_7"},{"key":"1345_CR58","doi-asserted-by":"crossref","unstructured":"Sun C, Wang D, Lu H, Yang M-H (2018) Correlation tracking via joint discrimination and reliability learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 489\u2013497","DOI":"10.1109\/CVPR.2018.00058"},{"key":"1345_CR59","doi-asserted-by":"crossref","unstructured":"Sun C, Wang D, Lu H, Yang M-H (2018) Learning spatial-aware regressions for visual tracking, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8962\u20138970","DOI":"10.1109\/CVPR.2018.00934"},{"key":"1345_CR60","doi-asserted-by":"crossref","unstructured":"Danelljan M, Bhat G, Shahbaz\u00a0Khan F, Felsberg M (2017) Eco: Efficient convolution operators for tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6638\u20136646","DOI":"10.1109\/CVPR.2017.733"},{"key":"1345_CR61","doi-asserted-by":"crossref","unstructured":"Danelljan M, Robinson A, Khan FS, Felsberg M (2016) Beyond correlation filters: Learning continuous convolution operators for visual tracking. In: European conference on computer vision, Springer, pp. 472\u2013488","DOI":"10.1007\/978-3-319-46454-1_29"},{"key":"1345_CR62","doi-asserted-by":"crossref","unstructured":"Bhat G, Danelljan M, Van\u00a0Gool L, Timofte R (2020) Know your surroundings: Exploiting scene information for object tracking. In: European conference on computer vision, Springer, pp. 205\u2013221","DOI":"10.1007\/978-3-030-58592-1_13"},{"key":"1345_CR63","doi-asserted-by":"crossref","unstructured":"Danelljan M, Gool LV, Timofte R (2020) Probabilistic regression for visual tracking, in: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 7183\u20137192","DOI":"10.1109\/CVPR42600.2020.00721"},{"key":"1345_CR64","doi-asserted-by":"crossref","unstructured":"Lukezic A, Matas J, Kristan M (2020) D3s-a discriminative single shot segmentation tracker, in: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 7133\u20137142","DOI":"10.1109\/CVPR42600.2020.00716"},{"key":"1345_CR65","doi-asserted-by":"crossref","unstructured":"Zhang Z, Peng H, Fu J, Li B, Hu W (2020) Ocean: Object-aware anchor-free tracking. In: Computer vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XXI 16, Springer, pp. 771\u2013787","DOI":"10.1007\/978-3-030-58589-1_46"},{"key":"1345_CR66","doi-asserted-by":"crossref","unstructured":"Wang G, Luo C, Xiong Z, Zeng W (2019) Spm-tracker: series-parallel matching for real-time visual object tracking. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 3643\u20133652","DOI":"10.1109\/CVPR.2019.00376"},{"key":"1345_CR67","first-page":"11037","volume":"34","author":"L Huang","year":"2020","unstructured":"Huang L, Zhao X, Huang K (2020) Globaltrack: a simple and strong baseline for long-term tracking. Proc AAAI Conf Artificial Intell 34:11037\u201311044","journal-title":"Proc AAAI Conf Artificial Intell"},{"key":"1345_CR68","doi-asserted-by":"crossref","unstructured":"Ma F, Shou MZ, Zhu L, Fan H, Xu Y, Yang Y, Yan Z (2022) Unified transformer tracker for object tracking. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 8781\u20138790","DOI":"10.1109\/CVPR52688.2022.00858"},{"key":"1345_CR69","unstructured":"Zhao M, Okada K, Inaba M (2021) Trtr: Visual tracking with transformer, arXiv preprint arXiv:2105.03817"},{"key":"1345_CR70","unstructured":"Cui Y, Jiang C, Wang L, Wu G (2021) Target transformed regression for accurate tracking, arXiv preprint arXiv:2104.00403"},{"key":"1345_CR71","doi-asserted-by":"crossref","unstructured":"Shen Q, Qiao L, Guo J, Li P, Li X, Li B, Feng W, Gan W, Wu W, Ouyang W (2022) Unsupervised learning of accurate siamese tracking. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 8101\u20138110","DOI":"10.1109\/CVPR52688.2022.00793"},{"key":"1345_CR72","doi-asserted-by":"crossref","unstructured":"Zheng J, Ma C, Peng H, Yang X (2021) Learning to track objects from unlabeled videos, in: Proceedings of the IEEE\/CVF international conference on computer vision, pp. 13546\u201313555","DOI":"10.1109\/ICCV48922.2021.01329"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01345-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-024-01345-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01345-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,16]],"date-time":"2024-05-16T18:16:27Z","timestamp":1715883387000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-024-01345-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,13]]},"references-count":72,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,6]]}},"alternative-id":["1345"],"URL":"https:\/\/doi.org\/10.1007\/s40747-024-01345-y","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"type":"print","value":"2199-4536"},{"type":"electronic","value":"2198-6053"}],"subject":[],"published":{"date-parts":[[2024,2,13]]},"assertion":[{"value":"17 January 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 December 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 February 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}