{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T20:29:51Z","timestamp":1757622591870,"version":"3.44.0"},"reference-count":89,"publisher":"Springer Science and Business Media LLC","issue":"9","license":[{"start":{"date-parts":[[2025,8,6]],"date-time":"2025-08-06T00:00:00Z","timestamp":1754438400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,8,6]],"date-time":"2025-08-06T00:00:00Z","timestamp":1754438400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100004479","name":"Jiangxi Provincial Natural Science Foundation","doi-asserted-by":"crossref","award":["20242BAB25058","20242BAB25075"],"award-info":[{"award-number":["20242BAB25058","20242BAB25075"]}],"id":[{"id":"10.13039\/501100004479","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2025,9]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Siamese-based trackers achieve much progress in accuracy and tracking speeds, which use the cross-correlation to compute the target similarity. However, these trackers based on cross-correlation ignore the spatial layout of feature maps and the correspondences of the features between the template and search regions. And these trackers based on cross-correlation either lose lots of foreground information or retain numerous background information due to pre-fixed feature regions. We design a Kronecker Product Matching based feature fusion network for establishing the spatial region correspondences from the target templates to search images. The spatial information of template features and search region features are obtained, which is helpful to obtain more accurate target similarity. In addition, a normalization based attention module is introduced in the template branch to suppress less salient features and background information. By integrating the designed feature fusion network and the attention module, a novel tracking algorithm is proposed in the Siamese tracking framework. Extensive experiments on six challenging benchmarks including OTB-100, GOT-10k, LaSOT, UAV123, VOT2018 and NFS demonstrate the generalization ability and effectiveness of the proposed tracker. In particular, the proposed tracker achieves the AUC score of 63.7% on GOT-10k, 62.6% on UAV123 and precision score of 89.6% on OTB-100, while running at 40 frames per second (FPS).<\/jats:p>","DOI":"10.1007\/s40747-025-02031-3","type":"journal-article","created":{"date-parts":[[2025,8,6]],"date-time":"2025-08-06T06:41:35Z","timestamp":1754462495000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Spatial correspondence matching based feature fusion for object tracking with attention learning"],"prefix":"10.1007","volume":"11","author":[{"given":"Yuanyun","family":"Wang","sequence":"first","affiliation":[]},{"given":"Wenhui","family":"Yang","sequence":"additional","affiliation":[]},{"given":"Lingtao","family":"Zhou","sequence":"additional","affiliation":[]},{"given":"Shenmiao","family":"Jin","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6750-5105","authenticated-orcid":false,"given":"Jun","family":"Wang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,8,6]]},"reference":[{"key":"2031_CR1","doi-asserted-by":"crossref","unstructured":"Wu Y, Lim J, Yang M-H (2013) Online object tracking: A benchmark. Proc IEEE Conf Comput Vis Pattern Recognit 2411\u20132418","DOI":"10.1109\/CVPR.2013.312"},{"issue":"5","key":"2031_CR2","first-page":"2386","volume":"44","author":"X Lu","year":"2020","unstructured":"Lu X, Ma C, Shen J, Yang X, Reid I, Yang M-H (2020) Deep object tracking with shrinkage loss. IEEE Trans Pattern Anal Mach Intell 44(5):2386\u20132401","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"109107","key":"2031_CR3","first-page":"1","volume":"135","author":"T Mandel","year":"2023","unstructured":"Mandel T, Jimenez M, Risley E, Nammoto T, Williams R, Panoff M, Ballesteros M, Suarez B (2023) Detection confidence driven multi-object tracking to recover reliable tracks from unreliable detections. Pattern Recogn 135(109107):1\u201313","journal-title":"Pattern Recogn"},{"key":"2031_CR4","unstructured":"Wang Y-H, Hsieh J-W, Chen P-Y, Chang M-C, So HH, Li X (2024) Smiletrack: Similarity learning for occlusion-aware multiple object tracking. arxiv:2211.08824"},{"key":"2031_CR5","doi-asserted-by":"crossref","unstructured":"Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PH (2016) Fully-convolutional siamese networks for object tracking, in: European conference on computer vision, Springer, pp. 850\u2013865","DOI":"10.1007\/978-3-319-48881-3_56"},{"key":"2031_CR6","doi-asserted-by":"crossref","unstructured":"Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 8971\u20138980","DOI":"10.1109\/CVPR.2018.00935"},{"issue":"3","key":"2031_CR7","doi-asserted-by":"publisher","first-page":"1403","DOI":"10.1109\/TCSVT.2021.3072207","volume":"32","author":"T Zhang","year":"2021","unstructured":"Zhang T, Liu X, Zhang Q, Han J (2021) Siamcda: complementarity-and distractor-aware rgb-t tracking based on siamese network. IEEE Trans Circuits Syst Video Technol 32(3):1403\u20131417","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"key":"2031_CR8","doi-asserted-by":"crossref","unstructured":"Xu Y, Wang Z, Li Z, Yuan Y, Yu G (2020) Siamfc++: Towards robust and accurate visual tracking with target estimation guidelines. In: Proceedings of the AAAI Conference on Artificial Intelligence, 34, 12549\u201312556","DOI":"10.1609\/aaai.v34i07.6944"},{"issue":"1","key":"2031_CR9","doi-asserted-by":"publisher","first-page":"315","DOI":"10.1109\/TCSVT.2020.2978194","volume":"31","author":"Y Shan","year":"2020","unstructured":"Shan Y, Zhou X, Liu S, Zhang Y, Huang K (2020) Siamfpn: A deep learning method for accurate and real-time maritime ship tracking. IEEE Trans Circuits Syst Video Technol 31(1):315\u2013325","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"key":"2031_CR10","doi-asserted-by":"crossref","unstructured":"Guo D, Wang J, Cui Y, Wang Z, Chen S (2020) Siamcar: Siamese fully convolutional classification and regression for visual tracking, in: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 6269\u20136277","DOI":"10.1109\/CVPR42600.2020.00630"},{"key":"2031_CR11","doi-asserted-by":"crossref","unstructured":"Shen Y, Xiao T, Li H, Yi S, Wang X (2018) End-to-end deep kronecker-product matching for person re-identification, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 6886\u20136895","DOI":"10.1109\/CVPR.2018.00720"},{"key":"2031_CR12","unstructured":"Liu Y, Shao Z, Teng Y, Hoffmann N (2021) Nam: Normalization-based attention module, arXiv preprint arXiv:2111.12419"},{"key":"2031_CR13","doi-asserted-by":"crossref","unstructured":"Fan H, Lin L, Yang F, Chu P, Deng G, Yu S, Bai H, Xu Y, Liao C, Ling H (2019) Lasot: A high-quality benchmark for large-scale single object tracking. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 5374\u20135383","DOI":"10.1109\/CVPR.2019.00552"},{"key":"2031_CR14","unstructured":"Huang L, Zhao X, Huang K (2019) Got-10k: A large high-diversity benchmark for generic object tracking in the wild. IEEE Trans Pattern Anal Mach Intell"},{"key":"2031_CR15","doi-asserted-by":"crossref","unstructured":"Mueller M, Smith N, Ghanem B (2016) A benchmark and simulator for uav tracking. European conference on computer vision Springer 445\u2013461","DOI":"10.1007\/978-3-319-46448-0_27"},{"key":"2031_CR16","unstructured":"Kristan M, Leonardis A, Matas J, Felsberg M, Pflugfelder R, \u010cehovin Zajc L, Vojir T, Bht, G, Lukezic A, Eldesokey A et al (2018) The sixth visual object tracking vot2018 challenge results. in: Proceedings of the European Conference on Computer Vision (ECCV) Workshops"},{"key":"2031_CR17","doi-asserted-by":"crossref","unstructured":"Kiani Galoogahi H, Fagg A, Huang C, Ramanan D, Lucey S (2017) Need for speed: A benchmark for higher frame rate object tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1125\u20131134","DOI":"10.1109\/ICCV.2017.128"},{"issue":"4","key":"2031_CR18","doi-asserted-by":"publisher","first-page":"1296","DOI":"10.1109\/TCSVT.2020.2987601","volume":"31","author":"J Fan","year":"2020","unstructured":"Fan J, Song H, Zhang K, Yang K, Liu Q (2020) Feature alignment and aggregation siamese networks for fast visual tracking. IEEE Trans Circuits Syst Video Technol 31(4):1296\u20131307","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"key":"2031_CR19","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2023.110380","volume":"265","author":"Y Wang","year":"2023","unstructured":"Wang Y, Zhang W, Lai C, Wang J (2023) Adaptive temporal feature modeling for visual tracking via cross-channel learning. Knowl-Based Syst 265:110380","journal-title":"Knowl-Based Syst"},{"key":"2031_CR20","doi-asserted-by":"crossref","unstructured":"Wang Q, Zhang L, Bertinetto L, Hu W, Torr PH (2019) Fast online object tracking and segmentation: A unifying approach. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 1328\u20131338","DOI":"10.1109\/CVPR.2019.00142"},{"key":"2031_CR21","doi-asserted-by":"crossref","unstructured":"Chen Z, Zhong B, Li G, Zhang S, Ji R (2020) Siamese box adaptive network for visual tracking. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, 6668\u20136677","DOI":"10.1109\/CVPR42600.2020.00670"},{"key":"2031_CR22","doi-asserted-by":"crossref","unstructured":"Valmadre J, Bertinetto L, Henriques J, Vedaldi A, Torr PH (2017) End-to-end representation learning for correlation filter based tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2805\u20132813","DOI":"10.1109\/CVPR.2017.531"},{"issue":"12","key":"2031_CR23","doi-asserted-by":"publisher","first-page":"8896","DOI":"10.1109\/TPAMI.2021.3127492","volume":"44","author":"J Shen","year":"2021","unstructured":"Shen J, Liu Y, Dong X, Lu X, Khan FS, Hoi S (2021) Distilled siamese networks for visual tracking. IEEE Trans Pattern Anal Mach Intell 44(12):8896\u20138909","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2031_CR24","doi-asserted-by":"crossref","unstructured":"Guo Q, Feng W, Zhou C, Huang R, Wan L, Wang S (2017) Learning dynamic siamese network for visual object tracking. In: Proceedings of the IEEE international conference on computer vision, pp. 1763\u20131771","DOI":"10.1109\/ICCV.2017.196"},{"key":"2031_CR25","doi-asserted-by":"crossref","unstructured":"He A, Luo C, Tian X, Zeng W (2018) A twofold siamese network for real-time object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4834\u20134843","DOI":"10.1109\/CVPR.2018.00508"},{"key":"2031_CR26","doi-asserted-by":"publisher","first-page":"3351","DOI":"10.1109\/TIP.2019.2959256","volume":"29","author":"Z Liang","year":"2019","unstructured":"Liang Z, Shen J (2019) Local semantic siamese networks for fast tracking. IEEE Trans Image Process 29:3351\u20133364","journal-title":"IEEE Trans Image Process"},{"issue":"2","key":"2031_CR27","doi-asserted-by":"publisher","first-page":"509","DOI":"10.1109\/TCSVT.2019.2892759","volume":"30","author":"D Li","year":"2019","unstructured":"Li D, Porikli F, Wen G, Kuai Y (2019) When correlation filters meet siamese networks for real-time complementary tracking. IEEE Trans Circ Syst Video Technol 30(2):509\u2013519","journal-title":"IEEE Trans Circ Syst Video Technol"},{"issue":"8","key":"2031_CR28","doi-asserted-by":"publisher","first-page":"4069","DOI":"10.1109\/TIV.2023.3282567","volume":"8","author":"Z Meng","year":"2023","unstructured":"Meng Z, Xia X, Xu R, Liu W, Ma J (2023) Hydro-3d: Hybrid object detection and tracking for cooperative perception using 3d lidar. IEEE Trans Intell Vehicles 8(8):4069\u20134080","journal-title":"IEEE Trans Intell Vehicles"},{"key":"2031_CR29","doi-asserted-by":"crossref","unstructured":"Du F, Liu P, Zhao W, Tang X (2020) Correlation-guided attention for corner detection based visual tracking. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, 6836\u20136845","DOI":"10.1109\/CVPR42600.2020.00687"},{"key":"2031_CR30","doi-asserted-by":"crossref","unstructured":"Voigtlaender P, Luiten J, Torr PH, Leibe B (2020) Siam r-cnn: Visual tracking by re-detection. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, 6578\u20136588","DOI":"10.1109\/CVPR42600.2020.00661"},{"key":"2031_CR31","doi-asserted-by":"crossref","unstructured":"Zhu Z, Wang Q, Li B, Wu W, Yan J, Hu W (2018) Distractor-aware siamese networks for visual object tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), 101\u2013117","DOI":"10.1007\/978-3-030-01240-3_7"},{"key":"2031_CR32","doi-asserted-by":"crossref","unstructured":"Fan H, Ling H (2019) Siamese cascaded region proposal networks for real-time visual tracking. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, 7952\u20137961","DOI":"10.1109\/CVPR.2019.00814"},{"issue":"5","key":"2031_CR33","doi-asserted-by":"publisher","first-page":"1515","DOI":"10.1109\/TPAMI.2019.2956703","volume":"43","author":"X Dong","year":"2019","unstructured":"Dong X, Shen J, Wang W, Shao L, Ling H, Porikli F (2019) Dynamical hyperparameter optimization via deep reinforcement learning in tracking. IEEE Trans Pattern Anal Mach Intell 43(5):1515\u20131529","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"2031_CR34","doi-asserted-by":"crossref","unstructured":"Li B, Wu W, Wang Q, Zhang F, Xing J, Yan J (2019) Siamrpn++: Evolution of siamese visual tracking with very deep networks. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 4282\u20134291","DOI":"10.1109\/CVPR.2019.00441"},{"key":"2031_CR35","first-page":"1097","volume":"25","author":"A Krizhevsky","year":"2012","unstructured":"Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097\u20131105","journal-title":"Adv Neural Inf Process Syst"},{"key":"2031_CR36","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"2031_CR37","doi-asserted-by":"crossref","unstructured":"Zhang Z, Peng H (2019) Deeper and wider siamese networks for real-time visual tracking. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pp. 4591\u20134600","DOI":"10.1109\/CVPR.2019.00472"},{"key":"2031_CR38","doi-asserted-by":"crossref","unstructured":"Zhang Z, Peng H, Fu J, Li B, Hu W (2020) Ocean: Object-aware anchor-free tracking. In: Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XXI 16, Springer, pp. 771\u2013787","DOI":"10.1007\/978-3-030-58589-1_46"},{"key":"2031_CR39","doi-asserted-by":"crossref","unstructured":"Han W, Dong X, Khan FS, Shao L, Shen J (2021) Learning to fuse asymmetric feature maps in siamese trackers. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp. 16570\u201316580","DOI":"10.1109\/CVPR46437.2021.01630"},{"key":"2031_CR40","doi-asserted-by":"crossref","unstructured":"Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132\u20137141","DOI":"10.1109\/CVPR.2018.00745"},{"key":"2031_CR41","doi-asserted-by":"crossref","unstructured":"Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. Proc IEEE Conf Comput Vis Pattern Recognit 7794\u20137803","DOI":"10.1109\/CVPR.2018.00813"},{"key":"2031_CR42","doi-asserted-by":"crossref","unstructured":"Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, Lu H (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 3146\u20133154","DOI":"10.1109\/CVPR.2019.00326"},{"key":"2031_CR43","doi-asserted-by":"crossref","unstructured":"Choi J, Jin Chang H, Yun S, Fischer T, Demiris Y, Young Choi J (2017) Attentional correlation filter network for adaptive visual tracking. Proc IEEE Conf Comput Vis Pattern Recognit 4807\u20134816","DOI":"10.1109\/CVPR.2017.513"},{"key":"2031_CR44","doi-asserted-by":"crossref","unstructured":"Wang Q, Teng Z, Xing J, Gao J, Hu W, Maybank S (2018) Learning attentions: residual attentional siamese network for high performance online visual tracking. Proc IEEE Conf Comput Vis Pattern Recognit 4854\u20134863","DOI":"10.1109\/CVPR.2018.00510"},{"key":"2031_CR45","doi-asserted-by":"crossref","unstructured":"Yu Y, Xiong Y, Huang W, Scott MR (2020) Deformable siamese attention networks for visual object tracking. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 6728\u20136737","DOI":"10.1109\/CVPR42600.2020.00676"},{"key":"2031_CR46","doi-asserted-by":"crossref","unstructured":"Dong X, Shi P, Liang T, Yang A (2024) Ctaffnet: Cnn-transformer adaptive feature fusion object detection algorithm for complex traffic scenarios. Transportation Research Record Journal of the Transportation Research Board","DOI":"10.1177\/03611981241258753"},{"key":"2031_CR47","doi-asserted-by":"publisher","DOI":"10.1016\/j.displa.2024.102814","volume":"84","author":"X Dong","year":"2024","unstructured":"Dong X, Shi P, Qi H, Yang A, Liang T (2024) Ts-bev: Bev object detection algorithm based on temporal-spatial feature fusion. Displays 84:102814","journal-title":"Displays"},{"issue":"8","key":"2031_CR48","doi-asserted-by":"publisher","first-page":"3154","DOI":"10.1109\/TCSVT.2020.3037947","volume":"31","author":"M Jiang","year":"2020","unstructured":"Jiang M, Zhao Y, Kong J (2020) Mutual learning and feature fusion siamese networks for visual object tracking. IEEE Trans Circuits Syst Video Technol 31(8):3154\u20133167","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"issue":"7","key":"2031_CR49","doi-asserted-by":"publisher","first-page":"3068","DOI":"10.1109\/TCYB.2019.2936503","volume":"50","author":"J Shen","year":"2019","unstructured":"Shen J, Tang X, Dong X, Shao L (2019) Visual object tracking by hierarchical attention siamese network. IEEE Trans Cybern 50(7):3068\u20133080","journal-title":"IEEE Trans Cybern"},{"key":"2031_CR50","unstructured":"Zhou X, Guo P, Hong L, Li J, Zhang W, Ge W, Zhang W (2024) Reading relevant feature from global representation memory for visual object tracking arxiv:2402.14392"},{"key":"2031_CR51","doi-asserted-by":"crossref","unstructured":"Chu P, Wang J, You Q, Ling H, Liu Z (2023) Transmot: Spatial-temporal graph transformer for multiple object tracking. In: Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision 4870\u20134880","DOI":"10.1109\/WACV56688.2023.00485"},{"issue":"3","key":"2031_CR52","doi-asserted-by":"publisher","first-page":"2426","DOI":"10.1109\/TIV.2022.3216102","volume":"8","author":"Y Ma","year":"2022","unstructured":"Ma Y, Zhang J, Qin G, Jin J, Zhang K, Pan D, Chen M (2022) 3d multi-object tracking based on dual-tracker and d-s evidence theory. IEEE Trans Intell Vehicles 8(3):2426\u20132436","journal-title":"IEEE Trans Intell Vehicles"},{"key":"2031_CR53","doi-asserted-by":"crossref","unstructured":"Liu J, Bai L, Xia Y, Huang T, Zhu B, Han QL (2023) Gnn-pmb: A simple but effective online 3d multi-object tracker without bells and whistles. IEEE Transactions on Intelligent Vehicles 1\u201316","DOI":"10.1109\/TIV.2022.3217490"},{"key":"2031_CR54","unstructured":"Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. International conference on machine learning PMLR 448\u2013456"},{"key":"2031_CR55","doi-asserted-by":"crossref","unstructured":"Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. Proc IEEE Conf Comput Vis Pattern Recognit 2818\u20132826","DOI":"10.1109\/CVPR.2016.308"},{"issue":"3","key":"2031_CR56","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","volume":"115","author":"O Russakovsky","year":"2015","unstructured":"Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vision 115(3):211\u2013252","journal-title":"Int J Comput Vision"},{"key":"2031_CR57","doi-asserted-by":"crossref","unstructured":"Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Doll\u00e1r P, Zitnick CL (2014) Microsoft coco: Common objects in context. European conference on computer vision Springer 740\u2013755","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"2031_CR58","doi-asserted-by":"crossref","unstructured":"Guo D, Shao Y, Cui Y, Wang Z, Zhang L, Shen C (2021) Graph attention tracking. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 9543\u20139552","DOI":"10.1109\/CVPR46437.2021.00942"},{"key":"2031_CR59","doi-asserted-by":"crossref","unstructured":"Chen X, Yan B, Zhu J, Wang D, Yang X, Lu H (2021) Transformer tracking. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 8126\u20138135","DOI":"10.1109\/CVPR46437.2021.00803"},{"key":"2031_CR60","doi-asserted-by":"crossref","unstructured":"Yan B, Peng H, Fu J, Wang D, Lu H (2021) Learning spatio-temporal transformer for visual tracking. Proceedings of the IEEE\/CVF international conference on computer vision 10448\u201310457","DOI":"10.1109\/ICCV48922.2021.01028"},{"key":"2031_CR61","doi-asserted-by":"crossref","unstructured":"Cui Y, Jiang C, Wang L, Wu G (2022) Mixformer: End-to-end tracking with iterative mixed attention. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 13608\u201313618","DOI":"10.1109\/CVPR52688.2022.01324"},{"key":"2031_CR62","doi-asserted-by":"crossref","unstructured":"Xie F, Wang C, Wang G, Yang W, Zeng W (2021) Learning tracking representations via dual-branch fully transformer networks. Proceedings of the IEEE\/CVF International Conference on Computer Vision 2688\u20132697","DOI":"10.1109\/ICCVW54120.2021.00303"},{"key":"2031_CR63","doi-asserted-by":"publisher","first-page":"725","DOI":"10.1109\/TIP.2020.3038356","volume":"30","author":"S Pu","year":"2020","unstructured":"Pu S, Song Y, Ma C, Zhang H, Yang M-H (2020) Learning recurrent memory activation networks for visual tracking. IEEE Trans Image Process 30:725\u2013738","journal-title":"IEEE Trans Image Process"},{"key":"2031_CR64","doi-asserted-by":"crossref","unstructured":"Danelljan M, Bhat G, Khan FS, Felsberg M (2019) Atom: Accurate tracking by overlap maximization. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 4660\u20134669","DOI":"10.1109\/CVPR.2019.00479"},{"key":"2031_CR65","doi-asserted-by":"crossref","unstructured":"Li P, Chen B, Ouyang W, Wang D, Yang X, Lu H (2019) Gradnet: Gradient-guided network for visual object tracking. Proceedings of the IEEE\/CVF International conference on computer vision 6162\u20136171","DOI":"10.1109\/ICCV.2019.00626"},{"key":"2031_CR66","doi-asserted-by":"crossref","unstructured":"Dong X, Shen J (2018) Triplet loss in siamese network for object tracking. Proceedings of the European conference on computer vision (ECCV) 459\u2013474","DOI":"10.1007\/978-3-030-01261-8_28"},{"key":"2031_CR67","first-page":"7183","volume":"1","author":"M Danelljan","year":"2020","unstructured":"Danelljan M, Gool LV, Timofte R (2020) Probabilistic regression for visual tracking. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 1:7183\u20137192","journal-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition"},{"key":"2031_CR68","unstructured":"Gopal GY, Amer MA (2023) Mobile vision transformer-based visual object tracking. arxiv:2309.05829"},{"key":"2031_CR69","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.121763","volume":"238","author":"S Pan","year":"2024","unstructured":"Pan S, Zhang C, Li Z, Hu L (2024) Siamca: Siamese visual tracking with customized anchor and target-aware interaction. Expert Syst Appl 238:121763","journal-title":"Expert Syst Appl"},{"issue":"8","key":"2031_CR70","first-page":"3774","volume":"32","author":"Z Pi","year":"2021","unstructured":"Pi Z, Shao Y, Gao C, Sang N (2021) Instance-based feature pyramid for visual object tracking. IEEE Trans Circuits Syst Video Technol 32(8):3774\u20133787","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"key":"2031_CR71","doi-asserted-by":"crossref","unstructured":"Bhat G, Danelljan M, Gool LV, Timofte R (2019) Learning discriminative model prediction for tracking. Proceedings of the IEEE\/CVF International Conference on Computer Vision 6182\u20136191","DOI":"10.1109\/ICCV.2019.00628"},{"key":"2031_CR72","doi-asserted-by":"crossref","unstructured":"Lukezic A, Matas J, Kristan M (2020) D3s-a discriminative single shot segmentation tracker. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 7133\u20137142","DOI":"10.1109\/CVPR42600.2020.00716"},{"key":"2031_CR73","doi-asserted-by":"crossref","unstructured":"Wang G, Luo C, Xiong Z, Zeng W (2019) Spm-tracker: Series-parallel matching for real-time visual object tracking. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 3643\u20133652","DOI":"10.1109\/CVPR.2019.00376"},{"key":"2031_CR74","doi-asserted-by":"crossref","unstructured":"Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. Proc IEEE Conf Comput Vis Pattern Recognit 4293\u20134302","DOI":"10.1109\/CVPR.2016.465"},{"key":"2031_CR75","unstructured":"Zhu J, Chen X, Wang D, Zhao W, Lu H (2022) Srrt: Search region regulation tracking, arXiv preprint arXiv:2207.04438"},{"key":"2031_CR76","doi-asserted-by":"crossref","unstructured":"Tang F, Ling Q (2022) Ranking-based siamese visual tracking. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 8741\u20138750","DOI":"10.1109\/CVPR52688.2022.00854"},{"key":"2031_CR77","doi-asserted-by":"crossref","unstructured":"Shen Q, Qiao L, Guo J, Li P, Li X, Li B, Feng W, Gan W, Wu W, Ouyang W (2022) Unsupervised learning of accurate siamese tracking. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 8101\u20138110","DOI":"10.1109\/CVPR52688.2022.00793"},{"key":"2031_CR78","doi-asserted-by":"crossref","unstructured":"Dong X, Shen J, Shao L, Porikli F (2020) Clnet: A compact latent network for fast adjusting siamese trackers. Proceedings of the European conference on computer vision (ECCV) Springer, 378\u2013395","DOI":"10.1007\/978-3-030-58565-5_23"},{"key":"2031_CR79","first-page":"15457","volume":"1","author":"Z Cao","year":"2021","unstructured":"Cao Z, Fu C, Ye J, Li B, Li Y (2021) Hift: Hierarchical feature transformer for aerial tracking. Proceedings of the IEEE\/CVF International Conference on Computer Vision 1:15457\u201315466","journal-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision"},{"key":"2031_CR80","doi-asserted-by":"crossref","unstructured":"Jung I, Son J, Baek M, Han B (2018) Real-time mdnet. Proceedings of the European conference on computer vision (ECCV) 83\u201398","DOI":"10.1007\/978-3-030-01225-0_6"},{"key":"2031_CR81","doi-asserted-by":"crossref","unstructured":"Danelljan M, Bhat G, Shahbaz Khan F, Felsberg M (2017) Eco: Efficient convolution operators for tracking. Proc IEEE Conf Comput Vis Pattern Recognit 6638\u20136646","DOI":"10.1109\/CVPR.2017.733"},{"key":"2031_CR82","doi-asserted-by":"crossref","unstructured":"Gao J, Zhang T, Xu C (2019) Graph convolutional tracking. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 4649\u20134659","DOI":"10.1109\/CVPR.2019.00478"},{"key":"2031_CR83","doi-asserted-by":"crossref","unstructured":"Li Y, Fu C, Ding F, Huang Z, Lu G (2020) Autotrack: Towards high-performance visual tracking for uav with automatic spatio-temporal regularization. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition 11923\u201311932","DOI":"10.1109\/CVPR42600.2020.01194"},{"key":"2031_CR84","doi-asserted-by":"crossref","unstructured":"Li F, Tian C, Zuo W, Zhang L, Yang M-H (2018) Learning spatial-temporal regularized correlation filters for visual tracking. Proc IEEE Conf Comput Vis Pattern Recognit 4904\u20134913","DOI":"10.1109\/CVPR.2018.00515"},{"key":"2031_CR85","doi-asserted-by":"crossref","unstructured":"Guo M, Zhang Z, Fan H, Jing L, Lyu Y, Li B, Hu W (2022) Learning target-aware representation for visual tracking via informative interactions, arXiv preprint arXiv:2201.02526","DOI":"10.24963\/ijcai.2022\/130"},{"key":"2031_CR86","doi-asserted-by":"crossref","unstructured":"Blatter P, Kanakis M, Danelljan M, Van Gool L (2023) Efficient visual tracking with exemplar transformers. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision","DOI":"10.1109\/WACV56688.2023.00162"},{"key":"2031_CR87","unstructured":"Gopal GY, Amer MA (2023) Separable self and mixed attention transformers for efficient object tracking. arxiv:2309.03979"},{"key":"2031_CR88","doi-asserted-by":"publisher","first-page":"2149","DOI":"10.1007\/s11760-022-02177-4","volume":"16","author":"J Wang","year":"2022","unstructured":"Wang J, Meng C, Deng C, Wang Y (2022) Learning attention modules for visual tracking. SIViP 16:2149\u20132156","journal-title":"SIViP"},{"key":"2031_CR89","doi-asserted-by":"crossref","unstructured":"Zhu J, Chen X, Zhang P, Wang X, Wang D, Zhao W, Lu H (2024) Srrt: Exploring search region regulation for visual object tracking. arxiv:2207.04438","DOI":"10.1109\/TCSVT.2024.3409898"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-025-02031-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-025-02031-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-025-02031-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,8]],"date-time":"2025-09-08T21:04:15Z","timestamp":1757365455000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-025-02031-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,6]]},"references-count":89,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2025,9]]}},"alternative-id":["2031"],"URL":"https:\/\/doi.org\/10.1007\/s40747-025-02031-3","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"type":"print","value":"2199-4536"},{"type":"electronic","value":"2198-6053"}],"subject":[],"published":{"date-parts":[[2025,8,6]]},"assertion":[{"value":"12 July 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"12 February 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 August 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"411"}}