{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:39:27Z","timestamp":1760146767395,"version":"build-2065373602"},"reference-count":41,"publisher":"MDPI AG","issue":"23","license":[{"start":{"date-parts":[[2024,12,4]],"date-time":"2024-12-04T00:00:00Z","timestamp":1733270400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Fundamental Research Funds for the Provincial Universities of Zhejiang","award":["GK249909299001-006"],"award-info":[{"award-number":["GK249909299001-006"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Object tracking in remote sensing videos is a challenging task in computer vision. Recent advances in deep learning have sparked significant interest in tracking algorithms based on Siamese neural networks. However, many existing algorithms fail to deliver satisfactory performance in complex scenarios due to challenging conditions and limited computational resources. Thus, enhancing tracking efficiency and improving algorithm responsiveness in complex scenarios are crucial. To address tracking drift caused by similar objects and background interference in remote sensing image tracking, we propose an enhanced Siamese network based on the SiamRhic architecture, incorporating a cross-correlation and ranking head for improved object tracking. We first use convolutional neural networks for feature extraction and integrate the CBAM (Convolutional Block Attention Module) to enhance the tracker\u2019s representational capacity, allowing it to focus more effectively on the objects. Additionally, we replace the original depth-wise cross-correlation operation with asymmetric convolution, enhancing both speed and performance. We also introduce a ranking loss to reduce the classification confidence of interference objects, addressing the mismatch between classification and regression. We validate the proposed algorithm through experiments on the OTB100, UAV123, and OOTB remote sensing datasets. Specifically, SiamRhic achieves success, normalized precision, and precision rates of 0.533, 0.786, and 0.812, respectively, on the OOTB benchmark. The OTB100 benchmark achieves a success rate of 0.670 and a precision rate of 0.892. Similarly, in the UAV123 benchmark, SiamRhic achieves a success rate of 0.621 and a precision rate of 0.823. These results demonstrate the algorithm\u2019s high precision and success rates, highlighting its practical value.<\/jats:p>","DOI":"10.3390\/rs16234549","type":"journal-article","created":{"date-parts":[[2024,12,4]],"date-time":"2024-12-04T10:07:10Z","timestamp":1733306830000},"page":"4549","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["SiamRhic: Improved Cross-Correlation and Ranking Head-Based Siamese Network for Object Tracking in Remote Sensing Videos"],"prefix":"10.3390","volume":"16","author":[{"given":"Afeng","family":"Yang","sequence":"first","affiliation":[{"name":"School of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, China"}]},{"given":"Zhuolin","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, China"}]},{"given":"Wenqing","family":"Feng","sequence":"additional","affiliation":[{"name":"School of Computer Science, Hangzhou Dianzi University, Hangzhou 310018, China"}]}],"member":"1968","published-online":{"date-parts":[[2024,12,4]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Bolme, D.S., Beveridge, J.R., Draper, B.A., and Lui, Y.M. (2010, January 13\u201318). Visual object tracking using adaptive correlation filters. Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2010.5539960"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Henriques, J.F., Caseiro, R., Martins, P., and Batista, J. (2012, January 7\u201313). Exploiting the circulant structure of tracking-by-detection with kernels. Proceedings of the Computer Vision-ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy. Part IV 12.","DOI":"10.1007\/978-3-642-33765-9_50"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1109\/TPAMI.2014.2345390","article-title":"High-speed tracking with kernelized correlation filters","volume":"37","author":"Henriques","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_4","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u2019OS), San Diego, CA, USA."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1561","DOI":"10.1109\/TPAMI.2016.2609928","article-title":"Discriminative scale space tracking","volume":"39","author":"Danelljan","year":"2016","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Ma, C., Yang, X., Zhang, C., and Yang, M.H. (2015, January 7\u201312). Long-term correlation tracking. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299177"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., and Torr, P. (2016, January 11\u201314). Fully-Convolutional Siamese Networks for Object Tracking. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-48881-3_56"},{"key":"ref_8","unstructured":"Bo, L., Yan, J., Wei, W., Zheng, Z., and Hu, X. (2018, January 18\u201323). High Performance Visual Tracking with Siamese Region Proposal Network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., and Hu, W. (2018, January 8\u201314). Distractor-aware Siamese Networks for Visual Object Trackings. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01240-3_7"},{"key":"ref_11","unstructured":"Tian, Z., Shen, C., Chen, H., and He, T. (November, January 27). FCOS: Fully Convolutional One-Stage Object Detection. Proceedings of the International Conference on Computer Vision, Seul, Republic of Korea."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Xu, Y., Wang, Z., Li, Z., Yuan, Y., and Yu, G. (2020, January 7\u201312). SiamFC++: Towards Robust and Accurate Visual Tracking with Target Estimation Guidelines. Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA.","DOI":"10.1609\/aaai.v34i07.6944"},{"key":"ref_13","unstructured":"Zhou, X., Wang, D., and Krahenbuhl, P. (2019). Objects as Points. arXiv."},{"key":"ref_14","unstructured":"Li, Q., Qin, Z., Zhang, W., and Zheng, W. (2020). Siamese Keypoint Prediction Network for Visual Object Tracking. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"783","DOI":"10.1109\/JSTARS.2020.2971657","article-title":"Object tracking in satellite videos based on convolutional regression network with appearance and motion features","volume":"13","author":"Hu","year":"2020","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_16","unstructured":"Li, Z., Yuan, L., and Nevada, R. (2008, January 23\u201328). Global data association for multi-object tracking using network flows. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Feng, J., Hui, B., Liang, Y., Yao, Q., and Zhang, X. (2021, January 11\u201316). Improved SiamRPN++ with Clustering-Based Frame Differencing for Object Tracking of Remote Sensing Videos. Proceedings of the IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium.","DOI":"10.1109\/IGARSS47720.2021.9553779"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., and Yan, J. (2019, January 15\u201320). SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00441"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Yang, J., Pan, Z., Liu, Y., Niu, B., and Lei, B. (2023). Single Object Tracking in Satellite Videos Based on Feature Enhancement and Multi-Level Matching Strategy. Remote Sens., 15.","DOI":"10.3390\/rs15174351"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.Y., and Kweon, I.S. (2018). CBAM: Convolutional Block Attention Module. Computer Vision\u2014ECCV 2018, Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, 8\u201314 September 2018, Springer.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1562","DOI":"10.1109\/TPAMI.2019.2957464","article-title":"GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild","volume":"43","author":"Huang","year":"2021","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Fan, H., Lin, L., Yang, F., Chu, P., Deng, G., Yu, S., Bai, H., Xu, Y., Liao, C., and Ling, H. (2019, January 15\u201320). LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking. Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00552"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"ImageNet Classification with Deep Convolutional Neural Networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Wu, Y., Lim, J., and Yang, M.-H. (2013, January 23\u201328). Online object tracking: A benchmark. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.312"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Peng, H., Fu, J., Li, B., and Hu, W. (2020, January 23\u201328). Ocean: Object-aware anchor-free tracking. Proceedings of the Computer Vision-ECCV 2020: 16th European Conference, Glasgow, UK. Part XXI 16.","DOI":"10.1007\/978-3-030-58589-1_46"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Danelljan, M., Bhat, G., Khan, F.S., and Felsberg, M. (2019, January 15\u201320). Atom: Accurate tracking by overlap maximization. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00479"},{"key":"ref_27","unstructured":"Li, P., Chen, B., Ouyang, W., Wang, D., Yang, X., and Lu, H. (November, January 27). GradNet: Gradient-guided network for visual object tracking. Proceedings of the International Conference on Computer Vision, Seul, Republic of Korea."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhang, Z., and Peng, H. (2019, January 15\u201320). Deeper and Wider Siamese Networks for Real-Time Visual Tracking. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00472"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Danelljan, M., Hager, G., Shahbaz Khan, F., and Felsberg, M. (2015, January 7\u201313). Learning spatially regularized correlation filters for visual tracking. Proceedings of the International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.490"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Zhang, G., Li, Z., Li, J., and Hu, X. (2023). CFNet: Cascade Fusion Network for Dense Prediction. arXiv.","DOI":"10.2139\/ssrn.4857945"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Mueller, M., Smith, N., and Ghanem, B. (2016, January 11\u201314). A benchmark and simulator for uav tracking. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_27"},{"key":"ref_32","unstructured":"Fei, D. (2021). Research on Visual Target Tracking Method Based on Attention Mechanism. [Master\u2019s Thesis, Harbin Institute of Technology]."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Chen, Z., Zhong, B., Li, G., Zhang, S., and Ji, R. (2020, January 13\u201319). Siamese box adaptive network for visual tracking. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00670"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Guo, D., Wang, J., Cui, Y., Wang, Z., and Chen, S. (2020, January 13\u201319). SiamCAR: Siamese fully convolutional classification and regression for visual tracking. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00630"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Guo, D.Y., Shao, Y.Y., Cui, Y., Wang, Z., and Shen, C. (2021, January 20\u201325). Graph attention tracking. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00942"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Cao, Z., Fu, C., Ye, J., Li, B., and Li, Y. (October, January 27). SiamAPN++: Siamese Attentional Aggregation Network for Real-Time UAV Tracking. Proceedings of the 2021 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic.","DOI":"10.1109\/IROS51168.2021.9636309"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"463","DOI":"10.1007\/s11554-021-01190-z","article-title":"A real-time siamese tracker deployed on UAVs","volume":"19","author":"Shen","year":"2022","journal-title":"J. Real-Time Image Process."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"212","DOI":"10.1016\/j.isprsjprs.2024.03.013","article-title":"Satellite video single object tracking: A systematic review and an oriented object tracking benchmark","volume":"210","author":"Chen","year":"2024","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Zhou, J., Wang, P., and Sun, H. (2020, January 7\u201312). Discriminative and Robust Online Learning for Siamese Visual Tracking. Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, NY, USA.","DOI":"10.1609\/aaai.v34i07.7002"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Dong, X., Shen, J., Shao, L., and Porikli, F. (2020, January 23\u201328). CLNet: A Compact Latent Network for Fast Adjusting Siamese Trackers. Proceedings of the Computer Vision\u2014ECCV 2020, 16th European Conference, Glasgow, UK.","DOI":"10.1007\/978-3-030-58565-5_23"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"9349","DOI":"10.1109\/TII.2022.3228197","article-title":"Scale-Aware Siamese Object Tracking for Vision-Based UAM Approaching","volume":"19","author":"Zheng","year":"2023","journal-title":"IEEE Trans. Ind. Inform."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/23\/4549\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:46:53Z","timestamp":1760114813000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/23\/4549"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,4]]},"references-count":41,"journal-issue":{"issue":"23","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["rs16234549"],"URL":"https:\/\/doi.org\/10.3390\/rs16234549","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2024,12,4]]}}}