{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:10:24Z","timestamp":1760238624839,"version":"build-2065373602"},"reference-count":44,"publisher":"MDPI AG","issue":"17","license":[{"start":{"date-parts":[[2020,8,28]],"date-time":"2020-08-28T00:00:00Z","timestamp":1598572800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Institute for Information &amp; communications Technology Promotion(IITP) grant funded by the Korea government(MSIT) (2017-0-00250, Intelligent Defense Boundary Surveillance Technology Using Collaborative Reinforced Learning of Embedded Edge Camera and Image","award":["2017-0-00250"],"award-info":[{"award-number":["2017-0-00250"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Recent advances in object tracking based on deep Siamese networks shifted the attention away from correlation filters. However, the Siamese network alone does not have as high accuracy as state-of-the-art correlation filter-based trackers, whereas correlation filter-based trackers alone have a frame update problem. In this paper, we present a Siamese network with spatially semantic correlation features (SNS-CF) for accurate, robust object tracking. To deal with various types of features spread in many regions of the input image frame, the proposed SNS-CF consists of\u2014(1) a Siamese feature extractor, (2) a spatially semantic feature extractor, and (3) an adaptive correlation filter. To the best of authors knowledge, the proposed SNS-CF is the first attempt to fuse the Siamese network and the correlation filter to provide high frame rate, real-time visual tracking with a favorable tracking performance to the state-of-the-art methods in multiple benchmarks.<\/jats:p>","DOI":"10.3390\/s20174881","type":"journal-article","created":{"date-parts":[[2020,8,28]],"date-time":"2020-08-28T09:17:08Z","timestamp":1598606228000},"page":"4881","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["SNS-CF: Siamese Network with Spatially Semantic Correlation Features for Object Tracking"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9095-1835","authenticated-orcid":false,"given":"Thierry","family":"Ntwari","sequence":"first","affiliation":[{"name":"Graduate School of Advanced Imaging Science, Multimedia and Film, Chung-Ang University, Seoul 06974, Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9882-6094","authenticated-orcid":false,"given":"Hasil","family":"Park","sequence":"additional","affiliation":[{"name":"Graduate School of Advanced Imaging Science, Multimedia and Film, Chung-Ang University, Seoul 06974, Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3818-6587","authenticated-orcid":false,"given":"Joongchol","family":"Shin","sequence":"additional","affiliation":[{"name":"Graduate School of Advanced Imaging Science, Multimedia and Film, Chung-Ang University, Seoul 06974, Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8593-7155","authenticated-orcid":false,"given":"Joonki","family":"Paik","sequence":"additional","affiliation":[{"name":"Graduate School of Advanced Imaging Science, Multimedia and Film, Chung-Ang University, Seoul 06974, Korea"}]}],"member":"1968","published-online":{"date-parts":[[2020,8,28]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Zhang, Z., and Peng, H. (2019, January 16\u201320). Deeper and wider siamese networks for real-time visual tracking. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00472"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., and Yan, J. (2019, January 16\u201320). Siamrpn++: Evolution of siamese visual tracking with very deep networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00441"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Ma, C., Huang, J.B., Yang, X., and Yang, M.H. (2015, January 7\u201313). Hierarchical convolutional features for visual tracking. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.352"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"2","DOI":"10.1287\/ijoo.2018.0001","article-title":"Robust classification","volume":"1","author":"Bertsimas","year":"2018","journal-title":"INFORMS J. Optim."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., and Torr, P.H. (2016, January 8\u201316). Fully-convolutional siamese networks for object tracking. Proceedings of the European Conference on computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-48881-3_56"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Li, B., Yan, J., Wu, W., Zhu, Z., and Hu, X. (2018, January 18\u201322). High performance visual tracking with siamese region proposal network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00935"},{"key":"ref_7","first-page":"651","article-title":"SSReF-Spatial_Semantic Residual Features for Object Tracking","volume":"11","author":"Thierry","year":"2019","journal-title":"Inst. Electron. Inf. Eng."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Wang, Q., Zhang, L., Bertinetto, L., Hu, W., and Torr, P.H. (2019, January 16\u201320). Fast online object tracking and segmentation: A unifying approach. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00142"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Moudgil, A., and Gandhi, V. (2018, January 2\u20136). Long-term visual object tracking benchmark. Proceedings of the Asian Conference on Computer Vision, Perth, Australia.","DOI":"10.1007\/978-3-030-20890-5_40"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1834","DOI":"10.1109\/TPAMI.2014.2388226","article-title":"Object tracking benchmark","volume":"37","author":"Wu","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Gray, R.M. (2006). Toeplitz and Circulant Matrices: A Review, Now Publishers Inc.","DOI":"10.1561\/9781933019680"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"583","DOI":"10.1109\/TPAMI.2014.2345390","article-title":"High-speed tracking with kernelized correlation filters","volume":"37","author":"Henriques","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"2526","DOI":"10.1109\/TIP.2018.2806280","article-title":"Good features to correlate for visual tracking","volume":"27","author":"Gundogdu","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"3068","DOI":"10.1109\/TCYB.2019.2936503","article-title":"Visual object tracking by hierarchical attention siamese network","volume":"50","author":"Shen","year":"2019","journal-title":"IEEE Trans. Cybern."},{"key":"ref_16","unstructured":"Hariharan, B., Arbel\u00e1ez, P., Girshick, R., and Malik, J. (2008, January 23\u201328). Hypercolumns for object segmentation and fine-grained localization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Danelljan, M., Bhat, G., Shahbaz Khan, F., and Felsberg, M. (2017, January 21\u201326). Eco: Efficient convolution operators for tracking. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.733"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Li, F., Tian, C., Zuo, W., Zhang, L., and Yang, M.H. (2018, January 18\u201322). Learning spatial-temporal regularized correlation filters for visual tracking. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00515"},{"key":"ref_19","unstructured":"Van Rossum, G., and Drake, F.L. (2009). Python 3 Reference Manual, PythonLabs."},{"key":"ref_20","unstructured":"MATLAB (2020). version 9.8.0.1417392 (R2020a) Update 4, MATLAB."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Fan, H., Lin, L., Yang, F., Chu, P., Deng, G., Yu, S., Bai, H., Xu, Y., Liao, C., and Ling, H. (2019, January 16\u201320). Lasot: A high-quality benchmark for large-scale single object tracking. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00552"},{"key":"ref_22","unstructured":"Moore, G., and Noyuce, R.O. Personal communication."},{"key":"ref_23","unstructured":"Huang, J., Priem, C., and Malachowsky, C. Personal communication."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"Imagenet large scale visual recognition challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft coco: Common objects in context. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Xu, N., Yang, L., Fan, Y., Yang, J., Yue, D., Liang, Y., Price, B., Cohen, S., and Huang, T. (2018, January 8\u201314). Youtube-vos: Sequence-to- sequence video object segmentation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01228-1_36"},{"key":"ref_27","unstructured":"Zou, W., Zhu, S., Yu, K., and Ng, A.Y. (2012, January 3\u20138). Deep learning of invariant features via simulated fixations in video. Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, Harrahs and Harveys, NV, USA."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Zhang, K., Zhang, L., Liu, Q., Zhang, D., and Yang, M.H. (2014, January 6\u201312). Fast visual tracking via dense spatio-temporal context learning. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_9"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"2096","DOI":"10.1109\/TPAMI.2015.2509974","article-title":"Struck: Structured output tracking with kernels","volume":"38","author":"Hare","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"2356","DOI":"10.1109\/TIP.2014.2313227","article-title":"Robust object tracking via sparse collaborative appearance model","volume":"23","author":"Zhong","year":"2014","journal-title":"IEEE Trans. Image Process."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"2002","DOI":"10.1109\/TPAMI.2014.2315808","article-title":"Fast compressive tracking","volume":"36","author":"Zhang","year":"2014","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1619","DOI":"10.1109\/TPAMI.2010.226","article-title":"Robust object tracking with online multiple instance learning","volume":"33","author":"Babenko","year":"2010","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"1409","DOI":"10.1109\/TPAMI.2011.239","article-title":"Tracking-learning-detection","volume":"34","author":"Kalal","year":"2011","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Henriques, J.F., Caseiro, R., Martins, P., and Batista, J. (2012, January 7\u201313). Exploiting the circulant structure of tracking-by-detection with kernels. Proceedings of the European Conference on Computer Vision, Florence, Italy.","DOI":"10.1007\/978-3-642-33765-9_50"},{"key":"ref_35","unstructured":"Sauer, A., Aljalbout, E., and Haddadin, S. (2019). Tracking Holistic Object Representations. arXiv."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Sun, C., Wang, D., Lu, H., and Yang, M.H. (2018, January 18\u201322). Learning spatial-aware regressions for visual tracking. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00934"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Zhu, Z., Wang, Q., Li, B., Wu, W., Yan, J., and Hu, W. (2018, January 8\u201314). Distractor-aware siamese networks for visual object tracking. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01240-3_7"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"5596","DOI":"10.1109\/TIP.2019.2919201","article-title":"Learning adaptive discriminative correlation filters via temporal consistency preserving spatial feature selection for robust visual object tracking","volume":"28","author":"Xu","year":"2019","journal-title":"IEEE Trans. Image Process."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Bai, S., He, Z., Dong, Y., and Bai, H. (2020, January 6\u201310). Multi-hierarchical independent correlation filters for visual tracking. Proceedings of the 2020 IEEE International Conference on Multimedia and Expo (ICME), London, UK.","DOI":"10.1109\/ICME46284.2020.9102759"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Bhat, G., Johnander, J., Danelljan, M., Shahbaz Khan, F., and Felsberg, M. (2018, January 8\u201314). Unveiling the power of deep tracking. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01216-8_30"},{"key":"ref_41","unstructured":"He, A., Luo, C., Tian, X., and Zeng, W. (2008, January 23\u201328). A twofold siamese network for real-time object tracking. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA."},{"key":"ref_42","unstructured":"Sun, C., Wang, D., Lu, H., and Yang, M.H. (2008, January 23\u201328). Correlation tracking via joint discrimination and reliability learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Nam, H., and Han, B. (2016, January 27\u201330). Learning multi-domain convolutional neural networks for visual tracking. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.465"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Song, Y., Ma, C., Wu, X., Gong, L., Bao, L., Zuo, W., Shen, C., Lau, R.W., and Yang, M.H. (2018, January 18\u201322). Vital: Visual tracking via adversarial learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00937"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/17\/4881\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:04:20Z","timestamp":1760177060000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/17\/4881"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,8,28]]},"references-count":44,"journal-issue":{"issue":"17","published-online":{"date-parts":[[2020,9]]}},"alternative-id":["s20174881"],"URL":"https:\/\/doi.org\/10.3390\/s20174881","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2020,8,28]]}}}