{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T00:42:04Z","timestamp":1776127324510,"version":"3.50.1"},"reference-count":45,"publisher":"MDPI AG","issue":"24","license":[{"start":{"date-parts":[[2022,12,15]],"date-time":"2022-12-15T00:00:00Z","timestamp":1671062400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"The Key Research &amp; Development of Hubei Province","award":["2020BIB006"],"award-info":[{"award-number":["2020BIB006"]}]},{"name":"The Key Research &amp; Development of Hubei Province","award":["2020AAA004"],"award-info":[{"award-number":["2020AAA004"]}]},{"name":"The Key Research &amp; Development of Hubei Province","award":["2020CFA001"],"award-info":[{"award-number":["2020CFA001"]}]},{"name":"The Natural Science Foundation of Hubei Province","award":["2020BIB006"],"award-info":[{"award-number":["2020BIB006"]}]},{"name":"The Natural Science Foundation of Hubei Province","award":["2020AAA004"],"award-info":[{"award-number":["2020AAA004"]}]},{"name":"The Natural Science Foundation of Hubei Province","award":["2020CFA001"],"award-info":[{"award-number":["2020CFA001"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Current multi-target multi-camera tracking algorithms demand increased requirements for re-identification accuracy and tracking reliability. This study proposed an improved end-to-end multi-target tracking algorithm that adapts to multi-view multi-scale scenes based on the self-attentive mechanism of the transformer\u2019s encoder\u2013decoder structure. A multi-dimensional feature extraction backbone network was combined with a self-built raster semantic map which was stored in the encoder for correlation and generated target position encoding and multi-dimensional feature vectors. The decoder incorporated four methods: spatial clustering and semantic filtering of multi-view targets; dynamic matching of multi-dimensional features; space\u2013time logic-based multi-target tracking, and space\u2013time convergence network (STCN)-based parameter passing. Through the fusion of multiple decoding methods, multi-camera targets were tracked in three dimensions: temporal logic, spatial logic, and feature matching. For the MOT17 dataset, this study\u2019s method significantly outperformed the current state-of-the-art method by 2.2% on the multiple object tracking accuracy (MOTA) metric. Furthermore, this study proposed a retrospective mechanism for the first time and adopted a reverse-order processing method to optimize the historical mislabeled targets for improving the identification F1-score (IDF1). For the self-built dataset OVIT-MOT01, the IDF1 improved from 0.948 to 0.967, and the multi-camera tracking accuracy (MCTA) improved from 0.878 to 0.909, which significantly improved the continuous tracking accuracy and reliability.<\/jats:p>","DOI":"10.3390\/rs14246354","type":"journal-article","created":{"date-parts":[[2022,12,16]],"date-time":"2022-12-16T02:54:02Z","timestamp":1671159242000},"page":"6354","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["An Improved End-to-End Multi-Target Tracking Method Based on Transformer Self-Attention"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1290-849X","authenticated-orcid":false,"given":"Yong","family":"Hong","sequence":"first","affiliation":[{"name":"State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China"},{"name":"Mobile Broadcasting and Information Service Industry Innovation Research Institute (Wuhan) Co., Ltd., Wuhan 430068, China"}]},{"given":"Deren","family":"Li","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China"}]},{"given":"Shupei","family":"Luo","sequence":"additional","affiliation":[{"name":"Wuhan Optics Valley Information Technology Co., Ltd., Wuhan 430068, China"}]},{"given":"Xin","family":"Chen","sequence":"additional","affiliation":[{"name":"Wuhan Optics Valley Information Technology Co., Ltd., Wuhan 430068, China"}]},{"given":"Yi","family":"Yang","sequence":"additional","affiliation":[{"name":"Wuhan Optics Valley Information Technology Co., Ltd., Wuhan 430068, China"}]},{"given":"Mi","family":"Wang","sequence":"additional","affiliation":[{"name":"State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,12,15]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Liu, S., Kong, W., Chen, X., Xu, M., Yasir, M., Zhao, L., and Li, J. (2022). Multi-Scale Ship Detection Algorithm Based on a Lightweight Neural Network for Spaceborne SAR Images. Remote Sens., 14.","DOI":"10.3390\/rs14051149"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020). End-to-End Object Detection with Transformers. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Meinhardt, T., Kirillov, A., Leal-Taix\u00e9, L., and Feichtenhofer, C. (2022). TrackFormer: Multi-Object Tracking with Transformers. coRR.","DOI":"10.1109\/CVPR52688.2022.00864"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Zeng, F., Dong, B., Zhang, Y., Wang, T., Zhang, X., and Wei, Y. (2022). MOTR: End-to-End Multiple-Object Tracking with Transformer. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-031-19812-0_38"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"5191","DOI":"10.1109\/TIP.2020.2980070","article-title":"Multi-Target Multi-Camera Tracking by Tracklet-to-Target Assignment","volume":"29","author":"He","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Bewley, A., Ge, Z., Ott, L., Ramos, F., and Upcroft, B. (2016, January 25\u201328). Simple online and real-time tracking. Proceedings of the IEEE International Conference on Image Processing, Phoenix, AZ, USA.","DOI":"10.1109\/ICIP.2016.7533003"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Yu, F., Li, W., Li, Q., Liu, Y., Shi, X., and Yan, J. (2016, January 8\u201316). Poi: Multiple object tracking with high performance detection and appearance feature. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-48881-3_3"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Tang, S., Andriluka, M., Andres, B., and Schiele, B. (2017, January 21\u201326). Multiple people tracking by lifted multicut and person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.394"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Xu, J., Cao, Y., Zhang, Z., and Hu, H. (2019, January 27\u201328). Spatial-temporal relation networks for multi-object tracking. Proceedings of the IEEE\/CVF International Conference on Computer Vision 2019, Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00409"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Wojke, N., Bewley, A., and Paulus, D. (2017, January 17\u201320). \u201cSimple online and real-time tracking with a deep association metric,\u201d in Image. Proceedings of the (ICIP), 2017 IEEE International Conference on IEEE, Beijing, China.","DOI":"10.1109\/ICIP.2017.8296962"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Xu, Y., Liu, X., Liu, Y., and Zhu, S.-C. (2016, January 27\u201330). Multi-view people tracking via hierarchical trajectory composition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.461"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"1806","DOI":"10.1109\/TPAMI.2011.21","article-title":"Multiple Object Tracking Using K-Shortest Paths Optimization","volume":"33","author":"Berclaz","year":"2011","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"663","DOI":"10.1109\/TPAMI.2006.80","article-title":"Principal axis-based correspondence between multiple cameras for people tracking","volume":"28","author":"Hu","year":"2006","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Cai, Y., and Medioni, G. (2014). Exploring Context Information for Inter-Camera Multiple Target Tracking, IEEE.","DOI":"10.1109\/WACV.2014.6836026"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Ristani, E., and Tomasi, C. (2018, January 18\u201323). Features for Multi-target Multi-camera Tracking and Re-identification. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00632"},{"key":"ref_16","unstructured":"Chen, K., Lai, C., Hung, Y., and Chen, C. (2008, January 24\u201326). An adaptive learning method for target tracking across multiple cameras. Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"2367","DOI":"10.1109\/TCSVT.2016.2589619","article-title":"An equalized global graph model-based approach for multi-camera object tracking","volume":"Volume 27","author":"Chen","year":"2016","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"2382","DOI":"10.1109\/TCSVT.2016.2565978","article-title":"Integrating social grouping for multi-target tracking across cameras in a crf model","volume":"Volume 27","author":"Chen","year":"2016","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"2870","DOI":"10.1109\/TCSVT.2017.2707399","article-title":"Online-learning-based human tracking across non-overlapping cameras","volume":"Volume 28","author":"Lee","year":"2017","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2015). Deep Residual Learning for Image Recognition. arXiv.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Yang, C., Wang, Y., Zhang, J., Zhang, H., Wei, Z., Lin, Z., and Yuille, A. (2022, January 18\u201324). Lite Vision Transformer with Enhanced Self-Attention. Proceedings of the 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01169"},{"key":"ref_23","unstructured":"Wang, G., Lai, J., Huang, P., and Xie, X. (February, January 27). Spatial-Temporal Person Re-identification. Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TGRS.2022.3225843","article-title":"Generalized Scene Classification From Small-Scale Datasets With Multitask Learning","volume":"Volume 60","author":"Zheng","year":"2022","journal-title":"IEEE Transactions on Geoscience and Remote Sensing"},{"key":"ref_25","first-page":"4","article-title":"An Application of Hungarian Algorithm to the Multi-Target Assignment","volume":"27","author":"Liu","year":"2002","journal-title":"Fire Control. Command. Control."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1287\/mnsc.14.3.205","article-title":"A primal method for minimal cost flows with applications to the assignment and transportation problems","volume":"14","author":"Klein","year":"1967","journal-title":"Manag. Sci."},{"key":"ref_27","unstructured":"Milan, A., Leal-Taix\u2019e, L., Reid, I., Roth, S., and Schindler, K. (2016). Mot16: A benchmark for multi-object tracking. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Olson, E.B. (2009, January 12\u201317). Real-time correlative scan matching. Proceedings of the International Conference on Robotics and Automation, Kobe, Japan.","DOI":"10.1109\/ROBOT.2009.5152375"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Konolige, K., Grisetti, G., K\u00fcmmerle, R., Burgard, W., Limketkai, B., and Vincent, R. (2010, January 18\u201322). Efficient sparse pose adjustment for 2D mapping. Proceedings of the 2010 IEEE\/RSJ International Conference on Intelligent Robots and Systems IEEE, Taipei, Taiwan.","DOI":"10.1109\/IROS.2010.5649043"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Hess, W., Kohler, D., Rapp, H., and Andor, D. (2016, January 16\u201321). Real-time Loop Closure in 2D LIDAR SLAM. Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA) IEEE, Stockholm, Sweden.","DOI":"10.1109\/ICRA.2016.7487258"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Wang, S., Yang, D., Wu, Y., Liu, Y., and Sheng, H. (2022, January 10\u201314). Tracking Game: Self-adaptative Agent based Multi-object Tracking. Proceedings of the 30th ACM International Conference on Multimedia (MM \u201922), Lisbon, Portugal.","DOI":"10.1145\/3503161.3548231"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., Luo, P., Liu, W., and Wang, X. (2022, January 23\u201327). ByteTrack: Multi-Object Tracking by Associating Every Detection Box. Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel.","DOI":"10.1007\/978-3-031-20047-2_1"},{"key":"ref_33","unstructured":"Dendorfer, P., Yugay, V., and O\u0161ep, A. (2022). Systems Quo Vadis: Is Trajectory Forecasting the Key Towards Long-Term Multi-Object Tracking?. arXiv."},{"key":"ref_34","unstructured":"Nasseri, M., Babaee, M., Moradi, H., and Hosseini, R. (2022). Fast Online and Relational Tracking. arXiv."},{"key":"ref_35","unstructured":"Aharon, N., Orfaig, R., and Bobrovsky, B. (2022). BoT-SORT: Robust Associations Multi-Pedestrian Tracking. arXiv."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Stadler, D., and Beyerer, J. (2022, January 21\u201325). BYTEv2: Associating More Detection Boxes under Occlusion for Improved Multi-Person Tracking. Proceedings of the ICPR Workshops 2022, Montr\u00e9al, QC, Canada.","DOI":"10.1007\/978-3-031-37660-3_6"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Solera, F., Calderara, S., and Cucchiara, R. (2015). Towards the Evaluation of Reproducible Robustness in Tracking-by-Detection, AVSS.","DOI":"10.1109\/AVSS.2015.7301755"},{"key":"ref_38","unstructured":"Wen, L., Du, D., Cai, Z., Lei, Z., Chang, M.C., Qi, H., Lim, J., Yang, M.H., and Lyu, S. (2015). A New Benchmark and Protocol for Multi-Object Detection and Tracking. arXiv."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Wu, C.W., Zhong, M.T., Tsao, Y., Yang, S.W., Chen, Y.K., and Chien, S.Y. (2017, January 21\u201326). Track-Clustering Error Evaluation for Track-Based Multi-camera Tracking System Employing Human Re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops 2017, Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.184"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Ristani, E., Solera, F., Zou, R., Cucchiara, R., and Tomasi, C. (2016, January 9). Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking. Proceedings of the ECCV 2016 Workshop on Benchmarking Multi-Target Tracking, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-48881-3_2"},{"key":"ref_41","unstructured":"Weber, M., Osep, A., and Leal-Taix\u00e9, L. (2022, October 01). The Multiple Object Tracking Benchmark. Available online: https:\/\/motchallenge.net\/results\/MOT17\/?det=Public."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., and Ren, N. (2020, January 23\u201328). Nerf: Representing scenes as neural radiance fields for view synthesis. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58452-8_24"},{"key":"ref_43","first-page":"10","article-title":"Tightly-coupled integration of acoustic signal and MEMS sensors on smartphones for indoor positioning","volume":"50","author":"Chen","year":"2021","journal-title":"Acta Geod. Et Cartogr. Sin."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Zhang, X.W., Zheng, W.Y., and Chen, Y. (2022). A Group Learning Based Optimization Algorithm Applied to UWB Positioning, IOP Publishing Ltd.","DOI":"10.1088\/1742-6596\/2294\/1\/012001"},{"key":"ref_45","first-page":"1179","article-title":"Bluetooth-controlled Parking System Based on WiFi Positioning Technology","volume":"34","author":"Chen","year":"2022","journal-title":"Sens. Mater. Int. J. Sens. Technol."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/24\/6354\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:42:04Z","timestamp":1760146924000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/24\/6354"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12,15]]},"references-count":45,"journal-issue":{"issue":"24","published-online":{"date-parts":[[2022,12]]}},"alternative-id":["rs14246354"],"URL":"https:\/\/doi.org\/10.3390\/rs14246354","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,12,15]]}}}