{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,7]],"date-time":"2026-02-07T20:39:13Z","timestamp":1770496753276,"version":"3.49.0"},"reference-count":46,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2023,11,22]],"date-time":"2023-11-22T00:00:00Z","timestamp":1700611200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,11,22]],"date-time":"2023-11-22T00:00:00Z","timestamp":1700611200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100009110","name":"Natural Science Foundation of Xinjiang Province","doi-asserted-by":"publisher","award":["2022D01B05"],"award-info":[{"award-number":["2022D01B05"]}],"id":[{"id":"10.13039\/100009110","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2024,4]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Multi-object tracking (MOT) aims to locate and identify objects in videos. As deep learning brings excellent performances to object detection, the tracking-by-detection (TBD) has gradually become a mainstream tracking framework. However, some drawbacks still exist in the current TBD framework: (1) inaccurate prediction of the bounding boxes would occur in the detection part, which is caused by overlooking the actual pedestrian ratio in the surveillance scene. (2) The width of the bounding boxes in the next frame might be indirectly predicted by the aspect ratio, which increases the error of width prediction in the motion prediction part. (3) Association is only performed for high-confidence detection boxes, and the low-confidence boxes caused by occlusion are discarded in the data association part, resulting in fragmentation of trajectories. To address the above issues, we propose a multi-target tracking model incorporating motion estimation and multi-stage association (MEMA). First, the aspect ratio of the ground-true bounding box is introduced to improve the fit of the detection and the ground-true bounding box, and we design the elliptical Gaussian kernel to improve the positioning accuracy of the object center point. Then, the prediction state vector of the Kalman filter is modified to predict the width and its corresponding velocity directly. It can reduce the width error of the prediction box and eliminate the velocity error of the motion estimation, which leads to a more pedestrian-friendly prediction bounding box. Finally, we propose a multi-stage association strategy to correlate different confidence boxes. Without using the appearance feature, the strategy can reduce the impact of occlusion and improve the tracking performance. On the MOT17 test set, the method proposed in this paper achieves a MOTA of 74.3% and an IDF1 of 72.4%, outperforming the current SOTA.<\/jats:p>","DOI":"10.1007\/s40747-023-01273-3","type":"journal-article","created":{"date-parts":[[2023,11,22]],"date-time":"2023-11-22T09:08:35Z","timestamp":1700644115000},"page":"2445-2458","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Motion estimation and multi-stage association for tracking-by-detection"],"prefix":"10.1007","volume":"10","author":[{"given":"Ye","family":"Li","sequence":"first","affiliation":[]},{"given":"Lei","family":"Wu","sequence":"additional","affiliation":[]},{"given":"Yiping","family":"Chen","sequence":"additional","affiliation":[]},{"given":"Xinzhong","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Guangqiang","family":"Yin","sequence":"additional","affiliation":[]},{"given":"Zhiguo","family":"Wang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,11,22]]},"reference":[{"key":"1273_CR1","doi-asserted-by":"crossref","unstructured":"Bewley A, Ge Z, Ott L, Ramos F, Upcroft B (2016) Simple online and realtime tracking. In: 2016 IEEE International Conference on image processing (ICIP), pp 3464\u20133468","DOI":"10.1109\/ICIP.2016.7533003"},{"key":"1273_CR2","doi-asserted-by":"crossref","unstructured":"Wojke N, Bewley A, Paulus D (2017) Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on image processing (ICIP), pp 3645\u20133649","DOI":"10.1109\/ICIP.2017.8296962"},{"key":"1273_CR3","doi-asserted-by":"crossref","unstructured":"Chen L, Ai H, Zhuang Z, Shang C (2018) Real-time multiple people tracking with deeply learned candidate selection and person re-identification. In: 2018 IEEE International Conference on multimedia and expo (ICME), pp 1\u20136","DOI":"10.1109\/ICME.2018.8486597"},{"key":"1273_CR4","doi-asserted-by":"crossref","unstructured":"Yu F, Li W, Li Q, Liu Y, Shi X, Yan J (2016) Poi: multiple object tracking with high performance detection and appearance feature. In: European Conference on computer vision, pp 36\u201342","DOI":"10.1007\/978-3-319-48881-3_3"},{"key":"1273_CR5","doi-asserted-by":"publisher","first-page":"7077","DOI":"10.1007\/s11042-018-6467-6","volume":"78","author":"M Nima","year":"2019","unstructured":"Nima M, Mohammad AS, Mohammad R (2019) Multi-target tracking using cnn-based features: Cnnmtt. Multimed Tools Appl 78:7077\u20137096","journal-title":"Multimed Tools Appl"},{"key":"1273_CR6","doi-asserted-by":"crossref","unstructured":"Zhou Z, Xing J, Zhang M, Hu W (2018) Online multi-target tracking with tensor-based high-order graph matching. In: 2018 24th International Conference on pattern recognition (ICPR), pp 1051\u20134651","DOI":"10.1109\/ICPR.2018.8545450"},{"key":"1273_CR7","doi-asserted-by":"crossref","unstructured":"Fang K, Xiang Y, Li X, Savarese S (2018) Recurrent autoregressive networks for online multi-object tracking. In: 2018 IEEE Winter Conference on applications of computer vision (WACV), pp 466\u2013475","DOI":"10.1109\/WACV.2018.00057"},{"key":"1273_CR8","doi-asserted-by":"crossref","unstructured":"Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE\/CVF International Conference on computer vision, pp 6569\u20136578","DOI":"10.1109\/ICCV.2019.00667"},{"issue":"11","key":"1273_CR9","doi-asserted-by":"publisher","first-page":"3069","DOI":"10.1007\/s11263-021-01513-4","volume":"129","author":"Y Zhang","year":"2021","unstructured":"Zhang Y, Wang C, Wang X, Zeng W, Liu W (2021) Fairmot: on the fairness of detection and re-identification in multiple object tracking. Int J Comput Vis 129(11):3069\u20133087","journal-title":"Int J Comput Vis"},{"issue":"6","key":"1273_CR10","first-page":"1137","volume":"39","author":"S Ren","year":"2017","unstructured":"Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 39(6):1137\u20131149","journal-title":"Adv Neural Inf Process Syst"},{"key":"1273_CR11","doi-asserted-by":"crossref","unstructured":"Kim C, Li F, Ciptadi A, Rehg JM (2015) Multiple hypothesis tracking revisited. In IEEE International Conference on computer vision","DOI":"10.1109\/ICCV.2015.533"},{"key":"1273_CR12","doi-asserted-by":"crossref","unstructured":"Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg A\u00a0C (2016) Ssd: Single shot multibox detector. In: Computer Vision\u2013ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11\u201314, 2016, Proceedings, Part I 14, pp 21\u201337","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"1273_CR13","unstructured":"Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767"},{"key":"1273_CR14","doi-asserted-by":"crossref","unstructured":"Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 7263\u20137271","DOI":"10.1109\/CVPR.2017.690"},{"key":"1273_CR15","doi-asserted-by":"crossref","unstructured":"Xu Y, Osep A, Ban Y, Horaud R, Leal-Taixe L, Alameda-Pineda X (2019) How to train your deep multi-object tracker. In: Proceedings of the IEEE\/CVF Conference on computer vision and pattern recognition, pp 6786\u20136795","DOI":"10.1109\/CVPR42600.2020.00682"},{"key":"1273_CR16","doi-asserted-by":"crossref","unstructured":"Pang B, Li Y, Zhang Y, Li M, Lu C (2020) Tubetk: Adopting tubes to track multi-object in a one-step training model. In: Proceedings of the IEEE\/CVF Conference on computer vision and pattern recognition, pp 6307\u20136317","DOI":"10.1109\/CVPR42600.2020.00634"},{"key":"1273_CR17","doi-asserted-by":"crossref","unstructured":"Feichtenhofer C, Pinz A, Zisserman A (2017) Detect to track and track to detect. In: Proceedings of the IEEE International Conference on computer vision, pp 3038\u20133046","DOI":"10.1109\/ICCV.2017.330"},{"key":"1273_CR18","doi-asserted-by":"crossref","unstructured":"Wang Z, Zheng L, Liu Y, Li Y, Wang S (2020) Towards real-time multi-object tracking. In: Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XI 16, pp 107\u2013122","DOI":"10.1007\/978-3-030-58621-8_7"},{"key":"1273_CR19","doi-asserted-by":"crossref","unstructured":"Bergmann P, Meinhardt T, Leal-Taixe L (2019) Tracking without bells and whistles. In: Proceedings of the IEEE\/CVF International Conference on computer vision, pp 941\u2013951","DOI":"10.1109\/ICCV.2019.00103"},{"key":"1273_CR20","unstructured":"Zhou X, Wang D, Kr\u00e4henb\u00fchl P (2019) Objects as points. arXiv preprint arXiv:1904.07850"},{"key":"1273_CR21","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"1273_CR22","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1155\/2008\/246309","volume":"2008","author":"K Bernardin","year":"2008","unstructured":"Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J Image Video Process 2008:1\u201310","journal-title":"EURASIP J Image Video Process"},{"key":"1273_CR23","doi-asserted-by":"crossref","unstructured":"Ristani E, Solera F, Zou R, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: Computer Vision\u2013ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part II, pp 17\u201335","DOI":"10.1007\/978-3-319-48881-3_2"},{"key":"1273_CR24","doi-asserted-by":"publisher","first-page":"548","DOI":"10.1007\/s11263-020-01375-2","volume":"129","author":"J Luiten","year":"2021","unstructured":"Luiten J, Osep A, Dendorfer P, Torr P, Geiger A, Leal-Taix\u00e9 L, Leibe B (2021) Hota: a higher order metric for evaluating multi-object tracking. Int J Comput Vis 129:548\u2013578","journal-title":"Int J Comput Vis"},{"key":"1273_CR25","doi-asserted-by":"crossref","unstructured":"Sanchez-Matilla R, Poiesi F, Cavallaro A (2016) Online multi-target tracking with strong and weak detections. In: Computer Vision\u2013ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part II 14, pp 84\u201399","DOI":"10.1007\/978-3-319-48881-3_7"},{"issue":"9","key":"1273_CR26","doi-asserted-by":"publisher","first-page":"7892","DOI":"10.1109\/JIOT.2020.2996609","volume":"7","author":"Y Zhang","year":"2020","unstructured":"Zhang Y, Sheng H, Wu Y, Wang S, Ke W, Xiong Z (2020) Multiplex labeling graph for near-online tracking in crowded scenes. IEEE Internet Things J 7(9):7892\u20137902","journal-title":"IEEE Internet Things J"},{"issue":"3","key":"1273_CR27","doi-asserted-by":"publisher","first-page":"913","DOI":"10.1007\/s11554-020-01054-y","volume":"18","author":"M Meneses","year":"2021","unstructured":"Meneses M, Matos L, Prado B, Carvalho A, Macedo H (2021) Smartsort: an mlp-based method for tracking multiple objects in real-time. J Real-Time Image Proc 18(3):913\u2013921","journal-title":"J Real-Time Image Proc"},{"key":"1273_CR28","doi-asserted-by":"crossref","unstructured":"Wang Y, Kitani K, Weng X (2021) Joint object detection and multi-object tracking with graph neural networks. In: 2021 IEEE International Conference on robotics and automation (ICRA), pp 13708\u201313715","DOI":"10.1109\/ICRA48506.2021.9561110"},{"key":"1273_CR29","doi-asserted-by":"crossref","unstructured":"Peng J, Wang C, Wan F, Wu Y, Wang Y, Tai Y, Wang C, Li J, Huang F, Fu Y (2020) Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In: Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part IV 16, pp 145\u2013161","DOI":"10.1007\/978-3-030-58548-8_9"},{"key":"1273_CR30","doi-asserted-by":"crossref","unstructured":"Pang J, Qiu L, Li X, Chen H, Li Q, Darrell T, Yu F (2021) Quasi-dense similarity learning for multiple object tracking. In: Proceedings of the IEEE\/CVF Conference on computer vision and pattern recognition, pp 164\u2013173","DOI":"10.1109\/CVPR46437.2021.00023"},{"key":"1273_CR31","doi-asserted-by":"crossref","unstructured":"Wu J, Cao J, Song L, Wang Y, Yang M, Yuan J (2021) Track to detect and segment: an online multi-object tracker. In: Proceedings of the IEEE\/CVF Conference on computer vision and pattern recognition, pp 12352\u201312361","DOI":"10.1109\/CVPR46437.2021.01217"},{"issue":"1","key":"1273_CR32","first-page":"104","volume":"43","author":"S Sun","year":"2019","unstructured":"Sun S, Akhtar N, Song H, Mian A, Shah M (2019) Deep affinity network for multiple object tracking. IEEE Trans Pattern Anal Mach Intell 43(1):104\u2013119","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1273_CR33","doi-asserted-by":"publisher","first-page":"75","DOI":"10.1016\/j.neucom.2021.12.104","volume":"476","author":"S Han","year":"2022","unstructured":"Han S, Huang P, Wang H, Yu E, Liu D, Pan X (2022) Mat: motion-aware multi-object tracking. Neurocomputing 476:75\u201386","journal-title":"Neurocomputing"},{"key":"1273_CR34","unstructured":"Xu Y, Ban Y, Delorme G, Gan G, Rus D, Alameda-Pineda X (2021) Transcenter: transformers with dense queries for multiple-object tracking. arXiv e-prints arXiv\u20132103"},{"key":"1273_CR35","unstructured":"Sun P, Cao J, Jiang Y, Zhang R, Xie E, Yuan Z, Wang C, Luo P (2020) Transtrack: multiple object tracking with transformer. arXiv preprint arXiv:2012.15460"},{"key":"1273_CR36","doi-asserted-by":"crossref","unstructured":"Wang Q, Zheng Y, Pan P, Xu Y (2021) Multiple object tracking with correlation learning. In: Proceedings of the IEEE\/CVF Conference on computer vision and pattern recognition, pp 3876\u20133886","DOI":"10.1109\/CVPR46437.2021.00387"},{"key":"1273_CR37","unstructured":"Li W, Xiong Y, Yang S, Xu M, Wang Y, Xia W (2021) Semi-tcl: semi-supervised track contrastive representation learning. arXiv preprint arXiv:2107.02396"},{"key":"1273_CR38","doi-asserted-by":"publisher","first-page":"3182","DOI":"10.1109\/TIP.2022.3165376","volume":"31","author":"C Liang","year":"2022","unstructured":"Liang C, Zhang Z, Zhou X, Li B, Zhu S, Hu W (2022) Rethinking the competition between detection and reid in multiobject tracking. IEEE Trans Image Process 31:3182\u20133196","journal-title":"IEEE Trans Image Process"},{"key":"1273_CR39","doi-asserted-by":"crossref","unstructured":"Liang C, Zhang Z, Zhou X, Li B, Hu W (2022) One more check: making \u201cfake background\u201d be tracked again. In: Proceedings of the AAAI Conference on. artif intell 36:1546\u20131554","DOI":"10.1609\/aaai.v36i2.20045"},{"key":"1273_CR40","doi-asserted-by":"crossref","unstructured":"Yu E, Li Z, Han S, Wang H (2023) RelationTrack: Relation-Aware Multiple Object Tracking With Decoupled Representation, In: IEEE Transactions on Multimedia, vol. 25, pp. 2686\u20132697. https:\/\/doi.org\/10.1109\/TMM.2022.3150169","DOI":"10.1109\/TMM.2022.3150169"},{"key":"1273_CR41","doi-asserted-by":"publisher","unstructured":"Nasseri MH, Babaee M, Moradi H, Hosseini R (2023) Online relational tracking with camera motion suppression. J Vis Commun Image Represent 90:103750. https:\/\/doi.org\/10.1016\/j.jvcir.2022.103750. https:\/\/www.sciencedirect.com\/science\/article\/pii\/S104732032200270X","DOI":"10.1016\/j.jvcir.2022.103750"},{"key":"1273_CR42","doi-asserted-by":"crossref","unstructured":"Chu P, Wang J, You Q, Ling H, Liu Z (2023) Transmot: spatial-temporal graph transformer for multiple object tracking. In: Proceedings of the IEEE\/CVF Winter Conference on applications of computer vision, pp 4870\u20134880","DOI":"10.1109\/WACV56688.2023.00485"},{"key":"1273_CR43","doi-asserted-by":"crossref","unstructured":"Zhang Y, Sun P, Jiang Y, Yu D, Weng F, Yuan Z, Luo P, Liu W, Wang X (2022) Bytetrack: multi-object tracking by associating every detection box. In: European Conference on computer vision, Springer, pp 1\u201321","DOI":"10.1007\/978-3-031-20047-2_1"},{"issue":"5","key":"1273_CR44","doi-asserted-by":"publisher","first-page":"1188","DOI":"10.3390\/s19051188","volume":"19","author":"L Zhang","year":"2019","unstructured":"Zhang L, Gray H, Ye X, Collins L, Allinson N (2019) Automatic individual pig detection and tracking in pig farms. Sensors 19(5):1188","journal-title":"Sensors"},{"key":"1273_CR45","doi-asserted-by":"crossref","unstructured":"Kim C, Li F, Rehg JM (2018) Multi-object tracking with neural gating using bilinear lstm. In: Proceedings of the European Conference on computer vision (ECCV), pp 200\u2013215","DOI":"10.1007\/978-3-030-01237-3_13"},{"key":"1273_CR46","doi-asserted-by":"crossref","unstructured":"Zhou X, Koltun V, Kr\u00e4henb\u00fchl P (2020) Tracking objects as points. In: Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part IV, pp 474\u2013490","DOI":"10.1007\/978-3-030-58548-8_28"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-023-01273-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-023-01273-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-023-01273-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,3,30]],"date-time":"2024-03-30T15:29:28Z","timestamp":1711812568000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-023-01273-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,22]]},"references-count":46,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2024,4]]}},"alternative-id":["1273"],"URL":"https:\/\/doi.org\/10.1007\/s40747-023-01273-3","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,11,22]]},"assertion":[{"value":"17 April 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 October 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 November 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}