{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T14:41:04Z","timestamp":1775745664644,"version":"3.50.1"},"reference-count":43,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2024,5,11]],"date-time":"2024-05-11T00:00:00Z","timestamp":1715385600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,5,11]],"date-time":"2024-05-11T00:00:00Z","timestamp":1715385600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"the Key project of Chongqing Technology Innovation and Application Development","award":["Grant No.cstc2021jscx-dxwtBX0018"],"award-info":[{"award-number":["Grant No.cstc2021jscx-dxwtBX0018"]}]},{"name":"Natural Science Foundation of Chongqing,China","award":["Grant No.CSTB2022NSCQ-MSX0493"],"award-info":[{"award-number":["Grant No.CSTB2022NSCQ-MSX0493"]}]},{"name":"Chongqing Postgraduate Scientific Research Innovation Project and the Action Plan for the High-quality Development of Postgraduate Education of Chongqing University of Technology","award":["Grant No. gzlcx20233218"],"award-info":[{"award-number":["Grant No. gzlcx20233218"]}]},{"name":"Chongqing Postgraduate Scientific Research Innovation Project and the Action Plan for the High-quality Development of Postgraduate Education of Chongqing University of Technology","award":["Grant No. gzlcx20222062"],"award-info":[{"award-number":["Grant No. gzlcx20222062"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2024,8]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Multi-object tracking (MOT) is a task to identify objects in videos, however, objects with similar appearance or occlusion may cause frequent ID switching, which is the main challenge of current MOT. In this paper, we propose a novel self-cross graph neural network-based multi-object tracking method, which we termed as SCGTracker. This method seamlessly integrates object detection and tracking through graph neural networks, building upon the foundation of the JDE paradigm. Specifically, we construct graph structures to capture the correlation between objects in both spatial and temporal dimensions. To further tackle the frequent ID switching problem, we employ an attention mechanism to aggregate object context information within the same frame and across different frames, updating the object information via graph neural networks to derive highly distinctive appearance features. Ultimately, the obtained strongly distinguishable object appearance features serve to mitigate the issue of frequent object ID switches. In experiments conducted on the MOT17 test set, our proposed method yields promising results, achieving a 73% Multiple Object Tracking Accuracy (MOTA) and a 73.2% ID <jats:italic>F<\/jats:italic>1 score. Furthermore, it demonstrates a substantial reduction in ID switches compared with state-of-the-art methods.<\/jats:p>","DOI":"10.1007\/s40747-024-01426-y","type":"journal-article","created":{"date-parts":[[2024,5,11]],"date-time":"2024-05-11T16:01:36Z","timestamp":1715443296000},"page":"5513-5527","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["SCGTracker: object feature embedding enhancement based on graph attention networks for multi-object tracking"],"prefix":"10.1007","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0009-0009-0766-5164","authenticated-orcid":false,"given":"Xin","family":"Feng","sequence":"first","affiliation":[]},{"given":"Xiaoning","family":"Jiao","sequence":"additional","affiliation":[]},{"given":"Siping","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Zhixian","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Yan","family":"Liu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,5,11]]},"reference":[{"key":"1426_CR1","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s40747-020-00206-8","volume":"2008","author":"K Bernardin","year":"2008","unstructured":"Bernardin K, Stiefelhagen R (2008) Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP J Image Video Process 2008:1\u201310. https:\/\/doi.org\/10.1007\/s40747-020-00206-8","journal-title":"EURASIP J Image Video Process"},{"key":"1426_CR2","doi-asserted-by":"publisher","unstructured":"Bewley A, Ge Z, Ott L, et al (2016) Simple online and real-time tracking. In: 2016 IEEE International Conference on image processing (ICIP), pp. 3464\u20133468. https:\/\/doi.org\/10.1109\/ICIP.2016.7533003","DOI":"10.1109\/ICIP.2016.7533003"},{"key":"1426_CR3","doi-asserted-by":"publisher","unstructured":"Bras \u2018o G, Leal-Taix\u2019e L (2020) Learning a neural solver for multiple object tracking. In: Proceedings of the IEEE\/CVF Conference on computer vision and pattern recognition, 2020, pp 6247\u20136257. https:\/\/doi.org\/10.48550\/arXiv.1912.07515.","DOI":"10.48550\/arXiv.1912.07515"},{"key":"1426_CR4","doi-asserted-by":"publisher","unstructured":"Chu P, Wang J, You Q, et al (2023) Transmot: Spatial-temporal graph transformer for multiple object tracking. In: Proceedings of the IEEE\/CVF Winter Conference on applications of computer vision, 2023, pp 4870\u20134880. https:\/\/doi.org\/10.48550\/arXiv.2104.00194","DOI":"10.48550\/arXiv.2104.00194"},{"key":"1426_CR5","unstructured":"Dendorfer P, Rezatofighi H, Milan A, et al (2020) Mot20: A benchmark for multi-object tracking in crowded scenes. arXiv preprint arXiv:2003.09003"},{"key":"1426_CR6","doi-asserted-by":"publisher","unstructured":"Duan K, Bai S, Xie L, et al (2019) Centernet: keypoint triplets for object detection. In: Proceedings of the IEEE\/CVF International Conference on computer vision, 2019, pp 6569\u20136578. https:\/\/doi.org\/10.1109\/ICCV.2019.00667.","DOI":"10.1109\/ICCV.2019.00667"},{"key":"1426_CR7","doi-asserted-by":"publisher","unstructured":"Fabbri M, Bras\u00f3 G, Maugeri G, et al (2021) Motsynth: How can synthetic data help pedestrian detection and tracking? In: Proceedings of the IEEE\/CVF International Conference on computer vision. 2021, pp 10849\u201310859. https:\/\/doi.org\/10.1109\/ICCV48922.2021.01067.","DOI":"10.1109\/ICCV48922.2021.01067"},{"key":"1426_CR8","doi-asserted-by":"publisher","unstructured":"He J, Huang Z, Wang N, et al (2021) Learnable graph matching: Incorporating graph partitioning with deep feature learning for multiple object tracking. In: Proceedings of the IEEE\/CVF Conference on computer vision and pattern recognition, 2021, pp pp 5299\u20135309. https:\/\/doi.org\/10.48550\/arXiv.2103.16178.","DOI":"10.48550\/arXiv.2103.16178"},{"issue":"1","key":"1426_CR9","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1088\/0004-637X\/823\/1\/37","volume":"823","author":"YD Hezaveh","year":"2016","unstructured":"Hezaveh YD, Dalal N, Marrone DP et al (2016) Detection of lensing substructure using ALMA observations of the dusty galaxy SDP.81. Astrophys J 823(1):37. https:\/\/doi.org\/10.1088\/0004-637X\/823\/1\/37","journal-title":"Astrophys J"},{"key":"1426_CR10","doi-asserted-by":"publisher","unstructured":"Hyun J, Kang M, Wee D, et al (2023) Detection recovery in online multi-object tracking with sparse graph tracker. In: Proceedings of the IEEE\/CVF Winter Conference on applications of computer vision, 2023, pp 4850\u20134859. https:\/\/doi.org\/10.48550\/arXiv.2205.00968.","DOI":"10.48550\/arXiv.2205.00968"},{"issue":"19","key":"1426_CR11","doi-asserted-by":"publisher","first-page":"14658","DOI":"10.1007\/s40747-020-00206-8","volume":"8","author":"H Jiang","year":"2021","unstructured":"Jiang H, Wang M, Liu D et al (2021) Ctrack: acoustic device-free and collaborative hands motion tracking on smartphones. IEEE Internet Things J 8(19):14658\u201314671. https:\/\/doi.org\/10.1007\/s40747-020-00206-8","journal-title":"IEEE Internet Things J"},{"key":"1426_CR12","doi-asserted-by":"publisher","unstructured":"Ke L, Li X, Danelljan M, et al (2021) Prototypical cross-attention networks for multiple object tracking and segmentation. In: Advances in neural information processing systems 34, pp 1192\u20131203. https:\/\/doi.org\/10.48550\/arXiv.2106.11958.","DOI":"10.48550\/arXiv.2106.11958"},{"key":"1426_CR13","doi-asserted-by":"publisher","first-page":"114535","DOI":"10.1109\/ACCESS.2021.3105118","volume":"9","author":"J Lee","year":"2021","unstructured":"Lee J, Jeong M, Ko BC (2021) Graph convolution neural network-based data association for online multi-object tracking. IEEE Access 9:114535\u2013114546. https:\/\/doi.org\/10.1109\/ACCESS.2021.3105118","journal-title":"IEEE Access"},{"key":"1426_CR14","doi-asserted-by":"publisher","unstructured":"Li J, Gao X, Jiang T (2020) Graph networks for multiple object tracking. In: Proceedings of the IEEE\/CVF Winter Conference on applications of computer vision, pp 719\u2013728. IEEE. https:\/\/doi.org\/10.1109\/WACV45572.2020.9093347","DOI":"10.1109\/WACV45572.2020.9093347"},{"key":"1426_CR15","doi-asserted-by":"publisher","first-page":"3182","DOI":"10.48550\/arXiv.2010.12138","volume":"31","author":"C Liang","year":"2022","unstructured":"Liang C, Zhang Z, Zhou X et al (2022) Rethinking the competition between detection and reid in multiobject tracking. IEEE Trans Image Process 31:3182\u20133196. https:\/\/doi.org\/10.48550\/arXiv.2010.12138","journal-title":"IEEE Trans Image Process"},{"key":"1426_CR16","doi-asserted-by":"publisher","DOI":"10.1007\/s40747-020-00206-8","author":"H Liang","year":"2022","unstructured":"Liang H, Song H, Yun X et al (2022) Traffic incident detection based on a global trajectory spatiotemporal map. Complex Intell Syst. https:\/\/doi.org\/10.1007\/s40747-020-00206-8","journal-title":"Complex Intell Syst"},{"key":"1426_CR17","doi-asserted-by":"publisher","unstructured":"Martinez M, Stiefelhagen R (2019) Taming the cross entropy loss. In: Pattern Recognition: 40th German Conference, GCPR 2018, Proceedings 40, pp 628\u2013637. Springer. https:\/\/doi.org\/10.1007\/978-3-030-12939-2 43","DOI":"10.1007\/978-3-030-12939-2"},{"key":"1426_CR18","doi-asserted-by":"publisher","unstructured":"Meinhardt, T., Kirillov, A., Leal-Taixe, L., et al. (2022). Trackformer: Multiobject tracking with transformers. In Proceedings of the IEEE\/CVF Conference on computer vision and pattern recognition, pp 8844\u20138854. IEEE. https:\/\/doi.org\/10.48550\/arXiv.2101.02702","DOI":"10.48550\/arXiv.2101.02702"},{"key":"1426_CR19","doi-asserted-by":"publisher","unstructured":"Milan A, Leal-Taixe L, Reid I, et al (2016) Mot16: a benchmark for multiobject tracking. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (pp. 807\u2013823). Springer. https:\/\/doi.org\/10.1007\/978-3-319-46478-7 40","DOI":"10.1007\/978-3-319-46478-7"},{"key":"1426_CR20","doi-asserted-by":"publisher","unstructured":"Mills-Tettey GA, Stentz A, Dias MB (2007) The dynamic Hungarian algorithm for the assignment problem with changing costs. Robotics Institute, Pittsburgh, PA, Tech Rep CMU-RI-TR-07-27. https:\/\/doi.org\/10.1177\/0278364911404579","DOI":"10.1177\/0278364911404579"},{"key":"1426_CR21","doi-asserted-by":"publisher","first-page":"577","DOI":"10.1007\/s40747-020-00206-8","volume":"7","author":"C Ning","year":"2021","unstructured":"Ning C, Menglu L, Hao Y et al (2021) Survey of pedestrian detection with occlusion. Complex Intell Syst 7:577\u2013587. https:\/\/doi.org\/10.1007\/s40747-020-00206-8","journal-title":"Complex Intell Syst"},{"key":"1426_CR22","doi-asserted-by":"publisher","unstructured":"Pang J, Qiu L, Li X, et al (2021) Quasi-dense similarity learning for multiple object tracking. In: Proceedings of the IEEE\/CVF Conference on computer vision and pattern recognition, pp 164\u2013173. IEEE. https:\/\/doi.org\/10.48550\/arXiv.2006.06664","DOI":"10.48550\/arXiv.2006.06664"},{"key":"1426_CR23","doi-asserted-by":"publisher","unstructured":"Papakis I, Sarkar A, Karpatne A (2020) Gcnnmatch: Graph convolutional neural networks for multi-object tracking via sinkhorn normalization. arXiv preprint arXiv:201000067. https:\/\/doi.org\/10.1007\/s40747-020-00206-8","DOI":"10.1007\/s40747-020-00206-8"},{"key":"1426_CR24","unstructured":"Rangesh A, Maheshwari P, Gebre M, et al (2021) TrackMPNN: a message passing graph neural architecture for multi-object tracking. arXiv preprint arXiv:2101.04206."},{"key":"1426_CR25","doi-asserted-by":"publisher","unstructured":"Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE International Conference on computer vision, pp 1440\u20131448. https:\/\/doi.org\/10.1109\/ICCV.2015.169.","DOI":"10.1109\/ICCV.2015.169"},{"key":"1426_CR26","doi-asserted-by":"publisher","unstructured":"Shao S, Zhao Z, Li B, et al (2018) CrowdHuman: a benchmark for detecting humans in a crowd. arXiv preprint arXiv:1805.00123. https:\/\/doi.org\/10.48550\/arXiv.1805.00123","DOI":"10.48550\/arXiv.1805.00123"},{"issue":"1","key":"1426_CR27","doi-asserted-by":"publisher","first-page":"104","DOI":"10.1109\/TPAMI.2019.2896507","volume":"43","author":"S Sun","year":"2019","unstructured":"Sun S, Akhtar N, Song H et al (2019) Deep affinity network for multiple object tracking. IEEE Trans Pattern Anal Mach Intell 43(1):104\u2013119. https:\/\/doi.org\/10.1109\/TPAMI.2019.2896507","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1426_CR28","doi-asserted-by":"publisher","unstructured":"Tang S, Andriluka M, Andres B, et al (2017) Multiple people tracking by lifted multicut and person re-identification. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 3539\u20133548. https:\/\/doi.org\/10.1109\/CVPR.2017.394","DOI":"10.1109\/CVPR.2017.394"},{"key":"1426_CR29","doi-asserted-by":"publisher","unstructured":"Vaswani A, Shazeer N, Parmar N, et al (2017) Attention is all you need. In: Advances in Neural Information Processing Systems 30, pp 5998\u20136008. https:\/\/doi.org\/10.5555\/3295222.3295349","DOI":"10.5555\/3295222.3295349"},{"key":"1426_CR30","doi-asserted-by":"publisher","unstructured":"Wang Y, Kitani K, Weng X (2021) Joint object detection and multi-object tracking with graph neural networks. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), IEEE, pp 13708\u201313715. https:\/\/doi.org\/10.1109\/ICRA.2021.9553818","DOI":"10.1109\/ICRA.2021.9553818"},{"key":"1426_CR31","doi-asserted-by":"publisher","unstructured":"Wang Z, Zheng L, Liu Y, et al (2020) Towards real-time multi-object tracking. In: Computer Vision\u2014ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XI 16, Springer, pp 107\u2013122. https:\/\/doi.org\/10.48550\/arXiv.1909.12605","DOI":"10.48550\/arXiv.1909.12605"},{"key":"1426_CR32","doi-asserted-by":"publisher","unstructured":"Willner D, Chang C, Dunn K (1976) Kalman filter algorithms for a multi-sensor system. In: 1976 IEEE Conference on Decision and Control including the 15th Symposium on adaptive processes, IEEE, pp 570\u2013574. https:\/\/doi.org\/10.1109\/CDC.1976.267794","DOI":"10.1109\/CDC.1976.267794"},{"key":"1426_CR33","doi-asserted-by":"publisher","unstructured":"Wojke N, Bewley A, Paulus D (2017) Simple online and realtime tracking with a deep association metric. In: 2017 IEEE International Conference on image processing (ICIP), IEEE, pp 3645\u20133649. https:\/\/doi.org\/10.1109\/ICIP.2017.8296962","DOI":"10.1109\/ICIP.2017.8296962"},{"key":"1426_CR34","doi-asserted-by":"publisher","unstructured":"Wu J, Cao J, Song L, et al (2021) Track to detect and segment: An online multi-object tracker. In: Proceedings of the IEEE\/CVF Conference on computer vision and pattern recognition (CVPR), pp 12352\u201312361. https:\/\/doi.org\/10.1109\/CVPR42934.2021.01253","DOI":"10.1109\/CVPR42934.2021.01253"},{"key":"1426_CR35","doi-asserted-by":"publisher","unstructured":"Yu F, Wang D, Shelhamer E, et al (2018) Deep layer aggregation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition pp 2403\u20132412. https:\/\/doi.org\/10.48550\/arXiv.1707.06484","DOI":"10.48550\/arXiv.1707.06484"},{"issue":"2","key":"1426_CR36","doi-asserted-by":"publisher","first-page":"5103","DOI":"10.48550\/arXiv.2104.11747","volume":"7","author":"JN Zaech","year":"2022","unstructured":"Zaech JN, Liniger A, Dai D et al (2022) Learnable online graph representations for 3D multi-object tracking. IEEE Robot Autom Lett 7(2):5103\u20135110. https:\/\/doi.org\/10.48550\/arXiv.2104.11747","journal-title":"IEEE Robot Autom Lett"},{"key":"1426_CR37","doi-asserted-by":"publisher","unstructured":"Zeng F, Dong B, Zhang Y, et al (2022) MOTR: End-to-end multiple-object tracking with transformer. In: European Conference on computer vision, pp 659\u2013675. Springer. https:\/\/doi.org\/10.48550\/arXiv.2105.03247","DOI":"10.48550\/arXiv.2105.03247"},{"key":"1426_CR38","doi-asserted-by":"publisher","unstructured":"Zhang W, He L, Chen P, et al (2021) Boosting end-to-end multi-object tracking and person search via knowledge distillation. In: Proceedings of the 29th ACM International Conference on multimedia, pp 1192\u20131201. ACM. https:\/\/doi.org\/10.1145\/3474085.3481546","DOI":"10.1145\/3474085.3481546"},{"key":"1426_CR39","doi-asserted-by":"publisher","first-page":"3069","DOI":"10.1007\/s40747-020-00206-8","volume":"129","author":"Y Zhang","year":"2021","unstructured":"Zhang Y, Wang C, Wang X et al (2021) Fairmot: On the fairness of detection and re-identification in multiple object tracking. Int J Comput Vision 129:3069\u20133087. https:\/\/doi.org\/10.1007\/s40747-020-00206-8","journal-title":"Int J Comput Vision"},{"issue":"1","key":"1426_CR40","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1109\/TCI.2016.2644865","volume":"3","author":"H Zhao","year":"2016","unstructured":"Zhao H, Gallo O, Frosio I et al (2016) Loss functions for image restoration with neural networks. EEE Trans Comput Imaging 3(1):47\u201357. https:\/\/doi.org\/10.1109\/TCI.2016.2644865","journal-title":"EEE Trans Comput Imaging"},{"key":"1426_CR41","doi-asserted-by":"publisher","first-page":"57","DOI":"10.48550\/arXiv.1812.08434","volume":"1","author":"J Zhou","year":"2020","unstructured":"Zhou J, Cui G, Hu S et al (2020) Graph neural networks: a review of methods and applications. AI Open 1:57\u201381. https:\/\/doi.org\/10.48550\/arXiv.1812.08434","journal-title":"AI Open"},{"key":"1426_CR42","doi-asserted-by":"publisher","unstructured":"Zhou X, Koltun V, Krahenbiihl P (2020) Tracking objects as points. In: Proceedings of the 16th European Conference on computer vision (ECCV2020), Glasgow, UK, August 23\u201328, 2020, Part IV, pp 474\u2013490. Springer. https:\/\/doi.org\/10.48550\/arXiv.2004.01177","DOI":"10.48550\/arXiv.2004.01177"},{"key":"1426_CR43","doi-asserted-by":"publisher","unstructured":"Zhu X, Lyu S, Wang X, et al (2021) Tph-yolov5: Improved yolov5 based on transformer prediction head for object detection on drone-captured scenarios. In: Proceedings of the IEEE\/CVF International Conference on computer vision, pp 2778\u20132788. https:\/\/doi.org\/10.1109\/ICCVW54120.2021.00312","DOI":"10.1109\/ICCVW54120.2021.00312"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01426-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-024-01426-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01426-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,17]],"date-time":"2024-07-17T17:24:57Z","timestamp":1721237097000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-024-01426-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,11]]},"references-count":43,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,8]]}},"alternative-id":["1426"],"URL":"https:\/\/doi.org\/10.1007\/s40747-024-01426-y","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,11]]},"assertion":[{"value":"17 July 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 March 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 May 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"There is no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}