{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,1]],"date-time":"2026-03-01T05:34:19Z","timestamp":1772343259016,"version":"3.50.1"},"reference-count":22,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2024,7,3]],"date-time":"2024-07-03T00:00:00Z","timestamp":1719964800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,7,3]],"date-time":"2024-07-03T00:00:00Z","timestamp":1719964800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2024,10]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The current paradigm of joint detection and tracking still requires a large amount of instance-level trajectory annotation, which incurs high annotation costs. Moreover, treating embedding training as a classification problem would lead to difficulties in model fitting. In this paper, we propose a new self-supervised multi-object tracking based on the real-time joint detection and embedding (JDE) framework, which we termed as self-supervised multi-object tracking (SS-MOT). In SS-MOT, the short-term temporal correlations between objects within and across adjacent video frames are both considered as self-supervised constraints, where the distances between different objects are enlarged while the distances between same object of adjacent frames are brought closer. In addition, short trajectories are formed by matching pairs of adjacent frames using a matching algorithm, and these matched pairs are treated as positive samples. The distances between positive samples are then minimized for futher the feature representation of the same object. Therefore, our method can be trained on videos without instance-level annotations. We apply our approach to state-of-the-art JDE models, such as FairMOT, Cstrack, and SiamMOT, and achieve comparable results to these supevised methods on the widely used MOT17 and MOT20 challenges.<\/jats:p>","DOI":"10.1007\/s40747-024-01475-3","type":"journal-article","created":{"date-parts":[[2024,7,3]],"date-time":"2024-07-03T05:03:35Z","timestamp":1719983015000},"page":"7077-7088","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Self-supervised multi-object tracking based on metric learning"],"prefix":"10.1007","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0009-0009-0766-5164","authenticated-orcid":false,"given":"Xin","family":"Feng","sequence":"first","affiliation":[]},{"given":"Yan","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Hanzhi","family":"Yang","sequence":"additional","affiliation":[]},{"given":"Xiaoning","family":"Jiao","sequence":"additional","affiliation":[]},{"given":"Zhi","family":"Liu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,7,3]]},"reference":[{"key":"1475_CR1","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2020.103448","volume":"293","author":"W Luo","year":"2021","unstructured":"Luo W, Xing J, Milan A, Zhang X, Liu W, Kim T-K (2021) Multiple object tracking: a literature review. Artificial Intell 293:103448","journal-title":"Artificial Intell"},{"key":"1475_CR2","doi-asserted-by":"publisher","first-page":"32650","DOI":"10.1109\/ACCESS.2021.3060821","volume":"9","author":"L Kalake","year":"2021","unstructured":"Kalake L, Wan W, Hou L (2021) Analysis based on recent deep learning approaches applied in real-time multi-object tracking: a review. IEEE Access 9:32650\u201332671","journal-title":"IEEE Access"},{"key":"1475_CR3","doi-asserted-by":"crossref","unstructured":"Du Y, Zhicheng Z, Yang S, Yanyun Z, Su F, Tao G, Hongying M (2023) Strongsort: Make deepsort great again. IEEE Transactions on Multimedia","DOI":"10.1109\/TMM.2023.3240881"},{"key":"1475_CR4","doi-asserted-by":"publisher","first-page":"3069","DOI":"10.1007\/s11263-021-01513-4","volume":"129","author":"Y Zhang","year":"2021","unstructured":"Zhang Y, Wang C, Wang X, Zeng W, Liu W (2021) Fairmot: On the fairness of detection and re-identification in multiple object tracking. Int J Comput Vis 129:3069\u20133087","journal-title":"Int J Comput Vis"},{"key":"1475_CR5","unstructured":"Laura L-T, Anton M, Ian R, Stefan R, Konrad S (2015) Motchallenge 2015: Towards a benchmark for multi-target tracking. arXiv preprint arXiv:1504.01942"},{"key":"1475_CR6","doi-asserted-by":"crossref","unstructured":"Santiago M, Michael G, Dengxin D, Luc VG (2017) Pathtrack: Fast trajectory annotation with path supervision. In: Proceedings of the IEEE International Conference on Computer Vision, pages 290\u2013299","DOI":"10.1109\/ICCV.2017.40"},{"issue":"9","key":"1475_CR7","doi-asserted-by":"publisher","first-page":"1627","DOI":"10.1109\/TPAMI.2009.167","volume":"32","author":"PF Felzenszwalb","year":"2009","unstructured":"Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2009) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627\u20131645","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1475_CR8","unstructured":"Shaoqing R, Kaiming H, Ross G, Jian S (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28"},{"key":"1475_CR9","unstructured":"Hei L, Jia D (2018) Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), pages 734\u2013750"},{"key":"1475_CR10","unstructured":"Xingyi Z, Dequan W, Philipp K (2019) Objects as points. arXiv preprint arXiv:1904.07850"},{"key":"1475_CR11","doi-asserted-by":"crossref","unstructured":"Zhongdao W, Liang Z, Yixuan L, Yali L, Shengjin W (2020) Towards real-time multi-object tracking. In: Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part XI 16, pages 107\u2013122. Springer","DOI":"10.1007\/978-3-030-58621-8_7"},{"key":"1475_CR12","doi-asserted-by":"crossref","unstructured":"Liang C, Zhang Z, Zhou X, Li B, Zhu S, Weiming H (2022) Rethinking the competition between detection and reid in multiobject tracking. IEEE Trans Image Process 31:3182\u20133196","DOI":"10.1109\/TIP.2022.3165376"},{"key":"1475_CR13","unstructured":"Shyamgopal K, Ameya P, Vineet G (2020) Simple unsupervised multi-object tracking. arXiv preprint arXiv:2006.02609"},{"key":"1475_CR14","unstructured":"Tae-young C, Heansung L, Myeong Ah C, Suhwan C, Sangyoun L (2020) Multi-object tracking with self-supervised associating network. arXiv preprint arXiv:2010.13424"},{"key":"1475_CR15","unstructured":"Wei L, Yuanjun X, Shuo Y, Mingze X, Yongxin W, Wei X (2021) Semi-tcl: Semi-supervised track contrastive representation learning. arXiv preprint arXiv:2107.02396"},{"key":"1475_CR16","unstructured":"Zheng G, Songtao L, Feng W, Zeming L, Jian S (2021) Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430"},{"key":"1475_CR17","unstructured":"Bing S, Andrew B, Xinyu L, Davide M, Joseph T (2021) Siammot: Siamese multi-object tracking. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pages 12372\u201312382"},{"key":"1475_CR18","doi-asserted-by":"crossref","unstructured":"Xingyi Z, Vladlen K, Philipp K (2020) Tracking objects as points. In: Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part IV, pages 474\u2013490. Springer","DOI":"10.1007\/978-3-030-58548-8_28"},{"issue":"1","key":"1475_CR19","first-page":"104","volume":"43","author":"S Sun","year":"2019","unstructured":"Sun S, Akhtar N, Song H, Mian A, Shah M (2019) Deep affinity network for multiple object tracking. IEEE Trans Pattern Anal Mach Intell 43(1):104\u2013119","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1475_CR20","doi-asserted-by":"crossref","unstructured":"Jinlong P, Changan W, Fangbin W, Yang W, Yabiao W, Ying T, Chengjie W, Jilin L, Feiyue H, Yanwei F (2020) Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In: Computer Vision\u2013ECCV 2020: 16th European Conference, Glasgow, UK, August 23\u201328, 2020, Proceedings, Part IV 16, pages 145\u2013161. Springer","DOI":"10.1007\/978-3-030-58548-8_9"},{"key":"1475_CR21","unstructured":"Ning W, Yibing S, Chao M, Wengang Z, Wei L, Houqiang L (2019) Unsupervised deep tracking. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, pages 1308\u20131317"},{"key":"1475_CR22","unstructured":"Ting C, Simon K, Mohammad N, Geoffrey H (2020) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, pages 1597\u20131607. PMLR"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01475-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-024-01475-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-024-01475-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,14]],"date-time":"2024-09-14T15:22:18Z","timestamp":1726327338000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-024-01475-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,7,3]]},"references-count":22,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2024,10]]}},"alternative-id":["1475"],"URL":"https:\/\/doi.org\/10.1007\/s40747-024-01475-3","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,7,3]]},"assertion":[{"value":"17 July 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"21 January 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 July 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}