{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,10]],"date-time":"2026-02-10T16:14:59Z","timestamp":1770740099184,"version":"3.49.0"},"reference-count":58,"publisher":"Association for Computing Machinery (ACM)","issue":"1s","license":[{"start":{"date-parts":[[2023,2,3]],"date-time":"2023-02-03T00:00:00Z","timestamp":1675382400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"National Key Research and Development Program of China","award":["2018YFB1308604"],"award-info":[{"award-number":["2018YFB1308604"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["U21A20518, 61976086"],"award-info":[{"award-number":["U21A20518, 61976086"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Foshan Science and Technology Innovation Team","award":["FS0AA-KJ919-4402-0069"],"award-info":[{"award-number":["FS0AA-KJ919-4402-0069"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2023,2,28]]},"abstract":"<jats:p>In the last few years, enormous strides have been made for object detection and data association, which are vital subtasks for one-stage online multi-object tracking (MOT). However, the two separated submodules involved in the whole MOT pipeline are processed or optimized separately, resulting in a complex method design and requiring manual settings. In addition, few works integrate the two subtasks into a single end-to-end network to optimize the overall task. In this study, we propose an end-to-end MOT network called joint detection and association network (JDAN) that is trained and inferred in a single network. All layers in JDAN are differentiable, and can be optimized jointly to detect targets and output an association matrix for robust multi-object tracking. What\u2019s more, we generate suitable pseudo-labels to address the data inconsistency between object detection and association. The detection and association submodules could be optimized by the composite loss function that is derived from the detection results and the generated pseudo association labels, respectively. The proposed approach is evaluated on two MOT challenge datasets, and achieves promising performance compared with classic and latest methods.<\/jats:p>","DOI":"10.1145\/3533253","type":"journal-article","created":{"date-parts":[[2022,5,2]],"date-time":"2022-05-02T12:27:33Z","timestamp":1651494453000},"page":"1-17","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":13,"title":["JDAN: Joint Detection and Association Network for Real-Time Online Multi-Object Tracking"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4614-5817","authenticated-orcid":false,"given":"Haidong","family":"Wang","sequence":"first","affiliation":[{"name":"Hunan University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8150-0135","authenticated-orcid":false,"given":"Xuan","family":"He","sequence":"additional","affiliation":[{"name":"Hunan University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9720-5915","authenticated-orcid":false,"given":"Zhiyong","family":"Li","sequence":"additional","affiliation":[{"name":"Hunan University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9600-7789","authenticated-orcid":false,"given":"Jin","family":"Yuan","sequence":"additional","affiliation":[{"name":"Hunan University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0585-9848","authenticated-orcid":false,"given":"Shutao","family":"Li","sequence":"additional","affiliation":[{"name":"Hunan University, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,2,3]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/3441656"},{"key":"e_1_3_1_3_2","first-page":"1","article-title":"Confidence-based data association and discriminative deep appearance learning for robust online multi-object tracking","author":"Bae Seung-Hwan","year":"2018","unstructured":"Seung-Hwan Bae and Kuk-Jin Yoon. 2018. Confidence-based data association and discriminative deep appearance learning for robust online multi-object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence (2018), 1\u20131.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_1_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00103"},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1155\/2008\/246309"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICIP.2017.8296360"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00627"},{"key":"e_1_3_1_8_2","article-title":"CVPR19 tracking and detection challenge: How crowded can it get?","author":"Dendorfer Patrick","year":"2019","unstructured":"Patrick Dendorfer, Hamid Seyed Rezatofighi, Anton Milan, Javen Shi, Daniel Cremers, D. Ian Reid, Stefan Roth, Konrad Schindler, and Laura Leal-Taixeacute. 2019. CVPR19 tracking and detection challenge: How crowded can it get? In Proceedings of CoRR (2019).","journal-title":"Proceedings of CoRR"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2009.5206631"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2008.4587581"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/WACV.2018.00057"},{"key":"e_1_3_1_12_2","first-page":"1","article-title":"Multi-level cooperative fusion of GM-PHD filters for online multiple human tracking","author":"Fu Zeyu","year":"2019","unstructured":"Zeyu Fu, Federico Angelini, Jonathon Chambers, and Mohsen Syed Naqvi. 2019. Multi-level cooperative fusion of GM-PHD filters for online multiple human tracking. IEEE Transactions on Multimedia (2019), 1\u20131.","journal-title":"IEEE Transactions on Multimedia"},{"key":"e_1_3_1_13_2","first-page":"3354","article-title":"Are we ready for autonomous driving? The KITTI vision benchmark suite","author":"Geiger Andreas","year":"2012","unstructured":"Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for autonomous driving? The KITTI vision benchmark suite. Computer Vision and Pattern Recognition (2012), 3354\u20133361.","journal-title":"Computer Vision and Pattern Recognition"},{"key":"e_1_3_1_14_2","first-page":"1","article-title":"Mask R-CNN","volume":"99","author":"He Kaiming","year":"2017","unstructured":"Kaiming He, Georgia Gkioxari, Piotr Dollar, and Ross Girshick. 2017. Mask R-CNN. IEEE Transactions on Pattern Analysis & Machine Intelligence (PP), 99 (2017), 1\u20131.","journal-title":"IEEE Transactions on Pattern Analysis & Machine Intelligence (PP)"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_1_16_2","article-title":"Hadamard product for low-rank bilinear pooling","author":"Kim Jin-Hwa","year":"2017","unstructured":"Jin-Hwa Kim, Woon Kyoung On, Jeonghee Kim, JungWoo Ha, and Byoung-Tak Zhang. 2017. Hadamard product for low-rank bilinear pooling. In Proceedings of ICLR (2017).","journal-title":"Proceedings of ICLR"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.579"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1002\/nav.3800020109"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01264-9_45"},{"key":"e_1_3_1_20_2","article-title":"MOTChallenge 2015: Towards a benchmark for multi-target tracking","author":"Leal-Taix\u00e9 Laura","year":"2015","unstructured":"Laura Leal-Taix\u00e9, Anton Milan, D. Ian Reid, Stefan Roth, and Konrad Schindler. 2015. MOTChallenge 2015: Towards a benchmark for multi-target tracking. CoRR (2015).","journal-title":"CoRR"},{"key":"e_1_3_1_21_2","first-page":"9047","article-title":"Multi-view correlation tracking with adaptive memory-improved update model","author":"Li Guiji","year":"2019","unstructured":"Guiji Li, Manman Peng, Ke Nai, Zhiyong Li, and Keqin Li. 2019. Multi-view correlation tracking with adaptive memory-improved update model. Neural Computing and Applications (2019), 9047\u20139063.","journal-title":"Neural Computing and Applications"},{"key":"e_1_3_1_22_2","article-title":"Learning a dynamic feature fusion tracker for object tracking","author":"Li Zhiyong","year":"2020","unstructured":"Zhiyong Li, Ke Nai, Guiji Li, and Shilong Jiang. 2020. Learning a dynamic feature fusion tracker for object tracking. IEEE Transactions on Intelligent Transportation Systems (2020).","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"e_1_3_1_23_2","first-page":"560","article-title":"Robust object tracking via weight-based local sparse appearance model","author":"Li Zhiyong","year":"2016","unstructured":"Zhiyong Li, Dongming Wang, Ke Nai, Tong Shen, and Ying Zeng. 2016. Robust object tracking via weight-based local sparse appearance model. ICNC-FSKD (2016), 560\u2013565.","journal-title":"ICNC-FSKD"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.324"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394171.3416304"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2020.103875"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-018-6467-6"},{"key":"e_1_3_1_30_2","unstructured":"Anton Milan Laura Leal-Taixe Ian Reid Stefan Roth and Konrad Schindler. 2016. MOT16: A benchmark for multi-object tracking. (2016)."},{"issue":"1","key":"e_1_3_1_31_2","first-page":"32","article-title":"Algorithms for the assignment and transportation problems","volume":"5","year":"2006","unstructured":"Munkres and James. 2006. Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics 5, 1 (2006), 32\u201338.","journal-title":"Journal of the Society for Industrial and Applied Mathematics"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2018.2848465"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2019.05.032"},{"key":"e_1_3_1_34_2","first-page":"8024","volume-title":"Advances in Neural Information Processing Systems (NIPS\u201919)","author":"Paszke Adam","year":"2019","unstructured":"Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, and Luca Antiga. 2019. PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems (NIPS\u201919). 8024\u20138035."},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2781233"},{"key":"e_1_3_1_36_2","article-title":"Yolov3: An incremental improvement","author":"Redmon Joseph","year":"2018","unstructured":"Joseph Redmon and Ali Farhadi. 2018. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018).","journal-title":"arXiv preprint arXiv:1804.02767"},{"key":"e_1_3_1_37_2","first-page":"91","volume-title":"Proceedings of the International Conference on Neural Information Processing Systems (NIPS 2015)","author":"Ren Shaoqing","year":"2015","unstructured":"Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the International Conference on Neural Information Processing Systems (NIPS 2015). 91\u201399."},{"key":"e_1_3_1_38_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-48881-3_2"},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-48881-3_7"},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2019.2929520"},{"key":"e_1_3_1_41_2","first-page":"7942","article-title":"MOTS: Multi-object tracking and segmentation","author":"Voigtlaender Paul","year":"2019","unstructured":"Paul Voigtlaender, Michael Krause, Aljosa Osep, Jonathon Luiten, Balachandar Gnana Berin Sekar, Andreas Geiger, and Bastian Leibe. 2019. MOTS: Multi-object tracking and segmentation. In Proceedings of(CVPR\u20192019), 7942\u20137951.","journal-title":"("},{"key":"e_1_3_1_42_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2020.103983"},{"key":"e_1_3_1_43_2","doi-asserted-by":"crossref","unstructured":"Zhongdao Wang Liang Zheng Yixuan Liu Yali Li and Shengjin Wang. 2020. Towards real-time multi-object tracking. (2020) 107\u2013122.","DOI":"10.1007\/978-3-030-58621-8_7"},{"key":"e_1_3_1_44_2","unstructured":"Greg Welch Gary Bishop et\u00a0al. 1995. An introduction to the kalman filter. (1995)."},{"key":"e_1_3_1_45_2","article-title":"Simple online and realtime tracking with a deep association metric","author":"Wojke Nicolai","year":"2017","unstructured":"Nicolai Wojke, Alex Bewley, and Dietrich Paulus. 2017. Simple online and realtime tracking with a deep association metric. In Proceedings of the(ICIP\u201917).","journal-title":"("},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.534"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.360"},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/3424341"},{"key":"e_1_3_1_49_2","first-page":"6786","article-title":"How to train your deep multi-object tracker","author":"Xu Yihong","year":"2020","unstructured":"Yihong Xu, Aljosa Osep, Yutong Ban, Radu Horaud, Laura Leal-Taix\u00e9, and Xavier Alameda-Pineda. 2020. How to train your deep multi-object tracker. In Proceedings of the(CVPR\u201920), 6786\u20136795.","journal-title":"("},{"key":"e_1_3_1_50_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-48881-3_3"},{"key":"e_1_3_1_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00255"},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.474"},{"key":"e_1_3_1_53_2","first-page":"685","volume-title":"Advances in Neural Information Processing Systems","author":"Zhang Sixin","year":"2015","unstructured":"Sixin Zhang, Anna E. Choromanska, and Yann LeCun. 2015. Deep learning with elastic averaging SGD. In Advances in Neural Information Processing Systems. 685\u2013693."},{"key":"e_1_3_1_54_2","first-page":"1","article-title":"Fairmot: On the fairness of detection and re-identification in multiple object tracking","author":"Zhang Yifu","year":"2021","unstructured":"Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, and Wenyu Liu. 2021. Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision (2021), 1\u201319.","journal-title":"International Journal of Computer Vision"},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.357"},{"key":"e_1_3_1_56_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58548-8_28"},{"key":"e_1_3_1_57_2","article-title":"Objects as points","author":"Zhou Xingyi","year":"2019","unstructured":"Xingyi Zhou, Dequan Wang, and Philipp Kr\u00e4henb\u00fchl. 2019. Objects as points. arXiv preprint arXiv:1904.07850 (2019).","journal-title":"arXiv preprint arXiv:1904.07850"},{"key":"e_1_3_1_58_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR.2018.8545450"},{"key":"e_1_3_1_59_2","first-page":"379","article-title":"Online multi-object tracking with dual matching attention networks","author":"Zhu Ji","year":"2018","unstructured":"Ji Zhu, Hua Yang, Nian Liu, Minyoung Kim, Wenjun Zhang, and Ming-Hsuan Yang. 2018. Online multi-object tracking with dual matching attention networks. In Proceedings of theECCV (2018), 379\u2013396.","journal-title":"ECCV"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3533253","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3533253","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:00:38Z","timestamp":1750186838000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3533253"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,3]]},"references-count":58,"journal-issue":{"issue":"1s","published-print":{"date-parts":[[2023,2,28]]}},"alternative-id":["10.1145\/3533253"],"URL":"https:\/\/doi.org\/10.1145\/3533253","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,3]]},"assertion":[{"value":"2021-08-14","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-04-22","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-02-03","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}