{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,8]],"date-time":"2025-07-08T17:26:06Z","timestamp":1751995566841,"version":"3.41.0"},"reference-count":59,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2023,7,12]],"date-time":"2023-07-12T00:00:00Z","timestamp":1689120000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities of China","doi-asserted-by":"crossref","award":["2022JBMC009"],"award-info":[{"award-number":["2022JBMC009"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61972027"],"award-info":[{"award-number":["61972027"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Beijing Municipal Natural Science Foundation","award":["4212041"],"award-info":[{"award-number":["4212041"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2023,11,30]]},"abstract":"<jats:p>The popular tracking-by-detection paradigm of multi-object tracking (MOT) takes detections of each frame as the input and associates detections from one frame to another. Existing association methods based on the relative motion have attracted attention, because they restrain the effect of noisy detections and improve the performance of MOT. However, these methods depend only on the immediately previous frame, which may easily lead to inaccurate matches and even large accumulated errors. Furthermore, multiple objects involved in occlusions are not fully exploited in these existing methods, which leads to the aggravation of inaccurate matches. Motivated by these issues, we design the pivot to represent each object and propose a novel pivot association network (PANet) for the MOT task. Specifically, pivots are learned from spatial semantic and historical contextual clues, which alleviates the dependency on the immediately previous frame. Our online tracker PANet employs pivots and a lightweight associator to localize tracklets of objects, which can inhibit noise detections and improve the accuracy of tracklet prediction by learning the correlation responses between pivots and spatial search areas. Extensive experiments conducted on two-dimensional MOT15, MOT16, MOT17, and MOT20 demonstrate the effectiveness of the proposed method against numerous state-of-the-art MOT trackers.<\/jats:p>","DOI":"10.1145\/3595379","type":"journal-article","created":{"date-parts":[[2023,5,5]],"date-time":"2023-05-05T12:27:38Z","timestamp":1683289658000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["PANet: An End-to-end Network Based on Relative Motion for Online Multi-object Tracking"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1679-5904","authenticated-orcid":false,"given":"Rui","family":"Li","sequence":"first","affiliation":[{"name":"School of Computer and Information Technology, Beijing Jiaotong University, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2592-2354","authenticated-orcid":false,"given":"Baopeng","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Computer and Information Technology, Beijing Jiaotong University, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3583-1392","authenticated-orcid":false,"given":"Wei","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Computer and Information Technology, Beijing Jiaotong University, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1754-4878","authenticated-orcid":false,"given":"Zhu","family":"Teng","sequence":"additional","affiliation":[{"name":"School of Computer and Information Technology, Beijing Jiaotong University, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2290-1785","authenticated-orcid":false,"given":"Jianping","family":"Fan","sequence":"additional","affiliation":[{"name":"AI Lab, Lenovo Research, China"}]}],"member":"320","published-online":{"date-parts":[[2023,7,12]]},"reference":[{"key":"e_1_3_1_2_2","article-title":"Layer normalization","author":"Ba Jimmy Lei","year":"2016","unstructured":"Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer normalization. arXiv:1607.06450. Retrieved from https:\/\/arxiv.org\/abs\/1607.06450.","journal-title":"arXiv:1607.06450"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jvcir.2021.103279"},{"key":"e_1_3_1_4_2","first-page":"941","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Bergmann Philipp","year":"2019","unstructured":"Philipp Bergmann, Tim Meinhardt, and Laura Leal-Taixe. 2019. Tracking without bells and whistles. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 941\u2013951."},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1155\/2008\/246309"},{"key":"e_1_3_1_6_2","first-page":"3464","volume-title":"Proceedings of the International Conference on Image Processing","author":"Bewley Alex","year":"2016","unstructured":"Alex Bewley, Zongyuan Ge, Lionel Ott, Fabio Ramos, and Ben Upcroft. 2016. Simple online and realtime tracking. In Proceedings of the International Conference on Image Processing. IEEE, 3464\u20133468."},{"key":"e_1_3_1_7_2","first-page":"6247","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Bras\u00f3 Guillem","year":"2020","unstructured":"Guillem Bras\u00f3 and Laura Leal-Taix\u00e9. 2020. Learning a neural solver for multiple object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 6247\u20136257."},{"key":"e_1_3_1_8_2","first-page":"213","volume-title":"European Conference on Computer Vision","author":"Carion Nicolas","year":"2020","unstructured":"Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-end object detection with transformers. In European Conference on Computer Vision. Springer, 213\u2013229."},{"key":"e_1_3_1_9_2","first-page":"6172","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Chu Peng","year":"2019","unstructured":"Peng Chu and Haibin Ling. 2019. Famnet: Joint learning of feature, affinity and multi-dimensional assignment for online multiple object tracking. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 6172\u20136181."},{"key":"e_1_3_1_10_2","first-page":"4836","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Chu Qi","year":"2017","unstructured":"Qi Chu, Wanli Ouyang, Hongsheng Li, Xiaogang Wang, Bin Liu, and Nenghai Yu. 2017. Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 4836\u20134845."},{"key":"e_1_3_1_11_2","first-page":"10672","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"34","author":"Chu Qi","year":"2020","unstructured":"Qi Chu, Wanli Ouyang, Bin Liu, Feng Zhu, and Nenghai Yu. 2020. Dasot: A unified framework integrating data association and single object tracking for online multi-object tracking. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 10672\u201310679."},{"key":"e_1_3_1_12_2","first-page":"2443","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Dai Peng","year":"2021","unstructured":"Peng Dai, Renliang Weng, Wongun Choi, Changshui Zhang, Zhangping He, and Wei Ding. 2021. Learning a proposal classifier for multiple object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2443\u20132452."},{"key":"e_1_3_1_13_2","first-page":"65.1\u201365.11","volume-title":"Proceedings of the British Machine Vision Conference","author":"Danelljan Martin","year":"2014","unstructured":"Martin Danelljan, Gustav H\u00e4ger, Fahad Khan, and Michael Felsberg. 2014. Accurate scale estimation for robust visual tracking. In Proceedings of the British Machine Vision Conference. 65.1\u201365.11."},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.143"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2014.2300479"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2009.167"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2022.3154286"},{"key":"e_1_3_1_19_2","first-page":"4364","volume-title":"International Conference on Machine Learning","author":"Hornakova Andrea","year":"2020","unstructured":"Andrea Hornakova, Roberto Henschel, Bodo Rosenhahn, and Paul Swoboda. 2020. Lifted disjoint paths with application in multiple object tracking. In International Conference on Machine Learning. PMLR, 4364\u20134375."},{"key":"e_1_3_1_20_2","first-page":"6330","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Hornakova Andrea","year":"2021","unstructured":"Andrea Hornakova, Timo Kaiser, Paul Swoboda, Michal Rolinek, Bodo Rosenhahn, and Roberto Henschel. 2021. Making higher order MOT scalable: An efficient approximate solver for lifted disjoint paths. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 6330\u20136340."},{"key":"e_1_3_1_21_2","first-page":"9553","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Kim Chanho","year":"2021","unstructured":"Chanho Kim, Li Fuxin, Mazen Alotaibi, and James M. Rehg. 2021. Discriminative appearance modeling with multi-track pooling for real-time multi-object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 9553\u20139562."},{"key":"e_1_3_1_22_2","article-title":"Motchallenge 2015: Towards a benchmark for multi-target tracking","author":"Leal-Taix\u00e9 Laura","year":"2015","unstructured":"Laura Leal-Taix\u00e9, Anton Milan, Ian Reid, Stefan Roth, and Konrad Schindler. 2015. Motchallenge 2015: Towards a benchmark for multi-target tracking. arXiv:1504.01942. Retrieved from https:\/\/arxiv.org\/abs\/1504.01942.","journal-title":"arXiv:1504.01942"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2022.108738"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.106"},{"key":"e_1_3_1_25_2","first-page":"530","volume-title":"Proceedings of the International Joint Conference on Artificial Intelligence","author":"Liu Qiankun","year":"2020","unstructured":"Qiankun Liu, Qi Chu, Bin Liu, and Nenghai Yu. 2020. GSM: Graph similarity model for multi-object tracking. In Proceedings of the International Joint Conference on Artificial Intelligence. 530\u2013536."},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2022.3140929"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2021.02.084"},{"issue":"5","key":"e_1_3_1_29_2","first-page":"2386","article-title":"Deep object tracking with shrinkage loss","volume":"44","author":"Lu Xiankai","year":"2020","unstructured":"Xiankai Lu, Chao Ma, Jianbing Shen, Xiaokang Yang, Ian Reid, and Ming-Hsuan Yang. 2020. Deep object tracking with shrinkage loss. IEEE Trans. Pattern Anal. Mach. Intell. 44, 5 (2020), 2386\u20132401.","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"e_1_3_1_30_2","article-title":"MOT16: A benchmark for multi-object tracking","author":"Milan Anton","year":"2016","unstructured":"Anton Milan, Laura Leal-Taix\u00e9, Ian Reid, Stefan Roth, and Konrad Schindler. 2016. MOT16: A benchmark for multi-object tracking. arXiv:1603.00831. Retrieved from https:\/\/arxiv.org\/abs\/1603.00831.","journal-title":"arXiv:1603.00831"},{"key":"e_1_3_1_31_2","doi-asserted-by":"crossref","first-page":"107480","DOI":"10.1016\/j.patcog.2020.107480","article-title":"TPM: Multiple object tracking with tracklet-plane matching","volume":"107","author":"Peng Jinlong","year":"2020","unstructured":"Jinlong Peng, Tao Wang, Weiyao Lin, Jian Wang, John See, Shilei Wen, and Erui Ding. 2020. TPM: Multiple object tracking with tracklet-plane matching. Pattern Recogn. 107 (2020), 107480.","journal-title":"Pattern Recogn."},{"key":"e_1_3_1_32_2","first-page":"779","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Redmon Joseph","year":"2016","unstructured":"Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 779\u2013788."},{"key":"e_1_3_1_33_2","first-page":"91","volume-title":"Proceedings of the Conference and Workshop on Neural Information Processing Systems","author":"Ren Shaoqing","year":"2015","unstructured":"Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the Conference and Workshop on Neural Information Processing Systems. 91\u201399."},{"key":"e_1_3_1_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2020.3044219"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-48881-3_2"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00632"},{"key":"e_1_3_1_37_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-015-0816-y"},{"issue":"6","key":"e_1_3_1_38_2","doi-asserted-by":"crossref","first-page":"1990","DOI":"10.1109\/TCYB.2018.2803217","article-title":"Multiobject tracking by submodular optimization","volume":"49","author":"Shen Jianbing","year":"2018","unstructured":"Jianbing Shen, Zhiyuan Liang, Jianhong Liu, Hanqiu Sun, Ling Shao, and Dacheng Tao. 2018. Multiobject tracking by submodular optimization. IEEE Trans. Cybernet. 49, 6 (2018), 1990\u20132001.","journal-title":"IEEE Trans. Cybernet."},{"issue":"12","key":"e_1_3_1_39_2","doi-asserted-by":"crossref","first-page":"8896","DOI":"10.1109\/TPAMI.2021.3127492","article-title":"Distilled siamese networks for visual tracking","volume":"44","author":"Shen Jianbing","year":"2021","unstructured":"Jianbing Shen, Yuanpei Liu, Xingping Dong, Xiankai Lu, Fahad Shahbaz Khan, and Steven Hoi. 2021. Distilled siamese networks for visual tracking. IEEE Trans. Pattern Anal. Mach. Intell. 44, 12 (2021), 8896\u20138909.","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"e_1_3_1_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2017.2750082"},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2020.2988649"},{"key":"e_1_3_1_42_2","article-title":"Transtrack: Multiple object tracking with transformer","author":"Sun Peize","year":"2020","unstructured":"Peize Sun, Jinkun Cao, Yi Jiang, Rufeng Zhang, Enze Xie, Zehuan Yuan, Changhu Wang, and Ping Luo. 2020. Transtrack: Multiple object tracking with transformer. arXiv:2012.15460. Retrieved from https:\/\/arxiv.2012.05460.","journal-title":"arXiv:2012.15460"},{"key":"e_1_3_1_43_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-013-0664-6"},{"key":"e_1_3_1_44_2","first-page":"5998","volume-title":"Advances in Neural Information Processing Systems","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998\u20136008."},{"key":"e_1_3_1_45_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2022.3140919"},{"key":"e_1_3_1_46_2","first-page":"482","volume-title":"Proceedings of the ACM International Conference on Multimedia","author":"Wang Gaoang","year":"2019","unstructured":"Gaoang Wang, Yizhou Wang, Haotian Zhang, Renshu Gu, and Jenq-Neng Hwang. 2019. Exploit the connectivity: Multi-object tracking with trackletnet. In Proceedings of the ACM International Conference on Multimedia. 482\u2013490."},{"key":"e_1_3_1_47_2","first-page":"3645","volume-title":"Proceedings of the International Conference on Image Processing","author":"Wojke Nicolai","year":"2017","unstructured":"Nicolai Wojke, Alex Bewley, and Dietrich Paulus. 2017. Simple online and realtime tracking with a deep association metric. In Proceedings of the International Conference on Image Processing. IEEE, 3645\u20133649."},{"key":"e_1_3_1_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2020.2975842"},{"key":"e_1_3_1_49_2","first-page":"3988","volume-title":"Proceedings of the IEEE International Conference on Computer Vision","author":"Xu Jiarui","year":"2019","unstructured":"Jiarui Xu, Yue Cao, Zheng Zhang, and Han Hu. 2019. Spatial-temporal relation networks for multi-object tracking. In Proceedings of the IEEE International Conference on Computer Vision. IEEE, 3988\u20133998."},{"key":"e_1_3_1_50_2","first-page":"6787","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Xu Yihong","year":"2020","unstructured":"Yihong Xu, Aljosa Osep, Yutong Ban, Radu Horaud, Laura Leal-Taix\u00e9, and Xavier Alameda-Pineda. 2020. How to train your deep multi-object tracker. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 6787\u20136796."},{"key":"e_1_3_1_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.234"},{"key":"e_1_3_1_52_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10489-021-02457-5"},{"key":"e_1_3_1_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2021.3074239"},{"key":"e_1_3_1_54_2","first-page":"6768","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","author":"Yin Junbo","year":"2020","unstructured":"Junbo Yin, Wenguan Wang, Qinghao Meng, Ruigang Yang, and Jianbing Shen. 2020. A unified object motion and affinity model for online multi-object tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 6768\u20136777."},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2020.10.002"},{"key":"e_1_3_1_56_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2020.3037518"},{"issue":"3","key":"e_1_3_1_57_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3486678","article-title":"Learning adaptive spatial-temporal context-aware correlation filters for UAV tracking","volume":"18","author":"Yuan Di","year":"2022","unstructured":"Di Yuan, Xiaojun Chang, Zhihui Li, and Zhenyu He. 2022. Learning adaptive spatial-temporal context-aware correlation filters for UAV tracking. ACM Trans. Multimedia Comput. Communicat. Appl. 18, 3 (2022), 1\u201318.","journal-title":"ACM Trans. Multimedia Comput. Communicat. Appl."},{"key":"e_1_3_1_58_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298684"},{"key":"e_1_3_1_59_2","first-page":"1468","volume-title":"Proceedings of the International Conference on Image Processing","author":"Zhang Mengdan","year":"2015","unstructured":"Mengdan Zhang, Junliang Xing, Jin Gao, and Weiming Hu. 2015. Robust visual tracking using joint scale-spatial correlation filters. In Proceedings of the International Conference on Image Processing. IEEE, 1468\u20131472."},{"key":"e_1_3_1_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2020.2993073"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3595379","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3595379","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T22:48:39Z","timestamp":1750286919000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3595379"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,7,12]]},"references-count":59,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2023,11,30]]}},"alternative-id":["10.1145\/3595379"],"URL":"https:\/\/doi.org\/10.1145\/3595379","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"type":"print","value":"1551-6857"},{"type":"electronic","value":"1551-6865"}],"subject":[],"published":{"date-parts":[[2023,7,12]]},"assertion":[{"value":"2022-07-13","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-04-25","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-07-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}