{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,8]],"date-time":"2026-07-08T02:27:34Z","timestamp":1783477654964,"version":"3.55.0"},"reference-count":98,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2023,3,8]],"date-time":"2023-03-08T00:00:00Z","timestamp":1678233600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,3,8]],"date-time":"2023-03-08T00:00:00Z","timestamp":1678233600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"the National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61971078"],"award-info":[{"award-number":["61971078"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"the Chongqing University of Technology Graduate Innovation Foundation","award":["gzlcx20223221"],"award-info":[{"award-number":["gzlcx20223221"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2023,10]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In recent years, YOLOv5 networks have become a research focus in many fields because they are capable of outperforming state-of-the-art (SOTA) approaches in different computer vision tasks. Nevertheless, there is still room for improvement in YOLOv5 in terms of target tracking. We modified YOLOv5 according to the anchor-free paradigm to be on par with other state-of-the-art tracking paradigms and modified the network backbone to design an efficient module, thus proposing the RetinaYOLO detector, which, after combining state-of-the-art tracking algorithms, achieves state-of-the-art performance: we call it RetinaMOT. To the best of our knowledge, RetinaMOT is the first such approach. The anchor-free paradigm SOTA method for the YOLOv5 architecture and RetinaYOLO outperforms all lightweight YOLO architecture methods on the MS COCO dataset. In this paper, we show the details of the RetinaYOLO backbone, embedding Kalman filtering and the Hungarian algorithm into the network, with one framework used to accomplish two tasks. Our RetinaMOT shows that MOTA metrics reach 74.8, 74.1, and 66.8 on MOT Challenge MOT16, 17, and 20 test datasets, and our method is at the top of the list when compared with state-of-the-art methods.<\/jats:p>","DOI":"10.1007\/s40747-023-01009-3","type":"journal-article","created":{"date-parts":[[2023,4,6]],"date-time":"2023-04-06T00:03:46Z","timestamp":1680739426000},"page":"5115-5133","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["RetinaMOT: rethinking anchor-free YOLOv5 for online multiple object tracking"],"prefix":"10.1007","volume":"9","author":[{"given":"Jie","family":"Cao","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jianxun","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Bowen","family":"Li","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Linfeng","family":"Gao","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jie","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2023,3,8]]},"reference":[{"key":"1009_CR1","unstructured":"Aharon N, Orfaig R, Bobrovsky BZ (2022) Bot-sort: robust associations multi-pedestrian tracking. arXiv preprint arXiv:2206.14651"},{"key":"1009_CR2","doi-asserted-by":"crossref","unstructured":"Ahmed M, Maher A, Bai X (2022) Aircraft tracking in aerial videos based on fused RetinaNet and low-score detection classification. IET Image Process","DOI":"10.1049\/ipr2.12665"},{"key":"1009_CR3","doi-asserted-by":"publisher","first-page":"175228","DOI":"10.1109\/ACCESS.2019.2957336","volume":"7","author":"MO Almasawa","year":"2019","unstructured":"Almasawa MO, Elrefaei LA, Moria K (2019) A survey on deep learning-based person re-identification systems. IEEE Access 7:175228\u2013175247","journal-title":"IEEE Access"},{"key":"1009_CR4","doi-asserted-by":"publisher","DOI":"10.1016\/j.aei.2021.101393","volume":"50","author":"J Azimjonov","year":"2021","unstructured":"Azimjonov J, \u00d6zmen A (2021) A real-time vehicle detection and a novel vehicle tracking systems for estimating and monitoring traffic flow on highways. Adv Eng Inform 50:101393","journal-title":"Adv Eng Inform"},{"key":"1009_CR5","unstructured":"Benjumea A, Teeti I, Cuzzolin F, Bradley A (2021) YOLO-z: improving small object detection in YOLOv5 for autonomous vehicles. arXiv preprint arXiv:2112.11798"},{"key":"1009_CR6","doi-asserted-by":"crossref","unstructured":"Bewley A, Ge Z, Ott L, Ramos F, Upcroft B (2016) Simple online and realtime tracking. In: 2016 IEEE international conference on image processing (ICIP). IEEE, pp 3464\u20133468","DOI":"10.1109\/ICIP.2016.7533003"},{"key":"1009_CR7","unstructured":"Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934"},{"key":"1009_CR8","doi-asserted-by":"crossref","unstructured":"Chen L, Ai H, Zhuang Z, Shang C (2018) Real-time multiple people tracking with deeply learned candidate selection and person re-identification. In: 2018 IEEE international conference on multimedia and expo (ICME). IEEE, pp 1\u20136","DOI":"10.1109\/ICME.2018.8486597"},{"key":"1009_CR9","doi-asserted-by":"publisher","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","volume":"40","author":"LC Chen","year":"2017","unstructured":"Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40:834\u2013848","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1009_CR10","doi-asserted-by":"crossref","unstructured":"Chen Q, Wang Y, Yang T, Zhang X, Cheng J, Sun J (2021) You only look one-level feature. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 13039\u201313048","DOI":"10.1109\/CVPR46437.2021.01284"},{"key":"1009_CR11","doi-asserted-by":"crossref","unstructured":"Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 764\u2013773","DOI":"10.1109\/ICCV.2017.89"},{"key":"1009_CR12","unstructured":"Dendorfer P, Rezatofighi H, Milan A, Shi J, Cremers D, Reid I, Roth S, Schindler K, Leal-Taix\u00e9 L (2020) MOT20: a benchmark for multi object tracking in crowded scenes. arXiv preprint arXiv:2003.09003"},{"key":"1009_CR13","doi-asserted-by":"crossref","unstructured":"Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248\u2013255","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"1009_CR14","doi-asserted-by":"crossref","unstructured":"Ding X, Guo Y, Ding G, Han J (2019a) ACNet: strengthening the kernel skeletons for powerful CNN via asymmetric convolution blocks. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 1911\u20131920","DOI":"10.1109\/ICCV.2019.00200"},{"key":"1009_CR15","doi-asserted-by":"crossref","unstructured":"Ding X, Guo Y, Ding G, Han J (2019b) ACNet: strengthening the kernel skeletons for powerful CNN via asymmetric convolution blocks. In: The IEEE international conference on computer vision (ICCV)","DOI":"10.1109\/ICCV.2019.00200"},{"key":"1009_CR16","doi-asserted-by":"crossref","unstructured":"Doll\u00e1r P, Wojek C, Schiele B, Perona P (2009) Pedestrian detection: a benchmark. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 304\u2013311","DOI":"10.1109\/CVPR.2009.5206631"},{"key":"1009_CR17","doi-asserted-by":"crossref","unstructured":"Du Y, Song Y, Yang B, Zhao Y (2022) StrongSort: make DeepSort great again. arXiv preprint arXiv:2202.13514","DOI":"10.1109\/TMM.2023.3240881"},{"key":"1009_CR18","doi-asserted-by":"publisher","first-page":"363","DOI":"10.1177\/14771535211034330","volume":"54","author":"D Durmus","year":"2022","unstructured":"Durmus D (2022) Correlated color temperature: use and limitations. Light Res Technol 54:363\u2013375","journal-title":"Light Res Technol"},{"key":"1009_CR19","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1016\/j.neunet.2017.12.012","volume":"107","author":"S Elfwing","year":"2018","unstructured":"Elfwing S, Uchibe E, Doya K (2018) Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw 107:3\u201311","journal-title":"Neural Netw"},{"key":"1009_CR20","doi-asserted-by":"publisher","first-page":"2000797","DOI":"10.1002\/admt.202000797","volume":"6","author":"M Elsherif","year":"2021","unstructured":"Elsherif M, Salih AE, Yetisen AK, Butt H (2021) Contact lenses for color vision deficiency. Adv Mater Technol 6:2000797","journal-title":"Adv Mater Technol"},{"key":"1009_CR21","doi-asserted-by":"crossref","unstructured":"Ess A, Leibe B, Schindler K, Van Gool L (2008) A mobile vision system for robust multi-person tracking. In: 2008 IEEE conference on computer vision and pattern recognition. IEEE, pp 1\u20138","DOI":"10.1109\/CVPR.2008.4587581"},{"key":"1009_CR22","unstructured":"Galor A, Orfaig R, Bobrovsky BZ (2022) Strong-transcenter: improved multi-object tracking based on transformers with dense representations. arXiv preprint arXiv:2210.13570"},{"key":"1009_CR23","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41467-022-32761-8","volume":"13","author":"JA Gaynes","year":"2022","unstructured":"Gaynes JA, Budoff SA, Grybko MJ, Hunt JB, Poleg-Polsky A (2022) Classical center-surround receptive fields facilitate novel object detection in retinal bipolar cells. Nat Commun 13:1\u201317","journal-title":"Nat Commun"},{"key":"1009_CR24","unstructured":"Ge Z, Liu S, Wang F, Li Z, Sun J (2021) YOLOx: exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430"},{"key":"1009_CR25","unstructured":"Girbau A, Gir\u00f3-i Nieto X, Rius I, Marqu\u00e9s F (2021) Multiple object tracking with mixture density networks for trajectory estimation. arXiv preprint arXiv:2106.10950"},{"key":"1009_CR26","doi-asserted-by":"publisher","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","volume":"37","author":"K He","year":"2015","unstructured":"He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37:1904\u20131916","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1009_CR27","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"1009_CR28","doi-asserted-by":"crossref","unstructured":"Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: CVPR","DOI":"10.1109\/CVPR46437.2021.01350"},{"key":"1009_CR29","doi-asserted-by":"crossref","unstructured":"Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132\u20137141","DOI":"10.1109\/CVPR.2018.00745"},{"key":"1009_CR30","doi-asserted-by":"crossref","unstructured":"Huang S, Lu Z, Cheng R, He C (2021) FAPN: feature-aligned pyramid network for dense image prediction. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 864\u2013873","DOI":"10.1109\/ICCV48922.2021.00090"},{"key":"1009_CR31","unstructured":"Jocher G, Chaurasia A, Stoken A, Borovec J, NanoCode012 Kwon Y, TaoXie Fang J, imyhxy Michael K, Lorna V A, Montes D, Nadar J, Laughing tkianai yxNONG Skalski P, Wang Z, Hogan A, Fati C, Mammana L, AlexWang1900 Patel D, Yiwei D, You F, Hajek J, Diaconu L, Minh MT (2022) ultralytics\/olov5: v6.1\u2014TensorRT, TensorFlow edge TPU and OpenVINO export and inference. https:\/\/doi.org\/10.5281\/zenodo.6222936"},{"key":"1009_CR32","doi-asserted-by":"crossref","unstructured":"Kawai F (2022) Certain retinal horizontal cells have a center-surround antagonistic organization. J Neurophysiol","DOI":"10.1152\/jn.00163.2022"},{"key":"1009_CR33","doi-asserted-by":"crossref","unstructured":"Koonce B (2021) Mobilenetv3. In: Convolutional neural networks with swift for Tensorflow. Springer, pp 125\u2013144","DOI":"10.1007\/978-1-4842-6168-2_11"},{"key":"1009_CR34","doi-asserted-by":"publisher","first-page":"667","DOI":"10.1038\/nn.2117","volume":"11","author":"PS Lagali","year":"2008","unstructured":"Lagali PS, Balya D, Awatramani GB, M\u00fcnch TA, Kim DS, Busskamp V, Cepko CL, Roska B (2008) Light-activated channels targeted to on bipolar cells restore visual function in retinal degeneration. Nat Neurosci 11:667\u2013675","journal-title":"Nat Neurosci"},{"key":"1009_CR35","unstructured":"Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W et\u00a0al (2022) Yolov6: a single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976"},{"key":"1009_CR36","unstructured":"Li W, Xiong Y, Yang S, Xu M, Wang Y, Xia W (2021a) Semi-TCL: semi-supervised track contrastive representation learning. arXiv preprint arXiv:2107.02396"},{"key":"1009_CR37","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1016\/j.neucom.2020.12.018","volume":"433","author":"Y Li","year":"2021","unstructured":"Li Y, Yin G, Liu C, Yang X, Wang Z (2021) Triplet online instance matching loss for person re-identification. Neurocomputing 433:10\u201318","journal-title":"Neurocomputing"},{"key":"1009_CR38","doi-asserted-by":"crossref","unstructured":"Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Doll\u00e1r P, Zitnick CL (2014) Microsoft coco: common objects in context. In: European conference on computer vision. Springer, pp 740\u2013755","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"1009_CR39","doi-asserted-by":"publisher","first-page":"200","DOI":"10.1016\/j.patrec.2021.03.022","volume":"146","author":"X Lin","year":"2021","unstructured":"Lin X, Li CT, Sanchez V, Maple C (2021) On the detection-to-track association for online multi-object tracking. Pattern Recogn Lett 146:200\u2013207","journal-title":"Pattern Recogn Lett"},{"key":"1009_CR40","doi-asserted-by":"crossref","unstructured":"Lit Z, Cai S, Wang X, Shao H, Niu L, Xue N (2021) Multiple object tracking with GRU association and Kalman prediction. In: 2021 international joint conference on neural networks (IJCNN). IEEE, pp 1\u20138","DOI":"10.1109\/IJCNN52387.2021.9533828"},{"key":"1009_CR41","doi-asserted-by":"publisher","first-page":"3032","DOI":"10.1109\/TIP.2022.3152627","volume":"31","author":"C Liu","year":"2022","unstructured":"Liu C, Sun H, Katto J, Zeng X, Fan Y (2022) Qa-filter: a QP-adaptive convolutional neural network filter for video coding. IEEE Trans Image Process 31:3032\u20133045","journal-title":"IEEE Trans Image Process"},{"key":"1009_CR42","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1016\/j.neucom.2020.10.019","volume":"423","author":"H Liu","year":"2021","unstructured":"Liu H, Xiao Z, Fan B, Zeng H, Zhang Y, Jiang G (2021) PrGCN: probability prediction with graph convolutional network for person re-identification. Neurocomputing 423:57\u201370","journal-title":"Neurocomputing"},{"key":"1009_CR43","doi-asserted-by":"crossref","unstructured":"Liu J, Luo X, Huang Y (2022b) Facial expression recognition based on improved residual network. In: 2nd international conference on information technology and intelligent control (CITIC 2022). SPIE, pp 349\u2013355","DOI":"10.1117\/12.2653443"},{"key":"1009_CR44","doi-asserted-by":"publisher","first-page":"333","DOI":"10.1016\/j.neucom.2022.01.008","volume":"483","author":"Q Liu","year":"2022","unstructured":"Liu Q, Chen D, Chu Q, Yuan L, Liu B, Zhang L, Yu N (2022) Online multi-object tracking with unsupervised re-identification learning and occlusion estimation. Neurocomputing 483:333\u2013347","journal-title":"Neurocomputing"},{"key":"1009_CR45","doi-asserted-by":"crossref","unstructured":"Liu S, Liu D, Srivastava G, Po\u0142ap D, Wo\u017aniak M (2020) Overview of correlation filter based algorithms in object tracking. Complex Intell Syst","DOI":"10.1007\/s40747-020-00161-4"},{"key":"1009_CR46","doi-asserted-by":"publisher","first-page":"740","DOI":"10.1126\/science.3283936","volume":"240","author":"M Livingstone","year":"1988","unstructured":"Livingstone M, Hubel D (1988) Segregation of form, color, movement, and depth: anatomy, physiology, and perception. Science 240:740\u2013749","journal-title":"Science"},{"key":"1009_CR47","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2020.103448","volume":"293","author":"W Luo","year":"2021","unstructured":"Luo W, Xing J, Milan A, Zhang X, Liu W, Kim TK (2021) Multiple object tracking: a literature review. Artif Intell 293:103448","journal-title":"Artif Intell"},{"key":"1009_CR48","doi-asserted-by":"crossref","unstructured":"Miao J, Wu Y, Liu P, Ding Y, Yang Y (2019) Pose-guided feature alignment for occluded person re-identification. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 542\u2013551","DOI":"10.1109\/ICCV.2019.00063"},{"key":"1009_CR49","unstructured":"Milan A, Leal-Taixe L, Reid I, Roth S, Schindler K (2016) Mot16: a benchmark for multi-object tracking"},{"key":"1009_CR50","doi-asserted-by":"crossref","unstructured":"Misra D, Nalamada T, Arasanipalai AU, Hou Q (2021) Rotate to attend: Convolutional triplet attention module. In: Proceedings of the IEEE\/CVF winter conference on applications of computer vision (WACV), pp 3139\u20133148","DOI":"10.1109\/WACV48630.2021.00318"},{"key":"1009_CR51","doi-asserted-by":"publisher","first-page":"83085","DOI":"10.1109\/ACCESS.2022.3197157","volume":"10","author":"R Mostafa","year":"2022","unstructured":"Mostafa R, Baraka H, Bayoumi A (2022) LMOT: efficient light-weight detection and tracking in crowds. IEEE Access 10:83085\u201383095","journal-title":"IEEE Access"},{"key":"1009_CR52","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/LGRS.2022.3227055","volume":"19","author":"Y Nie","year":"2022","unstructured":"Nie Y, Bian C, Li L (2022) Object tracking in satellite videos based on Siamese network with multidimensional information-aware and temporal motion compensation. IEEE Geosci Remote Sens Lett 19:1\u20135","journal-title":"IEEE Geosci Remote Sens Lett"},{"key":"1009_CR53","doi-asserted-by":"crossref","unstructured":"Pang B, Li Y, Zhang Y, Li M, Lu C (2020) Tubetk: adopting tubes to track multi-object in a one-step training model. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 6308\u20136318","DOI":"10.1109\/CVPR42600.2020.00634"},{"key":"1009_CR54","doi-asserted-by":"crossref","unstructured":"Pang J, Qiu L, Li X, Chen H, Li Q, Darrell T, Yu F (2021) Quasi-dense similarity learning for multiple object tracking. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 164\u2013173","DOI":"10.1109\/CVPR46437.2021.00023"},{"key":"1009_CR55","doi-asserted-by":"crossref","unstructured":"Peng J, Wang C, Wan F, Wu Y, Wang Y, Tai Y, Wang C, Li J, Huang F, Fu Y (2020a) Chained-tracker: chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In: European conference on computer vision. Springer, pp 145\u2013161","DOI":"10.1007\/978-3-030-58548-8_9"},{"key":"1009_CR56","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2020.107480","volume":"107","author":"J Peng","year":"2020","unstructured":"Peng J, Wang T, Lin W, Wang J, See J, Wen S, Ding E (2020) TPM: multiple object tracking with tracklet-plane matching. Pattern Recogn 107:107480","journal-title":"Pattern Recogn"},{"key":"1009_CR57","doi-asserted-by":"publisher","first-page":"363","DOI":"10.1152\/jn.1981.45.3.363","volume":"45","author":"M Piccolino","year":"1981","unstructured":"Piccolino M, Neyton J, Gerschenfeld H (1981) Center-surround antagonistic organization in small-field luminosity horizontal cells of turtle retina. J Neurophysiol 45:363\u2013375","journal-title":"J Neurophysiol"},{"key":"1009_CR58","doi-asserted-by":"publisher","first-page":"3233","DOI":"10.1016\/j.cub.2021.05.017","volume":"31","author":"Y Qiu","year":"2021","unstructured":"Qiu Y, Zhao Z, Klindt D, Kautzky M, Szatko KP, Schaeffel F, Rifai K, Franke K, Busse L, Euler T (2021) Natural environment statistics in the upper and lower visual field are reflected in mouse retinal specializations. Curr Biol 31:3233\u20133247","journal-title":"Curr Biol"},{"key":"1009_CR59","unstructured":"Quan H, Ablameyko S (2022) Multi-object tracking by using strong sort tracker and YOLOv7 network"},{"key":"1009_CR60","doi-asserted-by":"crossref","unstructured":"Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-CAM: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618\u2013626","DOI":"10.1109\/ICCV.2017.74"},{"key":"1009_CR61","unstructured":"Shan C, Wei C, Deng B, Huang J, Hua XS, Cheng X, Liang K (2020) Tracklets predicting based adaptive graph tracking. arXiv preprint arXiv:2010.09015"},{"key":"1009_CR62","unstructured":"Shao S, Zhao Z, Li B, Xiao T, Yu G, Zhang X, Sun J (2018) CrowdHuman: a benchmark for detecting human in a crowd. arXiv preprint arXiv:1805.00123"},{"key":"1009_CR63","doi-asserted-by":"crossref","unstructured":"Shopov VK, Markova VD (2021) Application of Hungarian algorithm for assignment problem. In: 2021 international conference on information technologies (InfoTech). IEEE, pp 1\u20134","DOI":"10.1109\/InfoTech52438.2021.9548600"},{"key":"1009_CR64","doi-asserted-by":"crossref","unstructured":"Stergiou Alexandros PR, Grigorios K (2021) Refining activation downsampling with softpool. In: International conference on computer vision (ICCV). IEEE, pp 10357\u201310366","DOI":"10.1109\/ICCV48922.2021.01019"},{"key":"1009_CR65","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.neucom.2021.01.073","volume":"440","author":"J Sun","year":"2021","unstructured":"Sun J, Li Y, Chen H, Peng Y, Zhu X, Zhu J (2021) Visible-infrared cross-modality person re-identification based on whole-individual training. Neurocomputing 440:1\u201311","journal-title":"Neurocomputing"},{"key":"1009_CR66","doi-asserted-by":"publisher","first-page":"3718","DOI":"10.1109\/TSMC.2021.3069265","volume":"52","author":"C Tian","year":"2021","unstructured":"Tian C, Xu Y, Zuo W, Lin CW, Zhang D (2021) Asymmetric CNN for image superresolution. IEEE Trans Syst Man Cybern Syst 52:3718\u20133730","journal-title":"IEEE Trans Syst Man Cybern Syst"},{"key":"1009_CR67","doi-asserted-by":"publisher","first-page":"443","DOI":"10.1016\/j.neucom.2021.03.056","volume":"449","author":"Z Tu","year":"2021","unstructured":"Tu Z, Zhou A, Gan C, Jiang B, Hussain A, Luo B (2021) A novel domain activation mapping-guided network (DA-GNT) for visual tracking. Neurocomputing 449:443\u2013454","journal-title":"Neurocomputing"},{"key":"1009_CR68","doi-asserted-by":"crossref","unstructured":"Wan X, Zhou S, Wang J, Meng R (2021) Multiple object tracking by trajectory map regression with temporal priors embedding. In: Proceedings of the 29th ACM international conference on multimedia, pp 1377\u20131386","DOI":"10.1145\/3474085.3475304"},{"key":"1009_CR69","doi-asserted-by":"crossref","unstructured":"Wang Q, Wu B, Z P, L P, Z W, Hu Q (2020) ECA-Net: efficient channel attention for deep convolutional neural networks. In: The IEEE conference on computer vision and pattern recognition (CVPR)","DOI":"10.1109\/CVPR42600.2020.01155"},{"key":"1009_CR70","doi-asserted-by":"crossref","unstructured":"Wang CY, Bochkovskiy A, Liao HYM (2022) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696","DOI":"10.1109\/CVPR52729.2023.00721"},{"key":"1009_CR71","doi-asserted-by":"crossref","unstructured":"Wang J, Zhu C (2021) Semantically enhanced multi-scale feature pyramid fusion for pedestrian detection. In: 2021 13th international conference on machine learning and computing, pp 423\u2013431","DOI":"10.1145\/3457682.3457747"},{"key":"1009_CR72","unstructured":"Wang T, Chen K, Lin W, See J, Zhang Z, Xu Q, Jia X (2020a) Spatio-temporal point process for multiple object tracking. IEEE Trans Neural Netw Learn Syst"},{"key":"1009_CR73","doi-asserted-by":"crossref","unstructured":"Wang Y, Kitani K, Weng X (2021) Joint object detection and multi-object tracking with graph neural networks. In: 2021 IEEE international conference on robotics and automation (ICRA). IEEE, pp 13708\u201313715","DOI":"10.1109\/ICRA48506.2021.9561110"},{"key":"1009_CR74","doi-asserted-by":"crossref","unstructured":"Wang Z, Zheng L, Liu Y, Li Y, Wang S (2020b) Towards real-time multi-object tracking. In: European conference on computer vision. Springer, pp 107\u2013122","DOI":"10.1007\/978-3-030-58621-8_7"},{"key":"1009_CR75","doi-asserted-by":"crossref","unstructured":"Welch GF (2020) Kalman filter. Computer vision: a reference guide, pp 1\u20133","DOI":"10.1007\/978-3-030-03243-2_716-1"},{"key":"1009_CR76","doi-asserted-by":"crossref","unstructured":"Wojke N, Bewley A, Paulus D (2017) Simple online and realtime tracking with a deep association metric. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 3645\u20133649","DOI":"10.1109\/ICIP.2017.8296962"},{"key":"1009_CR77","doi-asserted-by":"crossref","unstructured":"Wu J, Cao J, Song L, Wang Y, Yang M, Yuan J (2021) Track to detect and segment: an online multi-object tracker. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 12352\u201312361","DOI":"10.1109\/CVPR46437.2021.01217"},{"key":"1009_CR78","doi-asserted-by":"publisher","first-page":"275","DOI":"10.1016\/j.cmpb.2019.07.009","volume":"178","author":"S Xiang","year":"2019","unstructured":"Xiang S, Liang Q, Hu Y, Tang P, Coppola G, Zhang D, Sun W (2019) AMC-Net: asymmetric and multi-scale convolutional neural network for multi-label HPA classification. Comput Methods Progr Biomed 178:275\u2013287","journal-title":"Comput Methods Progr Biomed"},{"key":"1009_CR79","doi-asserted-by":"crossref","unstructured":"Xiao T, Li S, Wang B, Lin L, Wang X (2017) Joint detection and identification feature learning for person search. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3415\u20133424","DOI":"10.1109\/CVPR.2017.360"},{"key":"1009_CR80","unstructured":"Xu S, Wang X, Lv W, Chang Q, Cui C, Deng K, Wang G, Dang Q, Wei S, Du Y et\u00a0al (2022) PP-YOLOE: an evolved version of YOLO. arXiv preprint arXiv:2203.16250"},{"key":"1009_CR81","unstructured":"Xu Y, Ban Y, Delorme G, Gan C, Rus D, Alameda-Pineda X (2021) Transcenter: transformers with dense queries for multiple-object tracking. arXiv preprint arXiv:2103.15145"},{"key":"1009_CR82","doi-asserted-by":"crossref","unstructured":"Yang G (2022) Asymptotic tracking with novel integral robust schemes for mismatched uncertain nonlinear systems. Int J Robust Nonlinear Control","DOI":"10.1002\/rnc.6499"},{"key":"1009_CR83","doi-asserted-by":"publisher","first-page":"2993","DOI":"10.1002\/rnc.5436","volume":"31","author":"G Yang","year":"2021","unstructured":"Yang G, Wang H, Chen J (2021) Disturbance compensation based asymptotic tracking control for nonlinear systems with mismatched modeling uncertainties. Int J Robust Nonlinear Control 31:2993\u20133010","journal-title":"Int J Robust Nonlinear Control"},{"key":"1009_CR84","doi-asserted-by":"crossref","unstructured":"Yang G, Yao J, Dong Z (2022) Neuroadaptive learning algorithm for constrained nonlinear systems with disturbance rejection. Int J Robust Nonlinear Control","DOI":"10.1002\/rnc.6143"},{"key":"1009_CR85","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.2104884118","volume":"118","author":"BK Young","year":"2021","unstructured":"Young BK, Ramakrishnan C, Ganjawala T, Wang P, Deisseroth K, Tian N (2021) An uncommon neuronal class conveys visual signals from rods and cones to retinal ganglion cells. Proc Natl Acad Sci 118:e2104884118","journal-title":"Proc Natl Acad Sci"},{"key":"1009_CR86","doi-asserted-by":"crossref","unstructured":"Yu E, Li Z, Han S, Wang H (2022) Relationtrack: relation-aware multiple object tracking with decoupled representation. IEEE Trans Multimed","DOI":"10.1109\/TMM.2022.3150169"},{"key":"1009_CR87","unstructured":"Yu G, Chang Q, Lv W, Xu C, Cui C, Ji W, Dang Q, Deng K, Wang G, Du Y et\u00a0al (2021) PP-PicoDet: a better real-time object detector on mobile devices. arXiv preprint arXiv:2111.00902"},{"key":"1009_CR88","doi-asserted-by":"crossref","unstructured":"Zhang S, Benenson R, Schiele B (2017) CityPersons: a diverse dataset for pedestrian detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213\u20133221","DOI":"10.1109\/CVPR.2017.474"},{"key":"1009_CR89","doi-asserted-by":"crossref","unstructured":"Zhang Y, Sun P, Jiang Y, Yu D, Weng F, Yuan Z, Luo P, Liu W, Wang X (2022) ByteTrack: multi-object tracking by associating every detection box","DOI":"10.1007\/978-3-031-20047-2_1"},{"key":"1009_CR90","doi-asserted-by":"publisher","first-page":"3069","DOI":"10.1007\/s11263-021-01513-4","volume":"129","author":"Y Zhang","year":"2021","unstructured":"Zhang Y, Wang C, Wang X, Zeng W, Liu W (2021) FairMOT: on the fairness of detection and re-identification in multiple object tracking. Int J Comput Vis 129:3069\u20133087","journal-title":"Int J Comput Vis"},{"key":"1009_CR91","doi-asserted-by":"crossref","unstructured":"Zheng L, Tang M, Chen Y, Zhu G, Wang J, Lu H (2021) Improving multiple object tracking with single object tracking. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 2453\u20132462","DOI":"10.1109\/CVPR46437.2021.00248"},{"key":"1009_CR92","doi-asserted-by":"crossref","unstructured":"Zheng L, Zhang H, Sun S, Chandraker M, Yang Y, Tian Q (2017) Person re-identification in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1367\u20131376","DOI":"10.1109\/CVPR.2017.357"},{"key":"1009_CR93","doi-asserted-by":"publisher","first-page":"1011","DOI":"10.1109\/TCSVT.2018.2825679","volume":"29","author":"H Zhou","year":"2018","unstructured":"Zhou H, Ouyang W, Cheng J, Wang X, Li H (2018) Deep continuous conditional random fields with asymmetric inter-object constraints for online multi-object tracking. IEEE Trans Circuits Syst Video Technol 29:1011\u20131022","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"key":"1009_CR94","doi-asserted-by":"crossref","unstructured":"Zhou X, Koltun V, Kr\u00e4henb\u00fchl P (2020) Tracking objects as points. In: European conference on computer vision. Springer, pp 474\u2013490","DOI":"10.1007\/978-3-030-58548-8_28"},{"key":"1009_CR95","unstructured":"Zhou X, Wang D, Kr\u00e4henb\u00fchl P (2019) Objects as points. In: arXiv preprint arXiv:1904.07850"},{"key":"1009_CR96","doi-asserted-by":"crossref","unstructured":"Zhu F, Yan H, Chen X, Li T, Zhang Z (2021) A multi-scale and multi-level feature aggregation network for crowd counting. Neurocomputing 423:46\u201356","DOI":"10.1016\/j.neucom.2020.09.059"},{"key":"1009_CR97","doi-asserted-by":"crossref","unstructured":"Zhuo L, Liu B, Zhang H, Zhang S, Li J (2021) MultiRPN-DIDnet: multiple RPNs and distance-IoU discriminative network for real-time UAV target tracking. Remote Sens 13:2772","DOI":"10.3390\/rs13142772"},{"key":"1009_CR98","doi-asserted-by":"crossref","unstructured":"Zou Z, Huang J, Luo P (2022) Compensation tracker: reprocessing lost object for multi-object tracking. In: Proceedings of the IEEE\/CVF winter conference on applications of computer vision, pp 307\u2013317","DOI":"10.1109\/WACV51458.2022.00273"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-023-01009-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-023-01009-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-023-01009-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,22]],"date-time":"2023-09-22T17:16:54Z","timestamp":1695403014000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-023-01009-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,8]]},"references-count":98,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2023,10]]}},"alternative-id":["1009"],"URL":"https:\/\/doi.org\/10.1007\/s40747-023-01009-3","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,8]]},"assertion":[{"value":"30 November 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 February 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 March 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}