{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,24]],"date-time":"2026-04-24T15:04:51Z","timestamp":1777043091495,"version":"3.51.4"},"reference-count":300,"publisher":"Association for Computing Machinery (ACM)","issue":"3","funder":[{"name":"Major Research Plan of National Natural Science Foundation of China","award":["92370203"],"award-info":[{"award-number":["92370203"]}]},{"name":"National Key Research and Development Program of China","award":["2023YFE0106800"],"award-info":[{"award-number":["2023YFE0106800"]}]},{"DOI":"10.13039\/501100018541","name":"Science Fund for Distinguished Young Scholars of Jiangsu Province","doi-asserted-by":"crossref","award":["BK20231531"],"award-info":[{"award-number":["BK20231531"]}],"id":[{"id":"10.13039\/501100018541","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Comput. Surv."],"published-print":{"date-parts":[[2026,2,28]]},"abstract":"<jats:p>Traffic Surveillance Systems (TSS) have become increasingly crucial in modern intelligent transportation systems, with vision technologies playing a central role for scene perception and understanding. While existing surveys typically focus on isolated aspects of TSS, a comprehensive analytical framework bridging low-level and high-level perception tasks, particularly considering emerging technologies, remains lacking. This article presents a systematic review of vision technologies in TSS, examining both low-level perception tasks (object detection, classification, and tracking) and high-level perception tasks (parameter estimation, anomaly detection, and behavior understanding). Specifically, we first provide a detailed methodological categorization and comprehensive performance evaluation for each task. Our investigation reveals five fundamental limitations in current TSS: perceptual data degradation in complex scenarios, data-driven learning constraints, semantic understanding gaps, sensing coverage limitations, and computational resource demands. To address these challenges, we systematically analyze five categories of current approaches and potential trends: advanced perception enhancement, efficient learning paradigms, knowledge-enhanced understanding, cooperative sensing frameworks, and efficient computing frameworks, critically assessing their real-world applicability. Furthermore, we evaluate the transformative potential of foundation models in TSS, which exhibit remarkable zero-shot learning abilities, strong generalization, and sophisticated reasoning capabilities across diverse tasks. This review provides a unified analytical framework bridging low-level and high-level perception tasks, systematically analyzes current limitations and solutions, and presents a structured roadmap for integrating emerging technologies, particularly foundation models, to enhance TSS capabilities.<\/jats:p>","DOI":"10.1145\/3760525","type":"journal-article","created":{"date-parts":[[2025,8,13]],"date-time":"2025-08-13T11:15:47Z","timestamp":1755083747000},"page":"1-47","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Vision Technologies with Applications in Traffic Surveillance Systems: A Holistic Survey"],"prefix":"10.1145","volume":"58","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3225-0576","authenticated-orcid":false,"given":"Wei","family":"Zhou","sequence":"first","affiliation":[{"name":"School of Automation, Nanjing University of Science and Technology","place":["Nanjing, China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8152-7642","authenticated-orcid":false,"given":"Li","family":"Yang","sequence":"additional","affiliation":[{"name":"School of Automation, Nanjing University of Science and Technology","place":["Nanjing, China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6336-048X","authenticated-orcid":false,"given":"Lei","family":"Zhao","sequence":"additional","affiliation":[{"name":"Southeast University","place":["Nanjing, China"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-6278-9915","authenticated-orcid":false,"given":"Runyu","family":"Zhang","sequence":"additional","affiliation":[{"name":"Southeast University","place":["Nanjing, China"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-2322-7841","authenticated-orcid":false,"given":"Yifan","family":"Cui","sequence":"additional","affiliation":[{"name":"Southeast University","place":["Nanjing, China"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-0210-6589","authenticated-orcid":false,"given":"Hongpu","family":"Huang","sequence":"additional","affiliation":[{"name":"Southeast University","place":["Nanjing, China"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-8406-7996","authenticated-orcid":false,"given":"Kun","family":"Qie","sequence":"additional","affiliation":[{"name":"Beijing University of Civil Engineering and Architecture","place":["Beijing, China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4573-9047","authenticated-orcid":false,"given":"Chen","family":"Wang","sequence":"additional","affiliation":[{"name":"Southeast University","place":["Nanjing, China"]}]}],"member":"320","published-online":{"date-parts":[[2025,9,9]]},"reference":[{"key":"e_1_3_2_2_2","article-title":"Pedestrian crossing intention prediction from surveillance videos for over-the-horizon safety warning","author":"Zhou Wei","year":"2024","unstructured":"Wei Zhou, Yuqing Liu, Lei Zhao, Sixuan Xu, and Chen Wang. 2024. Pedestrian crossing intention prediction from surveillance videos for over-the-horizon safety warning. IEEE Transactions on Intelligent Transportation Systems 25, 2 (2024), 1394\u20131407.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"e_1_3_2_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_2_4_2","article-title":"An image is worth 16x16 words: Transformers for image recognition at scale","author":"Alexey Dosovitskiy","year":"2020","unstructured":"Dosovitskiy Alexey. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020). Retrieved from https:\/\/arxiv.org\/abs\/2010.11929","journal-title":"arXiv preprint"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00371"},{"key":"e_1_3_2_6_2","first-page":"8748","volume-title":"Proceedings of the International Conference on Machine Learning","author":"Radford Alec","year":"2021","unstructured":"Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et\u00a0al. 2021. Learning transferable visual models from natural language supervision. In Proceedings of the International Conference on Machine Learning. PMLR, 8748\u20138763."},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1109\/tits.2016.2530146"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3417989"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/3434398"},{"key":"e_1_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2020.2997084"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2020.3004066"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2022.3147770"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2023.3258683"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2022.04.087"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2022.3195509"},{"key":"e_1_3_2_16_2","article-title":"Toward effective traffic sign detection via two-stage fusion neural networks","author":"Li Zhishan","year":"2024","unstructured":"Zhishan Li, Hongxu Chen, Battista Biggio, Yifan He, Haoran Cai, Fabio Roli, and Lei Xie. 2024. Toward effective traffic sign detection via two-stage fusion neural networks. IEEE Transactions on Intelligent Transportation Systems 25, 8 (2024), 8283\u20138294.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"e_1_3_2_17_2","article-title":"Monitoring-based traffic participant detection in urban mixed traffic: A novel dataset and a tailored detector","author":"Zhou Wei","year":"2023","unstructured":"Wei Zhou, Chen Wang, Jingxin Xia, Zhendong Qian, and Yuan Wu. 2023. Monitoring-based traffic participant detection in urban mixed traffic: A novel dataset and a tailored detector. IEEE Transactions on Intelligent Transportation Systems 25, 1 (2023), 189\u2013202.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.121209"},{"key":"e_1_3_2_19_2","article-title":"MDFD2-DETR: A real-time complex road object detection model based on multi-domain feature decomposition and de-redundancy","author":"Liu Jia-wei","year":"2024","unstructured":"Jia-wei Liu, Da Yang, Ting-wei Feng, and Jun-jie Fu. 2024. MDFD2-DETR: A real-time complex road object detection model based on multi-domain feature decomposition and de-redundancy. IEEE Transactions on Intelligent Vehicles (2024). (Early Access).","journal-title":"IEEE Transactions on Intelligent Vehicles"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2016.2577031"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00644"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2023.3238524"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.91"},{"key":"e_1_3_2_24_2","unstructured":"Joseph Redmon. 2018. Yolov3: An incremental improvement. arXiv:1804.02767 (2018). Retrieved from https:\/\/arxiv.org\/abs\/1804.02767"},{"key":"e_1_3_2_25_2","unstructured":"Ao Wang Hui Chen Lihao Liu Kai Chen Zijia Lin Jungong Han and Guiguang Ding. 2024. Yolov10: Real-time end-to-end object detection. Advances in Neural Information Processing Systems 37 (2024) 107984\u2013108011. Retrieved from https:\/\/arxiv.org\/abs\/2405.14458"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"e_1_3_2_27_2","doi-asserted-by":"crossref","unstructured":"Z. Tian C. Shen H. Chen and T. He. 2019. FCOS: Fully convolutional one-stage object detection.. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 9627\u20139636. Retrieved from https:\/\/arxiv.org\/abs\/1904.01355","DOI":"10.1109\/ICCV.2019.00972"},{"key":"e_1_3_2_28_2","unstructured":"Xingyi Zhou Dequan Wang and Philipp Kr\u00e4henb\u00fchl. 2019. Objects as points. arXiv:1904.07850 (2019). Retrieved from https:\/\/arxiv.org\/abs\/1904.07850"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.01605"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2023.3328195"},{"key":"e_1_3_2_32_2","article-title":"Data-efficient object detection on construction sites using reweighting mechanism and cross-batch contrastive learning","author":"Zhou Wei","year":"2025","unstructured":"Wei Zhou, Lei Zhao, Hongpu Huang, and Chen Wang. 2025. Data-efficient object detection on construction sites using reweighting mechanism and cross-batch contrastive learning. IEEE Transactions on Industrial Informatics 21, 8 (2025), 6028\u20136037.","journal-title":"IEEE Transactions on Industrial Informatics"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.5220\/0010783600003124"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.5244\/C.28.42"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00138-020-01117-x"},{"key":"e_1_3_2_36_2","first-page":"1995","volume-title":"ICASSP 2022-Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","author":"Chen Yiqiang","year":"2022","unstructured":"Yiqiang Chen, Feng Liu, and Ke Pei. 2022. Monocular vehicle 3D bounding box estimation using homography and geometry in traffic scene. In ICASSP 2022-Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1995\u20131999."},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3061343"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.02065"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00938"},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00469"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00330"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.02070"},{"key":"e_1_3_2_43_2","article-title":"MonoUNI: A unified vehicle and infrastructure-side monocular 3D object detection network with sufficient depth clues","author":"Jinrang Jia","year":"2024","unstructured":"Jia Jinrang, Zhenjia Li, and Yifeng Shi. 2024. MonoUNI: A unified vehicle and infrastructure-side monocular 3D object detection network with sufficient depth clues. Advances in Neural Information Processing Systems 36 (2023), 11703\u201311715.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_44_2","first-page":"1238","volume-title":"Proceedings of the 17th International IEEE Conference on Intelligent Transportation Systems (ITSC)","author":"Ou Yuanchang","year":"2014","unstructured":"Yuanchang Ou, Huicheng Zheng, Shuyue Chen, and Jiangtao Chen. 2014. Vehicle logo recognition based on a weighted spatial pyramid framework. In Proceedings of the 17th International IEEE Conference on Intelligent Transportation Systems (ITSC). IEEE, 1238\u20131244."},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2014.12.018"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compeleceng.2018.07.045"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2014.2387069"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1049\/iet-its.2018.5127"},{"key":"e_1_3_2_49_2","article-title":"A new method for vehicle logo recognition based on swin transformer","author":"Li Yang","year":"2024","unstructured":"Yang Li, Doudou Zhang, and Jianli Xiao. 2024. A new method for vehicle logo recognition based on swin transformer. arXiv preprint arXiv:2401.15458 (2024). Retrieved from https:\/\/arxiv.org\/abs\/2401.15458","journal-title":"arXiv preprint"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2020.2981737"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICIP.2017.8296310"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2017.8019491"},{"key":"e_1_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICME.2018.8486589"},{"key":"e_1_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2023.110526"},{"key":"e_1_3_2_55_2","article-title":"Multi-Branch enhanced discriminative network for vehicle re-identification","author":"Lian Jiawei","year":"2023","unstructured":"Jiawei Lian, Da-Han Wang, Yun Wu, and Shunzhi Zhu. 2023. Multi-Branch enhanced discriminative network for vehicle re-identification. IEEE Transactions on Intelligent Transportation Systems 25, 2 (2023), 1263\u20131274.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2023.3238642"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00837"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00623"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2021.08.126"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2023.3257873"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2010.5539960"},{"key":"e_1_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2014.2345390"},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.156"},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-48881-3_56"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00935"},{"key":"e_1_3_2_66_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00676"},{"key":"e_1_3_2_67_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2023.03.083"},{"key":"e_1_3_2_68_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICIP.2016.7533003"},{"key":"e_1_3_2_69_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICIP.2017.8296962"},{"key":"e_1_3_2_70_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-20047-2_1"},{"key":"e_1_3_2_71_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2023.3240881"},{"key":"e_1_3_2_72_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v38i6.28386"},{"key":"e_1_3_2_73_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58621-8_7"},{"key":"e_1_3_2_74_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-021-01513-4"},{"key":"e_1_3_2_75_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00864"},{"key":"e_1_3_2_76_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00908"},{"key":"e_1_3_2_77_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cviu.2020.102907"},{"key":"e_1_3_2_78_2","doi-asserted-by":"publisher","DOI":"10.1186\/s12544-019-0390-4"},{"key":"e_1_3_2_79_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2018.2848705"},{"key":"e_1_3_2_80_2","article-title":"Baai-vanjee roadside dataset: Towards the connected automated vehicle highway technologies in challenging environments of China","author":"Yongqiang Deng","year":"2021","unstructured":"Deng Yongqiang, Wang Dengjiang, Cao Gang, Ma Bing, Guan Xijia, Wang Yajun, Liu Jianchao, Fang Yanming, and Li Juanjuan. 2021. Baai-vanjee roadside dataset: Towards the connected automated vehicle highway technologies in challenging environments of China. arXiv preprint arXiv:2105.14370 (2021). Retrieved from https:\/\/arxiv.org\/abs\/2105.14370","journal-title":"arXiv preprint"},{"key":"e_1_3_2_81_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA46639.2022.9811699"},{"key":"e_1_3_2_82_2","doi-asserted-by":"publisher","DOI":"10.1109\/IV51971.2022.9827401"},{"key":"e_1_3_2_83_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.02067"},{"key":"e_1_3_2_84_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVW.2013.77"},{"key":"e_1_3_2_85_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7299023"},{"key":"e_1_3_2_86_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2021.3062113"},{"key":"e_1_3_2_87_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.238"},{"key":"e_1_3_2_88_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46475-6_53"},{"key":"e_1_3_2_89_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00900"},{"key":"e_1_3_2_90_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2021.3099253"},{"key":"e_1_3_2_91_2","volume-title":"Proceedings of the European Conference on Computer Vision","author":"Benchmark UT","year":"2016","unstructured":"UT Benchmark. 2016. A benchmark and simulator for UAV tracking. In Proceedings of the European Conference on Computer Vision."},{"key":"e_1_3_2_92_2","article-title":"Vision meets drones: A challenge","author":"Zhu Pengfei","year":"2018","unstructured":"Pengfei Zhu, Longyin Wen, Xiao Bian, Haibin Ling, and Qinghua Hu. 2018. Vision meets drones: A challenge. arXiv preprint arXiv:1804.07437 (2018). Retrieved from https:\/\/arxiv.org\/abs\/1804.07437","journal-title":"arXiv preprint"},{"key":"e_1_3_2_93_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11263-020-01393-0"},{"key":"e_1_3_2_94_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2024.3439557"},{"key":"e_1_3_2_95_2","doi-asserted-by":"publisher","DOI":"10.1177\/0278364913491297"},{"key":"e_1_3_2_96_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01164"},{"key":"e_1_3_2_97_2","article-title":"A comprehensive survey on deep-learning-based vehicle re-identification: Models, data sets and challenges","author":"Amiri Ali","year":"2024","unstructured":"Ali Amiri, Aydin Kaya, and Ali Seydi Keceli. 2024. A comprehensive survey on deep-learning-based vehicle re-identification: Models, data sets and challenges. arXiv preprint arXiv:2401.10643 (2024). Retrieved from https:\/\/arxiv.org\/abs\/2401.10643","journal-title":"arXiv preprint"},{"key":"e_1_3_2_98_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.artint.2020.103448"},{"key":"e_1_3_2_99_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-981-19-7580-6_2"},{"key":"e_1_3_2_100_2","doi-asserted-by":"publisher","DOI":"10.1109\/3477.584964"},{"key":"e_1_3_2_101_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cviu.2024.103996"},{"key":"e_1_3_2_102_2","article-title":"Deformable DETR: Deformable transformers for end-to-end object detection","author":"Zhu Xizhou","year":"2020","unstructured":"Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai. 2020. Deformable DETR: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020). Retrieved from https:\/\/arxiv.org\/abs\/2010.04159","journal-title":"arXiv preprint"},{"key":"e_1_3_2_103_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"e_1_3_2_104_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2013.2294646"},{"key":"e_1_3_2_105_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2018.2799228"},{"key":"e_1_3_2_106_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-020-09171-3"},{"key":"e_1_3_2_107_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2021.3131530"},{"key":"e_1_3_2_108_2","article-title":"AMLNet: Attention multibranch loss CNN models for fine-grained vehicle recognition","author":"Lu Hongchun","year":"2023","unstructured":"Hongchun Lu, Min Han, Chaoqing Wang, and Junlong Cheng. 2023. AMLNet: Attention multibranch loss CNN models for fine-grained vehicle recognition. IEEE Transactions on Vehicular Technology 73, 1 (2023), 375\u2013384.","journal-title":"IEEE Transactions on Vehicular Technology"},{"key":"e_1_3_2_109_2","first-page":"1228","volume-title":"Proceedings of the 2016 19th International Conference on Information Fusion (FUSION)","author":"Chen Ruilong","year":"2016","unstructured":"Ruilong Chen, Matthew Hawes, Lyudmila Mihaylova, Jingjing Xiao, and Wei Liu. 2016. Vehicle logo recognition by spatial-SIFT combined with logistic regression. In Proceedings of the 2016 19th International Conference on Information Fusion (FUSION). IEEE, 1228\u20131235."},{"key":"e_1_3_2_110_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIM.2022.3178482"},{"key":"e_1_3_2_111_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-60639-8_34"},{"key":"e_1_3_2_112_2","article-title":"Siamdmu: Siamese dual mask update network for visual object tracking","author":"Liu Jing","year":"2024","unstructured":"Jing Liu, Han Wang, Chao Ma, Yuting Su, and Xiaokang Yang. 2024. Siamdmu: Siamese dual mask update network for visual object tracking. IEEE Transactions on Emerging Topics in Computational Intelligence 8, 2 (2024), 1656\u20131669.","journal-title":"IEEE Transactions on Emerging Topics in Computational Intelligence"},{"key":"e_1_3_2_113_2","doi-asserted-by":"publisher","DOI":"10.1109\/ITSC.2008.4732673"},{"key":"e_1_3_2_114_2","doi-asserted-by":"publisher","DOI":"10.1109\/ITSC.2012.6338660"},{"key":"e_1_3_2_115_2","first-page":"123","volume-title":"Proceedings of the 2012 Federated Conference on Computer Science and Information Systems (FedCSIS)","author":"Orghidan Radu","year":"2012","unstructured":"Radu Orghidan, Joaquim Salvi, Mihaela Gordan, and Bogdan Orza. 2012. Camera calibration using two or three vanishing points. In Proceedings of the 2012 Federated Conference on Computer Science and Information Systems (FedCSIS). IEEE, 123\u2013130."},{"key":"e_1_3_2_116_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2012.2210670"},{"key":"e_1_3_2_117_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-86383-8_50"},{"key":"e_1_3_2_118_2","doi-asserted-by":"publisher","DOI":"10.1061\/JTEPBS.TEENG-7412"},{"key":"e_1_3_2_119_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00138-024-01576-6"},{"key":"e_1_3_2_120_2","doi-asserted-by":"publisher","DOI":"10.1145\/3199667"},{"key":"e_1_3_2_121_2","doi-asserted-by":"publisher","DOI":"10.1109\/DICTA51227.2020.9363417"},{"key":"e_1_3_2_122_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00138-020-01125-x"},{"key":"e_1_3_2_123_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIE.2009.2038395"},{"key":"e_1_3_2_124_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICSIPA.2009.5478629"},{"key":"e_1_3_2_125_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00138-021-01255-w"},{"key":"e_1_3_2_126_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jksuci.2023.101657"},{"key":"e_1_3_2_127_2","first-page":"161","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops","author":"Huang Tingting","year":"2018","unstructured":"Tingting Huang. 2018. Traffic speed estimation from surveillance video data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 161\u2013165."},{"key":"e_1_3_2_128_2","doi-asserted-by":"publisher","DOI":"10.5194\/isprs-annals-V-2-2020-419-2020"},{"key":"e_1_3_2_129_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2023.3236512"},{"key":"e_1_3_2_130_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2023.3318077"},{"key":"e_1_3_2_131_2","doi-asserted-by":"publisher","DOI":"10.1049\/itr2.12079"},{"key":"e_1_3_2_132_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2019.2914254"},{"key":"e_1_3_2_133_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW50498.2020.00315"},{"key":"e_1_3_2_134_2","doi-asserted-by":"publisher","DOI":"10.3390\/jimaging9070131"},{"key":"e_1_3_2_135_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2022.104597"},{"key":"e_1_3_2_136_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46478-7_38"},{"key":"e_1_3_2_137_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.396"},{"key":"e_1_3_2_138_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2021.3124675"},{"key":"e_1_3_2_139_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2023.3296571"},{"key":"e_1_3_2_140_2","doi-asserted-by":"publisher","DOI":"10.1145\/3645101"},{"key":"e_1_3_2_141_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2017.2670780"},{"key":"e_1_3_2_142_2","doi-asserted-by":"publisher","DOI":"10.1109\/IPTA.2019.8936124"},{"issue":"1","key":"e_1_3_2_143_2","first-page":"8848874","article-title":"A new video-based crash detection method: Balancing speed and accuracy using a feature fusion deep learning framework","volume":"2020","author":"Lu Zhenbo","year":"2020","unstructured":"Zhenbo Lu, Wei Zhou, Shixiang Zhang, and Chen Wang. 2020. A new video-based crash detection method: Balancing speed and accuracy using a feature fusion deep learning framework. Journal of Advanced Transportation 2020, 1 (2020), 8848874.","journal-title":"Journal of Advanced Transportation"},{"key":"e_1_3_2_144_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00133"},{"key":"e_1_3_2_145_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01379"},{"key":"e_1_3_2_146_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2023.3297589"},{"key":"e_1_3_2_147_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2024.3355812"},{"key":"e_1_3_2_148_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00678"},{"key":"e_1_3_2_149_2","article-title":"Motion-aware feature for improved video anomaly detection","author":"Zhu Yi","year":"2019","unstructured":"Yi Zhu and Shawn Newsam. 2019. Motion-aware feature for improved video anomaly detection. arXiv preprint arXiv:1907.10211 (2019). Retrieved from https:\/\/arxiv.org\/abs\/1907.10211","journal-title":"arXiv preprint"},{"key":"e_1_3_2_150_2","doi-asserted-by":"publisher","DOI":"10.1109\/LSP.2020.3025688"},{"key":"e_1_3_2_151_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2023.109765"},{"key":"e_1_3_2_152_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-024-09611-3"},{"key":"e_1_3_2_153_2","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN55064.2022.9892231"},{"key":"e_1_3_2_154_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00179"},{"key":"e_1_3_2_155_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.86"},{"key":"e_1_3_2_156_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11760-020-01740-1"},{"key":"e_1_3_2_157_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2021.3108504"},{"key":"e_1_3_2_158_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-022-07335-w"},{"key":"e_1_3_2_159_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00684"},{"key":"e_1_3_2_160_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV48922.2021.01333"},{"key":"e_1_3_2_161_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2021.3083152"},{"key":"e_1_3_2_162_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2024.3376399"},{"key":"e_1_3_2_163_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVT.2014.2307958"},{"key":"e_1_3_2_164_2","doi-asserted-by":"publisher","DOI":"10.1109\/HNICEM.2015.7393241"},{"key":"e_1_3_2_165_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2011.2179537"},{"key":"e_1_3_2_166_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2021.3095053"},{"key":"e_1_3_2_167_2","doi-asserted-by":"publisher","DOI":"10.3390\/s23249800"},{"key":"e_1_3_2_168_2","doi-asserted-by":"publisher","DOI":"10.1111\/mice.12819"},{"key":"e_1_3_2_169_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIE.2017.2782236"},{"key":"e_1_3_2_170_2","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2021.3094491"},{"key":"e_1_3_2_171_2","first-page":"0","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV) Workshops","author":"Nikhil Nishant","year":"2018","unstructured":"Nishant Nikhil and Brendan Tran Morris. 2018. Convolutional neural network for trajectory prediction. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops. 0\u20130."},{"key":"e_1_3_2_172_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2022.3172015"},{"key":"e_1_3_2_173_2","doi-asserted-by":"publisher","DOI":"10.1109\/MITS.2021.3049404"},{"key":"e_1_3_2_174_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-20074-8_30"},{"key":"e_1_3_2_175_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIM.2022.3163136"},{"key":"e_1_3_2_176_2","article-title":"Emsin: Enhanced multi-stream interaction network for vehicle trajectory prediction","author":"Ren Yilong","year":"2024","unstructured":"Yilong Ren, Zhengxing Lan, Lingshan Liu, and Haiyang Yu. 2024. Emsin: Enhanced multi-stream interaction network for vehicle trajectory prediction. IEEE Transactions on Fuzzy Systems 33, 1 (2024), 54\u201368.","journal-title":"IEEE Transactions on Fuzzy Systems"},{"key":"e_1_3_2_177_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW63382.2024.00574"},{"key":"e_1_3_2_178_2","article-title":"A temporal multi-gate mixture-of-experts approach for vehicle trajectory and driving intention prediction","author":"Yuan Renteng","year":"2023","unstructured":"Renteng Yuan, Mohamed Abdel-Aty, Qiaojun Xiang, Zijin Wang, and Xin Gu. 2023. A temporal multi-gate mixture-of-experts approach for vehicle trajectory and driving intention prediction. IEEE Transactions on Intelligent Vehicles 9, 1 (2023), 1204\u20131216.","journal-title":"IEEE Transactions on Intelligent Vehicles"},{"key":"e_1_3_2_179_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2019.2923319"},{"key":"e_1_3_2_180_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIV.2018.2873901"},{"key":"e_1_3_2_181_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00638"},{"key":"e_1_3_2_182_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2021.3074829"},{"key":"e_1_3_2_183_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2019.2946642"},{"key":"e_1_3_2_184_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2021.3135251"},{"key":"e_1_3_2_185_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVW.2017.33"},{"key":"e_1_3_2_186_2","doi-asserted-by":"publisher","DOI":"10.1109\/FG52635.2021.9666989"},{"key":"e_1_3_2_187_2","doi-asserted-by":"publisher","DOI":"10.1109\/WACV48630.2021.00130"},{"key":"e_1_3_2_188_2","doi-asserted-by":"crossref","unstructured":"Mohsen Azarmi Mahdi Rezaei He Wang and Sebastien Glaser. 2025. PIP-Net: Pedestrian intention prediction in the wild. IEEE Transactions on Intelligent Transportation Systems 26 7 (2025) 9824\u20139837. Retrieved from https:\/\/arxiv.org\/abs\/2402.12810","DOI":"10.1109\/TITS.2025.3570794"},{"key":"e_1_3_2_189_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2023.3281393"},{"key":"e_1_3_2_190_2","article-title":"DPCIAN: A novel dual-channel pedestrian crossing intention anticipation network","author":"Yang Biao","year":"2023","unstructured":"Biao Yang, Zhiwen Wei, Hongyu Hu, Rui Wang, Changchun Yang, and Rongrong Ni. 2023. DPCIAN: A novel dual-channel pedestrian crossing intention anticipation network. IEEE Transactions on Intelligent Transportation Systems 25, 6 (2023), 6023\u20136034.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"e_1_3_2_191_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2014.283"},{"issue":"1","key":"e_1_3_2_192_2","first-page":"8741534","article-title":"Simulation of pedestrian crossing behaviors at unmarked roadways based on social force model","volume":"2017","author":"Ningbo Cao","year":"2017","unstructured":"Cao Ningbo, Wei Wei, Qu Zhaowei, Zhao Liying, and Bai Qiaowen. 2017. Simulation of pedestrian crossing behaviors at unmarked roadways based on social force model. Discrete Dynamics in Nature and Society 2017, 1 (2017), 8741534.","journal-title":"Discrete Dynamics in Nature and Society"},{"key":"e_1_3_2_193_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.110"},{"key":"e_1_3_2_194_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00240"},{"key":"e_1_3_2_195_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00887"},{"key":"e_1_3_2_196_2","doi-asserted-by":"publisher","DOI":"10.1109\/IV55152.2023.10186643"},{"key":"e_1_3_2_197_2","first-page":"8994","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition","author":"Shi Liushuai","year":"2021","unstructured":"Liushuai Shi, Le Wang, Chengjiang Long, Sanping Zhou, Mo Zhou, Zhenxing Niu, and Gang Hua. 2021. SGCN: Sparse graph convolution network for pedestrian trajectory prediction. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 8994\u20139003."},{"key":"e_1_3_2_198_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10489-022-03524-1"},{"key":"e_1_3_2_199_2","unstructured":"Pei Lv Wentong Wang Yunxin Wang Yuzhen Zhang Mingliang Xu and Changsheng Xu. 2021. SSAGCN: Social soft attention graph convolution network for pedestrian trajectory prediction. arXiv preprint arXiv:2112.02459 (2021). Retrieved from https:\/\/arxiv.org\/abs\/2112.02459"},{"key":"e_1_3_2_200_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2018.00015"},{"key":"e_1_3_2_201_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2014.6854869"},{"key":"e_1_3_2_202_2","doi-asserted-by":"publisher","DOI":"10.5555\/2317018.2317068"},{"key":"e_1_3_2_203_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW50498.2020.00318"},{"key":"e_1_3_2_204_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-19390-8_48"},{"key":"e_1_3_2_205_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.446"},{"issue":"1","key":"e_1_3_2_206_2","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1109\/TPAMI.2013.111","article-title":"Anomaly detection and localization in crowded scenes","volume":"36","author":"Li Weixin","year":"2013","unstructured":"Weixin Li, Vijay Mahadevan, and Nuno Vasconcelos. 2013. Anomaly detection and localization in crowded scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 1 (2013), 18\u201332.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_2_207_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2013.338"},{"key":"e_1_3_2_208_2","doi-asserted-by":"publisher","DOI":"10.1109\/AVSS.2018.8639160"},{"key":"e_1_3_2_209_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSTARS.2023.3285905"},{"key":"e_1_3_2_210_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-15549-9_33"},{"key":"e_1_3_2_211_2","first-page":"655","volume-title":"Computer Graphics Forum","year":"2007","unstructured":"Alon Lerner, Yiorgos Chrysanthou, and Dani Lischinski. 2007. Crowds by example. In Computer Graphics Forum, Vol. 26. Wiley Online Library, 655\u2013664."},{"key":"e_1_3_2_212_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00636"},{"key":"e_1_3_2_213_2","doi-asserted-by":"publisher","DOI":"10.1109\/ITSC.2018.8569552"},{"key":"e_1_3_2_214_2","doi-asserted-by":"publisher","DOI":"10.1177\/03611981231185768"},{"key":"e_1_3_2_215_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2018.00141"},{"key":"e_1_3_2_216_2","first-page":"409","volume-title":"Proceedings of the Conference on Robot Learning","author":"Houston John","year":"2021","unstructured":"John Houston, Guido Zuidhof, Luca Bergamini, Yawei Ye, Long Chen, Ashesh Jain, Sammy Omari, Vladimir Iglovikov, and Peter Ondruska. 2021. One thousand and one hours: Self-driving motion prediction dataset. In Proceedings of the Conference on Robot Learning. PMLR, 409\u2013418."},{"key":"e_1_3_2_217_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00531"},{"key":"e_1_3_2_218_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.2987072"},{"key":"e_1_3_2_219_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2021.3069362"},{"key":"e_1_3_2_220_2","first-page":"012144","volume-title":"Journal of Physics: Conference Series","volume":"890","year":"2017","unstructured":"Budi Setiyono, Dwi Ratna Sulistyaningrum, Soetrisno, Farah Fajriyah, and Danang Wahyu Wicaksono. 2017. Vehicle speed detection based on Gaussian mixture model using sequential of images. Journal of Physics: Conference Series 890, 1 (2017), 012144."},{"key":"e_1_3_2_221_2","article-title":"DACR-AMTP: Adaptive multi-modal vehicle trajectory prediction for dynamic drivable areas based on collision risk","author":"Cong Peichao","year":"2023","unstructured":"Peichao Cong, Yixuan Xiao, Xianquan Wan, Murong Deng, Jiaxing Li, and Xin Zhang. 2023. DACR-AMTP: Adaptive multi-modal vehicle trajectory prediction for dynamic drivable areas based on collision risk. IEEE Transactions on Intelligent Vehicles 9, 9 (2023), 5339\u20135360.","journal-title":"IEEE Transactions on Intelligent Vehicles"},{"key":"e_1_3_2_222_2","article-title":"Pedestrian crossing intention prediction based on cross-modal transformer and uncertainty-aware multi-task learning for autonomous driving","author":"Chen Xiaobo","year":"2024","unstructured":"Xiaobo Chen, Shilin Zhang, Jun Li, and Jian Yang. 2024. Pedestrian crossing intention prediction based on cross-modal transformer and uncertainty-aware multi-task learning for autonomous driving. IEEE Transactions on Intelligent Transportation Systems 25, 9 (2024), 12538\u201312549.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"e_1_3_2_223_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01443"},{"key":"e_1_3_2_224_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW59228.2023.00044"},{"key":"e_1_3_2_225_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2021.3051462"},{"key":"e_1_3_2_226_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIV.2020.3039456"},{"key":"e_1_3_2_227_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.01406"},{"key":"e_1_3_2_228_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.01440"},{"key":"e_1_3_2_229_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.trc.2020.102946"},{"key":"e_1_3_2_230_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2022.3183612"},{"key":"e_1_3_2_231_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00571"},{"key":"e_1_3_2_232_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIV.2024.3510563"},{"key":"e_1_3_2_233_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00352"},{"key":"e_1_3_2_234_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00889"},{"key":"e_1_3_2_235_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2023.3290135"},{"key":"e_1_3_2_236_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00564"},{"key":"e_1_3_2_237_2","doi-asserted-by":"publisher","DOI":"10.1109\/TIP.2020.2993403"},{"key":"e_1_3_2_238_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11042-023-16337-2"},{"key":"e_1_3_2_239_2","unstructured":"Oriol Vinyals Charles Blundell Timothy Lillicrap koray kavukcuoglu and Daan Wierstra. 2016. Matching networks for one shot learning. Advances in Neural Information Processing Systems 29 (2016) 1\u20139."},{"issue":"11","key":"e_1_3_2_240_2","first-page":"12832","article-title":"Meta-detr: Image-level few-shot detection with inter-class correlation exploitation","volume":"45","author":"Zhang Gongjie","year":"2022","unstructured":"Gongjie Zhang, Zhipeng Luo, Kaiwen Cui, Shijian Lu, and Eric P. Xing. 2022. Meta-detr: Image-level few-shot detection with inter-class correlation exploitation. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 11 (2022), 12832\u201312843.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_2_241_2","article-title":"Unsupervised representation learning by predicting image rotations","author":"Gidaris Spyros","year":"2018","unstructured":"Spyros Gidaris, Praveer Singh, and Nikos Komodakis. 2018. Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728 (2018). Retrieved from https:\/\/arxiv.org\/abs\/1803.07728","journal-title":"arXiv preprint"},{"key":"e_1_3_2_242_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01553"},{"key":"e_1_3_2_243_2","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2021.3122801"},{"key":"e_1_3_2_244_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.cviu.2023.103656"},{"key":"e_1_3_2_245_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP49357.2023.10094809"},{"key":"e_1_3_2_246_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00219"},{"key":"e_1_3_2_247_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2023.3253919"},{"key":"e_1_3_2_248_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2022.3166687"},{"issue":"2","key":"e_1_3_2_249_2","first-page":"1926","article-title":"Detection of road accidents using synthetically generated multi-perspective accident videos","volume":"24","author":"Vijay Thakare Kamalakar","year":"2022","unstructured":"Thakare Kamalakar Vijay, Debi Prosad Dogra, Heeseung Choi, Gipyo Nam, and Ig-Jae Kim. 2022. Detection of road accidents using synthetically generated multi-perspective accident videos. IEEE Transactions on Intelligent Transportation Systems 24, 2 (2022), 1926\u20131935.","journal-title":"IEEE Transactions on Intelligent Transportation Systems"},{"key":"e_1_3_2_250_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2024.3398252"},{"key":"e_1_3_2_251_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-88361-4_7"},{"key":"e_1_3_2_252_2","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2022.3143171"},{"issue":"5","key":"e_1_3_2_253_2","first-page":"2313","article-title":"Auto-encoding and distilling scene graphs for image captioning","volume":"44","author":"Yang Xu","year":"2020","unstructured":"Xu Yang, Hanwang Zhang, and Jianfei Cai. 2020. Auto-encoding and distilling scene graphs for image captioning. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 5 (2020), 2313\u20132327.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_2_254_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00033"},{"key":"e_1_3_2_255_2","first-page":"1","article-title":"Scene adaptation in adverse conditions: A multi-sensor fusion framework for roadside traffic perception","author":"Li Kong","year":"2024","unstructured":"Kong Li, Zhe Dai, Chen Zuo, Xuan Wang, Hua Cui, Huansheng Song, and Mengying Cui. 2024. Scene adaptation in adverse conditions: A multi-sensor fusion framework for roadside traffic perception. Journal of Intelligent Transportation Systems (2024), 1\u201321.","journal-title":"Journal of Intelligent Transportation Systems"},{"key":"e_1_3_2_256_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.aap.2021.105973"},{"issue":"2","key":"e_1_3_2_257_2","first-page":"1","article-title":"TelecomTM: A fine-grained and ubiquitous traffic monitoring system using pre-existing telecommunication fiber-optic cables as sensors","volume":"7","author":"Liu Jingxiao","year":"2023","unstructured":"Jingxiao Liu, Siyuan Yuan, Yiwen Dong, Biondo Biondi, and Hae Young Noh. 2023. TelecomTM: A fine-grained and ubiquitous traffic monitoring system using pre-existing telecommunication fiber-optic cables as sensors. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 7, 2 (2023), 1\u201324.","journal-title":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies"},{"key":"e_1_3_2_258_2","doi-asserted-by":"publisher","DOI":"10.1109\/LRA.2024.3368234"},{"key":"e_1_3_2_259_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.trb.2023.102869"},{"key":"e_1_3_2_260_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compenvurbsys.2019.101364"},{"key":"e_1_3_2_261_2","doi-asserted-by":"publisher","DOI":"10.1049\/iet-its.2017.0116"},{"key":"e_1_3_2_262_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2021.3095408"},{"key":"e_1_3_2_263_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2020.01.043"},{"key":"e_1_3_2_264_2","doi-asserted-by":"publisher","DOI":"10.1109\/JSEN.2024.3397534"},{"key":"e_1_3_2_265_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2020.3028424"},{"key":"e_1_3_2_266_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2024.3435937"},{"key":"e_1_3_2_267_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA48891.2023.10160367"},{"key":"e_1_3_2_268_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA46639.2022.9812038"},{"key":"e_1_3_2_269_2","article-title":"A survey of collaborative perception in intelligent vehicles at intersections","author":"Gao Xin","year":"2024","unstructured":"Xin Gao, Xinyu Zhang, Yiguo Lu, Yuning Huang, Lei Yang, Yijin Xiong, and Peng Liu. 2024. A survey of collaborative perception in intelligent vehicles at intersections. IEEE Transactions on Intelligent Vehicles (2024). Early Access.","journal-title":"IEEE Transactions on Intelligent Vehicles"},{"key":"e_1_3_2_270_2","article-title":"Rethinking DABNet: Light-weight network for real-time semantic segmentation of road scenes","author":"Mazhar Saquib","year":"2023","unstructured":"Saquib Mazhar, Nadeem Atif, MK Bhuyan, and Shaik Rafi Ahamed. 2023. Rethinking DABNet: Light-weight network for real-time semantic segmentation of road scenes. IEEE Transactions on Artificial Intelligence 5, 6 (2023), 3098\u20133108.","journal-title":"IEEE Transactions on Artificial Intelligence"},{"key":"e_1_3_2_271_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCVW60793.2023.00162"},{"key":"e_1_3_2_272_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2021.3100554"},{"key":"e_1_3_2_273_2","article-title":"Mobilevit: Light-weight, general-purpose, and mobile-friendly vision transformer","author":"Mehta Sachin","year":"2021","unstructured":"Sachin Mehta and Mohammad Rastegari. 2021. Mobilevit: Light-weight, general-purpose, and mobile-friendly vision transformer. arXiv preprint arXiv:2110.02178 (2021). Retrieved from https:\/\/arxiv.org\/abs\/2110.02178","journal-title":"arXiv preprint"},{"key":"e_1_3_2_274_2","first-page":"12934","article-title":"Efficientformer: Vision transformers at mobilenet speed","volume":"35","author":"Li Yanyu","year":"2022","unstructured":"Yanyu Li, Geng Yuan, Yang Wen, Ju Hu, Georgios Evangelidis, Sergey Tulyakov, Yanzhi Wang, and Jian Ren. 2022. Efficientformer: Vision transformers at mobilenet speed. Advances in Neural Information Processing Systems 35 (2022), 12934\u201312949.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_275_2","doi-asserted-by":"publisher","DOI":"10.1504\/ijbic.2023.135467"},{"key":"e_1_3_2_276_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2019.2936939"},{"key":"e_1_3_2_277_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.01152"},{"key":"e_1_3_2_278_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v34i04.5963"},{"key":"e_1_3_2_279_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v37i9.26244"},{"key":"e_1_3_2_280_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2023.3267469"},{"key":"e_1_3_2_281_2","doi-asserted-by":"publisher","DOI":"10.1109\/COMST.2024.3393230"},{"key":"e_1_3_2_282_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICDCS.2019.00058"},{"key":"e_1_3_2_283_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jnca.2024.103886"},{"key":"e_1_3_2_284_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2023.09.008"},{"key":"e_1_3_2_285_2","doi-asserted-by":"publisher","DOI":"10.3390\/electronics13142883"},{"key":"e_1_3_2_286_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2024.3475371"},{"key":"e_1_3_2_287_2","article-title":"TSCLIP: Robust CLIP fine-tuning for worldwide cross-regional traffic sign recognition","author":"Zhao Guoyang","year":"2024","unstructured":"Guoyang Zhao, Fulong Ma, Weiqing Qi, Chenguang Zhang, Yuxuan Liu, Ming Liu, and Jun Ma. 2024. TSCLIP: Robust CLIP fine-tuning for worldwide cross-regional traffic sign recognition. arXiv preprint arXiv:2409.15077 (2024). Retrieved from https:\/\/arxiv.org\/abs\/2409.15077","journal-title":"arXiv preprint"},{"key":"e_1_3_2_288_2","doi-asserted-by":"crossref","unstructured":"Aaron Lohner Francesco Compagno Jonathan Francis and Alessandro Oltramari. 2024. Enhancing vision-language models with scene graphs for traffic accident understanding. IEEE International Automated Vehicle Validation Conference (IAVVC). IEEE 1\u20137. Retrieved from https:\/\/arxiv.org\/abs\/2407.05910","DOI":"10.1109\/IAVVC63304.2024.10786395"},{"key":"e_1_3_2_289_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2024.3361862"},{"key":"e_1_3_2_290_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-41731-3_9"},{"key":"e_1_3_2_291_2","doi-asserted-by":"publisher","DOI":"10.3390\/vehicles6030074"},{"key":"e_1_3_2_292_2","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2024.3409453"},{"key":"e_1_3_2_293_2","doi-asserted-by":"publisher","DOI":"10.1145\/3580305.3599246"},{"key":"e_1_3_2_294_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.02061"},{"key":"e_1_3_2_295_2","unstructured":"Lening Wang Yilong Ren Han Jiang Pinlong Cai Daocheng Fu Tianqi Wang Zhiyong Cui Haiyang Yu Xuesong Wang Hanchu Zhou et\u00a0al. 2023. AccidentGPT: Accident analysis and prevention from v2x environmental perception with multi-modal large model. arXiv:2312.13156 (2023). Retrieved from https:\/\/arxiv.org\/abs\/2312.13156"},{"key":"e_1_3_2_296_2","article-title":"OccSora: 4D occupancy generation models as world simulators for autonomous driving","author":"Wang Lening","year":"2024","unstructured":"Lening Wang, Wenzhao Zheng, Yilong Ren, Han Jiang, Zhiyong Cui, Haiyang Yu, and Jiwen Lu. 2024. OccSora: 4D occupancy generation models as world simulators for autonomous driving. arXiv preprint arXiv:2405.20337 (2024). Retrieved from https:\/\/arxiv.org\/abs\/2405.20337","journal-title":"arXiv preprint"},{"key":"e_1_3_2_297_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2024.3362821"},{"key":"e_1_3_2_298_2","article-title":"Towards label-free scene understanding by vision foundation models","author":"Chen Runnan","year":"2024","unstructured":"Runnan Chen, Youquan Liu, Lingdong Kong, Nenglun Chen, Xinge Zhu, Yuexin Ma, Tongliang Liu, and Wenping Wang. 2024. Towards label-free scene understanding by vision foundation models. Advances in Neural Information Processing Systems 36 (2023), 75896\u201375910.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_299_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.aap.2021.106261"},{"key":"e_1_3_2_300_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.trc.2022.103570"},{"key":"e_1_3_2_301_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01712"}],"container-title":["ACM Computing Surveys"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3760525","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,9]],"date-time":"2025-09-09T14:33:58Z","timestamp":1757428438000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3760525"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,9]]},"references-count":300,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,2,28]]}},"alternative-id":["10.1145\/3760525"],"URL":"https:\/\/doi.org\/10.1145\/3760525","relation":{},"ISSN":["0360-0300","1557-7341"],"issn-type":[{"value":"0360-0300","type":"print"},{"value":"1557-7341","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,9,9]]},"assertion":[{"value":"2024-11-29","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-08-07","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-09-09","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}