{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,30]],"date-time":"2026-01-30T00:06:26Z","timestamp":1769731586019,"version":"3.49.0"},"reference-count":105,"publisher":"MDPI AG","issue":"22","license":[{"start":{"date-parts":[[2022,11,9]],"date-time":"2022-11-09T00:00:00Z","timestamp":1667952000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"University Postgraduate Award (UPA)"},{"name":"University of Wollongong (UOW)"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>With the increase of large camera networks around us, it is becoming more difficult to manually identify vehicles. Computer vision enables us to automate this task. More specifically, vehicle re-identification (ReID) aims to identify cars in a camera network with non-overlapping views. Images captured of vehicles can undergo intense variations of appearance due to illumination, pose, or viewpoint. Furthermore, due to small inter-class similarities and large intra-class differences, feature learning is often enhanced with non-visual cues, such as the topology of camera networks and temporal information. These are, however, not always available or can be resource intensive for the model. Following the success of Transformer baselines in ReID, we propose for the first time an outlook-attention-based vehicle ReID framework using the Vision Outlooker as its backbone, which is able to encode finer-level features. We show that, without embedding any additional side information and using only the visual cues, we can achieve an 80.31% mAP and 97.13% R-1 on the VeRi-776 dataset. Besides documenting our research, this paper also aims to provide a comprehensive walkthrough of vehicle ReID. We aim to provide a starting point for individuals and organisations, as it is difficult to navigate through the myriad of complex research in this field.<\/jats:p>","DOI":"10.3390\/s22228651","type":"journal-article","created":{"date-parts":[[2022,11,10]],"date-time":"2022-11-10T02:11:15Z","timestamp":1668046275000},"page":"8651","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["V2ReID: Vision-Outlooker-Based Vehicle Re-Identification"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2418-6992","authenticated-orcid":false,"given":"Yan","family":"Qian","sequence":"first","affiliation":[{"name":"SMART Infrastructure Facility, University of Wollongong, Wollongong 2500, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7800-5309","authenticated-orcid":false,"given":"Johan","family":"Barthelemy","sequence":"additional","affiliation":[{"name":"NVIDIA, Santa Clara, CA 95051, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0894-0489","authenticated-orcid":false,"given":"Umair","family":"Iqbal","sequence":"additional","affiliation":[{"name":"SMART Infrastructure Facility, University of Wollongong, Wollongong 2500, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6151-6228","authenticated-orcid":false,"given":"Pascal","family":"Perez","sequence":"additional","affiliation":[{"name":"SMART Infrastructure Facility, University of Wollongong, Wollongong 2500, Australia"}]}],"member":"1968","published-online":{"date-parts":[[2022,11,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1624","DOI":"10.1109\/TITS.2011.2158001","article-title":"Data-driven intelligent transportation systems: A survey","volume":"12","author":"Zhang","year":"2011","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_2","first-page":"1","article-title":"Urban computing: Concepts, methodologies, and applications","volume":"5","author":"Zheng","year":"2014","journal-title":"ACM Trans. Intell. Syst. Technol."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Liu, X., Liu, W., Ma, H., and Fu, H. (2016, January 11-15). Large-scale vehicle re-identification in urban surveillance videos. Proceedings of the 2016 IEEE International Conference on Multimedia and Expo (ICME), Seattle, WA, USA.","DOI":"10.1109\/ICME.2016.7553002"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1442","DOI":"10.1109\/TCYB.2013.2272636","article-title":"Accurate estimation of human body orientation from RGB-D sensors","volume":"43","author":"Liu","year":"2013","journal-title":"IEEE Trans. Cybern."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"3162","DOI":"10.3390\/math9243162","article-title":"Trends in vehicle re-identification past, present, and future: A comprehensive review","volume":"9","author":"Deng","year":"2021","journal-title":"Mathematics"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1665","DOI":"10.1109\/TMM.2021.3069562","article-title":"Beyond triplet loss: Person re-identification with fine-grained difference-aware pairwise loss","volume":"24","author":"Yan","year":"2021","journal-title":"IEEE Trans. Multimed."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Wang, Z., Tang, L., Liu, X., Yao, Z., Yi, S., Shao, J., Yan, J., Wang, S., Li, H., and Wang, X. (2017, January 22\u201329). Orientation invariant feature embedding and spatial temporal regularization for vehicle re-identification. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.49"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Liu, X., Zhang, S., Huang, Q., and Gao, W. (2018, January 23\u201327). Ram: A region-aware deep model for vehicle re-identification. Proceedings of the 2018 IEEE International Conference on Multimedia and Expo (ICME), San Diego, CA, USA.","DOI":"10.1109\/ICME.2018.8486589"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"He, B., Li, J., Zhao, Y., and Tian, Y. (2019, January 15\u201319). Part-regularized near-duplicate vehicle re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00412"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Yuan, L., Hou, Q., Jiang, Z., Feng, J., and Yan, S. (2021). Volo: Vision outlooker for visual recognition. arXiv.","DOI":"10.1109\/TPAMI.2022.3206108"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"172443","DOI":"10.1109\/ACCESS.2019.2956172","article-title":"A Survey of Vehicle Re-Identification Based on Deep Learning","volume":"7","author":"Wang","year":"2019","journal-title":"IEEE Access"},{"key":"ref_12","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). Imagenet classification with deep convolutional neural networks. Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Gazzah, S., Essoukri, N., and Amara, B. (2017, January 22\u201324). Vehicle Re-identification in Camera Networks: A Review and New Perspectives. Proceedings of the ACIT\u20192017 The International Arab Conference on Information Technology, Yassmine Hammamet, Tunisia.","DOI":"10.1109\/DT.2017.8012146"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1016\/j.cviu.2019.03.001","article-title":"A survey of advances in vision-based vehicle re-identification","volume":"182","author":"Khan","year":"2019","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"2872","DOI":"10.1109\/TPAMI.2021.3054775","article-title":"Deep learning for person re-identification: A survey and outlook","volume":"44","author":"Ye","year":"2021","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"29","DOI":"10.3389\/fncom.2020.00029","article-title":"Attention in psychology, neuroscience, and machine learning","volume":"14","author":"Lindsay","year":"2020","journal-title":"Front. Comput. Neurosci."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Teng, S., Liu, X., Zhang, S., and Huang, Q. (2018, January 21\u201322). Scan: Spatial and channel attention network for vehicle re-identification. Proceedings of the Pacific Rim Conference on Multimedia, Hefei, China.","DOI":"10.1007\/978-3-030-00764-5_32"},{"key":"ref_18","unstructured":"Khorramshahi, P., Kumar, A., Peri, N., Rambhatla, S.S., Chen, J.C., and Chellappa, R. (November, January 27). A dual-path model with adaptive attention for vehicle re-identification. Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1245","DOI":"10.1109\/TMM.2017.2648498","article-title":"Diversified visual attention networks for fine-grained object classification","volume":"19","author":"Zhao","year":"2017","journal-title":"IEEE Trans. Multimed."},{"key":"ref_20","unstructured":"Mnih, V., Heess, N., and Graves, A. (2014, January 8\u201313). Recurrent models of visual attention. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, USA."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Naphade, M., Wang, S., Anastasiu, D.C., Tang, Z., Chang, M.C., Yang, X., Yao, Y., Zheng, L., Chakraborty, P., and Lopez, C.E. (2021, January 19\u201325). The 5th ai city challenge. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Virtual.","DOI":"10.1109\/CVPRW53098.2021.00482"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Wu, M., Qian, Y., Wang, C., and Yang, M. (2021, January 19\u201325). A multi-camera vehicle tracking system based on city-scale vehicle Re-ID and spatial-temporal information. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Virtual.","DOI":"10.1109\/CVPRW53098.2021.00460"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Huynh, S.V. (2021, January 19\u201325). A strong baseline for vehicle re-identification. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Virtual.","DOI":"10.1109\/CVPRW53098.2021.00468"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Fernandez, M., Moral, P., Garcia-Martin, A., and Martinez, J.M. (2021, January 19\u201325). Vehicle Re-Identification based on Ensembling Deep Learning Features including a Synthetic Training Dataset, Orientation and Background Features, and Camera Verification. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Virtual.","DOI":"10.1109\/CVPRW53098.2021.00459"},{"key":"ref_25","unstructured":"Liu, H., Tian, Y., Yang, Y., Pang, L., and Huang, T. (July, January 26). Deep relative distance learning: Tell the difference between similar vehicles. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Shen, Y., Xiao, T., Li, H., Yi, S., and Wang, X. (2017, January 22\u201329). Learning deep neural networks for vehicle re-id with visual-spatio-temporal path proposals. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.210"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"645","DOI":"10.1109\/TMM.2017.2751966","article-title":"Provid: Progressive and multimodal vehicle reidentification for large-scale urban surveillance","volume":"20","author":"Liu","year":"2017","journal-title":"IEEE Trans. Multimed."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"43724","DOI":"10.1109\/ACCESS.2018.2862382","article-title":"Joint feature and similarity deep learning for vehicle re-identification","volume":"6","author":"Zhu","year":"2018","journal-title":"IEEE Access"},{"key":"ref_29","unstructured":"Chu, R., Sun, Y., Li, Y., Liu, Z., Zhang, C., and Wei, Y. (November, January 27). Vehicle re-identification with viewpoint-aware metric learning. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"115673","DOI":"10.1109\/ACCESS.2020.3004092","article-title":"Unifying Person and Vehicle Re-Identification","volume":"8","author":"Organisciak","year":"2020","journal-title":"IEEE Access"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Wei, X.S., Zhang, C.L., Liu, L., Shen, C., and Wu, J. (2018, January 2\u20136). Coarse-to-fine: A RNN-based hierarchical attention model for vehicle re-identification. Proceedings of the Asian Conference on Computer Vision, Perth, Australia.","DOI":"10.1007\/978-3-030-20890-5_37"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Chen, T.S., Liu, C.T., Wu, C.W., and Chien, S.Y. (2020). Orientation-aware Vehicle Re-identification with Semantics-guided Part Attention Network. arXiv.","DOI":"10.1007\/978-3-030-58536-5_20"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhou, Y., and Shao, L. (2017, January 4\u20137). Cross-View GAN Based Vehicle Generation for Re-identification. Proceedings of the BMVC, London, UK.","DOI":"10.5244\/C.31.186"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Wu, F., Yan, S., Smith, J.S., and Zhang, B. (2018, January 20\u201324). Joint semi-supervised learning and re-ranking for vehicle re-identification. Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China.","DOI":"10.1109\/ICPR.2018.8545584"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"261","DOI":"10.1016\/j.image.2019.04.021","article-title":"Vehicle re-identification in still images: Application of semi-supervised learning and re-ranking","volume":"76","author":"Wu","year":"2019","journal-title":"Signal Process. Image Commun."},{"key":"ref_36","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_39","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017, January 4\u20139). All you need. Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_40","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16 x 16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_41","unstructured":"Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., and J\u00e9gou, H. (2021, January 18\u201324). Training data-efficient image Transformers & distillation through attention. Proceedings of the International Conference on Machine Learning, PMLR, Virtual."},{"key":"ref_42","unstructured":"Lin, T., Wang, Y., Liu, X., and Qiu, X. (2021). A survey of Transformers. arXiv."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020, January 23\u201328). End-to-end object detection with Transformers. Proceedings of the European Conference on Computer Vision, Virtual.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"ref_44","unstructured":"Fang, Y., Liao, B., Wang, X., Fang, J., Qi, J., Wu, R., Niu, J., and Liu, W. (2021, January 6\u201314). You only look at one sequence: Rethinking Transformer in vision through object detection. Proceedings of the Advances in Neural Information Processing Systems, Virtual."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., and Torr, P.H. (2021, January 19\u201325). Rethinking semantic segmentation from a sequence-to-sequence perspective with Transformers. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00681"},{"key":"ref_46","unstructured":"Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., and Luo, P. (2021, January 6\u201314). SegFormer: Simple and efficient design for semantic segmentation with Transformers. Proceedings of the Advances in Neural Information Processing Systems, Virtual."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"He, S., Luo, H., Wang, P., Wang, F., Li, H., and Jiang, W. (2021, January 10\u201317). Transreid: Transformer-based object re-identification. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.01474"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Han, K., Wang, Y., Chen, H., Chen, X., Guo, J., Liu, Z., Tang, Y., Xiao, A., Xu, C., and Xu, Y. (2022). A survey on vision Transformer. IEEE Trans. Pattern Anal. Mach. Intell.","DOI":"10.1109\/TPAMI.2022.3152247"},{"key":"ref_49","unstructured":"Liu, Y., Zhang, Y., Wang, Y., Hou, F., Yuan, J., Tian, J., Zhang, Y., Shi, Z., Fan, J., and He, Z. (2021). A Survey of Visual Transformers. arXiv."},{"key":"ref_50","first-page":"200","article-title":"Transformers in vision: A survey","volume":"54","author":"Khan","year":"2021","journal-title":"ACM Comput. Surv."},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"331","DOI":"10.1007\/s41095-022-0271-y","article-title":"Attention mechanisms in computer vision: A survey","volume":"8","author":"Guo","year":"2022","journal-title":"Comput. Vis. Media"},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"33","DOI":"10.1007\/s41095-021-0247-3","article-title":"Transformers in computational visual media: A survey","volume":"8","author":"Xu","year":"2022","journal-title":"Comput. Vis. Media"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1007\/s41095-022-0274-8","article-title":"Pvt v2: Improved baselines with pyramid vision Transformer","volume":"8","author":"Wang","year":"2022","journal-title":"Comput. Vis. Media"},{"key":"ref_54","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional Transformers for language understanding. arXiv."},{"key":"ref_55","unstructured":"Ba, J.L., Kiros, J.R., and Hinton, G.E. (2016). Layer normalization. arXiv."},{"key":"ref_56","unstructured":"Battaglia, P.W., Hamrick, J.B., Bapst, V., Sanchez-Gonzalez, A., Zambaldi, V., Malinowski, M., Tacchetti, A., Raposo, D., Santoro, A., and Faulkner, R. (2018). Relational inductive biases, deep learning, and graph networks. arXiv."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Sun, C., Shrivastava, A., Singh, S., and Gupta, A. (2017, January 22\u201329). Revisiting unreasonable effectiveness of data in deep learning era. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.97"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Li, F.-F. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the 2009 IEEE Conference on Computer Vision And Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_59","unstructured":"Krizhevsky, A., and Hinton, G. (2009). Learning Multiple Layers of Features from Tiny Images, Technical Report; University of Toronto."},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Srinivas, A., Lin, T.Y., Parmar, N., Shlens, J., Abbeel, P., and Vaswani, A. (2021, January 20\u201325). Bottleneck Transformers for visual recognition. Proceedings of the IEEE\/CVF Conference On Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01625"},{"key":"ref_61","unstructured":"Wu, B., Xu, C., Dai, X., Wan, A., Zhang, P., Yan, Z., Tomizuka, M., Gonzalez, J., Keutzer, K., and Vajda, P. (2020). Visual Transformers: Token-based image representation and processing for computer vision. arXiv."},{"key":"ref_62","unstructured":"D\u2019Ascoli, S., Touvron, H., Leavitt, M.L., Morcos, A.S., Biroli, G., and Sagun, L. (2021, January 18\u201324). Convit: Improving vision Transformers with soft convolutional inductive biases. Proceedings of the International Conference on Machine Learning, PMLR, Virtual."},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Yuan, K., Guo, S., Liu, Z., Zhou, A., Yu, F., and Wu, W. (2021, January 10\u201317). Incorporating convolution designs into visual Transformers. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00062"},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Wu, H., Xiao, B., Codella, N., Liu, M., Dai, X., Yuan, L., and Zhang, L. (2021, January 10\u201317). Cvt: Introducing convolutions to vision Transformers. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00009"},{"key":"ref_65","unstructured":"Li, Y., Zhang, K., Cao, J., Timofte, R., and Van Gool, L. (2021). Localvit: Bringing locality to vision Transformers. arXiv."},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Graham, B., El-Nouby, A., Touvron, H., Stock, P., Joulin, A., J\u00e9gou, H., and Douze, M. (2021, January 10\u201317). Levit: A vision Transformer in convnet\u2019s clothing for faster inference. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.01204"},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 10\u201317). Swin Transformer: Hierarchical vision Transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_68","unstructured":"Han, K., Xiao, A., Wu, E., Guo, J., Xu, C., and Wang, Y. (2021, January 6\u201314). Transformer in Transformer. Proceedings of the Advances in Neural Information Processing Systems, Virtual."},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Yuan, L., Chen, Y., Wang, T., Yu, W., Shi, Y., Jiang, Z.H., Tay, F.E., Feng, J., and Yan, S. (2021, January 10\u201317). Tokens-to-token vit: Training vision Transformers from scratch on imagenet. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00060"},{"key":"ref_70","doi-asserted-by":"crossref","unstructured":"Wang, W., Xie, E., Li, X., Fan, D.P., Song, K., Liang, D., Lu, T., Luo, P., and Shao, L. (2021, January 10\u201317). Pyramid vision Transformer: A versatile backbone for dense prediction without convolutions. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00061"},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"Touvron, H., Cord, M., Sablayrolles, A., Synnaeve, G., and J\u00e9gou, H. (2021, January 10\u201317). Going deeper with image Transformers. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00010"},{"key":"ref_72","unstructured":"Zhou, D., Kang, B., Jin, X., Yang, L., Lian, X., Jiang, Z., Hou, Q., and Feng, J. (2021). Deepvit: Towards deeper vision Transformer. arXiv."},{"key":"ref_73","doi-asserted-by":"crossref","unstructured":"Li, D., Hu, J., Wang, C., Li, X., She, Q., Zhu, L., Zhang, T., and Chen, Q. (2021, January 10\u201317). Involution: Inverting the inherence of convolution for visual recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Montreal, QC, Canada.","DOI":"10.1109\/CVPR46437.2021.01214"},{"key":"ref_74","unstructured":"Jiang, Z.H., Hou, Q., Yuan, L., Zhou, D., Shi, Y., Jin, X., Wang, A., and Feng, J. (2021, January 6\u201314). All tokens matter: Token labelling for training better vision Transformers. Proceedings of the Advances in Neural Information Processing Systems, Virtual."},{"key":"ref_75","doi-asserted-by":"crossref","unstructured":"Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., and Tian, Q. (2015, January 7\u201313). Scalable person re-identification: A benchmark. Proceedings of the IEEE international conference on computer vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.133"},{"key":"ref_76","doi-asserted-by":"crossref","unstructured":"Ristani, E., Solera, F., Zou, R., Cucchiara, R., and Tomasi, C. (2016, January 11\u201314). Performance measures and a data set for multi-target, multi-camera tracking. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-48881-3_2"},{"key":"ref_77","doi-asserted-by":"crossref","unstructured":"Zhu, H., Ke, W., Li, D., Liu, J., Tian, L., and Shan, Y. (2022, January 19\u201324). Dual Cross-Attention Learning for Fine-Grained Visual Categorization and Object Re-Identification. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00465"},{"key":"ref_78","doi-asserted-by":"crossref","unstructured":"Lu, T., Zhang, H., Min, F., and Jia, S. (2022). Vehicle Re-identification Based on Quadratic Split Architecture and Auxiliary Information Embedding. IEICE Trans. Fundam. Electron. Commun. Comput. Sci.","DOI":"10.1587\/transfun.2022EAL2008"},{"key":"ref_79","unstructured":"Shen, F., Xie, Y., Zhu, J., Zhu, X., and Zeng, H. (2021). Git: Graph interactive Transformer for vehicle re-identification. arXiv."},{"key":"ref_80","doi-asserted-by":"crossref","unstructured":"Lian, J., Wang, D., Zhu, S., Wu, Y., and Li, C. (2022). Transformer-Based Attention Network for Vehicle Re-Identification. Electronics, 11.","DOI":"10.3390\/electronics11071016"},{"key":"ref_81","doi-asserted-by":"crossref","first-page":"19557","DOI":"10.1109\/TITS.2022.3166463","article-title":"MsKAT: Multi-Scale Knowledge-Aware Transformer for Vehicle Re-Identification","volume":"23","author":"Li","year":"2022","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_82","doi-asserted-by":"crossref","unstructured":"Luo, H., Chen, W., Xu, X., Gu, J., Zhang, Y., Liu, C., Jiang, Y., He, S., Wang, F., and Li, H. (2021, January 20\u201325). An empirical study of vehicle re-identification on the AI City Challenge. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPRW53098.2021.00462"},{"key":"ref_83","doi-asserted-by":"crossref","first-page":"102868","DOI":"10.1016\/j.ipm.2022.102868","article-title":"Multi-attribute adaptive aggregation Transformer for vehicle re-identification","volume":"59","author":"Yu","year":"2022","journal-title":"Inf. Process. Manag."},{"key":"ref_84","doi-asserted-by":"crossref","unstructured":"Gibbs, J.W. (1902). Elementary Principles in Statistical Mechanics\u2014Developed with Especial Reference to the Rational Foundation of Thermodynamics, C. Scribner\u2019s Sons. Available online: www.gutenberg.org\/ebooks\/50992.","DOI":"10.5962\/bhl.title.32624"},{"key":"ref_85","doi-asserted-by":"crossref","unstructured":"Bridle, J.S. (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. Neurocomputing, Springer.","DOI":"10.1007\/978-3-642-76153-9_28"},{"key":"ref_86","first-page":"45","article-title":"Shannon equations reform and applications","volume":"44","author":"Lu","year":"1990","journal-title":"BUSEFAL"},{"key":"ref_87","doi-asserted-by":"crossref","unstructured":"Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang, Y., and Tian, Q. (2017, January 21\u201326). Person re-identification in the wild. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.357"},{"key":"ref_88","unstructured":"Liu, W., Wen, Y., Yu, Z., and Yang, M. (2016, January 20\u201322). Large-margin SoftMax loss for convolutional neural networks. Proceedings of the ICML, New York, NY, USA."},{"key":"ref_89","doi-asserted-by":"crossref","unstructured":"Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., and Song, L. (2017, January 21\u201326). Sphereface: Deep hypersphere embedding for face recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.713"},{"key":"ref_90","unstructured":"Chen, B., Deng, W., and Shen, H. (2018, January 3\u20138). Virtual class enhanced discriminative embedding learning. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_91","unstructured":"Hermans, A., Beyer, L., and Leibe, B. (2017). In defense of the triplet loss for person re-identification. arXiv."},{"key":"ref_92","doi-asserted-by":"crossref","unstructured":"Schroff, F., Kalenichenko, D., and Philbin, J. (2015, January 7\u201312). Facenet: A unified embedding for face recognition and clustering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"ref_93","doi-asserted-by":"crossref","unstructured":"Chen, W., Chen, X., Zhang, J., and Huang, K. (2017, January 21\u201326). Beyond triplet loss: A deep quadruplet network for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.145"},{"key":"ref_94","doi-asserted-by":"crossref","unstructured":"Sun, Y., Cheng, C., Zhang, Y., Zhang, C., Zheng, L., Wang, Z., and Wei, Y. (2020, January 13\u201319). Circle loss: A unified perspective of pair similarity optimization. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00643"},{"key":"ref_95","doi-asserted-by":"crossref","unstructured":"Wen, Y., Zhang, K., Li, Z., and Qiao, Y. (2016). A discriminative feature learning approach for deep face recognition. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46478-7_31"},{"key":"ref_96","doi-asserted-by":"crossref","unstructured":"Zhu, X., Luo, Z., Fu, P., and Ji, X. (2020, January 14\u201319). VOC-ReID: Vehicle re-identification based on vehicle-orientation-camera. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA.","DOI":"10.1109\/CVPRW50498.2020.00309"},{"key":"ref_97","unstructured":"Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., Maschinot, A., Liu, C., and Krishnan, D. (2020, January 6\u201312). Supervised contrastive learning. Proceedings of the Advances in Neural Information Processing Systems, Virtual."},{"key":"ref_98","doi-asserted-by":"crossref","unstructured":"Luo, H., Gu, Y., Liao, X., Lai, S., and Jiang, W. (2019, January 16\u201317). Bag of tricks and a strong baseline for deep person re-identification. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA.","DOI":"10.1109\/CVPRW.2019.00190"},{"key":"ref_99","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. (2016, January 27\u201330). Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.308"},{"key":"ref_100","doi-asserted-by":"crossref","first-page":"2597","DOI":"10.1109\/TMM.2019.2958756","article-title":"A strong baseline and batch normalization neck for deep person re-identification","volume":"22","author":"Luo","year":"2019","journal-title":"IEEE Trans. Multimed."},{"key":"ref_101","doi-asserted-by":"crossref","unstructured":"Liu, X., Liu, W., Mei, T., and Ma, H. (2016). A deep learning-based approach to progressive vehicle re-identification for urban surveillance. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46475-6_53"},{"key":"ref_102","unstructured":"Loshchilov, I., and Hutter, F. (2016). Sgdr: Stochastic gradient descent with warm restarts. arXiv."},{"key":"ref_103","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1016\/j.jvcir.2019.01.010","article-title":"Spherereid: Deep hypersphere manifold embedding for person re-identification","volume":"60","author":"Fan","year":"2019","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_104","unstructured":"Goodfellow, I.J., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press. Available online: http:\/\/www.deeplearningbook.org."},{"key":"ref_105","unstructured":"Zhang, T., and Li, W. (2020). k-decay: A new method for learning rate schedule. arXiv."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/22\/8651\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:13:23Z","timestamp":1760145203000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/22\/8651"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,9]]},"references-count":105,"journal-issue":{"issue":"22","published-online":{"date-parts":[[2022,11]]}},"alternative-id":["s22228651"],"URL":"https:\/\/doi.org\/10.3390\/s22228651","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,11,9]]}}}