{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T03:02:12Z","timestamp":1760151732880,"version":"build-2065373602"},"reference-count":43,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2022,4,15]],"date-time":"2022-04-15T00:00:00Z","timestamp":1649980800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Partial occlusion and background clutter in camera video surveillance affect the accuracy of video-based person re-identification (re-ID). To address these problems, we propose a person re-ID method based on random erasure of frame sampling and temporal weight aggregation of mutual information of partial and global features. First, for the case in which the target person is interfered or partially occluded, the frame sampling\u2013random erasure (FSE) method is used for data enhancement to effectively alleviate the occlusion problem, improve the generalization ability of the model, and match persons more accurately. Second, to further improve the re-ID accuracy of video-based persons and learn more discriminative feature representations, we use a ResNet-50 network to extract global and partial features and fuse these features to obtain frame-level features. In the time dimension, based on a mutual information\u2013temporal weight aggregation (MI\u2013TWA) module, the partial features are added according to different weights and the global features are added according to equal weights and connected to output sequence features. The proposed method is extensively experimented on three public video datasets, MARS, DukeMTMC-VideoReID, and PRID-2011; the mean average precision (mAP) values are 82.4%, 94.1%, and 95.3% and Rank-1 values are 86.4%, 94.8%, and 95.2%, respectively.<\/jats:p>","DOI":"10.3390\/s22083047","type":"journal-article","created":{"date-parts":[[2022,4,19]],"date-time":"2022-04-19T02:39:31Z","timestamp":1650335971000},"page":"3047","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Video Person Re-Identification with Frame Sampling\u2013Random Erasure and Mutual Information\u2013Temporal Weight Aggregation"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1738-9425","authenticated-orcid":false,"given":"Jiayue","family":"Li","sequence":"first","affiliation":[{"name":"Information and Communication Engineering, Electronics Information Engineering College, Changchun University of Science and Technology, Changchun 130022, China"},{"name":"High-Speed Railway Comprehensive Technical College, Jilin Railway Technology College, Jilin 132299, China"}]},{"given":"Yan","family":"Piao","sequence":"additional","affiliation":[{"name":"Information and Communication Engineering, Electronics Information Engineering College, Changchun University of Science and Technology, Changchun 130022, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,4,15]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Song, C., Huang, Y., OuYang, W., and Wang, L. (2018, January 18\u201321). Mask-guided contrastive attention model for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00129"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Chen, T.L., Ding, S.J., Xie, J.Y., Yuan, Y., Chen, W., Yang, Y., Ren, Z., and Wang, Z. (2018, January 18\u201321). ABD-Net: Attentive but diverse person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/ICCV.2019.00844"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Sun, Y., Zheng, L., Yang, Y., Tian, Q., and Wang, S. (2018, January 8\u201314). Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). Proceedings of the 2018 European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01225-0_30"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Zheng, M., Karanam, S., Wu, Z., and Radke, R.J. (2019, January 15\u201321). Re-identification with consistent attentive siamese networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00588"},{"key":"ref_5","unstructured":"Fu, Y., Wang, X.Y., Wei, Y.C., and Huang, T. (February, January 27). STA: Spatial-Temporal Attention for Large-Scale Video-Based Person Re-Identification. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), Honolulu, HI, USA."},{"key":"ref_6","unstructured":"Liu, X.H., Zhang, P.P., Yu, C.Y., Lu, H.C., Qian, X.S., and Yang, X.Y. (2021). A video is worth three views: Trigeminal transformers for video-based person reidentification. arXiv."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Li, S.Z., Yu, H.M., and Hu, H.J. (2020, January 7\u201312). Appearance and Motion Enhancement for Video-Based Person Re-Identification. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), New York, NY, USA.","DOI":"10.1609\/aaai.v34i07.6802"},{"key":"ref_8","unstructured":"McLaughlin, N., Rincon, J.M., and Miller, P. (July, January 26). Recurrent Convolutional Network for Video-Based Person Re-Identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zhou, Z., Huang, Y., Wang, W., Wang, L., and Tan, T.N. (2017, January 21\u201326). See the Forest for the Trees: Joint Spatial and Temporal Recurrent Neural Networks for Video-Based Person Re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.717"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1366","DOI":"10.1109\/TIP.2018.2878505","article-title":"Video Person Re-Identification by Temporal Residual Learning","volume":"28","author":"Dai","year":"2019","journal-title":"IEEE Trans. Image Process."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Wu, Z., Wang, X., Jiang, Y.G., Ye, H., and Xue, X. (2015, January 11\u201316). Modeling spatial-temporal clues in a hybrid deep learning framework for video classification. Proceedings of the ACM Multimedia (ACM), Sydney, Australia.","DOI":"10.1145\/2733373.2806222"},{"key":"ref_12","unstructured":"Yu, Z., Li, T., Yu, N., and Gong, X. (2017). Three-stream convolutional networks for video-based person re-identification. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"He, L.X., Liang, J., Li, H.Q., and Sun, Z.N. (2018, January 18\u201321). Deep spatial feature reconstruction for partial person reidentification: Alignment-free approach. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00739"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zhang, S.S., Yang, J., and Schiele, B. (2018, January 18\u201321). Occluded pedestrian detection through guided attention in cnns. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00731"},{"key":"ref_15","unstructured":"He, K.M., Zhang, X.Y., Ren, S.Q., and Sun, J. (July, January 26). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Ge, Y., Gu, X., Chen, M., Wang, H., and Yang, D. (2018, January 23\u201327). Deep multi-metric learning for person re-identification. Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), San Diego, CA, USA.","DOI":"10.1109\/ICME.2018.8486502"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Lan, C., Zeng, W., and Chen, Z. (2019, January 15\u201321). Densely semantically aligned person reidentification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00076"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Chen, D., Xu, D., Li, H., Sebe, N., and Wang, X. (2018, January 18\u201321). Group consistent similarity learning via deep crf for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00902"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Liu, F., and Zhang, L. (2019, January 15\u201321). View confusion feature learning for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/ICCV.2019.00674"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Hou, R., Ma, B., Chang, H., Gu, X., Shan, S., and Chen, X. (2019, January 15\u201321). Interaction-and-aggregation network for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00954"},{"key":"ref_21","unstructured":"Gu, X., Ma, B., Chang, H., Shan, S., and Chen, X. (November, January 27). Temporal knowledge propagation for image-to-video person re-identification. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Korea."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Liu, Y., Yan, J., and Ouyang, W. (2017, January 21\u201326). Quality aware network for set to set recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.499"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Li, S., Bak, S., Carr, P., He, T.C., and Wang, X. (2018, January 18\u201321). Diversity regularized spatiotemporal attention for video-based person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00046"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Xu, S., Cheng, Y., Gu, K., Yang, Y., Chang, S., and Zhou, P. (2017, January 22\u201329). Jointly attentive spatial temporal pooling networks for video-based person re-identification. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.507"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Hou, R., Ma, B., Chang, H., Gu, X., Shan, S., and Chen, X. (2019, January 15\u201321). VRSTC: Occlusion-free video person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00735"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Gu, X., Ma, B., Chang, H., Zhang, H., and Chen, X. (2020, January 23\u201328). Appearance-Preserving 3D Convolution for Video-Based Person Re-Identification. Proceedings of the 2020 European Conference on Computer Vision (ECCV), Glasgow, UK.","DOI":"10.1007\/978-3-030-58536-5_14"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Liao, X.Y., He, L.X., Yang, Z.W., and Zhang, C. (2018, January 2\u20136). Video-Based Person Re-identification via 3D Convolutional Networks and Non-local Attention. Proceedings of the 14th Asian Conference on Computer Vision (ACCV), Perth, Australia.","DOI":"10.1007\/978-3-030-20876-9_39"},{"key":"ref_28","unstructured":"Li, J.N., Zhang, S.L., and Huang, T.J. (February, January 27). Multiscale 3d convolution network for video-based person reidentification. Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), Honolulu, HI, USA."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Yang, J.R., Zheng, W.S., Yang, Q.Z., Chen, Y.-C., and Tian, Q. (2020, January 14\u201319). Spatial-Temporal Graph Convolutional Network for Video-Based Person Re-Identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00335"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Chung, D., Tahboub, K., and Edward, J.D. (2017, January 22\u201329). A two stream siamese convolutional neural network for person re-identification. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.218"},{"key":"ref_31","unstructured":"Karen, S.Y., and Andrew, Z. (2014, January 8\u201313). Two-stream convolutional networks for action recognition in videos. Proceedings of the Conference on Neural Information Processing Systems(NIPS), Montreal, QC, Canada."},{"key":"ref_32","unstructured":"Zhong, Z., Zheng, L., Kang, G., Li, S., and Yang, Y. (2017). Random erasing data augmentation. arXiv."},{"key":"ref_33","unstructured":"Sergey, I., and Christian, S. (2015, January 6\u201311). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proceedings of the International Conference on Machine Learning (ICML), Lille, France."},{"key":"ref_34","unstructured":"Miao, J.X., Wu, Y., Liu, P., Ding, Y.H., and Yang, Y. (November, January 27). Pose-Guided Feature Alignment for Occluded Person Re-Identification. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Korea."},{"key":"ref_35","unstructured":"Hermans, A., Beyer, L., and Leibe, B. (2017). In defense of the triplet loss for person reidentification. arXiv."},{"key":"ref_36","unstructured":"Subramaniam, A., Nambiar, A., and Mittal, A. (November, January 27). Co-segmentation inspired attention networks for video-based person re-identification. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Korea."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Zheng, L., Bie, Z., Sun, Y.F., Wang, J.D., Su, C., Wang, S.J., and Tian, Q. (2016, January 10\u201316). Mars: A video benchmark for large-scale person re-identification. Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46466-4_52"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Hirzer, M., Beleznai, C., Roth, P.M., and Bischof, H. (2011, January 13\u201315). Person re-identification by descriptive and discriminative classification. Proceedings of the Scandinavian conference on Image analysis, Ystad, Sweden.","DOI":"10.1007\/978-3-642-21227-7_9"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Wu, Y., Lin, Y., Dong, X., Yan, Y., Quyang, W., and Yang, Y. (2018, January 18\u201321). Exploit the unknown gradually: One-shot video-based person re-identification by stepwise learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00543"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"2788","DOI":"10.1109\/TCSVT.2017.2715499","article-title":"Video-based person re-identification with accumulative motion context","volume":"28","author":"Liu","year":"2018","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Suh, Y.M., Wang, J.D., Tang, S.Y., Tao, M., and Lee, K.M. (2018, January 8\u201314). Part-aligned bilinear representations for person re-identification. Proceedings of the 2018 European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01264-9_25"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Zhang, J., Wang, N., and Zhang, L. (2018, January 18\u201321). Multi-shot pedestrian reidentification via sequential decision making. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00709"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Zhao, Y., Shen, X., Jin, Z., Lu, H., and Hua, X.S. (2019, January 15\u201321). Attribute-driven feature disentangling and temporal aggregation for video person reidentification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00505"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/8\/3047\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:54:50Z","timestamp":1760136890000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/8\/3047"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,15]]},"references-count":43,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2022,4]]}},"alternative-id":["s22083047"],"URL":"https:\/\/doi.org\/10.3390\/s22083047","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2022,4,15]]}}}