{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T10:08:00Z","timestamp":1760609280690,"version":"build-2065373602"},"reference-count":55,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2020,6,26]],"date-time":"2020-06-26T00:00:00Z","timestamp":1593129600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100010418","name":"Institute for Information and Communications Technology Promotion","doi-asserted-by":"publisher","award":["2017-0-00250"],"award-info":[{"award-number":["2017-0-00250"]}],"id":[{"id":"10.13039\/501100010418","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Person re-identification (Re-ID) has a problem that makes learning difficult such as misalignment and occlusion. To solve these problems, it is important to focus on robust features in intra-class variation. Existing attention-based Re-ID methods focus only on common features without considering distinctive features. In this paper, we present a novel attentive learning-based Siamese network for person Re-ID. Unlike existing methods, we designed an attention module and attention loss using the properties of the Siamese network to concentrate attention on common and distinctive features. The attention module consists of channel attention to select important channels and encoder-decoder attention to observe the whole body shape. We modified the triplet loss into an attention loss, called uniformity loss. The uniformity loss generates a unique attention map, which focuses on both common and discriminative features. Extensive experiments show that the proposed network compares favorably to the state-of-the-art methods on three large-scale benchmarks including Market-1501, CUHK03 and DukeMTMC-ReID datasets.<\/jats:p>","DOI":"10.3390\/s20123603","type":"journal-article","created":{"date-parts":[[2020,6,29]],"date-time":"2020-06-29T11:17:17Z","timestamp":1593429437000},"page":"3603","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Uniformity Attentive Learning-Based Siamese Network for Person Re-Identification"],"prefix":"10.3390","volume":"20","author":[{"given":"Dasol","family":"Jeong","sequence":"first","affiliation":[{"name":"Department of Image, Graduate School of Advanced Imaging Science, Multimedia and Film, Chung-Ang University, Seoul 06974, Korea"}]},{"given":"Hasil","family":"Park","sequence":"additional","affiliation":[{"name":"Department of Image, Graduate School of Advanced Imaging Science, Multimedia and Film, Chung-Ang University, Seoul 06974, Korea"}]},{"given":"Joongchol","family":"Shin","sequence":"additional","affiliation":[{"name":"Department of Image, Graduate School of Advanced Imaging Science, Multimedia and Film, Chung-Ang University, Seoul 06974, Korea"}]},{"given":"Donggoo","family":"Kang","sequence":"additional","affiliation":[{"name":"Department of Image, Graduate School of Advanced Imaging Science, Multimedia and Film, Chung-Ang University, Seoul 06974, Korea"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8593-7155","authenticated-orcid":false,"given":"Joonki","family":"Paik","sequence":"additional","affiliation":[{"name":"Department of Image, Graduate School of Advanced Imaging Science, Multimedia and Film, Chung-Ang University, Seoul 06974, Korea"}]}],"member":"1968","published-online":{"date-parts":[[2020,6,26]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.patrec.2012.07.005","article-title":"Intelligent multi-camera video surveillance: A review","volume":"34","author":"Wang","year":"2013","journal-title":"Pattern Recognit. Lett."},{"key":"ref_2","unstructured":"Loy, C.C., Xiang, T., and Gong, S. (2009, January 20\u201325). Multi-camera activity correlation analysis. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA."},{"key":"ref_3","first-page":"207","article-title":"Distance metric learning for large margin nearest neighbor classification","volume":"10","author":"Weinberger","year":"2009","journal-title":"J. Mach. Learn. Res."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"653","DOI":"10.1109\/TPAMI.2012.138","article-title":"Reidentification by relative distance comparison","volume":"35","author":"Zheng","year":"2012","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_5","unstructured":"Zajdel, W., Zivkovic, Z., and Krose, B. (2005, January 18\u201322). Keeping track of humans: Have I seen this person before?. Proceedings of the 2005 IEEE International Conference on Robotics and Automation, Barcelona, Spain."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Gray, D., and Tao, H. (2008). Viewpoint invariant pedestrian recognition with an ensemble of localized features. European Conference Computer Vision, Springer.","DOI":"10.1007\/978-3-540-88682-2_21"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Farenzena, M., Bazzani, L., Perina, A., Murino, V., and Cristani, M. (2010, January 13\u201318). Person re-identification by symmetry-driven accumulation of local features. Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2010.5539926"},{"key":"ref_8","unstructured":"Gheissari, N., Sebastian, T.B., and Hartley, R. (2006, January 17\u201322). Person reidentification using spatiotemporal appearance. Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201906), New York, NY, USA."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Zhao, R., Ouyang, W., and Wang, X. (2013, January 23\u201328). Unsupervised salience learning for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.460"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Mignon, A., and Jurie, F. (2012, January 18\u201320). Pcca: A new approach for distance learning from sparse pairwise constraints. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6247987"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Liao, S., Hu, Y., Zhu, X., and Li, S.Z. (2015, January 7\u201312). Person re-identification by local maximal occurrence representation and metric learning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298832"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., and Tian, Q. (2015, January 11\u201318). Scalable person re-identification: A benchmark. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.133"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Li, W., Zhao, R., Xiao, T., and Wang, X. (2014, January 24\u201327). Deepreid: Deep filter pairing neural network for person re-identification. Proceedings of the IEEE conference on computer vision and pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.27"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Zheng, Z., Zheng, L., and Yang, Y. (2017, January 22\u201329). Unlabeled samples generated by gan improve the person re-identification baseline in vitro. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.405"},{"key":"ref_15","first-page":"1097","article-title":"Imagenet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2012","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_16","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Ioffe, S., Vanhoucke, V., and Alemi, A.A. (2017, January 4\u201310). Inception-v4, inception-resnet and the impact of residual connections on learning. Proceedings of the Thirty-first AAAI Conference on Artificial Intelligence, San Francisco, CA, USA.","DOI":"10.1609\/aaai.v31i1.11231"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201322). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_20","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_21","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017, January 4-9). Attention is all you need. Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA."},{"key":"ref_22","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Luong, M.T., Pham, H., and Manning, C.D. (2015). Effective approaches to attention-based neural machine translation. arXiv.","DOI":"10.18653\/v1\/D15-1166"},{"key":"ref_24","unstructured":"Mnih, V., Heess, N., and Graves, A. (2014, January 8\u201313). Recurrent models of visual attention. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_25","unstructured":"Bahdanau, D., Cho, K., and Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv."},{"key":"ref_26","unstructured":"Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., Zemel, R., and Bengio, Y. (2015, January 6\u201311). Show, attend and tell: Neural image caption generation with visual attention. Proceedings of the International Conference on Machine Learning, Lille, France."},{"key":"ref_27","unstructured":"Sermanet, P., Frome, A., and Real, E. (2014). Attention for fine-grained categorization. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"944","DOI":"10.1109\/TMM.2016.2642789","article-title":"Attentive contexts for object detection","volume":"19","author":"Li","year":"2016","journal-title":"IEEE Trans. Multimed."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Liu, X., Zhao, H., Tian, M., Sheng, L., Shao, J., Yi, S., Yan, J., and Wang, X. (2017, January 22\u201329). Hydraplus-net: Attentive deep features for pedestrian analysis. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.46"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Li, W., Zhu, X., and Gong, S. (2018, January 18\u201322). Harmonious attention network for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake, UT, USA.","DOI":"10.1109\/CVPR.2018.00243"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Li, S., Bak, S., Carr, P., and Wang, X. (2018, January 18\u201322). Diversity regularized spatiotemporal attention for video-based person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake, UT, USA.","DOI":"10.1109\/CVPR.2018.00046"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Varior, R.R., Haloi, M., and Wang, G. (2016). Gated siamese convolutional neural network architecture for human re-identification. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46484-8_48"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zheng, M., Karanam, S., Wu, Z., and Radke, R.J. (2019, January 16\u201320). Re-identification with consistent attentive siamese networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00588"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3159171","article-title":"A discriminatively learned cnn embedding for person reidentification","volume":"14","author":"Zheng","year":"2017","journal-title":"ACM Trans. Multimed. Comput. Commun. Appl."},{"key":"ref_35","unstructured":"Cheng, D., Gong, Y., Zhou, S., Wang, J., and Zheng, N. (July, January 26). Person re-identification by multi-channel parts-based cnn with improved triplet loss function. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Chen, W., Chen, X., Zhang, J., and Huang, K. (2017, January 21\u201326). Beyond triplet loss: A deep quadruplet network for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.145"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Li, D.X., Fei, G.Y., and Teng, S.W. (2020). Learning Large Margin Multiple Granularity Features with an Improved Siamese Network for Person Re-Identification. Symmetry, 12.","DOI":"10.3390\/sym12010092"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Guo, H., Zheng, K., Fan, X., Yu, H., and Wang, S. (2019, January 16\u201320). Visual attention consistency under image transforms for multi-label image classification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00082"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Newell, A., Yang, K., and Deng, J. (2016). Stacked hourglass networks for human pose estimation. European Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-319-46484-8_29"},{"key":"ref_40","unstructured":"Zheng, L., Yang, Y., and Hauptmann, A.G. (2016). Person re-identification: Past, present and future. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Schroff, F., Kalenichenko, D., and Philbin, J. (2015, January 7\u201312). Facenet: A unified embedding for face recognition and clustering. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298682"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Zhong, Z., Zheng, L., Cao, D., and Li, S. (2017, January 21\u201326). Re-ranking person re-identification with k-reciprocal encoding. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.389"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Ristani, E., Solera, F., Zou, R., Cucchiara, R., and Tomasi, C. (2016). Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking. European Conference on Computer Vision Workshop on Benchmarking Multi-Target Tracking, Springer.","DOI":"10.1007\/978-3-319-48881-3_2"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Li, D., Chen, X., Zhang, Z., and Huang, K. (2017, January 21\u201326). Learning deep context-aware features over body and latent parts for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.782"},{"key":"ref_46","unstructured":"Chen, D., Yuan, Z., Chen, B., and Zheng, N. (July, January 26). Similarity learning with spatial constraints for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_47","unstructured":"Zhang, L., Xiang, T., and Gong, S. (July, January 26). Learning a discriminative null space for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Chen, Y., Zhu, X., and Gong, S. (2017, January 22\u201329). Person re-identification by deep learning multi-scale representations. Proceedings of the IEEE International Conference on Computer Vision Workshops, Venice, Italy.","DOI":"10.1109\/ICCVW.2017.304"},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"3492","DOI":"10.1109\/TIP.2017.2700762","article-title":"End-to-end comparative attention networks for person re-identification","volume":"26","author":"Liu","year":"2017","journal-title":"IEEE Trans. Image Process."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Varior, R.R., Shuai, B., Lu, J., Xu, D., and Wang, G. (2016). A siamese long short-term memory architecture for human re-identification. European Conference on Computer Vision, Springe.","DOI":"10.1007\/978-3-319-46478-7_9"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Sun, Y., Zheng, L., Deng, W., and Wang, S. (2017, January 22\u201329). Svdnet for pedestrian retrieval. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.410"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Wang, H., Gong, S., and Xiang, T. (2016). Highly efficient regression for scalable person re-identification. arXiv.","DOI":"10.5244\/C.30.134"},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"3037","DOI":"10.1109\/TCSVT.2018.2873599","article-title":"Pedestrian alignment network for large-scale person re-identification","volume":"29","author":"Zheng","year":"2018","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Chang, X., Hospedales, T.M., and Xiang, T. (2018, January 18\u201322). Multi-level factorisation net for person re-identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake, UT, USA.","DOI":"10.1109\/CVPR.2018.00225"},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Li, W., Zhu, X., and Gong, S. (2017). Person re-identification by deep joint learning of multi-loss classification. arXiv.","DOI":"10.24963\/ijcai.2017\/305"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/12\/3603\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:43:18Z","timestamp":1760175798000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/12\/3603"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,26]]},"references-count":55,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2020,6]]}},"alternative-id":["s20123603"],"URL":"https:\/\/doi.org\/10.3390\/s20123603","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2020,6,26]]}}}