{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,2]],"date-time":"2026-01-02T07:35:42Z","timestamp":1767339342251,"version":"build-2065373602"},"reference-count":36,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2023,2,3]],"date-time":"2023-02-03T00:00:00Z","timestamp":1675382400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Object detection based on deep learning is one of the most important and fundamental tasks of computer vision. High-performance detection algorithms have been widely used in many practical fields. For the management of workers wearing helmets in construction scenarios, this paper proposes a framework model based on the YOLOv5 detection algorithm, combined with multi-object tracking algorithms, to monitor and track whether workers wear safety helmets in real-time video. The improved StrongSORT tracking algorithm of DeepSORT is selected to reduce the loss of the tracked object caused by the occlusion, trajectory blur, and motion scale of the object. The safety helmet dataset is trained with YOLOv5s, and the best result of training is used as the weight model in the StrongSORT tracking algorithm. The experimental results show that the mAP@0.5 of all classes in the YOLOv5s model can reach 95.1% in the validation dataset, mAP@0.5:0.95 is 62.1%, and the precision of wearing helmet is 95.7%. After the box regression loss function was changed from CIOU to Focal-EIOU, the mAP@0.5 increased to 95.4%, mAP@0.5:0.95 increased to 62.9%, and the precision of wearing helmet increased to 96.5%, which were increased by 0.3%, 0.8% and 0.8%, respectively. StrongSORT can update object trajectories in video frames at a speed of 0.05 s per frame. Based on the improved YOLOv5s combined with the StrongSORT tracking algorithm, the helmet-wearing tracking detection can achieve better performance.<\/jats:p>","DOI":"10.3390\/s23031682","type":"journal-article","created":{"date-parts":[[2023,2,3]],"date-time":"2023-02-03T01:40:25Z","timestamp":1675388425000},"page":"1682","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":19,"title":["Helmet-Wearing Tracking Detection Based on StrongSORT"],"prefix":"10.3390","volume":"23","author":[{"given":"Fufang","family":"Li","sequence":"first","affiliation":[{"name":"School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou 510006, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5073-3956","authenticated-orcid":false,"given":"Yan","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou 510006, China"}]},{"given":"Ming","family":"Hu","sequence":"additional","affiliation":[{"name":"School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou 510006, China"}]},{"given":"Manlin","family":"Luo","sequence":"additional","affiliation":[{"name":"School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou 510006, China"}]},{"given":"Guobin","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou 510006, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,2,3]]},"reference":[{"key":"ref_1","unstructured":"Viola, P., and Jones, M. (2001, January 8\u201314). Rapid object detection using a boosted cascade of simple features. Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, Kauai, HI, USA."},{"key":"ref_2","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201905), San Diego, CA, USA."},{"key":"ref_3","unstructured":"O\u2019Shea, K., and Nash, R. (2015). An introduction to convolutional neural networks. arXiv."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"154","DOI":"10.1007\/s11263-013-0620-5","article-title":"Selective search for object recognition","volume":"104","author":"Uijlings","year":"2013","journal-title":"Int. J. Comput. Vis."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1145\/3065386","article-title":"Imagenet classification with deep convolutional neural networks","volume":"60","author":"Krizhevsky","year":"2017","journal-title":"Commun. ACM"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards real-time object detection with region proposal networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_9","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao HY, M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, faster, stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_12","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_13","unstructured":"Jocher, G. (2020, December 20). YOLOv5 by Ultralytics (Version 7.0) [Computer software]. Available online: https:\/\/doi.org\/10.5281\/zenodo.3908559."},{"key":"ref_14","unstructured":"Tian, Z., Shen, C., Chen, H., and He, T. (November, January 27). Fcos: Fully convolutional one-stage object detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 11\u201314). SSD: Single shot multibox detector. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_17","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_19","unstructured":"Redmon, J. (2013, December 20). Darknet: Open Source Neural Networks in c. Available online: http:\/\/pjreddie.com\/darknet\/."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_22","unstructured":"Misra, D. (2019). Mish: A self regularized non-monotonic activation function. arXiv."},{"key":"ref_23","first-page":"12993","article-title":"Distance-IoU loss: Faster and better learning for bounding box regression","volume":"34","author":"Zheng","year":"2020","journal-title":"Proc. AAAI Conf. Artif. Intell."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201323). Path aggregation network for instance segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"32650","DOI":"10.1109\/ACCESS.2021.3060821","article-title":"Analysis based on recent deep learning approaches applied in real-time multi-object tracking: A review","volume":"9","author":"Kalake","year":"2021","journal-title":"IEEE Access"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Bewley, A., Ge, Z., Ott, L., Ramos, F., and Upcroft, B. (2016, January 25\u201328). Simple online and realtime tracking. Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA.","DOI":"10.1109\/ICIP.2016.7533003"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Wojke, N., Bewley, A., and Paulus, D. (2017, January 17\u201320). Simple online and realtime tracking with a deep association metric. Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China.","DOI":"10.1109\/ICIP.2017.8296962"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Du, Y., Song, Y., Yang, B., and Zhao, Y. (2022). Strongsort: Make deepsort great again. arXiv.","DOI":"10.1109\/TMM.2023.3240881"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"2597","DOI":"10.1109\/TMM.2019.2958756","article-title":"A strong baseline and batch normalization neck for deep person re-identification","volume":"22","author":"Luo","year":"2019","journal-title":"IEEE Trans. Multimed."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Song, H., Zhang, X., Song, J., and Zhao, J. (2022). Detection and tracking of safety helmet based on DeepSort and YOLOv5. Multimedia Tools Appl., 1\u201314.","DOI":"10.1007\/s11042-022-13305-0"},{"key":"ref_31","first-page":"961","article-title":"Object tracking using improved deep SORT YOLOv3 architecture","volume":"14","author":"Dang","year":"2020","journal-title":"ICIC Express Lett."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1007\/s00521-021-06391-y","article-title":"Real-time multiple object tracking using deep learning methods","volume":"35","author":"Meimetis","year":"2021","journal-title":"Neural Comput. Appl."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"103911","DOI":"10.1016\/j.imavis.2020.103911","article-title":"IoU-aware single-stage object detector for accurate localization","volume":"97","author":"Wu","year":"2020","journal-title":"Image Vis. Comput."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Rezatofighi, H., Tsoi, N., Gwak, J.Y., Sadeghian, A., Reid, I., and Savarese, S. (2019, January 15\u201320). Generalized intersection over union: A metric and a loss for bounding box regression. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00075"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"146","DOI":"10.1016\/j.neucom.2022.07.042","article-title":"Focal and efficient IOU loss for accurate bounding box regression","volume":"506","author":"Zhang","year":"2022","journal-title":"Neurocomputing"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1115\/1.3662552","article-title":"A new approach to linear filtering and prediction problems","volume":"82","author":"Kalman","year":"1960","journal-title":"J. Basic Eng."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/3\/1682\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:23:01Z","timestamp":1760120581000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/3\/1682"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,3]]},"references-count":36,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2023,2]]}},"alternative-id":["s23031682"],"URL":"https:\/\/doi.org\/10.3390\/s23031682","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2023,2,3]]}}}