{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,30]],"date-time":"2026-01-30T05:23:58Z","timestamp":1769750638910,"version":"3.49.0"},"reference-count":89,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2023,9,24]],"date-time":"2023-09-24T00:00:00Z","timestamp":1695513600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100018571","name":"Guangxi Science and Technology Base and Talent Special Project","doi-asserted-by":"publisher","award":["Guike AD22035127"],"award-info":[{"award-number":["Guike AD22035127"]}],"id":[{"id":"10.13039\/501100018571","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100018571","name":"Guangxi Science and Technology Base and Talent Special Project","doi-asserted-by":"publisher","award":["2023KY0264"],"award-info":[{"award-number":["2023KY0264"]}],"id":[{"id":"10.13039\/501100018571","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100018571","name":"Guangxi Science and Technology Base and Talent Special Project","doi-asserted-by":"publisher","award":["62262011"],"award-info":[{"award-number":["62262011"]}],"id":[{"id":"10.13039\/501100018571","id-type":"DOI","asserted-by":"publisher"}]},{"name":"2023 Guangxi Province University Young and Middle-aged Teachers\u2019 Research Basic Ability Improvement Project","award":["Guike AD22035127"],"award-info":[{"award-number":["Guike AD22035127"]}]},{"name":"2023 Guangxi Province University Young and Middle-aged Teachers\u2019 Research Basic Ability Improvement Project","award":["2023KY0264"],"award-info":[{"award-number":["2023KY0264"]}]},{"name":"2023 Guangxi Province University Young and Middle-aged Teachers\u2019 Research Basic Ability Improvement Project","award":["62262011"],"award-info":[{"award-number":["62262011"]}]},{"name":"National Natural Science Foundation of China","award":["Guike AD22035127"],"award-info":[{"award-number":["Guike AD22035127"]}]},{"name":"National Natural Science Foundation of China","award":["2023KY0264"],"award-info":[{"award-number":["2023KY0264"]}]},{"name":"National Natural Science Foundation of China","award":["62262011"],"award-info":[{"award-number":["62262011"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Human detection is the task of locating all instances of human beings present in an image, which has a wide range of applications across various fields, including search and rescue, surveillance, and autonomous driving. The rapid advancement of computer vision and deep learning technologies has brought significant improvements in human detection. However, for more advanced applications like healthcare, human\u2013computer interaction, and scene understanding, it is crucial to obtain information beyond just the localization of humans. These applications require a deeper understanding of human behavior and state to enable effective and safe interactions with humans and the environment. This study presents a comprehensive benchmark, the Common Human Postures (CHP) dataset, aimed at promoting a more informative and more encouraging task beyond mere human detection. The benchmark dataset comprises a diverse collection of images, featuring individuals in different environments, clothing, and occlusions, performing a wide range of postures and activities. The benchmark aims to enhance research in this challenging task by designing novel and precise methods specifically for it. The CHP dataset consists of 5250 human images collected from different scenes, annotated with bounding boxes for seven common human poses. Using this well-annotated dataset, we have developed two baseline detectors, namely CHP-YOLOF and CHP-YOLOX, building upon two identity-preserved human posture detectors: IPH-YOLOF and IPH-YOLOX. We evaluate the performance of these baseline detectors through extensive experiments. The results demonstrate that these baseline detectors effectively detect human postures on the CHP dataset. By releasing the CHP dataset, we aim to facilitate further research on human pose estimation and to attract more researchers to focus on this challenging task.<\/jats:p>","DOI":"10.3390\/s23198061","type":"journal-article","created":{"date-parts":[[2023,9,24]],"date-time":"2023-09-24T10:48:31Z","timestamp":1695552511000},"page":"8061","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":11,"title":["Beyond Human Detection: A Benchmark for Detecting Common Human Posture"],"prefix":"10.3390","volume":"23","author":[{"given":"Yongxin","family":"Li","sequence":"first","affiliation":[{"name":"Guangxi Key Laboratory of Embedded Technology and Intelligent Information Processing, College of Information Science and Engineering, Guilin University of Technology, Guilin 541006, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-4230-899X","authenticated-orcid":false,"given":"You","family":"Wu","sequence":"additional","affiliation":[{"name":"Guangxi Key Laboratory of Embedded Technology and Intelligent Information Processing, College of Information Science and Engineering, Guilin University of Technology, Guilin 541006, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaoting","family":"Chen","sequence":"additional","affiliation":[{"name":"Guangxi Key Laboratory of Embedded Technology and Intelligent Information Processing, College of Information Science and Engineering, Guilin University of Technology, Guilin 541006, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Han","family":"Chen","sequence":"additional","affiliation":[{"name":"Guangxi Key Laboratory of Embedded Technology and Intelligent Information Processing, College of Information Science and Engineering, Guilin University of Technology, Guilin 541006, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Depeng","family":"Kong","sequence":"additional","affiliation":[{"name":"Guangxi Key Laboratory of Embedded Technology and Intelligent Information Processing, College of Information Science and Engineering, Guilin University of Technology, Guilin 541006, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haihua","family":"Tang","sequence":"additional","affiliation":[{"name":"Guangxi Key Laboratory of Embedded Technology and Intelligent Information Processing, College of Information Science and Engineering, Guilin University of Technology, Guilin 541006, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4587-513X","authenticated-orcid":false,"given":"Shuiwang","family":"Li","sequence":"additional","affiliation":[{"name":"Guangxi Key Laboratory of Embedded Technology and Intelligent Information Processing, College of Information Science and Engineering, Guilin University of Technology, Guilin 541006, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,9,24]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1109\/MAES.2020.3021322","article-title":"High Precision Human Detection and Tracking Using Millimeter-Wave Radars","volume":"36","author":"Cui","year":"2020","journal-title":"IEEE Aerosp. Electron. Syst. Mag."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"8759","DOI":"10.1007\/s11042-020-10103-4","article-title":"Human detection techniques for real time surveillance: A comprehensive survey","volume":"80","author":"Ansari","year":"2020","journal-title":"Multimed. Tools Appl."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"27867","DOI":"10.1007\/s11042-021-10811-5","article-title":"A deep survey on supervised learning based human detection and activity classification methods","volume":"80","author":"Khan","year":"2021","journal-title":"Multimed. Tools Appl."},{"key":"ref_4","first-page":"462","article-title":"Real-Time Human Detection Using Deep Learning on Embedded Platforms: A Review","volume":"2","author":"Rahmaniar","year":"2021","journal-title":"J. Robot. Control. (JRC)"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"42724","DOI":"10.1109\/ACCESS.2021.3063028","article-title":"Vision-Based Human Detection Techniques: A Descriptive Review","volume":"9","author":"Sumit","year":"2021","journal-title":"IEEE Access"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Pawar, P., and Devendran, V. (2019, January 28\u201329). Scene Understanding: A Survey to See the World at a Single Glance. Proceedings of the 2019 2nd International Conference on Intelligent Communication and Computational Techniques (ICCT), Jaipur, India.","DOI":"10.1109\/ICCT46177.2019.8969051"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1859","DOI":"10.1109\/ACCESS.2018.2886133","article-title":"Indoor Scene Understanding in 2.5\/3D for Autonomous Agents: A Survey","volume":"7","author":"Naseer","year":"2018","journal-title":"IEEE Access"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1080\/01441647.2018.1494640","article-title":"Governing autonomous vehicles: Emerging responses for safety, liability, privacy, cybersecurity, and industry risks","volume":"39","author":"Taeihagh","year":"2018","journal-title":"Transp. Rev."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Guo, Z., Huang, Y., Hu, X., Wei, H., and Zhao, B. (2021). A Survey on Deep Learning Based Approaches for Scene Understanding in Autonomous Driving. Electronics, 10.","DOI":"10.3390\/electronics10040471"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"197","DOI":"10.1108\/JOSM-11-2021-0409","article-title":"To serve and protect: A typology of service robots and their role in physically safe services","volume":"32","author":"Schepers","year":"2022","journal-title":"J. Serv. Manag."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"907","DOI":"10.1108\/JOSM-04-2018-0119","article-title":"Brave new world: Service robots in the frontline","volume":"29","author":"Wirtz","year":"2018","journal-title":"J. Serv. Manag."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"361","DOI":"10.1108\/JSTP-04-2019-0088","article-title":"Service robots, customers and service employees: What can we learn from the academic literature and where are the gaps?","volume":"30","author":"Lu","year":"2020","journal-title":"J. Serv. Theory Pract."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Zhang, H., Zhang, F., Zhang, Y., Cheng, H., Gao, R., Li, Z., Zhao, J., and Zhang, M. (2022, January 28\u201331). An Elderly Living-alone Guardianship Model Based on Wavelet Transform. Proceedings of the 2022 4th International Conference on Power and Energy Technology (ICPET), Xining, China.","DOI":"10.1109\/ICPET55165.2022.9918289"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"395","DOI":"10.1007\/s11023-021-09561-y","article-title":"Value Sensitive Design to Achieve the UN SDGs with AI: A Case of Elderly Care Robots","volume":"31","author":"Umbrello","year":"2021","journal-title":"Minds Mach."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"629","DOI":"10.1007\/s12369-020-00653-w","article-title":"Trust in and Ethical Design of Carebots: The Case for Ethics of Care","volume":"13","author":"Yew","year":"2020","journal-title":"Int. J. Soc. Robot."},{"key":"ref_16","unstructured":"Coin, A., and Dubljevi\u0107, V. (2021). Trust in Human-Robot Interaction, Academic Press."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"657","DOI":"10.1007\/s12369-021-00816-3","article-title":"Robots for Elderly Care in the Home: A Landscape Analysis and Co-Design Toolkit","volume":"14","author":"Bardaro","year":"2021","journal-title":"Int. J. Soc. Robot."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Jang, J., Kim, D., Park, C., Jang, M., Lee, J., and Kim, J. (2020, January 25\u201329). ETRI-Activity3D: A Large-Scale RGB-D Dataset for Robots to Recognize Daily Activities of the Elderly. Proceedings of the 2020 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA.","DOI":"10.1109\/IROS45743.2020.9341160"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"602","DOI":"10.1109\/JAS.2017.7510604","article-title":"A survey of human-centered intelligent robots: Issues and challenges","volume":"4","author":"He","year":"2017","journal-title":"IEEE\/CAA J. Autom. Sin."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Quiroz, M., Pati\u00f1o, R., Diaz-Amado, J., and Cardinale, Y. (2022). Group emotion detection based on social robot perception. Sensors, 22.","DOI":"10.3390\/s22103749"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.ijhcs.2015.01.006","article-title":"Emotionally expressive dynamic physical behaviors in robots","volume":"78","author":"Bretan","year":"2015","journal-title":"Int. J. Hum. Comput. Stud."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"677590","DOI":"10.3389\/fpsyg.2021.677590","article-title":"The application of human\u2013computer interaction technology fused with artificial intelligence in sports moving target detection education for college athlete","volume":"12","author":"Liu","year":"2021","journal-title":"Front. Psychol."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"35","DOI":"10.3991\/ijoe.v18i12.30893","article-title":"Applying Deep Learning and Computer Vision Techniques for an e-Sport and Smart Coaching System Using a Multiview Dataset: Case of Shotokan Karate","volume":"18","author":"Aaroud","year":"2022","journal-title":"Int. J. Online Biomed. Eng."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"766","DOI":"10.1002\/tee.23113","article-title":"Detection and recognition of human body posture in motion based on sensor technology","volume":"15","author":"Zhao","year":"2020","journal-title":"IEEJ Trans. Electr. Electron. Eng."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Wang, J., Qiu, K., Peng, H., Fu, J., and Zhu, J. (2019, January 21\u201325). Ai coach: Deep human pose estimation and analysis for personalized athletic training assistance. Proceedings of the 27th ACM International Conference on Multimedia, Nice, France.","DOI":"10.1145\/3343031.3350609"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"21247","DOI":"10.3390\/s141121247","article-title":"Fast human detection for intelligent monitoring using surveillance visible sensors","volume":"14","author":"Ko","year":"2014","journal-title":"Sensors"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"8895","DOI":"10.3390\/s140508895","article-title":"A vision-based system for intelligent monitoring: Human behaviour analysis and privacy by context","volume":"14","author":"Chaaraoui","year":"2014","journal-title":"Sensors"},{"key":"ref_28","first-page":"54","article-title":"A depth video-based human detection and activity recognition using multi-features and embedded hidden Markov models for health care monitoring systems","volume":"4","author":"Jalal","year":"2017","journal-title":"Int. J. Interact. Multimed. Artif. Intell."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Cort\u00e9s, C., Ardanza, A., Molina-Rueda, F., Cuesta-Gomez, A., Unzueta, L., Epelde, G., Ruiz, O.E., De Mauro, A., and Florez, J. (2014). Upper limb posture estimation in robotic and virtual reality-based rehabilitation. BioMed Res. Int., 2014.","DOI":"10.1155\/2014\/821908"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"562","DOI":"10.1109\/JSAC.2020.3020600","article-title":"Remote monitoring of physical rehabilitation of stroke patients using IoT and virtual reality","volume":"39","author":"Postolache","year":"2020","journal-title":"IEEE J. Sel. Areas Commun."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"102802","DOI":"10.1016\/j.jvcir.2020.102802","article-title":"Hand pose estimation in object-interaction based on deep learning for virtual reality applications","volume":"70","author":"Wu","year":"2020","journal-title":"J. Vis. Commun. Image Represent."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Negrillo-C\u00e1rdenas, J., Jim\u00e9nez-P\u00e9rez, J.R., and Feito, F.R. (2020). The role of virtual and augmented reality in orthopedic trauma surgery: From diagnosis to rehabilitation. Comput. Methods Programs Biomed., 191.","DOI":"10.1016\/j.cmpb.2020.105407"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Lv, X., Ta, N., Chen, T., Zhao, J., and Wei, H. (2022). Analysis of Gait Characteristics of Patients with Knee Arthritis Based on Human Posture Estimation. BioMed Res. Int., 2022.","DOI":"10.1155\/2022\/7020804"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1109\/MIE.2020.2970790","article-title":"A Human-Like Traffic Scene Understanding System: A Survey","volume":"15","author":"Xia","year":"2021","journal-title":"IEEE Ind. Electron. Mag."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"102897","DOI":"10.1016\/j.cviu.2019.102897","article-title":"Monocular human pose estimation: A survey of deep learning-based methods","volume":"192","author":"Chen","year":"2020","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"103275","DOI":"10.1016\/j.cviu.2021.103275","article-title":"A review of 3D human pose estimation algorithms for markerless motion capture","volume":"212","author":"Desmarais","year":"2021","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"2887","DOI":"10.1109\/TCSVT.2019.2950449","article-title":"3D Mapping and 6D Pose Computation for Real Time Augmented Reality on Cylindrical Objects","volume":"30","author":"Tang","year":"2020","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"4606","DOI":"10.1109\/LRA.2022.3150497","article-title":"Vision-Only Robot Navigation in a Neural Radiance World","volume":"7","author":"Adamkiewicz","year":"2022","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Chen, H., Feng, R., Wu, S., Xu, H., Zhou, F., and Liu, Z. (2022). 2D Human Pose Estimation: A Survey. arXiv.","DOI":"10.1007\/s00530-022-01019-0"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s42979-022-01567-2","article-title":"PoseAnalyser: A Survey on Human Pose Estimation","volume":"4","author":"Kulkarni","year":"2023","journal-title":"SN Comput. Sci."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Zanfir, M., Leordeanu, M., and Sminchisescu, C. (2013, January 2\u20138). The Moving Pose: An Efficient 3D Kinematics Descriptor for Low-Latency Action Recognition and Detection. Proceedings of the 2013 IEEE International Conference on Computer Vision, Sydney, Australia.","DOI":"10.1109\/ICCV.2013.342"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Rutjes, H., Willemsen, M.C., and IJsselsteijn, W.A. (2019, January 4\u20139). Beyond Behavior: The Coach\u2019s Perspective on Technology in Health Coaching. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, UK.","DOI":"10.1145\/3290605.3300900"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"1949","DOI":"10.1519\/JSC.0000000000003027","article-title":"Is What You See What You Get? Perceptions of Personal Trainers\u2019 Competence, Knowledge, and Preferred Sex of Personal Trainer Relative to Physique","volume":"35","author":"Boerner","year":"2019","journal-title":"J. Strength Cond. Res."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Deng, X., Xiang, Y., Mousavian, A., Eppner, C., Bretl, T., and Fox, D. (August, January 31). Self-supervised 6D Object Pose Estimation for Robot Manipulation. Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France.","DOI":"10.1109\/ICRA40945.2020.9196714"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"488","DOI":"10.1016\/j.neucom.2021.12.059","article-title":"Human pose estimation for mitigating false negatives in weapon detection in video-surveillance","volume":"489","author":"Lamas","year":"2022","journal-title":"Neurocomputing"},{"key":"ref_46","unstructured":"Thyagarajmurthy, A., Ninad, M.G., Rakesh, B., Niranjan, S.K., and Manvi, B. (2019). Lecture Notes in Electrical Engineering, Springer."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Guo, Y., Chen, Y., Deng, J., Li, S., and Zhou, H. (2023). Identity-Preserved Human Posture Detection in Infrared Thermal Images: A Benchmark. Sensors, 23.","DOI":"10.3390\/s23010092"},{"key":"ref_48","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201905), San Diego, CA, USA."},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Wang, Q.J., and Zhang, R.B. (2008, January 21\u201322). LPP-HOG: A new local image descriptor for fast human detection. Proceedings of the 2008 IEEE International Symposium on Knowledge Acquisition and Modeling Workshop, Wuhan, China.","DOI":"10.1109\/KAMW.2008.4810570"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Shen, J., Sun, C., Yang, W., and Sun, Z. (June, January 29). Fast human detection based on enhanced variable size HOG features. Proceedings of the Advances in Neural Networks\u2013ISNN 2011: 8th International Symposium on Neural Networks, ISNN 2011, Guilin, China. Proceedings, Part II 8.","DOI":"10.1007\/978-3-642-21090-7_40"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Wang, X., Han, T.X., and Yan, S. (October, January 29). An HOG-LBP human detector with partial occlusion handling. Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan.","DOI":"10.1109\/ICCV.2009.5459207"},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"773","DOI":"10.1016\/j.sigpro.2010.08.010","article-title":"Efficient HOG human detection","volume":"91","author":"Pang","year":"2011","journal-title":"Signal Process."},{"key":"ref_53","first-page":"778","article-title":"Human detection in images via piecewise linear support vector machines","volume":"22","author":"Ye","year":"2012","journal-title":"IEEE Trans. Image Process."},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, faster, stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_56","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv."},{"key":"ref_57","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Chen, Q., Wang, Y., Yang, T., Zhang, X., Cheng, J., and Sun, J. (2021, January 20\u201325). You only look one-level feature. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01284"},{"key":"ref_59","unstructured":"Ge, Z., Liu, S., Wang, F., Li, Z., and Sun, J. (2021). Yolox: Exceeding yolo series in 2021. arXiv."},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Nikouei, S.Y., Chen, Y., Song, S., Xu, R., Choi, B.Y., and Faughnan, T.R. (2018, January 2\u20137). Real-time human detection as an edge service enabled by a lightweight cnn. Proceedings of the 2018 IEEE International Conference on Edge Computing (EDGE), San Francisco, CA, USA.","DOI":"10.1109\/EDGE.2018.00025"},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Zhao, J., Zhang, G., Tian, L., and Chen, Y.Q. (2017, January 10\u201314). Real-time human detection with depth camera via a physical radius-depth detector and a CNN descriptor. Proceedings of the 2017 IEEE International Conference on Multimedia and Expo (ICME), Hong Kong, China.","DOI":"10.1109\/ICME.2017.8019323"},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Lan, W., Dang, J., Wang, Y., and Wang, S. (2018, January 5\u20138). Pedestrian detection based on YOLO network model. Proceedings of the 2018 IEEE International Conference on Mechatronics and Automation (ICMA), Changchun, China.","DOI":"10.1109\/ICMA.2018.8484698"},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Buri\u0107, M., Pobar, M., and Iva\u0161i\u0107-Kos, M. (2019, January 19\u201321). Adapting YOLO network for ball and player detection. Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods, Prague, Czech Republic.","DOI":"10.5220\/0007582008450851"},{"key":"ref_67","doi-asserted-by":"crossref","first-page":"485","DOI":"10.5755\/j01.itc.51.3.30540","article-title":"Human Detection Algorithm Based on Improved YOLO v4","volume":"51","author":"Zhou","year":"2022","journal-title":"Inf. Technol. Control"},{"key":"ref_68","doi-asserted-by":"crossref","unstructured":"Cao, Z., Simon, T., Wei, S.E., and Sheikh, Y. (2017, January 21\u201326). Realtime multi-person 2d pose estimation using part affinity fields. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.143"},{"key":"ref_69","doi-asserted-by":"crossref","unstructured":"Newell, A., Yang, K., and Deng, J. (2016, January 11\u201314). Stacked hourglass networks for human pose estimation. Proceedings of the Computer Vision\u2013ECCV 2016: 14th European Conference, Amsterdam, The Netherlands. Proceedings, Part VIII 14.","DOI":"10.1007\/978-3-319-46484-8_29"},{"key":"ref_70","doi-asserted-by":"crossref","first-page":"2518","DOI":"10.1007\/s10489-020-01918-7","article-title":"EfficientPose: Scalable single-person pose estimation","volume":"51","author":"Groos","year":"2021","journal-title":"Appl. Intell."},{"key":"ref_71","doi-asserted-by":"crossref","unstructured":"Wei, S.E., Ramakrishna, V., Kanade, T., and Sheikh, Y. (2016, January 27\u201330). Convolutional pose machines. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.511"},{"key":"ref_72","doi-asserted-by":"crossref","unstructured":"Sun, X., Xiao, B., Wei, F., Liang, S., and Wei, Y. (2018, January 8\u201314). Integral human pose regression. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01231-1_33"},{"key":"ref_73","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The pascal visual object classes (voc) challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_74","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft coco: Common objects in context. Proceedings of the Computer Vision\u2013ECCV 2014: 13th European Conference, Zurich, Switzerland. Proceedings, Part V 13.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_75","doi-asserted-by":"crossref","unstructured":"Xu, W., Xu, Y., Chang, T., and Tu, Z. (2021, January 11\u201317). Co-scale conv-attentional image transformers. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00983"},{"key":"ref_76","doi-asserted-by":"crossref","unstructured":"Wang, W., Dai, J., Chen, Z., Huang, Z., Li, Z., Zhu, X., Hu, X., Lu, T., Lu, L., and Li, H. (2023, January 18\u201322). Internimage: Exploring large-scale vision foundation models with deformable convolutions. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.01385"},{"key":"ref_77","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_78","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201323). Path aggregation network for instance segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_79","doi-asserted-by":"crossref","unstructured":"Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., and Hu, Q. (2020, January 13\u201319). ECA-Net: Efficient channel attention for deep convolutional neural networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01155"},{"key":"ref_80","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.Y., and Kweon, I.S. (2018, January 8\u201314). Cbam: Convolutional block attention module. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_81","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_82","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., and Fei-Fei, L. (2009, January 20\u201325). Imagenet: A large-scale hierarchical image database. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_83","doi-asserted-by":"crossref","unstructured":"Qin, L., Zhou, H., Wang, Z., Deng, J., Liao, Y., and Li, S. (2022, January 4\u20137). Detection Beyond What and Where: A Benchmark for Detecting Occlusion State. Proceedings of the Pattern Recognition and Computer Vision: 5th Chinese Conference, PRCV 2022, Shenzhen, China. Proceedings, Part IV.","DOI":"10.1007\/978-3-031-18916-6_38"},{"key":"ref_84","doi-asserted-by":"crossref","unstructured":"Wu, Y., Ye, H., Yang, Y., Wang, Z., and Li, S. (2023). Liquid Content Detection in Transparent Containers: A Benchmark. Sensors, 23.","DOI":"10.3390\/s23156656"},{"key":"ref_85","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_86","doi-asserted-by":"crossref","unstructured":"Wang, W., Dai, J., Chen, Z., Huang, Z., Li, Z., Zhu, X., Hu, X., Lu, T., Lu, L., and Li, H. (2022). Internimage: Exploring large-scale vision foundation models with deformable convolutions. arXiv.","DOI":"10.1109\/CVPR52729.2023.01385"},{"key":"ref_87","unstructured":"Ba, J.L., Kiros, J.R., and Hinton, G.E. (2016). Layer normalization. arXiv."},{"key":"ref_88","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017). Attention is all you need. arXiv."},{"key":"ref_89","unstructured":"Hendrycks, D., and Gimpel, K. (2016). Gaussian error linear units (gelus). arXiv."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/19\/8061\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T20:57:13Z","timestamp":1760129833000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/19\/8061"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,24]]},"references-count":89,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2023,10]]}},"alternative-id":["s23198061"],"URL":"https:\/\/doi.org\/10.3390\/s23198061","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,24]]}}}