{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,19]],"date-time":"2026-05-19T17:09:48Z","timestamp":1779210588372,"version":"3.51.4"},"reference-count":47,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2024,3,6]],"date-time":"2024-03-06T00:00:00Z","timestamp":1709683200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,3,6]],"date-time":"2024-03-06T00:00:00Z","timestamp":1709683200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100019447","name":"Applied Basic Research Project of Shanxi Province, China","doi-asserted-by":"publisher","award":["202203021221145"],"award-info":[{"award-number":["202203021221145"]}],"id":[{"id":"10.13039\/501100019447","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100019447","name":"Applied Basic Research Project of Shanxi Province, China","doi-asserted-by":"publisher","award":["202203021221145"],"award-info":[{"award-number":["202203021221145"]}],"id":[{"id":"10.13039\/501100019447","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Graduate Joint Training Demonstration Base Project of Shanxi Province,China","award":["2022JD11"],"award-info":[{"award-number":["2022JD11"]}]},{"name":"Graduate Joint Training Demonstration Base Project of Shanxi Province,China","award":["2022JD11"],"award-info":[{"award-number":["2022JD11"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Process Lett"],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Pedestrian detection is crucial for various applications, including intelligent transportation and video surveillance systems. Although recent research has advanced pedestrian detection models like the YOLO series, they still face limitations in handling diverse pedestrian scales, leading to performance challenges. To address these issues, we propose HF-YOLO, an advanced pedestrian detection model. HF-YOLO tackles the complexities of pedestrian detection in complex scenes by addressing scale variations and occlusions among pedestrians. In the feature fusion stage, our algorithm leverages both shallow localization information and deep semantic information. This involves fusing P2 layer features and adding a high-resolution detection layer, significantly improving the detection of small-scale pedestrians and occluded instances. To enhance feature representation, HF-YOLO incorporates the HardSwish activation function, introducing more non-linear factors and strengthening the model\u2019s ability to represent complex and discriminative features. Additionally, to address regression imbalance, a balance factor is introduced to the CIoU loss function. This modification effectively resolves the imbalance problem and enhances pedestrian localization accuracy. Experimental results demonstrate the effectiveness of our proposed algorithm. HF-YOLO achieves notable improvements, including a 3.52% increase in average precision, a 1.35% boost in accuracy, and a 4.83% enhancement in recall. Moreover, the algorithm maintains real-time performance with a detection time of 8.5ms, meeting the stringent requirements of real-time applications.<\/jats:p>","DOI":"10.1007\/s11063-024-11558-4","type":"journal-article","created":{"date-parts":[[2024,3,6]],"date-time":"2024-03-06T18:01:51Z","timestamp":1709748111000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":21,"title":["HF-YOLO: Advanced Pedestrian Detection Model with Feature Fusion and Imbalance Resolution"],"prefix":"10.1007","volume":"56","author":[{"given":"Lihu","family":"Pan","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jianzhong","family":"Diao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhengkui","family":"Wang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shouxin","family":"Peng","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Cunhui","family":"Zhao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2024,3,6]]},"reference":[{"issue":"4","key":"11558_CR1","doi-asserted-by":"publisher","DOI":"10.1007\/s11704-022-2050-4","volume":"17","author":"M Maqsood","year":"2023","unstructured":"Maqsood M, Yasmin S, Gillani S et al (2023) An efficient deep learning-assisted person re-identification solution for intelligent video surveillance in smart cities. Front Comp Sci 17(4):174329","journal-title":"Front Comp Sci"},{"key":"11558_CR2","doi-asserted-by":"publisher","DOI":"10.1016\/j.trc.2020.102856","volume":"121","author":"S El Hamdani","year":"2020","unstructured":"El Hamdani S, Benamar N, Younis M (2020) Pedestrian support in intelligent transportation systems: challenges, solutions and open issues. Transp Res part C Emerg Technol 121:102856. https:\/\/doi.org\/10.1016\/j.trc.2020.102856","journal-title":"Transp Res part C Emerg Technol"},{"key":"11558_CR3","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2022.119242","volume":"213","author":"S Lee","year":"2023","unstructured":"Lee S, Lee S, Seong H et al (2023) Fallen person detection for autonomous driving. Expert Syst Appl 213:119242. https:\/\/doi.org\/10.1016\/j.eswa.2022.119242","journal-title":"Expert Syst Appl"},{"key":"11558_CR4","doi-asserted-by":"publisher","DOI":"10.1016\/j.aap.2020.105692","volume":"145","author":"K Wang","year":"2020","unstructured":"Wang K, Li G, Chen J et al (2020) The adaptability and challenges of autonomous vehicles to pedestrians in urban china. Accid Anal Prev 145:105692. https:\/\/doi.org\/10.1016\/j.aap.2020.105692","journal-title":"Accid Anal Prev"},{"key":"11558_CR5","doi-asserted-by":"publisher","first-page":"144","DOI":"10.1016\/j.neucom.2016.12.050","volume":"234","author":"J Hariyono","year":"2017","unstructured":"Hariyono J, Jo KH (2017) Detection of pedestrian crossing road: a study on pedestrian pose recognition. Neurocomputing 234:144\u2013153. https:\/\/doi.org\/10.1016\/j.neucom.2016.12.050","journal-title":"Neurocomputing"},{"key":"11558_CR6","doi-asserted-by":"crossref","unstructured":"Lee S, Rim J, Jeong B, et\u00a0al (2023) Human pose estimation in extremely low-light conditions. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 704\u2013714","DOI":"10.1109\/CVPR52729.2023.00075"},{"key":"11558_CR7","doi-asserted-by":"publisher","DOI":"10.1016\/j.aei.2021.101356","volume":"49","author":"PKY Wong","year":"2021","unstructured":"Wong PKY, Luo H, Wang M et al (2021) Recognition of pedestrian trajectories and attributes with computer vision and deep learning techniques. Adv Eng Inf 49:101356. https:\/\/doi.org\/10.1016\/j.aei.2021.101356","journal-title":"Adv Eng Inf"},{"key":"11558_CR8","doi-asserted-by":"crossref","unstructured":"Feng J, Wu A, Zheng WS (2023) Shape-erased feature learning for visible-infrared person re-identification. In: 2023 IEEE\/CVF conference on computer vision and pattern recognition (CVPR). IEEE, pp 22752\u201322761","DOI":"10.1109\/CVPR52729.2023.02179"},{"key":"11558_CR9","unstructured":"Paul V, Michael J (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001, pp I\u2013I"},{"key":"11558_CR10","doi-asserted-by":"crossref","unstructured":"Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR\u201905), pp 886\u2013893","DOI":"10.1109\/CVPR.2005.177"},{"key":"11558_CR11","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1016\/j.infrared.2021.103694","volume":"60","author":"DG Lowe","year":"2004","unstructured":"Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60:91\u2013110. https:\/\/doi.org\/10.1016\/j.infrared.2021.103694","journal-title":"Int J Comput Vis"},{"key":"11558_CR12","doi-asserted-by":"publisher","DOI":"10.1016\/j.infrared.2021.103694","volume":"115","author":"X Dai","year":"2021","unstructured":"Dai X, Hu J, Zhang H et al (2021) Multi-task faster R-CNN for nighttime pedestrian detection and distance estimation. Infrared Phys Technol 115:103694. https:\/\/doi.org\/10.1016\/j.infrared.2021.103694","journal-title":"Infrared Phys Technol"},{"key":"11558_CR13","doi-asserted-by":"publisher","DOI":"10.1016\/j.infrared.2021.103906","volume":"118","author":"Y Xue","year":"2021","unstructured":"Xue Y, Ju Z, Li Y et al (2021) MAF-YOLO: multi-modal attention fusion based yolo for pedestrian detection. Infrared Phys Technol 118:103906. https:\/\/doi.org\/10.1016\/j.infrared.2021.103906","journal-title":"Infrared Phys Technol"},{"key":"11558_CR14","doi-asserted-by":"publisher","first-page":"401","DOI":"10.1016\/j.inffus.2023.02.014","volume":"95","author":"DK Jain","year":"2023","unstructured":"Jain DK, Zhao X, Gonz\u00e1lez-Almagro G et al (2023) Multimodal pedestrian detection using metaheuristics with deep convolutional neural network in crowded scenes. Inf Fusion 95:401\u2013414","journal-title":"Inf Fusion"},{"key":"11558_CR15","doi-asserted-by":"crossref","unstructured":"Girshick R, Donahue J, Darrell T, et\u00a0al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580\u2013587","DOI":"10.1109\/CVPR.2014.81"},{"key":"11558_CR16","doi-asserted-by":"crossref","unstructured":"Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440\u20131448","DOI":"10.1109\/ICCV.2015.169"},{"key":"11558_CR17","doi-asserted-by":"publisher","first-page":"1137","DOI":"10.1109\/tpami.2016.2577031","volume":"39","author":"S Ren","year":"2015","unstructured":"Ren S, He K, Girshick RB et al (2015) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39:1137\u20131149. https:\/\/doi.org\/10.1109\/tpami.2016.2577031","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"11558_CR18","doi-asserted-by":"crossref","unstructured":"Liu W, Anguelov D, Erhan D, et\u00a0al (2016) Ssd: Single shot multibox detector. In: Computer vision\u2013ECCV 2016: 14th European conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part I 14, pp 21\u201337","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"11558_CR19","doi-asserted-by":"crossref","unstructured":"Redmon J, Divvala S, Girshick R, et\u00a0al (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779\u2013788","DOI":"10.1109\/CVPR.2016.91"},{"key":"11558_CR20","doi-asserted-by":"crossref","unstructured":"Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263\u20137271","DOI":"10.1109\/CVPR.2017.690"},{"key":"11558_CR21","unstructured":"Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. Preprint at https:\/\/arXiv.org\/abs\/1804.02767"},{"key":"11558_CR22","unstructured":"Zhou X, Wang D, Kr\u00e4henb\u00fchl P (2019) Objects as points. Preprint at https:\/\/arXiv.org\/abs\/1904.07850"},{"issue":"4","key":"11558_CR23","first-page":"1922","volume":"44","author":"Z Tian","year":"2020","unstructured":"Tian Z, Shen C, Chen H et al (2020) FCOS: a simple and strong anchor-free object detector. IEEE Trans Pattern Anal Mach Intell 44(4):1922\u20131933","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"5","key":"11558_CR24","doi-asserted-by":"publisher","first-page":"2019","DOI":"10.1109\/TIP.2014.2311377","volume":"23","author":"J Yu","year":"2014","unstructured":"Yu J, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019\u20132032","journal-title":"IEEE Trans Image Process"},{"key":"11558_CR25","doi-asserted-by":"crossref","unstructured":"Cao J, Cholakkal H, Anwer RM, et\u00a0al (2020) D2det: Towards high quality object detection and instance segmentation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 11485\u201311494","DOI":"10.1109\/CVPR42600.2020.01150"},{"issue":"2","key":"11558_CR26","doi-asserted-by":"publisher","first-page":"563","DOI":"10.1109\/TPAMI.2019.2932058","volume":"44","author":"J Yu","year":"2019","unstructured":"Yu J, Tan M, Zhang H et al (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell 44(2):563\u2013578","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"11558_CR27","doi-asserted-by":"crossref","unstructured":"Woo S, Park J, Lee JY, et\u00a0al (2018) Cbam: Convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp 3\u201319","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"11558_CR28","unstructured":"Lv W, Xu S, Zhao Y, et\u00a0al (2023) Detrs beat yolos on real-time object detection. Preprint at https:\/\/arxiv.org\/abs\/2304.08069"},{"key":"11558_CR29","doi-asserted-by":"crossref","unstructured":"Zhang L, Lin L, Liang X, et\u00a0al (2016) Is faster R-CNN doing well for pedestrian detection? In: Computer vision\u2013ECCV 2016: 14th European conference, Amsterdam, The Netherlands, October 11\u201314, 2016, Proceedings, Part II 14, Springer, pp 443\u2013457","DOI":"10.1007\/978-3-319-46475-6_28"},{"key":"11558_CR30","doi-asserted-by":"crossref","unstructured":"Zhang S, Wen L, Bian X, et\u00a0al (2018) Occlusion-aware R-CNN: Detecting pedestrians in a crowd. In: Proceedings of the European conference on computer vision (ECCV), pp 637\u2013653","DOI":"10.1007\/978-3-030-01219-9_39"},{"key":"11558_CR31","doi-asserted-by":"crossref","unstructured":"Liu S, Huang D, Wang Y (2019) Adaptive NMS: refining pedestrian detection in a crowd. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 6459\u20136468","DOI":"10.1109\/CVPR.2019.00662"},{"key":"11558_CR32","doi-asserted-by":"crossref","unstructured":"Chu X, Zheng A, Zhang X, et\u00a0al (2020) Detection in crowded scenes: one proposal, multiple predictions. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 12214\u201312223","DOI":"10.1109\/CVPR42600.2020.01223"},{"key":"11558_CR33","doi-asserted-by":"publisher","DOI":"10.1016\/j.dsp.2021.103311","volume":"121","author":"H Xia","year":"2021","unstructured":"Xia H, Ma J, Ou J et al (2021) Pedestrian detection algorithm based on multi-scale feature extraction and attention feature fusion. Digital Signal Process 121:103311. https:\/\/doi.org\/10.1016\/j.dsp.2021.103311","journal-title":"Digital Signal Process"},{"key":"11558_CR34","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.ins.2020.10.049","volume":"550","author":"Q Li","year":"2021","unstructured":"Li Q, Qiang H, Li J (2021) Conditional random fields as message passing mechanism in anchor-free network for multi-scale pedestrian detection. Inf Sci 550:1\u201312. https:\/\/doi.org\/10.1016\/j.ins.2020.10.049","journal-title":"Inf Sci"},{"key":"11558_CR35","doi-asserted-by":"publisher","DOI":"10.1016\/j.csi.2022.103702","volume":"84","author":"M Wang","year":"2023","unstructured":"Wang M, Ma H, Liu S et al (2023) A novel small-scale pedestrian detection method base on residual block group of CenterNet. Comput Stand Interfaces 84:103702. https:\/\/doi.org\/10.1016\/j.csi.2022.103702","journal-title":"Comput Stand Interfaces"},{"issue":"2","key":"11558_CR36","doi-asserted-by":"publisher","first-page":"652","DOI":"10.1109\/TPAMI.2019.2938758","volume":"43","author":"S Gao","year":"2019","unstructured":"Gao S, Cheng MM, Zhao K et al (2019) Res2Net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell 43(2):652\u2013662","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"11558_CR37","doi-asserted-by":"crossref","unstructured":"Wang CY, Bochkovskiy A, Liao HYM (2023) Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 7464\u20137475","DOI":"10.1109\/CVPR52729.2023.00721"},{"key":"11558_CR38","doi-asserted-by":"crossref","unstructured":"Lee Y, won Hwang J, Lee S, et\u00a0al (2019) An energy and GPU-computation efficient backbone network for real-time object detection. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition workshops, pp 0\u20130","DOI":"10.1109\/CVPRW.2019.00103"},{"key":"11558_CR39","doi-asserted-by":"crossref","unstructured":"Wang CY, Liao HYM, Wu YH, et\u00a0al (2020) Cspnet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition workshops, pp 390\u2013391","DOI":"10.1109\/CVPRW50498.2020.00203"},{"key":"11558_CR40","doi-asserted-by":"crossref","unstructured":"Howard A, Sandler M, Chu G, et\u00a0al (2019) Searching for mobilenetv3. In: Proceedings of the IEEE\/CVF international conference on computer vision, pp 1314\u20131324","DOI":"10.1109\/ICCV.2019.00140"},{"issue":"8","key":"11558_CR41","doi-asserted-by":"publisher","first-page":"8574","DOI":"10.1109\/TCYB.2021.3095305","volume":"52","author":"Z Zheng","year":"2021","unstructured":"Zheng Z, Wang P, Ren D et al (2021) Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans Cybern 52(8):8574\u20138586","journal-title":"IEEE Trans Cybern"},{"key":"11558_CR42","unstructured":"Shao S, Zhao Z, Li B, et\u00a0al (2018) Crowdhuman: a benchmark for detecting human in a crowd. Preprint at https:\/\/arXiv.org\/abs\/1805.00123"},{"issue":"2","key":"11558_CR43","doi-asserted-by":"publisher","first-page":"380","DOI":"10.1109\/TMM.2019.2929005","volume":"22","author":"S Zhang","year":"2019","unstructured":"Zhang S, Xie Y, Wan J et al (2019) Widerperson: a diverse dataset for dense pedestrian detection in the wild. IEEE Trans Multimed 22(2):380\u2013393","journal-title":"IEEE Trans Multimed"},{"key":"11558_CR44","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1016\/j.neunet.2017.12.012","volume":"107","author":"S Elfwing","year":"2018","unstructured":"Elfwing S, Uchibe E, Doya K (2018) Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw 107:3\u201311. https:\/\/doi.org\/10.1016\/j.neunet.2017.12.012","journal-title":"Neural Netw"},{"key":"11558_CR45","unstructured":"Hendrycks D, Gimpel. K (2016) Bridging nonlinearities and stochastic regularizers with gaussian error linear units. Preprint at https:\/\/arxiv.org\/abs\/1606.08415"},{"key":"11558_CR46","unstructured":"Gevorgyan Z (2022) Siou Loss: More powerful learning for bounding box regression. Preprint at https:\/\/arxiv.org\/abs\/2205.12740"},{"key":"11558_CR47","unstructured":"Tong Z, Chen Y, Xu Z, et\u00a0al (2023) Wise-iou: Bounding box regression loss with dynamic focusing mechanism. Preprint at https:\/\/arxiv.org\/abs\/2301.10051"}],"container-title":["Neural Processing Letters"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11063-024-11558-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11063-024-11558-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11063-024-11558-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,5,16]],"date-time":"2024-05-16T20:32:41Z","timestamp":1715891561000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11063-024-11558-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,3,6]]},"references-count":47,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2024,4]]}},"alternative-id":["11558"],"URL":"https:\/\/doi.org\/10.1007\/s11063-024-11558-4","relation":{},"ISSN":["1573-773X"],"issn-type":[{"value":"1573-773X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,3,6]]},"assertion":[{"value":"11 February 2024","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 March 2024","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}],"article-number":"90"}}