{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,17]],"date-time":"2026-02-17T15:00:53Z","timestamp":1771340453263,"version":"3.50.1"},"reference-count":52,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,2,22]],"date-time":"2024-02-22T00:00:00Z","timestamp":1708560000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,22]],"date-time":"2024-02-22T00:00:00Z","timestamp":1708560000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62172225"],"award-info":[{"award-number":["62172225"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"publisher","award":["30920032201"],"award-info":[{"award-number":["30920032201"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Vis. Intell."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>It remains a challenging task to detect pedestrians in crowds and it needs more efforts to understand why the detectors fail. When we perform an error analysis based on the traditional evaluation strategy, we find that it produces many misleading false positives, which in fact cover occluded pedestrians. The reason for this is that we usually have two kinds of annotations in the dataset: regular pedestrians (detection targets) labeled by full-body boxes and ignored pedestrians (NOT detection targets) labeled by visible boxes. Ignored pedestrians are labeled as an additional category termed the \u201cignore region\u201d. Nevertheless, our detectors always predict a full-body box for each pedestrian. This gap results in the following case: when a detector successfully predicts a full-body box for those ignored pedestrians, a false positive is triggered due to the low overlap between the predicted full-body box and the labeled visible box for the ignored pedestrian. This becomes even more harmful as the detector improves and becomes more capable of locating occluded pedestrians. To alleviate this issue, we devise a new pedestrian detection pipeline, which considers the additional visible box at both the detection and evaluation stages. During detection, we predict an extra visible box apart from the full-body box for every instance; during evaluation, we employ visible boxes instead of full-body boxes to match the \u201cignore region\u201d. We apply the new pipeline to dozens of detection methods and validate the effectiveness of our pipeline in reducing the over-reporting of false positives and providing more reliable evaluation results.<\/jats:p>","DOI":"10.1007\/s44267-024-00036-z","type":"journal-article","created":{"date-parts":[[2024,2,22]],"date-time":"2024-02-22T04:33:08Z","timestamp":1708576388000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Towards more reliable evaluation in pedestrian detection by rethinking \u201cignore regions\u201d"],"prefix":"10.1007","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9956-7653","authenticated-orcid":false,"given":"Gang","family":"Li","sequence":"first","affiliation":[]},{"given":"Xiang","family":"Li","sequence":"additional","affiliation":[]},{"given":"Shanshan","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Jian","family":"Yang","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,2,22]]},"reference":[{"key":"36_CR1","first-page":"5693","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"K. Sun","year":"2019","unstructured":"Sun, K., Xiao, B., Liu, D., & Wang, J. (2019). Deep high-resolution representation learning for human pose estimation. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 5693\u20135703). Piscataway: IEEE."},{"key":"36_CR2","first-page":"10863","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"J. Li","year":"2019","unstructured":"Li, J., Wang, C., Zhu, H., Mao, Y., Fang, H.-S., & Crowdpose, C. Lu. (2019). Efficient crowded scenes pose estimation and a new benchmark. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 10863\u201310872). Piscataway: IEEE."},{"key":"36_CR3","first-page":"486","volume-title":"Proceedings of the 15th European conference on computer vision","author":"Y. Shen","year":"2018","unstructured":"Shen, Y., Li, H., Yi, S., Chen, D., & Wang, X. (2018). Person re-identification with deep similarity-guided graph neural network. In V. Ferrari, M. Hebert, C. Sminchisescu, et al. (Eds.), Proceedings of the 15th European conference on computer vision (pp. 486\u2013504). Cham: Springer."},{"key":"36_CR4","first-page":"8514","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"F. Zheng","year":"2019","unstructured":"Zheng, F., Deng, C., Sun, X., Jiang, X., Guo, X., Yu, Z., et al. (2019). Pyramidal person re-identification via multi-loss dynamic training. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 8514\u20138522). Piscataway: IEEE."},{"key":"36_CR5","first-page":"932","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"K. Gong","year":"2017","unstructured":"Gong, K., Liang, X., Zhang, D., Shen, X., & Lin, L. (2017). Look into person: self-supervised structure-sensitive learning and a new benchmark for human parsing. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 932\u2013940). Piscataway: IEEE."},{"key":"36_CR6","first-page":"2117","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"T.-Y. Lin","year":"2017","unstructured":"Lin, T.-Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2117\u20132125). Piscataway: IEEE."},{"key":"36_CR7","unstructured":"Shao, S., Zhao, Z., Li, B., Xiao, T., Yu, G., Zhang, X., et al. (2018). Crowdhuman: a benchmark for detecting human in a crowd. arXiv preprint. arXiv:1805.00123."},{"key":"36_CR8","first-page":"3213","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"S. Zhang","year":"2017","unstructured":"Zhang, S., Benenson, R., & Citypersons, B. S. (2017). A diverse dataset for pedestrian detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3213\u20133221). Piscataway: IEEE."},{"issue":"4","key":"36_CR9","doi-asserted-by":"publisher","first-page":"743","DOI":"10.1109\/TPAMI.2011.155","volume":"34","author":"P. Dollar","year":"2011","unstructured":"Dollar, P., Wojek, C., Schiele, B., & Perona, P. (2011). Pedestrian detection: an evaluation of the state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 743\u2013761.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"36_CR10","first-page":"9759","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"S. Zhang","year":"2020","unstructured":"Zhang, S., Chi, C., Yao, Y., Lei, Z., & Li, S. Z. (2020). Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 9759\u20139768). Piscataway: IEEE."},{"key":"36_CR11","first-page":"21002","volume-title":"Proceedings of the 34th international conference on neural information processing systems","author":"X. Li","year":"2020","unstructured":"Li, X., Wang, W., Wu, L., Chen, S., Hu, X., Li, J., et al. (2020). Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection. In H. Larochelle, M. Ranzato, R. Hadsell, et al. (Eds.), Proceedings of the 34th international conference on neural information processing systems (pp. 21002\u201321012). Red Hook: Curran Associates."},{"issue":"3","key":"36_CR12","doi-asserted-by":"publisher","first-page":"1150","DOI":"10.1109\/TCSVT.2020.3000223","volume":"31","author":"Y. Jiao","year":"2021","unstructured":"Jiao, Y., Yao, H., & Xu, C. (2021). PEN: pose-embedding network for pedestrian detection. IEEE Transactions on Circuits and Systems for Video Technology, 31(3), 1150\u20131162.","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"},{"issue":"12","key":"36_CR13","doi-asserted-by":"publisher","first-page":"3608","DOI":"10.1109\/TCSVT.2018.2883558","volume":"29","author":"C. Lin","year":"2018","unstructured":"Lin, C., Lu, J., & Zhou, J. (2018). Multi-grained deep feature learning for robust pedestrian detection. IEEE Transactions on Circuits and Systems for Video Technology, 29(12), 3608\u20133621.","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"},{"issue":"10","key":"36_CR14","doi-asserted-by":"publisher","first-page":"3332","DOI":"10.1109\/TCSVT.2019.2913114","volume":"30","author":"X. Wang","year":"2019","unstructured":"Wang, X., Liang, C., Chen, C., Chen, J., Wang, Z., Han, Z., et al. (2019). S3d: scalable pedestrian detection via score scale surface discrimination. IEEE Transactions on Circuits and Systems for Video Technology, 30(10), 3332\u20133344.","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"},{"key":"36_CR15","first-page":"9180","volume-title":"Proceedings of the 20th international conference on pattern recognition","author":"G. Li","year":"2021","unstructured":"Li, G., Zhang, S., & Yang, J. (2021). Nighttime pedestrian detection based on feature attention and transformation. In Proceedings of the 20th international conference on pattern recognition (pp. 9180\u20139187). Piscataway: IEEE."},{"issue":"8","key":"36_CR16","doi-asserted-by":"publisher","first-page":"2663","DOI":"10.1109\/TCSVT.2019.2924912","volume":"30","author":"X. Wang","year":"2019","unstructured":"Wang, X., Shen, C., Li, H., & Xu, S. (2019). Human detection aided by deeply learned semantic masks. IEEE Transactions on Circuits and Systems for Video Technology, 30(8), 2663\u20132673.","journal-title":"IEEE Transactions on Circuits and Systems for Video Technology"},{"key":"36_CR17","first-page":"12214","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"X. Chu","year":"2020","unstructured":"Chu, X., Zheng, A., Zhang, X., & Sun, J. (2020). Detection in crowded scenes: one proposal, multiple predictions. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 12214\u201312223). Piscataway: IEEE."},{"key":"36_CR18","first-page":"2980","volume-title":"Proceedings of the IEEE international conference on computer vision","author":"T.-Y. Lin","year":"2017","unstructured":"Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Doll\u00e1r, P. (2017). Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980\u20132988). Piscataway: IEEE."},{"key":"36_CR19","first-page":"11632","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"X. Li","year":"2021","unstructured":"Li, X., Wang, W., Hu, X., Li, J., Tang, J., & Yang, J. (2021). Generalized focal loss v2: learning reliable localization quality estimation for dense object detection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 11632\u201311641). Piscataway: IEEE."},{"key":"36_CR20","first-page":"6154","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"Z. Cai","year":"2018","unstructured":"Cai, Z., & Vasconcelos, N. (2018). Cascade R-CNN: delving into high quality object detection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 6154\u20136162). Piscataway: IEEE."},{"key":"36_CR21","unstructured":"Vu, T., Jang, H., Pham, T. X., & Yoo, C. D. (2019). Cascade RPN: delving into high-quality region proposal network with adaptive convolution. arXiv preprint. arXiv:1909.06720."},{"key":"36_CR22","first-page":"4974","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"K. Chen","year":"2019","unstructured":"Chen, K., Pang, J., Wang, J., Xiong, Y., Li, X., Sun, S., et al. (2019). Hybrid task cascade for instance segmentation. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 4974\u20134983). Piscataway: IEEE."},{"key":"36_CR23","doi-asserted-by":"publisher","first-page":"1597","DOI":"10.1145\/3394171.3413983","volume-title":"Proceedings of the 28th ACM international conference on multimedia","author":"G. Li","year":"2020","unstructured":"Li, G., Li, J., Zhang, S., & Yang, J. (2020). Learning hierarchical graph for occluded pedestrian detection. In Proceedings of the 28th ACM international conference on multimedia (pp. 1597\u20131605). New York: ACM."},{"key":"36_CR24","first-page":"135","volume-title":"Proceedings of the 15th European conference on computer vision","author":"C. Zhou","year":"2018","unstructured":"Zhou, C., & Yuan, J. (2018). Bi-box regression for pedestrian detection and occlusion estimation. In V. Ferrari, M. Hebert, C. Sminchisescu, et al. (Eds.), Proceedings of the 15th European conference on computer vision (pp. 135\u2013151). Cham: Springer."},{"key":"36_CR25","first-page":"4967","volume-title":"Proceedings of the IEEE\/CVF international conference on computer vision","author":"Y. Pang","year":"2019","unstructured":"Pang, Y., Xie, J., Khan, M.H., Anwer, R.M., Khan, F.S., & Shao, L. (2019). Mask-guided attention network for occluded pedestrian detection. In Proceedings of the IEEE\/CVF international conference on computer vision (pp. 4967\u20134975). Piscataway: IEEE."},{"key":"36_CR26","volume-title":"Proceedings of the 34th international conference on neural information processing systems","author":"Z. Xu","year":"2020","unstructured":"Xu, Z., Li, B., Yuan, Y., & Dang, A. (2020). Beta R-CNN: looking into pedestrian detection from another perspective. In H. Larochelle, M. Ranzato, R. Hadsell, et al. (Eds.), Proceedings of the 34th international conference on neural information processing systems. Red Hook: Curran Associates."},{"key":"36_CR27","first-page":"213","volume-title":"Proceedings of the 16th European conference on computer vision","author":"N. Carion","year":"2020","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020). End-to-end object detection with transformers. In A. Vedaldi, H. Bischof, T. Brox, et al. (Eds.), Proceedings of the 16th European conference on computer vision (pp. 213\u2013229). Cham: Springer."},{"key":"36_CR28","unstructured":"Zhu, X., Su, W., Lu, L., Li, B., Wang, X., & Dai, J. (2020). Deformable DETR: deformable transformers for end-to-end object detection. arXiv preprint. arXiv:2010.04159."},{"key":"36_CR29","unstructured":"Zhang, H., Li, F., Liu, S., Zhang, L., Su, H., Zhu, J., et al. (2022). Dino: DETR with improved denoising anchor boxes for end-to-end object detection. arXiv preprint. arXiv:2203.03605."},{"key":"36_CR30","first-page":"3651","volume-title":"Proceedings of the IEEE\/CVF computer vision and pattern recognition","author":"D. Meng","year":"2021","unstructured":"Meng, D., Chen, X., Fan, Z., Zeng, G., Li, H., Yuan, Y., et al. (2021). Conditional DETR for fast training convergence. In Proceedings of the IEEE\/CVF computer vision and pattern recognition (pp. 3651\u20133660). Piscataway: IEEE."},{"key":"36_CR31","first-page":"7329","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"S. Zhang","year":"2023","unstructured":"Zhang, S., Wang, X., Wang, J., Pang, J., Lyu, C., Zhang, W., et al. (2023). Dense distinct query for end-to-end object detection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 7329\u20137338). Piscataway: IEEE."},{"key":"36_CR32","first-page":"23809","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"J. Zhang","year":"2023","unstructured":"Zhang, J., Lin, X., Zhang, W., Wang, K., Tan, X., Han, J., et al. (2023). Semi-DETR: semi-supervised object detection with detection transformers. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 23809\u201323818). Piscataway: IEEE."},{"key":"36_CR33","first-page":"2567","volume-title":"Proceedings of the AAAI conference on artificial intelligence","author":"Y. Wang","year":"2022","unstructured":"Wang, Y., Zhang, X., Yang, T., & Sun, J. (2022). Anchor DETR: query design for transformer-based detector. In Proceedings of the AAAI conference on artificial intelligence (pp. 2567\u20132575). Palo Alto: AAAI Press."},{"key":"36_CR34","first-page":"185","volume-title":"Proceedings of the AAAI conference on artificial intelligence","author":"X. Cao","year":"2022","unstructured":"Cao, X., Yuan, P., Feng, B., & Niu, K. (2022). CF-DETR: coarse-to-fine transformers for end-to-end object detection. In Proceedings of the AAAI conference on artificial intelligence (pp. 185\u2013193). Palo Alto: AAAI Press."},{"key":"36_CR35","first-page":"13619","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"F. Li","year":"2022","unstructured":"Li, F., Zhang, H., Liu, S., Guo, J., Ni, L. M., & Zhang, L. (2022). DN-DETR: accelerate DETR training by introducing query denoising. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 13619\u201313627). Piscataway: IEEE."},{"key":"36_CR36","first-page":"857","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"A. Zheng","year":"2022","unstructured":"Zheng, A., Zhang, Y., Zhang, X., Qi, X., & Sun, J. (2022). Progressive end-to-end object detection in crowded scenes. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 857\u2013866). Piscataway: IEEE."},{"key":"36_CR37","first-page":"10750","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"X. Huang","year":"2020","unstructured":"Huang, X., Ge, Z., Jie, Z., & Yoshie, O. (2020). NMS by representative region: towards crowded pedestrian detection by proposal pairing. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 10750\u201310759). Piscataway: IEEE."},{"key":"36_CR38","first-page":"1259","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"S. Zhang","year":"2016","unstructured":"Zhang, S., Benenson, R., Omran, M., Hosang, J., & Schiele, B. (2016). How far are we from solving pedestrian detection? In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1259\u20131267). Piscataway: IEEE."},{"key":"36_CR39","first-page":"740","volume-title":"Proceedings of the 13th European conference on computer vision","author":"T.-Y. Lin","year":"2014","unstructured":"Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., et al. (2014). Microsoft COCO: common objects in context. In D. J. Fleet, T. Pajdla, B. Schiele, et al. (Eds.), Proceedings of the 13th European conference on computer vision (pp. 740\u2013755). Cham: Springer."},{"key":"36_CR40","first-page":"9627","volume-title":"Proceedings of the IEEE\/CVF international conference on computer vision","author":"Z. Tian","year":"2019","unstructured":"Tian, Z., Shen, C., Chen, H., & He, T. (2019). FCOS: fully convolutional one-stage object detection. In Proceedings of the IEEE\/CVF international conference on computer vision (pp. 9627\u20139636). Piscataway: IEEE."},{"key":"36_CR41","unstructured":"Kang, K., & Lee, H. S. (2020). Probabilistic anchor assignment with IoU prediction for object detection. arXiv preprint. arXiv:2007.08103."},{"key":"36_CR42","first-page":"9657","volume-title":"Proceedings of the IEEE\/CVF international conference on computer vision","author":"Z. Yang","year":"2019","unstructured":"Yang, Z., Liu, S., Hu, H., Wang, L., & Reppoints, S. L. (2019). Point set representation for object detection. In Proceedings of the IEEE\/CVF international conference on computer vision (pp. 9657\u20139666). Piscataway: IEEE."},{"key":"36_CR43","first-page":"260","volume-title":"Proceedings of the 16th European conference on computer vision","author":"H. Zhang","year":"2020","unstructured":"Zhang, H., Chang, H., Ma, B., Wang, N., & Chen, X. (2020). Dynamic R-CNN: towards high quality object detection via dynamic training. In A. Vedaldi, H. Bischof, T. Brox, et al. (Eds.), Proceedings of the 16th European conference on computer vision (pp. 260\u2013275). Cham: Springer."},{"key":"36_CR44","doi-asserted-by":"crossref","unstructured":"Zhang, H., Wang, Y., Dayoub, F., & Varifocalnet, N. S. (2020). An iou-aware dense object detector. arXiv preprint. arXiv:2008.13367.","DOI":"10.1109\/CVPR46437.2021.00841"},{"key":"36_CR45","first-page":"821","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"J. Pang","year":"2019","unstructured":"Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., & Lin, D. (2019). Libra R-CNN: towards balanced learning for object detection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 821\u2013830). Piscataway: IEEE."},{"key":"36_CR46","first-page":"2965","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"J. Wang","year":"2019","unstructured":"Wang, J., Chen, K., Yang, S., Loy, C. C., & Lin, D. (2019). Region proposal by guided anchoring. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 2965\u20132974). Piscataway: IEEE."},{"key":"36_CR47","first-page":"10186","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"Y. Wu","year":"2020","unstructured":"Wu, Y., Chen, Y., Yuan, L., Liu, Z., Wang, L., Li, H., et al. (2020). Rethinking classification and localization for object detection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 10186\u201310195). Piscataway: IEEE."},{"key":"36_CR48","first-page":"11583","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"Y. Cao","year":"2020","unstructured":"Cao, Y., Chen, K., Loy, C. C., & Lin, D. (2020). Prime sample attention in object detection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 11583\u201311591). Piscataway: IEEE."},{"key":"36_CR49","first-page":"5187","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"W. Liu","year":"2019","unstructured":"Liu, W., Liao, S., Ren, W., Hu, W., & Yu, Y. (2019). High-level semantic feature detection: a new perspective for pedestrian detection. In Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition (pp. 5187\u20135196). Piscataway: IEEE."},{"key":"36_CR50","unstructured":"Chen, K., Wang, J., Pang, J., Cao, Y., Xiong, Y., Li, X., et al. (2019). MMDetection: open MMLab detection toolbox and benchmark. arXiv preprint. arXiv:1906.07155."},{"key":"36_CR51","first-page":"770","volume-title":"Proceedings of the IEEE conference on computer vision and pattern recognition","author":"K. He","year":"2016","unstructured":"He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770\u2013778). Piscataway: IEEE."},{"key":"36_CR52","unstructured":"Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint. arXiv:1409.1556."}],"container-title":["Visual Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44267-024-00036-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s44267-024-00036-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s44267-024-00036-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,2,22]],"date-time":"2024-02-22T05:37:50Z","timestamp":1708580270000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s44267-024-00036-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,22]]},"references-count":52,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["36"],"URL":"https:\/\/doi.org\/10.1007\/s44267-024-00036-z","relation":{},"ISSN":["2731-9008"],"issn-type":[{"value":"2731-9008","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,22]]},"assertion":[{"value":"2 July 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 January 2024","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 January 2024","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 February 2024","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"4"}}