{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,12]],"date-time":"2025-09-12T18:31:23Z","timestamp":1757701883337},"reference-count":39,"publisher":"Walter de Gruyter GmbH","issue":"1","license":[{"start":{"date-parts":[[2022,1,1]],"date-time":"2022-01-01T00:00:00Z","timestamp":1640995200000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022,3,8]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Recent object detectors have achieved excellent performance in accuracy and speed. Even with such impressive results, the most advanced detectors are challenging in dense scenes. In this article, we analyze and find the reasons for the decrease in detection accuracy in dense scenes. We started our work in terms of region proposal and location loss. We found that low-quality proposal regions during the training process are the main factors affecting detection accuracy. To prove our research, we established and trained a dense detection model based on Cascade R-CNN. The model achieves an accuracy of mAP 0.413 on the SKU-110K sub-dataset. Our results show that improving the quality of recommended regions can effectively improve the detection accuracy in dense scenes.<\/jats:p>","DOI":"10.1515\/comp-2022-0231","type":"journal-article","created":{"date-parts":[[2022,3,9]],"date-time":"2022-03-09T00:31:16Z","timestamp":1646785876000},"page":"75-82","source":"Crossref","is-referenced-by-count":3,"title":["A method for detecting objects in dense scenes"],"prefix":"10.1515","volume":"12","author":[{"given":"Chuanyun","family":"Xu","sequence":"first","affiliation":[{"name":"College of Computer Science and Engineering, Chongqing University of Technology , Chongqing , China"}]},{"given":"Yu","family":"Zheng","sequence":"additional","affiliation":[{"name":"College of Computer Science and Engineering, Chongqing University of Technology , Chongqing , China"}]},{"given":"Yang","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Computer and Information Science, Chongqing Normal University , Chongqing , China"}]},{"given":"Gang","family":"Li","sequence":"additional","affiliation":[{"name":"College of Computer Science and Engineering, Chongqing University of Technology , Chongqing , China"}]},{"given":"Ying","family":"Wang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Engineering, Chongqing University of Technology , Chongqing , China"}]}],"member":"374","published-online":{"date-parts":[[2022,3,8]]},"reference":[{"key":"2022081707553229446_j_comp-2022-0231_ref_001","doi-asserted-by":"crossref","unstructured":"A. Krizhevsky, I. Sutskever, and G. E. Hinton, \u201cImagenet classification with deep convolutional neural networks,\u201d Commun. ACM, vol. 60, 2017, pp 84\u201390.","DOI":"10.1145\/3065386"},{"key":"2022081707553229446_j_comp-2022-0231_ref_002","doi-asserted-by":"crossref","unstructured":"C. Szegedy, W. Liu, Y. Angqing Jia, P. Sermanet, S. Reed, D. Anguelov, et al., Going deeper with convolutions, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. pp. 1\u20139.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"2022081707553229446_j_comp-2022-0231_ref_003","unstructured":"K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2014, arXiv:http:\/\/arXiv.org\/abs\/arXiv:1409.1556."},{"key":"2022081707553229446_j_comp-2022-0231_ref_004","doi-asserted-by":"crossref","unstructured":"K. He, X. Zhang, S. Ren, and J. Sun, \u201cDeep residual learning for image recognition,\u201d In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770\u2013778.","DOI":"10.1109\/CVPR.2016.90"},{"key":"2022081707553229446_j_comp-2022-0231_ref_005","doi-asserted-by":"crossref","unstructured":"J. Deng, W. Dong, R. Socher, L-J. Li, K. Li, and L. Fei-Fei. \u201cImagenet: A large-scale hierarchical image database,\u201d Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009, p. 1.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"2022081707553229446_j_comp-2022-0231_ref_006","doi-asserted-by":"crossref","unstructured":"B. Zhou, A. Lapedriza, A. Khosla, A. Oliva, and A. Torralba, \u201cPlaces: A 10 million image database for scene recognition,\u201d IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 6, pp. 1452\u20131464, 2017.","DOI":"10.1109\/TPAMI.2017.2723009"},{"key":"2022081707553229446_j_comp-2022-0231_ref_007","doi-asserted-by":"crossref","unstructured":"R. Girshick, J. Donahue, T. Darrell, and J. Malik, \u201cRich feature hierarchies for accurate object detection and semantic segmentation,\u201d In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 580\u2013587.","DOI":"10.1109\/CVPR.2014.81"},{"key":"2022081707553229446_j_comp-2022-0231_ref_008","doi-asserted-by":"crossref","unstructured":"R. Girshick, \u201cFast r-cnn,\u201d In Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 1440\u20131448.","DOI":"10.1109\/ICCV.2015.169"},{"key":"2022081707553229446_j_comp-2022-0231_ref_009","unstructured":"S. Ren, K. He, R. Girshick, and J. Sun \u201cFaster r-cnn: Towards real-time object detection with region proposal networks,\u201d Adv. Neural Inf. Process Syst., 2015, vol. 28, pp. 91\u201399."},{"key":"2022081707553229446_j_comp-2022-0231_ref_010","doi-asserted-by":"crossref","unstructured":"T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Doll\u00e1r, \u201cFocal loss for dense object detection,\u201d Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2980\u20132988.","DOI":"10.1109\/ICCV.2017.324"},{"key":"2022081707553229446_j_comp-2022-0231_ref_011","unstructured":"A. Bochkovskiy, C. Y. Wang, and H. Y. M. Liao YOLOv4: Optimal Speed and Accuracy of Object Detection, 2020, arXiv: http:\/\/arXiv.org\/abs\/arXiv:2004.10934."},{"key":"2022081707553229446_j_comp-2022-0231_ref_012","doi-asserted-by":"crossref","unstructured":"J. Redmon, S. K. Divvala, R. B. Girshick, and Ali Farhadi, \u201cYou only look once: unified, real-time object detection,\u201d In Proceedings of Conference on Computer Vision Pattern Recognition, 2016.","DOI":"10.1109\/CVPR.2016.91"},{"key":"2022081707553229446_j_comp-2022-0231_ref_013","doi-asserted-by":"crossref","unstructured":"W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, et al., \u201cSSD: Single shot multibox detector,\u201d In European Conference on Computer Vision, 2016.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"2022081707553229446_j_comp-2022-0231_ref_014","unstructured":"A. Farhadi, and J. Redmon. \u201cYolov3: An incremental improvement,\u201d In Computer Vision and Pattern Recognition, Berlin\/Heidelberg, Germany: Springer, 2018, p. 1804.02767."},{"key":"2022081707553229446_j_comp-2022-0231_ref_015","doi-asserted-by":"crossref","unstructured":"E. Goldman, R. Herzig, A. Eisenschtat, J. Goldberger, and T. Hassner, \u201cPrecise detection in densely packed scenes,\u201d In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 5227\u20135236.","DOI":"10.1109\/CVPR.2019.00537"},{"key":"2022081707553229446_j_comp-2022-0231_ref_016","doi-asserted-by":"crossref","unstructured":"Z. Q. Zhao, P. Zheng, S. Xu, and X. Wu, \u201cObject detection with deep learning: A review,\u201d IEEE Trans. Neural Networks Learning Syst., vol. 30, no. 11, pp. 3212\u20133232, 2019.","DOI":"10.1109\/TNNLS.2018.2876865"},{"key":"2022081707553229446_j_comp-2022-0231_ref_017","doi-asserted-by":"crossref","unstructured":"N. Dalal and B. Triggs, \u201cHistograms of oriented gradients for human detection,\u201d In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR\u201905), vol. 1, IEEE, 2005, pp. 886\u2013893.","DOI":"10.1109\/CVPR.2005.177"},{"key":"2022081707553229446_j_comp-2022-0231_ref_018","doi-asserted-by":"crossref","unstructured":"P. F. Felzenszwalb, R. B. Girshick, and D. McAllester, \u201cCascade object detection with deformable part models,\u201d In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, 2010, pp. 2241\u20132248.","DOI":"10.1109\/CVPR.2010.5539906"},{"key":"2022081707553229446_j_comp-2022-0231_ref_019","doi-asserted-by":"crossref","unstructured":"R. Girshick, J. Donahue, T. Darrell, and J. Malik, \u201cRich feature hierarchies for accurate object detection and semantic segmentation,\u201d In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 580\u2013587.","DOI":"10.1109\/CVPR.2014.81"},{"key":"2022081707553229446_j_comp-2022-0231_ref_020","doi-asserted-by":"crossref","unstructured":"CL Zitnick and P. Doll\u00e1r, \u201cEdge boxes: locating object proposals from edges,\u201d In European Conference on Computer Vision, Cham: Springer, 2014, pp. 391\u2013405.","DOI":"10.1007\/978-3-319-10602-1_26"},{"key":"2022081707553229446_j_comp-2022-0231_ref_021","doi-asserted-by":"crossref","unstructured":"J. R. R. Uijlings, K. E. A. Van De Sande, T. Gevers, and A. W. Smeulders, \u201cSelective search for object recognition,\u201d Int. J. Comput. Vision, vol. 104, no. 2, pp. 154\u2013171, 2013.","DOI":"10.1007\/s11263-013-0620-5"},{"key":"2022081707553229446_j_comp-2022-0231_ref_022","doi-asserted-by":"crossref","unstructured":"B. Alexe, T. Deselaers, and V. Ferrari, \u201cMeasuring the objectness of image windows,\u201d IEEE Trans. Pattern Anal. Machine Intell., vol. 34, no. 11, pp. 2189\u20132202, 2012.","DOI":"10.1109\/TPAMI.2012.28"},{"key":"2022081707553229446_j_comp-2022-0231_ref_023","unstructured":"P. Purkait, C. Zhao, and C. Zach, \u201cSPP-Net: Deep absolute pose regression with synthetic views,\u201d 2017, arXiv:http:\/\/arXiv.org\/abs\/arXiv:1712.03452."},{"key":"2022081707553229446_j_comp-2022-0231_ref_024","doi-asserted-by":"crossref","unstructured":"E. Goldman, R. Herzig, A. Eisenschtat, J. Goldberger, and T. Hassner, \u201cPrecise detection in densely packed scenes,\u201d In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 5227\u20135236.","DOI":"10.1109\/CVPR.2019.00537"},{"key":"2022081707553229446_j_comp-2022-0231_ref_025","doi-asserted-by":"crossref","unstructured":"T. Y. Lin, P. Doll\u00e1r, R. Girshick, K. He, B. Hariharan, and S. Belongie, \u201cFeature pyramid networks for object detection,\u201d In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2117\u20132125.","DOI":"10.1109\/CVPR.2017.106"},{"key":"2022081707553229446_j_comp-2022-0231_ref_026","doi-asserted-by":"crossref","unstructured":"H Rezatofighi, N Tsoi, JY Gwak, A. Sadeghian, I. Reid, S. Savarese, \u201cGeneralized intersection over union: A metric and a loss for bounding box regression,\u201d In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 658\u2013666.","DOI":"10.1109\/CVPR.2019.00075"},{"key":"2022081707553229446_j_comp-2022-0231_ref_027","doi-asserted-by":"crossref","unstructured":"Z. Tian, C. Shen, H. Chen, and T. He, \u201cFcos: Fully convolutional one-stage object detection,\u201d In Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 9627\u20139636.","DOI":"10.1109\/ICCV.2019.00972"},{"key":"2022081707553229446_j_comp-2022-0231_ref_028","doi-asserted-by":"crossref","unstructured":"M. Everingham, L. Van Gool, C. K. I. Williams, and J. Winn, \u201cThe pascal visual object classes (voc) challenge,\u201d Int. J. Comput. Vision, vol. 88, no. 2, 303\u2013338, 2010.","DOI":"10.1007\/s11263-009-0275-4"},{"key":"2022081707553229446_j_comp-2022-0231_ref_029","doi-asserted-by":"crossref","unstructured":"T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, et al., \u201cMicrosoft coco: Common objects in context,\u201d In ECCV, 2014. p. 1.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"2022081707553229446_j_comp-2022-0231_ref_030","doi-asserted-by":"crossref","unstructured":"J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and L. Fei-Fei, \u201cImagenet: A large-scale hierarchical image database,\u201d In 2009 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2009, pp. 248\u2013255.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"2022081707553229446_j_comp-2022-0231_ref_031","doi-asserted-by":"crossref","unstructured":"C. Arteta, V. Lempitsky, and A. Zisserman, \u201cCounting in the wild,\u201d In European Conference on Computer Vision, Cham: Springer, 2016, pp. 483\u2013498.","DOI":"10.1007\/978-3-319-46478-7_30"},{"key":"2022081707553229446_j_comp-2022-0231_ref_032","doi-asserted-by":"crossref","unstructured":"D. Onoro-Rubio and R. J. L\u00f3pez-Sastre, \u201cTowards perspective-free object counting with deep learning,\u201d European Conference on Computer Vision. Cham: Springer, 2016, pp. 615\u2013629.","DOI":"10.1007\/978-3-319-46478-7_38"},{"key":"2022081707553229446_j_comp-2022-0231_ref_033","doi-asserted-by":"crossref","unstructured":"X. Chu, A. Zheng, X. Zhang, and J. Sun, \u201cDetection in crowded scenes: one proposal multiple predictions,\u201d In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 12214\u201312223.","DOI":"10.1109\/CVPR42600.2020.01223"},{"key":"2022081707553229446_j_comp-2022-0231_ref_034","doi-asserted-by":"crossref","unstructured":"Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, \u201cDistance-IoU loss: faster and better learning for bounding box regression,\u201d In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 7, pp. 12993\u201313000, 2020.","DOI":"10.1609\/aaai.v34i07.6999"},{"key":"2022081707553229446_j_comp-2022-0231_ref_035","doi-asserted-by":"crossref","unstructured":"B. Zoph, E. D. Cubuk, G. Ghiasi, T. Y. Lin, J. Shlens, and Q. V. Le, \u201cLearning data augmentation strategies for object detection,\u201d In European Conference on Computer Vision, Cham: Springer, 2020, pp. 566\u2013583.","DOI":"10.1007\/978-3-030-58583-9_34"},{"key":"2022081707553229446_j_comp-2022-0231_ref_036","doi-asserted-by":"crossref","unstructured":"K. Sun, B. Xiao, D. Liu, and J. Wang, \u201cDeep high-resolution representation learning for human pose estimation,\u201d In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 5693\u20135703.","DOI":"10.1109\/CVPR.2019.00584"},{"key":"2022081707553229446_j_comp-2022-0231_ref_037","unstructured":"X. Long, K. Deng, G. Wang, Y. Zhang, Q. Dang, and Y. Gao, \u201cPP-YOLO: An effective and efficient implementation of object detector,\u201d 2020, arXiv:http:\/\/arXiv.org\/abs\/arXiv:2007.12099."},{"key":"2022081707553229446_j_comp-2022-0231_ref_038","doi-asserted-by":"crossref","unstructured":"J. Hu, L. Shen, and G. Sun, \u201cSqueeze-and-excitation networks,\u201d In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7132\u20137141.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"2022081707553229446_j_comp-2022-0231_ref_039","doi-asserted-by":"crossref","unstructured":"Z. Huang, J. Wang, X. Fu, T. Yu, Y. Guo, and R. Wang, \u201cDC-SPP-YOLO: Dense connection and spatial pyramid pooling based YOLO for object detection,\u201d Inform. Sci., vol. 522, pp. 241\u2013258, 2020.","DOI":"10.1016\/j.ins.2020.02.067"}],"container-title":["Open Computer Science"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/comp-2022-0231\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/comp-2022-0231\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,19]],"date-time":"2024-09-19T23:18:31Z","timestamp":1726787911000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.degruyter.com\/document\/doi\/10.1515\/comp-2022-0231\/html"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,1]]},"references-count":39,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,3,16]]},"published-print":{"date-parts":[[2022,3,16]]}},"alternative-id":["10.1515\/comp-2022-0231"],"URL":"https:\/\/doi.org\/10.1515\/comp-2022-0231","relation":{},"ISSN":["2299-1093"],"issn-type":[{"type":"electronic","value":"2299-1093"}],"subject":[],"published":{"date-parts":[[2022,1,1]]}}}