{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,10]],"date-time":"2025-12-10T09:04:41Z","timestamp":1765357481017,"version":"build-2065373602"},"reference-count":66,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2023,11,17]],"date-time":"2023-11-17T00:00:00Z","timestamp":1700179200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["12062009","Y202352263"],"award-info":[{"award-number":["12062009","Y202352263"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Scientific Research Fund of Zhejiang Provincial Education Department","award":["12062009","Y202352263"],"award-info":[{"award-number":["12062009","Y202352263"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Symmetry"],"abstract":"<jats:p>Object detection methods based on deep learning typically require devices with ample computing capabilities, which limits their deployment in restricted environments such as those with embedded devices. To address this challenge, we propose Mini-YOLOv4, a lightweight real-time object detection network that achieves an excellent trade-off between speed and accuracy. Based on CSPDarknet-Tiny as the backbone network, we enhance the detection performance of the network in three ways. We use a multibranch structure embedded in an attention module for simultaneous spatial and channel attention calibration. We design a group self-attention block with a symmetric structure consisting of a pair of complementary self-attention modules to mine contextual information, thereby ensuring that the detection accuracy is improved without increasing the computational cost. Finally, we introduce a hierarchical feature pyramid network to fully exploit multiscale feature maps and promote the extraction of fine-grained features. The experimental results demonstrate that Mini-YOLOv4 requires only 4.7 M parameters and has a billion floating point operations (BFLOPs) value of 3.1. Compared with YOLOv4-Tiny, our approach achieves a 3.2% improvement in mean accuracy precision (mAP) for the PASCAL VOC dataset and obtains a significant improvement of 3.5% in overall detection accuracy for the MS COCO dataset. In testing with an embedded platform, Mini-YOLOv4 achieves a real-time detection speed of 25.6 FPS on the NVIDIA Jetson Nano, thus meeting the demand for real-time detection in computationally limited devices.<\/jats:p>","DOI":"10.3390\/sym15112080","type":"journal-article","created":{"date-parts":[[2023,11,17]],"date-time":"2023-11-17T09:10:38Z","timestamp":1700212238000},"page":"2080","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["A Novel Lightweight Object Detection Network with Attention Modules and Hierarchical Feature Pyramid"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3010-1622","authenticated-orcid":false,"given":"Shengying","family":"Yang","sequence":"first","affiliation":[{"name":"School of Mechanical and Electrical Engineering, Lanzhou University of Technology, Lanzhou 730050, China"},{"name":"School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, China"}]},{"given":"Linfeng","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, China"}]},{"given":"Junxia","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Information and Electronic Engineering, Zhejiang University of Science and Technology, Hangzhou 310023, China"}]},{"given":"Wuyin","family":"Jin","sequence":"additional","affiliation":[{"name":"School of Mechanical and Electrical Engineering, Lanzhou University of Technology, Lanzhou 730050, China"}]},{"given":"Yunxiang","family":"Yu","sequence":"additional","affiliation":[{"name":"Zhejiang Dingli Industry Co., Ltd., Lishui 321400, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,11,17]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Chen, R., Liu, Y., Zhang, M., Liu, S., Yu, B., and Tai, Y.-W. (2020, January 23\u201328). Dive deeper into box for object detection. Proceedings of the 2020 European Conference on Computer Vision (ECCV): 16th European Conference, Glasgow, UK.","DOI":"10.1007\/978-3-030-58542-6_25"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Wu, Y., Chen, Y., Yuan, L., Liu, Z., Wang, L., Li, H., and Fu, Y. (2020, January 14\u201319). Rethinking classification and localization for object detection. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01020"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Qiu, H., Li, H., Wu, Q., and Shi, H. (2020, January 14\u201319). Offset bin classification network for accurate object detection. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01320"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Shi, H., Zhou, Q., Ni, Y., Wu, X., and Latecki, L.J. (2022, January 16\u201319). DPNET: Dual-path network for efficient object detection with Lightweight Self-Attention. Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP), Bordeaux, France.","DOI":"10.1109\/ICIP46576.2022.9897803"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"104397","DOI":"10.1016\/j.engappai.2021.104397","article-title":"EEEA-Net: An early exit evolutionary neural architecture search","volume":"104","author":"Termritthikun","year":"2021","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_6","unstructured":"Sun, Z., Lin, M., Sun, X., Tan, Z., Li, H., and Jin, R. (2022, January 17\u201323). MAE-DET: Revisiting maximum entropy principle in zero-shot nas for efficient object detection. Proceedings of the 39th International Conference on Machine Learning (PMLR), Virtual."},{"key":"ref_7","unstructured":"Chen, T., Saxena, S., Li, L., Fleet, D.J., and Hinton, G. (2022). Pix2seq: A language modeling framework for object detection. arXiv."},{"key":"ref_8","unstructured":"Du, X., Zoph, B., Hung, W.-C., and Lin, T.-Y. (2021). Simple training strategies and model scaling for object detection. arXiv."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"2127","DOI":"10.1007\/s00371-020-01974-7","article-title":"Disam: Density independent and scale aware model for crowd counting and localization","volume":"37","author":"Khan","year":"2021","journal-title":"Vis. Comput."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"415","DOI":"10.1007\/s41095-022-0274-8","article-title":"PVT v2: Improved baselines with pyramid vision transformer","volume":"8","author":"Wang","year":"2022","journal-title":"Comput. Vis. Media"},{"key":"ref_11","unstructured":"Xin, Y., Wang, G., Mao, M., Feng, Y., Dang, Q., Ma, Y., Ding, E., and Han, S. (2021). PAFNet: An efficient anchor-free object detector guidance. arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"16799","DOI":"10.1109\/TITS.2021.3102266","article-title":"Generating and restoring private face images for internet of vehicles based on semantic features and adversarial examples","volume":"23","author":"Yang","year":"2022","journal-title":"IEEE Trans. Intell. Transp. Syst."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Wang, W., Xie, E., Li, X., Fan, D.-P., Song, K., Liang, D., Lu, T., Luo, P., and Shao, L. (2021, January 10\u201317). Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.00061"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Ke, W., Zhang, T., Huang, Z., Ye, Q., Liu, J., and Huang, D. (2020, January 14\u201319). Multiple anchor learning for visual object detection. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01022"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Wang, J., Zhang, W., Cao, Y., Chen, K., Pang, J., Gong, T., Shi, J., Loy, C.C., and Lin, D. (2020, January 23\u201328). Side-Aware boundary localization for more precise object detection. Proceedings of the Computer Vision\u2014ECCV 2020: 16th European Conference, Glasgow, UK.","DOI":"10.1007\/978-3-030-58548-8_24"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Cao, J., Cholakkal, H., Anwer, R.M., Khan, F.S., Pang, Y., and Shao, L. (2020, January 14\u201319). D2Det: Towards high quality object detection and Instance Segmentation. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01150"},{"key":"ref_17","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). MobileNets: Efficient Convolutional Neural Networks for mobile vision applications. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.-C. (2018, January 18\u201322). MobileNetV2: Inverted residuals and linear bottlenecks. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Howard, A., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q.V., and Adam, H. (November, January 27). Searching for MobileNetV3. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00140"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Zhang, X., Zhou, X., Lin, M., and Sun, J. (2018, January 18\u201322). ShuffleNet: An extremely efficient Convolutional Neural Network for mobile devices. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00716"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Ma, N., Zhang, X., Zheng, H.-T., and Sun, J. (2018, January 8\u201314). ShuffleNet V2: Practical guidelines for efficient CNN architecture design. Proceedings of the 2018 European Conference on Computer Vision (ECCV): 15th European Conference, Munich, Germany.","DOI":"10.1007\/978-3-030-01264-9_8"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Chollet, F. (2017, January 21\u201326). Xception: Deep learning with depthwise separable convolutions. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.195"},{"key":"ref_23","unstructured":"Tan, M., and Le, Q.V. (2019). MixConv: Mixed depthwise convolutional kernels. arXiv."},{"key":"ref_24","unstructured":"Tan, M., and Le, Q. (2019). EfficientNet: Rethinking model scaling for Convolutional Neural Networks. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., and Berg, A.C. (2016, January 11\u201314). SSD: Single shot multibox detector. Proceedings of the 2016 European Conference on Computer Vision(ECCV): 14th European Conference, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 21\u201326). YOLO9000: Better, faster, stronger. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_28","unstructured":"Redmon, J., and Farhadi, A. (2018). YOLOv3: An incremental improvement. arXiv."},{"key":"ref_29","unstructured":"Bochkovskiy, A., Wang, C.-Y., and Liao, H.-Y.M. (2020). YOLOv4: Optimal speed and accuracy of object detection. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Lu, X., Li, Q., Li, B., and Yan, J. (2020, January 23\u201328). MimicDet: Bridging the gap between one-stage and two-stage object detection. Proceedings of the 2020 European Conference on Computer Vision (ECCV): 16th European Conference, Glasgow, UK.","DOI":"10.1007\/978-3-030-58568-6_32"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"133529","DOI":"10.1109\/ACCESS.2019.2941547","article-title":"Mini-YOLOv3: Real-Time object detector for embedded applications","volume":"7","author":"Mao","year":"2019","journal-title":"IEEE Access"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Wang, C.-Y., Bochkovskiy, A., and Liao, H.-Y.M. (2021, January 20\u201325). Scaled-YOLOv4: Scaling cross stage partial network. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01283"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201322). Squeeze-and-excitation networks. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.-Y., and Kweon, I.S. (2018, January 8\u201314). CBAM: Convolutional block attention module. Proceedings of the 2018 European Conference on Computer Vision (ECCV): 15th European Conference, Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Wang, X., Girshick, R., Gupta, A., and He, K. (2018, January 18\u201322). Non-local neural networks. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00813"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Li, X., Wang, W., Hu, X., and Yang, J. (2019, January 16\u201320). Selective kernel networks. Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00060"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Cao, Y., Xu, J., Lin, S., Wei, F., and Hu, H. (2019, January 27\u201328). GCNet: Non-local networks meet squeeze-excitation networks and beyond. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Republic of Korea.","DOI":"10.1109\/ICCVW.2019.00246"},{"key":"ref_39","unstructured":"Liu, Y., Shao, Z., Teng, Y., and Hoffmann, N. (2021). NAM: Normalization-based attention module. arXiv."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., and Hu, Q. (2020, January 14\u201319). ECA-Net: Efficient channel attention for deep convolutional neural networks. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01155"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Zhang, Q.-L., and Yang, Y.-B. (2021, January 6\u201311). SA-Net: Shuffle Attention for Deep Convolutional Neural Networks. Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada.","DOI":"10.1109\/ICASSP39728.2021.9414568"},{"key":"ref_42","unstructured":"Yang, L., Zhang, R., Li, L., and Xie, X. (2021, January 18\u201324). SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks. Proceedings of the International Conference on Machine Learning, Virtual."},{"key":"ref_43","unstructured":"Liu, Y., and Shao, Z. (2021). Hoffmann, Global attention mechanism: Retain information to enhance channel-spatial interactions. arXiv."},{"key":"ref_44","unstructured":"Liu, S., Huang, D., and Wang, Y. (2019). Learning spatial fusion for single-shot object detection. arXiv."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs","volume":"40","author":"Chen","year":"2018","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Liu, S., Huang, D., and Wang, Y. (2018, January 8\u201314). Receptive field block net for accurate and fast object detection. Proceedings of the 2018 European Conference on Computer Vision (ECCV): 15th European Conference, Munich, Germany.","DOI":"10.1007\/978-3-030-01252-6_24"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature pyramid networks for object detection. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018, January 18\u201322). Path aggregation network for Instance segmentation. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00913"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Huang, X., Ge, Z., Jie, Z., and Yoshie, O. (2020, January 14\u201319). NMS by representative region: Towards crowded pedestrian detection by proposal pairing. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01076"},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., and Ren, D. (2020, January 7\u201312). Distance-IoU loss: Faster and better learning for bounding box regression. Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA.","DOI":"10.1609\/aaai.v34i07.6999"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The pascal visual object classes (VOC) challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"TLin, Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Zitnick, C.L., and Doll\u00e1r, P. (2014, January 6\u201312). Microsoft COCO: Common objects in context. Proceedings of the 2014 European Conference on Computer Vision (ECCV), Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_53","first-page":"14","article-title":"Nvidia makes it easy to embed AI: The Jetson nano packs a lot of machine-learning power into DIY projects\u2014[Hands on]","volume":"57","author":"Cass","year":"2020","journal-title":"IEEE Spectr."},{"key":"ref_54","unstructured":"Long, X., Deng, K., Wang, G., Zhang, Y., Dang, Q., Gao, Y., Shen, H., Ren, J., Han, S., and Ding, E. (2020). PP-YOLO: An effective and efficient implementation of object detector. arXiv."},{"key":"ref_55","unstructured":"Couturier, R., Noura, H.N., Salman, O., and Sider, A. (2021). A deep learning object detection method for an efficient clusters initialization. arXiv."},{"key":"ref_56","unstructured":"Ge, Z., Liu, S., Wang, F., Li, Z., and Sun, J. (2021). YOLOX: Exceeding yolo series in 2021. arXiv."},{"key":"ref_57","unstructured":"Fu, C.-Y., Liu, W., Ranga, A., Tyagi, A., and Berg, A.C. (2018). DSSD: Deconvolutional single shot detector. arXiv."},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Zhang, S., Wen, L., Bian, X., Lei, Z., and Li, S.Z. (2018, January 18\u201322). Single-Shot Refinement Neural Network for Object Detection. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00442"},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Zhao, Q., Sheng, T., Wang, Y., Tang, Z., Chen, Y., Cai, L., and Ling, H. (2019, January 29\u201331). M2det: A single-shot object detector based on multi-level feature pyramid network. Proceedings of the 2019 AAAI Conference on Artificial Intelligence, Honolulu, HI, USA.","DOI":"10.1609\/aaai.v33i01.33019259"},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Wang, T., Anwer, R.M., Cholakkal, H., Khan, F.S., Pang, Y., and Shao, L. (November, January 27). Learning rich features at high-speed for single-shot object detection. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00206"},{"key":"ref_61","doi-asserted-by":"crossref","first-page":"642","DOI":"10.1007\/s11263-019-01204-1","article-title":"CornerNet: Detecting objects as paired keypoints","volume":"128","author":"Law","year":"2020","journal-title":"Int. J. Comput. Vis."},{"key":"ref_62","doi-asserted-by":"crossref","unstructured":"Kim, S.-W., Kook, H.-K., Sun, J.-Y., Kang, M.-C., and Ko, S.-J. (2018, January 8\u201314). Parallel feature pyramid network for object detection. Proceedings of the European Conference on Computer Vision (ECCV): 15th European Conference, Munich, Germany.","DOI":"10.1007\/978-3-030-01228-1_15"},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"Cao, J., Pang, Y., Han, J., and Li, X. (November, January 27). Hierarchical shot detector. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00980"},{"key":"ref_64","doi-asserted-by":"crossref","unstructured":"Nie, J., Anwer, R.M., Cholakkal, H., Khan, F.S., Pang, Y., and Shao, L. (November, January 27). Enriched feature guided refinement network for object detection. Proceedings of the 2019 IEEE\/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea.","DOI":"10.1109\/ICCV.2019.00963"},{"key":"ref_65","doi-asserted-by":"crossref","unstructured":"Tan, M., Pang, R., and Le, Q.V. (2020, January 14\u201319). EfficientDet: Scalable and efficient object detection. Proceedings of the 2020 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"ref_66","doi-asserted-by":"crossref","unstructured":"Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2017, January 22\u201329). Grad-CAM: Visual explanations from deep networks via gradient-based localization. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.74"}],"container-title":["Symmetry"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-8994\/15\/11\/2080\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:24:41Z","timestamp":1760131481000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-8994\/15\/11\/2080"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,11,17]]},"references-count":66,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2023,11]]}},"alternative-id":["sym15112080"],"URL":"https:\/\/doi.org\/10.3390\/sym15112080","relation":{},"ISSN":["2073-8994"],"issn-type":[{"type":"electronic","value":"2073-8994"}],"subject":[],"published":{"date-parts":[[2023,11,17]]}}}