{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T14:18:21Z","timestamp":1780582701390,"version":"3.54.1"},"reference-count":44,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2023,3,17]],"date-time":"2023-03-17T00:00:00Z","timestamp":1679011200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"the National Natural Science Foundation of China","award":["62001173"],"award-info":[{"award-number":["62001173"]}]},{"name":"the National Natural Science Foundation of China","award":["pdjh2022a0131"],"award-info":[{"award-number":["pdjh2022a0131"]}]},{"name":"the National Natural Science Foundation of China","award":["pdjh2023b0141"],"award-info":[{"award-number":["pdjh2023b0141"]}]},{"name":"Special Funds for the Cultivation of Guangdong College Students\u2019 Scientific and Technological Innovation (\u201dClimbing Program\u201d Special Funds)","award":["62001173"],"award-info":[{"award-number":["62001173"]}]},{"name":"Special Funds for the Cultivation of Guangdong College Students\u2019 Scientific and Technological Innovation (\u201dClimbing Program\u201d Special Funds)","award":["pdjh2022a0131"],"award-info":[{"award-number":["pdjh2022a0131"]}]},{"name":"Special Funds for the Cultivation of Guangdong College Students\u2019 Scientific and Technological Innovation (\u201dClimbing Program\u201d Special Funds)","award":["pdjh2023b0141"],"award-info":[{"award-number":["pdjh2023b0141"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Insect pests have always been one of the main hazards affecting crop yield and quality in traditional agriculture. An accurate and timely pest detection algorithm is essential for effective pest control; however, the existing approach suffers from a sharp performance drop when it comes to the pest detection task due to the lack of learning samples and models for small pest detection. In this paper, we explore and study the improvement methods of convolutional neural network (CNN) models on the Teddy Cup pest dataset and further propose a lightweight and effective agricultural pest detection method for small target pests, named Yolo-Pest, for the pest detection task in agriculture. Specifically, we tackle the problem of feature extraction in small sample learning with the proposed CAC3 module, which is built in a stacking residual structure based on the standard BottleNeck module. By applying a ConvNext module based on the vision transformer (ViT), the proposed method achieves effective feature extraction while keeping a lightweight network. Comparative experiments prove the effectiveness of our approach. Our proposal achieves 91.9% mAP0.5 on the Teddy Cup pest dataset, which outperforms the Yolov5s model by nearly 8% in mAP0.5. It also achieves great performance on public datasets, such as IP102, with a great reduction in the number of parameters.<\/jats:p>","DOI":"10.3390\/s23063221","type":"journal-article","created":{"date-parts":[[2023,3,17]],"date-time":"2023-03-17T05:36:01Z","timestamp":1679031361000},"page":"3221","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":41,"title":["Yolo-Pest: An Insect Pest Object Detection Algorithm via CAC3 Module"],"prefix":"10.3390","volume":"23","author":[{"given":"Qiuchi","family":"Xiang","sequence":"first","affiliation":[{"name":"School of Data Science and Engineering, Xingzhi College, South China Normal University, Shanwei 516600, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiaoning","family":"Huang","sequence":"additional","affiliation":[{"name":"School of Data Science and Engineering, Xingzhi College, South China Normal University, Shanwei 516600, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Zhouxu","family":"Huang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Aberystwyth University, Aberystwyth SY23 3FL, UK"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xingming","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Electronics and Information Engineering, South China Normal University, Foshan 528000, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jintao","family":"Cheng","sequence":"additional","affiliation":[{"name":"School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou 510006, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6038-9623","authenticated-orcid":false,"given":"Xiaoyu","family":"Tang","sequence":"additional","affiliation":[{"name":"School of Data Science and Engineering, Xingzhi College, South China Normal University, Shanwei 516600, China"},{"name":"School of Electronics and Information Engineering, South China Normal University, Foshan 528000, China"},{"name":"School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou 510006, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2023,3,17]]},"reference":[{"key":"ref_1","first-page":"4034","article-title":"Classification of agricultural pests using dwt and back propagation neural networks","volume":"5","author":"Kandalkar","year":"2014","journal-title":"Int. J. Comput. Sci. Inf. Technol."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1016\/j.biosystemseng.2018.02.008","article-title":"Research on insect pest image detection and recognition based on bio-inspired methods","volume":"169","author":"Deng","year":"2018","journal-title":"Biosyst. Eng."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Wang, R., Liu, L., Xie, C., Yang, P., Li, R., and Zhou, M. (2021). AgriPest: A Large-Scale Domain-Specific Benchmark Dataset for Practical Agricultural Pest Detection in the Wild. Sensors, 21.","DOI":"10.3390\/s21051601"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"3118","DOI":"10.3390\/s120303118","article-title":"Feasibility Study on a Portable Field Pest Classification System Design Based on DSP and 3G Wireless Communication Technology","volume":"12","author":"Han","year":"2012","journal-title":"Sensors"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Aladhadh, S., Habib, S., Islam, M., Aloraini, M., Aladhadh, M., and Al-Rawashdeh, H.S. (2022). An Efficient Pest Detection Framework with a Medium-Scale Benchmark to Increase the Agricultural Productivity. Sensors, 22.","DOI":"10.3390\/s22249749"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Li, C., Zhen, T., and Li, Z. (2022). Image classification of pests with residual neural network based on transfer learning. Appl. Sci., 12.","DOI":"10.3390\/app12094356"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s00170-021-08614-x","article-title":"A new lightweight deep neural network for surface scratch detection","volume":"123","author":"Li","year":"2022","journal-title":"Int. J. Adv. Manuf. Technol."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., and Xie, S. (2022, January 18\u201324). A convnet for the 2020s. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01167"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201323). Squeeze-and-excitat ion networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_10","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_11","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_12","unstructured":"Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., and Keutzer, K. (2016). SqueezeNet: AlexNet-level accuracy with 50\u00d7 fewer parameters and <0.5 MB model size. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Zhang, X., Zhou, X., Lin, M., and Sun, J. (2018, January 18\u201323). Shufflenet: An extremely efficient convolutional neural network for mobile devices. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00716"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Ma, N., Zhang, X., Zheng, H.T., and Sun, J. (2018, January 8\u201314). Shufflenet v2: Practical guidelines for efficient cnn architecture design. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01264-9_8"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Washington, DC, USA.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_17","first-page":"1","article-title":"Faster r-cnn: Towards real-time object detection with region proposal networks","volume":"28","author":"Ren","year":"2015","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_18","first-page":"1","article-title":"R-fcn: Object detection via region-based fully convolutional networks","volume":"29","author":"Dai","year":"2016","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., and Lin, D. (2019, January 15\u201320). Libra r-cnn: Towards balanced learning for object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00091"},{"key":"ref_20","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv."},{"key":"ref_21","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (July, January 26). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_22","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_23","unstructured":"Wang, C.Y., Bochkovskiy, A., and Liao, H.Y.M. (2022). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 8\u201316). Ssd: Single shot multibox detector. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1904","DOI":"10.1109\/TPAMI.2015.2389824","article-title":"Spatial pyramid pooling in deep convolutional networks for visual recognition","volume":"37","author":"He","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.Y., and Kweon, I.S. (2018, January 8\u201314). Cbam: Convolutional block attention module. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Hou, Q., Zhou, D., and Feng, J. (2021, January 20\u201325). Coordinate attention for efficient mobile network design. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01350"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Nagar, H., and Sharma, R. (2020, January 13\u201315). A comprehensive survey on pest detection techniques using image processing. Proceedings of the 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India.","DOI":"10.1109\/ICICCS48265.2020.9120889"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1016\/j.compag.2012.08.008","article-title":"Image-based orchard insect automated identification and classification method","volume":"89","author":"Wen","year":"2012","journal-title":"Comput. Electron. Agric."},{"key":"ref_31","first-page":"23","article-title":"Automatic classification of insects using color-based and shape-based descriptors","volume":"2","author":"Hassan","year":"2014","journal-title":"Int. J. Appl. Control. Electr. Electron. Eng."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Huang, X., Dong, J., Zhu, Z., Ma, D., Ma, F., and Lang, L. (2022). TSD-Truncated Structurally Aware Distance for Small Pest Object Detection. Sensors, 22.","DOI":"10.3390\/s22228691"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Liao, H.Y.M., Wu, Y.H., Chen, P.Y., Hsieh, J.W., and Yeh, I.H. (2020, January 14\u201319). CSPNet: A new backbone that can enhance learning capability of CNN. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA.","DOI":"10.1109\/CVPRW50498.2020.00203"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Ghiasi, G., Cui, Y., Srinivas, A., Qian, R., Lin, T.Y., Cubuk, E.D., Le, Q.V., and Zoph, B. (2021, January 20\u201325). Simple copy-paste is a strong data augmentation method for instance segmentation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00294"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.C. (2018, January 18\u201323). Mobilenetv2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_36","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16 \u00d7 16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Wu, X., Zhan, C., Lai, Y.K., Cheng, M.M., and Yang, J. (2019, January 15\u201320). Ip102: A large-scale benchmark dataset for insect pest recognition. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00899"},{"key":"ref_38","unstructured":"Rao, Y., Zhao, W., Tang, Y., Zhou, J., Lim, S.N., and Lu, J. (2022). Hornet: Efficient high-order spatial interactions with recursive gated convolutions. arXiv."},{"key":"ref_39","unstructured":"Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., and Vasudevan, V. (November, January 27). Searching for mobilenetv3. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Seoul, Republic of Korea."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Radosavovic, I., Kosaraju, R.P., Girshick, R., He, K., and Doll\u00e1r, P. (2020, January 13\u201319). Designing network design spaces. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01044"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Li, X., Wang, W., Hu, X., and Yang, J. (2019, January 15\u201320). Selective kernel networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00060"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., and Xu, C. (2020, January 13\u201319). Ghostnet: More features from cheap operations. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00165"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., and Sun, J. (2021, January 20\u201325). Repvgg: Making vgg-style convnets great again. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01352"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/6\/3221\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:57:32Z","timestamp":1760122652000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/6\/3221"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,17]]},"references-count":44,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2023,3]]}},"alternative-id":["s23063221"],"URL":"https:\/\/doi.org\/10.3390\/s23063221","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,17]]}}}