{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T14:59:26Z","timestamp":1777388366980,"version":"3.51.4"},"reference-count":37,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2023,9,30]],"date-time":"2023-09-30T00:00:00Z","timestamp":1696032000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Key R &amp; D Plan Project Sub Project","award":["2021YFD2200304-2"],"award-info":[{"award-number":["2021YFD2200304-2"]}]},{"name":"National Key R &amp; D Plan Project Sub Project","award":["2021C02037"],"award-info":[{"award-number":["2021C02037"]}]},{"name":"Zhejiang Province Key R &amp; D Plan Project","award":["2021YFD2200304-2"],"award-info":[{"award-number":["2021YFD2200304-2"]}]},{"name":"Zhejiang Province Key R &amp; D Plan Project","award":["2021C02037"],"award-info":[{"award-number":["2021C02037"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Birds play a vital role in maintaining biodiversity. Accurate identification of bird species is essential for conducting biodiversity surveys. However, fine-grained image recognition of birds encounters challenges due to large within-class differences and small inter-class differences. To solve this problem, our study took a part-based approach, dividing the identification task into two parts: part detection and identification classification. We proposed an improved bird part detection algorithm based on YOLOv5, which can handle partial overlap and complex environmental conditions between part objects. The backbone network incorporates the Res2Net-CBAM module to enhance the receptive fields of each network layer, strengthen the channel characteristics, and improve the sensitivity of the model to important information. Additionally, in order to boost data on features extraction and channel self-regulation, we have integrated CBAM attention mechanisms into the neck. The success rate of our suggested model, according to experimental findings, is 86.6%, 1.2% greater than the accuracy of the original model. Furthermore, when compared with other algorithms, our model\u2019s accuracy shows noticeable improvement. These results show how useful the method we suggested is for quickly and precisely recognizing different bird species.<\/jats:p>","DOI":"10.3390\/s23198204","type":"journal-article","created":{"date-parts":[[2023,10,2]],"date-time":"2023-10-02T04:39:30Z","timestamp":1696221570000},"page":"8204","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["Research on Fine-Grained Image Recognition of Birds Based on Improved YOLOv5"],"prefix":"10.3390","volume":"23","author":[{"given":"Xiaomei","family":"Yi","sequence":"first","affiliation":[{"name":"College of Mathematics & Computer Science, Zhejiang A & F University, Hangzhou 311300, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-2278-8938","authenticated-orcid":false,"given":"Cheng","family":"Qian","sequence":"additional","affiliation":[{"name":"College of Mathematics & Computer Science, Zhejiang A & F University, Hangzhou 311300, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8946-3447","authenticated-orcid":false,"given":"Peng","family":"Wu","sequence":"additional","affiliation":[{"name":"College of Mathematics & Computer Science, Zhejiang A & F University, Hangzhou 311300, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Brian Tapiwanashe","family":"Maponde","sequence":"additional","affiliation":[{"name":"College of Mathematics & Computer Science, Zhejiang A & F University, Hangzhou 311300, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tengteng","family":"Jiang","sequence":"additional","affiliation":[{"name":"College of Mathematics & Computer Science, Zhejiang A & F University, Hangzhou 311300, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Wenying","family":"Ge","sequence":"additional","affiliation":[{"name":"College of Mathematics & Computer Science, Zhejiang A & F University, Hangzhou 311300, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,9,30]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1038\/362149a0","article-title":"Dark habitats and bright birds illustrate the role of the environment in species divergence","volume":"362","author":"Karen","year":"1993","journal-title":"Nature"},{"key":"ref_2","unstructured":"Koskimies, P. (1989, January 24-28). Birds as a Tool in Environmental Monitoring. Proceedings of the 10th International Conference on Bird Census Work and Atlas Studies, Helsinki, Finland."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1111\/j.1600-0587.2012.07799.x","article-title":"Rapid changes in bird community composition at multiple temporal and spatial scales in response to recent climate change","volume":"36","author":"Martin","year":"2013","journal-title":"Ecography"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"349","DOI":"10.1111\/j.1365-2664.2011.02094.x","article-title":"Birds as biodiversity surrogates: Will supplementing birds with other taxa improve effectiveness?","volume":"49","author":"Frank","year":"2012","journal-title":"J. Appl. Ecol."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Ramirez, A.D.P., de la Rosa Vargas, J.I., Valdez, R.R., and Becerra, A. (2018, January 7\u20139). A Comparative between Mel Frequency Cepstral Coefficients (Mfcc) and Inverse Mel Frequency Cepstral Coefficients (IMFCC) Features for an Automatic Bird Species Recognition System. Proceedings of the 2018 IEEE Latin American Conference on Computational Intelligence (LA-CCI), Gudalajara, Mexico.","DOI":"10.1109\/LA-CCI.2018.8625230"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Shan-shan, X., Hai-feng, X., Jiang, L., Yan, Z., and Dan-jv, L. (2021, January 8\u201310). Research on Bird Songs Recognition Based on MFCC-HMM. Proceedings of the 2021 International Conference on Computer, Control and Robotics (ICCCR), Shanghai, China.","DOI":"10.1109\/ICCCR49711.2021.9349284"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Wang, H., Xu, Y., and Ran, J. (2022). An Efficient Model for a Vast Number of Bird Species Identification Based on Acoustic Features. Animals, 12.","DOI":"10.3390\/ani12182434"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Branson, S., Van, H.G., Belongie, S., and Perona, P. (2014, January 1\u20135). Bird Species Categorization Using Pose Normalized Deep Convolutional Nets. Proceedings of the BMVC 2014, Nottingham, UK.","DOI":"10.5244\/C.28.87"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Krause, J., Jin, H., Yang, J., and Fei-Fei, L. (2015, January 7\u201312). Fine-Grained Recognition without Part Annotations. Proceedings of the 28th IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299194"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhou, J., Wang, Y., Zhang, C., Wu, W., Ji, Y., and Zou, Y. (2022). Eyebirds: Enabling the Public to Recognize Water Birds at Hand. Animals, 12.","DOI":"10.3390\/ani12213000"},{"key":"ref_11","unstructured":"Wah, C., Branson, S., Welinder, P., and Belongie, S. (2011). The Caltech-UCSD Birds-200-2011 Dataset: Cns-tr-2011-001, California Institute of Technology."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Berg, T., and Belhumeur, P.N. (2013, January 23\u201328). POOF: Part-Based One-vs.-One Features for Fine-Grained Categorization, Face Verification, and Attribute Estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.128"},{"key":"ref_13","unstructured":"Yao, B., Bradski, G., and Fei-Fei, L. (2012, January 16\u201321). A Codebook-Free and Annotation-Free Approach for Fine-Grained Image Categorization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA."},{"key":"ref_14","first-page":"3132","article-title":"Unsupervised Template Learning for Fine-Grained Object Recognition","volume":"2","author":"Yang","year":"2012","journal-title":"NIPS"},{"key":"ref_15","first-page":"647","article-title":"DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition","volume":"32","author":"Donahue","year":"2014","journal-title":"Int. Conf. Mach. Learn."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Zhang, N., Donahue, J., Girshick, R., and Darrell, T. (2014, January 6\u201312). Part-Based R-CNNs for Fine-Grained Category Detection. Proceedings of the 13th European Conference, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10590-1_54"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Zhang, H., Xu, T., Elhoseiny, M., Huang, X., Zhang, S., Elgammal, A., and Metaxas, D. (2016, January 27\u201330). SPDA-CNN: Unifying semantic part detection and abstraction for fine-grained recognition. Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.129"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Wang, Y.M., Choi, J., Morariu, V.I., and Davis, L.S. (2016, January 27\u201330). Mining discriminative triplets of patches for fine-grained classification. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.131"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1791","DOI":"10.1109\/TCYB.2018.2813971","article-title":"Deep Attention-Based Spatially Recursive Networks for Fine-Grained Visual Recognition","volume":"49","author":"Wu","year":"2019","journal-title":"IEEE Trans. Cybern."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Lam, M., Mahasseni, B., and Todorovic, S. (2017, January 21\u201326). Fine-grained recognition as hsnet search for informative image parts. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.688"},{"key":"ref_21","unstructured":"Lin, T., RoyChowdhury, A., and Maji, S. (2015, January 07\u201313). Bilinear Convolutional Neural Networks for Fine-grained Visual Recognition. Proceedings of the IEEE Transactions on Pattern Analysis & Machine Intelligence, Santiago, Chile."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Li, P., Xie, J., Wang, Q., and Zuo, W. (2017, January 22\u201329). Is Second-order Information Helpful for Large-scale Visual Recognition?. Proceedings of the International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.228"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"1345","DOI":"10.1109\/TMM.2019.2939747","article-title":"Part-Aware Fine-grained Object Categorization using Weakly Supervised Part Detection Network","volume":"22","author":"Zhang","year":"2020","journal-title":"IEEE Trans. Multimed."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Wang, Y., Morariu, V.I., and Davis, L.S. (2018, January 18\u201322). Learning a Discriminative Filter Bank within a CNN for Fine-grained Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Utah.","DOI":"10.1109\/CVPR.2018.00436"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Nepal, U., and Eslamiat, H. (2022). Comparing YOLOv3, YOLOv4 and YOLOv5 for Autonomous Landing Spot Detection in Faulty UAVs. Sensors, 22.","DOI":"10.3390\/s22020464"},{"key":"ref_26","unstructured":"Xingkui, Z., Shuchang, L., Xu, W., and Zhao, Q. (2021, January 11\u201317). TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV) Workshops, Montreal, BC, Canada."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.Y., and Kweon, I.S. (2018, January 8\u201314). Cbam: Convolutional block attention module. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Hou, Q., Zhou, D., and Feng, J. (2021, January 20\u201325). Coordinate attention for efficient mobile network design. Proceedings of the 2021 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01350"},{"key":"ref_29","first-page":"652","article-title":"Res2Net: A New Multi-Scale Backbone Architecture","volume":"43","author":"Gao","year":"2021","journal-title":"Comput. Vis. Pattern Recognit."},{"key":"ref_30","unstructured":"Liu, X., Xia, T., Wang, J., Zhou, F., and Lin, Y. (2016). Fully Convolutional Attention Networks for Fine-Grained Recognition. Computer Vision and Pattern Recognition. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Wang, D., Shen, Z., Shao, J., Xue, X., and Zhang, Z. (2015, January 7\u201313). Multiple Granularity Descriptors for Fine-Grained Categorization. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.276"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Fu, J., Zheng, H., and Mei, T. (2017, January 21\u201326). Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition. Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.476"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Dubey, A., Gupta, O., Guo, P., Raskar, R., Farrell, R., and Naik, N. (2018, January 8\u201314). Pairwise Confusion for Fine-Grained Visual Classification. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01258-8_5"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"119","DOI":"10.1016\/j.neucom.2020.05.049","article-title":"Stochastic region pooling: Make attention more expressive","volume":"409","author":"Luo","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., and Batra, D. (2017, January 22\u201329). Grad-cam: Visual explanations from deep networks via gradient-based localization. Proceedings of the 24th IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.74"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"4009","DOI":"10.1007\/s11760-023-02631-x","article-title":"Innovative local texture descriptor in joint of human-based color features for content-based image retrieval","volume":"17","author":"Karimian","year":"2023","journal-title":"Signal Image Video Process."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"66","DOI":"10.1016\/j.neucom.2016.11.030","article-title":"Effective pixel classification of Mars images based on ant colony optimization feature selection and extreme learning machine","volume":"226","author":"Rashno","year":"2017","journal-title":"Neurocomputing"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/19\/8204\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:03:23Z","timestamp":1760130203000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/19\/8204"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,30]]},"references-count":37,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2023,10]]}},"alternative-id":["s23198204"],"URL":"https:\/\/doi.org\/10.3390\/s23198204","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,30]]}}}