{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,1]],"date-time":"2026-02-01T04:01:55Z","timestamp":1769918515619,"version":"3.49.0"},"reference-count":34,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2023,3,9]],"date-time":"2023-03-09T00:00:00Z","timestamp":1678320000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Fondazione Caritro (Trento, Italy)"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Horticulture and agriculture are considered as the important pillars of any economy. Current technological advancements have led to the development of several new technologies which are useful in atomizing the agriculture process. Apple farming has a significant role in Italy\u2019s agriculture domain where manual labor is widely employed for apple picking which can be replaced by automated robot mechanisms. However, these mechanisms are based on computer vision methods. These methods focus on detection, localization and tracking the apple fruits in given video frames. Later, appropriate actions can be taken to enhance the production and harvesting. Several techniques have been presented for apple detection, but complex background, noise and image blurriness are the major causes which can deteriorate the performance of the system. Thus, in this work, we present a deep learning-based scheme to detect apples which uses Yolov5 architecture in live apple farm images. We further improve the Yolov5 architecture by incorporating an adaptive pooling scheme and attribute augmentation model. This model detects the smaller objects and improves the feature quality to detect the apples in complex backgrounds. Moreover, a loss function is also incorporated to obtain the accurate bounding box which helps to maximize the detection accuracy. The comparative study shows that the proposed approach with the improved Yolov5 architecture achieves overall accuracy of 0.97, 0.99, and 0.98 in terms of precision, recall, and F1-score, respectively.<\/jats:p>","DOI":"10.3390\/rs15061516","type":"journal-article","created":{"date-parts":[[2023,3,10]],"date-time":"2023-03-10T01:31:41Z","timestamp":1678411901000},"page":"1516","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":64,"title":["Deep Learning-Based Apple Detection with Attention Module and Improved Loss Function in YOLO"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-0097-0317","authenticated-orcid":false,"given":"Praveen Kumar","family":"Sekharamantry","sequence":"first","affiliation":[{"name":"Department of Information Engineering and Computer Science, University of Trento, 38123 Trento, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9745-3732","authenticated-orcid":false,"given":"Farid","family":"Melgani","sequence":"additional","affiliation":[{"name":"Department of Information Engineering and Computer Science, University of Trento, 38123 Trento, Italy"}]},{"given":"Jonni","family":"Malacarne","sequence":"additional","affiliation":[{"name":"Department of Information Engineering and Computer Science, University of Trento, 38123 Trento, Italy"}]}],"member":"1968","published-online":{"date-parts":[[2023,3,9]]},"reference":[{"key":"ref_1","unstructured":"Zou, Z., Shi, Z., Guo, Y., and Ye, J. (2019). Object detection in 20 years: A survey. arXiv."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"9581275","DOI":"10.1155\/2019\/9581275","article-title":"Vision Based Computing Systems for Healthcare Applications","volume":"2019","author":"Murala","year":"2019","journal-title":"J. Healthc. Eng."},{"key":"ref_3","unstructured":"Chandra, A.L., Desai, S.V., Guo, W., and Balasubramanian, V.N. (2020). Computer vision with deep learning for plant phenotyping in agriculture: A survey. arXiv."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1016\/j.compag.2019.01.012","article-title":"Apple detection during different growth stages in orchards using the improved YOLO-V3 model","volume":"157","author":"Tian","year":"2019","journal-title":"Comput. Electron. Agric."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Kuznetsova, A., Maleva, T., and Soloviev, V. (2020). Using YOLOv3 algorithm with pre-and post-processing for apple detection in fruit-harvesting robot. Agronomy, 10.","DOI":"10.3390\/agronomy10071016"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1729881420925310","DOI":"10.1177\/1729881420925310","article-title":"Apple harvesting robot under information technology: A review","volume":"17","author":"Jia","year":"2020","journal-title":"Int. J. Adv. Robot. Syst."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Jiao, Y., Luo, R., Li, Q., Deng, X., Yin, X., Ruan, C., and Jia, W. (2020). Detection and localization of overlapped fruits application in an apple harvesting robot. Electronics, 9.","DOI":"10.3390\/electronics9061023"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Li, T., Fang, W., Zhao, G., Gao, F., Wu, Z., Li, R., and Dhupia, J. (2021). An improved binocular localization method for apple based on fruit detection using deep learning. Inf. Process. Agric., in press.","DOI":"10.1016\/j.inpa.2021.12.003"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"689","DOI":"10.1016\/j.compag.2019.05.016","article-title":"Multi-modal deep learning for Fuji apple detection using RGB-D cameras and their radiometric capabilities","volume":"162","author":"Vilaplana","year":"2019","journal-title":"Comput. Electron. Agric."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"171","DOI":"10.1016\/j.biosystemseng.2019.08.017","article-title":"Fruit detection in an apple orchard using a mobile terrestrial laser scanner","volume":"187","author":"Gregorio","year":"2019","journal-title":"Biosyst. Eng."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Liu, Q., Zhao, X., Yang, H., Zhao, L., Ling, W., Ma, X., and Zhao, Y. (2021, January 17\u201319). Image segmentation of Huaniu apple based on pulse coupled neural network and watershed algorithm. Proceedings of the International Conference on Electronic Information Engineering and Computer Communication (EIECC 2021), Nanchang, China.","DOI":"10.1117\/12.2634516"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Zhang, C., Zou, K., and Pan, Y. (2020). A method of apple image segmentation based on color-texture fusion feature and machine learning. Agronomy, 10.","DOI":"10.3390\/agronomy10070972"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"322","DOI":"10.1007\/s13198-021-01415-1","article-title":"Development of image recognition software based on artificial intelligence algorithm for the efficient sorting of apple fruit","volume":"13","author":"Yang","year":"2022","journal-title":"Int. J. Syst. Assur. Eng. Manag."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"781","DOI":"10.1109\/LRA.2017.2651944","article-title":"Counting apples and oranges with deep learning: A data-driven approach","volume":"2","author":"Chen","year":"2017","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1016\/j.compind.2018.03.010","article-title":"Apple flower detection using deep convolutional networks","volume":"99","author":"Dias","year":"2018","journal-title":"Comput. Ind."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_17","unstructured":"Farhadi, A., and Redmon, J. (2018). YOLOv3: An incremental improvement. arXiv."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Biffi, L.J., Mitishita, E., Liesenberg, V., Santos AA, D., Gon\u00e7alves, D.N., Estrabis, N.V., and Gon\u00e7alves, W.N. (2020). ATSS deep learning-based approach to detect apple fruits. Remote Sens., 13.","DOI":"10.3390\/rs13010054"},{"key":"ref_19","unstructured":"(2021, September 01). www.personaldrones.it. Available online: https:\/\/www.personaldrones.it\/341-mavic-3."},{"key":"ref_20","unstructured":"(2021, September 01). www.dji.com. Available online: https:\/\/www.dji.com\/it\/mavic-3."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Wang, J.L., Li, A.Y., Huang, M., Ibrahim, A.K., Zhuang, H., and Ali, A.M. (2018, January 6\u20138). Classification of white blood cells with pattern net-fused ensemble of convolutional neural networks (pecnn). Proceedings of the 2018 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), Louisville, KY, USA.","DOI":"10.1109\/ISSPIT.2018.8642630"},{"key":"ref_22","unstructured":"Brock, H., Rengot, J., and Nakadai, K. (2018, January 7\u201312). Augmenting sparse corpora for enhanced sign language recognition and generation. Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018) and the 8th Workshop on the Representation and Processing of Sign Languages: Involving the Language Community, Miyazaki, Japan."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Nepal, U., and Eslamiat, H. (2022). Comparing YOLOv3, YOLOv4 and YOLOv5 for Autonomous Landing Spot Detection in Faulty UAVs. Sensors, 22.","DOI":"10.3390\/s22020464"},{"key":"ref_24","unstructured":"Bahdanau, D., Cho, K., and Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Ghaffarian, S., Valente, J., Van Der Voort, M., and Tekinerdogan, B. (2022). Effect of attention mechanism in deep learning-based remote sensing image processing: A systematic literature review. Remote Sens., 13.","DOI":"10.3390\/rs13152965"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"30716","DOI":"10.1109\/ACCESS.2022.3158681","article-title":"DWANet: Focus on Foreground Features for More Accurate Location","volume":"10","author":"Hu","year":"2022","journal-title":"IEEE Access"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"012021","DOI":"10.1088\/1742-6596\/1948\/1\/012021","article-title":"Research Towards Yolo-Series Algorithms: Comparison and Analysis of Object Detection Models for Real-Time UAV Applications","volume":"1948","author":"Wang","year":"2021","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Tan, M., Pang, R., and Le, Q.V. (2020, January 13\u201319). Efficientdet: Scalable and efficient object detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"ref_29","unstructured":"Wolter, M., and Garcke, J. (2021, January 13\u201315). Adaptive wavelet pooling for convolutional neural networks. Proceedings of the International Conference on Artificial Intelligence and Statistics, San Diego, CA, USA."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Tsai, Y.H., Hamsici, O.C., and Yang, M.H. (2015, January 7\u201312). Adaptive region pooling for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2015, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298673"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Yang, X., and Liu, Q. (2021). Scale-sensitive feature reassembly network for pedestrian detection. Sensors, 21.","DOI":"10.3390\/s21124189"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"2863","DOI":"10.1049\/ipr2.12387","article-title":"Airport small object detection based on feature enhancement","volume":"16","author":"Zhu","year":"2021","journal-title":"IET Image Process."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"7389","DOI":"10.1109\/TIP.2020.3002345","article-title":"Foveabox: Beyound anchor-based object detection","volume":"29","author":"Kong","year":"2020","journal-title":"IEEE Trans. Image Process."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2016). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv.","DOI":"10.1109\/TPAMI.2016.2577031"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/6\/1516\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:51:30Z","timestamp":1760122290000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/6\/1516"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,9]]},"references-count":34,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2023,3]]}},"alternative-id":["rs15061516"],"URL":"https:\/\/doi.org\/10.3390\/rs15061516","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,3,9]]}}}