{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T19:06:28Z","timestamp":1780599988985,"version":"3.54.1"},"reference-count":37,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2020,4,10]],"date-time":"2020-04-10T00:00:00Z","timestamp":1586476800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Automatic fruit detection is a very important benefit of harvesting robots. However, complicated environment conditions, such as illumination variation, branch, and leaf occlusion as well as tomato overlap, have made fruit detection very challenging. In this study, an improved tomato detection model called YOLO-Tomato is proposed for dealing with these problems, based on YOLOv3. A dense architecture is incorporated into YOLOv3 to facilitate the reuse of features and help to learn a more compact and accurate model. Moreover, the model replaces the traditional rectangular bounding box (R-Bbox) with a circular bounding box (C-Bbox) for tomato localization. The new bounding boxes can then match the tomatoes more precisely, and thus improve the Intersection-over-Union (IoU) calculation for the Non-Maximum Suppression (NMS). They also reduce prediction coordinates. An ablation study demonstrated the efficacy of these modifications. The YOLO-Tomato was compared to several state-of-the-art detection methods and it had the best detection performance.<\/jats:p>","DOI":"10.3390\/s20072145","type":"journal-article","created":{"date-parts":[[2020,4,13]],"date-time":"2020-04-13T10:41:52Z","timestamp":1586774512000},"page":"2145","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":368,"title":["YOLO-Tomato: A Robust Algorithm for Tomato Detection Based on YOLOv3"],"prefix":"10.3390","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3668-4569","authenticated-orcid":false,"given":"Guoxu","family":"Liu","sequence":"first","affiliation":[{"name":"Computer Software Institute, Weifang University of Science and Technology, Shouguang 262-700, China"},{"name":"Department of Electronics Engineering, Pusan National University, Busan 46241, Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0074-227X","authenticated-orcid":false,"given":"Joseph Christian","family":"Nouaze","sequence":"additional","affiliation":[{"name":"Department of Electronics Engineering, Pusan National University, Busan 46241, Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Philippe Lyonel","family":"Touko Mbouembe","sequence":"additional","affiliation":[{"name":"Department of Electronics Engineering, Pusan National University, Busan 46241, Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jae Ho","family":"Kim","sequence":"additional","affiliation":[{"name":"Department of Electronics Engineering, Pusan National University, Busan 46241, Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2020,4,10]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"311","DOI":"10.1016\/j.compag.2016.06.022","article-title":"A review of key techniques of vision-based control for harvesting robot","volume":"127","author":"Zhao","year":"2016","journal-title":"Comput. Electron. Agric."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1016\/j.compag.2015.05.021","article-title":"Sensors and systems for fruit detection and localization: A review","volume":"116","author":"Gongal","year":"2015","journal-title":"Comput. Electron. Agric."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1016\/j.compag.2011.11.007","article-title":"Determination of the number of green apples in RGB images recorded in orchards","volume":"81","author":"Linker","year":"2012","journal-title":"Comput. Electron. Agric."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"5684","DOI":"10.1016\/j.ijleo.2014.07.001","article-title":"Automatic method of fruit object extraction under complex agricultural background for vision system of fruit picking robot","volume":"125","author":"Wei","year":"2014","journal-title":"Optik"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"174","DOI":"10.1016\/j.biosystemseng.2013.11.007","article-title":"Vision-based localisation of mature apples in tree images using convexity","volume":"118","author":"Kelman","year":"2014","journal-title":"Biosyst. Eng."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"160","DOI":"10.1016\/j.compag.2013.11.011","article-title":"Estimating mango crop yield using image analysis using fruit at \u2018stone hardening\u2019stage and night time imaging","volume":"100","author":"Payne","year":"2014","journal-title":"Comput. Electron. Agric."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1016\/j.compag.2012.11.009","article-title":"Estimation of mango crop yield using image analysis\u2013segmentation method","volume":"91","author":"Payne","year":"2013","journal-title":"Comput. Electron. Agric."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Zhao, Y., Gong, L., Huang, Y., and Liu, C. (2016). Robust tomato recognition for robotic harvesting using feature images fusion. Sensors, 16.","DOI":"10.3390\/s16020173"},{"key":"ref_9","first-page":"115","article-title":"Identification of fruit and branch in natural scenes for citrus harvesting robot using machine vision and support vector machine","volume":"7","author":"Qiang","year":"2014","journal-title":"Int. J. Agric. Biol. Eng."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1007\/s11119-013-9323-8","article-title":"Immature peach detection in colour images acquired in natural illumination conditions using statistical classifiers and neural network","volume":"15","author":"Kurtulmus","year":"2014","journal-title":"Precis. Agric."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"12191","DOI":"10.3390\/s140712191","article-title":"On plant detection of intact tomato fruits using image analysis and machine learning methods","volume":"14","author":"Yamamoto","year":"2014","journal-title":"Sensors"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"127","DOI":"10.1016\/j.biosystemseng.2016.05.001","article-title":"Detecting tomatoes in greenhouse scenes by combining AdaBoost classifier and colour analysis","volume":"148","author":"Zhao","year":"2016","journal-title":"Biosyst. Eng."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Luo, L., Tang, Y., Zou, X., Wang, C., Zhang, P., and Feng, W. (2016). Robust grape cluster detection in a vineyard by combining the AdaBoost framework and multiple color components. Sensors, 16.","DOI":"10.3390\/s16122098"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Liu, G., Mao, S., and Kim, J.H. (2019). A mature-tomato detection algorithm using machine learning and color analysis. Sensors, 19.","DOI":"10.3390\/s19092023"},{"key":"ref_15","unstructured":"Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012, January 3\u20136). Imagenet classification with deep convolutional neural networks. Proceedings of the International Conference on Neural Information Processing Systems 25, Lake Tahoe, NV, USA."},{"key":"ref_16","unstructured":"Simonyan, K., and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"70","DOI":"10.1016\/j.compag.2018.02.016","article-title":"Deep learning in agriculture: A survey","volume":"147","author":"Kamilaris","year":"2018","journal-title":"Comput. Electron. Agric."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Sa, I., Ge, Z., Dayoub, F., Upcroft, B., Perez, T., and McCool, C. (2016). Deepfruits: A fruit detection system using deep neural networks. Sensors, 16.","DOI":"10.3390\/s16081222"},{"key":"ref_19","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201312). Faster r-cnn: Towards real-time object detection with region proposal networks. Proceedings of the International Conference on Neural Information Processing Systems 28, Montreal, QC, Canada."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Bargoti, S., and Underwood, J. (2017, January 3). Deep fruit detection in orchards. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore.","DOI":"10.1109\/ICRA.2017.7989417"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Rahnemoonfar, M., and Sheppard, C. (2017). Deep count: Fruit counting based on deep simulated learning. Sensors, 17.","DOI":"10.3390\/s17040905"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Ioffe, S., Vanhoucke, V., and Alemi, A.A. (2017, January 9). Inception-v4, inception-resnet and the impact of residual connections on learning. Proceedings of the Thirty-first AAAI Conference on Artificial Intelligence, San Francisco, CA, USA.","DOI":"10.1609\/aaai.v31i1.11231"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Redmon, J., and Farhadi, A. (2017, January 26). YOLO9000: Better, faster, stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.690"},{"key":"ref_25","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7). Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 26). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_28","unstructured":"Ioffe, S., and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 30). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Doll\u00e1r, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 26). Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_31","unstructured":"Glorot, X., Bordes, A., and Bengio, Y. (2011, January 13). Deep sparse rectifier neural networks. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Ft. Lauderdale, FL, USA."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1627","DOI":"10.1109\/TPAMI.2009.167","article-title":"Object detection with discriminatively trained part-based models","volume":"32","author":"Felzenszwalb","year":"2009","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 1). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"303","DOI":"10.1007\/s11263-009-0275-4","article-title":"The pascal visual object classes (voc) challenge","volume":"88","author":"Everingham","year":"2010","journal-title":"Int. J. Comput. Vis."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6). Microsoft coco: Common objects in context. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"Imagenet large scale visual recognition challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"80","DOI":"10.2307\/3001968","article-title":"Individual comparisons by ranking methods","volume":"1","author":"Wilcoxon","year":"1945","journal-title":"Biom. Bull."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/7\/2145\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T09:17:20Z","timestamp":1760174240000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/7\/2145"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,4,10]]},"references-count":37,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2020,4]]}},"alternative-id":["s20072145"],"URL":"https:\/\/doi.org\/10.3390\/s20072145","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,4,10]]}}}