{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T05:54:01Z","timestamp":1775627641477,"version":"3.50.1"},"reference-count":56,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2020,10,4]],"date-time":"2020-10-04T00:00:00Z","timestamp":1601769600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"ARC Nanocomm Hub","award":["ARC ITRH IH150100006"],"award-info":[{"award-number":["ARC ITRH IH150100006"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Robotic harvesting shows a promising aspect in future development of agricultural industry. However, there are many challenges which are still presented in the development of a fully functional robotic harvesting system. Vision is one of the most important keys among these challenges. Traditional vision methods always suffer from defects in accuracy, robustness, and efficiency in real implementation environments. In this work, a fully deep learning-based vision method for autonomous apple harvesting is developed and evaluated. The developed method includes a light-weight one-stage detection and segmentation network for fruit recognition and a PointNet to process the point clouds and estimate a proper approach pose for each fruit before grasping. Fruit recognition network takes raw inputs from RGB-D camera and performs fruit detection and instance segmentation on RGB images. The PointNet grasping network combines depth information and results from the fruit recognition as input and outputs the approach pose of each fruit for robotic arm execution. The developed vision method is evaluated on RGB-D image data which are collected from both laboratory and orchard environments. Robotic harvesting experiments in both indoor and outdoor conditions are also included to validate the performance of the developed harvesting system. Experimental results show that the developed vision method can perform highly efficient and accurate to guide robotic harvesting. Overall, the developed robotic harvesting system achieves 0.8 on harvesting success rate and cycle time is 6.5 s.<\/jats:p>","DOI":"10.3390\/s20195670","type":"journal-article","created":{"date-parts":[[2020,10,5]],"date-time":"2020-10-05T08:35:57Z","timestamp":1601886957000},"page":"5670","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":170,"title":["Real-Time Fruit Recognition and Grasping Estimation for Robotic Apple Harvesting"],"prefix":"10.3390","volume":"20","author":[{"given":"Hanwen","family":"Kang","sequence":"first","affiliation":[{"name":"Laboratory of Motion Generation and Analysis, Faculty of Engineering, Monash University, Clayton, VIC 3800, Australia"}]},{"given":"Hongyu","family":"Zhou","sequence":"additional","affiliation":[{"name":"Laboratory of Motion Generation and Analysis, Faculty of Engineering, Monash University, Clayton, VIC 3800, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8676-7821","authenticated-orcid":false,"given":"Xing","family":"Wang","sequence":"additional","affiliation":[{"name":"Laboratory of Motion Generation and Analysis, Faculty of Engineering, Monash University, Clayton, VIC 3800, Australia"}]},{"given":"Chao","family":"Chen","sequence":"additional","affiliation":[{"name":"Laboratory of Motion Generation and Analysis, Faculty of Engineering, Monash University, Clayton, VIC 3800, Australia"}]}],"member":"1968","published-online":{"date-parts":[[2020,10,4]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"35","DOI":"10.1016\/j.biosystemseng.2018.12.005","article-title":"Human\u2013robot interaction in agriculture: A survey and current challenges","volume":"179","author":"Vasconez","year":"2019","journal-title":"Biosyst. Eng."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Sa, I., Ge, Z., Dayoub, F., Upcroft, B., Perez, T., and McCool, C. (2016). Deepfruits: A fruit detection system using deep neural networks. Sensors, 16.","DOI":"10.3390\/s16081222"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"888","DOI":"10.1002\/rob.21525","article-title":"Harvesting robots for high-value crops: State-of-the-art review and challenges ahead","volume":"31","author":"Bac","year":"2014","journal-title":"J. Field Robot."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"311","DOI":"10.1016\/j.compag.2016.06.022","article-title":"A review of key techniques of vision-based control for harvesting robot","volume":"127","author":"Zhao","year":"2016","journal-title":"Comput. Electron. Agric."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Lin, G., Tang, Y., Zou, X., Xiong, J., and Li, J. (2019). Guava detection and pose estimation using a low-cost RGB-D sensor in the field. Sensors, 19.","DOI":"10.3390\/s19020428"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"62151","DOI":"10.1109\/ACCESS.2020.2984556","article-title":"Visual Perception and Modelling for Autonomous Apple Harvesting","volume":"8","author":"Kang","year":"2020","journal-title":"IEEE Access"},{"key":"ref_7","unstructured":"Qi, C.R., Su, H., Mo, K., and Guibas, L.J. (2017, January 21\u201326). Pointnet: Deep learning on point sets for 3d classification and segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Vibhute, A., and Bodhe, S. (2012). Applications of image processing in agriculture: A survey. Int. J. Comput. Appl., 52.","DOI":"10.5120\/8176-1495"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Lin, G., Tang, Y., Zou, X., Xiong, J., and Fang, Y. (2019). Color-, depth-, and shape-based 3D fruit detection. Precis. Agric., 1\u201317.","DOI":"10.1007\/s11119-019-09654-w"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Lin, G., Tang, Y., Zou, X., Cheng, J., and Xiong, J. (2019). Fruit detection in natural environment using partial shape matching and probabilistic Hough transform. Precis. Agric., 1\u201318.","DOI":"10.1007\/s11119-019-09662-w"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"184","DOI":"10.1016\/j.biosystemseng.2019.04.024","article-title":"A novel image processing algorithm to separate linearly clustered kiwifruits","volume":"183","author":"Fu","year":"2019","journal-title":"Biosyst. Eng."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1504\/IJCVR.2012.046419","article-title":"Computer vision for fruit harvesting robots\u2013state of the art and challenges ahead","volume":"3","author":"Kapach","year":"2012","journal-title":"Int. J. Comput. Vis. Robot."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1016\/j.neunet.2014.09.003","article-title":"Deep learning in neural networks: An overview","volume":"61","author":"Schmidhuber","year":"2015","journal-title":"Neural Netw."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"84","DOI":"10.1109\/MSP.2017.2749125","article-title":"Advanced deep-learning techniques for salient and category-specific object detection: A survey","volume":"35","author":"Han","year":"2018","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"3212","DOI":"10.1109\/TNNLS.2018.2876865","article-title":"Object detection with deep learning: A review","volume":"30","author":"Zhao","year":"2019","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision, 2015, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_17","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201312). Faster r-cnn: Towards real-time object detection with region proposal networks. Proceedings of the Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_18","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 11\u201314). Ssd: Single shot multibox detector. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"105108","DOI":"10.1016\/j.compag.2019.105108","article-title":"Fast implementation of real-time fruit detection in apple orchards using deep learning","volume":"168","author":"Kang","year":"2020","journal-title":"Comput. Electron. Agric."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Bargoti, S., and Underwood, J. (June, January 29). Deep fruit detection in orchards. Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Marina Bay Sands, Singapore.","DOI":"10.1109\/ICRA.2017.7989417"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"104846","DOI":"10.1016\/j.compag.2019.06.001","article-title":"Fruit detection for strawberry harvesting robot in non-structural environment based on Mask-RCNN","volume":"163","author":"Yu","year":"2019","journal-title":"Comput. Electron. Agric."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"He, K., Gkioxari, G., Doll\u00e1r, P., and Girshick, R. (2017, January 22\u201329). Mask r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.322"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"2327","DOI":"10.1109\/ACCESS.2019.2962513","article-title":"Improved kiwifruit detection using pre-trained VGG16 with RGB and NIR information fusion","volume":"8","author":"Liu","year":"2019","journal-title":"IEEE Access"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"417","DOI":"10.1016\/j.compag.2019.01.012","article-title":"Apple detection during different growth stages in orchards using the improved YOLO-V3 model","volume":"157","author":"Tian","year":"2019","journal-title":"Comput. Electron. Agric."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Koirala, A., Walsh, K., Wang, Z., and McCarthy, C. (2019). Deep learning for real-time fruit detection and orchard fruit load estimation: Benchmarking of \u2018MangoYOLO\u2019. Precis. Agric., 1\u201329.","DOI":"10.1007\/s11119-019-09642-0"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Kang, H., and Chen, C. (2019). Fruit detection and segmentation for apple harvesting using visual sensor in orchards. Sensors, 19.","DOI":"10.3390\/s19204599"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"105302","DOI":"10.1016\/j.compag.2020.105302","article-title":"Fruit detection, segmentation and 3d visualisation of environments in apple orchards","volume":"171","author":"Kang","year":"2020","journal-title":"Comput. Electron. Agric."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1109\/MRA.2012.2191995","article-title":"Perception, planning, and execution for mobile manipulation in unstructured environments","volume":"19","author":"Chitta","year":"2012","journal-title":"IEEE Robot. Autom. Mag. Spec. Issue Mob. Manip."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Caldera, S., Rassau, A., and Chai, D. (2018). Review of deep learning methods in robotic grasp detection. Multimodal Technol. Interact., 2.","DOI":"10.20944\/preprints201805.0484.v1"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"80","DOI":"10.1109\/MRA.2012.2206675","article-title":"Three-dimensional object recognition and 6 DoF pose estimation","volume":"19","author":"Aldoma","year":"2012","journal-title":"IEEE Robot. Autom. Mag."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1455","DOI":"10.1177\/0278364917735594","article-title":"Grasp pose detection in point clouds","volume":"36","author":"Gualtieri","year":"2017","journal-title":"Int. J. Robot. Res."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"705","DOI":"10.1177\/0278364914549607","article-title":"Deep learning for detecting robotic grasps","volume":"34","author":"Lenz","year":"2015","journal-title":"Int. J. Robot. Res."},{"key":"ref_34","unstructured":"Qi, C.R., Yi, L., Su, H., and Guibas, L.J. (2017, January 4\u20139). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Proceedings of the Neural Information Processing Systems, Long Beach, CA, USA."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Gualtieri, M., Ten Pas, A., Saenko, K., and Platt, R. (2016, January 9\u201314). High precision grasp pose detection in dense clutter. Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea.","DOI":"10.1109\/IROS.2016.7759114"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Liang, H., Ma, X., Li, S., G\u00f6rner, M., Tang, S., Fang, B., Sun, F., and Zhang, J. (2019, January 20\u201324). Pointnetgpd: Detecting grasp configurations from point sets. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8794435"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1016\/j.compag.2015.01.010","article-title":"Location of apples in trees using stereoscopic vision","volume":"112","author":"Si","year":"2015","journal-title":"Comput. Electron. Agric."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Yaguchi, H., Nagahama, K., Hasegawa, T., and Inaba, M. (2016, January 9\u201314). Development of an autonomous tomato harvesting robot with rotational plucking gripper. Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea.","DOI":"10.1109\/IROS.2016.7759122"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1186\/s40648-019-0141-2","article-title":"An automated fruit harvesting robot by using deep learning","volume":"6","author":"Onishi","year":"2019","journal-title":"Robomech J."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Lehnert, C., Sa, I., McCool, C., Upcroft, B., and Perez, T. (2016, January 16\u201321). Sweet pepper pose detection and grasping for automated crop harvesting. Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden.","DOI":"10.1109\/ICRA.2016.7487394"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"872","DOI":"10.1109\/LRA.2017.2655622","article-title":"Autonomous sweet pepper harvesting for protected cropping systems","volume":"2","author":"Lehnert","year":"2017","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_42","unstructured":"Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, J., Wheeler, R., and Ng, A.Y. (2020, October 01). ROS: An open-source Robot Operating System. Available online: https:\/\/www.willowgarage.com\/sites\/default\/files\/icraoss09-ROS.pdf."},{"key":"ref_43","unstructured":"Sucan, I.A., and Chitta, S. (2020, October 01). Moveit!. Available online: https:\/\/www.researchgate.net\/profile\/Sachin_Chitta\/publication\/254057457_MoveitROS_topics\/links\/565a2a0608aefe619b232fa8.pdf."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Beeson, P., and Ames, B. (2015, January 3\u20135). TRAC-IK: An open-source library for improved solving of generic inverse kinematics. Proceedings of the 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), Seoul, Korea.","DOI":"10.1109\/HUMANOIDS.2015.7363472"},{"key":"ref_45","first-page":"70","article-title":"Fin ray\u00ae effect inspired soft robotic gripper: From the robosoft grand challenge toward optimization","volume":"3","author":"Crooks","year":"2016","journal-title":"Front. Robot."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"056017","DOI":"10.1088\/1748-3190\/aba091","article-title":"A soft pneumatic bistable reinforced actuator bioinspired by Venus Flytrap with enhanced grasping capability","volume":"15","author":"Wang","year":"2020","journal-title":"Bioinspir. Biomimetics"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_48","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TPAMI.2017.2699184","article-title":"Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs","volume":"40","author":"Chen","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_50","unstructured":"Yu, J., Yao, J., Zhang, J., Yu, Z., and Tao, D. (2020). SPRNet: Single-Pixel Reconstruction for One-Stage Instance Segmentation. IEEE Trans. Cybern., 1\u201312."},{"key":"ref_51","unstructured":"Tzutalin (2020, September 30). LabelImg. Git Code (2015). Available online: https:\/\/github.com\/tzutalin\/labelImg."},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Goyal, P., Girshick, R., He, K., and Doll\u00e1r, P. (2017, January 22\u201329). Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.324"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Qi, C.R., Liu, W., Wu, C., Su, H., and Guibas, L.J. (2018, January 18\u201322). Frustum pointnets for 3d object detection from rgb-d data. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake, UT, USA.","DOI":"10.1109\/CVPR.2018.00102"},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Xu, J., Ma, Y., He, S., and Zhu, J. (2019). 3D-GIoU: 3D Generalized Intersection over Union for Object Detection in Point Cloud. Sensors, 19.","DOI":"10.3390\/s19194093"},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"214","DOI":"10.1111\/j.1467-8659.2007.01016.x","article-title":"Efficient RANSAC for point-cloud shape detection","volume":"26","author":"Schnabel","year":"2007","journal-title":"Comput. Graph. Forum"},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"1186","DOI":"10.1016\/j.patrec.2007.02.002","article-title":"The randomized-Hough-transform-based method for great-circle detection on sphere","volume":"28","author":"Torii","year":"2007","journal-title":"Pattern Recognit. Lett."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/19\/5670\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:16:29Z","timestamp":1760177789000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/20\/19\/5670"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,10,4]]},"references-count":56,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2020,10]]}},"alternative-id":["s20195670"],"URL":"https:\/\/doi.org\/10.3390\/s20195670","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,10,4]]}}}