{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T16:16:50Z","timestamp":1776442610515,"version":"3.51.2"},"reference-count":54,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2022,3,21]],"date-time":"2022-03-21T00:00:00Z","timestamp":1647820800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001871","name":"Funda\u00e7\u00e3o para a Ci\u00eancia e Tecnologia","doi-asserted-by":"publisher","award":["UIDB\/05757\/2020"],"award-info":[{"award-number":["UIDB\/05757\/2020"]}],"id":[{"id":"10.13039\/501100001871","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Innovation Cluster Dracten (ICD)","award":["Collab- 444 orative Connected Robots (Cobots) 2.0"],"award-info":[{"award-number":["Collab- 444 orative Connected Robots (Cobots) 2.0"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Automation"],"abstract":"<jats:p>The number of applications in which industrial robots share their working environment with people is increasing. Robots appropriate for such applications are equipped with safety systems according to ISO\/TS 15066:2016 and are often referred to as collaborative robots (cobots). Due to the nature of human-robot collaboration, the working environment of cobots is subjected to unforeseeable modifications caused by people. Vision systems are often used to increase the adaptability of cobots, but they usually require knowledge of the objects to be manipulated. The application of machine learning techniques can increase the flexibility by enabling the control system of a cobot to continuously learn and adapt to unexpected changes in the working environment. In this paper we address this issue by investigating the use of Reinforcement Learning (RL) to control a cobot to perform pick-and-place tasks. We present the implementation of a control system that can adapt to changes in position and enables a cobot to grasp objects which were not part of the training. Our proposed system uses deep Q-learning to process color and depth images and generates an \u03f5-greedy policy to define robot actions. The Q-values are estimated using Convolution Neural Networks (CNNs) based on pre-trained models for feature extraction. To reduce training time, we implement a simulation environment to first train the RL agent, then we apply the resulting system on a real cobot. System performance is compared when using the pre-trained CNN models ResNext, DenseNet, MobileNet, and MNASNet. Simulation and experimental results validate the proposed approach and show that our system reaches a grasping success rate of 89.9% when manipulating a never-seen object operating with the pre-trained CNN model MobileNet.<\/jats:p>","DOI":"10.3390\/automation3010011","type":"journal-article","created":{"date-parts":[[2022,3,21]],"date-time":"2022-03-21T13:24:59Z","timestamp":1647869099000},"page":"223-241","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":35,"title":["Reinforcement Learning for Collaborative Robots Pick-and-Place Applications: A Case Study"],"prefix":"10.3390","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2444-375X","authenticated-orcid":false,"given":"Natanael Magno","family":"Gomes","sequence":"first","affiliation":[{"name":"Sensors and Smart Systems Group, Institute of Engineering, Hanze University of Applied Sciences, 9747 AS Groningen, The Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1032-6162","authenticated-orcid":false,"given":"Felipe Nascimento","family":"Martins","sequence":"additional","affiliation":[{"name":"Sensors and Smart Systems Group, Institute of Engineering, Hanze University of Applied Sciences, 9747 AS Groningen, The Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7902-1207","authenticated-orcid":false,"given":"Jos\u00e9","family":"Lima","sequence":"additional","affiliation":[{"name":"The Research Centre in Digitalization and Intelligent Robotics (CeDRI), Instituto Polit\u00e9cnico de Bragan\u00e7a, 5300-252 Bragan\u00e7a, Portugal"},{"name":"Centre for Robotics in Industry and Intelligent Systems\u2014INESC TEC, 4200-465 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2263-0495","authenticated-orcid":false,"given":"Heinrich","family":"W\u00f6rtche","sequence":"additional","affiliation":[{"name":"Sensors and Smart Systems Group, Institute of Engineering, Hanze University of Applied Sciences, 9747 AS Groningen, The Netherlands"},{"name":"Department of Electrical Engineering, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands"}]}],"member":"1968","published-online":{"date-parts":[[2022,3,21]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Siciliano, B., and Khatib, O. (2016). Springer Handbook of Robotics, Springer International Publishing.","DOI":"10.1007\/978-3-319-32552-1"},{"key":"ref_2","unstructured":"(2016). Robots and Robotic Devices-Collaborative Robots (Standard No. ISO\/TS 15066)."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"989","DOI":"10.1007\/s00217-012-1844-2","article-title":"Applications of computer vision techniques in the agriculture and food industry: A review","volume":"235","author":"Gomes","year":"2012","journal-title":"Eur. Food Res. Technol."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"426","DOI":"10.1016\/j.procs.2016.03.055","article-title":"Computer Vision Based Fruit Grading System for Quality Evaluation of Tomato in Agriculture industry","volume":"Volume 79","author":"Arakeri","year":"2016","journal-title":"Procedia Computer Science"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Bhutta, M.U.M., Aslam, S., Yun, P., Jiao, J., and Liu, M. (January, January 24). Smart-Inspect: Micro Scale Localization and Classification of Smartphone Glass Defects for Industrial Automation. Proceedings of the 2020 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA.","DOI":"10.1109\/IROS45743.2020.9341509"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Saxena, A., Driemeyer, J., Kearns, J., and Ng, A.Y. (2007). Robotic grasping of novel objects. Advances in Neural Information Processing Systems, IEEE.","DOI":"10.7551\/mitpress\/7503.003.0156"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Torras, C. (1992). Computer Vision: Theory and Industrial Applications, Springer.","DOI":"10.1007\/978-3-642-48675-3"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Kumra, S., and Kanan, C. (2017, January 24\u201328). Robotic grasp detection using deep convolutional neural networks. Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Vancouver, BC, Canada.","DOI":"10.1109\/IROS.2017.8202237"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"183","DOI":"10.1177\/0278364919859066","article-title":"Learning robust, real-time, reactive robotic grasping","volume":"39","author":"Morrison","year":"2020","journal-title":"Int. J. Robot. Res."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Shafii, N., Kasaei, S.H., and Lopes, L.S. (2016, January 9\u201314). Learning to grasp familiar objects using object view recognition and template matching. Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Daejeon, Korea.","DOI":"10.1109\/IROS.2016.7759448"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1721","DOI":"10.1016\/j.eswa.2012.09.010","article-title":"Neural network Reinforcement Learning for visual control of robot manipulators","volume":"40","year":"2013","journal-title":"Expert Syst. Appl."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Pereira, A.I., Fernandes, F.P., Coelho, J.P., Teixeira, J.P., Pacheco, M.F., Alves, P., and Lopes, R.P. (2021). Deep Reinforcement Learning Applied to a Robotic Pick-and-Place Application. Optimization, Learning Algorithms and Applications, Springer International Publishing.","DOI":"10.1007\/978-3-030-91885-9"},{"key":"ref_13","unstructured":"Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, The MIT Press. [2nd ed.]."},{"key":"ref_14","unstructured":"Saha, S. (2020, June 20). A Comprehensive Guide to Convolutional Neural Networks-Towards Data Science. Available online: https:\/\/towardsdatascience.com\/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23\u201328). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington, DC, USA.","DOI":"10.1109\/CVPR.2014.81"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Girshick, R. (2015, January 7\u201313). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.169"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2017","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"386","DOI":"10.1109\/TPAMI.2018.2844175","article-title":"Mask R-CNN","volume":"42","author":"He","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_19","unstructured":"Girshick, R., Radosavovic, I., Gkioxari, G., Doll\u00e1r, P., and He, K. (2020, June 20). Detectron. Available online: https:\/\/github.com\/facebookresearch\/detectron."},{"key":"ref_20","unstructured":"Redmon, J., and Farhadi, A. (2020, June 20). YOLO: Real-Time Object Detection. Available online: https:\/\/pjreddie.com\/darknet\/yolo."},{"key":"ref_21","unstructured":"Redmon, J., and Farhadi, A. (2018). YOLOv3: An Incremental Improvement. arXiv."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1345","DOI":"10.1109\/TKDE.2009.191","article-title":"A survey on transfer learning","volume":"22","author":"Pan","year":"2009","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"7068349","DOI":"10.1155\/2018\/7068349","article-title":"Deep learning for computer vision: A brief review","volume":"2018","author":"Voulodimos","year":"2018","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_24","unstructured":"Luo, C., He, X., Zhan, J., Wang, L., Gao, W., and Dai, J. (2020). Comparison and benchmarking of ai models and frameworks on mobile devices. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Xie, S., Girshick, R., Doll\u00e1r, P., Tu, Z., and He, K. (2017, January 21\u201326). Aggregated residual transformations for deep neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.634"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21\u201326). Densely Connected Convolutional Networks. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_27","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Tan, M., Chen, B., Pang, R., Vasudevan, V., Sandler, M., Howard, A., and Le, Q.V. (2019, January 15\u201320). Mnasnet: Platform-aware neural architecture search for mobile. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00293"},{"key":"ref_29","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Zanuttigh, P., Mutto, C.D., Minto, L., Marin, G., Dominio, F., and Cortelazzo, G.M. (2016). Time-of-Flight and Structured Light Depth Cameras: Technology and Applications, Springer International Publishing.","DOI":"10.1007\/978-3-319-30973-6"},{"key":"ref_31","unstructured":"Zhang, F., Leitner, J., Milford, M., Upcroft, B., and Corke, P. (2015). Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Rahman, M.M., Rashid, S.M.H., and Hossain, M.M. (2018). Implementation of Q learning and deep Q network for controlling a self balancing robot model. Robot. Biomim., 5.","DOI":"10.1186\/s40638-018-0091-9"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Hase, H., Azampour, M.F., Tirindelli, M., Paschali, M., Simson, W., Fatemizadeh, E., and Navab, N. (2020\u201324, January 25). Ultrasound-Guided Robotic Navigation with Deep Reinforcement Learning. Proceedings of the 2020 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA.","DOI":"10.1109\/IROS45743.2020.9340913"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Joshi, S., Kumra, S., and Sahin, F. (2020, January 20\u201321). Robotic grasping using deep reinforcement learning. Proceedings of the 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE), Hong Kong, China.","DOI":"10.1109\/CASE48305.2020.9216986"},{"key":"ref_35","first-page":"8026","article-title":"PyTorch: An imperative style, high-performance deep learning library","volume":"32","author":"Paszke","year":"2019","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_36","unstructured":"Gomes, N.M. (2021, January 22). Natanaelmgomes\/drl_ros: ROS Package with Webots Simulation Environment, Layer of Control and a Deep Reinforcement Learning Algorithm Using Convolutional Neural Network. Available online: https:\/\/github.com\/natanaelmgomes\/drl_ros."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1238","DOI":"10.1177\/0278364913495721","article-title":"Reinforcement learning in robotics: A survey","volume":"32","author":"Kober","year":"2013","journal-title":"Int. J. Robot. Res."},{"key":"ref_38","unstructured":"(2022, January 03). Models and Pre-Trained Weights. Available online: https:\/\/pytorch.org\/vision\/stable\/models.html."},{"key":"ref_39","unstructured":"Webots (2020, August 18). Commercial Mobile Robot Simulation Misc. Available online: http:\/\/www.cyberbotics.com."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Ayala, A., Cruz, F., Campos, D., Rubio, R., Fernandes, B., and Dazeley, R. (2020, January 7\u201311). A comparison of humanoid robot simulators: A quantitative approach. Proceedings of the 2020 Joint IEEE 10th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), Valparaiso, Chile.","DOI":"10.1109\/ICDL-EpiRob48136.2020.9278116"},{"key":"ref_41","unstructured":"Open Robotics (2020, August 18). Robot Operating System. Available online: http:\/\/wiki.ros.org\/melodic."},{"key":"ref_42","unstructured":"Universal Robots (2020, August 05). Universal_Robots_ROS_Driver. Available online: https:\/\/github.com\/UniversalRobots\/Universal_Robots_ROS_Driver."},{"key":"ref_43","unstructured":"(2020, October 19). Ros-Industrial\/Robotiq: Robotiq Packages. Available online: http:\/\/wiki.ros.org\/robotiq."},{"key":"ref_44","unstructured":"(2022, January 03). Intel(R) RealSense(TM) ROS Wrapper for D400 Series, SR300 Camera and T265 Tracking Module: IntelRealSense\/realsense-ros. Available online: https:\/\/github.com\/IntelRealSense\/realsense-ros."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Rajeswaran, A., Kumar, V., Gupta, A., Vezzani, G., Schulman, J., Todorov, E., and Levine, S. (2017). Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations. Technical Report. arXiv.","DOI":"10.15607\/RSS.2018.XIV.049"},{"key":"ref_46","unstructured":"Hawkins, K.P. (2013). Analytic Inverse Kinematics for the Universal Robots UR-5\/UR-10 Arms, Georgia Institute of Technology. Technical Report."},{"key":"ref_47","unstructured":"(2020, December 31). Universal Robots-Parameters for Calculations of Kinematics and Dynamics. Available online: https:\/\/www.universal-robots.com\/articles\/ur\/application-installation\/dh-parameters-for-calculations-of-kinematics-and-dynamics\/."},{"key":"ref_48","unstructured":"Universal Robots (2020, August 05). Universal Robots e-Series User Manual-US Version 5.7, Available online: https:\/\/www.universal-robots.com\/download\/manuals-e-series\/user\/ur5e\/57\/user-manual-ur5e-e-series-sw-57-english-us-en-us\/."},{"key":"ref_49","unstructured":"Robotiq Inc. (Manual Robotiq 2F-85 & 2F-140 for e-Series Universal Robots, 2018). Manual Robotiq 2F-85 & 2F-140 for e-Series Universal Robots."},{"key":"ref_50","unstructured":"(2021, January 15). SmoothL1Loss - PyTorch 1.7.0 Documentation. Available online: https:\/\/pytorch.org\/docs\/stable\/generated\/torch.nn.SmoothL1Loss.html."},{"key":"ref_51","unstructured":"Kingma, D.P., and Ba, J.L. (2015, January 7\u20139). Adam: A method for stochastic optimization. Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015-Conference Track Proceedings. International Conference on Learning Representations, ICLR, San Diego, CA, USA."},{"key":"ref_52","unstructured":"Loshchilov, I., and Hutter, F. (2017). Decoupled Weight Decay Regularization. arXiv."},{"key":"ref_53","unstructured":"Brys, T., Harutyunyan, A., Suay, H.B., Chernova, S., Taylor, M.E., and Now\u00e9, A. (2015, January 25\u201331). Reinforcement learning from demonstration through shaping. Proceedings of the IJCAI International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina."},{"key":"ref_54","first-page":"1","article-title":"Experience selection in deep reinforcement learning for control","volume":"19","author":"Kober","year":"2018","journal-title":"J. Mach. Learn. Res."}],"container-title":["Automation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2673-4052\/3\/1\/11\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:40:15Z","timestamp":1760136015000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2673-4052\/3\/1\/11"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,21]]},"references-count":54,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,3]]}},"alternative-id":["automation3010011"],"URL":"https:\/\/doi.org\/10.3390\/automation3010011","relation":{},"ISSN":["2673-4052"],"issn-type":[{"value":"2673-4052","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3,21]]}}}