{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,1]],"date-time":"2026-01-01T10:06:31Z","timestamp":1767261991659,"version":"build-2065373602"},"reference-count":45,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2023,1,16]],"date-time":"2023-01-16T00:00:00Z","timestamp":1673827200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Korea-Canada Artificial Intelligence Joint Research Center at the Korea Electrotechnology Research Institute","award":["22A03009"],"award-info":[{"award-number":["22A03009"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Robotics"],"abstract":"<jats:p>The majority of robots in factories today are operated with conventional control strategies that require individual programming on a task-by-task basis, with no margin for error. As an alternative to the rudimentary operation planning and task-programming techniques, machine learning has shown significant promise for higher-level task planning, with the development of reinforcement learning (RL)-based control strategies. This paper reviews the implementation of combined traditional and RL control for simulated and real environments to validate the RL approach for standard industrial tasks such as reach, grasp, and pick-and-place. The goal of this research is to bring intelligence to robotic control so that robotic operations can be completed without precisely defining the environment, constraints, and the action plan. The results from this approach provide optimistic preliminary data on the application of RL to real-world robotics.<\/jats:p>","DOI":"10.3390\/robotics12010012","type":"journal-article","created":{"date-parts":[[2023,1,16]],"date-time":"2023-01-16T01:31:15Z","timestamp":1673832675000},"page":"12","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":24,"title":["Simulated and Real Robotic Reach, Grasp, and Pick-and-Place Using Combined Reinforcement Learning and Traditional Controls"],"prefix":"10.3390","volume":"12","author":[{"given":"Andrew","family":"Lobbezoo","sequence":"first","affiliation":[{"name":"AI for Manufacturing Laboratory, Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada"}]},{"given":"Hyock-Ju","family":"Kwon","sequence":"additional","affiliation":[{"name":"AI for Manufacturing Laboratory, Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, ON N2L 3G1, Canada"}]}],"member":"1968","published-online":{"date-parts":[[2023,1,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Massa, D., Callegari, M., and Cristalli, C. (2015). Manual Guidance for Industrial Robot Programming, Emerald Group Publishing Limited.","DOI":"10.1108\/IR-11-2014-0413"},{"key":"ref_2","unstructured":"Biggs, G., and Macdonald, B. (2003). A Survey of Robot Programming Systems, Society of Robots."},{"key":"ref_3","unstructured":"Saha, S.K. (2014). Introduction to Robotics, McGraw Hill Education. [2nd ed.]."},{"key":"ref_4","unstructured":"Craig, J. (2005). Introduction to Robotics Mechanics and Control, Pearson Education International."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Al-Selwi, H.F., Aziz, A.A., Abas, F.S., and Zyada, Z. (2021, January 5\u20136). Reinforcement Learning for Robotic Applications with Vision Feedback. Proceedings of the 2021 IEEE 17th International Colloquium on Signal Processing & Its Applications (CSPA), Langkawi, Malaysia.","DOI":"10.1109\/CSPA52141.2021.9377292"},{"key":"ref_6","unstructured":"Tai, L., Zhang, J., Liu, M., Boedecker, J., and Burgard, W. (2016). A Survey of Deep Network Solutions for Learning Control in Robotics: From Reinforcement to Imitation. arXiv."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1238","DOI":"10.1177\/0278364913495721","article-title":"Reinforcement Learning in Robotics: A Survey","volume":"32","author":"Kober","year":"2013","journal-title":"Int. J. Robot. Res."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"108429","DOI":"10.1109\/ACCESS.2020.3001130","article-title":"A Reinforcement Learning-Based Framework for Robot Manipulation Skill Acquisition","volume":"8","author":"Liu","year":"2020","journal-title":"IEEE Access"},{"key":"ref_9","unstructured":"Kalashnikov, D., Irpan, A., Pastor, P., Ibarz, J., Herzog, A., Jang, E., Quillen, D., Holly, E., Kalakrishnan, M., and Vanhoucke, V. (2018, January 29). Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation. Proceedings of the 2nd Conference on Robot Learning, Z\u00fcrich, Switzerland."},{"key":"ref_10","first-page":"50","article-title":"Pick and Place Objects in a Cluttered Scene Using Deep Reinforcement Learning","volume":"20","author":"Mohammed","year":"2020","journal-title":"Int. J. Mech. Mechatron. Eng."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Liu, R., Nageotte, F., Zanne, P., de Mathelin, M., and Drespp-Langley, B. (2021). Deep Reinforcement Learning for the Control of Robotic Manipulation: A Focussed Mini-Review. arXiv.","DOI":"10.3390\/robotics10010022"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"239","DOI":"10.1007\/s43154-020-00021-6","article-title":"A Survey on Learning-Based Robotic Grasping","volume":"1","author":"Kleeberger","year":"2020","journal-title":"Curr. Robot. Rep."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Xiao, Y., Katt, S., ten Pas, A., Chen, S., and Amato, C. (2019, January 20\u201324). Online Planning for Target Object Search in Clutter under Partial Observability. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8793494"},{"key":"ref_14","unstructured":"Sutton, R., and Barto, A. (2018). Reinforcement Learning: An Introduction, The MIT Press."},{"key":"ref_15","unstructured":"Russell, S., and Norvig, P. Artificial Intelligence A Modern Approach, Pearson Education, Inc.. [4th ed.]."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1109\/MSP.2017.2743240","article-title":"Deep Reinforcement Learning: A Brief Survey","volume":"34","author":"Arulkumaran","year":"2017","journal-title":"IEEE Signal Process. Magazine"},{"key":"ref_17","unstructured":"Ng, A., Harada, D., and Russell, S. (1999, January 27). Policy invariance under reward transformations theory and application to reward shaping. Proceedings of the Sixteenth International Conference on Machine Learning, San Francisco, CA, USA."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Gualtieri, M., Pas, A., and Platt, R. (2018). Pick and Place Without Geometric Object Models, IEEE.","DOI":"10.1109\/ICRA.2018.8460553"},{"key":"ref_19","unstructured":"Gualtieri, M., and Platt, R. (2018). Learning 6-DoF Grasping and Pick-Place Using Attention Focus. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Pore, A., and Aragon-Camarasa, G. (August, January 31). On Simple Reactive Neural Networks for Behaviour-Based Reinforcement Learning. Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France.","DOI":"10.1109\/ICRA40945.2020.9197262"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Li, B., Lu, T., Li, J., Lu, N., Cai, Y., and Wang, S. (2020, January 21). ACDER: Augmented Curiosity-Driven Experience Replay. Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France.","DOI":"10.1109\/ICRA40945.2020.9197421"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Marzari, L., Pore, A., Dall\u2019Alba, D., Aragon-Camarasa, G., Farinelli, A., and Fiorini, P. (2021). Towards Hierarchical Task Decomposition Using Deep Reinforcement Learning for Pick and Place Subtasks. arXiv.","DOI":"10.1109\/ICAR53236.2021.9659344"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"282","DOI":"10.1016\/j.rcim.2015.04.002","article-title":"Robot skills for manufacturing: From concept to industrial deployment","volume":"37","author":"Pedersen","year":"2016","journal-title":"Robot. Comput.-Integr. Manuf."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Lobbezoo, A., Qian, Y., and Kwon, H.-J. (2021). Reinforcement Learning for Pick and Place Operations in Robotics: A Survey. Robotics, 10.","DOI":"10.3390\/robotics10030105"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"178450","DOI":"10.1109\/ACCESS.2020.3027923","article-title":"Review of Deep Reinforcement Learning-Based Object Grasping: Techniques, Open Challenges, and Recommendations","volume":"8","author":"Mohammed","year":"2020","journal-title":"IEEE Access"},{"key":"ref_26","unstructured":"Howard, A. (2022, September 20). Gazebo. Available online: http:\/\/gazebosim.org\/."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Erez, T., Tassa, Y., and Todorov, E. (2015, January 26\u201330). Simulation Tools for Model-Based Robotics: Comparison of Bullet, Havok, MuJoCo, ODE and PhysX. Proceedings of the 2015 IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA.","DOI":"10.1109\/ICRA.2015.7139807"},{"key":"ref_28","unstructured":"(2022, July 11). DeepMind Opening up a Physics Simulator for Robotics. Available online: https:\/\/www.deepmind.com\/blog\/opening-up-a-physics-simulator-for-robotics."},{"key":"ref_29","unstructured":"Coumans, E. (2022, June 10). Tiny Differentiable Simulator. Available online: https:\/\/pybullet.org\/wordpress\/."},{"key":"ref_30","unstructured":"Gallou\u00e9dec, Q., Cazin, N., Dellandr\u00e9a, E., and Chen, L. (2021). Multi-Goal Reinforcement Learning Enviroments for Simulated Franka Emika Panda Robot. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1007\/s10514-022-10034-z","article-title":"Continuous Control Actions Learning and Adaptation for Robotic Manipulation through Reinforcement Learning","volume":"46","author":"Shahid","year":"2022","journal-title":"Autonomous Robots"},{"key":"ref_32","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization Algorithms. arXiv."},{"key":"ref_33","unstructured":"Karagiannakos, S. (2021, December 13). Trust Region and Proximal Policy Optimization (TRPO and PPO). Available online: https:\/\/theaisummer.com\/TRPO_PPO\/."},{"key":"ref_34","unstructured":"Schulman, J., Levine, S., Moritz, P., Jordan, M.I., and Abbeel, P. (2015). Trust Region Policy Optimization. arXiv."},{"key":"ref_35","unstructured":"Haarnoja, T., Zhou, A., Abbeel, P., and Levine, S. (2018, January 10\u201315). Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden."},{"key":"ref_36","unstructured":"Tuomas, H., Zhou, A., Hartikainen, K., and Tucker, G. (2019). Soft Actor-Critic Algorithms and Applications. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Haarnoja, T., Ha, S., Zhou, A., Tan, J., Tucker, G., and Levine, S. (2019). Learning To Walk via Deep Reinforcement Learning. arXiv.","DOI":"10.15607\/RSS.2019.XV.011"},{"key":"ref_38","first-page":"1","article-title":"Stable-Baselines3: Reliable Reinforcement Learning Implementations","volume":"22","author":"Raffin","year":"2021","journal-title":"J. Mach. Learn. Res."},{"key":"ref_39","unstructured":"Bergstra, J., Bardenet, R., Bengio, Y., and Kegl, B. (2011). Algorithms for Hyper-Parameter Optimization, Curran Associates Inc."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Akiba, T., Sano, S., Yanase, T., Ohta, T., and Koyama, M. (2019, January 4). Optuna: A Next-generation Hyperparameter Optimization Framework. Proceedings of the Applied Data Science Track Paper, Anchorage, AK, USA.","DOI":"10.1145\/3292500.3330701"},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Mataric, M.J. (1994). Reward functions for accelerated learning. Machine Learning Proceedings 1994, Elsevier.","DOI":"10.1016\/B978-1-55860-335-6.50030-1"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Anca, M., and Studley, M. (2021, January 4\u20136). Twin Delayed Hierarchical Actor-Critic. Proceedings of the 2021 7th International Conference on Automation, Robotics and Applications (ICARA), Prague, Czech Republic.","DOI":"10.1109\/ICARA51699.2021.9376459"},{"key":"ref_43","unstructured":"Franka Emika (2021, July 13). Data Sheet Robot\u2014Arm & Control. Available online: https:\/\/pkj-robotics.dk\/wp-content\/uploads\/2020\/09\/Franka-Emika_Brochure_EN_April20_PKJ.pdf."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"G\u00f6rner, M., Haschk, R., Ritter, H., and Zhang, J. (2019, January 20\u201324). MoveIt! Task Constructor for Task-Level Motion Planning. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8793898"},{"key":"ref_45","unstructured":"Coumans, E., and Bai, Y. (2022, March 12). PyBullet Quickstart Guide. Available online: https:\/\/docs.google.com\/document\/d\/10sXEhzFRSnvFcl3XxNGhnD4N2SedqwdAvK3dsihxVUA\/edit#heading=h.2ye70wns7io3."}],"container-title":["Robotics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2218-6581\/12\/1\/12\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:06:55Z","timestamp":1760119615000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2218-6581\/12\/1\/12"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,1,16]]},"references-count":45,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,2]]}},"alternative-id":["robotics12010012"],"URL":"https:\/\/doi.org\/10.3390\/robotics12010012","relation":{},"ISSN":["2218-6581"],"issn-type":[{"type":"electronic","value":"2218-6581"}],"subject":[],"published":{"date-parts":[[2023,1,16]]}}}