{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,24]],"date-time":"2026-06-24T16:07:46Z","timestamp":1782317266360,"version":"3.54.5"},"reference-count":44,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2025,11,6]],"date-time":"2025-11-06T00:00:00Z","timestamp":1762387200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Virtual Worlds"],"abstract":"<jats:p>Modern myoelectric prostheses remain difficult to control, particularly during rehabilitation, leading to high abandonment rates in favor of static devices. This highlights the need for advanced controllers that can automate some motions. This study presents an end-to-end framework coupling deep reinforcement learning with augmented reality (AR) for prosthetic actuation. A 14-degree-of-freedom hand was modeled in Blender and deployed in Unity. Two reinforcement learning agents were trained with distinct reward functions for a grasping task: (i) a discrete, Booleann reward with contact penalties and (ii) a continuous distance-based reward between joints and the target object. Each agent trained for 3 \u00d7 107 timesteps at 50 Hz. The Booleann reward function performed poorly by entropy and convergence metrics, while the continuous reward function achieved success. The trained agent using the continuous reward was integrated into a dynamic AR scene, where a user controlled the prosthesis via a myoelectric armband while the grasping motion was actuated automatically. This framework demonstrates potential for assisting patients by automating certain movements to reduce initial control difficulty and improve rehabilitation outcomes.<\/jats:p>","DOI":"10.3390\/virtualworlds4040053","type":"journal-article","created":{"date-parts":[[2025,11,7]],"date-time":"2025-11-07T10:56:45Z","timestamp":1762513005000},"page":"53","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Reinforcement Learning-Driven Prosthetic Hand Actuation in a Virtual Environment Using Unity ML-Agents"],"prefix":"10.3390","volume":"4","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-8967-0175","authenticated-orcid":false,"given":"Christian","family":"Done","sequence":"first","affiliation":[{"name":"Department of Mechanical and Measurement & Control Engineering, Idaho State University, Pocatello, ID 83209, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-2832-3601","authenticated-orcid":false,"given":"Jaden","family":"Palmer","sequence":"additional","affiliation":[{"name":"Department of Mechanical and Measurement & Control Engineering, Idaho State University, Pocatello, ID 83209, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-8962-0186","authenticated-orcid":false,"given":"Kayson","family":"Oakey","sequence":"additional","affiliation":[{"name":"Department of Mechanical and Measurement & Control Engineering, Idaho State University, Pocatello, ID 83209, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-0632-2473","authenticated-orcid":false,"given":"Atulan","family":"Gupta","sequence":"additional","affiliation":[{"name":"Department of Mechanical and Measurement & Control Engineering, Idaho State University, Pocatello, ID 83209, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Constantine","family":"Thiros","sequence":"additional","affiliation":[{"name":"Department of Mechanical and Measurement & Control Engineering, Idaho State University, Pocatello, ID 83209, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Janet","family":"Franklin","sequence":"additional","affiliation":[{"name":"Department of Mechanical and Measurement & Control Engineering, Idaho State University, Pocatello, ID 83209, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6572-0119","authenticated-orcid":false,"given":"Marco P.","family":"Schoen","sequence":"additional","affiliation":[{"name":"Department of Mechanical and Measurement & Control Engineering, Idaho State University, Pocatello, ID 83209, USA"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,6]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"3708","DOI":"10.1080\/09638288.2020.1866684","article-title":"Current rates of prosthetic usage in upper-limb amputees\u2014Have innovations had an impact on device acceptance?","volume":"44","author":"Salminger","year":"2022","journal-title":"Disabil. Rehabil."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"236","DOI":"10.1080\/03093640600994581","article-title":"Upper limb prosthesis use and abandonment: A survey of the last 25 years","volume":"31","author":"Biddiss","year":"2007","journal-title":"Prosthetics Orthot. Int."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"363","DOI":"10.1007\/s00402-003-0546-4","article-title":"Prosthetic rehabilitation in traumatic upper limb amputees (an Indian perspective)","volume":"123","author":"Bhaskaranand","year":"2003","journal-title":"Arch. Orthop. Trauma Surg."},{"key":"ref_4","first-page":"657","article-title":"Predictive factors for successful prosthetic rehabilitation after vascular transtibial amputation","volume":"60","author":"Budinski","year":"2021","journal-title":"Acta Clin. Croat."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1082","DOI":"10.1038\/s41467-019-09103-2","article-title":"Machine-learning reprogrammable metasurface imager","volume":"10","author":"Li","year":"2019","journal-title":"Nat. Commun."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1938","DOI":"10.1080\/10447318.2022.2089085","article-title":"Machine learning techniques in adaptive and personalized systems for health and wellness","volume":"39","author":"Oyebode","year":"2023","journal-title":"Int. J. Hum. Comput. Interact."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"56","DOI":"10.20544\/HORIZONS.B.04.1.17.P05","article-title":"An overview of the supervised machine learning methods","volume":"4","author":"Nasteski","year":"2017","journal-title":"Horizons B"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"2346","DOI":"10.1109\/TCYB.2019.2890974","article-title":"Online reinforcement learning control for the personalization of a robotic knee prosthesis","volume":"50","author":"Wen","year":"2019","journal-title":"IEEE Trans. Cybern."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Gao, X., Si, J., Wen, Y., Li, M., and Huang, H.H. (August, January 31). Knowledge-guided reinforcement learning control for robotic lower limb prosthesis. Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France.","DOI":"10.1109\/ICRA40945.2020.9196749"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"178450","DOI":"10.1109\/ACCESS.2020.3027923","article-title":"Review of deep reinforcement learning-based object grasping: Techniques, open challenges, and recommendations","volume":"8","author":"Mohammed","year":"2020","journal-title":"IEEE Access"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1729881420936851","DOI":"10.1177\/1729881420936851","article-title":"Continuous shared control in prosthetic hand grasp tasks by Deep Deterministic Policy Gradient with Hindsight Experience Replay","volume":"17","author":"Gao","year":"2020","journal-title":"Int. J. Adv. Robot. Syst."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Cilla, M., Borgiani, E., Mart\u00ednez, J., Duda, G.N., and Checa, S. (2017). Machine learning techniques for the optimization of joint replacements: Application to a short-stem hip implant. PLoS ONE, 12.","DOI":"10.1371\/journal.pone.0183755"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Parajuli, N., Sreenivasan, N., Bifulco, P., Cesarelli, M., Savino, S., Niola, V., Esposito, D., Hamilton, T.J., Naik, G.R., and Gunawardana, U. (2019). Real-time EMG based pattern recognition control for hand prostheses: A review on existing methods, challenges and future implementation. Sensors, 19.","DOI":"10.3390\/s19204596"},{"key":"ref_14","unstructured":"Joshi, D., Atreya, S., Arora, A., and Anand, S. (2009). Trends in EMG based prosthetic hand development: A review. Indian J. Biomech. Spec. Issue, 228\u2013232."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"8578","DOI":"10.1109\/JSEN.2018.2865623","article-title":"Knit band sensor for myoelectric control of surface EMG-based prosthetic hand","volume":"18","author":"Lee","year":"2018","journal-title":"IEEE Sens. J."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1007\/s13534-023-00281-z","article-title":"Recent trends and challenges of surface electromyography in prosthetic applications","volume":"13","author":"Yadav","year":"2023","journal-title":"Biomed. Eng. Lett."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Abdikenov, B., Zholtayev, D., Suleimenov, K., Assan, N., Ozhikenov, K., Ozhikenova, A., Nadirov, N., and Kapsalyamov, A. (2025). Emerging Frontiers in Robotic Upper-Limb Prostheses: Mechanisms, Materials, Tactile Sensors and Machine Learning-Based EMG Control. Sensors, 25.","DOI":"10.3390\/s25133892"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1007\/s40137-016-0128-3","article-title":"Hand transplantation versus hand prosthetics: Pros and cons","volume":"4","author":"Salminger","year":"2016","journal-title":"Curr. Surg. Rep."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"189","DOI":"10.1016\/j.jelectrocard.2010.12.004","article-title":"A comparison of conductive textile-based and silver\/silver chloride gel electrodes in exercise electrocardiogram recordings","volume":"44","author":"Marozas","year":"2011","journal-title":"J. Electrocardiol."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"42","DOI":"10.1186\/1743-0003-7-42","article-title":"Cognitive vision system for control of dexterous prosthetic hands: Experimental evaluation","volume":"7","author":"Cipriani","year":"2010","journal-title":"J. Neuroeng. Rehabil."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1007\/s10856-008-3492-4","article-title":"Study of issues in the development of surface EMG controlled human hand","volume":"20","author":"Ryait","year":"2009","journal-title":"J. Mater. Sci. Mater. Med."},{"key":"ref_22","first-page":"18","article-title":"Adaptive switching in practice: Improving myoelectric prosthesis performance through reinforcement learning","volume":"14","author":"Edwards","year":"2014","journal-title":"Proc. MEC"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1186\/s12984-021-00822-6","article-title":"Immersive augmented reality system for the training of pattern classification control with a myoelectric prosthesis","volume":"18","author":"Boschmann","year":"2021","journal-title":"J. Neuroeng. Rehabil."},{"key":"ref_24","unstructured":"Microsoft (2025, August 27). Mixed Reality Toolkit (MRTK). Available online: https:\/\/learn.microsoft.com\/en-us\/windows\/mixed-reality\/mrtk-unity\/mrtk2\/."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"276","DOI":"10.1007\/s13534-016-0240-4","article-title":"A real time surface electromyography signal driven prosthetic hand model using PID controlled DC motor","volume":"6","author":"Raj","year":"2016","journal-title":"Biomed. Eng. Lett."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Garc\u00eda-Ort\u00edz, J.V., Mora, M.C., and Cerd\u00e1-Boluda, J. (2024). Modeling the Dynamics of Prosthetic Fingers for the Development of Predictive Control Algorithms. Mathematics, 12.","DOI":"10.3390\/math12203236"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Gupta, A., and Schoen, M.P. (2025, January 9\u201310). Analysis of Simulated Autonomous Wheelchair Driving using GA-PID and RL based Controllers. Proceedings of the 2025 Intermountain Engineering, Technology and Computing (IETC), Orem, UT, USA.","DOI":"10.1109\/IETC64455.2025.11039341"},{"key":"ref_28","first-page":"1","article-title":"End-to-end training of deep visuomotor policies","volume":"17","author":"Levine","year":"2016","journal-title":"J. Mach. Learn. Res."},{"key":"ref_29","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv."},{"key":"ref_30","unstructured":"Taheri, H., Hosseini, S.R., and Nekoui, M.A. (2024). Deep reinforcement learning with enhanced ppo for safe mobile robot navigation. arXiv."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Guo, Z., Fu, H., Wu, J., Han, W., Huang, W., Zheng, W., and Li, T. (2025). Dynamic Task Planning for Multi-Arm Apple-Harvesting Robots Using LSTM-PPO Reinforcement Learning Algorithm. Agriculture, 15.","DOI":"10.3390\/agriculture15060588"},{"key":"ref_32","first-page":"24611","article-title":"The surprising effectiveness of ppo in cooperative multi-agent games","volume":"35","author":"Yu","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_33","unstructured":"Juliani, A., Berges, V.P., Teng, E., Cohen, A., Harper, J., Elion, C., Goy, C., Gao, Y., Henry, H., and Mattar, M. (2020). Unity: A General Platform for Intelligent Agents. arXiv."},{"key":"ref_34","unstructured":"Kaup, M., Wolff, C., Hwang, H., Mayer, J., and Bruni, E. (2024). A review of nine physics engines for reinforcement learning research. arXiv."},{"key":"ref_35","first-page":"15","article-title":"Quaternions and rotations","volume":"477","author":"Jia","year":"2008","journal-title":"Com S"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Liu, Y., Jiang, L., Liu, H., and Ming, D. (2021). A systematic analysis of hand movement functionality: Qualitative classification and quantitative investigation of hand grasp behavior. Front. Neurorobot., 15.","DOI":"10.3389\/fnbot.2021.658075"},{"key":"ref_37","unstructured":"Eimer, T., Lindauer, M., and Raileanu, R. (2023, January 23\u201329). Hyperparameters in reinforcement learning and how to tune them. Proceedings of the International Conference on Machine Learning, PMLR, Honolulu, HI, USA."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"106415","DOI":"10.1109\/ACCESS.2024.3436710","article-title":"Impact of mixed reality-based rehabilitation on muscle activity in lower-limb amputees: An EMG analysis","volume":"12","author":"Lim","year":"2024","journal-title":"IEEE Access"},{"key":"ref_39","unstructured":"Nota, C., and Thomas, P.S. (2019). Is the policy gradient a gradient?. arXiv."},{"key":"ref_40","first-page":"47","article-title":"Convolutional neural networks for time series data processing applicable to sEMG controlled hand prosthesis","volume":"44","author":"Jaman","year":"2024","journal-title":"Tech. Mech.-Eur. J. Eng. Mech."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Kim, H., Miyakoshi, M., Kim, Y., Stapornchaisit, S., Yoshimura, N., and Koike, Y. (2022). Electroencephalography reflects user satisfaction in controlling robot hand through electromyographic signals. Sensors, 23.","DOI":"10.3390\/s23010277"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Wu, H., Dyson, M., and Nazarpour, K. (2021). Arduino-based myoelectric control: Towards longitudinal study of prosthesis use. Sensors, 21.","DOI":"10.3390\/s21030763"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Fedorov\u00e1, L., Rajt\u2019\u00fakov\u00e1, V., T\u00f3th, T., and \u017div\u010d\u00e1k, J. (2014, January 23\u201325). EMG system application in muscle parametrization of the upper extremities. Proceedings of the 2014 IEEE 12th International Symposium on Applied Machine Intelligence and Informatics (SAMI), Herl\u2019any, Slovakia.","DOI":"10.1109\/SAMI.2014.6822381"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Cardona-\u00c1lvarez, Y.N., \u00c1lvarez-Meza, A.M., C\u00e1rdenas-Pe\u00f1a, D.A., Casta\u00f1o-Duque, G.A., and Castellanos-Dominguez, G. (2023). A novel OpenBCI framework for EEG-based neurophysiological experiments. Sensors, 23.","DOI":"10.3390\/s23073763"}],"container-title":["Virtual Worlds"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2813-2084\/4\/4\/53\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,7]],"date-time":"2025-11-07T11:39:36Z","timestamp":1762515576000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2813-2084\/4\/4\/53"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,6]]},"references-count":44,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["virtualworlds4040053"],"URL":"https:\/\/doi.org\/10.3390\/virtualworlds4040053","relation":{},"ISSN":["2813-2084"],"issn-type":[{"value":"2813-2084","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,11,6]]}}}