{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,19]],"date-time":"2026-05-19T05:33:51Z","timestamp":1779168831004,"version":"3.51.4"},"reference-count":32,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2023,10,9]],"date-time":"2023-10-09T00:00:00Z","timestamp":1696809600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Robotics"],"abstract":"<jats:p>Reinforcement learning (RL) is explored for motor control of a novel pneumatic-driven soft robot modeled after continuum media with a varying density. This model complies with closed-form Lagrangian dynamics, which fulfills the fundamental structural property of passivity, among others. Then, the question arises of how to synthesize a passivity-based RL model to control the unknown continuum soft robot dynamics to exploit its input\u2013output energy properties advantageously throughout a reward-based neural network controller. Thus, we propose a continuous-time Actor\u2013Critic scheme for tracking tasks of the continuum 3D soft robot subject to Lipschitz disturbances. A reward-based temporal difference leads to learning with a novel discontinuous adaptive mechanism of Critic neural weights. Finally, the reward and integral of the Bellman error approximation reinforce the adaptive mechanism of Actor neural weights. Closed-loop stability is guaranteed in the sense of Lyapunov, which leads to local exponential convergence of tracking errors based on integral sliding modes. Notably, it is assumed that dynamics are unknown, yet the control is continuous and robust. A representative simulation study shows the effectiveness of our proposal for tracking tasks.<\/jats:p>","DOI":"10.3390\/robotics12050141","type":"journal-article","created":{"date-parts":[[2023,10,9]],"date-time":"2023-10-09T10:48:33Z","timestamp":1696848513000},"page":"141","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["A Novel Actor\u2014Critic Motor Reinforcement Learning for Continuum Soft Robots"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6606-861X","authenticated-orcid":false,"given":"Luis","family":"Pantoja-Garcia","sequence":"first","affiliation":[{"name":"Robotics and Advanced Manufacturing Department, Research Center for Advanced Studies (Cinvestav-Ipn), Ramos Arizpe 25903, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1813-0394","authenticated-orcid":false,"given":"Vicente","family":"Parra-Vega","sequence":"additional","affiliation":[{"name":"Robotics and Advanced Manufacturing Department, Research Center for Advanced Studies (Cinvestav-Ipn), Ramos Arizpe 25903, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4402-6072","authenticated-orcid":false,"given":"Rodolfo","family":"Garcia-Rodriguez","sequence":"additional","affiliation":[{"name":"Facultad de Ciencias de la Administraci\u00f3n, Universidad Aut\u00f3noma de Coahuila, Saltillo 25280, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1405-2675","authenticated-orcid":false,"given":"Carlos Ernesto","family":"V\u00e1zquez-Garc\u00eda","sequence":"additional","affiliation":[{"name":"Robotics and Advanced Manufacturing Department, Research Center for Advanced Studies (Cinvestav-Ipn), Ramos Arizpe 25903, Mexico"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,10,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1109\/TSMC.2020.3041775","article-title":"Looking Back on the Actor\u2014Critic Architecture","volume":"51","author":"Barto","year":"2021","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1109\/MCI.2009.932261","article-title":"Adaptive Dynamic Programming: An Introduction","volume":"4","author":"Wang","year":"2009","journal-title":"IEEE Comput. Intell. Mag."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Lewis, F., Vrabie, D., and Syrmos, V. (2012). Optimal Control, Wiley. EngineeringPro Collection.","DOI":"10.1002\/9781118122631"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1016\/j.arcontrol.2022.12.001","article-title":"Composite adaptation and learning for robot control: A survey","volume":"55","author":"Guo","year":"2023","journal-title":"Annu. Rev. Control."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"23","DOI":"10.1016\/j.neucom.2018.01.002","article-title":"Robot manipulator control using neural networks: A survey","volume":"285","author":"Jin","year":"2018","journal-title":"Neurocomputing"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"620","DOI":"10.1109\/TCYB.2015.2411285","article-title":"Adaptive Neural Network Control of an Uncertain Robot with Full-State Constraints","volume":"46","author":"He","year":"2016","journal-title":"IEEE Trans. Cybern."},{"key":"ref_7","unstructured":"Song, B., Slotine, J.J., and Pham, Q.C. (2022). Stability Guarantees for Continuous RL Control. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Bhagat, S., Banerjee, H., Ho Tse, Z.T., and Ren, H. (2019). Deep reinforcement learning for soft, flexible robots: Brief review with impending challenges. Robotics, 8.","DOI":"10.3390\/robotics8010004"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"121004","DOI":"10.1115\/1.4055692","article-title":"Lagrangian and Quasi-Lagrangian Models for Noninertial Pneumatic Soft Cylindrical Robots","volume":"144","year":"2022","journal-title":"J. Dyn. Syst. Meas. Control."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Guan, Z., and Yamamoto, T. (2020, January 19\u201324). Design of a Reinforcement Learning PID controller. Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK.","DOI":"10.1109\/IJCNN48605.2020.9207641"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"7326","DOI":"10.1109\/TSMC.2020.2975232","article-title":"Reinforcement Learning Control of a Flexible Two-Link Manipulator: An Experimental Investigation","volume":"51","author":"He","year":"2021","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"V\u00e1zquez-Garc\u00eda, C.E., Trejo-Ramos, C.A., Parra-Vega, V., and Olgu\u00edn-D\u00edaz, E. (2021, January 11\u201315). Quasi-static Optimal Design of a Pneumatic Soft Robot to Maximize Pressure-to-Force Transference. Proceedings of the 2021 Latin American Robotics Symposium (LARS), 2021 Brazilian Symposium on Robotics (SBR), and 2021 Workshop on Robotics in Education (WRE), Natal, Brazil.","DOI":"10.1109\/LARS\/SBR\/WRE54079.2021.9605376"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"967","DOI":"10.1109\/TRA.2003.819600","article-title":"Dynamic sliding PID control for tracking of robot manipulators: Theory and experiments","volume":"19","author":"Arimoto","year":"2003","journal-title":"IEEE Trans. Robot. Autom."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"14518","DOI":"10.1038\/s41598-018-32757-9","article-title":"A soft artificial muscle driven robot with reinforcement learning","volume":"8","author":"Yang","year":"2018","journal-title":"Sci. Rep."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"579","DOI":"10.1089\/soro.2018.0126","article-title":"Exploring Behaviors of Caterpillar-Like Soft Robots with a Central Pattern Generator-Based Controller and Reinforcement Learning","volume":"6","author":"Ishige","year":"2019","journal-title":"Soft Robot."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Boyraz, P., Runge, G., and Raatz, A. (2018). An overview of novel actuators for soft robotics. Actuators, 7.","DOI":"10.20944\/preprints201806.0172.v1"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Cianchetti, M., Ranzani, T., Gerboni, G., De Falco, I., Laschi, C., and Menciassi, A. (2013, January 3\u20137). STIFF-FLOP surgical manipulator: Mechanical design and experimental characterization of the single module. Proceedings of the 2013 IEEE\/RSJ International Conference on Intelligent Robots And Systems, Tokyo, Japan.","DOI":"10.1109\/IROS.2013.6696866"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"2726","DOI":"10.1109\/TMECH.2018.2872972","article-title":"Underwater dynamic modeling for a cable-driven soft robot arm","volume":"23","author":"Xu","year":"2018","journal-title":"IEEE\/ASME Trans. Mechatronics"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1089\/soro.2014.0022","article-title":"A recipe for soft fluidic elastomer robots","volume":"2","author":"Marchese","year":"2015","journal-title":"Soft Robot."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Marchese, A.D., Komorowski, K., Onal, C.D., and Rus, D. (June, January 31). Design and control of a soft and continuously deformable 2d robotic manipulation system. Proceedings of the 2014 IEEE International Conference ON Robotics And Automation (ICRA), Hong Kong, China.","DOI":"10.1109\/ICRA.2014.6907161"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"155","DOI":"10.1089\/soro.2015.0013","article-title":"Autonomous object manipulation using a soft planar grasping manipulator","volume":"2","author":"Katzschmann","year":"2015","journal-title":"Soft Robot."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1073\/pnas.1615140114","article-title":"Automatic design of fiber-reinforced soft actuators for trajectory matching","volume":"114","author":"Connolly","year":"2017","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1002\/rob.10070","article-title":"Kinematics and the implementation of an elephant\u2019s trunk manipulator and other continuum style robots","volume":"20","author":"Hannan","year":"2003","journal-title":"J. Robot. Syst."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"22","DOI":"10.3389\/frobt.2017.00022","article-title":"A geometry deformation model for braided continuum manipulators","volume":"4","author":"Sadati","year":"2017","journal-title":"Front. Robot. AI"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1661","DOI":"10.1177\/0278364910368147","article-title":"Design and kinematic modeling of constant curvature continuum robots: A review","volume":"29","author":"Webster","year":"2010","journal-title":"Int. J. Robot. Res."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Godage, I.S., Branson, D.T., Guglielmino, E., Medrano-Cerda, G.A., and Caldwell, D.G. (2011, January 3\u20139). Shape function-based kinematics and dynamics for variable length continuum robotic arms. Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China.","DOI":"10.1109\/ICRA.2011.5979607"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Odom, E.M., and Egelhoff, C.J. (2011, January 12\u201315). Teaching deflection of stepped shafts: Castigliano\u2019s theorem, dummy loads, heaviside step functions and numerical integration. Proceedings of the 2011 Frontiers in Education Conference (FIE), Rapid City, SD, USA.","DOI":"10.1109\/FIE.2011.6143039"},{"key":"ref_28","first-page":"285","article-title":"Tracking control of robot manipulators using second order neuro sliding mode","volume":"39","author":"Garcia","year":"2009","journal-title":"Lat. Am. Appl. Res."},{"key":"ref_29","first-page":"1073","article-title":"Temporal Difference Learning in Continuous Time and Space","volume":"Volume 8","author":"Touretzky","year":"1995","journal-title":"Advances in Neural Information Processing Systems"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Kandasamy, S., Teo, M., Ravichandran, N., McDaid, A., Jayaraman, K., and Aw, K. (2022). Body-powered and portable soft hydraulic actuators as prosthetic hands. Robotics, 11.","DOI":"10.3390\/robotics11040071"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"348","DOI":"10.1089\/soro.2017.0079","article-title":"Finite Element Method-Based Kinematics and Closed-Loop Control of Soft, Continuum Manipulators","volume":"5","author":"Bieze","year":"2018","journal-title":"Soft Robot."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"364","DOI":"10.1109\/TMRB.2020.3011291","article-title":"FEM-Based Mechanics Modeling of Bio-Inspired Compliant Mechanisms for Medical Applications","volume":"2","author":"Sun","year":"2020","journal-title":"IEEE Trans. Med. Robot. Bionics"}],"container-title":["Robotics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2218-6581\/12\/5\/141\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:03:42Z","timestamp":1760130222000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2218-6581\/12\/5\/141"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,9]]},"references-count":32,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2023,10]]}},"alternative-id":["robotics12050141"],"URL":"https:\/\/doi.org\/10.3390\/robotics12050141","relation":{},"ISSN":["2218-6581"],"issn-type":[{"value":"2218-6581","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,9]]}}}