{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T18:03:54Z","timestamp":1770833034638,"version":"3.50.1"},"reference-count":59,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2023,10,6]],"date-time":"2023-10-06T00:00:00Z","timestamp":1696550400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Union\u2019s Horizon 2020 Research and Innovation Programme","award":["860108"],"award-info":[{"award-number":["860108"]}]},{"name":"European Union\u2019s Horizon 2020 Research and Innovation Programme","award":["863212"],"award-info":[{"award-number":["863212"]}]},{"name":"PROBOSCIS","award":["860108"],"award-info":[{"award-number":["860108"]}]},{"name":"PROBOSCIS","award":["863212"],"award-info":[{"award-number":["863212"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>This paper presents Soft DAgger, an efficient imitation learning-based approach for training control solutions for soft robots. To demonstrate the effectiveness of the proposed algorithm, we implement it on a two-module soft robotic arm involved in the task of writing letters in 3D space. Soft DAgger uses a dynamic behavioral map of the soft robot, which maps the robot\u2019s task space to its actuation space. The map acts as a teacher and is responsible for predicting the optimal actions for the soft robot based on its previous state action history, expert demonstrations, and current position. This algorithm achieves generalization ability without depending on costly exploration techniques or reinforcement learning-based synthetic agents. We propose two variants of the control algorithm and demonstrate that good generalization capabilities and improved task reproducibility can be achieved, along with a consistent decrease in the optimization time and samples. Overall, Soft DAgger provides a practical control solution to perform complex tasks in fewer samples with soft robots. To the best of our knowledge, our study is an initial exploration of imitation learning with online optimization for soft robot control.<\/jats:p>","DOI":"10.3390\/s23198278","type":"journal-article","created":{"date-parts":[[2023,10,6]],"date-time":"2023-10-06T08:29:12Z","timestamp":1696580952000},"page":"8278","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["Soft DAgger: Sample-Efficient Imitation Learning for Control of Soft Robots"],"prefix":"10.3390","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-9341-8665","authenticated-orcid":false,"given":"Muhammad Sunny","family":"Nazeer","sequence":"first","affiliation":[{"name":"The BioRobotics Institute, Scuola Superiore Sant\u2019Anna, 56025 Pontedera, Italy"},{"name":"Department of Excellence in Robotics and AI, Scuola Superiore Sant\u2019Anna, 56125 Pisa, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5248-1043","authenticated-orcid":false,"given":"Cecilia","family":"Laschi","sequence":"additional","affiliation":[{"name":"Department of Mechanical Engineering, National University of Singapore, Singapore 117575, Singapore"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8060-8080","authenticated-orcid":false,"given":"Egidio","family":"Falotico","sequence":"additional","affiliation":[{"name":"The BioRobotics Institute, Scuola Superiore Sant\u2019Anna, 56025 Pontedera, Italy"},{"name":"Department of Excellence in Robotics and AI, Scuola Superiore Sant\u2019Anna, 56125 Pisa, Italy"}]}],"member":"1968","published-online":{"date-parts":[[2023,10,6]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"287","DOI":"10.1016\/j.tibtech.2013.03.002","article-title":"Soft robotics: A bioinspired evolution in robotics","volume":"31","author":"Kim","year":"2013","journal-title":"Trends Biotechnol."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"467","DOI":"10.1038\/nature14543","article-title":"Design, fabrication and control of soft robots","volume":"521","author":"Rus","year":"2015","journal-title":"Nature"},{"key":"ref_3","unstructured":"Armanini, C., Boyer, F., Mathew, A.T., Duriez, C., and Renda, F. (2021). Soft Robots Modeling: A Structured Overview. arXiv."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"1661","DOI":"10.1177\/0278364910368147","article-title":"Design and Kinematic Modeling of Constant Curvature Continuum Robots: A Review","volume":"29","author":"Webster","year":"2010","journal-title":"Int. J. Robot. Res."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Chawla, A., Frazelle, C., and Walker, I. (February, January 31). A Comparison of Constant Curvature Forward Kinematics for Multisection Continuum Manipulators. Proceedings of the 2018 Second IEEE International Conference on Robotic Computing (IRC), Laguna Hills, CA, USA.","DOI":"10.1109\/IRC.2018.00046"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"460","DOI":"10.1016\/j.ijsolstr.2007.08.016","article-title":"Nonlinear dynamics of elastic rods using the Cosserat theory: Modelling and simulation","volume":"45","author":"Cao","year":"2008","journal-title":"Int. J. Solids Struct."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1033","DOI":"10.1109\/TRO.2011.2160469","article-title":"Statics and Dynamics of Continuum Robots With General Tendon Routing and External Loading","volume":"27","author":"Rucker","year":"2011","journal-title":"IEEE Trans. Robot."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Pozzi, M., Miguel, E., Deimel, R., Malvezzi, M., Bickel, B., Brock, O., and Prattichizzo, D. (2018, January 21\u201325). Efficient FEM-Based Simulation of Soft Robots Modeled as Kinematic Chains. Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia.","DOI":"10.1109\/ICRA.2018.8461106"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Schegg, P., and Duriez, C. (2022). Review on generic methods for mechanical modeling, simulation and control of soft robots. PLoS ONE, 17.","DOI":"10.1371\/journal.pone.0251059"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Kim, D., Kim, S.H., Kim, T., Kang, B.B., Lee, M., Park, W., Ku, S., Kim, D., Kwon, J., and Lee, H. (2021). Review of machine learning methods in soft robotics. PLoS ONE, 16.","DOI":"10.1371\/journal.pone.0246102"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"149","DOI":"10.1089\/soro.2017.0007","article-title":"Control Strategies for Soft Robotic Manipulators: A Survey","volume":"5","author":"Ansari","year":"2018","journal-title":"Soft Robot."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"285","DOI":"10.1089\/soro.2016.0051","article-title":"Learning Closed Loop Kinematic Controllers for Continuum Manipulators in Unstructured Environments","volume":"4","author":"Falotico","year":"2017","journal-title":"Soft Robot."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"256","DOI":"10.1007\/978-3-319-22979-9_26","article-title":"Integrating feedback and predictive control in a Bio-inspired model of visual pursuit implemented on a humanoid robot","volume":"9222","author":"Vannucci","year":"2015","journal-title":"Lect. Notes Comput. Sci."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"5022","DOI":"10.1109\/TIE.2016.2554078","article-title":"Kinematic Control of Continuum Manipulators Using a Fuzzy-Model-Based Approach","volume":"63","author":"Qi","year":"2016","journal-title":"IEEE Trans. Ind. Electron."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Oikonomou, P., Dometios, A., Khamassi, M., and Tzafestas, C.S. (October, January 27). Task Driven Skill Learning in a Soft-Robotic Arm. Proceedings of the 2021 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS 2021), Prague, Czech Republic.","DOI":"10.1109\/IROS51168.2021.9636812"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"935","DOI":"10.1109\/TRO.2014.2314777","article-title":"A Variable Curvature Continuum Kinematics for Kinematic Control of the Bionic Handling Assistant","volume":"30","author":"Mahl","year":"2014","journal-title":"IEEE Trans. Robot."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Gillespie, M., Best, C., Townsend, E., Wingate, D., and Killpack, M. (2018, January 24\u201328). Learning nonlinear dynamic models of soft robots for model predictive control with neural networks. Proceedings of the 2018 IEEE International Conference on Soft Robotics (RoboSoft), Livorno, Italy.","DOI":"10.1109\/ROBOSOFT.2018.8404894"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Thuruthel, T.G., Falotico, E., Renda, F., and Laschi, C. (2017). Learning dynamic models for open loop predictive control of soft robotic manipulators. Bioinspiration Biomimetics, 12.","DOI":"10.1088\/1748-3190\/aa839f"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"1292","DOI":"10.1109\/LRA.2018.2797241","article-title":"Stable Open Loop Control of Soft Robotic Manipulators","volume":"3","author":"Thuruthel","year":"2018","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"124","DOI":"10.1109\/TRO.2018.2878318","article-title":"Model-Based Reinforcement Learning for Closed-Loop Dynamic Control of Soft Robotic Manipulators","volume":"35","author":"Thuruthel","year":"2019","journal-title":"IEEE Trans. Robot."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Wu, Q., Gu, Y., Li, Y., Zhang, B., Chepinskiy, S.A., Wang, J., Zhilenkov, A.A., Krasnov, A.Y., and Chernyi, S. (2020). Position Control of Cable-Driven Robotic Soft Arm Based on Deep Reinforcement Learning. Information, 11.","DOI":"10.3390\/info11060310"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"5469","DOI":"10.1109\/LRA.2022.3157369","article-title":"Controlling Soft Robotic Arms Using Continual Learning","volume":"7","author":"Kalidindi","year":"2022","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"410","DOI":"10.1089\/soro.2021.0123","article-title":"SofaGym: An Open Platform for Reinforcement Learning Based on Soft Robot Simulations","volume":"10","author":"Schegg","year":"2022","journal-title":"Soft Robot."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"280","DOI":"10.3389\/frobt.2021.730330","article-title":"A Survey for Machine Learning-Based Control of Continuum Robots","volume":"8","author":"Wang","year":"2021","journal-title":"Front. Robot. AI"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"880","DOI":"10.1109\/TRO.2014.2309194","article-title":"Model-Less Feedback Control of Continuum Manipulators in Constrained Environments","volume":"30","author":"Yip","year":"2014","journal-title":"IEEE Trans. Robot."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1194","DOI":"10.1109\/LRA.2019.2893691","article-title":"Vision-Based Online Learning Kinematic Control for Soft Robots Using Local Gaussian Process Regression","volume":"4","author":"Ge","year":"2019","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Bhagat, S., Banerjee, H., and Tse, Z. (2019). Deep Reinforcement Learning for Soft, Flexible Robots: Brief Review with Impending Challenges. Robotics, 8.","DOI":"10.3390\/robotics8010004"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"4741","DOI":"10.1109\/LRA.2022.3146903","article-title":"Closed-Loop Dynamic Control of a Soft Manipulator Using Deep Reinforcement Learning","volume":"7","author":"Centurelli","year":"2022","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"2471","DOI":"10.1109\/LRA.2018.2800106","article-title":"Model-plant mismatch compensation using reinforcement learning","volume":"3","author":"Koryakovskiy","year":"2018","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_30","unstructured":"Balakrishna, A., Thananjeyan, B., Lee, J., Li, F., Zahed, A., Gonzalez, J.E., and Goldberg, K. (November, January 30). On-Policy Robot Imitation Learning from a Converging Supervisor. Proceedings of the Conference on Robot Learning, Osaka, Japan."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1238","DOI":"10.1177\/0278364913495721","article-title":"Reinforcement learning in robotics: A survey","volume":"32","author":"Kober","year":"2013","journal-title":"Int. J. Robot. Res."},{"key":"ref_32","first-page":"1","article-title":"Learning by Imitation with the STIFF-FLOP Surgical Robot: A Biomimetic Approach Inspired by Octopus Movements","volume":"1","author":"Malekzadeh","year":"2014","journal-title":"Robot. Biomimetics Spec. Issue Med. Robot."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1016\/j.cmpb.2013.12.015","article-title":"Human\u2013robot skills transfer interfaces for a flexible surgical robot","volume":"116","author":"Calinon","year":"2014","journal-title":"Comput. Methods Programs Biomed."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"6169","DOI":"10.1109\/LRA.2020.3011353","article-title":"Imitation Learning Based on Bilateral Control for Human\u2013Robot Cooperation","volume":"5","author":"Sasagawa","year":"2020","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Racinskis, P., Arents, J., and Greitans, M. (2022). A Motion Capture and Imitation Learning Based Approach to Robot Control. Appl. Sci., 12.","DOI":"10.20944\/preprints202206.0427.v1"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1016\/S0893-6080(99)00098-2","article-title":"Reinforcement Learning: An Introduction; R.S. Sutton, A.G. Barto (Eds.)","volume":"13","author":"Rao","year":"2000","journal-title":"Neural Netw."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/2300000053","article-title":"An Algorithmic Perspective on Imitation Learning","volume":"7","author":"Osa","year":"2018","journal-title":"Found. Trends Robot."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Zhang, D., Fan, W., Lloyd, J., Yang, C., and Lepora, N.F. (2022). One-shot domain-adaptive imitation learning via progressive learning applied to robotic pouring. IEEE Trans. Autom. Sci. Eng.","DOI":"10.1109\/TASE.2022.3220728"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Zhu, Y., Wang, Z., Merel, J., Rusu, A., Erez, T., Cabi, S., Tunyasuvunakool, S., Kram\u00e1r, J., Hadsell, R., and Freitas, N. (2018). Reinforcement and Imitation Learning for Diverse Visuomotor Skills. arXiv.","DOI":"10.15607\/RSS.2018.XIV.009"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1892","DOI":"10.1109\/LRA.2019.2898035","article-title":"Combining Imitation Learning With Constraint-Based Task Specification and Control","volume":"4","author":"Perico","year":"2019","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_41","unstructured":"Sasaki, F., Yohira, T., and Kawaguchi, A. (2019, January 6\u20139). Sample Efficient Imitation Learning for Continuous Control. Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA."},{"key":"ref_42","unstructured":"Stadie, B., Abbeel, P., and Sutskever, I. (2017). Third-Person Imitation Learning. arXiv."},{"key":"ref_43","unstructured":"Chen, Z., and Lin, M. (2020). Self-Imitation Learning in Sparse Reward Settings. arXiv."},{"key":"ref_44","unstructured":"Rusu, A., G\u00f3mez, S., Gulcehre, C., Desjardins, G., Kirkpatrick, J., Pascanu, R., Mnih, V., Kavukcuoglu, K., and Hadsell, R. (2015). Policy Distillation. arXiv."},{"key":"ref_45","unstructured":"Duan, Y., Andrychowicz, M., Stadie, B., Ho, J., Schneider, J., Sutskever, I., Abbeel, P., and Zaremba, W. (2017). One-Shot Imitation Learning. arXiv."},{"key":"ref_46","unstructured":"Finn, C., Yu, T., Zhang, T., Abbeel, P., and Levine, S. (2017, January 13\u201315). One-Shot Visual Imitation Learning via Meta-Learning. Proceedings of the 1st Annual Conference on Robot Learning, Mountain View, CA, USA."},{"key":"ref_47","unstructured":"Spencer, J., Choudhury, S., Venkatraman, A., Ziebart, B., and Bagnell, J.A. (2021). Feedback in imitation learning: The three regimes of covariate shift. arXiv."},{"key":"ref_48","first-page":"627","article-title":"A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning","volume":"15","author":"Ross","year":"2010","journal-title":"J. Mach. Learn. Res. Proc. Track"},{"key":"ref_49","doi-asserted-by":"crossref","unstructured":"Kelly, M., Sidrane, C., Driggs-Campbell, K., and Kochenderfer, M.J. (2019, January 20\u201324). HG-DAgger: Interactive Imitation Learning with Human Experts. Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada.","DOI":"10.1109\/ICRA.2019.8793698"},{"key":"ref_50","unstructured":"Laskey, M., Lee, J., Fox, R., Dragan, A., and Goldberg, K. (2017, January 13\u201315). DART: Noise Injection for Robust Imitation Learning. Proceedings of the 1st Annual Conference on Robot Learning, Mountain View, CA, USA."},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Menda, K., Driggs-Campbell, K., and Kochenderfer, M.J. (2019, January 3\u20138). EnsembleDAgger: A Bayesian Approach to Safe Imitation Learning. Proceedings of the 2019 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Venetian Macao, Macau.","DOI":"10.1109\/IROS40897.2019.8968287"},{"key":"ref_52","doi-asserted-by":"crossref","unstructured":"Malekzadeh, M., Bruno, D., Calinon, S., Nanayakkara, T., and Caldwell, D. (2013, January 3\u20137). Skills Transfer Across Dissimilar Robots by Learning Context-Dependent Rewards. Proceedings of the 2013 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan.","DOI":"10.1109\/IROS.2013.6696585"},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Oikonomou, P., Khamassi, M., and Tzafestas, C. (August, January 31). Periodic movement learning in a soft-robotic arm. Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France.","DOI":"10.1109\/ICRA40945.2020.9197035"},{"key":"ref_54","doi-asserted-by":"crossref","unstructured":"Oikonomou, P., Dometios, A., Khamassi, M., and Tzafestas, C. (2022, January 23\u201327). Reproduction of Human Demonstrations with a Soft-Robotic Arm based on a Library of Learned Probabilistic Movement Primitives. Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA.","DOI":"10.1109\/ICRA46639.2022.9811627"},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Manti, M., Pratesi, A., Falotico, E., Cianchetti, M., and Laschi, C. (2016, January 26\u201329). Soft assistive robot for personal care of elderly people. Proceedings of the 2016 6th IEEE International Conference on Biomedical Robotics and Biomechatronics (BioRob), Singapore.","DOI":"10.1109\/BIOROB.2016.7523731"},{"key":"ref_56","doi-asserted-by":"crossref","first-page":"103451","DOI":"10.1016\/j.robot.2020.103451","article-title":"I-Support: A robotic platform of an assistive bathing robot for the elderly population","volume":"126","author":"Zlatintsi","year":"2020","journal-title":"Robot. Auton. Syst."},{"key":"ref_57","unstructured":"O\u2019Malley, T., Bursztein, E., Long, J., Chollet, F., Jin, H., Invernizzi, L., de Marmiesse, G., Fu, Y., Hahn, A., and Mullenbach, J. (2023, June 30). KerasTuner. Available online: https:\/\/github.com\/keras-team\/keras-tuner."},{"key":"ref_58","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1109\/MCSE.2007.55","article-title":"Matplotlib: A 2D graphics environment","volume":"9","author":"Hunter","year":"2007","journal-title":"Comput. Sci. Eng."},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Saputra, M.R.U., de Gusmao, P.P.B., Almalioglu, Y., Markham, A., and Trigoni, N. (2019). Distilling Knowledge From a Deep Pose Regressor Network. arXiv.","DOI":"10.1109\/ICCV.2019.00035"}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/19\/8278\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:02:05Z","timestamp":1760130125000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/23\/19\/8278"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,6]]},"references-count":59,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2023,10]]}},"alternative-id":["s23198278"],"URL":"https:\/\/doi.org\/10.3390\/s23198278","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,10,6]]}}}