{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,13]],"date-time":"2026-05-13T10:41:05Z","timestamp":1778668865870,"version":"3.51.4"},"reference-count":21,"publisher":"Springer Science and Business Media LLC","issue":"7","license":[{"start":{"date-parts":[[2020,6,6]],"date-time":"2020-06-06T00:00:00Z","timestamp":1591401600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,6,6]],"date-time":"2020-06-06T00:00:00Z","timestamp":1591401600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100004440","name":"Wellcome Trust","doi-asserted-by":"publisher","award":["203145Z\/16\/Z"],"award-info":[{"award-number":["203145Z\/16\/Z"]}],"id":[{"id":"10.13039\/100004440","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/P027938\/1"],"award-info":[{"award-number":["EP\/P027938\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/R004080\/1"],"award-info":[{"award-number":["EP\/R004080\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000266","name":"Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/P012841\/1)"],"award-info":[{"award-number":["EP\/P012841\/1)"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000287","name":"Royal Academy of Engineering","doi-asserted-by":"publisher","award":["CiET1819\\2\\36"],"award-info":[{"award-number":["CiET1819\\2\\36"]}],"id":[{"id":"10.13039\/501100000287","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J CARS"],"published-print":{"date-parts":[[2020,7]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Purpose<\/jats:title><jats:p>Concentric tube robots are composed of multiple concentric, pre-curved, super-elastic, telescopic tubes that are compliant and have a small diameter suitable for interventions that must be minimally invasive like fetal surgery. Combinations of rotation and extension of the tubes can alter the robot\u2019s shape but the inverse kinematics are complex to model due to the challenge of incorporating friction and other tube interactions or manufacturing imperfections. We propose a model-free reinforcement learning approach to form the inverse kinematics solution and directly obtain a control policy.<\/jats:p><\/jats:sec><jats:sec><jats:title>Method<\/jats:title><jats:p>Three exploration strategies are shown for deep deterministic policy gradient with hindsight experience replay for concentric tube robots in simulation environments. The aim is to overcome the joint to Cartesian sampling bias and be scalable with the number of robotic tubes. To compare strategies, evaluation of the trained policy network to selected Cartesian goals and associated errors are analyzed. The learned control policy is demonstrated with trajectory following tasks.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>Separation of extension and rotation joints for Gaussian exploration is required to overcome Cartesian sampling bias. Parameter noise and Ornstein\u2013Uhlenbeck were found to be optimal strategies with less than 1 mm error in all simulation environments. Various trajectories can be followed with the optimal exploration strategy learned policy at high joint extension values. Our inverse kinematics solver in evaluation has 0.44\u00a0mm extension and<jats:inline-formula><jats:alternatives><jats:tex-math>$$0.3^{\\circ }$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"><mml:mrow><mml:mn>0<\/mml:mn><mml:mo>.<\/mml:mo><mml:msup><mml:mn>3<\/mml:mn><mml:mo>\u2218<\/mml:mo><\/mml:msup><\/mml:mrow><\/mml:math><\/jats:alternatives><\/jats:inline-formula>rotation error.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusion<\/jats:title><jats:p>We demonstrate the feasibility of effective model-free control for concentric tube robots. Directly using the control policy, arbitrary trajectories can be followed and this is an important step towards overcoming the challenge of concentric tube robot control for clinical use in minimally invasive interventions.<\/jats:p><\/jats:sec>","DOI":"10.1007\/s11548-020-02194-z","type":"journal-article","created":{"date-parts":[[2020,6,6]],"date-time":"2020-06-06T11:02:43Z","timestamp":1591441363000},"page":"1157-1165","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":21,"title":["Investigating exploration for deep reinforcement learning of concentric tube robot control"],"prefix":"10.1007","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7020-9537","authenticated-orcid":false,"given":"Keshav","family":"Iyengar","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8019-7680","authenticated-orcid":false,"given":"George","family":"Dwyer","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0980-3227","authenticated-orcid":false,"given":"Danail","family":"Stoyanov","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2020,6,6]]},"reference":[{"key":"2194_CR1","unstructured":"Andrychowicz M, Wolski F, Ray A, Schneider J, Fong R, Welinder P, McGrew B, Tobin J, Abbeel OP, Zaremba W (2017) Hindsight experience replay. In: Advances in neural information processing systems, pp 5048\u20135058"},{"key":"2194_CR2","unstructured":"Bergeles C, Lin FY, Yang GZ (2015) Concentric tube robot kinematics using neural networks. In: Hamlyn symposium on medical robotics, pp 13\u201314"},{"issue":"3","key":"2194_CR3","doi-asserted-by":"publisher","first-page":"996","DOI":"10.1109\/TMECH.2013.2265804","volume":"19","author":"J Burgner","year":"2014","unstructured":"Burgner J, Rucker DC, Gilbert HB, Swaney PJ, Russell PT, Weaver KD, Webster RJ (2014) A telerobotic system for transnasal surgery. IEEE\/ASME Trans Mechatron 19(3):996\u20131006. https:\/\/doi.org\/10.1109\/TMECH.2013.2265804","journal-title":"IEEE\/ASME Trans Mechatron"},{"key":"2194_CR4","unstructured":"Dupont P, Gosline A, Vasilyev N, Lock J, Butler E, Folk C, Cohen A, Chen R, Schmitz G RH, del Nido P (2012) Concentric tube robots for minimally invasive surgery. In: Hamlyn symposium on medical robotics, vol\u00a07, p\u00a08"},{"issue":"2","key":"2194_CR5","doi-asserted-by":"publisher","first-page":"209","DOI":"10.1109\/TRO.2009.2035740","volume":"26","author":"PE Dupont","year":"2010","unstructured":"Dupont PE, Lock J, Itkowitz B, Butler E (2010) Design and control of concentric-tube robots. IEEE Trans Robot 26(2):209\u2013225. https:\/\/doi.org\/10.1109\/TRO.2009.2035740","journal-title":"IEEE Trans Robot"},{"issue":"3","key":"2194_CR6","doi-asserted-by":"publisher","first-page":"1656","DOI":"10.1109\/LRA.2017.2679902","volume":"2","author":"G Dwyer","year":"2017","unstructured":"Dwyer G, Chadebecq F, Amo MT, Bergeles C, Maneas E, Pawar V, Vander Poorten E, Deprest J, Ourselin S, De Coppi P, Vercauteren T, Stoyanov D (2017) A continuum robot and control interface for surgical assist in fetoscopic interventions. IEEE Robot Autom Lett 2(3):1656\u20131663","journal-title":"IEEE Robot Autom Lett"},{"key":"2194_CR7","doi-asserted-by":"crossref","unstructured":"Dwyer G, Colchester RJ, Alles EJ, Maneas E, Ourselin S, Vercauteren T, Deprest J, Vander\u00a0Poorten E, De\u00a0Coppi P, Desjardins AE, Stoyanov D (2019) Robotic control of a multi-modal rigid endoscope combining optical imaging with all-optical ultrasound. In: 2019 International conference on robotics and automation (ICRA). IEEE, pp 3882\u20133888","DOI":"10.1109\/ICRA.2019.8794289"},{"key":"2194_CR8","doi-asserted-by":"publisher","unstructured":"Grassmann R, Modes V, Burgner-Kahrs J (2018) Learning the forward and inverse kinematics of a 6-DOF concentric tube continuum robot in SE(3). In: IEEE international conference on intelligent robots and systems, pp 5125\u20135132. Institute of Electrical and Electronics Engineers Inc. https:\/\/doi.org\/10.1109\/IROS.2018.8594451","DOI":"10.1109\/IROS.2018.8594451"},{"key":"2194_CR9","unstructured":"Grassmann RM, Burgner-Kahrs J (2019) On the merits of joint space and orientation representations in learning the forward kinematics in SE ( 3 ). In: Robotics: science and systems"},{"key":"2194_CR10","doi-asserted-by":"crossref","unstructured":"Henderson P, Islam R, Bachman P, Pineau J, Precup D, Meger D (2018) Deep reinforcement learning that matters. In: Thirty-second AAAI conference on artificial intelligence","DOI":"10.1609\/aaai.v32i1.11694"},{"key":"2194_CR11","unstructured":"Hill A, Raffin A, Ernestus M, Gleave A, Kanervisto A, Traore R, Dhariwal P, Hesse C, Klimov O, Nichol A, Plappert M, Radford A, Schulman J, Sidor S, Wu Y (2018) Stable baselines. https:\/\/github.com\/hill-a\/stable-baselines"},{"issue":"3","key":"2194_CR12","doi-asserted-by":"publisher","first-page":"307","DOI":"10.1207\/s15516709cog1603_1","volume":"16","author":"MI Jordan","year":"1992","unstructured":"Jordan MI, Rumelhart DE (1992) Forward models: supervised learning with a distal teacher. Cogn Sci 16(3):307\u2013354","journal-title":"Cogn Sci"},{"key":"2194_CR13","unstructured":"Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. arxiv:1509.02971"},{"key":"2194_CR14","doi-asserted-by":"publisher","unstructured":"Lock J, Dupont PE (2011) Friction modeling in concentric tube robots. In: Proceedings\u2014IEEE international conference on robotics and automation, pp 1139\u20131146. https:\/\/doi.org\/10.1109\/ICRA.2011.5980347","DOI":"10.1109\/ICRA.2011.5980347"},{"key":"2194_CR15","doi-asserted-by":"publisher","unstructured":"Nair A, McGrew B, Andrychowicz M, Zaremba W, Abbeel P (2018) Overcoming exploration in reinforcement learning with demonstrations. In: Proceedings\u2014IEEE international conference on robotics and automation, pp 6292\u20136299. https:\/\/doi.org\/10.1109\/ICRA.2018.8463162","DOI":"10.1109\/ICRA.2018.8463162"},{"key":"2194_CR16","unstructured":"Nikishin E, Izmailov P, Athiwaratkun B, Podoprikhin D, Garipov T, Shvechikov P, Vetrov D, Wilson AG (2018) Improving stability in deep reinforcement learning with weight averaging. In: Uncertainty in artificial intelligence workshop on uncertainty in deep learning, vol\u00a05"},{"key":"2194_CR17","unstructured":"OpenAI Andrychowicz M, Baker B, Chociej M, Jozefowicz R, McGrew B, Pachocki J, Petron A, Plappert M, Powell G, Ray A, Schneider J, Sidor S, Tobin J, Welinder P, Weng L, Zaremba W (2018) Learning dexterous in-hand manipulation. http:\/\/arxiv.org\/abs\/1808.00177"},{"key":"2194_CR18","unstructured":"Plappert M, Houthooft R, Dhariwal P, Sidor S, Chen RY, Chen X, Asfour T, Abbeel P, Andrychowicz M (2017) Parameter space noise for exploration. arXiv preprint arXiv:1706.01905"},{"issue":"5","key":"2194_CR19","doi-asserted-by":"publisher","first-page":"769","DOI":"10.1109\/TRO.2010.2062570","volume":"26","author":"DC Rucker","year":"2010","unstructured":"Rucker DC, Jones BA, Webster RJ (2010) A geometrically exact model for externally loaded concentric-tube continuum robots. IEEE Trans Robot 26(5):769\u2013780. https:\/\/doi.org\/10.1109\/TRO.2010.2062570","journal-title":"IEEE Trans Robot"},{"key":"2194_CR20","volume-title":"Reinforcement learning: an introduction","author":"RS Sutton","year":"2018","unstructured":"Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT Press, Cambridge"},{"issue":"3","key":"2194_CR21","doi-asserted-by":"publisher","first-page":"e1774","DOI":"10.1002\/rcs.1774","volume":"13","author":"W Xu","year":"2017","unstructured":"Xu W, Chen J, Lau HY, Ren H (2017) Data-driven methods towards learning the highly nonlinear inverse kinematics of tendon-driven surgical manipulators. Int J Med Robot Comput Assist Surg 13(3):e1774","journal-title":"Int J Med Robot Comput Assist Surg"}],"container-title":["International Journal of Computer Assisted Radiology and Surgery"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11548-020-02194-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11548-020-02194-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11548-020-02194-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,10,26]],"date-time":"2022-10-26T22:24:32Z","timestamp":1666823072000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11548-020-02194-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,6,6]]},"references-count":21,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2020,7]]}},"alternative-id":["2194"],"URL":"https:\/\/doi.org\/10.1007\/s11548-020-02194-z","relation":{},"ISSN":["1861-6410","1861-6429"],"issn-type":[{"value":"1861-6410","type":"print"},{"value":"1861-6429","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020,6,6]]},"assertion":[{"value":"19 November 2019","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 April 2020","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 June 2020","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Compliance with ethical standards"}},{"value":"The authors Keshav Iyengar, George Dwyer and Danail Stoyanov confirm that they do not have financial or non-financial conflict of interest related to the work presented in this paper and that the research here detailed did not involve human participants or animals, hence, the need for informed consent does not apply.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}