{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,6]],"date-time":"2025-10-06T18:23:41Z","timestamp":1759775021897,"version":"3.38.0"},"reference-count":41,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2021,8,28]],"date-time":"2021-08-28T00:00:00Z","timestamp":1630108800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61703228"],"award-info":[{"award-number":["61703228"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["61903219"],"award-info":[{"award-number":["61903219"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62073183"],"award-info":[{"award-number":["62073183"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002858","name":"China Postdoctoral Science Foundation","doi-asserted-by":"publisher","award":["2020T130351"],"award-info":[{"award-number":["2020T130351"]}],"id":[{"id":"10.13039\/501100002858","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Transactions of the Institute of Measurement and Control"],"published-print":{"date-parts":[[2022,2]]},"abstract":"<jats:p> Single-track two-wheeled robots have become an important research topic in recent years, owing to their simple structure, energy savings and ability to run on narrow roads. However, the ramp jump remains a challenging task. In this study, we propose to realize a single-track two-wheeled robot ramp jump. We present a control method that employs continuous action reinforcement learning techniques for single-track two-wheeled robot control. We design a novel reward function for reinforcement learning, optimize the dimensions of the action space, and enable training under the deep deterministic policy gradient algorithm. Finally, we validate the control method through simulation experiments and successfully realize the single-track two-wheeled robot ramp jump task. Simulation results validate that the control method is effective and has several advantages over high-dimension action space control, reinforcement learning control of sparse reward function and discrete action reinforcement learning control. <\/jats:p>","DOI":"10.1177\/01423312211037847","type":"journal-article","created":{"date-parts":[[2021,8,28]],"date-time":"2021-08-28T18:22:26Z","timestamp":1630174946000},"page":"892-904","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":15,"title":["Continuous reinforcement learning based ramp jump control for single-track two-wheeled robots"],"prefix":"10.1177","volume":"44","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6743-4327","authenticated-orcid":false,"given":"Qingyuan","family":"Zheng","sequence":"first","affiliation":[{"name":"Department of Automation, Tsinghua University, China"}]},{"given":"Duo","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Automation, Tsinghua University, China"}]},{"given":"Zhang","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Automation, Tsinghua University, China"}]},{"given":"Yiyong","family":"Sun","sequence":"additional","affiliation":[{"name":"Department of Automation, Tsinghua University, China"}]},{"given":"Bin","family":"Liang","sequence":"additional","affiliation":[{"name":"Department of Automation, Tsinghua University, China"}]}],"member":"179","published-online":{"date-parts":[[2021,8,28]]},"reference":[{"doi-asserted-by":"publisher","key":"bibr1-01423312211037847","DOI":"10.1109\/MCS.2005.1499389"},{"doi-asserted-by":"publisher","key":"bibr2-01423312211037847","DOI":"10.1109\/ROBOT.1998.680749"},{"volume-title":"Proceedings of the 21st Annual Conference on Neural Information Processing Systems","year":"2007","author":"Bhatnagar S","key":"bibr3-01423312211037847"},{"doi-asserted-by":"publisher","key":"bibr4-01423312211037847","DOI":"10.1007\/s11071-021-06380-9"},{"doi-asserted-by":"publisher","key":"bibr5-01423312211037847","DOI":"10.3390\/sym11020290"},{"issue":"3","key":"bibr6-01423312211037847","first-page":"279","volume":"8","author":"Dayan P","year":"1992","journal-title":"Machine Learning"},{"doi-asserted-by":"publisher","key":"bibr7-01423312211037847","DOI":"10.1109\/ACC.1994.751712"},{"doi-asserted-by":"publisher","key":"bibr8-01423312211037847","DOI":"10.1109\/CDC.1995.478913"},{"year":"2015","author":"Hausknecht M","journal-title":"arXiv preprint arXiv:1507.06527","key":"bibr9-01423312211037847"},{"doi-asserted-by":"publisher","key":"bibr10-01423312211037847","DOI":"10.1007\/978-3-662-46466-3_21"},{"doi-asserted-by":"publisher","key":"bibr11-01423312211037847","DOI":"10.1017\/S026357471400112X"},{"doi-asserted-by":"publisher","key":"bibr12-01423312211037847","DOI":"10.1109\/TCST.2008.2004349"},{"doi-asserted-by":"publisher","key":"bibr13-01423312211037847","DOI":"10.1163\/016918610X538462"},{"doi-asserted-by":"publisher","key":"bibr14-01423312211037847","DOI":"10.1007\/s12206-015-0442-1"},{"key":"bibr15-01423312211037847","first-page":"2200","volume-title":"IEEE\/RSJ International Conference on Intelligent Robots and Systems","volume":"3","author":"Lee S","year":"2002"},{"year":"2015","author":"Lillicrap TP","journal-title":"arXiv preprint arXiv:1509.02971","key":"bibr16-01423312211037847"},{"doi-asserted-by":"publisher","key":"bibr17-01423312211037847","DOI":"10.1007\/BF00992699"},{"year":"2016","author":"Lipton ZC","journal-title":"arXiv preprint ArXiv:1608.05081","key":"bibr18-01423312211037847"},{"year":"2017","author":"Mahajan A","journal-title":"arXiv preprint arXiv:1706.02999","key":"bibr19-01423312211037847"},{"year":"2013","author":"Mnih V","journal-title":"arXiv preprint arXiv:1312.5602","key":"bibr20-01423312211037847"},{"doi-asserted-by":"publisher","key":"bibr21-01423312211037847","DOI":"10.1038\/nature14236"},{"key":"bibr22-01423312211037847","first-page":"1928","volume-title":"ICML\u201916 Proceedings of the 33rd International Conference on International Conference on Machine Learning","volume":"48","author":"Mnih V","year":"2016"},{"doi-asserted-by":"publisher","key":"bibr23-01423312211037847","DOI":"10.1016\/j.neucom.2007.11.026"},{"doi-asserted-by":"publisher","key":"bibr24-01423312211037847","DOI":"10.1016\/J.MECHATRONICS.2020.102386"},{"year":"2015","author":"Schaul T","journal-title":"arXiv preprint arXiv:1511.05952","key":"bibr25-01423312211037847"},{"unstructured":"Silver D, Lever G, Heess N, et al. (2014). Deterministic Policy Gradient Algorithms. In: Proceedings of The 31st International Conference on Machine Learning. pp. 387\u2013395.","key":"bibr26-01423312211037847"},{"doi-asserted-by":"publisher","key":"bibr27-01423312211037847","DOI":"10.1109\/IECON43393.2020.9254572"},{"doi-asserted-by":"publisher","key":"bibr28-01423312211037847","DOI":"10.1007\/BF00115009"},{"key":"bibr29-01423312211037847","first-page":"1057","volume":"12","author":"Sutton RS","year":"1999","journal-title":"Advances in Neural Information Processing Systems"},{"doi-asserted-by":"publisher","key":"bibr30-01423312211037847","DOI":"10.1109\/AMC.2004.1297665"},{"doi-asserted-by":"publisher","key":"bibr31-01423312211037847","DOI":"10.1109\/TIE.2008.927406"},{"year":"2015","author":"Van Hasselt H","journal-title":"arXiv preprint arXiv:1509.06461","key":"bibr32-01423312211037847"},{"doi-asserted-by":"publisher","key":"bibr33-01423312211037847","DOI":"10.23919\/ACC.2019.8814916"},{"doi-asserted-by":"publisher","key":"bibr34-01423312211037847","DOI":"10.1109\/TASE.2019.2922068"},{"doi-asserted-by":"publisher","key":"bibr35-01423312211037847","DOI":"10.1109\/AIM43001.2020.9158912"},{"doi-asserted-by":"publisher","key":"bibr36-01423312211037847","DOI":"10.1007\/BF00992696"},{"key":"bibr37-01423312211037847","first-page":"1245","volume-title":"Proceedings, 2005 IEEE\/ASME International Conference on Advanced Intelligent Mechatronics","author":"Yamakita M","year":"2005"},{"doi-asserted-by":"publisher","key":"bibr38-01423312211037847","DOI":"10.1109\/IROS.2006.282281"},{"doi-asserted-by":"publisher","key":"bibr39-01423312211037847","DOI":"10.1109\/ACC.2014.6859392"},{"doi-asserted-by":"publisher","key":"bibr40-01423312211037847","DOI":"10.1109\/ROBIO.2018.8665347"},{"doi-asserted-by":"publisher","key":"bibr41-01423312211037847","DOI":"10.1109\/ICRA.2011.5979841"}],"container-title":["Transactions of the Institute of Measurement and Control"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/01423312211037847","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/01423312211037847","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/01423312211037847","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,1]],"date-time":"2025-03-01T06:49:37Z","timestamp":1740811777000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/01423312211037847"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,28]]},"references-count":41,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,2]]}},"alternative-id":["10.1177\/01423312211037847"],"URL":"https:\/\/doi.org\/10.1177\/01423312211037847","relation":{},"ISSN":["0142-3312","1477-0369"],"issn-type":[{"type":"print","value":"0142-3312"},{"type":"electronic","value":"1477-0369"}],"subject":[],"published":{"date-parts":[[2021,8,28]]}}}