{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,27]],"date-time":"2025-12-27T07:13:47Z","timestamp":1766819627695,"version":"build-2065373602"},"reference-count":34,"publisher":"MIT Press","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Neural Computation"],"published-print":{"date-parts":[[2021,1]]},"abstract":"<jats:p> This letter proposes a new idea to improve learning efficiency in reinforcement learning (RL) with the actor-critic method used as a muscle controller for posture stabilization of the human arm. Actor-critic RL (ACRL) is used for simulations to realize posture controls in humans or robots using muscle tension control. However, it requires very high computational costs to acquire a better muscle control policy for desirable postures. For efficient ACRL, we focused on embodiment that is supposed to potentially achieve efficient controls in research fields of artificial intelligence or robotics. According to the neurophysiology of motion control obtained from experimental studies using animals or humans, the pedunculopontine tegmental nucleus (PPTn) induces muscle tone suppression, and the midbrain locomotor region (MLR) induces muscle tone promotion. PPTn and MLR modulate the activation levels of mutually antagonizing muscles such as flexors and extensors in a process through which control signals are translated from the substantia nigra reticulata to the brain stem. Therefore, we hypothesized that the PPTn and MLR could control muscle tone, that is, the maximum values of activation levels of mutually antagonizing muscles using different sigmoidal functions for each muscle; then we introduced antagonism function models (AFMs) of PPTn and MLR for individual muscles, incorporating the hypothesis into the process to determine the activation level of each muscle based on the output of the actor in ACRL. <\/jats:p><jats:p> ACRL with AFMs representing the embodiment of muscle tone successfully achieved posture stabilization in five joint motions of the right arm of a human adult male under gravity in predetermined target angles at an earlier period of learning than the learning methods without AFMs. The results obtained from this study suggest that the introduction of embodiment of muscle tone can enhance learning efficiency in posture stabilization disorders of humans or humanoid robots. <\/jats:p>","DOI":"10.1162\/neco_a_01333","type":"journal-article","created":{"date-parts":[[2020,10,20]],"date-time":"2020-10-20T21:25:44Z","timestamp":1603229144000},"page":"129-156","source":"Crossref","is-referenced-by-count":5,"title":["Efficient Actor-Critic Reinforcement Learning With Embodiment of Muscle Tone for Posture Stabilization of the Human Arm"],"prefix":"10.1162","volume":"33","author":[{"given":"Masami","family":"Iwamoto","sequence":"first","affiliation":[{"name":"Toyota Central R&D Labs., Aichi 480-1192 Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Daichi","family":"Kato","sequence":"additional","affiliation":[{"name":"Toyota Central R&D Labs., Aichi 480-1192 Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"281","reference":[{"key":"B1","doi-asserted-by":"publisher","DOI":"10.1243\/EMED_JOUR_1979_008_010_02"},{"key":"B2","doi-asserted-by":"publisher","DOI":"10.1016\/0021-9290(81)90048-8"},{"key":"B3","first-page":"5048","volume-title":"Proceedings of the 31st Conference on Neural Information Processing Systems","author":"Andrychowicz M.","year":"2017"},{"key":"B4","first-page":"215","volume-title":"Models of information processing in the basal ganglia","author":"Houk J. C.","year":"1995"},{"key":"B5","doi-asserted-by":"publisher","DOI":"10.1162\/089976600300015961"},{"key":"B6","doi-asserted-by":"publisher","DOI":"10.1016\/S0959-4388(00)00153-7"},{"issue":"1","key":"B7","first-page":"160","volume":"10","author":"Gans C.","year":"1982","journal-title":"Exercise Sports Sciences Reviews"},{"journal-title":"The implications of embodiment for behavior and cognition: Animal and robotic case studies","year":"2012","author":"Hoffmann M.","key":"B8"},{"key":"B9","first-page":"231","volume":"56","author":"Iwamoto M.","year":"2012","journal-title":"Stapp Car Crash Journal"},{"key":"B10","doi-asserted-by":"publisher","DOI":"10.1109\/IEMBS.2004.1403200"},{"key":"B11","doi-asserted-by":"publisher","DOI":"10.1016\/j.neunet.2008.11.004"},{"volume-title":"Proceedings of the 2018 International IRCOBI Conference on the Biomechanics of Injury","year":"2018","author":"Kato D.","key":"B12"},{"key":"B13","doi-asserted-by":"publisher","DOI":"10.1016\/S1058-2746(97)70049-1"},{"key":"B14","doi-asserted-by":"publisher","DOI":"10.1016\/S0268-0033(99)00081-9"},{"key":"B15","doi-asserted-by":"publisher","DOI":"10.1162\/neco_a_01063"},{"key":"B16","doi-asserted-by":"publisher","DOI":"10.1162\/0899766053011528"},{"key":"B17","doi-asserted-by":"publisher","DOI":"10.1016\/S0021-9290(00)00051-8"},{"key":"B18","doi-asserted-by":"publisher","DOI":"10.1016\/S0021-9290(01)00173-7"},{"key":"B19","doi-asserted-by":"publisher","DOI":"10.1016\/0021-9290(94)00114-J"},{"key":"B20","volume-title":"Kinesiology of the musculoskeletal system: Foundations for rehabilitation","author":"Neumann D.","year":"2010","edition":"2"},{"volume-title":"Computer-aided analysis of mechanical systems","year":"1988","author":"Nikravesh P.","key":"B21"},{"key":"B22","doi-asserted-by":"publisher","DOI":"10.1126\/science.1145803"},{"volume-title":"Proceedings of the 31st Conference on Neural Information Processing Systems","year":"2018","author":"Popov I.","key":"B23"},{"volume-title":"Proceedings of the Conference on the Enhanced Safety of Vehicles","year":"2011","author":"Rooij L.","key":"B24"},{"key":"B25","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.1999.831131"},{"volume-title":"Proceedings of the 31th International Conference on Machine Learning","year":"2014","author":"Silver D.","key":"B26"},{"key":"B27","doi-asserted-by":"publisher","DOI":"10.14802\/jmd.16062"},{"key":"B28","doi-asserted-by":"publisher","DOI":"10.1086\/202916"},{"key":"B29","doi-asserted-by":"publisher","DOI":"10.1115\/1.1531112"},{"key":"B30","doi-asserted-by":"publisher","DOI":"10.1016\/S0021-9290(02)00432-3"},{"key":"B31","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2012.6386109"},{"key":"B32","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4613-9030-5_5"},{"key":"B33","doi-asserted-by":"publisher","DOI":"10.1589\/jpts.26.1079"},{"issue":"4","key":"B34","first-page":"359","volume":"17","author":"Zajac F.","year":"1989","journal-title":"Critical Review in Biomedical Engineering"}],"container-title":["Neural Computation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mitpressjournals.org\/doi\/pdf\/10.1162\/neco_a_01333","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,3,12]],"date-time":"2021-03-12T21:43:54Z","timestamp":1615585434000},"score":1,"resource":{"primary":{"URL":"https:\/\/direct.mit.edu\/neco\/article\/33\/1\/129-156\/95657"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,1]]},"references-count":34,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,1]]}},"alternative-id":["10.1162\/neco_a_01333"],"URL":"https:\/\/doi.org\/10.1162\/neco_a_01333","relation":{},"ISSN":["0899-7667","1530-888X"],"issn-type":[{"type":"print","value":"0899-7667"},{"type":"electronic","value":"1530-888X"}],"subject":[],"published":{"date-parts":[[2021,1]]}}}