{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T04:18:18Z","timestamp":1777522698699,"version":"3.51.4"},"reference-count":29,"publisher":"SAGE Publications","issue":"1","license":[{"start":{"date-parts":[[2013,8,29]],"date-time":"2013-08-29T00:00:00Z","timestamp":1377734400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Adaptive Behavior"],"published-print":{"date-parts":[[2014,2]]},"abstract":"<jats:p>Imitation is an example of social learning in which an individual observes and copies another\u2019s actions. This paper presents a new method for using imitation as a way of enhancing the learning speed of individual agents that employ a well-known reinforcement learning algorithm, namely Q-learning. Compared with other research that uses imitation with reinforcement learning, our method uses imitation of purely observed behaviours to enhance learning, with no internal state access or sharing of experiences between agents. The paper evaluates our imitation-enhanced reinforcement learning approach in both simulation and with real robots in continuous space. Both simulation and real robot experimental results show that the learning speed of the group is improved.<\/jats:p>","DOI":"10.1177\/1059712313500503","type":"journal-article","created":{"date-parts":[[2013,8,29]],"date-time":"2013-08-29T21:57:21Z","timestamp":1377813441000},"page":"31-50","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":13,"title":["Embodied imitation-enhanced reinforcement learning in multi-agent systems"],"prefix":"10.1177","volume":"22","author":[{"given":"Mehmet D","family":"Erbas","sequence":"first","affiliation":[{"name":"Istanbul Kemerburgaz University, Faculty of Engineering and Architecture, Istanbul, Turkey"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alan FT","family":"Winfield","sequence":"additional","affiliation":[{"name":"University of the West of England, Faculty of Environment and Technology, Bristol, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Larry","family":"Bull","sequence":"additional","affiliation":[{"name":"University of the West of England, Faculty of Environment and Technology, Bristol, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2013,8,29]]},"reference":[{"key":"bibr1-1059712313500503","author":"Abbeel P.","year":"2004","journal-title":"Proceedings of ICML 2004"},{"key":"bibr2-1059712313500503","first-page":"105","volume":"6","author":"Barto A. G.","year":"2004","journal-title":"Artificial Intelligence"},{"key":"bibr3-1059712313500503","first-page":"551","author":"Bentivegna D. C.","year":"2004","journal-title":"Robotics"},{"key":"bibr4-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-30301-5_60"},{"key":"bibr5-1059712313500503","first-page":"393","volume-title":"Advances in Neural Information Processing Systems: Proceedings of the 1994 Conference","author":"Bradtke S. J.","year":"1995"},{"key":"bibr6-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1162\/1064546053278955"},{"key":"bibr7-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1145\/1228716.1228751"},{"key":"bibr8-1059712313500503","volume-title":"Proceedings of the Neural Information Processing Systems 8","author":"Crites R. H.","year":"1996"},{"key":"bibr9-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2004.03.005"},{"key":"bibr10-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1162\/089976600300015961"},{"key":"bibr11-1059712313500503","first-page":"219","volume-title":"Proceedings of the Eleventh European Conference on the Synthesis and Simulation of Living Systems (ECAL 2011)","author":"Erbas M. D.","year":"2011"},{"key":"bibr12-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1080\/088395198117596"},{"key":"bibr13-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2007.4399449"},{"key":"bibr14-1059712313500503","volume-title":"Proceedings of 10th International RoboCup Symposium","author":"Latzke T.","year":"2006"},{"key":"bibr15-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1016\/j.micpro.2010.08.002"},{"key":"bibr16-1059712313500503","first-page":"59","volume-title":"9th Conference on Autonomous Robot Systems and Competitions","author":"Mondada F.","year":"2009"},{"key":"bibr17-1059712313500503","doi-asserted-by":"crossref","first-page":"41","DOI":"10.7551\/mitpress\/3676.003.0003","volume-title":"Imitation in animals amd artifacts","author":"Nehaniv C. L.","year":"2002"},{"key":"bibr18-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511489808"},{"key":"bibr19-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511489808.027"},{"key":"bibr20-1059712313500503","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/2889.001.0001"},{"key":"bibr21-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2006.282564"},{"key":"bibr22-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1613\/jair.898"},{"key":"bibr23-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1038\/323533a0"},{"key":"bibr24-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1177\/105971239700600201"},{"key":"bibr25-1059712313500503","first-page":"903","volume-title":"Proceedings of the 17th International Conference on Machine Learning","author":"Smart W. D.","year":"2000"},{"key":"bibr26-1059712313500503","volume-title":"Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems","author":"Strosslin T.","year":"2006"},{"key":"bibr27-1059712313500503","volume-title":"Reinforcement learning","author":"Sutton R. S.","year":"1998"},{"issue":"3","key":"bibr28-1059712313500503","first-page":"272","volume":"8","author":"Watkins C.","year":"1992","journal-title":"Machine Learning"},{"key":"bibr29-1059712313500503","doi-asserted-by":"publisher","DOI":"10.1007\/s12293-011-0063-x"}],"container-title":["Adaptive Behavior"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1059712313500503","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1059712313500503","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1059712313500503","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T16:18:25Z","timestamp":1777393105000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1059712313500503"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,8,29]]},"references-count":29,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2014,2]]}},"alternative-id":["10.1177\/1059712313500503"],"URL":"https:\/\/doi.org\/10.1177\/1059712313500503","relation":{},"ISSN":["1059-7123","1741-2633"],"issn-type":[{"value":"1059-7123","type":"print"},{"value":"1741-2633","type":"electronic"}],"subject":[],"published":{"date-parts":[[2013,8,29]]}}}