{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:31:40Z","timestamp":1760243500976,"version":"build-2065373602"},"reference-count":39,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2013,7,10]],"date-time":"2013-07-10T00:00:00Z","timestamp":1373414400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/3.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Robotics"],"abstract":"<jats:p>As a powerful and intelligent machine learning method, reinforcement learning (RL) has been widely used in many fields such as game theory, adaptive control,  multi-agent system, nonlinear forecasting, and so on. The main contribution of this technique is its exploration and exploitation approaches to find the optimal solution or semi-optimal solution of goal-directed problems. However, when RL is applied to multi-agent systems (MASs), problems such as \u201ccurse of dimension\u201d, \u201cperceptual aliasing problem\u201d, and uncertainty of the environment constitute high hurdles to RL. Meanwhile, although RL is inspired by behavioral psychology and reward\/punishment from the environment is used, higher mental factors such as affects, emotions, and motivations are rarely adopted in the learning procedure of RL. In this paper, to challenge agents learning in MASs, we propose a computational motivation function, which adopts two principle affective factors \u201cArousal\u201d and \u201cPleasure\u201d of Russell\u2019s circumplex model of affects, to improve the learning performance of a conventional RL algorithm named Q-learning (QL). Compared with the conventional QL, computer simulations of pursuit problems with static and dynamic preys were carried out, and the results showed that the proposed method results in agents having a faster and more stable learning performance.<\/jats:p>","DOI":"10.3390\/robotics2030149","type":"journal-article","created":{"date-parts":[[2013,7,10]],"date-time":"2013-07-10T10:56:10Z","timestamp":1373453770000},"page":"149-164","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["An Improved Reinforcement Learning System Using  Affective Factors"],"prefix":"10.3390","volume":"2","author":[{"given":"Takashi","family":"Kuremoto","sequence":"first","affiliation":[{"name":"Graduate School of Science and Engineering, Yamaguchi University, Tokiwadai 2-16-1, Ube, Yamaguchi 755-8611, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tetsuya","family":"Tsurusaki","sequence":"additional","affiliation":[{"name":"Graduate School of Science and Engineering, Yamaguchi University, Tokiwadai 2-16-1, Ube, Yamaguchi 755-8611, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kunikazu","family":"Kobayashi","sequence":"additional","affiliation":[{"name":"School of Information Science & Technology, Aichi Prefectural University, Ibaragabasama 152203, Ngakute, Aichi 480-1198, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shingo","family":"Mabu","sequence":"additional","affiliation":[{"name":"Graduate School of Science and Engineering, Yamaguchi University, Tokiwadai 2-16-1, Ube, Yamaguchi 755-8611, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Masanao","family":"Obayashi","sequence":"additional","affiliation":[{"name":"Graduate School of Science and Engineering, Yamaguchi University, Tokiwadai 2-16-1, Ube, Yamaguchi 755-8611, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2013,7,10]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Sutton, R.S., and Barto, A.G. (1998). Reinforcement Learning: An Introduction, The MIT Press.","DOI":"10.1109\/TNN.1998.712192"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"495","DOI":"10.1016\/S0893-6080(02)00044-8","article-title":"Metalearning and neuromodulation","volume":"15","author":"Doya","year":"2002","journal-title":"Neural Netw."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"275","DOI":"10.1016\/S0004-3702(99)00026-0","article-title":"Cooperative behavior acquisition for mobile robots in dynamically changing real worlds via vision-based reinforcement learning and development","volume":"110","author":"Asada","year":"1999","journal-title":"Artif. Intell."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1177\/0278364907087426","article-title":"Trajectory optimization using reinforcement learning for map exploration","volume":"27","author":"Kollar","year":"2008","journal-title":"Int. J. Robot. Res."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"338","DOI":"10.1109\/5326.704563","article-title":"Fuzzy inference system learning by reinforcement learning","volume":"28","author":"Jouffe","year":"1998","journal-title":"IEEE Trans. Syst. Man Cybern. B"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"526","DOI":"10.1007\/s10015-008-0608-3","article-title":"A robust reinforcement learning using concept of slide mode control","volume":"13","author":"Obayashi","year":"2009","journal-title":"Artif. Life Robot."},{"key":"ref_7","unstructured":"Kuremoto, T., Obayashi, M., Yamamoto, A., and Kobayashi, K. (2003, January 15\u201318). Predicting Chaotic Time Series by Reinforcement Learning. Proceedings of the 2nd International Conference on Computational Intelligence, Robotics, and Autonomous Systems, Singapore."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1085","DOI":"10.1007\/11538059_112","article-title":"Nonlinear prediction by reinforcement learning","volume":"3644","author":"Kuremoto","year":"2005","journal-title":"Lect. Note. Comput. Sci."},{"key":"ref_9","unstructured":"Kuremoto, T., Obayashi, M., and Kobayashi, K. (2007, January 24\u201327). Forecasting Time Series by SOFNN with Reinforcement Learning. Proceedings of the 27th Annual International Symposium on Forecasting, Neural Forecasting Competition (NN3), New York, NY, USA."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Weber, C., Elshaw, M., and Mayer, N.M. (2008). Reinforcement Learning, Theory and Applications, InTech.","DOI":"10.5772\/54"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Kuremoto, T., Obayashi, M., Kobayashi, K., Adachi, H., and Yoneda, K. (2008, January 1\u20136). A Reinforcement Learning System for Swarm Behaviors. Proceedings of IEEE World Congress Computational Intelligence (WCCI\/IJCNN 2008), Hong Kong.","DOI":"10.1109\/IJCNN.2008.4634330"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"724","DOI":"10.1108\/17563780911005854","article-title":"Swarm behavior acquisition by a neuro-fuzzy system and reinforcement learning algorithm","volume":"2","author":"Kuremoto","year":"2009","journal-title":"Int. J. Intell. Comput. Cybern."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"675","DOI":"10.1007\/978-3-540-85984-0_81","article-title":"A neuro-fuzzy learning system for adaptive swarm behaviors dealing with continuous state space","volume":"5227","author":"Kuremoto","year":"2008","journal-title":"Lect. Notes Comput. Sci."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1517","DOI":"10.1142\/S0218126609005836","article-title":"An improved internal model for swarm formation and adaptive swarm behavior acquisition","volume":"18","author":"Kuremoto","year":"2009","journal-title":"J. Circuit. Syst. Comput."},{"key":"ref_15","first-page":"79","article-title":"Multi-agent systems","volume":"19","author":"Sycara","year":"1998","journal-title":"Artif. Intell. Mag."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"77","DOI":"10.1023\/A:1008819414322","article-title":"Reinforcement learning in multi-robot domain","volume":"4","author":"Mataric","year":"1997","journal-title":"Auton. Robot."},{"key":"ref_17","first-page":"345","article-title":"Hierarchical multi agent reinforcement learning","volume":"12","author":"Makar","year":"2000","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"537","DOI":"10.1007\/978-3-642-34487-9_65","article-title":"Cooperative behavior acquisition using attention degree","volume":"7665","author":"Kobayashi","year":"2012","journal-title":"Lect. Notes Comput. Sci."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"834","DOI":"10.1109\/TSMC.1983.6313077","article-title":"Neuron-like adaptive elements that can solve difficult learning control problems","volume":"13","author":"Barto","year":"1983","journal-title":"IEEE Trans. Syst. Man. Cybern."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"9","DOI":"10.1007\/BF00115009","article-title":"Learning to predict by the method of temporal difference","volume":"3","author":"Sutton","year":"1988","journal-title":"Mach. Learn."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1007\/BF00992698","article-title":"Technical note: Q-learning","volume":"8","author":"Watkins","year":"1992","journal-title":"Mach. Learn."},{"key":"ref_22","first-page":"1008","article-title":"Actor-critic algorithms","volume":"12","author":"Konda","year":"2000","journal-title":"Adv. Neural Inf. Process."},{"key":"ref_23","unstructured":"LeDoux, J.E. (1996). The Emotional Brain: The Mysterious Underpinnings of Emotional Life, Siman & Schuster."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"49","DOI":"10.1037\/0708-5591.49.1.49","article-title":"Emotion and cognition in psychotherapy: The transforming power of affect","volume":"49","author":"Greenberg","year":"2008","journal-title":"Can. Psychol."},{"key":"ref_25","first-page":"1390","article-title":"Characteristics of behavior of robots with emotion model","volume":"124","author":"Sato","year":"2004","journal-title":"IEEJ Trans. Electron. Inf. Syst."},{"key":"ref_26","first-page":"1037","article-title":"Emergent of burden sharing of robots with emotion model (in Japanese)","volume":"125","author":"Kusano","year":"2005","journal-title":"IEEJ Trans. Electron. Inf. Syst."},{"key":"ref_27","first-page":"25","article-title":"Promises and problems with the circumplex model of emotion","volume":"Volume 13","author":"Clark","year":"1992","journal-title":"Review of Personality and Social Psychology"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1161","DOI":"10.1037\/h0077714","article-title":"A circumplex model of affect","volume":"39","author":"Russell","year":"1980","journal-title":"J. Personal. Soc. Psychol."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"541","DOI":"10.1007\/978-3-642-15615-1_63","article-title":"Autonomic behaviors of swarm robots driven by emotion and curiosity","volume":"6630","author":"Kuremoto","year":"2010","journal-title":"Lect. Notes Comput. Sci."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"501","DOI":"10.1007\/s12559-011-9102-7","article-title":"An improved internal model of autonomous robot by a psychological approach","volume":"3","author":"Kuremoto","year":"2011","journal-title":"Cogn. Comput."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"805","DOI":"10.1037\/0022-3514.76.5.805","article-title":"Core affect, prototypical emotional episodes, and other things called emotion: Dissecting the elephant","volume":"76","author":"Russell","year":"1999","journal-title":"J. Personal. Soc. Psychol."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1037\/0033-295X.110.1.145","article-title":"Core affect and the psychological construction of emotion","volume":"110","author":"Russell","year":"2003","journal-title":"Psychol. Rev."},{"key":"ref_33","unstructured":"Wundn, W. (1897). Outlines of Psychology, Wilhem Englemann."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Ortony, A., Clore, G., and Collins, A. (1988). The Cognitive Structure of Emotions, Cambridge University Press.","DOI":"10.1017\/CBO9780511571299"},{"key":"ref_35","first-page":"345","article-title":"Reinforcement learning algorithm for partially observable Markov decision problems","volume":"7","author":"Jaakkola","year":"1994","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_36","unstructured":"Agogino, A.K., and Tumer, K. Quicker Q-Learning in Multi-Agent Systems. Available online:http:\/\/archive.org\/details\/nasa_techdoc_20050182925."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1007\/s11031-010-9162-0","article-title":"Composition and consistency of the desired affective state: The role of personality and motivation","volume":"34","author":"Augustine","year":"2010","journal-title":"Motiv. Emot."},{"key":"ref_38","unstructured":"Watanabe, S., Obayashi, M., Kuremoto, T., and Kobayashi, K. (February, January 30). A New Decision-Making System of an Agent Based on Emotional Models in Multi-Agent System. Proceedings of the 18th International Symposium on Artificial Life and Robotics, Daejeon, Korea."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"22","DOI":"10.1007\/s12559-009-9008-9","article-title":"Designing conscious systems","volume":"1","author":"Aleksander","year":"2009","journal-title":"Cogn. Comput."}],"container-title":["Robotics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2218-6581\/2\/3\/149\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T21:47:51Z","timestamp":1760219271000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2218-6581\/2\/3\/149"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2013,7,10]]},"references-count":39,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2013,9]]}},"alternative-id":["robotics2030149"],"URL":"https:\/\/doi.org\/10.3390\/robotics2030149","relation":{},"ISSN":["2218-6581"],"issn-type":[{"type":"electronic","value":"2218-6581"}],"subject":[],"published":{"date-parts":[[2013,7,10]]}}}