{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T14:23:14Z","timestamp":1766067794133,"version":"3.40.3"},"publisher-location":"Cham","reference-count":7,"publisher":"Springer Nature Switzerland","isbn-type":[{"type":"print","value":"9783031376481"},{"type":"electronic","value":"9783031376498"}],"license":[{"start":{"date-parts":[[2023,1,1]],"date-time":"2023-01-01T00:00:00Z","timestamp":1672531200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,7,25]],"date-time":"2023-07-25T00:00:00Z","timestamp":1690243200000},"content-version":"vor","delay-in-days":205,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2023]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Reinforcement learning (RL) is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning. Reinforcement learning algorithms have become very popular in simple computer games and games like chess and GO. However, playing classical arcade fighting games would be challenging because of the complexity of the command system (the character makes moves according to the sequence of input) and combo system. In this paper, a creation of a game environment of The King of Fighters \u201997 (KOF \u201997), which implements the open gym env interface, is described. Based on the characteristics of the game, an innovative approach to represent the observations from the last few steps has been proposed, which guarantees the preservation of Markov\u2019s property. The observations are coded using the \u201cone-hot encoding\u201d technique to form a binary vector, while the sequence of stacked vectors from successive steps creates a binary image. This image encodes the character\u2019s input and behavioural pattern, which are then retrieved and recognized by the CNN network. A network structure based on the Advantage Actor-Critic network was proposed. In the experimental verification, the RL agent performing basic combos and complex moves (including the so-called \u201cdesperation moves\u201d) was able to defeat characters using the highest level of AI built into the game.<\/jats:p>","DOI":"10.1007\/978-3-031-37649-8_5","type":"book-chapter","created":{"date-parts":[[2023,7,25]],"date-time":"2023-07-25T04:02:08Z","timestamp":1690257728000},"page":"45-55","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Representation of\u00a0Observations in\u00a0Reinforcement Learning for\u00a0Playing Arcade Fighting Game"],"prefix":"10.1007","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6160-8330","authenticated-orcid":false,"given":"Huaiyu","family":"Du","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0753-7241","authenticated-orcid":false,"given":"Rafa\u0142","family":"J\u00f3\u017awiak","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,7,25]]},"reference":[{"key":"5_CR1","unstructured":"Firoiu, V., Whitney, W.F., Tenenbaum, J.B.: Beating the world\u2019s best at super smash bros. with deep reinforcement learning. arXiv preprint arXiv:1702.06230 (2017)"},{"key":"5_CR2","doi-asserted-by":"crossref","unstructured":"Kim, D.W., Park, S., Yang, S.: Mastering fighting game using deep reinforcement learning with self-play. In: 2020 IEEE Conference on Games (CoG), pp. 576\u2013583. IEEE (2020)","DOI":"10.1109\/CoG47356.2020.9231639"},{"key":"5_CR3","doi-asserted-by":"crossref","unstructured":"Li, Y.J., Chang, H.Y., Lin, Y.J., Wu, P.W., Wang, Y.C.F.: Deep reinforcement learning for playing 2.5 d fighting games. In: 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 3778\u20133782. IEEE (2018)","DOI":"10.1109\/ICIP.2018.8451491"},{"key":"5_CR4","unstructured":"M-J-Murray: M-j-murray\/mametoolkit: A python toolkit used to train reinforcement learning algorithms against arcade games. https:\/\/github.com\/M-J-Murray\/MAMEToolkit"},{"key":"5_CR5","doi-asserted-by":"publisher","first-page":"212","DOI":"10.1109\/TG.2021.3049539","volume":"14","author":"I Oh","year":"2021","unstructured":"Oh, I., Rho, S., Moon, S., Son, S., Lee, H., Chung, J.: Creating pro-level AI for a real-time fighting game using deep reinforcement learning. IEEE Trans. Games 14, 212\u2013220 (2021)","journal-title":"IEEE Trans. Games"},{"key":"5_CR6","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)"},{"issue":"7782","key":"5_CR7","doi-asserted-by":"publisher","first-page":"350","DOI":"10.1038\/s41586-019-1724-z","volume":"575","author":"O Vinyals","year":"2019","unstructured":"Vinyals, O., et al.: Grandmaster level in starcraft II using multi-agent reinforcement learning. Nature 575(7782), 350\u2013354 (2019)","journal-title":"Nature"}],"container-title":["Lecture Notes in Networks and Systems","Digital Interaction and Machine Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-37649-8_5","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,7,25]],"date-time":"2023-07-25T04:10:25Z","timestamp":1690258225000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-37649-8_5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023]]},"ISBN":["9783031376481","9783031376498"],"references-count":7,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-37649-8_5","relation":{},"ISSN":["2367-3370","2367-3389"],"issn-type":[{"type":"print","value":"2367-3370"},{"type":"electronic","value":"2367-3389"}],"subject":[],"published":{"date-parts":[[2023]]},"assertion":[{"value":"25 July 2023","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"MIDI","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Machine Intelligence and Digital Interaction Conference","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2022","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"12 December 2022","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"15 December 2022","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"midi12022","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"https:\/\/midi2022.opi.org.pl\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}}]}}