{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:23:06Z","timestamp":1750220586572,"version":"3.41.0"},"publisher-location":"New York, NY, USA","reference-count":39,"publisher":"ACM","license":[{"start":{"date-parts":[[2020,11,2]],"date-time":"2020-11-02T00:00:00Z","timestamp":1604275200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2020,11,2]]},"DOI":"10.1145\/3383668.3419938","type":"proceedings-article","created":{"date-parts":[[2020,11,3]],"date-time":"2020-11-03T12:27:35Z","timestamp":1604406455000},"page":"134-139","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["MarioMix"],"prefix":"10.1145","author":[{"given":"Christian","family":"Arzate Cruz","sequence":"first","affiliation":[{"name":"The University of Tokyo, Tokyo, Japan"}]},{"given":"Takeo","family":"Igarashi","sequence":"additional","affiliation":[{"name":"The University of Tokyo, Tokyo, Japan"}]}],"member":"320","published-online":{"date-parts":[[2020,11,3]]},"reference":[{"doi-asserted-by":"publisher","key":"e_1_3_2_2_1_1","DOI":"10.1145\/1015330.1015430"},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_2_1","DOI":"10.1609\/aimag.v35i4.2513"},{"unstructured":"Ofra Amir Ece Kamar Andrey Kolobov and Barbara Grosz. 2016. Interactive teaching strategies for agent training. (2016).  Ofra Amir Ece Kamar Andrey Kolobov and Barbara Grosz. 2016. Interactive teaching strategies for agent training. (2016).","key":"e_1_3_2_2_3_1"},{"key":"e_1_3_2_2_4_1","volume-title":"DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback. CoRR abs\/1810.11748","author":"Arakawa Riku","year":"2018","unstructured":"Riku Arakawa , Sosuke Kobayashi , Yuya Unno , Yuta Tsuboi , and Shinichi Maeda . 2018. DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback. CoRR abs\/1810.11748 ( 2018 ). arXiv:1810.11748 http:\/\/arxiv.org\/abs\/1810.11748 Riku Arakawa, Sosuke Kobayashi, Yuya Unno, Yuta Tsuboi, and Shinichi Maeda. 2018. DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback. CoRR abs\/1810.11748 (2018). arXiv:1810.11748 http:\/\/arxiv.org\/abs\/1810.11748"},{"key":"e_1_3_2_2_5_1","volume-title":"Automated Video Game Testing Using Synthetic and Human-Like Agents","author":"Ariyurek Sinan","year":"2019","unstructured":"Sinan Ariyurek , Aysu Betin-Can , and Elif Surer . 2019. Automated Video Game Testing Using Synthetic and Human-Like Agents . IEEE Transactions on Games ( 2019 ). Sinan Ariyurek, Aysu Betin-Can, and Elif Surer. 2019. Automated Video Game Testing Using Synthetic and Human-Like Agents. IEEE Transactions on Games (2019)."},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_6_1","DOI":"10.1145\/3357236.3395525"},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_7_1","DOI":"10.3390\/app8122453"},{"unstructured":"Francisco Elizalde and Luis Enrique Sucar. 2009. Expert Evaluation of Probabilistic Explanations.. In ExaCt. 1--12.  Francisco Elizalde and Luis Enrique Sucar. 2009. Expert Evaluation of Probabilistic Explanations.. In ExaCt. 1--12.","key":"e_1_3_2_2_8_1"},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_9_1","DOI":"10.1145\/3242671.3242709"},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_10_1","DOI":"10.3390\/make1010002"},{"volume-title":"Proceedings of the 8th International Conference on Intelligent User Interfaces","author":"Fails Jerry Alan","unstructured":"Jerry Alan Fails and Dan R . Olsen, Jr. 2003. Interactive Machine Learning . In Proceedings of the 8th International Conference on Intelligent User Interfaces ( Miami, Florida, USA) (IUI '03). ACM, New York, NY, USA, 39--45. https:\/\/doi.org\/10.1145\/604045.604056 10.1145\/604045.604056 Jerry Alan Fails and Dan R. Olsen, Jr. 2003. Interactive Machine Learning. In Proceedings of the 8th International Conference on Intelligent User Interfaces (Miami, Florida, USA) (IUI '03). ACM, New York, NY, USA, 39--45. https:\/\/doi.org\/10.1145\/604045.604056","key":"e_1_3_2_2_11_1"},{"key":"e_1_3_2_2_12_1","volume-title":"International Conference on Machine Learning. 49--58","author":"Finn Chelsea","year":"2016","unstructured":"Chelsea Finn , Sergey Levine , and Pieter Abbeel . 2016 . Guided cost learning: Deep inverse optimal control via policy optimization . In International Conference on Machine Learning. 49--58 . Chelsea Finn, Sergey Levine, and Pieter Abbeel. 2016. Guided cost learning: Deep inverse optimal control via policy optimization. In International Conference on Machine Learning. 49--58."},{"key":"e_1_3_2_2_13_1","volume-title":"Visualizing and understanding atari agents. arXiv preprint arXiv:1711.00138","author":"Greydanus Sam","year":"2017","unstructured":"Sam Greydanus , Anurag Koul , Jonathan Dodge , and Alan Fern . 2017. Visualizing and understanding atari agents. arXiv preprint arXiv:1711.00138 ( 2017 ). Sam Greydanus, Anurag Koul, Jonathan Dodge, and Alan Fern. 2017. Visualizing and understanding atari agents. arXiv preprint arXiv:1711.00138 (2017)."},{"unstructured":"Shane Griffith Kaushik Subramanian Jonathan Scholz Charles L Isbell and Andrea L Thomaz. 2013. Policy shaping: Integrating human feedback with reinforcement learning. In Advances in neural information processing systems. 2625--2633.  Shane Griffith Kaushik Subramanian Jonathan Scholz Charles L Isbell and Andrea L Thomaz. 2013. Policy shaping: Integrating human feedback with reinforcement learning. In Advances in neural information processing systems. 2625--2633.","key":"e_1_3_2_2_14_1"},{"unstructured":"Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. In Advances in Neural Information Processing Systems. 4565--4573.  Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. In Advances in Neural Information Processing Systems. 4565--4573.","key":"e_1_3_2_2_15_1"},{"key":"e_1_3_2_2_16_1","volume-title":"Visual analytics in deep learning: An interrogative survey for the next frontiers","author":"Hohman Fred Matthew","year":"2018","unstructured":"Fred Matthew Hohman , Minsuk Kahng , Robert Pienta , and Duen Horng Chau . 2018. Visual analytics in deep learning: An interrogative survey for the next frontiers . IEEE transactions on visualization and computer graphics ( 2018 ). Fred Matthew Hohman, Minsuk Kahng, Robert Pienta, and Duen Horng Chau. 2018. Visual analytics in deep learning: An interrogative survey for the next frontiers. IEEE transactions on visualization and computer graphics (2018)."},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_17_1","DOI":"10.1145\/2908812.2908920"},{"key":"e_1_3_2_2_18_1","volume-title":"Why: Natural explanations from a robot navigator. arXiv preprint arXiv:1709.09741","author":"Korpan Raj","year":"2017","unstructured":"Raj Korpan , Susan L Epstein , Anoop Aroor , and Gil Dekel . 2017 . Why: Natural explanations from a robot navigator. arXiv preprint arXiv:1709.09741 (2017). Raj Korpan, Susan L Epstein, Anoop Aroor, and Gil Dekel. 2017. Why: Natural explanations from a robot navigator. arXiv preprint arXiv:1709.09741 (2017)."},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_19_1","DOI":"10.1145\/3277904"},{"key":"e_1_3_2_2_20_1","volume-title":"Scalable agent alignment via reward modeling: a research direction. arXiv preprint arXiv:1811.07871","author":"Leike Jan","year":"2018","unstructured":"Jan Leike , David Krueger , Tom Everitt , Miljan Martic , Vishal Maini , and Shane Legg . 2018. Scalable agent alignment via reward modeling: a research direction. arXiv preprint arXiv:1811.07871 ( 2018 ). Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, and Shane Legg. 2018. Scalable agent alignment via reward modeling: a research direction. arXiv preprint arXiv:1811.07871 (2018)."},{"key":"e_1_3_2_2_21_1","volume-title":"12th International Conference on Autonomous Agents and Multiagent Systems 2013, AAMAS 2013 2.","author":"Li Guangliang","year":"2013","unstructured":"Guangliang Li , Hayley Hung , Shimon Whiteson , and W Knox . 2013 . Using Informative Behavior to Increase Engagement in the TAMER Framework . 12th International Conference on Autonomous Agents and Multiagent Systems 2013, AAMAS 2013 2. Guangliang Li, Hayley Hung, Shimon Whiteson, and W Knox. 2013. Using Informative Behavior to Increase Engagement in the TAMER Framework. 12th International Conference on Autonomous Agents and Multiagent Systems 2013, AAMAS 2013 2."},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_22_1","DOI":"10.5555\/3305890.3305917"},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_23_1","DOI":"10.1016\/j.jvlc.2016.10.007"},{"key":"e_1_3_2_2_24_1","first-page":"278","article-title":"Policy invariance under reward transformations: Theory and application to reward shaping","volume":"99","author":"Ng Andrew Y","year":"1999","unstructured":"Andrew Y Ng , Daishi Harada , and Stuart Russell . 1999 . Policy invariance under reward transformations: Theory and application to reward shaping . In ICML , Vol. 99. 278 -- 287 . Andrew Y Ng, Daishi Harada, and Stuart Russell. 1999. Policy invariance under reward transformations: Theory and application to reward shaping. In ICML, Vol. 99. 278--287.","journal-title":"ICML"},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_25_1","DOI":"10.1145\/3242671.3242706"},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_26_1","DOI":"10.1109\/TCIAIG.2014.2335273"},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_27_1","DOI":"10.24963\/ijcai.2017\/534"},{"key":"e_1_3_2_2_28_1","volume-title":"Daniel Weiskopf, Stephen C. North, and Daniel A. Keim.","author":"Sacha Dominik","year":"2016","unstructured":"Dominik Sacha , Michael Sedlmair , Leishi Zhang , John Aldo Lee , Daniel Weiskopf, Stephen C. North, and Daniel A. Keim. 2016 . Humancentered machine learning through interactive visualization: review and open challenges. In ESANN. Dominik Sacha, Michael Sedlmair, Leishi Zhang, John Aldo Lee, Daniel Weiskopf, Stephen C. North, and Daniel A. Keim. 2016. Humancentered machine learning through interactive visualization: review and open challenges. In ESANN."},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_29_1","DOI":"10.1145\/3270316.3271539"},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_30_1","DOI":"10.1109\/CIG.2016.7860436"},{"key":"e_1_3_2_2_31_1","volume-title":"Real-time neuroevolution in the NERO video game","author":"Stanley Kenneth O","year":"2005","unstructured":"Kenneth O Stanley , Bobby D Bryant , and Risto Miikkulainen . 2005. Real-time neuroevolution in the NERO video game . IEEE transactions on evolutionary computation 9, 6 ( 2005 ), 653--668. Kenneth O Stanley, Bobby D Bryant, and Risto Miikkulainen. 2005. Real-time neuroevolution in the NERO video game. IEEE transactions on evolutionary computation 9, 6 (2005), 653--668."},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_32_1","DOI":"10.1109\/TG.2018.2846639"},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_33_1","DOI":"10.1109\/5.949485"},{"key":"e_1_3_2_2_34_1","volume-title":"The 10th International Conference on Autonomous Agents and Multiagent Systems-Volume 2. International Foundation for Autonomous Agents and Multiagent Systems, 617--624","author":"Taylor Matthew E","year":"2011","unstructured":"Matthew E Taylor , Halit Bener Suay , and Sonia Chernova . 2011 . Integrating reinforcement learning with human demonstrations of varying ability . In The 10th International Conference on Autonomous Agents and Multiagent Systems-Volume 2. International Foundation for Autonomous Agents and Multiagent Systems, 617--624 . Matthew E Taylor, Halit Bener Suay, and Sonia Chernova. 2011. Integrating reinforcement learning with human demonstrations of varying ability. In The 10th International Conference on Autonomous Agents and Multiagent Systems-Volume 2. International Foundation for Autonomous Agents and Multiagent Systems, 617--624."},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_35_1","DOI":"10.1007\/978-3-319-78978-1_5"},{"key":"e_1_3_2_2_36_1","volume-title":"Conference'17","author":"Yang Zhao","year":"2018","unstructured":"Zhao Yang , Song Bai , Li Zhang , and Philip HS Torr . 2018 . Learn to Interpret Atari Agents. arXiv preprint arXiv:1812.11276 (2018) . Conference'17 , July 2017, Washington, DC, USA Trovato and Tobin, et al. Zhao Yang, Song Bai, Li Zhang, and Philip HS Torr. 2018. Learn to Interpret Atari Agents. arXiv preprint arXiv:1812.11276 (2018). Conference'17, July 2017, Washington, DC, USA Trovato and Tobin, et al."},{"doi-asserted-by":"publisher","key":"e_1_3_2_2_37_1","DOI":"10.1109\/TCIAIG.2014.2339221"},{"unstructured":"Brian D Ziebart Andrew Maas J Andrew Bagnell and Anind K Dey. 2008. Maximum entropy inverse reinforcement learning. (2008).  Brian D Ziebart Andrew Maas J Andrew Bagnell and Anind K Dey. 2008. Maximum entropy inverse reinforcement learning. (2008).","key":"e_1_3_2_2_38_1"},{"key":"e_1_3_2_2_39_1","volume-title":"AAAI Spring Symposium: Human Behavior Modeling. 92","author":"Ziebart Brian D","year":"2009","unstructured":"Brian D Ziebart , Andrew L Maas , J Andrew Bagnell , and Anind K Dey . 2009 . Human Behavior Modeling with Maximum Entropy Inverse Optimal Control .. In AAAI Spring Symposium: Human Behavior Modeling. 92 . Brian D Ziebart, Andrew L Maas, J Andrew Bagnell, and Anind K Dey. 2009. Human Behavior Modeling with Maximum Entropy Inverse Optimal Control.. In AAAI Spring Symposium: Human Behavior Modeling. 92."}],"event":{"sponsor":["SIGCHI ACM Special Interest Group on Computer-Human Interaction"],"acronym":"CHI PLAY '20","name":"CHI PLAY '20: The Annual Symposium on Computer-Human Interaction in Play","location":"Virtual Event Canada"},"container-title":["Extended Abstracts of the 2020 Annual Symposium on Computer-Human Interaction in Play"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3383668.3419938","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3383668.3419938","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:31:40Z","timestamp":1750195900000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3383668.3419938"}},"subtitle":["Creating Aligned Playstyles for Bots with Interactive Reinforcement Learning"],"short-title":[],"issued":{"date-parts":[[2020,11,2]]},"references-count":39,"alternative-id":["10.1145\/3383668.3419938","10.1145\/3383668"],"URL":"https:\/\/doi.org\/10.1145\/3383668.3419938","relation":{},"subject":[],"published":{"date-parts":[[2020,11,2]]},"assertion":[{"value":"2020-11-03","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}