{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,5]],"date-time":"2025-10-05T21:40:26Z","timestamp":1759700426865,"version":"build-2065373602"},"reference-count":79,"publisher":"Association for Computing Machinery (ACM)","issue":"6","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Hum.-Comput. Interact."],"published-print":{"date-parts":[[2025,10,5]]},"abstract":"<jats:p>\n            Recent advances in Large Language Models (LLMs) have demonstrated their potential as autonomous agents across various tasks. One emerging application is the use of LLMs in playing games. In this work, we explore a practical problem for the gaming industry:\n            <jats:bold>Can LLMs be used to measure game difficulty?<\/jats:bold>\n            We evaluate the feasibility of using LLM agents to test game difficulty, focusing on two widely played games:\n            <jats:italic toggle=\"yes\">Wordle<\/jats:italic>\n            and\n            <jats:italic toggle=\"yes\">Slay the Spire<\/jats:italic>\n            . Our results reveal an interesting finding: although LLMs may not perform as well as the average human player, their performance, when guided by simple, generic prompting techniques, shows a statistically significant and strong correlation with difficulty indicated by human players. This suggests that LLMs could potentially serve as human-like agents for measuring game difficulty during the development process, as their assessments may align closely with those of human players. Based on our experiments, we also propose general principles and guidelines for integrating LLMs into the game testing workflow.\n          <\/jats:p>","DOI":"10.1145\/3748634","type":"journal-article","created":{"date-parts":[[2025,10,5]],"date-time":"2025-10-05T21:01:11Z","timestamp":1759698071000},"page":"1097-1123","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["LLMs May Not Be Human-Level Players, But They Can Be Testers: Measuring Game Difficulty with LLM Agents"],"prefix":"10.1145","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-7143-2771","authenticated-orcid":false,"given":"Chang","family":"Xiao","sequence":"first","affiliation":[{"name":"Boston University, Boston, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3224-2154","authenticated-orcid":false,"given":"Zixiaofan","family":"Yang","sequence":"additional","affiliation":[{"name":"Columbia University, New York, USA"}]}],"member":"320","published-online":{"date-parts":[[2025,10,5]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"crossref","first-page":"102370","DOI":"10.1016\/j.ijhcs.2019.102370","article-title":"Development and validation of the player experience inventory: A scale to measure player experiences at the level of functional and psychosocial consequences","volume":"135","author":"Abeele Vero Vanden","year":"2020","unstructured":"Vero Vanden Abeele, Katta Spiel, Lennart Nacke, Daniel Johnson, and Kathrin Gerling. 2020. Development and validation of the player experience inventory: A scale to measure player experiences at the level of functional and psychosocial consequences. International Journal of Human-Computer Studies, 135 (2020), 102370.","journal-title":"International Journal of Human-Computer Studies"},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 18th International Conference on the Foundations of Digital Games. 1\u20139.","author":"Acharya Devi","year":"2023","unstructured":"Devi Acharya, Jack Kelly, William Tate, Maxwell Joslyn, Michael Mateas, and Noah Wardrip-Fruin. 2023. Shoelace: A storytelling assistant for GUMSHOE One-2-One. In Proceedings of the 18th International Conference on the Foundations of Digital Games. 1\u20139."},{"key":"e_1_2_1_3_1","volume-title":"International Conference on Machine Learning. 337\u2013371","author":"Aher Gati V","year":"2023","unstructured":"Gati V Aher, Rosa I Arriaga, and Adam Tauman Kalai. 2023. Using large language models to simulate multiple humans and replicate human subject studies. In International Conference on Machine Learning. 337\u2013371."},{"key":"e_1_2_1_4_1","volume-title":"Entertainment Computing\u2013ICEC 2009: 8th International Conference","author":"Aponte Maria-Virginia","year":"2009","unstructured":"Maria-Virginia Aponte, Guillaume Levieux, and St\u00e9phane Natkin. 2009. Scaling the level of difficulty in single player video games. In Entertainment Computing\u2013ICEC 2009: 8th International Conference, Paris, France, September 3-5, 2009. Proceedings 8. 24\u201335."},{"key":"e_1_2_1_5_1","doi-asserted-by":"crossref","first-page":"205","DOI":"10.1016\/j.entcom.2011.04.001","article-title":"Measuring the level of difficulty in single player video games","volume":"2","author":"Aponte Maria-Virginia","year":"2011","unstructured":"Maria-Virginia Aponte, Guillaume Levieux, and Stephane Natkin. 2011. Measuring the level of difficulty in single player video games. Entertainment Computing, 2, 4 (2011), 205\u2013213.","journal-title":"Entertainment Computing"},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","first-page":"8","DOI":"10.1109\/MCI.2019.2919363","article-title":"Improving RTS game AI by supervised policy learning, tactical search, and deep reinforcement learning","volume":"14","author":"Barriga Nicolas A","year":"2019","unstructured":"Nicolas A Barriga, Marius Stanescu, Felipe Besoain, and Michael Buro. 2019. Improving RTS game AI by supervised policy learning, tactical search, and deep reinforcement learning. IEEE Computational Intelligence Magazine, 14, 3 (2019), 8\u201318.","journal-title":"IEEE Computational Intelligence Magazine"},{"key":"e_1_2_1_7_1","volume-title":"Proceedings of the 19th International Conference on the Foundations of Digital Games. 1\u201310","author":"Bateni Bahar","year":"2024","unstructured":"Bahar Bateni and Jim Whitehead. 2024. Language-Driven Play: Large Language Models as Game-Playing Agents in Slay the Spire. In Proceedings of the 19th International Conference on the Foundations of Digital Games. 1\u201310."},{"key":"e_1_2_1_8_1","unstructured":"Chris. 2024. Wordle Stats \u2013 How Hard is Today\u2019s Wordle? https:\/\/engaging-data.com\/wordle-guess-distribution\/"},{"key":"e_1_2_1_9_1","volume-title":"Mumbai","author":"Constant Thomas","year":"2017","unstructured":"Thomas Constant, Guillaume Levieux, Axel Buendia, and St\u00e9phane Natkin. 2017. From objective to subjective difficulty evaluation in video games. In Human-Computer Interaction-INTERACT 2017: 16th IFIP TC 13 International Conference, Mumbai, India, September 25-29, 2017, Proceedings, Part II 16. 107\u2013127."},{"key":"e_1_2_1_10_1","volume-title":"Flow: The psychology of optimal experience.","author":"Czikszentmihalyi Mihaly","year":"1990","unstructured":"Mihaly Czikszentmihalyi. 1990. Flow: The psychology of optimal experience. New York: Harper & Row."},{"key":"e_1_2_1_11_1","volume-title":"Chessgpt: Bridging policy learning and language modeling. Advances in Neural Information Processing Systems, 36","author":"Feng Xidong","year":"2024","unstructured":"Xidong Feng, Yicheng Luo, Ziyan Wang, Hongrui Tang, Mengyue Yang, Kun Shao, David Mguni, Yali Du, and Jun Wang. 2024. Chessgpt: Bridging policy learning and language modeling. Advances in Neural Information Processing Systems, 36 (2024)."},{"key":"e_1_2_1_12_1","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1016\/j.entcom.2014.08.004","article-title":"A methodological approach to identifying and quantifying video game difficulty factors","volume":"5","author":"Fraser James","year":"2014","unstructured":"James Fraser, Michael Katchabaw, and Robert E Mercer. 2014. A methodological approach to identifying and quantifying video game difficulty factors. Entertainment Computing, 5, 4 (2014), 441\u2013449.","journal-title":"Entertainment Computing"},{"key":"e_1_2_1_13_1","unstructured":"PC Gamer. 2024. Shadows of the Erdtree currently has a mixed status on Steam with many of the negative reviews complaining that the bosses are too hard. https:\/\/www.pcgamer.com\/games\/rpg\/shadows-of-the-erdtree-currently-has-a-mixed-status-on-steam-with-many-of-the-negative-reviews-complaining-that-the-bosses-are-too-hard\/"},{"key":"e_1_2_1_14_1","unstructured":"Stan Girard Nicolas Oulianov Pierre-Louis Biojout and Paul-Louis Venard. 2024. LLM Colosseum: Benchmarking LLMs with Street Fighter 3. https:\/\/github.com\/OpenGenerativeAI\/llm-colosseum"},{"key":"e_1_2_1_15_1","unstructured":"Jiawei Gu Xuhui Jiang Zhichao Shi Hexiang Tan Xuehao Zhai Chengjin Xu Wei Li Yinghan Shen Shengjie Ma Honghao Liu et al. 2024. A survey on llm-as-a-judge. arXiv preprint arXiv:2411.15594."},{"key":"e_1_2_1_16_1","volume-title":"2018 IEEE Conference on Computational Intelligence and Games (CIG). 1\u20138.","author":"Guerrero-Romero Cristina","year":"2018","unstructured":"Cristina Guerrero-Romero, Simon M Lucas, and Diego Perez-Liebana. 2018. Using a team of general ai algorithms to assist game design and testing. In 2018 IEEE Conference on Computational Intelligence and Games (CIG). 1\u20138."},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the 2019 CHI conference on human factors in computing systems. 1\u201313","author":"Guzdial Matthew","year":"2019","unstructured":"Matthew Guzdial, Nicholas Liao, Jonathan Chen, Shao-Yu Chen, Shukan Shah, Vishwa Shah, Joshua Reno, Gillian Smith, and Mark O Riedl. 2019. Friend, collaborator, student, manager: How design of an ai-driven game level editor affects creators. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1\u201313."},{"key":"e_1_2_1_18_1","unstructured":"Johannes Heinrich and David Silver. 2016. Deep reinforcement learning from self-play in imperfect-information games. arXiv preprint arXiv:1603.01121."},{"key":"e_1_2_1_19_1","doi-asserted-by":"crossref","unstructured":"John J Horton. 2023. Large language models as simulated economic agents: What can we learn from homo silicus? National Bureau of Economic Research.","DOI":"10.3386\/w31122"},{"key":"e_1_2_1_20_1","unstructured":"Sihao Hu Tiansheng Huang Fatih Ilhan Selim Tekin Gaowen Liu Ramana Kompella and Ling Liu. 2024. A survey on large language model-based game agents. arXiv preprint arXiv:2404.02039."},{"key":"e_1_2_1_21_1","unstructured":"Sihao Hu Tiansheng Huang and Ling Liu. 2024. Pok\u00e9LLMon: A Human-Parity Agent for Pok\u00e9mon Battles with Large Language Models. arXiv preprint arXiv:2402.01118."},{"key":"e_1_2_1_22_1","unstructured":"Chenghao Huang Yanbo Cao Yinlong Wen Tao Zhou and Yanru Zhang. 2024. PokerGPT: An End-to-End Lightweight Solver for Multi-Player Texas Hold\u2019em via Large Language Model. arXiv preprint arXiv:2401.06781."},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the AAAI Workshop on Challenges in Game AI. 4, 1722","author":"Hunicke Robin","year":"2004","unstructured":"Robin Hunicke, Marc LeBlanc, Robert Zubek, et al. 2004. MDA: A formal approach to game design and game research. In Proceedings of the AAAI Workshop on Challenges in Game AI. 4, 1722."},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of DiGRA 2003 Conference: Level Up.","author":"Juul Jesper","year":"2003","unstructured":"Jesper Juul. 2003. The game, the player, the world: Looking for a heart of gameness. In Proceedings of DiGRA 2003 Conference: Level Up."},{"key":"e_1_2_1_25_1","volume-title":"The art of failure: An essay on the pain of playing video games","author":"Juul Jesper","unstructured":"Jesper Juul. 2013. The art of failure: An essay on the pain of playing video games. MIT press."},{"key":"e_1_2_1_26_1","volume-title":"Proceedings of the 18th International Conference on the Foundations of Digital Games. 1\u20134.","author":"Kelly Jack","year":"2023","unstructured":"Jack Kelly, Michael Mateas, and Noah Wardrip-Fruin. 2023. Towards computational support with language models for TTRPG game masters. In Proceedings of the 18th International Conference on the Foundations of Digital Games. 1\u20134."},{"key":"e_1_2_1_27_1","unstructured":"Christian Keszthelyi. 2023. Modl. ai: creating the ultimate AI game testers."},{"key":"e_1_2_1_28_1","unstructured":"Neil Kirby and Heather Hurley. 2011. Introduction to game AI. Course Technology\/Cengage Learning."},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the AAAI conference on artificial intelligence. 31","author":"Lample Guillaume","year":"2017","unstructured":"Guillaume Lample and Devendra Singh Chaplot. 2017. Playing FPS games with deep reinforcement learning. In Proceedings of the AAAI conference on artificial intelligence. 31."},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 15523\u201315536","author":"Li Nian","year":"2024","unstructured":"Nian Li, Chen Gao, Mingyu Li, Yong Li, and Qingmin Liao. 2024. EconAgent: Large Language Model-Empowered Agents for Simulating Macroeconomic Activities. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 15523\u201315536."},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the 2019 CHI conference on human factors in computing systems. 1\u201313","author":"Liang Claire","year":"2019","unstructured":"Claire Liang, Julia Proft, Erik Andersen, and Ross A Knepper. 2019. Implicit communication of actionable information in human-ai teams. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1\u201313."},{"key":"e_1_2_1_32_1","volume-title":"NeurIPS 2023 Foundation Models for Decision Making Workshop.","author":"Light Jonathan","year":"2023","unstructured":"Jonathan Light, Min Cai, Sheng Shen, and Ziniu Hu. 2023. Avalonbench: Evaluating llms playing the game of avalon. In NeurIPS 2023 Foundation Models for Decision Making Workshop."},{"key":"e_1_2_1_33_1","volume-title":"Pre-Release Experimentation in Indie Game Development: An Interview Survey. In International Conference on Software Business. 293\u2013308","author":"Lin\u00e5ker Johan","year":"2024","unstructured":"Johan Lin\u00e5ker, Elizabeth Bjarnason, and Fabian Fagerholm. 2024. Pre-Release Experimentation in Indie Game Development: An Interview Survey. In International Conference on Software Business. 293\u2013308."},{"key":"e_1_2_1_34_1","unstructured":"Weiyu Ma Qirui Mi Xue Yan Yuqiao Wu Runji Lin Haifeng Zhang and Jun Wang. 2023. Large language models play starcraft ii: Benchmarks and a chain of summarization approach. arXiv preprint arXiv:2312.11865."},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 2005 ACM SIGCHI International Conference on Advances in computer entertainment technology. 421\u2013428","author":"Macleod Alasdair","year":"2005","unstructured":"Alasdair Macleod. 2005. Game design through self-play experiments. In Proceedings of the 2005 ACM SIGCHI International Conference on Advances in computer entertainment technology. 421\u2013428."},{"key":"e_1_2_1_36_1","volume-title":"Automated social science: Language models as scientist and subjects","author":"Manning Benjamin S","unstructured":"Benjamin S Manning, Kehang Zhu, and John J Horton. 2024. Automated social science: Language models as scientist and subjects. National Bureau of Economic Research."},{"key":"e_1_2_1_37_1","unstructured":"MegaCrit. 2019. Slay the Spire. Video game. Available on Microsoft Windows macOS Linux PlayStation 4 Xbox One Nintendo Switch iOS Android"},{"key":"e_1_2_1_38_1","unstructured":"Volodymyr Mnih Koray Kavukcuoglu David Silver Alex Graves Ioannis Antonoglou Daan Wierstra and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602."},{"key":"e_1_2_1_39_1","volume-title":"Trends, COVID-19 Impact, and Forecasts (2024 -","author":"Intelligence Mordor","year":"2029","unstructured":"Mordor Intelligence. 2024. Indie Game Market - Growth, Trends, COVID-19 Impact, and Forecasts (2024 - 2029). https:\/\/www.mordorintelligence.com\/industry-reports\/indie-game-market Accessed: 2025-07-24"},{"key":"e_1_2_1_40_1","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1109\/TCIAIG.2013.2286295","article-title":"A survey of real-time strategy game AI research and competition in StarCraft","volume":"5","author":"Ontan\u00f3n Santiago","year":"2013","unstructured":"Santiago Ontan\u00f3n, Gabriel Synnaeve, Alberto Uriarte, Florian Richoux, David Churchill, and Mike Preuss. 2013. A survey of real-time strategy game AI research and competition in StarCraft. IEEE Transactions on Computational Intelligence and AI in games, 5, 4 (2013), 293\u2013311.","journal-title":"IEEE Transactions on Computational Intelligence and AI in games"},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of the 36th annual acm symposium on user interface software and technology. 1\u201322","author":"Park Joon Sung","year":"2023","unstructured":"Joon Sung Park, Joseph O\u2019Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S Bernstein. 2023. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th annual acm symposium on user interface software and technology. 1\u201322."},{"key":"e_1_2_1_42_1","doi-asserted-by":"crossref","unstructured":"Cheng Peng Xi Yang Aokun Chen Kaleb E Smith Nima PourNejatian Anthony B Costa Cheryl Martin Mona G Flores Ying Zhang Tanja Magoc et al. 2023. A study of generative large language model for medical research and healthcare. NPJ digital medicine 6 1 (2023) 210.","DOI":"10.1038\/s41746-023-00958-w"},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence. 30","author":"Perez-Liebana Diego","year":"2016","unstructured":"Diego Perez-Liebana, Spyridon Samothrakis, Julian Togelius, Tom Schaul, and Simon Lucas. 2016. General video game ai: Competition, challenges and opportunities. In Proceedings of the AAAI Conference on Artificial Intelligence. 30."},{"key":"e_1_2_1_44_1","volume-title":"2021 IEEE\/ACM International Conference on Automation of Software Test (AST). 90\u201399","author":"Politowski Cristiano","year":"2021","unstructured":"Cristiano Politowski, Fabio Petrillo, and Yann-Ga\u00ebl Gu\u00e9h\u00e9neuc. 2021. A survey of video game testing. In 2021 IEEE\/ACM International Conference on Automation of Software Test (AST). 90\u201399."},{"key":"e_1_2_1_45_1","unstructured":"Siyuan Qi Shuo Chen Yexin Li Xiangyu Kong Junqi Wang Bangcheng Yang Pring Wong Yifan Zhong Xiaoyuan Zhang Zhaowei Zhang et al. 2024. CivRealm: A learning and reasoning odyssey in Civilization for decision-making agents. arXiv preprint arXiv:2401.10568."},{"key":"e_1_2_1_46_1","unstructured":"Weihong Qi Hanjia Lyu and Jiebo Luo. 2024. Representation Bias in Political Sample Simulations with Large Language Models. arXiv preprint arXiv:2407.11409."},{"key":"e_1_2_1_47_1","unstructured":"Qubit Labs. 2024. How Many Game Developers Are There In the World? Surprising Statistics. https:\/\/qubit-labs.com\/how-many-game-developers-are-there-in-the-world-surprising-statistics\/ Accessed: 2025-07-24"},{"key":"e_1_2_1_48_1","unstructured":"Noah Ranella and Markus Eger. 2023. Towards Automated Video Game Commentary Using Generative AI.. In EXAG@ AIIDE."},{"key":"e_1_2_1_49_1","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1007\/s13218-020-00647-w","article-title":"From chess and atari to starcraft and beyond: How game ai is driving the world of ai","volume":"34","author":"Risi Sebastian","year":"2020","unstructured":"Sebastian Risi and Mike Preuss. 2020. From chess and atari to starcraft and beyond: How game ai is driving the world of ai. KI-K\u00fcnstliche Intelligenz, 34, 1 (2020), 7\u201317.","journal-title":"KI-K\u00fcnstliche Intelligenz"},{"key":"e_1_2_1_50_1","unstructured":"Andy Robinson. 2025. Microsoft\u2019s CEO says Xbox \u2019will have a catalog of games using generative AI\u2019. https:\/\/www.videogameschronicle.com\/news\/microsofts-ceo-says-xbox-will-have-a-catalog-of-games-using-generative-ai\/"},{"key":"e_1_2_1_51_1","unstructured":"Andrew Rollings and Ernest Adams. 2003. Andrew Rollings and Ernest Adams on game design. New Riders."},{"key":"e_1_2_1_52_1","unstructured":"Soumadeep Saha Sutanoya Chakraborty Saptarshi Saha and Utpal Garain. 2024. Language Models are Crossword Solvers. arXiv preprint arXiv:2406.09043."},{"key":"e_1_2_1_53_1","unstructured":"Charles P Schultz and Robert Denton Bryant. 2016. Game testing: All in one. Mercury Learning and Information."},{"key":"e_1_2_1_54_1","volume-title":"Role play with large language models. Nature, 623, 7987","author":"Shanahan Murray","year":"2023","unstructured":"Murray Shanahan, Kyle McDonell, and Laria Reynolds. 2023. Role play with large language models. Nature, 623, 7987 (2023), 493\u2013498."},{"key":"e_1_2_1_55_1","unstructured":"Kun Shao Zhentao Tang Yuanheng Zhu Nannan Li and Dongbin Zhao. 2019. A survey of deep reinforcement learning in video games. arXiv preprint arXiv:1912.10944."},{"key":"e_1_2_1_56_1","doi-asserted-by":"crossref","unstructured":"David Silver Thomas Hubert Julian Schrittwieser Ioannis Antonoglou Matthew Lai Arthur Guez Marc Lanctot Laurent Sifre Dharshan Kumaran Thore Graepel et al. 2018. A general reinforcement learning algorithm that masters chess shogi and Go through self-play. Science 362 6419 (2018) 1140\u20131144.","DOI":"10.1126\/science.aar6404"},{"key":"e_1_2_1_57_1","unstructured":"Statista. 2023. Number of games released on Steam 2004-2022. https:\/\/www.statista.com\/statistics\/552623\/number-games-released-steam\/ Accessed: 2024-09-06"},{"key":"e_1_2_1_58_1","volume-title":"2018 IEEE conference on computational intelligence and games (CIG). 1\u20138.","author":"\u015awiechowski Maciej","year":"2018","unstructured":"Maciej \u015awiechowski, Tomasz Tajmajer, and Andrzej Janusz. 2018. Improving hearthstone ai by combining mcts and supervised learning algorithms. In 2018 IEEE conference on computational intelligence and games (CIG). 1\u20138."},{"key":"e_1_2_1_59_1","unstructured":"Paul Tassi. 2024. \u2018Elden Ring: Shadow Of The Erdtree\u2019 At Mixed Reviews Because It\u2019s Too Hard. https:\/\/www.forbes.com\/sites\/paultassi\/2024\/06\/22\/elden-ring-shadow-of-the-erdtree-at-mixed-reviews-because-its-too-hard\/"},{"key":"e_1_2_1_60_1","unstructured":"Financial Times. 2024. AI could actually change the gaming industry. https:\/\/www.ft.com\/content\/962ec2b7-88d4-498f-b1c9-83f157eac828\/"},{"key":"e_1_2_1_61_1","unstructured":"Mike Treanor Alexander Zook Mirjam P Eladhari Julian Togelius Gillian Smith Michael Cook Tommy Thompson Brian Magerko John Levine and Adam Smith. 2015. AI-based game design patterns."},{"key":"e_1_2_1_62_1","volume-title":"Nelson Adami Andreollo, and Jos\u00e9 Eduardo de Aguilar-Nascimento","author":"Tustumi Francisco","year":"2023","unstructured":"Francisco Tustumi, Nelson Adami Andreollo, and Jos\u00e9 Eduardo de Aguilar-Nascimento. 2023. Future of the language models in healthcare: the role of chatGPT. ABCD. arquivos brasileiros de cirurgia digestiva (s\u00e3o paulo), 36 (2023), e1727."},{"key":"e_1_2_1_63_1","volume-title":"ACM International Conference Proceeding Series. 123","author":"Tychsen Anders","year":"2005","unstructured":"Anders Tychsen, Michael Hitchens, Thea Brolund, and Manolya Kavakli. 2005. The game master. In ACM International Conference Proceeding Series. 123, 215\u2013222."},{"key":"e_1_2_1_64_1","unstructured":"Unity Technologies. 2023. Unity. https:\/\/unity.com\/ Game development platform"},{"key":"e_1_2_1_65_1","volume-title":"Proceedings of the ACM on Human-Computer Interaction, 6, CHI PLAY","author":"Villareale Jennifer","year":"2022","unstructured":"Jennifer Villareale, Casper Harteveld, and Jichen Zhu. 2022. \" I Want To See How Smart This AI Really Is\": Player Mental Model Development of an Adversarial AI Player. Proceedings of the ACM on Human-Computer Interaction, 6, CHI PLAY (2022), 1\u201326."},{"key":"e_1_2_1_66_1","doi-asserted-by":"crossref","unstructured":"Oriol Vinyals Igor Babuschkin Wojciech M Czarnecki Micha\u00ebl Mathieu Andrew Dudzik Junyoung Chung David H Choi Richard Powell Timo Ewalds Petko Georgiev et al. 2019. Grandmaster level in StarCraft II using multi-agent reinforcement learning. nature 575 7782 (2019) 350\u2013354.","DOI":"10.1038\/s41586-019-1724-z"},{"key":"e_1_2_1_67_1","article-title":"Voyager: An Open-Ended Embodied Agent with Large Language Models","author":"Wang Guanzhi","year":"2023","unstructured":"Guanzhi Wang, Yuqi Xie, Yunfan Jiang, Ajay Mandlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, and Anima Anandkumar. 2023. Voyager: An Open-Ended Embodied Agent with Large Language Models. Transactions on Machine Learning Research.","journal-title":"Transactions on Machine Learning Research."},{"key":"e_1_2_1_68_1","unstructured":"Wikipedia contributors. 2025. Indie game. https:\/\/en.wikipedia.org\/wiki\/Indie_game Accessed: 2025-07-24"},{"key":"e_1_2_1_69_1","unstructured":"Patrick Y Wu Jonathan Nagler Joshua A Tucker and Solomon Messing. 2023. Large language models can be used to estimate the latent positions of politicians. arXiv preprint arXiv:2303.12057."},{"key":"e_1_2_1_70_1","unstructured":"Zhiheng Xi Wenxiang Chen Xin Guo Wei He Yiwen Ding Boyang Hong Ming Zhang Junzhe Wang Senjie Jin Enyu Zhou et al. 2023. The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864."},{"key":"e_1_2_1_71_1","unstructured":"Zelai Xu Chao Yu Fei Fang Yu Wang and Yi Wu. 2023. Language agents with reinforcement learning for strategic play in the werewolf game. arXiv preprint arXiv:2310.18940."},{"key":"e_1_2_1_72_1","article-title":"GPT for Games: An Updated Scoping Review (2020-2024)","author":"Yang Daijin","year":"2025","unstructured":"Daijin Yang, Erica Kleinman, and Casper Harteveld. 2025. GPT for Games: An Updated Scoping Review (2020-2024). IEEE Transactions on Games.","journal-title":"IEEE Transactions on Games."},{"key":"e_1_2_1_73_1","first-page":"908","article-title":"Supervised learning achieves human-level performance in moba games: A case study of honor of kings","volume":"33","author":"Ye Deheng","year":"2020","unstructured":"Deheng Ye, Guibin Chen, Peilin Zhao, Fuhao Qiu, Bo Yuan, Wen Zhang, Sheng Chen, Mingfei Sun, Xiaoqian Li, Siqin Li, et al. 2020. Supervised learning achieves human-level performance in moba games: A case study of honor of kings. IEEE Transactions on Neural Networks and Learning Systems, 33, 3 (2020), 908\u2013918.","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"e_1_2_1_74_1","volume-title":"Game accessibility: a survey. Universal Access in the information Society, 10","author":"Yuan Bei","year":"2011","unstructured":"Bei Yuan, Eelke Folmer, and Frederick C Harris. 2011. Game accessibility: a survey. Universal Access in the information Society, 10 (2011), 81\u2013100."},{"key":"e_1_2_1_75_1","volume-title":"Ufo: A ui-focused agent for windows os interaction. arXiv preprint arXiv:2402.07939.","author":"Zhang Chaoyun","year":"2024","unstructured":"Chaoyun Zhang, Liqun Li, Shilin He, Xu Zhang, Bo Qiao, Si Qin, Minghua Ma, Yu Kang, Qingwei Lin, Saravan Rajmohan, et al. 2024. Ufo: A ui-focused agent for windows os interaction. arXiv preprint arXiv:2402.07939."},{"key":"e_1_2_1_76_1","unstructured":"Yadong Zhang Shaoguang Mao Tao Ge Xun Wang Adrian de Wynter Yan Xia Wenshan Wu Ting Song Man Lan and Furu Wei. 2024. LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models. arXiv preprint arXiv:2404.01230."},{"key":"e_1_2_1_77_1","volume-title":"Assistants. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment. 19","author":"Zhu Andrew","year":"2023","unstructured":"Andrew Zhu, Lara Martin, Andrew Head, and Chris Callison-Burch. 2023. CALYPSO: LLMs as Dungeon Master\u2019s Assistants. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment. 19, 380\u2013390."},{"key":"e_1_2_1_78_1","volume-title":"Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1\u201317","author":"Zhu Jichen","year":"2021","unstructured":"Jichen Zhu, Jennifer Villareale, Nithesh Javvaji, Sebastian Risi, Mathias L\u00f6we, Rush Weigelt, and Casper Harteveld. 2021. Player-AI interaction: What neural network games reveal about AI as play. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1\u201317."},{"key":"e_1_2_1_79_1","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1162\/coli_a_00502","article-title":"Can large language models transform computational social science","volume":"50","author":"Ziems Caleb","year":"2024","unstructured":"Caleb Ziems, William Held, Omar Shaikh, Jiaao Chen, Zhehao Zhang, and Diyi Yang. 2024. Can large language models transform computational social science? Computational Linguistics, 50, 1 (2024), 237\u2013291.","journal-title":"Computational Linguistics"}],"container-title":["Proceedings of the ACM on Human-Computer Interaction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3748634","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,5]],"date-time":"2025-10-05T21:07:34Z","timestamp":1759698454000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3748634"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,5]]},"references-count":79,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,10,5]]}},"alternative-id":["10.1145\/3748634"],"URL":"https:\/\/doi.org\/10.1145\/3748634","relation":{},"ISSN":["2573-0142"],"issn-type":[{"value":"2573-0142","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,5]]},"assertion":[{"value":"2025-02-19","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-07-24","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-10-05","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}