{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T19:16:11Z","timestamp":1775848571650,"version":"3.50.1"},"reference-count":244,"publisher":"Elsevier BV","issue":"5","license":[{"start":{"date-parts":[[2025,7,10]],"date-time":"2025-07-10T00:00:00Z","timestamp":1752105600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,7,10]],"date-time":"2025-07-10T00:00:00Z","timestamp":1752105600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100008769","name":"Julius-Maximilians-Universit\u00e4t W\u00fcrzburg","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100008769","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Artif Intell Educ"],"published-print":{"date-parts":[[2025,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>We conducted a systematic literature review to examine the current research on the application of Reinforcement Learning (RL) in education. RL is a type of Machine Learning that trains an agent to take actions in an environment in order to maximize a reward signal. In recent years, researchers have explored the potential of RL for improving educational outcomes and developing personalized interventions. This systematic review (according to the PRISMA standard) surveys and evaluates 89 manuscripts from three databases (IEEE Xplore, Google Scholar, and ACM) published between 2000 and 2024 with predefined eligibility criteria. We examined the following objectives: (1) Educational contexts and evaluation strategies in RL-based educational applications, (2) impact and significance of RL-based applications for cognitive and affective variables, (3) RL algorithms and baselines used in the context of RL in education, (4) adaptation mechanisms in RL-based education, and (5) best practices for implementing RL in education. Our results suggest that RL has shown promise for a range of educational applications, such as enhancing learning outcomes or promoting student engagement. However, there are currently significant challenges and limitations to the use of RL in education, including methodological issues, and the need for broader and more large-scale deployments and evaluations with actual users relative to only using simulated data. Overall, this review provides a comprehensive overview of the current state of research on the application of RL in education and identifies areas where further research is needed to fully realize its potential as a tool for enhancing teaching and learning. Additionally, we present a set of best practices for the field, distilling key insights from our systematic review for practical application.<\/jats:p>","DOI":"10.1007\/s40593-025-00494-6","type":"journal-article","created":{"date-parts":[[2025,7,10]],"date-time":"2025-07-10T22:11:39Z","timestamp":1752185499000},"page":"2669-2723","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Reinforcement Learning in Education: A Systematic Literature Review"],"prefix":"10.1016","volume":"35","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1969-8563","authenticated-orcid":false,"given":"Anna","family":"Riedmann","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5960-7921","authenticated-orcid":false,"given":"Philipp","family":"Schaper","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2362-0080","authenticated-orcid":false,"given":"Birgit","family":"Lugrin","sequence":"additional","affiliation":[]}],"member":"78","published-online":{"date-parts":[[2025,7,10]]},"reference":[{"key":"494_CR1","unstructured":"Abdelshiheed, M., Hostetter, J. W., Barnes, T., & Chi, M. (2023b). Bridging Declarative, Procedural, and Conditional Metacognitive Knowledge Gap Using Deep Reinforcement Learning. In CogSci\u201923: The 45th Annual Conference of the Cognitive Science Society."},{"key":"494_CR2","doi-asserted-by":"publisher","unstructured":"*Abdelshiheed, M., Hostetter, J. W., Barnes, T., spsampsps Chi, M. (2023a). Leveraging deep reinforcement learning for metacognitive interventions across intelligent tutoring systems. In N. Wang, G. Rebolledo-Mendez, N. Matsuda, O. C. Santos, spsampsps V. Dimitrova (Eds.), Lecture Notes in Artificial Intelligence: Vol. 13916. Artificial Intelligence in Education: 24th International Conference, AIED 2023, Tokyo, Japan, July 3\u20137, 2023, Proceedings (Vol. 13916, pp.\u00a0291\u2013303). Springer Nature Switzerland; Imprint Springer. https:\/\/doi.org\/10.1007\/978-3-031-36272-9_24","DOI":"10.1007\/978-3-031-36272-9_24"},{"issue":"7","key":"494_CR3","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3543846","volume":"55","author":"MM Afsar","year":"2023","unstructured":"Afsar, M. M., Crump, T., & Far, B. (2023). Reinforcement learning based recommender systems: A survey. ACM Computing Surveys, 55(7), 1\u201338. https:\/\/doi.org\/10.1145\/3543846","journal-title":"ACM Computing Surveys"},{"key":"494_CR4","unstructured":"*Ai, F., Chen, Y., Guo, Y., Zhao, Y., Wang, Z., Fu, G., & Wang, G. Concept-aware deep knowledge tracing and exercise recommendation in an online learning system. In Proceedings of The 12th International Conference on Educational Data Mining (EDM 2019) (pp. 240\u2013245)."},{"key":"494_CR5","doi-asserted-by":"publisher","unstructured":"Akalin, N., & Loutfi, A. (2021). Reinforcement learning approaches in social robotics. Sensors, 21(4). https:\/\/doi.org\/10.3390\/s21041292","DOI":"10.3390\/s21041292"},{"key":"494_CR6","doi-asserted-by":"publisher","unstructured":"Akanksha, E., Jyoti, Sharma, N., & Gulati, K. (2021). Review on reinforcement learning, research evolution and scope of application. In 2021 5th International Conference on Computing Methodologies and Communication (ICCMC) (pp.\u00a01416\u20131423). IEEE. https:\/\/doi.org\/10.1109\/ICCMC51019.2021.9418283","DOI":"10.1109\/ICCMC51019.2021.9418283"},{"key":"494_CR7","unstructured":"*Alam, N., Mostafavi, B., Tithi, S. D., Chi, M., & Barnes, T. (2024). How Much Training is Needed? Reducing Training Time using Deep Reinforcement Learning in an Intelligent Tutor. In Proceedings of the 17th International Conference on Educational Data Mining."},{"key":"494_CR8","doi-asserted-by":"publisher","first-page":"89769","DOI":"10.1109\/ACCESS.2023.3305584","volume":"11","author":"S Amin","year":"2023","unstructured":"Amin, S., Uddin, M. I., Alarood, A. A., Mashwani, W. K., Alzahrani, A., & Alzahrani, A. O. (2023). Smart e-learning framework for personalized adaptive learning and sequential path recommendations using reinforcement learning. IEEE Access, 11, 89769\u201389790. https:\/\/doi.org\/10.1109\/ACCESS.2023.3305584","journal-title":"IEEE Access"},{"key":"494_CR9","unstructured":"*Ausin, M. S., Azizsoltani, H., Barnes, T., & Chi, M. (2019). Leveraging deep reinforcement learning for pedagogical policy induction in an intelligent tutoring system. In Proceedings of The 12th International Conference on Educational Data Mining (EDM 2019) (pp.\u00a0168\u2013177)."},{"key":"494_CR10","doi-asserted-by":"publisher","unstructured":"*Ausin, M. S., Maniktala, M., Barnes, T., spsampsps Chi, M. (2021). Tackling the credit assignment problem in reinforcement learning-induced pedagogical policies with neural networks. In I. Roll, D. McNamara, S. Sosnovsky, R. Luckin, spsampsps V. Dimitrova (Eds.), Lecture Notes in Computer Science. ARTIFICIAL INTELLIGENCE IN EDUCATION: 22nd international conference, aied 2021 (Vol. 12748, pp.\u00a0356\u2013368). SPRINGER NATURE. https:\/\/doi.org\/10.1007\/978-3-030-78292-4_29","DOI":"10.1007\/978-3-030-78292-4_29"},{"key":"494_CR11","doi-asserted-by":"publisher","DOI":"10.1007\/s40593-022-00312-3","author":"MS Ausin","year":"2022","unstructured":"Ausin, M. S., Maniktala, M., Barnes, T., & Chi, M. (2022). The impact of batch deep reinforcement learning on student performance: A simple act of explanation can go a long way. International Journal of Artificial Intelligence in Education. https:\/\/doi.org\/10.1007\/s40593-022-00312-3","journal-title":"International Journal of Artificial Intelligence in Education"},{"key":"494_CR12","doi-asserted-by":"publisher","unstructured":"Balaji, B., Mallya, S., Genc, S., Gupta, S., Dirac, L., Khare, V., Roy, G., Sun, T., Tao, Y., Townsend, B., Calleja, E., Muralidhara, S., & Karuppasamy, D. (2020). Deepracer: Autonomous racing platform for experimentation with sim2real reinforcement learning. In 2020 IEEE International Conference on Robotics and Automation (ICRA) (pp.\u00a02746\u20132754). IEEE. https:\/\/doi.org\/10.1109\/ICRA40945.2020.9197465","DOI":"10.1109\/ICRA40945.2020.9197465"},{"key":"494_CR13","unstructured":"*Barnes, T., Stamper, J., Lehman, L., & Croy, M. (2008). A pilot study on logic proof tutoring using hints generated from historical student data. In Proceedings of Educational Data Mining 2008, The 1st International Conference on Educational Data Mining, Montreal, Qu\u00e9bec, Canada, June 20\u201321, 2008."},{"key":"494_CR14","doi-asserted-by":"publisher","unstructured":"Barnes, T., spsampsps Stamper, J. (2008). Toward automatic hint generation for logic proof tutoring using historical student data. In B. P. Woolf, B. Woolf, E. Ai\u0307meur, R. Nkambou, spsampsps S. Lajoie (Eds.), Lecture Notes in Computer Science: Vol. 5091. Intelligent tutoring systems: 9th international conference, ITS 2008, Montreal, Canada, June 23 - 27, 2008; proceedings (Vol. 5091, pp.\u00a0373\u2013382). Springer. https:\/\/doi.org\/10.1007\/978-3-540-69132-7_41","DOI":"10.1007\/978-3-540-69132-7_41"},{"key":"494_CR15","doi-asserted-by":"publisher","unstructured":"*Bassen, J., Balaji, B., Schaarschmidt, M., Thille, C., Painter, J., Zimmaro, D., Games, A., Fast, E., & Mitchell, J. C. (2020). Reinforcement learning for the adaptive scheduling of educational activities. In R. Bernhaupt (Ed.), ACM Digital Library, Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (pp.\u00a01\u201312). Association for Computing Machinery. https:\/\/doi.org\/10.1145\/3313831.3376518","DOI":"10.1145\/3313831.3376518"},{"key":"494_CR16","unstructured":"Beck, J. (1997). Modeling the student with reinforcement learning. In Machine learning for User Modeling Workshop at the Sixth International Conference on User Modeling."},{"key":"494_CR17","unstructured":"Beck, J. E. (1998). Learning to teach with a reinforcement learning agent. In J. Mostow & C. Rich (Eds.), Proceedings of the Fifteenth National Conference on Artificial Intelligence and Tenth Innovative Applications of Artificial Intelligence Conference, AAAI 98, IAAI 98, July 26\u201330, 1998, Madison, Wisconsin, USA (p.\u00a01185). AAAI Press \/ The MIT Press. http:\/\/www.aaai.org\/Library\/AAAI\/1998\/aaai98-181.php"},{"key":"494_CR18","unstructured":"Beck, J. E., Woolf, B. P., & Beal, C. R. (2000). ADVISOR: A machine learning architecture for intelligent tutor construction. In Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence (pp.\u00a0552\u2013557)."},{"key":"494_CR19","doi-asserted-by":"publisher","unstructured":"Beck, J. E., spsampsps Woolf, B. P. (2000). High-level student modeling with machine learning. In G. Gauthier (Ed.), Lecture Notes in Computer Science: Vol. 1839. Intelligent tutoring systems: 5th international conference; proceedings (Vol. 1839, pp.\u00a0584\u2013593). Springer. https:\/\/doi.org\/10.1007\/3-540-45108-0_62","DOI":"10.1007\/3-540-45108-0_62"},{"key":"494_CR20","doi-asserted-by":"publisher","unstructured":"Belfer, R., Kochmar, E., spsampsps Serban, I. V. (2022). Raising student completion rates with adaptive curriculum and contextual bandits. In M. M. Rodrigo, N. Matsuda, A. I. Cristea, spsampsps V. Dimitrova (Eds.), Lecture Notes in Computer Science: Vol. 13355. Artificial Intelligence in Education: 23rd International Conference, AIED 2022, Durham, UK, July 27\u201331, 2022, Proceedings, Part I (1st ed. 2022, Vol. 13355, pp.\u00a0724\u2013730). Springer International Publishing; Imprint Springer. https:\/\/doi.org\/10.1007\/978-3-031-11644-5_74","DOI":"10.1007\/978-3-031-11644-5_74"},{"issue":"8","key":"494_CR21","doi-asserted-by":"publisher","first-page":"716","DOI":"10.1073\/pnas.38.8.716","volume":"38","author":"R Bellman","year":"1952","unstructured":"Bellman, R. (1952). On the Theory of Dynamic Programming. Proceedings of the National Academy of Sciences of the United States of America, 38(8), 716\u2013719. https:\/\/doi.org\/10.1073\/pnas.38.8.716","journal-title":"Proceedings of the National Academy of Sciences of the United States of America"},{"issue":"4","key":"494_CR22","doi-asserted-by":"publisher","first-page":"264","DOI":"10.1109\/TCIAIG.2009.2035923","volume":"1","author":"F Bellotti","year":"2009","unstructured":"Bellotti, F., Berta, R., de Gloria, A., & Primavera, L. (2009). Adaptive experience engine for serious games. IEEE Transactions on Computational Intelligence and AI in Games, 1(4), 264\u2013280. https:\/\/doi.org\/10.1109\/TCIAIG.2009.2035923","journal-title":"IEEE Transactions on Computational Intelligence and AI in Games"},{"key":"494_CR23","doi-asserted-by":"publisher","unstructured":"Bennane, A. (2013). Adaptive educational software by applying reinforcement learning. Informatics in Education, 12(1), 13\u201328. https:\/\/doi.org\/10.15388\/infedu.2013.02","DOI":"10.15388\/infedu.2013.02"},{"key":"494_CR24","unstructured":"Bittencourt, I., Tadeu, M., & Costa, E. (2006). Combining AI techniques into a legal agent-based intelligent tutoring system. In Eighteenth International Conference on Software Engineering and Knowledge Engineering-SEKE (Vol. 18, pp.\u00a035\u201340)."},{"key":"494_CR25","doi-asserted-by":"publisher","first-page":"1198","DOI":"10.1016\/j.procs.2020.03.028","volume":"170","author":"M Boussakssou","year":"2020","unstructured":"Boussakssou, M., Hssina, B., & Erittali, M. (2020). Towards an adaptive e-learning system based on q-learning algorithm. Procedia Computer Science, 170, 1198\u20131203. https:\/\/doi.org\/10.1016\/j.procs.2020.03.028","journal-title":"Procedia Computer Science"},{"key":"494_CR26","doi-asserted-by":"publisher","unstructured":"*Caro, M. F., Quitian, L., Giraldo, J. C., & Lengua-Cantero, C. (2023). A Formal Model for Personalized Learning Path using Artificial Intelligence for Instructional Planning with a Focus on 21st-Century Skills and Environmental Awareness. In 2023 IEEE Colombian Caribbean Conference (C3) (pp.\u00a01\u20136). IEEE. https:\/\/doi.org\/10.1109\/C358072.2023.10436195","DOI":"10.1109\/C358072.2023.10436195"},{"key":"494_CR27","unstructured":"*Chakraborty, N., Roy, S., Leite, W. L., Faradonbeh, M. K. S., & Michailidis, G. (2021). The effects of a personalized recommendation system on students\u2019 high-stakes achievement scores: A field experiment. In Proceedings of The 14th International Conference on Educational Data Mining (EDM 2021) (pp.\u00a0588\u2013594)."},{"key":"494_CR28","doi-asserted-by":"publisher","DOI":"10.1016\/j.compedu.2020.103836","volume":"150","author":"H Chen","year":"2020","unstructured":"Chen, H., Park, H. W., & Breazeal, C. (2020). Teaching and learning with children: Impact of reciprocal peer learning with a social robot on children\u2019s learning and emotive engagement. Computers & Education, 150, Article 103836. https:\/\/doi.org\/10.1016\/j.compedu.2020.103836","journal-title":"Computers & Education"},{"key":"494_CR29","unstructured":"Chi, M., Jordan, P., VanLehn, K., & Hall, M. (2008). Reinforcement learning based feature selection for developing pedagogically effective tutorial dialogue tactics. In Proceedings of Educational Data Mining 2008 - 1st International Conference on Educational Data Mining (pp.\u00a0258\u2013265)."},{"key":"494_CR30","doi-asserted-by":"publisher","unstructured":"*Chi, M., VanLehn, K., Litman, D., spsampsps Jordan, P. (2010). Inducing effective pedagogical strategies using learning context features. In P. de Bra, P. Del Brassey, A. Kobsa, spsampsps D. Chin (Eds.), Lecture Notes in Computer Science \/ Information Systems and Applications, incl. Internet\/Web, and HCI: Vol. 6075. User Modeling, Adaptation, and Personalization: 18th International Conference, UMAP 2010, Big Island, HI, USA, June 20\u201324, 2010 ; proceedings (Vol. 6075, pp.\u00a0147\u2013158). Springer. https:\/\/doi.org\/10.1007\/978-3-642-13470-8_15","DOI":"10.1007\/978-3-642-13470-8_15"},{"key":"494_CR31","doi-asserted-by":"publisher","unstructured":"*Chi, M., Jordan, P., spsampsps VanLehn, K. (2014). When is tutorial dialogue more effective than step-based tutoring? In D. Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, A. Kobsa, F. Mattern, J. C. Mitchell, M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen, D. Terzopoulos, D. Tygar, G. Weikum, S. Trausan-Matu, K. E. Boyer, M. Crosby, K. Panourgia, spsampsps \u015e. Tr\u0103u\u015fan-Matu (Eds.), Lecture Notes in Computer Science: Vol. 8474. Intelligent tutoring systems: 12th international conference, ITS 2014, Honolulu, HI, USA, June 5 - 9, 2014; proceedings (Vol. 8474, pp.\u00a0210\u2013219). Springer. https:\/\/doi.org\/10.1007\/978-3-319-07221-0_25","DOI":"10.1007\/978-3-319-07221-0_25"},{"issue":"1\u20132","key":"494_CR32","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1007\/s11257-010-9093-1","volume":"21","author":"M Chi","year":"2011","unstructured":"Chi, M., VanLehn, K., Litman, D., & Jordan, P. (2011). Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies. User Modeling and User-Adapted Interaction, 21(1\u20132), 137\u2013180. https:\/\/doi.org\/10.1007\/s11257-010-9093-1","journal-title":"User Modeling and User-Adapted Interaction"},{"issue":"2","key":"494_CR33","doi-asserted-by":"publisher","first-page":"20","DOI":"10.5281\/ZENODO.3554667","volume":"7","author":"B Clement","year":"2015","unstructured":"Clement, B., Roy, D., Oudeyer, P.-Y., & Lopes, M. (2015). Multi-armed bandits for intelligent tutoring systems. Journal of Educational Data Mining, 7(2), 20\u201348. https:\/\/doi.org\/10.5281\/ZENODO.3554667","journal-title":"Journal of Educational Data Mining"},{"key":"494_CR34","doi-asserted-by":"publisher","unstructured":"Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2. ed.). Erlbaum. https:\/\/doi.org\/10.4324\/9780203771587","DOI":"10.4324\/9780203771587"},{"key":"494_CR35","doi-asserted-by":"publisher","unstructured":"Condor, A., & Pardos, Z. (2022). A deep reinforcement learning approach to automatic formative feedback. In Proceedings of the 15th International Conference on Educational Data Mining (pp.\u00a0662\u2013666). https:\/\/doi.org\/10.5281\/zenodo.6853061","DOI":"10.5281\/zenodo.6853061"},{"key":"494_CR36","unstructured":"Croy, M., Barnes, T., & Stamper, J. (2008). Towards an intelligent tutoring system for propositional proof construction. In Proceedings of the 2008 conference on Current Issues in Computing and Philosophy."},{"key":"494_CR37","doi-asserted-by":"publisher","unstructured":"Cui, L. (2023). Research of Intelligent Tutoring System Based on Deep Learning Computer Technology. In 2023 International Conference on Applied Intelligence and Sustainable Computing (ICAISC) (pp.\u00a01\u20137). IEEE. https:\/\/doi.org\/10.1109\/ICAISC58445.2023.10199414","DOI":"10.1109\/ICAISC58445.2023.10199414"},{"issue":"4","key":"494_CR38","doi-asserted-by":"publisher","first-page":"812","DOI":"10.1213\/ANE.0000000000001596","volume":"123","author":"JE Dalton","year":"2016","unstructured":"Dalton, J. E., Bolen, S. D., & Mascha, E. J. (2016). Publication Bias: The Elephant in the Review. Anesthesia and Analgesia, 123(4), 812\u2013813. https:\/\/doi.org\/10.1213\/ANE.0000000000001596","journal-title":"Anesthesia and Analgesia"},{"issue":"2","key":"494_CR39","doi-asserted-by":"publisher","first-page":"107","DOI":"10.3233\/DS-200028","volume":"3","author":"F den Hengst","year":"2020","unstructured":"den Hengst, F., Grua, E. M., el Hassouni, A., & Hoogendoorn, M. (2020). Reinforcement learning for personalization: A systematic literature review. Data Science, 3(2), 107\u2013147. https:\/\/doi.org\/10.3233\/DS-200028","journal-title":"Data Science"},{"issue":"6","key":"494_CR40","doi-asserted-by":"publisher","first-page":"2092","DOI":"10.1016\/j.eswa.2012.10.014","volume":"40","author":"FA Dor\u00e7a","year":"2013","unstructured":"Dor\u00e7a, F. A., Lima, L. V., Fernandes, M. A., & Lopes, C. R. (2013). Comparing strategies for modeling students learning styles through reinforcement learning in adaptive and intelligent educational systems: An experimental analysis. Expert Systems with Applications, 40(6), 2092\u20132101. https:\/\/doi.org\/10.1016\/j.eswa.2012.10.014","journal-title":"Expert Systems with Applications"},{"issue":"4","key":"494_CR41","doi-asserted-by":"publisher","first-page":"568","DOI":"10.1007\/s40593-019-00187-x","volume":"29","author":"S Doroudi","year":"2019","unstructured":"Doroudi, S., Aleven, V., & Brunskill, E. (2019). Where\u2019s the reward? International Journal of Artificial Intelligence in Education, 29(4), 568\u2013620. https:\/\/doi.org\/10.1007\/s40593-019-00187-x","journal-title":"International Journal of Artificial Intelligence in Education"},{"issue":"9","key":"494_CR42","doi-asserted-by":"publisher","first-page":"2419","DOI":"10.1007\/s10994-021-05961-4","volume":"110","author":"G Dulac-Arnold","year":"2021","unstructured":"Dulac-Arnold, G., Levine, N., Mankowitz, D. J., Li, J., Paduraru, C., Gowal, S., & Hester, T. (2021). Challenges of real-world reinforcement learning: Definitions, benchmarks and analysis. Machine Learning, 110(9), 2419\u20132468. https:\/\/doi.org\/10.1007\/s10994-021-05961-4","journal-title":"Machine Learning"},{"key":"494_CR43","unstructured":"*Efremov, A., Ghosh, A., & Singla, A. K. (2020). Zero-shot learning of hint policy via reinforcement learning and program synthesis. In Proceedings of The 13th International Conference on Educational Data Mining (EDM 2020) (pp.\u00a0388\u2013394)."},{"issue":"7109","key":"494_CR44","doi-asserted-by":"publisher","first-page":"629","DOI":"10.1136\/bmj.315.7109.629","volume":"315","author":"M Egger","year":"1997","unstructured":"Egger, M., Davey Smith, G., Schneider, M., & Minder, C. (1997). Bias in meta-analysis detected by a simple, graphical test. BMJ\u202f: British Medical Journal, 315(7109), 629\u2013634. https:\/\/doi.org\/10.1136\/bmj.315.7109.629","journal-title":"BMJ : British Medical Journal"},{"issue":"1","key":"494_CR45","doi-asserted-by":"publisher","first-page":"6637","DOI":"10.48084\/etasr.3905","volume":"11","author":"H El Fazazi","year":"2021","unstructured":"El Fazazi, H., Elgarej, M., Qbadou, M., & Mansouri, K. (2021). Design of an adaptive e-learning system based on multi-agent approach and reinforcement learning. Engineering, Technology & Applied Science Research, 11(1), 6637\u20136644. https:\/\/doi.org\/10.48084\/etasr.3905","journal-title":"Engineering, Technology & Applied Science Research"},{"issue":"3","key":"494_CR46","doi-asserted-by":"publisher","first-page":"74","DOI":"10.3390\/informatics10030074","volume":"10","author":"B Fahad Mon","year":"2023","unstructured":"Fahad Mon, B., Wasfi, A., Hayajneh, M., Slim, A., & Abu Ali, N. (2023). Reinforcement learning in education: A literature review. Informatics, 10(3), 74. https:\/\/doi.org\/10.3390\/informatics10030074","journal-title":"Informatics"},{"issue":"21","key":"494_CR47","doi-asserted-by":"publisher","first-page":"23191","DOI":"10.1609\/aaai.v38i21.30365","volume":"38","author":"FM Fahid","year":"2024","unstructured":"Fahid, F. M., Rowe, J., Kim, Y., Srivastava, S., & Lester, J. (2024). Online reinforcement learning-based pedagogical planning for narrative-centered learning environments. Proceedings of the AAAI Conference on Artificial Intelligence, 38(21), 23191\u201323199. https:\/\/doi.org\/10.1609\/aaai.v38i21.30365","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"issue":"4","key":"494_CR48","doi-asserted-by":"publisher","first-page":"e10271","DOI":"10.1371\/journal.pone.0010271","volume":"5","author":"D Fanelli","year":"2010","unstructured":"Fanelli, D. (2010). Do pressures to publish increase scientists\u2019 bias? An empirical support from US States Data. PLoS ONE, 5(4), e10271. https:\/\/doi.org\/10.1371\/journal.pone.0010271","journal-title":"PLoS ONE"},{"key":"494_CR49","doi-asserted-by":"publisher","unstructured":"Fenza, G., Orciuoli, F., & Sampson, D. G. (2017). Building adaptive tutoring model using artificial neural networks and reinforcement learning. In M. Chang (Ed.), Icalt 2017: Ieee 17th International Conference on Advanced Learning Technologies: Proceedings: 3\u20137 July 2017, Timi\u015foara, Romania. IEEE. https:\/\/doi.org\/10.1109\/ICALT.2017.124","DOI":"10.1109\/ICALT.2017.124"},{"key":"494_CR50","doi-asserted-by":"publisher","first-page":"605","DOI":"10.3233\/978-1-60750-028-5-605","volume":"200","author":"K Ferguson","year":"2009","unstructured":"Ferguson, K., Woolf, B. P., & Mahadevan, S. (2009). Transfer learning and representation discovery in intelligent tutoring systems. Frontiers in Artificial Intelligence and Applications, 200, 605\u2013607. https:\/\/doi.org\/10.3233\/978-1-60750-028-5-605","journal-title":"Frontiers in Artificial Intelligence and Applications"},{"key":"494_CR51","doi-asserted-by":"publisher","unstructured":"*Fernandes, C. W., Miari, T., Rafatirad, S., & Sayadi, H. (2023). Unleashing the Potential of Reinforcement Learning for Enhanced Personalized Education. In 2023 IEEE Frontiers in Education Conference (FIE) (pp.\u00a01\u20135). IEEE. https:\/\/doi.org\/10.1109\/FIE58773.2023.10342902","DOI":"10.1109\/FIE58773.2023.10342902"},{"key":"494_CR52","doi-asserted-by":"publisher","unstructured":"*Flores, A., Alfaro, L., & Herrera, J. (2019). Proposal model for e-learning based on case based reasoning and reinforcement learning. In C. d. Rocha Brito & M. M. Ciampi (Eds.), Modern educational paradigms for computer and engineering career: Proceedings: Edunine2019 - III IEEE World Engineering Education Conference : March 17 to 19, 2019, Lima, Peru (pp.\u00a01\u20136). IEEE. https:\/\/doi.org\/10.1109\/EDUNINE.2019.8875800","DOI":"10.1109\/EDUNINE.2019.8875800"},{"key":"494_CR53","doi-asserted-by":"publisher","unstructured":"*Fotopoulou, E., Zafeiropoulos, A., Feidakis, M., Metafas, D., spsampsps Papavassiliou, S. (2020). An interactive recommender system based on reinforcement learning for improving emotional competences in educational groups. In V. Kumar spsampsps C. Troussas (Eds.), Programming and Software Engineering: Vol. 12149. Intelligent Tutoring Systems: 16th International Conference, ITS 2020, Athens, Greece, June 8\u201312, 2020, Proceedings (1st ed. 2020, Vol. 12149, pp.\u00a0248\u2013258). Springer International Publishing; Imprint: Springer. https:\/\/doi.org\/10.1007\/978-3-030-49663-0_29","DOI":"10.1007\/978-3-030-49663-0_29"},{"key":"494_CR54","doi-asserted-by":"publisher","unstructured":"El Fouki, M., Aknin, N., & El. Kadiri, K. E. (2017). Intelligent adapted e-learning system based on deep reinforcement learning. In J. Zbitou (Ed.), ACM Digital Library, Proceedings of the 2nd International Conference on Computing and Wireless Communication Systems (pp.\u00a01\u20136). ACM. https:\/\/doi.org\/10.1145\/3167486.3167574","DOI":"10.1145\/3167486.3167574"},{"key":"494_CR55","doi-asserted-by":"publisher","unstructured":"Fran\u00e7ois-Lavet, V., Henderson, P., Islam, R., Bellemare, M. G., & Pineau, J. (2018). An introduction to deep reinforcement learning. Foundations and Trends\u00ae in Machine Learning, 11(3\u20134), 219\u2013354. https:\/\/doi.org\/10.1561\/2200000071","DOI":"10.1561\/2200000071"},{"key":"494_CR56","doi-asserted-by":"publisher","unstructured":"Frej, J., Shah, N., Knezevic, M., Nazaretsky, T., & K\u00e4ser, T. (2024). Finding Paths for Explainable MOOC Recommendation: A Learner Perspective. In Proceedings of the 14th Learning Analytics and Knowledge Conference (pp.\u00a0426\u2013437). ACM. https:\/\/doi.org\/10.1145\/3636555.3636898","DOI":"10.1145\/3636555.3636898"},{"key":"494_CR57","doi-asserted-by":"publisher","unstructured":"Frenoy, R., Soullard, Y., Thouvenin, I., & Gapenne, O. (2016). Adaptive training environment without prior knowledge. In J. Vassileva (Ed.), Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization (pp.\u00a0131\u2013139). ACM. https:\/\/doi.org\/10.1145\/2930238.2930256","DOI":"10.1145\/2930238.2930256"},{"key":"494_CR58","unstructured":"Gao, Y., Barendregt, W., & Castellano, G. (2017). Personalised human-robot co-adaptation in instructional settings using reinforcement learning. In IVA Workshop on Persuasive Embodied Agents for Behavior Change: PEACH 2017, August 27, Stockholm, Sweden."},{"key":"494_CR59","doi-asserted-by":"publisher","unstructured":"Gao, Y., Barendregt, W., Obaid, M., & Castellano, G. (2018). When robot personalisation does not help: Insights from a robot-supported learning study. In J.-J. Cabibihan (Ed.), Ieee RO-MAN 2018: The 27th IEEE International Symposium on Robot and Human Interactive Communication (pp.\u00a0705\u2013712). IEEE. https:\/\/doi.org\/10.1109\/ROMAN.2018.8525832","DOI":"10.1109\/ROMAN.2018.8525832"},{"key":"494_CR60","unstructured":"*Gao, G., Ju, S., Ausin, M. S., & Chi, M. (2023a). HOPE: Human-centric off-policy evaluation for e-learning and healthcare. In AAMAS \u201923, Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems (pp.\u00a01504\u20131513)."},{"key":"494_CR61","doi-asserted-by":"publisher","DOI":"10.1007\/s12559-023-10201-z","volume-title":"Improving knowledge learning through modelling students\u2019 practice-based cognitive processes","author":"H Gao","year":"2023","unstructured":"Gao, H., Zeng, Y., Ma, B., & Pan, Y. (2023b). Improving knowledge learning through modelling students\u2019 practice-based cognitive processes. Advance online publication. https:\/\/doi.org\/10.1007\/s12559-023-10201-z"},{"key":"494_CR62","unstructured":"*Georgila, K., Core, M. G., Nye, B. D., Karumbaiah, S., Auerbach, D., & Ram, M. (2019). Using reinforcement learning to optimize the policies of an intelligent tutoring system for interpersonal skills training. In AAMAS \u201919, Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems (pp.\u00a0737\u2013745)."},{"issue":"1","key":"494_CR63","first-page":"1573","volume":"16","author":"A Geramifard","year":"2015","unstructured":"Geramifard, A., Dann, C., Klein, R. H., Dabney, W., & How, J. P. (2015). RLPy: A value-function-based reinforcement learning framework for education and research. Journal of Machine Learning Research, 16(1), 1573\u20131578.","journal-title":"Journal of Machine Learning Research"},{"key":"494_CR64","unstructured":"*Gkatzia, D., Hastie, H., Janarthanam, S., & Lemon, O. (2013). Generating student feedback from time-series data using reinforcement learning. In Proceedings of the 14th European Workshop on Natural Language Generation (pp.\u00a0115\u2013124)."},{"issue":"3","key":"494_CR65","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3580510","volume":"17","author":"J Gong","year":"2023","unstructured":"Gong, J., Wan, Y., Liu, Y., Li, X., Zhao, Y., Wang, C., Lin, Y., Fang, X., Feng, W., Zhang, J., & Tang, J. (2023). Reinforced MOOCs concept recommendation in heterogeneous information networks. ACM Transactions on the Web, 17(3), 1\u201327. https:\/\/doi.org\/10.1145\/3580510","journal-title":"ACM Transactions on the Web"},{"key":"494_CR66","doi-asserted-by":"crossref","unstructured":"*Gordon, G., Spaulding, S., Westlund, J. K., Lee, J. J., Plummer, L., Martinez, M., Das, M., & Breazeal, C. (2016). Affective personalization of a social robot tutor for children\u2019s second language skills. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1), 3951\u20133957. https:\/\/ojs.aaai.org\/index.php\/aaai\/article\/view\/9914","DOI":"10.1609\/aaai.v30i1.9914"},{"key":"494_CR67","doi-asserted-by":"publisher","unstructured":"Graesser, A. C., Conley, M. W., spsampsps Olney, A. (2012). Intelligent tutoring systems. In K. R. Harris, S. Graham, spsampsps T. C. Urdan (Eds.), APA handbooks in psychology. APA educational psychology handbook (1st ed., pp.\u00a0451\u2013473). American Psychological Association. https:\/\/doi.org\/10.1037\/13275-018","DOI":"10.1037\/13275-018"},{"key":"494_CR68","doi-asserted-by":"publisher","unstructured":"Guran, A.-M., Cojocar, G.-S., spsampsps Dio\u015fan, L.-S. (2021). Towards smart edutainment applications for young children: A proposal. In A. I. Cristea spsampsps C. Troussas (Eds.), Springer eBook Collection: Vol. 12677. Intelligent Tutoring Systems: 17th International Conference, ITS 2021, Virtual Event, June 7\u201311, 2021, Proceedings (1st ed. 2021, Vol. 12677, pp.\u00a0439\u2013443). Springer International Publishing; Imprint Springer. https:\/\/doi.org\/10.1007\/978-3-030-80421-3_48","DOI":"10.1007\/978-3-030-80421-3_48"},{"key":"494_CR69","doi-asserted-by":"publisher","unstructured":"Haider, S. A., Ahmad, K. M., Zahid, A., AlGhamdi, A., Keshta, I., & Soni, M. (2024). Genetic and Deep Reinforcement Learning-Based Intelligent Course Scheduling for Smart Education. In Proceedings of the 2024 International Conference on Artificial Intelligence and Teacher Education (pp.\u00a0117\u2013124). ACM. https:\/\/doi.org\/10.1145\/3702386.3702398","DOI":"10.1145\/3702386.3702398"},{"issue":"3","key":"494_CR70","doi-asserted-by":"publisher","first-page":"522","DOI":"10.1111\/bmsp.12199","volume":"73","author":"R Han","year":"2020","unstructured":"Han, R., Chen, K., & Tan, C. (2020). Curiosity-driven recommendation strategy for adaptive learning via deep reinforcement learning. The British Journal of Mathematical and Statistical Psychology, 73(3), 522\u2013540. https:\/\/doi.org\/10.1111\/bmsp.12199","journal-title":"The British Journal of Mathematical and Statistical Psychology"},{"key":"494_CR71","doi-asserted-by":"publisher","unstructured":"Hare, R., & Tang, Y. (2023). Reinforcement Learning with Experience Sharing for Intelligent Educational Systems. In 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC) (pp.\u00a01431\u20131436). IEEE. https:\/\/doi.org\/10.1109\/SMC53992.2023.10394095","DOI":"10.1109\/SMC53992.2023.10394095"},{"issue":"3","key":"494_CR72","doi-asserted-by":"publisher","first-page":"387","DOI":"10.1109\/TE.2024.3359001","volume":"67","author":"R Hare","year":"2024","unstructured":"Hare, R., Tang, Y., & Ferguson, S. (2024). An Intelligent Serious Game for Digital Logic Education to Enhance Student Learning. IEEE Transactions on Education, 67(3), 387\u2013394. https:\/\/doi.org\/10.1109\/TE.2024.3359001","journal-title":"IEEE Transactions on Education"},{"key":"494_CR73","doi-asserted-by":"crossref","unstructured":"Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup, D., & Meger, D. (2018). Deep reinforcement learning that matters. In AAAI Conference On Artificial Intelligence (AAAI) (pp. 3207\u20133214).","DOI":"10.1609\/aaai.v32i1.11694"},{"key":"494_CR74","unstructured":"*Hostetter, J. W., Abdelshiheed, M., Barnes, T., & Chi, M. (2023c). A self-organizing neuro-fuzzy q-network: Systematic design with offline hybrid learning. In AAMAS \u201923, Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems (pp.\u00a01248\u20131257)."},{"key":"494_CR75","doi-asserted-by":"crossref","unstructured":"Hostetter, J., Conati, C., Yang, X., Abdelshiheed, M., Barnes, T., & Chi, M. (2023b). XAI to increase the effectiveness of an intelligent pedagogical agent. In ACM International Conference on Intelligent Virtual Agents (IVA2023).","DOI":"10.1145\/3570945.3607301"},{"key":"494_CR76","doi-asserted-by":"crossref","unstructured":"*Hostetter, J., Abdelshiheed, M., Barnes, T., & Chi, M. (2023a). Leveraging fuzzy logic towards more explainable reinforcement learning-induced pedagogical policies on intelligent tutoring systems. In IEEE International Conference on Fuzzy Systems (FUZZ 2023).","DOI":"10.1109\/FUZZ52849.2023.10309741"},{"key":"494_CR77","volume-title":"Dynamic programming and Markov processes","author":"RA Howard","year":"1960","unstructured":"Howard, R. A. (1960). Dynamic programming and Markov processes. Press."},{"key":"494_CR78","doi-asserted-by":"publisher","unstructured":"*Huang, Z., Liu, Q., Zhai, C., Yin, Y., Chen, E., Gao, W., & Hu, G. (2019). Exploring multi-objective exercise recommendations in online education systems. In W. Zhu, D. Tao, & X. Cheng (Eds.), ACM Digital Library, Cikm\u201919: Proceedings of the 28th ACM International Conference on Information & Knowledge Management (pp.\u00a01261\u20131270). Association for Computing Machinery. https:\/\/doi.org\/10.1145\/3357384.3357995","DOI":"10.1145\/3357384.3357995"},{"key":"494_CR79","unstructured":"Iglesias, A., Mart\u00ednez, P., & Fern\u00e1ndez, F. (1995). Navigating through the RLATES interface: A web-based adaptive and intelligent educational system. In G. Goos (Ed.), Otm 2003 Workshops: Otm Confederated International Workshops, HCI-SWWA, IPW, JTRES,WORM, WMS, and WRSM 2003, Catania, Sicily, Italy, November 3\u20137, 2003. Proceedings (1st ed.). Springer Berlin Heidelberg."},{"key":"494_CR80","doi-asserted-by":"publisher","unstructured":"Iglesias, A., Mart\u00ednez, P., & Fern\u00e1ndez, F. (2003). An experience applying reinforcement learning in a web-based adaptive and intelligent educational system. Informatics in Education, 2(2), 223\u2013240. https:\/\/doi.org\/10.15388\/infedu.2003.17","DOI":"10.15388\/infedu.2003.17"},{"issue":"1","key":"494_CR81","doi-asserted-by":"publisher","first-page":"89","DOI":"10.1007\/s10489-008-0115-1","volume":"31","author":"A Iglesias","year":"2009","unstructured":"Iglesias, A., Mart\u00ednez, P., Aler, R., & Fern\u00e1ndez, F. (2009a). Learning teaching strategies in an adaptive and intelligent educational system through reinforcement learning. Applied Intelligence, 31(1), 89\u2013106. https:\/\/doi.org\/10.1007\/s10489-008-0115-1","journal-title":"Applied Intelligence"},{"issue":"4","key":"494_CR82","doi-asserted-by":"publisher","first-page":"266","DOI":"10.1016\/j.knosys.2009.01.007","volume":"22","author":"A Iglesias","year":"2009","unstructured":"Iglesias, A., Mart\u00ednez, P., Aler, R., & Fern\u00e1ndez, F. (2009b). Reinforcement learning of pedagogical policies in adaptive and intelligent educational systems. Knowledge-Based Systems, 22(4), 266\u2013270. https:\/\/doi.org\/10.1016\/j.knosys.2009.01.007","journal-title":"Knowledge-Based Systems"},{"key":"494_CR83","doi-asserted-by":"publisher","unstructured":"*Intayoad, W., Kamyod, C., & Temdee, P. (2018). Reinforcement learning for online learning recommendation system. In The 6th Global Wireless Summit (GWS-2018): November 25\u201328, 2018, Mae Fah Luang University, Chiang Rai (pp.\u00a0167\u2013170). IEEE. https:\/\/doi.org\/10.1109\/GWS.2018.8686513","DOI":"10.1109\/GWS.2018.8686513"},{"issue":"4","key":"494_CR84","doi-asserted-by":"publisher","first-page":"2917","DOI":"10.1007\/s11277-020-07199-0","volume":"115","author":"W Intayoad","year":"2020","unstructured":"Intayoad, W., Kamyod, C., & Temdee, P. (2020). Reinforcement learning based on contextual bandits for personalized online learning recommendation systems. Wireless Personal Communications, 115(4), 2917\u20132932. https:\/\/doi.org\/10.1007\/s11277-020-07199-0","journal-title":"Wireless Personal Communications"},{"key":"494_CR85","unstructured":"Islam, R., Henderson, P., Gomrokchi, M., & Precup, D. (2017). Reproducibility of benchmarked deep reinforcement learning tasks for continuous control. In ICML Reproducibility in Machine Learning Workshop, ICML\u201917."},{"key":"494_CR86","doi-asserted-by":"publisher","first-page":"155123","DOI":"10.1109\/ACCESS.2021.3128578","volume":"9","author":"MZ Islam","year":"2021","unstructured":"Islam, M. Z., Ali, R., Haider, A., Islam, M. Z., & Kim, H. S. (2021). PAKES: A reinforcement learning-based personalized adaptability knowledge extraction strategy for adaptive learning systems. IEEE Access, 9, 155123\u2013155137. https:\/\/doi.org\/10.1109\/ACCESS.2021.3128578","journal-title":"IEEE Access"},{"key":"494_CR87","unstructured":"JASP Team. (2021). JASP (Version 0.16.0.0) [Computer software]. https:\/\/jasp-stats.org\/"},{"key":"494_CR88","doi-asserted-by":"publisher","unstructured":"Jatzlau, S., Michaeli, T., Seegerer, S., & Romeike, R. (2019). It\u2019s not magic after all: Machine learning in Snap! using reinforcement learning. In 2019 IEEE Blocks and Beyond Workshop (B & B): B & B 2019 : October 18, 2019, Memphis, Tennessee, USA : Proceedings (pp.\u00a037\u201341). IEEE. https:\/\/doi.org\/10.1109\/BB48857.2019.8941208","DOI":"10.1109\/BB48857.2019.8941208"},{"key":"494_CR89","doi-asserted-by":"publisher","unstructured":"*Jeewantha, H. C. R., Gajasinghe, A. N., Naidabadu, N. I., Rajapaksha, T. N., Kasthurirathna, D., & Karunasena, A. (2021). English language trainer for non-native speakers using audio signal processing, reinforcement learning, and deep learning. In 21st International Conference on Advances in ICT for Emerging Regions (ICTer) 2021: Conference proceedings : 02nd & 03rd of December 2021, University of Colombo, School of Computing, Colombo, Sri Lanka (pp.\u00a0117\u2013122). IEEE. https:\/\/doi.org\/10.1109\/ICter53630.2021.9774785","DOI":"10.1109\/ICter53630.2021.9774785"},{"key":"494_CR90","doi-asserted-by":"crossref","unstructured":"Johnson, S., & Zaiane, O. R. (2013). Intelligent feedback polarity and timing selection in the Shufti intelligent tutoring system. In International Conference on Computers in Education.","DOI":"10.58459\/icce.2012.547"},{"key":"494_CR91","doi-asserted-by":"publisher","unstructured":"*Ju, S., Zhou, G., Abdelshiheed, M., Barnes, T., spsampsps Chi, M. (2021). Evaluating critical reinforcement learning framework in the field. In I. Roll, D. McNamara, S. Sosnovsky, R. Luckin, spsampsps V. Dimitrova (Eds.), Lecture Notes in Computer Science. ARTIFICIAL INTELLIGENCE IN EDUCATION: 22nd international conference, aied 2021 (Vol. 12748, pp.\u00a0215\u2013227). SPRINGER NATURE. https:\/\/doi.org\/10.1007\/978-3-030-78292-4_18","DOI":"10.1007\/978-3-030-78292-4_18"},{"key":"494_CR92","unstructured":"*Jung, G., Ausin, M. S., Barnes, T., & Chi, M. (2024). More, May not the Better: Insights from Applying Deep Reinforcement Learning for Pedagogical Policy Induction. In Proceedings of the 17th International Conference on Educational Data Mining."},{"issue":"1\u20132","key":"494_CR93","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1016\/S0004-3702(98)00023-X","volume":"101","author":"LP Kaelbling","year":"1998","unstructured":"Kaelbling, L. P., Littman, M. L., & Cassandra, A. R. (1998). Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1\u20132), 99\u2013134. https:\/\/doi.org\/10.1016\/S0004-3702(98)00023-X","journal-title":"Artificial Intelligence"},{"key":"494_CR94","doi-asserted-by":"publisher","first-page":"1248","DOI":"10.1109\/tlt.2024.3372508","volume":"17","author":"YC Kakdas","year":"2024","unstructured":"Kakdas, Y. C., Kockara, S., Halic, T., & Demirel, D. (2024). Enhancing Medical Training Through Learning From Mistakes by Interacting With an Ill-Trained Reinforcement Learning Agent. IEEE Transactions on Learning Technologies, 17, 1248\u20131260. https:\/\/doi.org\/10.1109\/tlt.2024.3372508","journal-title":"IEEE Transactions on Learning Technologies"},{"key":"494_CR95","doi-asserted-by":"publisher","unstructured":"Kandel, A., Ibrahim, I., & Fukuta, N. (2022). An analysis of educational cloud platforms using multi-agent learning. In T. Matsui, K. Takamatsu, & Y. Ono (Eds.), 2022 12th International Congress on Advanced Applied Informatics: Iiai-AAI 2022 : Kanazawa, Japan, 2\u20137 July 2022 : Proceedings (pp.\u00a0230\u2013233). IEEE. https:\/\/doi.org\/10.1109\/IIAIAAI55812.2022.00053","DOI":"10.1109\/IIAIAAI55812.2022.00053"},{"key":"494_CR96","doi-asserted-by":"publisher","unstructured":"*Kim, S., Kim, W., spsampsps Kim, H. (2021). Learning path construction using reinforcement learning and bloom\u2019s taxonomy. In A. I. Cristea spsampsps C. Troussas (Eds.), Springer eBook Collection: Vol. 12677. Intelligent Tutoring Systems: 17th International Conference, ITS 2021, Virtual Event, June 7\u201311, 2021, Proceedings (1st ed. 2021, Vol. 12677, pp.\u00a0267\u2013278). Springer International Publishing; Imprint Springer. https:\/\/doi.org\/10.1007\/978-3-030-80421-3_29","DOI":"10.1007\/978-3-030-80421-3_29"},{"key":"494_CR97","doi-asserted-by":"publisher","unstructured":"Kochmar, E., Vu, D. D., Belfer, R., Gupta, V., Serban, I. V., spsampsps Pineau, J. (2020). Automated personalized feedback improves learning gains in an intelligent tutoring system. In I. I. Bittencourt, M. Cukurova, K. Muldner, R. Luckin, spsampsps E. Mill\u00e1n (Eds.), Springer eBook Collection: Vol. 12164. Artificial Intelligence in Education: 21st International Conference, AIED 2020, Ifrane, Morocco, July 6\u201310, 2020, Proceedings, Part II (1st ed. 2020, Vol. 12164, pp.\u00a0140\u2013146). Springer International Publishing; Imprint Springer. https:\/\/doi.org\/10.1007\/978-3-030-52240-7_26","DOI":"10.1007\/978-3-030-52240-7_26"},{"key":"494_CR98","unstructured":"Konda, V., & Tsitsiklis, J. (1999). Actor-critic algorithms. In S. Solla, T. Leen, & K. M\u00fcller (Eds.), Advances in Neural Information Processing Systems (Vol. 12). MIT Press."},{"issue":"3","key":"494_CR99","first-page":"483","volume":"43","author":"J Koroveshi","year":"2020","unstructured":"Koroveshi, J., & Ktona, A. (2020). Modelling an intelligent tutoring system using reinforcement learning. Knowledge - International Journal, 43(3), 483\u2013487.","journal-title":"Knowledge - International Journal"},{"issue":"3","key":"494_CR100","doi-asserted-by":"publisher","first-page":"10","DOI":"10.5281\/zenodo.4661454","volume":"19","author":"J Koroveshi","year":"2021","unstructured":"Koroveshi, J., & Ktona, A. (2021). Training an intelligent tutoring system using reinforcement learning. International Journal of Computer Science and Information Security (IJCSIS), 19(3), 10\u201318. https:\/\/doi.org\/10.5281\/zenodo.4661454","journal-title":"International Journal of Computer Science and Information Security (IJCSIS)"},{"key":"494_CR101","unstructured":"Kubotani, Y., Fukuhara, Y., & Morishima, S. (2021). RLTutor: Reinforcement learning based adaptive tutoring system by modeling virtual student with fewer interactions. In AI4EDU workshop at IJCAI2021."},{"key":"494_CR102","doi-asserted-by":"publisher","unstructured":"Kumar, A., spsampsps Ahuja, N. J. (2020). An adaptive framework of learner model using learner characteristics for intelligent tutoring systems. In S. Choudhury, R. Mishra, R. G. Mishra, spsampsps A. Kumar (Eds.), Springer eBooks Intelligent Technologies and Robotics: Vol. 989. Intelligent Communication, Control and Devices: Proceedings of ICICCD 2018 (1st ed. 2020, Vol. 989, pp.\u00a0425\u2013433). Springer. https:\/\/doi.org\/10.1007\/978-981-13-8618-3_45","DOI":"10.1007\/978-981-13-8618-3_45"},{"key":"494_CR103","unstructured":"Lagoudakis, M. G., & Parr, R. (2003). Least-squares policy iteration. The Journal of Machine Learning Research, (4), 1107\u20131149."},{"key":"494_CR104","doi-asserted-by":"publisher","DOI":"10.1017\/9781108571401","author":"T Lattimore","year":"2020","unstructured":"Lattimore, T., & Szepesv\u00e1ri, C. (2020). Bandit algorithms. Cambridge University Press. https:\/\/doi.org\/10.1017\/9781108571401","journal-title":"Cambridge University Press"},{"key":"494_CR105","doi-asserted-by":"publisher","first-page":"46","DOI":"10.1016\/j.jclinepi.2019.06.014","volume":"115","author":"V Leclercq","year":"2019","unstructured":"Leclercq, V., Beaudart, C., Ajamieh, S., Rabenda, V., Tirelli, E., & Bruy\u00e8re, O. (2019). Meta-analyses indexed in PsycINFO had a better completeness of reporting when they mention PRISMA. Journal of Clinical Epidemiology, 115, 46\u201354. https:\/\/doi.org\/10.1016\/j.jclinepi.2019.06.014","journal-title":"Journal of Clinical Epidemiology"},{"key":"494_CR106","unstructured":"Lee, J. in, & Brunskill, E. (2012). The impact on individualizing student models on necessary practice opportunities. In International Conference on Educational Data Mining (EDM), Chania, Greece."},{"key":"494_CR107","doi-asserted-by":"publisher","unstructured":"Legaspi, R. S., & Sison, R. C. (2002). A machine learning framework for an expert tutor construction. In Proceedings \/ International Conference on Computers in Education: December 3 - 6, 2002, Aukland, New Zealand (pp.\u00a0670\u2013674). IEEEE Computer Society. https:\/\/doi.org\/10.1109\/CIE.2002.1186038","DOI":"10.1109\/CIE.2002.1186038"},{"key":"494_CR108","doi-asserted-by":"publisher","unstructured":"Leite, W. L., Roy, S., Chakraborty, N., Michailidis, G., Huggins-Manley, A. C., D\u2019Mello, S., Shirani Faradonbeh, M. K., Jensen, E., Kuang, H., & Jing, Z. (2022). A novel video recommendation system for algebra: An effectiveness evaluation study. In A. F. Wise (Ed.), ACM Digital Library, Lak22: 12th International Learning Analytics and Knowledge Conference (pp.\u00a0294\u2013303). Association for Computing Machinery. https:\/\/doi.org\/10.1145\/3506860.3506906","DOI":"10.1145\/3506860.3506906"},{"key":"494_CR109","doi-asserted-by":"publisher","unstructured":"Lenhard, W., & Lenhard, A. (2017). Computation of effect sizes. Psychometrica. https:\/\/doi.org\/10.13140\/RG.2.2.17823.92329","DOI":"10.13140\/RG.2.2.17823.92329"},{"key":"494_CR110","doi-asserted-by":"publisher","unstructured":"Li, Q., Xia, W., Yin, L., Jin, J., & Yu, Y. (2024a). Privileged Knowledge State Distillation for Reinforcement Learning-based Educational Path Recommendation. In R. Baeza-Yates & F. Bonchi (Eds.), Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp.\u00a01621\u20131630). ACM. https:\/\/doi.org\/10.1145\/3637528.3671872","DOI":"10.1145\/3637528.3671872"},{"issue":"9","key":"494_CR111","first-page":"1823","volume":"32","author":"J Li","year":"2024","unstructured":"Li, J., Yu, S., & Zhang, T. (2024b). Learning Path Recommendation Based on Reinforcement Learning. Engineering Letters, 32(9), 1823\u20131832.","journal-title":"Engineering Letters"},{"key":"494_CR112","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1016\/j.compedu.2019.01.003","volume":"132","author":"K Li","year":"2019","unstructured":"Li, K. (2019). MOOC learners\u2019 demographics, self-regulated learning strategy, perceived learning and satisfaction: A structural equation modeling approach. Computers & Education, 132, 16\u201330. https:\/\/doi.org\/10.1016\/j.compedu.2019.01.003","journal-title":"Computers & Education"},{"issue":"2","key":"494_CR113","doi-asserted-by":"publisher","first-page":"220","DOI":"10.3102\/10769986221129847","volume":"48","author":"X Li","year":"2023","unstructured":"Li, X., Xu, H., Zhang, J., & Chang, H.-H. (2023b). Deep reinforcement learning for adaptive learning systems. Journal of Educational and Behavioral Statistics, 48(2), 220\u2013243. https:\/\/doi.org\/10.3102\/10769986221129847","journal-title":"Journal of Educational and Behavioral Statistics"},{"issue":"34","key":"494_CR114","doi-asserted-by":"publisher","first-page":"24369","DOI":"10.1007\/s00521-023-08989-w","volume":"35","author":"Z Li","year":"2023","unstructured":"Li, Z., Shi, L., Wang, J., Cristea, A. I., & Zhou, Y. (2023a). Sim-GAIL: A generative adversarial imitation learning approach of student modelling for intelligent tutoring systems. Neural Computing and Applications, 35(34), 24369\u201324388. https:\/\/doi.org\/10.1007\/s00521-023-08989-w","journal-title":"Neural Computing and Applications"},{"key":"494_CR115","doi-asserted-by":"publisher","unstructured":"*Liang, K., & You, J. (2024). Research on personalized learning path recommendation model of artificial intelligence in new business. In 2024 4th International Signal Processing, Communications and Engineering Management Conference (ISPCEM) (pp.\u00a0801\u2013806). IEEE. https:\/\/doi.org\/10.1109\/ISPCEM64498.2024.00143","DOI":"10.1109\/ISPCEM64498.2024.00143"},{"key":"494_CR116","doi-asserted-by":"publisher","first-page":"120757","DOI":"10.1109\/ACCESS.2020.3006254","volume":"8","author":"J Lin","year":"2020","unstructured":"Lin, J., Ma, Z., Gomez, R., Nakamura, K., He, B., & Li, G. (2020). A review on interactive reinforcement learning from human social feedback. IEEE Access, 8, 120757\u2013120765. https:\/\/doi.org\/10.1109\/ACCESS.2020.3006254","journal-title":"IEEE Access"},{"key":"494_CR117","doi-asserted-by":"publisher","unstructured":"*Liu, S., Chen, Y., Huang, H., Xiao, L., & Hei, X. (2018). Towards smart educational recommendations with reinforcement learning in classroom. In M. J. W. Lee (Ed.), Proceedings of 2018 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE2018) (pp.\u00a01079\u20131084). IEEE. https:\/\/doi.org\/10.1109\/TALE.2018.8615217","DOI":"10.1109\/TALE.2018.8615217"},{"key":"494_CR118","doi-asserted-by":"publisher","unstructured":"*Liu, Q., Tong, S., Liu, C., Zhao, H., Chen, E., Ma, H., & Wang, S. (2019). Exploiting cognitive structure for adaptive learning. In A. Teredesai (Ed.), ACM Digital Library, Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp.\u00a0627\u2013635). Association for Computing Machinery. https:\/\/doi.org\/10.1145\/3292500.3330922","DOI":"10.1145\/3292500.3330922"},{"key":"494_CR119","doi-asserted-by":"publisher","unstructured":"Liu, Y., Tang, W., & Pareek, P. K. (2022). The dynamic mode construction of mixed english learning based on reinforcement learning. In 2nd International Conference on Mobile Networks and Wireless Communications (ICMNWC-2022) (pp.\u00a01\u20135). IEEE. https:\/\/doi.org\/10.1109\/ICMNWC56175.2022.10031831","DOI":"10.1109\/ICMNWC56175.2022.10031831"},{"key":"494_CR120","doi-asserted-by":"publisher","unstructured":"Liu, Y., & Zoghi, B. (2023). Enhancing STEM Education using Machine Learning and Reinforcement Learning Techniques for Educational Software and Serious Games. In L. G\u00f3mez Chova, A. L\u00f3pez Mart\u00ednez, & I. Candel Torres (Eds.), EDULEARN Proceedings, EDULEARN23 Proceedings (pp.\u00a07148\u20137152). IATED. https:\/\/doi.org\/10.21125\/edulearn.2023.1871","DOI":"10.21125\/edulearn.2023.1871"},{"key":"494_CR121","doi-asserted-by":"publisher","unstructured":"Liu, F., Hu, X., Liu, S., Bu, C., & Wu, L. (2023). Meta multi-agent exercise recommendation: A game application perspective. In A. Singh (Ed.), ACM Digital Library, Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp.\u00a01441\u20131452). Association for Computing Machinery. https:\/\/doi.org\/10.1145\/3580305.3599429","DOI":"10.1145\/3580305.3599429"},{"key":"494_CR122","doi-asserted-by":"publisher","unstructured":"Loh, H., Shin, D., Lee, S., Baek, J., Hwang, C., Lee, Y., Cha, Y., Kwon, S., Park, J., & Choi, Y. (2021). Recommendation for effective standardized exam preparation. In M. Scheffel (Ed.), ACM Digital Library, Lak21: 11th International Learning Analytics and Knowledge Conference (pp.\u00a0397\u2013404). Association for Computing Machinery. https:\/\/doi.org\/10.1145\/3448139.3448177","DOI":"10.1145\/3448139.3448177"},{"key":"494_CR123","doi-asserted-by":"publisher","unstructured":"L\u00f3pez-L\u00f3pez, J. A., Page, M. J., Lipsey, M. W., & Higgins, J. P. T. (2018). Dealing with effect size multiplicity in systematic reviews and meta-analyses. Research Synthesis Methods, 9(3). https:\/\/doi.org\/10.1002\/jrsm.1310","DOI":"10.1002\/jrsm.1310"},{"key":"494_CR124","unstructured":"Maclellan, C., & Gupta, A. (2021). Learning expert models for educationally relevant tasks using reinforcement learning. In International Conference on Educational Data Mining (EDM)."},{"issue":"10","key":"494_CR125","doi-asserted-by":"publisher","first-page":"3921","DOI":"10.1007\/s12652-019-01627-1","volume":"11","author":"Y Madani","year":"2020","unstructured":"Madani, Y., Ezzikouri, H., Erritali, M., & Hssina, B. (2020). Finding optimal pedagogical content in an adaptive e-learning platform using a new recommendation approach and reinforcement learning. Journal of Ambient Intelligence and Humanized Computing, 11(10), 3921\u20133936. https:\/\/doi.org\/10.1007\/s12652-019-01627-1","journal-title":"Journal of Ambient Intelligence and Humanized Computing"},{"issue":"3","key":"494_CR126","doi-asserted-by":"publisher","DOI":"10.1002\/cl2.1256","volume":"18","author":"M Maier","year":"2022","unstructured":"Maier, M., VanderWeele, T. J., & Mathur, M. B. (2022). Using selection models to assess sensitivity to publication bias: A tutorial and call for more routine use. Campbell Systematic Reviews, 18(3), Article e1256. https:\/\/doi.org\/10.1002\/cl2.1256","journal-title":"Campbell Systematic Reviews"},{"key":"494_CR127","unstructured":"*Malpani, A., Ravindran, B., & Murthy, H. (2011). Personalized intelligent tutoring system using reinforcement learning. In Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference, May 18\u201320, 2011, Palm Beach, Florida, USA."},{"key":"494_CR128","unstructured":"*Mandel, T., Liu, Y.-E., Levine, S., Brunskill, E., & Popovic, Z. (2014). Offline policy evaluation across representations with applications to educational games. In AAMAS \u201914, Proceedings of the 2014 International Conference on Autonomous Agents and Multi-Agent Systems (pp.\u00a01077\u20131084). International Foundation for Autonomous Agents and Multiagent Systems."},{"key":"494_CR129","unstructured":"Mandel, T. S. (2017). Better education through improved reinforcement learning [Doctoral dissertation, University of Washington]. ProQuest Dissertations & Theses."},{"key":"494_CR130","doi-asserted-by":"publisher","unstructured":"*Martin, K. N., spsampsps Arroyo, I. (2004). Agentx: Using reinforcement learning to improve the effectiveness of intelligent tutoring systems. In J. C. Lester, R. M. Vicari, spsampsps F. Paragua\u00e7u (Eds.), Lecture Notes in Computer Science: Vol. 3220. Intelligent Tutoring Systems: 7th International Conference, ITS 2004 Proceedings (Vol. 3220, pp.\u00a0564\u2013572). Springer. https:\/\/doi.org\/10.1007\/978-3-540-30139-4_53","DOI":"10.1007\/978-3-540-30139-4_53"},{"issue":"2","key":"494_CR131","doi-asserted-by":"publisher","first-page":"133","DOI":"10.1177\/0962280211432219","volume":"22","author":"D Mavridis","year":"2013","unstructured":"Mavridis, D., & Salanti, G. (2013). A practical introduction to multivariate meta-analysis. Statistical Methods in Medical Research, 22(2), 133\u2013158. https:\/\/doi.org\/10.1177\/0962280211432219","journal-title":"Statistical Methods in Medical Research"},{"issue":"8","key":"494_CR132","doi-asserted-by":"publisher","first-page":"9325","DOI":"10.1007\/s10639-022-11129-x","volume":"28","author":"C Mazon","year":"2023","unstructured":"Mazon, C., Cl\u00e9ment, B., Roy, D., Oudeyer, P.-Y., & Sauz\u00e9on, H. (2023). Pilot study of an intervention based on an intelligent tutoring system (ITS) for instructing mathematical skills of students with ASD and\/or ID. Education and Information Technologies, 28(8), 9325\u20139354. https:\/\/doi.org\/10.1007\/s10639-022-11129-x","journal-title":"Education and Information Technologies"},{"key":"494_CR133","doi-asserted-by":"publisher","DOI":"10.1016\/j.caeo.2024.100175","volume":"6","author":"B Memarian","year":"2024","unstructured":"Memarian, B., & Doleck, T. (2024). A scoping review of reinforcement learning in education. Computers and Education Open, 6, Article 100175. https:\/\/doi.org\/10.1016\/j.caeo.2024.100175","journal-title":"Computers and Education Open"},{"key":"494_CR134","doi-asserted-by":"publisher","unstructured":"*Ming, G. F., & Hua, S. (2010). Course-scheduling algorithm of option-based hierarchical reinforcement learning. In Z. Hu (Ed.), 2010 Second International Workshop on Education Technology and Computer Science: Etcs 2010 ; Wuhan, China, 6 - 7 March 2010 ; [proceedings (pp.\u00a0288\u2013291). IEEE. https:\/\/doi.org\/10.1109\/ETCS.2010.584","DOI":"10.1109\/ETCS.2010.584"},{"issue":"5","key":"494_CR135","doi-asserted-by":"publisher","first-page":"6389","DOI":"10.1007\/s11042-021-11806-y","volume":"81","author":"SAH Minoofam","year":"2022","unstructured":"Minoofam, S. A. H., Bastanfard, A., & Keyvanpour, M. R. (2022). Ralf: An adaptive reinforcement learning framework for teaching dyslexic students. Multimedia Tools and Applications, 81(5), 6389\u20136412. https:\/\/doi.org\/10.1007\/s11042-021-11806-y","journal-title":"Multimedia Tools and Applications"},{"key":"494_CR136","doi-asserted-by":"publisher","unstructured":"Mirea, A.-M., & Preda, M. C. (2009). Adaptive learning based on exercises fitness degree. In R. Baeza-Yates & P. Boldi (Eds.), 2009 IEEE\/WIC\/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT): Wi-IAT 2009 (pp.\u00a0215\u2013218). IEEE. https:\/\/doi.org\/10.1109\/WI-IAT.2009.266","DOI":"10.1109\/WI-IAT.2009.266"},{"key":"494_CR137","doi-asserted-by":"publisher","unstructured":"Mishima, C., & Asada, M. (1999). Active learning from cross perceptual aliasing caused by direct teaching. In Human and environment friendly robots with high intelligence and emotional quotients: Proceedings (pp.\u00a01420\u20131425). IEEE Operations Center. https:\/\/doi.org\/10.1109\/IROS.1999.811678","DOI":"10.1109\/IROS.1999.811678"},{"key":"494_CR138","doi-asserted-by":"publisher","unstructured":"Mitchell, C. M., Boyer, K. E., spsampsps Lester, J. C. (2013). A markov decision process model of tutorial intervention in task-oriented dialogue. In H. C. Lane, K. Yacef, J. Mostow, spsampsps A. Graesser (Eds.), Lecture notes in computer science Lecture notes in artificial intelligence: Vol. 7926. Artificial intelligence in education: 16th international conference, AIED 2013 proceedings (Vol. 7926, pp.\u00a0828\u2013831). Springer. https:\/\/doi.org\/10.1007\/978-3-642-39112-5_123","DOI":"10.1007\/978-3-642-39112-5_123"},{"issue":"7540","key":"494_CR139","doi-asserted-by":"publisher","first-page":"529","DOI":"10.1038\/nature14236","volume":"518","author":"V Mnih","year":"2015","unstructured":"Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529\u2013533. https:\/\/doi.org\/10.1038\/nature14236","journal-title":"Nature"},{"key":"494_CR140","doi-asserted-by":"publisher","unstructured":"Moerland, T. M., Broekens, J., Plaat, A., & Jonker, C. M. (2023). Model-based reinforcement learning: A survey. Foundations and Trends\u00ae in Machine Learning, 16(1), 1\u2013118. https:\/\/doi.org\/10.1561\/2200000086","DOI":"10.1561\/2200000086"},{"key":"494_CR141","doi-asserted-by":"publisher","unstructured":"*Mohana, R., Sekhar, K. C., Sen Gupta, S., Punithaasree, K. S., Dorcas E, G., & Muthuperumal, S. (2024). Increasing Learner Engagement in English Language Acquisition Through AI-Powered Gamification. In 2024 International Conference on Artificial Intelligence and Quantum Computation-Based Sensor Application (ICAIQSA) (pp.\u00a01\u20136). IEEE. https:\/\/doi.org\/10.1109\/ICAIQSA64000.2024.10882349","DOI":"10.1109\/ICAIQSA64000.2024.10882349"},{"issue":"1","key":"494_CR142","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1287\/mnsc.28.1.1","volume":"28","author":"GE Monahan","year":"1982","unstructured":"Monahan, G. E. (1982). State of the Art\u2014A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms. Management Science, 28(1), 1\u201316. https:\/\/doi.org\/10.1287\/mnsc.28.1.1","journal-title":"Management Science"},{"key":"494_CR143","doi-asserted-by":"publisher","unstructured":"Mu, T., Wang, S., Andersen, E., & Brunskill, E. (2018). Combining adaptivity with progression ordering for intelligent tutoring systems. In S. Klemmer (Ed.), ACM Proceedings of the Fifth Annual ACM Conference on Learning at Scale (pp.\u00a01\u20134). ACM. https:\/\/doi.org\/10.1145\/3231644.3231672","DOI":"10.1145\/3231644.3231672"},{"key":"494_CR144","doi-asserted-by":"publisher","unstructured":"Mu, T., Wang, S., Andersen, E., spsampsps Brunskill, E. (2021). Automatic adaptive sequencing in a webgame. In A. I. Cristea spsampsps C. Troussas (Eds.), Springer eBook Collection: Vol. 12677. Intelligent Tutoring Systems: 17th International Conference, ITS 2021, Virtual Event, June 7\u201311, 2021, Proceedings (1st ed. 2021, Vol. 12677, pp.\u00a0430\u2013438). Springer International Publishing; Imprint Springer. https:\/\/doi.org\/10.1007\/978-3-030-80421-3_47","DOI":"10.1007\/978-3-030-80421-3_47"},{"key":"494_CR145","unstructured":"Murphy, K. (2025). Reinforcement learning: An overview. http:\/\/arxiv.org\/pdf\/2412.05265"},{"key":"494_CR146","doi-asserted-by":"publisher","unstructured":"Mustapha, R., Soukaina, G., Mohammed, Q., & Es-S\u00e2adia, A. (2023). Towards an adaptive e-learning system based on deep learner profile, machine learning approach, and reinforcement learning. International Journal of Advanced Computer Science and Applications, 14(5). https:\/\/doi.org\/10.14569\/IJACSA.2023.0140528","DOI":"10.14569\/IJACSA.2023.0140528"},{"issue":"3","key":"494_CR147","doi-asserted-by":"publisher","first-page":"221","DOI":"10.1080\/00220671.2017.1289775","volume":"110","author":"LS Nadelson","year":"2017","unstructured":"Nadelson, L. S., & Seifert, A. L. (2017). Integrated STEM defined: Contexts, challenges, and the future. The Journal of Educational Research, 110(3), 221\u2013223. https:\/\/doi.org\/10.1080\/00220671.2017.1289775","journal-title":"The Journal of Educational Research"},{"key":"494_CR148","doi-asserted-by":"publisher","unstructured":"Nguyen, H., & La, H. (2019). Review of deep reinforcement learning for robot manipulation. In 2019 Third IEEE International Conference on Robotic Computing (IRC) (pp.\u00a0590\u2013595). IEEE. https:\/\/doi.org\/10.1109\/IRC.2019.00120","DOI":"10.1109\/IRC.2019.00120"},{"key":"494_CR149","doi-asserted-by":"publisher","unstructured":"Nie, A., Reuel, A.-K., spsampsps Brunskill, E. (2023). Understanding the impact of reinforcement learning personalization on subgroups of students in math tutoring. In N. Wang, G. Rebolledo-Mendez, V. Dimitrova, N. Matsuda, spsampsps O. C. Santos (Eds.), Communications in Computer and Information Science: Vol. 1831. Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky: 24th International Conference, AIED 2023 Proceedings (1st ed. 2023, Vol. 1831, pp.\u00a0688\u2013694). Springer Nature Switzerland; Imprint Springer. https:\/\/doi.org\/10.1007\/978-3-031-36336-8_106","DOI":"10.1007\/978-3-031-36336-8_106"},{"key":"494_CR150","doi-asserted-by":"publisher","unstructured":"Nisansala, P., & Morawaka, A. (2019). Athwel: Gamification supportive tool for special educational centers in Sri Lanka. In 2019 IEEE 14th International Conference on Industrial and Information Systems: (ICIIS) : 18th-20th December, 2019 : Conference proceedings (pp.\u00a0446\u2013451). IEEE. https:\/\/doi.org\/10.1109\/ICIIS47346.2019.9063274","DOI":"10.1109\/ICIIS47346.2019.9063274"},{"key":"494_CR151","doi-asserted-by":"publisher","unstructured":"*Niu, S., & Cao, S. (2022). Get a sense of accomplishment in doing exercises: A reinforcement learning perspective. In 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD) (pp.\u00a0299\u2013304). IEEE. https:\/\/doi.org\/10.1109\/CSCWD54268.2022.9776133","DOI":"10.1109\/CSCWD54268.2022.9776133"},{"key":"494_CR152","doi-asserted-by":"publisher","unstructured":"Nwana, H. (1990). Intelligent tutoring systems: an overview. Artificial Intelligence Review, 4(4). https:\/\/doi.org\/10.1007\/BF00168958","DOI":"10.1007\/BF00168958"},{"key":"494_CR153","doi-asserted-by":"publisher","unstructured":"*Oralbayeva, N., Shakerimov, A., Sarmonov, S., Kantoreyeva, K., Dadebayeva, F., Serkali, N., & Sandygulova, A. (2022). K-Qbot: Language learning chatbot based on reinforcement learning. In S. \u0160abanovi\u0107 (Ed.), Hri \u201822: Proceedings of the 2022 ACM\/IEEE International Conference on Human-Robot Interaction (pp.\u00a0963\u2013967). IEEE. https:\/\/doi.org\/10.1109\/HRI53351.2022.9889428","DOI":"10.1109\/HRI53351.2022.9889428"},{"key":"494_CR154","doi-asserted-by":"publisher","unstructured":"*Orsoni, M., P\u00f6gelt, A., Duong-Trung, N., Benassi, M., Kravcik, M., spsampsps Gr\u00fcttm\u00fcller, M. (2023). Recommending mathematical tasks based on reinforcement learning and item response theory. In C. Frasson, P. Mylonas, spsampsps C. Troussas (Eds.), Lecture Notes in Computer Science: Vol. 13891. Augmented Intelligence and Intelligent Tutoring Systems: 19th International Conference, ITS 2023, Corfu, Greece, June 2\u20135, 2023, Proceedings (1st ed. 2023, Vol. 13891, pp.\u00a016\u201328). Springer Nature Switzerland; Imprint Springer. https:\/\/doi.org\/10.1007\/978-3-031-32883-1_2","DOI":"10.1007\/978-3-031-32883-1_2"},{"key":"494_CR155","doi-asserted-by":"publisher","unstructured":"*Oyuga Anne, D., & Maina, E. (2021). Reinforcement learning approach for adaptive e-learning based on multiple learner characteristics. Open Journal for Information Technology, 4(2), 55\u201376. https:\/\/doi.org\/10.32591\/coas.ojit.0402.03055o","DOI":"10.32591\/coas.ojit.0402.03055o"},{"key":"494_CR156","doi-asserted-by":"publisher","DOI":"10.1136\/bmj.n160","volume":"372","author":"MJ Page","year":"2021","unstructured":"Page, M. J., Moher, D., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hr\u00f3bjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., & McKenzie, J. E. (2021). Prisma 2020 explanation and elaboration: Updated guidance and exemplars for reporting systematic reviews. BMJ (Clinical Research Ed.), 372, Article n160. https:\/\/doi.org\/10.1136\/bmj.n160","journal-title":"BMJ (Clinical Research Ed.)"},{"key":"494_CR157","doi-asserted-by":"publisher","unstructured":"*Pan, J., & Yang, N. (2024). Application of Reinforcement Learning Algorithm in Personalized Music Teaching. In 2024 2nd International Conference on Mechatronics, IoT and Industrial Informatics (ICMIII) (pp.\u00a0599\u2013604). IEEE. https:\/\/doi.org\/10.1109\/ICMIII62623.2024.00118","DOI":"10.1109\/ICMIII62623.2024.00118"},{"issue":"12","key":"494_CR158","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0083138","volume":"8","author":"N Panic","year":"2013","unstructured":"Panic, N., Leoncini, E., de Belvis, G., Ricciardi, W., & Boccia, S. (2013). Evaluation of the endorsement of the preferred reporting items for systematic reviews and meta-analysis (PRISMA) statement on the quality of published systematic review and meta-analyses. PLoS ONE, 8(12), Article e83138. https:\/\/doi.org\/10.1371\/journal.pone.0083138","journal-title":"PLoS ONE"},{"key":"494_CR159","doi-asserted-by":"crossref","unstructured":"Papadimitriou, C. H., & Tsitsiklis, J. N. (1987). The Complexity of Markov Decision Processes. Mathematics of Operations Research, 12(3), 441\u2013450. http:\/\/www.jstor.org\/stable\/3689975","DOI":"10.1287\/moor.12.3.441"},{"key":"494_CR160","doi-asserted-by":"publisher","first-page":"687","DOI":"10.1609\/aaai.v33i01.3301687","volume":"33","author":"HW Park","year":"2019","unstructured":"Park, H. W., Grover, I., Spaulding, S., Gomez, L., & Breazeal, C. (2019). A model-free affective reinforcement learning approach to personalization of an autonomous social robot companion for early literacy education. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 687\u2013694. https:\/\/doi.org\/10.1609\/aaai.v33i01.3301687","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"494_CR161","doi-asserted-by":"publisher","unstructured":"Patel, M., & Sajja, P. S. (2021). Application for multi-agent system: A case of customised elearning. In S. L. Chavan (Ed.), 2021 International Conference on Computing, Communication and Green Engineering (CCGE2021) (pp.\u00a01\u20136). IEEE. https:\/\/doi.org\/10.1109\/CCGE50943.2021.9776390","DOI":"10.1109\/CCGE50943.2021.9776390"},{"key":"494_CR162","doi-asserted-by":"publisher","unstructured":"Perez, J., Dapena, E., Aguilar, J., & Carrillo, G. (2022). Reinforcement learning for estimating student proficiency in math word problems. In 2022 XVII Latin American Conference on Learning Technologies (LACLO) (pp.\u00a01\u20136). IEEE. https:\/\/doi.org\/10.1109\/LACLO56648.2022.10013399","DOI":"10.1109\/LACLO56648.2022.10013399"},{"issue":"16","key":"494_CR163","doi-asserted-by":"publisher","first-page":"21015","DOI":"10.1007\/s10639-024-12699-8","volume":"29","author":"J P\u00e9rez","year":"2024","unstructured":"P\u00e9rez, J., Dapena, E., & Aguilar, J. (2024). Emotions as implicit feedback for adapting difficulty in tutoring systems based on reinforcement learning. Education and Information Technologies, 29(16), 21015\u201321043. https:\/\/doi.org\/10.1007\/s10639-024-12699-8","journal-title":"Education and Information Technologies"},{"key":"494_CR164","doi-asserted-by":"crossref","unstructured":"*Pietquin, O., Daubigney, L., & Geist, M. (2011). Optimization of a tutoring system from a fixed set of data. In SLaTE 2011 (pp.\u00a01\u20134).","DOI":"10.21437\/SLaTE.2011-29"},{"key":"494_CR165","doi-asserted-by":"publisher","unstructured":"*P\u00f6gelt, A., Ihsberner, K., Pengel, N., Kravcik, M., Gr\u00fcttm\u00fcller, M., spsampsps Hardt, W. (2024). Individualised Mathematical Task Recommendations Through Intended Learning Outcomes and Reinforcement Learning. In A. Sifaleras spsampsps F. Lin (Eds.), Lecture Notes in Computer Science: Vol. 14798. Generative Intelligence and Intelligent Tutoring Systems (Vol. 14798, pp.\u00a0117\u2013130). Springer Nature Switzerland. https:\/\/doi.org\/10.1007\/978-3-031-63028-6_10","DOI":"10.1007\/978-3-031-63028-6_10"},{"key":"494_CR166","doi-asserted-by":"publisher","unstructured":"Priya, S. S., Subhashini, R., & Akilandeswari, J. (2012). Learning agent based knowledge management in intelligent tutoring system. In 2012 International Conference on Computer Communication and Informatics (ICCCI 2012) (pp.\u00a01\u20135). IEEE. https:\/\/doi.org\/10.1109\/ICCCI.2012.6158828","DOI":"10.1109\/ICCCI.2012.6158828"},{"key":"494_CR167","doi-asserted-by":"publisher","unstructured":"*Pu, Y., Wang, C., & Wu, W. (2020). A deep reinforcement learning framework for instructional sequencing. In 2020 IEEE International Conference on Big Data (Big Data) (pp.\u00a05201\u20135208). IEEE. https:\/\/doi.org\/10.1109\/BigData50022.2020.9378463","DOI":"10.1109\/BigData50022.2020.9378463"},{"key":"494_CR168","doi-asserted-by":"publisher","unstructured":"Puterman, M. L. (2005). Markov decision processes: Discrete stochastic dynamic programming. Wiley series in probability and mathematical statistics. Applied probability and statistics section. Wiley-Interscience. https:\/\/doi.org\/10.1002\/9780470316887","DOI":"10.1002\/9780470316887"},{"key":"494_CR169","doi-asserted-by":"publisher","unstructured":"*Raghuveer, V. R., Tripathy, B. K., Singh, T., & Khanna, S. (2014). Reinforcement learning approach towards effective content recommendation in MOOC environments. In 2014 IEEE International Conference on MOOC, Innovation and Technology in Education (MITE) (pp.\u00a0285\u2013289). IEEE. https:\/\/doi.org\/10.1109\/MITE.2014.7020289","DOI":"10.1109\/MITE.2014.7020289"},{"key":"494_CR170","unstructured":"Ramachandran, A., & Scassellati, B. (2014). Adapting difficulty levels in personalized robot-child tutoring interactions. In Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence. https:\/\/www.aaai.org\/ocs\/index.php\/WS\/AAAIW14\/paper\/viewPaper\/8736"},{"key":"494_CR171","doi-asserted-by":"publisher","unstructured":"*Ravari, P. B., Jen Lee, K., Law, E., & Kulic, D. (2021). Effects of an adaptive robot encouraging teamwork on students\u2019 learning. In 2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN) (pp.\u00a0250\u2013257). IEEE. https:\/\/doi.org\/10.1109\/RO-MAN50785.2021.9515354","DOI":"10.1109\/RO-MAN50785.2021.9515354"},{"key":"494_CR172","doi-asserted-by":"crossref","unstructured":"*Reddy, S., Levine, S., & Dragan, A. (2017). Accelerating human learning with deep reinforcement learning. In NIPS\u201917 Workshop: Teaching Machines, Robots, and Humans (pp.\u00a05\u20139).","DOI":"10.15607\/RSS.2018.XIV.005"},{"key":"494_CR173","doi-asserted-by":"publisher","unstructured":"Riedmann, A., & Lugrin, B. (2023). Towards an Adaptive Pedagogical Agent in a Reading Intervention Using Reinforcement Learning. In B. Lugrin, M. Latoschik, S. von Mammen, S. Kopp, F. P\u00e9cune, & C. Pelachaud (Eds.), ACM Digital Library, Proceedings of the 23rd ACM International Conference on Intelligent Virtual Agents (pp.\u00a01\u20133). Association for Computing Machinery. https:\/\/doi.org\/10.1145\/3570945.3607320","DOI":"10.1145\/3570945.3607320"},{"key":"494_CR174","doi-asserted-by":"publisher","unstructured":"*Riedmann, A., G\u00f6tz, J., D\u2019Eramo, C., spsampsps Lugrin, B. (2024). Uli-RL: A Real-World Deep Reinforcement Learning Pedagogical Agent for Children. In A. Hotho spsampsps S. Rudolph (Eds.), Lecture Notes in Artificial Intelligence: Vol. 14992. Ki 2024: Advances in Artificial Intelligence: 47th German Conference on AI, W\u00fcrzburg, Germany, September 25\u201327, 2024, Proceedings (1st ed. 2024, Vol. 1410). Springer Nature Switzerland. https:\/\/doi.org\/10.1007\/978-3-031-70893-0_25","DOI":"10.1007\/978-3-031-70893-0_25"},{"issue":"4","key":"494_CR175","doi-asserted-by":"publisher","first-page":"789","DOI":"10.1111\/j.1467-985X.2008.00593.x","volume":"172","author":"RD Riley","year":"2009","unstructured":"Riley, R. D. (2009). Multivariate meta-analysis: The effect of ignoring within-study correlation. Journal of the Royal Statistical Society Series a: Statistics in Society, 172(4), 789\u2013811. https:\/\/doi.org\/10.1111\/j.1467-985X.2008.00593.x","journal-title":"Journal of the Royal Statistical Society Series a: Statistics in Society"},{"key":"494_CR176","doi-asserted-by":"publisher","unstructured":"Rojas-Barahona, L. M., spsampsps Cerisara, C. (2014). Bayesian inverse reinforcement learning for modeling conversational agents in a virtual environment. In D. Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell, M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen, M. Sudan, D. Terzopoulos, D. Tygar, M. Y. Vardi, G. Weikum, spsampsps A. Gelbukh (Eds.), Lecture notes in computer science Theoretical computer science and general issues: Vol. 8403. Computational linguistics and intelligent text processing: 15th International Conference [on Intelligent Text Processing and Computational Linguistics], CICLing 2014 proceedings (Vol. 8403, pp.\u00a0503\u2013514). Springer. https:\/\/doi.org\/10.1007\/978-3-642-54906-9_41","DOI":"10.1007\/978-3-642-54906-9_41"},{"key":"494_CR177","doi-asserted-by":"publisher","unstructured":"*Rowe, J. P., & Lester, J. C. (2015). Improving student problem solving in narrative-centered learning environments: A modular reinforcement learning framework. In C. Conati, N. Heffernan, A. Mitrovic, & M. F. Verdejo (Eds.), Lecture notes in computer science Lecture notes in artificial intelligence: Vol. 9112. Artificial intelligence in education: 17th international conference, AIED 2015 proceedings (Vol. 9112, pp.\u00a0419\u2013428). Springer. https:\/\/doi.org\/10.1007\/978-3-319-19773-9_42","DOI":"10.1007\/978-3-319-19773-9_42"},{"key":"494_CR178","doi-asserted-by":"publisher","unstructured":"Roy, S., Crick, C., Kieson, E., & Abramson, C. (2018). A reinforcement learning model for robots as teachers. In J.-J. Cabibihan (Ed.), Ieee RO-MAN 2018: The 27th IEEE International Symposium on Robot and Human Interactive Communication (pp.\u00a0294\u2013299). IEEE. https:\/\/doi.org\/10.1109\/ROMAN.2018.8525563","DOI":"10.1109\/ROMAN.2018.8525563"},{"issue":"5","key":"494_CR179","doi-asserted-by":"publisher","first-page":"3023","DOI":"10.1007\/s10994-023-06423-9","volume":"113","author":"S Ruan","year":"2024","unstructured":"Ruan, S., Nie, A., Steenbergen, W., He, J., Zhang, J. Q., Guo, M., Liu, Y., Dang Nguyen, K., Wang, C. Y., Ying, R., Landay, J. A., & Brunskill, E. (2024). Reinforcement learning tutor better supported lower performers in a math task. Machine Learning, 113(5), 3023\u20133048. https:\/\/doi.org\/10.1007\/s10994-023-06423-9","journal-title":"Machine Learning"},{"key":"494_CR180","doi-asserted-by":"publisher","DOI":"10.1016\/j.compedu.2021.104426","volume":"180","author":"JA Ruip\u00e9rez-Valiente","year":"2022","unstructured":"Ruip\u00e9rez-Valiente, J. A., Staubitz, T., Jenner, M., Halawa, S., Zhang, J., Despujol, I., Maldonado-Mahauad, J., Montoro, G., Peffer, M., Rohloff, T., Lane, J., Turro, C., Li, X., P\u00e9rez-Sanagust\u00edn, M., & Reich, J. (2022). Large scale analytics of global and regional MOOC providers: Differences in learners\u2019 demographics, preferences, and perceptions. Computers & Education, 180, Article 104426. https:\/\/doi.org\/10.1016\/j.compedu.2021.104426","journal-title":"Computers & Education"},{"key":"494_CR181","doi-asserted-by":"publisher","unstructured":"Sarma, B. H. S., spsampsps Ravindran, B. (2007). Intelligent tutoring systems using reinforcement learning to teach autistic students. In A. Venkatesh, T. Gonzalves, A. Monk, spsampsps K. Buckner (Eds.), IFIP International Federation for Information Processing: Vol. 241. Home informatics and telematics: Ict for the next billion: Proceeding of IFIP TC 9, WG 9.3 HOIT 2007 Conference (Vol. 241, pp.\u00a065\u201378). Springer. https:\/\/doi.org\/10.1007\/978-0-387-73697-6_5","DOI":"10.1007\/978-0-387-73697-6_5"},{"key":"494_CR182","doi-asserted-by":"publisher","unstructured":"*Sawyer, R., Rowe, J., spsampsps Lester, J. (2017). Balancing learning and engagement in game-based learning environments with multi-objective reinforcement learning. In E. Andr\u00e9, R. Baker, X. Hu, M. M. T. Rodrigo, spsampsps B. Du Boulay (Eds.), Lecture Notes in Computer Science: Vol. 10331. Artificial Intelligence in Education: 18th International Conference, AIED 2017 Proceedings (Vol. 10331, pp.\u00a0323\u2013334). Springer International Publishing. https:\/\/doi.org\/10.1007\/978-3-319-61425-0_27","DOI":"10.1007\/978-3-319-61425-0_27"},{"key":"494_CR183","doi-asserted-by":"publisher","unstructured":"Scarlatos, A., Smith, D., Woodhead, S., spsampsps Lan, A. (2024). Improving the Validity of Automatically Generated Feedback via Reinforcement Learning. In A. M. Olney, I.-A. Chounta, Z. Liu, O. C. Santos, spsampsps I. I. Bittencourt (Eds.), Lecture Notes in Computer Science: Vol. 14829. Artificial Intelligence in Education (Vol. 14829, pp.\u00a0280\u2013294). Springer Nature Switzerland. https:\/\/doi.org\/10.1007\/978-3-031-64302-6_20","DOI":"10.1007\/978-3-031-64302-6_20"},{"key":"494_CR184","doi-asserted-by":"publisher","unstructured":"Schmucker, R., Pachapurkar, N., Bala, S., Shah, M., spsampsps Mitchell, T. (2023). Learning to give useful hints: Assistance action evaluation and policy improvements. In O. Viberg, I. Jivet, P. J. Mu\u00f1oz-Merino, M. Perifanou, spsampsps T. Papathoma (Eds.), Lecture Notes in Computer Science: Vol. 14200. Responsive and Sustainable Educational Futures: 18th European Conference on Technology Enhanced Learning, EC-TEL 2023 Proceedings (1st ed. 2023, Vol. 14200, pp.\u00a0383\u2013398). Springer Nature Switzerland; Imprint Springer. https:\/\/doi.org\/10.1007\/978-3-031-42682-7_26","DOI":"10.1007\/978-3-031-42682-7_26"},{"key":"494_CR185","unstructured":"Schulman, J., Levine, S., Abbeel, P., Jordan, M., & Moritz, P. (2015). Trust region policy optimization. In F. Bach & D. Blei (Eds.), Proceedings of Machine Learning Research, Proceedings of the 32nd International Conference on Machine Learning (pp.\u00a01889\u20131897). PMLR. https:\/\/proceedings.mlr.press\/v37\/schulman15.html"},{"key":"494_CR186","unstructured":"Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. https:\/\/arxiv.org\/pdf\/1707.06347.pdf"},{"key":"494_CR187","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.120495","volume":"231","author":"AK Shakya","year":"2023","unstructured":"Shakya, A. K., Pillai, G., & Chakrabarty, S. (2023). Reinforcement learning algorithms: A brief survey. Expert Systems with Applications, 231, Article 120495. https:\/\/doi.org\/10.1016\/j.eswa.2023.120495","journal-title":"Expert Systems with Applications"},{"key":"494_CR188","unstructured":"Sharma, P., & Li, Q. (2024). Designing Simulated Students to Emulate Learner Activity Data in an Open-Ended Learning Environment. In Proceedings of the 17th International Conference on Educational Data Mining."},{"key":"494_CR189","doi-asserted-by":"publisher","unstructured":"*Shawky, D., spsampsps Badawi, A. (2018). A reinforcement learning-based adaptive learning system. In A. E. Hassanien, M. F. Tolba, M. Elhoseny, spsampsps M. Mostafa (Eds.), Advances in Intelligent Systems and Computing: Vol. 723. The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2018) (Vol. 723, pp.\u00a0221\u2013231). Springer International Publishing. https:\/\/doi.org\/10.1007\/978-3-319-74690-6_22","DOI":"10.1007\/978-3-319-74690-6_22"},{"key":"494_CR190","doi-asserted-by":"publisher","unstructured":"Shawky, D., spsampsps Badawi, A. (2019). Towards a personalized learning experience using reinforcement learning. In A. E. Hassanien (Ed.), Studies in Computational Intelligence: Volume 801. Machine Learning Paradigms: Theory and Application (Vol. 801, pp.\u00a0169\u2013187). Springer International Publishing. https:\/\/doi.org\/10.1007\/978-3-030-02357-7_8","DOI":"10.1007\/978-3-030-02357-7_8"},{"key":"494_CR191","doi-asserted-by":"publisher","unstructured":"*Shen, S., Ausin, M. S., Mostafavi, B., & Chi, M. (2018a). Improving learning & reducing time: A constrained action-based reinforcement learning approach. In T. Mitrovic (Ed.), ACM Conferences, Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization (pp.\u00a043\u201351). ACM. https:\/\/doi.org\/10.1145\/3209219.3209232","DOI":"10.1145\/3209219.3209232"},{"key":"494_CR192","doi-asserted-by":"publisher","unstructured":"*Shen, S., Mostafavi, B., Lynch, C., Barnes, T., spsampsps Chi, M. (2018c). Empirically evaluating the effectiveness of pomdp vs. mdp towards the pedagogical strategies induction. In C. Ros\u00e9, R. Mart\u00ednez-Maldonado, H. U. Hoppe, R. Luckin, M. Mavrikis, K. Porayska-Pomsta, B. McLaren, spsampsps B. Du Boulay (Eds.), Vol. 10948. Artificial Intelligence in Education: 19th International Conference, AIED 2018 Proceedings, Part II (Vol. 10948, pp.\u00a0327\u2013331). Springer. https:\/\/doi.org\/10.1007\/978-3-319-93846-2_61","DOI":"10.1007\/978-3-319-93846-2_61"},{"key":"494_CR193","doi-asserted-by":"publisher","unstructured":"*Shen, D., Truong, T., & Weintz, C. (2021). Using q-learning to personalize pedagogical policies for addition problems. In 2021 International Conference on Signal Processing and Machine Learning: Conf-SPML 2021 Proceedings (pp.\u00a0186\u2013189). IEEE. https:\/\/doi.org\/10.1109\/CONF-SPML54095.2021.00043","DOI":"10.1109\/CONF-SPML54095.2021.00043"},{"issue":"3","key":"494_CR194","doi-asserted-by":"publisher","first-page":"27","DOI":"10.5281\/ZENODO.3554713","volume":"10","author":"S Shen","year":"2018","unstructured":"Shen, S., Mostafavi, B., Barnes, T., & Chi, M. (2018b). Exploring induced pedagogical strategies through a markov decision process framework: Lessons learned. Journal of Educational Data Mining, 10(3), 27\u201368. https:\/\/doi.org\/10.5281\/ZENODO.3554713","journal-title":"Journal of Educational Data Mining"},{"issue":"1","key":"494_CR195","doi-asserted-by":"publisher","first-page":"216","DOI":"10.3758\/s13428-021-01602-9","volume":"54","author":"J Shin","year":"2022","unstructured":"Shin, J., & Bulut, O. (2022). Building an intelligent recommendation system for personalized test scheduling in computerized assessments: A reinforcement learning approach. Behavior Research Methods, 54(1), 216\u2013232. https:\/\/doi.org\/10.3758\/s13428-021-01602-9","journal-title":"Behavior Research Methods"},{"key":"494_CR196","doi-asserted-by":"publisher","first-page":"24","DOI":"10.1186\/s13643-015-0004-8","volume":"4","author":"M Simmonds","year":"2015","unstructured":"Simmonds, M. (2015). Quantifying the risk of error when interpreting funnel plots. Systematic Reviews, 4, 24. https:\/\/doi.org\/10.1186\/s13643-015-0004-8","journal-title":"Systematic Reviews"},{"key":"494_CR197","unstructured":"Singla, A., Rafferty, A. N., Radanovic, G., & Heffernan, N. T. (2021). Reinforcement learning for education: Opportunities and challenges [Workshop]. In International Conference on Educational Data Mining (EDM)."},{"issue":"3","key":"494_CR198","doi-asserted-by":"publisher","first-page":"56","DOI":"10.5815\/ijmecs.2024.03.05","volume":"16","author":"D Soto Forero","year":"2024","unstructured":"Soto Forero, D., Ackermann, S., Laure Betbeder, M., & Henriet, J. (2024). Automatic Real-Time Adaptation of Training Session Difficulty Using Rules and Reinforcement Learning in the AI-VT ITS. International Journal of Modern Education and Computer Science, 16(3), 56\u201371. https:\/\/doi.org\/10.5815\/ijmecs.2024.03.05","journal-title":"International Journal of Modern Education and Computer Science"},{"key":"494_CR199","doi-asserted-by":"publisher","unstructured":"*Spain, R., Rowe, J., Smith, A., Goldberg, B., Pokorny, R., Mott, B., & Lester, J. (2021). A reinforcement learning approach to adaptive remediation in online training. The Journal of Defense Modeling and Simulation: Applications, Methodology, Technology, 154851292110283. https:\/\/doi.org\/10.1177\/15485129211028317","DOI":"10.1177\/15485129211028317"},{"key":"494_CR200","unstructured":"*Stamper, J., Barnes, T., Lehmann, L., & Croy, M. (2008). The hint factory: Automatic generation of contextualized help for existing computer aided instruction. In Proceedings of the 9th International Conference on Intelligent Tutoring Systems Young Researchers Track (pp.\u00a071\u201378)."},{"issue":"5","key":"494_CR201","doi-asserted-by":"publisher","first-page":"581","DOI":"10.1177\/1948550617693062","volume":"8","author":"TD Stanley","year":"2017","unstructured":"Stanley, T. D. (2017). Limitations of PET-PEESE and Other Meta-Analysis Methods. Social Psychological and Personality Science, 8(5), 581\u2013591. https:\/\/doi.org\/10.1177\/1948550617693062","journal-title":"Social Psychological and Personality Science"},{"issue":"1","key":"494_CR202","doi-asserted-by":"publisher","first-page":"60","DOI":"10.1002\/jrsm.1095","volume":"5","author":"TD Stanley","year":"2014","unstructured":"Stanley, T. D., & Doucouliagos, H. (2014). Meta-regression approximations to reduce publication selection bias. Research Synthesis Methods, 5(1), 60\u201378. https:\/\/doi.org\/10.1002\/jrsm.1095","journal-title":"Research Synthesis Methods"},{"issue":"7304","key":"494_CR203","doi-asserted-by":"publisher","first-page":"101","DOI":"10.1136\/bmj.323.7304.101","volume":"323","author":"JA Sterne","year":"2001","unstructured":"Sterne, J. A., Egger, M., & Smith, G. D. (2001). Systematic reviews in health care: Investigating and dealing with publication and other biases in meta-analysis. BMJ\u202f: British Medical Journal, 323(7304), 101\u2013105. https:\/\/doi.org\/10.1136\/bmj.323.7304.101","journal-title":"BMJ : British Medical Journal"},{"key":"494_CR204","doi-asserted-by":"publisher","DOI":"10.1136\/bmj.d4002","volume":"343","author":"JAC Sterne","year":"2011","unstructured":"Sterne, J. A. C., Sutton, A. J., Ioannidis, J. P. A., Terrin, N., Jones, D. R., Lau, J., Carpenter, J., R\u00fccker, G., Harbord, R. M., Schmid, C. H., Tetzlaff, J., Deeks, J. J., Peters, J., Macaskill, P., Schwarzer, G., Duval, S., Altman, D. G., Moher, D., & Higgins, J. P. T. (2011). Recommendations for examining and interpreting funnel plot asymmetry in meta-analyses of randomised controlled trials. BMJ (Clinical Research Ed.), 343, Article d4002. https:\/\/doi.org\/10.1136\/bmj.d4002","journal-title":"BMJ (Clinical Research Ed.)"},{"key":"494_CR205","doi-asserted-by":"publisher","unstructured":"*Su, P.-h., Wang, Y.-B., Yu, T.-h., & Lee, L.-s. (2013). A dialogue game framework with personalized training using reinforcement learning for computer-assisted language learning. In Icassp 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (pp.\u00a08213\u20138217). IEEE. https:\/\/doi.org\/10.1109\/ICASSP.2013.6639266","DOI":"10.1109\/ICASSP.2013.6639266"},{"key":"494_CR206","doi-asserted-by":"publisher","unstructured":"*Sun, M., Li, P., & Wang, D. (2024). Simulation and Optimization of Physical Education Teaching Based on Virtual Reality Technology and Reinforcement Learning Algorithms. In 2024 International Conference on Telecommunications and Power Electronics (TELEPE) (pp.\u00a0579\u2013584). IEEE. https:\/\/doi.org\/10.1109\/TELEPE64216.2024.00110","DOI":"10.1109\/TELEPE64216.2024.00110"},{"key":"494_CR207","volume-title":"Reinforcement learning: An introduction","author":"RS Sutton","year":"2018","unstructured":"Sutton, R. S., & Barto, A. (2018). Reinforcement learning: An introduction (2nd ed.). The MIT Press.","edition":"2"},{"key":"494_CR208","doi-asserted-by":"publisher","unstructured":"Tang, Y., Hare, R., & Ferguson, S. (2022). Classroom evaluation of a gamified adaptive tutoring system. In Fie 2022 Proceedings (pp.\u00a01\u20135). IEEE. https:\/\/doi.org\/10.1109\/FIE56618.2022.9962718","DOI":"10.1109\/FIE56618.2022.9962718"},{"issue":"1","key":"494_CR209","doi-asserted-by":"publisher","first-page":"108","DOI":"10.1111\/bmsp.12144","volume":"72","author":"X Tang","year":"2019","unstructured":"Tang, X., Chen, Y., Li, X., Liu, J., & Ying, Z. (2019). A reinforcement learning approach to personalized learning recommendation systems. The British Journal of Mathematical and Statistical Psychology, 72(1), 108\u2013135. https:\/\/doi.org\/10.1111\/bmsp.12144","journal-title":"The British Journal of Mathematical and Statistical Psychology"},{"key":"494_CR210","doi-asserted-by":"publisher","unstructured":"Teixeira da Silva, J. A., & Daly, T. (2024). Against Over-reliance on PRISMA Guidelines for Meta-analytical Studies. Rambam Maimonides Medical Journal, 15(1). https:\/\/doi.org\/10.5041\/RMMJ.10518","DOI":"10.5041\/RMMJ.10518"},{"key":"494_CR211","doi-asserted-by":"publisher","unstructured":"Tetreault, J. R., & Litman, D. J. (2006b). Comparing the utility of state features in spoken dialogue using reinforcement learning. In Moore, R. C. (Ed.), Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics - (pp.\u00a0272\u2013279). Association for Computational Linguistics. https:\/\/doi.org\/10.3115\/1220835.1220870","DOI":"10.3115\/1220835.1220870"},{"key":"494_CR212","unstructured":"*Tetreault, J., & Litman, D. (2006a). Using reinforcement learning to build a better model of dialogue state. In EACL 2006, 11st Conference of the European Chapter of the Association for Computational Linguistics Proceedings."},{"issue":"1","key":"494_CR213","first-page":"107","volume":"18","author":"SM Thede","year":"2002","unstructured":"Thede, S. M. (2002). Using reinforcement learning to introduce artificial intelligence in the CS curriculum. Journal of Computing Sciences in Colleges, 18(1), 107\u2013112.","journal-title":"Journal of Computing Sciences in Colleges"},{"key":"494_CR214","unstructured":"Vahidy, J. (2019). Enhancing STEM learning through technology. In R. Power (Ed.), Technology and the curriculum: Summer 2019. Power Learning Solution."},{"key":"494_CR215","doi-asserted-by":"crossref","unstructured":"VanLehn, K., Jordan, P., & Litman, D. (2007). Developing pedagogically effective tutorial dialogue tactics: Experiments and a testbed. In SLaTE-2007 (pp.\u00a017\u201320).","DOI":"10.21437\/SLaTE.2007-3"},{"key":"494_CR216","unstructured":"*Vassoyan, J., Vie, J.-J., & Lemberger, P. (2023). Towards Scalable Adaptive Learning with Graph Neural Networks and Reinforcement Learning. In Proceedings of the 16th International Conference on Educational Data Mining."},{"issue":"12","key":"494_CR217","doi-asserted-by":"publisher","first-page":"2306","DOI":"10.3923\/itj.2013.2306.2314","volume":"12","author":"B Velusamy","year":"2013","unstructured":"Velusamy, B., Anouneia, S. M., & Abraham, G. (2013). Reinforcement learning approach for adaptive e-learning systems using learning styles. Information Technology Journal, 12(12), 2306\u20132314. https:\/\/doi.org\/10.3923\/itj.2013.2306.2314","journal-title":"Information Technology Journal"},{"key":"494_CR218","doi-asserted-by":"publisher","unstructured":"Vijayan, A., Janmasree, S., Keerthana, C., & Baby Syla, L. (2018, July 5\u20137). A framework for intelligent learning assistant platform based on cognitive computing for children with autism spectrum disorder. In 2018 International CET Conference on Control, Communication, and Computing (IC4) (pp.\u00a0361\u2013365). IEEE. https:\/\/doi.org\/10.1109\/CETIC4.2018.8530940","DOI":"10.1109\/CETIC4.2018.8530940"},{"key":"494_CR219","doi-asserted-by":"publisher","unstructured":"*Wan, H., Che, B., Luo, H., & Luo, X. (2023). Learning path recommendation based on knowledge tracing and reinforcement learning. In 2023 IEEE International Conference on Advanced Learning Technologies (ICALT) (pp.\u00a055\u201357). IEEE. https:\/\/doi.org\/10.1109\/ICALT58122.2023.00021","DOI":"10.1109\/ICALT58122.2023.00021"},{"key":"494_CR220","doi-asserted-by":"publisher","unstructured":"*Wang, F. (2014b). Pomdp framework for building an intelligent tutoring system. In S. Zvacek (Ed.), Proceedings of the 6th International Conference on Computer Supported Education, Barcelona, Spain, 1 - 3 April, 2014 (pp.\u00a0233\u2013240). SCITEPRESS. https:\/\/doi.org\/10.5220\/0004801702330240","DOI":"10.5220\/0004801702330240"},{"key":"494_CR221","doi-asserted-by":"publisher","unstructured":"Wang, F. (2014a). Learning teaching in teaching: Online reinforcement learning for intelligent tutoring. In C.-h. Pak, I. Stojmenovic, M. Choi, spsampsps F. Xhafa (Eds.), Lecture Notes in Electrical Engineering: Vol. 276. Future information technology: Futuretech 2013 (Vol. 276, pp.\u00a0191\u2013196). Springer. https:\/\/doi.org\/10.1007\/978-3-642-40861-8_29","DOI":"10.1007\/978-3-642-40861-8_29"},{"key":"494_CR222","doi-asserted-by":"publisher","unstructured":"Wang, L., Zhang, D., Gao, L., Song, J., Guo, L., & Shen, H. T. (2018). MathDQN: Solving arithmetic word problems via deep reinforcement learning. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). https:\/\/doi.org\/10.1609\/aaai.v32i1.11981","DOI":"10.1609\/aaai.v32i1.11981"},{"key":"494_CR223","doi-asserted-by":"publisher","unstructured":"Wang, F. (2018). Reinforcement learning in a pomdp based intelligent tutoring system for optimizing teaching strategies. International Journal of Information and Education Technology, 8(8), 553\u2013558. https:\/\/doi.org\/10.18178\/ijiet.2018.8.8.1098","DOI":"10.18178\/ijiet.2018.8.8.1098"},{"key":"494_CR224","doi-asserted-by":"publisher","unstructured":"Wang, Y., Cai, W., Chen, M., & Shen, J. (2020). Poem: A personalized online education scheme based on reinforcement learning. In H. Mitsuhara (Ed.), Proceedings of 2020 IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE): Date and venue: 8\u201311 December 2020, online (pp.\u00a0474\u2013481). IEEE. https:\/\/doi.org\/10.1109\/TALE48869.2020.9368369","DOI":"10.1109\/TALE48869.2020.9368369"},{"key":"494_CR225","doi-asserted-by":"publisher","unstructured":"*Wang, J., Zhang, Y., Sun, L., Liu, Y., Zhang, W., & Zhang, Y. (2023). Learning path design on knowledge graph by using reinforcement learning. In 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (pp.\u00a03480\u20133485). IEEE. https:\/\/doi.org\/10.1109\/BIBM58861.2023.10386061","DOI":"10.1109\/BIBM58861.2023.10386061"},{"key":"494_CR226","doi-asserted-by":"publisher","unstructured":"*Wang, W., & Song, S. (2024). Real-Time Wireless Adaptive Learning Systems Using Reinforcement Learning and IoT for Smart Education. In 2024 Cross Strait Radio Science and Wireless Technology Conference (CSRSWTC) (pp.\u00a01\u20134). IEEE. https:\/\/doi.org\/10.1109\/CSRSWTC64338.2024.10811643","DOI":"10.1109\/CSRSWTC64338.2024.10811643"},{"issue":"3\u20134","key":"494_CR227","doi-asserted-by":"publisher","first-page":"279","DOI":"10.1007\/BF00992698","volume":"8","author":"CJCH Watkins","year":"1992","unstructured":"Watkins, C. J. C. H., & Dayan, P. (1992). Q-learning. Machine Learning, 8(3\u20134), 279\u2013292. https:\/\/doi.org\/10.1007\/BF00992698","journal-title":"Machine Learning"},{"issue":"2","key":"494_CR228","doi-asserted-by":"publisher","first-page":"152","DOI":"10.1109\/TLT.2017.2692761","volume":"11","author":"J Whitehill","year":"2018","unstructured":"Whitehill, J., & Movellan, J. (2018). Approximately optimal teaching of approximately optimal learners. IEEE Transactions on Learning Technologies, 11(2), 152\u2013164. https:\/\/doi.org\/10.1109\/TLT.2017.2692761","journal-title":"IEEE Transactions on Learning Technologies"},{"issue":"3\u20134","key":"494_CR229","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1007\/BF00992696","volume":"8","author":"RJ Williams","year":"1992","unstructured":"Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3\u20134), 229\u2013256. https:\/\/doi.org\/10.1007\/BF00992696","journal-title":"Machine Learning"},{"key":"494_CR230","doi-asserted-by":"publisher","first-page":"691","DOI":"10.1109\/TLT.2023.3326449","volume":"17","author":"S Wu","year":"2024","unstructured":"Wu, S., Wang, J., & Zhang, W. (2024). Contrastive Personalized Exercise Recommendation With Reinforcement Learning. IEEE Transactions on Learning Technologies, 17, 691\u2013703. https:\/\/doi.org\/10.1109\/TLT.2023.3326449","journal-title":"IEEE Transactions on Learning Technologies"},{"key":"494_CR231","doi-asserted-by":"publisher","unstructured":"Yang, J. (2024). English Learning Knowledge Point Recommendation Algorithm based on Deep Deterministic Policy Gradient. In 2024 International Conference on Integrated Intelligence and Communication Systems (ICIICS) (pp.\u00a01\u20135). IEEE. https:\/\/doi.org\/10.1109\/ICIICS63763.2024.10860216","DOI":"10.1109\/ICIICS63763.2024.10860216"},{"key":"494_CR232","doi-asserted-by":"publisher","unstructured":"*Yantao, L., & Wei, J. (2024). Enhancing Student Engagement in Smart Classrooms Using Reinforcement Learning Algorithms. In 2024 Cross Strait Radio Science and Wireless Technology Conference (CSRSWTC) (pp.\u00a01\u20134). IEEE. https:\/\/doi.org\/10.1109\/CSRSWTC64338.2024.10811575","DOI":"10.1109\/CSRSWTC64338.2024.10811575"},{"key":"494_CR233","doi-asserted-by":"publisher","unstructured":"*Yessad, A. (2023). Using the ITS components in improving the q-learning policy for instructional sequencing. In C. Frasson, P. Mylonas, spsampsps C. Troussas (Eds.), Lecture Notes in Computer Science: Vol. 13891. Augmented Intelligence and Intelligent Tutoring Systems: 19th International Conference, ITS 2023 Proceedings (1st ed. 2023, Vol. 13891, pp.\u00a0247\u2013256). Springer Nature Switzerland; Imprint Springer. https:\/\/doi.org\/10.1007\/978-3-031-32883-1_21","DOI":"10.1007\/978-3-031-32883-1_21"},{"key":"494_CR234","doi-asserted-by":"publisher","unstructured":"Yuh, M. S., Rabb, E., Thorpe, A., & Jain, N. (2024). Using Reward Shaping to Train Cognitive-Based Control Policies for Intelligent Tutoring Systems. In 2024 American Control Conference (ACC) (pp.\u00a03223\u20133230). IEEE. https:\/\/doi.org\/10.23919\/ACC60939.2024.10644169","DOI":"10.23919\/ACC60939.2024.10644169"},{"key":"494_CR235","doi-asserted-by":"crossref","unstructured":"Zadem, M., Mover, S., & Nguyen, S. M. (2023). Emergence of a Symbolic Goal Representation with an Intelligent Tutoring System based on Intrinsic Motivation. In NeurIPS 2023: IMOL Workshop \"Intrinsically-Motivated and Open-Ended Learning\" (pp.\u00a0423\u2013428). IEEE.","DOI":"10.1109\/ICDL55364.2023.10364473"},{"key":"494_CR236","unstructured":"*Zhang, Y., & Goh, W.-B. (2019). Bootstrapped policy gradient for difficulty adaptation in intelligent tutoring systems. In AAMAS \u201919, Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems (pp.\u00a0711\u2013719). International Foundation for Autonomous Agents and Multiagent Systems."},{"key":"494_CR237","doi-asserted-by":"publisher","unstructured":"Zhang, H., spsampsps Yu, T. (2020). Taxonomy of reinforcement learning algorithms. In H. Dong, Z. Ding, spsampsps S. Zhang (Eds.), Springer eBook Collection. Deep Reinforcement Learning: Fundamentals, Research and Applications (1st ed. 2020, pp.\u00a0125\u2013133). Springer Singapore; Imprint Springer. https:\/\/doi.org\/10.1007\/978-981-15-4095-0_3","DOI":"10.1007\/978-981-15-4095-0_3"},{"key":"494_CR238","doi-asserted-by":"publisher","unstructured":"*Zhang, J. (2023). Game Design and Learning Effectiveness Evaluation of English Teaching Based on Reinforcement Learning Algorithm. In 2023 International Conference on Intelligent Computing, Communication & Convergence (ICI3C) (pp.\u00a0349\u2013353). IEEE. https:\/\/doi.org\/10.1109\/ICI3C60830.2023.00073","DOI":"10.1109\/ICI3C60830.2023.00073"},{"key":"494_CR239","doi-asserted-by":"publisher","unstructured":"Zhang, D. (2024). Using deep Reinforcement Learning to Optimize the Motivational Incentive Mechanism of Online English Learners. In Proceedings of the International Conference on Decision Science & Management (pp.\u00a0179\u2013183). ACM. https:\/\/doi.org\/10.1145\/3686081.3686110","DOI":"10.1145\/3686081.3686110"},{"issue":"4","key":"494_CR240","doi-asserted-by":"publisher","first-page":"753","DOI":"10.1007\/s11257-021-09292-w","volume":"31","author":"Y Zhang","year":"2021","unstructured":"Zhang, Y., & Goh, W.-B. (2021). Personalized task difficulty adaptation based on reinforcement learning. User Modeling and User-Adapted Interaction, 31(4), 753\u2013784. https:\/\/doi.org\/10.1007\/s11257-021-09292-w","journal-title":"User Modeling and User-Adapted Interaction"},{"key":"494_CR241","doi-asserted-by":"publisher","unstructured":"Zhiyong, J., Jing, T., & Jing, Z. (2021). Allocation of english remote guiding based on deep reinforcement learning and multi-objective optimization. In Proceedings of the 5th International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud): I-SMAC 2021 : 11\u201313, November 2021 (pp.\u00a0414\u2013417). IEEE. https:\/\/doi.org\/10.1109\/I-SMAC52330.2021.9640763","DOI":"10.1109\/I-SMAC52330.2021.9640763"},{"key":"494_CR242","doi-asserted-by":"publisher","unstructured":"*Zhou, G., Azizsoltani, H., Ausin, M. S., Barnes, T., spsampsps Chi, M. (2019). Hierarchical reinforcement learning for pedagogical policy induction. In S. Isotani, E. Mill\u00e1n, A. Ogan, P. Hastings, B. McLaren, spsampsps R. Luckin (Eds.), LNCS sublibrary: 11625\u201311626. Artificial intelligence in education: 20th international conference, AIED 2019, Chicago, IL, USA, June 25\u201329, 2019, proceedings (Vol. 11625, pp.\u00a0544\u2013556). Springer International Publishing. https:\/\/doi.org\/10.1007\/978-3-030-23204-7_45","DOI":"10.1007\/978-3-030-23204-7_45"},{"key":"494_CR243","doi-asserted-by":"publisher","unstructured":"*Zhou, G., Yang, X., Azizsoltani, H., Barnes, T., & Chi, M. (2020). Improving student-system interaction through data-driven explanations of hierarchical reinforcement learning induced pedagogical policies. In T. Kuflik (Ed.), ACM Digital Library, Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization (pp.\u00a0284\u2013292). Association for Computing Machinery. https:\/\/doi.org\/10.1145\/3340631.3394848","DOI":"10.1145\/3340631.3394848"},{"issue":"2","key":"494_CR244","doi-asserted-by":"publisher","first-page":"454","DOI":"10.1007\/s40593-021-00269-9","volume":"32","author":"G Zhou","year":"2021","unstructured":"Zhou, G., Azizsoltani, H., Ausin, M. S., Barnes, T., & Chi, M. (2021). Leveraging granularity: Hierarchical reinforcement learning for pedagogical policy induction. International Journal of Artificial Intelligence in Education, 32(2), 454\u2013500. https:\/\/doi.org\/10.1007\/s40593-021-00269-9","journal-title":"International Journal of Artificial Intelligence in Education"}],"container-title":["International Journal of Artificial Intelligence in Education"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40593-025-00494-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40593-025-00494-6","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40593-025-00494-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,4]],"date-time":"2026-03-04T18:12:42Z","timestamp":1772647962000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40593-025-00494-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,10]]},"references-count":244,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2025,12]]}},"alternative-id":["494"],"URL":"https:\/\/doi.org\/10.1007\/s40593-025-00494-6","relation":{},"ISSN":["1560-4292","1560-4306"],"issn-type":[{"value":"1560-4292","type":"print"},{"value":"1560-4306","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,7,10]]},"assertion":[{"value":"16 June 2025","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 July 2025","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}},{"value":"This review was not registered. The protocol for the systematic review is available upon request from the corresponding author.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Registration Information"}}]}}