{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T16:36:00Z","timestamp":1776098160153,"version":"3.50.1"},"reference-count":67,"publisher":"Elsevier BV","issue":"6","license":[{"start":{"date-parts":[[2025,7,25]],"date-time":"2025-07-25T00:00:00Z","timestamp":1753401600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,7,25]],"date-time":"2025-07-25T00:00:00Z","timestamp":1753401600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"NSF National AI Research Institutes Program","award":["2112532"],"award-info":[{"award-number":["2112532"]}]},{"name":"NSF National AI Research Institutes Program","award":["2112532"],"award-info":[{"award-number":["2112532"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Artif Intell Educ"],"published-print":{"date-parts":[[2025,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>This study explores how large language models (LLMs), specifically GPT-4, could be used to generate personalized feedback within an Intelligent Tutoring System (ITS). The research focuses on evaluating the model\u2019s ability to (1) diagnose student errors, (2) generate personalized corrective feedback, and (3) assess the accuracy of diagnoses and helpfulness of the feedback. We analyze student errors from the Apprentice Tutor College Algebra ITS and prompt GPT-4 to give targeted feedback on those errors. The findings suggest that while this model can effectively diagnose a range of student errors, its feedback varies in effectiveness based on the complexity of the problem and the type of error. While GPT-4 generates relevant, specific feedback a majority of the time, 35% of the hints were too general, incorrect, or give away the correct answer. The study also explores methods for using an LLM to automatically evaluate the validity of generated feedback, and finds that only 35% of feedback passes automated helpfulness evaluations.<\/jats:p>","DOI":"10.1007\/s40593-025-00505-6","type":"journal-article","created":{"date-parts":[[2025,7,25]],"date-time":"2025-07-25T16:59:37Z","timestamp":1753462777000},"page":"3459-3500","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Generating In-Context, Personalized Feedback for Intelligent Tutors with Large Language Models"],"prefix":"10.1016","volume":"35","author":[{"given":"Jennifer M","family":"Reddig","sequence":"first","affiliation":[]},{"given":"Arav","family":"Arora","sequence":"additional","affiliation":[]},{"given":"Christopher J.","family":"MacLellan","sequence":"additional","affiliation":[]}],"member":"78","published-online":{"date-parts":[[2025,7,25]]},"reference":[{"issue":"2","key":"505_CR1","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1109\/TLT.2009.22","volume":"2","author":"V Aleven","year":"2009","unstructured":"Aleven, V., McLaren, B. M., & Sewall, J. (2009). Scaling up programming by demonstration for intelligent tutoring systems development: An open-access web site for middle school mathematics learning. IEEE Transactions on Learning Technologies, 2(2), 64\u201378.","journal-title":"IEEE Transactions on Learning Technologies"},{"key":"505_CR2","doi-asserted-by":"publisher","first-page":"205","DOI":"10.1007\/s40593-015-0089-1","volume":"26","author":"V Aleven","year":"2016","unstructured":"Aleven, V., Roll, I., McLaren, B. M., & Koedinger, K. R. (2016). Help helps, but only so much: Research on help seeking with intelligent tutoring systems. International Journal of Artificial Intelligence in Education, 26, 205\u2013223.","journal-title":"International Journal of Artificial Intelligence in Education"},{"key":"505_CR3","unstructured":"Anderson, J., & Pelletier, R. (1991). A development system for model-tracing tutors"},{"key":"505_CR4","doi-asserted-by":"crossref","unstructured":"Azaiz, I., Kiesler, N., & Strickroth, S. (2024). Feedback-generation for programming exercises with gpt-4. In: Proceedings of the 2024 on Innovation and Technology in Computer Science Education (vol. 1, pp. 31\u201337).","DOI":"10.1145\/3649217.3653594"},{"key":"505_CR5","doi-asserted-by":"crossref","unstructured":"Azevedo, R., Witherspoon, A., Graesser, A., McNamara, D., Chauncey, A., Siler, E., Cai, Z., Rus, V., & Lintean, M. (2009). Metatutor: Analyzing self-regulated learning in a tutoring system for biology. In: Artificial Intelligence in Education, (pp. 635\u2013637). IOS Press","DOI":"10.3233\/978-1-60750-028-5-635"},{"key":"505_CR6","unstructured":"Berglund, L., Tong, M., Kaufmann, M., Balesni, M., Stickland, A.C., Korbak, T., & Evans, O. (2023). The reversal curse: Llms trained on\u201c a is b\u201d fail to learn\u201c b is a\u201d. arXiv:2309.12288"},{"key":"505_CR7","doi-asserted-by":"crossref","unstructured":"Butgereit, L. (2024). Using gpt-4 to tutor technical subjects in non-english languages in africa. In: International conference on information technology-new generations, (pp. 35\u201340). Springer","DOI":"10.1007\/978-3-031-56599-1_5"},{"key":"505_CR8","doi-asserted-by":"crossref","unstructured":"Butgereit, L., Martinus, H., & Abugosseisa, M.M. (2023). Prof pi: tutoring mathematics in arabic language using gpt-4 and whatsapp. In: 2023 IEEE 27th international conference on intelligent engineering systems (INES), (pp. 000161\u2013000164). IEEE","DOI":"10.1109\/INES59282.2023.10297824"},{"issue":"2","key":"505_CR9","doi-asserted-by":"publisher","first-page":"290","DOI":"10.1037\/a0031026","volume":"105","author":"AC Butler","year":"2013","unstructured":"Butler, A. C., Godbole, N., & Marsh, E. J. (2013). Explanation feedback is better than correct answer feedback for promoting transfer of learning. Journal of Educational Psychology, 105(2), 290.","journal-title":"Journal of Educational Psychology"},{"key":"505_CR10","doi-asserted-by":"crossref","unstructured":"Calo, T., & Maclellan, C. (2024). Towards educator-driven tutor authoring: Generative ai approaches for creating intelligent tutor interfaces. In: Proceedings of the Eleventh ACM Conference on Learning@ Scale, (pp. 305\u2013309).","DOI":"10.1145\/3657604.3664694"},{"key":"505_CR11","unstructured":"Choi, J.S., & Crossley, S.A. (2021). Arte: Automatic readability tool for english. NLP Tools for the Social Sciences. linguisticanalysistools.org."},{"key":"505_CR12","doi-asserted-by":"publisher","first-page":"253","DOI":"10.1007\/BF01099821","volume":"4","author":"AT Corbett","year":"1994","unstructured":"Corbett, A. T., & Anderson, J. R. (1994). Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-adapted Interaction, 4, 253\u2013278.","journal-title":"User Modeling and User-adapted Interaction"},{"issue":"3","key":"505_CR13","doi-asserted-by":"publisher","first-page":"409","DOI":"10.1017\/S0261444808005077","volume":"41","author":"SA Crossley","year":"2008","unstructured":"Crossley, S. A., & McNamara, D. S. (2008). Assessing l2 reading texts at the intermediate level: An approximate replication of crossley, louwerse, mccarthy & mcnamara (2007). Language Teaching, 41(3), 409\u2013429.","journal-title":"Language Teaching"},{"issue":"3\u20134","key":"505_CR14","doi-asserted-by":"publisher","first-page":"541","DOI":"10.1111\/1467-9817.12283","volume":"42","author":"SA Crossley","year":"2019","unstructured":"Crossley, S. A., Skalicky, S., & Dascalu, M. (2019). Moving beyond classic readability formulas: New methods and new models. Journal of Research in Reading, 42(3\u20134), 541\u2013561.","journal-title":"Journal of Research in Reading"},{"key":"505_CR15","doi-asserted-by":"publisher","first-page":"104094","DOI":"10.1016\/j.compedu.2020.104094","volume":"162","author":"G Deeva","year":"2021","unstructured":"Deeva, G., Bogdanova, D., Serral, E., Snoeck, M., & De Weerdt, J. (2021). A review of automated feedback systems for learners: Classification framework, challenges and opportunities. Computers & Education, 162, 104094.","journal-title":"Computers & Education"},{"key":"505_CR16","unstructured":"Dziri, N., Lu, X., Sclar, M., Li, X.L., Jiang, L., Lin, B.Y., Welleck, S., West, P., Bhagavatula, C., Le\u00a0Bras, R., et al. (2024). Faith and fate: Limits of transformers on compositionality. Advances in Neural Information Processing Systems, 36."},{"issue":"1","key":"505_CR17","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1177\/1475725720971205","volume":"20","author":"N Enders","year":"2021","unstructured":"Enders, N., Gaschler, R., & Kubik, V. (2021). Online quizzes with closed questions in formal assessment: How elaborate feedback can promote learning. Psychology Learning & Teaching, 20(1), 91\u2013106.","journal-title":"Psychology Learning & Teaching"},{"key":"505_CR18","doi-asserted-by":"publisher","first-page":"101329","DOI":"10.1016\/j.aos.2021.101329","volume":"99","author":"D Erickson","year":"2022","unstructured":"Erickson, D., Holderness, D. K., Olsen, K. J., & Thornock, T. A. (2022). Feedback with feeling? how emotional language in feedback affects individual performance. Accounting, Organizations and Society, 99, 101329. https:\/\/doi.org\/10.1016\/j.aos.2021.101329","journal-title":"Accounting, Organizations and Society"},{"key":"505_CR19","doi-asserted-by":"publisher","first-page":"104","DOI":"10.1016\/j.learninstruc.2017.08.007","volume":"54","author":"B Finn","year":"2018","unstructured":"Finn, B., Thomas, R., & Rawson, K. A. (2018). Learning more from feedback: Elaborating feedback with examples enhances concept learning. Learning and Instruction, 54, 104\u2013113.","journal-title":"Learning and Instruction"},{"key":"505_CR20","unstructured":"Frieder, S., Pinchetti, L., Griffiths, R.-R., Salvatori, T., Lukasiewicz, T., Petersen, P., & Berner, J. (2024). Mathematical capabilities of chatgpt. Advances in Neural Information Processing Systems, 36"},{"key":"505_CR21","unstructured":"Gendron, G., Bao, Q., Witbrock, M., & Dobbie, G. (2023). Large language models are not strong abstract reasoners. arXiv:2305.19555"},{"issue":"6p1","key":"505_CR22","doi-asserted-by":"publisher","first-page":"503","DOI":"10.1037\/h0028501","volume":"60","author":"DA Gilman","year":"1969","unstructured":"Gilman, D. A. (1969). Comparison of several feedback methods for correcting errors by computer-assisted instruction. Journal of Educational Psychology, 60(6p1), 503.","journal-title":"Journal of Educational Psychology"},{"issue":"1","key":"505_CR23","doi-asserted-by":"publisher","first-page":"81","DOI":"10.3102\/003465430298487","volume":"77","author":"J Hattie","year":"2007","unstructured":"Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81\u2013112.","journal-title":"Review of Educational Research"},{"key":"505_CR24","doi-asserted-by":"publisher","first-page":"470","DOI":"10.1007\/s40593-014-0024-x","volume":"24","author":"NT Heffernan","year":"2014","unstructured":"Heffernan, N. T., & Heffernan, C. L. (2014). The assistments ecosystem: Building a platform that brings scientists and teachers together for minimally invasive research on human learning and teaching. International Journal of Artificial Intelligence in Education, 24, 470\u2013497.","journal-title":"International Journal of Artificial Intelligence in Education"},{"key":"505_CR25","doi-asserted-by":"publisher","first-page":"123","DOI":"10.1007\/s11423-020-09874-2","volume":"69","author":"NR Howard","year":"2021","unstructured":"Howard, N. R. (2021). \u201chow did i do?\u2019\u2019: Giving learners effective and affective feedback. Educational Technology Research and Development, 69, 123\u2013126.","journal-title":"Educational Technology Research and Development"},{"issue":"1","key":"505_CR26","doi-asserted-by":"publisher","first-page":"51","DOI":"10.1016\/0004-3702(90)90094-G","volume":"42","author":"WL Johnson","year":"1990","unstructured":"Johnson, W. L. (1990). Understanding and debugging novice programs. Artificial Intelligence, 42(1), 51\u201397.","journal-title":"Artificial Intelligence"},{"issue":"2","key":"505_CR27","doi-asserted-by":"publisher","first-page":"106","DOI":"10.2307\/748867","volume":"7","author":"JD Knifong","year":"1976","unstructured":"Knifong, J. D., & Holtan, B. (1976). An analysis of children\u2019s written solutions to word problems. Journal for Research in Mathematics Education, 7(2), 106\u2013112.","journal-title":"Journal for Research in Mathematics Education"},{"key":"505_CR28","doi-asserted-by":"crossref","unstructured":"Kumar, H., Rothschild, D.M., Goldstein, D.G., & Hofman, J.M. (2023). Math education with large language models: Peril or promise? Available at SSRN 4641653","DOI":"10.2139\/ssrn.4641653"},{"key":"505_CR29","unstructured":"Lee, K., Firat, O., Agarwal, A., Fannjiang, C., & Sussillo, D. (2018). Hallucinations in neural machine translation"},{"issue":"8","key":"505_CR30","first-page":"269","volume":"2","author":"S Loria","year":"2018","unstructured":"Loria, S., et al. (2018). textblob documentation. Release 0.15, 2(8), 269.","journal-title":"Release 0.15"},{"key":"505_CR31","doi-asserted-by":"crossref","unstructured":"Lovett, M.C. (1998). Cognitive task analysis in service of intelligent tutoring system design: A case study in statistics. In: International Conference on Intelligent Tutoring Systems, (pp. 234\u2013243). Springer","DOI":"10.1007\/3-540-68716-5_29"},{"key":"505_CR32","doi-asserted-by":"crossref","unstructured":"Lu, X., & Wang, X. (2024). Generative students: using llm-simulated student profiles to support question item evaluation. In: Proceedings of the eleventh ACM conference on learning@ Scale, (pp. 16\u201327)","DOI":"10.1145\/3657604.3662031"},{"issue":"1","key":"505_CR33","doi-asserted-by":"publisher","first-page":"76","DOI":"10.1007\/s40593-020-00214-2","volume":"32","author":"CJ MacLellan","year":"2022","unstructured":"MacLellan, C. J., & Koedinger, K. R. (2022). Domain-general tutor authoring with apprentice learner models. International Journal of Artificial Intelligence in Education, 32(1), 76\u2013117.","journal-title":"International Journal of Artificial Intelligence in Education"},{"issue":"1","key":"505_CR34","doi-asserted-by":"publisher","first-page":"1","DOI":"10.9743\/JEO.2005.1.5","volume":"2","author":"BJ Mandernach","year":"2005","unstructured":"Mandernach, B. J. (2005). Relative effectiveness of computer-based and human feedback for enhancing student learning. The Journal of Educators Online, 2(1), 1\u201317.","journal-title":"The Journal of Educators Online"},{"issue":"6","key":"505_CR35","doi-asserted-by":"publisher","first-page":"645","DOI":"10.1080\/09658211.2012.684882","volume":"20","author":"EJ Marsh","year":"2012","unstructured":"Marsh, E. J., Lozito, J. P., Umanath, S., Bjork, E. L., & Bjork, R. A. (2012). Using verification feedback to correct errors made on a multiple-choice test. Memory, 20(6), 645\u2013653.","journal-title":"Memory"},{"issue":"4","key":"505_CR36","doi-asserted-by":"publisher","first-page":"381","DOI":"10.1207\/s15327051hci0504_2","volume":"5","author":"J McKendree","year":"1990","unstructured":"McKendree, J. (1990). Effective feedback content for tutoring complex skills. Human-Computer Interaction, 5(4), 381\u2013413.","journal-title":"Human-Computer Interaction"},{"key":"505_CR37","unstructured":"McNichols, H., Feng, W., Lee, J., Scarlatos, A., Smith, D., Woodhead, S., & Lan, A. (2023). Automated distractor and feedback generation for math multiple-choice questions via in-context learning. NeurIPS."},{"key":"505_CR38","doi-asserted-by":"crossref","unstructured":"Mollick, E., & Mollick, L. (2024). Instructors as innovators: A future-focused approach to new ai learning opportunities, with prompts. arXiv:2407.05181","DOI":"10.2139\/ssrn.4802463"},{"key":"505_CR39","doi-asserted-by":"crossref","unstructured":"Nesbit, J.C., Adesope, O.O., Liu, Q., & Ma, W. (2014). How effective are intelligent tutoring systems in computer science education? In: 2014 IEEE 14th International Conference on Advanced Learning Technologies, (pp. 99\u2013103). IEEE","DOI":"10.1109\/ICALT.2014.38"},{"key":"505_CR40","first-page":"31","volume":"39","author":"MA Newman","year":"1977","unstructured":"Newman, M. A. (1977). An analysis of sixth-grade pupil\u2019s error on written mathematical tasks. Victorian Institute for Educational Research Bulletin, 39, 31\u201343.","journal-title":"Victorian Institute for Educational Research Bulletin"},{"issue":"5","key":"505_CR41","doi-asserted-by":"publisher","first-page":"277","DOI":"10.1111\/jerd.12231","volume":"28","author":"C Olms","year":"2016","unstructured":"Olms, C., Jakstat, H. A., & Haak, R. (2016). The implementation of elaborative feedback for qualitative improvement of shade matching\u2014a randomized study. Journal of Esthetic and Restorative Dentistry, 28(5), 277\u2013286.","journal-title":"Journal of Esthetic and Restorative Dentistry"},{"key":"505_CR42","unstructured":"Paa\u00dfen, B., Hammer, B., Price, T. W., Barnes, T., Gross, S., & Pinkwart, N. (2018). The continuous hint factory - providing hints in vast and sparsely populated edit distance spaces. Journal of Educational Data Mining,10(2018), 1\u201335. arXiv:1708.06564"},{"key":"505_CR43","doi-asserted-by":"crossref","unstructured":"Pal\u00a0Chowdhury, S., Zouhar, V., & Sachan, M. (2024). Autotutor meets large language models: A language model tutor with rich pedagogy and guardrails. In: Proceedings of the Eleventh ACM Conference on Learning@ Scale, (pp. 5\u201315).","DOI":"10.1145\/3657604.3662041"},{"key":"505_CR44","doi-asserted-by":"crossref","unstructured":"Pardos, Z.A., Tang, M., Anastasopoulos, I., Sheel, S.K., & Zhang, E. (2023). Oatutor: An open-source adaptive tutoring system and curated content library for learning sciences research. In: Proceedings of the 2023 Chi Conference on Human Factors in Computing Systems, (pp. 1\u201317).","DOI":"10.1145\/3544548.3581574"},{"issue":"5","key":"505_CR45","doi-asserted-by":"publisher","first-page":"0304013","DOI":"10.1371\/journal.pone.0304013","volume":"19","author":"ZA Pardos","year":"2024","unstructured":"Pardos, Z. A., & Bhandari, S. (2024). Chatgpt-generated help produces learning gains equivalent to human tutor-authored help on mathematics skills. Plos One, 19(5), 0304013.","journal-title":"Plos One"},{"key":"505_CR46","doi-asserted-by":"crossref","unstructured":"Phung, T., P\u0103durean, V.-A., Cambronero, J., Gulwani, S., Kohn, T., Majumdar, R., Singla, A., & Soares, G. (2023). Generative ai for programming education: Benchmarking chatgpt, gpt-4, and human tutors. In: Proceedings of the 2023 ACM Conference on International Computing Education Research-Volume 2, (pp. 41\u201342).","DOI":"10.1145\/3568812.3603476"},{"key":"505_CR47","doi-asserted-by":"crossref","unstructured":"Phung, T., P\u0103durean, V.-A., Singh, A., Brooks, C., Cambronero, J., Gulwani, S., Singla, A., & Soares, G. (2024). Automating human tutor-style programming feedback: Leveraging gpt-4 tutor model for hint generation and gpt-3.5 student model for hint validation. In: Proceedings of the 14th Learning Analytics and Knowledge Conference, (pp. 12\u201323).","DOI":"10.1145\/3636555.3636846"},{"key":"505_CR48","doi-asserted-by":"publisher","first-page":"460","DOI":"10.1016\/j.compedu.2014.12.009","volume":"82","author":"MA Rau","year":"2015","unstructured":"Rau, M. A., Michaelis, J. E., & Fay, N. (2015). Connection making between multiple graphical representations: A multi-methods approach for domain-specific grounding of an intelligent tutoring system for chemistry. Computers & Education, 82, 460\u2013485.","journal-title":"Computers & Education"},{"issue":"2","key":"505_CR49","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1109\/TLT.2009.23","volume":"2","author":"L Razzaq","year":"2009","unstructured":"Razzaq, L., Patvarczki, J., Almeida, S. F., Vartak, M., Feng, M., Heffernan, N. T., & Koedinger, K. R. (2009). The assistment builder: Supporting the life cycle of tutoring system content creation. IEEE Transactions on Learning Technologies, 2(2), 157\u2013166.","journal-title":"IEEE Transactions on Learning Technologies"},{"key":"505_CR50","doi-asserted-by":"publisher","first-page":"249","DOI":"10.3758\/BF03194060","volume":"14","author":"S Ritter","year":"2007","unstructured":"Ritter, S., Anderson, J. R., Koedinger, K. R., & Corbett, A. (2007). Cognitive tutor: Applied research in mathematics education. Psychonomic Bulletin & Review, 14, 249\u2013255.","journal-title":"Psychonomic Bulletin & Review"},{"key":"505_CR51","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1007\/s40593-015-0070-z","volume":"27","author":"K Rivers","year":"2017","unstructured":"Rivers, K., & Koedinger, K. R. (2017). Data-driven hint generation in vast solution spaces: a self-improving python programming tutor. International Journal of Artificial Intelligence in Education, 27, 37\u201364.","journal-title":"International Journal of Artificial Intelligence in Education"},{"key":"505_CR52","doi-asserted-by":"crossref","unstructured":"Roest, L., Keuning, H., & Jeuring, J. (2024). Next-step hint generation for introductory programming using large language models. In: Proceedings of the 26th Australasian Computing Education Conference, (pp. 144\u2013153).","DOI":"10.1145\/3636243.3636259"},{"key":"505_CR53","doi-asserted-by":"crossref","unstructured":"Shahri, H., Emad, M., Ibrahim, N., Rais, R.N.B., & Al-Fayoumi, Y. (2024). Elevating education through ai tutor: Utilizing gpt-4 for personalized learning. In: 2024 15th Annual undergraduate research conference on applied computing (URC), (pp. 1\u20135). IEEE","DOI":"10.1109\/URC62276.2024.10604578"},{"issue":"1","key":"505_CR54","doi-asserted-by":"publisher","first-page":"153","DOI":"10.3102\/0034654307313795","volume":"78","author":"VJ Shute","year":"2008","unstructured":"Shute, V. J. (2008). Focus on formative feedback. Review of Educational Research, 78(1), 153\u2013189.","journal-title":"Review of Educational Research"},{"key":"505_CR55","unstructured":"Smith, G., Gupta, A., & MacLellan, C. (2024). Apprentice Tutor Builder: A Platform For Users to Create and Personalize Intelligent Tutors. arxiv:2404.07883"},{"key":"505_CR56","unstructured":"Stamper, J., Barnes, T., Lehmann, L., & Croy, M. (2008). The hint factory: Automatic generation of contextualized help for existing computer aided instruction. In: Proceedings of the 9th International Conference on Intelligent Tutoring Systems Young Researchers Track, (pp. 71\u201378)."},{"issue":"3","key":"505_CR57","doi-asserted-by":"publisher","first-page":"60","DOI":"10.4018\/IJICTE.2019070105","volume":"15","author":"D Tafazoli","year":"2019","unstructured":"Tafazoli, D., Mar\u00eda, E. G., & Abril, C. A. H. (2019). Intelligent language tutoring system: Integrating intelligent computer-assisted language learning into language education. International Journal of Information and Communication Technology Education (IJICTE), 15(3), 60\u201374.","journal-title":"International Journal of Information and Communication Technology Education (IJICTE)"},{"key":"505_CR58","unstructured":"Thomas, D.R. (2003). A general inductive approach for qualitative data analysis"},{"key":"505_CR59","doi-asserted-by":"crossref","unstructured":"Truong, T.H., Baldwin, T., Verspoor, K., & Cohn, T. (2023). Language models are not naysayers: An analysis of language models on negation benchmarks. arxiv:2306.08189","DOI":"10.18653\/v1\/2023.starsem-1.10"},{"key":"505_CR60","unstructured":"Valmeekam, K., Olmo, A., Sreedharan, S., & Kambhampati, S. (2022). Large language models still can\u2019t plan (a benchmark for llms on planning and reasoning about change). In: NeurIPS 2022 Foundation Models for Decision Making Workshop"},{"key":"505_CR61","first-page":"75993","volume":"36","author":"K Valmeekam","year":"2023","unstructured":"Valmeekam, K., Marquez, M., Sreedharan, S., & Kambhampati, S. (2023). On the planning abilities of large language models-a critical investigation. Advances in Neural Information Processing Systems, 36, 75993\u201376005.","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"3","key":"505_CR62","doi-asserted-by":"publisher","first-page":"227","DOI":"10.3233\/IRG-2006-16(3)02","volume":"16","author":"K VanLehn","year":"2006","unstructured":"VanLehn, K. (2006). The behavior of tutoring systems. International Journal of Artificial Intelligence in Education, 16(3), 227\u2013265.","journal-title":"International Journal of Artificial Intelligence in Education"},{"issue":"1","key":"505_CR63","doi-asserted-by":"publisher","first-page":"010152","DOI":"10.1103\/PhysRevPhysEducRes.20.010152","volume":"20","author":"T Wan","year":"2024","unstructured":"Wan, T., & Chen, Z. (2024). Exploring generative ai assisted feedback writing for students\u2019 written responses to a physics conceptual question with prompt engineering and few-shot learning. Physical Review Physics Education Research, 20(1), 010152.","journal-title":"Physical Review Physics Education Research"},{"key":"505_CR64","doi-asserted-by":"crossref","unstructured":"Weitekamp, D., Harpstead, E., & Koedinger, K.R. (2020). An interaction design for machine teaching to develop ai tutors. In: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, (pp. 1\u201311).","DOI":"10.1145\/3313831.3376226"},{"key":"505_CR65","first-page":"24824","volume":"35","author":"J Wei","year":"2022","unstructured":"Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q. V., Zhou, D., et al. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35, 24824\u201324837.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"505_CR66","doi-asserted-by":"publisher","unstructured":"Xiao, R., Hou, X., & Stamper, J. (2024). Exploring how multiple levels of gpt-generated programming hints support or disappoint novices. In: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems. CHI \u201924. ACM. https:\/\/doi.org\/10.1145\/3613905.3650937","DOI":"10.1145\/3613905.3650937"},{"key":"505_CR67","unstructured":"Yamkovenko, S. (2023). Sal Khan\u2019s 2023 TED talk: AI in the classroom can transform education. Khan Academy. Accessed: 26 -Sept-2024"}],"container-title":["International Journal of Artificial Intelligence in Education"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40593-025-00505-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40593-025-00505-6","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40593-025-00505-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,4]],"date-time":"2026-03-04T18:12:46Z","timestamp":1772647966000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40593-025-00505-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,25]]},"references-count":67,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,12]]}},"alternative-id":["505"],"URL":"https:\/\/doi.org\/10.1007\/s40593-025-00505-6","relation":{},"ISSN":["1560-4292","1560-4306"],"issn-type":[{"value":"1560-4292","type":"print"},{"value":"1560-4306","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,7,25]]},"assertion":[{"value":"10 July 2025","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 July 2025","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing Interests"}}]}}