{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T13:32:29Z","timestamp":1769693549057,"version":"3.49.0"},"reference-count":91,"publisher":"Springer Science and Business Media LLC","issue":"18","license":[{"start":{"date-parts":[[2025,10,2]],"date-time":"2025-10-02T00:00:00Z","timestamp":1759363200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"},{"start":{"date-parts":[[2025,10,2]],"date-time":"2025-10-02T00:00:00Z","timestamp":1759363200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.springernature.com\/gp\/researchers\/text-and-data-mining"}],"funder":[{"DOI":"10.13039\/501100017630","name":"Humanities and Social Sciences Youth Foundation, Ministry of Education","doi-asserted-by":"publisher","award":["24YJC880062"],"award-info":[{"award-number":["24YJC880062"]}],"id":[{"id":"10.13039\/501100017630","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Educ Inf Technol"],"published-print":{"date-parts":[[2025,12]]},"DOI":"10.1007\/s10639-025-13774-4","type":"journal-article","created":{"date-parts":[[2025,10,2]],"date-time":"2025-10-02T08:37:23Z","timestamp":1759394243000},"page":"25881-25908","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Evaluating the performance of ChatGPT and Claude in automated writing scoring: Insights from the Many-facet Rasch model"],"prefix":"10.1007","volume":"30","author":[{"given":"Rui","family":"Jin","sequence":"first","affiliation":[]},{"given":"Mingren","family":"Zhao","sequence":"additional","affiliation":[]},{"given":"Chunling","family":"Niu","sequence":"additional","affiliation":[]},{"given":"Yuyan","family":"Xia","sequence":"additional","affiliation":[]},{"given":"Hao","family":"Zhou","sequence":"additional","affiliation":[]},{"given":"Na","family":"Liu","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,10,2]]},"reference":[{"key":"13774_CR1","doi-asserted-by":"publisher","first-page":"197061","DOI":"10.1109\/ACCESS.2024.3521945","volume":"12","author":"Y Abdelgadir Mohamed","year":"2024","unstructured":"Abdelgadir Mohamed, Y., Mohamed, A. H. H. M., Khanan, A., Bashir, M., Adiel, M. A. E., & Elsadig, M. A. (2024). Navigating the ethical terrain of AI-generated text tools: A review. IEEE Access\u202f: Practical Innovations, Open Solutions, 12, 197061\u2013197120. https:\/\/doi.org\/10.1109\/ACCESS.2024.3521945","journal-title":"IEEE Access\u202f: Practical Innovations, Open Solutions"},{"issue":"4","key":"13774_CR2","doi-asserted-by":"publisher","first-page":"561","DOI":"10.1007\/BF02293814","volume":"43","author":"D Andrich","year":"1978","unstructured":"Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43(4), 561\u2013573. https:\/\/doi.org\/10.1007\/BF02293814","journal-title":"Psychometrika"},{"key":"13774_CR3","doi-asserted-by":"publisher","first-page":"100745","DOI":"10.1016\/j.asw.2023.100745","volume":"57","author":"JS Barrot","year":"2023","unstructured":"Barrot, J. S. (2023). Using ChatGPT for second language writing: Pitfalls and potentials. Assessing Writing, 57, 100745. https:\/\/doi.org\/10.1016\/j.asw.2023.100745","journal-title":"Assessing Writing"},{"key":"13774_CR4","unstructured":"Barshay, J. (2024, July 9). New evidence affirms teachers should go slow using AI to grade essays. FutureEd. https:\/\/www.future-ed.org\/new-evidence-affirms-teachers-should-go-slow-using-ai-to-grade-essays\/"},{"issue":"3","key":"13774_CR5","doi-asserted-by":"publisher","first-page":"101861","DOI":"10.1007\/s12528-021-09283-1","volume":"33","author":"M Beseiso","year":"2023","unstructured":"Beseiso, M., Alzubi, O. A., & Rashaideh, H. (2023). A novel automated essay scoring approach for reliable higher educational assessments. Information Fusion, 33(3), 101861. https:\/\/doi.org\/10.1007\/s12528-021-09283-1","journal-title":"Information Fusion"},{"key":"13774_CR6","doi-asserted-by":"publisher","DOI":"10.1007\/s11831-024-10115-5","author":"P Bhattacharya","year":"2024","unstructured":"Bhattacharya, P., Prasad, V. K., Verma, A., Gupta, D., Sapsomboon, A., Viriyasitavat, W., & Dhiman, G. (2024). Demystifying ChatGPT: An in-depth survey of OpenAI\u2019s robust large language models. Archives of Computational Methods in Engineering. https:\/\/doi.org\/10.1007\/s11831-024-10115-5","journal-title":"Archives of Computational Methods in Engineering"},{"key":"13774_CR7","doi-asserted-by":"publisher","DOI":"10.2139\/ssrn.4476855","author":"A Borji","year":"2025","unstructured":"Borji, A., & Mohammadian, M. (2025). Battle of the wordsmiths: Comparing chatgpt, GPT-4, claude, and bard. SSRN Electronic Journal. https:\/\/doi.org\/10.2139\/ssrn.4476855","journal-title":"SSRN Electronic Journal"},{"issue":"1","key":"13774_CR8","doi-asserted-by":"publisher","first-page":"27","DOI":"10.1080\/08957347.2012.635502","volume":"25","author":"B Bridgeman","year":"2012","unstructured":"Bridgeman, B., Trapani, C., & Attali, Y. (2012). Comparison of human and machine scoring of essays: Differences by gender, ethnicity, and country. Applied Measurement in Education, 25(1), 27\u201340. https:\/\/doi.org\/10.1080\/08957347.2012.635502","journal-title":"Applied Measurement in Education"},{"key":"13774_CR9","doi-asserted-by":"publisher","DOI":"10.1007\/s10639-024-12891-w","author":"NM Bui","year":"2024","unstructured":"Bui, N. M., & Barrot, J. S. (2024). ChatGPT as an automated essay scoring tool in the writing classrooms: How it compares with human scoring. Education and Information Technologies. https:\/\/doi.org\/10.1007\/s10639-024-12891-w","journal-title":"Education and Information Technologies"},{"issue":"1","key":"13774_CR10","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1177\/02655322221076025","volume":"40","author":"KKY Chan","year":"2023","unstructured":"Chan, K. K. Y., Bond, T., & Yan, Z. (2023). Application of an automated essay scoring engine to English writing assessment using many-facet Rasch measurement. Language Testing, 40(1), 61\u201385. https:\/\/doi.org\/10.1177\/02655322221076025","journal-title":"Language Testing"},{"issue":"6","key":"13774_CR11","doi-asserted-by":"publisher","first-page":"1122","DOI":"10.3102\/00028312221106773","volume":"59","author":"D Chen","year":"2022","unstructured":"Chen, D., Hebert, M., & Wilson, J. (2022). Examining human and automated ratings of elementary students\u2019 writing quality: A multivariate generalizability theory application. American Educational Research Journal, 59(6), 1122\u20131156. https:\/\/doi.org\/10.3102\/00028312221106773","journal-title":"American Educational Research Journal"},{"key":"13774_CR12","doi-asserted-by":"publisher","first-page":"103826","DOI":"10.1016\/j.tate.2022.103826","volume":"118","author":"L Doornkamp","year":"2022","unstructured":"Doornkamp, L., Van der Pol, L. D., Groeneveld, S., Mesman, J., Endendijk, J. J., & Groeneveld, M. G. (2022). Understanding gender bias in teachers\u2019 grading: The role of gender stereotypical beliefs. Teaching and Teacher Education, 118, 103826. https:\/\/doi.org\/10.1016\/j.tate.2022.103826","journal-title":"Teaching and Teacher Education"},{"issue":"5","key":"13774_CR13","doi-asserted-by":"publisher","first-page":"329","DOI":"10.1136\/medethics-2020-106820","volume":"47","author":"JM Dur\u00e1n","year":"2021","unstructured":"Dur\u00e1n, J. M., & Jongsma, K. R. (2021). Who is afraid of black box algorithms? On the epistemological and ethical basis of trust in medical AI. Journal of Medical Ethics, 47(5), 329\u2013335. https:\/\/doi.org\/10.1136\/medethics-2020-106820","journal-title":"Journal of Medical Ethics"},{"issue":"3","key":"13774_CR14","doi-asserted-by":"publisher","first-page":"197","DOI":"10.1207\/s15434311laq0203_2","volume":"2","author":"T Eckes","year":"2005","unstructured":"Eckes, T. (2005). Examining rater effects in TestDaF writing and speaking performance assessments: A many-facet Rasch analysis. Language Assessment Quarterly, 2(3), 197\u2013221. https:\/\/doi.org\/10.1207\/s15434311laq0203_2","journal-title":"Language Assessment Quarterly"},{"key":"13774_CR15","doi-asserted-by":"crossref","unstructured":"Eckes, T. (2023). Introduction to many-facet Rasch measurement. Peter Lang.","DOI":"10.3726\/b20875"},{"issue":"1","key":"13774_CR16","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1186\/s41239-023-00425-2","volume":"20","author":"J Escalante","year":"2023","unstructured":"Escalante, J., Pack, A., & Barrett, A. (2023). AI-generated feedback on writing: Insights into efficacy and ENL student preference. International Journal of Educational Technology in Higher Education, 20(1), 57. https:\/\/doi.org\/10.1186\/s41239-023-00425-2","journal-title":"International Journal of Educational Technology in Higher Education"},{"issue":"9","key":"13774_CR17","doi-asserted-by":"publisher","first-page":"3780","DOI":"10.1021\/acs.jchemed.4c00231","volume":"101","author":"AA Fern\u00e1ndez","year":"2024","unstructured":"Fern\u00e1ndez, A. A., L\u00f3pez-Torres, M., Fern\u00e1ndez, J. J., & V\u00e1zquez-Garc\u00eda, D. (2024). ChatGPT as an instructor\u2019s assistant for generating and scoring exams. Journal Of Chemical Education, 101(9), 3780\u20133788. https:\/\/doi.org\/10.1021\/acs.jchemed.4c00231","journal-title":"Journal Of Chemical Education"},{"issue":"3","key":"13774_CR18","doi-asserted-by":"publisher","DOI":"10.1016\/j.ijme.2024.101081","volume":"22","author":"I Fischer","year":"2024","unstructured":"Fischer, I., Sweeney, S., Lucas, M., & Gupta, N. (2024). Making sense of generative AI for assessments: Contrasting student claims and assessor evaluations. The International Journal of Management Education, 22(3), Article 101081. https:\/\/doi.org\/10.1016\/j.ijme.2024.101081","journal-title":"The International Journal of Management Education"},{"key":"13774_CR19","doi-asserted-by":"publisher","DOI":"10.1007\/s10639-024-12912-8","author":"E Fokides","year":"2024","unstructured":"Fokides, E., & Peristeraki, E. (2024). Comparing ChatGPT\u2019s correction and feedback comments with that of educators in the context of primary students\u2019 short essays written in English and Greek. Education And Information Technologies. https:\/\/doi.org\/10.1007\/s10639-024-12912-8","journal-title":"Education And Information Technologies"},{"issue":"1","key":"13774_CR20","doi-asserted-by":"publisher","first-page":"29","DOI":"10.1080\/02188791.2024.2305173","volume":"44","author":"Y Gao","year":"2024","unstructured":"Gao, Y., Wang, Q., & Wang, X. (2024). Exploring EFL university teachers\u2019 beliefs in integrating ChatGPT and other large language models in language education: A study in China. Asia Pacific Journal of Education, 44(1), 29\u201344. https:\/\/doi.org\/10.1080\/02188791.2024.2305173","journal-title":"Asia Pacific Journal of Education"},{"key":"13774_CR21","doi-asserted-by":"publisher","first-page":"149","DOI":"10.1016\/j.ecresq.2020.11.005","volume":"55","author":"RA Gordon","year":"2021","unstructured":"Gordon, R. A., Peng, F., Curby, T. W., & Zinsser, K. M. (2021). An introduction to the many-facet Rasch model as a method to improve observational quality measures with an application to measuring the teaching of emotion skills. Early Childhood Research Quarterly, 55, 149\u2013164. https:\/\/doi.org\/10.1016\/j.ecresq.2020.11.005","journal-title":"Early Childhood Research Quarterly"},{"issue":"4","key":"13774_CR22","doi-asserted-by":"publisher","first-page":"961","DOI":"10.1080\/09588221.2022.2067179","volume":"37","author":"T Han","year":"2024","unstructured":"Han, T., & Sari, E. (2024). An investigation on the use of automated feedback in Turkish EFL students\u2019 writing classes. Computer Assisted Language Learning, 37(4), 961\u2013985. https:\/\/doi.org\/10.1080\/09588221.2022.2067179","journal-title":"Computer Assisted Language Learning"},{"issue":"3","key":"13774_CR23","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1007\/s10676-024-09777-3","volume":"26","author":"R Heersmink","year":"2024","unstructured":"Heersmink, R., de Rooij, B., Clavel V\u00e1zquez, M. J., & Colombo, M. (2024). A phenomenology and epistemology of large language models: Transparency, trust, and trustworthiness. Ethics and Information Technology, 26(3), 41. https:\/\/doi.org\/10.1007\/s10676-024-09777-3","journal-title":"Ethics and Information Technology"},{"issue":"7","key":"13774_CR24","doi-asserted-by":"publisher","first-page":"2219","DOI":"10.1111\/tgis.13233","volume":"28","author":"HH Hochmair","year":"2024","unstructured":"Hochmair, H. H., Juh\u00e1sz, L., & Kemp, T. (2024). Correctness comparison of ChatGPT-4, Gemini, Claude\u20103, and Copilot for spatial tasks. Transactions in GIS, 28(7), 2219\u20132231. https:\/\/doi.org\/10.1111\/tgis.13233","journal-title":"Transactions in GIS"},{"issue":"4","key":"13774_CR25","doi-asserted-by":"publisher","first-page":"403","DOI":"10.1037\/1082-989X.4.4.403","volume":"4","author":"WT Hoyt","year":"2024","unstructured":"Hoyt, W. T., & Kerns, M. D. (2024). Magnitude and moderators of bias in observer ratings: A meta-analysis. Psychological Methods, 4(4), 403\u2013424. https:\/\/doi.org\/10.1037\/1082-989X.4.4.403","journal-title":"Psychological Methods"},{"issue":"1","key":"13774_CR26","doi-asserted-by":"publisher","first-page":"64","DOI":"10.1037\/1082-989X.5.1.64","volume":"5","author":"WT Hoyt","year":"2000","unstructured":"Hoyt, W. T. (2000). Rater bias in psychological research: When is it a problem and what can we do about it? Psychological Methods, 5(1), 64\u201386. https:\/\/doi.org\/10.1037\/1082-989X.5.1.64","journal-title":"Psychological Methods"},{"key":"13774_CR27","doi-asserted-by":"publisher","unstructured":"Hurley, E., & Okyere-Badoo, J. (2024). A comparative study of few-shot vs. zero-shot prompting to generate quick and useful responses to students\u2019 periodic reflections. Proceedings of the 55th ACM Technical Symposium on Computer Science Education\u00a0(V. 2, pp. 1881\u20131881). https:\/\/doi.org\/10.1145\/3626253.3635400","DOI":"10.1145\/3626253.3635400"},{"key":"13774_CR28","doi-asserted-by":"publisher","first-page":"e208","DOI":"10.7717\/peerj-cs.208","volume":"5","author":"MA Hussein","year":"2021","unstructured":"Hussein, M. A., Hassan, H., & Nassef, M. (2021). Automated Language essay scoring systems: A literature review. PeerJ Computer Science, 5, e208. https:\/\/doi.org\/10.7717\/peerj-cs.208","journal-title":"PeerJ Computer Science"},{"key":"13774_CR29","doi-asserted-by":"publisher","first-page":"37","DOI":"10.1016\/j.asw.2017.08.004","volume":"34","author":"DR Isbell","year":"2017","unstructured":"Isbell, D. R. (2017). Assessing C2 writing ability on the certificate of english language proficiency: Rater and examinee age effects. Assessing Writing, 34, 37\u201349. https:\/\/doi.org\/10.1016\/j.asw.2017.08.004","journal-title":"Assessing Writing"},{"key":"13774_CR30","doi-asserted-by":"publisher","unstructured":"Jauhiainen, J. S., & Garagorry, G. A. (2024). Generative AI in education: ChatGPT-4 in evaluating students\u2019 written responses. Innovations in Education and Teaching International\u00a0(pp. 1\u201318). https:\/\/doi.org\/10.1080\/14703297.2024.2422337","DOI":"10.1080\/14703297.2024.2422337"},{"issue":"12","key":"13774_CR31","doi-asserted-by":"publisher","first-page":"15873","DOI":"10.1007\/s10639-023-11834-1","volume":"28","author":"J Jeon","year":"2023","unstructured":"Jeon, J., & Lee, S. (2023). Large language models in education: A focus on the complementary relationship between human teachers and ChatGPT. Education and Information Technologies, 28(12), 15873\u201315892. https:\/\/doi.org\/10.1007\/s10639-023-11834-1","journal-title":"Education and Information Technologies"},{"issue":"1","key":"13774_CR32","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-024-79208-2","volume":"14","author":"M Johnson","year":"2024","unstructured":"Johnson, M., & Zhang, M. (2024). Examining the responsible use of zero-shot AI approaches to scoring essays. Scientific Reports, 14(1), 1\u201310. https:\/\/doi.org\/10.1038\/s41598-024-79208-2","journal-title":"Scientific Reports"},{"key":"13774_CR33","doi-asserted-by":"publisher","unstructured":"Karakaya, I. (2015). Comparison of self, peer and instructor assessments in the portfolio assessment by using many facet Rasch model. Journal of Education and Human Development, 4(2). https:\/\/doi.org\/10.15640\/jehd.v4n2a22","DOI":"10.15640\/jehd.v4n2a22"},{"issue":"4","key":"13774_CR34","doi-asserted-by":"publisher","first-page":"349","DOI":"10.26822\/iejee.2020459464","volume":"12","author":"D Ko\u00e7ak","year":"2020","unstructured":"Ko\u00e7ak, D. (2020). Investigation of rater tendencies and reliability in different assessment methods with many facet Rasch model. International Electronic Journal of Elementary Education, 12(4), 349\u2013358. https:\/\/doi.org\/10.26822\/iejee.2020459464","journal-title":"International Electronic Journal of Elementary Education"},{"issue":"10","key":"13774_CR35","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-024-10888-y","volume":"57","author":"P Kumar","year":"2024","unstructured":"Kumar, P. (2024). Large language models (LLMs): Survey, technical frameworks, and future challenges. Psychological Methods, 57(10), Article 260. https:\/\/doi.org\/10.1007\/s10462-024-10888-y","journal-title":"Psychological Methods"},{"issue":"10\u201311","key":"13774_CR36","doi-asserted-by":"publisher","first-page":"2083","DOI":"10.1016\/j.jpubeco.2008.02.009","volume":"92","author":"V Lavy","year":"2008","unstructured":"Lavy, V. (2008). Do gender stereotypes reduce girls\u2019 or boys\u2019 human capital outcomes? Evidence from a natural experiment. Journal Of Public Economics, 92(10\u201311), 2083\u20132105. https:\/\/doi.org\/10.1016\/j.jpubeco.2008.02.009","journal-title":"Journal Of Public Economics"},{"issue":"5","key":"13774_CR37","doi-asserted-by":"publisher","first-page":"1982","DOI":"10.1111\/bjet.13505","volume":"55","author":"J Lee","year":"2024","unstructured":"Lee, J., Hicke, Y., Yu, R., Brooks, C., & Kizilcec, R. F. (2024a). The life cycle of large language models in education: A framework for understanding sources of bias. British Journal of Educational Technology, 55(5), 1982\u20132002. https:\/\/doi.org\/10.1111\/bjet.13505","journal-title":"British Journal of Educational Technology"},{"issue":"9","key":"13774_CR38","doi-asserted-by":"publisher","first-page":"11483","DOI":"10.1007\/s10639-023-12249-8","volume":"29","author":"U Lee","year":"2024","unstructured":"Lee, U., Jung, H., Jeon, Y., Sohn, Y., Hwang, W., Moon, J., & Kim, H. (2024b). Few-shot is enough: Exploring chatgpt prompt engineering method for automatic question generation in English education. Education and Information Technologies, 29(9), 11483\u201311515. https:\/\/doi.org\/10.1007\/s10639-023-12249-8","journal-title":"Education and Information Technologies"},{"issue":"10","key":"13774_CR39","doi-asserted-by":"publisher","first-page":"2409","DOI":"10.1007\/s11145-022-10279-1","volume":"35","author":"W Li","year":"2022","unstructured":"Li, W. (2022). Scoring rubric reliability and internal validity in rater-mediated EFL writing assessment: Insights from many-facet Rasch measurement. Reading and Writing, 35(10), 2409\u20132431. https:\/\/doi.org\/10.1007\/s11145-022-10279-1","journal-title":"Reading and Writing"},{"key":"13774_CR40","doi-asserted-by":"publisher","DOI":"10.14742\/ajet.9463","author":"J Li","year":"2024","unstructured":"Li, J., Jangamreddy, N. K., Hisamoto, R., Bhansali, R., Dyda, A., Zaphir, L., & Glencross, M. (2024a). AI-assisted marking: Functionality and limitations of chatGPT in written assessment evaluation. Australasian Journal of Educational Technology. https:\/\/doi.org\/10.14742\/ajet.9463","journal-title":"Australasian Journal of Educational Technology"},{"key":"13774_CR41","doi-asserted-by":"publisher","DOI":"10.1007\/s10639-024-12851-4","author":"K Li","year":"2024","unstructured":"Li, K., Qian, C., & Yang, X. (2024b). Evaluating the quality of student-generated content in learnersourcing: A large language model based approach. Education And Information Technologies. https:\/\/doi.org\/10.1007\/s10639-024-12851-4","journal-title":"Education And Information Technologies"},{"key":"13774_CR42","unstructured":"Linacre, J. M. (1989). Many-faceted Rasch measurement. MESA."},{"key":"13774_CR43","unstructured":"Linacre, J. M., & Wright, B. (2014). Facets. Computer program for Many-Faceted Rasch Measurement, 1998."},{"issue":"4","key":"13774_CR44","doi-asserted-by":"publisher","first-page":"479","DOI":"10.1177\/0265532214530699","volume":"31","author":"G Ling","year":"2014","unstructured":"Ling, G., Mollaun, P., & Xi, X. (2014). A study on the impact of fatigue on human raters when scoring speaking responses. Language Testing, 31(4), 479\u2013499. https:\/\/doi.org\/10.1177\/0265532214530699","journal-title":"Language Testing"},{"issue":"1","key":"13774_CR45","doi-asserted-by":"publisher","first-page":"e59273","DOI":"10.2196\/59273","volume":"12","author":"X Liu","year":"2024","unstructured":"Liu, X., Duan, C., Kim, M., Zhang, L., Jee, E., Maharjan, B., Huang, Y., Du, D., & Jiang, X. (2024). Claude 3 opus and ChatGPT with GPT-4 in dermoscopic image analysis for melanoma diagnosis: Comparative performance analysis. JMIR Medical Informatics, 12(1), e59273. https:\/\/doi.org\/10.2196\/59273","journal-title":"JMIR Medical Informatics"},{"key":"13774_CR46","doi-asserted-by":"publisher","DOI":"10.1080\/02602938.2024.2301722","author":"Q Lu","year":"2024","unstructured":"Lu, Q., Yao, Y., Xiao, L., Yuan, M., Wang, J., & Zhu, X. (2024). Can chatgpt effectively complement teacher assessment of undergraduate students\u2019 academic writing? Assessment & Evaluation in Higher Education. https:\/\/doi.org\/10.1080\/02602938.2024.2301722","journal-title":"Assessment & Evaluation in Higher Education"},{"issue":"1","key":"13774_CR47","doi-asserted-by":"publisher","first-page":"54","DOI":"10.1177\/026553229501200104","volume":"12","author":"T Lumley","year":"1995","unstructured":"Lumley, T., & McNamara, T. F. (1995). Rater characteristics and rater bias: Implications for training. Language Testing, 12(1), 54\u201371. https:\/\/doi.org\/10.1177\/026553229501200104","journal-title":"Language Testing"},{"issue":"2","key":"13774_CR48","doi-asserted-by":"publisher","first-page":"231","DOI":"10.1177\/0265532212456968","volume":"30","author":"ME Lunz","year":"2013","unstructured":"Lunz, M. E., Wright, B. D., Linacre, J. M., Winke, P., Gass, S., & Myford, C. (2013). Raters\u2019 l2 background as a potential source of bias in rating oral performance. Language Testing, 30(2), 231\u2013252. https:\/\/doi.org\/10.1177\/0265532212456968","journal-title":"Language Testing"},{"issue":"5","key":"13774_CR49","doi-asserted-by":"publisher","first-page":"651","DOI":"10.1080\/02602938.2024.2309963","volume":"49","author":"J Luo","year":"2024","unstructured":"Luo, J. (2024). A critical review of GenAI policies in higher education assessment: A call to reconsider the originality of students\u2019 work. Assessment & Evaluation in Higher Education, 49(5), 651\u2013664. https:\/\/doi.org\/10.1080\/02602938.2024.2309963","journal-title":"Assessment & Evaluation in Higher Education"},{"issue":"2","key":"13774_CR50","doi-asserted-by":"publisher","first-page":"430","DOI":"10.1152\/advan.00093.2024","volume":"49","author":"V Mavrych","year":"2025","unstructured":"Mavrych, V., Yaqinuddin, A., & Bolgova, O. (2025). Claude, chatgpt, copilot, and gemini performance versus students in different topics of neuroscience. Advances in Physiology Education, 49(2), 430\u2013437. https:\/\/doi.org\/10.1152\/advan.00093.2024","journal-title":"Advances in Physiology Education"},{"issue":"1","key":"13774_CR51","doi-asserted-by":"publisher","first-page":"125","DOI":"10.1080\/15391523.2022.2142872","volume":"55","author":"CWF Mayer","year":"2023","unstructured":"Mayer, C. W. F., Ludwig, S., & Brandt, S. (2023). Prompt text classifications with transformer models! An exemplary introduction to prompt-based learning with large language models. Journal of Research on Technology in Education, 55(1), 125\u2013141. https:\/\/doi.org\/10.1080\/15391523.2022.2142872","journal-title":"Journal of Research on Technology in Education"},{"issue":"2","key":"13774_CR52","doi-asserted-by":"publisher","first-page":"225","DOI":"10.1111\/ijsa.12459","volume":"32","author":"SM Merritt","year":"2024","unstructured":"Merritt, S. M., Ryan, A. M., Gardner, C., Liff, J., & Mondragon, N. (2024). Gendered competencies and gender composition: A human versus algorithm evaluator comparison. International Journal of Selection and Assessment, 32(2), 225\u2013248. https:\/\/doi.org\/10.1111\/ijsa.12459","journal-title":"International Journal of Selection and Assessment"},{"key":"13774_CR53","doi-asserted-by":"publisher","DOI":"10.1080\/14703297.2025.2516117","author":"K Misiejuk","year":"2025","unstructured":"Misiejuk, K., Bastesen, J., & Ershova, T. (2025). How does using generative AI for essay writing impact peer assessment patterns? Insights from early adopters. Innovations in Education and Teaching International. https:\/\/doi.org\/10.1080\/14703297.2025.2516117","journal-title":"Innovations in Education and Teaching International"},{"key":"13774_CR54","doi-asserted-by":"publisher","first-page":"933","DOI":"10.1162\/tacl_a_00681","volume":"12","author":"M Mizrahi","year":"2024","unstructured":"Mizrahi, M., Kaplan, G., Malkin, D., Dror, R., Shahaf, D., & Stanovsky, G. (2024). State of what art?? A call for multi-prompt LLM evaluation. Transactions of the Association for Computational Linguistics, 12, 933\u2013949. https:\/\/doi.org\/10.1162\/tacl_a_00681","journal-title":"Transactions of the Association for Computational Linguistics"},{"issue":"2","key":"13774_CR55","doi-asserted-by":"publisher","first-page":"100116","DOI":"10.1016\/j.rmal.2024.100116","volume":"3","author":"A Mizumoto","year":"2024","unstructured":"Mizumoto, A., Shintani, N., Sasaki, M., & Teng, M. F. (2024). Testing the viability of ChatGPT as a companion in L2 writing accuracy assessment. Research Methods in Applied Linguistics, 3(2), 100116. https:\/\/doi.org\/10.1016\/j.rmal.2024.100116","journal-title":"Research Methods in Applied Linguistics"},{"issue":"5","key":"13774_CR56","doi-asserted-by":"publisher","first-page":"780","DOI":"10.3102\/10769986231207886","volume":"49","author":"R Mozer","year":"2024","unstructured":"Mozer, R., Miratrix, L., Relyea, J. E., & Kim, J. S. (2024). Combining human and automated scoring methods in experimental assessments of writing: A case study tutorial. Journal of Educational and Behavioral Statistics, 49(5), 780\u2013816. https:\/\/doi.org\/10.3102\/10769986231207886","journal-title":"Journal of Educational and Behavioral Statistics"},{"issue":"4","key":"13774_CR57","first-page":"386","volume":"4","author":"CM Myford","year":"2003","unstructured":"Myford, C. M., & Wolfe, E. W. (2003). Detecting and measuring rater effects using many-facet Rasch measurement: Part I. Journal of Applied Measurement, 4(4), 386\u2013422.","journal-title":"Journal of Applied Measurement"},{"issue":"2","key":"13774_CR58","first-page":"189","volume":"5","author":"CM Myford","year":"2004","unstructured":"Myford, C. M., & Wolfe, E. W. (2004). Detecting and measuring rater effects using many-facet Rasch measurement: Part II. Journal of Applied Measurement, 5(2), 189\u2013227.","journal-title":"Journal of Applied Measurement"},{"issue":"1","key":"13774_CR59","doi-asserted-by":"publisher","first-page":"102","DOI":"10.26417\/919pei96z","volume":"7","author":"P Panagiotidis","year":"2024","unstructured":"Panagiotidis, P. (2024). LLM-based chatbots in language learning. European Journal of Education, 7(1), 102\u2013123. https:\/\/doi.org\/10.26417\/919pei96z","journal-title":"European Journal of Education"},{"issue":"12","key":"13774_CR60","doi-asserted-by":"publisher","first-page":"721","DOI":"10.3928\/01484834-20231006-02","volume":"62","author":"JL Parker","year":"2023","unstructured":"Parker, J. L., Becker, K., & Carroca, C. (2023). ChatGPT for automated writing evaluation in scholarly writing instruction. Journal of Nursing Education, 62(12), 721\u2013727. https:\/\/doi.org\/10.3928\/01484834-20231006-02","journal-title":"Journal of Nursing Education"},{"key":"13774_CR61","doi-asserted-by":"publisher","first-page":"141","DOI":"10.1016\/j.stueduc.2018.07.006","volume":"59","author":"T Protiv\u00ednsk\u00fd","year":"2018","unstructured":"Protiv\u00ednsk\u00fd, T., & M\u00fcnich, D. (2018). Gender bias in teachers\u2019 grading: What is in the grade. Studies in Educational Evaluation, 59, 141\u2013149. https:\/\/doi.org\/10.1016\/j.stueduc.2018.07.006","journal-title":"Studies in Educational Evaluation"},{"issue":"3","key":"13774_CR62","doi-asserted-by":"publisher","first-page":"2495","DOI":"10.1007\/s10462-021-10068-2","volume":"55","author":"D Ramesh","year":"2022","unstructured":"Ramesh, D., & Sanampudi, S. K. (2022). An automated essay scoring systems: A systematic literature review. Artificial Intelligence Review, 55(3), 2495\u20132527. https:\/\/doi.org\/10.1007\/s10462-021-10068-2","journal-title":"Artificial Intelligence Review"},{"issue":"11","key":"13774_CR63","doi-asserted-by":"publisher","first-page":"6099","DOI":"10.1007\/s00405-024-08828-1","volume":"281","author":"B Schmidl","year":"2024","unstructured":"Schmidl, B., H\u00fctten, T., Pigorsch, S., St\u00f6gbauer, F., Hoch, C. C., Hussain, T., Wollenberg, B., & Wirth, M. (2024). Assessing the use of the novel tool Claude 3 in comparison to ChatGPT 4.0 as an artificial intelligence tool in the diagnosis and therapy of primary head and neck cancer cases. European Archives of Oto-Rhino-Laryngology, 281(11), 6099\u20136109. https:\/\/doi.org\/10.1007\/s00405-024-08828-1","journal-title":"European Archives of Oto-Rhino-Laryngology"},{"issue":"18","key":"13774_CR64","doi-asserted-by":"publisher","DOI":"10.1007\/s10639-024-12817-6","volume":"29","author":"D Shin","year":"2024","unstructured":"Shin, D., & Lee, J. H. (2024). Exploratory study on the potential of ChatGPT as a rater of second language writing. Education and Information Technologies, 29(18), Article 23. https:\/\/doi.org\/10.1007\/s10639-024-12817-6","journal-title":"Education and Information Technologies"},{"key":"13774_CR65","doi-asserted-by":"publisher","first-page":"e55318","DOI":"10.2196\/55318","volume":"12","author":"S Sivarajkumar","year":"2024","unstructured":"Sivarajkumar, S., Kelley, M., Samolyk-Mazzanti, A., Visweswaran, S., & Wang, Y. (2024). An empirical evaluation of prompting strategies for large language models in zero-shot clinical natural language processing: Algorithm development and validation study. JMIR Medical Informatics, 12, e55318. https:\/\/doi.org\/10.2196\/55318","journal-title":"JMIR Medical Informatics"},{"key":"13774_CR66","doi-asserted-by":"publisher","first-page":"1880","DOI":"10.1109\/TLT.2024.3396873","volume":"17","author":"Y Song","year":"2024","unstructured":"Song, Y., Zhu, Q., Wang, H., & Zheng, Q. (2024). Automated essay scoring and revising based on open-source large language models. IEEE Transactions on Learning Technologies, 17, 1880\u20131890. https:\/\/doi.org\/10.1109\/TLT.2024.3396873","journal-title":"IEEE Transactions on Learning Technologies"},{"issue":"1","key":"13774_CR67","doi-asserted-by":"publisher","DOI":"10.1007\/s11606-024-09050-9","volume":"40","author":"R Sreedhar","year":"2024","unstructured":"Sreedhar, R., Chang, L., Gangopadhyaya, A., Shiels, P. W., Loza, J., Chi, E., Gabel, E., & Park, Y. S. (2024). Comparing scoring consistency of large language models with faculty for formative assessments in medical education. Journal of General Internal Medicine, 40(1), Article 8. https:\/\/doi.org\/10.1007\/s11606-024-09050-9","journal-title":"Journal of General Internal Medicine"},{"key":"13774_CR68","doi-asserted-by":"publisher","first-page":"101894","DOI":"10.1016\/j.learninstruc.2024.101894","volume":"91","author":"J Steiss","year":"2024","unstructured":"Steiss, J., Tate, T., Graham, S., Cruz, J., Hebert, M., Wang, J., Moon, Y., Tseng, W., Warschauer, M., & Olson, C. B. (2024). Comparing the quality of human and chatGPT feedback of students\u2019 writing. Learning and Instruction, 91, 101894. https:\/\/doi.org\/10.1016\/j.learninstruc.2024.101894","journal-title":"Learning and Instruction"},{"key":"13774_CR69","doi-asserted-by":"publisher","first-page":"100752","DOI":"10.1016\/j.asw.2023.100752","volume":"57","author":"Y Su","year":"2023","unstructured":"Su, Y., Lin, Y., & Lai, C. (2023). Collaborating with ChatGPT in argumentative writing classrooms. Assessing Writing, 57, 100752. https:\/\/doi.org\/10.1016\/j.asw.2023.100752","journal-title":"Assessing Writing"},{"issue":"3","key":"13774_CR70","doi-asserted-by":"publisher","first-page":"289","DOI":"10.1080\/0260293970220303","volume":"22","author":"K Sullivan","year":"1997","unstructured":"Sullivan, K., & Hall, C. (1997). Introducing students to self-assessment. Assessment & Evaluation in Higher Education, 22(3), 289\u2013305. https:\/\/doi.org\/10.1080\/0260293970220303","journal-title":"Assessment & Evaluation in Higher Education"},{"key":"13774_CR71","doi-asserted-by":"publisher","unstructured":"Sun, D., Boudouaia, A., Yang, J., Xu, J., Wang, Q., & Gayed, J. M. (2024). Effectiveness of large Language models in automated evaluation of argumentative essays: Finetuning vs. Zero-shot prompting. Computer Assisted Language Learning\u00a0(pp. 1\u201329). https:\/\/doi.org\/10.1080\/09588221.2024.2371395","DOI":"10.1080\/09588221.2024.2371395"},{"issue":"3","key":"13774_CR72","doi-asserted-by":"publisher","first-page":"333","DOI":"10.1097\/ACM.0000000000002495","volume":"94","author":"O ten Cate","year":"2019","unstructured":"ten Cate, O., & Regehr, G. (2019). The power of subjectivity in the assessment of medical trainees. Academic Medicine, 94(3), 333\u2013337. https:\/\/doi.org\/10.1097\/ACM.0000000000002495","journal-title":"Academic Medicine"},{"issue":"6630","key":"13774_CR73","doi-asserted-by":"publisher","first-page":"313","DOI":"10.1126\/science.adg7879","volume":"379","author":"HH Thorp","year":"2023","unstructured":"Thorp, H. H. (2023). ChatGPT is fun, but not an author. Science, 379(6630), 313\u2013313. https:\/\/doi.org\/10.1126\/science.adg7879","journal-title":"Science"},{"key":"13774_CR74","doi-asserted-by":"publisher","DOI":"10.1007\/s10639-024-12722-y","author":"CY Tsai","year":"2024","unstructured":"Tsai, C. Y., Lin, Y. T., & Brown, I. K. (2024). Impacts of ChatGPT-assisted writing for EFL english majors: Feasibility and challenges. Education and Information Technologies. https:\/\/doi.org\/10.1007\/s10639-024-12722-y","journal-title":"Education and Information Technologies"},{"key":"13774_CR75","doi-asserted-by":"publisher","unstructured":"Usher, M. (2025). Generative AI vs. instructor vs. peer assessments: A comparison of grading and feedback in higher education. Assessment & Evaluation in Higher Education(6). https:\/\/doi.org\/10.1080\/02602938.2025.2487495","DOI":"10.1080\/02602938.2025.2487495"},{"issue":"8","key":"13774_CR76","doi-asserted-by":"publisher","first-page":"8450","DOI":"10.3758\/s13428-024-02485-2","volume":"56","author":"M Uto","year":"2023","unstructured":"Uto, M., & Aramaki, K. (2023). Linking essay-writing tests using many-facet models and neural automated essay scoring. Behavior Research Methods, 56(8), 8450\u20138479. https:\/\/doi.org\/10.3758\/s13428-024-02485-2","journal-title":"Behavior Research Methods"},{"key":"13774_CR77","doi-asserted-by":"publisher","unstructured":"Vieira da Silva, L. M., Kocher, A., Gehlhoff, F., & Fay, A. (2024). On the use of large language models to generate capability ontologies. 2024 IEEE 29th International Conference on Emerging Technologies and Factory Automation (ETFA), 33, 1\u20138. https:\/\/doi.org\/10.1109\/ETFA61755.2024.10710775","DOI":"10.1109\/ETFA61755.2024.10710775"},{"issue":"1","key":"13774_CR78","doi-asserted-by":"publisher","first-page":"010152","DOI":"10.1103\/PhysRevPhysEducRes.20.010152","volume":"20","author":"T Wan","year":"2024","unstructured":"Wan, T., & Chen, Z. (2024). Exploring generative AI assisted feedback writing for students\u2019 written responses to a physics conceptual question with prompt engineering and few-shot learning. Physical Review Physics Education Research, 20(1), 010152. https:\/\/doi.org\/10.1103\/PhysRevPhysEducRes.20.010152","journal-title":"Physical Review Physics Education Research"},{"key":"13774_CR79","doi-asserted-by":"publisher","DOI":"10.1016\/j.tsc.2023.101440","volume":"51","author":"L Wang","year":"2024","unstructured":"Wang, L., Chen, X., Wang, C., Xu, L., Shadiev, R., & Li, Y. (2024a). ChatGPT\u2019s capabilities in providing feedback on undergraduate students\u2019 argumentation: A case study. Thinking Skills and Creativity, 51, Article 101440. https:\/\/doi.org\/10.1016\/j.tsc.2023.101440","journal-title":"Thinking Skills and Creativity"},{"key":"13774_CR80","doi-asserted-by":"publisher","first-page":"3917","DOI":"10.2147\/JMDH.S473680","volume":"17","author":"Y Wang","year":"2024","unstructured":"Wang, Y., Liang, L., Li, R., Wang, Y., & Hao, C. (2024b). Comparison of the performance of ChatGPT, Claude and Bard in support of myopia prevention and control. Journal of Multidisciplinary Healthcare, 17, 3917\u20133929. https:\/\/doi.org\/10.2147\/JMDH.S473680","journal-title":"Journal of Multidisciplinary Healthcare"},{"key":"13774_CR81","first-page":"24824","volume":"35","author":"J Wei","year":"2022","unstructured":"Wei, J., Wang, X., Schuurmans, D., Bosma, M., Xia, F., Chi, E., Le, Q. V., & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35, 24824\u201324837.","journal-title":"Advances in Neural Information Processing Systems"},{"issue":"2","key":"13774_CR82","doi-asserted-by":"publisher","first-page":"263","DOI":"10.1191\/026553298670883954","volume":"15","author":"SC Weigle","year":"1998","unstructured":"Weigle, S. C. (1998). Using FACETS to model rater training effects. Language Testing, 15(2), 263\u2013287. https:\/\/doi.org\/10.1191\/026553298670883954","journal-title":"Language Testing"},{"issue":"1","key":"13774_CR83","doi-asserted-by":"publisher","first-page":"52","DOI":"10.1186\/s41239-024-00485-y","volume":"21","author":"A Williams","year":"2024","unstructured":"Williams, A. (2024). Comparison of generative AI performance on undergraduate and postgraduate written assessments in the biomedical sciences. International Journal of Educational Technology in Higher Education, 21(1), 52. https:\/\/doi.org\/10.1186\/s41239-024-00485-y","journal-title":"International Journal of Educational Technology in Higher Education"},{"issue":"1","key":"13774_CR84","doi-asserted-by":"publisher","first-page":"40","DOI":"10.1186\/s41239-024-00468-z","volume":"21","author":"Q Xia","year":"2024","unstructured":"Xia, Q., Weng, X., Ouyang, F., Lin, T. J., & Chiu, T. K. F. (2024). A scoping review on how generative artificial intelligence transforms assessment in higher education. International Journal of Educational Technology in Higher Education, 21(1), 40. https:\/\/doi.org\/10.1186\/s41239-024-00468-z","journal-title":"International Journal of Educational Technology in Higher Education"},{"issue":"3","key":"13774_CR85","doi-asserted-by":"publisher","first-page":"100133","DOI":"10.1016\/j.rmal.2024.100133","volume":"3","author":"T Yamashita","year":"2024","unstructured":"Yamashita, T. (2024). An application of many-facet Rasch measurement to evaluate automated essay scoring: A case of ChatGPT-4.0. Research Methods in Applied Linguistics, 3(3), 100133. https:\/\/doi.org\/10.1016\/j.rmal.2024.100133","journal-title":"Research Methods in Applied Linguistics"},{"issue":"11","key":"13774_CR86","doi-asserted-by":"publisher","first-page":"13943","DOI":"10.1007\/s10639-023-11742-4","volume":"28","author":"D Yan","year":"2023","unstructured":"Yan, D. (2023). Impact of ChatGPT on learners in a L2 writing practicum: An exploratory investigation. Education And Information Technologies, 28(11), 13943\u201313967. https:\/\/doi.org\/10.1007\/s10639-023-11742-4","journal-title":"Education And Information Technologies"},{"issue":"4\u20135","key":"13774_CR87","doi-asserted-by":"publisher","first-page":"520","DOI":"10.1080\/15434303.2023.2288256","volume":"20","author":"Y Yan","year":"2023","unstructured":"Yan, Y., Dong, S., Yu, X., Voss, E., Cushing, S. T., Ockey, G. J., & Yan, X. (2023). The use of assistive technologies including generative AI by test takers in language assessment: A debate of theory and practice. Language Assessment Quarterly, 20(4\u20135), 520\u2013532. https:\/\/doi.org\/10.1080\/15434303.2023.2288256","journal-title":"Language Assessment Quarterly"},{"key":"13774_CR88","doi-asserted-by":"publisher","DOI":"10.1111\/bjet.13494","author":"F Yavuz","year":"2024","unstructured":"Yavuz, F., \u00c7elik, \u00d6., & Yava\u015f \u00c7elik, G. (2024). Utilizing large language models for EFL essay grading: An examination of reliability and validity in rubric-based assessments. British Journal of Educational Technology. https:\/\/doi.org\/10.1111\/bjet.13494","journal-title":"British Journal of Educational Technology"},{"key":"13774_CR89","doi-asserted-by":"publisher","unstructured":"Yoshida, L. (2024). The Impact of example selection in few-shot prompting on automated essay scoring using GPT models. In A. M. Olney, I.-A. Chounta, Z. Liu, O. C. Santos, & I. I. Bittencourt (Eds.), Communications in Computer and Information Science (pp. 61\u201373). Springer Nature Switzerland. https:\/\/doi.org\/10.1007\/978-3-031-64315-6_5","DOI":"10.1007\/978-3-031-64315-6_5"},{"issue":"1","key":"13774_CR90","doi-asserted-by":"publisher","first-page":"170","DOI":"10.1515\/jccall-2021-2007","volume":"1","author":"S Zhang","year":"2021","unstructured":"Zhang, S. (2021). Review of automated writing evaluation systems. Journal of China Computer-Assisted Language Learning, 1(1), 170\u2013176. https:\/\/doi.org\/10.1515\/jccall-2021-2007","journal-title":"Journal of China Computer-Assisted Language Learning"},{"key":"13774_CR91","doi-asserted-by":"publisher","unstructured":"Zhuang, J. (2024). Robust Data-centric graph structure learning for text classification. Companion Proceedings of the ACM Web Conference 2024\u00a0(pp. 1486\u20131495). https:\/\/doi.org\/10.1145\/3589335.3651915","DOI":"10.1145\/3589335.3651915"}],"container-title":["Education and Information Technologies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10639-025-13774-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10639-025-13774-4","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10639-025-13774-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,29]],"date-time":"2026-01-29T02:40:09Z","timestamp":1769654409000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10639-025-13774-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,2]]},"references-count":91,"journal-issue":{"issue":"18","published-print":{"date-parts":[[2025,12]]}},"alternative-id":["13774"],"URL":"https:\/\/doi.org\/10.1007\/s10639-025-13774-4","relation":{},"ISSN":["1360-2357","1573-7608"],"issn-type":[{"value":"1360-2357","type":"print"},{"value":"1573-7608","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,2]]},"assertion":[{"value":"7 January 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 August 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 October 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"We prioritized the protection of participant rights and adherence to research integrity. Participants, including 117 university students and four human raters, were fully informed about the study\u2019s purpose, data collection processes, and the anonymization of their data. Informed consent was obtained from all individual adult participants included in the study. We also ensured compliance with OpenAI\u2019s policy for using ChatGPT and Claude\u2019s policy for using Claude in research. The study protocol was reviewed and approved by the university\u2019s Institutional Review Board (IRB).","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical statement"}},{"value":"The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}