{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T10:34:23Z","timestamp":1777113263307,"version":"3.51.4"},"publisher-location":"New York, NY, USA","reference-count":62,"publisher":"ACM","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62407001"],"award-info":[{"award-number":["62407001"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2026,4,27]]},"DOI":"10.1145\/3785022.3785130","type":"proceedings-article","created":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T09:39:01Z","timestamp":1777109941000},"page":"685-696","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["When LLMs Fall Short in Deductive Coding: Model Comparisons and Human\u2013AI Collaboration Workflow Design"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0000-0566-9051","authenticated-orcid":false,"given":"Zijian","family":"Li","sequence":"first","affiliation":[{"name":"Department of Educational Technology, Graduate School of Education, Peking University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-0102-5677","authenticated-orcid":false,"given":"Luzhen","family":"Tang","sequence":"additional","affiliation":[{"name":"Department of Educational Technology, Graduate School of Education, Peking University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-8496-8611","authenticated-orcid":false,"given":"Mengyu","family":"Xia","sequence":"additional","affiliation":[{"name":"Department of Educational Technology, Graduate School of Education, Peking University, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2681-4451","authenticated-orcid":false,"given":"Xinyu","family":"Li","sequence":"additional","affiliation":[{"name":"Centre for Learning Analytics, Faculty of Information Technology, Monash University, Melbourne, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3996-5454","authenticated-orcid":false,"given":"Naping","family":"Chen","sequence":"additional","affiliation":[{"name":"Medical College, Shantou University, Shantou, Guangdong, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9265-1908","authenticated-orcid":false,"given":"Dragan","family":"Ga\u0161evi\u0107","sequence":"additional","affiliation":[{"name":"Centre for Learning Analytics, Faculty of Information Technology, Monash University, Melbourne, Australia"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2777-1705","authenticated-orcid":false,"given":"Yizhou","family":"Fan","sequence":"additional","affiliation":[{"name":"Department of Educational Technology, Graduate School of Education, Peking University, Beijing, China and Centre for Learning Analytics, Faculty of Information Technology, Monash University, Melbourne, Australia"}]}],"member":"320","published-online":{"date-parts":[[2026,4,26]]},"reference":[{"key":"e_1_3_3_2_2_2","doi-asserted-by":"crossref","unstructured":"Laila Alrajhi Ahmed Alamri Filipe\u00a0Dwan Pereira Alexandra\u00a0I Cristea and Elaine\u00a0HT Oliveira. 2024. Solving the imbalanced data issue: automatic urgency detection for instructor assistance in MOOC discussion forums. User Modeling and User-Adapted Interaction 34 3 (2024) 797\u2013852.","DOI":"10.1007\/s11257-023-09381-y"},{"key":"e_1_3_3_2_3_2","doi-asserted-by":"crossref","unstructured":"Julian Ashwin Aditya Chhabra and Vijayendra Rao. 2023. Using large language models for qualitative analysis can introduce serious bias. Sociological Methods & Research (2023) 00491241251338246.","DOI":"10.1596\/1813-9450-10597"},{"key":"e_1_3_3_2_4_2","doi-asserted-by":"crossref","unstructured":"Alberto Benayas Miguel\u00a0Angel Sicilia and Mar\u00e7al Mora-Cantallops. 2025. A comparative analysis of encoder only and decoder only models in intent classification and sentiment analysis: Navigating the trade-offs in model size and performance. Language Resources and Evaluation 59 3 (2025) 2007\u20132030.","DOI":"10.1007\/s10579-024-09796-y"},{"key":"e_1_3_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-98420-4_27"},{"key":"e_1_3_3_2_6_2","unstructured":"Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared\u00a0D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et\u00a0al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020) 1877\u20131901."},{"key":"e_1_3_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-65735-1_19"},{"key":"e_1_3_3_2_8_2","doi-asserted-by":"crossref","unstructured":"Siti Dianah\u00a0Abdul Bujang Ali Selamat Ondrej Krejcar Farhan Mohamed Lim\u00a0Kok Cheng Po\u00a0Chan Chiu and Hamido Fujita. 2022. Imbalanced classification methods for student grade prediction: A systematic literature review. IEEE Access 11 (2022) 1970\u20131989.","DOI":"10.1109\/ACCESS.2022.3225404"},{"key":"e_1_3_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-33232-7_4"},{"key":"e_1_3_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICALT.2019.00061"},{"key":"e_1_3_3_2_11_2","doi-asserted-by":"crossref","unstructured":"Wuxing Chen Kaixiang Yang Zhiwen Yu Yifan Shi and CL\u00a0Philip Chen. 2024. A survey on imbalanced learning: latest research applications and future directions. Artificial Intelligence Review 57 6 (2024) 137.","DOI":"10.1007\/s10462-024-10759-6"},{"key":"e_1_3_3_2_12_2","doi-asserted-by":"crossref","unstructured":"Yixin Cheng Yizhou Fan Xinyu Li Guanliang Chen Dragan Ga\u0161evi\u0107 and Zachari Swiecki. 2025. Asking generative artificial intelligence the right questions improves writing performance. Computers and Education: Artificial Intelligence 8 (2025) 100374.","DOI":"10.1016\/j.caeai.2025.100374"},{"key":"e_1_3_3_2_13_2","unstructured":"Robert Chew John Bollenbacher Michael Wenger Jessica Speer and Annice Kim. 2023. LLM-assisted content analysis: Using large language models to support deductive coding. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2306.14924 (2023)."},{"key":"e_1_3_3_2_14_2","unstructured":"Shih-Chieh Dai Aiping Xiong and Lun-Wei Ku. 2023. LLM-in-the-loop: Leveraging large language model for thematic analysis. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2310.15100 (2023)."},{"key":"e_1_3_3_2_15_2","first-page":"4171","volume-title":"Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers)","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers). 4171\u20134186."},{"key":"e_1_3_3_2_16_2","unstructured":"Zackary\u00a0Okun Dunivin. 2024. Scalable qualitative coding with llms: Chain-of-thought reasoning matches human performance in some hermeneutic tasks. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2401.15170 (2024)."},{"key":"e_1_3_3_2_17_2","doi-asserted-by":"crossref","unstructured":"Zackary\u00a0Okun Dunivin. 2025. Scaling hermeneutics: a guide to qualitative coding with LLMs for reflexive content analysis. EPJ Data Science 14 1 (2025) 28.","DOI":"10.1140\/epjds\/s13688-025-00548-8"},{"key":"e_1_3_3_2_18_2","doi-asserted-by":"crossref","unstructured":"Aleksandra Edwards and Jose Camacho-Collados. 2024. Language models for text classification: Is in-context learning enough? arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2403.17661 (2024).","DOI":"10.63317\/3dnziwdd5bh2"},{"key":"e_1_3_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3375462.3375495"},{"key":"e_1_3_3_2_20_2","doi-asserted-by":"crossref","unstructured":"Stephen\u00a0T Fife and Jacob\u00a0D Gossner. 2024. Deductive qualitative analysis: Evaluating expanding and refining theory. International Journal of Qualitative Methods 23 (2024) 16094069241244856.","DOI":"10.1177\/16094069241244856"},{"key":"e_1_3_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1145\/3613904.3642002"},{"key":"e_1_3_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3544548.3581352"},{"key":"e_1_3_3_2_23_2","doi-asserted-by":"crossref","unstructured":"Hyukjun Gweon and Matthias Schonlau. 2024. Automated classification for open-ended questions with BERT. Journal of Survey Statistics and Methodology 12 2 (2024) 493\u2013504.","DOI":"10.1093\/jssam\/smad015"},{"key":"e_1_3_3_2_24_2","doi-asserted-by":"crossref","unstructured":"Sean\u00a0N Halpin. 2024. Inter-coder agreement in qualitative coding: Considerations for its use. American Journal of Qualitative Research 8 3 (2024) 23\u201343.","DOI":"10.29333\/ajqr\/14887"},{"key":"e_1_3_3_2_25_2","volume-title":"Codebook LLMs: Evaluating LLMs as Measurement Tools for Political Science Concepts","author":"Halterman Andrew","year":"2024","unstructured":"Andrew Halterman and Katherine\u00a0A. Keith. 2024. Codebook LLMs: Evaluating LLMs as Measurement Tools for Political Science Concepts. arxiv:https:\/\/arXiv.org\/abs\/2407.10747\u00a0[cs.CL]"},{"key":"e_1_3_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/3706468.3706526"},{"key":"e_1_3_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/3636555.3636927"},{"key":"e_1_3_3_2_28_2","doi-asserted-by":"crossref","unstructured":"Charles Lang Alyssa\u00a0Friend Wise Agathe Merceron Dragan Ga\u0161evi\u0107 and George Siemens. 2022. What is learning analytics. The handbook of learning analytics (2022) 8\u201318.","DOI":"10.18608\/hla22.001"},{"key":"e_1_3_3_2_29_2","doi-asserted-by":"crossref","unstructured":"Jionghao Lin Wei Tan Lan Du Wray Buntine David Lang Dragan Ga\u0161evi\u0107 and Guanliang Chen. 2023. Enhancing educational dialogue act classification with discourse context and sample informativeness. IEEE Transactions on Learning Technologies 17 (2023) 258\u2013269.","DOI":"10.1109\/TLT.2023.3302573"},{"key":"e_1_3_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-36272-9_10"},{"key":"e_1_3_3_2_31_2","doi-asserted-by":"crossref","unstructured":"Xiner Liu Andres\u00a0Felipe Zambrano Ryan\u00a0S Baker Amanda Barany Jaclyn Ocumpaugh Jiayi Zhang Maciej Pankiewicz Nidhi Nasiar and Zhanlan Wei. 2025. Qualitative Coding with GPT-4: Where It Works Better. Journal of Learning Analytics 12 1 (2025) 169\u2013185.","DOI":"10.18608\/jla.2025.8575"},{"key":"e_1_3_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-76335-9_7"},{"key":"e_1_3_3_2_33_2","doi-asserted-by":"crossref","unstructured":"Yun Long Haifeng Luo and Yu Zhang. 2024. Evaluating large language models in analysing classroom dialogue. npj Science of Learning 9 1 (2024) 60.","DOI":"10.1038\/s41539-024-00273-3"},{"key":"e_1_3_3_2_34_2","doi-asserted-by":"crossref","unstructured":"Kathleen\u00a0M MacQueen Eleanor McLellan Kelly Kay and Bobby Milstein. 1998. Codebook development for team-based qualitative analysis. Cam Journal 10 2 (1998) 31\u201336.","DOI":"10.1177\/1525822X980100020301"},{"key":"e_1_3_3_2_35_2","doi-asserted-by":"crossref","unstructured":"Atsushi Mizumoto and Mark\u00a0Feng Teng. 2025. Large language models fall short in classifying learners\u2019 open-ended responses. Research Methods in Applied Linguistics 4 2 (2025) 100210.","DOI":"10.1016\/j.rmal.2025.100210"},{"key":"e_1_3_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-98459-4_18"},{"key":"e_1_3_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.4135\/9781071802878"},{"key":"e_1_3_3_2_38_2","unstructured":"Benjamin\u00a0D Nye Donald\u00a0M Morrison and Borhan Samei. 2015. Automated Session-Quality Assessment for Human Tutoring Based on Expert Ratings of Tutoring Success. International Educational Data Mining Society (2015)."},{"key":"e_1_3_3_2_39_2","unstructured":"Joel Oksanen Andr\u00e9s Lucero and Perttu H\u00e4m\u00e4l\u00e4inen. 2025. LLMCode: Evaluating and Enhancing Researcher-AI Alignment in Qualitative Analysis. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2504.16671 (2025)."},{"key":"e_1_3_3_2_40_2","doi-asserted-by":"crossref","unstructured":"Ikenna Osakwe Guanliang Chen Alex Whitelock-Wainwright Dragan Ga\u0161evi\u0107 Anderson\u00a0Pinheiro Cavalcanti and Rafael\u00a0Ferreira Mello. 2022. Towards automated content analysis of educational feedback: A multi-language study. Computers and Education: Artificial Intelligence 3 (2022) 100059.","DOI":"10.1016\/j.caeai.2022.100059"},{"key":"e_1_3_3_2_41_2","unstructured":"Zacharoula Papamitsiou and Anastasios\u00a0A Economides. 2014. Learning analytics and educational data mining in practice: A systematic literature review of empirical evidence. Journal of educational technology & society 17 4 (2014) 49\u201364."},{"key":"e_1_3_3_2_42_2","doi-asserted-by":"crossref","unstructured":"Anabel Pilicita and Enrique Barra. 2025. LLMs in Education: Evaluation GPT and BERT Models in Student Comment Classification. Multimodal Technologies and Interaction 9 5 (2025) 44.","DOI":"10.3390\/mti9050044"},{"key":"e_1_3_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICOIACT53268.2021.9563968"},{"key":"e_1_3_3_2_44_2","unstructured":"Colin Raffel Noam Shazeer Adam Roberts Katherine Lee Sharan Narang Michael Matena Yanqi Zhou Wei Li and Peter\u00a0J Liu. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of machine learning research 21 140 (2020) 1\u201367."},{"key":"e_1_3_3_2_45_2","doi-asserted-by":"crossref","unstructured":"Yuval Reif and Roy Schwartz. 2024. Beyond performance: Quantifying and mitigating label bias in LLMs. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2405.02743 (2024).","DOI":"10.18653\/v1\/2024.naacl-long.378"},{"key":"e_1_3_3_2_46_2","doi-asserted-by":"crossref","unstructured":"Mona\u00a0J Ritchie Karen\u00a0L Drummond Brandy\u00a0N Smith Jennifer\u00a0L Sullivan and Sara\u00a0J Landes. 2022. Development of a qualitative data analysis codebook informed by the i-PARIHS framework. Implementation science communications 3 1 (2022) 98.","DOI":"10.1186\/s43058-022-00344-9"},{"key":"e_1_3_3_2_47_2","doi-asserted-by":"crossref","unstructured":"Soroush Sabbaghan. 2024. Exploring the synergy of human and AI-driven approaches in thematic analysis for qualitative educational research. Journal of Applied Learning and Teaching 7 2 (2024) 129\u2013140.","DOI":"10.37074\/jalt.2024.7.2.32"},{"key":"e_1_3_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/3706598.3713120"},{"key":"e_1_3_3_2_49_2","unstructured":"Burr Settles. 2009. Active learning literature survey. (2009)."},{"key":"e_1_3_3_2_50_2","doi-asserted-by":"crossref","unstructured":"Lele Sha Mladen Rakovi\u0107 Angel Das Dragan Ga\u0161evi\u0107 and Guanliang Chen. 2022. Leveraging class balancing techniques to alleviate algorithmic bias for predictive tasks in education. IEEE Transactions on Learning Technologies 15 4 (2022) 481\u2013492.","DOI":"10.1109\/TLT.2022.3196278"},{"key":"e_1_3_3_2_51_2","unstructured":"Jongyoon Song Sangwon Yu and Sungroh Yoon. 2024. Large language models are skeptics: False negative problem of input-conflicting hallucination. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2406.13929 (2024)."},{"key":"e_1_3_3_2_52_2","doi-asserted-by":"crossref","unstructured":"Nga Than Leanne Fan Tina Law Laura\u00a0K Nelson and Leslie McCall. 2025. Updating \u201cThe Future of Coding\u201d: Qualitative Coding with Generative Large Language Models. Sociological Methods & Research 54 3 (2025) 849\u2013888.","DOI":"10.1177\/00491241251339188"},{"key":"e_1_3_3_2_53_2","doi-asserted-by":"publisher","DOI":"10.1145\/3170358.3170367"},{"key":"e_1_3_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.24963\/ijcai.2017\/406"},{"key":"e_1_3_3_2_55_2","doi-asserted-by":"crossref","unstructured":"Tarid Wongvorachan Surina He and Okan Bulut. 2023. A comparison of undersampling oversampling and SMOTE methods for dealing with imbalanced classification in educational data mining. Information 14 1 (2023) 54.","DOI":"10.3390\/info14010054"},{"key":"e_1_3_3_2_56_2","doi-asserted-by":"crossref","unstructured":"Peter Wulff Lukas Mientus Anna Nowak and Andreas Borowski. 2023. Utilizing a pretrained language model (BERT) to classify preservice physics teachers\u2019 written reflections. International Journal of Artificial Intelligence in Education 33 3 (2023) 439\u2013466.","DOI":"10.1007\/s40593-022-00290-6"},{"key":"e_1_3_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1145\/3581754.3584136"},{"key":"e_1_3_3_2_58_2","unstructured":"Huimin Xu Seungjun Yi Terence Lim Jiawei Xu Andrew Well Carlos Mery Aidong Zhang Yuji Zhang Heng Ji Keshav Pingali et\u00a0al. 2025. Tama: A human-ai collaborative thematic analysis framework using multi-agent llms for clinical interviews. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2503.20666 (2025)."},{"key":"e_1_3_3_2_59_2","unstructured":"Zhuo Xu. 2021. RoBERTa-WWM-EXT fine-tuning for Chinese text classification. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2103.00492 (2021)."},{"key":"e_1_3_3_2_60_2","doi-asserted-by":"crossref","unstructured":"Lixiang Yan Samuel Greiff Ziwen Teuber and Dragan Ga\u0161evi\u0107. 2024. Promises and challenges of generative artificial intelligence for human learning. Nature Human Behaviour 8 10 (2024) 1839\u20131850.","DOI":"10.1038\/s41562-024-02004-5"},{"key":"e_1_3_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-031-47014-1_32"},{"key":"e_1_3_3_2_62_2","unstructured":"He Zhang Chuhao Wu Jingyi Xie ChanMin Kim and John\u00a0M Carroll. 2023. QualiGPT: GPT as an easy-to-use tool for qualitative coding. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2310.07061 (2023)."},{"key":"e_1_3_3_2_63_2","doi-asserted-by":"crossref","unstructured":"Junyan Zhang Yiming Huang Shuliang Liu Yubo Gao and Xuming Hu. 2025. Do BERT-Like Bidirectional Models Still Perform Better on Text Classification in the Era of LLMs? arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2505.18215 (2025).","DOI":"10.18653\/v1\/2025.findings-emnlp.1033"}],"event":{"name":"LAK 2026: LAK26: 16th International Learning Analytics and Knowledge Conference","location":"Bergen Norway","acronym":"LAK 2026"},"container-title":["Proceedings of the LAK26: 16th International Learning Analytics and Knowledge Conference"],"original-title":[],"deposited":{"date-parts":[[2026,4,25]],"date-time":"2026-04-25T09:40:37Z","timestamp":1777110037000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3785022.3785130"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4,26]]},"references-count":62,"alternative-id":["10.1145\/3785022.3785130","10.1145\/3785022"],"URL":"https:\/\/doi.org\/10.1145\/3785022.3785130","relation":{},"subject":[],"published":{"date-parts":[[2026,4,26]]},"assertion":[{"value":"2026-04-26","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}