{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T07:57:55Z","timestamp":1776931075910,"version":"3.51.2"},"publisher-location":"New York, NY, USA","reference-count":68,"publisher":"ACM","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2026,4,13]]},"DOI":"10.1145\/3772318.3791236","type":"proceedings-article","created":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T05:14:30Z","timestamp":1776057270000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["AI meets Mathematics Education: Supporting Instructors in Large Mathematics Classes with Context-Aware AI"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-5414-1557","authenticated-orcid":false,"given":"J\u00e9r\u00e9my Valentin","family":"Barghorn","sequence":"first","affiliation":[{"name":"Data science master, \u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne, Lausanne, Vaud, Switzerland"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-6480-9501","authenticated-orcid":false,"given":"Anna","family":"Sotnikova","sequence":"additional","affiliation":[{"name":"\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne, Lausanne, Vaud, Switzerland"}]},{"ORCID":"https:\/\/orcid.org\/0009-0006-3388-6188","authenticated-orcid":false,"given":"Sacha","family":"Friedli","sequence":"additional","affiliation":[{"name":"\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne, Lausanne, Vaud, Switzerland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8968-9649","authenticated-orcid":false,"given":"Antoine","family":"Bosselut","sequence":"additional","affiliation":[{"name":"\u00c9cole Polytechnique F\u00e9d\u00e9rale de Lausanne, Lausanne, Vaud, Switzerland"}]}],"member":"320","published-online":{"date-parts":[[2026,4,13]]},"reference":[{"key":"e_1_3_3_2_2_2","doi-asserted-by":"publisher","unstructured":"Samar Aad and Mariann Hardey. 2024. Generative AI: hopes controversies and the future of faculty roles in education. Quality Assurance in Education 33 2 (09 2024) 267\u2013282. arXiv:https:\/\/www.emerald.com\/qae\/article-pdf\/33\/2\/267\/9670069\/qae-02-2024-0043.pdf10.1108\/QAE-02-2024-0043","DOI":"10.1108\/QAE-02-2024-0043"},{"key":"e_1_3_3_2_3_2","doi-asserted-by":"publisher","unstructured":"S. Abdurahman M. Atari F. Karimi-Malekabadi M.\u00a0J. Xue J. Trager P.\u00a0S. Park P. Golazizian A. Omrani and M. Dehghani. 2024. Perils and opportunities in using large language models in psychological research. PNAS Nexus 3 7 (Jul 2024) pgae245. 10.1093\/pnasnexus\/pgae245","DOI":"10.1093\/pnasnexus\/pgae245"},{"key":"e_1_3_3_2_4_2","unstructured":"Borhane Blili-Hamelin Christopher Graziul Leif Hancox-Li Hananel Hazan El-Mahdi El-Mhamdi Avijit Ghosh Katherine Heller Jacob Metcalf Fabricio Murai Eryk Salvaggio Andrew Smart Todd Snider Mariame Tighanimine Talia Ringer Margaret Mitchell and Shiri Dori-Hacohen. 2025. Stop treating \u2018AGI\u2019 as the north-star goal of AI research. arxiv:https:\/\/arXiv.org\/abs\/2502.03689\u00a0[cs.CY] https:\/\/arxiv.org\/abs\/2502.03689"},{"key":"e_1_3_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3706598.3713513"},{"key":"e_1_3_3_2_6_2","doi-asserted-by":"publisher","unstructured":"Beatriz Borges Negar Foroutan Deniz Bayazit Anna Sotnikova Syrielle Montariol Tanya Nazaretsky Mohammadreza Banaei Alireza Sakhaeirad Philippe Servant Seyed\u00a0Parsa Neshaei Jibril Frej Angelika Romanou Gail Weiss Sepideh Mamooler Zeming Chen Simin Fan Silin Gao Mete Ismayilzada Debjit Paul Philippe Schwaller Sacha Friedli Patrick Jermann Tanja K\u00e4ser Antoine Bosselut EPFL\u00a0Grader Consortium and EPFL\u00a0Data Consortium. 2024. Could ChatGPT get an engineering degree? Evaluating higher education vulnerability to AI assistants. Proceedings of the National Academy of Sciences 121 49 (2024) e2414955121. arXiv:https:\/\/www.pnas.org\/doi\/pdf\/10.1073\/pnas.241495512110.1073\/pnas.2414955121","DOI":"10.1073\/pnas.2414955121"},{"key":"e_1_3_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/3613905.3650868"},{"key":"e_1_3_3_2_8_2","volume-title":"Artificial Intelligence and the Future of Teaching and Learning: Insights and Recommendations","author":"Cardona Miguel\u00a0A.","year":"2023","unstructured":"Miguel\u00a0A. Cardona, Roberto\u00a0J. Rodr\u00edguez, and Kristina Ishmael. 2023. Artificial Intelligence and the Future of Teaching and Learning: Insights and Recommendations. Technical Report. U.S. Department of Education, Office of Educational Technology, Washington, DC. https:\/\/www.ed.gov\/sites\/ed\/files\/documents\/ai-report\/ai-report.pdf Insights and Recommendations."},{"key":"e_1_3_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.emnlp-main.474"},{"key":"e_1_3_3_2_10_2","unstructured":"Konstantin Chernyshev Vitaliy Polshkov Ekaterina Artemova Alex Myasnikov Vlad Stepanov Alexei Miasnikov and Sergei Tilga. 2025. U-MATH: A University-Level Benchmark for Evaluating Mathematical Skills in LLMs. arxiv:https:\/\/arXiv.org\/abs\/2412.03205\u00a0[cs.CL] https:\/\/arxiv.org\/abs\/2412.03205"},{"key":"e_1_3_3_2_11_2","unstructured":"Alexis Chevalier Jiayi Geng Alexander Wettig Howard Chen Sebastian Mizera Toni Annala Max\u00a0Jameson Aragon Arturo\u00a0Rodr\u00edguez Fanlo Simon Frieder Simon Machado Akshara Prabhakar Ellie Thieu Jiachen\u00a0T. Wang Zirui Wang Xindi Wu Mengzhou Xia Wenhan Xia Jiatong Yu Jun-Jie Zhu Zhiyong\u00a0Jason Ren Sanjeev Arora and Danqi Chen. 2024. Language Models as Science Tutors. arxiv:https:\/\/arXiv.org\/abs\/2402.11111\u00a0[cs.CL] https:\/\/arxiv.org\/abs\/2402.11111"},{"key":"e_1_3_3_2_12_2","doi-asserted-by":"publisher","unstructured":"Cheng-Han Chiang and Hung-yi Lee. 2023. Can Large Language Models Be an Alternative to Human Evaluations?15607\u201315631. 10.18653\/v1\/2023.acl-long.870","DOI":"10.18653\/v1\/2023.acl-long.870"},{"key":"e_1_3_3_2_13_2","unstructured":"Sribala\u00a0Vidyadhari Chinta Zichong Wang Zhipeng Yin Nhat Hoang Matthew Gonzalez Tai\u00a0Le Quy and Wenbin Zhang. 2024. FairAIED: Navigating Fairness Bias and Ethics in Educational AI Applications. arxiv:https:\/\/arXiv.org\/abs\/2407.18745\u00a0[cs.LG] https:\/\/arxiv.org\/abs\/2407.18745"},{"key":"e_1_3_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.15870201"},{"key":"e_1_3_3_2_15_2","unstructured":"Karl Cobbe Vineet Kosaraju Mohammad Bavarian Mark Chen Heewoo Jun Lukasz Kaiser Matthias Plappert Jerry Tworek Jacob Hilton Reiichiro Nakano Christopher Hesse and John Schulman. 2021. Training Verifiers to Solve Math Word Problems. arxiv:https:\/\/arXiv.org\/abs\/2110.14168\u00a0[cs.LG] https:\/\/arxiv.org\/abs\/2110.14168"},{"key":"e_1_3_3_2_16_2","unstructured":"DeepSeek-AI Daya Guo Dejian Yang Haowei Zhang Junxiao Song Ruoyu Zhang Runxin Xu Qihao Zhu Shirong Ma Peiyi Wang Xiao Bi Xiaokang Zhang Xingkai Yu Yu Wu Z.\u00a0F. Wu Zhibin Gou Zhihong Shao Zhuoshu Li Ziyi Gao Aixin Liu Bing Xue Bingxuan Wang Bochao Wu Bei Feng Chengda Lu Chenggang Zhao Chengqi Deng Chenyu Zhang Chong Ruan Damai Dai Deli Chen Dongjie Ji Erhang Li Fangyun Lin Fucong Dai Fuli Luo Guangbo Hao Guanting Chen Guowei Li H. Zhang Han Bao Hanwei Xu Haocheng Wang Honghui Ding Huajian Xin Huazuo Gao Hui Qu Hui Li Jianzhong Guo Jiashi Li Jiawei Wang Jingchang Chen Jingyang Yuan Junjie Qiu Junlong Li J.\u00a0L. Cai Jiaqi Ni Jian Liang Jin Chen Kai Dong Kai Hu Kaige Gao Kang Guan Kexin Huang Kuai Yu Lean Wang Lecong Zhang Liang Zhao Litong Wang Liyue Zhang Lei Xu Leyi Xia Mingchuan Zhang Minghua Zhang Minghui Tang Meng Li Miaojun Wang Mingming Li Ning Tian Panpan Huang Peng Zhang Qiancheng Wang Qinyu Chen Qiushi Du Ruiqi Ge Ruisong Zhang Ruizhe Pan Runji Wang R.\u00a0J. Chen R.\u00a0L. Jin Ruyi Chen Shanghao Lu Shangyan Zhou Shanhuang Chen Shengfeng Ye Shiyu Wang Shuiping Yu Shunfeng Zhou Shuting Pan S.\u00a0S. Li Shuang Zhou Shaoqing Wu Shengfeng Ye Tao Yun Tian Pei Tianyu Sun T. Wang Wangding Zeng Wanjia Zhao Wen Liu Wenfeng Liang Wenjun Gao Wenqin Yu Wentao Zhang W.\u00a0L. Xiao Wei An Xiaodong Liu Xiaohan Wang Xiaokang Chen Xiaotao Nie Xin Cheng Xin Liu Xin Xie Xingchao Liu Xinyu Yang Xinyuan Li Xuecheng Su Xuheng Lin X.\u00a0Q. Li Xiangyue Jin Xiaojin Shen Xiaosha Chen Xiaowen Sun Xiaoxiang Wang Xinnan Song Xinyi Zhou Xianzu Wang Xinxia Shan Y.\u00a0K. Li Y.\u00a0Q. Wang Y.\u00a0X. Wei Yang Zhang Yanhong Xu Yao Li Yao Zhao Yaofeng Sun Yaohui Wang Yi Yu Yichao Zhang Yifan Shi Yiliang Xiong Ying He Yishi Piao Yisong Wang Yixuan Tan Yiyang Ma Yiyuan Liu Yongqiang Guo Yuan Ou Yuduan Wang Yue Gong Yuheng Zou Yujia He Yunfan Xiong Yuxiang Luo Yuxiang You Yuxuan Liu Yuyang Zhou Y.\u00a0X. Zhu Yanhong Xu Yanping Huang Yaohui Li Yi Zheng Yuchen Zhu Yunxian Ma Ying Tang Yukun Zha Yuting Yan Z.\u00a0Z. Ren Zehui Ren Zhangli Sha Zhe Fu Zhean Xu Zhenda Xie Zhengyan Zhang Zhewen Hao Zhicheng Ma Zhigang Yan Zhiyu Wu Zihui Gu Zijia Zhu Zijun Liu Zilin Li Ziwei Xie Ziyang Song Zizheng Pan Zhen Huang Zhipeng Xu Zhongyu Zhang and Zhen Zhang. 2025. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. arxiv:https:\/\/arXiv.org\/abs\/2501.12948\u00a0[cs.CL] https:\/\/arxiv.org\/abs\/2501.12948"},{"key":"e_1_3_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/3706598.3713349"},{"key":"e_1_3_3_2_18_2","doi-asserted-by":"publisher","unstructured":"Y. Du C. Duan A. Bran A. Sotnikova Y. Qu H. Kulik et\u00a0al. 2024. Large Language Models are Catalyzing Chemistry Education. ChemRxiv (2024). 10.26434\/chemrxiv-2024-h722vPreprint. This content has not been peer-reviewed..","DOI":"10.26434\/chemrxiv-2024-h722v"},{"key":"e_1_3_3_2_19_2","unstructured":"Hugging Face. 2025. Open R1: A fully open reproduction of DeepSeek-R1. https:\/\/github.com\/huggingface\/open-r1"},{"key":"e_1_3_3_2_20_2","unstructured":"Menna Fateen and Tsunenori Mine. 2024. Developing a Tutoring Dialog Dataset to Optimize LLMs for Educational Use. arxiv:https:\/\/arXiv.org\/abs\/2410.19231\u00a0[cs.CL] https:\/\/arxiv.org\/abs\/2410.19231"},{"key":"e_1_3_3_2_21_2","unstructured":"Lucile Favero Juan-Antonio P\u00e9rez-Ortiz Tanja K\u00e4ser and Nuria Oliver. 2025. Do AI tutors empower or enslave learners? Toward a critical use of AI in education. arxiv:https:\/\/arXiv.org\/abs\/2507.06878\u00a0[cs.CY] https:\/\/arxiv.org\/abs\/2507.06878"},{"key":"e_1_3_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3706468.3706481"},{"key":"e_1_3_3_2_23_2","unstructured":"Cl\u00e9mentine Fourrier Nathan Habib Hynek Kydl\u00ed\u010dek Thomas Wolf and Lewis Tunstall. 2023. LightEval: A lightweight framework for LLM evaluation. https:\/\/github.com\/huggingface\/lighteval"},{"key":"e_1_3_3_2_24_2","volume-title":"Student Generative AI Survey 2025","author":"Freeman Josh","year":"2025","unstructured":"Josh Freeman. 2025. Student Generative AI Survey 2025. Policy Note\u00a061. Higher Education Policy Institute (HEPI). HEPI Policy Note 61."},{"key":"e_1_3_3_2_25_2","unstructured":"Caterina Fuligni Daniel\u00a0Dominguez Figaredo and Julia Stoyanovich. 2025. \"Would You Want an AI Tutor?\" Understanding Stakeholder Perceptions of LLM-based Systems in the Classroom. arxiv:https:\/\/arXiv.org\/abs\/2503.02885\u00a0[cs.CY] https:\/\/arxiv.org\/abs\/2503.02885"},{"key":"e_1_3_3_2_26_2","unstructured":"Grammarly Inc.2024. Grammarly Writing Assistant. https:\/\/www.grammarly.com. Accessed: 2025-05-20."},{"key":"e_1_3_3_2_27_2","unstructured":"Aaron Grattafiori Abhimanyu Dubey Abhinav Jauhri Abhinav Pandey Abhishek Kadian Ahmad Al-Dahle Aiesha Letman Akhil Mathur Alan Schelten Alex Vaughan Amy Yang Angela Fan Anirudh Goyal Anthony Hartshorn Aobo Yang Archi Mitra Archie Sravankumar Artem Korenev Arthur Hinsvark Arun Rao Aston Zhang Aurelien Rodriguez Austen Gregerson Ava Spataru Baptiste Roziere Bethany Biron Binh Tang Bobbie Chern Charlotte Caucheteux Chaya Nayak Chloe Bi Chris Marra Chris McConnell Christian Keller Christophe Touret Chunyang Wu Corinne Wong Cristian\u00a0Canton Ferrer Cyrus Nikolaidis Damien Allonsius Daniel Song Danielle Pintz Danny Livshits Danny Wyatt David Esiobu Dhruv Choudhary Dhruv Mahajan Diego Garcia-Olano Diego Perino Dieuwke Hupkes Egor Lakomkin Ehab AlBadawy Elina Lobanova Emily Dinan Eric\u00a0Michael Smith Filip Radenovic Francisco Guzm\u00e1n Frank Zhang Gabriel Synnaeve Gabrielle Lee Georgia\u00a0Lewis Anderson Govind Thattai Graeme Nail Gregoire Mialon Guan Pang Guillem Cucurell Hailey Nguyen Hannah Korevaar Hu Xu Hugo Touvron Iliyan Zarov Imanol\u00a0Arrieta Ibarra Isabel Kloumann Ishan Misra Ivan Evtimov Jack Zhang Jade Copet Jaewon Lee Jan Geffert Jana Vranes Jason Park Jay Mahadeokar Jeet Shah Jelmer van\u00a0der Linde Jennifer Billock Jenny Hong Jenya Lee Jeremy Fu Jianfeng Chi Jianyu Huang Jiawen Liu Jie Wang Jiecao Yu Joanna Bitton Joe Spisak Jongsoo Park Joseph Rocca Joshua Johnstun Joshua Saxe Junteng Jia Kalyan\u00a0Vasuden Alwala Karthik Prasad Kartikeya Upasani Kate Plawiak Ke Li Kenneth Heafield Kevin Stone Khalid El-Arini Krithika Iyer Kshitiz Malik Kuenley Chiu Kunal Bhalla Kushal Lakhotia Lauren Rantala-Yeary Laurens van\u00a0der Maaten Lawrence Chen Liang Tan Liz Jenkins Louis Martin Lovish Madaan Lubo Malo Lukas Blecher Lukas Landzaat Luke de Oliveira Madeline Muzzi Mahesh Pasupuleti Mannat Singh Manohar Paluri Marcin Kardas Maria Tsimpoukelli Mathew Oldham Mathieu Rita Maya Pavlova Melanie Kambadur Mike Lewis Min Si Mitesh\u00a0Kumar Singh Mona Hassan Naman Goyal Narjes Torabi Nikolay Bashlykov Nikolay Bogoychev Niladri Chatterji Ning Zhang Olivier Duchenne Onur \u00c7elebi Patrick Alrassy Pengchuan Zhang Pengwei Li Petar Vasic Peter Weng Prajjwal Bhargava Pratik Dubal Praveen Krishnan Punit\u00a0Singh Koura Puxin Xu Qing He Qingxiao Dong Ragavan Srinivasan Raj Ganapathy Ramon Calderer Ricardo\u00a0Silveira Cabral Robert Stojnic Roberta Raileanu Rohan Maheswari Rohit Girdhar Rohit Patel Romain Sauvestre Ronnie Polidoro Roshan Sumbaly Ross Taylor Ruan Silva Rui Hou Rui Wang Saghar Hosseini Sahana Chennabasappa Sanjay Singh Sean Bell Seohyun\u00a0Sonia Kim Sergey Edunov Shaoliang Nie Sharan Narang Sharath Raparthy Sheng Shen Shengye Wan Shruti Bhosale Shun Zhang Simon Vandenhende Soumya Batra Spencer Whitman Sten Sootla Stephane Collot Suchin Gururangan Sydney Borodinsky Tamar Herman Tara Fowler Tarek Sheasha Thomas Georgiou Thomas Scialom Tobias Speckbacher Todor Mihaylov Tong Xiao Ujjwal Karn Vedanuj Goswami Vibhor Gupta Vignesh Ramanathan Viktor Kerkez Vincent Gonguet Virginie Do Vish Vogeti V\u00edtor Albiero Vladan Petrovic Weiwei Chu Wenhan Xiong Wenyin Fu Whitney Meers Xavier Martinet Xiaodong Wang Xiaofang Wang Xiaoqing\u00a0Ellen Tan Xide Xia Xinfeng Xie Xuchao Jia Xuewei Wang Yaelle Goldschlag Yashesh Gaur Yasmine Babaei Yi Wen Yiwen Song Yuchen Zhang Yue Li Yuning Mao Zacharie\u00a0Delpierre Coudert Zheng Yan Zhengxing Chen Zoe Papakipos Aaditya Singh Aayushi Srivastava Abha Jain Adam Kelsey Adam Shajnfeld Adithya Gangidi Adolfo Victoria Ahuva Goldstand Ajay Menon Ajay Sharma Alex Boesenberg Alexei Baevski Allie Feinstein Amanda Kallet Amit Sangani Amos Teo Anam Yunus Andrei Lupu Andres Alvarado Andrew Caples Andrew Gu Andrew Ho Andrew Poulton Andrew Ryan Ankit Ramchandani Annie Dong Annie Franco Anuj Goyal Aparajita Saraf Arkabandhu Chowdhury Ashley Gabriel Ashwin Bharambe Assaf Eisenman Azadeh Yazdan Beau James Ben Maurer Benjamin Leonhardi Bernie Huang Beth Loyd Beto\u00a0De Paola Bhargavi Paranjape Bing Liu Bo Wu Boyu Ni Braden Hancock Bram Wasti Brandon Spence Brani Stojkovic Brian Gamido Britt Montalvo Carl Parker Carly Burton Catalina Mejia Ce Liu Changhan Wang Changkyu Kim Chao Zhou Chester Hu Ching-Hsiang Chu Chris Cai Chris Tindal Christoph Feichtenhofer Cynthia Gao Damon Civin Dana Beaty Daniel Kreymer Daniel Li David Adkins David Xu Davide Testuggine Delia David Devi Parikh Diana Liskovich Didem Foss Dingkang Wang Duc Le Dustin Holland Edward Dowling Eissa Jamil Elaine Montgomery Eleonora Presani Emily Hahn Emily Wood Eric-Tuan Le Erik Brinkman Esteban Arcaute Evan Dunbar Evan Smothers Fei Sun Felix Kreuk Feng Tian Filippos Kokkinos Firat Ozgenel Francesco Caggioni Frank Kanayet Frank Seide Gabriela\u00a0Medina Florez Gabriella Schwarz Gada Badeer Georgia Swee Gil Halpern Grant Herman Grigory Sizov Guangyi Zhang Guna Lakshminarayanan Hakan Inan Hamid Shojanazeri Han Zou Hannah Wang Hanwen Zha Haroun Habeeb Harrison Rudolph Helen Suk Henry Aspegren Hunter Goldman Hongyuan Zhan Ibrahim Damlaj Igor Molybog Igor Tufanov Ilias Leontiadis Irina-Elena Veliche Itai Gat Jake Weissman James Geboski James Kohli Janice Lam Japhet Asher Jean-Baptiste Gaya Jeff Marcus Jeff Tang Jennifer Chan Jenny Zhen Jeremy Reizenstein Jeremy Teboul Jessica Zhong Jian Jin Jingyi Yang Joe Cummings Jon Carvill Jon Shepard Jonathan McPhie Jonathan Torres Josh Ginsburg Junjie Wang Kai Wu Kam\u00a0Hou U Karan Saxena Kartikay Khandelwal Katayoun Zand Kathy Matosich Kaushik Veeraraghavan Kelly Michelena Keqian Li Kiran Jagadeesh Kun Huang Kunal Chawla Kyle Huang Lailin Chen Lakshya Garg Lavender A Leandro Silva Lee Bell Lei Zhang Liangpeng Guo Licheng Yu Liron Moshkovich Luca Wehrstedt Madian Khabsa Manav Avalani Manish Bhatt Martynas Mankus Matan Hasson Matthew Lennie Matthias Reso Maxim Groshev Maxim Naumov Maya Lathi Meghan Keneally Miao Liu Michael\u00a0L. Seltzer Michal Valko Michelle Restrepo Mihir Patel Mik Vyatskov Mikayel Samvelyan Mike Clark Mike Macey Mike Wang Miquel\u00a0Jubert Hermoso Mo Metanat Mohammad Rastegari Munish Bansal Nandhini Santhanam Natascha Parks Natasha White Navyata Bawa Nayan Singhal Nick Egebo Nicolas Usunier Nikhil Mehta Nikolay\u00a0Pavlovich Laptev Ning Dong Norman Cheng Oleg Chernoguz Olivia Hart Omkar Salpekar Ozlem Kalinli Parkin Kent Parth Parekh Paul Saab Pavan Balaji Pedro Rittner Philip Bontrager Pierre Roux Piotr Dollar Polina Zvyagina Prashant Ratanchandani Pritish Yuvraj Qian Liang Rachad Alao Rachel Rodriguez Rafi Ayub Raghotham Murthy Raghu Nayani Rahul Mitra Rangaprabhu Parthasarathy Raymond Li Rebekkah Hogan Robin Battey Rocky Wang Russ Howes Ruty Rinott Sachin Mehta Sachin Siby Sai\u00a0Jayesh Bondu Samyak Datta Sara Chugh Sara Hunt Sargun Dhillon Sasha Sidorov Satadru Pan Saurabh Mahajan Saurabh Verma Seiji Yamamoto Sharadh Ramaswamy Shaun Lindsay Shaun Lindsay Sheng Feng Shenghao Lin Shengxin\u00a0Cindy Zha Shishir Patil Shiva Shankar Shuqiang Zhang Shuqiang Zhang Sinong Wang Sneha Agarwal Soji Sajuyigbe Soumith Chintala Stephanie Max Stephen Chen Steve Kehoe Steve Satterfield Sudarshan Govindaprasad Sumit Gupta Summer Deng Sungmin Cho Sunny Virk Suraj Subramanian Sy Choudhury Sydney Goldman Tal Remez Tamar Glaser Tamara Best Thilo Koehler Thomas Robinson Tianhe Li Tianjun Zhang Tim Matthews Timothy Chou Tzook Shaked Varun Vontimitta Victoria Ajayi Victoria Montanez Vijai Mohan Vinay\u00a0Satish Kumar Vishal Mangla Vlad Ionescu Vlad Poenaru Vlad\u00a0Tiberiu Mihailescu Vladimir Ivanov Wei Li Wenchen Wang Wenwen Jiang Wes Bouaziz Will Constable Xiaocheng Tang Xiaojian Wu Xiaolan Wang Xilun Wu Xinbo Gao Yaniv Kleinman Yanjun Chen Ye Hu Ye Jia Ye Qi Yenda Li Yilin Zhang Ying Zhang Yossi Adi Youngjin Nam Yu Wang Yu Zhao Yuchen Hao Yundi Qian Yunlu Li Yuzi He Zach Rait Zachary DeVito Zef Rosnbrick Zhaoduo Wen Zhenyu Yang Zhiwei Zhao and Zhiyu Ma. 2024. The Llama 3 Herd of Models. arxiv:https:\/\/arXiv.org\/abs\/2407.21783\u00a0[cs.AI] https:\/\/arxiv.org\/abs\/2407.21783"},{"key":"e_1_3_3_2_28_2","doi-asserted-by":"publisher","unstructured":"L. Guo D. Wang F. Gu Y. Li Y. Wang and R. Zhou. 2021. Evolution and trends in intelligent tutoring systems research: a multidisciplinary and scientometric view. Asia Pacific Education Review 22 3 (2021) 441\u2013461. 10.1007\/s12564-021-09697-7Epub 2021 May 4.","DOI":"10.1007\/s12564-021-09697-7"},{"key":"e_1_3_3_2_29_2","doi-asserted-by":"publisher","unstructured":"Batel Hazan-Liran and Paul Miller. 2024. The Influence of Manipulating and Accentuating Task-Irrelevant Information on Learning Efficiency: Insights for Cognitive Load Theory. Journal of Cognition (Apr 2024). 10.5334\/joc.361","DOI":"10.5334\/joc.361"},{"key":"e_1_3_3_2_30_2","unstructured":"Owen Henkel Hannah Horne-Robinson Nessie Kozhakhmetova and Amanda Lee. 2024. Effective and Scalable Math Support: Evidence on the Impact of an AI- Tutor on Math Achievement in Ghana. arxiv:https:\/\/arXiv.org\/abs\/2402.09809\u00a0[cs.HC] https:\/\/arxiv.org\/abs\/2402.09809"},{"key":"e_1_3_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.bea-1.60"},{"key":"e_1_3_3_2_32_2","doi-asserted-by":"publisher","unstructured":"Kenneth Holstein Bruce\u00a0M. McLaren and Vincent Aleven. 2019. Co-Designing a Real-Time Classroom Orchestration Tool to Support Teacher\u2013AI Complementarity. Journal of Learning Analytics 6 2 (Jul. 2019) 27\u201352. 10.18608\/jla.2019.62.3","DOI":"10.18608\/jla.2019.62.3"},{"key":"e_1_3_3_2_33_2","doi-asserted-by":"publisher","unstructured":"John Jerrim and Sam Sims. 2022. School accountability and teacher stress: international evidence from the OECD TALIS study. Educational Assessment Evaluation and Accountability 34 (02 2022). 10.1007\/s11092-021-09360-0","DOI":"10.1007\/s11092-021-09360-0"},{"key":"e_1_3_3_2_34_2","unstructured":"Albert\u00a0Q. Jiang Alexandre Sablayrolles Arthur Mensch Chris Bamford Devendra\u00a0Singh Chaplot Diego de\u00a0las Casas Florian Bressand Gianna Lengyel Guillaume Lample Lucile Saulnier L\u00e9lio\u00a0Renard Lavaud Marie-Anne Lachaux Pierre Stock Teven\u00a0Le Scao Thibaut Lavril Thomas Wang Timoth\u00e9e Lacroix and William\u00a0El Sayed. 2023. Mistral 7B. arxiv:https:\/\/arXiv.org\/abs\/2310.06825\u00a0[cs.CL] https:\/\/arxiv.org\/abs\/2310.06825"},{"key":"e_1_3_3_2_35_2","doi-asserted-by":"publisher","unstructured":"Marcin Jukiewicz. 2025. How generative artificial intelligence transforms teaching and influences student wellbeing in future education. Frontiers in Education 10 (08 2025). 10.3389\/feduc.2025.1594572","DOI":"10.3389\/feduc.2025.1594572"},{"key":"e_1_3_3_2_36_2","unstructured":"Irina Jurenka Markus Kunesch Kevin McKee Daniel Gillick Shaojian Zhu Sara Wiltberger Shubham\u00a0Milind Phal Katherine Hermann Daniel Kasenberg Avishkar Bhoopchand Ankit Anand M\u00eeruna Pislar Stephanie Chan Lisa Wang Jennifer She Parsa Mahmoudieh Aliya Rysbek Wei-Jen Ko Andrea Huber Brett Wiltshire Gal Elidan Roni Rabin Jasmin Rubinovitz Amit Pitaru Mac McAllister Julia Wilkowski David Choi Roee Engelberg Lidan Hackmon Adva Levin Rachel Griffin Michael Sears Filip Bar Mia Mesar Mana Jabbour Arslan Chaudhry James Cohan Sridhar Thiagarajan Nir Levine Ben Brown Dilan Gorur Svetlana Grant Rachel Hashimoshoni Laura Weidinger Jieru Hu Dawn Chen Kuba Dolecki Canfer Akbulut Maxwell Bileschi Laura Culp Wen-Xin Dong Nahema Marchal Kelsi\u00a0Van Deman Hema\u00a0Bajaj Misra Michael Duah Moran Ambar Avi Caciularu Sandra Lefdal Christopher Summerfield James An Pierre-Alexandre Kamienny Abhinit Mohdi Theofilos Strinopoulous Annie Hale Wayne Anderson Luis\u00a0C. Cobo Niv Efron Muktha Ananda Shakir Mohamed Maureen Heymans Zoubin Ghahramani Yossi Matias Ben Gomes and Lila Ibrahim. 2024. Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach. ArXiv abs\/2407.12687 (2024). https:\/\/api.semanticscholar.org\/CorpusID:271245017"},{"key":"e_1_3_3_2_37_2","first-page":"205","volume-title":"Cognitive load theory","author":"Kirschner Paul A.","year":"2009","unstructured":"Paul A. Kirschner, Femke Kirschner, and Fred Paas. 2009. Cognitive load theory. Vol.\u00a01 (a-j). Macmillan Reference, United States, 205\u2013209."},{"key":"e_1_3_3_2_38_2","unstructured":"Soonwoo Kwon Sojung Kim Minju Park Seunghyun Lee and Kyuseok Kim. 2024. BIPED: Pedagogically Informed Tutoring System for ESL Education. arxiv:https:\/\/arXiv.org\/abs\/2406.03486\u00a0[cs.CL] https:\/\/arxiv.org\/abs\/2406.03486"},{"key":"e_1_3_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3706598.3713960"},{"key":"e_1_3_3_2_40_2","doi-asserted-by":"publisher","unstructured":"Ming Li Ariunaa Enkhtur Beverley\u00a0Anne Yamamoto Fei Cheng and Lilan Chen. 2025. Potential Societal Biases of ChatGPT in Higher Education: A Scoping Review. Open Praxis 17 1 (2025) 79\u201394. 10.55982\/openpraxis.17.1.750","DOI":"10.55982\/openpraxis.17.1.750"},{"key":"e_1_3_3_2_41_2","unstructured":"Hunter Lightman Vineet Kosaraju Yura Burda Harri Edwards Bowen Baker Teddy Lee Jan Leike John Schulman Ilya Sutskever and Karl Cobbe. 2023. Let\u2019s Verify Step by Step. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2305.20050 (2023)."},{"key":"e_1_3_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3675812.3675874"},{"key":"e_1_3_3_2_43_2","doi-asserted-by":"publisher","unstructured":"Rose Luckin. 2017. Towards artificial intelligence-based assessment systems. Nature Human Behaviour 1 3 (March 2017) 1\u20133. 10.1038\/s41562-016-0028","DOI":"10.1038\/s41562-016-0028"},{"key":"e_1_3_3_2_44_2","doi-asserted-by":"crossref","unstructured":"Jakub Macina Nico Daheim Ido Hakimi Manu Kapur Iryna Gurevych and Mrinmaya Sachan. 2025. MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors. arxiv:https:\/\/arXiv.org\/abs\/2502.18940\u00a0[cs.CL] https:\/\/arxiv.org\/abs\/2502.18940","DOI":"10.18653\/v1\/2025.emnlp-main.11"},{"key":"e_1_3_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3641554.3701965"},{"key":"e_1_3_3_2_46_2","unstructured":"Ryan Mok Faraaz Akhtar Louis Clare Christine Li Jun Ida Lewis Ross and Mario Campanelli. 2024. Using AI Large Language Models for Grading in Education: A Hands-On Test for Physics. arxiv:https:\/\/arXiv.org\/abs\/2411.13685\u00a0[physics.ed-ph] https:\/\/arxiv.org\/abs\/2411.13685"},{"key":"e_1_3_3_2_47_2","doi-asserted-by":"publisher","unstructured":"W.\u00a0K. Monib A. Qazi R.\u00a0A. Apong M.\u00a0T. Azizan L. De\u00a0Silva and H. Yassin. 2024. Generative AI and future education: a review theoretical validation and authors\u2019 perspective on challenges and solutions. PeerJ Computer Science 10 (2024) e2105. 10.7717\/peerj-cs.2105","DOI":"10.7717\/peerj-cs.2105"},{"key":"e_1_3_3_2_48_2","unstructured":"Open R1 Team. 2025. OpenR1-Math-220k. https:\/\/huggingface.co\/datasets\/open-r1\/OpenR1-Math-220k. Accessed: 2025-04-29."},{"key":"e_1_3_3_2_49_2","unstructured":"OpenAI : Aaron Hurst Adam Lerer Adam\u00a0P. Goucher Adam Perelman Aditya Ramesh Aidan Clark AJ Ostrow Akila Welihinda Alan Hayes Alec Radford Aleksander M\u0105dry Alex Baker-Whitcomb Alex Beutel Alex Borzunov Alex Carney Alex Chow Alex Kirillov Alex Nichol Alex Paino Alex Renzin Alex\u00a0Tachard Passos Alexander Kirillov Alexi Christakis Alexis Conneau Ali Kamali Allan Jabri Allison Moyer Allison Tam Amadou Crookes Amin Tootoochian Amin Tootoonchian Ananya Kumar Andrea Vallone Andrej Karpathy Andrew Braunstein Andrew Cann Andrew Codispoti Andrew Galu Andrew Kondrich Andrew Tulloch Andrey Mishchenko Angela Baek Angela Jiang Antoine Pelisse Antonia Woodford Anuj Gosalia Arka Dhar Ashley Pantuliano Avi Nayak Avital Oliver Barret Zoph Behrooz Ghorbani Ben Leimberger Ben Rossen Ben Sokolowsky Ben Wang Benjamin Zweig Beth Hoover Blake Samic Bob McGrew Bobby Spero Bogo Giertler Bowen Cheng Brad Lightcap Brandon Walkin Brendan Quinn Brian Guarraci Brian Hsu Bright Kellogg Brydon Eastman Camillo Lugaresi Carroll Wainwright Cary Bassin Cary Hudson Casey Chu Chad Nelson Chak Li Chan\u00a0Jun Shern Channing Conger Charlotte Barette Chelsea Voss Chen Ding Cheng Lu Chong Zhang Chris Beaumont Chris Hallacy Chris Koch Christian Gibson Christina Kim Christine Choi Christine McLeavey Christopher Hesse Claudia Fischer Clemens Winter Coley Czarnecki Colin Jarvis Colin Wei Constantin Koumouzelis Dane Sherburn Daniel Kappler Daniel Levin Daniel Levy David Carr David Farhi David Mely David Robinson David Sasaki Denny Jin Dev Valladares Dimitris Tsipras Doug Li Duc\u00a0Phong Nguyen Duncan Findlay Edede Oiwoh Edmund Wong Ehsan Asdar Elizabeth Proehl Elizabeth Yang Eric Antonow Eric Kramer Eric Peterson Eric Sigler Eric Wallace Eugene Brevdo Evan Mays Farzad Khorasani Felipe\u00a0Petroski Such Filippo Raso Francis Zhang Fred von Lohmann Freddie Sulit Gabriel Goh Gene Oden Geoff Salmon Giulio Starace Greg Brockman Hadi Salman Haiming Bao Haitang Hu Hannah Wong Haoyu Wang Heather Schmidt Heather Whitney Heewoo Jun Hendrik Kirchner Henrique\u00a0Ponde de Oliveira\u00a0Pinto Hongyu Ren Huiwen Chang Hyung\u00a0Won Chung Ian Kivlichan Ian O\u2019Connell Ian O\u2019Connell Ian Osband Ian Silber Ian Sohl Ibrahim Okuyucu Ikai Lan Ilya Kostrikov Ilya Sutskever Ingmar Kanitscheider Ishaan Gulrajani Jacob Coxon Jacob Menick Jakub Pachocki James Aung James Betker James Crooks James Lennon Jamie Kiros Jan Leike Jane Park Jason Kwon Jason Phang Jason Teplitz Jason Wei Jason Wolfe Jay Chen Jeff Harris Jenia Varavva Jessica\u00a0Gan Lee Jessica Shieh Ji Lin Jiahui Yu Jiayi Weng Jie Tang Jieqi Yu Joanne Jang Joaquin\u00a0Quinonero Candela Joe Beutler Joe Landers Joel Parish Johannes Heidecke John Schulman Jonathan Lachman Jonathan McKay Jonathan Uesato Jonathan Ward Jong\u00a0Wook Kim Joost Huizinga Jordan Sitkin Jos Kraaijeveld Josh Gross Josh Kaplan Josh Snyder Joshua Achiam Joy Jiao Joyce Lee Juntang Zhuang Justyn Harriman Kai Fricke Kai Hayashi Karan Singhal Katy Shi Kavin Karthik Kayla Wood Kendra Rimbach Kenny Hsu Kenny Nguyen Keren Gu-Lemberg Kevin Button Kevin Liu Kiel Howe Krithika Muthukumar Kyle Luther Lama Ahmad Larry Kai Lauren Itow Lauren Workman Leher Pathak Leo Chen Li Jing Lia Guy Liam Fedus Liang Zhou Lien Mamitsuka Lilian Weng Lindsay McCallum Lindsey Held Long Ouyang Louis Feuvrier Lu Zhang Lukas Kondraciuk Lukasz Kaiser Luke Hewitt Luke Metz Lyric Doshi Mada Aflak Maddie Simens Madelaine Boyd Madeleine Thompson Marat Dukhan Mark Chen Mark Gray Mark Hudnall Marvin Zhang Marwan Aljubeh Mateusz Litwin Matthew Zeng Max Johnson Maya Shetty Mayank Gupta Meghan Shah Mehmet Yatbaz Meng\u00a0Jia Yang Mengchao Zhong Mia Glaese Mianna Chen Michael Janner Michael Lampe Michael Petrov Michael Wu Michele Wang Michelle Fradin Michelle Pokrass Miguel Castro Miguel Oom\u00a0Temudo de Castro Mikhail Pavlov Miles Brundage Miles Wang Minal Khan Mira Murati Mo Bavarian Molly Lin Murat Yesildal Nacho Soto Natalia Gimelshein Natalie Cone Natalie Staudacher Natalie Summers Natan LaFontaine Neil Chowdhury Nick Ryder Nick Stathas Nick Turley Nik Tezak Niko Felix Nithanth Kudige Nitish Keskar Noah Deutsch Noel Bundick Nora Puckett Ofir Nachum Ola Okelola Oleg Boiko Oleg Murk Oliver Jaffe Olivia Watkins Olivier Godement Owen Campbell-Moore Patrick Chao Paul McMillan Pavel Belov Peng Su Peter Bak Peter Bakkum Peter Deng Peter Dolan Peter Hoeschele Peter Welinder Phil Tillet Philip Pronin Philippe Tillet Prafulla Dhariwal Qiming Yuan Rachel Dias Rachel Lim Rahul Arora Rajan Troll Randall Lin Rapha\u00a0Gontijo Lopes Raul Puri Reah Miyara Reimar Leike Renaud Gaubert Reza Zamani Ricky Wang Rob Donnelly Rob Honsby Rocky Smith Rohan Sahai Rohit Ramchandani Romain Huet Rory Carmichael Rowan Zellers Roy Chen Ruby Chen Ruslan Nigmatullin Ryan Cheu Saachi Jain Sam Altman Sam Schoenholz Sam Toizer Samuel Miserendino Sandhini Agarwal Sara Culver Scott Ethersmith Scott Gray Sean Grove Sean Metzger Shamez Hermani Shantanu Jain Shengjia Zhao Sherwin Wu Shino Jomoto Shirong Wu Shuaiqi Xia Sonia Phene Spencer Papay Srinivas Narayanan Steve Coffey Steve Lee Stewart Hall Suchir Balaji Tal Broda Tal Stramer Tao Xu Tarun Gogineni Taya Christianson Ted Sanders Tejal Patwardhan Thomas Cunninghman Thomas Degry Thomas Dimson Thomas Raoux Thomas Shadwell Tianhao Zheng Todd Underwood Todor Markov Toki Sherbakov Tom Rubin Tom Stasi Tomer Kaftan Tristan Heywood Troy Peterson Tyce Walters Tyna Eloundou Valerie Qi Veit Moeller Vinnie Monaco Vishal Kuo Vlad Fomenko Wayne Chang Weiyi Zheng Wenda Zhou Wesam Manassra Will Sheu Wojciech Zaremba Yash Patil Yilei Qian Yongjik Kim Youlong Cheng Yu Zhang Yuchen He Yuchen Zhang Yujia Jin Yunxing Dai and Yury Malkov. 2024. GPT-4o System Card. arxiv:https:\/\/arXiv.org\/abs\/2410.21276\u00a0[cs.CL] https:\/\/arxiv.org\/abs\/2410.21276"},{"key":"e_1_3_3_2_50_2","volume-title":"Introducing OpenAI o3 and o4-mini","year":"2025","unstructured":"OpenAI. 2025. Introducing OpenAI o3 and o4-mini. https:\/\/openai.com\/index\/introducing-o3-and-o4-mini\/ Accessed: 2025-05-14."},{"key":"e_1_3_3_2_51_2","unstructured":"Sitong Pan Robin Schmucker Bernardo Garcia\u00a0Bulle Bueno Salome\u00a0Aguilar Llanes Fernanda\u00a0Albo Alarc\u00f3n Hangxiao Zhu Adam Teo and Meng Xia. 2025. TutorUp: What If Your Students Were Simulated? Training Tutors to Address Engagement Challenges in Online Learning. arxiv:https:\/\/arXiv.org\/abs\/2502.16178\u00a0[cs.HC] https:\/\/arxiv.org\/abs\/2502.16178"},{"key":"e_1_3_3_2_52_2","doi-asserted-by":"publisher","unstructured":"Yufeng Qian. 2025. Pedagogical Applications of Generative AI in Higher Education: A Systematic Review of the Field. TechTrends (2025). 10.1007\/s11528-025-01100-1","DOI":"10.1007\/s11528-025-01100-1"},{"key":"e_1_3_3_2_53_2","unstructured":"Qwen An Yang Baosong Yang Beichen Zhang Binyuan Hui Bo Zheng Bowen Yu Chengyuan Li Dayiheng Liu Fei Huang Haoran Wei Huan Lin Jian Yang Jianhong Tu Jianwei Zhang Jianxin Yang Jiaxi Yang Jingren Zhou Junyang Lin Kai Dang Keming Lu Keqin Bao Kexin Yang Le Yu Mei Li Mingfeng Xue Pei Zhang Qin Zhu Rui Men Runji Lin Tianhao Li Tianyi Tang Tingyu Xia Xingzhang Ren Xuancheng Ren Yang Fan Yang Su Yichang Zhang Yu Wan Yuqiong Liu Zeyu Cui Zhenru Zhang and Zihan Qiu. 2025. Qwen2.5 Technical Report. arxiv:https:\/\/arXiv.org\/abs\/2412.15115\u00a0[cs.CL] https:\/\/arxiv.org\/abs\/2412.15115"},{"key":"e_1_3_3_2_54_2","doi-asserted-by":"publisher","DOI":"10.1145\/3706598.3713971"},{"key":"e_1_3_3_2_55_2","unstructured":"David Rein Betty\u00a0Li Hou Asa\u00a0Cooper Stickland Jackson Petty Richard\u00a0Yuanzhe Pang Julien Dirani Julian Michael and Samuel\u00a0R. Bowman. 2023. GPQA: A Graduate-Level Google-Proof Question and Answer Benchmark. arxiv:https:\/\/arXiv.org\/abs\/2311.12022\u00a0[cs.AI] https:\/\/arxiv.org\/abs\/2311.12022"},{"key":"e_1_3_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1145\/3706598.3713644"},{"key":"e_1_3_3_2_57_2","unstructured":"David Rolnick Alan Aspuru-Guzik Sara Beery Bistra Dilkina Priya\u00a0L. Donti Marzyeh Ghassemi Hannah Kerner Claire Monteleoni Esther Rolf Milind Tambe and Adam White. 2024. Application-Driven Innovation in Machine Learning. arxiv:https:\/\/arXiv.org\/abs\/2403.17381\u00a0[cs.LG] https:\/\/arxiv.org\/abs\/2403.17381"},{"key":"e_1_3_3_2_58_2","unstructured":"Devansh Saxena Ji-Youn Jung Jodi Forlizzi Kenneth Holstein and John Zimmerman. 2025. AI Mismatches: Identifying Potential Algorithmic Harms Before AI Development. arxiv:https:\/\/arXiv.org\/abs\/2502.18682\u00a0[cs.HC] https:\/\/arxiv.org\/abs\/2502.18682"},{"key":"e_1_3_3_2_59_2","doi-asserted-by":"crossref","unstructured":"Orit Shaer Angelora Cooper Osnat Mokryn Andrew\u00a0L. Kun and Hagit\u00a0Ben Shoshan. 2024. AI-Augmented Brainwriting: Investigating the use of LLMs in group ideation. arxiv:https:\/\/arXiv.org\/abs\/2402.14978\u00a0[cs.HC] https:\/\/arxiv.org\/abs\/2402.14978","DOI":"10.1145\/3613904.3642414"},{"key":"e_1_3_3_2_60_2","unstructured":"Gemma Team Aishwarya Kamath Johan Ferret Shreya Pathak Nino Vieillard Ramona Merhej Sarah Perrin Tatiana Matejovicova Alexandre Ram\u00e9 Morgane Rivi\u00e8re Louis Rouillard Thomas Mesnard Geoffrey Cideron Jean bastien Grill Sabela Ramos Edouard Yvinec Michelle Casbon Etienne Pot Ivo Penchev Ga\u00ebl Liu Francesco Visin Kathleen Kenealy Lucas Beyer Xiaohai Zhai Anton Tsitsulin Robert Busa-Fekete Alex Feng Noveen Sachdeva Benjamin Coleman Yi Gao Basil Mustafa Iain Barr Emilio Parisotto David Tian Matan Eyal Colin Cherry Jan-Thorsten Peter Danila Sinopalnikov Surya Bhupatiraju Rishabh Agarwal Mehran Kazemi Dan Malkin Ravin Kumar David Vilar Idan Brusilovsky Jiaming Luo Andreas Steiner Abe Friesen Abhanshu Sharma Abheesht Sharma Adi\u00a0Mayrav Gilady Adrian Goedeckemeyer Alaa Saade Alex Feng Alexander Kolesnikov Alexei Bendebury Alvin Abdagic Amit Vadi Andr\u00e1s Gy\u00f6rgy Andr\u00e9\u00a0Susano Pinto Anil Das Ankur Bapna Antoine Miech Antoine Yang Antonia Paterson Ashish Shenoy Ayan Chakrabarti Bilal Piot Bo Wu Bobak Shahriari Bryce Petrini Charlie Chen Charline\u00a0Le Lan Christopher\u00a0A. Choquette-Choo CJ Carey Cormac Brick Daniel Deutsch Danielle Eisenbud Dee Cattle Derek Cheng Dimitris Paparas Divyashree\u00a0Shivakumar Sreepathihalli Doug Reid Dustin Tran Dustin Zelle Eric Noland Erwin Huizenga Eugene Kharitonov Frederick Liu Gagik Amirkhanyan Glenn Cameron Hadi Hashemi Hanna Klimczak-Pluci\u0144ska Harman Singh Harsh Mehta Harshal\u00a0Tushar Lehri Hussein Hazimeh Ian Ballantyne Idan Szpektor Ivan Nardini Jean Pouget-Abadie Jetha Chan Joe Stanton John Wieting Jonathan Lai Jordi Orbay Joseph Fernandez Josh Newlan Ju yeong Ji Jyotinder Singh Kat Black Kathy Yu Kevin Hui Kiran Vodrahalli Klaus Greff Linhai Qiu Marcella Valentine Marina Coelho Marvin Ritter Matt Hoffman Matthew Watson Mayank Chaturvedi Michael Moynihan Min Ma Nabila Babar Natasha Noy Nathan Byrd Nick Roy Nikola Momchev Nilay Chauhan Noveen Sachdeva Oskar Bunyan Pankil Botarda Paul Caron Paul\u00a0Kishan Rubenstein Phil Culliton Philipp Schmid Pier\u00a0Giuseppe Sessa Pingmei Xu Piotr Stanczyk Pouya Tafti Rakesh Shivanna Renjie Wu Renke Pan Reza Rokni Rob Willoughby Rohith Vallu Ryan Mullins Sammy Jerome Sara Smoot Sertan Girgin Shariq Iqbal Shashir Reddy Shruti Sheth Siim P\u00f5der Sijal Bhatnagar Sindhu\u00a0Raghuram Panyam Sivan Eiger Susan Zhang Tianqi Liu Trevor Yacovone Tyler Liechty Uday Kalra Utku Evci Vedant Misra Vincent Roseberry Vlad Feinberg Vlad Kolesnikov Woohyun Han Woosuk Kwon Xi Chen Yinlam Chow Yuvein Zhu Zichuan Wei Zoltan Egyed Victor Cotruta Minh Giang Phoebe Kirk Anand Rao Kat Black Nabila Babar Jessica Lo Erica Moreira Luiz\u00a0Gustavo Martins Omar Sanseviero Lucas Gonzalez Zach Gleicher Tris Warkentin Vahab Mirrokni Evan Senter Eli Collins Joelle Barral Zoubin Ghahramani Raia Hadsell Yossi Matias D. Sculley Slav Petrov Noah Fiedel Noam Shazeer Oriol Vinyals Jeff Dean Demis Hassabis Koray Kavukcuoglu Clement Farabet Elena Buchatskaya Jean-Baptiste Alayrac Rohan Anil Dmitry Lepikhin Sebastian Borgeaud Olivier Bachem Armand Joulin Alek Andreev Cassidy Hardin Robert Dadashi and L\u00e9onard Hussenot. 2025. Gemma 3 Technical Report. arxiv:https:\/\/arXiv.org\/abs\/2503.19786\u00a0[cs.CL] https:\/\/arxiv.org\/abs\/2503.19786"},{"key":"e_1_3_3_2_61_2","volume-title":"Hybrid AI Architecture Evaluation Framework","author":"Team LLM Reasoning-Augmented Benchmark\u00a0Framework","year":"2024","unstructured":"LLM Reasoning-Augmented Benchmark\u00a0Framework Team. 2024. Hybrid AI Architecture Evaluation Framework. https:\/\/github.com\/cavit99\/Reasoning-augmented-Sonnet3.5-Framework"},{"key":"e_1_3_3_2_62_2","unstructured":"Shubham Toshniwal Ivan Moshkov Sean Narenthiran Daria Gitman Fei Jia and Igor Gitman. 2024. OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset. arxiv:https:\/\/arXiv.org\/abs\/2402.10176\u00a0[cs.CL] https:\/\/arxiv.org\/abs\/2402.10176"},{"key":"e_1_3_3_2_63_2","doi-asserted-by":"crossref","unstructured":"Trieu Trinh Yuhuai\u00a0Tony Wu Quoc Le He He and Thang Luong. 2024. Solving olympiad geometry without human demonstrations. Nature 625 (2024) 476\u2013482. https:\/\/www.nature.com\/articles\/s41586-023-06747-5","DOI":"10.1038\/s41586-023-06747-5"},{"key":"e_1_3_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1145\/3706598.3713368"},{"key":"e_1_3_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.1145\/3375462.3375504"},{"key":"e_1_3_3_2_66_2","unstructured":"Wenjing Xie Juxin Niu Chun\u00a0Jason Xue and Nan Guan. 2024. Grade Like a Human: Rethinking Automated Assessment with Large Language Models. arxiv:https:\/\/arXiv.org\/abs\/2405.19694\u00a0[cs.AI] https:\/\/arxiv.org\/abs\/2405.19694"},{"key":"e_1_3_3_2_67_2","unstructured":"Lance Ying Katherine\u00a0M. Collins Lionel Wong Ilia Sucholutsky Ryan Liu Adrian Weller Tianmin Shu Thomas\u00a0L. Griffiths and Joshua\u00a0B. Tenenbaum. 2025. On Benchmarking Human-Like Intelligence in Machines. arxiv:https:\/\/arXiv.org\/abs\/2502.20502\u00a0[cs.AI] https:\/\/arxiv.org\/abs\/2502.20502"},{"key":"e_1_3_3_2_68_2","unstructured":"Beichen Zhang Kun Zhou Xilin Wei Wayne\u00a0Xin Zhao Jing Sha Shijin Wang and Ji-Rong Wen. 2023. Evaluating and Improving Tool-Augmented Computation-Intensive Math Reasoning. arxiv:https:\/\/arXiv.org\/abs\/2306.02408\u00a0[cs.CL] https:\/\/arxiv.org\/abs\/2306.02408"},{"key":"e_1_3_3_2_69_2","doi-asserted-by":"publisher","unstructured":"Di Zhang. 2025. AIME 1983 2024 (Revision 6283828). 10.57967\/hf\/4687","DOI":"10.57967\/hf\/4687"}],"event":{"name":"CHI 2026: CHI Conference on Human Factors in Computing Systems","location":"Barcelona Spain","acronym":"CHI '26","sponsor":["SIGCHI ACM Special Interest Group on Computer-Human Interaction"]},"container-title":["Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3772318.3791236","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T09:52:03Z","timestamp":1776419523000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3772318.3791236"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4,13]]},"references-count":68,"alternative-id":["10.1145\/3772318.3791236","10.1145\/3772318"],"URL":"https:\/\/doi.org\/10.1145\/3772318.3791236","relation":{},"subject":[],"published":{"date-parts":[[2026,4,13]]},"assertion":[{"value":"2026-04-13","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}