{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,16]],"date-time":"2026-02-16T18:38:35Z","timestamp":1771267115578,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":29,"publisher":"ACM","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2026,2,22]]},"DOI":"10.1145\/3773966.3777952","type":"proceedings-article","created":{"date-parts":[[2026,2,16]],"date-time":"2026-02-16T17:50:01Z","timestamp":1771264201000},"page":"946-954","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["TOOL-CURE: Tool Selection via Curriculum-Enhanced Reinforcement Learning with Sample Screening for LLMs"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4619-4950","authenticated-orcid":false,"given":"Jie","family":"Zhang","sequence":"first","affiliation":[{"name":"Huazhong University of Science and Technology, Wuhan, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0667-0772","authenticated-orcid":false,"given":"Dongsheng","family":"Bi","sequence":"additional","affiliation":[{"name":"Ant Group, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6357-6726","authenticated-orcid":false,"given":"Tao","family":"Sun","sequence":"additional","affiliation":[{"name":"Ant Group, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9412-7509","authenticated-orcid":false,"given":"Minghui","family":"Yang","sequence":"additional","affiliation":[{"name":"Ant Group, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4144-1753","authenticated-orcid":false,"given":"Jian","family":"Wang","sequence":"additional","affiliation":[{"name":"Ant Group, Hangzhou, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2135-5359","authenticated-orcid":false,"given":"Yiwei","family":"Wang","sequence":"additional","affiliation":[{"name":"Huazhong University of Science and Technology, Wuhan, China"}]}],"member":"320","published-online":{"date-parts":[[2026,2,21]]},"reference":[{"key":"e_1_3_2_1_1_1","volume-title":"Can a single model master both multi-turn conversations and tool use? coalm: A unified conversational agentic language model. arXiv preprint arXiv:2502.08820","author":"Acikgoz Emre Can","year":"2025","unstructured":"Emre Can Acikgoz, Jeremiah Greer, Akul Datta, Ze Yang, William Zeng, Oussama Elachqar, Emmanouil Koukoumidis, Dilek Hakkani-T\u00fcr, and Gokhan Tur. 2025. Can a single model master both multi-turn conversations and tool use? coalm: A unified conversational agentic language model. arXiv preprint arXiv:2502.08820 (2025)."},{"key":"e_1_3_2_1_2_1","volume-title":"Agent-flan: Designing data and methods of effective agent tuning for large language models. arXiv preprint arXiv:2403.12881","author":"Chen Zehui","year":"2024","unstructured":"Zehui Chen, Kuikun Liu, Qiuchen Wang, Wenwei Zhang, Jiangning Liu, Dahua Lin, Kai Chen, and Feng Zhao. 2024. Agent-flan: Designing data and methods of effective agent tuning for large language models. arXiv preprint arXiv:2403.12881 (2024)."},{"key":"e_1_3_2_1_3_1","volume-title":"Sft memorizes, rl generalizes: A comparative study of foundation model post-training. arXiv preprint arXiv:2501.17161","author":"Chu Tianzhe","year":"2025","unstructured":"Tianzhe Chu, Yuexiang Zhai, Jihan Yang, Shengbang Tong, Saining Xie, Dale Schuurmans, Quoc V Le, Sergey Levine, and Yi Ma. 2025. Sft memorizes, rl generalizes: A comparative study of foundation model post-training. arXiv preprint arXiv:2501.17161 (2025)."},{"key":"e_1_3_2_1_4_1","volume-title":"Improving Generalization in Intent Detection: GRPO with Reward-Based Curriculum Sampling. arXiv preprint arXiv:2504.13592","author":"Feng Zihao","year":"2025","unstructured":"Zihao Feng, Xiaoxue Wang, Ziwei Bai, Donghang Su, Bowen Wu, Qun Yu, and Baoxun Wang. 2025. Improving Generalization in Intent Detection: GRPO with Reward-Based Curriculum Sampling. arXiv preprint arXiv:2504.13592 (2025)."},{"key":"e_1_3_2_1_5_1","unstructured":"Daya Guo Dejian Yang Haowei Zhang Junxiao Song Ruoyu Zhang Runxin Xu Qihao Zhu Shirong Ma Peiyi Wang Xiao Bi et al. 2025. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948 (2025)."},{"key":"e_1_3_2_1_6_1","volume-title":"Advancing Personalized Learning with Neural Collapse for Long-Tail Challenge. In Forty-second International Conference on Machine Learning.","author":"Hu Hanglei","unstructured":"Hanglei Hu, Yingying Guo, Zhikang Chen, Sen Cui, Fei Wu, Kun Kuang, Min Zhang, and Bo Jiang. [n.d.]. Advancing Personalized Learning with Neural Collapse for Long-Tail Challenge. In Forty-second International Conference on Machine Learning."},{"key":"e_1_3_2_1_7_1","volume-title":"Neil Zhenqiang Gong, et al","author":"Huang Yue","year":"2023","unstructured":"Yue Huang, Jiawen Shi, Yuan Li, Chenrui Fan, Siyuan Wu, Qihui Zhang, Yixin Liu, Pan Zhou, Yao Wan, Neil Zhenqiang Gong, et al. 2023. Metatool benchmark for large language models: Deciding whether to use tools and which to use. arXiv preprint arXiv:2310.03128 (2023)."},{"key":"e_1_3_2_1_8_1","unstructured":"Binyuan Hui Jian Yang Zeyu Cui Jiaxi Yang Dayiheng Liu Lei Zhang Tianyu Liu Jiajun Zhang Bowen Yu Keming Lu et al. 2024. Qwen2. 5-coder technical report. arXiv preprint arXiv:2409.12186 (2024)."},{"key":"e_1_3_2_1_9_1","volume-title":"Search-r1: Training llms to reason and leverage search engines with reinforcement learning. arXiv preprint arXiv:2503.09516","author":"Jin Bowen","year":"2025","unstructured":"Bowen Jin, Hansi Zeng, Zhenrui Yue, Jinsung Yoon, Sercan Arik, Dong Wang, Hamed Zamani, and Jiawei Han. 2025. Search-r1: Training llms to reason and leverage search engines with reinforcement learning. arXiv preprint arXiv:2503.09516 (2025)."},{"key":"e_1_3_2_1_10_1","volume-title":"From quantity to quality: Boosting llm performance with self-guided data selection for instruction tuning. arXiv preprint arXiv:2308.12032","author":"Li Ming","year":"2023","unstructured":"Ming Li, Yong Zhang, Zhitao Li, Jiuhai Chen, Lichang Chen, Ning Cheng, Jianzong Wang, Tianyi Zhou, and Jing Xiao. 2023. From quantity to quality: Boosting llm performance with self-guided data selection for instruction tuning. arXiv preprint arXiv:2308.12032 (2023)."},{"key":"e_1_3_2_1_11_1","volume-title":"Limr: Less is more for rl scaling. arXiv preprint arXiv:2502.11886","author":"Li Xuefeng","year":"2025","unstructured":"Xuefeng Li, Haoyang Zou, and Pengfei Liu. 2025. Limr: Less is more for rl scaling. arXiv preprint arXiv:2502.11886 (2025)."},{"key":"e_1_3_2_1_12_1","volume-title":"Hammer: Robust function-calling for on-device language models via function masking. arXiv preprint arXiv:2410.04587","author":"Lin Qiqiang","year":"2024","unstructured":"Qiqiang Lin, Muning Wen, Qiuying Peng, Guanyu Nie, Junwei Liao, Jun Wang, Xiaoyun Mo, Jiamu Zhou, Cheng Cheng, Yin Zhao, et al. 2024. Hammer: Robust function-calling for on-device language models via function masking. arXiv preprint arXiv:2410.04587 (2024)."},{"key":"e_1_3_2_1_13_1","volume-title":"Toolace: Winning the points of llm function calling. arXiv preprint arXiv:2409.00920","author":"Liu Weiwen","year":"2024","unstructured":"Weiwen Liu, Xu Huang, Xingshan Zeng, Xinlong Hao, Shuai Yu, Dexun Li, Shuai Wang, Weinan Gan, Zhengying Liu, Yuanqing Yu, et al. 2024. Toolace: Winning the points of llm function calling. arXiv preprint arXiv:2409.00920 (2024)."},{"key":"e_1_3_2_1_14_1","volume-title":"Junbo Niu, Chengyu Shen, Runming He, Bin Cui, et al.","author":"Ma Lu","year":"2025","unstructured":"Lu Ma, Hao Liang, Meiyi Qiang, Lexiang Tang, Xiaochen Ma, Zhen Hao Wong, Junbo Niu, Chengyu Shen, Runming He, Bin Cui, et al. 2025. Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions. arXiv preprint arXiv:2506.07527 (2025)."},{"key":"e_1_3_2_1_15_1","volume-title":"Fortysecond International Conference on Machine Learning.","author":"Patil Shishir G","unstructured":"Shishir G Patil, Huanzhi Mao, Fanjia Yan, Charlie Cheng-Jie Ji, Vishnu Suresh, Ion Stoica, and Joseph E Gonzalez. [n.d.]. The Berkeley Function Calling Leaderboard (BFCL): From Tool Use to Agentic Evaluation of Large Language Models. In Fortysecond International Conference on Machine Learning."},{"key":"e_1_3_2_1_16_1","volume-title":"Qi He, HongruWang, Xiusi Chen, Dilek Hakkani- T\u00fcr, Gokhan Tur, and Heng Ji.","author":"Qian Cheng","year":"2025","unstructured":"Cheng Qian, Emre Can Acikgoz, Qi He, HongruWang, Xiusi Chen, Dilek Hakkani- T\u00fcr, Gokhan Tur, and Heng Ji. 2025. Toolrl: Reward is all tool learning needs. arXiv preprint arXiv:2504.13958 (2025)."},{"key":"e_1_3_2_1_17_1","volume-title":"Toolllm: Facilitating large language models to master 16000 real-world apis. arXiv preprint arXiv:2307.16789","author":"Qin Yujia","year":"2023","unstructured":"Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, et al. 2023. Toolllm: Facilitating large language models to master 16000 real-world apis. arXiv preprint arXiv:2307.16789 (2023)."},{"key":"e_1_3_2_1_18_1","first-page":"68539","article-title":"Toolformer: Language models can teach themselves to use tools","volume":"36","author":"Schick Timo","year":"2023","unstructured":"Timo Schick, Jane Dwivedi-Yu, Roberto Dess\u00ec, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. 2023. Toolformer: Language models can teach themselves to use tools. Advances in Neural Information Processing Systems 36 (2023), 68539--68551.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_19_1","volume-title":"Deepseekmath: Pushing the limits of mathematical reasoning in open language models. arXiv preprint arXiv:2402.03300","author":"Shao Zhihong","year":"2024","unstructured":"Zhihong Shao, PeiyiWang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, YK Li, Yang Wu, et al. 2024. Deepseekmath: Pushing the limits of mathematical reasoning in open language models. arXiv preprint arXiv:2402.03300 (2024)."},{"key":"e_1_3_2_1_20_1","volume-title":"HybridFlow: A Flexible and Efficient RLHF Framework. arXiv preprint arXiv: 2409.19256","author":"Sheng Guangming","year":"2024","unstructured":"Guangming Sheng, Chi Zhang, Zilingfeng Ye, XibinWu,Wang Zhang, Ru Zhang, Yanghua Peng, Haibin Lin, and Chuan Wu. 2024. HybridFlow: A Flexible and Efficient RLHF Framework. arXiv preprint arXiv: 2409.19256 (2024)."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11704-024-40231-1"},{"key":"e_1_3_2_1_22_1","volume-title":"Denny Zhou, et al.","author":"Wei Jason","year":"2022","unstructured":"Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems 35 (2022), 24824--24837."},{"key":"e_1_3_2_1_23_1","volume-title":"Reward Hacking in Reinforcement Learning. lilianweng. github.io (Nov","author":"Weng Lilian","year":"2024","unstructured":"Lilian Weng. 2024. Reward Hacking in Reinforcement Learning. lilianweng. github.io (Nov 2024). https:\/\/lilianweng.github.io\/posts\/2024--11--28-rewardhacking\/"},{"key":"e_1_3_2_1_24_1","unstructured":"An Yang Anfeng Li Baosong Yang Beichen Zhang Binyuan Hui Bo Zheng Bowen Yu Chang Gao Chengen Huang Chenxu Lv et al. 2025. Qwen3 technical report. arXiv preprint arXiv:2505.09388 (2025)."},{"key":"e_1_3_2_1_25_1","volume-title":"International Conference on Learning Representations (ICLR).","author":"Yao Shunyu","year":"2023","unstructured":"Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2023. React: Synergizing reasoning and acting in language models. In International Conference on Learning Representations (ICLR)."},{"key":"e_1_3_2_1_26_1","volume-title":"When scaling meets llm finetuning: The effect of data, model and finetuning method. arXiv preprint arXiv:2402.17193","author":"Zhang Biao","year":"2024","unstructured":"Biao Zhang, Zhongtao Liu, Colin Cherry, and Orhan Firat. 2024. When scaling meets llm finetuning: The effect of data, model and finetuning method. arXiv preprint arXiv:2402.17193 (2024)."},{"key":"e_1_3_2_1_27_1","unstructured":"Jianguo Zhang Tian Lan Ming Zhu Zuxin Liu Thai Hoang Shirley Kokane Weiran Yao Juntao Tan Akshara Prabhakar Haolin Chen et al. 2024. xlam: A family of large action models to empower ai agent systems. arXiv preprint arXiv:2409.03215 (2024)."},{"key":"e_1_3_2_1_28_1","volume-title":"SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning. arXiv preprint arXiv:2506.09016","author":"Zhang Ruiqi","year":"2025","unstructured":"Ruiqi Zhang, Daman Arora, Song Mei, and Andrea Zanette. 2025. SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning. arXiv preprint arXiv:2506.09016 (2025)."},{"key":"e_1_3_2_1_29_1","unstructured":"Yuze Zhao Jintao Huang Jinghan Hu Xingjun Wang Yunlin Mao Daoze Zhang Zeyinzi Jiang Zhikai Wu Baole Ai Ang Wang Wenmeng Zhou and Yingda Chen. 2024. SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning. arXiv:2408.05517 [cs.CL] https:\/\/arxiv.org\/abs\/2408.05517"}],"event":{"name":"WSDM '26:The Nineteenth ACM International Conference on Web Search and Data Mining","location":"Boise ID USA","sponsor":["SIGKDD ACM Special Interest Group on Knowledge Discovery in Data","SIGWEB ACM Special Interest Group on Hypertext, Hypermedia, and Web","SIGIR ACM Special Interest Group on Information Retrieval","SIGMOD ACM Special Interest Group on Management of Data"]},"container-title":["Proceedings of the Nineteenth ACM International Conference on Web Search and Data Mining"],"original-title":[],"deposited":{"date-parts":[[2026,2,16]],"date-time":"2026-02-16T17:51:21Z","timestamp":1771264281000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3773966.3777952"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,21]]},"references-count":29,"alternative-id":["10.1145\/3773966.3777952","10.1145\/3773966"],"URL":"https:\/\/doi.org\/10.1145\/3773966.3777952","relation":{},"subject":[],"published":{"date-parts":[[2026,2,21]]},"assertion":[{"value":"2026-02-21","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}