{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T04:48:24Z","timestamp":1776746904321,"version":"3.51.2"},"reference-count":43,"publisher":"World Scientific Pub Co Pte Ltd","issue":"01","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Semantic Computing"],"published-print":{"date-parts":[[2026,3]]},"abstract":"<jats:p>Agentic large language models (LLMs) have emerged as powerful tools for autonomously interacting with external environments and performing multi-step reasoning. While most existing approaches rely on in-context learning with multi-turn few-shot prompts, these methods often require long inputs and, consequently, incur high computational costs and latency. Agent fine-tuning offers a resource-aware alternative by enabling models to internalize procedural reasoning patterns and domain-specific knowledge through demonstrations and curated training data. However, its effectiveness in highly specialized technical microdomains remains underexplored. This work investigates agent fine-tuning with knowledge distillation for adapting LLMs to Hitachi\u2019s JP1 middleware, a complex microdomain centered on IT operations management. We fine-tune models using JP1-specific corpora extracted from manuals and textbooks, together with distilled reasoning trajectories (ReAct and CoT) generated by larger LLMs (GPT-4). At inference time, we incorporate retrieval-augmented generation (RAG) with an agentic prompt and introduce a context-answer extractor (CAE) to improve grounding and relevance. On JP1 certification examinations, our model that was continually pre-trained on JP1 manuals and further fine-tuned with ReAct trajectories achieves substantial improvements over the base model \u2014 13% (Engineer), 12% (Professional), and 10% (Consultant) \u2014 while delivering up to 9.78 times higher cost efficiency than strong general-purpose LLMs such as GPT-4. These results demonstrate that agent fine-tuning combined with knowledge distillation is a highly effective and economically scalable strategy for building high-fidelity LLM agents tailored to specialized technical domains.<\/jats:p>","DOI":"10.1142\/s1793351x26410023","type":"journal-article","created":{"date-parts":[[2026,2,20]],"date-time":"2026-02-20T04:18:56Z","timestamp":1771561136000},"page":"31-44","source":"Crossref","is-referenced-by-count":0,"title":["Agent Fine\u2013Tuning with Knowledge Distillation: Achieving Cost\u2013Efficient and High\u2013Fidelity LLMs in Microdomains"],"prefix":"10.1142","volume":"20","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-8173-0153","authenticated-orcid":false,"given":"Yawen","family":"Xue","sequence":"first","affiliation":[{"name":"Research and Development Group, Hitachi, Ltd., Tokyo, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9885-7584","authenticated-orcid":false,"given":"Masaya","family":"Tsunokake","sequence":"additional","affiliation":[{"name":"Research and Development Group, Hitachi, Ltd., Tokyo, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-2262-3072","authenticated-orcid":false,"given":"Yuta","family":"Koreeda","sequence":"additional","affiliation":[{"name":"Research and Development Group, Hitachi, Ltd., Tokyo, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yasuhiro","family":"Sogawa","sequence":"additional","affiliation":[{"name":"Research and Development Group, Hitachi, Ltd., Tokyo, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"219","published-online":{"date-parts":[[2026,3,11]]},"reference":[{"key":"S1793351X26410023BIB001","unstructured":"OpenAI, Introducing GPT-4, OpenAI Blog (2023), https:\/\/openai.com\/research\/gpt-4."},{"key":"S1793351X26410023BIB002","unstructured":"G. DeepMind, Gemini: Google\u2019s multimodal AI model (2023), https:\/\/deepmind.google\/technologies\/gemini\/."},{"key":"S1793351X26410023BIB003","unstructured":"R. Vavekanand and K. Sam, Llama 3.1: An in-depth analysis of the next-generation large language model (2024)."},{"key":"S1793351X26410023BIB004","unstructured":"AI@Meta, Llama 3 model card (2024)."},{"key":"S1793351X26410023BIB005","unstructured":"Qwen Team, Qwen: A scalable and open large language model (2023), https:\/\/github.com\/QwenLM\/Qwen-7B."},{"key":"S1793351X26410023BIB006","unstructured":"Deepseek Team, Deepseek LLM: Bridging open-source and commercial LLMs (2023), https:\/\/github.com\/deepseek-ai\/DeepSeek-LLM."},{"key":"S1793351X26410023BIB007","doi-asserted-by":"publisher","DOI":"10.1007\/s11704-024-40579-4"},{"key":"S1793351X26410023BIB008","unstructured":"R. Bommasani\n                      et al.\n                      , On the opportunities and risks of foundation models (2021), arXiv:2108.07258."},{"key":"S1793351X26410023BIB009","doi-asserted-by":"publisher","DOI":"10.1038\/s42256-024-00944-1"},{"key":"S1793351X26410023BIB010","unstructured":"M. A. Ferrag, N. Tihanyi and M. Debbah, From LLM reasoning to autonomous AI agents: A comprehensive review (2025), arXiv:2504.19678."},{"key":"S1793351X26410023BIB011","first-page":"68539","volume-title":"Advances in Neural Information Processing Systems (NeurIPS)","volume":"36","author":"Schick T.","year":"2023"},{"key":"S1793351X26410023BIB012","volume-title":"Int. Conf. Learning Representations (ICLR)","author":"Wang G.","year":"2024"},{"key":"S1793351X26410023BIB013","unstructured":"S. Hong\n                      et al\n                      , Metagpt: Meta programming for multi-agent collaborative framework (2024), arXiv:2308.00352."},{"key":"S1793351X26410023BIB014","unstructured":"N. Shinn\n                      et al\n                      , Reflexion: Language agents with verbal reinforcement learning (2023), arXiv:2303.11366."},{"key":"S1793351X26410023BIB015","unstructured":"T. Brown\n                      et al\n                      , Language models are few-shot learners (2020), arXiv:2005.14165."},{"key":"S1793351X26410023BIB016","first-page":"24824","volume-title":"Advances in Neural Information Processing Systems","volume":"35","author":"Wei J.","year":"2022"},{"key":"S1793351X26410023BIB017","volume-title":"Int. Conf. Learning Representations (ICLR)","author":"Yao S.","year":"2023"},{"key":"S1793351X26410023BIB018","unstructured":"Z. Sprague, F. Yin, J. D. Rodriguez, D. Jiang, M. Wadhwa, P. Singhal, X. Zhao, X. Ye, K. Mahowald and G. Durrett, To cot or not to cot? Chain-of-thought helps mainly on math and symbolic reasoning (2024), arXiv:2409.12183."},{"key":"S1793351X26410023BIB019","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00638"},{"key":"S1793351X26410023BIB020","first-page":"16344","volume-title":"Advances in Neural Information Processing Systems (NeurIPS)","author":"Dao T.","year":"2022"},{"key":"S1793351X26410023BIB021","unstructured":"B. Chen, C. Shu, E. Shareghi, N. Collier, K. Narasimhan and S. Yao, Fireact: Toward language agent fine-tuning (2023), arXiv:2310.05915."},{"key":"S1793351X26410023BIB022","doi-asserted-by":"crossref","unstructured":"S. Qiao, N. Zhang, R. Fang, Y. Luo, W. Zhou, Y. E. Jiang, C. Lv and H. Chen, AutoAct: Automatic agent learning from scratch for QA via self-planning (2024), arXiv:2401.05268.","DOI":"10.18653\/v1\/2024.acl-long.165"},{"key":"S1793351X26410023BIB023","first-page":"165","volume-title":"European Conf. Computer Vision","author":"Peng Z.","year":"2024"},{"key":"S1793351X26410023BIB024","doi-asserted-by":"crossref","unstructured":"Z. Chen, K. Liu, Q. Wang, W. Zhang, J. Liu, D. Lin, K. Chen and F. Zhao, Agent-flan: Designing data and methods of effective agent tuning for large language models (2024), arXiv:2403.12881.","DOI":"10.18653\/v1\/2024.findings-acl.557"},{"key":"S1793351X26410023BIB025","volume-title":"ICLR 2024 Workshop on Large Language Model (LLM) Agents","author":"Yin D.","year":"2023"},{"key":"S1793351X26410023BIB026","doi-asserted-by":"crossref","unstructured":"D. Yin, F. Brahman, A. Ravichander, K. Chandu, K.W. Chang, Y. Choi and B. Y. Lin, Agent lumos: Unified and modular training for open-source language agents (2023), arXiv:2311.05657.","DOI":"10.18653\/v1\/2024.acl-long.670"},{"key":"S1793351X26410023BIB027","unstructured":"Y. Wang\n                      et al\n                      , Self-refine: Iterative refinement with self-feedback (2023), arXiv:2303.17651."},{"key":"S1793351X26410023BIB028","first-page":"9459","volume":"33","author":"Lewis P.","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"S1793351X26410023BIB029","unstructured":"Mistral AI, Mistral 7b (2023), arXiv:2310.06825."},{"key":"S1793351X26410023BIB030","first-page":"52","volume":"66","author":"Hooker S.","year":"2023","journal-title":"Commun. ACM"},{"key":"S1793351X26410023BIB031","unstructured":"G. Hinton, O. Vinyals and J. Dean, Distilling the knowledge in a neural network (2015), arXiv:1503.02531."},{"key":"S1793351X26410023BIB032","first-page":"1","volume":"46","author":"Agarwal R.","year":"2024","journal-title":"Found. Trends Mach. Learn."},{"key":"S1793351X26410023BIB033","unstructured":"L. Magister\n                      et al\n                      , Distilling reasoning capabilities into smaller language models (2023), arXiv:2305.02301."},{"key":"S1793351X26410023BIB034","unstructured":"X. Li\n                      et al\n                      , Reasoning distillation of large language models (2023), arXiv:2311.11314."},{"key":"S1793351X26410023BIB035","volume-title":"Int. Conf. Learning Representations (ICLR)","author":"Izacard G.","year":"2022"},{"key":"S1793351X26410023BIB036","unstructured":"T. Yu\n                      et al\n                      , Evaluation of retrieval-augmented generation: A survey (2024), arXiv:2401.05856."},{"key":"S1793351X26410023BIB037","unstructured":"J. Achiam\n                      et al\n                      , Gpt-4 technical report (2023), arXiv:2303.08774."},{"key":"S1793351X26410023BIB038","volume-title":"Int. Conf. Artificial Intelligence x Business","author":"Xue Y.","year":"2025"},{"key":"S1793351X26410023BIB039","doi-asserted-by":"publisher","DOI":"10.1145\/3616855.3635752"},{"key":"S1793351X26410023BIB040","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.577"},{"key":"S1793351X26410023BIB041","unstructured":"R. Ishigami, cyberagent\/llama-3.1-70b-japanese-instruct-2407 (2024)."},{"key":"S1793351X26410023BIB042","unstructured":"I. Loshchilov and F. Hutter, Fixing weight decay regularization in adam, CoRR (2017), arXiv:1711.05101."},{"key":"S1793351X26410023BIB043","unstructured":"L. Debut, A. Zucker, Z. Mueller, Y.D. Shieh, B. Bossan and P. Cuenca, Fixing gradient accumulation (2024), https:\/\/huggingface.co\/blog\/gradient_accumulation, Accessed 24 October 2024."}],"container-title":["International Journal of Semantic Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S1793351X26410023","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,21]],"date-time":"2026-04-21T04:28:15Z","timestamp":1776745695000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S1793351X26410023"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3]]},"references-count":43,"journal-issue":{"issue":"01","published-print":{"date-parts":[[2026,3]]}},"alternative-id":["10.1142\/S1793351X26410023"],"URL":"https:\/\/doi.org\/10.1142\/s1793351x26410023","relation":{},"ISSN":["1793-351X","1793-7108"],"issn-type":[{"value":"1793-351X","type":"print"},{"value":"1793-7108","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3]]}}}