{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,12]],"date-time":"2026-06-12T23:56:53Z","timestamp":1781308613979,"version":"3.54.1"},"reference-count":41,"publisher":"Association for Computing Machinery (ACM)","issue":"4","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Digital Threats"],"published-print":{"date-parts":[[2025,12,31]]},"abstract":"<jats:p>Large Language Models (LLMs) have rapidly gained popularity in various fields, including Digital Forensics (DF), where they offer the potential to accelerate investigative processes. Although several studies have explored LLMs for tasks such as evidence identification, artifact analysis, and report writing, fine-tuning models for specific forensic applications remains underexplored. This article addresses this gap by proposing recommendations for fine-tuning LLMs tailored to DF tasks. A case study on chat summarization is presented to showcase the applicability of the recommendations, where us evaluate multiple fine-tuned models to assess their performance. The study concludes with sharing the lessons learned from the case study.<\/jats:p>","DOI":"10.1145\/3748264","type":"journal-article","created":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T17:12:40Z","timestamp":1753377160000},"page":"1-18","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":5,"title":["Fine-Tuning Large Language Models for Digital Forensics: Case Study and General Recommendations"],"prefix":"10.1145","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8844-3425","authenticated-orcid":false,"given":"Ga\u00ebtan","family":"Michelet","sequence":"first","affiliation":[{"name":"Chair for Cybersecurity, University of Augsburg, Augsburg, Germany and School of Criminal Justice, University of Lausanne, Lausanne, Switzerland"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-2810-442X","authenticated-orcid":false,"given":"Hans","family":"Henseler","sequence":"additional","affiliation":[{"name":"Netherlands Forensic Institute, The Hague, Netherlands   and Leiden University of Applied Sciences, Leiden, Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6128-5201","authenticated-orcid":false,"given":"Harm van","family":"Beek","sequence":"additional","affiliation":[{"name":"Netherlands Forensic Institute, The Hague, Netherlands   and Open Universiteit, Heerlen, Netherlands"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6581-7164","authenticated-orcid":false,"given":"Mark","family":"Scanlon","sequence":"additional","affiliation":[{"name":"University College Dublin, Dublin, Ireland"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5261-4600","authenticated-orcid":false,"given":"Frank","family":"Breitinger","sequence":"additional","affiliation":[{"name":"Chair for Cybersecurity, University of Augsburg, Augsburg, Germany"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2025,12,15]]},"reference":[{"key":"e_1_3_4_2_2","first-page":"10088","volume-title":"Advances in Neural Information Processing Systems","author":"Dettmers Tim","year":"2023","unstructured":"Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, and Luke Zettlemoyer. 2023. QLoRA: Efficient finetuning of quantized LLMs. In Advances in Neural Information Processing Systems. A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.), Vol. 36, Curran Associates, Inc., 10088\u201310115. Retrieved from https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2023\/file\/1feb87871436031bdc0f2beaa62a049b-Paper-Conference.pdf"},{"key":"e_1_3_4_3_2","doi-asserted-by":"publisher","DOI":"10.1093\/fsr\/owad039"},{"key":"e_1_3_4_4_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.fsidi.2024.301801"},{"key":"e_1_3_4_5_2","unstructured":"Abhimanyu Dubey Abhinav Jauhri Abhinav Pandey Abhishek Kadian Ahmad Al-Dahle Aiesha Letman Akhil Mathur Alan Schelten Amy Yang Angela Fan et al. 2024. The Llama 3 Herd of models. arXiv:2407.21783. Retrieved from https:\/\/arxiv.org\/abs\/2407.21783"},{"key":"e_1_3_4_6_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-emnlp.24"},{"key":"e_1_3_4_7_2","doi-asserted-by":"crossref","unstructured":"Xiachong Feng Xiaocheng Feng and Bing Qin. 2022. A survey on dialogue summarization: Recent advances and new frontiers. arXiv:2107.03175. Retrieved from https:\/\/arxiv.org\/abs\/2107.03175","DOI":"10.24963\/ijcai.2022\/764"},{"key":"e_1_3_4_8_2","unstructured":"Gemma Team: Morgane Riviere Shreya Pathak Pier Giuseppe Sessa Cassidy Hardin Surya Bhupatiraju L\u00e9onard Hussenot Thomas Mesnard Bobak Shahriari Alexandre Ram\u00e9 Johan Ferret et al. 2024. Gemma 2: Improving open language models at a practical size. arXiv:2408.00118. Retrieved from https:\/\/arxiv.org\/abs\/2408.00118"},{"key":"e_1_3_4_9_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-5409"},{"key":"e_1_3_4_10_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.fsidi.2024.301844"},{"key":"e_1_3_4_11_2","first-page":"58","volume-title":"Proceedings of the Third International Workshop on Artificial Intelligence and Intelligent Assistance for Legal Professionals in the Digital Workplace (LegalAIIA \u201923), Vol. 3423. CEUR Workshop Proceedings","author":"Henseler Hans","year":"2023","unstructured":"Hans Henseler and Harm van Beek. 2023. ChatGPT as a Copilot for investigating digital evidence. In Proceedings of the Third International Workshop on Artificial Intelligence and Intelligent Assistance for Legal Professionals in the Digital Workplace (LegalAIIA \u201923), Vol. 3423. CEUR Workshop Proceedings, 58\u201369. Retrieved from https:\/\/ceur-ws.org\/Vol-3423\/paper6.pdf"},{"key":"e_1_3_4_12_2","unstructured":"Edward J. Hu Yelong Shen Phillip Wallis Zeyuan Allen-Zhu Yuanzhi Li Shean Wang Lu Wang and Weizhu Chen. 2021. LoRA: Low-rank adaptation of large language models. arXiv:2106.09685. Retrieved from https:\/\/arxiv.org\/abs\/2106.09685"},{"key":"e_1_3_4_13_2","doi-asserted-by":"publisher","DOI":"10.1109\/ASE56229.2023.00181"},{"key":"e_1_3_4_14_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.clinicalnlp-1.42"},{"key":"e_1_3_4_15_2","unstructured":"Albert Q. Jiang Alexandre Sablayrolles Arthur Mensch Chris Bamford Devendra Singh Chaplot Diego de las Casas Florian Bressand Gianna Lengyel Guillaume Lample Lucile Saulnier et al. 2023. Mistral 7B. arXiv:2310.06825. Retrieved from https:\/\/arxiv.org\/abs\/2310.06825"},{"key":"e_1_3_4_16_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCCIS60361.2023.10425000"},{"key":"e_1_3_4_17_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-emnlp.391"},{"key":"e_1_3_4_18_2","first-page":"74","volume-title":"Text Summarization Branches Out","author":"Lin Chin-Yew","year":"2004","unstructured":"Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out. Association for Computational Linguistics, 74\u201381. Retrieved from https:\/\/aclanthology.org\/W04-1013"},{"key":"e_1_3_4_19_2","unstructured":"Raj J. Mathav V. M. Kushala Harikrishna Warrier and Yogesh Gupta. 2024. Fine tuning LLM for enterprise: Practical guidelines and recommendations. arXiv:2404.10779. Retrieved from https:\/\/arxiv.org\/abs\/2404.10779"},{"key":"e_1_3_4_20_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.fsidi.2023.301683"},{"key":"e_1_3_4_21_2","unstructured":"Shervin Minaee Tomas Mikolov Narjes Nikzad Meysam Chenaghlu Richard Socher Xavier Amatriain and Jianfeng Gao. 2024. Large language models: A survey. arXiv:2402.06196. Retrieved from https:\/\/arxiv.org\/abs\/2402.06196"},{"key":"e_1_3_4_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3744746"},{"key":"e_1_3_4_23_2","doi-asserted-by":"publisher","DOI":"10.3115\/1073083.1073135"},{"key":"e_1_3_4_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/3736721"},{"key":"e_1_3_4_25_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.fsidi.2023.301609"},{"key":"e_1_3_4_26_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.fsidi.2023.301543"},{"key":"e_1_3_4_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2025.3546700"},{"key":"e_1_3_4_28_2","unstructured":"Kunal Suri Prakhar Mishra Saumajit Saha and Atul Singh. 2023. SuryaKiran at MEDIQA-Sum 2023: Leveraging LoRA for clinical dialogue summarization. arXiv:2307.05162. Retrieved from https:\/\/arxiv.org\/abs\/2307.05162"},{"key":"e_1_3_4_29_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.naacl-main.415"},{"key":"e_1_3_4_30_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.acl-long.385"},{"key":"e_1_3_4_31_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.patter.2023.100729"},{"key":"e_1_3_4_32_2","unstructured":"Lewis Tunstall Edward Beeching Nathan Lambert Nazneen Rajani Shengyi Huang Kashif Rasul Alvaro Bartolome Alexander M. Rush and Thomas Wolf. 2023. The Alignment Handbook. Version 0.3.0.dev0 Apache-2.0 License. Retrieved from https:\/\/github.com\/huggingface\/alignment-handbook"},{"key":"e_1_3_4_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/3664476.3670908"},{"key":"e_1_3_4_34_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.fsidi.2024.301859"},{"key":"e_1_3_4_35_2","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN60899.2024.10650513"},{"key":"e_1_3_4_36_2","unstructured":"Shengbin Yue Wei Chen Siyuan Wang Bingxuan Li Chenchen Shen Shujun Liu Yuxuan Zhou Yao Xiao Song Yun Xuanjing Huang and Zhongyu Wei. 2023. DISC-LawLLM: Fine-tuning large language models for intelligent legal services. arXiv:2309.11325. Retrieved from https:\/\/arxiv.org\/abs\/2309.11325"},{"key":"e_1_3_4_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/3604237.3626838"},{"key":"e_1_3_4_38_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-emnlp.313"},{"key":"e_1_3_4_39_2","unstructured":"Tianyi Zhang Varsha Kishore Felix Wu Kilian Q. Weinberger and Yoav Artzi. 2020. BERTScore: Evaluating text generation with BERT. arXiv:1904.09675. Retrieved from https:\/\/arxiv.org\/abs\/1904.09675"},{"key":"e_1_3_4_40_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.findings-emnlp.377"},{"key":"e_1_3_4_41_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.naacl-main.357"},{"key":"e_1_3_4_42_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v36i10.21432"}],"container-title":["Digital Threats: Research and Practice"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3748264","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,15]],"date-time":"2025-12-15T17:57:20Z","timestamp":1765821440000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3748264"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,12,15]]},"references-count":41,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,12,31]]}},"alternative-id":["10.1145\/3748264"],"URL":"https:\/\/doi.org\/10.1145\/3748264","relation":{},"ISSN":["2692-1626","2576-5337"],"issn-type":[{"value":"2692-1626","type":"print"},{"value":"2576-5337","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,12,15]]},"assertion":[{"value":"2025-05-09","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-07-07","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-12-15","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}