{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,19]],"date-time":"2025-09-19T09:35:58Z","timestamp":1758274558724,"version":"3.40.5"},"reference-count":0,"publisher":"IOS Press","isbn-type":[{"type":"electronic","value":"9781643685489"}],"license":[{"start":{"date-parts":[[2024,10,16]],"date-time":"2024-10-16T00:00:00Z","timestamp":1729036800000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by-nc\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024,10,16]]},"abstract":"<jats:p>Large language models demonstrate impressive proficiency in language understanding and generation. Nonetheless, training these models from scratch, even the least complex billion-parameter variant demands significant computational resources rendering it economically impractical for many organizations. With large language models functioning as general-purpose task solvers, this paper investigates their task-specific fine-tuning. We employ task-specific datasets and prompts to fine-tune two pruned LLaMA models having 5 billion and 4 billion parameters. This process utilizes the pre-trained weights and focuses on a subset of weights using the LoRA method. One challenge in fine-tuning the LLaMA model is crafting a precise prompt tailored to the specific task. To address this, we propose a novel approach to fine-tune the LLaMA model under two primary constraints: task specificity and prompt effectiveness. Our approach, Tailored LLaMA initially employs structural pruning to reduce the model sizes from 7B to 5B and 4B parameters. Subsequently, it applies a carefully designed prompt specific to the task and utilizes the LoRA method to accelerate the fine-tuning process. Moreover, fine-tuning a model pruned by 50% for less than one hour restores the mean accuracy of classification tasks to 95.68% at a 20% compression ratio and to 86.54% at a 50% compression ratio through few-shot learning with 50 shots. Our validation of Tailored LLaMA on these two pruned variants demonstrates that even when compressed to 50%, the models maintain over 65% of the baseline model accuracy in few-shot classification and generation tasks. These findings highlight the efficacy of our tailored approach in maintaining high performance with significantly reduced model sizes.<\/jats:p>","DOI":"10.3233\/faia240947","type":"book-chapter","created":{"date-parts":[[2024,10,17]],"date-time":"2024-10-17T13:46:28Z","timestamp":1729172788000},"source":"Crossref","is-referenced-by-count":2,"title":["Tailored-LLaMA: Optimizing Few-Shot Learning in Pruned LLaMA Models with Task-Specific Prompts"],"prefix":"10.3233","author":[{"ORCID":"https:\/\/orcid.org\/0009-0006-9825-9134","authenticated-orcid":false,"given":"Danyal","family":"Aftab","sequence":"first","affiliation":[{"name":"Technological University Dublin, Ireland"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3300-1152","authenticated-orcid":false,"given":"Steven","family":"Davy","sequence":"additional","affiliation":[{"name":"Technological University Dublin, Ireland"}]}],"member":"7437","container-title":["Frontiers in Artificial Intelligence and Applications","ECAI 2024"],"original-title":[],"link":[{"URL":"https:\/\/ebooks.iospress.nl\/pdf\/doi\/10.3233\/FAIA240947","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,17]],"date-time":"2024-10-17T13:46:28Z","timestamp":1729172788000},"score":1,"resource":{"primary":{"URL":"https:\/\/ebooks.iospress.nl\/doi\/10.3233\/FAIA240947"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,10,16]]},"ISBN":["9781643685489"],"references-count":0,"URL":"https:\/\/doi.org\/10.3233\/faia240947","relation":{},"ISSN":["0922-6389","1879-8314"],"issn-type":[{"type":"print","value":"0922-6389"},{"type":"electronic","value":"1879-8314"}],"subject":[],"published":{"date-parts":[[2024,10,16]]}}}