{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T03:02:18Z","timestamp":1773802938288,"version":"3.50.1"},"reference-count":0,"publisher":"Association for the Advancement of Artificial Intelligence (AAAI)","issue":"24","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AAAI"],"abstract":"<jats:p>Layer pruning is a viable technique for compressing large language models while achieving acceleration proportional to the pruning ratio. In this work, we identify that removing any layer induces a magnitude gap in hidden states, and demonstrate that a simple compensation operation leads to superior performance in iterative layer pruning. This key observation motivates us to propose Prune&amp;Comp, a novel, plug-and-play iterative layer pruning scheme that leverages magnitude compensation to mitigate such gaps in a training-free manner. Specifically, we first estimate the magnitude gap of layer removal and then eliminate it by rescaling the remaining weights offline. We further demonstrate the advantages of Prune&amp;Comp in improving the stability of iterative pruning. When integrated with an iterative prune-and-compensate loop, Prune&amp;Comp consistently enhances existing layer pruning metrics. For instance, when 5 layers of LLaMA-3-8B are pruned with the prevalent Taylor+ metric, Prune&amp;Comp reduces PPL from 512.78 to 16.34 and retains 90.57% of the original performance across 9 question-answering tasks, outperforming the baseline by 24.72%.<\/jats:p>","DOI":"10.1609\/aaai.v40i24.39120","type":"journal-article","created":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T01:12:02Z","timestamp":1773796322000},"page":"20316-20324","source":"Crossref","is-referenced-by-count":0,"title":["Prune&amp;Comp: Free Lunch for Layer-Pruned LLMs via Iterative Pruning with Magnitude Compensation"],"prefix":"10.1609","volume":"40","author":[{"given":"Xinrui","family":"Chen","sequence":"first","affiliation":[]},{"given":"Hongxing","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Fanyi","family":"Zeng","sequence":"additional","affiliation":[]},{"given":"Yongxian","family":"Wei","sequence":"additional","affiliation":[]},{"given":"Yizhi","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Xitong","family":"Ling","sequence":"additional","affiliation":[]},{"given":"Guanghao","family":"Li","sequence":"additional","affiliation":[]},{"given":"Chun","family":"Yuan","sequence":"additional","affiliation":[]}],"member":"9382","published-online":{"date-parts":[[2026,3,14]]},"container-title":["Proceedings of the AAAI Conference on Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/39120\/43082","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/39120\/43082","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T01:12:02Z","timestamp":1773796322000},"score":1,"resource":{"primary":{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/view\/39120"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,14]]},"references-count":0,"journal-issue":{"issue":"24","published-online":{"date-parts":[[2026,3,17]]}},"URL":"https:\/\/doi.org\/10.1609\/aaai.v40i24.39120","relation":{},"ISSN":["2374-3468","2159-5399"],"issn-type":[{"value":"2374-3468","type":"electronic"},{"value":"2159-5399","type":"print"}],"subject":[],"published":{"date-parts":[[2026,3,14]]}}}