{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T04:30:30Z","timestamp":1773808230628,"version":"3.50.1"},"reference-count":0,"publisher":"Association for the Advancement of Artificial Intelligence (AAAI)","issue":"42","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AAAI"],"abstract":"<jats:p>Recently, Large Vision-Language Models (LVLMs) have been demonstrated to be vulnerable to jailbreak attacks, highlighting the urgent need for further research to comprehensively identify and mitigate these threats.\nUnfortunately, existing jailbreak studies primarily focus on coarse-grained input manipulation to elicit specific responses, overlooking the exploitation of internal representations, i.e., intermediate activations, which constrains their ability to penetrate alignment safeguards and generate harmful responses.\nTo tackle this issue, we propose the Activation Manipulation (ActMan) Attack framework, which performs fine-grained activation manipulations inspired by the perception and cognition stages of human decision-making, enhancing both the penetration capability and harmfulness of attacks.\nTo improve penetration capability, we introduce a Deceptive Visual Camouflage module inspired by the masking effect in human perception. This module uses a benign activation-guided attention redirection strategy to conceal abnormal activation patterns, thereby suppressing LVLM's defense detection during early-stage decoding.\nTo enhance harmfulness, we design a Malicious Semantic Induction module drawing from the framing effect in human cognition, which reconstructs jailbreak instructions using malicious activation guidance to change LVLM\u2019s risk assessment during late-stage decoding, thereby amplifying the harmfulness of model responses.\nExtensive experiments on six mainstream LVLMs demonstrate that our method remarkably outperforms state-of-the-art baselines, achieving an average relative ASR improvement of 12.06%.<\/jats:p>","DOI":"10.1609\/aaai.v40i42.40858","type":"journal-article","created":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T03:35:26Z","timestamp":1773804926000},"page":"35481-35489","source":"Crossref","is-referenced-by-count":0,"title":["Activation Manipulation Attack: Penetrating and Harmful Jailbreak Attack Against Large Vision-Language Models"],"prefix":"10.1609","volume":"40","author":[{"given":"Haojie","family":"Hao","sequence":"first","affiliation":[]},{"given":"Jiakai","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Aishan","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Yuqing","family":"Ma","sequence":"additional","affiliation":[]},{"given":"Haotong","family":"Qin","sequence":"additional","affiliation":[]},{"given":"Yuanfang","family":"Guo","sequence":"additional","affiliation":[]},{"given":"Xianglong","family":"Liu","sequence":"additional","affiliation":[]}],"member":"9382","published-online":{"date-parts":[[2026,3,14]]},"container-title":["Proceedings of the AAAI Conference on Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/40858\/44819","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/40858\/44819","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T03:35:28Z","timestamp":1773804928000},"score":1,"resource":{"primary":{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/view\/40858"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,14]]},"references-count":0,"journal-issue":{"issue":"42","published-online":{"date-parts":[[2026,3,17]]}},"URL":"https:\/\/doi.org\/10.1609\/aaai.v40i42.40858","relation":{},"ISSN":["2374-3468","2159-5399"],"issn-type":[{"value":"2374-3468","type":"electronic"},{"value":"2159-5399","type":"print"}],"subject":[],"published":{"date-parts":[[2026,3,14]]}}}