{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T02:46:23Z","timestamp":1773801983753,"version":"3.50.1"},"reference-count":0,"publisher":"Association for the Advancement of Artificial Intelligence (AAAI)","issue":"13","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AAAI"],"abstract":"<jats:p>Vision-Language Models (VLMs) like CLIP struggle to understand negation, often embedding affirmatives and negatives similarly (e.g., matching \"no dog\" with dog images). Existing methods refine negation understanding via fine-tuning CLIP\u2019s text encoder, risking overfitting. In this work, we propose CLIPGlasses, a plug-and-play framework that enhances CLIP\u2019s ability to comprehend negated visual descriptions. CLIPGlasses adapts a dual-stage design: a Lens module disentangles negated semantics from text embeddings, and a Frame module predicts context-aware repulsion strength, which is integrated into the modified similarity computation to penalize alignment with negated semantics, thereby reducing false positive matches. Experiments show that CLIP equipped with CLIPGlasses achieves competitive in-domain performance and outperforms state-of-the-art methods in cross-domain generalization. Its superiority is especially evident under low-resource conditions, indicating stronger robustness across domains.<\/jats:p>","DOI":"10.1609\/aaai.v40i13.38075","type":"journal-article","created":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T00:07:31Z","timestamp":1773792451000},"page":"10978-10986","source":"Crossref","is-referenced-by-count":0,"title":["Not Just What\u2019s There: Enabling CLIP to Comprehend Negated Visual Descriptions Without Fine-Tuning"],"prefix":"10.1609","volume":"40","author":[{"given":"Junhao","family":"Xiao","sequence":"first","affiliation":[]},{"given":"Zhiyu","family":"Wu","sequence":"additional","affiliation":[]},{"given":"Hao","family":"Lin","sequence":"additional","affiliation":[]},{"given":"Yi","family":"Chen","sequence":"additional","affiliation":[]},{"given":"Yahui","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Xiaoran","family":"Zhao","sequence":"additional","affiliation":[]},{"given":"Zixu","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Zejiang","family":"He","sequence":"additional","affiliation":[]}],"member":"9382","published-online":{"date-parts":[[2026,3,14]]},"container-title":["Proceedings of the AAAI Conference on Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/38075\/42037","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/38075\/42037","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T00:07:31Z","timestamp":1773792451000},"score":1,"resource":{"primary":{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/view\/38075"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,14]]},"references-count":0,"journal-issue":{"issue":"13","published-online":{"date-parts":[[2026,3,17]]}},"URL":"https:\/\/doi.org\/10.1609\/aaai.v40i13.38075","relation":{},"ISSN":["2374-3468","2159-5399"],"issn-type":[{"value":"2374-3468","type":"electronic"},{"value":"2159-5399","type":"print"}],"subject":[],"published":{"date-parts":[[2026,3,14]]}}}