{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T17:45:33Z","timestamp":1777657533402,"version":"3.51.4"},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,7]]},"abstract":"<jats:p>Significant progress has been made recently in developing few-shot object segmentation methods. Learning is shown to be successful in few-shot segmentation settings, using pixel-level, scribbles and bounding box supervision. This paper takes another approach, i.e., only requiring image-level label for few-shot object segmentation. We propose a novel multi-modal interaction module for few-shot object segmentation that utilizes a co-attention mechanism using both visual and word embedding. Our model using image-level labels achieves 4.8% improvement over previously proposed image-level few-shot object segmentation. It also outperforms state-of-the-art methods that use weak bounding box supervision on PASCAL-5^i. Our results show that few-shot segmentation benefits from utilizing word embeddings, and that we are able to perform few-shot segmentation using stacked joint visual semantic processing with weak image-level labels. We further propose a novel setup, Temporal Object Segmentation for Few-shot Learning (TOSFL) for videos. TOSFL can be used on a variety of public video data such as Youtube-VOS, as demonstrated in both instance-level and category-level TOSFL experiments.<\/jats:p>","DOI":"10.24963\/ijcai.2020\/120","type":"proceedings-article","created":{"date-parts":[[2020,7,8]],"date-time":"2020-07-08T12:12:10Z","timestamp":1594210330000},"page":"860-867","source":"Crossref","is-referenced-by-count":35,"title":["Weakly Supervised Few-shot Object Segmentation using Co-Attention with Visual and Semantic Embeddings"],"prefix":"10.24963","author":[{"given":"Mennatullah","family":"Siam","sequence":"first","affiliation":[{"name":"University of Alberta"},{"name":"HiSilicon, Huawei Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Naren","family":"Doraiswamy","sequence":"additional","affiliation":[{"name":"Indian Institute of Science"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Boris N.","family":"Oreshkin","sequence":"additional","affiliation":[{"name":"Element AI"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Hengshuai","family":"Yao","sequence":"additional","affiliation":[{"name":"HiSilicon, Huawei Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Martin","family":"Jagersand","sequence":"additional","affiliation":[{"name":"University of Alberta"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"10584","event":{"name":"Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}","theme":"Artificial Intelligence","location":"Yokohama, Japan","acronym":"IJCAI-PRICAI-2020","number":"28","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)"],"start":{"date-parts":[[2020,7,11]]},"end":{"date-parts":[[2020,7,17]]}},"container-title":["Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2020,7,9]],"date-time":"2020-07-09T02:13:24Z","timestamp":1594260804000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2020\/120"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2020,7]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2020\/120","relation":{},"subject":[],"published":{"date-parts":[[2020,7]]}}}