{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,6]],"date-time":"2024-09-06T09:43:08Z","timestamp":1725615788785},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020,7]]},"abstract":"<jats:p>Visually-grounded embodied language learning models have recently shown to be effective at learning multiple multimodal tasks such as following navigational instructions and answering questions. In this paper, we address two key limitations of these models, (a) the inability to transfer the grounded knowledge across different tasks and (b) the inability to transfer to new words and concepts not seen during training using only a few examples. We propose a multitask model which facilitates knowledge transfer across tasks by disentangling the knowledge of words and visual attributes in the intermediate representations. We create scenarios and datasets to quantify cross-task knowledge transfer and show that the proposed model outperforms a range of baselines in simulated 3D environments. We also show that this disentanglement of representations makes our model modular and interpretable which allows for transfer to instructions containing new concepts.<\/jats:p>","DOI":"10.24963\/ijcai.2020\/338","type":"proceedings-article","created":{"date-parts":[[2020,7,8]],"date-time":"2020-07-08T08:12:10Z","timestamp":1594195930000},"page":"2442-2448","source":"Crossref","is-referenced-by-count":4,"title":["Embodied Multimodal Multitask Learning"],"prefix":"10.24963","author":[{"given":"Devendra Singh","family":"Chaplot","sequence":"first","affiliation":[{"name":"Carnegie Mellon University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Lisa","family":"Lee","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ruslan","family":"Salakhutdinov","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Devi","family":"Parikh","sequence":"additional","affiliation":[{"name":"Georgia Institute of Technology & Facebook AI Research"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dhruv","family":"Batra","sequence":"additional","affiliation":[{"name":"Georgia Institute of Technology & Facebook AI Research"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"10584","event":{"number":"28","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)"],"acronym":"IJCAI-PRICAI-2020","name":"Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}","start":{"date-parts":[[2020,7,11]]},"theme":"Artificial Intelligence","location":"Yokohama, Japan","end":{"date-parts":[[2020,7,17]]}},"container-title":["Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2020,7,8]],"date-time":"2020-07-08T22:14:33Z","timestamp":1594246473000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2020\/338"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2020,7]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2020\/338","relation":{},"subject":[],"published":{"date-parts":[[2020,7]]}}}