{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,23]],"date-time":"2025-06-23T03:40:05Z","timestamp":1750650005201,"version":"3.41.0"},"reference-count":19,"publisher":"World Scientific Pub Co Pte Ltd","issue":"01","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. Semantic Computing"],"published-print":{"date-parts":[[2025,3]]},"abstract":"<jats:p> Deep Reinforcement Learning (DRL) has shown great potential in enabling robots to find certain objects (e.g. \u201cfind a fridge\u201d) in environments like homes or schools. This task is known as Object-Goal Navigation (ObjectNav). DRL methods are predominantly trained and evaluated using environment simulators. Although DRL has shown impressive results, the simulators may be biased or limited. This creates a risk of shortcut learning, i.e. learning a policy tailored to specific visual details of training environments. We aim to deepen our understanding of shortcut learning in ObjectNav, its implications and propose a solution. We design an experiment for inserting a shortcut bias in the appearance of training environments. As a proof-of-concept, we associate room types to specific wall colors (e.g. bedrooms with green walls), and observe poor generalization of a state-of-the-art (SOTA) ObjectNav method to environments where this is not the case (e.g. bedrooms with blue walls). We find that shortcut learning is the root cause: the agent learns to navigate to target objects, by simply searching for the associated wall color of the target object\u2019s room. To solve this, we propose Language-Based (L-B) augmentation. Our key insight is that we can leverage the multimodal feature space of a Vision-Language Model (VLM) to augment visual representations directly at the feature-level, requiring no changes to the simulator, and only an addition of one layer to the model. Where the SOTA ObjectNav method\u2019s success rate drops 69%, our proposal has only a drop of 23%. Code is available at https:\/\/github.com\/Dennishoftijzer\/L-B_Augmentation . <\/jats:p>","DOI":"10.1142\/s1793351x25410077","type":"journal-article","created":{"date-parts":[[2025,3,7]],"date-time":"2025-03-07T15:09:42Z","timestamp":1741360182000},"page":"147-167","source":"Crossref","is-referenced-by-count":0,"title":["Language-Based Augmentation to Mitigate Shortcut Learning in Object-Goal Navigation"],"prefix":"10.1142","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-5704-3452","authenticated-orcid":false,"given":"Dennis","family":"Hoftijzer","sequence":"first","affiliation":[{"name":"Electrical Engineering, Mathematics, and Computer Science (EEMCS), University of Twente, Drienerlolaan 5, Enschede, Overijssel, The Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6265-7276","authenticated-orcid":false,"given":"Gertjan","family":"Burghouts","sequence":"additional","affiliation":[{"name":"Intelligent Imaging, TNO, Oude Waalsdorperweg 63, Den Haag, Zuid-Holland, The Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8481-560X","authenticated-orcid":false,"given":"Luuk","family":"Spreeuwers","sequence":"additional","affiliation":[{"name":"Electrical Engineering, Mathematics, and Computer Science (EEMCS), University of Twente, Drienerlolaan 5, Enschede, Overijssel, The Netherlands"}]}],"member":"219","published-online":{"date-parts":[[2025,4,24]]},"reference":[{"key":"S1793351X25410077BIB001","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01441"},{"key":"S1793351X25410077BIB004","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00943"},{"key":"S1793351X25410077BIB005","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.00323"},{"key":"S1793351X25410077BIB006","doi-asserted-by":"publisher","DOI":"10.1109\/IROS51168.2021.9636667"},{"key":"S1793351X25410077BIB007","doi-asserted-by":"publisher","DOI":"10.1038\/s42256-020-00257-z"},{"key":"S1793351X25410077BIB008","doi-asserted-by":"publisher","DOI":"10.1109\/IRC59093.2023.00007"},{"key":"S1793351X25410077BIB009","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2017.8202133"},{"key":"S1793351X25410077BIB011","first-page":"5982","volume-title":"Advances in Neural Information Processing Systems","volume":"35","author":"Deitke M.","year":"2022"},{"key":"S1793351X25410077BIB012","doi-asserted-by":"publisher","DOI":"10.1109\/3DV.2017.00081"},{"volume-title":"Thirty-fifth Conf. Neural Information Processing Systems Datasets and Benchmarks Track (Round 2)","year":"2021","author":"Ramakrishnan S. K.","key":"S1793351X25410077BIB013"},{"key":"S1793351X25410077BIB015","first-page":"8821","volume-title":"Proc. 38th Int. Conf. Machine Learning","author":"Ramesh A.","year":"2021"},{"key":"S1793351X25410077BIB016","first-page":"8748","volume-title":"Proc. 38th Int. Conf. Machine Learning","author":"Radford A.","year":"2021"},{"key":"S1793351X25410077BIB018","first-page":"492","volume-title":"6th Annual Conf. Robot Learning","author":"Shah D.","year":"2022"},{"key":"S1793351X25410077BIB020","first-page":"32340","volume-title":"Advances in Neural Information Processing Systems","author":"Majumdar A.","year":"2022"},{"key":"S1793351X25410077BIB021","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00945"},{"key":"S1793351X25410077BIB023","doi-asserted-by":"publisher","DOI":"10.1145\/3596490"},{"volume-title":"NeurIPS 2022 Workshop on Distribution Shifts: Connecting Methods and Applications","year":"2022","author":"Wang S.","key":"S1793351X25410077BIB024"},{"key":"S1793351X25410077BIB027","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00111"},{"volume-title":"4th Int. Conf. Learning Representations","year":"2016","author":"Schulman J.","key":"S1793351X25410077BIB030"}],"container-title":["International Journal of Semantic Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S1793351X25410077","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,23]],"date-time":"2025-06-23T03:13:01Z","timestamp":1750648381000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S1793351X25410077"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3]]},"references-count":19,"journal-issue":{"issue":"01","published-print":{"date-parts":[[2025,3]]}},"alternative-id":["10.1142\/S1793351X25410077"],"URL":"https:\/\/doi.org\/10.1142\/s1793351x25410077","relation":{},"ISSN":["1793-351X","1793-7108"],"issn-type":[{"type":"print","value":"1793-351X"},{"type":"electronic","value":"1793-7108"}],"subject":[],"published":{"date-parts":[[2025,3]]}}}