{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,10]],"date-time":"2026-01-10T20:19:41Z","timestamp":1768076381712,"version":"3.49.0"},"reference-count":46,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2022,11,4]],"date-time":"2022-11-04T00:00:00Z","timestamp":1667520000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,11,4]],"date-time":"2022-11-04T00:00:00Z","timestamp":1667520000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"the Wuhan Science and Technology Planning Application Foundation Frontier Project","award":["2019010701011413"],"award-info":[{"award-number":["2019010701011413"]}]},{"name":"the National Key Research and Development Program of China","award":["2018YFB1305001"],"award-info":[{"award-number":["2018YFB1305001"]}]},{"name":"the Project Supported by the Open Fund of Hubei Luojia Laboratory"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2023,4]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Visual mapless navigation (VMN), modeling a direct mapping between sensory inputs and agent actions, aims to navigate from a stochastic origin location to a prescribed goal in an unseen scene. A fundamental yet challenging issue in visual mapless navigation is generalizing to a new scene. Furthermore, it is of pivotal concern to design a method to make effective policy learning. To address these issues, we introduce a novel visual mapless navigation model, which integrates hierarchical semantic information represented by context vector with meta-learning to improve the generalization performance gap between known and unknown environments. Extensive experimental results on AI2-THOR benchmark dataset demonstrate that our model significantly outperforms the state-of-the-art model by  <jats:inline-formula><jats:alternatives><jats:tex-math>$$15.79\\%$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mn>15.79<\/mml:mn>\n                    <mml:mo>%<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> for the SPL and by  <jats:inline-formula><jats:alternatives><jats:tex-math>$$23.83\\%$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mrow>\n                    <mml:mn>23.83<\/mml:mn>\n                    <mml:mo>%<\/mml:mo>\n                  <\/mml:mrow>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> for the success rate. In addition, the exploration rate experiment shows that our model can effectively improve the invalid exploration behavior of the agent and accelerate the convergence speed of the model. Our implementation code and data can be viewed on <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"uri\" xlink:href=\"https:\/\/github.com\/zhiyu-tech\/WHU-CVVMN\">https:\/\/github.com\/zhiyu-tech\/WHU-CVVMN<\/jats:ext-link>.<\/jats:p>","DOI":"10.1007\/s40747-022-00902-7","type":"journal-article","created":{"date-parts":[[2022,11,4]],"date-time":"2022-11-04T06:02:31Z","timestamp":1667541751000},"page":"2031-2041","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Context vector-based visual mapless navigation in indoor using hierarchical semantic information and meta-learning"],"prefix":"10.1007","volume":"9","author":[{"given":"Fei","family":"Li","sequence":"first","affiliation":[]},{"given":"Chi","family":"Guo","sequence":"additional","affiliation":[]},{"given":"Huyin","family":"Zhang","sequence":"additional","affiliation":[]},{"given":"Binhan","family":"Luo","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,11,4]]},"reference":[{"key":"902_CR1","doi-asserted-by":"crossref","unstructured":"Zhu Y, Mottaghi R, Kolve E, Lim JJ, Gupta A, Fei-Fei L, Farhadi A (2017) Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: 2017 IEEE international conference on robotics and automation (ICRA). IEEE, pp 3357\u20133364","DOI":"10.1109\/ICRA.2017.7989381"},{"key":"902_CR2","unstructured":"Mirowski P, Grimes M, Malinowski M, Hermann KM, Anderson K, Teplyashin D, Simonyan K, Zisserman A, Hadsell R, et\u00a0al (2018) Learning to navigate in cities without a map. In: Advances in neural information processing systems, pp 2419\u20132430"},{"key":"902_CR3","doi-asserted-by":"crossref","unstructured":"Chaplot DS, Sathyendra KM, Pasumarthi RK, Rajagopal D, Salakhutdinov R (2018) Gated-attention architectures for task-oriented language grounding. In: Thirty-second AAAI conference on artificial intelligence","DOI":"10.1609\/aaai.v32i1.11832"},{"key":"902_CR4","unstructured":"Yang W, Wang X, Farhadi A, Gupta A, Mottaghi R (2018) Visual semantic navigation using scene priors. arXiv preprint arXiv:1810.06543"},{"key":"902_CR5","unstructured":"Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907"},{"key":"902_CR6","doi-asserted-by":"crossref","unstructured":"Wu Y, Wu Y, Tamar A, Russell S, Gkioxari G, Tian Y (2019) Bayesian relational memory for semantic visual navigation. In: Proceedings of the IEEE international conference on computer vision, pp 2769\u20132779","DOI":"10.1109\/ICCV.2019.00286"},{"key":"902_CR7","doi-asserted-by":"crossref","unstructured":"Moghaddam MK, Wu Q, Abbasnejad E, Shi J (2021) Optimistic agent: accurate graph-based value estimation for more successful visual navigation. In: Proceedings of the IEEE\/CVF winter conference on applications of computer vision, pp 3733\u20133742","DOI":"10.1109\/WACV48630.2021.00378"},{"key":"902_CR8","doi-asserted-by":"crossref","unstructured":"Wortsman M, Ehsani K, Rastegari M, Farhadi A, Mottaghi R (2019) Learning to learn how to learn: Self-adaptive visual navigation using meta-learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6750\u20136759","DOI":"10.1109\/CVPR.2019.00691"},{"key":"902_CR9","doi-asserted-by":"crossref","unstructured":"Li J, Wang X, Tang S, Shi H, Wu F, Zhuang Y, Wang WY (2020) Unsupervised reinforcement learning of transferable meta-skills for embodied navigation. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 12123\u201312132","DOI":"10.1109\/CVPR42600.2020.01214"},{"key":"902_CR10","doi-asserted-by":"publisher","first-page":"368","DOI":"10.1016\/j.neucom.2021.03.084","volume":"449","author":"F Li","year":"2021","unstructured":"Li F, Guo C, Luo B, Zhang H (2021) Multi goals and multi scenes visual mapless navigation in indoor using meta-learning and scene priors. Neurocomputing 449:368\u2013377","journal-title":"Neurocomputing"},{"key":"902_CR11","doi-asserted-by":"crossref","unstructured":"Torralba A, Murphy KP, Freeman WT, Rubin MA (2003) Context-based vision system for place and object recognition. In: IEEE international conference on IEEE computer society computer vision, vol 2, pp 273\u2013273","DOI":"10.1109\/ICCV.2003.1238354"},{"key":"902_CR12","doi-asserted-by":"crossref","unstructured":"Mottaghi R, Chen X, Liu X, Cho NG, Lee SW, Fidler S, Urtasun R, Yuille A (2014) The role of context for object detection and semantic segmentation in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 891\u2013898","DOI":"10.1109\/CVPR.2014.119"},{"key":"902_CR13","doi-asserted-by":"publisher","first-page":"467","DOI":"10.1016\/j.ins.2021.06.084","volume":"577","author":"D Elayaperumal","year":"2021","unstructured":"Elayaperumal D, Joo YH (2021) Robust visual object tracking using context-based spatial variation via multi-feature fusion. Inf Sci 577:467\u2013482","journal-title":"Inf Sci"},{"issue":"1","key":"902_CR14","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1016\/S0004-3702(97)00078-7","volume":"99","author":"S Thrun","year":"1998","unstructured":"Thrun S (1998) Learning metric-topological maps for indoor mobile robot navigation. Artif Intell 99(1):21\u201371","journal-title":"Artif Intell"},{"issue":"2\u20133","key":"902_CR15","doi-asserted-by":"publisher","first-page":"121","DOI":"10.1016\/S0921-8890(02)00237-3","volume":"40","author":"K Kidono","year":"2002","unstructured":"Kidono K, Miura J, Shirai Y (2002) Autonomous visual navigation of a mobile robot using a human-guided experience. Robot Auton Syst 40(2\u20133):121\u2013130","journal-title":"Robot Auton Syst"},{"issue":"2","key":"902_CR16","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1109\/MRA.2006.1638022","volume":"13","author":"H Durrant-Whyte","year":"2006","unstructured":"Durrant-Whyte H, Bailey T (2006) Simultaneous localization and mapping: part I. IEEE Robot Autom Mag 13(2):99\u2013110","journal-title":"IEEE Robot Autom Mag"},{"key":"902_CR17","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.robot.2014.11.009","volume":"64","author":"E Garcia-Fidalgo","year":"2015","unstructured":"Garcia-Fidalgo E, Ortiz A (2015) Vision-based topological mapping and localization methods: a survey. Robot Auton Syst 64:1\u201320","journal-title":"Robot Auton Syst"},{"issue":"6","key":"902_CR18","doi-asserted-by":"publisher","first-page":"1309","DOI":"10.1109\/TRO.2016.2624754","volume":"32","author":"C Cadena","year":"2016","unstructured":"Cadena C, Carlone L, Carrillo H, Latif Y, Scaramuzza D, Neira J, Reid I, Leonard JJ (2016) Past, present, and future of simultaneous localization and mapping: toward the robust-perception age. IEEE Trans Rob 32(6):1309\u20131332","journal-title":"IEEE Trans Rob"},{"key":"902_CR19","doi-asserted-by":"crossref","unstructured":"Zhang P, Ouyang W, Zhang P, Xue J, Zheng N (2019) Sr-lstm: state refinement for lstm towards pedestrian trajectory prediction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 12085\u201312094","DOI":"10.1109\/CVPR.2019.01236"},{"issue":"7553","key":"902_CR20","doi-asserted-by":"publisher","first-page":"436","DOI":"10.1038\/nature14539","volume":"521","author":"Y LeCun","year":"2015","unstructured":"LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436\u2013444","journal-title":"Nature"},{"issue":"3","key":"902_CR21","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","volume":"115","author":"O Russakovsky","year":"2015","unstructured":"Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211\u2013252","journal-title":"Int J Comput Vis"},{"key":"902_CR22","doi-asserted-by":"crossref","unstructured":"He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770\u2013778","DOI":"10.1109\/CVPR.2016.90"},{"key":"902_CR23","unstructured":"Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In: International conference on machine learning, pp 1928\u20131937"},{"key":"902_CR24","unstructured":"Mirowski P, Pascanu R, Viola F, Soyer H, Ballard AJ, Banino A, Denil M, Goroshin R, Sifre L, Kavukcuoglu K et\u00a0al (2016) Learning to navigate in complex environments. arXiv preprint arXiv:1611.03673"},{"key":"902_CR25","doi-asserted-by":"crossref","unstructured":"Mousavian A, Toshev A, Fi\u0161er M, Ko\u0161eck\u00e1 J, Wahid A, Davidson J (2019) Visual representations for semantic target driven navigation. In: 2019 international conference on robotics and automation (ICRA). IEEE, pp 8846\u20138852","DOI":"10.1109\/ICRA.2019.8793493"},{"issue":"4","key":"902_CR26","doi-asserted-by":"publisher","first-page":"2393","DOI":"10.1109\/TII.2019.2936167","volume":"16","author":"H Shi","year":"2019","unstructured":"Shi H, Shi L, Xu M, Hwang KS (2019) End-to-end navigation strategy with deep reinforcement learning for mobile robots. IEEE Trans Ind Inf 16(4):2393\u20132402","journal-title":"IEEE Trans Ind Inf"},{"key":"902_CR27","doi-asserted-by":"crossref","unstructured":"Fang K, Toshev A, Fei-Fei L, Savarese S (2019) Scene memory transformer for embodied agents in long-horizon tasks. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 538\u2013547","DOI":"10.1109\/CVPR.2019.00063"},{"key":"902_CR28","unstructured":"Du H, Yu X, Zheng L (2021) Vtnet: Visual transformer network for object goal navigation. arXiv preprint arXiv:2105.09447"},{"key":"902_CR29","first-page":"10001","volume":"34","author":"Q Wu","year":"2020","unstructured":"Wu Q, Manocha D, Wang J, Xu K (2020) Neonav: improving the generalization of visual navigation via generating next expected observations. Proc AAAI Conf Artif Intell 34:10001\u201310008","journal-title":"Proc AAAI Conf Artif Intell"},{"key":"902_CR30","doi-asserted-by":"crossref","unstructured":"Tang T, Yu X, Dong X, Yang Y (2021) Auto-navigator: decoupled neural architecture search for visual navigation. In: Proceedings of the IEEE\/CVF winter conference on applications of computer vision, pp 3743\u20133752","DOI":"10.1109\/WACV48630.2021.00379"},{"key":"902_CR31","doi-asserted-by":"crossref","unstructured":"Zeng Z, R\u00f6fer A, Jenkins OC (2020) Semantic linking maps for active visual object search. In: 2020 IEEE international conference on robotics and automation (ICRA). IEEE, pp 1984\u20131990","DOI":"10.1109\/ICRA40945.2020.9196830"},{"key":"902_CR32","unstructured":"Chaplot DS, Gandhi DP, Gupta A, Salakhutdinov RR (2020) Object goal navigation using goal-oriented semantic exploration. Adv Neural Inf Process Syst 33"},{"issue":"2","key":"902_CR33","doi-asserted-by":"publisher","first-page":"1279","DOI":"10.1109\/LRA.2020.2967677","volume":"5","author":"R Druon","year":"2020","unstructured":"Druon R, Yoshiyasu Y, Kanezaki A, Watt A (2020) Visual object search by learning spatial context. IEEE Robot Autom Lett 5(2):1279\u20131286","journal-title":"IEEE Robot Autom Lett"},{"key":"902_CR34","unstructured":"Qiu Y, Pal A, Christensen HI (2020) Learning hierarchical relationships for object-goal navigation. arXiv preprint arXiv:2003.06749"},{"key":"902_CR35","unstructured":"Nagabandi A, Clavera I, Liu S, Fearing RS, Abbeel P, Levine S, Finn C (2018) Learning to adapt in dynamic, real-world environments through meta-reinforcement learning. arXiv preprint arXiv:1803.11347"},{"key":"902_CR36","first-page":"9937","volume":"33","author":"AE Eshratifar","year":"2019","unstructured":"Eshratifar AE, Abrishami MS, Eigen D, Pedram M (2019) A meta-learning approach for custom model training. Proc AAAI Conf Artif Intell 33:9937\u20139938","journal-title":"Proc AAAI Conf Artif Intell"},{"key":"902_CR37","doi-asserted-by":"crossref","unstructured":"Yan L, Liu D, Song Y, Yu C (2020) Multimodal aggregation approach for memory vision-voice indoor navigation with meta-learning. In: 2020 IEEE\/rsj international conference on intelligent robots and systems (IROS). IEEE, pp 5847\u20135854","DOI":"10.1109\/IROS45743.2020.9341398"},{"key":"902_CR38","doi-asserted-by":"crossref","unstructured":"Wang C, Qiu M, Huang J, He X (2020) Keml: A knowledge-enriched meta-learning framework for lexical relation classification. arXiv preprint arXiv:2002.10903","DOI":"10.1609\/aaai.v35i15.17640"},{"key":"902_CR39","first-page":"11210","volume":"35","author":"H Zou","year":"2021","unstructured":"Zou H, Ren T, Yan D, Su H, Zhu J (2021) Learning task-distribution reward shaping with meta-learning. Proc AAAI Conf Artif Intell 35:11210\u201311218","journal-title":"Proc AAAI Conf Artif Intell"},{"issue":"8","key":"902_CR40","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735\u20131780","journal-title":"Neural Comput"},{"key":"902_CR41","doi-asserted-by":"crossref","unstructured":"Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532\u20131543","DOI":"10.3115\/v1\/D14-1162"},{"key":"902_CR42","unstructured":"Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. arXiv preprint arXiv:1703.03400"},{"key":"902_CR43","unstructured":"Kolve E, Mottaghi R, Gordon D, Zhu Y, Gupta A, Farhadi A (2017) Ai2-thor: an interactive 3d environment for visual AI. arXiv preprint arXiv:1712.05474"},{"key":"902_CR44","unstructured":"Anderson P, Chang A, Chaplot DS, Dosovitskiy A, Gupta S, Koltun V, Kosecka J, Malik J, Mottaghi R, Savva M et\u00a0al (2018) On evaluation of embodied navigation agents. arXiv preprint arXiv:1807.06757"},{"key":"902_CR45","unstructured":"Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980"},{"key":"902_CR46","doi-asserted-by":"crossref","unstructured":"Mayo B, Hazan T, Tal A (2021) Visual navigation with spatial attention. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, pp 16898\u201316907","DOI":"10.1109\/CVPR46437.2021.01662"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-022-00902-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-022-00902-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-022-00902-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,4,18]],"date-time":"2023-04-18T09:42:29Z","timestamp":1681810949000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-022-00902-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,4]]},"references-count":46,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,4]]}},"alternative-id":["902"],"URL":"https:\/\/doi.org\/10.1007\/s40747-022-00902-7","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,11,4]]},"assertion":[{"value":"26 January 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 October 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 November 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflicts of interest to report regarding the present study.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}