{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,26]],"date-time":"2025-11-26T16:25:42Z","timestamp":1764174342903,"version":"3.38.0"},"reference-count":57,"publisher":"SAGE Publications","issue":"1","license":[{"start":{"date-parts":[[2016,12,14]],"date-time":"2016-12-14T00:00:00Z","timestamp":1481673600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of Robotics Research"],"published-print":{"date-parts":[[2017,1]]},"abstract":"<jats:p> Autonomous vehicles are often tasked to explore unseen environments, aiming to acquire and understand large amounts of visual image data and other sensory information. In such scenarios, remote sensing data may be available a priori, and can help to build a semantic model of the environment and plan future autonomous missions. In this paper, we introduce two multimodal learning algorithms to model the relationship between visual images taken by an autonomous underwater vehicle during a survey and remotely sensed acoustic bathymetry (ocean depth) data that is available prior to the survey. We present a multi-layer architecture to capture the joint distribution between the bathymetry and visual modalities. We then propose an extension based on gated feature learning models, which allows the model to cluster the input data in an unsupervised fashion and predict visual image features using just the ocean depth information. Our experiments demonstrate that multimodal learning improves semantic classification accuracy regardless of which modalities are available at classification time, allows for unsupervised clustering of either or both modalities, and can facilitate mission planning by enabling class-based or image-based queries. <\/jats:p>","DOI":"10.1177\/0278364916679892","type":"journal-article","created":{"date-parts":[[2017,1,21]],"date-time":"2017-01-21T17:11:38Z","timestamp":1485018698000},"page":"24-43","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":32,"title":["Multimodal learning and inference from visual and remotely sensed data"],"prefix":"10.1177","volume":"36","author":[{"given":"Dushyant","family":"Rao","sequence":"first","affiliation":[{"name":"Australian Centre for Field Robotics, The University of Sydney, Sydney, Australia"}]},{"given":"Mark","family":"De Deuge","sequence":"additional","affiliation":[{"name":"Australian Centre for Field Robotics, The University of Sydney, Sydney, Australia"}]},{"given":"Navid","family":"Nourani\u2013Vatani","sequence":"additional","affiliation":[{"name":"Siemens AG, Berlin, Germany"}]},{"given":"Stefan B","family":"Williams","sequence":"additional","affiliation":[{"name":"Australian Centre for Field Robotics, The University of Sydney, Sydney, Australia"}]},{"given":"Oscar","family":"Pizarro","sequence":"additional","affiliation":[{"name":"Australian Centre for Field Robotics, The University of Sydney, Sydney, Australia"}]}],"member":"179","published-online":{"date-parts":[[2016,12,14]]},"reference":[{"key":"bibr1-0278364916679892","doi-asserted-by":"publisher","DOI":"10.3354\/meps08210"},{"key":"bibr2-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1007\/s10208-007-9011-z"},{"key":"bibr3-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2012.6247798"},{"key":"bibr4-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2012.6386258"},{"key":"bibr5-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2013.6630605"},{"key":"bibr6-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1038\/sdata.2015.57"},{"key":"bibr7-0278364916679892","first-page":"1","volume-title":"13th international symposium on experimental robotics","author":"Bo L","year":"2012"},{"key":"bibr8-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1016\/j.ecss.2011.02.007"},{"key":"bibr9-0278364916679892","first-page":"1","volume-title":"NIPS workshop on deep learning and unsupervised feature learning","author":"Coates A","year":"2010"},{"key":"bibr10-0278364916679892","first-page":"921","volume-title":"International conference on machine learning","author":"Coates A","year":"2011"},{"key":"bibr11-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1177\/0278364915587723"},{"key":"bibr12-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1177\/0278364910373409"},{"key":"bibr13-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1109\/MRA.2011.2181683"},{"key":"bibr14-0278364916679892","unstructured":"Erhan D, Bengio Y, Courville A, (2009) Visualizing higher-layer features of a deep network. Technical Report, University of Montreal."},{"key":"bibr15-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0050440"},{"key":"bibr16-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2008.08.007"},{"key":"bibr17-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-00065-7_53"},{"key":"bibr18-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2013.6638947"},{"key":"bibr19-0278364916679892","unstructured":"Hinton G (2010) A practical guide to training restricted Boltzmann machines. Technical Report, Department of Computer Science, University of Toronto, Canada."},{"key":"bibr20-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1162\/089976602760128018"},{"key":"bibr21-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.49"},{"key":"bibr22-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2013.6630948"},{"key":"bibr23-0278364916679892","doi-asserted-by":"publisher","DOI":"10.3354\/meps219121"},{"key":"bibr24-0278364916679892","first-page":"1","volume-title":"Advances in neural information processing systems","author":"Krizhevsky A","year":"2012"},{"key":"bibr25-0278364916679892","first-page":"761","volume-title":"Advances in neural information processing systems","author":"Kurihara K","year":"2006"},{"key":"bibr26-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2014.6907298"},{"key":"bibr27-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1145\/1553374.1553453"},{"key":"bibr28-0278364916679892","first-page":"1096","volume-title":"Advances in neural information processing systems","author":"Lee H","year":"2009"},{"key":"bibr29-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1016\/j.ecss.2012.11.001"},{"key":"bibr30-0278364916679892","first-page":"1","author":"Mao J","year":"2014","journal-title":"arXiv e-Print Archive"},{"key":"bibr31-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1364\/OPEX.13.008766"},{"key":"bibr32-0278364916679892","first-page":"1339","volume":"22","author":"Nair V","year":"2009","journal-title":"Advances in Neural Information Processing Systems"},{"key":"bibr33-0278364916679892","first-page":"1145","volume-title":"Advances in Neural Information Processing Systems","author":"Nair V","year":"2009"},{"key":"bibr34-0278364916679892","first-page":"689","volume-title":"Proceedings of the 28th annual international conference on machine learning","author":"Ngiam J","year":"2011"},{"key":"bibr35-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1109\/ACSSC.1993.342465"},{"key":"bibr36-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2012.6224637"},{"issue":"2","key":"bibr37-0278364916679892","volume":"29","author":"Pronobis A","year":"2010","journal-title":"The International Journal of Robotics Research (IJRR)"},{"key":"bibr38-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2007.383157"},{"key":"bibr39-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2014.6907413"},{"volume-title":"IROS workshop on alternative sensing for robot perception","year":"2015","author":"Rao D","key":"bibr40-0278364916679892"},{"key":"bibr41-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1002\/rob.20372"},{"key":"bibr42-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1145\/1273496.1273596"},{"key":"bibr43-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1029\/2004EO310002"},{"key":"bibr44-0278364916679892","first-page":"2141","volume-title":"Advances in neural information processing systems","author":"Sohn K","year":"2014"},{"key":"bibr45-0278364916679892","unstructured":"Spinoccia M (2011) Bathymetry grids of south east Tasmania shelf. Geoscience Australia. https:\/\/data.gov.au\/dataset\/bathymetry-grids-of-south-east-tasmania-shelf"},{"key":"bibr46-0278364916679892","first-page":"1","volume":"25","author":"Srivastava N","year":"2012","journal-title":"Advances in Neural Information Processing Systems"},{"key":"bibr47-0278364916679892","unstructured":"Steinberg DM (2013) An unsupervised approach to modelling visual data. PhD Thesis, University of Sydney, Australia."},{"key":"bibr48-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2010.5652480"},{"volume-title":"Proceedings of the international symposium on robotics research","year":"2015","author":"Valada A","key":"bibr49-0278364916679892"},{"key":"bibr50-0278364916679892","doi-asserted-by":"publisher","DOI":"10.3723\/ut.28.099"},{"key":"bibr51-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1162\/NECO_a_00142"},{"key":"bibr52-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1145\/1390156.1390294"},{"key":"bibr53-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-13408-1_25"},{"key":"bibr54-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1109\/MRA.2011.2181772"},{"key":"bibr55-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1080\/01490410701295962"},{"key":"bibr56-0278364916679892","first-page":"1794","volume-title":"IEEE conference on computer vision and pattern recognition","author":"Yang J","year":"2009"},{"key":"bibr57-0278364916679892","doi-asserted-by":"publisher","DOI":"10.1080\/00207729808929596"}],"container-title":["The International Journal of Robotics Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0278364916679892","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/0278364916679892","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/0278364916679892","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,1]],"date-time":"2025-03-01T13:44:56Z","timestamp":1740836696000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/0278364916679892"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,12,14]]},"references-count":57,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2017,1]]}},"alternative-id":["10.1177\/0278364916679892"],"URL":"https:\/\/doi.org\/10.1177\/0278364916679892","relation":{},"ISSN":["0278-3649","1741-3176"],"issn-type":[{"type":"print","value":"0278-3649"},{"type":"electronic","value":"1741-3176"}],"subject":[],"published":{"date-parts":[[2016,12,14]]}}}