{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T21:23:03Z","timestamp":1780521783858,"version":"3.54.1"},"reference-count":88,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2021,10,30]],"date-time":"2021-10-30T00:00:00Z","timestamp":1635552000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,10,30]],"date-time":"2021-10-30T00:00:00Z","timestamp":1635552000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000266","name":"engineering and physical sciences research council","doi-asserted-by":"publisher","award":["EP\/L000539\/1"],"award-info":[{"award-number":["EP\/L000539\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000266","name":"engineering and physical sciences research council","doi-asserted-by":"publisher","award":["EP\/P022529\/1"],"award-info":[{"award-number":["EP\/P022529\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Virtual Reality"],"published-print":{"date-parts":[[2022,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>As personalised immersive display systems have been intensely explored in virtual reality (VR), plausible 3D audio corresponding to the visual content is required to provide more realistic experiences to users. It is well known that spatial audio synchronised with visual information improves a sense of immersion but limited research progress has been achieved in immersive audio-visual content production and reproduction. In this paper, we propose an end-to-end pipeline to simultaneously reconstruct 3D geometry and acoustic properties of the environment from a pair of omnidirectional panoramic images. A semantic scene reconstruction and completion method using a deep convolutional neural network is proposed to estimate the complete semantic scene geometry in order to adapt spatial audio reproduction to the scene. Experiments provide objective and subjective evaluations of the proposed pipeline for plausible audio-visual VR reproduction of real scenes.<\/jats:p>","DOI":"10.1007\/s10055-021-00594-3","type":"journal-article","created":{"date-parts":[[2021,10,30]],"date-time":"2021-10-30T18:03:09Z","timestamp":1635616989000},"page":"823-838","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["Immersive audio-visual scene reproduction using semantic scene reconstruction from 360 cameras"],"prefix":"10.1007","volume":"26","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4907-0491","authenticated-orcid":false,"given":"Hansung","family":"Kim","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Luca","family":"Remaggi","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Aloisio","family":"Dourado","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Teofilo de","family":"Campos","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Philip J. B.","family":"Jackson","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Adrian","family":"Hilton","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2021,10,30]]},"reference":[{"key":"594_CR1","volume-title":"Simulated annealing and Boltzmann machines: a stochastic approach to combinatorial optimization and neural computing","author":"E Aarts","year":"1989","unstructured":"Aarts E, Korst J (1989) Simulated annealing and Boltzmann machines: a stochastic approach to combinatorial optimization and neural computing. Wiley, New York"},{"key":"594_CR2","doi-asserted-by":"crossref","unstructured":"Armeni I, Sener O, Zamir AR, Jiang H, Brilakis I, Fischer M, Savarese S (2016) 3D semantic parsing of large-scale indoor spaces. In: Proceedings of CVPR, pp 1534\u20131543","DOI":"10.1109\/CVPR.2016.170"},{"key":"594_CR3","doi-asserted-by":"crossref","unstructured":"Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell","DOI":"10.1109\/TPAMI.2016.2644615"},{"issue":"5","key":"594_CR4","doi-asserted-by":"publisher","first-page":"3510","DOI":"10.1121\/1.4987362","volume":"141","author":"W Bailey","year":"2017","unstructured":"Bailey W, Fazenda BM (2017) The effect of reverberation and audio spatialization on egocentric distance estimation of objects in stereoscopic virtual reality. J. Acoust Soc Am 141(5):3510","journal-title":"J. Acoust Soc Am"},{"key":"594_CR5","unstructured":"Bailey W, Fazenda BM (2018) The effect of visual cues and binaural rendering method on plausibility in virtual environments. In: Proceedings of the 144th AES convention, Milan, Italy"},{"key":"594_CR6","first-page":"69","volume":"XLII\u20132","author":"L Barazzetti","year":"2018","unstructured":"Barazzetti L, Previtali M, Roncoroni F (2018) Can we use low-cost 360 degree cameras to create accurate 3d models? ISPRS Int Arch Photogram Remote Sens Spat Inf Sci XLII\u20132:69\u201375","journal-title":"ISPRS Int Arch Photogram Remote Sens Spat Inf Sci"},{"issue":"4","key":"594_CR7","first-page":"320","volume":"81","author":"M Barron","year":"1995","unstructured":"Barron M (1995) Interpretation of early decay times in concert auditoria. Acta Acustica 81(4):320\u2013331","journal-title":"Acta Acustica"},{"key":"594_CR8","doi-asserted-by":"crossref","unstructured":"Bengio Y, Louradour J, Collobert R, Weston, J (2009) Curriculum learning. In: Proceedings of ICML, pp 41\u201348","DOI":"10.1145\/1553374.1553380"},{"key":"594_CR9","doi-asserted-by":"publisher","first-page":"163","DOI":"10.1007\/s10055-017-0321-4","volume":"24","author":"PRKS Bhama","year":"2017","unstructured":"Bhama PRKS, Hariharasubramanian V, Mythili OP, Ramachandran M (2017) Users-domain knowledge prediction in e-learning with speech-interfaced augmented and virtual reality contents. Virt Real 24:163\u2013173","journal-title":"Virt Real"},{"issue":"8","key":"594_CR10","doi-asserted-by":"publisher","first-page":"98","DOI":"10.3390\/jimaging4080098","volume":"4","author":"S Bianco","year":"2018","unstructured":"Bianco S, Ciocca G, Marelli D (2018) Evaluating the performance of structure from motion pipelines. J Imaging 4(8):98","journal-title":"J Imaging"},{"key":"594_CR11","doi-asserted-by":"publisher","DOI":"10.1007\/b139075","volume-title":"Communication acoustics","author":"J Blauert","year":"2005","unstructured":"Blauert J (2005) Communication acoustics. Springer, Berlin"},{"key":"594_CR12","doi-asserted-by":"crossref","unstructured":"Bleyer M, Breiteneder C (2013) Stereo matching-state-of-the-art and research challenges. In: Advanced topics in computer vision, pp 143\u2013179","DOI":"10.1007\/978-1-4471-5520-1_6"},{"issue":"10","key":"594_CR13","doi-asserted-by":"publisher","first-page":"713","DOI":"10.1016\/j.apacoust.2011.04.004","volume":"72","author":"JS Bradley","year":"2011","unstructured":"Bradley JS (2011) Review of objective room acoustics measures and future needs. Appl Acoust 72(10):713\u2013720","journal-title":"Appl Acoust"},{"key":"594_CR14","unstructured":"Brown K, Paradis M, Murphy D (2017) Openairlib: a javascript library for the acoustics of spaces. In: Audio engineering society convention, p 142"},{"issue":"6","key":"594_CR15","doi-asserted-by":"publisher","first-page":"1309","DOI":"10.1109\/TRO.2016.2624754","volume":"32","author":"C Cadena","year":"2016","unstructured":"Cadena C, Carlone L, Carrillo H, Latif Y, Scaramuzza D, Neira J, Reid I, Leonard J (2016) Past, present, and future of simultaneous localization and mapping: towards the robust-perception age. IEEE Trans Rob 32(6):1309\u20131332","journal-title":"IEEE Trans Rob"},{"key":"594_CR16","doi-asserted-by":"crossref","unstructured":"Chang A, Dai A, Funkhouser T, Halber M, Niessner M, Savva M, Song S, Zeng A, Zhang Y (2017) Matterport 3D: learning from RGB-D data in indoor environments. In: Proceedings of 3DV","DOI":"10.1109\/3DV.2017.00081"},{"key":"594_CR17","unstructured":"Corporation V (2021) Steam audio. https:\/\/valvesoftware.github.io\/steam-audio\/"},{"issue":"4","key":"594_CR18","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1109\/MMUL.2012.61","volume":"20","author":"D Cosker","year":"2013","unstructured":"Cosker D, Eisert P, Grau O, Hancock PJB, McKinnell J, Ong E (2013) Applications of face analysis and modeling in media production. IEEE Multimed 20(4):18\u201327","journal-title":"IEEE Multimed"},{"key":"594_CR19","unstructured":"Cox T (2013) Gun shot in anechoic chamber. Freesound. https:\/\/freesound.org\/people\/acs272\/sounds\/210766\/"},{"key":"594_CR20","doi-asserted-by":"crossref","unstructured":"Dourado A, de\u00a0Campos TE, Kim H, Hilton A (2021) EdgeNet: semantic scene completion from rgb-d images. In: Proceedings of ICPR","DOI":"10.1109\/ICPR48806.2021.9413252"},{"key":"594_CR21","unstructured":"Farina A (2000) Simultaneous measurement of impulse response and distortion with a swept-sine technique. In: Proceedings of the AES convention"},{"issue":"6","key":"594_CR22","doi-asserted-by":"publisher","first-page":"381","DOI":"10.1145\/358669.358692","volume":"24","author":"MA Fischler","year":"1981","unstructured":"Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381\u2013395","journal-title":"Commun ACM"},{"issue":"1\u20132","key":"594_CR23","first-page":"1","volume":"9","author":"Y Furukawa","year":"2015","unstructured":"Furukawa Y, Hern\u00e1ndez C (2015) Multi-view stereo: a tutorial. Found Trends Comput Gr Vis 9(1\u20132):1\u2013148","journal-title":"Found Trends Comput Gr Vis"},{"key":"594_CR24","doi-asserted-by":"crossref","unstructured":"Garofolo JS, Lamel LF, Fisher WM, Fiscus JG, Pallet DS, Dahlgren NL (1993) DARPA TIMIT acoustic phonetic continuous speech corpus CDROM. Technical report. NIST Interagency","DOI":"10.6028\/NIST.IR.4930"},{"key":"594_CR25","unstructured":"Gaudio: Gaudio vr audio (2021). https:\/\/gaudiolab.com\/solution-ar-vr-and-immersive\/"},{"key":"594_CR26","unstructured":"Gonzalez R, Woods R (2017) Digital image processing. Pearson"},{"issue":"1","key":"594_CR27","doi-asserted-by":"publisher","first-page":"1125","DOI":"10.3389\/fpsyg.2017.01125","volume":"8","author":"M Gonzalez-Franco","year":"2017","unstructured":"Gonzalez-Franco M, Lanier J (2017) Model of illusions and virtual reality. Front Psychol 8(1):1125","journal-title":"Front Psychol"},{"key":"594_CR28","unstructured":"Google: Google resonance audio (2021). https:\/\/resonance-audio.github.io\/resonance-audio\/"},{"key":"594_CR29","unstructured":"GoPro: Gopro fusion (2019). https:\/\/shop.gopro.com\/EMEA\/cameras\/fusion\/CHDHZ-103-master.html"},{"key":"594_CR30","unstructured":"Gorzel M, Allen A, Kelly I, Gungormusler A, Kammerl J, Yeh H, Boland F (2019) Efficient encoding and decoding of binaural sound with resonance audio. In: Proceedings of the AES conference on immersive and interactive audio, York, UK"},{"key":"594_CR31","unstructured":"Guo R, Zou C, Hoiem D (2015) Predicting complete 3D models of indoor scenes. CoRR abs\/1504.02437. http:\/\/arxiv.org\/abs\/1504.02437"},{"key":"594_CR32","doi-asserted-by":"crossref","unstructured":"Gupta A, Efros AA, Hebert M (2010) Blocks world revisited: image understanding using qualitative geometry and mechanics. In: Proceedings of ECCV","DOI":"10.1007\/978-3-642-15561-1_35"},{"key":"594_CR33","unstructured":"Handa A, Patraucean V, Badrinarayanan V, Stent S, Cipolla R (2015) SceneNet: understanding real world indoor scenes with synthetic data. CoRR abs\/1511.07041. http:\/\/arxiv.org\/abs\/1511.07041"},{"key":"594_CR34","doi-asserted-by":"crossref","unstructured":"Hicks M, Nichols S, O\u2019Malley C (2004) Comparing the roles of 3d representations in audio and audio-visual collaborations. Virt Real 7:148\u2013163","DOI":"10.1007\/s10055-004-0126-0"},{"key":"594_CR35","unstructured":"Hoeg W, Christensen L, Walker R (1997) Subjective assessment of audio quality\u2014the means and methods within the EBU. Technical report. EBU Technical Review"},{"key":"594_CR36","unstructured":"HTC: Vive pro (2018). https:\/\/www.vive.com\/uk\/product\/vive-pro-full-kit\/"},{"issue":"1","key":"594_CR37","doi-asserted-by":"publisher","first-page":"102","DOI":"10.1111\/j.1467-8659.2011.02086.x","volume":"31","author":"V Hulusic","year":"2012","unstructured":"Hulusic V, Harvey C, Debattista K, Tsingos N, Walker S, Howard D, Chalmers A (2012) Acoustic rendering and auditory-visual cross-modal perception and interaction. J Comput Gr Forum 31(1):102\u2013131","journal-title":"J Comput Gr Forum"},{"key":"594_CR38","doi-asserted-by":"crossref","unstructured":"Im S, Ha H, Rameau F, Jeon HG, Choe G, Kweon I (2016) All-around depth from small motion with a spherical panoramic camera. In: Proceedings of ECCV, vol 9907","DOI":"10.1007\/978-3-319-46487-9_10"},{"key":"594_CR39","unstructured":"Insta360: Insta360 one x (2019). https:\/\/www.insta360.com\/product\/insta360-onex"},{"issue":"11","key":"594_CR40","doi-asserted-by":"publisher","first-page":"1611","DOI":"10.1109\/TCSVT.2012.2202185","volume":"22","author":"H Kim","year":"2012","unstructured":"Kim H, Guillemaut JY, Takai T, Sarim M, Hilton A (2012) Outdoor dynamic 3d scene reconstruction. IEEE Trans Circuits Syst Video Technol 22(11):1611\u20131622","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"issue":"1","key":"594_CR41","doi-asserted-by":"publisher","first-page":"94","DOI":"10.1007\/s11263-013-0616-1","volume":"104","author":"H Kim","year":"2013","unstructured":"Kim H, Hilton A (2013) 3D scene reconstruction from multiple spherical stereo pairs. Int J Comput Vis 104(1):94\u2013116","journal-title":"Int J Comput Vis"},{"key":"594_CR42","unstructured":"Kim H, Hughes RJ, Remaggi L, Jackson PJB, Hilton A, Cox TJ, Shirley B (2017) Acoustic room modelling using a spherical camera for reverberant spatial audio objects. In: Proceedings of the 142th AES convention"},{"key":"594_CR43","doi-asserted-by":"crossref","unstructured":"Kim H, Remaggi L, Jackson PJ, Hilton A (2019) Immersive spatial audio reproduction for vr\/ar using room acoustic modelling from 360 images. In: Proceedings of IEEE VR","DOI":"10.1109\/VR.2019.8798247"},{"issue":"4","key":"594_CR44","doi-asserted-by":"publisher","first-page":"917","DOI":"10.1109\/TCSVT.2019.2898732","volume":"30","author":"HG Kim","year":"2020","unstructured":"Kim HG, Lim H, Ro YM (2020) Deep virtual reality image quality assessment with human perception guider for omnidirectional image. IEEE Trans Circuits Syst Video Technol 30(4):917\u2013928","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"issue":"12","key":"594_CR45","doi-asserted-by":"publisher","first-page":"4921","DOI":"10.1109\/TCYB.2019.2931042","volume":"50","author":"U Kim","year":"2020","unstructured":"Kim U, Park J, Song T, Kim J (2020) 3-d scene graph: a sparse and semantic representation of physical environments for intelligent agents. IEEE Trans Cybern 50(12):4921\u20134933","journal-title":"IEEE Trans Cybern"},{"key":"594_CR46","unstructured":"Kinetic A (2021) Wwise spatial audio. https:\/\/www.audiokinetic.com\/products\/wwise-spatial-audio\/"},{"issue":"3","key":"594_CR47","doi-asserted-by":"publisher","first-page":"226","DOI":"10.1109\/34.667881","volume":"20","author":"J Kittler","year":"1998","unstructured":"Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE Trans Pattern Anal Mach Intell 20(3):226\u2013239","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"594_CR48","unstructured":"Kon H, Koike H (2018) Deep neural networks for cross-modal estimations of acoustic reverberation characteristics from two-dimensional images. In: Proceedings of the 144th AES convention, Milan, Italy"},{"key":"594_CR49","doi-asserted-by":"crossref","unstructured":"Larsson P, V\u00e4ljam\u00e4e A, V\u00e4stfj\u00e4ll D, Tajadura-Jim\u00e9nez A, Kleiner M (2010) Auditory-induced presence in mixed reality environments and related technology","DOI":"10.1007\/978-1-84882-733-2_8"},{"key":"594_CR50","first-page":"1","volume":"2","author":"KE Laver","year":"2015","unstructured":"Laver KE, George S, Thomas S, Deutsch JE, Crotty M (2015) Virtual reality for stroke rehabilitation. Cochrane Collab 2:1\u201327","journal-title":"Cochrane Collab"},{"key":"594_CR51","doi-asserted-by":"crossref","unstructured":"Li D, Langlois TR, Zheng C (2018) Scene-aware audio for 360\u00b0 videos. ACM Trans Gr 37(4)","DOI":"10.1145\/3197517.3201391"},{"issue":"5","key":"594_CR52","doi-asserted-by":"publisher","first-page":"804","DOI":"10.3813\/AAA.918562","volume":"98","author":"A Lindau","year":"2012","unstructured":"Lindau A, Weinzierl S (2012) Assessing the plausibility of virtual acoustic environments. Acta Acust Acust 98(5):804\u2013810","journal-title":"Acta Acust Acust"},{"key":"594_CR53","unstructured":"Liu S, Hu Y, Zeng Y, Tang Q, Jin B, Han Y, Li X (2018) See and think: disentangling semantic scene completion. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Proceedings of NIPS, pp 263\u2013274"},{"issue":"4","key":"594_CR54","doi-asserted-by":"publisher","first-page":"828","DOI":"10.1109\/TCSVT.2016.2543039","volume":"27","author":"R Mekuria","year":"2017","unstructured":"Mekuria R, Blom K, Cesar P (2017) Design, implementation, and evaluation of a point cloud codec for tele-immersive video. IEEE Trans Circuits Syst Video Technol 27(4):828\u2013842","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"key":"594_CR55","doi-asserted-by":"crossref","unstructured":"Meng Z, Zhao F, He M (2006) The just noticeable difference of noise length and reverberation perception. In: Proceedings of the international symposium on communications and information technologies, Bangkok, Thailand","DOI":"10.1109\/ISCIT.2006.339980"},{"key":"594_CR56","doi-asserted-by":"publisher","first-page":"173","DOI":"10.1007\/s10055-016-0288-6","volume":"20","author":"R Menzies","year":"2016","unstructured":"Menzies R, Rogers SJ, Phillips AM, Chiarovano E, Waele C, Verstraten F, MacDougall H (2016) An objective measure for the visual fidelity of virtual reality and the risks of falls in a virtual environment. Virt Real 20:173\u2013181","journal-title":"Virt Real"},{"key":"594_CR57","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1007\/s10055-019-00382-0","volume":"24","author":"S Narayanan","year":"2020","unstructured":"Narayanan S, Polys N, Bukvic I (2020) Cinemacraft: exploring fidelity cues in collaborative virtual world interactions. Virt Real 24:53\u201373","journal-title":"Virt Real"},{"key":"594_CR58","unstructured":"Neidhardt A, Tommy AI, Pereppadan AD (2018) Plausibility of an interactive approaching motion towards a virtual sound source based on simplified BRIR sets. In: Proceedings of the 144th AES convention, Milan, Italy"},{"key":"594_CR59","doi-asserted-by":"crossref","unstructured":"Newcombe R, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison A, Kohli P, Shotton J, Hodges S, Fitzgibbon A (2011) Kinectfusion: real-time dense surface mapping and tracking. In: Proceedings of ISMAR","DOI":"10.1109\/ISMAR.2011.6092378"},{"key":"594_CR60","unstructured":"Morgado P, Vasconcelos N, Langlois T, Wang O (2018) Self-supervised generation of spatial audio for 360\u00b0 video. In: Proceedings of NIPS"},{"issue":"2","key":"594_CR61","doi-asserted-by":"publisher","first-page":"225","DOI":"10.1109\/TCSVT.2014.2335832","volume":"25","author":"X Peng","year":"2015","unstructured":"Peng X, Bennamoun M, Wang Q, Ma Q, Xu Z (2015) A low-cost implementation of a 360\u00b0 vision distributed aperture system. IEEE Trans Circuits Syst Video Technol 25(2):225\u2013238","journal-title":"IEEE Trans Circuits Syst Video Technol"},{"key":"594_CR62","unstructured":"Politis A, Tervo S, Lokki T, Pulkki V (2018) Parametric multidirectional decomposition of microphone recordings for broadband high-order ambisonic encoding. In: Proceedings of the 144th AES convention"},{"key":"594_CR63","doi-asserted-by":"crossref","unstructured":"Pollard KA, Oiknine AH, Files BT, Sinatra AM, Patton D, Ericson M, Thomas J, Khooshabeh P (2020) Level of immersion affects spatial learning in virtual environments: results of a three-condition within-subjects study with long intersession intervals. Virt Real 1\u201314","DOI":"10.1007\/s10055-019-00411-y"},{"key":"594_CR64","doi-asserted-by":"publisher","first-page":"161","DOI":"10.1007\/s10055-015-0275-3","volume":"19","author":"BNJ Postma","year":"2015","unstructured":"Postma BNJ, Katz B (2015) Creation and calibration method of acoustical models for historic virtual reality auralizations. Virt Real 19:161\u2013180","journal-title":"Virt Real"},{"key":"594_CR65","unstructured":"Remaggi L, Jackson PJB, Coleman P (2015) Estimation of room reflection parameters for a reverberant spatial audio object. In: Proceedings of the 138th AES convention"},{"key":"594_CR66","unstructured":"Remaggi L, Kim H, Neidhardt A, Hilton A, Jackson PJ (2019) Perceived quality and spatial impression of room reverberation in VR reproduction from measured images and acoustics. In: Proceedings of ICA"},{"key":"594_CR67","unstructured":"Ricoh: Ricoh theta v (2019). https:\/\/theta360.com\/en\/about\/theta\/v.html"},{"key":"594_CR68","unstructured":"Robotham T, Rummukainen O, Herre J, Habets EAP (2018) Online vs offline multiple stimulus audio quality evaluation for virtual reality. In: Proceedings of the 145th AES convention, New York, USA"},{"key":"594_CR69","doi-asserted-by":"crossref","unstructured":"Ronneberger O, Fischer P, Brox,T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells WM, Frangi AF (eds) Proceedings of MICCAI, pp 234\u2013241","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"594_CR70","doi-asserted-by":"crossref","DOI":"10.1007\/978-1-4939-0755-7","volume-title":"Springer handbook of acoustics","author":"TD Rossing","year":"2014","unstructured":"Rossing TD (2014) Springer handbook of acoustics, 2nd edn. Springer, Berlin","edition":"2"},{"key":"594_CR71","doi-asserted-by":"publisher","first-page":"117","DOI":"10.1007\/BF02009728","volume":"1","author":"D Rossiter","year":"1995","unstructured":"Rossiter D, Baciu G, Horner A (1995) An investigation into the modelling of virtual objects with sound vibration properties. Virt Real 1:117\u2013121","journal-title":"Virt Real"},{"key":"594_CR72","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1097\/00001648-199001000-00010","volume":"1","author":"KJ Rothman","year":"1990","unstructured":"Rothman KJ (1990) No adjustments are needed for multiple comparisons. Epidemiology 1:43\u201346","journal-title":"Epidemiology"},{"key":"594_CR73","doi-asserted-by":"publisher","first-page":"223","DOI":"10.1007\/s10055-015-0274-4","volume":"19","author":"D Ruminski","year":"2015","unstructured":"Ruminski D (2015) An experimental study of spatial sound usefulness in searching and navigating through AR environments. Virt Real 19:223\u2013233","journal-title":"Virt Real"},{"issue":"3","key":"594_CR74","doi-asserted-by":"publisher","first-page":"1246","DOI":"10.1109\/TVCG.2017.2666150","volume":"24","author":"C Schissler","year":"2018","unstructured":"Schissler C, Loftin C, Manocha D (2018) Acoustic classification and optimization for multi-modal rendering of real-world scenes. IEEE Trans Visual Comput Gr 24(3):1246\u20131259","journal-title":"IEEE Trans Visual Comput Gr"},{"key":"594_CR75","doi-asserted-by":"crossref","unstructured":"Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from RGBD images. In: Fitzgibbon A, Lazebnik S, Perona P, Sato Y, Schmid C (eds) Proceedings of ECCV, pp 746\u2013760","DOI":"10.1007\/978-3-642-33715-4_54"},{"key":"594_CR76","unstructured":"Smith LN (2018) A disciplined approach to neural network hyper-parameters: part 1\u2014learning rate, batch size, momentum, and weight decay. CoRR abs\/1803.09820"},{"key":"594_CR77","doi-asserted-by":"crossref","unstructured":"Song M, Watanabe H, Hara J (2018) Robust 3d reconstruction with omni-directional camera based on structure from motion. In: Proceedings of IWAIT, pp 1\u20134","DOI":"10.1109\/IWAIT.2018.8369715"},{"key":"594_CR78","doi-asserted-by":"crossref","unstructured":"Song S, Yu F, Zeng A, Chang AX, Savva M, Funkhouser T (2017) Semantic scene completion from a single depth image. In: Proceedings of CVPR","DOI":"10.1109\/CVPR.2017.28"},{"key":"594_CR79","doi-asserted-by":"crossref","unstructured":"Song S, Zeng A, Chang AX, Savva M, Savarese S, Funkhouser T (2018) Im2Pano3D: extrapolating 360\u00b0 structure and semantics beyond the field of view. In: Proceedings of CVPR","DOI":"10.1109\/CVPR.2018.00405"},{"issue":"4","key":"594_CR80","first-page":"249","volume":"50","author":"GB Stan","year":"2002","unstructured":"Stan GB, Embrechts JJ, Archambeau D (2002) Comparison of different impulse response measurement techniques. J Audio Eng Soc 50(4):249\u2013262","journal-title":"J Audio Eng Soc"},{"issue":"4","key":"594_CR81","doi-asserted-by":"publisher","first-page":"259","DOI":"10.1016\/0169-7439(89)80095-4","volume":"6","author":"L Sthle","year":"1989","unstructured":"Sthle L, Wold S (1989) Analysis of variance (anova). Chemom Intell Lab Syst 6(4):259\u2013272","journal-title":"Chemom Intell Lab Syst"},{"key":"594_CR82","doi-asserted-by":"crossref","unstructured":"Student (1908) The probable error of a mean. Biometika 6:1\u201325","DOI":"10.2307\/2331554"},{"issue":"1\/2","key":"594_CR83","first-page":"17","volume":"61","author":"S Tervo","year":"2013","unstructured":"Tervo S, Patynen J, Kuusinen A, Lokki T (2013) Spatial decomposition method for room impulse responses. J Audio Eng Soc 61(1\/2):17\u201328","journal-title":"J Audio Eng Soc"},{"key":"594_CR84","doi-asserted-by":"publisher","first-page":"189","DOI":"10.1016\/j.patrec.2013.07.003","volume":"36","author":"M Turk","year":"2014","unstructured":"Turk M (2014) Multimodal interaction: a review. Pattern Recogn Lett 36:189\u2013195","journal-title":"Pattern Recogn Lett"},{"key":"594_CR85","unstructured":"Unity (2019). https:\/\/unity.com\/"},{"issue":"5","key":"594_CR86","doi-asserted-by":"publisher","first-page":"1421","DOI":"10.1109\/TASL.2012.2189567","volume":"20","author":"V Valimaki","year":"2012","unstructured":"Valimaki V, Parker J, Savioja L, Smith J, Abel J (2012) Fifty years of artificial reverberation. IEEE Trans Audio Speech Lang Process 20(5):1421\u20131448","journal-title":"IEEE Trans Audio Speech Lang Process"},{"key":"594_CR87","unstructured":"Vorl\u00e4nder M (1995) International round robin on room acoustical computer simulations. In: Proceedings of the 15th ICA, Trondheim, Norway"},{"key":"594_CR88","doi-asserted-by":"crossref","unstructured":"Zhang J, Zhao H, Yao A, Chen Y, Zhang L, Liao H (2018) Efficient semantic scene completion network with spatial group convolution. In: Proceedings of ECCV","DOI":"10.1007\/978-3-030-01258-8_45"}],"container-title":["Virtual Reality"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10055-021-00594-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10055-021-00594-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10055-021-00594-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,11,12]],"date-time":"2023-11-12T00:21:51Z","timestamp":1699748511000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10055-021-00594-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,30]]},"references-count":88,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,9]]}},"alternative-id":["594"],"URL":"https:\/\/doi.org\/10.1007\/s10055-021-00594-3","relation":{},"ISSN":["1359-4338","1434-9957"],"issn-type":[{"value":"1359-4338","type":"print"},{"value":"1434-9957","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,10,30]]},"assertion":[{"value":"9 October 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"4 October 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 October 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}