{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T20:37:40Z","timestamp":1776976660719,"version":"3.51.4"},"reference-count":76,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2026,3,10]],"date-time":"2026-03-10T00:00:00Z","timestamp":1773100800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Virtual Real."],"abstract":"<jats:sec>\n                    <jats:title>Introduction<\/jats:title>\n                    <jats:p>Cinematic VR (CVR) removes the director\u2019s frame, creating the challenge of guiding audience attention without breaking immersion. This systematic review synthesizes empirical evidence on two audio modalities with strong potential to function as narrative agents\u2014diegetic audio (sounds from within the story world) and object-based spatial audio (discrete sound \u201cobjects\u201d rendered with positional metadata)\u2014to clarify how they guide attention, shape affect and presence, and how these effects are validated.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Methods<\/jats:title>\n                    <jats:p>Following Preferred Reporting Items for Systematic Reviews and Meta\u2010Analyses (PRISMA) 2020, searches in IEEE Xplore, ACM Digital Library, Scopus, and Web of Science (last searched June 2025) identified studies using diegetic and\/or object\u2010based spatial audio as narrative devices in CVR with empirical user data; non-diegetic\u2010only or purely technical papers without user measures were excluded (except where used as baselines). We conducted a qualitative synthesis across behavioral (head\/eye tracking), subjective (presence\/engagement), and physiological (HR\/EMG\/EDA\/PPG) measures; no protocol registration was performed.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>Eighteen studies met inclusion criteria. Across studies, world-locked, off-screen diegetic cues were repeatedly reported to redirect gaze and shorten time-to-region-of-interest after cuts, while object-based rendering enabled precise, dynamic cue placement and was commonly associated with higher presence\/immersion and affective arousal relative to non-spatial or head-locked baselines.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Discussion<\/jats:title>\n                    <jats:p>Evidence remains constrained by methodological heterogeneity, small-to-moderate samples, inconsistent reporting, and limited direct measures of narrative comprehension. We contribute (i) a functional taxonomy of CVR narrative audio techniques aligned to diegetic\/object-based practice, (ii) a Validation Triangulation Framework integrating behavioral, subjective, and physiological evidence, and (iii) a Minimum Reporting &amp;amp; Sharing Standard for CVR Narrative Audio specifying what to report and how to share data\/metadata, aligned with PRISMA guidance and Findable, Accessible, Interoperable, Reusable (FAIR) principles.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.3389\/frvir.2026.1696677","type":"journal-article","created":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T22:10:09Z","timestamp":1773094209000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Diegetic and object-based spatial audio in cinematic VR: a PRISMA-Guided systematic review with a functional taxonomy and validation framework"],"prefix":"10.3389","volume":"7","author":[{"given":"Vimala","family":"Perumal","sequence":"first","affiliation":[{"name":"Multimedia University","place":["Cyberjaya, Malaysia"]}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zeeshan Jawed","family":"Shah","sequence":"additional","affiliation":[{"name":"Multimedia University","place":["Cyberjaya, Malaysia"]},{"name":"Prince Mohammad Bin Fahd University","place":["Dhahran, Saudi Arabia"]}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1965","published-online":{"date-parts":[[2026,3,10]]},"reference":[{"key":"B1","doi-asserted-by":"publisher","first-page":"257","DOI":"10.1016\/j.cub.2004.01.029","article-title":"The ventriloquist effect results from near-optimal bimodal integration","volume":"14","author":"Alais","year":"2004","journal-title":"Curr. Biol."},{"key":"B2","first-page":"84","article-title":"Construction of the literature graph in Semantic Scholar","volume-title":"Proceedings of the 2018 conference of the North American chapter of the association for Computational Linguistics: Demonstrations","author":"Ammar","year":"2018"},{"key":"B3","unstructured":"AES E-Library"},{"key":"B4","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3290605.3300925","article-title":"When the elephant trumps: a comparative study on spatial audio for orientation in 360\u00b0 videos","volume-title":"Proceedings of the 2019 CHI conference on human factors in computing systems","author":"Bala","year":"2019"},{"key":"B5","doi-asserted-by":"publisher","first-page":"80","DOI":"10.1016\/j.jneumeth.2010.04.028","article-title":"A continuous measure of phasic electrodermal activity","volume":"190","author":"Benedek","year":"2010","journal-title":"J. Neurosci. Methods"},{"key":"B6","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1525\/fq.2002.55.3.16","article-title":"Intensified continuity","volume":"55","author":"Bordwell","year":"2002","journal-title":"Film. Q."},{"key":"B76","first-page":"2","article-title":"The nuts and bolts of PROSPERO: an international prospective register of systematic reviews","volume-title":"Syst. Rev.","author":"Booth","year":"2012"},{"key":"B7","volume-title":"Film art: an introduction","author":"Bordwell","year":"2017"},{"key":"B8","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/1486.001.0001","volume-title":"Auditory scene analysis: the perceptual organization of sound","author":"Bregman","year":"1990"},{"key":"B9","doi-asserted-by":"publisher","first-page":"321","DOI":"10.1080\/15213260903287259","article-title":"Measuring narrative engagement","volume":"12","author":"Busselle","year":"2009","journal-title":"Media Psychol."},{"key":"B10","volume-title":"Handbook of psychophysiology","author":"Cacioppo","year":"2007"},{"key":"B11","doi-asserted-by":"crossref","first-page":"489","DOI":"10.7551\/mitpress\/3422.001.0001","article-title":"Multisensory integration: methodological approaches and neuroanatomical substrates","volume-title":"The handbook of multisensory processes","author":"Calvert","year":"2004"},{"key":"B12","doi-asserted-by":"publisher","first-page":"l6890","DOI":"10.1136\/bmj.l6890","article-title":"Synthesis without meta-analysis (SWiM) in systematic reviews: reporting guideline","volume":"368","author":"Campbell","year":"2020","journal-title":"BMJ"},{"key":"B13","unstructured":"Registered reports\n          \n          \n          2025"},{"key":"B14","doi-asserted-by":"publisher","first-page":"609","DOI":"10.1016\/j.cortex.2012.12.016","article-title":"Registered reports: a new publishing initiative at Cortex","volume":"49","author":"Chambers","year":"2013","journal-title":"Cortex"},{"key":"B15","doi-asserted-by":"publisher","first-page":"359","DOI":"10.1002\/asi.20317","article-title":"CiteSpace II: detecting and visualizing emerging trends and transient patterns in scientific literature","volume":"57","author":"Chen","year":"2006","journal-title":"J. Am. Soc. Inf. Sci. Technol."},{"key":"B16","doi-asserted-by":"crossref","DOI":"10.7312\/chio18588","volume-title":"Audio-vision: sound on screen (C. Gorbman, Trans.)","author":"Chion","year":"2019"},{"key":"B17","doi-asserted-by":"publisher","first-page":"43","DOI":"10.29173\/jchla29699","article-title":"Product review: researchrabbit","volume":"44","author":"Cole","year":"2023","journal-title":"J. Can. Health Libr. Assoc."},{"key":"B18","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/7909.001.0001","volume-title":"Game sound: an introduction to the history, theory, and practice of video game music and sound design","author":"Collins","year":"2008"},{"key":"B19","volume-title":"Dolby atmos: next-generation audio for cinema","year":"2012"},{"key":"B20","doi-asserted-by":"publisher","first-page":"245","DOI":"10.1016\/S0959-4388(98)80147-5","article-title":"Crossmodal attention","volume":"8","author":"Driver","year":"1998","journal-title":"Curr. Opin. Neurobiol."},{"key":"B21","doi-asserted-by":"crossref","first-page":"409","DOI":"10.1007\/978-3-030-49760-6_29","article-title":"Analyzing the user experience of virtual reality storytelling with visual and aural stimuli","volume-title":"Lecture notes in computer science: vol. 12201. Design, user experience, and usability. Design for contemporary interactive environments","author":"Dumlu","year":"2020"},{"key":"B22","unstructured":"About elicit\/reliability and sources\n          \n          \n          2025"},{"key":"B23","unstructured":"EBU Tech Report 045: NGA deployment \u2013 the path to Next Generation Audio for broadcasters\n          \n          \n          2019"},{"key":"B24","unstructured":"ADM (Audio Definition Model) Guidelines (web resource)\n          \n          \n          2025"},{"key":"B25","article-title":"Cognitive indicators for acoustic source localization and presence in a vivid 3D scene","volume-title":"Proceedings of the 23rd international congress on acoustics","author":"Flore","year":"2019"},{"key":"B26","unstructured":"MPEG-H audio \u2013 production tools and authoring (MHAPi)\n          \n          \n            \n              Fraunhofer\n              I. I. S."},{"key":"B27","doi-asserted-by":"crossref","DOI":"10.26503\/dl.v2007i1.313","article-title":"Situating gaming as a sonic experience: the acoustic ecology of first-person shooters","volume-title":"Proceedings of DiGRA 2007: situated play","author":"Grimshaw","year":"2007"},{"key":"B28","first-page":"1","article-title":"Evaluating the effect of cinematography on the viewing experience in immersive environment","volume-title":"2022 IEEE international conference on multimedia and expo (ICME)","author":"Han","year":"2022"},{"key":"B29","doi-asserted-by":"publisher","first-page":"94","DOI":"10.14746\/i.2024.37.46.23","article-title":"Cinematic virtual reality: the paradox of the omniscient viewer in omnidirectional space versus artistic authorial control","volume":"2024","author":"Haqshenas","year":"2024","journal-title":"Stud. Filmowe i Telew."},{"key":"B30","volume-title":"The anatomy of the orchestra: a VR experience","author":"Hazelwood","year":"2018"},{"key":"B31","doi-asserted-by":"crossref","DOI":"10.1117\/12.2626118","article-title":"A study of multimodal head\/eye orientation prediction techniques in virtual space","volume-title":"Proceedings of SPIE 12177, International Workshop on Advanced Imaging Technology (IWAIT) 2022","author":"Higuchi","year":"2022"},{"key":"B32","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3650208","article-title":"A quality of experience and visual attention evaluation for 360\u00b0 videos with non-spatial and spatial audio","volume":"20","author":"Hirway","year":"2024","journal-title":"ACM Trans. Multimedia Comput. Commun. Appl."},{"key":"B33","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/SIVE.2017.7901610","article-title":"Binaural sound reduces reaction time in a virtual reality search task","volume-title":"2017 IEEE 3rd VR workshop on sonic interactions for virtual environments (SIVE)","author":"Hoeg","year":"2017"},{"key":"B34","volume-title":"Surround sound: up and running","author":"Holman","year":"2008"},{"key":"B35","volume-title":"ICAD proceedings","year":""},{"key":"B36","volume-title":"Recommendation ITU-R BS.2094-1: common definitions for the audio definition model","year":"2017"},{"key":"B37","volume-title":"Recommendation ITU-R BS.2076-2: audio definition model (ADM)","year":"2019"},{"key":"B38","volume-title":"Recommendation ITU-R BS.2076-3: audio definition model (ADM)","year":"2025"},{"key":"B39","unstructured":"Recommendation ITU-R BS.2168-0: requirements for the use of ADM metadata in next generation audio (PDF)\n          \n          \n          2025"},{"key":"B40","doi-asserted-by":"crossref","DOI":"10.7551\/mitpress\/9780262026864.001.0001","volume-title":"Gameworld interfaces","author":"J\u00f8rgensen","year":"2013"},{"key":"B41","doi-asserted-by":"publisher","first-page":"213","DOI":"10.3389\/fpsyg.2017.00213","article-title":"Heart rate variability and cardiac vagal tone in psychophysiological research\u2014Recommendations for experiment planning, data analysis, and data reporting","volume":"8","author":"Laborde","year":"2017","journal-title":"Front. Psychol."},{"key":"B42","doi-asserted-by":"publisher","first-page":"1086","DOI":"10.3389\/fpsyg.2015.01086","article-title":"Audiovisual crossmodal cuing effects in front and rear space","volume":"6","author":"Lee","year":"2015","journal-title":"Front. Psychol."},{"key":"B43","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1109\/VR.2017.7892230","article-title":"Cinematic virtual reality: evaluating the effect of display type on the viewing experience for panoramic video","volume-title":"2017 IEEE Virtual Reality (VR)","author":"MacQuarrie","year":"2017"},{"key":"B44","doi-asserted-by":"publisher","first-page":"79","DOI":"10.1109\/MCG.2021.3063228","article-title":"Influence of directional sound cues on users' exploration across 360\u00b0 movie cuts","volume":"41","author":"Masi\u00e1","year":"2021","journal-title":"IEEE Comput. Graph. Appl."},{"key":"B45","doi-asserted-by":"publisher","first-page":"40","DOI":"10.1016\/j.jclinepi.2016.01.021","article-title":"PRESS Peer Review of Electronic Search Strategies: 2015 guideline statement","volume":"75","author":"McGowan","year":"2016","journal-title":"J. Clin. Epidemiol."},{"key":"B46","article-title":"The exit sign: off-screen, in-theater","volume":"14","year":"2019","journal-title":"Media Fields J"},{"key":"B47","doi-asserted-by":"publisher","first-page":"e1000217","DOI":"10.1371\/journal.pmed.1000217","article-title":"Guidance for developers of health research reporting guidelines","volume":"7","author":"Moher","year":"2010","journal-title":"PLoS Med."},{"key":"B48","volume-title":"An introduction to the psychology of hearing","author":"Moore","year":"2012"},{"key":"B49","doi-asserted-by":"publisher","first-page":"154","DOI":"10.1016\/S0926-6410(03)00089-2","article-title":"Auditory capture of vision: examining temporal ventriloquism","volume":"17","author":"Morein-Zamir","year":"2003","journal-title":"Cognitive Brain Res."},{"key":"B50","first-page":"135","article-title":"Surround sound\u2014The chaos continues","volume":"26","author":"Newell","year":"2004","journal-title":"Proc. Inst. Acoust."},{"key":"B51","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1145\/2993369.2993405","article-title":"Missing the point: an exploration of how to guide users' attention during cinematic virtual reality","volume-title":"Proceedings of the 22nd ACM conference on virtual reality software and technology","author":"Nielsen","year":"2016"},{"key":"B52","doi-asserted-by":"publisher","first-page":"n71","DOI":"10.1136\/bmj.n71","article-title":"The PRISMA 2020 statement: an updated guideline for reporting systematic reviews","volume":"372","author":"Page","year":"2021","journal-title":"BMJ"},{"key":"B53","doi-asserted-by":"publisher","first-page":"1","DOI":"10.3389\/fpsyg.2013.00086","article-title":"Achieving presence through evoked reality: a review of presence definitions and factors","volume":"4","author":"Pillai","year":"2013","journal-title":"Front. Psychol."},{"key":"B54","doi-asserted-by":"crossref","first-page":"129","DOI":"10.1109\/VR.2013.6549396","article-title":"Integration of spatial sound in immersive virtual environments: an experimental study on effects of spatial sound on presence","volume-title":"2013 IEEE virtual reality (VR)","author":"Poeschl","year":"2013"},{"key":"B55","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2205.01833","article-title":"OpenAlex: a fully-open index of scholarly works, authors, venues, institutions, and concepts","author":"Priem","year":"2022","journal-title":"arXiv"},{"key":"B56","doi-asserted-by":"publisher","first-page":"174","DOI":"10.5195\/jmla.2021.962","article-title":"PRISMA-S: an extension to the PRISMA statement for reporting literature searches in systematic reviews","volume":"109","author":"Rethlefsen","year":"2021","journal-title":"J. Med. Libr. Assoc."},{"key":"B57","doi-asserted-by":"publisher","first-page":"11547","DOI":"10.1038\/s41598-020-68253-2","article-title":"Engagement in video and audio narratives: contrasting self-report and physiological measures","volume":"10","author":"Richardson","year":"2020","journal-title":"Sci. Rep."},{"key":"B58","first-page":"101","article-title":"Guiding the viewer in cinematic virtual reality by diegetic cues","volume-title":"Lecture notes in computer science: vol. 10850. Augmented reality, virtual reality, and computer graphics","author":"Rothe","year":""},{"key":"B59","first-page":"1","article-title":"Spatial statistics for analyzing data in cinematic virtual reality","volume-title":"Proceedings of the 2018 international conference on advanced visual interfaces","author":"Rothe","year":""},{"key":"B60","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3139131.3143421","article-title":"Diegetic cues for guiding the viewer in cinematic virtual reality: extended abstract","volume-title":"Proceedings of the 23rd ACM symposium on virtual reality software and technology","author":"Rothe","year":"2017"},{"key":"B61","volume-title":"Spatial audio","author":"Rumsey","year":"2001"},{"key":"B62","doi-asserted-by":"crossref","first-page":"29","DOI":"10.1049\/ibc.2016.0029","article-title":"Directing attention in 360-degree video","volume-title":"IBC 2016 conference","author":"Sheikh","year":"2016"},{"key":"B63","doi-asserted-by":"publisher","first-page":"673","DOI":"10.3758\/BF03193770","article-title":"Visual dominance and attention: the Colavita effect revisited","volume":"69","author":"Sinnett","year":"2007","journal-title":"Percept. Psychophys."},{"key":"B64","unstructured":"SMPTE Digital Library"},{"key":"B65","doi-asserted-by":"publisher","first-page":"400","DOI":"10.1016\/j.tics.2010.06.008","article-title":"The multifaceted interplay between attention and multisensory integration","volume":"14","author":"Talsma","year":"2010","journal-title":"Trends Cognitive Sci."},{"key":"B66","unstructured":"Audio Spatializers in XR [Unity Manual]\n          \n          Unity Manual webpage\n          \n          2023"},{"key":"B67","doi-asserted-by":"publisher","first-page":"1053","DOI":"10.1037\/0096-1523.34.5.1053","article-title":"Pip and pop: nonspatial auditory signals improve spatial visual search","volume":"34","author":"van der Burg","year":"2008","journal-title":"J. Exp. Psychol. Hum. Percept. Perform."},{"key":"B68","doi-asserted-by":"publisher","first-page":"523","DOI":"10.1007\/s11192-009-0146-3","article-title":"Software survey: VOSviewer, a computer program for bibliometric mapping","volume":"84","author":"van Eck","year":"2010","journal-title":"Scientometrics"},{"key":"B69","article-title":"Exploring the impact of audio-visual information in omnidirectional videos on user behaviors for virtual reality","author":"Wang","year":"2020","journal-title":"TARA, Trinity Coll. Dublin\u2019s Open Access Repos"},{"key":"B70","doi-asserted-by":"crossref","first-page":"262","DOI":"10.1109\/ISMAR-Adjunct57072.2022.00058","article-title":"Validating the effects of immersion and spatial audio using novel continuous biometric sensor measures for virtual reality","volume-title":"2022 IEEE international symposium on mixed and augmented reality adjunct (ISMAR-Adjunct)","author":"Warp","year":"2022"},{"key":"B71","doi-asserted-by":"publisher","first-page":"160018","DOI":"10.1038\/sdata.2016.18","article-title":"The FAIR guiding principles for scientific data management and stewardship","volume":"3","author":"Wilkinson","year":"2016","journal-title":"Sci. Data"},{"key":"B72","first-page":"1","article-title":"Guidelines for snowballing in systematic literature studies and a replication in software engineering","volume-title":"Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering (EASE \u201914)","author":"Wohlin","year":"2014"},{"key":"B73","doi-asserted-by":"crossref","first-page":"1233","DOI":"10.1109\/VR.2019.8798184","article-title":"Motivation to select point of view in cinematic virtual reality","volume-title":"2019 IEEE conference on virtual reality and 3D user interfaces (VR)","author":"Won","year":"2019"},{"key":"B74","first-page":"520","article-title":"Sound-guided framing in cinematic virtual reality \u2013 an eye-tracking study","volume-title":"Lecture notes in computer scienceHCI International 2022 \u2013 late Breaking Papers. Interaction in new media, learning and culture","author":"Xue","year":"2022"},{"key":"B75","doi-asserted-by":"crossref","DOI":"10.1007\/978-3-030-17207-7","volume-title":"Ambisonics: a practical 3D audio theory for recording, studio production, sound reinforcement, and virtual reality","author":"Zotter","year":"2019"}],"container-title":["Frontiers in Virtual Reality"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frvir.2026.1696677\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,9]],"date-time":"2026-03-09T22:10:17Z","timestamp":1773094217000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frvir.2026.1696677\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,10]]},"references-count":76,"alternative-id":["10.3389\/frvir.2026.1696677"],"URL":"https:\/\/doi.org\/10.3389\/frvir.2026.1696677","relation":{},"ISSN":["2673-4192"],"issn-type":[{"value":"2673-4192","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,10]]},"article-number":"1696677"}}