{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:17:59Z","timestamp":1750220279848,"version":"3.41.0"},"reference-count":24,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2021,9,22]],"date-time":"2021-09-22T00:00:00Z","timestamp":1632268800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Comput. Graph. Interact. Tech."],"published-print":{"date-parts":[[2021,9,22]]},"abstract":"<jats:p>We model acoustic perception in AI agents efficiently within complex scenes with many sound events. The key idea is to employ perceptual parameters that capture how each sound event propagates through the scene to the agent's location. This naturally conforms virtual perception to human. We propose a simplified auditory masking model that limits localization capability in the presence of distracting sounds. We show that anisotropic reflections as well as the initial sound serve as useful localization cues. Our system is simple, fast, and modular and obtains natural results in our tests, letting agents navigate through passageways and portals by sound alone, and anticipate or track occluded but audible targets. Source code is provided.<\/jats:p>","DOI":"10.1145\/3480139","type":"journal-article","created":{"date-parts":[[2021,9,28]],"date-time":"2021-09-28T04:43:36Z","timestamp":1632804216000},"page":"1-13","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Efficient acoustic perception for virtual AI agents"],"prefix":"10.1145","volume":"4","author":[{"given":"Mike","family":"Chemistruck","sequence":"first","affiliation":[{"name":"Microsoft Mixed Reality, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andrew","family":"Allen","sequence":"additional","affiliation":[{"name":"Microsoft Mixed Reality, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"John","family":"Snyder","sequence":"additional","affiliation":[{"name":"Microsoft Research, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nikunj","family":"Raghuvanshi","sequence":"additional","affiliation":[{"name":"Microsoft Research, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,9,27]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/6391.001.0001"},{"key":"#cr-split#-e_1_2_2_2_1.1","doi-asserted-by":"crossref","unstructured":"Chang'an Chen Unnat Jain Carl Schissler S. V. A. Gar\u00ed Ziad Al-Halah Vamsi K. Ithapu Philip Robinson and K. Grauman. 2020. SoundSpaces: Audio-Visual Navigation in 3D Environments. In ECCV. https:\/\/doi.org\/10.1007\/978-3-030-58539-6_2 10.1007\/978-3-030-58539-6_2","DOI":"10.1007\/978-3-030-58539-6_2"},{"key":"#cr-split#-e_1_2_2_2_1.2","doi-asserted-by":"crossref","unstructured":"Chang'an Chen Unnat Jain Carl Schissler S. V. A. Gar\u00ed Ziad Al-Halah Vamsi K. Ithapu Philip Robinson and K. Grauman. 2020. SoundSpaces: Audio-Visual Navigation in 3D Environments. In ECCV. https:\/\/doi.org\/10.1007\/978-3-030-58539-6_2","DOI":"10.1007\/978-3-030-58539-6_2"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1121\/1.1907229"},{"volume-title":"Audio Engineering Society Conference: 2020 AES International Conference on Audio for Virtual and Augmented Reality. Audio Engineering Society.","author":"Cowan Brent","key":"e_1_2_2_4_1","unstructured":"Brent Cowan , Bill Kapralos , and K. C. Collins . 2020. Realistic Auditory Artificial Intelligence: Spatial Sound Modelling to Provide NPCs with Sound Perception . In Audio Engineering Society Conference: 2020 AES International Conference on Audio for Virtual and Augmented Reality. Audio Engineering Society. Brent Cowan, Bill Kapralos, and K. C. Collins. 2020. Realistic Auditory Artificial Intelligence: Spatial Sound Modelling to Provide NPCs with Sound Perception. In Audio Engineering Society Conference: 2020 AES International Conference on Audio for Virtual and Augmented Reality. Audio Engineering Society."},{"key":"e_1_2_2_5_1","volume-title":"2021 International Conference on Machine Learning.","author":"Devlin Sam","year":"2021","unstructured":"Sam Devlin , Raluca Georgescu , Ida Momennejad , Jaroslaw Rzepecki , Evelyn Zuniga , Gavin Costello , Guy Leroy , Ali Shaw , and Katja Hofmann . 2021 . Navigation Turing Test (NTT): Learning to Evaluate Human-like Navigation . In 2021 International Conference on Machine Learning. Sam Devlin, Raluca Georgescu, Ida Momennejad, Jaroslaw Rzepecki, Evelyn Zuniga, Gavin Costello, Guy Leroy, Ali Shaw, and Katja Hofmann. 2021. Navigation Turing Test (NTT): Learning to Evaluate Human-like Navigation. In 2021 International Conference on Machine Learning."},{"volume-title":"Acoustics in Halls for Speech and Music","author":"Gade Anders","key":"e_1_2_2_6_1","unstructured":"Anders Gade . 2007. Acoustics in Halls for Speech and Music . In Springer Handbook of Acoustics (two thousand, seventh ed.), Thomas Rossing (Ed.). Springer , Chapter 9. Anders Gade. 2007. Acoustics in Halls for Speech and Music. In Springer Handbook of Acoustics (two thousand, seventh ed.), Thomas Rossing (Ed.). Springer, Chapter 9."},{"volume-title":"Proceedings of the 12th ACM SIGGRAPH\/Eurographics Symposium on Computer Animation (SCA '13)","author":"Huang Pengfei","key":"e_1_2_2_7_1","unstructured":"Pengfei Huang , Mubbasir Kapadia , and Norman I. Badler . 2013. SPREAD: Sound Propagation and Perception for Autonomous Agents in Dynamic Environments . In Proceedings of the 12th ACM SIGGRAPH\/Eurographics Symposium on Computer Animation (SCA '13) . Association for Computing Machinery, New York, NY, USA, 135--144. https:\/\/doi.org\/10.1145\/2485895.2485911 10.1145\/2485895.2485911 Pengfei Huang, Mubbasir Kapadia, and Norman I. Badler. 2013. SPREAD: Sound Propagation and Perception for Autonomous Agents in Dynamic Environments. In Proceedings of the 12th ACM SIGGRAPH\/Eurographics Symposium on Computer Animation (SCA '13). Association for Computing Machinery, New York, NY, USA, 135--144. https:\/\/doi.org\/10.1145\/2485895.2485911"},{"volume-title":"16th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment","author":"Jacob Mikhail","key":"e_1_2_2_8_1","unstructured":"Mikhail Jacob , Sam Devlin , and Katja Hofmann . 2020. \" It's Unwieldy and It Takes a Lot of Time .\" Challenges and Opportunities for Creating Agents in Commercial Games . In 16th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment . Association for the Advancement of Artificial Intelligence (AAAI), Association for the Advancement of Artificial Intelligence (AAAI). Mikhail Jacob, Sam Devlin, and Katja Hofmann. 2020. \"It's Unwieldy and It Takes a Lot of Time.\" Challenges and Opportunities for Creating Agents in Commercial Games. In 16th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment. Association for the Advancement of Artificial Intelligence (AAAI), Association for the Advancement of Artificial Intelligence (AAAI)."},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1121\/1.427914"},{"key":"e_1_2_2_10_1","first-page":"3","article-title":"Sound Localization in Noise in Normal-Hearing Listeners","volume":"105","author":"Lorenzi Christian","year":"1999","unstructured":"Christian Lorenzi , Stuart Gatehouse , and Catherine Lever . 1999 . Sound Localization in Noise in Normal-Hearing Listeners . The Journal of the Acoustical Society of America 105 , 3 (March 1999), 1810--1820. https:\/\/doi.org\/10.1121\/1.426719 10.1121\/1.426719 Christian Lorenzi, Stuart Gatehouse, and Catherine Lever. 1999. Sound Localization in Noise in Normal-Hearing Listeners. The Journal of the Acoustical Society of America 105, 3 (March 1999), 1810--1820. https:\/\/doi.org\/10.1121\/1.426719","journal-title":"The Journal of the Acoustical Society of America"},{"key":"e_1_2_2_11_1","unstructured":"Microsoft Corp. 2018. Project Acoustics. https:\/\/aka.ms\/acoustics.  Microsoft Corp. 2018. Project Acoustics. https:\/\/aka.ms\/acoustics."},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1121\/1.1909553"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1185657.1185832"},{"volume-title":"Computer Vision - ECCV 2016 (Lecture Notes in Computer Science)","author":"Owens Andrew","key":"e_1_2_2_14_1","unstructured":"Andrew Owens , Jiajun Wu , Josh H. McDermott , William T. Freeman , and Antonio Torralba . 2016. Ambient Sound Provides Supervision for Visual Learning . In Computer Vision - ECCV 2016 (Lecture Notes in Computer Science) , Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer International Publishing , Cham , 801--816. https:\/\/doi.org\/10.1007\/978-3-319-46448-0_48 10.1007\/978-3-319-46448-0_48 Andrew Owens, Jiajun Wu, Josh H. McDermott, William T. Freeman, and Antonio Torralba. 2016. Ambient Sound Provides Supervision for Visual Learning. In Computer Vision - ECCV 2016 (Lecture Notes in Computer Science), Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer International Publishing, Cham, 801--816. https:\/\/doi.org\/10.1007\/978-3-319-46448-0_48"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.842996"},{"key":"e_1_2_2_16_1","first-page":"6","article-title":"Virtual Sound Source Positioning Using Vector Base Amplitude Panning","volume":"45","author":"Pulkki Ville","year":"1997","unstructured":"Ville Pulkki . 1997 . Virtual Sound Source Positioning Using Vector Base Amplitude Panning . Journal of the Audio Engineering Society 45 , 6 (June 1997), 456--466. Ville Pulkki. 1997. Virtual Sound Source Positioning Using Vector Base Amplitude Panning. Journal of the Audio Engineering Society 45, 6 (June 1997), 456--466.","journal-title":"Journal of the Audio Engineering Society"},{"key":"e_1_2_2_17_1","doi-asserted-by":"crossref","unstructured":"Nikunj Raghuvanshi and John Snyder. 2018. Parametric Directional Coding for Precomputed Sound Propagation. ACM Trans. Graph. (2018).  Nikunj Raghuvanshi and John Snyder. 2018. Parametric Directional Coding for Precomputed Sound Propagation. ACM Trans. Graph. (2018).","DOI":"10.1145\/3197517.3201339"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.14099"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.1121\/1.4926438"},{"key":"e_1_2_2_20_1","volume-title":"The Psychoacoustics of Multichannel Audio. In Audio Engineering Society Conference: UK 11th Conference: Audio for New Media (ANM). Audio Engineering Society.","author":"Stuart J. Robert","year":"1996","unstructured":"J. Robert Stuart . 1996 . The Psychoacoustics of Multichannel Audio. In Audio Engineering Society Conference: UK 11th Conference: Audio for New Media (ANM). Audio Engineering Society. J. Robert Stuart. 1996. The Psychoacoustics of Multichannel Audio. In Audio Engineering Society Conference: UK 11th Conference: Audio for New Media (ANM). Audio Engineering Society."},{"volume-title":"Proceedings of the 18th Meeting of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D '14)","author":"Wang Yu","key":"e_1_2_2_21_1","unstructured":"Yu Wang , Mubbasir Kapadia , Pengfei Huang , Ladislav Kavan , and Norman I. Badler . 2014. Sound Localization and Multi-Modal Steering for Autonomous Virtual Agents . In Proceedings of the 18th Meeting of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D '14) . Association for Computing Machinery, New York, NY, USA, 23--30. https:\/\/doi.org\/10.1145\/2556700.2556718 10.1145\/2556700.2556718 Yu Wang, Mubbasir Kapadia, Pengfei Huang, Ladislav Kavan, and Norman I. Badler. 2014. Sound Localization and Multi-Modal Steering for Autonomous Virtual Agents. In Proceedings of the 18th Meeting of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D '14). Association for Computing Machinery, New York, NY, USA, 23--30. https:\/\/doi.org\/10.1145\/2556700.2556718"},{"key":"e_1_2_2_22_1","article-title":"Ambient Sound Propagation. ACM","author":"Zhang Zechen","year":"2018","unstructured":"Zechen Zhang , Nikunj Raghuvanshi , John Snyder , and Steve Marschner . 2018 . Ambient Sound Propagation. ACM Trans. Graph. 6 ( Nov. 2018). https:\/\/doi.org\/10.1145\/3272127.3275100 10.1145\/3272127.3275100 Zechen Zhang, Nikunj Raghuvanshi, John Snyder, and Steve Marschner. 2018. Ambient Sound Propagation. ACM Trans. Graph. 6 (Nov. 2018). https:\/\/doi.org\/10.1145\/3272127.3275100","journal-title":"Trans. Graph. 6"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3355089.3356566"}],"container-title":["Proceedings of the ACM on Computer Graphics and Interactive Techniques"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3480139","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3480139","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:31:16Z","timestamp":1750188676000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3480139"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,22]]},"references-count":24,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2021,9,22]]}},"alternative-id":["10.1145\/3480139"],"URL":"https:\/\/doi.org\/10.1145\/3480139","relation":{},"ISSN":["2577-6193"],"issn-type":[{"type":"electronic","value":"2577-6193"}],"subject":[],"published":{"date-parts":[[2021,9,22]]},"assertion":[{"value":"2021-09-27","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}