{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,15]],"date-time":"2026-03-15T03:51:52Z","timestamp":1773546712247,"version":"3.50.1"},"reference-count":64,"publisher":"Association for Computing Machinery (ACM)","issue":"3","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62172241"],"award-info":[{"award-number":["62172241"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2026,3,31]]},"abstract":"<jats:p>\n                    The widespread integration of the Internet of Things with sensors like depth-of-field cameras, LiDAR scanners, and eye-tracking infrared sensors, in head-mounted devices, has ushered in a new era of immersive digital experiences. Full-scene volumetric video (VV), a key innovation in this integration, provides a deeply immersive experience by capturing the richness and detail of the 3D world. However, its massive data volume presents significant streaming challenges. While 3D tile-based viewport approaches have been proposed, they struggle to full-scene VV given the small video buffer limitation, high tile segmentation overhead, and lack of full-scene consideration. In this work, inspired by the advancements of implicit neural radiance field (NeRF), we present\n                    <jats:inline-formula content-type=\"math\/tex\">\n                      <jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\({\\mathsf{V}^{2}\\mathsf{NeRF}}\\)<\/jats:tex-math>\n                    <\/jats:inline-formula>\n                    , a novel full-scene VV streaming system featured by layered representation. It harmonizes the NeRF with explicit point clouds to represent the static background and dynamic foreground, thereby avoiding large data transfers and achieving photorealistic content representation. To tackle the issues of intensive computation requirements and multiscale adaptation scheduling within\n                    <jats:inline-formula content-type=\"math\/tex\">\n                      <jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\({\\mathsf{V}^{2}\\mathsf{NeRF}}\\)<\/jats:tex-math>\n                    <\/jats:inline-formula>\n                    system, we propose a lightweight non-visible background removal method and a two-stage decoupled architecture. In addition, an efficient buffer-aware simulated annealing algorithm is developed, alongside the utilization of a perceptually learned metric, to enhance user experience. We further discuss the concerns about practical development and deployment. Extensive prototype evaluations demonstrate\n                    <jats:inline-formula content-type=\"math\/tex\">\n                      <jats:tex-math notation=\"LaTeX\" version=\"MathJax\">\\({\\mathsf{V}^{2}\\mathsf{NeRF}}\\)<\/jats:tex-math>\n                    <\/jats:inline-formula>\n                    \u2019s superior streaming and viewing performance on a wide variety of networks, viewing motions, and scenes. For instance, compared to state-of-the-art approaches, it achieves a 24% increment in perceptual quality, an 83% reduction in rebuffering time, and a 54% enhancement in user experience on average.\n                  <\/jats:p>","DOI":"10.1145\/3728472","type":"journal-article","created":{"date-parts":[[2025,4,10]],"date-time":"2025-04-10T12:00:21Z","timestamp":1744286421000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Implicit Representation-based Volumetric Video Streaming for Photorealistic Full-scene Experience"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7687-8480","authenticated-orcid":false,"given":"Jianxin","family":"Shi","sequence":"first","affiliation":[{"name":"CCS, DISSec, ISN, Nankai University, Tianjin, China and Simon Fraser University, Burnaby, British Columbia, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6126-6142","authenticated-orcid":false,"given":"Miao","family":"Zhang","sequence":"additional","affiliation":[{"name":"Simon Fraser University, Burnaby, British Columbia, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4410-4779","authenticated-orcid":false,"given":"Linfeng","family":"Shen","sequence":"additional","affiliation":[{"name":"Simon Fraser University, Burnaby, British Columbia, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6592-1984","authenticated-orcid":false,"given":"Jiangchuan","family":"Liu","sequence":"additional","affiliation":[{"name":"Simon Fraser University, Burnaby, British Columbia, Canada"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1335-2780","authenticated-orcid":false,"given":"Yuan","family":"Zhang","sequence":"additional","affiliation":[{"name":"Communication University of China, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3063-8887","authenticated-orcid":false,"given":"Lingjun","family":"Pu","sequence":"additional","affiliation":[{"name":"CS, DISSec, ISN, Nankai University, Tianjin, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2173-4076","authenticated-orcid":false,"given":"Jingdong","family":"Xu","sequence":"additional","affiliation":[{"name":"CS, DISSec, ISN, Nankai University, Tianjin, China"}]}],"member":"320","published-online":{"date-parts":[[2026,2,27]]},"reference":[{"key":"e_1_3_2_2_2","first-page":"495","volume-title":"Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI \u201920)","author":"Yan Francis Y.","year":"2020","unstructured":"Francis Y. Yan, Hudson Ayers, Chenzhi Zhu, Sadjad Fouladi, James Hong, Keyi Zhang, Philip Levis, and Keith Winstein. 2020. Learning in situ: A randomized experiment in video streaming. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI \u201920), 495\u2013511."},{"key":"e_1_3_2_3_2","unstructured":"Apple Vision Pro. 2024. Apple\u2019s First Spatial Computer. Retrieved April 10 2024 from https:\/\/www.apple.com\/ca\/apple-vision-pro\/"},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00539"},{"key":"e_1_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1145\/3544494"},{"key":"e_1_3_2_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3592135"},{"key":"e_1_3_2_7_2","unstructured":"Guikun Chen and Wenguan Wang. 2024. A survey on 3D Gaussian splatting. arXiv:2401.03890. Retrieved from https:\/\/arxiv.org\/abs\/2401.03890"},{"key":"e_1_3_2_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3626111.3628184"},{"key":"e_1_3_2_9_2","unstructured":"E. d\u2019Eon B. Harrison T. Myers and P. A. Chou. 2017. 8i voxelized full bodies: A voxelized point cloud dataset. ISO\/IEC JTC1\/SC29 Joint WG11\/WG1 (MPEG\/JPEG) Input Document WG11M40059\/WG1M74006."},{"key":"e_1_3_2_10_2","unstructured":"Federal Communications Commission (FCC). 2023. Measuring Broadband Raw Data Releases. Retrieved December 3 2023 from https:\/\/www.fcc.gov\/oet\/mba\/raw-data-releases\/"},{"key":"e_1_3_2_11_2","unstructured":"Google. 2023. Draco 3D Data Compression. Retrieved October 3 2023 from https:\/\/github.com\/google\/draco"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1017\/ATSIP.2020.12"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3570361.3592530"},{"key":"e_1_3_2_14_2","doi-asserted-by":"publisher","DOI":"10.1145\/3386290.3396933"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.01873"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/3372224.3380888"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/3649139"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3587819.3592551"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3581783.3613810"},{"key":"e_1_3_2_20_2","unstructured":"International Organization for Standardization. 1999. ISO\/IEC 14496-2 Information Technology\u2014Coding of Audio-Visual Objects\u2014Part 2: Visual. Retrieved from https:\/\/api.semanticscholar.org\/CorpusID:14775904"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/OJCOMS.2021.3057679"},{"key":"e_1_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3592433"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1126\/science.220.4598.671"},{"key":"e_1_3_2_24_2","doi-asserted-by":"publisher","DOI":"10.1145\/3372224.3419214"},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2022.3153208"},{"key":"e_1_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00411"},{"key":"e_1_3_2_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00544"},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2024.3372096"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/3609395.3610593"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/VR55154.2023.00033"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3615452.3617938"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3495243.3517027"},{"key":"e_1_3_2_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/3550274"},{"key":"e_1_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1145\/3636534.3649364"},{"key":"e_1_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/3098822.3098843"},{"key":"e_1_3_2_36_2","unstructured":"Meta Quest3. 2023. A Virtual Reality Headset Developed by Reality Labs a Division of Meta Platforms. Retrieved April 16 2024 from https:\/\/www.meta.com\/ca\/quest\/"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-58452-8_24"},{"key":"e_1_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/3528223.3530127"},{"key":"e_1_3_2_39_2","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV \u201924)","author":"Navaneet K. L.","year":"2024","unstructured":"K. L. Navaneet, Kossar Pourahmadi Meibodi, Soroush Abbasi Koohpayegani, and Hamed Pirsiavash. 2024. Compact3D: Compressing Gaussian splat radiance field models with vector quantization. In Proceedings of the European Conference on Computer Vision (ECCV \u201924)."},{"key":"e_1_3_2_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52733.2024.00985"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/3339825.3394938"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3651863.3651879"},{"key":"e_1_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNET.2020.2996964"},{"key":"e_1_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/3503161.3548220"},{"key":"e_1_3_2_45_2","doi-asserted-by":"publisher","DOI":"10.1145\/3528233.3530727"},{"key":"e_1_3_2_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01191"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3350917"},{"key":"e_1_3_2_48_2","unstructured":"VIVE Cosmos Elite. 2020. A High-Performance All-in-One Head-Mounted Displays. Retrieved April 20 2024 from https:\/\/www.vive.com\/us\/"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2022.3148585"},{"key":"e_1_3_2_50_2","doi-asserted-by":"publisher","DOI":"10.1145\/3581783.3613907"},{"key":"e_1_3_2_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2023.3340642"},{"key":"e_1_3_2_52_2","doi-asserted-by":"publisher","DOI":"10.1145\/2964284.2964300"},{"key":"e_1_3_2_53_2","unstructured":"Wiki. 2023. The Introduction of Framebuffer. Retrieved November 26 2023 from https:\/\/en.wikipedia.org\/wiki\/Framebuffer"},{"key":"e_1_3_2_54_2","unstructured":"Wiki. 2024. The Introduction of Stereoscopic Rendering. Retrieved January 10 2024 from https:\/\/en.wikipedia.org\/wiki\/Stereoscopy"},{"key":"e_1_3_2_55_2","doi-asserted-by":"publisher","DOI":"10.1145\/3603269.3604819"},{"key":"e_1_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1145\/3643832.3661858"},{"key":"e_1_3_2_57_2","doi-asserted-by":"publisher","DOI":"10.1145\/3544216.3544243"},{"key":"e_1_3_2_58_2","doi-asserted-by":"publisher","DOI":"10.1109\/JIOT.2022.3197798"},{"key":"e_1_3_2_59_2","doi-asserted-by":"publisher","DOI":"10.1145\/3603146"},{"key":"e_1_3_2_60_2","doi-asserted-by":"publisher","DOI":"10.1145\/3591108"},{"key":"e_1_3_2_61_2","doi-asserted-by":"publisher","DOI":"10.1145\/3664647.3680908"},{"key":"e_1_3_2_62_2","first-page":"137","volume-title":"Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI \u201922)","author":"Zhang Anlan","year":"2022","unstructured":"Anlan Zhang, Chendong Wang, Bo Han, and Feng Qian. 2022. YuZu: Neural-enhanced volumetric video streaming. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI \u201922), 137\u2013154."},{"key":"e_1_3_2_63_2","doi-asserted-by":"publisher","DOI":"10.1145\/3560905.3568540"},{"key":"e_1_3_2_64_2","doi-asserted-by":"publisher","DOI":"10.1145\/3450626.3459756"},{"key":"e_1_3_2_65_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00068"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3728472","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,15]],"date-time":"2026-03-15T03:47:49Z","timestamp":1773546469000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3728472"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,27]]},"references-count":64,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,3,31]]}},"alternative-id":["10.1145\/3728472"],"URL":"https:\/\/doi.org\/10.1145\/3728472","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,27]]},"assertion":[{"value":"2024-09-05","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-04-02","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2026-02-27","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}