{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T17:00:27Z","timestamp":1777568427028,"version":"3.51.4"},"reference-count":87,"publisher":"Association for Computing Machinery (ACM)","issue":"4","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2025,8,1]]},"abstract":"<jats:p>\n                    Online free-view navigation in volumetric videos requires high-quality rendering and real-time streaming in order to provide immersive user experiences. However, existing methods (\n                    <jats:italic toggle=\"yes\">e.g.<\/jats:italic>\n                    , dynamic NeRF and 3DGS) may not handle dynamic scenes with complex motions, and their models may not be streamable due to storage and bandwidth constraints. In this paper, we propose a novel 4D Gaussian Video (4DGV) approach that enables the creation and streaming of photorealistic, volumetric videos for dynamic scenes over the Internet. The core of our 4DGV is a novel streamable group of Gaussians (GOG) representation based on motion layering. Each GOG consists of static and dynamic points obtained via lifting 2D segmentation into 3D in motion layering, where the deformation of each dynamic point is represented as the temporal offset of its attributes. We also adaptively convert static points back to dynamic points to handle the appearance change,\n                    <jats:italic toggle=\"yes\">(e.g.<\/jats:italic>\n                    , moving shadows and reflections), of static objects through optimization. To support real-time streaming of 4DGVs, we show that by applying quantization on Gaussian attributes and H.265 encoding on deformation offsets, our GOG representation can be significantly compressed (to around 6% of the original model size) without sacrificing the accuracy (PSNR loss less than 0.01dB). Extensive experiments on standard benchmarks demonstrate that our method outperforms state-of-the-art volumetric video approaches, with superior rendering quality and minimum storage overheads.\n                  <\/jats:p>","DOI":"10.1145\/3731189","type":"journal-article","created":{"date-parts":[[2025,7,27]],"date-time":"2025-07-27T04:02:22Z","timestamp":1753588942000},"page":"1-14","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["4D Gaussian Videos with Motion Layering"],"prefix":"10.1145","volume":"44","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6983-3463","authenticated-orcid":false,"given":"Pinxuan","family":"Dai","sequence":"first","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-8892-4683","authenticated-orcid":false,"given":"Peiquan","family":"Zhang","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-5191-9348","authenticated-orcid":false,"given":"Zheng","family":"Dong","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5855-3810","authenticated-orcid":false,"given":"Ke","family":"Xu","sequence":"additional","affiliation":[{"name":"City University of Hong Kong, Hong Kong, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0667-2599","authenticated-orcid":false,"given":"Yifan","family":"Peng","sequence":"additional","affiliation":[{"name":"The University of Hong Kong, Hong Kong, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2911-1321","authenticated-orcid":false,"given":"Dandan","family":"Ding","sequence":"additional","affiliation":[{"name":"Hangzhou Normal University, Hangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3801-6705","authenticated-orcid":false,"given":"Yujun","family":"Shen","sequence":"additional","affiliation":[{"name":"Ant Group, Hangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7645-5931","authenticated-orcid":false,"given":"Yin","family":"Yang","sequence":"additional","affiliation":[{"name":"The University of Utah, Salt Lake City, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4650-1970","authenticated-orcid":false,"given":"Xinguo","family":"Liu","sequence":"additional","affiliation":[{"name":"Zhejiang University, Hangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8957-8129","authenticated-orcid":false,"given":"Rynson W. H.","family":"Lau","sequence":"additional","affiliation":[{"name":"City University of Hong Kong, Hong Kong, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3756-3539","authenticated-orcid":false,"given":"Weiwei","family":"Xu","sequence":"additional","affiliation":[{"name":"State Key Lab of CAD&amp;CG, Zhejiang University, Hangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2025,7,27]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"crossref","unstructured":"Benjamin Attal Jia-Bin Huang Christian Richardt Michael Zollh\u00f6fer Johannes Kopf Matthew O'Toole and Changil Kim. 2023. HyperReel: High-Fidelity 6-DoF Video With Ray-Conditioned Sampling. In CVPR.","DOI":"10.1109\/CVPR52729.2023.01594"},{"key":"e_1_2_2_2_1","volume-title":"Zoedepth: Zero-shot transfer by combining relative and metric depth. arXiv:2302.12288","author":"Bhat Shariq Farooq","year":"2023","unstructured":"Shariq Farooq Bhat, Reiner Birkl, Diana Wofk, Peter Wonka, and Matthias M\u00fcller. 2023. Zoedepth: Zero-shot transfer by combining relative and metric depth. arXiv:2302.12288 (2023)."},{"key":"e_1_2_2_3_1","volume-title":"Large displacement optical flow: descriptor matching in variational motion estimation","author":"Brox Thomas","year":"2010","unstructured":"Thomas Brox and Jitendra Malik. 2010. Large displacement optical flow: descriptor matching in variational motion estimation. IEEE TPAMI (2010)."},{"key":"e_1_2_2_4_1","volume-title":"Immersive Light Field Video with a Layered Mesh Representation. ACM TOG","author":"Broxton Michael","year":"2020","unstructured":"Michael Broxton, John Flynn, Ryan Overbeck, Daniel Erickson, Peter Hedman, Matthew DuVall, Jason Dourgarian, Jay Busch, Matt Whalen, and Paul Debevec. 2020. Immersive Light Field Video with a Layered Mesh Representation. ACM TOG (2020)."},{"key":"e_1_2_2_5_1","doi-asserted-by":"crossref","unstructured":"Ang Cao and Justin Johnson. 2023. HexPlane: A Fast Representation for Dynamic Scenes. In CVPR.","DOI":"10.1109\/CVPR52729.2023.00021"},{"key":"e_1_2_2_6_1","doi-asserted-by":"crossref","unstructured":"Jason Chang Donglai Wei and John W Fisher. 2013. A video representation using temporal superpixels. In CVPR.","DOI":"10.1109\/CVPR.2013.267"},{"key":"e_1_2_2_7_1","volume-title":"HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression. In ECCV.","author":"Chen Yihang","year":"2024","unstructured":"Yihang Chen, Qianyi Wu, Weiyao Lin, Mehrtash Harandi, and Jianfei Cai. 2024. HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression. In ECCV."},{"key":"e_1_2_2_8_1","volume-title":"Tap-vid: A benchmark for tracking any point in a video. In NeurIPS.","author":"Doersch Carl","year":"2022","unstructured":"Carl Doersch, Ankush Gupta, Larisa Markeeva, Adria Recasens, Lucas Smaira, Yusuf Aytar, Joao Carreira, Andrew Zisserman, and Yi Yang. 2022. Tap-vid: A benchmark for tracking any point in a video. In NeurIPS."},{"key":"e_1_2_2_9_1","volume-title":"Tapir: Tracking any point with per-frame initialization and temporal refinement. In ICCV.","author":"Doersch Carl","year":"2023","unstructured":"Carl Doersch, Yi Yang, Mel Vecerik, Dilara Gokay, Ankush Gupta, Yusuf Aytar, Joao Carreira, and Andrew Zisserman. 2023. Tapir: Tracking any point with per-frame initialization and temporal refinement. In ICCV."},{"key":"e_1_2_2_10_1","volume-title":"SAILOR: Synergizing Radiance and Occupancy Fields for Live Human Performance Capture. ACM TOG","author":"Dong Zheng","year":"2023","unstructured":"Zheng Dong, Ke Xu, Yaoan Gao, Qilin Sun, Hujun Bao, Weiwei Xu, and Rynson WH Lau. 2023. SAILOR: Synergizing Radiance and Occupancy Fields for Live Human Performance Capture. ACM TOG (2023)."},{"key":"e_1_2_2_11_1","doi-asserted-by":"crossref","unstructured":"Yuanxing Duan Fangyin Wei Qiyu Dai Yuhang He Wenzheng Chen and Baoquan Chen. 2024. 4D-Rotor Gaussian Splatting: Towards Efficient Novel-View Synthesis for Dynamic Scenes. In SIGGRAPH.","DOI":"10.1145\/3641519.3657463"},{"key":"e_1_2_2_12_1","volume-title":"Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding. arXiv:1311.2540","author":"Duda Jarek","year":"2014","unstructured":"Jarek Duda. 2014. Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding. arXiv:1311.2540 (2014)."},{"key":"e_1_2_2_13_1","volume-title":"LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS. arXiv:2311.17245","author":"Fan Zhiwen","year":"2023","unstructured":"Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, and Zhangyang Wang. 2023. LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS. arXiv:2311.17245 (2023)."},{"key":"e_1_2_2_14_1","volume-title":"Benjamin Recht, and Angjoo Kanazawa.","author":"Fridovich-Keil Sara","year":"2023","unstructured":"Sara Fridovich-Keil, Giacomo Meanti, Frederik Rahb\u00e6k Warburg, Benjamin Recht, and Angjoo Kanazawa. 2023. K-Planes: Explicit Radiance Fields in Space, Time, and Appearance. In CVPR."},{"key":"e_1_2_2_15_1","unstructured":"Bastian Goldl\u00fccke Marcus A Magnor and Bennett Wilburn. 2002. Hardware-Accelerated Dynamic Light Field Rendering.. In VMV."},{"key":"e_1_2_2_16_1","unstructured":"Google. 2017. Draco 3D Graphics Compression. https:\/\/github.com\/google\/draco"},{"key":"e_1_2_2_17_1","volume-title":"Factormatte: Redefining video matting for re-composition tasks. ACM TOG","author":"Gu Zeqi","year":"2023","unstructured":"Zeqi Gu, Wenqi Xian, Noah Snavely, and Abe Davis. 2023. Factormatte: Redefining video matting for re-composition tasks. ACM TOG (2023)."},{"key":"e_1_2_2_18_1","doi-asserted-by":"crossref","unstructured":"Adam W Harley Zhaoyuan Fang and Katerina Fragkiadaki. 2022. Particle video revisited: Tracking through occlusions using point trajectories. In ECCV.","DOI":"10.1007\/978-3-031-20047-2_4"},{"key":"e_1_2_2_19_1","doi-asserted-by":"crossref","unstructured":"Yi-Hua Huang Yang-Tian Sun Ziyi Yang Xiaoyang Lyu Yan-Pei Cao and Xiaojuan Qi. 2024. SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes. In CVPR.","DOI":"10.1109\/CVPR52733.2024.00404"},{"key":"e_1_2_2_20_1","volume-title":"Liteflownet: A lightweight convolutional neural network for optical flow estimation. In CVPR.","author":"Hui Tak-Wai","year":"2018","unstructured":"Tak-Wai Hui, Xiaoou Tang, and Chen Change Loy. 2018. Liteflownet: A lightweight convolutional neural network for optical flow estimation. In CVPR."},{"key":"e_1_2_2_21_1","volume-title":"PCC WD G-PCC (Geometry-Based PCC). Standard","author":"ISO.","unstructured":"ISO. 2018a. PCC WD G-PCC (Geometry-Based PCC). Standard. International Organization for Standardization."},{"key":"e_1_2_2_22_1","volume-title":"PCC WD V-PCC (Video-Based PCC). Standard","author":"ISO.","unstructured":"ISO. 2018b. PCC WD V-PCC (Video-Based PCC). Standard. International Organization for Standardization."},{"key":"e_1_2_2_23_1","volume-title":"International Conference on Multimedia Computing and Systems.","author":"Jain R.","unstructured":"R. Jain and K. Wakimoto. 1995. Multiple perspective interactive video. In International Conference on Multimedia Computing and Systems."},{"key":"e_1_2_2_24_1","doi-asserted-by":"crossref","unstructured":"Joel Janai Fatma Guney Anurag Ranjan Michael Black and Andreas Geiger. 2018. Unsupervised learning of multi-frame optical flow with occlusions. In ECCV.","DOI":"10.1007\/978-3-030-01270-0_42"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/MMUL.2006.93"},{"key":"e_1_2_2_26_1","volume-title":"Cotracker: It is better to track together. In ECCV.","author":"Karaev Nikita","year":"2024","unstructured":"Nikita Karaev, Ignacio Rocco, Benjamin Graham, Natalia Neverova, Andrea Vedaldi, and Christian Rupprecht. 2024. Cotracker: It is better to track together. In ECCV."},{"key":"e_1_2_2_27_1","volume-title":"3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM TOG","author":"Kerbl Bernhard","year":"2023","unstructured":"Bernhard Kerbl, Georgios Kopanas, Thomas Leimk\u00fchler, and George Drettakis. 2023. 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM TOG (2023)."},{"key":"e_1_2_2_28_1","volume-title":"Splat: A WebGL Implementation of A Real-time Renderer for 3D Gaussian Splatting for Real-Time Radiance Field Rendering. https:\/\/github.com\/antimatter15\/splat","author":"Kwok Kevin","year":"2023","unstructured":"Kevin Kwok. 2023. Splat: A WebGL Implementation of A Real-time Renderer for 3D Gaussian Splatting for Real-Time Radiance Field Rendering. https:\/\/github.com\/antimatter15\/splat"},{"key":"e_1_2_2_29_1","unstructured":"Junoh Lee ChangYeon Won Hyunjun Jung Inhwan Bae and Hae-Gon Jeon. 2024c. Fully Explicit Dynamic Guassian Splatting. In NeurIPS."},{"key":"e_1_2_2_30_1","volume-title":"Jong Hwan Ko, and Eunbyung Park","author":"Lee Joo Chan","year":"2024","unstructured":"Joo Chan Lee, Daniel Rho, Xiangyu Sun, Jong Hwan Ko, and Eunbyung Park. 2024b. Compact 3D Gaussian Representation for Radiance Field. In CVPR."},{"key":"e_1_2_2_31_1","volume-title":"Generative Omnimatte: Learning to Decompose Video into Layers. arXiv:2411.16683","author":"Lee Yao-Chih","year":"2024","unstructured":"Yao-Chih Lee, Erika Lu, Sarah Rumbley, Michal Geyer, Jia-Bin Huang, Tali Dekel, and Forrester Cole. 2024a. Generative Omnimatte: Learning to Decompose Video into Layers. arXiv:2411.16683 (2024)."},{"key":"e_1_2_2_32_1","doi-asserted-by":"crossref","unstructured":"Lingzhi Li Zhen Shen Zhongshu Wang Li Shen and Ping Tan. 2022a. Streaming radiance fields for 3d video synthesis. In NeurIPS.","DOI":"10.52202\/068431-0980"},{"key":"e_1_2_2_33_1","unstructured":"Tianye Li Mira Slavcheva Michael Zollh\u00f6fer Simon Green Christoph Lassner Changil Kim Tanner Schmidt Steven Lovegrove Michael Goesele Richard Newcombe and Zhaoyang Lv. 2022b. Neural 3D Video Synthesis From Multi-View Video. In CVPR."},{"key":"e_1_2_2_34_1","unstructured":"Zhan Li Zhang Chen Zhong Li and Yi Xu. 2024. Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis. In CVPR."},{"key":"e_1_2_2_35_1","doi-asserted-by":"crossref","unstructured":"Geng Lin Chen Gao Jia-Bin Huang Changil Kim Yipeng Wang Matthias Zwicker and Ayush Saraf. 2023. OmnimatteRF: Robust Omnimatte with 3D Background Modeling. In ICCV.","DOI":"10.1109\/ICCV51070.2023.02145"},{"key":"e_1_2_2_36_1","volume-title":"Efficient Neural Radiance Fields for Interactive Free-viewpoint Video. In SIGGRAPH Asia Conference Proceedings.","author":"Lin Haotong","year":"2022","unstructured":"Haotong Lin, Sida Peng, Zhen Xu, Yunzhi Yan, Qing Shuai, Hujun Bao, and Xiaowei Zhou. 2022. Efficient Neural Radiance Fields for Interactive Free-viewpoint Video. In SIGGRAPH Asia Conference Proceedings."},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3503161.3547979"},{"key":"e_1_2_2_38_1","volume-title":"Omnimatte: Associating objects and their effects in video. In CVPR.","author":"Lu Erika","year":"2021","unstructured":"Erika Lu, Forrester Cole, Tali Dekel, Andrew Zisserman, William T Freeman, and Michael Rubinstein. 2021. Omnimatte: Associating objects and their effects in video. In CVPR."},{"key":"e_1_2_2_39_1","volume-title":"Scaffold-gs: Structured 3d gaussians for view-adaptive rendering. In CVPR.","author":"Lu Tao","year":"2024","unstructured":"Tao Lu, Mulin Yu, Linning Xu, Yuanbo Xiangli, Limin Wang, Dahua Lin, and Bo Dai. 2024. Scaffold-gs: Structured 3d gaussians for view-adaptive rendering. In CVPR."},{"key":"e_1_2_2_40_1","volume-title":"Francisco Vicente Carrasco, and Fernando De La Torre","author":"Mallick Saswat Subhajyoti","year":"2024","unstructured":"Saswat Subhajyoti Mallick, Rahul Goel, Bernhard Kerbl, Markus Steinberger, Francisco Vicente Carrasco, and Fernando De La Torre. 2024. Taming 3DGS: High-Quality Radiance Fields with Limited Resources. In SIGGRAPH Asia."},{"key":"e_1_2_2_41_1","volume-title":"Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM","author":"Mildenhall Ben","year":"2021","unstructured":"Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2021. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM (2021)."},{"key":"e_1_2_2_42_1","volume-title":"A Computer Oriented Geodetic Data Base and a New Technique in File Sequencing","author":"Morton G.M.","unstructured":"G.M. Morton. 1966. A Computer Oriented Geodetic Data Base and a New Technique in File Sequencing. International Business Machines Company. https:\/\/books.google.com\/books?id=9FFdHAAACAAJ"},{"key":"e_1_2_2_43_1","volume-title":"Soroush Abbasi Koohpayegani, and Hamed Pirsiavash.","author":"Navaneet KL","year":"2024","unstructured":"KL Navaneet, Kossar Pourahmadi Meibodi, Soroush Abbasi Koohpayegani, and Hamed Pirsiavash. 2024. CompGS: Smaller and Faster Gaussian Splatting with Vector Quantization. In ECCV."},{"key":"e_1_2_2_44_1","volume-title":"Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. In CVPR.","author":"Newcombe Richard A","year":"2015","unstructured":"Richard A Newcombe, Dieter Fox, and Steven M Seitz. 2015. Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time. In CVPR."},{"key":"e_1_2_2_45_1","doi-asserted-by":"crossref","unstructured":"Simon Niedermayr Josef Stumpfegger and R\u00fcdiger Westermann. 2024. Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis. In CVPR.","DOI":"10.1109\/CVPR52733.2024.00985"},{"key":"e_1_2_2_46_1","unstructured":"Maxime Oquab Timoth\u00e9e Darcet Th\u00e9o Moutakanni Huy Vo Marc Szafraniec Vasil Khalidov Pierre Fernandez Daniel Haziza Francisco Massa Alaaeldin El-Nouby et al. 2023. Dinov2: Learning robust visual features without supervision. arXiv:2304.07193 (2023)."},{"key":"e_1_2_2_47_1","volume-title":"Seitz","author":"Park Keunhong","year":"2021","unstructured":"Keunhong Park, Utkarsh Sinha, Peter Hedman, Jonathan T. Barron, Sofien Bouaziz, Dan B Goldman, Ricardo Martin-Brualla, and Steven M. Seitz. 2021. HyperNeRF: a higher-dimensional representation for topologically varying neural radiance fields. ACM TOG (2021)."},{"key":"e_1_2_2_48_1","doi-asserted-by":"crossref","unstructured":"Albert Pumarola Enric Corona Gerard Pons-Moll and Francesc Moreno-Noguer. 2020. D-NeRF: Neural Radiance Fields for Dynamic Scenes. In CVPR.","DOI":"10.1109\/CVPR46437.2021.01018"},{"key":"e_1_2_2_49_1","unstructured":"Nikhila Ravi Valentin Gabeur Yuan-Ting Hu Ronghang Hu Chaitanya Ryali Tengyu Ma Haitham Khedr Roman R\u00e4dle Chloe Rolland Laura Gustafson et al. 2024a. Sam 2: Segment anything in images and videos. arXiv:2408.00714 (2024)."},{"key":"e_1_2_2_50_1","volume-title":"Nicolas Carion, Chao-Yuan Wu, Ross Girshick, Piotr Doll\u00e5r, and Christoph Feichtenhofer.","author":"Ravi Nikhila","year":"2024","unstructured":"Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu, Ronghang Hu, Chaitanya Ryali, Tengyu Ma, Haitham Khedr, Roman R\u00e4dle, Chloe Rolland, Laura Gustafson, Eric Mintun, Junting Pan, Kalyan Vasudev Alwala, Nicolas Carion, Chao-Yuan Wu, Ross Girshick, Piotr Doll\u00e5r, and Christoph Feichtenhofer. 2024b. SAM 2: Segment Anything in Images and Videos. arXiv:2408.00714 (2024)."},{"key":"e_1_2_2_51_1","doi-asserted-by":"crossref","unstructured":"Zhile Ren Orazio Gallo Deqing Sun Ming-Hsuan Yang Erik B Sudderth and Jan Kautz. 2019. A fusion approach for multi-frame optical flow estimation. In WACV.","DOI":"10.1007\/978-3-030-11024-6_53"},{"key":"e_1_2_2_52_1","doi-asserted-by":"crossref","unstructured":"Michael Rubinstein Ce Liu and William T Freeman. 2012. Towards longer long-range motion trajectories. In BMVC.","DOI":"10.5244\/C.26.53"},{"key":"e_1_2_2_53_1","volume-title":"Dataset and Pipeline for Multi-View Light-Field Video. In CVPR Workshops.","author":"Sabater Neus","year":"2017","unstructured":"Neus Sabater, Guillaume Boisson, Benoit Vandame, Paul Kerbiriou, Frederic Babon, Matthieu Hog, Remy Gendrot, Tristan Langlois, Olivier Bureller, Arno Schubert, and Valerie Allie. 2017. Dataset and Pipeline for Multi-View Light-Field Video. In CVPR Workshops."},{"key":"e_1_2_2_54_1","volume-title":"Particle video: Long-range motion estimation using point trajectories. IJCV","author":"Sand Peter","year":"2008","unstructured":"Peter Sand and Seth Teller. 2008. Particle video: Long-range motion estimation using point trajectories. IJCV (2008)."},{"key":"e_1_2_2_55_1","volume-title":"On-the-Fly Processing of Generalized Lumigraphs. Computer Graphics Forum","author":"Schirmacher Hartmut","year":"2001","unstructured":"Hartmut Schirmacher, Li Ming, and Hans-Peter Seidel. 2001. On-the-Fly Processing of Generalized Lumigraphs. Computer Graphics Forum (2001)."},{"key":"e_1_2_2_56_1","volume-title":"NeRFPlayer: A Streamable Dynamic Scene Representation with Decomposed Neural Radiance Fields","author":"Song Liangchen","year":"2023","unstructured":"Liangchen Song, Anpei Chen, Zhong Li, Zhang Chen, Lele Chen, Junsong Yuan, Yi Xu, and Andreas Geiger. 2023. NeRFPlayer: A Streamable Dynamic Scene Representation with Decomposed Neural Radiance Fields. IEEE TVCG (2023)."},{"key":"e_1_2_2_57_1","doi-asserted-by":"crossref","unstructured":"Yunzhou Song Jiahui Lei Ziyun Wang Lingjie Liu and Kostas Daniilidis. 2024. Track everything everywhere fast and robustly. In ECCV.","DOI":"10.1007\/978-3-031-72646-0_20"},{"key":"e_1_2_2_58_1","doi-asserted-by":"crossref","unstructured":"Mohammed Suhail Erika Lu Zhengqi Li Noah Snavely Leonid Sigal and Forrester Cole. 2023. Omnimatte3D: Associating Objects and Their Effects in Unconstrained Monocular Video. In CVPR.","DOI":"10.1109\/CVPR52729.2023.00068"},{"key":"e_1_2_2_59_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSVT.2012.2221191"},{"key":"e_1_2_2_60_1","unstructured":"Deqing Sun Erik Sudderth and Michael Black. 2010. Layered image motion with explicit occlusions temporal consistency and depth ordering. In NeurIPS."},{"key":"e_1_2_2_61_1","doi-asserted-by":"crossref","unstructured":"Jiakai Sun Han Jiao Guangyuan Li Zhanjie Zhang Lei Zhao and Wei Xing. 2024. 3DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos. In CVPR.","DOI":"10.1109\/CVPR52733.2024.01954"},{"key":"e_1_2_2_62_1","volume-title":"Sullivan","author":"Sze Vivienne","year":"2014","unstructured":"Vivienne Sze, Madhukar Budagavi, and Gary J. Sullivan. 2014. High Efficiency Video Coding (HEVC): Algorithms and Architectures. Springer Publishing Company, Incorporated."},{"key":"e_1_2_2_63_1","volume-title":"Raft: Recurrent all-pairs field transforms for optical flow. In ECCV.","author":"Teed Zachary","year":"2020","unstructured":"Zachary Teed and Jia Deng. 2020. Raft: Recurrent all-pairs field transforms for optical flow. In ECCV."},{"key":"e_1_2_2_64_1","volume-title":"SDformerFlow: Spatiotemporal swin spikeformer for event-based optical flow estimation. arXiv:2409.04082","author":"Tian Yi","year":"2024","unstructured":"Yi Tian and Juan Andrade-Cetto. 2024. SDformerFlow: Spatiotemporal swin spikeformer for event-based optical flow estimation. arXiv:2409.04082 (2024)."},{"key":"e_1_2_2_65_1","unstructured":"Sundar Vedula Simon Baker Steven Seitz and Takeo Kanade. 2000. Shape and motion carving in 6D. In CVPR."},{"key":"e_1_2_2_66_1","doi-asserted-by":"crossref","unstructured":"Carl Vondrick Abhinav Shrivastava Alireza Fathi Sergio Guadarrama and Kevin Murphy. 2018. Tracking emerges by colorizing videos. In ECCV.","DOI":"10.1007\/978-3-030-01261-8_24"},{"key":"e_1_2_2_67_1","volume-title":"Masked space-time hash encoding for efficient dynamic scene reconstruction. Advances in neural information processing systems 36","author":"Wang Feng","year":"2023","unstructured":"Feng Wang, Zilong Chen, Guokang Wang, Yafei Song, and Huaping Liu. 2023b. Masked space-time hash encoding for efficient dynamic scene reconstruction. Advances in neural information processing systems 36 (2023), 70497\u201370510."},{"key":"e_1_2_2_68_1","doi-asserted-by":"crossref","unstructured":"Feng Wang Sinan Tan Xinghang Li Zeyue Tian Yafei Song and Huaping Liu. 2023d. Mixed Neural Voxels for Fast Multi-view Video Synthesis. In ICCV.","DOI":"10.1109\/ICCV51070.2023.01805"},{"key":"e_1_2_2_69_1","doi-asserted-by":"crossref","unstructured":"Henan Wang Hanxin Zhu Tianyu He Runsen Feng Jiajun Deng Jiang Bian and Zhibo Chen. 2024d. End-to-End Rate-Distortion Optimized 3D Gaussian Representation. In ECCV.","DOI":"10.1007\/978-3-031-73636-0_5"},{"key":"e_1_2_2_70_1","doi-asserted-by":"crossref","unstructured":"Liao Wang Kaixin Yao Chengcheng Guo Zhirui Zhang Qiang Hu Jingyi Yu Lan Xu and Minye Wu. 2024b. VideoRF: Rendering Dynamic Radiance Fields as 2D Feature Video Streams. In CVPR.","DOI":"10.1109\/CVPR52733.2024.00052"},{"key":"e_1_2_2_71_1","volume-title":"V3: Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians. ACM TOG","author":"Wang Penghao","year":"2024","unstructured":"Penghao Wang, Zhirui Zhang, Liao Wang, Kaixin Yao, Siyuan Xie, Jingyi Yu, Minye Wu, and Lan Xu. 2024c. V3: Viewing Volumetric Videos on Mobiles via Streamable 2D Dynamic Gaussians. ACM TOG (2024)."},{"key":"e_1_2_2_72_1","doi-asserted-by":"crossref","unstructured":"Qianqian Wang Yen-Yu Chang Ruojin Cai Zhengqi Li Bharath Hariharan Aleksander Holynski and Noah Snavely. 2023a. Tracking everything everywhere all at once. In ICCV.","DOI":"10.1109\/ICCV51070.2023.01813"},{"key":"e_1_2_2_73_1","doi-asserted-by":"crossref","unstructured":"Yiming Wang Qin Han Marc Habermann Kostas Daniilidis Christian Theobalt and Lingjie Liu. 2023c. Neus2: Fast learning of neural implicit surfaces for multi-view reconstruction. In ICCV.","DOI":"10.1109\/ICCV51070.2023.00305"},{"key":"e_1_2_2_74_1","doi-asserted-by":"crossref","unstructured":"Yihan Wang Lahav Lipson and Jia Deng. 2024a. SEA-RAFT: Simple Efficient Accurate RAFT for Optical Flow. In ECCV.","DOI":"10.1007\/978-3-031-72667-5_3"},{"key":"e_1_2_2_75_1","doi-asserted-by":"crossref","unstructured":"Philippe Weinzaepfel Jerome Revaud Zaid Harchaoui and Cordelia Schmid. 2013. DeepFlow: Large displacement optical flow with deep matching. In ICCV.","DOI":"10.1109\/ICCV.2013.175"},{"key":"e_1_2_2_76_1","volume-title":"Humannerf: Free-viewpoint rendering of moving people from monocular video. In CVPR.","author":"Weng Chung-Yi","year":"2022","unstructured":"Chung-Yi Weng, Brian Curless, Pratul P Srinivasan, Jonathan T Barron, and Ira Kemelmacher-Shlizerman. 2022. Humannerf: Free-viewpoint rendering of moving people from monocular video. In CVPR."},{"key":"e_1_2_2_77_1","doi-asserted-by":"crossref","unstructured":"T. Wiegand G.J. Sullivan G. Bjontegaard and A. Luthra. 2003. Overview of the H.264\/AVC video coding standard. IEEE TCSVT (2003).","DOI":"10.1109\/TCSVT.2003.815165"},{"key":"e_1_2_2_78_1","unstructured":"Guanjun Wu Taoran Yi Jiemin Fang Lingxi Xie Xiaopeng Zhang Wei Wei Wenyu Liu Qi Tian and Xinggang Wang. 2024a. 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering. In CVPR."},{"key":"e_1_2_2_79_1","doi-asserted-by":"publisher","DOI":"10.1007\/s41095-024-0436-y"},{"key":"e_1_2_2_80_1","unstructured":"Jiawei Xu Zexin Fan Jian Yang and Jin Xie. 2024a. Grid4D: 4D Decomposed Hash Encoding for High-fidelity Dynamic Scene Rendering. In NeurIPS."},{"key":"e_1_2_2_81_1","unstructured":"Zhen Xu Sida Peng Haotong Lin Guangzhao He Jiaming Sun Yujun Shen Hujun Bao and Xiaowei Zhou. 2024b. 4K4D: Real-Time 4D View Synthesis at 4K Resolution. In CVPR."},{"key":"e_1_2_2_82_1","volume-title":"Representing Long Volumetric Video with Temporal Gaussian Hierarchy. ACM TOG","author":"Xu Zhen","year":"2024","unstructured":"Zhen Xu, Yinghao Xu, Zhiyuan Yu, Sida Peng, Jiaming Sun, Hujun Bao, and Xiaowei Zhou. 2024c. Representing Long Volumetric Video with Temporal Gaussian Hierarchy. ACM TOG (2024)."},{"key":"e_1_2_2_83_1","doi-asserted-by":"crossref","unstructured":"Ziyi Yang Xinyu Gao Wen Zhou Shaohui Jiao Yuqing Zhang and Xiaogang Jin. 2024a. Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction. In CVPR.","DOI":"10.1109\/CVPR52733.2024.01922"},{"key":"e_1_2_2_84_1","unstructured":"Zeyu Yang Hongye Yang Zijie Pan and Li Zhang. 2024b. Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting. In ICLR."},{"key":"e_1_2_2_85_1","unstructured":"Ye Zhang and Chandra Kambhamettu. 2001. On 3D scene flow and structure estimation. In CVPR."},{"key":"e_1_2_2_86_1","volume-title":"Motiongs: Exploring explicit motion guidance for deformable 3d gaussian splatting. arXiv:2410.07707","author":"Zhu Ruijie","year":"2024","unstructured":"Ruijie Zhu, Yanzhe Liang, Hanzhi Chang, Jiacheng Deng, Jiahao Lu, Wenfei Yang, Tianzhu Zhang, and Yongdong Zhang. 2024. Motiongs: Exploring explicit motion guidance for deformable 3d gaussian splatting. arXiv:2410.07707 (2024)."},{"key":"e_1_2_2_87_1","volume-title":"Matthew Uyttendaele, Simon Winder, and Richard Szeliski.","author":"Zitnick C. Lawrence","year":"2004","unstructured":"C. Lawrence Zitnick, Sing Bing Kang, Matthew Uyttendaele, Simon Winder, and Richard Szeliski. 2004. High-quality video view interpolation using a layered representation. ACM TOG (2004)."}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3731189","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T17:50:23Z","timestamp":1774633823000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3731189"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,27]]},"references-count":87,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2025,8,1]]}},"alternative-id":["10.1145\/3731189"],"URL":"https:\/\/doi.org\/10.1145\/3731189","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"value":"0730-0301","type":"print"},{"value":"1557-7368","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,7,27]]},"assertion":[{"value":"2025-07-27","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}