{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,6]],"date-time":"2025-12-06T05:15:59Z","timestamp":1764998159659,"version":"3.46.0"},"reference-count":34,"publisher":"MDPI AG","issue":"12","license":[{"start":{"date-parts":[[2025,11,30]],"date-time":"2025-11-30T00:00:00Z","timestamp":1764460800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Liaoning Province Science and Technology Joint Plan","award":["2025JH2\/101800394"],"award-info":[{"award-number":["2025JH2\/101800394"]}]},{"name":"Opening Funding of National Key Laboratory of Electromagnetic Space Security","award":["JCKY2024240C008"],"award-info":[{"award-number":["JCKY2024240C008"]}]},{"DOI":"10.13039\/100012841","name":"Shenyang Ligong University","doi-asserted-by":"crossref","award":["1010147001133"],"award-info":[{"award-number":["1010147001133"]}],"id":[{"id":"10.13039\/100012841","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Shenyang Xing-Shen Talents Plan Project for Master Teachers","award":["XSMS2206003"],"award-info":[{"award-number":["XSMS2206003"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>3D Gaussian Splatting (3DGS) is a multi-view 3D reconstruction method that relies solely on image loss for supervision, lacking explicit constraints on the geometric consistency of the rendering model. It uses a multi-view scene-by-scene training paradigm, which limits generalization to unknown scenes in the case of single-view limited input. To address these issues, this paper proposes a Geometric Consistency-High Generalization (GC-HG), a single-view 3DGS reconstruction framework integrating depth prior and a pseudo-triplane. First, we utilize the VGGT 3D geometry pre-trained model to derive depth prior, back-projecting them into point clouds to construct a dual-modal input alongside the image. Second, we introduce a pseudo-triplane mechanism with a learnable Z-plane token for feature decoupling and pseudo-triplane feature fusion, thereby enhancing geometry perception and consistency. Finally, we integrate a parent\u2013child hierarchical Gaussian renderer into the feed-forward 3DGS framework, combining depth and 3D offsets to model depth and geometry information, while mapping parent and child Gaussians into a linear structure through an MLP. Evaluations on the RealEstate10K dataset validate our approach, demonstrating improvements in geometric modeling and generalization for single-view reconstruction. Our method improves Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Learned Perceptual Image Patch Similarity (LPIPS) metrics, demonstrating its advantages in geometric consistency modeling and cross-scene generalization.<\/jats:p>","DOI":"10.3390\/a18120761","type":"journal-article","created":{"date-parts":[[2025,12,5]],"date-time":"2025-12-05T18:42:02Z","timestamp":1764960122000},"page":"761","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["GC-HG Gaussian Splatting Single-View 3D Reconstruction Method Based on Depth Prior and Pseudo-Triplane"],"prefix":"10.3390","volume":"18","author":[{"given":"Hua","family":"Gong","sequence":"first","affiliation":[{"name":"School of Science, Shenyang Ligong University, Shenyang 110159, China"},{"name":"Liaoning Key Laboratory of Intelligent Optimization and Control for Ordnance Industry, Shenyang 110159, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-8418-4074","authenticated-orcid":false,"given":"Peide","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Science, Shenyang Ligong University, Shenyang 110159, China"}]},{"given":"Yuanjing","family":"Ma","sequence":"additional","affiliation":[{"name":"School of Science, Shenyang Ligong University, Shenyang 110159, China"},{"name":"Liaoning Key Laboratory of Intelligent Optimization and Control for Ordnance Industry, Shenyang 110159, China"}]},{"given":"Yong","family":"Zhang","sequence":"additional","affiliation":[{"name":"National Key Laboratory of Electromagnetic Space Security, Tianjin 300308, China"}]}],"member":"1968","published-online":{"date-parts":[[2025,11,30]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Park, J.J., Florence, P., Straub, J., Newcombe, R., and Lovegrove, S. (2019, January 16\u201320). DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00025"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1145\/3503250","article-title":"NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis","volume":"65","author":"Mildenhall","year":"2021","journal-title":"Commun. ACM"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3592433","article-title":"3D Gaussian Splatting for Real-Time Radiance Field Rendering","volume":"42","author":"Kerbl","year":"2023","journal-title":"ACM Trans. Graph."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Li, J., Feng, Z., She, Q., Ding, H., Wang, C., and Lee, G.H. (2021, January 11\u201317). MINE: Towards Continuous Depth MPI with NeRF for Novel View Synthesis. Proceedings of the 2021 IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.01235"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Szymanowicz, S., Rupprecht, C., and Vedaldi, A. (2024, January 17\u201321). Splatter Image: Ultra-Fast Single-View 3D Reconstruction. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.00972"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Xu, H., Peng, S., Wang, F., Blum, H., Barath, D., Geiger, A., and Pollefeys, M. (2025, January 11\u201315). DepthSplat: Connecting Gaussian Splatting and Depth. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR52734.2025.01534"},{"key":"ref_7","unstructured":"Chen, Y., Xu, H., Zheng, C., Zhuang, B., Pollefeys, M., Geiger, A., Cham, T.-J., and Cai, J. (October, January 29). MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images. Proceedings of the European Conference on Computer Vision (ECCV), Milan, Italy."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Charatan, D., Li, S.L., Tagliasacchi, A., and Sitzmann, V. (2024, January 17\u201321). pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.01840"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Chan, E.R., Lin, C.Z., Chan, M.A., Nagano, K., Pan, B., De Mello, S., Gallo, O., Guibas, L., Tremblay, J., and Khamis, S. (2022, January 19\u201324). Efficient Geometry-aware 3D Generative Adversarial Networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01565"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Shue, J.R., Chan, E.R., Po, R., Ankner, Z., Wu, J., and Wetzstein, G. (2023, January 18\u201322). 3D Neural Field Generation using Triplane Diffusion. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.02000"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Zou, Z.X., Yu, Z., Guo, Y.C., Li, Y., Liang, D., Cao, Y.P., and Zhang, S.H. (2024, January 17\u201321). Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.00983"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. (2022, January 19\u201324). High-Resolution Image Synthesis with Latent Diffusion Models. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01042"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Yi, T., Fang, J., Wang, J., Wu, G., Xie, L., Zhang, X., Liu, W., Tian, Q., and Wang, X. (2024, January 17\u201321). GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.00649"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Liu, R., Wu, R., Van Hoorick, B., Tokmakov, P., Zakharov, S., and Vondrick, C. (2023, January 2\u20136). Zero-1-to-3: Zero-shot One Image to 3D Object. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Paris, France.","DOI":"10.1109\/ICCV51070.2023.00853"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Chan, E.R., Nagano, K., Chan, M.A., Bergman, A.W., Park, J.J., Levy, A., Aittala, M., De Mello, S., Karras, T., and Wetzstein, G. (2023, January 2\u20136). Generative Novel View Synthesis with 3D-Aware Diffusion Models. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Paris, France.","DOI":"10.1109\/ICCV51070.2023.00389"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Zhou, Z., and Tulsiani, S. (2023, January 18\u201322). SparseFusion: Distilling View-conditioned Diffusion for 3D Reconstruction. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.01211"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Ma, B., Gao, H., Deng, H., Luo, Z., Huang, T., Tang, L., and Wang, X. (2025, January 11\u201315). You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR52734.2025.00194"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Gong, T., Li, B., Zhong, Y., and Wang, F. (2025). ExScene: Free-View 3D Scene Reconstruction with Gaussian Splatting from a Single Image. arXiv.","DOI":"10.1109\/ICME59968.2025.11209941"},{"key":"ref_19","unstructured":"Li, J., Tan, H., Zhang, K., Xu, Z., Luan, F., Xu, Y., Hong, Y., Sunkavalli, K., Shakhnarovich, G., and Bi, S. (2023). Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model. arXiv."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Melas-Kyriazi, L., Rupprecht, C., Laina, I., and Vedaldi, A. (2023, January 18\u201322). RealFusion: 360\u00b0 Reconstruction of Any Object from a Single Image. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada.","DOI":"10.1109\/CVPR52729.2023.00816"},{"key":"ref_21","unstructured":"Melas-Kyriazi, L., Laina, I., Rupprecht, C., Neverova, N., Vedaldi, A., Gafni, O., and Kokkinos, F. (2024). IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation. arXiv."},{"key":"ref_22","unstructured":"Shi, Y., Wang, P., Ye, J., Long, M., Li, K., and Yang, X. (2023). MVDream: Multi-view Diffusion for 3D Generation. arXiv."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zheng, C., and Vedaldi, A. (2024, January 17\u201321). Free3D: Consistent Novel View Synthesis without 3D Representation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR52733.2024.00928"},{"key":"ref_24","unstructured":"Blattmann, A., Dockhorn, T., Kulal, S., Mendelevitch, D., Kilian, M., Lorenz, D., Levi, Y., English, Z., Voleti, V., and Letts, A. (2023). Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets. arXiv."},{"key":"ref_25","unstructured":"Dai, X., Hou, J., Ma, C.Y., Tsai, S., Wang, J., Wang, R., Zhang, P., Vandenhende, S., Wang, X., and Dubey, A. (2023). Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack. arXiv."},{"key":"ref_26","unstructured":"Girdhar, R., Singh, M., Brown, A., Duval, Q., Azadi, S., Rambhatla, S.S., Shah, A., Yin, X., Parikh, D., and Misra, I. (October, January 29). Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning. Proceedings of the European Conference on Computer Vision (ECCV), Milan, Italy."},{"key":"ref_27","unstructured":"Podell, D., English, Z., Lacey, K., Blattmann, A., Dockhorn, T., M\u00fcller, J., Penna, J., and Rombach, R. (2023). SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Rombach, R., Esser, P., and Ommer, B. (2021, January 11\u201317). Geometry-Free View Synthesis: Transformers and no 3D Prior. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada.","DOI":"10.1109\/ICCV48922.2021.01409"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"223","DOI":"10.1109\/TVCG.2002.1021576","article-title":"EWA Splatting","volume":"8","author":"Zwicker","year":"2002","journal-title":"IEEE Trans. Vis. Comput. Graph."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Wang, J., Chen, M., Karaev, N., Vedaldi, A., Rupprecht, C., and Novotny, D. (2025, January 11\u201315). VGGT: Visual Geometry Grounded Transformer. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA.","DOI":"10.1109\/CVPR52734.2025.00499"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Zhou, T., Tucker, R., Flynn, J., Fyffe, G., and Snavely, N. (2018). Stereo Magnification: Learning View Synthesis using Multiplane Images. arXiv.","DOI":"10.1145\/3197517.3201323"},{"key":"ref_32","unstructured":"Belghazi, M.I., Baratin, A., Rajeswar, S., Ozair, S., Bengio, Y., Courville, A., and Hjelm, R.D. (2018). MINE: Mutual Information Neural Estimation. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Tucker, R., and Snavely, N. (2020, January 16\u201320). Single-View View Synthesis with Multiplane Images. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.00063"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Szymanowicz, S., Insafutdinov, E., Zheng, C., Campbell, D., Henriques, J.F., Rupprecht, C., and Vedaldi, A. (2024). Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image. arXiv.","DOI":"10.1109\/3DV66043.2025.00067"}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/12\/761\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,6]],"date-time":"2025-12-06T05:13:48Z","timestamp":1764998028000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/18\/12\/761"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,11,30]]},"references-count":34,"journal-issue":{"issue":"12","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["a18120761"],"URL":"https:\/\/doi.org\/10.3390\/a18120761","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2025,11,30]]}}}