{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T05:01:04Z","timestamp":1775797264416,"version":"3.50.1"},"reference-count":34,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T00:00:00Z","timestamp":1775606400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["MAKE"],"abstract":"<jats:p>Sparse light field imaging often limits the quality of 3D scene reconstruction due to insufficient viewpoint coverage, resulting in incomplete or inaccurate reconstructions. This work introduces a hybrid CNN\u2013LSTM-based framework to address this issue by generating novel camera poses and the corresponding synthesized novel views, effectively densifying the light field representation. The CNN extracts spatial features from the sparse input views, while the LSTM predicts temporal and positional dependencies, enabling smooth interpolation of novel poses and views. The proposed method integrates these synthesized views with the original sparse dataset to produce a comprehensive set of images. Our approach was evaluated on several datasets, including challenging datasets. The inference capability of our method was tested extensively, and it showed good generalization across diverse datasets. The effectiveness of the framework was evaluated not only with local light field fusion (LLFF) but also with NeRF and 3D Gaussian Splatting, which are considered state-of-the-art reconstruction methods. Overall, the enriched dataset generated by our method led to consistent improvements in 3D reconstruction quality, including higher depth estimation accuracy, reduced artifacts, and enhanced structural consistency. Most importantly, LSTM-based approaches have so far attracted limited attention in the context of generating novel views. While LSTMs have been widely applied in sequential data domains such as natural language processing, their use for image generation conditioned on camera poses remains largely unexplored, which underscores the novelty and significance of the proposed work. This approach provides a scalable and generalizable solution to the sparsity problem in light fields, advancing the capabilities of computational imaging, photorealistic rendering, and immersive 3D scene reconstruction. The results firmly establish the proposed method as a robust and versatile tool for improving reconstruction quality in sparse-view settings.<\/jats:p>","DOI":"10.3390\/make8040094","type":"journal-article","created":{"date-parts":[[2026,4,9]],"date-time":"2026-04-09T07:48:07Z","timestamp":1775720887000},"page":"94","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Deep Learning-Driven Sparse Light Field Enhancement: A CNN-LSTM Framework for Novel View Synthesis and 3D Scene Reconstruction"],"prefix":"10.3390","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4970-4909","authenticated-orcid":false,"given":"Vivek","family":"Dwivedi","sequence":"first","affiliation":[{"name":"Faculty of Electrical Engineering and Information Technology, Slovak University of Technology in Bratislava, 84104 Bratislava, Slovakia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7554-4748","authenticated-orcid":false,"given":"Gregor","family":"Rozinaj","sequence":"additional","affiliation":[{"name":"Faculty of Electrical Engineering and Information Technology, Slovak University of Technology in Bratislava, 84104 Bratislava, Slovakia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3625-8199","authenticated-orcid":false,"given":"Javlon","family":"Tursunov","sequence":"additional","affiliation":[{"name":"Faculty of Electrical Engineering and Information Technology, Slovak University of Technology in Bratislava, 84104 Bratislava, Slovakia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7973-0123","authenticated-orcid":false,"given":"Ivan","family":"Min\u00e1rik","sequence":"additional","affiliation":[{"name":"Faculty of Electrical Engineering and Information Technology, Slovak University of Technology in Bratislava, 84104 Bratislava, Slovakia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3929-0183","authenticated-orcid":false,"given":"Marek","family":"Vanco","sequence":"additional","affiliation":[{"name":"Faculty of Electrical Engineering and Information Technology, Slovak University of Technology in Bratislava, 84104 Bratislava, Slovakia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0617-5237","authenticated-orcid":false,"given":"Radoslav","family":"Vargic","sequence":"additional","affiliation":[{"name":"Faculty of Electrical Engineering and Information Technology, Slovak University of Technology in Bratislava, 84104 Bratislava, Slovakia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2026,4,8]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Zhou, Y., Guo, H., Fu, R., Liang, G., Wang, C., and Wu, X. (2015). 3D reconstruction based on light field information. 2015 IEEE International Conference on Information and Automation, Lijiang, China, IEEE.","DOI":"10.1109\/ICInfA.2015.7279428"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Iwane, T. (2016). Light field display and 3D image reconstruction. Three-Dimensional Imaging, Visualization, and Display, Baltimore, MD, USA, SPIE.","DOI":"10.1117\/12.2227081"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Guo, H., Peng, S., Lin, H., Wang, Q., Zhang, G., Bao, H., and Zhou, X. (2022). Neural 3D Scene Reconstruction with the Manhattan-world Assumption. arXiv.","DOI":"10.1109\/CVPR52688.2022.00543"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"12","DOI":"10.1186\/s13640-024-00628-1","article-title":"Learning-based light field imaging: An overview","volume":"2024","author":"Mahmoudpour","year":"2024","journal-title":"J. Image Video Process."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Murez, Z., van As, T., Bartolozzi, J., Sinha, A., Badrinarayanan, V., and Rabinovich, A. (2020). Atlas: End-to-End 3D Scene Reconstruction from Posed Images. arXiv.","DOI":"10.1007\/978-3-030-58571-6_25"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., and Ng, R. (2020). NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. arXiv.","DOI":"10.1007\/978-3-030-58452-8_24"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Horn\u00e1\u010dek, M., and Rozinaj, G. (2024). Exploring 3D Gaussian Splatting: An Algorithmic Perspective. 2024 International Symposium ELMAR, IEEE.","DOI":"10.1109\/ELMAR62909.2024.10693978"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1145\/3592433","article-title":"3D Gaussian Splatting for Real-Time Radiance Field Rendering","volume":"42","author":"Kerbl","year":"2023","journal-title":"ACM Trans. Graph."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"193","DOI":"10.1145\/2980179.2980251","article-title":"Learning-based view synthesis for light field cameras","volume":"35","author":"Kalantari","year":"2016","journal-title":"ACM Trans. Graph."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"305","DOI":"10.1111\/j.1467-8659.2012.03009.x","article-title":"Unstructured Light Fields","volume":"31","author":"Davis","year":"2012","journal-title":"Comput. Graph. Forum"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Johannsen, O., Sulc, A., and Goldluecke, B. (2016). What Sparse Light Field Coding Reveals about Scene Structure. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, IEEE.","DOI":"10.1109\/CVPR.2016.355"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"108121","DOI":"10.1016\/j.sigpro.2021.108121","article-title":"Robust dense light field reconstruction from sparse noisy sampling","volume":"186","author":"Zhou","year":"2021","journal-title":"Signal Process."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Feng, W., Gao, J., Qu, T., Zhou, S., and Zhao, D. (2021). Three-Dimensional Reconstruction of Light Field Based on Phase Similarity. Sensors, 21.","DOI":"10.3390\/s21227734"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Huang, Z., Fessler, J.A., Norris, T.B., and Chun, I.Y. (2020). Light-Field Reconstruction and Depth Estimation from Focal Stack Images Using Convolutional Neural Networks. ICASSP 2020\u20142020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, IEEE.","DOI":"10.1109\/ICASSP40776.2020.9053586"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1007\/978-3-030-01231-1_9","article-title":"Fast Light Field Reconstruction with Deep Coarse-to-Fine Modeling of Spatial-Angular Clues","volume":"Volume 11210","author":"Ferrari","year":"2018","journal-title":"Computer Vision\u2014ECCV 2018"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1681","DOI":"10.1109\/TPAMI.2018.2845393","article-title":"Light Field Reconstruction Using Convolutional Network on EPI and Extended Applications","volume":"41","author":"Wu","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_17","first-page":"1162","article-title":"Light Field Super-Resolution using a Low-Rank Prior and Deep Convolutional Neural Networks","volume":"42","author":"Farrugia","year":"2019","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"873","DOI":"10.1109\/TPAMI.2019.2945027","article-title":"High-Dimensional Dense Residual Convolutional Neural Network for Light Field Reconstruction","volume":"43","author":"Meng","year":"2021","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Mildenhall, B., Srinivasan, P.P., Ortiz-Cayon, R., Kalantari, N.K., Ramamoorthi, R., Ng, R., and Kar, A. (2019). Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines. arXiv.","DOI":"10.1145\/3306346.3322980"},{"key":"ref_20","unstructured":"Deng, Y., Han, L., Lin, T., Li, L., Zhang, J., and Fang, L. (2023). RealLiFe: Real-Time Light Field Reconstruction via Hierarchical Sparse Gradient Descent. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Popov, S., Bauszat, P., and Ferrari, V. (2020). CoReNet: Coherent 3D scene reconstruction from a single RGB image. arXiv.","DOI":"10.1007\/978-3-030-58536-5_22"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"8486","DOI":"10.1109\/TPAMI.2024.3410032","article-title":"SSR-2D: Semantic 3D Scene Reconstruction from 2D Images","volume":"46","author":"Huang","year":"2024","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"7542","DOI":"10.1109\/TPAMI.2024.3393141","article-title":"NeuralRecon: Real-Time Coherent 3D Scene Reconstruction from Monocular Video","volume":"46","author":"Chen","year":"2024","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Min, C., Xiao, L., Zhao, D., Nie, Y., and Dai, B. (2023). UniScene: Multi-Camera Unified Pre-training via 3D Scene Reconstruction for Autonomous Driving. arXiv.","DOI":"10.1109\/LRA.2024.3362635"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Sch\u00f6nberger, J.L., and Frahm, J.M. (2016). Structure-from-Motion Revisited. Conference on Computer Vision and Pattern Recognition (CVPR), IEEE.","DOI":"10.1109\/CVPR.2016.445"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Sch\u00f6nberger, J.L., Zheng, E., Pollefeys, M., and Frahm, J.M. (2016). Pixelwise View Selection for Unstructured Multi-View Stereo. European Conference on Computer Vision (ECCV), Springer International Publishing.","DOI":"10.1007\/978-3-319-46487-9_31"},{"key":"ref_27","unstructured":"(2026, April 01). The Stanford Bunny. The (New) Stanford Light Field Archive. Available online: https:\/\/faculty.cc.gatech.edu\/~turk\/bunny\/bunny.html."},{"key":"ref_28","unstructured":"Lengyel, E. (2012). Mathematics for 3D Game Programming and Computer Graphics, Course Technology. [3rd ed.]."},{"key":"ref_29","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., and Polosukhin, I. (2017). Attention Is All You Need. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"James, G., Witten, D., Hastie, T., and Tibshirani, R. (2021). An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics, Springer.","DOI":"10.1007\/978-1-0716-1418-1"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Mienye, I.D., Swart, T.G., and Obaido, G. (2024). Recurrent Neural Networks: A Comprehensive Review of Architectures, Variants, and Applications. Information, 15.","DOI":"10.20944\/preprints202408.0748.v1"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Jain, A., Tancik, M., and Abbeel, P. (2021). Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis. IEEE\/CVF International Conference on Computer Vision (ICCV), IEEE.","DOI":"10.1109\/ICCV48922.2021.00583"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"600","DOI":"10.1109\/TIP.2003.819861","article-title":"Image quality assessment: From error visibility to structural similarity","volume":"13","author":"Wang","year":"2004","journal-title":"IEEE Trans. Image Process."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Venjarski, J., Tibensk\u00fd, \u0160., and Rozinaj, G. (2023). Analyzing Classical and LDI Depth-Aware Image Stitching for Enhanced Virtual View Representation. 2023 30th International Conference on Systems, Signals and Image Processing (IWSSIP), IEEE.","DOI":"10.1109\/IWSSIP58668.2023.10180238"}],"container-title":["Machine Learning and Knowledge Extraction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-4990\/8\/4\/94\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T04:12:14Z","timestamp":1775794334000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-4990\/8\/4\/94"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4,8]]},"references-count":34,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2026,4]]}},"alternative-id":["make8040094"],"URL":"https:\/\/doi.org\/10.3390\/make8040094","relation":{},"ISSN":["2504-4990"],"issn-type":[{"value":"2504-4990","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,4,8]]}}}