{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:53:39Z","timestamp":1760151219938,"version":"build-2065373602"},"reference-count":33,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2022,3,3]],"date-time":"2022-03-03T00:00:00Z","timestamp":1646265600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Depth estimation for light field images is essential for applications such as light field image compression, reconstructing perspective views and 3D reconstruction. Previous depth map estimation approaches do not capture sharp transitions around object boundaries due to occlusions, making many of the current approaches unreliable at depth discontinuities. This is especially the case for light field images because the pixels do not exhibit photo-consistency in the presence of occlusions. In this paper, we propose an algorithm to estimate the depth map for light field images using depth from defocus. Our approach uses a small patch size of pixels in each focal stack image for comparing defocus cues, allowing the algorithm to generate sharper depth boundaries. Then, in contrast to existing approaches that use defocus cues for depth estimation, we use frequency domain analysis image similarity checking to generate the depth map. Processing in the frequency domain reduces the individual pixel errors that occur while directly comparing RGB images, making the algorithm more resilient to noise. The algorithm has been evaluated on both a synthetic image dataset and real-world images in the JPEG dataset. Experimental results demonstrate that our proposed algorithm outperforms state-of-the-art depth estimation techniques for light field images, particularly in case of noisy images.<\/jats:p>","DOI":"10.3390\/s22051993","type":"journal-article","created":{"date-parts":[[2022,3,3]],"date-time":"2022-03-03T20:36:30Z","timestamp":1646339790000},"page":"1993","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Noise-Resilient Depth Estimation for Light Field Images Using Focal Stack and FFT Analysis"],"prefix":"10.3390","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-3559-7885","authenticated-orcid":false,"given":"Rishabh","family":"Sharma","sequence":"first","affiliation":[{"name":"School of Electrical and Data Engineering, University of Technology Sydney, Ultimo, NSW 2007, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2794-3178","authenticated-orcid":false,"given":"Stuart","family":"Perry","sequence":"additional","affiliation":[{"name":"School of Electrical and Data Engineering, University of Technology Sydney, Ultimo, NSW 2007, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Eva","family":"Cheng","sequence":"additional","affiliation":[{"name":"School of Electrical and Data Engineering, University of Technology Sydney, Ultimo, NSW 2007, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,3,3]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Zhou, J., Yang, D., Cui, Z., Wang, S., and Sheng, H. (2021, January 1\u20133). LRFNet: An Occlusion Robust Fusion Network for Semantic Segmentation with Light Field. Proceedings of the 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI), Virtual.","DOI":"10.1109\/ICTAI52525.2021.00186"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Hu, X., Yang, K., Fei, L., and Wang, K. (2019, January 22\u201325). Acnet: Attention based network to exploit complementary features for rgbd semantic segmentation. Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan.","DOI":"10.1109\/ICIP.2019.8803025"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1023\/A:1014573219977","article-title":"A taxonomy and evaluation of dense two-frame stereo correspondence algorithms","volume":"47","author":"Scharstein","year":"2002","journal-title":"Int. J. Comput. Vis."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Chen, C., Lin, H., Yu, Z., Bing Kang, S., and Yu, J. (2014, January 23\u201328). Light field stereo matching using bilateral statistics of surface cameras. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.197"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"739","DOI":"10.1109\/TCSVT.2016.2555778","article-title":"Light-Field Depth Estimation via Epipolar Plane Image Analysis and Locally Linear Embedding","volume":"27","author":"Zhang","year":"2017","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1023\/A:1008175127327","article-title":"Depth from Defocus vs. Stereo: How Different Really Are They?","volume":"39","author":"Schechner","year":"2000","journal-title":"Int. J. Comput. Vis."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"3586","DOI":"10.1109\/TIP.2018.2814217","article-title":"Benchmark Data Set and Method for Depth Estimation from Light Field Images","volume":"27","author":"Feng","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Heber, S., Yu, W., and Pock, T. (2017, January 22\u201329). Neural EPI-Volume Networks for Shape from Light Field. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.247"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Strecke, M., Alperovich, A., and Goldluecke, B. (2017, January 22\u201325). Accurate depth and normal maps from occlusion-aware focal stack symmetry. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.271"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Wang, T., Efros, A.A., and Ramamoorthi, R. (2015, January 7\u201313). Occlusion-Aware Depth Estimation Using Light-Field Cameras. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.398"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"148","DOI":"10.1016\/j.cviu.2015.12.007","article-title":"Robust depth estimation for light field via spinning parallelogram operator","volume":"145","author":"Zhang","year":"2016","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Shin, C., Jeon, H.G., Yoon, Y., Kweon, I.S., and Kim, S.J. (2018, January 18\u201323). Epinet: A fully-convolutional neural network using epipolar geometry for depth from light field images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00499"},{"key":"ref_13","unstructured":"Ng, R. (2006). Digital Light Field Photography, Stanford University."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"926","DOI":"10.1109\/JSTSP.2017.2747126","article-title":"Light field image processing: An overview","volume":"11","author":"Wu","year":"2017","journal-title":"IEEE J. Sel. Top. Signal Process."},{"key":"ref_15","unstructured":"Mousnier, A., Vural, E., and Guillemot, C. (2015). Partial light field tomographic reconstruction from a fixed-camera focal stack. arXiv."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"765","DOI":"10.1145\/1073204.1073259","article-title":"High performance imaging using large camera arrays","volume":"24","author":"Wilburn","year":"2005","journal-title":"ACM Trans. Graph. TOG"},{"key":"ref_17","first-page":"1","article-title":"Light field photography with a hand-held plenoptic camera","volume":"2","author":"Ng","year":"2005","journal-title":"Comput. Sci. Tech. Rep. CSTR"},{"key":"ref_18","unstructured":"Levoy, M., and Hanrahan, P. (2021, January 9\u201313). Light field rendering. Proceedings of the 23rd annual Conference on Computer Graphics and Interactive Techniques, ACM, Virtual Event."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Kolmogorov, V., and Zabih, R. (2002, January 28\u201331). Multi-camera scene reconstruction via graph cuts. Proceedings of the European Conference on Computer Vision, Copenhagen, Denmark.","DOI":"10.1007\/3-540-47977-5_6"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"2115","DOI":"10.1109\/TPAMI.2009.131","article-title":"Global stereo reconstruction under second-order smoothness priors","volume":"31","author":"Woodford","year":"2009","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Bleyer, M., Rother, C., and Kohli, P. (2010, January 13\u201318). Surface stereo with soft segmentation. Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2010.5539783"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Johannsen, O., Sulc, A., and Goldluecke, B. (2016, January 27\u201330). What Sparse Light Field Coding Reveals about Scene Structure. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.355"},{"key":"ref_23","first-page":"1","article-title":"Reconstructing reflective and transparent surfaces from epipolar plane images","volume":"Volume 8142","author":"Wanner","year":"2013","journal-title":"Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1016\/j.cviu.2004.06.001","article-title":"Extracting layers and analyzing their specular properties using epipolar-plane-image analysis","volume":"97","author":"Criminisi","year":"2005","journal-title":"Comput. Vis. Image Underst."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Wanner, S., and Goldluecke, B. (2012, January 16\u201321). Globally Consistent Depth Labeling of 4D Light Fields. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6247656"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"4879","DOI":"10.1109\/TIP.2013.2279316","article-title":"Estimating Spatially Varying Defocus Blur from a Single Image","volume":"22","author":"Zhu","year":"2013","journal-title":"IEEE Trans. Image Process."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"1852","DOI":"10.1016\/j.patcog.2011.03.009","article-title":"Defocus map estimation from a single image","volume":"44","author":"Zhuo","year":"2011","journal-title":"Pattern Recognit."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"824","DOI":"10.1109\/34.308479","article-title":"Shape from focus","volume":"16","author":"Nayar","year":"1994","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Tao, M.W., Hadap, S., Malik, J., and Ramamoorthi, R. (2013, January 1\u20138). Depth from Combining Defocus and Correspondence Using Light-Field Cameras. Proceedings of the 2013 IEEE International Conference on Computer Vision, Sydney, Australia.","DOI":"10.1109\/ICCV.2013.89"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Heber, S., and Pock, T. (2016, January 27\u201330). Convolutional Networks for Shape from Light Field. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.407"},{"key":"ref_31","unstructured":"(2021, December 18). 3dMD Laser Scanner. Available online: https:\/\/3dmd.com."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Honauer, K., Johannsen, O., Kondermann, D., and Goldluecke, B. (2016, January 20\u201324). A dataset and evaluation methodology for depth estimation on 4d light fields. Proceedings of the Asian Conference on Computer Vision, Taipei, Taiwan.","DOI":"10.1007\/978-3-319-54187-7_2"},{"key":"ref_33","unstructured":"Rerabek, M., and Ebrahimi, T. (2016, January 6\u20138). New light field image dataset. Proceedings of the 8th International Conference on Quality of Multimedia Experience (QoMEX), Lisbon, Portugal."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/5\/1993\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:31:33Z","timestamp":1760135493000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/5\/1993"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,3]]},"references-count":33,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2022,3]]}},"alternative-id":["s22051993"],"URL":"https:\/\/doi.org\/10.3390\/s22051993","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2022,3,3]]}}}