{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,11]],"date-time":"2026-02-11T19:02:31Z","timestamp":1770836551452,"version":"3.50.1"},"reference-count":43,"publisher":"MDPI AG","issue":"24","license":[{"start":{"date-parts":[[2024,12,12]],"date-time":"2024-12-12T00:00:00Z","timestamp":1733961600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Since conventional photogrammetric approaches struggle with with low-texture, reflective, and transparent regions, this study explores the application of Neural Radiance Fields (NeRFs) for large-scale 3D reconstruction of outdoor scenes, since NeRF-based methods have recently shown very impressive results in these areas. We evaluate three approaches: Mega-NeRF, Block-NeRF, and Direct Voxel Grid Optimization, focusing on their accuracy and completeness compared to ground truth point clouds. In addition, we analyze the effects of using multiple sub-modules, estimating the visibility by an additional neural network and varying the density threshold for the extraction of the point cloud. For performance evaluation, we use benchmark datasets that correspond to the setting off standard flight campaigns and therefore typically have nadir camera perspective and relatively little image overlap, which can be challenging for NeRF-based approaches that are typically trained with significantly more images and varying camera angles. We show that despite lower quality compared to classic photogrammetric approaches, NeRF-based reconstructions provide visually convincing results in challenging areas. Furthermore, our study shows that in particular increasing the number of sub-modules and predicting the visibility using an additional neural network improves the quality of the resulting reconstructions significantly.<\/jats:p>","DOI":"10.3390\/rs16244655","type":"journal-article","created":{"date-parts":[[2024,12,12]],"date-time":"2024-12-12T09:38:15Z","timestamp":1733996295000},"page":"4655","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Leveraging Neural Radiance Fields for Large-Scale 3D Reconstruction from Aerial Imagery"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0009-0002-6628-2381","authenticated-orcid":false,"given":"Max","family":"Hermann","sequence":"first","affiliation":[{"name":"Institute of Photogrammetry and Remote Sensing, Karlsruhe Institute of Technology (KIT), 76131 Karlsruhe, Germany"},{"name":"Fraunhofer IOSB, 76131 Karlsruhe, Germany"}]},{"given":"Hyovin","family":"Kwak","sequence":"additional","affiliation":[{"name":"Fraunhofer IOSB, 76131 Karlsruhe, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2627-3202","authenticated-orcid":false,"given":"Boitumelo","family":"Ruf","sequence":"additional","affiliation":[{"name":"Fraunhofer IOSB, 76131 Karlsruhe, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8654-7546","authenticated-orcid":false,"given":"Martin","family":"Weinmann","sequence":"additional","affiliation":[{"name":"Institute of Photogrammetry and Remote Sensing, Karlsruhe Institute of Technology (KIT), 76131 Karlsruhe, Germany"}]}],"member":"1968","published-online":{"date-parts":[[2024,12,12]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1362","DOI":"10.1109\/TPAMI.2009.161","article-title":"Accurate, dense, and robust multiview stereopsis","volume":"32","author":"Furukawa","year":"2009","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Galliani, S., Lasinger, K., and Schindler, K. (2015, January 7\u201313). Massively Parallel Multiview Stereopsis by Surface Normal Diffusion. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile.","DOI":"10.1109\/ICCV.2015.106"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Goesele, M., Snavely, N., Curless, B., Hoppe, H., and Seitz, S.M. (2007, January 14\u201321). Multi-View Stereo for Community Photo Collections. Proceedings of the IEEE International Conference on Computer Vision, Rio de Janeiro, Brazil.","DOI":"10.1109\/ICCV.2007.4408933"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Sch\u00f6nberger, J.L., Zheng, E., Pollefeys, M., and Frahm, J.M. (2016, January 11\u201314). Pixelwise View Selection for Unstructured Multi-View Stereo. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands.","DOI":"10.1007\/978-3-319-46487-9_31"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Xu, Q., and Tao, W. (2019, January 15\u201320). Multi-Scale Geometric Consistency Guided Multi-View Stereo. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00563"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Xu, Q., and Tao, W. (2020, January 7\u201312). Planar Prior Assisted PatchMatch Multi-View Stereo. Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA.","DOI":"10.1609\/aaai.v34i07.6940"},{"key":"ref_7","first-page":"4945","article-title":"Multi-Scale Geometric Consistency Guided and Planar Prior Assisted Multi-View Stereo","volume":"45","author":"Xu","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"99","DOI":"10.1145\/3503250","article-title":"Nerf: Representing scenes as neural radiance fields for view synthesis","volume":"65","author":"Mildenhall","year":"2021","journal-title":"Commun. ACM"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R., and Srinivasan, P.P. (2021, January 11\u201317). Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. Proceedings of the IEEE International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00580"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Sun, C., Sun, M., and Chen, H.T. (2022, January 18\u201324). Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00538"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Sch\u00f6nberger, J.L., and Frahm, J.M. (2016, January 27\u201330). Structure-from-Motion Revisited. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.445"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Furukawa, Y., Curless, B., Seitz, S.M., and Szeliski, R. (2010, January 13\u201318). Towards internet-scale multi-view stereo. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA.","DOI":"10.1109\/CVPR.2010.5539802"},{"key":"ref_13","unstructured":"Han, X., Leung, T., Jia, Y., Sukthankar, R., and Berg, A.C. (2015, January 7\u201312). MatchNet: Unifying feature and metric learning for patch-based matching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA."},{"key":"ref_14","first-page":"2287","article-title":"Stereo matching by training a convolutional neural network to compare image patches","volume":"17","author":"Zbontar","year":"2016","journal-title":"J. Mach. Learn. Res."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Hartmann, W., Galliani, S., Havlena, M., Van Gool, L., and Schindler, K. (2017, January 22\u201329). Learned multi-patch similarity. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.176"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"328","DOI":"10.1109\/TPAMI.2007.1166","article-title":"Stereo processing by semi-global matching and mutual information","volume":"30","author":"Hirschmueller","year":"2008","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Yao, Y., Luo, Z., Li, S., Fang, T., and Quan, L. (2018, January 8\u201314). MVSNet: Depth inference for unstructured multi-view stereo. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01237-3_47"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Huang, P.H., Matzen, K., Kopf, J., Ahuja, N., and Huang, J.B. (2018, January 18\u201323). DeepMVS: Learning multi-view stereopsis. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00298"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Yao, Y., Luo, Z., Li, S., Shen, T., Fang, T., and Quan, L. (2019, January 15\u201320). Recurrent MVSNet for High-Resolution Multi-View Stereo Depth Inference. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00567"},{"key":"ref_20","first-page":"4748","article-title":"Cost Volume Pyramid Based Depth Inference for Multi-View Stereo","volume":"44","author":"Yang","year":"2022","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_21","first-page":"6000","article-title":"Attention is All you Need","volume":"30","author":"Vaswani","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Ranftl, R., Bochkovskiy, A., and Koltun, V. (2021, January 11\u201317). Vision transformers for dense prediction. Proceedings of the IEEE International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.01196"},{"key":"ref_23","unstructured":"Cao, C., Ren, X., and Fu, Y. (2023). MVSFormer: Multi-View Stereo by Learning Robust Image Features and Temperature-based Depth. Trans. Mach. Learn. Res."},{"key":"ref_24","unstructured":"Cao, C., Ren, X., and Fu, Y. (2024, January 7\u201311). MVSFormer++: Revealing the Devil in Transformer\u2019s Details for Multi-View Stereo. Proceedings of the International Conference on Learning Representations, Vienna, Austria."},{"key":"ref_25","unstructured":"Xie, J., Girshick, R., and Farhadi, A. (, January 11\u201314October). Deep3D: Fully automatic 2D-to-3D video conversion with deep convolutional neural networks. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Flynn, J., Neulander, I., Philbin, J., and Snavely, N. (2016, January 27\u201330). DeepStereo: Learning to predict new views from the world\u2019s imagery. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.595"},{"key":"ref_27","unstructured":"Wang, Z., Li, L., Shen, Z., Shen, L., and Bo, L. (2023). 4K-NeRF: High Fidelity Neural Radiance Fields at Ultra High Resolutions. arXiv."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Xiangli, Y., Xu, L., Pan, X., Zhao, N., Rao, A., Theobalt, C., Dai, B., and Lin, D. (2022, January 23\u201327). BungeeNeRF: Progressive Neural Radiance Field for Extreme Multi-scale Scene Rendering. Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel.","DOI":"10.1007\/978-3-031-19824-3_7"},{"key":"ref_29","first-page":"1","article-title":"Instant neural graphics primitives with a multiresolution hash encoding","volume":"41","author":"Evans","year":"2022","journal-title":"ACM Trans. Graph."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Chen, A., Xu, Z., Geiger, A., Yu, J., and Su, H. (2022, January 23\u201327). TensoRF: Tensorial Radiance Fields. Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel.","DOI":"10.1007\/978-3-031-19824-3_20"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Martin-Brualla, R., Radwan, N., Sajjadi, M.S.M., Barron, J.T., Dosovitskiy, A., and Duckworth, D. (2021, January 20\u201325). NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.00713"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Turki, H., Ramanan, D., and Satyanarayanan, M. (2022, January 18\u201324). Mega-NERF: Scalable Construction of Large-Scale NeRFs for Virtual Fly-Throughs. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01258"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Tancik, M., Casser, V., Yan, X., Pradhan, S., Mildenhall, B.P., Srinivasan, P., Barron, J.T., and Kretzschmar, H. (2022, January 18\u201324). Block-NeRF: Scalable Large Scene Neural View Synthesis. Proceedings of the Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.00807"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"89","DOI":"10.7763\/IJCTE.2011.V3.288","article-title":"Statistical normalization and back propagation for classification","volume":"3","author":"Jayalakshmi","year":"2011","journal-title":"Int. J. Comput. Theory Eng."},{"key":"ref_35","unstructured":"Zhang, K., Riegler, G., Snavely, N., and Koltun, V. (2020). NeRF++: Analyzing and Improving Neural Radiance Fields. arXiv."},{"key":"ref_36","unstructured":"Seitz, S.M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. (2006, January 17\u201322). A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New York, NY, USA."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"100065","DOI":"10.1016\/j.ophoto.2024.100065","article-title":"Depth estimation and 3D reconstruction from UAV-borne imagery: Evaluation on the UseGeo dataset","volume":"13","author":"Hermann","year":"2024","journal-title":"ISPRS Open J. Photogramm. Remote Sens."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"600","DOI":"10.1109\/TIP.2003.819861","article-title":"Image quality assessment: From error visibility to structural similarity","volume":"13","author":"Wang","year":"2004","journal-title":"IEEE Trans. Image Process."},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Zhang, R., Isola, P., Efros, A.A., Shechtman, E., and Wang, O. (2018, January 18\u201323). The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00068"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"100070","DOI":"10.1016\/j.ophoto.2024.100070","article-title":"UseGeo\u2014A UAV-based multi-sensor dataset for geospatial research","volume":"13","author":"Nex","year":"2024","journal-title":"ISPRS Open J. Photogramm. Remote Sens."},{"key":"ref_41","first-page":"11","article-title":"The Hessigheim 3D (H3D) benchmark on semantic segmentation of high-resolution 3D point clouds and textured meshes from UAV LiDAR and Multi-View-Stereo","volume":"1","author":"Laupheimer","year":"2021","journal-title":"ISPRS Open J. Photogramm. Remote Sens."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3072959.3073599","article-title":"Tanks and temples: Benchmarking large-scale scene reconstruction","volume":"36","author":"Knapitsch","year":"2017","journal-title":"ACM Trans. Graph."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Lindenberger, P., Sarlin, P.E., Larsson, V., and Pollefeys, M. (2021, January 11\u201317). Pixel-perfect structure-from-motion with featuremetric refinement. Proceedings of the IEEE International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00593"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/24\/4655\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T16:53:42Z","timestamp":1760115222000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/16\/24\/4655"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,12,12]]},"references-count":43,"journal-issue":{"issue":"24","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["rs16244655"],"URL":"https:\/\/doi.org\/10.3390\/rs16244655","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,12,12]]}}}