{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:59:23Z","timestamp":1760147963482,"version":"build-2065373602"},"reference-count":26,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2023,3,17]],"date-time":"2023-03-17T00:00:00Z","timestamp":1679011200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Natural Science Foundation of China","award":["U21A20515","61972459","61971418","U2003109","62171321","62071157","62162044"],"award-info":[{"award-number":["U21A20515","61972459","61971418","U2003109","62171321","62071157","62162044"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Traditional multi-view stereo (MVS) is not applicable for the point cloud reconstruction of serialized video frames. Among them, the exhausted feature extraction and matching for all the prepared frames are time-consuming, and the scope of the search requires covering all the key frames. In this paper, we propose a novel serialized reconstruction method to solve the above issues. Specifically, a joint feature descriptors-based covisibility cluster generation strategy is designed to accelerate the feature matching and improve the performance of the pose estimation. Then, a serialized structure-from-motion (SfM) and dense point cloud reconstruction framework is designed to achieve high efficiency and competitive precision reconstruction for serialized frames. To fully demonstrate the superiority of our method, we collect a public aerial sequences dataset with referable ground truth for the dense point cloud reconstruction evaluation. Through a time complexity analysis and the experimental validation in this dataset, the comprehensive performance of our algorithm is better than the other compared outstanding methods.<\/jats:p>","DOI":"10.3390\/rs15061625","type":"journal-article","created":{"date-parts":[[2023,3,17]],"date-time":"2023-03-17T04:10:48Z","timestamp":1679026248000},"page":"1625","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Accurate and Serialized Dense Point Cloud Reconstruction for Aerial Video Sequences"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4037-9900","authenticated-orcid":false,"given":"Shibiao","family":"Xu","sequence":"first","affiliation":[{"name":"School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bingbing","family":"Pan","sequence":"additional","affiliation":[{"name":"Institute of Automation, Chinese Academy of Sciences, Beijing 100090, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8212-1361","authenticated-orcid":false,"given":"Jiguang","family":"Zhang","sequence":"additional","affiliation":[{"name":"Institute of Automation, Chinese Academy of Sciences, Beijing 100090, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0092-6474","authenticated-orcid":false,"given":"Xiaopeng","family":"Zhang","sequence":"additional","affiliation":[{"name":"Institute of Automation, Chinese Academy of Sciences, Beijing 100090, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2023,3,17]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Lao, Y., Ait-Aider, O., and Bartoli, A. (2018, January 8\u201314). Rolling Shutter Pose and Ego-Motion Estimation Using Shape-from-Template. Proceedings of the ECCV, Munich, Germany.","DOI":"10.1007\/978-3-030-01216-8_29"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1040","DOI":"10.1109\/LRA.2021.3137520","article-title":"Burst imaging for light-constrained structure-from-motion","volume":"7","author":"Ravendran","year":"2021","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"1362","DOI":"10.1109\/TPAMI.2009.161","article-title":"Accurate, dense, and robust multiview stereopsis","volume":"32","author":"Furukawa","year":"2010","journal-title":"TPAMI"},{"key":"ref_4","unstructured":"Wu, C. (July, January 29). Towards Linear-Time Incremental Structure from Motion. Proceedings of the 3DV, Seattle, WA, USA."},{"key":"ref_5","unstructured":"Sch\u00f6nberger, J.L., and Frahm, J.M. (June, January 27). Structure-from-Motion Revisited. Proceedings of the CVPR, Las Vegas, NV, USA."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1016\/j.neucom.2020.02.044","article-title":"DRM-SLAM: Towards dense reconstruction of monocular SLAM with scene depth fusion","volume":"396","author":"Ye","year":"2020","journal-title":"Neurocomputing"},{"key":"ref_7","unstructured":"Hatem, I., and Yousif, Y. (2020). Robot Operating System (ROS), Springer."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Lan, Z., Yew, Z.J., and Lee, G.H. (2019). Robust Point Cloud Based Reconstruction of Large-Scale Outdoor Scenes. arXiv.","DOI":"10.1109\/CVPR.2019.00992"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Wang, K., Gao, F., and Shen, S. (2019). Real-time Scalable Dense Surfel Mapping. arXiv.","DOI":"10.1109\/ICRA.2019.8794101"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"108225","DOI":"10.1016\/j.patcog.2021.108225","article-title":"Blitz-SLAM: A semantic SLAM in dynamic environments","volume":"121","author":"Fan","year":"2022","journal-title":"Pattern Recognit."},{"key":"ref_11","unstructured":"Rosinol, A., Leonard, J.J., and Carlone, L. (2022). NeRF-SLAM: Real-Time Dense Monocular SLAM with Neural Radiance Fields. arXiv."},{"key":"ref_12","unstructured":"Chung, C.M., Tseng, Y.C., Hsu, Y.C., Shi, X.Q., Hua, Y.H., Yeh, J.F., Chen, W.C., Chen, Y.T., and Hsu, W.H. (2022). Orbeez-SLAM: A Real-time Monocular Visual SLAM with ORB Features and NeRF-realized Mapping. arXiv."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Zhu, Z., Peng, S., Larsson, V., Xu, W., Bao, H., Cui, Z., Oswald, M.R., and Pollefeys, M. (2022, January 18\u201324). Nice-slam: Neural implicit scalable encoding for slam. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01245"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., and Geiger, A. (2017, January 10\u201312). Sparsity invariant cnns. Proceedings of the 2017 International Conference on 3D Vision (3DV), Qingdao, China.","DOI":"10.1109\/3DV.2017.00012"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"3429","DOI":"10.1109\/TIP.2019.2960589","article-title":"Hms-net: Hierarchical multi-scale sparsity-invariant network for sparse depth completion","volume":"29","author":"Huang","year":"2019","journal-title":"IEEE Trans. Image Process."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Cheng, X., Wang, P., and Yang, R. (2018, January 8\u201314). Depth estimation via affinity learned with convolutional spatial propagation network. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01270-0_7"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Fu, H., Gong, M., Wang, C., Batmanghelich, K., and Tao, D. (2018, January 18\u201322). Deep ordinal regression network for monocular depth estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00214"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Chen, Z., Badrinarayanan, V., Drozdov, G., and Rabinovich, A. (2018, January 8\u201314). Estimating depth from rgb and sparse sensing. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01225-0_11"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Shivakumar, S.S., Nguyen, T., Miller, I.D., Chen, S.W., Kumar, V., and Taylor, C.J. (2019, January 27\u201330). Dfusenet: Deep fusion of rgb and sparse depth information for image guided dense depth completion. Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand.","DOI":"10.1109\/ITSC.2019.8917294"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Ma, F., and Karaman, S. (2018, January 21\u201325). Sparse-to-dense: Depth prediction from sparse depth samples and a single image. Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia.","DOI":"10.1109\/ICRA.2018.8460184"},{"key":"ref_21","first-page":"2012","article-title":"Scale-invariant Feature Transform","volume":"7","author":"Miller","year":"2009","journal-title":"Scholarpedia"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Dusmanu, M., Rocco, I., Pajdla, T., Pollefeys, M., Sivic, J., Torii, A., and Sattler, T. (2019). D2-Net: A Trainable CNN for Joint Detection and Description of Local Features. arXiv.","DOI":"10.1109\/CVPR.2019.00828"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"76a","DOI":"10.1145\/3072959.3054739","article-title":"BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration","volume":"36","author":"Dai","year":"2017","journal-title":"ACM Trans. Graph."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1901","DOI":"10.1109\/TIP.2013.2237921","article-title":"Accurate Multiple View 3D Reconstruction Using Patch-Based Stereo for Large-Scale Scenes","volume":"22","author":"Shen","year":"2013","journal-title":"IEEE Trans. Image Process."},{"key":"ref_25","unstructured":"Bleyer, M., Rhemann, C., and Rother, C. (September, January 29). PatchMatch Stereo\u2014Stereo Matching with Slanted Support Windows. Proceedings of the BMVC, Dundee, UK."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/TGRS.2020.3042202","article-title":"RTSfM: Real-Time Structure From Motion for Mosaicing and DSM Mapping of Sequential Aerial Images With Low Overlap","volume":"60","author":"Zhao","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/6\/1625\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T18:57:26Z","timestamp":1760122646000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/6\/1625"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,3,17]]},"references-count":26,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2023,3]]}},"alternative-id":["rs15061625"],"URL":"https:\/\/doi.org\/10.3390\/rs15061625","relation":{},"ISSN":["2072-4292"],"issn-type":[{"type":"electronic","value":"2072-4292"}],"subject":[],"published":{"date-parts":[[2023,3,17]]}}}