{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,23]],"date-time":"2026-04-23T17:46:13Z","timestamp":1776966373991,"version":"3.51.4"},"reference-count":38,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2022,9,27]],"date-time":"2022-09-27T00:00:00Z","timestamp":1664236800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100012166","name":"National Key R &amp; D Program of China","doi-asserted-by":"publisher","award":["2019YFB1405600"],"award-info":[{"award-number":["2019YFB1405600"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012166","name":"National Key R &amp; D Program of China","doi-asserted-by":"publisher","award":["AR2201"],"award-info":[{"award-number":["AR2201"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012166","name":"National Key R &amp; D Program of China","doi-asserted-by":"publisher","award":["AR2209"],"award-info":[{"award-number":["AR2209"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Fundamental Research Funds for Chinese Academy of Surveying and Mapping","award":["2019YFB1405600"],"award-info":[{"award-number":["2019YFB1405600"]}]},{"name":"Fundamental Research Funds for Chinese Academy of Surveying and Mapping","award":["AR2201"],"award-info":[{"award-number":["AR2201"]}]},{"name":"Fundamental Research Funds for Chinese Academy of Surveying and Mapping","award":["AR2209"],"award-info":[{"award-number":["AR2209"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>The automatic 3D reconstruction of indoor scenes is of great significance in the application of 3D-scene understanding. The existing methods have poor resilience to the incomplete and noisy point cloud, which leads to low-quality results and tedious post-processing. Therefore, the objective of this work is to automatically reconstruct indoor scenes from an incomplete and noisy point-cloud base on semantics and primitives. In this paper, we propose a semantics-and-primitives-guided indoor 3D reconstruction method. Firstly, a local, fully connected graph neural network is designed for semantic segmentation. Secondly, based on the enumerable features of indoor scenes, a primitive-based reconstruction method is proposed, which retrieves the most similar model in a 3D-ESF indoor model library by using ESF descriptors and semantic labels. Finally, a coarse-to-fine registration method is proposed to register the model into the scene. The results indicate that our method can achieve high-quality results while remaining better resilience to the incompleteness and noise of point cloud. It is concluded that the proposed method is practical and is able to automatically reconstruct the indoor scene from the point cloud with incompleteness and noise.<\/jats:p>","DOI":"10.3390\/rs14194820","type":"journal-article","created":{"date-parts":[[2022,9,28]],"date-time":"2022-09-28T03:30:37Z","timestamp":1664335837000},"page":"4820","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Semantics-and-Primitives-Guided Indoor 3D Reconstruction from Point Clouds"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6117-2944","authenticated-orcid":false,"given":"Tengfei","family":"Wang","sequence":"first","affiliation":[{"name":"Institute of Photogrammetry and Remote Sensing, Chinese Academy of Surveying and Mapping (CASM), Beijing 100036, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Qingdong","family":"Wang","sequence":"additional","affiliation":[{"name":"Institute of Photogrammetry and Remote Sensing, Chinese Academy of Surveying and Mapping (CASM), Beijing 100036, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Haibin","family":"Ai","sequence":"additional","affiliation":[{"name":"Institute of Photogrammetry and Remote Sensing, Chinese Academy of Surveying and Mapping (CASM), Beijing 100036, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Li","family":"Zhang","sequence":"additional","affiliation":[{"name":"Institute of Photogrammetry and Remote Sensing, Chinese Academy of Surveying and Mapping (CASM), Beijing 100036, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2022,9,27]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., and Davison, A. (2011, January 16\u201319). Kinect Fusion: Real-time 3D reconstruction and interaction using a moving depth camera. Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, Santa Barbara, CA, USA.","DOI":"10.1145\/2047196.2047270"},{"key":"ref_2","unstructured":"Whelan, T., Kaess, M., Fallon, M., Johannsson, H., Leonard, J., and McDonald, J. (2012). Kintinuous: Spatially Extended KinectFusion. CSAIL Tech. Rep., Available online: https:\/\/dspace.mit.edu\/handle\/1721.1\/71756."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Whelan, T., Leutenegger, S., Salas-Moreno, R., Glocker, B., and Davison, A. (2015). ElasticFusion: Dense SLAM without a pose graph. Robot. Sci. Syst.","DOI":"10.15607\/RSS.2015.XI.001"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"68","DOI":"10.1016\/j.autcon.2014.02.021","article-title":"Productive modeling for development of as-built BIM of existing indoor structures","volume":"42","author":"Jung","year":"2014","journal-title":"Autom. Constr."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.autcon.2015.04.001","article-title":"Automatic BIM component extraction from point clouds of existing buildings for sustainability applications","volume":"56","author":"Wang","year":"2015","journal-title":"Autom. Constr."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1842","DOI":"10.1109\/LGRS.2016.2614749","article-title":"An Efficient Planar Feature Fitting Method Using Point Cloud Simplification and Threshold-Independent BaySAC","volume":"13","author":"Kang","year":"2016","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Poux, F., Neuville, R., Nys, G.-A., and Billen, R. (2018). 3D Point Cloud Semantic Modelling: Integrated Framework for Indoor Spaces and Furniture. Remote Sens., 10.","DOI":"10.3390\/rs10091412"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2366145.2366156","article-title":"A search-classify approach for cluttered indoor scene understanding","volume":"31","author":"Nan","year":"2012","journal-title":"ACM Trans. Graph."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Xu, K., Li, H., Zhang, H., Cohen-Or, D., Xiong, Y., and Cheng, Z.-Q. (2010). Style-content separation by anisotropic part scales. Proceedings of the ACM SIGGRAPH Asia 2010 Papers, Association for Computing Machinery.","DOI":"10.1145\/1882262.1866206"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1541","DOI":"10.1109\/LGRS.2015.2412535","article-title":"Model-Driven Reconstruction of 3-D Buildings Using LiDAR Data","volume":"12","author":"Zheng","year":"2015","journal-title":"IEEE Geosci. Remote Sens. Lett."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., and Jiang, Y.-G. (2018). Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images, Computer Vision Foundation. Available online: https:\/\/openaccess.thecvf.com\/content_ECCV_2018\/html\/Nanyang_Wang_Pixel2Mesh_Generating_3D_ECCV_2018_paper.html.","DOI":"10.1007\/978-3-030-01252-6_4"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Gkioxari, G., Malik, J., and Johnson, J. (2019). Mesh R-CNN, Computer Vision Foundation. Available online: https:\/\/openaccess.thecvf.com\/content_ICCV_2019\/html\/Gkioxari_Mesh_R-CNN_ICCV_2019_paper.html.","DOI":"10.1109\/ICCV.2019.00988"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Wen, C., Zhang, Y., Li, Z., and Fu, Y. (2019). Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation, Computer Vision Foundation. Available online: https:\/\/openaccess.thecvf.com\/content_ICCV_2019\/html\/Wen_Pixel2Mesh_Multi-View_3D_Mesh_Generation_via_Deformation_ICCV_2019_paper.html.","DOI":"10.1109\/ICCV.2019.00113"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Park, J.J., Florence, P., Straub, J., Newcombe, R., and Lovegrove, S. (2019). DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation, Computer Vision Foundation. Available online: https:\/\/openaccess.thecvf.com\/content_CVPR_2019\/html\/Park_DeepSDF_Learning_Continuous_Signed_Distance_Functions_for_Shape_Representation_CVPR_2019_paper.html.","DOI":"10.1109\/CVPR.2019.00025"},{"key":"ref_15","unstructured":"Huang, J., and You, S. (2016, January 4\u20138). Point cloud labeling using 3D Convolutional Neural Network. Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Maturana, D., and Scherer, S. (October, January 28). VoxNet: A 3D Convolutional Neural Network for real-time object recognition. Proceedings of the 2015 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany.","DOI":"10.1109\/IROS.2015.7353481"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Tchapmi, L., Choy, C., Armeni, I., Gwak, J., and Savarese, S. (2017, January 10\u201312). SEGCloud: Semantic Segmentation of 3D Point Clouds. Proceedings of the 2017 International Conference on 3D Vision (3DV), Qingdao, China.","DOI":"10.1109\/3DV.2017.00067"},{"key":"ref_18","unstructured":"Qi, C.R., Su, H., Mo, K., and Guibas, L.J. (2017). PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation, Computer Vision Foundation. Available online: https:\/\/openaccess.thecvf.com\/content_cvpr_2017\/html\/Qi_PointNet_Deep_Learning_CVPR_2017_paper.html."},{"key":"ref_19","unstructured":"Qi, C.R., Yi, L., Su, H., and Guibas, L.J. (2017). PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. Proceedings of the Advances in Neural Information Processing Systems, Curran Associates, Inc.. Available online: https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/d8bf84be3800d12f74d8b05e9b89836f-Abstract.html."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Jiang, M., Wu, Y., Zhao, T., Zhao, Z., and Lu, C. (2018). PointSIFT: A SIFT-like Network Module for 3D Point Cloud Semantic Segmentation. arXiv.","DOI":"10.1109\/IGARSS.2019.8900102"},{"key":"ref_21","unstructured":"Li, Y., Bu, R., Sun, M., Wu, W., Di, X., and Chen, B. (2018). PointCNN: Convolution On X-Transformed Points. Proceedings of the Advances in Neural Information Processing Systems, Curran Associates, Inc.. Available online: https:\/\/proceedings.neurips.cc\/paper\/2018\/hash\/f5f8590cd58a54e94377e6ae2eded4d9-Abstract.html."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1109\/TNN.2008.2005605","article-title":"The Graph Neural Network Model","volume":"20","author":"Scarselli","year":"2009","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_23","unstructured":"Kipf, T.N., and Welling, M. (2017). Semi-Supervised Classification with Graph Convolutional Networks. arXiv."},{"key":"ref_24","first-page":"1","article-title":"Dynamic Graph CNN for Learning on Point Clouds","volume":"38","author":"Wang","year":"2019","journal-title":"ACM Trans. Graph."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Landrieu, L., and Simonovsky, M. (2018). Large-Scale Point Cloud Semantic Segmentation With Superpoint Graphs, Computer Vision Foundation. Available online: https:\/\/openaccess.thecvf.com\/content_cvpr_2018\/html\/Landrieu_Large-Scale_Point_Cloud_CVPR_2018_paper.html.","DOI":"10.1109\/CVPR.2018.00479"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Hu, Q., Yang, B., Xie, L., Rosa, S., Guo, Y., Wang, Z., Trigoni, N., and Markham, A. (2020). RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds, Computer Vision Foundation. Available online: https:\/\/openaccess.thecvf.com\/content_CVPR_2020\/html\/Hu_RandLA-Net_Efficient_Semantic_Segmentation_of_Large-Scale_Point_Clouds_CVPR_2020_paper.html.","DOI":"10.1109\/CVPR42600.2020.01112"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Wohlkinger, W., and Vincze, M. (2010, January 7\u201311). Ensemble of shape functions for 3D object classification. Proceedings of the 2011 IEEE International Conference on Robotics and Biomimetics, Phuket, Thailand.","DOI":"10.1109\/ROBIO.2011.6181760"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Wu, W., Qi, Z., and Fuxin, L. (2019). PointConv: Deep Convolutional Networks on 3D Point Clouds, Computer Vision Foundation. Available online: https:\/\/openaccess.thecvf.com\/content_CVPR_2019\/html\/Wu_PointConv_Deep_Convolutional_Networks_on_3D_Point_Clouds_CVPR_2019_paper.html.","DOI":"10.1109\/CVPR.2019.00985"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Liu, Y., Fan, B., Xiang, S., and Pan, C. (2019). Relation-Shape Convolutional Neural Network for Point Cloud Analysis, Computer Vision Foundation. Available online: https:\/\/openaccess.thecvf.com\/content_CVPR_2019\/html\/Liu_Relation-Shape_Convolutional_Neural_Network_for_Point_Cloud_Analysis_CVPR_2019_paper.html.","DOI":"10.1109\/CVPR.2019.00910"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Rusu, R.B., Blodow, N., Marton, Z.C., and Beetz, M. (2008, January 22\u201326). Aligning point cloud views using persistent feature histograms. Proceedings of the 2008 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Nice, France.","DOI":"10.1109\/IROS.2008.4650967"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Li, G., Muller, M., Thabet, A., and Ghanem, B. (2019). DeepGCNs: Can GCNs Go as Deep as CNNs?, Computer Vision Foundation. Available online: https:\/\/openaccess.thecvf.com\/content_ICCV_2019\/html\/Li_DeepGCNs_Can_GCNs_Go_As_Deep_As_CNNs_ICCV_2019_paper.html.","DOI":"10.1109\/ICCV.2019.00936"},{"key":"ref_32","unstructured":"Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., and Xiao, J. (2015). 3D ShapeNets: A Deep Representation for Volumetric Shapes, Computer Vision Foundation. Available online: https:\/\/www.cv-foundation.org\/openaccess\/content_cvpr_2015\/html\/Wu_3D_ShapeNets_A_2015_CVPR_paper.html."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhong, Y. (October, January 27). Intrinsic shape signatures: A shape descriptor for 3D object recognition. Proceedings of the 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, Kyoto, Japan.","DOI":"10.1109\/ICCVW.2009.5457637"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Rusu, R.B., Blodow, N., and Beetz, M. (2009, January 12\u201317). Fast Point Feature Histograms (FPFH) for 3D registration. Proceedings of the 2009 IEEE International Conference on Robotics and Automation, Kobe, Japan.","DOI":"10.1109\/ROBOT.2009.5152473"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Armeni, I., Sener, O., Zamir, A.R., Jiang, H., Brilakis, I., Fischer, M., and Savarese, S. (2016). 3D Semantic Parsing of Large-Scale Indoor Spaces, Computer Vision Foundation. Available online: https:\/\/openaccess.thecvf.com\/content_cvpr_2016\/html\/Armeni_3D_Semantic_Parsing_CVPR_2016_paper.html.","DOI":"10.1109\/CVPR.2016.170"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., and Niessner, M. (2017). ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes, Computer Vision Foundation. Available online: https:\/\/openaccess.thecvf.com\/content_cvpr_2017\/html\/Dai_ScanNet_Richly-Annotated_3D_CVPR_2017_paper.html.","DOI":"10.1109\/CVPR.2017.261"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Zhao, H., Jiang, L., Fu, C.-W., and Jia, J. (2019). PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing, Computer Vision Foundation. Available online: https:\/\/openaccess.thecvf.com\/content_CVPR_2019\/html\/Zhao_PointWeb_Enhancing_Local_Neighborhood_Features_for_Point_Cloud_Processing_CVPR_2019_paper.html.","DOI":"10.1109\/CVPR.2019.00571"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Choy, C., Gwak, J., and Savarese, S. (2019). 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks, Computer Vision Foundation. Available online: https:\/\/openaccess.thecvf.com\/content_CVPR_2019\/html\/Choy_4D_Spatio-Temporal_ConvNets_Minkowski_Convolutional_Neural_Networks_CVPR_2019_paper.html.","DOI":"10.1109\/CVPR.2019.00319"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/19\/4820\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T00:40:25Z","timestamp":1760143225000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/19\/4820"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,27]]},"references-count":38,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2022,10]]}},"alternative-id":["rs14194820"],"URL":"https:\/\/doi.org\/10.3390\/rs14194820","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,27]]}}}