{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,1]],"date-time":"2026-04-01T14:46:20Z","timestamp":1775054780226,"version":"3.50.1"},"reference-count":51,"publisher":"MDPI AG","issue":"7","license":[{"start":{"date-parts":[[2019,3,31]],"date-time":"2019-03-31T00:00:00Z","timestamp":1553990400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Depth-based reconstruction of three-dimensional (3D) shape of objects is one of core problems in computer vision with a lot of commercial applications. However, the 3D scanning for point cloud-based video streaming is expensive and is generally unattainable to an average user due to required setup of multiple depth sensors. We propose a novel hybrid modular artificial neural network (ANN) architecture, which can reconstruct smooth polygonal meshes from a single depth frame, using a priori knowledge. The architecture of neural network consists of separate nodes for recognition of object type and reconstruction thus allowing for easy retraining and extension for new object types. We performed recognition of nine real-world objects using the neural network trained on the ShapeNetCore model dataset. The results evaluated quantitatively using the Intersection-over-Union (IoU), Completeness, Correctness and Quality metrics, and qualitative evaluation by visual inspection demonstrate the robustness of the proposed architecture with respect to different viewing angles and illumination conditions.<\/jats:p>","DOI":"10.3390\/s19071553","type":"journal-article","created":{"date-parts":[[2019,4,2]],"date-time":"2019-04-02T03:21:26Z","timestamp":1554175286000},"page":"1553","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":20,"title":["Reconstruction of 3D Object Shape Using Hybrid Modular Neural Network Architecture Trained on 3D Models from ShapeNetCore Dataset"],"prefix":"10.3390","volume":"19","author":[{"given":"Audrius","family":"Kulikajevas","sequence":"first","affiliation":[{"name":"Department of Multimedia Engineering, Kaunas University of Technology, 51368 Kaunas, Lithuania"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2809-2213","authenticated-orcid":false,"given":"Rytis","family":"Maskeli\u016bnas","sequence":"additional","affiliation":[{"name":"Centre of Real Time Computer Systems, Kaunas University of Technology, 51368 Kaunas, Lithuania"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9990-1084","authenticated-orcid":false,"given":"Robertas","family":"Dama\u0161evi\u010dius","sequence":"additional","affiliation":[{"name":"Department of Software Engineering, Kaunas University of Technology, 51368 Kaunas, Lithuania"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sanjay","family":"Misra","sequence":"additional","affiliation":[{"name":"Department of Electrical and Information Engineering, Covenant University, Ota 1023, Nigeria"},{"name":"Department of Computer Engineering, Atilim University, Ankara 06830, Turkey"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2019,3,31]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Fanini, B., Pagano, A., and Ferdani, D. (2018). A Novel Immersive VR Game Model for Recontextualization in Virtual Environments: The uVRModel. Multimodal Technol. Interact., 2.","DOI":"10.3390\/mti2020020"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Liao, B., Li, J., Ju, Z., and Ouyang, G. (July, January 30). Hand Gesture Recognition with Generalized Hough Transform and DC-CNN Using Realsense. Proceedings of the 2018 Eighth International Conference on Information Science and Technology (ICIST), Cordoba, Spain.","DOI":"10.1109\/ICIST.2018.8426125"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Chen, C., Yang, B., Song, S., Tian, M., Li, J., Dai, W., and Fang, L. (2018). Calibrate Multiple Consumer RGB-D Cameras for Low-Cost and Efficient 3D Indoor Mapping. Remote Sens., 10.","DOI":"10.3390\/rs10020328"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Jusas, V., Birvinskas, D., and Gahramanov, E. (2017). Methods and Tools of Digital Triage in Forensic Context: Survey and Future Directions. Symmetry, 9.","DOI":"10.3390\/sym9040049"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Haleem, A., and Javaid, M. (2018). 3D scanning applications in medical field: A literature-based review. Clin. Epidemiol. Glob. Health.","DOI":"10.1016\/j.cegh.2018.05.006"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Macher, H., Landes, T., and Grussenmeyer, P. (2017). From Point Clouds to Building Information Models: 3D Semi-Automatic Reconstruction of Indoors of Existing Buildings. Appl. Sci., 7.","DOI":"10.3390\/app7101030"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Wang, L., Li, R., Shi, H., Sun, J., Zhao, L., Seah, H., Quah, C., and Tandianus, B. (2019). Multi-Channel Convolutional Neural Network Based 3D Object Detection for Indoor Robot Environmental Perception. Sensors, 19.","DOI":"10.3390\/s19040893"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1104","DOI":"10.3390\/rs3061104","article-title":"Heritage Recording and 3D Modeling with Photogrammetry and 3D Scanning","volume":"3","author":"Remondino","year":"2011","journal-title":"Remote Sens."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"1021","DOI":"10.1109\/ACCESS.2018.2886213","article-title":"Generative Adversarial Network-Based Method for Transforming Single RGB Image into 3D Point Cloud","volume":"7","author":"Chu","year":"2019","journal-title":"IEEE Access"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"3402","DOI":"10.1109\/LRA.2018.2852782","article-title":"Real-Time Fully Incremental Scene Understanding on Mobile Platforms","volume":"3","author":"Wald","year":"2018","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1540","DOI":"10.1109\/LRA.2017.2660769","article-title":"An Adaptable, Probabilistic, Next-Best View Algorithm for Reconstruction of Unknown 3-D Objects","volume":"2","author":"Daudelin","year":"2017","journal-title":"IEEE Robot. Autom. Lett."},{"key":"ref_12","first-page":"55","article-title":"Visual simultaneous localization and mapping: A survey","volume":"43","author":"Ascencio","year":"2012","journal-title":"Artif. Intell. Rev."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"625","DOI":"10.1111\/cgf.13386","article-title":"State of the Art on 3D Reconstruction with RGB-D Cameras","volume":"37","author":"Stotko","year":"2018","journal-title":"Comput. Graph. Forum"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Kutulakos, K.N., and Seitz, S.M. (1999, January 20\u201327). A theory of shape by space carving. Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece.","DOI":"10.1109\/ICCV.1999.791235"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Fan, H., Su, H., and Guibas, L.J. (2017, January 21\u201326). A Point Set Generation Network for 3D Object Reconstruction from a Single Image. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.264"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Li, C., Zia, M.Z., Tran, Q., Yu, X., Hager, G.D., and Chandraker, M. (2017, January 21\u201326). Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.49"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Yang, B., Rosa, S., Markham, A., Trigoni, N., and Wen, H. (arXiv, 2018). Dense 3D Object Reconstruction from a Single Depth View, arXiv.","DOI":"10.1109\/ICCVW.2017.86"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1109\/MMUL.2012.24","article-title":"Microsoft Kinect Sensor and Its Effect","volume":"19","author":"Zhang","year":"2012","journal-title":"IEEE Multimed."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Keselman, L., Woodfill, J.I., Grunnet-Jepsen, A., and Bhowmik, A. (2017, January 21\u201326). Intel(R) RealSense(TM) Stereoscopic Depth Cameras. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.167"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1109\/TCDS.2018.2866587","article-title":"Canonical Correlation Analysis Regularization: An Effective Deep Multi-View Learning Baseline for RGB-D Object Recognition","volume":"11","author":"Tang","year":"2018","journal-title":"IEEE Trans. Cognit. Dev. Syst."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zhang, L., Sun, J., and Zheng, Q. (2018). 3D Point Cloud Recognition Based on a Multi-View Convolutional Neural Network. Sensors, 18.","DOI":"10.3390\/s18113681"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Tian, G., Liu, L., Ri, J., Liu, Y., and Sun, Y. (2019). ObjectFusion: An object detection and segmentation framework with RGB-D SLAM and convolutional neural networks. Neurocomputing.","DOI":"10.1016\/j.neucom.2019.01.088"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zeng, H., Yang, B., Wang, X., Liu, J., and Fu, D. (2019). RGB-D Object Recognition Using Multi-Modal Deep Neural Network and DS Evidence Theory. Sensors, 19.","DOI":"10.3390\/s19030529"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Oliveira, F.F., Souza, A.A.S., Fernandes, M.A.C., Gomes, R.B., and Goncalves, L.M.G. (2018). Efficient 3D Objects Recognition Using Multifoveated Point Clouds. Sensors, 18.","DOI":"10.3390\/s18072302"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1437","DOI":"10.3390\/s120201437","article-title":"Accuracy and Resolution of Kinect Depth Data for Indoor Mapping Applications","volume":"12","author":"Khoshelham","year":"2012","journal-title":"Sensors"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"4508","DOI":"10.1109\/JSEN.2017.2703829","article-title":"On the Performance of the Intel SR300 Depth Camera: Metrological and Critical Characterization","volume":"17","author":"Carfagni","year":"2017","journal-title":"IEEE Sens. J."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Stutz, D., and Geiger, A. (2018). Learning 3D Shape Completion Under Weak Supervision. Int. J. Comput. Vis.","DOI":"10.1007\/s11263-018-1126-y"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Wiles, O., and Zisserman, A. (2018). Learning to Predict 3D Surfaces of Sculptures from Single and Multiple Views. Int. J. Comput. Vis.","DOI":"10.1007\/s11263-018-1124-0"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"836","DOI":"10.1109\/TIP.2016.2621673","article-title":"Exploiting Depth From Single Monocular Images for Object Detection and Semantic Segmentation","volume":"26","author":"Cao","year":"2017","journal-title":"IEEE Trans. Image Process."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"2086","DOI":"10.1109\/TCSVT.2016.2555678","article-title":"Depth Estimation Using an Infrared Dot Projector and an Infrared Color Stereo Camera","volume":"27","author":"Hisatomi","year":"2017","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"3010","DOI":"10.1109\/TIP.2016.2552404","article-title":"Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition","volume":"25","author":"Du","year":"2016","journal-title":"IEEE Trans. Image Process."},{"key":"ref_32","unstructured":"Kingma, D.P., and Ba, J. (arXiv, 2015). Adam: A Method for Stochastic Optimization, arXiv."},{"key":"ref_33","unstructured":"Nair, V., and Hinton, G.E. (2010, January 21\u201324). Rectified Linear Units Improve Restricted Boltzmann Machines. Proceedings of the 27th International Conference on Machine Learning (ICML\u201910), Haifa, Israel."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2015, January 13\u201316). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.123"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"1469","DOI":"10.1109\/20.497526","article-title":"Generation of 3D isosurfaces by means of the marching cube algorithm","volume":"32","author":"Bartsch","year":"1996","journal-title":"IEEE Trans. Magn."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"339","DOI":"10.1145\/566654.566586","article-title":"Dual Contouring of Hermite Data","volume":"21","author":"Ju","year":"2002","journal-title":"ACM Trans. Graph."},{"key":"ref_37","unstructured":"Kainz, F., Bogart, R.R., and Hess, D.K. (2004). The OpenEXR Image File Format. GPU Gems: Programming Techniques, Tips and Tricks for Real-Time Graphics, Addison-Wesley Professional."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Pantaleoni, J. (2011, January 5\u20137). VoxelPipe: A programmable pipeline for 3D voxelization. Proceedings of the ACM SIGGRAPH Symposium on High Performance Graphics (HPG \u201911), Vancouver, BC, Canada.","DOI":"10.1145\/2018323.2018339"},{"key":"ref_39","first-page":"39","article-title":"Fast Ray-Triangle Intersections by Coordinate Transformation","volume":"5","author":"Baldwin","year":"2016","journal-title":"J. Comput. Graph. Techol."},{"key":"ref_40","unstructured":"Chang, A.X., Funkhouser, T.A., Guibas, L.J., Hanrahan, P., Huang, Q.X., Li, Z., Savarese, S., Savva, M., Song, S., and Su, H. (arXiv, 2015). ShapeNet: An Information-Rich 3D Model Repository, arXiv."},{"key":"ref_41","first-page":"282","article-title":"Detecting People Looking at Each Other in Videos","volume":"106","author":"Zisserman","year":"2013","journal-title":"Int. J. Comput. Vis."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1109\/JSTARS.2009.2012488","article-title":"A Comparison of Evaluation Techniques for Building Extraction From Airborne Laser Scanning","volume":"2","author":"Rutzinger","year":"2009","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"2012","DOI":"10.1109\/TNNLS.2017.2748585","article-title":"A Novel Pruning Algorithm for Smoothing Feedforward Neural Networks Based on Group Lasso Method","volume":"29","author":"Wang","year":"2018","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"57","DOI":"10.1109\/TNN.2004.836241","article-title":"A generalized growing and pruning RBF (GGAP-RBF) neural network for function approximation","volume":"16","author":"Huang","year":"2005","journal-title":"IEEE Trans. Neural Netw."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"574","DOI":"10.1016\/S0378-4371(00)00479-9","article-title":"Using genetic algorithms to select architecture of a feedforward artificial neural network","volume":"289","author":"Arifovic","year":"2001","journal-title":"Phys. A Stat. Mech. Its Appl."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Po\u0142ap, D., K\u0119sik, K., Wo\u017aniak, M., and Dama\u0161evi\u010dius, R. (2018). Parallel Technique for the Metaheuristic Algorithms Using Devoted Local Search and Manipulating the Solutions Space. Appl. Sci., 8.","DOI":"10.3390\/app8020293"},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., and Davison, A. (2011, January 16\u201319). Kinectfusion: Real-time 3D reconstruction and interaction using a moving depth camera. Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology UIST, Santa Barbara, CA, USA.","DOI":"10.1145\/2047196.2047270"},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"1652","DOI":"10.1109\/TNNLS.2017.2677968","article-title":"Recurrent Neural Networks With Auxiliary Memory Units","volume":"29","author":"Wang","year":"2018","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"243","DOI":"10.1109\/TCBB.2005.44","article-title":"The applicability of recurrent neural networks for biological sequence analysis","volume":"2","author":"Hawkins","year":"2005","journal-title":"IEEE\/ACM Trans. Comput. Biol. Bioinform."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Liu, Z., Zhao, C., Wu, X., and Chen, W. (2017). An Effective 3D Shape Descriptor for Object Recognition with RGB-D Sensors. Sensors, 17.","DOI":"10.3390\/s17030451"},{"key":"ref_51","doi-asserted-by":"crossref","first-page":"2110","DOI":"10.1109\/TIFS.2014.2361028","article-title":"RGB-D-Based Face Reconstruction and Recognition","volume":"9","author":"Hsu","year":"2014","journal-title":"IEEE Trans. Inf. Forensics Secur."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/19\/7\/1553\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:41:51Z","timestamp":1760186511000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/19\/7\/1553"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,3,31]]},"references-count":51,"journal-issue":{"issue":"7","published-online":{"date-parts":[[2019,4]]}},"alternative-id":["s19071553"],"URL":"https:\/\/doi.org\/10.3390\/s19071553","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,3,31]]}}}