{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T06:47:23Z","timestamp":1775198843095,"version":"3.50.1"},"reference-count":51,"publisher":"Springer Science and Business Media LLC","issue":"11","license":[{"start":{"date-parts":[[2022,9,15]],"date-time":"2022-09-15T00:00:00Z","timestamp":1663200000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,9,15]],"date-time":"2022-09-15T00:00:00Z","timestamp":1663200000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"European Research Council","award":["742989"],"award-info":[{"award-number":["742989"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Vis Comput"],"published-print":{"date-parts":[[2023,11]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>This work deals with the automatic 3D reconstruction of objects from frontal RGB images. This aims at a better understanding of the reconstruction of 3D objects from RGB images and their use in immersive virtual environments. We propose a complete workflow that can be easily adapted to almost any other family of rigid objects. To explain and validate our method, we focus on guitars. First, we detect and segment the guitars present in the image using semantic segmentation methods based on convolutional neural networks. In a second step, we perform the final 3D reconstruction of the guitar by warping the rendered depth maps of a fitted 3D template in 2D image space to match the input silhouette. We validated our method by obtaining guitar reconstructions from real input images and renders of all guitar models available in the ShapeNet database. Numerical results for different object families were obtained by computing standard mesh evaluation metrics such as Intersection over Union, Chamfer Distance, and the F-score. The results of this study show that our method can automatically generate high-quality 3D object reconstructions from frontal images using various segmentation and 3D reconstruction techniques.\n<\/jats:p>","DOI":"10.1007\/s00371-022-02669-x","type":"journal-article","created":{"date-parts":[[2022,9,15]],"date-time":"2022-09-15T19:03:26Z","timestamp":1663268606000},"page":"5421-5436","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["3D objects reconstruction from frontal images: an example with guitars"],"prefix":"10.1007","volume":"39","author":[{"given":"Alejandro","family":"Beacco","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3332-619X","authenticated-orcid":false,"given":"Jaime","family":"Gallego","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mel","family":"Slater","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2022,9,15]]},"reference":[{"key":"2669_CR1","doi-asserted-by":"crossref","unstructured":"Wen, C., Zhang, Y. Li, Z., Fu, Y.: Pixel2mesh++: Multi-view 3d mesh generation via deformation. In: IEEE Int. Conf. on Computer Vision, (2019), pp. 1042\u20131051. arXiv:1908.01491","DOI":"10.1109\/ICCV.2019.00113"},{"key":"2669_CR2","doi-asserted-by":"crossref","unstructured":"Gkioxari, G., Malik, J., Johnson, J.: Mesh r-cnn (2020). arXiv:1906.02739","DOI":"10.1109\/ICCV.2019.00988"},{"key":"2669_CR3","doi-asserted-by":"publisher","unstructured":"Beacco, A., Oliva, R., Cabreira, C., Gallego, J., Slater, M.: Disturbance and plausibility in a virtual rock concert: a pilot study. In: 2021 IEEE virtual reality and 3D user interfaces (VR), (2021), pp. 538\u2013545. https:\/\/doi.org\/10.1109\/VR50410.2021.00078","DOI":"10.1109\/VR50410.2021.00078"},{"issue":"4","key":"2669_CR4","doi-asserted-by":"publisher","first-page":"597","DOI":"10.1109\/TVCG.2013.29","volume":"19","author":"K Kilteni","year":"2013","unstructured":"Kilteni, K., Bergstrom, I., Slater, M.: Drumming in immersive virtual reality: the body shapes the way we play. IEEE Trans. Vis. Comput. Graphics 19(4), 597\u2013605 (2013). https:\/\/doi.org\/10.1109\/TVCG.2013.29","journal-title":"IEEE Trans. Vis. Comput. Graphics"},{"key":"2669_CR5","doi-asserted-by":"crossref","unstructured":"Chen, L., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: European conf. on computer vision (2018). arXiv:1802.02611","DOI":"10.1007\/978-3-030-01234-2_49"},{"key":"2669_CR6","unstructured":"Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation, CoRR abs\/1505.04597. arXiv:1505.04597"},{"key":"2669_CR7","unstructured":"Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation, CoRR abs\/1511.00561. arXiv:1511.00561"},{"key":"2669_CR8","unstructured":"Chollet, F.: Xception: Deep learning with depthwise separable convolutions. arXiv:1610.02357"},{"key":"2669_CR9","doi-asserted-by":"publisher","unstructured":"Tono, I., Gallego, J., Swiderska-Chadaj, Z., Slater, M.: Guitar segmentation in rgb images using convolutional neural networks, in: IEEE Int, Conf. on computational problems of electrical engineering, (2020), pp. 1\u20134. https:\/\/doi.org\/10.1109\/CPEE50798.2020.9238720","DOI":"10.1109\/CPEE50798.2020.9238720"},{"key":"2669_CR10","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.544","author":"B Zhou","year":"2017","unstructured":"Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ADE20K dataset. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (2017). https:\/\/doi.org\/10.1109\/CVPR.2017.544","journal-title":"Proc. IEEE Conf. Comput. Vis. Pattern Recognit."},{"key":"2669_CR11","doi-asserted-by":"crossref","unstructured":"Gong, K., Liang, X., Li, Y., Chen, Y., Yang, M., Lin, L.: Instance-level human parsing via part grouping network. In: European conf. on computer vision, (2018), pp. 770\u2013785. arXiv:1808.00157","DOI":"10.1007\/978-3-030-01225-0_47"},{"key":"2669_CR12","doi-asserted-by":"publisher","unstructured":"Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: IEEE conf. on computer vision and pattern recognition, (2015), pp. 3431\u20133440. https:\/\/doi.org\/10.1109\/CVPR.2015.7298965","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"2669_CR13","unstructured":"Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition, arXiv e-prints (2014) arXiv:1409.1556"},{"key":"2669_CR14","unstructured":"Sun, K., Zhao, Y., Jiang, B., Cheng, T., Xiao, B., Liu, D., Mu, Y., Wang, X., Liu, W., Wang, J.: High-resolution representations for labeling pixels and regions. arXiv:1904.04514"},{"key":"2669_CR15","unstructured":"K.\u00a0Sun, B.\u00a0Xiao, D.\u00a0Liu, J.\u00a0Wang, Deep high-resolution representation learning for human pose estimation, CoRR arXiv:1902.09212"},{"issue":"5","key":"2669_CR16","doi-asserted-by":"publisher","first-page":"1578","DOI":"10.1109\/TPAMI.2019.2954885","volume":"43","author":"XF Han","year":"2019","unstructured":"Han, X.F., Laga, H., Bennamoun, M.: Image-based 3d object reconstruction: state-of-the-art and trends in the deep learning era. IEEE Trans. Pattern Anal. Mach. Intell. 43(5), 1578\u20131604 (2019). https:\/\/doi.org\/10.1109\/TPAMI.2019.2954885","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"issue":"4","key":"2669_CR17","doi-asserted-by":"publisher","first-page":"1227","DOI":"10.1007\/s00371-021-02151-0","volume":"38","author":"D Hepperle","year":"2022","unstructured":"Hepperle, D., Purps, C.F., Deuchler, J., W\u00f6lfel, M.: Aspects of visual avatar appearance: self-representation, display type, and uncanny valley. Vis. Comput. 38(4), 1227\u20131244 (2022)","journal-title":"Vis. Comput."},{"key":"2669_CR18","doi-asserted-by":"publisher","unstructured":"Choy, C.B., Xu, D., Gwak, J., Chen, K., Savarese, S.: 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction, in: European conf. on computer vision, Springer, (2016), pp. 628\u2013644. https:\/\/doi.org\/10.1007\/978-3-319-46484-8_38","DOI":"10.1007\/978-3-319-46484-8_38"},{"key":"2669_CR19","doi-asserted-by":"publisher","unstructured":"Wu, J., Zhang, C., Xue, T., Freeman, W.T., Tenenbaum, J.B.: Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In: Proceedings of the 30th international conference on neural information processing systems, (2016), pp. 82\u201390. https:\/\/doi.org\/10.5555\/3157096.3157106","DOI":"10.5555\/3157096.3157106"},{"key":"2669_CR20","doi-asserted-by":"crossref","unstructured":"Girdhar, R., Fouhey, D.F., Rodriguez, M., Gupta, A.: Learning a predictable and generative vector representation for objects. In: European conference on computer vision, Springer, (2016), pp. 484\u2013499. arXiv:1603.08637","DOI":"10.1007\/978-3-319-46466-4_29"},{"key":"2669_CR21","doi-asserted-by":"publisher","unstructured":"Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: learning 3d reconstruction in function space. In: IEEE\/CVF conference on computer vision and pattern recognition, (2019), pp. 4460\u20134470. https:\/\/doi.org\/10.1109\/CVPR.2019.00459","DOI":"10.1109\/CVPR.2019.00459"},{"key":"2669_CR22","unstructured":"Kato, H., Beker, D., Morariu, M., Ando, T., Matsuoka, T., Kehl, W., Gaidon, A.: Differentiable rendering: a survey. arXiv:2006.12057"},{"key":"2669_CR23","doi-asserted-by":"crossref","unstructured":"Jiang, Y., Ji, D., Han, Z., Zwicker, M.: Sdfdiff: Differentiable rendering of signed distance fields for 3d shape optimization. In: IEEE Conf. on computer vision and pattern recognition, (2020), pp. 1251\u20131261. arXiv:1912.07109","DOI":"10.1109\/CVPR42600.2020.00133"},{"key":"2669_CR24","unstructured":"Sitzmann, V., Zollh\u00f6fer, M., Wetzstein, G.: Scene representation networks: continuous 3d-structure-aware neural scene representations, arXiv preprint arXiv:1906.01618"},{"issue":"6","key":"2669_CR25","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2816795.2818071","volume":"34","author":"Y Li","year":"2015","unstructured":"Li, Y., Su, H., Qi, C.R., Fish, N., Cohen-Or, D., Guibas, L.J.: Joint embeddings of shapes and images via CNN image purification. Trans. graphics 34(6), 1\u201312 (2015). https:\/\/doi.org\/10.1145\/2816795.2818071","journal-title":"Trans. graphics"},{"issue":"4","key":"2669_CR26","doi-asserted-by":"publisher","first-page":"719","DOI":"10.1109\/TPAMI.2016.2574713","volume":"39","author":"S Tulsiani","year":"2016","unstructured":"Tulsiani, S., Kar, A., Carreira, J., Malik, J.: Learning category-specific deformable 3d models for object reconstruction. Trans. Pattern Anal. Mach. Intell. 39(4), 719\u2013731 (2016). https:\/\/doi.org\/10.1109\/TPAMI.2016.2574713","journal-title":"Trans. Pattern Anal. Mach. Intell."},{"key":"2669_CR27","doi-asserted-by":"publisher","unstructured":"Kong, C., Lin, C.H., Lucey, S.: Using locally corresponding cad models for dense 3d reconstructions from a single image. In: IEEE conf. on computer vision and pattern recognition, (2017), pp. 4857\u20134865. https:\/\/doi.org\/10.1109\/CVPR.2017.594","DOI":"10.1109\/CVPR.2017.594"},{"key":"2669_CR28","unstructured":"Pontes, J.K., Kong, C., Eriksson, A., Fookes, C., Sridharan, S., Lucey, S.: Compact model representation for 3d reconstruction. In: Int. Conf. on 3D Vision, (2017), pp. 88\u201396. arXiv:1707.07360"},{"issue":"7","key":"2669_CR29","doi-asserted-by":"publisher","first-page":"1743","DOI":"10.1007\/s00371-020-01935-0","volume":"37","author":"Q-F Zou","year":"2021","unstructured":"Zou, Q.-F., Liu, L., Liu, Y.: Instance-level 3d shape retrieval from a single image by hybrid-representation-assisted joint embedding. Vis. Comput. 37(7), 1743\u20131756 (2021)","journal-title":"Vis. Comput."},{"key":"2669_CR30","doi-asserted-by":"crossref","unstructured":"Pan, J., Han, X., Chen, W., Tang, J., Jia K.: Deep mesh reconstruction from single rgb images via topology modification networks. In: IEEE int. conf. on computer vision, (2019), pp. 9964\u20139973. arXiv:1909.00321","DOI":"10.1109\/ICCV.2019.01006"},{"key":"2669_CR31","doi-asserted-by":"crossref","unstructured":"Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., Jiang, Y.G.: Pixel2mesh: generating 3d mesh models from single rgb images. In: European conf. on computer vision (ECCV), (2018), pp. 52\u201367. arXiv:1804.01654","DOI":"10.1007\/978-3-030-01252-6_4"},{"key":"2669_CR32","doi-asserted-by":"crossref","unstructured":"G\u00fcler, R., Neverova, N., Kokkinos, I.: Densepose: dense human pose estimation in the wild. In: IEEE CVPR 2018 Papers (2018). arXiv:1802.00434","DOI":"10.1109\/CVPR.2018.00762"},{"key":"2669_CR33","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE conf. on computer vision and pattern recognition, (2016), pp. 770\u2013778. arXiv:1512.03385","DOI":"10.1109\/CVPR.2016.90"},{"issue":"5","key":"2669_CR34","doi-asserted-by":"publisher","first-page":"34","DOI":"10.1109\/38.946629","volume":"21","author":"E Reinhard","year":"2001","unstructured":"Reinhard, E., Adhikhmin, M., Gooch, B., Shirley, P.: Color transfer between images. IEEE Comput. Graphics Appl. 21(5), 34\u201341 (2001). https:\/\/doi.org\/10.1109\/38.946629","journal-title":"IEEE Comput. Graphics Appl."},{"key":"2669_CR35","doi-asserted-by":"crossref","unstructured":"Weng, C., Curless, B., Kemelmacher-Shlizerman, I.: Photo wake-up: 3d character animation from a single photo. In: IEEE conf. on computer vision and pattern recognition, (2019), pp. 5908\u20135917. arXiv:1812.02246","DOI":"10.1109\/CVPR.2019.00606"},{"key":"2669_CR36","doi-asserted-by":"publisher","unstructured":"Beacco, A., Gallego, J., Slater, M.: Automatic 3d character reconstruction from frontal and lateral monocular 2d rgb views. In: IEEE int. conf. on image processing (2020), pp. 2785\u20132789. https:\/\/doi.org\/10.1109\/ICIP40778.2020.9191091","DOI":"10.1109\/ICIP40778.2020.9191091"},{"issue":"2","key":"2669_CR37","doi-asserted-by":"publisher","first-page":"1134","DOI":"10.1103\/PhysRevA.33.1134","volume":"33","author":"AM Fraser","year":"1986","unstructured":"Fraser, A.M., Swinney, H.L.: Independent coordinates for strange attractors from mutual information. Phys. Rev. A 33(2), 1134 (1986). https:\/\/doi.org\/10.1103\/PhysRevA.33.1134","journal-title":"Phys. Rev. A"},{"issue":"1","key":"2669_CR38","doi-asserted-by":"publisher","first-page":"35","DOI":"10.1016\/S1361-8415(01)80004-9","volume":"1","author":"WM Wells III","year":"1996","unstructured":"Wells, W.M., III., Viola, P., Atsumi, H., Nakajima, S., Kikinis, R.: Multi-modal volume registration by maximization of mutual information. Med. Image Anal. 1(1), 35\u201351 (1996). https:\/\/doi.org\/10.1016\/S1361-8415(01)80004-9","journal-title":"Med. Image Anal."},{"issue":"11","key":"2669_CR39","doi-asserted-by":"publisher","first-page":"1222","DOI":"10.1109\/34.969114","volume":"23","author":"Y Boykov","year":"2001","unstructured":"Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23(11), 1222\u20131239 (2001). https:\/\/doi.org\/10.1109\/34.969114","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"2669_CR40","doi-asserted-by":"publisher","unstructured":"Schaefer, S., McPhail, T., Warren, J.: Image deformation using moving least squares. In: ACM SIGGRAPH 2006 Papers, SIGGRAPH \u201906, Association for Computing Machinery, New York, NY, USA, (2006), p. 533\u2013540. https:\/\/doi.org\/10.1145\/1179352.1141920","DOI":"10.1145\/1179352.1141920"},{"key":"2669_CR41","unstructured":"Opara, A., Stachowiak, T.: More like this, please! texture synthesis and remixing from a single example. (2019). https:\/\/github.com\/EmbarkStudios\/texture-synthesis"},{"issue":"9","key":"2669_CR42","doi-asserted-by":"publisher","first-page":"1200","DOI":"10.1109\/TIP.2004.833105","volume":"13","author":"A Criminisi","year":"2004","unstructured":"Criminisi, A., Perez, P., Toyama, K.: Region filling and object removal by exemplar-based image inpainting. IEEE Trans. Image Process 13(9), 1200\u20131212 (2004). https:\/\/doi.org\/10.1109\/TIP.2004.833105","journal-title":"IEEE Trans. Image Process"},{"issue":"10","key":"2669_CR43","doi-asserted-by":"publisher","first-page":"3779","DOI":"10.1109\/TIP.2013.2261308","volume":"22","author":"O Le Meur","year":"2013","unstructured":"Le Meur, O., Ebdelli, M., Guillemot, C.: Hierarchical super-resolution-based inpainting. IEEE Trans. Image Process 22(10), 3779\u20133790 (2013). https:\/\/doi.org\/10.1109\/TIP.2013.2261308","journal-title":"IEEE Trans. Image Process"},{"key":"2669_CR44","unstructured":"Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., et\u00a0al.: Shapenet: an information-rich 3d model repository, arXiv arXiv:1512.03012"},{"key":"2669_CR45","doi-asserted-by":"crossref","unstructured":"Tatarchenko, M., Richter, S.R., Ranftl, R., Li, Z., Koltun, V., Brox, T.: What do single-view 3d reconstruction networks learn? (2019). arXiv:1905.03678","DOI":"10.1109\/CVPR.2019.00352"},{"issue":"1145\/3072959","key":"2669_CR46","first-page":"3073599","volume":"10","author":"A Knapitsch","year":"2017","unstructured":"Knapitsch, A., Park, J., Zhou, Q., Koltun, V.: Tanks and temples: benchmarking large-scale scene reconstruction. ACM Trans. Graph. 10(1145\/3072959), 3073599 (2017)","journal-title":"ACM Trans. Graph."},{"issue":"2","key":"2669_CR47","doi-asserted-by":"publisher","first-page":"239","DOI":"10.1109\/34.121791","volume":"14","author":"P Besl","year":"1992","unstructured":"Besl, P., McKay, N.D.: A method for registration of 3-d shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239\u2013256 (1992). https:\/\/doi.org\/10.1109\/34.121791","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"2669_CR48","doi-asserted-by":"crossref","unstructured":"Fan, H., Su, H., Guibas, L.: A point set generation network for 3d object reconstruction from a single image (2016). arXiv:1612.00603","DOI":"10.1109\/CVPR.2017.264"},{"key":"2669_CR49","unstructured":"Groueix, T., Fisher, M., Kim, V., Russell, B., Aubry, M.: Atlasnet: A papier-m\u00e2ch\u00e9 approach to learning 3d surface generation. arXiv:1802.05384"},{"key":"2669_CR50","unstructured":"Tatarchenko, M., Dosovitskiy, A., Brox, T.: Octree generating networks: Efficient convolutional architectures for high-resolution 3d outputs. arXiv:1703.09438"},{"key":"2669_CR51","unstructured":"Richter, S., Roth, S.: Matryoshka networks: Predicting 3d geometry via nested shape layers. arXiv:1804.10975"}],"container-title":["The Visual Computer"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00371-022-02669-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00371-022-02669-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00371-022-02669-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,27]],"date-time":"2023-10-27T15:04:36Z","timestamp":1698419076000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00371-022-02669-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,9,15]]},"references-count":51,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2023,11]]}},"alternative-id":["2669"],"URL":"https:\/\/doi.org\/10.1007\/s00371-022-02669-x","relation":{},"ISSN":["0178-2789","1432-2315"],"issn-type":[{"value":"0178-2789","type":"print"},{"value":"1432-2315","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,9,15]]},"assertion":[{"value":"1 September 2022","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 September 2022","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}