{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,13]],"date-time":"2025-11-13T11:56:47Z","timestamp":1763035007386,"version":"build-2065373602"},"reference-count":45,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2018,3,2]],"date-time":"2018-03-02T00:00:00Z","timestamp":1519948800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Multi-spectral photometric stereo can recover pixel-wise surface normal from a single RGB image. The difficulty lies in that the intensity in each channel is the tangle of illumination, albedo and camera response; thus, an initial estimate of the normal is required in optimization-based solutions. In this paper, we propose to make a rough depth estimation using the deep convolutional neural network (CNN) instead of using depth sensors or binocular stereo devices. Since high-resolution ground-truth data is expensive to obtain, we designed a network and trained it with rendered images of synthetic 3D objects. We use the model to predict initial normal of real-world objects and iteratively optimize the fine-scale geometry in the multi-spectral photometric stereo framework. The experimental results illustrate the improvement of the proposed method compared with existing methods.<\/jats:p>","DOI":"10.3390\/s18030764","type":"journal-article","created":{"date-parts":[[2018,3,2]],"date-time":"2018-03-02T11:53:40Z","timestamp":1519991620000},"page":"764","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["Three-Dimensional Reconstruction from Single Image Base on Combination of CNN and Multi-Spectral Photometric Stereo"],"prefix":"10.3390","volume":"18","author":[{"given":"Liang","family":"Lu","sequence":"first","affiliation":[{"name":"College of Information Science and Engineering, Ocean University of China, Qingdao 266100, China"}]},{"given":"Lin","family":"Qi","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Ocean University of China, Qingdao 266100, China"}]},{"given":"Yisong","family":"Luo","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Ocean University of China, Qingdao 266100, China"}]},{"given":"Hengchao","family":"Jiao","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Ocean University of China, Qingdao 266100, China"}]},{"given":"Junyu","family":"Dong","sequence":"additional","affiliation":[{"name":"College of Information Science and Engineering, Ocean University of China, Qingdao 266100, China"}]}],"member":"1968","published-online":{"date-parts":[[2018,3,2]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"4534","DOI":"10.1109\/JSEN.2017.2707522","article-title":"Depth Recovery for Kinect Sensor Using Contour-Guided Adaptive Morphology Filter","volume":"17","author":"Ti","year":"2017","journal-title":"IEEE Sens. J."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"141","DOI":"10.1016\/0004-3702(81)90023-0","article-title":"Numerical shape from shading and occluding boundaries","volume":"17","author":"Ikeuchi","year":"1981","journal-title":"Artif. Intell."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"815","DOI":"10.1109\/34.236247","article-title":"Shape from shading with a linear triangular element surface model","volume":"15","author":"Lee","year":"1993","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1117\/12.7972479","article-title":"Photometric method for determining surface orientation from multiple images","volume":"19","author":"Woodham","year":"1980","journal-title":"Opt. Eng."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Drew, M.S., and Kontsevich, L.L. (1994). Closed-form Attitude Determination under Spectrally Varying Illumination, Simon Fraser University, Centre for Systems Science. Technical Report CSS\/LCCR TR 94-02.","DOI":"10.1109\/CVPR.1994.323939"},{"key":"ref_6","unstructured":"Thomas, A., Ferrar, V., Leibe, B., Tuytelaars, T., Schiel, B., and van Gool, L. (2006, January 17\u201322). Towards multi-view object class detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, NY, USA."},{"key":"ref_7","unstructured":"Seitz, S.M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. (2006, January 17\u201322). A comparison and evaluation of multiview stereo reconstruction algorithms. Proceedings of the IEEE Computer Society Conference on Computer vision and pattern recognition, New York, NY, USA."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1007\/BF00128525","article-title":"Epipolar-plane image analysis: An approach to determining structure from motion","volume":"1","author":"Bolles","year":"1987","journal-title":"Int. J. Comput. Vis."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"835","DOI":"10.1145\/1141911.1141964","article-title":"Photo tourism: Exploring photo collections in 3d","volume":"25","author":"Snavely","year":"2006","journal-title":"ACM Trans. Graph."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1047","DOI":"10.1364\/JOSAA.11.001047","article-title":"Reconstruction of shape from shading in color images","volume":"11","author":"Kontsevich","year":"1994","journal-title":"JOSA A"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"3050","DOI":"10.1364\/JOSAA.11.003050","article-title":"Gradient and Curvature from Photometric Stereo Including Local Confidence Estimation","volume":"11","author":"Woodham","year":"1994","journal-title":"J. Opt. Soc. Am."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Tsiotsios, C., Angelopoulou, M., Kim, T.K., and Davison, A. (2014, January 23\u201328). Backscatter Compensated Photometric Stereo with 3 Sources. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.289"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Anderson, R., Stenger, B., and Cipolla, R. (2011, January 6\u201313). Color Photometric Stereo for Multicolored Surfaces. Proceedings of the 2011 IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain.","DOI":"10.1109\/ICCV.2011.6126495"},{"key":"ref_14","unstructured":"Decker, D., Kautz, J., Mertens, T., and Bekaert, P. (2009, January 20\u201325). Capturing multiple illumination conditions using time and color multiplexing. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, Miami, FL, USA."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Kim, H., Wilburn, B., and Ben-Ezra, M. (2010). Photometric Stereo for Dynamic Surface Orientations. Computer Vision \u2014ECCV 2010, Springer.","DOI":"10.1007\/978-3-642-15549-9_5"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Janko, Z., Delaunoy, A., and Prados, E. (2010). Colour dynamic photometric stereo for textured surfaces. Computer Vision\u2014ACCV 2010, Springer.","DOI":"10.1007\/978-3-642-19309-5_5"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Hernandez, C., Vogiatzis, G., Brostow, G.J., Stenger, B., and Cipolla, R. (2007, January 14\u201321). Non-rigid Photometric Stereo with Colored Lights. Proceedings of the IEEE 11th International Conference on Computer Vision, Rio de Janeiro, Brazil.","DOI":"10.1109\/ICCV.2007.4408939"},{"key":"ref_18","unstructured":"Narasimhan, S., and Nayar, S. (2005, January 18\u201323). Structured Light Methods for Underwater Imaging: Light Stripe Scanning and Photometric Stereo. Proceedings of the MTS\/IEEE Oceans, Washington, DC, USA."},{"key":"ref_19","unstructured":"Velikhov, E.P. (1987). Light, Color and Shape. Cognitive Processes and their Simulation, Nauka. (In Russian)."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Ma, W., Jones, A., Chiang, J., Hawkins, T., Frederiksen, S., Peers, P., Vukovic, M., Ouhyoung, M., and Debevec, P. (2008, January 11\u201315). Facial performance synthesis using deformation-driven polynomial displacement maps. Proceedings of the ACM SIGGRAPH Asia2008, Los Angeles, CA, USA.","DOI":"10.1145\/1457515.1409074"},{"key":"ref_21","unstructured":"Eigen, D., Puhrsch, C., and Fergus, R. (2014, January 8\u201313). Depth map prediction from a single image using a multi-scale deep network. Proceedings of the Neural Information Processing Systems Conference, Montreal, QC, Canada."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Liu, F., Shen, C., and Lin, G. (2015, January 8\u201310). Deep convolutional neural fields for depth estimation from a single image. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7299152"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Cipolla, R., Battiato, S., and Farinella, G.M. (2010). Computer Vision: Detection, Recognition and Reconstruction, Springer.","DOI":"10.1007\/978-3-642-12848-6"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1016\/0146-664X(82)90001-6","article-title":"Obtaining 3-dimensional shape of textured and specular surfaces using foursource photometry","volume":"18","author":"Coleman","year":"1982","journal-title":"Comput. Graph. Image Process."},{"key":"ref_25","unstructured":"Nayar, S.K., Ikeuchi, K., and Kanade, T. (1989). Surface Reflection: Physical and Geometrical Perspectives, DTIC Document. Technical Reports."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Lin, S., and Lee, S.W. (1999, January 20\u201327). Estimation of diffuse and specular appearance. Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece.","DOI":"10.1109\/ICCV.1999.790311"},{"key":"ref_27","unstructured":"Jensen, H.W., Marschner, S.R., Levoy, M., and Hanrahan, P. A practical model for subsurface light transport. Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques;."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Nicodemus, F.E., Richmond, J.C., and Hsia, J.J. (1977). Geometrical Considerations and Nomenclature for Reflectance, National Bureau of Standards, US Department of Commerce.","DOI":"10.6028\/NBS.MONO.160"},{"key":"ref_29","first-page":"290","article-title":"Shadows in three-source photometric stereo","volume":"Volume 5302","author":"Forsyth","year":"2008","journal-title":"Computer Vision\u2014ECCV 2008. ECCV 2008. Lecture Notes in Computer Science"},{"key":"ref_30","unstructured":"Zhang, Q., Ye, M., Yang, R., Matsushita, Y., Wilburn, B., and Yu, H. (2012, January 16\u201321). Edge-preserving photometric stereo via depth fusion. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Yu, L.-F., Yeung, S.-K., Tai, Y.-W., and Lin, S. (2013, January 23\u201328). Shading based shape refinement of rgb-d images. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA.","DOI":"10.1109\/CVPR.2013.186"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1145\/2661229.2661263","article-title":"Robust surface reconstruction via dictionary learning","volume":"33","author":"Xiong","year":"2014","journal-title":"ACM Trans. Graph."},{"key":"ref_33","first-page":"1","article-title":"Learning depth from single monocular images","volume":"18","author":"Liu","year":"2005","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Ladicky, L., Shi, J., and Pollefeys, M. (2014, January 23\u201328). Pulling things out of perspective. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.","DOI":"10.1109\/CVPR.2014.19"},{"key":"ref_35","first-page":"486","article-title":"Fine-scale surface normal estimation using a single nir image","volume":"Volume 9907","author":"Leibe","year":"2016","journal-title":"Computer Vision \u2013 ECCV 2016. ECCV 2016. Lecture Notes in Computer Science"},{"key":"ref_36","first-page":"322","article-title":"Multiview 3d models from single images with a convolutional network","volume":"Volume 9911","author":"Leibe","year":"2016","journal-title":"Computer Vision \u2013 ECCV 2016. ECCV 2016. Lecture Notes in Computer Science"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Mousavian, A., and Pirsiavash, H. (2016, January 25\u201328). Joint Semantic Segmentation and Depth Estimation with Deep Convolutional Networks. Proceedings of the Fourth International Conference on 3D Vision, Stanford, CA, USA.","DOI":"10.1109\/3DV.2016.69"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Laina, I., Rupprecht, C., and Belagiannis, V. (2016, January 25\u201328). Deeper Depth Prediction with Fully Convolutional Residual Networks. Proceedings of the Fourth International Conference on 3D Vision, Stanford, CA, USA.","DOI":"10.1109\/3DV.2016.32"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Roy, A., and Todorovic, S. (2016, January 27\u201330). Monocular Depth Estimation Using Neural Regression Forest. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.594"},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Xu, D., Ricci, E., Ouyang, W., Wang, X., and Sebe, N. (2017, January 21\u201326). Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.25"},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"2024","DOI":"10.1109\/TPAMI.2015.2505283","article-title":"Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields","volume":"38","author":"Liu","year":"2016","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Chen, W., Xiang, D., and Deng, J. (2017, January 22\u201329). Surface Normals in the Wild. Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.173"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Li, J., Klein, R., and Yao, A. (2017, January 22\u201329). A Two-Streamed Network for Estimating Fine-Scaled Depth Maps from Single RGB Images. Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.365"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Savva, M., Chang, A.X., and Hanrahan, P. (2015, January 7\u201312). Semantically-enriched 3d models for common-sense knowledge. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Boston, MA, USA.","DOI":"10.1109\/CVPRW.2015.7301289"},{"key":"ref_45","unstructured":"(2018, February 27). GirHub. Available online: https:\/\/github.com\/panmari\/stanford-shapenet-renderer."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/18\/3\/764\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T14:57:22Z","timestamp":1760194642000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/18\/3\/764"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,3,2]]},"references-count":45,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2018,3]]}},"alternative-id":["s18030764"],"URL":"https:\/\/doi.org\/10.3390\/s18030764","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2018,3,2]]}}}