{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,24]],"date-time":"2026-02-24T02:54:19Z","timestamp":1771901659866,"version":"3.50.1"},"reference-count":57,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2023,2,25]],"date-time":"2023-02-25T00:00:00Z","timestamp":1677283200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62062056"],"award-info":[{"award-number":["62062056"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Ningxia Graduate Education and Teaching Reform Research and Practice Project 2021"},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61662059"],"award-info":[{"award-number":["61662059"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Multimedia Comput. Commun. Appl."],"published-print":{"date-parts":[[2023,8,31]]},"abstract":"<jats:p>\n            Single-view three-dimensional (3D) object reconstruction has always been a long-term challenging task. Objects with complex topologies are hard to accurately reconstruct, which makes existing methods suffer from blurring of shape boundaries between multiple components in the object. Moreover, most of them cannot balance learning between global geometric structure information and local detail information. In this article, we propose a multi-scale edge-guided learning network (MEGLN) to utilize the global edge information guiding the network to better capture and recover local details. The goal is to exploit the multi-scale learning strategy to learn global edge information and local details, thus achieving robust 3D object reconstruction. We first design a multi-scale Gaussian difference block (MGDB) to extract global edge geometry features for input images of different scales and adopt the attention mechanism to aggregate the extracted global edge geometry features of different scales. Second, we design a multi-scale feature interaction block (MFIB) to learn local details, which utilizes the multi-scale feature interaction to capture the features of multiple objects or components at multiple scales. The MFIB can learn and capture better as much local detail information as possible under the guidance of global edge information. Finally, we dynamically fuse the predicted probabilities of the MGDB and MFIB to obtain the final predicted result, which makes our MEGLN able to recover 3D shapes with global complex topological structures and rich local details via the multi-scale learning strategy. Extensive qualitative and quantitative experimental results on the ShapeNet dataset demonstrate that our approach achieves competitive performance compared with state-of-the-art methods. Code is available at\n            <jats:ext-link xmlns:xlink=\"http:\/\/www.w3.org\/1999\/xlink\" ext-link-type=\"url\" xlink:href=\"https:\/\/github.com\/Ray-tju\/MEGLN\">https:\/\/github.com\/Ray-tju\/MEGLN<\/jats:ext-link>\n            .\n          <\/jats:p>","DOI":"10.1145\/3568678","type":"journal-article","created":{"date-parts":[[2022,10,20]],"date-time":"2022-10-20T11:52:23Z","timestamp":1666266743000},"page":"1-24","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":13,"title":["Multi-scale Edge-guided Learning for 3D Reconstruction"],"prefix":"10.1145","volume":"19","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0253-516X","authenticated-orcid":false,"given":"Lei","family":"Li","sequence":"first","affiliation":[{"name":"Ningxia University, Yinchuan, Ningxia, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6871-6043","authenticated-orcid":false,"given":"Zhiyuan","family":"Zhou","sequence":"additional","affiliation":[{"name":"Ningxia University, Yinchuan, Ningxia, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5207-1802","authenticated-orcid":false,"given":"Suping","family":"Wu","sequence":"additional","affiliation":[{"name":"Ningxia University, Yinchuan, Ningxia, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8533-3864","authenticated-orcid":false,"given":"Yongrong","family":"Cao","sequence":"additional","affiliation":[{"name":"Ningxia University, Yinchuan, Ningxia, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,2,25]]},"reference":[{"key":"e_1_3_1_2_2","article-title":"ShapeNet: An information-rich 3D model repository","author":"Chang Angel X.","year":"2015","unstructured":"Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et\u00a0al. 2015. ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012 (2015).","journal-title":"arXiv preprint arXiv:1512.03012"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2017.2699184"},{"key":"e_1_3_1_4_2","unstructured":"Liang-Chieh Chen George Papandreou Florian Schroff and Hartwig Adam. 2017. Rethinking atrous convolution for semantic image segmentation. 6 (2017). arXiv preprint arXiv:1706.05587 ."},{"key":"e_1_3_1_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00609"},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46484-8_38"},{"key":"e_1_3_1_7_2","unstructured":"Harm De Vries Florian Strub J\u00e9r\u00e9mie Mary Hugo Larochelle Olivier Pietquin and Aaron Courville. 2017. Modulating early visual processing by language. Advances in Neural Information Processing Systems 30 (2017)."},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/72.165600"},{"key":"e_1_3_1_9_2","article-title":"Adversarially learned inference","author":"Dumoulin Vincent","year":"2016","unstructured":"Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky, and Aaron Courville. 2016. Adversarially learned inference. arXiv preprint arXiv:1606.00704 (2016).","journal-title":"arXiv preprint arXiv:1606.00704"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.264"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/VISUAL.1998.745312"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00030"},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.1006\/cviu.1997.0633"},{"key":"e_1_3_1_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2015.2389824"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_3_1_16_2","doi-asserted-by":"crossref","unstructured":"Emmanuel Prados and Olivier Faugeras. 2006. Shape from shading. In Proceeding of the Handbook of Mathematical Models in Computer Vision . Springer 375\u2013388.","DOI":"10.1007\/0-387-28831-7_23"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.354"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298807"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2014.2316835"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00411"},{"key":"e_1_3_1_21_2","article-title":"Adam: A method for stochastic optimization","author":"Kingma Diederik P.","year":"2014","unstructured":"Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).","journal-title":"arXiv preprint arXiv:1412.6980"},{"key":"e_1_3_1_22_2","doi-asserted-by":"crossref","unstructured":"Alex Krizhevsky Ilya Sutskever and Geoffrey E. Hinton. 2017. ImageNet classification with deep convolutional neural networks. Commun. ACM 60 6 (2017) 84\u201390.","DOI":"10.1145\/3065386"},{"key":"e_1_3_1_23_2","doi-asserted-by":"publisher","DOI":"10.1038\/nature14539"},{"key":"e_1_3_1_24_2","doi-asserted-by":"publisher","DOI":"10.16383\/j.aas.c200543"},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP39728.2021.9413649"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICPR48806.2021.9411960"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01100"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.12278"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.106"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00913"},{"key":"e_1_3_1_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/37402.37422"},{"key":"e_1_3_1_32_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00459"},{"key":"e_1_3_1_33_2","article-title":"A simple and scalable shape representation for 3D reconstruction","author":"Michalkiewicz Mateusz","year":"2020","unstructured":"Mateusz Michalkiewicz, Eugene Belilovsky, Mahsa Baktashmotlagh, and Anders Eriksson. 2020. A simple and scalable shape representation for 3D reconstruction. arXiv preprint arXiv:2005.04623 (2020).","journal-title":"arXiv preprint arXiv:2005.04623"},{"issue":"09","key":"e_1_3_1_34_2","first-page":"941","article-title":"3-D reconstruction using mirror images based on a plane symmetry recovering method","volume":"14","author":"Mitsumoto Hiroshi","year":"1992","unstructured":"Hiroshi Mitsumoto, Shinichi Tamura, Kozo Okazaki, Naoki Kajimi, and Yutaka Fukui. 1992. 3-D reconstruction using mirror images based on a plane symmetry recovering method. IEEE Computer Architecture Letters 14, 09 (1992), 941\u2013946.","journal-title":"IEEE Computer Architecture Letters"},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46484-8_29"},{"key":"e_1_3_1_36_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.434"},{"key":"e_1_3_1_37_2","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201912)","author":"Oswald Martin R.","year":"2012","unstructured":"Martin R. Oswald, Eno T\u00f6ppe, and Daniel Cremers. 2012. Fast and globally optimal single view reconstruction of curved objects. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR\u201912). IEEE, Providence, RI, 534\u2013541."},{"key":"e_1_3_1_38_2","volume-title":"Asian Conference on Computer Vision","author":"Pontes Jhony K.","year":"2018","unstructured":"Jhony K. Pontes, Chen Kong, Sridha Sridharan, Simon Lucey, Anders Eriksson, and Clinton Fookes. 2018. Image2Mesh: A learning framework for single image 3D reconstruction. In Asian Conference on Computer Vision. Springer, Perth, 365\u2013381."},{"key":"e_1_3_1_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00207"},{"key":"e_1_3_1_40_2","unstructured":"Ashutosh Saxena Sung H. Chung Andrew Y. Ng et\u00a0al. 2005. Learning depth from single monocular images. In Advances in Neural Information Processing Systems Vol. 18. MIT Press Vancouver 1\u20138."},{"key":"e_1_3_1_41_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNN.2008.2005605"},{"issue":"1","key":"e_1_3_1_42_2","first-page":"118","article-title":"Geographic, geometrical and semantic reconstruction of urban scene from high resolution oblique aerial images","volume":"6","author":"Sun Xiaofeng","year":"2017","unstructured":"Xiaofeng Sun, Shuhan Shen, Hainan Cui, Lihua Hu, and Zhanyi Hu. 2017. Geographic, geometrical and semantic reconstruction of urban scene from high resolution oblique aerial images. IEEE\/CAA Journal of Automatica Sinica 6, 1 (2017), 118\u2013130.","journal-title":"IEEE\/CAA Journal of Automatica Sinica"},{"key":"e_1_3_1_43_2","article-title":"Im2Avatar: Colorful 3D reconstruction from a single image","author":"Sun Yongbin","year":"2018","unstructured":"Yongbin Sun, Ziwei Liu, Yue Wang, and Sanjay E. Sarma. 2018. Im2Avatar: Colorful 3D reconstruction from a single image. arXiv preprint arXiv:1804.06375 abs\/1804.06375 (2018).","journal-title":"arXiv preprint arXiv:1804.06375"},{"key":"e_1_3_1_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2017.230"},{"key":"e_1_3_1_45_2","article-title":"Attention is all you need","volume":"30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in Neural Information Processing Systems 30 (2017).","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_1_46_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-01252-6_4"},{"key":"e_1_3_1_47_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00813"},{"key":"e_1_3_1_48_2","unstructured":"Jiajun Wu Yifan Wang Tianfan Xue Xingyuan Sun William T. Freeman and Joshua B. Tenenbaum. 2017. MarrNet: 3D shape reconstruction via 2.5 D sketches. Advances in Neural Information Processing Systems 30 (2017)."},{"key":"e_1_3_1_49_2","unstructured":"Jiajun Wu Chengkai Zhang Tianfan Xue William T. Freeman and Joshua B. Tenenbaum. 2016. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. Advances in Neural Information Processing Systems 29 (2016)."},{"key":"e_1_3_1_50_2","volume-title":"Proceedings of the European Conference on Computer Vision (ECCV\u201918)","author":"Wu Jiajun","year":"2018","unstructured":"Jiajun Wu, Chengkai Zhang, Xiuming Zhang, Zhoutong Zhang, William T. Freeman, and Joshua B. Tenenbaum. 2018. Learning shape priors for single-view 3D completion and reconstruction. In Proceedings of the European Conference on Computer Vision (ECCV\u201918). Springer, Munich, 646\u2013662."},{"key":"e_1_3_1_51_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.729880"},{"key":"e_1_3_1_52_2","unstructured":"Qiangeng Xu Weiyue Wang Duygu Ceylan Radomir Mech and Ulrich Neumann. 2019. DISN: Deep implicit surface network for high-quality single-view 3D reconstruction. Advances in Neural Information Processing Systems 32 (2019)."},{"issue":"4","key":"e_1_3_1_53_2","doi-asserted-by":"crossref","first-page":"991","DOI":"10.1109\/JAS.2020.1003234","article-title":"Concrete defects inspection and 3D mapping using CityFlyer quadrotor robot","volume":"7","author":"Yang Liang","year":"2020","unstructured":"Liang Yang, Bing Li, Wei Li, Howard Brand, Biao Jiang, and Jizhong Xiao. 2020. Concrete defects inspection and 3D mapping using CityFlyer quadrotor robot. IEEE\/CAA Journal of Automatica Sinica 7, 4 (2020), 991\u20131002.","journal-title":"IEEE\/CAA Journal of Automatica Sinica"},{"key":"e_1_3_1_54_2","volume-title":"European Conference on Computer Vision","author":"Zhang Dong","year":"2020","unstructured":"Dong Zhang, Hanwang Zhang, Jinhui Tang, Meng Wang, Xiansheng Hua, and Qianru Sun. 2020. Feature pyramid transformer. In European Conference on Computer Vision. Springer, Glasgow, UK, 323\u2013339."},{"key":"e_1_3_1_55_2","doi-asserted-by":"publisher","DOI":"10.1002\/vis.291"},{"key":"e_1_3_1_56_2","doi-asserted-by":"publisher","DOI":"10.1109\/34.784284"},{"key":"e_1_3_1_57_2","unstructured":"Xiuming Zhang Zhoutong Zhang Chengkai Zhang Joshua B. Tenenbaum William T. Freeman and Jiajun Wu. 2018. Learning to reconstruct shapes from unseen classes. Advances in Neural Information Processing Systems (2018) 2263\u20132274."},{"key":"e_1_3_1_58_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.660"}],"container-title":["ACM Transactions on Multimedia Computing, Communications, and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3568678","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3568678","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:08:34Z","timestamp":1750183714000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3568678"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,2,25]]},"references-count":57,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,8,31]]}},"alternative-id":["10.1145\/3568678"],"URL":"https:\/\/doi.org\/10.1145\/3568678","relation":{},"ISSN":["1551-6857","1551-6865"],"issn-type":[{"value":"1551-6857","type":"print"},{"value":"1551-6865","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,2,25]]},"assertion":[{"value":"2021-11-30","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-10-08","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-02-25","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}