{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,31]],"date-time":"2025-10-31T07:48:39Z","timestamp":1761896919118,"version":"3.41.0"},"reference-count":46,"publisher":"Association for Computing Machinery (ACM)","issue":"6","license":[{"start":{"date-parts":[[2018,12,4]],"date-time":"2018-12-04T00:00:00Z","timestamp":1543881600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Shenzhen Innovation Program","award":["JCYJ20170302153208613, JCYJ20151015151249564"],"award-info":[{"award-number":["JCYJ20170302153208613, JCYJ20151015151249564"]}]},{"DOI":"10.13039\/501100012166","name":"973 Program","doi-asserted-by":"crossref","award":["2015CB352501"],"award-info":[{"award-number":["2015CB352501"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001809","name":"NSFC","doi-asserted-by":"crossref","award":["61602311, 61522213, 61761146002, 61861130365"],"award-info":[{"award-number":["61602311, 61522213, 61761146002, 61861130365"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"NSERC Canada","award":["2015-05407"],"award-info":[{"award-number":["2015-05407"]}]},{"name":"GD Science and Technology Program","award":["2015A030312015"],"award-info":[{"award-number":["2015A030312015"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Graph."],"published-print":{"date-parts":[[2018,12,31]]},"abstract":"<jats:p>We introduce a learning-based method to reconstruct objects acquired in a casual handheld scanning setting with a depth camera. Our method is based on two core components. First, a deep network that provides a semantic segmentation and labeling of the frames of an input RGBD sequence. Second, an alignment and reconstruction method that employs the semantic labeling to reconstruct the acquired object from the frames. We demonstrate that the use of a semantic labeling improves the reconstructions of the objects, when compared to methods that use only the depth information of the frames. Moreover, since training a deep network requires a large amount of labeled data, a key contribution of our work is an active self-learning framework to simplify the creation of the training data. Specifically, we iteratively predict the labeling of frames with the neural network, reconstruct the object from the labeled frames, and evaluate the confidence of the labeling, to incrementally train the neural network while requiring only a small amount of user-provided annotations. We show that this method enables the creation of data for training a neural network with high accuracy, while requiring only little manual effort.<\/jats:p>","DOI":"10.1145\/3272127.3275024","type":"journal-article","created":{"date-parts":[[2018,11,28]],"date-time":"2018-11-28T19:16:10Z","timestamp":1543432570000},"page":"1-12","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Semantic object reconstruction via casual handheld scanning"],"prefix":"10.1145","volume":"37","author":[{"given":"Ruizhen","family":"Hu","sequence":"first","affiliation":[{"name":"Shenzhen University"}]},{"given":"Cheng","family":"Wen","sequence":"additional","affiliation":[{"name":"Shenzhen University"}]},{"given":"Oliver","family":"Van Kaick","sequence":"additional","affiliation":[{"name":"Carleton University"}]},{"given":"Luanmin","family":"Chen","sequence":"additional","affiliation":[{"name":"Shenzhen University"}]},{"given":"Di","family":"Lin","sequence":"additional","affiliation":[{"name":"Shenzhen University"}]},{"given":"Daniel","family":"Cohen-Or","sequence":"additional","affiliation":[{"name":"Shenzhen University and Tel Aviv University"}]},{"given":"Hui","family":"Huang","sequence":"additional","affiliation":[{"name":"Shenzhen University"}]}],"member":"320","published-online":{"date-parts":[[2018,12,4]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/1360612.1360684"},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2004.60"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1007\/s41095-015-0029-x"},{"volume-title":"Proc. CVPR. IEEE, 5556--5565","author":"Choi Sungjoon","key":"e_1_2_2_4_1","unstructured":"Sungjoon Choi , Q. Y. Zhou , and V. Koltun . 2015. Robust reconstruction of indoor scenes . In Proc. CVPR. IEEE, 5556--5565 . Sungjoon Choi, Q. Y. Zhou, and V. Koltun. 2015. Robust reconstruction of indoor scenes. In Proc. CVPR. IEEE, 5556--5565."},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.261"},{"volume-title":"Proc. CVPR. IEEE, 493--501","author":"Dou M.","key":"e_1_2_2_7_1","unstructured":"M. Dou , J. Taylor , H. Fuchs , A. Fitzgibbon , and S. Izadi . 2015. 3D scanning deformable objects with a single RGBD sensor . In Proc. CVPR. IEEE, 493--501 . M. Dou, J. Taylor, H. Fuchs, A. Fitzgibbon, and S. Izadi. 2015. 3D scanning deformable objects with a single RGBD sensor. In Proc. CVPR. IEEE, 493--501."},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/2980179.2982409"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10462-012-9365-8"},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/2835487"},{"key":"e_1_2_2_11_1","first-page":"1","article-title":"Dense Semantic 3D Reconstruction","volume":"99","author":"H\u00e4ne C.","year":"2016","unstructured":"C. H\u00e4ne , C. Zach , A. Cohen , and M. Pollefeys . 2016 . Dense Semantic 3D Reconstruction . IEEE Trans. Pattern Analysis & Machine Intelligence PP , 99 (2016), 1 -- 14 . C. H\u00e4ne, C. Zach, A. Cohen, and M. Pollefeys. 2016. Dense Semantic 3D Reconstruction. IEEE Trans. Pattern Analysis & Machine Intelligence PP, 99 (2016), 1--14.","journal-title":"IEEE Trans. Pattern Analysis & Machine Intelligence PP"},{"key":"e_1_2_2_12_1","unstructured":"Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. (2016).  Kaiming He Xiangyu Zhang Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. (2016)."},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2766890"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2015.2459891"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1145\/1778765.1778839"},{"volume-title":"Proc. Int. Conf. on Robotics & Automation. IEEE, 3748--3754","author":"Kerl C.","key":"e_1_2_2_16_1","unstructured":"C. Kerl , J. Sturm , and D. Cremers . 2013. Robust odometry estimation for RGB-D cameras . In Proc. Int. Conf. on Robotics & Automation. IEEE, 3748--3754 . C. Kerl, J. Sturm, and D. Cremers. 2013. Robust odometry estimation for RGB-D cameras. In Proc. Int. Conf. on Robotics & Automation. IEEE, 3748--3754."},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366157"},{"key":"e_1_2_2_18_1","first-page":"703","article-title":"Joint Semantic Segmentation and 3D Reconstruction from Monocular Video","volume":"8694","author":"Kundu A.","year":"2014","unstructured":"A. Kundu , Y. Li , F. Dellaert , F. Li , and J. M. Rehg . 2014 . Joint Semantic Segmentation and 3D Reconstruction from Monocular Video . LNCS (Proc. ECCV) 8694 (2014), 703 -- 718 . A. Kundu, Y. Li, F. Dellaert, F. Li, and J. M. Rehg. 2014. Joint Semantic Segmentation and 3D Reconstruction from Monocular Video. LNCS (Proc. ECCV) 8694 (2014), 703--718.","journal-title":"LNCS (Proc. ECCV)"},{"key":"e_1_2_2_19_1","volume-title":"Niloy Jyoti Mitra, and Kun Zhou","author":"Lin Minmin","year":"2018","unstructured":"Minmin Lin , Tianjia Shao , Youyi Zheng , Niloy Jyoti Mitra, and Kun Zhou . 2018 . Recovering Functional Mechanical Assemblies from Raw Scans. IEEE transactions on visualization and computer graphics 24, 3 (2018), 1354--1367. Minmin Lin, Tianjia Shao, Youyi Zheng, Niloy Jyoti Mitra, and Kun Zhou. 2018. Recovering Functional Mechanical Assemblies from Raw Scans. IEEE transactions on visualization and computer graphics 24, 3 (2018), 1354--1367."},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2017.7989538"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.3390\/s140508547"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366156"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISMAR.2011.6092378"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2508363.2508374"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.264"},{"volume-title":"Proc. Int. Conf. on Robotics & Automation. 4471--4478","author":"R\u00fcnz M.","key":"e_1_2_2_27_1","unstructured":"M. R\u00fcnz and L. Agapito . 2017. Co-fusion: Real-time segmentation, tracking and fusion of multiple objects . In Proc. Int. Conf. on Robotics & Automation. 4471--4478 . M. R\u00fcnz and L. Agapito. 2017. Co-fusion: Real-time segmentation, tracking and fusion of multiple objects. In Proc. Int. Conf. on Robotics & Automation. 4471--4478."},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.178"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-8659.2007.01103.x"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366155"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366199"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-46448-0_6"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.cagd.2016.02.015"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/2070781.2024160"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11554-013-0379-5"},{"volume-title":"Exploring Artificial Intelligence in the New Millenium","author":"Thrun S.","key":"e_1_2_2_36_1","unstructured":"S. Thrun . 2002. Robotic Mapping: A Survey . In Exploring Artificial Intelligence in the New Millenium , G. Lakemeyer and B. Nebel (Eds.). Morgan Kaufmann , 1--35. S. Thrun. 2002. Robotic Mapping: A Survey. In Exploring Artificial Intelligence in the New Millenium, G. Lakemeyer and B. Nebel (Eds.). Morgan Kaufmann, 1--35."},{"key":"e_1_2_2_37_1","volume-title":"Semantic part segmentation with deep learning. arXiv preprint arXiv:1505.02438","author":"Tsogkas Stavros","year":"2015","unstructured":"Stavros Tsogkas , Iasonas Kokkinos , George Papandreou , and Andrea Vedaldi . 2015. Semantic part segmentation with deep learning. arXiv preprint arXiv:1505.02438 ( 2015 ). Stavros Tsogkas, Iasonas Kokkinos, George Papandreou, and Andrea Vedaldi. 2015. Semantic part segmentation with deep learning. arXiv preprint arXiv:1505.02438 (2015)."},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-8659.2011.01884.x"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2366145.2366184"},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.15607\/RSS.2015.XI.001"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.12574"},{"key":"e_1_2_2_42_1","volume-title":"Proc. Euro. Conf. on Computer Vision","volume":"9909","author":"Xia Fangting","year":"2015","unstructured":"Fangting Xia , Peng Wang , Liang-Chieh Chen , and Alan L Yuille . 2015 . Zoom Better to See Clearer: Human and Object Parsing with Hierarchical Auto-Zoom Net . In Proc. Euro. Conf. on Computer Vision , Vol. 9909 . Fangting Xia, Peng Wang, Liang-Chieh Chen, and Alan L Yuille. 2015. Zoom Better to See Clearer: Human and Object Parsing with Hierarchical Auto-Zoom Net. In Proc. Euro. Conf. on Computer Vision, Vol. 9909."},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2013.458"},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.1145\/2010324.1964975"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1145\/2980179.2982425"},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/2601097.2601191"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/2980179.2980238"}],"container-title":["ACM Transactions on Graphics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3272127.3275024","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3272127.3275024","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T00:44:04Z","timestamp":1750207444000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3272127.3275024"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,12,4]]},"references-count":46,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2018,12,31]]}},"alternative-id":["10.1145\/3272127.3275024"],"URL":"https:\/\/doi.org\/10.1145\/3272127.3275024","relation":{},"ISSN":["0730-0301","1557-7368"],"issn-type":[{"type":"print","value":"0730-0301"},{"type":"electronic","value":"1557-7368"}],"subject":[],"published":{"date-parts":[[2018,12,4]]},"assertion":[{"value":"2018-12-04","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}