{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:59:00Z","timestamp":1750309140179,"version":"3.41.0"},"reference-count":46,"publisher":"Association for Computing Machinery (ACM)","issue":"ISS","license":[{"start":{"date-parts":[[2023,10,31]],"date-time":"2023-10-31T00:00:00Z","timestamp":1698710400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Hum.-Comput. Interact."],"published-print":{"date-parts":[[2023,10,31]]},"abstract":"<jats:p>Segmenting and determining the 3D bounding boxes of objects of interest in RGB videos is an important task for a variety of applications such as augmented reality, navigation, and robotics. Supervised machine learning techniques are commonly used for this, but they need training datasets: sets of images with associated 3D bounding boxes manually defined by human annotators using a labelling tool. However, precisely placing 3D bounding boxes can be difficult using conventional 3D manipulation tools on a 2D interface. To alleviate that burden, we propose a novel technique with which 3D bounding boxes can be created by simply drawing 2D bounding rectangles on multiple frames of a video sequence showing the object from different angles. The method uses reconstructed dense 3D point clouds from the video and computes tightly fitting 3D bounding boxes of desired objects selected by back-projecting the 2D rectangles. We show concrete application scenarios of our interface, including training dataset creation and editing 3D spaces and videos. An evaluation comparing our technique with a conventional 3D annotation tool shows that our method results in higher accuracy. We also confirm that the bounding boxes created with our interface have a lower variance, likely yielding more consistent labels and datasets.<\/jats:p>","DOI":"10.1145\/3626476","type":"journal-article","created":{"date-parts":[[2023,11,1]],"date-time":"2023-11-01T16:26:12Z","timestamp":1698855972000},"page":"309-326","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Interactive 3D Annotation of Objects in Moving Videos from Sparse Multi-view Frames"],"prefix":"10.1145","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-6679-9758","authenticated-orcid":false,"given":"Kotaro","family":"Oomori","sequence":"first","affiliation":[{"name":"University of Tokyo, Tokyo, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5547-3831","authenticated-orcid":false,"given":"Wataru","family":"Kawabe","sequence":"additional","affiliation":[{"name":"University of Tokyo, Tokyo, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1804-631X","authenticated-orcid":false,"given":"Fabrice","family":"Matulic","sequence":"additional","affiliation":[{"name":"Preferred Networks, Tokyo, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5495-6441","authenticated-orcid":false,"given":"Takeo","family":"Igarashi","sequence":"additional","affiliation":[{"name":"University of Tokyo, Tokyo, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-6054-8471","authenticated-orcid":false,"given":"Keita","family":"Higuchi","sequence":"additional","affiliation":[{"name":"Preferred Networks, Tokyo, Japan"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,11]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR46437.2021.00773"},{"key":"e_1_2_2_2_1","volume-title":"Proceedings of the IEEE\/CVF Conference on computer vision and pattern recognition. 2614\u20132623","author":"Avetisyan Armen","year":"2019","unstructured":"Armen Avetisyan , Manuel Dahnert , Angela Dai , Manolis Savva , Angel X Chang , and Matthias Nie\u00df ner. 2019 . Scan2cad: Learning cad model alignment in rgb-d scans . In Proceedings of the IEEE\/CVF Conference on computer vision and pattern recognition. 2614\u20132623 . Armen Avetisyan, Manuel Dahnert, Angela Dai, Manolis Savva, Angel X Chang, and Matthias Nie\u00df ner. 2019. Scan2cad: Learning cad model alignment in rgb-d scans. In Proceedings of the IEEE\/CVF Conference on computer vision and pattern recognition. 2614\u20132623."},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR42600.2020.01164"},{"key":"e_1_2_2_4_1","unstructured":"Dan Cernea. 2020. OpenMVS: Multi-View Stereo Reconstruction Library. https:\/\/cdcseacave.github.io\/openMVS Dan Cernea. 2020. OpenMVS: Multi-View Stereo Reconstruction Library. https:\/\/cdcseacave.github.io\/openMVS"},{"key":"e_1_2_2_5_1","volume-title":"A coefficient of agreement for nominal scales. Educational and psychological measurement, 20, 1","author":"Cohen Jacob","year":"1960","unstructured":"Jacob Cohen . 1960. A coefficient of agreement for nominal scales. Educational and psychological measurement, 20, 1 ( 1960 ), 37\u201346. Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Educational and psychological measurement, 20, 1 (1960), 37\u201346."},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.5281\/zenodo.8095553"},{"key":"e_1_2_2_7_1","doi-asserted-by":"crossref","first-page":"6662","DOI":"10.1109\/LRA.2021.3093869","article-title":"Adversarial training on point clouds for sim-to-real 3D object detection","volume":"6","author":"DeBortoli Robert","year":"2021","unstructured":"Robert DeBortoli , Li Fuxin , Ashish Kapoor , and Geoffrey A Hollinger . 2021 . Adversarial training on point clouds for sim-to-real 3D object detection . IEEE Robotics and Automation Letters , 6 , 4 (2021), 6662 \u2013 6669 . Robert DeBortoli, Li Fuxin, Ashish Kapoor, and Geoffrey A Hollinger. 2021. Adversarial training on point clouds for sim-to-real 3D object detection. IEEE Robotics and Automation Letters, 6, 4 (2021), 6662\u20136669.","journal-title":"IEEE Robotics and Automation Letters"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/3343031.3350535"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1177\/0278364913491297"},{"key":"e_1_2_2_10_1","volume-title":"Maximilian M\u00fchlegg, and Sebastian Dorn.","author":"Geyer Jakob","year":"2020","unstructured":"Jakob Geyer , Yohannes Kassahun , Mentar Mahmudi , Xavier Ricou , Rupesh Durgesh , Andrew S Chung , Lorenz Hauswald , Viet Hoang Pham , Maximilian M\u00fchlegg, and Sebastian Dorn. 2020 . A2d2: Audi autonomous driving dataset. arXiv preprint arXiv:2004.06320. Jakob Geyer, Yohannes Kassahun, Mentar Mahmudi, Xavier Ricou, Rupesh Durgesh, Andrew S Chung, Lorenz Hauswald, Viet Hoang Pham, Maximilian M\u00fchlegg, and Sebastian Dorn. 2020. A2d2: Audi autonomous driving dataset. arXiv preprint arXiv:2004.06320."},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/3172944.3172960"},{"key":"e_1_2_2_12_1","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1418\u20131428","author":"Hu Yuan-Ting","year":"2021","unstructured":"Yuan-Ting Hu , Jiahong Wang , Raymond A Yeh , and Alexander G Schwing . 2021 . SAIL-VOS 3D: A synthetic dataset and baselines for object detection and 3d mesh reconstruction from video data . In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1418\u20131428 . Yuan-Ting Hu, Jiahong Wang, Raymond A Yeh, and Alexander G Schwing. 2021. SAIL-VOS 3D: A synthetic dataset and baselines for object detection and 3d mesh reconstruction from video data. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1418\u20131428."},{"key":"e_1_2_2_13_1","volume-title":"2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 475\u2013477","author":"Jalal Mona","year":"2019","unstructured":"Mona Jalal , Josef B. Spjut , Ben Boudaoud , and Margrit Betke . 2019 . SIDOD: A Synthetic Image Dataset for 3D Object Pose Recognition With Distractors . 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 475\u2013477 . Mona Jalal, Josef B. Spjut, Ben Boudaoud, and Margrit Betke. 2019. SIDOD: A Synthetic Image Dataset for 3D Object Pose Recognition With Distractors. 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 475\u2013477."},{"key":"e_1_2_2_14_1","doi-asserted-by":"crossref","unstructured":"Alexander Kirillov Eric Mintun Nikhila Ravi Hanzi Mao Chloe Rolland Laura Gustafson Tete Xiao Spencer Whitehead Alexander C. Berg Wan-Yen Lo Piotr Doll\u00e1r and Ross Girshick. 2023. Segment Anything. arXiv:2304.02643. Alexander Kirillov Eric Mintun Nikhila Ravi Hanzi Mao Chloe Rolland Laura Gustafson Tete Xiao Spencer Whitehead Alexander C. Berg Wan-Yen Lo Piotr Doll\u00e1r and Ross Girshick. 2023. Segment Anything. arXiv:2304.02643.","DOI":"10.1109\/ICCV51070.2023.00371"},{"key":"e_1_2_2_15_1","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 1030\u20131031","author":"Koksal Aybora","year":"2020","unstructured":"Aybora Koksal , Kutalmis Gokalp Ince , and Aydin Alatan . 2020 . Effect of annotation errors on drone detection with YOLOv3 . In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 1030\u20131031 . Aybora Koksal, Kutalmis Gokalp Ince, and Aydin Alatan. 2020. Effect of annotation errors on drone detection with YOLOv3. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops. 1030\u20131031."},{"key":"e_1_2_2_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/3332165.3347927"},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2642918.2647367"},{"volume-title":"HCI and Usability for Education and Work","author":"Laugwitz Bettina","key":"e_1_2_2_18_1","unstructured":"Bettina Laugwitz , Theo Held , and Martin Schrepp . 2008. Construction and Evaluation of a User Experience Questionnaire . In HCI and Usability for Education and Work , Andreas Holzinger (Ed.). Springer Berlin Heidelberg , Berlin, Heidelberg . 63\u201376. isbn:978-3-540-89350-9 Bettina Laugwitz, Theo Held, and Martin Schrepp. 2008. Construction and Evaluation of a User Experience Questionnaire. In HCI and Usability for Education and Work, Andreas Holzinger (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 63\u201376. isbn:978-3-540-89350-9"},{"key":"e_1_2_2_19_1","volume-title":"Towards An End-to-End Framework for Flow-Guided Video Inpainting. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Li Zhen","year":"2022","unstructured":"Zhen Li , Cheng-Ze Lu , Jianhua Qin , Chun-Le Guo , and Ming-Ming Cheng . 2022 . Towards An End-to-End Framework for Flow-Guided Video Inpainting. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Zhen Li, Cheng-Ze Lu, Jianhua Qin, Chun-Le Guo, and Ming-Ming Cheng. 2022. Towards An End-to-End Framework for Flow-Guided Video Inpainting. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_2_20_1","unstructured":"Tzuta Lin. 2015. LabelImg. https:\/\/github.com\/heartexlabs\/labelImg Tzuta Lin. 2015. LabelImg. https:\/\/github.com\/heartexlabs\/labelImg"},{"key":"e_1_2_2_21_1","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 4850\u20134859","author":"Ma Jiaxin","year":"2022","unstructured":"Jiaxin Ma , Yoshitaka Ushiku , and Miori Sagara . 2022 . The Effect of Improving Annotation Quality on Object Detection Datasets: A Preliminary Study . In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 4850\u20134859 . Jiaxin Ma, Yoshitaka Ushiku, and Miori Sagara. 2022. The Effect of Improving Annotation Quality on Object Detection Datasets: A Preliminary Study. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 4850\u20134859."},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/3503250"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/3528223.3530127"},{"key":"e_1_2_2_24_1","volume-title":"Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 75\u201382","author":"Muratov Oleg","year":"2016","unstructured":"Oleg Muratov , Yury Slynko , Vitaly Chernov , Maria Lyubimtseva , Artem Shamsuarov , and Victor Bucha . 2016 . 3DCapture: 3D Reconstruction for a Smartphone . In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 75\u201382 . Oleg Muratov, Yury Slynko, Vitaly Chernov, Maria Lyubimtseva, Artem Shamsuarov, and Victor Bucha. 2016. 3DCapture: 3D Reconstruction for a Smartphone. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 75\u201382."},{"key":"e_1_2_2_25_1","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 4813\u20134822","author":"Murrugarra-Llerena Jeffri","year":"2022","unstructured":"Jeffri Murrugarra-Llerena , Lucas N Kirsten , and Claudio R Jung . 2022 . Can We Trust Bounding Box Annotations for Object Detection? In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 4813\u20134822 . Jeffri Murrugarra-Llerena, Lucas N Kirsten, and Claudio R Jung. 2022. Can We Trust Bounding Box Annotations for Object Detection? In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 4813\u20134822."},{"key":"e_1_2_2_26_1","doi-asserted-by":"crossref","unstructured":"Yutaka Ohtake Alexander Belyaev and Hans-Peter Seidel. 2003. A multi-scale approach to 3D scattered data interpolation with compactly supported basis functions. In 2003 Shape Modeling International.. 153\u2013161. Yutaka Ohtake Alexander Belyaev and Hans-Peter Seidel. 2003. A multi-scale approach to 3D scattered data interpolation with compactly supported basis functions. In 2003 Shape Modeling International.. 153\u2013161.","DOI":"10.1109\/SMI.2003.1199611"},{"key":"e_1_2_2_27_1","volume-title":"International Conference on Computer Vision (ICCV).","author":"Ouyang Hao","year":"2021","unstructured":"Hao Ouyang , Tengfei Wang , and Qifeng Chen . 2021 . Internal Video Inpainting by Implicit Long-range Propagation . In International Conference on Computer Vision (ICCV). Hao Ouyang, Tengfei Wang, and Qifeng Chen. 2021. Internal Video Inpainting by Implicit Long-range Propagation. In International Conference on Computer Vision (ICCV)."},{"key":"e_1_2_2_28_1","volume-title":"Proc. of The International Conference in Robotics and Automation (ICRA). 2267\u20132273","author":"Pham Quang-Hieu","year":"2020","unstructured":"Quang-Hieu Pham , Pierre Sevestre , Ramanpreet Singh Pahwa , Huijing Zhan , Chun Ho Pang , Yuda Chen , Armin Mustafa , Vijay Chandrasekhar , and Jie Lin . 2020 . A 3D dataset: Towards autonomous driving in challenging environments . In Proc. of The International Conference in Robotics and Automation (ICRA). 2267\u20132273 . Quang-Hieu Pham, Pierre Sevestre, Ramanpreet Singh Pahwa, Huijing Zhan, Chun Ho Pang, Yuda Chen, Armin Mustafa, Vijay Chandrasekhar, and Jie Lin. 2020. A 3D dataset: Towards autonomous driving in challenging environments. In Proc. of The International Conference in Robotics and Automation (ICRA). 2267\u20132273."},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/3526113.3545663"},{"key":"e_1_2_2_30_1","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision. 10912\u201310922","author":"Roberts Mike","year":"2021","unstructured":"Mike Roberts , Jason Ramapuram , Anurag Ranjan , Atulit Kumar , Miguel Angel Bautista , Nathan Paczan , Russ Webb , and Joshua M Susskind . 2021 . Hypersim: A photorealistic synthetic dataset for holistic indoor scene understanding . In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 10912\u201310922 . Mike Roberts, Jason Ramapuram, Anurag Ranjan, Atulit Kumar, Miguel Angel Bautista, Nathan Paczan, Russ Webb, and Joshua M Susskind. 2021. Hypersim: A photorealistic synthetic dataset for holistic indoor scene understanding. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 10912\u201310922."},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.14733\/cadaps.2022.1191-1206"},{"key":"e_1_2_2_32_1","doi-asserted-by":"crossref","unstructured":"Christoph Sager Patrick Zschech and Niklas K\u00fchl. 2021. labelCloud: A Lightweight Domain-Independent Labeling Tool for 3D Object Detection in Point Clouds. arxiv:2103.04970. Christoph Sager Patrick Zschech and Niklas K\u00fchl. 2021. labelCloud: A Lightweight Domain-Independent Labeling Tool for 3D Object Detection in Point Clouds. arxiv:2103.04970.","DOI":"10.14733\/cadconfP.2021.319-323"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3491102.3502098"},{"key":"e_1_2_2_34_1","volume-title":"Structure-from-Motion Revisited. In Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Sch\u00f6nberger Johannes Lutz","year":"2016","unstructured":"Johannes Lutz Sch\u00f6nberger and Jan-Michael Frahm . 2016 . Structure-from-Motion Revisited. In Conference on Computer Vision and Pattern Recognition (CVPR). Johannes Lutz Sch\u00f6nberger and Jan-Michael Frahm. 2016. Structure-from-Motion Revisited. In Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_2_35_1","volume-title":"Pixelwise View Selection for Unstructured Multi-View Stereo. In European Conference on Computer Vision (ECCV).","author":"Sch\u00f6nberger Johannes Lutz","year":"2016","unstructured":"Johannes Lutz Sch\u00f6nberger , Enliang Zheng , Marc Pollefeys , and Jan-Michael Frahm . 2016 . Pixelwise View Selection for Unstructured Multi-View Stereo. In European Conference on Computer Vision (ECCV). Johannes Lutz Sch\u00f6nberger, Enliang Zheng, Marc Pollefeys, and Jan-Michael Frahm. 2016. Pixelwise View Selection for Unstructured Multi-View Stereo. In European Conference on Computer Vision (ECCV)."},{"key":"e_1_2_2_36_1","unstructured":"Jiaming Sun Zihao Wang Siyu Zhang Xingyi He Hongcheng Zhao Guofeng Zhang and Xiaowei Zhou. 2022. OnePose: One-Shot Object Pose Estimation without CAD Models. CVPR. Jiaming Sun Zihao Wang Siyu Zhang Xingyi He Hongcheng Zhao Guofeng Zhang and Xiaowei Zhou. 2022. OnePose: One-Shot Object Pose Estimation without CAD Models. CVPR."},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3182168"},{"key":"e_1_2_2_38_1","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision. 7658\u20137667","author":"Wald Johanna","year":"2019","unstructured":"Johanna Wald , Armen Avetisyan , Nassir Navab , Federico Tombari , and Matthias Nie\u00df ner. 2019 . RIO: 3D object instance re-localization in changing indoor environments . In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 7658\u20137667 . Johanna Wald, Armen Avetisyan, Nassir Navab, Federico Tombari, and Matthias Nie\u00df ner. 2019. RIO: 3D object instance re-localization in changing indoor environments. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 7658\u20137667."},{"volume-title":"Beyond pascal: A benchmark for 3d object detection in the wild","author":"Xiang Yu","key":"e_1_2_2_39_1","unstructured":"Yu Xiang , Roozbeh Mottaghi , and Silvio Savarese . 2014. Beyond pascal: A benchmark for 3d object detection in the wild . In IEEE winter conference on applications of computer vision. 75\u201382. Yu Xiang, Roozbeh Mottaghi, and Silvio Savarese. 2014. Beyond pascal: A benchmark for 3d object detection in the wild. In IEEE winter conference on applications of computer vision. 75\u201382."},{"key":"e_1_2_2_40_1","volume-title":"Deep Flow-Guided Video Inpainting. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).","author":"Xu Rui","year":"2019","unstructured":"Rui Xu , Xiaoxiao Li , Bolei Zhou , and Chen Change Loy . 2019 . Deep Flow-Guided Video Inpainting. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Rui Xu, Xiaoxiao Li, Bolei Zhou, and Chen Change Loy. 2019. Deep Flow-Guided Video Inpainting. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)."},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2009.5459289"},{"key":"e_1_2_2_42_1","volume-title":"2020 IEEE Symposium Series on Computational Intelligence (SSCI). 737\u2013744","author":"Zhao Wenshuai","year":"2020","unstructured":"Wenshuai Zhao , Jorge Pe\u00f1a Queralta , and Tomi Westerlund . 2020 . Sim-to-real transfer in deep reinforcement learning for robotics: a survey . In 2020 IEEE Symposium Series on Computational Intelligence (SSCI). 737\u2013744 . Wenshuai Zhao, Jorge Pe\u00f1a Queralta, and Tomi Westerlund. 2020. Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In 2020 IEEE Symposium Series on Computational Intelligence (SSCI). 737\u2013744."},{"key":"e_1_2_2_43_1","volume-title":"Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1009\u20131018","author":"Zhao Yongheng","year":"2019","unstructured":"Yongheng Zhao , Tolga Birdal , Haowen Deng , and Federico Tombari . 2019 . 3D point capsule networks . In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1009\u20131018 . Yongheng Zhao, Tolga Birdal, Haowen Deng, and Federico Tombari. 2019. 3D point capsule networks. In Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition. 1009\u20131018."},{"key":"e_1_2_2_44_1","volume-title":"Davison","author":"Zhi Shuaifeng","year":"2021","unstructured":"Shuaifeng Zhi , Edgar Sucar , Andre Mouton , Iain Haughton , Tristan Laidlow , and Andrew J . Davison . 2021 . iLabel: Interactive Neural Scene Labelling . arXiv. Shuaifeng Zhi, Edgar Sucar, Andre Mouton, Iain Haughton, Tristan Laidlow, and Andrew J. Davison. 2021. iLabel: Interactive Neural Scene Labelling. arXiv."},{"key":"e_1_2_2_45_1","unstructured":"Qian-Yi Zhou Jaesik Park and Vladlen Koltun. 2018. Open3D: A Modern Library for 3D Data Processing. arXiv:1801.09847. Qian-Yi Zhou Jaesik Park and Vladlen Koltun. 2018. Open3D: A Modern Library for 3D Data Processing. arXiv:1801.09847."},{"key":"e_1_2_2_46_1","volume-title":"2019 IEEE Intelligent Vehicles Symposium (IV). 1816\u20131821","author":"Zimmer Walter","year":"2019","unstructured":"Walter Zimmer , Akshay Rangesh , and Mohan Trivedi . 2019 . 3d bat: A semi-automatic, web-based 3d annotation toolbox for full-surround, multi-modal data streams . In 2019 IEEE Intelligent Vehicles Symposium (IV). 1816\u20131821 . Walter Zimmer, Akshay Rangesh, and Mohan Trivedi. 2019. 3d bat: A semi-automatic, web-based 3d annotation toolbox for full-surround, multi-modal data streams. In 2019 IEEE Intelligent Vehicles Symposium (IV). 1816\u20131821."}],"container-title":["Proceedings of the ACM on Human-Computer Interaction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3626476","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3626476","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T22:50:15Z","timestamp":1750287015000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3626476"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,31]]},"references-count":46,"journal-issue":{"issue":"ISS","published-print":{"date-parts":[[2023,10,31]]}},"alternative-id":["10.1145\/3626476"],"URL":"https:\/\/doi.org\/10.1145\/3626476","relation":{},"ISSN":["2573-0142"],"issn-type":[{"type":"electronic","value":"2573-0142"}],"subject":[],"published":{"date-parts":[[2023,10,31]]},"assertion":[{"value":"2023-11-01","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}