{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,22]],"date-time":"2026-06-22T05:48:42Z","timestamp":1782107322248,"version":"3.54.5"},"reference-count":113,"publisher":"Springer Science and Business Media LLC","issue":"2","license":[{"start":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T00:00:00Z","timestamp":1770508800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,2,18]],"date-time":"2026-02-18T00:00:00Z","timestamp":1771372800000},"content-version":"vor","delay-in-days":10,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100004435","name":"Universidad de Navarra","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100004435","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Virtual Reality"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Accurate determination of the 6D pose of an object is important in Augmented Reality (AR) to align and anchor virtual elements within the real world. Achieving seamless and proper alignment of virtual objects within RGB image sequences enables AR to provide spatial value. For this purpose, the estimation of the 6D pose is one of the most relevant techniques, yet it remains a significant challenge. While there has been significant research in the field of 6D object estimation from RGB images, many challenges remain unresolved. Our analysis offers a thorough examination of modern methods based on deep learning due to their ability to deliver state-of-the-art results in 6D pose estimation, while addressing challenges such as changes in lighting, occlusions, background clutter and other environmental factors. We also consider the standard datasets and metrics to compare performance, perform a qualitative analysis from different perspectives and summarize the main technical challenges and trends in this topic, always from an application-oriented perspective in the context of AR.<\/jats:p>","DOI":"10.1007\/s10055-026-01315-4","type":"journal-article","created":{"date-parts":[[2026,2,8]],"date-time":"2026-02-08T07:05:59Z","timestamp":1770534359000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Monocular RGB 6D object pose estimation for augmented reality: a survey"],"prefix":"10.1007","volume":"30","author":[{"given":"Pablo","family":"Aguirrezabal","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Iker","family":"Aguinaga","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Aitor","family":"Alvarez-Gila","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2026,2,8]]},"reference":[{"issue":"6","key":"1315_CR1","doi-asserted-by":"publisher","first-page":"34","DOI":"10.1109\/38.963459","volume":"21","author":"R Azuma","year":"2001","unstructured":"Azuma R, Baillot Y, Behringer R et al (2001) Recent advances in augmented reality. IEEE Comput Graphics Appl 21(6):34\u201347. https:\/\/doi.org\/10.1109\/38.963459","journal-title":"IEEE Comput Graphics Appl"},{"issue":"4","key":"1315_CR2","doi-asserted-by":"publisher","first-page":"355","DOI":"10.1162\/pres.1997.6.4.355","volume":"6","author":"RT Azuma","year":"1997","unstructured":"Azuma RT (1997) A survey of augmented reality. Presence: Teleoperators and Virtual Environ 6(4):355\u2013385. https:\/\/doi.org\/10.1162\/pres.1997.6.4.355","journal-title":"Presence: Teleoperators and Virtual Environ"},{"key":"1315_CR3","doi-asserted-by":"publisher","unstructured":"Banerjee P, Shkodrani S, Moulon P, et\u00a0al (2024) Introducing hot3d: An egocentric dataset for 3d hand and object tracking. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), https:\/\/doi.org\/10.48550\/arXiv.2406.09598","DOI":"10.48550\/arXiv.2406.09598"},{"key":"1315_CR4","doi-asserted-by":"publisher","unstructured":"Barath D, Matas J (2019) Progressive-x: Efficient, anytime, multi-model fitting algorithm. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp 8172\u20138180, https:\/\/doi.org\/10.1109\/ICCV.2019.00388","DOI":"10.1109\/ICCV.2019.00388"},{"issue":"3","key":"1315_CR5","doi-asserted-by":"publisher","first-page":"346","DOI":"10.1016\/j.cviu.2007.09.014","volume":"110","author":"H Bay","year":"2008","unstructured":"Bay H, Ess A, Tuytelaars T et al (2008) Speeded-up robust features (surf). Comput Vis Image Underst 110(3):346\u2013359. https:\/\/doi.org\/10.1016\/j.cviu.2007.09.014","journal-title":"Comput Vis Image Underst"},{"issue":"7","key":"1315_CR6","doi-asserted-by":"publisher","first-page":"2047","DOI":"10.1177\/21925682211069321","volume":"13","author":"FR Bhatt","year":"2022","unstructured":"Bhatt FR, Orosz LD, Tewari A et al (2022) Augmented reality-assisted spine surgery: An early experience demonstrating safety and accuracy with 218 screws. Global Spine J 13(7):2047\u20132052. https:\/\/doi.org\/10.1177\/21925682211069321","journal-title":"Global Spine J"},{"key":"1315_CR7","doi-asserted-by":"publisher","unstructured":"Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: Optimal speed and accuracy of object detection. https:\/\/doi.org\/10.48550\/arXiv.2004.10934,","DOI":"10.48550\/arXiv.2004.10934"},{"key":"1315_CR8","doi-asserted-by":"publisher","unstructured":"Brachmann E, Krull A, Michel F, et\u00a0al (2014) Learning 6d object pose estimation using 3d object coordinates. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 536\u2013551, https:\/\/doi.org\/10.1007\/978-3-319-10605-2_35","DOI":"10.1007\/978-3-319-10605-2_35"},{"key":"1315_CR9","doi-asserted-by":"publisher","unstructured":"Brachmann E, Michel F, Krull A, et\u00a0al (2016) Uncertainty-driven 6d pose estimation of objects and scenes from a single rgb image. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 3364\u20133372, https:\/\/doi.org\/10.1109\/cvpr.2016.366","DOI":"10.1109\/cvpr.2016.366"},{"key":"1315_CR10","doi-asserted-by":"publisher","unstructured":"Chen C, Jiang X, Zhou W, et\u00a0al (2019) Pose estimation for texture-less shiny objects in a single rgb image using synthetic training data. https:\/\/doi.org\/10.48550\/arXiv.1909.10270,","DOI":"10.48550\/arXiv.1909.10270"},{"key":"1315_CR11","doi-asserted-by":"publisher","unstructured":"Chen H, Manhardt F, Navab N, et\u00a0al (2023) Texpose: Neural texture learning for self-supervised 6d object pose estimation. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 4841\u20134852, https:\/\/doi.org\/10.1109\/cvpr52729.2023.00469","DOI":"10.1109\/cvpr52729.2023.00469"},{"key":"1315_CR12","doi-asserted-by":"publisher","unstructured":"Chum O, Matas J (2003) Locally optimized ransac. In: Proceedings of the Joint pattern recognition symposium, pp 236\u2013243, https:\/\/doi.org\/10.1007\/978-3-540-45243-0_31","DOI":"10.1007\/978-3-540-45243-0_31"},{"key":"1315_CR13","doi-asserted-by":"publisher","unstructured":"Deng J, Dong W, Socher R, et\u00a0al (2009) Imagenet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 248\u2013255, https:\/\/doi.org\/10.1109\/cvpr.2009.5206848","DOI":"10.1109\/cvpr.2009.5206848"},{"key":"1315_CR14","doi-asserted-by":"publisher","unstructured":"Deng W, Campbell D, Sun C, et\u00a0al (2025) Pos3r: 6d pose estimation for unseen objects made easy. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 16818\u201316828, https:\/\/doi.org\/10.1109\/cvpr52734.2025.01567","DOI":"10.1109\/cvpr52734.2025.01567"},{"key":"1315_CR15","doi-asserted-by":"publisher","unstructured":"Di Y, Manhardt F, Wang G, et\u00a0al (2021) So-pose: Exploiting self-occlusion for direct 6d pose estimation. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp 12376\u201312385, https:\/\/doi.org\/10.1109\/iccv48922.2021.01217","DOI":"10.1109\/iccv48922.2021.01217"},{"key":"1315_CR16","unstructured":"Diebel J, et\u00a0al (2006) Representing attitude: Euler angles, unit quaternions, and rotation vectors. Matrix 58(15-16):1\u201335. https:\/\/api.semanticscholar.org\/CorpusID:16450526"},{"key":"1315_CR17","doi-asserted-by":"publisher","unstructured":"Doumanoglou A, Kouskouridas R, Malassiotis S, et\u00a0al (2016) Recovering 6d object pose and predicting next-best-view in the crowd. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 3583\u20133592, https:\/\/doi.org\/10.1109\/CVPR.2016.390","DOI":"10.1109\/CVPR.2016.390"},{"key":"1315_CR18","doi-asserted-by":"publisher","unstructured":"Drost B, Ulrich M, Bergmann P, et\u00a0al (2017) Introducing mvtec itodd \u2013 a dataset for 3d object recognition in industry. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision Workshops (ICCVW), pp 2200\u20132208, https:\/\/doi.org\/10.1109\/ICCVW.2017.257","DOI":"10.1109\/ICCVW.2017.257"},{"issue":"6","key":"1315_CR19","doi-asserted-by":"publisher","first-page":"381","DOI":"10.1145\/358669.358692","volume":"24","author":"MA Fischler","year":"1981","unstructured":"Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381\u2013395. https:\/\/doi.org\/10.1145\/358669.358692","journal-title":"Commun ACM"},{"key":"1315_CR20","doi-asserted-by":"publisher","first-page":"1","DOI":"10.5116\/ijme.5e01.eb1a","volume":"11","author":"J Gerup","year":"2020","unstructured":"Gerup J, Soerensen CB, Dieckmann P (2020) Augmented reality and mixed reality for healthcare education beyond surgery: an integrative review. Int J Med Educ 11:1\u201318. https:\/\/doi.org\/10.5116\/ijme.5e01.eb1a","journal-title":"Int J Med Educ"},{"key":"1315_CR21","doi-asserted-by":"publisher","unstructured":"Guan J, Hao Y, Wu Q, et\u00a0al (2024) A survey of 6dof object pose estimation methods for different application scenarios. Sensors 24(4). https:\/\/doi.org\/10.3390\/s24041076","DOI":"10.3390\/s24041076"},{"key":"1315_CR22","doi-asserted-by":"publisher","unstructured":"Guo A, Wen B, Yuan J, et\u00a0al (2023) Handal: A dataset of real-world manipulable object categories with pose annotations, affordances, and reconstructions. In: Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 11428\u201311435, https:\/\/doi.org\/10.1109\/IROS55552.2023.10341672","DOI":"10.1109\/IROS55552.2023.10341672"},{"key":"1315_CR23","doi-asserted-by":"publisher","unstructured":"Hai Y, Song R, Li J, et\u00a0al (2023) Pseudo flow consistency for self-supervised 6d object pose estimation. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp 14029\u201314039, https:\/\/doi.org\/10.1109\/iccv51070.2023.01294","DOI":"10.1109\/iccv51070.2023.01294"},{"key":"1315_CR24","doi-asserted-by":"publisher","unstructured":"He K, Zhang X, Ren S, et\u00a0al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 770\u2013778, https:\/\/doi.org\/10.1109\/cvpr.2016.90","DOI":"10.1109\/cvpr.2016.90"},{"issue":"2","key":"1315_CR25","doi-asserted-by":"publisher","first-page":"386","DOI":"10.1109\/TPAMI.2018.2844175","volume":"42","author":"K He","year":"2020","unstructured":"He K, Gkioxari G, Doll\u00e1r P et al (2020) Mask r-cnn. IEEE Trans Pattern Anal Mach Intell 42(2):386\u2013397. https:\/\/doi.org\/10.1109\/TPAMI.2018.2844175","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"5","key":"1315_CR26","doi-asserted-by":"publisher","first-page":"799","DOI":"10.1007\/s11548-021-02369-2","volume":"16","author":"J Hein","year":"2021","unstructured":"Hein J, Seibold M, Bogo F et al (2021) Towards markerless surgical tool and hand pose estimation. Int J Comput Assist Radiol Surg 16(5):799\u2013808. https:\/\/doi.org\/10.1007\/s11548-021-02369-2","journal-title":"Int J Comput Assist Radiol Surg"},{"key":"1315_CR27","doi-asserted-by":"publisher","unstructured":"Hinterstoisser S, Holzer S, Cagniart C, et\u00a0al (2011) Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 858\u2013865, https:\/\/doi.org\/10.1109\/ICCV.2011.6126326","DOI":"10.1109\/ICCV.2011.6126326"},{"key":"1315_CR28","doi-asserted-by":"publisher","unstructured":"Hinterstoisser S, Lepetit V, Ilic S, et\u00a0al (2013) Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. In: Proceedings of the Asian Conference on Computer Vision (ACCV), pp 548\u2013562, https:\/\/doi.org\/10.1007\/978-3-642-37331-2_42","DOI":"10.1007\/978-3-642-37331-2_42"},{"key":"1315_CR29","doi-asserted-by":"publisher","unstructured":"Hoda\u0148 T, Matas J, Obdr\u017e\u00e1lek \u0160 (2016) On evaluation of 6d object pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 606\u2013619, https:\/\/doi.org\/10.1007\/978-3-319-49409-8_52","DOI":"10.1007\/978-3-319-49409-8_52"},{"key":"1315_CR30","doi-asserted-by":"publisher","unstructured":"Hoda\u0148 T, Michel F, Brachmann E, et\u00a0al (2018) Bop: Benchmark for 6d object pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 19\u201335, https:\/\/doi.org\/10.1007\/978-3-030-01249-6_2","DOI":"10.1007\/978-3-030-01249-6_2"},{"key":"1315_CR31","doi-asserted-by":"publisher","unstructured":"Hoda\u0148 T, Sundermeyer M, Drost B, et\u00a0al (2020) Bop challenge 2020 on 6d object localization. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 577\u2013594, https:\/\/doi.org\/10.1007\/978-3-030-66096-3_39","DOI":"10.1007\/978-3-030-66096-3_39"},{"key":"1315_CR32","doi-asserted-by":"publisher","unstructured":"Hoda\u0148 T, Haluza P, Obdrzalek S, et\u00a0al (2017) T-less: An rgb-d dataset for 6d pose estimation of texture-less objects. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), pp 880\u2013888, https:\/\/doi.org\/10.1109\/WACV.2017.103","DOI":"10.1109\/WACV.2017.103"},{"key":"1315_CR33","doi-asserted-by":"publisher","unstructured":"Hoda\u0148 T, Barath D, Matas J (2020) Epos: Estimating 6d pose of objects with symmetries. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 11703\u201311712, https:\/\/doi.org\/10.1109\/CVPR42600.2020.01172, arXiv:2004.00605","DOI":"10.1109\/CVPR42600.2020.01172"},{"key":"1315_CR34","doi-asserted-by":"publisher","unstructured":"Hoda\u0148 T, Sundermeyer M, Labb\u00e9 Y, et\u00a0al (2024) Bop challenge 2023 on detection, segmentation and pose estimation of seen and unseen rigid objects. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 5610\u20135619, https:\/\/doi.org\/10.1109\/cvprw63382.2024.00570","DOI":"10.1109\/cvprw63382.2024.00570"},{"key":"1315_CR35","doi-asserted-by":"publisher","first-page":"143746","DOI":"10.1109\/access.2021.3114399","volume":"9","author":"S Hoque","year":"2021","unstructured":"Hoque S, Arafat MY, Xu S et al (2021) A comprehensive review on 3d object detection and 6d pose estimation with deep learning. IEEE Access 9:143746\u2013143770. https:\/\/doi.org\/10.1109\/access.2021.3114399","journal-title":"IEEE Access"},{"key":"1315_CR36","doi-asserted-by":"publisher","unstructured":"Hu Y, Hugonot J, Fua P, et\u00a0al (2019) Segmentation-driven 6d object pose estimation. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 3380\u20133389, https:\/\/doi.org\/10.1109\/CVPR.2019.00350, arXiv:1812.02541","DOI":"10.1109\/CVPR.2019.00350"},{"key":"1315_CR37","doi-asserted-by":"publisher","unstructured":"Hu Y, Fua P, Wang W, et\u00a0al (2020) Single-stage 6d object pose estimation. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 2927\u20132936, https:\/\/doi.org\/10.1109\/cvpr42600.2020.00300","DOI":"10.1109\/cvpr42600.2020.00300"},{"key":"1315_CR38","doi-asserted-by":"publisher","unstructured":"Hu Y, Fua P, Salzmann M (2022) Perspective flow aggregation for data-limited 6d object pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 89\u2013106, https:\/\/doi.org\/10.1007\/978-3-031-20086-1_6","DOI":"10.1007\/978-3-031-20086-1_6"},{"key":"1315_CR39","doi-asserted-by":"publisher","unstructured":"Huang J, Liang J, Hu J, et\u00a0al (2025) Xyz-ibd: A high-precision bin-picking dataset for object 6d pose estimation capturing real-world industrial complexity. https:\/\/doi.org\/10.48550\/arXiv.2506.00599,","DOI":"10.48550\/arXiv.2506.00599"},{"key":"1315_CR40","doi-asserted-by":"publisher","unstructured":"Huang L, Hoda\u0148 T, Ma L, et\u00a0al (2022) Neural correspondence field for object pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 585\u2013603, https:\/\/doi.org\/10.1007\/978-3-031-20080-9_34","DOI":"10.1007\/978-3-031-20080-9_34"},{"key":"1315_CR41","doi-asserted-by":"publisher","unstructured":"Jasche F, Hoffmann S, Ludwig T, et\u00a0al (2021) Comparison of different types of augmented reality visualizations for instructions. In: Proceedings of the Conference on Human Factors in Computing Systems (CHI), CHI \u201921, pp 1\u201313, https:\/\/doi.org\/10.1145\/3411764.3445724","DOI":"10.1145\/3411764.3445724"},{"key":"1315_CR42","doi-asserted-by":"publisher","unstructured":"Jung H, Wu SC, Ruhkamp P, et\u00a0al (2024) Housecat6d-a large-scale multi-modal category level 6d object perception dataset with household objects in realistic scenarios. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 22498\u201322508, https:\/\/doi.org\/10.1109\/CVPR52733.2024.02123","DOI":"10.1109\/CVPR52733.2024.02123"},{"issue":"5","key":"1315_CR43","doi-asserted-by":"publisher","first-page":"827","DOI":"10.1107\/S0567739478001680","volume":"34","author":"W Kabsch","year":"1978","unstructured":"Kabsch W (1978) A discussion of the solution for the best rotation to relate two sets of vectors. Acta Crystallogr A 34(5):827\u2013828. https:\/\/doi.org\/10.1107\/S0567739478001680","journal-title":"Acta Crystallogr A"},{"key":"1315_CR44","doi-asserted-by":"publisher","unstructured":"Kalra A, Stoppi G, Marin D, et\u00a0al (2024) Towards co-evaluation of cameras, hdr, and algorithms for industrial-grade 6dof pose estimation. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 22691\u201322701, https:\/\/doi.org\/10.1109\/CVPR52733.2024.02141","DOI":"10.1109\/CVPR52733.2024.02141"},{"key":"1315_CR45","doi-asserted-by":"publisher","unstructured":"Kaskman R, Zakharov S, Shugurov I, et\u00a0al (2019) Homebreweddb: Rgb-d dataset for 6d pose estimation of 3d objects. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision Workshops (ICCVW), pp 2767\u20132776, https:\/\/doi.org\/10.1109\/ICCVW.2019.00338","DOI":"10.1109\/ICCVW.2019.00338"},{"key":"1315_CR46","doi-asserted-by":"publisher","unstructured":"Kehl W, Manhardt F, Tombari F, et\u00a0al (2017) Ssd-6d: Making rgb-based 3d detection and 6d pose estimation great again. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp 1521\u20131529, https:\/\/doi.org\/10.1109\/iccv.2017.169","DOI":"10.1109\/iccv.2017.169"},{"key":"1315_CR47","doi-asserted-by":"publisher","unstructured":"Kim J, Park J, Lee K, et\u00a0al (2025) Refpose: Leveraging reference geometric correspondences for accurate 6d pose estimation of unseen objects. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 6447\u20136456, https:\/\/doi.org\/10.1109\/cvpr52734.2025.00604","DOI":"10.1109\/cvpr52734.2025.00604"},{"key":"1315_CR48","doi-asserted-by":"publisher","unstructured":"Labb\u00e9 Y, Carpentier J, Aubry M, et\u00a0al (2020) Cosypose: Consistent multi-view multi-object 6d pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, pp 574\u2013591, https:\/\/doi.org\/10.1007\/978-3-030-58520-4_34","DOI":"10.1007\/978-3-030-58520-4_34"},{"key":"1315_CR49","doi-asserted-by":"publisher","unstructured":"Labb\u00e9 Y, Manuelli L, Mousavian A, et\u00a0al (2022) Megapose: 6d pose estimation of novel objects via render & compare. In: Proceedings of the Conference on Robot Learning (CoRL), https:\/\/doi.org\/10.48550\/arXiv.2212.06870","DOI":"10.48550\/arXiv.2212.06870"},{"issue":"2","key":"1315_CR50","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1007\/s11263-008-0152-6","volume":"81","author":"V Lepetit","year":"2009","unstructured":"Lepetit V, Moreno-Noguer F, Fua P (2009) Epnp: an accurate o(n) solution to the pnp problem. Int J Comput Vision 81(2):155\u2013166. https:\/\/doi.org\/10.1007\/s11263-008-0152-6","journal-title":"Int J Comput Vision"},{"key":"1315_CR51","doi-asserted-by":"publisher","unstructured":"Leroy V, Cabon Y, Revaud J (2024) Grounding image matching in 3d with mast3r. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 71\u201391, https:\/\/doi.org\/10.1007\/978-3-031-73220-1_5","DOI":"10.1007\/978-3-031-73220-1_5"},{"key":"1315_CR52","doi-asserted-by":"publisher","unstructured":"Li Y, Wang G, Ji X, et\u00a0al (2018) Deepim: Deep iterative matching for 6d pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 683\u2013698, https:\/\/doi.org\/10.1007\/978-3-030-01231-1_42","DOI":"10.1007\/978-3-030-01231-1_42"},{"key":"1315_CR53","doi-asserted-by":"publisher","unstructured":"Li Z, Wang G, Ji X (2019) Cdpn: Coordinates-based disentangled pose network for real-time rgb-based 6-dof object pose estimation. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp 7677\u20137686, https:\/\/doi.org\/10.1109\/ICCV.2019.00777","DOI":"10.1109\/ICCV.2019.00777"},{"key":"1315_CR54","doi-asserted-by":"publisher","unstructured":"Lian R, Ling H (2023) Checkerpose: Progressive dense keypoint localization for object pose estimation with graph neural network. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp 13976\u201313987, https:\/\/doi.org\/10.1109\/iccv51070.2023.01289","DOI":"10.1109\/iccv51070.2023.01289"},{"key":"1315_CR55","doi-asserted-by":"publisher","unstructured":"Lin TY, Goyal P, Girshick R, et\u00a0al (2017) Focal loss for dense object detection. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp 2980\u20132988, https:\/\/doi.org\/10.1109\/iccv.2017.324","DOI":"10.1109\/iccv.2017.324"},{"key":"1315_CR56","doi-asserted-by":"publisher","unstructured":"Liu W, Anguelov D, Erhan D, et\u00a0al (2016) Ssd: Single shot multibox detector. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 21\u201337, https:\/\/doi.org\/10.1007\/978-3-319-46448-0_2","DOI":"10.1007\/978-3-319-46448-0_2"},{"key":"1315_CR57","doi-asserted-by":"publisher","unstructured":"Liu Y, Wen Y, Peng S, et\u00a0al (2022) Gen6d: Generalizable model-free 6-dof object pose estimation from rgb images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 298\u2013315, https:\/\/doi.org\/10.1007\/978-3-031-19824-3_18","DOI":"10.1007\/978-3-031-19824-3_18"},{"key":"1315_CR58","doi-asserted-by":"publisher","unstructured":"Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 3431\u20133440, https:\/\/doi.org\/10.1109\/CVPR.2015.7298965","DOI":"10.1109\/CVPR.2015.7298965"},{"key":"1315_CR59","doi-asserted-by":"publisher","unstructured":"Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 1150\u20131157, https:\/\/doi.org\/10.1109\/ICCV.1999.790410","DOI":"10.1109\/ICCV.1999.790410"},{"key":"1315_CR60","doi-asserted-by":"publisher","unstructured":"Manhardt F, Arroyo DM, Rupprecht C, et\u00a0al (2019) Explaining the ambiguity of object detection and 6d pose from visual data. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp 6840\u20136849, https:\/\/doi.org\/10.1109\/ICCV.2019.00694","DOI":"10.1109\/ICCV.2019.00694"},{"issue":"12","key":"1315_CR61","doi-asserted-by":"publisher","first-page":"2633","DOI":"10.1109\/TVCG.2015.2513408","volume":"22","author":"E Marchand","year":"2016","unstructured":"Marchand E, Uchiyama H, Spindler F (2016) Pose estimation for augmented reality: a hands-on survey. IEEE Trans Visual Comput Graphics 22(12):2633\u20132651. https:\/\/doi.org\/10.1109\/TVCG.2015.2513408","journal-title":"IEEE Trans Visual Comput Graphics"},{"key":"1315_CR62","doi-asserted-by":"publisher","first-page":"24605","DOI":"10.1007\/s11042-022-14213-z","volume":"82","author":"G Marullo","year":"2023","unstructured":"Marullo G, Livatino S, Gentile C et al (2023) 6d object position estimation from 2d images: a literature review. Multimed Tools Appl 82:24605\u201324643. https:\/\/doi.org\/10.1007\/s11042-022-14213-z","journal-title":"Multimed Tools Appl"},{"key":"1315_CR63","doi-asserted-by":"publisher","unstructured":"Me\u017ea S, Turk v, Dolenc M, (2014) Component based engineering of a mobile bim-based augmented reality system. Autom Constr 42:1\u201312. https:\/\/doi.org\/10.1016\/j.autcon.2014.02.011","DOI":"10.1016\/j.autcon.2014.02.011"},{"key":"1315_CR64","doi-asserted-by":"publisher","unstructured":"Mildenhall B, Srinivasan PP, Tancik M, et\u00a0al (2020) Nerf: Representing scenes as neural radiance fields for view synthesis. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 405\u2013421, https:\/\/doi.org\/10.1007\/978-3-030-58452-8_24","DOI":"10.1007\/978-3-030-58452-8_24"},{"key":"1315_CR65","doi-asserted-by":"publisher","unstructured":"Moon S, Son H, Hur D, et\u00a0al (2024) Genflow: Generalizable recurrent flow for 6d pose refinement of novel objects. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 10039\u201310049, https:\/\/doi.org\/10.1109\/cvpr52733.2024.00957","DOI":"10.1109\/cvpr52733.2024.00957"},{"key":"1315_CR66","doi-asserted-by":"publisher","unstructured":"Moon S, Son H, Hur D, et\u00a0al (2025) Co-op: Correspondence-based novel object pose estimation. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 11622\u201311632, https:\/\/doi.org\/10.1109\/cvpr52734.2025.01085","DOI":"10.1109\/cvpr52734.2025.01085"},{"key":"1315_CR67","doi-asserted-by":"publisher","unstructured":"Mousavian A, Anguelov D, Flynn J, et\u00a0al (2017) 3d bounding box estimation using deep learning and geometry. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 5632\u20135640, https:\/\/doi.org\/10.1109\/CVPR.2017.597","DOI":"10.1109\/CVPR.2017.597"},{"issue":"1","key":"1315_CR68","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1007\/BF01421486","volume":"14","author":"H Murase","year":"1995","unstructured":"Murase H, Nayar SK (1995) Visual learning and recognition of 3-d objects from appearance. Int J Comput Vision 14(1):5\u201324. https:\/\/doi.org\/10.1007\/BF01421486","journal-title":"Int J Comput Vision"},{"key":"1315_CR69","doi-asserted-by":"publisher","unstructured":"Nguyen VN, Groueix T, Ponimatkin G, et\u00a0al (2023) Cnos: A strong baseline for cad-based novel object segmentation. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision Workshops (ICCVW), pp 2126\u20132132, https:\/\/doi.org\/10.1109\/ICCVW60793.2023.00227","DOI":"10.1109\/ICCVW60793.2023.00227"},{"key":"1315_CR70","doi-asserted-by":"publisher","unstructured":"Nguyen VN, Groueix T, Salzmann M, et\u00a0al (2024) Gigapose: Fast and robust novel object pose estimation via one correspondence. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9903\u20139913, https:\/\/doi.org\/10.1109\/cvpr52733.2024.00945","DOI":"10.1109\/cvpr52733.2024.00945"},{"key":"1315_CR71","doi-asserted-by":"publisher","unstructured":"Oberweger M, Wohlhart P, Lepetit V (2015) Training a feedback loop for hand pose estimation. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp 3316\u20133324, https:\/\/doi.org\/10.1109\/ICCV.2015.379","DOI":"10.1109\/ICCV.2015.379"},{"key":"1315_CR72","doi-asserted-by":"publisher","unstructured":"Oberweger M, Rad M, Lepetit V (2018) Making deep heatmaps robust to partial occlusions for 3d object pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, pp 119\u2013134, https:\/\/doi.org\/10.1007\/978-3-030-01267-0_8","DOI":"10.1007\/978-3-030-01267-0_8"},{"key":"1315_CR73","unstructured":"Oquab M, Darcet T, Moutakanni T, et\u00a0al (2024) DINOv2: Learning robust visual features without supervision. Transactions on Machine Learning Research https:\/\/openreview.net\/forum?id=a68SUt6zFt}"},{"key":"1315_CR74","doi-asserted-by":"publisher","unstructured":"Osokin A, Sumin D, Lomakin V (2020) Os2d: One-stage one-shot object detection by matching anchor features. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 635\u2013652, https:\/\/doi.org\/10.1007\/978-3-030-58555-6_38","DOI":"10.1007\/978-3-030-58555-6_38"},{"key":"1315_CR75","doi-asserted-by":"publisher","first-page":"215","DOI":"10.1016\/j.rcim.2017.06.002","volume":"49","author":"R Palmarini","year":"2018","unstructured":"Palmarini R, Erkoyuncu JA, Roy R et al (2018) A systematic review of augmented reality applications in maintenance. Robot Comput-Integr Manuf 49:215\u2013228. https:\/\/doi.org\/10.1016\/j.rcim.2017.06.002","journal-title":"Robot Comput-Integr Manuf"},{"key":"1315_CR76","doi-asserted-by":"publisher","unstructured":"Park J, Kim J, Cho NI (2024) Leveraging positional encoding for robust multi-reference-based object 6d pose estimation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) abs\/2401.16284. https:\/\/doi.org\/10.48550\/arXiv.2401.16284","DOI":"10.48550\/arXiv.2401.16284"},{"key":"1315_CR77","doi-asserted-by":"publisher","unstructured":"Park K, Patten T, Vincze M (2019) Pix2pose: Pixel-wise coordinate regression of objects for 6d pose estimation. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp 7667\u20137676, https:\/\/doi.org\/10.1109\/ICCV.2019.00776","DOI":"10.1109\/ICCV.2019.00776"},{"key":"1315_CR78","doi-asserted-by":"publisher","unstructured":"Peng S, Liu Y, Huang Q, et\u00a0al (2019) Pvnet: Pixel-wise voting network for 6dof pose estimation. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 4556\u20134565, https:\/\/doi.org\/10.1109\/cvpr.2019.00469","DOI":"10.1109\/cvpr.2019.00469"},{"key":"1315_CR79","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2023.104490","volume":"168","author":"AS Periyasamy","year":"2023","unstructured":"Periyasamy AS, Amini A, Tsaturyan V et al (2023) Yolopose v2: Understanding and improving transformer-based 6d pose estimation. Robot Auton Syst 168:104490. https:\/\/doi.org\/10.1016\/j.robot.2023.104490","journal-title":"Robot Auton Syst"},{"key":"1315_CR80","doi-asserted-by":"publisher","unstructured":"Pitteri G, Bugeau A, Ilic S, et\u00a0al (2020) 3d object detection and pose estimation of unseen objects in color images with local surface embeddings. In: Proceedings of the Asian Conference on Computer Vision (ACCV), pp 38\u201354, https:\/\/doi.org\/10.1007\/978-3-030-69525-5_3","DOI":"10.1007\/978-3-030-69525-5_3"},{"key":"1315_CR81","doi-asserted-by":"publisher","unstructured":"Rad M, Lepetit V (2017) Bb8: A scalable, accurate, robust to partial occlusion method for predicting the 3d poses of challenging objects without using depth. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp 3848\u20133856, https:\/\/doi.org\/10.1109\/ICCV.2017.413","DOI":"10.1109\/ICCV.2017.413"},{"key":"1315_CR82","unstructured":"Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv:1804.02767"},{"key":"1315_CR83","doi-asserted-by":"publisher","unstructured":"Redmon J, Divvala S, Girshick R, et\u00a0al (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 779\u2013788, https:\/\/doi.org\/10.1109\/cvpr.2016.91","DOI":"10.1109\/cvpr.2016.91"},{"issue":"6","key":"1315_CR84","doi-asserted-by":"publisher","first-page":"1137","DOI":"10.1109\/tpami.2016.2577031","volume":"39","author":"S Ren","year":"2017","unstructured":"Ren S, He K, Girshick R et al (2017) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137\u20131149. https:\/\/doi.org\/10.1109\/tpami.2016.2577031","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"issue":"2","key":"1315_CR85","doi-asserted-by":"publisher","first-page":"1179","DOI":"10.1109\/lra.2016.2532924","volume":"1","author":"C Rennie","year":"2016","unstructured":"Rennie C, Shome R, Bekris KE et al (2016) A dataset for improved rgbd-based object detection and pose estimation for warehouse pick-and-place. IEEE Robot Automa Lett 1(2):1179\u20131185. https:\/\/doi.org\/10.1109\/lra.2016.2532924","journal-title":"IEEE Robot Automa Lett"},{"key":"1315_CR86","doi-asserted-by":"publisher","unstructured":"Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Proceedings of the Medical Image Computing and Computer-Assisted Intervention (MICCAI), pp 234\u2013241, https:\/\/doi.org\/10.1007\/978-3-319-24574-4_28","DOI":"10.1007\/978-3-319-24574-4_28"},{"key":"1315_CR87","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2020.103898","volume":"96","author":"C Sahin","year":"2020","unstructured":"Sahin C, Garcia-Hernando G, Sock J et al (2020) A review on object pose recovery: from 3d bounding box detectors to full 6d pose estimators. Image Vis Comput 96:103898. https:\/\/doi.org\/10.1016\/j.imavis.2020.103898","journal-title":"Image Vis Comput"},{"key":"1315_CR88","doi-asserted-by":"publisher","unstructured":"Saito S, Huang Z, Natsume R, et\u00a0al (2019) Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp 2304\u2013231https:\/\/doi.org\/10.1109\/ICCV.2019.00239, arXiv:1905.05172","DOI":"10.1109\/ICCV.2019.00239"},{"key":"1315_CR89","doi-asserted-by":"publisher","unstructured":"Shugurov I, Li F, Busam B, et\u00a0al (2022) Osop: A multi-stage one shot object pose estimation framework. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 6835\u20136844, https:\/\/doi.org\/10.1109\/cvpr52688.2022.00671","DOI":"10.1109\/cvpr52688.2022.00671"},{"key":"1315_CR90","doi-asserted-by":"publisher","unstructured":"Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. https:\/\/doi.org\/10.48550\/arXiv.1409.1556,","DOI":"10.48550\/arXiv.1409.1556"},{"key":"1315_CR91","doi-asserted-by":"publisher","unstructured":"Song C, Song J, Huang Q (2020) Hybridpose: 6d object pose estimation under hybrid representations. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 428\u2013437, https:\/\/doi.org\/10.1109\/CVPR42600.2020.00051, arXiv:2001.01869","DOI":"10.1109\/CVPR42600.2020.00051"},{"key":"1315_CR92","doi-asserted-by":"publisher","unstructured":"Stapf S, Bauernfeind T, Riboldi M (2023) Pvit-6d: Overclocking vision transformers for 6d pose estimation with confidence-level prediction and pose tokens. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) abs\/2311.17504. https:\/\/doi.org\/10.48550\/arXiv.2311.17504","DOI":"10.48550\/arXiv.2311.17504"},{"key":"1315_CR93","doi-asserted-by":"publisher","unstructured":"Su Y, Saleh M, Fetzer T, et\u00a0al (2022) Zebrapose: Coarse to fine surface encoding for 6dof object pose estimation. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 6728\u20136738, https:\/\/doi.org\/10.1109\/cvpr52688.2022.00662","DOI":"10.1109\/cvpr52688.2022.00662"},{"key":"1315_CR94","doi-asserted-by":"publisher","unstructured":"Sun J, Wang Z, Zhang S, et\u00a0al (2022) Onepose: One-shot object pose estimation without cad models. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 6815\u20136824, https:\/\/doi.org\/10.1109\/cvpr52688.2022.00670","DOI":"10.1109\/cvpr52688.2022.00670"},{"key":"1315_CR95","doi-asserted-by":"publisher","unstructured":"Sundermeyer M, Marton ZC, Durner M, et\u00a0al (2018) Implicit 3d orientation learning for 6d object detection from rgb images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 699\u2013715, https:\/\/doi.org\/10.1007\/978-3-030-01231-1_43","DOI":"10.1007\/978-3-030-01231-1_43"},{"key":"1315_CR96","doi-asserted-by":"publisher","unstructured":"Sundermeyer M, Durner M, Puang EY, et\u00a0al (2020) Multi-path learning for object pose estimation across domains. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 13913\u201313922, https:\/\/doi.org\/10.1109\/cvpr42600.2020.01393","DOI":"10.1109\/cvpr42600.2020.01393"},{"key":"1315_CR97","doi-asserted-by":"publisher","unstructured":"Sundermeyer M, Hoda\u0148 T, Labb\u00e9 Y, et\u00a0al (2023) Bop challenge 2022 on detection, segmentation and pose estimation of specific rigid objects. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 2785\u20132794, https:\/\/doi.org\/10.1109\/cvprw59228.2023.00279","DOI":"10.1109\/cvprw59228.2023.00279"},{"key":"1315_CR98","doi-asserted-by":"publisher","unstructured":"Tan T, Dong Q (2025) Onda-pose: Occlusion-aware neural domain adaptation for self-supervised 6d object pose estimation. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 16829\u201316838, https:\/\/doi.org\/10.1109\/cvpr52734.2025.01568","DOI":"10.1109\/cvpr52734.2025.01568"},{"key":"1315_CR99","doi-asserted-by":"publisher","unstructured":"Tejani A, Tang D, Kouskouridas R, et\u00a0al (2014) Latent-class hough forests for 3d object detection and pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 462\u2013477, https:\/\/doi.org\/10.1007\/978-3-319-10599-4_30","DOI":"10.1007\/978-3-319-10599-4_30"},{"key":"1315_CR100","doi-asserted-by":"publisher","unstructured":"Tekin B, Sinha SN, Fua P (2018) Real-time seamless single shot 6d object pose prediction. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 292\u2013301, https:\/\/doi.org\/10.1109\/cvpr.2018.00038","DOI":"10.1109\/cvpr.2018.00038"},{"key":"1315_CR101","doi-asserted-by":"publisher","unstructured":"Tian Z, Shen C, Chen H, et\u00a0al (2019) Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp 9627\u20139636, https:\/\/doi.org\/10.1109\/iccv.2019.00972","DOI":"10.1109\/iccv.2019.00972"},{"key":"1315_CR102","doi-asserted-by":"publisher","unstructured":"Tyree S, Tremblay J, To T, et\u00a0al (2022) 6-dof pose estimation of household objects for robotic manipulation: An accessible dataset and benchmark. In: Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), https:\/\/doi.org\/10.1109\/iros47612.2022.9981838","DOI":"10.1109\/iros47612.2022.9981838"},{"key":"1315_CR103","doi-asserted-by":"publisher","unstructured":"Valentin JPC, Vineet V, Cheng MM, et\u00a0al (2015) Semanticpaint: interactive 3d labeling and learning at your fingertips. ACM Transactions on Graphics 34(5):154:1\u2013154:17. https:\/\/doi.org\/10.1145\/2751556","DOI":"10.1145\/2751556"},{"key":"1315_CR104","doi-asserted-by":"publisher","unstructured":"Wang C, Xu D, Zhu Y, et\u00a0al (2019a) Densefusion: 6d object pose estimation by iterative dense fusion. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 3338\u20133347, https:\/\/doi.org\/10.1109\/cvpr.2019.00346","DOI":"10.1109\/cvpr.2019.00346"},{"key":"1315_CR105","doi-asserted-by":"publisher","unstructured":"Wang G, Manhardt F, Tombari F, et\u00a0al (2021) Gdr-net: Geometry-guided direct regression network for monocular 6d object pose estimation. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 16611\u201316621, https:\/\/doi.org\/10.1109\/CVPR46437.2021.01634, arXiv:2102.12145","DOI":"10.1109\/CVPR46437.2021.01634"},{"issue":"5","key":"1315_CR106","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3326362","volume":"38","author":"Y Wang","year":"2019","unstructured":"Wang Y, Sun Y, Liu Z et al (2019) Dynamic graph cnn for learning on point clouds. ACM Trans Graphics 38(5):1\u201312. https:\/\/doi.org\/10.1145\/3326362","journal-title":"ACM Trans Graphics"},{"key":"1315_CR107","doi-asserted-by":"publisher","unstructured":"Wu D, Zhuang Z, Xiang C, et\u00a0al (2019) 6d-vnet: End-to-end 6dof vehicle pose estimation from monocular rgb images. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 0\u20130, https:\/\/doi.org\/10.1109\/cvprw.2019.00163","DOI":"10.1109\/cvprw.2019.00163"},{"key":"1315_CR108","doi-asserted-by":"publisher","unstructured":"Xiang Y, Schmidt T, Narayanan V, et\u00a0al (2018) Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes. In: Proceedings of the Robotics: Science and Systems XIV, RSS2018, https:\/\/doi.org\/10.15607\/rss.2018.xiv.019","DOI":"10.15607\/rss.2018.xiv.019"},{"key":"1315_CR109","doi-asserted-by":"publisher","unstructured":"Xie Y, Jiang H, Xie J (2024) Mask6d: Masked pose priors for 6d object pose estimation. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 3545\u20133549, https:\/\/doi.org\/10.1109\/icassp48485.2024.10447716","DOI":"10.1109\/icassp48485.2024.10447716"},{"key":"1315_CR110","doi-asserted-by":"publisher","unstructured":"Xu L, Qu H, Cai Y, et\u00a0al (2024) 6d-diff: A keypoint diffusion framework for 6d object pose estimation. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 9676\u20139686, https:\/\/doi.org\/10.1109\/cvpr52733.2024.00924","DOI":"10.1109\/cvpr52733.2024.00924"},{"key":"1315_CR111","doi-asserted-by":"publisher","unstructured":"Zakharov S, Shugurov I, Ilic S (2019) Dpod: 6d pose object detector and refiner. In: Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV), pp 2061\u20132070, https:\/\/doi.org\/10.1109\/ICCV.2019.00203","DOI":"10.1109\/ICCV.2019.00203"},{"key":"1315_CR112","doi-asserted-by":"publisher","unstructured":"Zhou Y, Barnes C, Lu J, et\u00a0al (2019) On the continuity of rotation representations in neural networks. In: Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 5738\u20135746, https:\/\/doi.org\/10.1109\/cvpr.2019.00589","DOI":"10.1109\/cvpr.2019.00589"},{"key":"1315_CR113","doi-asserted-by":"publisher","unstructured":"\u00d6rnek EP, Labb\u00e9 Y, Tekin B, et\u00a0al (2024) Foundpose: Unseen object pose estimation with foundation features. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 163\u2013182, https:\/\/doi.org\/10.1007\/978-3-031-73347-5_10","DOI":"10.1007\/978-3-031-73347-5_10"}],"container-title":["Virtual Reality"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10055-026-01315-4","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10055-026-01315-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10055-026-01315-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,6,22]],"date-time":"2026-06-22T05:29:28Z","timestamp":1782106168000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10055-026-01315-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,8]]},"references-count":113,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2026,6]]}},"alternative-id":["1315"],"URL":"https:\/\/doi.org\/10.1007\/s10055-026-01315-4","relation":{},"ISSN":["1434-9957"],"issn-type":[{"value":"1434-9957","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,8]]},"assertion":[{"value":"25 March 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"9 January 2026","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 February 2026","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"57"}}