{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,28]],"date-time":"2026-02-28T20:03:26Z","timestamp":1772309006562,"version":"3.50.1"},"reference-count":48,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2023,9,27]],"date-time":"2023-09-27T00:00:00Z","timestamp":1695772800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Robot. AI"],"abstract":"<jats:p>6D pose recognition has been a crucial factor in the success of robotic grasping, and recent deep learning based approaches have achieved remarkable results on benchmarks. However, their generalization capabilities in real-world applications remain unclear. To overcome this gap, we introduce 6IMPOSE, a novel framework for sim-to-real data generation and 6D pose estimation. 6IMPOSE consists of four modules: First, a data generation pipeline that employs the 3D software suite Blender to create synthetic RGBD image datasets with 6D pose annotations. Second, an annotated RGBD dataset of five household objects was generated using the proposed pipeline. Third, a real-time two-stage 6D pose estimation approach that integrates the object detector YOLO-V4 and a streamlined, real-time version of the 6D pose estimation algorithm PVN3D optimized for time-sensitive robotics applications. Fourth, a codebase designed to facilitate the integration of the vision system into a robotic grasping experiment. Our approach demonstrates the efficient generation of large amounts of photo-realistic RGBD images and the successful transfer of the trained inference model to robotic grasping experiments, achieving an overall success rate of 87% in grasping five different household objects from cluttered backgrounds under varying lighting conditions. This is made possible by fine-tuning data generation and domain randomization techniques and optimizing the inference pipeline, overcoming the generalization and performance shortcomings of the original PVN3D algorithm. Finally, we make the code, synthetic dataset, and all the pre-trained models available on GitHub.<\/jats:p>","DOI":"10.3389\/frobt.2023.1176492","type":"journal-article","created":{"date-parts":[[2023,9,28]],"date-time":"2023-09-28T04:15:45Z","timestamp":1695874545000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":17,"title":["6IMPOSE: bridging the reality gap in 6D pose estimation for robotic grasping"],"prefix":"10.3389","volume":"10","author":[{"given":"Hongpeng","family":"Cao","sequence":"first","affiliation":[]},{"given":"Lukas","family":"Dirnberger","sequence":"additional","affiliation":[]},{"given":"Daniele","family":"Bernardini","sequence":"additional","affiliation":[]},{"given":"Cristina","family":"Piazza","sequence":"additional","affiliation":[]},{"given":"Marco","family":"Caccamo","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2023,9,27]]},"reference":[{"key":"B1","first-page":"265","article-title":"TensorFlow: A system for large-scale machine learning","author":"Abadi","year":"2016"},{"key":"B2","first-page":"4233","article-title":"G2l-net: global to local network for real-time 6d pose estimation with embedding vector features","author":"Chen","year":"2020"},{"key":"B3","volume-title":"Blender - a 3D modelling and rendering package","author":"Community","year":"2018"},{"key":"B4","first-page":"3665","article-title":"Self-supervised 6d object pose estimation for robot manipulation","author":"Deng","year":"2020"},{"key":"B5","doi-asserted-by":"crossref","first-page":"890","DOI":"10.1007\/978-3-030-95459-8_55","article-title":"A billion ways to grasp: an evaluation of grasp sampling schemes on a dense, physics-based grasp data set","volume-title":"Robotics research","author":"Eppner","year":"2022"},{"key":"B6","first-page":"6222","article-title":"Acronym: A large-scale grasp dataset based on simulation","author":"Eppner","year":"2021"},{"key":"B7","first-page":"1","article-title":"The pascal visual object classes challenge 2012 (voc2012) development kit","volume":"2007","author":"Everingham","year":"2012","journal-title":"Pattern Anal. Stat. Model. comput. Learn., Tech. Rep."},{"key":"B8","first-page":"345","article-title":"Learning rich features from rgb-d images for object detection and segmentation","volume-title":"European conference on computer vision","author":"Gupta","year":"2014"},{"key":"B9","first-page":"935","article-title":"Bridging the reality gap for pose estimation networks using sensor-based domain randomization","author":"Hagelskj\u00e6r","year":"2021"},{"key":"B10","first-page":"3003","article-title":"Ffb6d: A full flow bidirectional fusion network for 6d pose estimation","author":"He","year":"2021"},{"key":"B11","first-page":"11632","article-title":"Pvn3d: A deep point-wise 3d keypoints voting network for 6dof pose estimation","author":"He","year":""},{"key":"B12","volume-title":"Supplementary material\u2013pvn3d: A deep point-wise 3d keypoints voting network for 6dof pose estimation","author":"He","year":""},{"key":"B13","first-page":"548","article-title":"Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes","volume-title":"Asian conference on computer vision","author":"Hinterstoisser","year":"2012"},{"key":"B14","first-page":"2684","article-title":"Adaptive neighborhood selection for real-time surface normal estimation from organized point cloud data using integral images","author":"Holzer","year":"2012"},{"key":"B15","first-page":"3225","article-title":"Sim2real instance-level style transfer for 6d pose estimation","author":"Ikeda","year":"2022"},{"key":"B16","doi-asserted-by":"crossref","DOI":"10.1109\/ICCVW.2019.00338","article-title":"Homebreweddb: rgb-d dataset for 6d pose estimation of 3d objects","author":"Kaskman","year":"2019"},{"key":"B17","first-page":"1521","article-title":"Ssd-6d: making rgb-based 3d detection and 6d pose estimation great again","author":"Kehl","year":"2017"},{"key":"B18","doi-asserted-by":"publisher","first-page":"239","DOI":"10.1007\/s43154-020-00021-6","article-title":"A survey on learning-based robotic grasping","volume":"1","author":"Kleeberger","year":"2020","journal-title":"Curr. Robot. Rep."},{"key":"B19","doi-asserted-by":"crossref","first-page":"594","DOI":"10.1007\/978-3-030-95892-3_45","article-title":"Automatic grasp pose generation for parallel jaw grippers","volume-title":"Intelligent autonomous systems 16","author":"Kleeberger","year":"2022"},{"key":"B20","first-page":"9128","article-title":"Towards an intelligent collaborative robotic system for mixed case palletizing","author":"Lamon","year":"2020"},{"key":"B21","first-page":"254","article-title":"A unified framework for multi-view multi-class object pose estimation","author":"Li","year":""},{"key":"B22","doi-asserted-by":"publisher","first-page":"2718","DOI":"10.1109\/tmech.2019.2945135","article-title":"A survey of methods and strategies for high-precision robotic grasping and assembly tasks\u2014Some new trends","volume":"24","author":"Li","year":"2019","journal-title":"IEEE\/ASME Trans. Mechatronics"},{"key":"B23","doi-asserted-by":"crossref","DOI":"10.1145\/3284398.3284408","article-title":"Weakly supervised 6d pose estimation for robotic grasping","author":"Li","year":""},{"key":"B24","doi-asserted-by":"publisher","first-page":"6526","DOI":"10.1109\/LRA.2022.3174261","article-title":"E2ek: end-to-end regression network based on keypoint for 6d pose estimation","volume":"7","author":"Lin","year":"2022","journal-title":"IEEE Robotics Automation Lett."},{"key":"B25","first-page":"740","article-title":"Microsoft coco: common objects in context","volume-title":"European conference on computer vision","author":"Lin","year":"2014"},{"key":"B26","first-page":"21","article-title":"Ssd: single shot multibox detector","volume-title":"European conference on computer vision","author":"Liu","year":"2016"},{"key":"B27","doi-asserted-by":"publisher","first-page":"1731","DOI":"10.1109\/jsen.2014.2309987","article-title":"Characterizations of noise in kinect depth images: A review","volume":"14","author":"Mallick","year":"2014","journal-title":"IEEE Sensors J."},{"key":"B28","first-page":"640","article-title":"Estimating surface normals with depth image gradients for fast and accurate registration","author":"Nakagawa","year":"2015"},{"key":"B29","first-page":"4561","article-title":"Pvnet: pixel-wise voting network for 6dof pose estimation","author":"Peng","year":"2019"},{"key":"B30","first-page":"681","article-title":"Improving noise","author":"Perlin","year":"2002"},{"key":"B31","article-title":"Pointnet++: deep hierarchical feature learning on point sets in a metric space","volume-title":"Advances in neural information processing systems 30","author":"Qi","year":"2017"},{"key":"B32","unstructured":"Darknet: open source neural networks in c\n            RedmonJ.\n          2016"},{"key":"B33","first-page":"188","article-title":"Style-transfer gans for bridging the domain gap in synthetic pose estimator training","author":"Rojtberg","year":"2020"},{"key":"B34","first-page":"699","article-title":"Implicit 3d orientation learning for 6d object detection from rgb images","author":"Sundermeyer","year":"2018"},{"key":"B35","first-page":"14","article-title":"Sydd: synthetic depth data randomization for object detection using domain-relevant background","volume":"2019","author":"Thalhammer","year":"","journal-title":"TUGraz OPEN Libr."},{"key":"B36","first-page":"106","article-title":"Sydpose: object detection and pose estimation in cluttered real-world depth images trained using only synthetic data","author":"Thalhammer","year":""},{"key":"B37","first-page":"23","article-title":"Domain randomization for transferring deep neural networks from simulation to the real world","author":"Tobin","year":"2017"},{"key":"B38","first-page":"13029","article-title":"Scaled-YOLOv4: scaling cross stage partial network","author":"Wang","year":"2021"},{"key":"B39","first-page":"3343","article-title":"Densefusion: 6d object pose estimation by iterative dense fusion","author":"Wang","year":""},{"key":"B40","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3326362","article-title":"Dynamic graph cnn for learning on point clouds","volume":"38","author":"Wang","year":"","journal-title":"Acm Trans. Graph. (tog)"},{"key":"B41","unstructured":"Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes\n            XiangY.\n            SchmidtT.\n            NarayananV.\n            FoxD.\n          2018"},{"key":"B42","first-page":"3485","article-title":"Sun database: large-scale scene recognition from abbey to zoo","author":"Xiao","year":"2010"},{"key":"B43","first-page":"244","article-title":"Pointfusion: deep sensor fusion for 3d bounding box estimation","author":"Xu","year":"2018"},{"key":"B44","doi-asserted-by":"crossref","DOI":"10.1109\/3DV.2018.00012","article-title":"Keep it unreal: bridging the realism gap for 2.5d recognition with geometry priors only","author":"Zakharov","year":"2018"},{"key":"B45","first-page":"1941","article-title":"Dpod: 6d pose object detector and refiner","author":"Zakharov","year":"2019"},{"key":"B46","doi-asserted-by":"publisher","first-page":"3876","DOI":"10.1109\/TIE.2021.3075836","article-title":"A practical robotic grasping method by using 6-d pose estimation with protective correction","volume":"69","author":"Zhang","year":"2022","journal-title":"IEEE Trans. Industrial Electron."},{"key":"B47","first-page":"2881","article-title":"Pyramid scene parsing network","author":"Zhao","year":"2017"},{"key":"B48","first-page":"5745","article-title":"On the continuity of rotation representations in neural networks","author":"Zhou","year":"2019"}],"container-title":["Frontiers in Robotics and AI"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frobt.2023.1176492\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,9,28]],"date-time":"2023-09-28T04:15:53Z","timestamp":1695874553000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frobt.2023.1176492\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,27]]},"references-count":48,"alternative-id":["10.3389\/frobt.2023.1176492"],"URL":"https:\/\/doi.org\/10.3389\/frobt.2023.1176492","relation":{},"ISSN":["2296-9144"],"issn-type":[{"value":"2296-9144","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,27]]},"article-number":"1176492"}}