{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T12:23:13Z","timestamp":1774959793337,"version":"3.50.1"},"reference-count":38,"publisher":"MDPI AG","issue":"18","license":[{"start":{"date-parts":[[2021,9,11]],"date-time":"2021-09-11T00:00:00Z","timestamp":1631318400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>This paper addresses the problem of pose estimation from 2D images for textureless industrial metallic parts for a semistructured bin-picking task. The appearance of metallic reflective parts is highly dependent on the camera viewing direction, as well as the distribution of light on the object, making conventional vision-based methods unsuitable for the task. We propose a solution using direct light at a fixed position to the camera, mounted directly on the robot\u2019s gripper, that allows us to take advantage of the reflective properties of the manipulated object. We propose a data-driven approach based on convolutional neural networks (CNN), without the need for a hard-coded geometry of the manipulated object. The solution was modified for an industrial application and extensively tested in a real factory. Our solution uses a cheap 2D camera and allows for a semi-automatic data-gathering process on-site.<\/jats:p>","DOI":"10.3390\/s21186093","type":"journal-article","created":{"date-parts":[[2021,9,12]],"date-time":"2021-09-12T21:48:01Z","timestamp":1631483281000},"page":"6093","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["Data-Driven Object Pose Estimation in a Practical Bin-Picking Application"],"prefix":"10.3390","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8405-269X","authenticated-orcid":false,"given":"Viktor","family":"Koz\u00e1k","sequence":"first","affiliation":[{"name":"Czech Institute of Informatics, Robotics, and Cybernetics, Czech Technical University in Prague, Jugosl\u00e1vsk\u00fdch Partyz\u00e1n\u016f 1580\/3, 160 00 Praha 6, Czech Republic"},{"name":"Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University in Prague, Karlovo N\u00e1m\u011bst\u00ed 13, 121 35 Praha 2, Czech Republic"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Roman","family":"Sushkov","sequence":"additional","affiliation":[{"name":"Czech Institute of Informatics, Robotics, and Cybernetics, Czech Technical University in Prague, Jugosl\u00e1vsk\u00fdch Partyz\u00e1n\u016f 1580\/3, 160 00 Praha 6, Czech Republic"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0997-5889","authenticated-orcid":false,"given":"Miroslav","family":"Kulich","sequence":"additional","affiliation":[{"name":"Czech Institute of Informatics, Robotics, and Cybernetics, Czech Technical University in Prague, Jugosl\u00e1vsk\u00fdch Partyz\u00e1n\u016f 1580\/3, 160 00 Praha 6, Czech Republic"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Libor","family":"P\u0159eu\u010dil","sequence":"additional","affiliation":[{"name":"Czech Institute of Informatics, Robotics, and Cybernetics, Czech Technical University in Prague, Jugosl\u00e1vsk\u00fdch Partyz\u00e1n\u016f 1580\/3, 160 00 Praha 6, Czech Republic"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,9,11]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"172","DOI":"10.1109\/TASE.2016.2600527","article-title":"Analysis and Observations From the First Amazon Picking Challenge","volume":"15","author":"Correll","year":"2016","journal-title":"IEEE Trans. Autom. Sci. Eng."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Mart\u00ednez, C., Boca, R., Zhang, B., Chen, H., and Nidamarthi, S. (2015, January 11\u201312). Automated bin-picking system for randomly located industrial parts. Proceedings of the 2015 IEEE International Conference on Technologies for Practical Robot Applications (TePRA), Woburn, MA, USA.","DOI":"10.1109\/TePRA.2015.7219656"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Rodrigues, J.J., Kim, J.S., Furukawa, M., Xavier, J., Aguiar, P., and Kanade, T. (2012, January 7\u201312). 6D pose estimation of textureless shiny objects using random ferns for bin-picking. Proceedings of the 2012 IEEE\/RSJ International Conference on Intelligent Robots and Systems, Vilamoura-Algarve, Portugal.","DOI":"10.1109\/IROS.2012.6385680"},{"key":"ref_4","unstructured":"Shen, X. (2019). A survey of Object Classification and Detection based on 2D\/3D data. arXiv."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"483","DOI":"10.1016\/j.vrih.2019.09.003","article-title":"Survey of 3D modeling using depth cameras","volume":"1","author":"Xu","year":"2019","journal-title":"Virtual Real. Intell. Hardw."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"91","DOI":"10.1023\/B:VISI.0000029664.99615.94","article-title":"Distinctive Image Features from Scale-Invariant Keypoints","volume":"60","author":"Lowe","year":"2004","journal-title":"Int. J. Comput. Vis."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Tola, E., Lepetit, V., and Fua, P. (2008, January 23\u201328). A fast local descriptor for dense matching. Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA.","DOI":"10.1109\/CVPR.2008.4587673"},{"key":"ref_8","unstructured":"Park, K., Patten, T., and Vincze, M. (November, January 27). Pix2pose: Pixel-wise coordinate regression of objects for 6d pose estimation. Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Ulrich, M., Wiedemann, C., and Steger, C. (2009, January 12\u201317). CAD-based recognition of 3D objects in monocular images. Proceedings of the 2009 IEEE International Conference on Robotics and Automation, Kobe, Japan.","DOI":"10.1109\/ROBOT.2009.5152511"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G., Konolige, K., and Navab, N. (2012). Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes. Asian Conference on Computer Vision, Springer.","DOI":"10.1007\/978-3-642-33885-4_60"},{"key":"ref_11","unstructured":"Rodrigues, J.J.M. (2018). 3D Pose Estimation for Bin-Picking: A Data-Driven Approach Using Multi-Light Images. [Ph.D. Thesis, Carnegie Mellon University]."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"3212","DOI":"10.1109\/TNNLS.2018.2876865","article-title":"Object Detection With Deep Learning: A Review","volume":"30","author":"Zhao","year":"2019","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Rad, M., and Lepetit, V. (2017, January 22\u201329). BB8: A Scalable, Accurate, Robust to Partial Occlusion Method for Predicting the 3D Poses of Challenging Objects without Using Depth. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy.","DOI":"10.1109\/ICCV.2017.413"},{"key":"ref_14","unstructured":"Zhao, Z., Peng, G., Wang, H., Fang, H.S., Li, C., and Lu, C. (2018). Estimating 6D Pose From Localizing Designated Surface Keypoints. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Hu, Y., Hugonot, J., Fua, P., and Salzmann, M. (2019, January 15\u201320). Segmentation-Driven 6D Object Pose Estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00350"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Peng, S., Liu, Y., Huang, Q., Zhou, X., and Bao, H. (2019, January 15\u201320). PVNet: Pixel-Wise Voting Network for 6DoF Pose Estimation. Proceedings of the 2019 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00469"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Xiang, Y., Schmidt, T., Narayanan, V., and Fox, D. (2018, January 26\u201330). PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes. Proceedings of the Robotics: Science and Systems 2018, Pittsburgh, PA, USA.","DOI":"10.15607\/RSS.2018.XIV.019"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Tekin, B., Sinha, S., and Fua, P. (2018, January 18\u201323). Real-Time Seamless Single Shot 6D Object Pose Prediction. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00038"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Kehl, W., Manhardt, F., Tombari, F., Ilic, S., and Navab, N. (2017, January 22\u201329). SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCV.2017.169"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Labbe, Y., Carpentier, J., Aubry, M., and Sivic, J. (2020, January 23\u201328). CosyPose: Consistent multi-view multi-object 6D pose estimation. Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK.","DOI":"10.1007\/978-3-030-58520-4_34"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Pinto, L., and Gupta, A. (2016, January 16\u201321). Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours. Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden.","DOI":"10.1109\/ICRA.2016.7487517"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"421","DOI":"10.1177\/0278364917710318","article-title":"Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection","volume":"37","author":"Levine","year":"2016","journal-title":"Int. J. Robot. Res."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Calli, B., Singh, A., Walsman, A., Srinivasa, S., Abbeel, P., and Dollar, A. (2015, January 27\u201331). The YCB object and Model set: Towards common benchmarks for manipulation research. Proceedings of the 2015 International Conference on Advanced Robotics (ICAR), Istanbul, Turkey.","DOI":"10.1109\/ICAR.2015.7251504"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Krull, A., Brachmann, E., Michel, F., Yang, M.Y., Gumhold, S., and Rother, C. (2015, January 7\u201313). Learning Analysis-by-Synthesis for 6D Pose Estimation in RGB-D Images. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.","DOI":"10.1109\/ICCV.2015.115"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Hoda\u0148, T., Michel, F., Brachmann, E., Kehl, W., Glent Buch, A., Kraft, D., Drost, B., Vidal, J., Ihrke, S., and Zabulis, X. (2018, January 8\u201314). BOP: Benchmark for 6D Object Pose Estimation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01249-6_2"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Hoda\u0148, T., Sundermeyer, M., Drost, B., Labb\u00e9, Y., Brachmann, E., Michel, F., Rother, C., and Matas, J. (2020, January 23\u201328). BOP Challenge 2020 on 6D Object Localization. Proceedings of the European Conference on Computer Vision Workshops (ECCVW), Glasgow, UK.","DOI":"10.1007\/978-3-030-66096-3_39"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Hoda\u0148, T., Haluza, P., Obdr\u017e\u00e1lek, S., Matas, J., Lourakis, M., and Zabulis, X. (2017, January 24\u201331). T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-Less Objects. Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA.","DOI":"10.1109\/WACV.2017.103"},{"key":"ref_28","unstructured":"Dalal, N., and Triggs, B. (2005, January 20\u201325). Histograms of oriented gradients for human detection. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Walsh, J., O\u2019 Mahony, N., Campbell, S., Carvalho, A., Krpalkova, L., Velasco-Hernandez, G., Harapanahalli, S., and Riordan, D. (2019). Deep Learning vs. Traditional Computer Vision. Science and Information Conference, Springer.","DOI":"10.1007\/978-3-030-17795-9_10"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"128837","DOI":"10.1109\/ACCESS.2019.2939201","article-title":"A Survey of Deep Learning-Based Object Detection","volume":"7","author":"Jiao","year":"2019","journal-title":"IEEE Access"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"1137","DOI":"10.1109\/TPAMI.2016.2577031","article-title":"Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks","volume":"39","author":"Ren","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Kao, M.Y. (2016). Sliding Window Algorithms. Encyclopedia of Algorithms, Springer.","DOI":"10.1007\/978-1-4939-2864-4"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"273","DOI":"10.1007\/BF00994018","article-title":"Support-vector networks","volume":"20","author":"Cortes","year":"1995","journal-title":"Mach. Learn."},{"key":"ref_35","unstructured":"Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., and Isard, M. (2016, January 2\u20134). TensorFlow: A System for Large-Scale Machine Learning. Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201916), Savannah, GA, USA."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Sur\u00e1k, M., Ko\u0161nar, K., Kulich, M., Koz\u00e1k, V., and P\u0159eu\u010dil, L. (2019). Visual Data Simulation for Deep Learning in Robot Manipulation Tasks. International Conference on Modelling and Simulation for Autonomous Systems, Springer.","DOI":"10.1007\/978-3-030-14984-0_29"},{"key":"ref_37","unstructured":"Coleman, D., Sucan, I., Chitta, S., and Correll, N. (2014). Reducing the Barrier to Entry of Complex Robotic Software: A MoveIt! Case Study. arXiv."},{"key":"ref_38","first-page":"5","article-title":"ROS: An open-source Robot Operating System","volume":"3","author":"Quigley","year":"2009","journal-title":"ICRA Workshop Open Source Softw."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/18\/6093\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:01:03Z","timestamp":1760166063000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/21\/18\/6093"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,11]]},"references-count":38,"journal-issue":{"issue":"18","published-online":{"date-parts":[[2021,9]]}},"alternative-id":["s21186093"],"URL":"https:\/\/doi.org\/10.3390\/s21186093","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,9,11]]}}}