{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T08:17:09Z","timestamp":1760170629967,"version":"build-2065373602"},"reference-count":46,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2021,11,10]],"date-time":"2021-11-10T00:00:00Z","timestamp":1636502400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004955","name":"Austrian Research Promotion Agency","doi-asserted-by":"publisher","award":["881082"],"award-info":[{"award-number":["881082"]}],"id":[{"id":"10.13039\/501100004955","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>Accurately estimating the six degree of freedom (6-DoF) pose of objects in images is essential for a variety of applications such as robotics, autonomous driving, and autonomous, AI, and vision-based navigation for unmanned aircraft systems (UAS). Developing such algorithms requires large datasets; however, generating those is tedious as it requires annotating the 6-DoF relative pose of each object of interest present in the image w.r.t. to the camera. Therefore, this work presents a novel approach that automates the data acquisition and annotation process and thus minimizes the annotation effort to the duration of the recording. To maximize the quality of the resulting annotations, we employ an optimization-based approach for determining the extrinsic calibration parameters of the camera. Our approach can handle multiple objects in the scene, automatically providing ground-truth labeling for each object and taking into account occlusion effects between different objects. Moreover, our approach can not only be used to generate data for 6-DoF pose estimation and corresponding 3D-models but can be also extended to automatic dataset generation for object detection, instance segmentation, or volume estimation for any kind of object.<\/jats:p>","DOI":"10.3390\/jimaging7110236","type":"journal-article","created":{"date-parts":[[2021,11,10]],"date-time":"2021-11-10T09:19:21Z","timestamp":1636535961000},"page":"236","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Automated Data Annotation for 6-DoF AI-Based Navigation Algorithm Development"],"prefix":"10.3390","volume":"7","author":[{"given":"Javier Gibran","family":"Apud Baca","sequence":"first","affiliation":[{"name":"Control of Networked Systems Group, University of Klagenfurt, 9020 Klagenfurt am W\u00f6rthersee, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Thomas","family":"Jantos","sequence":"additional","affiliation":[{"name":"Control of Networked Systems Group, University of Klagenfurt, 9020 Klagenfurt am W\u00f6rthersee, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mario","family":"Theuermann","sequence":"additional","affiliation":[{"name":"JOANNEUM RESEARCH Forschungsgesellschaft mbH, DIGITAL, Remote Sensing and Geoinformation, 8010 Graz, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mohamed Amin","family":"Hamdad","sequence":"additional","affiliation":[{"name":"Infineon Technologies Austria AG, 9500 Villach, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2465-2527","authenticated-orcid":false,"given":"Jan","family":"Steinbrener","sequence":"additional","affiliation":[{"name":"Control of Networked Systems Group, University of Klagenfurt, 9020 Klagenfurt am W\u00f6rthersee, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Stephan","family":"Weiss","sequence":"additional","affiliation":[{"name":"Control of Networked Systems Group, University of Klagenfurt, 9020 Klagenfurt am W\u00f6rthersee, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alexander","family":"Almer","sequence":"additional","affiliation":[{"name":"JOANNEUM RESEARCH Forschungsgesellschaft mbH, DIGITAL, Remote Sensing and Geoinformation, 8010 Graz, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3374-4201","authenticated-orcid":false,"given":"Roland","family":"Perko","sequence":"additional","affiliation":[{"name":"JOANNEUM RESEARCH Forschungsgesellschaft mbH, DIGITAL, Remote Sensing and Geoinformation, 8010 Graz, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2021,11,10]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"48572","DOI":"10.1109\/ACCESS.2019.2909530","article-title":"Unmanned Aerial Vehicles (UAVs): A Survey on Civil Applications and Key Research Challenges","volume":"7","author":"Shakhatreh","year":"2019","journal-title":"IEEE Access"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Puri, A., Valavanis, K., and Kontitsis, M. (2007, January 27\u201329). Statistical Profile Generation for Traffic Monitoring Using Real-Time UAV Based Video Data. Proceedings of the Mediterranean Conference on Control & Automation, Athens, Greece.","DOI":"10.1109\/MED.2007.4433658"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Scherer, J., Yahyanejad, S., Hayat, S., Yanmaz, E., Andre, T., Khan, A., Vukadinovic, V., Bettstetter, C., Hellwagner, H., and Rinner, B. (2015, January 19). An Autonomous Multi-UAV System for Search and Rescue. Proceedings of the Workshop on Micro Aerial Vehicle Networks, Systems, and Applications for Civilian Use, Florence, Italy.","DOI":"10.1145\/2750675.2750683"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Balaban, M.A., Mastaglio, T.W., and Lynch, C.J. (2016, January 11\u201314). Analysis of Future UAS-based Delivery. Proceedings of the Winter Simulation Conference (WSC), Washington, DC, USA.","DOI":"10.1109\/WSC.2016.7822209"},{"key":"ref_5","unstructured":"Lottes, P., Khanna, R., Pfeifer, J., Siegwart, R., and Stachniss, C. (June, January 29). UAV-based Crop and Weed Classification for Smart Farming. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Singapore."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"14887","DOI":"10.3390\/s150714887","article-title":"Vision and Control for UAVs: A Survey of General Methods and of Inexpensive Platforms for Infrastructure Inspection","volume":"15","year":"2015","journal-title":"Sensors"},{"key":"ref_7","unstructured":"Dovis, F. (2015). GNSS Interference Threats and Countermeasures, Artech House."},{"key":"ref_8","unstructured":"Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning, MIT Press. Available online: http:\/\/www.deeplearningbook.org."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014). Microsoft COCO: Common Objects in Context. Proceedings of the 13th European Conference on Computer Vision, Zurich, Switzerland, 6\u201312 September 2014, Springer.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"157","DOI":"10.1007\/s11263-007-0090-8","article-title":"LabelMe: A Database and Web-based Tool for Image Annotation","volume":"77","author":"Russell","year":"2008","journal-title":"Int. J. Comput. Vis."},{"key":"ref_11","unstructured":"Zhang, C., Loken, K., Chen, Z., Xiao, Z., and Kunkel, G. (2018). Mask Editor: An Image Annotation Tool for Image Segmentation Tasks. arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Russakovsky, O., Li, L.J., and Fei-Fei, L. (2015, January 8\u201310). Best of Both Worlds: Human-Machine Collaboration for Object Annotation. Proceedings of the IEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298824"},{"key":"ref_13","unstructured":"Papadopoulos, D.P., Uijlings, J.R., Keller, F., and Ferrari, V. (July, January 26). We Don\u2019t Need No Bounding-Boxes: Training Object Class Detectors Using Only Human Verification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Papadopoulos, D.P., Uijlings, J.R., Keller, F., and Ferrari, V. (2017, January 22\u201325). Training Object Class Detectors with Click Supervision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.27"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Adhikari, B., and Huttunen, H. (2021, January 10\u201315). Iterative Bounding Box Annotation for Object Detection. Proceedings of the 25th International Conference on Pattern Recognition (ICPR), Milano, Italy.","DOI":"10.1109\/ICPR48806.2021.9412956"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Adhikari, B., Peltomaki, J., Puura, J., and Huttunen, H. (2018, January 26\u201328). Faster Bounding Box Annotation for Object Detection in Indoor Scenes. Proceedings of the 2018 7th European Workshop on Visual Information Processing (EUVIP), Tampere, Finland.","DOI":"10.1109\/EUVIP.2018.8611732"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"200-1","DOI":"10.2352\/ISSN.2470-1173.2020.16.AVM-200","article-title":"A Tool for Semi-Automatic Ground Truth Annotation of Traffic Videos","volume":"29","author":"Groh","year":"2020","journal-title":"Electron. Imaging"},{"key":"ref_18","unstructured":"Hinterstoisser, S., Pauly, O., Heibel, H., Martina, M., and Bokeloh, M. (November, January 27). An Annotation Saved is an Annotation Earned: Using Fully Synthetic Training for Object Detection. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV) Workshops, Seoul, Korea."},{"key":"ref_19","unstructured":"Borkman, S., Crespi, A., Dhakad, S., Ganguly, S., Hogins, J., Jhang, Y.C., Kamalzadeh, M., Li, B., Leal, S., and Parisi, P. (2021). Unity Perception: Generate Synthetic Data for Computer Vision. arXiv."},{"key":"ref_20","unstructured":"Denninger, M., Sundermeyer, M., Winkelbauer, D., Olefir, D., Hodan, T., Zidan, Y., Elbadrawy, M., Knauer, M., Katam, H.T., and Lodhi, A. (2021, October 06). BlenderProc: Reducing the Reality Gap with Photorealistic Rendering. Available online: https:\/\/github.com\/DLR-RM\/BlenderProc."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"714","DOI":"10.1016\/j.procir.2021.05.092","article-title":"Leveraging Synthetic Data from CAD Models for Training Object Detection Models\u2014A VR Industry Application Case","volume":"100","author":"Kohtala","year":"2021","journal-title":"Proc. CIRP"},{"key":"ref_22","unstructured":"Ratner, A., Bach, S., Varma, P., and Re, C. (2021, October 06). Weak Supervised: The New Programming Paradigm for Machine Learning. Available online: http:\/\/ai.stanford.edu\/blog\/weak-supervision\/."},{"key":"ref_23","unstructured":"Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A. (July, January 26). Learning Deep Features for Discriminative Localization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"De Brabandere, B., Neven, D., and Van Gool, L. (2017, January 22\u201325). Semantic Instance Segmentation for Autonomous Driving. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA.","DOI":"10.1109\/CVPRW.2017.66"},{"key":"ref_25","first-page":"565","article-title":"Instance Segmentation for Autonomous Vehicle","volume":"12","author":"Mohanapriya","year":"2021","journal-title":"Turk. J. Comput. Math. Educ."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"2112","DOI":"10.1049\/ipr2.12181","article-title":"An Optimized YOLO-based Object Detection Model for Crop Harvesting System","volume":"15","author":"Junos","year":"2021","journal-title":"IET Image Process."},{"key":"ref_27","unstructured":"Soumya, V., and Sreeraj, M. (2013, January 19\u201321). Object Detection and Classification in Surveillance System. Proceedings of the IEEE Recent Advances in Intelligent Computational Systems (RAICS), Trivandrum, India."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"3981","DOI":"10.1007\/s11042-020-09749-x","article-title":"Real Time Object Detection and Trackingsystem for Video Surveillance System","volume":"80","author":"Jha","year":"2021","journal-title":"Multimed. Tools Appl."},{"key":"ref_29","unstructured":"Bukschat, Y., and Vetter, M. (2020). EfficientPose: An Efficient, Accurate and Scalable End-to-End 6D Multi Object Pose Estimation Approach. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Thalhammer, S., Patten, T., and Vincze, M. (2019, January 16\u201319). SyDPose: Object Detection and Pose Estimation in Cluttered Real-World Depth Images Trained using Only Synthetic Data. Proceedings of the International Conference on 3D Vision (3DV), Quebec City, QC, Canada.","DOI":"10.1109\/3DV.2019.00021"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Josifovski, J., Kerzel, M., Pregizer, C., Posniak, L., and Wermter, S. (2018, January 1\u20135). Object Detection and Pose Estimation Based on Convolutional Neural Networks Trained with Synthetic Data. Proceedings of the IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain.","DOI":"10.1109\/IROS.2018.8594379"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Hodan, T., Haluza, P., Obdr\u017e\u00e1lek, \u0160., Matas, J., Lourakis, M., and Zabulis, X. (2017, January 27\u201329). T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-less Objects. Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA.","DOI":"10.1109\/WACV.2017.103"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Wang, H., Sridhar, S., Huang, J., Valentin, J., Song, S., and Guibas, L.J. (2019, January 16\u201320). Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00275"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Hodan, T., Michel, F., Brachmann, E., Kehl, W., GlentBuch, A., Kraft, D., Drost, B., Vidal, J., Ihrke, S., and Zabulis, X. (2018, January 8\u201314). BOP: Benchmark for 6D Object Pose Estimation. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01249-6_2"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Xiang, Y., Schmidt, T., Narayanan, V., and Fox, D. (2018, January 26\u201330). PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes. Proceedings of the Robotics: Science and Systems (RSS), Pittsburgh, PA, USA.","DOI":"10.15607\/RSS.2018.XIV.019"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Hinterstoisser, S., Lepetit, V., Ilic, S., Holzer, S., Bradski, G., Konolige, K., and Navab, N. Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes. Proceedings of the 11th Asian Conference on Computer Vision, Daejeon, Korea, 5\u20139 November 2012.","DOI":"10.1007\/978-3-642-33885-4_60"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Brachmann, E., Krull, A., Michel, F., Gumhold, S., Shotton, J., and Rother, C. Learning 6D Object Pose Estimation Using 3D Object Coordinates. Proceedings of the 13th European Conference on Computer Vision, Zurich, Switzerland, 6\u201312 September 2014.","DOI":"10.1007\/978-3-319-10605-2_35"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Yuan, H., Hoogenkamp, T., and Veltkamp, R.C. (2021). RobotP: A Benchmark Dataset for 6D Object Pose Estimation. Sensors, 21.","DOI":"10.3390\/s21041299"},{"key":"ref_39","unstructured":"Technology, I.R. (2021, October 06). Intel RealSense D400 Series Product Family. Available online: https:\/\/www.intel.com\/content\/dam\/support\/us\/en\/documents\/emerging-technologies\/intel-realsense-technology\/Intel-RealSense-D400-Series-Datasheet.pdf."},{"key":"ref_40","unstructured":"Cignoni, P., Ranzuglia, G., Callieri, M., Corsini, M., Ganovelli, F., Pietroni, N., and Tarini, M. (2008, January 2\u20134). MeshLab: An Open-Source Mesh Processing Tool. Proceedings of the Eurographics Italian Chapter Conference, Salerno, Italy."},{"key":"ref_41","unstructured":"(2021, July 27). OptiTrack, NaturalPoint Inc. Available online: https:\/\/optitrack.com\/."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"2280","DOI":"10.1016\/j.patcog.2014.01.005","article-title":"Automatic Generation and Detection of Highly Reliable Fiducial Markers Under Occlusion","volume":"47","year":"2014","journal-title":"Pattern Recognit."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"38","DOI":"10.1016\/j.imavis.2018.05.004","article-title":"Speeded Up Detection of Squared Fiducial Markers","volume":"76","year":"2018","journal-title":"Image Vis. Comput."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., and Savarese, S. (2019, January 16\u201320). Generalized Intersection over Union: A Metric and a Loss for Bounding Box Regression. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00075"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Li, Z., Wang, G., and Ji, X. (2019, January 16\u201320). CDPN: Coordinates-Based Disentangled Pose Network for Real-Time RGB-Based 6-DoF Object Pose Estimation. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Long Beach, CA, USA.","DOI":"10.1109\/ICCV.2019.00777"},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"3727","DOI":"10.1109\/LRA.2019.2928776","article-title":"SilhoNet: An RGB Method for 6D Object Pose Estimation","volume":"4","author":"Billings","year":"2019","journal-title":"IEEE Robot. Autom. Lett."}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/7\/11\/236\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:28:12Z","timestamp":1760167692000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/7\/11\/236"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,11,10]]},"references-count":46,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2021,11]]}},"alternative-id":["jimaging7110236"],"URL":"https:\/\/doi.org\/10.3390\/jimaging7110236","relation":{},"ISSN":["2313-433X"],"issn-type":[{"type":"electronic","value":"2313-433X"}],"subject":[],"published":{"date-parts":[[2021,11,10]]}}}