{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T05:30:04Z","timestamp":1770269404915,"version":"3.49.0"},"reference-count":59,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2024,1,26]],"date-time":"2024-01-26T00:00:00Z","timestamp":1706227200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"European Union\u2019s Horizon 2020 Research and Innovation Program","award":["861678"],"award-info":[{"award-number":["861678"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Robotics"],"abstract":"<jats:p>In an effort to improve short-sea shipping in Europe, we present a 3D world interpreter (3DWI) system as part of a robotic container-handling system. The 3DWI is an advanced sensor suite combined with AI-based software and the communication infrastructure to connect to both the crane control and the shore control center. On input of LiDAR data and stereo captures, the 3DWI builds a world model of the operating environment and detects containers. The 3DWI and crane control are the core of an autonomously operating crane that monitors the environment and may trigger an emergency stop while alerting the remote operator of the danger. During container handling, the 3DWI scans for human activity and continuously updates a 3D-Twin model for the operator, enabling situational awareness. The presented methodology includes the sensor suite design, creation of the world model and the 3D-Twin, innovations in AI-detection software, and interaction with the crane and operator. Supporting experiments quantify the performance of the 3DWI, its AI detectors, and safety measures; the detectors reach the top of VisDrone\u2019s leaderboard and the pilot tests show the safe autonomous operation of the crane.<\/jats:p>","DOI":"10.3390\/robotics13020023","type":"journal-article","created":{"date-parts":[[2024,1,26]],"date-time":"2024-01-26T08:56:01Z","timestamp":1706259361000},"page":"23","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["A 3D World Interpreter System for Safe Autonomous Crane Operation"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-5532-0309","authenticated-orcid":false,"given":"Frank Bart","family":"ter Haar","sequence":"first","affiliation":[{"name":"TNO\u2014Intelligent Imaging, Oude Waalsdorperweg 63, 2597 AK The Hague, The Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-0181-9600","authenticated-orcid":false,"given":"Frank","family":"Ruis","sequence":"additional","affiliation":[{"name":"TNO\u2014Intelligent Imaging, Oude Waalsdorperweg 63, 2597 AK The Hague, The Netherlands"}]},{"ORCID":"https:\/\/orcid.org\/0009-0005-3242-7644","authenticated-orcid":false,"given":"Bastian Thomas","family":"van Manen","sequence":"additional","affiliation":[{"name":"TNO\u2014Intelligent Imaging, Oude Waalsdorperweg 63, 2597 AK The Hague, The Netherlands"}]}],"member":"1968","published-online":{"date-parts":[[2024,1,26]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"152","DOI":"10.1080\/01441647.2018.1502834","article-title":"Autonomous technologies in short sea shipping: Trends, feasibility and implications","volume":"39","author":"Ghaderi","year":"2019","journal-title":"Transp. Rev."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"665","DOI":"10.1515\/eng-2020-0074","article-title":"An overview of current safety requirements for autonomous machines\u2014Review of standards","volume":"10","author":"Tiusanen","year":"2020","journal-title":"Open Eng."},{"key":"ref_3","unstructured":"Mohseni, S., Pitale, M., Singh, V., and Wang, Z. (2019). Practical Solutions for Machine Learning Safety in Autonomous Vehicles. arXiv."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"8867757","DOI":"10.1155\/2020\/8867757","article-title":"Safety of autonomous vehicles","volume":"2020","author":"Wang","year":"2020","journal-title":"J. Adv. Transp."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Perez-Cerrolaza, J., Abella, J., Borg, M., Donzella, C., Cerquides, J., Cazorla, F.J., Englund, C., Tauber, M., Nikolakopoulos, G., and Flores, J.L. (2023). Artificial Intelligence for Safety-Critical Systems in Industrial and Transportation Domains: A Survey. ACM Comput. Surv., Just Accepted.","DOI":"10.1145\/3626314"},{"key":"ref_6","unstructured":"Karvonen, H., Heikkil\u00e4, E., and Wahlstr\u00f6m, M. (2020). Engineering Psychology and Cognitive Ergonomics. Cognition and Design, Springer."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1109\/MCOMSTD.011.2100004","article-title":"Digital Twin Analysis to Promote Safety and Security in Autonomous Vehicles","volume":"5","author":"Almeaibed","year":"2021","journal-title":"IEEE Commun. Stand. Mag."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"St\u0105czek, P., Pizo\u0144, J., Danilczuk, W., and Gola, A. (2021). A Digital Twin Approach for the Improvement of an Autonomous Mobile Robots (AMR\u2019s) Operating Environment\u2014A Case Study. Sensors, 21.","DOI":"10.3390\/s21237830"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Geiger, A., Lenz, P., and Urtasun, R. (2012, January 16\u201321). Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA.","DOI":"10.1109\/CVPR.2012.6248074"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"108796","DOI":"10.1016\/j.patcog.2022.108796","article-title":"3D Object Detection for Autonomous Driving: A Survey","volume":"130","author":"Qian","year":"2022","journal-title":"Pattern Recognit."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1016\/j.measurement.2016.10.009","article-title":"Quantifying the influence of rain in LiDAR performance","volume":"95","author":"Filgueira","year":"2017","journal-title":"Measurement"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Jokela, M., Kutila, M., and Pyyk\u00f6nen, P. (2019). Testing and Validation of Automotive Point-Cloud Sensors in Adverse Weather Conditions. Appl. Sci., 9.","DOI":"10.3390\/app9112341"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Abdo, J., Hamblin, S., and Chen, G. (2021, January 1\u20135). Effect of Weather on the Performance of Autonomous Vehicle LiDAR Sensors. Proceedings of the ASME International Mechanical Engineering Congress and Exposition, Virtual.","DOI":"10.1115\/IMECE2021-73770"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Sebastian, G., Vattem, T., Lukic, L., B\u00fcrgy, C., and Schumann, T. (2021, January 11\u201317). RangeWeatherNet for LiDAR-only weather and road condition classification. Proceedings of the 2021 IEEE Intelligent Vehicles Symposium (IV), Nagoya, Japan.","DOI":"10.1109\/IV48863.2021.9575320"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Kumar, D., and Muhammad, N. (2023). Object Detection in Adverse Weather for Autonomous Driving through Data Merging and YOLOv8. Sensors, 23.","DOI":"10.20944\/preprints202309.0050.v1"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"16219","DOI":"10.1038\/s41598-023-42753-3","article-title":"Improved YOLOv5-based for small traffic sign detection under complex weather","volume":"13","author":"Qu","year":"2023","journal-title":"Sci. Rep."},{"key":"ref_17","unstructured":"Zhang, H., Li, F., Liu, S., Zhang, L., Su, H., Zhu, J., Ni, L., and Shum, H.Y. (2022, January 25\u201329). DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection. Proceedings of the The Eleventh International Conference on Learning Representations, Virtual."},{"key":"ref_18","unstructured":"Ren, S., He, K., Girshick, R., and Sun, J. (2015, January 7\u201312). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Proceedings of the Advances in Neural Information Processing Systems (NIPS), Montreal, QC, Canada."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Cai, Z., and Vasconcelos, N. (2018, January 18\u201322). Cascade R-CNN: Delving into High Quality Object Detection. Proceedings of the CVPR, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00644"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Woo, S., Debnath, S., Hu, R., Chen, X., Liu, Z., Kweon, I.S., and Xie, S. (2023). ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders. arXiv.","DOI":"10.1109\/CVPR52729.2023.01548"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Zong, Z., Song, G., and Liu, Y. (2023, January 4\u20136). DETRs with Collaborative Hybrid Assignments Training. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Paris, France.","DOI":"10.1109\/ICCV51070.2023.00621"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Liu, Z., Hu, H., Lin, Y., Yao, Z., Xie, Z., Wei, Y., Ning, J., Cao, Y., Zhang, Z., and Dong, L. (2022, January 18\u201324). Swin Transformer V2: Scaling Up Capacity and Resolution. Proceedings of the International Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.","DOI":"10.1109\/CVPR52688.2022.01170"},{"key":"ref_23","first-page":"1072","article-title":"Transformer-based End-to-End Object Detection in Aerial Images","volume":"14","author":"Vo","year":"2023","journal-title":"Int. J. Adv. Comput. Sci. Appl."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Cao, Y., He, Z., Wang, L., Wang, W., Yuan, Y., Zhang, D., Zhang, J., Zhu, P., Van Gool, L., and Han, J. (2021, January 11\u201317). VisDrone-DET2021: The vision meets drone object detection challenge results. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00319"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zhu, X., Lyu, S., Wang, X., and Zhao, Q. (2021, January 11\u201317). TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios. Proceedings of the IEEE\/CVF International Conference on Computer Vision (ICCV) Workshops, Montreal, BC, Canada.","DOI":"10.1109\/ICCVW54120.2021.00312"},{"key":"ref_26","unstructured":"Xu, S., Wang, X., Lv, W., Chang, Q., Cui, C., Deng, K., Wang, G., Dang, Q., Wei, S., and Du, Y. (2022). PP-YOLOE: An evolved version of YOLO. arXiv."},{"key":"ref_27","unstructured":"Redmon, J., and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv."},{"key":"ref_28","unstructured":"Bochkovskiy, A., Wang, C.Y., and Liao, H.Y.M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv."},{"key":"ref_29","unstructured":"Jocher, G. (2024, January 20). Software implementation YOLOv5 by Ultralytics. Available online: https:\/\/zenodo.org\/records\/7347926."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Bochkovskiy, A., and Liao, H.Y.M. (2022). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv.","DOI":"10.1109\/CVPR52729.2023.00721"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"4207","DOI":"10.1007\/s00521-021-06489-3","article-title":"Machine-learning-based top-view safety monitoring of ground workforce on complex industrial sites","volume":"34","author":"Golcarenarenji","year":"2022","journal-title":"Neural Comput. Appl."},{"key":"ref_32","unstructured":"Sutjaritvorakul, T., Vierling, A., Pawlak, J., and Berns, K. (2020). Advances in Service and Industrial Robotics: Results of RAAD, Springer."},{"key":"ref_33","first-page":"864","article-title":"Data-driven worker detection from load-view crane camera","volume":"Volume 37","author":"Sutjaritvorakul","year":"2020","journal-title":"Proceedings of the International Symposium on Automation and Robotics in Construction"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Neuhausen, M., Herbers, P., and K\u00f6nig, M. (2020). Using synthetic data to improve and evaluate the tracking performance of construction workers on site. Appl. Sci., 10.","DOI":"10.3390\/app10144948"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"He, T., Zhang, Z., Zhang, H., Zhang, Z., Xie, J., and Li, M. (2019, January 15\u201320). Bag of tricks for image classification with convolutional neural networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.","DOI":"10.1109\/CVPR.2019.00065"},{"key":"ref_36","unstructured":"Zhang, Z., He, T., Zhang, H., Zhang, Z., Xie, J., and Li, M. (2019). Bag of freebies for training object detection neural networks. arXiv."},{"key":"ref_37","unstructured":"Steiner, A., Kolesnikov, A., Zhai, X., Wightman, R., Uszkoreit, J., and Beyer, L. (2021). How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers. arXiv."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Yang, B., Luo, W., and Urtasun, R. (2018, January 18\u201322). PIXOR: Real-time 3D Object Detection from Point Clouds. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00798"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Arikumar, K.S., Deepak Kumar, A., Gadekallu, T.R., Prathiba, S.B., and Tamilarasi, K. (2022). Real-Time 3D Object Detection and Classification in Autonomous Driving Environment Using 3D LiDAR and Camera Sensors. Electronics, 11.","DOI":"10.3390\/electronics11244203"},{"key":"ref_40","unstructured":"Middelhoek, F. (2023). Stereo Pointclouds for Safety Monitoring of Port Environments. [Master\u2019s Thesis, TUDelft]."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"7380","DOI":"10.1109\/TPAMI.2021.3119563","article-title":"Detection and Tracking Meet Drones Challenge","volume":"44","author":"Zhu","year":"2021","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"1256","DOI":"10.1007\/s11263-019-01177-1","article-title":"Deep learning approach in aerial imagery for supporting land search and rescue missions","volume":"127","author":"Gotovac","year":"2019","journal-title":"Int. J. Comput. Vis."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"103482","DOI":"10.1016\/j.autcon.2020.103482","article-title":"Dataset and benchmark for detecting moving objects in construction sites","volume":"122","author":"Xuehui","year":"2021","journal-title":"Autom. Constr."},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27\u201330). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.91"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll\u00e1r, P., and Zitnick, C.L. (2014, January 6\u201312). Microsoft coco: Common objects in context. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.","DOI":"10.1007\/978-3-319-10602-1_48"},{"key":"ref_46","unstructured":"Micikevicius, P., Narang, S., Alben, J., Diamos, G., Elsen, E., Garcia, D., Ginsburg, B., Houston, M., Kuchaiev, O., and Venkatesh, G. (May, January 30). Mixed Precision Training. Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada."},{"key":"ref_47","unstructured":"Zhang, H., Cisse, M., Dauphin, Y.N., and Lopez-Paz, D. (May, January 30). mixup: Beyond Empirical Risk Minimization. Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada."},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Liao, H.Y.M., Wu, Y.H., Chen, P.Y., Hsieh, J.W., and Yeh, I.H. (2020, January 14\u201319). CSPNet: A new backbone that can enhance learning capability of CNN. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA.","DOI":"10.1109\/CVPRW50498.2020.00203"},{"key":"ref_49","unstructured":"Moore, B.E., and Corso, J.J. (2024, January 20). FiftyOne. GitHub. Available online: https:\/\/github.com\/voxel51\/fiftyone."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Tan, M., Pang, R., and Le, Q.V. (2020, January 14\u201319). EfficientDet: Scalable and Efficient Object Detection. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01079"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Fu, X., Wei, G., Yuan, X., Liang, Y., and Bo, Y. (2023). Efficient YOLOv7-Drone: An Enhanced Object Detection Approach for Drone Aerial Imagery. Drones, 7.","DOI":"10.3390\/drones7100616"},{"key":"ref_52","unstructured":"Northcutt, C.G., Athalye, A., and Mueller, J. (2021, January 7\u201310). Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks. Proceedings of the Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1), Virtual."},{"key":"ref_53","first-page":"2","article-title":"New stochastic approximation type procedures","volume":"7","author":"Polyak","year":"1990","journal-title":"Automat. Telemekh"},{"key":"ref_54","unstructured":"Ruppert, D. (1988). Efficient Estimators from a Slowly Convergent Robbins-Monro Procedure, Cornell University Operations Research and Industrial Engineering."},{"key":"ref_55","unstructured":"Huang, G., Li, Y., Pleiss, G., Liu, Z., Hopcroft, J.E., and Weinberger, K.Q. (2017). Snapshot Ensembles: Train 1, Get M for Free. arXiv."},{"key":"ref_56","first-page":"1195","article-title":"Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results","volume":"30","author":"Tarvainen","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_57","unstructured":"Touvron, H., Vedaldi, A., Douze, M., and J\u00e9gou, H. (2019). Fixing the train-test resolution discrepancy. Adv. Neural Inf. Process. Syst. (NeurIPS), 32."},{"key":"ref_58","unstructured":"Zhang, R. (2019, January 9\u201315). Making Convolutional Networks Shift-Invariant Again. Proceedings of the ICML, Long Beach, CA, USA."},{"key":"ref_59","unstructured":"(2015, January 15\u201320). Numba: A LLVM-based Python JIT Compiler. Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, Austin, TX, USA."}],"container-title":["Robotics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2218-6581\/13\/2\/23\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T13:49:46Z","timestamp":1760104186000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2218-6581\/13\/2\/23"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,1,26]]},"references-count":59,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2024,2]]}},"alternative-id":["robotics13020023"],"URL":"https:\/\/doi.org\/10.3390\/robotics13020023","relation":{},"ISSN":["2218-6581"],"issn-type":[{"value":"2218-6581","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,1,26]]}}}