{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T15:06:07Z","timestamp":1776179167777,"version":"3.50.1"},"reference-count":45,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T00:00:00Z","timestamp":1776124800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Ministry of Science, Technological Development and Innovation of the Republic of Serbia","award":["451-03-34\/2026-03\/200109"],"award-info":[{"award-number":["451-03-34\/2026-03\/200109"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Computers"],"abstract":"<jats:p>This paper presents an edge-deployable vision-based framework for human\u2013robot interaction using a xArm collaborative robot and a single RGB camera mounted on the robot wrist, and lightweight AI-based perception modules. The system enables intuitive, contact-free control by combining hand understanding and object detection within a unified perception\u2013decision\u2013control pipeline. Hand landmarks are extracted using MediaPipe Hands, from which continuous hand trajectories, static gestures, and dynamic gestures are derived. Task objects are detected using a YOLO-based model, and both hand and object observations are mapped into the robot workspace using ArUco-based planar calibration. To ensure stable robot motion, the hand control signal is smoothed using low-pass and Kalman filtering, while dynamic gestures such as waving are recognized using a lightweight LSTM classifier. The complete pipeline runs locally on edge hardware, specifically NVIDIA Jetson Orin Nano and Raspberry Pi 5 with a Hailo AI accelerator. Experimental evaluation includes trajectory stability, gesture recognition reliability, and runtime performance on both platforms. Results show that filtering significantly reduces hand-tracking jitter, gesture recognition provides stable command states for control, and both edge devices support real-time operation, with Jetson achieving consistently lower runtime than Raspberry Pi. The proposed system demonstrates the feasibility of low-cost edge AI solutions for responsive and practical human\u2013robot interaction in collaborative industrial environments.<\/jats:p>","DOI":"10.3390\/computers15040241","type":"journal-article","created":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T14:24:58Z","timestamp":1776176698000},"page":"241","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Edge Computing Approach to AI-Based Gesture for Human\u2013Robot Interaction and Control"],"prefix":"10.3390","volume":"15","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4367-119X","authenticated-orcid":false,"given":"Nikola","family":"Iva\u010dko","sequence":"first","affiliation":[{"name":"Faculty of Mechanical Engineering, Department of Mechatronics and Control Systems, University of Ni\u0161, Aleksandra Medvedeva 14, 18000 Ni\u0161, Serbia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0430-8937","authenticated-orcid":false,"given":"Ivan","family":"\u0106iri\u0107","sequence":"additional","affiliation":[{"name":"Faculty of Mechanical Engineering, Department of Mechatronics and Control Systems, University of Ni\u0161, Aleksandra Medvedeva 14, 18000 Ni\u0161, Serbia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1364-7746","authenticated-orcid":false,"given":"Milo\u0161","family":"Simonovi\u0107","sequence":"additional","affiliation":[{"name":"Faculty of Mechanical Engineering, Department of Mechatronics and Control Systems, University of Ni\u0161, Aleksandra Medvedeva 14, 18000 Ni\u0161, Serbia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2026,4,14]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Vysock, A., Potulka, T., Chlebek, J., Kot, T., Maslowski, J., and Grushko, S. (2023). Hand gesture interface for robot path definition in collaborative applications: Implementation and comparative study. Sensors, 23.","DOI":"10.3390\/s23094219"},{"key":"ref_2","first-page":"353","article-title":"Design and implementation of a deep learning-based hand gesture recognition system for rehabilitation internet-of-things (RIoT) environments using MediaPipe","volume":"26","author":"Dhuzuki","year":"2025","journal-title":"Int. Islam. Univ. Malays. Eng. J."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"110","DOI":"10.1109\/JAS.2020.1003465","article-title":"Dynamic hand gesture recognition based on short-term sampling neural networks","volume":"8","author":"Zhang","year":"2021","journal-title":"IEEE\/CAA J. Autom. Sin."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Xie, J., Xu, Z., Zeng, J., Gao, Y., and Hashimoto, K. (2025). Human\u2013Robot Interaction Using Dynamic Hand Gesture for Teleoperation of Quadruped Robots with a Robotic Arm. Electronics, 14.","DOI":"10.3390\/electronics14050860"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Lizo, L., and Estrada, J. (2024, January 12). Lightweight convolutional neural network (CNN) and long short-term memory network (LSTM) for dynamic hand gesture recognition. Proceedings of the 2024 4th International Conference of Science and Information Technology in Smart Administration (ICSINTESA), Balikpapan, Indonesia.","DOI":"10.1109\/ICSINTESA62455.2024.10748189"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Halim, J., Eichler, P., Krusche, S., Bdiwi, M., and Ihlenfeldt, S. (2022). No-code robotic programming for agile production: A new markerless-approach for multimodal natural interaction in a human-robot collaboration context. Front. Robot. AI, 9.","DOI":"10.3389\/frobt.2022.1001955"},{"key":"ref_7","unstructured":"Sander, J., Cohen, A., Dasari, V., Venable, B., and Jalaian, B. (2025). On accelerating edge AI: Optimizing resource-constrained environments. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Taing, T., Khan, Y.F., Aliev, H., and Kim, H. (2025, January 27\u201329). Toward efficient deployment of lightweight object detectors for real-time inference on resource-constrained edge devices. Proceedings of the 2025 IEEE\/IEIE International Conference on Consumer Electronics-Asia (ICCE-Asia), Busan, Republic of Korea.","DOI":"10.1109\/ICCE-Asia67487.2025.11263725"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1007\/978-3-032-02106-9_6","article-title":"A Cognitive Robotics Approach for Manipulation of Freeform Objects Using CNN-Based Perception and Soft-Gripping","volume":"Volume 190","year":"2025","journal-title":"Advances in Service and Industrial Robotics: RAAD 2025; Mechanisms and Machine Science"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Castro, A., Silva, F., and Santos, V. (2021). Trends of human-robot collaboration in industry contexts: Handover, learning, and metrics. Sensors, 21.","DOI":"10.3390\/s21124113"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Taesi, C., Aggogeri, F., and Pellegrini, N. (2023). COBOT ApplicationsRecent advances and challenges. Robotics, 12.","DOI":"10.3390\/robotics12030079"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Choi, H.-H., Kim, K., Lee, K., and Lee, K. (2025, January 8\u201312). Cooperative edge inference and virtual simulation for real-time 3D human pose estimation in safety-critical applications. Proceedings of the 2025 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), Daejeon, Republic of Korea.","DOI":"10.1109\/ISMAR-Adjunct68609.2025.00040"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"3359","DOI":"10.1109\/ACCESS.2025.3647565","article-title":"A vehicular-edge federated, quantized YOLOv12 system for real-time 3D hand-gestures-based AAV control","volume":"14","author":"Lamaakal","year":"2026","journal-title":"IEEE Access"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Chen, Y., Wei, W., and Xiao, W. (2019, January 19\u201321). Human-computer interaction control of snake-like robot based on gesture recognition. Proceedings of the International Conference on Automation, Control and Robotics Engineering, Shenzhen, China.","DOI":"10.1145\/3351917.3351984"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"de la Cruz, M.H., Solache, U., Luna-lvarez, A., Zagal-Barrera, S.R., Lpez, D.A.M., and Mjica-Vargas, D. (2025). CNN 1D: A robust model for human pose estimation. Information, 16.","DOI":"10.3390\/info16020129"},{"key":"ref_16","unstructured":"Benitez-Garcia, G., Olivares-Mercado, J., Snchez-Prez, G., and Yanai, K. (2020, January 10\u201315). IPN hand: A video dataset and benchmark for real-time continuous hand gesture recognition. Proceedings of the International Conference on Pattern Recognition, Milan, Italy."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Zhao, D., Zhang, M., Liang, Y., Wang, S., Song, K.-B., and Kim, D. (2025, January 19\u201325). 3D-AMTA: Occlusion-aware real-time 3D hand pose estimation with auto mask and token-specific attention. Proceedings of the IEEE\/RJS International Conference on Intelligent Robots and Systems, Hangzhou, China.","DOI":"10.1109\/IROS60139.2025.11246826"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"73","DOI":"10.47836\/pjst.33.S2.05","article-title":"Dynamic hand gesture recognition by hand landmark classification using long short-term memory","volume":"33","author":"Ahmad","year":"2025","journal-title":"Pertanika J. Sci. Technol."},{"key":"ref_19","first-page":"5327","article-title":"Intelligent sign language recognition system for e-learning context","volume":"72","author":"Hussain","year":"2022","journal-title":"Comput. Mater. Contin."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Al Koutayni, M.R., Rybalkin, V., Malik, J., Elhayek, A., Weis, C., Reis, G., Wehn, N., and Stricker, D. (2020). Real-time energy efficient hand pose estimation: A case study. Sensors, 20.","DOI":"10.3390\/s20102828"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Kwon, O.-J., Kim, J., Jamil, S., Lee, J., and Ullah, F. (2024). Next-gen dynamic hand gesture recognition: MediaPipe, inception-v3 and LSTM-based enhanced deep learning model. Electronics, 13.","DOI":"10.3390\/electronics13163233"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"015228","DOI":"10.1088\/2631-8695\/ada72d","article-title":"Enhanced dynamic hand gesture recognition for finger disabilities using deep learning and an optimized otsu threshold method","volume":"7","author":"Kadhim","year":"2025","journal-title":"Eng. Res. Express"},{"key":"ref_23","unstructured":"Rahim, M., Miah, A.S.M., Akash, H.S., Shin, J., Hossain, M.I., and Hossain, M.N. (2024). An advanced deep learning based three-stream hybrid model for dynamic hand gesture recognition. arXiv."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Zhang, X., Li, S., Zeng, X., Lu, P., and Sun, W. (2025). A novel multimodal hand gesture recognition model using combined approach of inter-frame motion and shared attention weights. Computers, 14.","DOI":"10.3390\/computers14100432"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"34094","DOI":"10.1109\/ACCESS.2023.3263812","article-title":"Convolutional transformer fusion blocks for multi-modal gesture recognition","volume":"11","author":"Hampiholi","year":"2023","journal-title":"IEEE Access"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Huang, L., Liu, J., Gu, Y., Jiang, K., and Li, H. (2025, January 13\u201317). Time-channel adaptive fusion and hierarchical attention mechanism for dynamic hand gesture recognition. Proceedings of the International Conference on Multimodal Interaction, Canberra, Australia.","DOI":"10.1145\/3716553.3750780"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Yin, L. (2024, January 23\u201325). A stereo vision-based real-time 3D hand pose estimation system combining nonlinear optimization. Proceedings of the Seventh International Conference on Computer Graphics and Virtuality (ICCGV 2024), Hangzhou, China.","DOI":"10.1117\/12.3030925"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Shakeel, M., Ejaz, A., Ashfaq, H., Ahmed, S., and Shaheen, R. (2025, January 7\u20139). SyncFit: AI-powered mobile eCommerce application utilising AR for virtual clothing fitting. Proceedings of the 2025 International Conference on Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA), Antalya, Turkiye.","DOI":"10.1109\/ACDSA65407.2025.11165966"},{"key":"ref_29","unstructured":"Kpkl, O., Kse, N., and Rigoll, G. (2018, January 18\u201322). Motion fused frames: Data level fusion strategy for hand gesture recognition. Proceedings of the 2018 IEEE\/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"29227","DOI":"10.1109\/JSEN.2023.3324479","article-title":"Dynamic gesture recognition based on two-scale 3-d-ConvNeXt","volume":"23","author":"Hao","year":"2023","journal-title":"IEEE Sens. J."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"e3275","DOI":"10.7717\/peerj-cs.3275","article-title":"Two-hand static and dynamic arabic sign language recognition using keypoints and shape descriptors with attention-driven feature fusion","volume":"11","author":"Kausar","year":"2025","journal-title":"PeerJ Comput. Sci."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Heo, S., Choi, T., and Choi, W. (2026). Clinical validation of an on-device AI-driven real-time human pose estimation and exercise prescription program; prospective single-arm quasi-experimental study. Healthcare, 14.","DOI":"10.3390\/healthcare14040482"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Heidari, N., Norouzi, J., Helfroush, M., and Danyali, H. (2024, January 19\u201320). Dynamic hand gesture recognition with 2DCNN-LSTM and improved keyframe extraction. Proceedings of the International Conference on Computer and Knowledge Engineering, Mashhad, Iran.","DOI":"10.1109\/ICCKE65377.2024.10874601"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Nathala, S.S., Yakkati, R.R., Breland, D.S., Yeduri, S.R., Manikandan, M.S., Jha, A., Zhou, J., and Cenkeramaddi, L.R. (2024, January 5\u20138). A deep CNN-based hand gestures recognition using high-resolution thermal imaging. Proceedings of the 2024 IEEE 19th Conference on Industrial Electronics and Applications (ICIEA), Kristiansand, Norway.","DOI":"10.1109\/ICIEA61579.2024.10665263"},{"key":"ref_35","unstructured":"Zhao, D., Zhang, M., Liang, Y., Wang, S., Song, K.-B., and Kim, D. (July, January 30). Mobile-StereoHPE: Real-time mobile 3D hand pose estimation from stereo gray images. Proceedings of the IEEE International Conference on Multimedia and Expo, Nantes, France."},{"key":"ref_36","unstructured":"Hao, Y.K., Wei, H.T., and Min, S. (2025). SPLite hand: Sparsity-aware lightweight 3D hand pose estimation. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Chen, J., Jin, F., Jiao, Y., Zhan, Y., and Qin, X. (2025). Improving dynamic gesture recognition with attention-enhanced LSTM and grounding SAM. Electronics, 14.","DOI":"10.3390\/electronics14091793"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Chitra, P., Brindha, P., Srilatha, K., Jegan, G., Raju, R., and R, R. (2025, January 7\u20139). LSTM based pose estimation and sign language translation. Proceedings of the International Conference Trends Material Science and Inventive Materials, Kanyakumari, India.","DOI":"10.1109\/ICTMIM65579.2025.10988129"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Leo, K.D., Biagetti, G., Falaschetti, L., and Crippa, P. (2025). Microcontroller implementation of LSTM neural networks for dynamic hand gesture recognition. Sensors, 25.","DOI":"10.3390\/s25123831"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"887","DOI":"10.30574\/wjaets.2024.13.2.0636","article-title":"Real time AdaptiveAI pipelines for edge cloud systems: Dynamic optimization based on infrastructure feedback","volume":"13","author":"Chinnaraju","year":"2024","journal-title":"World J. Adv. Eng. Technol. Sci."},{"key":"ref_41","doi-asserted-by":"crossref","first-page":"161605","DOI":"10.1109\/ACCESS.2025.3609798","article-title":"QuantEdge: A hybrid quantization approach for optimized AI deployment across edge devices","volume":"13","author":"Mahmudov","year":"2025","journal-title":"IEEE Access"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"7","DOI":"10.70315\/uloap.ulete.2026.0301002","article-title":"Edge AI for real-time robotic systems: Architectures, deployment strategies, and performance optimization","volume":"3","author":"Ghosh","year":"2026","journal-title":"Univers. Libr. Eng. Technol."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Horvath, K., Abid, M.R., Merino, T., Zimmerman, R., Peker, Y., and Khan, S. (2024). Cloud-Based Infrastructure and DevOps for Energy Fault Detection in Smart Buildings. Computers, 13.","DOI":"10.3390\/computers13010023"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"873","DOI":"10.1109\/TGCN.2020.2982821","article-title":"Green Cloud Multimedia Networking: NFV\/SDN based Energy-efficient Resource Allocation","volume":"4","author":"Montazerolghaem","year":"2020","journal-title":"IEEE Trans. Green Commun. Netw."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"806","DOI":"10.1109\/TNSM.2016.2572161","article-title":"A Load-Balanced Call Admission Controller for IMS Cloud Computing","volume":"13","author":"Montazerolghaem","year":"2016","journal-title":"IEEE Trans. Netw. Serv. Manag."}],"container-title":["Computers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2073-431X\/15\/4\/241\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T14:32:04Z","timestamp":1776177124000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2073-431X\/15\/4\/241"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,4,14]]},"references-count":45,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2026,4]]}},"alternative-id":["computers15040241"],"URL":"https:\/\/doi.org\/10.3390\/computers15040241","relation":{},"ISSN":["2073-431X"],"issn-type":[{"value":"2073-431X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,4,14]]}}}