{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T00:22:14Z","timestamp":1770337334224,"version":"3.49.0"},"reference-count":28,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T00:00:00Z","timestamp":1770249600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Liwa University, United Arab Emirates, through the Faculty Research Incentive Grant"},{"name":"Liwa University"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Informatics"],"abstract":"<jats:p>This paper presents a hybrid deep learning framework for real-time sign language recognition (SLR) tailored to Internet of Things (IoT)-enabled environments, enhancing accessibility for Deaf communities. The proposed system integrates a Long Short-Term Memory (LSTM) network for static gesture recognition and a 3D Convolutional Neural Network (3D CNN) for dynamic gesture recognition. Implemented on a Raspberry Pi device using MediaPipe for landmark extraction, the system supports low-latency, on-device inference suitable for resource-constrained edge computing. Experimental results demonstrate that the LSTM model achieves its highest stability and performance for static signs at 1000 training epochs, yielding an average F1-score of 0.938 and an accuracy of 86.67%. In contrast, at 2000 epochs, the model exhibits a catastrophic performance collapse (F1-score of 0.088) due to overfitting and weight instability, highlighting the necessity of careful training regulation. Despite this, the overall system achieves consistently high classification performance under controlled conditions. In contrast, the 3D CNN component maintains robust and consistent performance across all evaluated training phases (500\u20132000 epochs), achieving up to 99.6% accuracy on dynamic signs. When deployed on a Raspberry Pi platform, the system achieves real-time performance with a frame rate of 12\u201315 FPS and an average inference latency of approximately 65 ms per frame. The hybrid architecture effectively balances recognition accuracy with computational efficiency by routing static gestures to the LSTM and dynamic gestures to the 3D CNN. This work presents a detailed epoch-wise comparative analysis of model stability and computational feasibility, contributing a practical and scalable IoT-enabled solution for inclusive, real-time sign-to-text communication in intelligent environments.<\/jats:p>","DOI":"10.3390\/informatics13020027","type":"journal-article","created":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T11:13:01Z","timestamp":1770289981000},"page":"27","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Comparative Evaluation of LSTM and 3D CNN Models in a Hybrid System for IoT-Enabled Sign-to-Text Translation in Deaf Communities"],"prefix":"10.3390","volume":"13","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6029-0319","authenticated-orcid":false,"given":"Samar","family":"Mouti","sequence":"first","affiliation":[{"name":"College of Engineering and Computing, Liwa University, Abu Dhabi P.O. Box 41009, United Arab Emirates"}]},{"given":"Hani","family":"Al Chalabi","sequence":"additional","affiliation":[{"name":"General Department, Liwa University, Al Ain P.O. Box 68297, United Arab Emirates"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1006-4588","authenticated-orcid":false,"given":"Mohammed","family":"Abushohada","sequence":"additional","affiliation":[{"name":"Thumbay College of Management and AI in Healthcare, Gulf Medical University, Ajman P.O. Box 4184, United Arab Emirates"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1072-6617","authenticated-orcid":false,"given":"Samer","family":"Rihawi","sequence":"additional","affiliation":[{"name":"College of Engineering and Computing, Liwa University, Abu Dhabi P.O. Box 41009, United Arab Emirates"}]},{"given":"Sulafa","family":"Abdalla","sequence":"additional","affiliation":[{"name":"General Department, Liwa University, Abu Dhabi P.O. Box 41009, United Arab Emirates"}]}],"member":"1968","published-online":{"date-parts":[[2026,2,5]]},"reference":[{"key":"ref_1","first-page":"4199","article-title":"IoT and sign language system (SLS)","volume":"13","author":"Mouti","year":"2020","journal-title":"Int. J. Eng. Res. Technol."},{"key":"ref_2","first-page":"e313960","article-title":"Special needs classroom assessment using a sign language communicator (CASC) based on AI techniques","volume":"19","author":"Mouti","year":"2023","journal-title":"Int. J. e-Collab."},{"key":"ref_3","first-page":"e316663","article-title":"Virtual teaching assistant for capturing facial and pose landmarks of the students in the classroom using deep learning","volume":"19","author":"Rihawi","year":"2023","journal-title":"Int. J. e-Collab."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"499","DOI":"10.11591\/eei.v13i1.6059","article-title":"An Adam-based CNN and LSTM approach for sign language recognition in real time for deaf people","volume":"13","author":"Paul","year":"2024","journal-title":"Bull. Electr. Eng. Inform."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"76174","DOI":"10.1038\/s41598-024-76174-7","article-title":"Sign language recognition using modified deep learning network and hybrid optimization","volume":"14","author":"Baihan","year":"2024","journal-title":"Sci. Rep."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Noor, T.H., Noor, A., Alharbi, A.F., Faisal, A., Alrashidi, R., Alsaeedi, A.S., Alharbi, G., Alsanoosy, T., and Alsaeedi, A. (2024). Real-time Arabic sign language recognition using a hybrid deep learning model. Sensors, 24.","DOI":"10.3390\/s24113683"},{"key":"ref_7","first-page":"42","article-title":"Artificial intelligence for sign language translation","volume":"53","author":"Strobel","year":"2023","journal-title":"Commun. Assoc. Inf. Syst."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"113","DOI":"10.61186\/seai.2409-1005","article-title":"A low-cost IoT-based hybrid multiscale CNN\u2013LSTM approach for bearing fault diagnosis using low sampling rate vibration data","volume":"1","author":"Moosavi","year":"2025","journal-title":"Sustain. Energy Artif. Intell."},{"key":"ref_9","first-page":"1775","article-title":"Comparison of machine learning models for stress detection from sensor data using long short-term memory (LSTM) networks and convolutional neural networks (CNNs)","volume":"12","author":"Jain","year":"2024","journal-title":"Int. J. Sci. Res. Manag."},{"key":"ref_10","first-page":"10","article-title":"Real-time sign language recognition and translation using MediaPipe and LSTM-based deep learning","volume":"187","author":"Ravikiran","year":"2025","journal-title":"Int. J. Comput. Appl."},{"key":"ref_11","first-page":"e951","article-title":"Deep learning-driven IoT defence: Comparative analysis of CNN and LSTM for DDoS detection and mitigation","volume":"10","author":"Patil","year":"2025","journal-title":"J. Inf. Syst. Eng. Manag."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Xue, H., Huang, B., Qin, M., Zhou, H., and Yang, H. (2020, January 2\u20136). Edge computing for Internet of Things: A survey. Proceedings of the 2020 International Conferences on Internet of Things (iThings), IEEE Green Computing and Communications (GreenCom), IEEE Cyber, Physical and Social Computing (CPSCom), IEEE Smart Data (SmartData), and IEEE Congress on Cybermatics (Cybermatics), Rhodes, Greece.","DOI":"10.1109\/iThings-GreenCom-CPSCom-SmartData-Cybermatics50389.2020.00130"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Zhang, L., Zhu, G., Shen, P., Song, J., Shah, S.A., and Bennamoun, M. (2017, January 22\u201329). Learning spatiotemporal features using 3DCNN and convolutional LSTM for gesture recognition. Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy.","DOI":"10.1109\/ICCVW.2017.369"},{"key":"ref_14","unstructured":"Brettmann, A., Gr\u00e4vinghoff, J., R\u00fcschoff, M., and Westhues, M. (2025). Breaking the barriers: Video vision transformers for word-level sign language recognition. arXiv."},{"key":"ref_15","first-page":"4675","article-title":"Dynamic hand gesture recognition using 3D-CNN and LSTM networks","volume":"70","author":"Rehman","year":"2022","journal-title":"Comput. Mater. Contin."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Alsharif, B. (2023). IoT technologies in healthcare for people with hearing impairments. Internet of Things for Smart Healthcare, Springer.","DOI":"10.1007\/978-3-031-33545-7_21"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Cui, R., Liu, H., and Zhang, C. (2017, January 21\u201326). Recurrent convolutional neural networks for continuous sign language recognition by staged optimization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.175"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"2641","DOI":"10.1007\/s00521-024-10729-7","article-title":"A hybrid CNN-random forest model with landmark angles for real-time Arabic sign language recognition","volume":"37","author":"Boulesnane","year":"2025","journal-title":"Neural Comput. Appl."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"3794","DOI":"10.1007\/s13198-024-02376-x","article-title":"An open-source MP\u2009+\u2009CNN\u2009+\u2009BiLSTM model-based hybrid model for recognizing sign language on smartphones","volume":"15","author":"Ghanimi","year":"2024","journal-title":"Int. J. Syst. Assur. Eng. Manag."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"34553","DOI":"10.1109\/ACCESS.2024.3372425","article-title":"Sign language recognition using graph and general deep neural network based on a large-scale dataset","volume":"12","author":"Miah","year":"2024","journal-title":"IEEE Access"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Alsharif, B., Alanazi, M., Altaher, A.S., Altaher, A., and Ilyas, M. (2023, January 4\u20136). Deep learning technology to recognize American sign language alphabet using multi-focus image fusion technique. Proceedings of the 2023 IEEE 20th International Conference on Smart Communities: Improving Quality of Life Using AI, Robotics and IoT (HONET), Boca Raton, FL, USA.","DOI":"10.1109\/HONET59747.2023.10374775"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Alsharif, B., Alalwany, E., Ibrahim, A., Mahgoub, I., and Ilyas, M. (2025). Real-Time American Sign Language Interpretation Using Deep Learning and Keypoint Tracking. Sensors, 25.","DOI":"10.3390\/s25072138"},{"key":"ref_23","first-page":"2169","article-title":"Deep Learning Approaches for Continuous Sign Language Recognition: A Comprehensive Review","volume":"13","author":"Khan","year":"2025","journal-title":"IEEE Access"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"6192","DOI":"10.1038\/s41598-025-89975-1","article-title":"IoT-driven smart assistive communication system for the hearing impaired with hybrid deep learning models for sign language recognition","volume":"15","author":"Maashi","year":"2025","journal-title":"Sci. Rep."},{"key":"ref_25","unstructured":"Rihawi, S. (2025, September 15). SIRM-Dynamic: A Dataset for Dynamic Sign Language Recognition. GitHub Repository. Available online: https:\/\/github.com\/srihawi\/SIRM-Dynamic."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"245","DOI":"10.33093\/jiwe.2025.4.3.15","article-title":"Indonesian Language Sign Detection Using MediaPipe with Long Short-Term Memory (LSTM) Algorithm","volume":"4","author":"Utomo","year":"2025","journal-title":"J. Inform. Web Eng."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"7354","DOI":"10.1109\/TCSVT.2025.3544212","article-title":"Cross-modal adaptive prototype learning for continuous sign language recognition","volume":"35","author":"Wei","year":"2025","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"13830","DOI":"10.1038\/s41598-025-98571-2","article-title":"Efficient human activity recognition on edge devices using DeepConv LSTM architectures","volume":"15","author":"Zhou","year":"2025","journal-title":"Sci. Rep."}],"container-title":["Informatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2227-9709\/13\/2\/27\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,5]],"date-time":"2026-02-05T11:36:16Z","timestamp":1770291376000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2227-9709\/13\/2\/27"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,5]]},"references-count":28,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2026,2]]}},"alternative-id":["informatics13020027"],"URL":"https:\/\/doi.org\/10.3390\/informatics13020027","relation":{},"ISSN":["2227-9709"],"issn-type":[{"value":"2227-9709","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,5]]}}}