{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,9]],"date-time":"2026-06-09T07:48:16Z","timestamp":1780991296632,"version":"3.54.1"},"reference-count":31,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2021,12,28]],"date-time":"2021-12-28T00:00:00Z","timestamp":1640649600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100003725","name":"National Research Foundation of Korea","doi-asserted-by":"publisher","award":["NRF-2021R1F1A1062181"],"award-info":[{"award-number":["NRF-2021R1F1A1062181"]}],"id":[{"id":"10.13039\/501100003725","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100002631","name":"Gachon University","doi-asserted-by":"publisher","award":["GCU-2019-0386"],"award-info":[{"award-number":["GCU-2019-0386"]}],"id":[{"id":"10.13039\/501100002631","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Studies on deep-learning-based behavioral pattern recognition have recently received considerable attention. However, if there are insufficient data and the activity to be identified is changed, a robust deep learning model cannot be created. This work contributes a generalized deep learning model that is robust to noise not dependent on input signals by extracting features through a deep learning model for each heterogeneous input signal that can maintain performance while minimizing preprocessing of the input signal. We propose a hybrid deep learning model that takes heterogeneous sensor data, an acceleration sensor, and an image as inputs. For accelerometer data, we use a convolutional neural network (CNN) and convolutional block attention module models (CBAM), and apply bidirectional long short-term memory and a residual neural network. The overall accuracy was 94.8% with a skeleton image and accelerometer data, and 93.1% with a skeleton image, coordinates, and accelerometer data after evaluating nine behaviors using the Berkeley Multimodal Human Action Database (MHAD). Furthermore, the accuracy of the investigation was revealed to be 93.4% with inverted images and 93.2% with white noise added to the accelerometer data. Testing with data that included inversion and noise data indicated that the suggested model was robust, with a performance deterioration of approximately 1%.<\/jats:p>","DOI":"10.3390\/s22010174","type":"journal-article","created":{"date-parts":[[2021,12,28]],"date-time":"2021-12-28T06:55:03Z","timestamp":1640674503000},"page":"174","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":21,"title":["Robust Human Activity Recognition by Integrating Image and Accelerometer Sensor Data Using Deep Fusion Network"],"prefix":"10.3390","volume":"22","author":[{"given":"Junhyuk","family":"Kang","sequence":"first","affiliation":[{"name":"Department of Software, Gachon University, Seongnam 13120, Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jieun","family":"Shin","sequence":"additional","affiliation":[{"name":"Department of Software, Gachon University, Seongnam 13120, Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jaewon","family":"Shin","sequence":"additional","affiliation":[{"name":"Department of Software, Gachon University, Seongnam 13120, Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8119-9677","authenticated-orcid":false,"given":"Daeho","family":"Lee","sequence":"additional","affiliation":[{"name":"Department of Mechanical Engineering, Gachon University, Seongnam 13120, Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7676-9869","authenticated-orcid":false,"given":"Ahyoung","family":"Choi","sequence":"additional","affiliation":[{"name":"Department of Software, Gachon University, Seongnam 13120, Korea"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2021,12,28]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Bieber, G., Voskamp, J., and Urban, B. (2009, January 19\u201324). Activity Recognition for Everyday Life on Mobile Phones. Proceedings of the International Conference on Universal Access in Human-Computer Interaction, San Diego, CA, USA.","DOI":"10.1007\/978-3-642-02710-9_32"},{"key":"ref_2","first-page":"1625","article-title":"Prediction of Activity Energy Expenditure Using Accelerometers in Children","volume":"36","author":"Puyau","year":"2004","journal-title":"Med. Sci. Sports Exerc."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"801","DOI":"10.1249\/MSS.0000000000001144","article-title":"Activity Recognition in Youth Using Single Accelerometer Placed at Wrist or Ankle","volume":"49","author":"Andrea","year":"2017","journal-title":"Med. Sci. Sports Exerc."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"2183","DOI":"10.1088\/0967-3334\/35\/11\/2183","article-title":"Machine learning for activity recognition: Hip versus wrist data","volume":"35","author":"Stewart","year":"2014","journal-title":"Physiol. Meas."},{"key":"ref_5","unstructured":"Anahita, H., Shayan, F., Eleanne, V., Lia, V., Rima, H., Majid, S., and Alex, B. (2018, January 18\u201321). Children Activity Recognition: Challenges and Strategies. Proceedings of the International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, HI, USA."},{"key":"ref_6","first-page":"1","article-title":"Machine learning algorithms for activity recognition in ambulant children and adolesecents with cerebral pasly","volume":"15","author":"Ahmadi","year":"2018","journal-title":"J. Neuroeng. Rehabiliation"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"915","DOI":"10.1016\/j.asoc.2017.09.027","article-title":"Real-time human activity recognition from accelerometer data using convolutional neural networks","volume":"62","author":"Ignatov","year":"2018","journal-title":"Appl. Soft Comput."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Wang, L. (2016). Recognition of human activities using continuous autoencoders with wearable sensors. Sensors, 16.","DOI":"10.3390\/s16020189"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1016\/j.neunet.2014.09.003","article-title":"Deep learning in neural networks: An overview","volume":"61","author":"Schmidhuber","year":"2015","journal-title":"Neural Netw."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"233","DOI":"10.1016\/j.eswa.2018.03.056","article-title":"Deep learning algorithms for human activity recognition using mobile and wearable sensor networks: State of the art and research challenges","volume":"105","author":"Nweke","year":"2018","journal-title":"Expert Syst. Appl."},{"key":"ref_11","unstructured":"Hammerla, N.Y., Halloran, S., and Pl\u00f6tz, T. (2016). Deep, convolutional, and recurrent models for human activity recognition using wearables. arXiv."},{"key":"ref_12","first-page":"1","article-title":"Deep Learning for Sensor-based Human Activity Recognition: Overview, Challenges, and Opportunities","volume":"54","author":"Chen","year":"2021","journal-title":"ACM Comput. Surv. (CSUR)"},{"key":"ref_13","first-page":"114","article-title":"Recognition of human hand activities based on a single wrist IMU using recurrent neural networks","volume":"6","author":"River","year":"2017","journal-title":"Int. J. Pharma Med. Biol. Sci."},{"key":"ref_14","first-page":"1","article-title":"Deep residual Bidir-LSTM for human activity recognition using wearable sensors","volume":"7316954","author":"Zhao","year":"2018","journal-title":"Math. Prob. Eng."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"2237","DOI":"10.1007\/s11227-020-03361-4","article-title":"An end-to-end deep learning model for human activity recognition from highly sparse body sensor data in Internet of Medical Things environment","volume":"77","author":"Hassan","year":"2021","journal-title":"J. Supercomput."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"743","DOI":"10.1007\/s11036-019-01445-x","article-title":"Deep learning models for real-time human activity recognition with smartphones","volume":"25","author":"Wan","year":"2020","journal-title":"Mob. Netw. Appl."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"60","DOI":"10.18201\/ijisae.2019151257","article-title":"Human activity recognition on real time and offline dataset","volume":"7","author":"Kale","year":"2019","journal-title":"Int. J. Intell. Syst. Appl. Eng."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"2567","DOI":"10.1007\/s42835-019-00278-8","article-title":"Vision-based human activity recognition system using depth silhouettes: A smart home system for monitoring the residents","volume":"14","author":"Kim","year":"2019","journal-title":"J. Electr. Eng. Technol."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1016\/j.patrec.2018.04.035","article-title":"Combining CNN streams of RGB-D and skeletal data for human activity recognition","volume":"115","author":"Khaire","year":"2018","journal-title":"Pattern Recognit. Lett."},{"key":"ref_20","unstructured":"Amir, S., Jun, L., Tian-Tsong, N., and Gang, W. (2016, January 27\u201330). NTU RGB+D: A Large-Scale Dataset for 3D Human Activity Analysis. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"2123","DOI":"10.1109\/TPAMI.2015.2505295","article-title":"Multimodal multipart learning for action recognition in depth videos","volume":"38","author":"Shahroudy","year":"2015","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Ord\u00f3\u00f1ez, F., and Roggen, D. (2016). Deep convolutional and LSTM recurrent neural networks for multimodal wearable activity recognition. Sensors, 16.","DOI":"10.3390\/s16010115"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Wei, H., Jafari, R., and Kehtarnavaz, N. (2019). Fusion of Video and Inertial Sensing for Deep Learning\u2013Based Human Action Recognition. Sensors, 19.","DOI":"10.3390\/s19173680"},{"key":"ref_24","first-page":"4061","article-title":"Multi-Layered Deep Learning Features Fusion for Human Action Recognition","volume":"69","author":"Kiran","year":"2021","journal-title":"Comput. Mater. Contin."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1377","DOI":"10.1007\/s10044-018-0688-1","article-title":"An implementation of optimized framework for action classification using multilayers neural network on selected fused features","volume":"22","author":"Khan","year":"2019","journal-title":"Pattern Anal. Appl."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Helmi, A.M., Al-Qaness, M.A., Dahou, A., Dama\u0161evi\u010dius, R., Kavi\u010dius, T., and Elaziz, M.A. (2021). A novel hybrid gradient-based optimizer and grey wolf optimizer feature selection method for human activity recognition using smartphone sensors. Entropy, 23.","DOI":"10.3390\/e23081065"},{"key":"ref_27","unstructured":"(2021, December 21). OpenPose API. Available online: https:\/\/github.com\/CMU-Perceptual-Computing-Lab\/openpose."},{"key":"ref_28","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","article-title":"Long short-term memory","volume":"9","author":"Hochreiter","year":"1997","journal-title":"Neural Comput."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.Y., and Kweon, I.S. (2018, January 8\u201314). CBAM: Convolutional Block Attention Module. Proceedings of the European Conference on Computer Vision, Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_31","unstructured":"(2021, December 21). Berkeley MHAD. Available online: https:\/\/tele-immersion.citris-uc.org\/berkeley_mhad."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/1\/174\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T07:54:34Z","timestamp":1760169274000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/22\/1\/174"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,12,28]]},"references-count":31,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,1]]}},"alternative-id":["s22010174"],"URL":"https:\/\/doi.org\/10.3390\/s22010174","relation":{},"ISSN":["1424-8220"],"issn-type":[{"value":"1424-8220","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,12,28]]}}}