{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T18:56:49Z","timestamp":1769194609459,"version":"3.49.0"},"reference-count":34,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2024,5,7]],"date-time":"2024-05-07T00:00:00Z","timestamp":1715040000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>Situational awareness (SA) is crucial in disaster response, enhancing the understanding of the environment. Social media, with its extensive user base, offers valuable real-time information for such scenarios. Although SA systems excel in extracting disaster-related details from user-generated content, a common limitation in prior approaches is their emphasis on single-modal extraction rather than embracing multi-modalities. This paper proposed a multimodal hierarchical graph-based situational awareness (MHGSA) system for comprehensive disaster event classification. Specifically, the proposed multimodal hierarchical graph contains nodes representing different disaster events and the features of the event nodes are extracted from the corresponding images and acoustic features. The proposed feature extraction modules with multi-branches for vision and audio features provide hierarchical node features for disaster events of different granularities, aiming to build a coarse-granularity classification task to constrain the model and enhance fine-granularity classification. The relationships between different disaster events in multi-modalities are learned by graph convolutional neural networks to enhance the system\u2019s ability to recognize disaster events, thus enabling the system to fuse complex features of vision and audio. Experimental results illustrate the effectiveness of the proposed visual and audio feature extraction modules in single-modal scenarios. Furthermore, the MHGSA successfully fuses visual and audio features, yielding promising results in disaster event classification tasks.<\/jats:p>","DOI":"10.3390\/fi16050161","type":"journal-article","created":{"date-parts":[[2024,5,7]],"date-time":"2024-05-07T06:53:27Z","timestamp":1715064807000},"page":"161","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["AI-Empowered Multimodal Hierarchical Graph-Based Learning for Situation Awareness on Enhancing Disaster Responses"],"prefix":"10.3390","volume":"16","author":[{"given":"Jieli","family":"Chen","sequence":"first","affiliation":[{"name":"XJTLU Entrepreneur College (Taicang), Xian Jiaotong-Liverpool University, Taicang 215400, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kah Phooi","family":"Seng","sequence":"additional","affiliation":[{"name":"XJTLU Entrepreneur College (Taicang), Xian Jiaotong-Liverpool University, Taicang 215400, China"},{"name":"School of Computer Science, Queensland University of Technology, Brisbane, QLD 4000, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Li Minn","family":"Ang","sequence":"additional","affiliation":[{"name":"School of Engineering and Science, University of Sunshine Coast, Petrie, QLD 4502, Australia"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jeremy","family":"Smith","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering & Electronics, University of Liverpool, Liverpool L69 3BX, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-8652-7913","authenticated-orcid":false,"given":"Hanyue","family":"Xu","sequence":"additional","affiliation":[{"name":"XJTLU Entrepreneur College (Taicang), Xian Jiaotong-Liverpool University, Taicang 215400, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2024,5,7]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"389","DOI":"10.1109\/TCSS.2019.2957208","article-title":"TAQE: Tweet Retrieval-Based Infrastructure Damage Assessment During Disasters","volume":"7","author":"Priya","year":"2020","journal-title":"IEEE Trans. Comput. Soc. Syst."},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Sakaki, T., Okazaki, M., and Matsuo, Y. (2010, January 26\u201330). Earthquake Shakes Twitter Users: Real-Time Event Detection by Social Sensors. Proceedings of the 19th International Conference on World Wide Web, Raleigh, NC, USA.","DOI":"10.1145\/1772690.1772777"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"556","DOI":"10.1109\/TCSVT.2014.2347551","article-title":"Effective Multimodality Fusion Framework for Cross-Media Topic Detection","volume":"26","author":"Chu","year":"2016","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Blandfort, P., Patton, D., Frey, W.R., Karaman, S., Bhargava, S., Lee, F.-T., Varia, S., Kedzie, C., Gaskell, M.B., and Schifanella, R. (2018, January 25\u201328). Multimodal Social Media Analysis for Gang Violence Prevention. Proceedings of the International AAAI Conference on Web and Social Media, Palo Alto, CA, USA.","DOI":"10.1609\/icwsm.v13i01.3214"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"105695","DOI":"10.1016\/j.knosys.2020.105695","article-title":"A Survey on Multi-Modal Social Event Detection","volume":"195","author":"Zhou","year":"2020","journal-title":"Knowl. Based Syst."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"3113","DOI":"10.1109\/TII.2019.2897594","article-title":"Efficient Fire Detection for Uncertain Surveillance Environment","volume":"15","author":"Muhammad","year":"2019","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1067","DOI":"10.1109\/TII.2019.2915592","article-title":"Edge Intelligence-Assisted Smoke Detection in Foggy Surveillance Environments","volume":"16","author":"Muhammad","year":"2020","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"1687","DOI":"10.1109\/JSTARS.2020.2969809","article-title":"EmergencyNet: Efficient Aerial Image Classification for Drone-Based Emergency Monitoring Using Atrous Convolutional Feature Fusion","volume":"13","author":"Kyrkou","year":"2020","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"878","DOI":"10.1109\/JPROC.2011.2182093","article-title":"Large-Scale Situation Awareness with Camera Networks and Multimodal Sensing","volume":"100","author":"Ramachandran","year":"2012","journal-title":"Proc. IEEE"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"10478","DOI":"10.1109\/ACCESS.2020.2965550","article-title":"A Hybrid Machine Learning Pipeline for Automated Mapping of Events and Locations From Social Media in Disasters","volume":"8","author":"Fan","year":"2020","journal-title":"IEEE Access"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"340","DOI":"10.1038\/s42256-023-00624-6","article-title":"Multimodal Learning with Graphs","volume":"5","author":"Ektefaie","year":"2023","journal-title":"Nat. Mach Intell."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Alam, F., Joty, S., and Imran, M. (2018, January 25\u201328). Graph Based Semi-Supervised Learning with Convolution Neural Networks to Classify Crisis Related Tweets. Proceedings of the International AAAI Conference on Web and Social Media, Palo Alto, CA, USA.","DOI":"10.1609\/icwsm.v12i1.15047"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Luo, C., Song, S., Xie, W., Shen, L., and Gunes, H. (2022, January 23\u201329). Learning Multi-Dimensional Edge Feature-Based AU Relation Graph for Facial Action Unit Recognition. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, Vienna, Austria.","DOI":"10.24963\/ijcai.2022\/173"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"8657","DOI":"10.1109\/TGRS.2020.3037361","article-title":"CNN-Enhanced Graph Convolutional Network with Pixel- and Superpixel-Level Feature Fusion for Hyperspectral Image Classification","volume":"59","author":"Liu","year":"2021","journal-title":"IEEE Trans. Geosci. Remote Sens."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"7268","DOI":"10.1109\/TII.2022.3227641","article-title":"Graph Learning Empowered Situation Awareness in Internet of Energy with Graph Digital Twin","volume":"19","author":"Sui","year":"2023","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"2207","DOI":"10.1109\/TMI.2022.3159264","article-title":"Multi-Modal Graph Learning for Disease Prediction","volume":"41","author":"Zheng","year":"2022","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"1382","DOI":"10.1109\/LSP.2023.3319233","article-title":"Audio Event-Relational Graph Representation Learning for Acoustic Scene Classification","volume":"30","author":"Hou","year":"2023","journal-title":"IEEE Signal Process. Lett."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"32","DOI":"10.1518\/001872095779049543","article-title":"Toward a Theory of Situation Awareness in Dynamic Systems","volume":"37","author":"Endsley","year":"1995","journal-title":"Hum Factors"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1016\/j.ins.2021.05.040","article-title":"Dual Attention Guided Multi-Scale CNN for Fine-Grained Image Classification","volume":"573","author":"Liu","year":"2021","journal-title":"Inf. Sci."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"116663","DOI":"10.1109\/ACCESS.2020.3005150","article-title":"Multi-Scale CNN for Fine-Grained Image Recognition","volume":"8","author":"Won","year":"2020","journal-title":"IEEE Access"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1016\/j.ijar.2022.07.002","article-title":"Hierarchical Classification Based on Coarse- to Fine-Grained Knowledge Transfer","volume":"149","author":"Qiu","year":"2022","journal-title":"Int. J. Approx. Reason."},{"key":"ref_22","unstructured":"Zhu, X., and Bain, M. (2017). B-CNN: Branch Convolutional Neural Network for Hierarchical Classification. arXiv."},{"key":"ref_23","unstructured":"Tan, M., and Le, Q.V. (2019, January 9\u201315). EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009, January 20\u201325). ImageNet: A Large-Scale Hierarchical Image Database. Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA.","DOI":"10.1109\/CVPR.2009.5206848"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201323). Squeeze-and-Excitation Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"104199","DOI":"10.1016\/j.dsp.2023.104199","article-title":"Self-Supervised Learning Representation for Abnormal Acoustic Event Detection Based on Attentional Contrastive Learning","volume":"142","author":"Wei","year":"2023","journal-title":"Digit. Signal Process."},{"key":"ref_28","unstructured":"Bresson, X., and Laurent, T. (2018). Residual Gated Graph ConvNets. arXiv."},{"key":"ref_29","unstructured":"Kipf, T.N., and Welling, M. (2017). Semi-Supervised Classification with Graph Convolutional Networks. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Marcheggiani, D., and Titov, I. (2017). Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling. arXiv.","DOI":"10.18653\/v1\/D17-1159"},{"key":"ref_31","unstructured":"Kingma, D.P., and Ba, J. (2017). Adam: A Method for Stochastic Optimization. arXiv."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Chen, H., Xie, W., Vedaldi, A., and Zisserman, A. (2020, January 4\u20138). Vggsound: A Large-Scale Audio-Visual Dataset. Proceedings of the ICASSP 2020\u20132020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain.","DOI":"10.1109\/ICASSP40776.2020.9053174"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., and Paluri, M. (2018, January 18\u201323). A Closer Look at Spatiotemporal Convolutions for Action Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00675"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"2880","DOI":"10.1109\/TASLP.2020.3030497","article-title":"PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition","volume":"28","author":"Kong","year":"2020","journal-title":"IEEE\/ACM Trans. Audio Speech Lang. Process."}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/16\/5\/161\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T14:40:46Z","timestamp":1760107246000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/16\/5\/161"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,5,7]]},"references-count":34,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2024,5]]}},"alternative-id":["fi16050161"],"URL":"https:\/\/doi.org\/10.3390\/fi16050161","relation":{},"ISSN":["1999-5903"],"issn-type":[{"value":"1999-5903","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,5,7]]}}}