{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T14:10:53Z","timestamp":1760710253738,"version":"build-2065373602"},"reference-count":32,"publisher":"MDPI AG","issue":"10","license":[{"start":{"date-parts":[[2025,10,16]],"date-time":"2025-10-16T00:00:00Z","timestamp":1760572800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Tianshan Talents Cultivation Program\u2014Leading Talents for Scientific and Technological Innovation","award":["2024TSYCLJ0002"],"award-info":[{"award-number":["2024TSYCLJ0002"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Information"],"abstract":"<jats:p>Multi-modal knowledge graph completion (MMKGC) aims to complete knowledge graphs by integrating structural information with multi-modal (e.g., visual, textual, and numerical) features and leveraging cross-modal reasoning within a unified semantic space to infer and supplement missing factual knowledge. Current MMKGC methods have advanced in terms of integrating multi-modal information but have overlooked the imbalance in modality importance for target entities. Treating all modalities equally dilutes critical semantics and amplifies irrelevant information, which in turn limits the semantic understanding and predictive performance of the model. To address these limitations, we proposed a modality information aggregation graph attention network with adversarial training for multi-modal knowledge graph completion (MIAGAT-AT). MIAGAT-AT focuses on hierarchically modeling complex cross-modal interactions. By combining the multi-head attention mechanism with modality-specific projection methods, it precisely captures global semantic dependencies and dynamically adjusts the weight of modality embeddings according to the importance of each modality, thereby optimizing cross-modal information fusion capabilities. Moreover, through the use of random noise and multi-layer residual blocks, the adversarial training generates high-quality multi-modal feature representations, thereby effectively enhancing information from imbalanced modalities. Experimental results demonstrate that our approach significantly outperforms 18 existing baselines and establishes a strong performance baseline across three distinct datasets.<\/jats:p>","DOI":"10.3390\/info16100907","type":"journal-article","created":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T13:07:22Z","timestamp":1760706442000},"page":"907","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Modality Information Aggregation Graph Attention Network with Adversarial Training for Multi-Modal Knowledge Graph Completion"],"prefix":"10.3390","volume":"16","author":[{"given":"Hankiz","family":"Yilahun","sequence":"first","affiliation":[{"name":"School of Computer Science and Technology, Xinjiang University, Urumqi 830017, China"},{"name":"Xinjiang Key Laboratory of Multilingual Information Technology, Urumqi 830017, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Elyar","family":"Aili","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xinjiang University, Urumqi 830017, China"},{"name":"Xinjiang Key Laboratory of Multilingual Information Technology, Urumqi 830017, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Seyyare","family":"Imam","sequence":"additional","affiliation":[{"name":"School of National Security Studies, Xinjiang University, Urumqi 830017, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2321-308X","authenticated-orcid":false,"given":"Askar","family":"Hamdulla","sequence":"additional","affiliation":[{"name":"School of Computer Science and Technology, Xinjiang University, Urumqi 830017, China"},{"name":"Xinjiang Key Laboratory of Multilingual Information Technology, Urumqi 830017, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,10,16]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Xu, Z., Cruz, M.J., Guevara, M., Wang, T., Deshpande, M., Wang, X., and Li, Z. (2024, January 14\u201318). Retrieval-augmented generation with knowledge graphs for customer service question answering. Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, Washington, DC, USA.","DOI":"10.1145\/3626772.3661370"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"481","DOI":"10.1007\/s11633-023-1440-x","article-title":"Ripple knowledge graph convolutional networks for recommendation systems","volume":"21","author":"Li","year":"2024","journal-title":"Mach. Intell. Res."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"104319","DOI":"10.1016\/j.cose.2025.104319","article-title":"Sublinear smart semantic search based on knowledge graph over encrypted database","volume":"151","author":"Li","year":"2025","journal-title":"Comput. Secur."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"9456","DOI":"10.1109\/TPAMI.2024.3417451","article-title":"A survey of knowledge graph reasoning on graph types: Static, dynamic, and multi-modal","volume":"46","author":"Liang","year":"2024","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_5","unstructured":"Chen, Z., Chen, J., Zhang, W., Guo, L., Fang, Y., Huang, Y., Zhang, Y., Geng, Y., Pan, J.Z., and Song, W. (November, January 29). Meaformer: Multi-modal entity alignment transformer for meta modality hybrid. Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, ON, Canada."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3545573","article-title":"Hyper-node relational graph attention network for multi-modal knowledge graph completion","volume":"19","author":"Liang","year":"2023","journal-title":"ACM Trans. Multim. Comput. Commun. Appl."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Wang, S., Wei, X., Nogueira dos Santos, C.N., Wang, Z., Nallapati, R., Arnold, A., Xiang, B., Yu, P.S., and Cruz, I.F. (2021, January 19\u201323). Mixed-curvature multi-relational graph neural network for knowledge graph completion. Proceedings of the Web Conference 2021, Ljubljana, Slovenia.","DOI":"10.1145\/3442381.3450118"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Li, X., Zhao, X., Xu, J., Zhang, Y., and Xing, C. (May, January 30). IMF: Interactive multimodal fusion model for link prediction. Proceedings of the ACM Web Conference 2023, Austin, TX, USA.","DOI":"10.1145\/3543507.3583554"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Mousselly-Sergieh, H., Botschen, T., Gurevych, I., and Roth, S. (2018, January 5\u20136). A multimodal translation-based approach for knowledge graph representation learning. Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, New Orleans, LA, USA.","DOI":"10.18653\/v1\/S18-2027"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Wang, M., Wang, S., Yang, H., Zhang, Z., Chen, X., and Qi, G. (2021, January 20\u201324). Is visual context really helpful for knowledge graph? A representation learning perspective. Proceedings of the 29th ACM International Conference on Multimedia, Virtual.","DOI":"10.1145\/3474085.3475470"},{"key":"ref_11","unstructured":"Xie, R., Liu, Z., Luan, H., and Sun, M. (2017, January 19\u201325). Image-embodied knowledge representation learning. Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI 2017), Melbourne, Australia."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Wang, Z., Li, L., Li, Q., and Zeng, D. (2019, January 14\u201319). Multimodal data enhanced representation learning for knowledge graphs. Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN 2019), Budapest, Hungary.","DOI":"10.1109\/IJCNN.2019.8852079"},{"key":"ref_13","first-page":"39090","article-title":"Otkge: Multi-modal knowledge graph embeddings via optimal transport","volume":"35","author":"Cao","year":"2022","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_14","unstructured":"Zhang, Y., and Zhang, W. (2022). Knowledge graph completion with pre-trained multimodal transformer and twins negative sampling. arXiv."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Xu, D., Xu, T., Wu, S., Zhou, J., and Chen, E. (2022, January 10\u201314). Relation-enhanced negative sampling for multimodal knowledge graph completion. Proceedings of the 30th ACM International Conference on Multimedia (ACM MM 2022), Lisbon, Portugal.","DOI":"10.1145\/3503161.3548388"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Chen, M., and Zhang, W. (2023, January 18\u201323). Modality-aware negative sampling for multi-modal knowledge graph embedding. Proceedings of the 2023 International Joint Conference on Neural Networks (IJCNN 2023), Gold Coast, Australia.","DOI":"10.1109\/IJCNN54540.2023.10191314"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Zhang, H., Han, Q., Sun, H., and Liu, C. (November, January 30). Multi-modal knowledge graph representation based on counterfactual data enhanced learning link prediction. Proceedings of the 2024 11th International Conference on Behavioural and Social Computing (BESC 2024), Okayama, Japan.","DOI":"10.1109\/BESC64747.2024.10780531"},{"key":"ref_18","unstructured":"Zhang, Y., Chen, Z., Liang, L., Chen, H., and Zhang, W. (2024, January 20\u201325). Unleashing the power of imbalanced modality information for multi-modal knowledge graph completion. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Turin, Italy."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Chen, Z., Guo, L., Xu, Y., Hu, B., Liu, Z., Zhang, W., and Chen, H. (2024, January 14\u201318). Native: Multi-modal knowledge graph completion in the wild. Proceedings of the 47th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024), Washington, DC, USA.","DOI":"10.1145\/3626772.3657800"},{"key":"ref_20","unstructured":"Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Li\u00f2, P., and Bengio, Y. (May, January 30). Graph attention networks. Proceedings of the 6th International Conference on Learning Representations (ICLR 2018), Vancouver, BC, Canada."},{"key":"ref_21","unstructured":"Sun, Z., Deng, Z.-H., Nie, J.-Y., and Tang, J. (2019, January 6\u20139). RotatE: Knowledge graph embedding by relational rotation in complex space. Proceedings of the 7th International Conference on Learning Representations (ICLR 2019), New Orleans, LA, USA."},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Liu, Y., Li, H., Garcia-Duran, A., Niepert, M., Onoro-Rubio, D., and Rosenblum, D.S. (2019, January 2\u20136). MMKG: Multi-modal knowledge graphs. Proceedings of the 16th Extended Semantic Web Conference (ESWC 2019), Portoro\u017e, Slovenia.","DOI":"10.1007\/978-3-030-21348-0_30"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., and Ives, Z. (2007, January 11\u201315). Dbpedia: A nucleus for a web of open data. Proceedings of the 6th International Semantic Web Conference (ISWC 2007), Busan, Republic of Korea.","DOI":"10.1007\/978-3-540-76298-0_52"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1145\/2629489","article-title":"Wikidata: A free collaborative knowledgebase","volume":"57","year":"2014","journal-title":"Commun. ACM"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Suchanek, F.M., Kasneci, G., and Weikum, G. (2007, January 8\u201312). Yago: A core of semantic knowledge. Proceedings of the 16th International Conference on World Wide Web (WWW 2007), Banff, AB, Canada.","DOI":"10.1145\/1242572.1242667"},{"key":"ref_26","first-page":"2787","article-title":"Translating embeddings for modeling multi-relational data","volume":"26","author":"Bordes","year":"2013","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Ji, G., He, S., Xu, L., Liu, K., and Zhao, J. (2015, January 26\u201331). Knowledge graph embedding via dynamic mapping matrix. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2015), Beijing, China.","DOI":"10.3115\/v1\/P15-1067"},{"key":"ref_28","unstructured":"Yang, B., Yih, S.W.-T., He, X., Gao, J., and Deng, L. (2015, January 7\u20139). Embedding entities and relations for learning and inference in knowledge bases. Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015), San Diego, CA, USA."},{"key":"ref_29","unstructured":"Trouillon, T., Welbl, J., Riedel, S., Gaussier, \u00c9., and Bouchard, G. (2016, January 19\u201324). Complex embeddings for simple link prediction. Proceedings of the 33rd International Conference on Machine Learning (ICML 2016), New York, NY, USA."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"7480","DOI":"10.1007\/s10489-021-02693-9","article-title":"MMKRL: A robust embedding approach for multi-modal knowledge graph representation learning","volume":"52","author":"Lu","year":"2022","journal-title":"Appl. Intell."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Lee, J., Chung, C., Lee, H., Jo, S., and Whang, J. (2023, January 6\u201310). VISTA: Visual-textual knowledge graph representation learning. Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore.","DOI":"10.18653\/v1\/2023.findings-emnlp.488"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Cai, L., and Wang, W.Y. (2018, January 1\u20136). KBGAN: Adversarial learning for knowledge graph embeddings. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2018), New Orleans, LA, USA.","DOI":"10.18653\/v1\/N18-1133"}],"container-title":["Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/10\/907\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,17]],"date-time":"2025-10-17T13:42:54Z","timestamp":1760708574000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2078-2489\/16\/10\/907"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,16]]},"references-count":32,"journal-issue":{"issue":"10","published-online":{"date-parts":[[2025,10]]}},"alternative-id":["info16100907"],"URL":"https:\/\/doi.org\/10.3390\/info16100907","relation":{},"ISSN":["2078-2489"],"issn-type":[{"value":"2078-2489","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,16]]}}}