{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T20:56:15Z","timestamp":1770065775021,"version":"3.49.0"},"reference-count":45,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2025,4,7]],"date-time":"2025-04-07T00:00:00Z","timestamp":1743984000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,4,7]],"date-time":"2025-04-07T00:00:00Z","timestamp":1743984000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001659","name":"Deutsche Forschungsgemeinschaft","doi-asserted-by":"publisher","award":["EXC-2075 - 390740016"],"award-info":[{"award-number":["EXC-2075 - 390740016"]}],"id":[{"id":"10.13039\/501100001659","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Vis. Comput. Ind. Biomed. Art"],"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>This study presents a novel visualization approach to explainable artificial intelligence for graph-based visual question answering (VQA) systems. The method focuses on identifying false answer predictions by the model and offers users the opportunity to directly correct mistakes in the input space, thus facilitating dataset curation. The decision-making process of the model is demonstrated by highlighting certain internal states of a graph neural network (GNN). The proposed system is built on top of a GraphVQA framework that implements various GNN-based models for VQA trained on the GQA dataset. The authors evaluated their tool through the demonstration of identified use cases, quantitative measures, and a user study conducted with experts from machine learning, visualization, and natural language processing domains. The authors\u2019 findings highlight the prominence of their implemented features in supporting the users with incorrect prediction identification and identifying the underlying issues. Additionally, their approach is easily extendable to similar models aiming at graph-based question answering.<\/jats:p>","DOI":"10.1186\/s42492-025-00185-y","type":"journal-article","created":{"date-parts":[[2025,4,7]],"date-time":"2025-04-07T10:12:36Z","timestamp":1744020756000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Visual explainable artificial intelligence for graph-based visual question answering and scene graph curation"],"prefix":"10.1186","volume":"8","author":[{"given":"Sebastian","family":"K\u00fcnzel","sequence":"first","affiliation":[]},{"given":"Tanja","family":"Munz-K\u00f6rner","sequence":"additional","affiliation":[]},{"given":"Pascal","family":"Tilli","sequence":"additional","affiliation":[]},{"given":"Noel","family":"Sch\u00e4fer","sequence":"additional","affiliation":[]},{"given":"Sandeep","family":"Vidyapu","sequence":"additional","affiliation":[]},{"given":"Ngoc","family":"Thang Vu","sequence":"additional","affiliation":[]},{"given":"Daniel","family":"Weiskopf","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,4,7]]},"reference":[{"key":"185_CR1","doi-asserted-by":"publisher","unstructured":"Chang XJ, Ren PZ, Xu PF, Li ZH, Chen XJ, Hauptmann A (2023) A comprehensive survey of scene graphs: Generation and application. IEEE Trans Pattern Anal Mach Intell 45(1):1\u201326.\u00a0https:\/\/doi.org\/10.1109\/TPAMI.2021.3137605","DOI":"10.1109\/TPAMI.2021.3137605"},{"key":"185_CR2","doi-asserted-by":"publisher","unstructured":"Damodaran V, Chakravarthy S, Kumar A, Umapathy A, Mitamura T, Nakashima Y et al (2021) Understanding the role of scene graphs in visual question answering. arXiv preprint arXiv: 2101.05479. https:\/\/doi.org\/10.48550\/arXiv.2101.05479","DOI":"10.48550\/arXiv.2101.05479"},{"key":"185_CR3","doi-asserted-by":"publisher","unstructured":"Liang WX, Jiang YH, Liu ZX (2021) GraghVQA: language-guided graph neural networks for graph-based visual question answering. In: Proceedings of the 3rd workshop on multimodal artificial intelligence, Association for Computational Linguistics, Mexico, 6 June 2021. https:\/\/doi.org\/10.18653\/v1\/2021.maiworkshop-1.12","DOI":"10.18653\/v1\/2021.maiworkshop-1.12"},{"key":"185_CR4","doi-asserted-by":"publisher","unstructured":"Hudson DA, Manning CD (2019) GQA: a new dataset for real-world visual reasoning and compositional question answering. In: Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition, IEEE Computer Society, Long Beach, 15\u201320 June 2019. https:\/\/doi.org\/10.1109\/CVPR.2019.00686","DOI":"10.1109\/CVPR.2019.00686"},{"key":"185_CR5","doi-asserted-by":"publisher","unstructured":"V\u00e4th D, Tilli P, Vu NT (2021) Beyond accuracy: a consolidated tool for visual question answering benchmarking. In: Proceedings of the 2021 conference on empirical methods in natural language processing: system demonstrations, Association for Computational Linguistics, Punta Cana, 7\u201311 November 2021. https:\/\/doi.org\/10.18653\/v1\/2021.emnlp-demo.14","DOI":"10.18653\/v1\/2021.emnlp-demo.14"},{"key":"185_CR6","doi-asserted-by":"publisher","unstructured":"Danilevsky M, Qian K, Aharonov R, Katsis Y, Kawas B, Sen P (2020) A survey of the state of explainable AI for natural language processing. In: Proceedings of the 1st conference of the Asia-Pacific chapter of the association for computational linguistics and the 10th international joint conference on natural language processing, Association for Computational Linguistics, Suzhou, 4\u20137 December 2020. https:\/\/doi.org\/10.18653\/v1\/2020.aacl-main.46","DOI":"10.18653\/v1\/2020.aacl-main.46"},{"key":"185_CR7","doi-asserted-by":"publisher","unstructured":"Do\u0161ilovi\u0107 FK, Br\u010di\u0107 M, Hlupi\u0107 N (2018) Explainable artificial intelligence: a survey. In: Proceedings of the 41st international convention on information and communication technology, electronics and microelectronics, IEEE, Opatija, 21\u201325 May 2018. https:\/\/doi.org\/10.23919\/MIPRO.2018.8400040","DOI":"10.23919\/MIPRO.2018.8400040"},{"key":"185_CR8","doi-asserted-by":"publisher","unstructured":"Xu FY, Uszkoreit H, Du YZ, Fan W, Zhao DY, Zhu J (2019) Explainable AI: a brief survey on history, research areas, approaches and challenges. In: Tang J, Kan MY, Zhao DY, Li SJ, Zan HY (eds) Natural language processing and Chinese computing. 8th CCF international conference, NLPCC 2019, Dunhuang, October 2019. Lecture notes in computer science (Lecture notes in artificial intelligence), vol 11839. Springer, Cham, pp 563\u2013574. https:\/\/doi.org\/10.1007\/978-3-030-32236-6_51","DOI":"10.1007\/978-3-030-32236-6_51"},{"key":"185_CR9","doi-asserted-by":"publisher","unstructured":"Sch\u00e4fer N, K\u00fcnzel S, Munz-K\u00f6rner T, Tilli P, Vidyapu S, Thang Vu N et al (2023) Visual analysis of scene-graph-based visual question answering. In: Proceedings of the 16th international symposium on visual information communication and interaction, Association for Computing Machinery, Guangzhou, 22\u201324 September 2023. https:\/\/doi.org\/10.1145\/3615522.3615547","DOI":"10.1145\/3615522.3615547"},{"key":"185_CR10","doi-asserted-by":"publisher","unstructured":"Sch\u00e4fer N, K\u00fcnzel S, Tilli P, Munz-K\u00f6rner T, Vidyapu S, Vu NT et al (2024) Extended visual analysis system for scene-graph-based visual question answering. DaRUS, V1. https:\/\/doi.org\/10.18419\/darus-3909","DOI":"10.18419\/darus-3909"},{"key":"185_CR11","unstructured":"Cook KA, Thomas JJ (2005) Illuminating the path: the research and development agenda for visual analytics. Pacific Northwest National Laboratory, Richland"},{"key":"185_CR12","doi-asserted-by":"publisher","unstructured":"Keim DA, Mansmann F, Schneidewind J, Thomas J, Ziegler H (2008) Visual analytics: scope and challenges. In: Simoff SJ, B\u00f6hlen MH, Mazeika A (eds) Visual data mining: theory, techniques and tools for visual analytics, vol 4404. Springer, Heidelberg, pp 76\u201390. https:\/\/doi.org\/10.1007\/978-3-540-71080-6_6","DOI":"10.1007\/978-3-540-71080-6_6"},{"key":"185_CR13","doi-asserted-by":"publisher","unstructured":"Garcia R, Telea AC, da Silva BC, T\u00f8rresen J, Comba JLD (2018) A task-and-technique centered survey on visual analytics for deep learning model engineering. Comput Graph 77:30\u201349. https:\/\/doi.org\/10.1016\/j.cag.2018.09.018","DOI":"10.1016\/j.cag.2018.09.018"},{"key":"185_CR14","doi-asserted-by":"publisher","unstructured":"Hohman F, Kahng M, Pienta R, Chau DH (2019) Visual analytics in deep learning: An interrogative survey for the next frontiers. IEEE Trans Vis Comput Graph 25(8):2674\u20132693.\u00a0https:\/\/doi.org\/10.1109\/TVCG.2018.2843369","DOI":"10.1109\/TVCG.2018.2843369"},{"key":"185_CR15","doi-asserted-by":"publisher","unstructured":"Liu SX, Wang XT, Liu MC, Zhu J (2017) Towards better analysis of machine learning models: a visual analytics perspective. Vis Inf 1(1):48\u201356.\nhttps:\/\/doi.org\/10.1016\/j.visinf.2017.01.006","DOI":"10.1016\/j.visinf.2017.01.006"},{"key":"185_CR16","doi-asserted-by":"publisher","unstructured":"Yuan J, Chen CJ, Yang WK, Liu MC, Xia JZ, Liu SX (2021) A survey of visual analytics techniques for machine learning. Comput Vis Media 7(1):3\u201336.\u00a0https:\/\/doi.org\/10.1007\/s41095-020-0191-7","DOI":"10.1007\/s41095-020-0191-7"},{"key":"185_CR17","doi-asserted-by":"publisher","unstructured":"Choo J, Liu SX (2018) Visual analytics for explainable deep learning. IEEE Comput Graph Appl 38(4):84\u201392. https:\/\/doi.org\/10.1109\/MCG.2018.042731661","DOI":"10.1109\/MCG.2018.042731661"},{"key":"185_CR18","doi-asserted-by":"publisher","unstructured":"Kahng M, Andrews PY, Kalro A, Chau DH (2018) ActiVis: visual exploration of industry-scale deep neural network models. IEEE Trans Vis Comput Graph 24(1):88\u201397.\u00a0https:\/\/doi.org\/10.1109\/TVCG.2017.2744718","DOI":"10.1109\/TVCG.2017.2744718"},{"key":"185_CR19","doi-asserted-by":"publisher","unstructured":"Strobelt H, Gehrmann S, Pfister H, Rush AM (2018) LSTMVis: a tool for visual analysis of hidden state dynamics in recurrent neural networks. IEEE Trans Vis Comput Graph 24(1):667\u2013676. https:\/\/doi.org\/10.1109\/TVCG.2017.2744158","DOI":"10.1109\/TVCG.2017.2744158"},{"key":"185_CR20","doi-asserted-by":"publisher","unstructured":"Ming Y, Cao SZ, Zhang RX, Li Z, Chen YZ, Song YQ et al (2017) Understanding hidden memories of recurrent neural networks. In: Proceedings of the 2017 IEEE conference on visual analytics science and technology, IEEE, Phoenix, 3\u20136 October 2017. https:\/\/doi.org\/10.1109\/VAST.2017.8585721","DOI":"10.1109\/VAST.2017.8585721"},{"key":"185_CR21","doi-asserted-by":"publisher","unstructured":"Munz T, V\u00e4th D, Kuznecov P, Vu NT, Weiskopf D (2022) Visualization-based improvement of neural machine translation. Comput Graph 103:45\u201360.\u00a0https:\/\/doi.org\/10.1016\/j.cag.2021.12.003","DOI":"10.1016\/j.cag.2021.12.003"},{"key":"185_CR22","doi-asserted-by":"publisher","unstructured":"Garcia R, Munz T, Weiskopf D (2021) Visual analytics tool for the interpretation of hidden states in recurrent neural networks. Vis Comput Ind Biomed Art 4(1):24.\u00a0https:\/\/doi.org\/10.1186\/s42492-021-00090-0","DOI":"10.1186\/s42492-021-00090-0"},{"key":"185_CR23","doi-asserted-by":"publisher","unstructured":"Antol S, Agrawal A, Lu JS, Mitchell M, Batra D, Zitnick CL et al (2015) VQA: visual question answering. In: Proceedings of the IEEE international conference on computer vision, IEEE, Santiago, 7\u201313 December 2015. https:\/\/doi.org\/10.1109\/ICCV.2015.279","DOI":"10.1109\/ICCV.2015.279"},{"key":"185_CR24","doi-asserted-by":"publisher","unstructured":"Yusuf R, Owusu JW, Wang HL, Qin K, Lawal ZK, Dong YZ (2022) VQA and visual reasoning: An overview of recent datasets, methods and challenges. arXiv:2212.13296 [cs.CV]. https:\/\/doi.org\/10.48550\/arXiv.2212.13296","DOI":"10.48550\/arXiv.2212.13296"},{"key":"185_CR25","doi-asserted-by":"publisher","unstructured":"Li QF, Tang XY, Jian Y (2021) Adversarial learning with bidirectional attention for visual question answering. Sensors 21(21):7164.\u00a0https:\/\/doi.org\/10.3390\/s21217164\u00a0","DOI":"10.3390\/s21217164"},{"key":"185_CR26","doi-asserted-by":"publisher","unstructured":"Huang ZC, Zeng ZY, Liu B, Fu DM, Fu JL (2020) Pixel-BERT: aligning image pixels with text by deep multi-modal transformers. arXiv preprint arXiv: 2004.00849. https:\/\/doi.org\/10.48550\/arXiv.2004.00849","DOI":"10.48550\/arXiv.2004.00849"},{"key":"185_CR27","doi-asserted-by":"publisher","unstructured":"Goyal Y, Mohapatra A, Parikh D, Batra D (2016) Towards transparent AI systems: interpreting visual question answering models. arXiv preprint arXiv: 1608.08974. https:\/\/doi.org\/10.48550\/arXiv.1608.08974","DOI":"10.48550\/arXiv.1608.08974"},{"key":"185_CR28","doi-asserted-by":"publisher","unstructured":"Vu MH, L\u00f6fstedt T, Nyholm T, Sznitman R (2020) A question-centric model for visual question answering in medical imaging. IEEE Trans Med Imaging 39(9):2856\u20132868.\u00a0https:\/\/doi.org\/10.1109\/TMI.2020.2978284","DOI":"10.1109\/TMI.2020.2978284"},{"key":"185_CR29","doi-asserted-by":"publisher","unstructured":"Ray A, Cogswell M, Lin X, Alipour K, Divakaran A, Yao Y et al (2021) Generating and evaluating explanations of attended and error-inducing input regions for VQA models. Appl AI Lett 2(4):e51. https:\/\/doi.org\/10.1002\/ail2.51","DOI":"10.1002\/ail2.51"},{"key":"185_CR30","unstructured":"Rajani NF, Mooney RJ (2017) Ensembling visual explanations for VQA. In: Proceedings of the 31st conference on neural information processing systems, Curran Associates Inc., Long Beach, 4\u20139 December 2017"},{"key":"185_CR31","unstructured":"Norcliffe-Brown W, Vafeias S, Parisot S (2018) Learning conditioned graph structures for interpretable visual question answering. In: Proceedings of the 32nd international conference on neural information processing systems, Curran Associates Inc., Montr\u00e9al, 3\u20138 December 2018"},{"key":"185_CR32","doi-asserted-by":"publisher","unstructured":"Johnson J, Krishna R, Stark M, Li LJ, Shamma DA, Bernstein MS et al (2015) Image retrieval using scene graphs. In: Proceedings of the 2015 IEEE conference on computer vision and pattern recognition, IEEE, Boston, 7\u201312 June 2015. https:\/\/doi.org\/10.1109\/CVPR.2015.7298990","DOI":"10.1109\/CVPR.2015.7298990"},{"key":"185_CR33","doi-asserted-by":"publisher","unstructured":"Ghosh S, Burachas G, Ray A, Ziskind A (2019) Generating natural language explanations for visual question answering using scene graphs and visual attention. arXiv preprint arXiv: 1902.05715. https:\/\/doi.org\/10.48550\/arXiv.1902.05715","DOI":"10.48550\/arXiv.1902.05715"},{"key":"185_CR34","doi-asserted-by":"publisher","unstructured":"Sch\u00e4fer N (2022) Visual analytics f\u00fcr visual-reasoning-aufgaben. Dissertation, Universit\u00e4t Stuttgart. https:\/\/doi.org\/10.18419\/opus-12677","DOI":"10.18419\/opus-12677"},{"key":"185_CR35","doi-asserted-by":"publisher","unstructured":"Wu Q, Teney D, Wang P, Shen CH, Dick A, van den Hengel A (2017) Visual\u00a0question answering: a survey of methods and datasets. Comput Vis Image Und 163:21\u201340.\u00a0https:\/\/doi.org\/10.1016\/j.cviu.2017.05.001","DOI":"10.1016\/j.cviu.2017.05.001"},{"key":"185_CR36","doi-asserted-by":"publisher","unstructured":"Ishmam MF, Shovon MSH, Mridha MF, Dey N (2024) From image to language: A critical analysis of Visual Question Answering (VQA) approaches, challenges, and opportunities. Inform Fusion 106:102270. https:\/\/doi.org\/10.1016\/j.inffus.2024.10227","DOI":"10.1016\/j.inffus.2024.10227"},{"key":"185_CR37","doi-asserted-by":"publisher","unstructured":"Waikhom L, Patgiri R (2023) A survey of graph neural networks in various learning paradigms: methods, applications, and challenges. Artif Intell Rev 56(7):6295\u20136364.\u00a0https:\/\/doi.org\/10.1007\/s10462-022-10321-2","DOI":"10.1007\/s10462-022-10321-2"},{"key":"185_CR38","unstructured":"Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th international conference on learning representations, OpenReview.net, Toulon, 24\u201326 April 2017"},{"key":"185_CR39","unstructured":"Brody S, Alon U, Yahav E (2022) How attentive are graph attention networks? In: Proceedings of the 10th international conference on learning representations, ICLR, online, 25\u201329 April 2022"},{"key":"185_CR40","doi-asserted-by":"publisher","unstructured":"Hagberg A, Swart PJ, Schult DA (2008) Exploring network structure, dynamics, and function using NetworkX. In: proceedings of the 7th python in science conference, SciPy, Pasadena, 19\u201324 August 2008. https:\/\/doi.org\/10.25080\/TCWV9851","DOI":"10.25080\/TCWV9851"},{"key":"185_CR41","unstructured":"Tilli P, Vu NT (2024) Intrinsic subgraph generation for interpretable graph based visual question answering. In: Proceedings of the 2024 joint international conference on computational linguistics, language resources and evaluation (LREC-COLING 2024), ELRA and ICCL, Torino, 20\u201325 May 2024"},{"key":"185_CR42","doi-asserted-by":"publisher","unstructured":"Gervautz M, Purgathofer W (1988) A simple method for color quantization: octree quantization. In: Magnenat-Thalmann N, Thalmann D (eds) New trends in computer graphics: proceedings of CG International\u201988. Springer, Heidelberg, pp 219\u2013231. https:\/\/doi.org\/10.1007\/978-3-642-83492-9_20","DOI":"10.1007\/978-3-642-83492-9_20"},{"key":"185_CR43","doi-asserted-by":"publisher","unstructured":"Pennington J, Socher R, Manning CD (2014) GloVe: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, Association for Computational Linguistics, Doha, 25\u201329 October 2014. https:\/\/doi.org\/10.3115\/v1\/D14-1162","DOI":"10.3115\/v1\/D14-1162"},{"key":"185_CR44","doi-asserted-by":"publisher","unstructured":"Ericsson KA, Simon HA (1993) Protocol analysis. MIT Press, Cambridge.\u00a0https:\/\/doi.org\/10.7551\/mitpress\/5657.001.0001","DOI":"10.7551\/mitpress\/5657.001.0001"},{"key":"185_CR45","doi-asserted-by":"publisher","unstructured":"Richer G, Pister A, Abdelaal M, Fekete JD, Sedlmair M, Weiskopf D (2024) Scalability in visualization. IEEE Trans Vis Comput Graph 30(7):3314\u20133330.\u00a0https:\/\/doi.org\/10.1109\/TVCG.2022.3231230","DOI":"10.1109\/TVCG.2022.3231230"}],"container-title":["Visual Computing for Industry, Biomedicine, and Art"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42492-025-00185-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s42492-025-00185-y\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s42492-025-00185-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,4,7]],"date-time":"2025-04-07T10:12:55Z","timestamp":1744020775000},"score":1,"resource":{"primary":{"URL":"https:\/\/vciba.springeropen.com\/articles\/10.1186\/s42492-025-00185-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,7]]},"references-count":45,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2025,12]]}},"alternative-id":["185"],"URL":"https:\/\/doi.org\/10.1186\/s42492-025-00185-y","relation":{},"ISSN":["2524-4442"],"issn-type":[{"value":"2524-4442","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,4,7]]},"assertion":[{"value":"20 March 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 January 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 April 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"All participants gave written consent to participate in the study. The study was approved by the ethics commission of the University of Stuttgart.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"9"}}