{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,31]],"date-time":"2026-03-31T11:49:54Z","timestamp":1774957794066,"version":"3.50.1"},"reference-count":17,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2025,1,13]],"date-time":"2025-01-13T00:00:00Z","timestamp":1736726400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100010661","name":"Horizon 2020 Framework Programme","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100010661","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Artif. Intell."],"abstract":"<jats:p>Many different methods for prompting large language models have been developed since the emergence of OpenAI's ChatGPT in November 2022. In this work, we evaluate six different few-shot prompting methods. The first set of experiments evaluates three frameworks that focus on the quantity or type of shots in a prompt: a baseline method with a simple prompt and a small number of shots, random few-shot prompting with 10, 20, and 30 shots, and similarity-based few-shot prompting. The second set of experiments target optimizing the prompt or enhancing shots through Large Language Model (LLM)-generated explanations, using three prompting frameworks: Explain then Translate, Question Decomposition Meaning Representation, and Optimization by Prompting. We evaluate these six prompting methods on the newly created Spider4SPARQL benchmark, as it is the most complex SPARQL-based Knowledge Graph Question Answering (KGQA) benchmark to date. Across the various prompting frameworks used, the commercial model is unable to achieve a score over 51%, indicating that KGQA, especially for complex queries, with multiple hops, set operations and filters remains a challenging task for LLMs. Our experiments find that the most successful prompting framework for KGQA is a simple prompt combined with an ontology and five random shots.<\/jats:p>","DOI":"10.3389\/frai.2024.1454258","type":"journal-article","created":{"date-parts":[[2025,1,13]],"date-time":"2025-01-13T06:12:17Z","timestamp":1736748737000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Evaluating the effectiveness of prompt engineering for knowledge graph question answering"],"prefix":"10.3389","volume":"7","author":[{"given":"Catherine","family":"Kosten","sequence":"first","affiliation":[]},{"given":"Farhad","family":"Nooralahzadeh","sequence":"additional","affiliation":[]},{"given":"Kurt","family":"Stockinger","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2025,1,13]]},"reference":[{"key":"B1","doi-asserted-by":"crossref","DOI":"10.1109\/SIEDS58326.2023.10137850","article-title":"\u201cChatgpt: applications, opportunities, and threats,\u201d","volume-title":"Proceedings of the IEEE Information Engineering Design Symposium (SIEDS)","author":"Bahrini","year":"2023"},{"key":"B2","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2110.14168","article-title":"Training verifiers to solve math word problems","author":"Cobbe","year":"2021","journal-title":"arXiv [Preprint]"},{"key":"B3","doi-asserted-by":"publisher","first-page":"158","DOI":"10.48786\/EDBT.2025.13","article-title":"\u201cEvaluating the data model robustness of text-to-SQL systems based on real user queries,\u201d","author":"F\u00fcrst","year":"2025","journal-title":"Proceedings 28th International Conference on Extending Database Technology, EDBT 2025, Barcelona, Spain, March 25-28, 2025"},{"key":"B4","article-title":"\u201cLarge language models are zero-shot reasoners,\u201d","volume-title":"Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS '22","author":"Kojima","year":"2024"},{"key":"B5","doi-asserted-by":"crossref","first-page":"5272","DOI":"10.1109\/BigData59044.2023.10386182","article-title":"\u201cSpider4sparql: a complex benchmark for evaluating knowledge graph question answering systems,\u201d","volume-title":"2023 IEEE International Conference on Big Data (BigData)","author":"Kosten","year":"2023"},{"key":"B6","first-page":"3","article-title":"\u201cA description logic primer,\u201d","author":"Krotzsch","year":"2014","journal-title":"Perspectives on Ontology Learning, Vol. 18"},{"key":"B7","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s40537-020-00383-w","article-title":"Querying knowledge graphs in natural language","volume":"8","author":"Liang","year":"2021","journal-title":"J. Big Data"},{"key":"B8","doi-asserted-by":"publisher","first-page":"5486","DOI":"10.18653\/v1\/2024.findings-acl.326","article-title":"\u201cStatBot.Swiss: bilingual open data exploration in natural language,\u201d","author":"Nooralahzadeh","year":"2024","journal-title":"Findings of the Association for Computational Linguistics: ACL 2024"},{"key":"B9","doi-asserted-by":"publisher","first-page":"16452","DOI":"10.48550\/arXiv\/2311:16452","article-title":"Can generalist foundation models outcompete special-purpose tuning? Case study in medicine","volume":"2311","author":"Nori","year":"2023","journal-title":"arXiv [Preprint]"},{"key":"B10","article-title":"Code llama: open foundation models for code","author":"Rozi\u00e9re","year":"2024","journal-title":"arXiv [Preprint]"},{"key":"B11","first-page":"61","article-title":"\u201cBio-soda: enabling natural language question answering over knowledge graphs without training data,\u201d","volume-title":"Proceedings of the 33rd International Conference on Scientific and Statistical Database Management, SSDBM '21","author":"Sima","year":"2021"},{"key":"B12","first-page":"1247","article-title":"\u201cWhy reinvent the wheel: let's build question answering systems together,\u201d","volume-title":"Proceedings of the 2018 World Wide Web Conference, WWW '18","author":"Singh","year":"2018"},{"key":"B13","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.findings-emnlp.119","article-title":"\u201cExplain-then-translate: An analysis on improving program translation with self-generated explanations,\u201d","author":"Tang","year":"2023","journal-title":"Findings of the Association for Computational Linguistics: EMNLP"},{"key":"B14","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1162\/tacl_a_00309","article-title":"Break it down: a question understanding benchmark","volume":"8","author":"Wolfson","year":"2020","journal-title":"Trans. Assoc. Comput. Linguist"},{"key":"B15","unstructured":"\u201cLarge language models as optimizers,\u201d\n          \n          \n            \n              Yang\n              C.\n            \n            \n              Wang\n              X.\n            \n            \n              Lu\n              Y.\n            \n            \n              Liu\n              H.\n            \n            \n              Le\n              Q. V.\n            \n            \n              Zhou\n              D.\n            \n          \n          OpenReview.net\n          The Twelfth International Conference on Learning Representations\n          \n          2024"},{"key":"B16","doi-asserted-by":"crossref","first-page":"3911","DOI":"10.18653\/v1\/D18-1425","article-title":"\u201cSpider: a large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task,\u201d","volume-title":"Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing","author":"Yu","year":"2018"},{"key":"B17","doi-asserted-by":"publisher","first-page":"685","DOI":"10.14778\/3636218.3636225","article-title":"Sciencebenchmark: a complex real-world benchmark for evaluating natural language to SQL systems","volume":"17","author":"Zhang","year":"2023","journal-title":"Proc. VLDB Endow"}],"container-title":["Frontiers in Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2024.1454258\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,1,13]],"date-time":"2025-01-13T06:12:29Z","timestamp":1736748749000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2024.1454258\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,13]]},"references-count":17,"alternative-id":["10.3389\/frai.2024.1454258"],"URL":"https:\/\/doi.org\/10.3389\/frai.2024.1454258","relation":{},"ISSN":["2624-8212"],"issn-type":[{"value":"2624-8212","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1,13]]},"article-number":"1454258"}}