{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T05:25:27Z","timestamp":1777526727839,"version":"3.51.4"},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"6","funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62090025"],"award-info":[{"award-number":["62090025"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"National Key Research and Development Program of China","award":["2023YFB4405101"],"award-info":[{"award-number":["2023YFB4405101"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Des. Autom. Electron. Syst."],"published-print":{"date-parts":[[2025,11,30]]},"abstract":"<jats:p>Customization is a fundamental aspect of embedded system development, requiring developers to acquire extensive domain-specific knowledge from technical documents. However, the sheer volume of these documents, coupled with the intricate relationships between their contents, makes it challenging to efficiently retrieve the necessary information. This challenge highlights the need for a structured approach to organize domain knowledge, with explicit representation of interrelationships. Moreover, while advanced large language models (LLMs) show promise in aiding embedded system development, they often lack the specialized knowledge needed to address domain-specific queries effectively. In this article, we present HSG-RAG, a knowledge base construction and retrieval method tailored for embedded system development that leverages knowledge graphs to represent the hierarchical structure within technical documentation. Unlike prior retrieval-augmented generation (RAG) or GraphRAG-based approaches, which build the index either rely on semantic similarity or keyword co-occurrence, HSG-RAG captures the inherited dependency and hierarchical relationships in the documents, which benefits the retrieval in both performance and efficiency. We also introduce a benchmark for evaluating the effectiveness of RAG systems in solving real-world challenges in embedded systems, particularly for multi-hop question answering. Experimental results show that HSG-RAG outperforms both RAG and GraphRAG, generating more specific and concise responses.<\/jats:p>","DOI":"10.1145\/3731680","type":"journal-article","created":{"date-parts":[[2025,4,21]],"date-time":"2025-04-21T07:07:26Z","timestamp":1745219246000},"page":"1-21","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["HSG-RAG: Hierarchical Knowledge Base Construction for Embedded System Development"],"prefix":"10.1145","volume":"30","author":[{"ORCID":"https:\/\/orcid.org\/0009-0005-6159-6914","authenticated-orcid":false,"given":"Zhouyang","family":"Lu","sequence":"first","affiliation":[{"name":"School of Computer Science, Fudan University","place":["Shanghai, China"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0000-7570-4078","authenticated-orcid":false,"given":"Hailin","family":"Xu","sequence":"additional","affiliation":[{"name":"School of Computer Science, Fudan University","place":["Shanghai, China"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0004-9712-0275","authenticated-orcid":false,"given":"Anrui","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Computer Science, Fudan University","place":["Shanghai, China"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0009-6582-6691","authenticated-orcid":false,"given":"Siyuan","family":"Tang","sequence":"additional","affiliation":[{"name":"School of Computer Science, Fudan University","place":["Shanghai, China"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0003-9924-7790","authenticated-orcid":false,"given":"Junyi","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Computer Science, Fudan University","place":["Shanghai, China"]}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-7188-4225","authenticated-orcid":false,"given":"Yifei","family":"Feng","sequence":"additional","affiliation":[{"name":"School of Computer Science, Fudan University","place":["Shanghai, China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7868-5829","authenticated-orcid":false,"given":"Wentao","family":"Pan","sequence":"additional","affiliation":[{"name":"School of Computer Science, Fudan University","place":["Shanghai, China"]}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9111-8474","authenticated-orcid":false,"given":"Jiangli","family":"Huang","sequence":"additional","affiliation":[{"name":"Fudan University","place":["Shanghai, China"]}]}],"member":"320","published-online":{"date-parts":[[2025,10,21]]},"reference":[{"key":"e_1_3_3_2_2","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D. Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et\u00a0al. 2020. Language models are few-shot learners. Advances in Neural Information Processing Systems 33 (2020), 1877\u20131901.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/3641289"},{"key":"e_1_3_3_4_2","doi-asserted-by":"crossref","unstructured":"Jianlyu Chen Shitao Xiao Peitian Zhang Kun Luo Defu Lian and Zheng Liu. 2024. M3-embedding: Multi-linguality multi-functionality multi-granularity text embeddings through self-knowledge distillation. In Findings of the Association for Computational Linguistics ACL 2024. 2318\u20132335.","DOI":"10.18653\/v1\/2024.findings-acl.137"},{"key":"e_1_3_3_5_2","unstructured":"Xin Cheng Di Luo Xiuying Chen Lemao Liu Dongyan Zhao and Rui Yan. 2023. Lift yourself up: Retrieval-augmented text generation with self-memory. Advances in Neural Information Processing Systems 36 (2023) 43780\u201343799."},{"key":"e_1_3_3_6_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.acl-long.870"},{"key":"e_1_3_3_7_2","doi-asserted-by":"publisher","DOI":"10.3115\/1654679.1654689"},{"key":"e_1_3_3_8_2","unstructured":"Darren Edge Ha Trinh Newman Cheng Joshua Bradley Alex Chao Apurva Mody Steven Truitt and Jonathan Larson. 2024. From local to global: A graph rag approach to query-focused summarization. arXiv preprint arXiv:2404.16130 (2024)."},{"key":"e_1_3_3_9_2","doi-asserted-by":"crossref","unstructured":"Wenqi Fan Yujuan Ding Liangbo Ning Shijie Wang Hengyun Li Dawei Yin Tat-Seng Chua and Qing Li. 2024. A survey on rag meeting llms: Towards retrieval-augmented large language models. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 6491\u20136501.","DOI":"10.1145\/3637528.3671470"},{"key":"e_1_3_3_10_2","unstructured":"Yunfan Gao Yun Xiong Xinyu Gao Kangxiang Jia Jinliu Pan Yuxi Bi Yi Dai Jiawei Sun Haofen Wang and Haofen Wang. 2023. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997 2 (2023)."},{"key":"e_1_3_3_11_2","unstructured":"Bernal Jim\u00e9nez Guti\u00e9rrez Yiheng Shu Yu Gu Michihiro Yasunaga and Yu Su. 2024. Hipporag: Neurobiologically inspired long-term memory for large language models. In The Thirty-eighth Annual Conference on Neural Information Processing Systems."},{"key":"e_1_3_3_12_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.coling-main.580"},{"key":"e_1_3_3_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3447772"},{"key":"e_1_3_3_14_2","unstructured":"Yuntong Hu Zhihan Lei Zheng Zhang Bo Pan Chen Ling and Liang Zhao. 2024. Grag: Graph retrieval-augmented generation. arXiv preprint arXiv:2405.16506 (2024)."},{"key":"e_1_3_3_15_2","unstructured":"Yucheng Hu and Yuxing Lu. 2024. RAG and RAU: A survey on retrieval-augmented language model in natural language processing. arXiv preprint arXiv:2404.19543 (2024)."},{"key":"e_1_3_3_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/3383123"},{"key":"e_1_3_3_17_2","unstructured":"Yizheng Huang and Jimmy Huang. 2024. A survey on retrieval-augmented text generation for large language models. arXiv preprint arXiv:2404.10981 (2024)."},{"key":"e_1_3_3_18_2","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2021.3070843"},{"key":"e_1_3_3_19_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2023.findings-emnlp.123"},{"key":"e_1_3_3_20_2","doi-asserted-by":"crossref","unstructured":"Zhengbao Jiang Frank F. Xu Luyu Gao Zhiqing Sun Qian Liu Jane Dwivedi-Yu Yiming Yang Jamie Callan and Graham Neubig. 2023. Active retrieval augmented generation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 7969\u20137992.","DOI":"10.18653\/v1\/2023.emnlp-main.495"},{"key":"e_1_3_3_21_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2024.acl-long.562"},{"key":"e_1_3_3_22_2","volume-title":"Proceedings of the North American Chapter of the Association for Computational Linguistics","author":"Khashabi Daniel","year":"2018","unstructured":"Daniel Khashabi, Snigdha Chaturvedi, Michael Roth, Shyam Upadhyay, and Dan Roth. 2018. Looking beyond the surface:A challenge set for reading comprehension over multiple sentences. In Proceedings of the North American Chapter of the Association for Computational Linguistics."},{"key":"e_1_3_3_23_2","unstructured":"Xiangyang Li Kuicai Dong Yi Quan Lee Wei Xia Hao Zhang Xinyi Dai Yasheng Wang and Ruiming Tang. 2024. Coir: A comprehensive benchmark for code information retrieval models. arXiv preprint arXiv:2407.02883 (2024)."},{"key":"e_1_3_3_24_2","unstructured":"Xi Victoria Lin Xilun Chen Mingda Chen Weijia Shi Maria Lomeli Richard James Pedro Rodriguez Jacob Kahn Gergely Szilvasy Mike Lewis et\u00a0al. 2023. RA-DIT: Retrieval-augmented dual instruction tuning. In The Twelfth International Conference on Learning Representations."},{"key":"e_1_3_3_25_2","unstructured":"Xinbei Ma Yeyun Gong Pengcheng He Hai Zhao and Nan Duan. 2023. Query rewriting in retrieval-augmented large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 5303\u20135315."},{"key":"e_1_3_3_26_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.acl-long.316"},{"key":"e_1_3_3_27_2","doi-asserted-by":"publisher","unstructured":"Shervin Minaee Tomas Mikolov Narjes Nikzad Meysam Chenaghlu Richard Socher Xavier Amatriain and Jianfeng Gao. 2024. Large language models: A survey. 10.48550\/arXiv.2402.06196","DOI":"10.48550\/arXiv.2402.06196"},{"key":"e_1_3_3_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/3597503.3639187"},{"key":"e_1_3_3_29_2","doi-asserted-by":"crossref","unstructured":"James Jie Pan Jianguo Wang and Guoliang Li. 2024. Survey of vector database management systems. The VLDB Journal 33 5 (2024) 1591\u20131615.","DOI":"10.1007\/s00778-024-00864-x"},{"key":"e_1_3_3_30_2","unstructured":"Boci Peng Yun Zhu Yongchao Liu Xiaohe Bo Haizhou Shi Chuntao Hong Yan Zhang and Siliang Tang. 2024. Graph retrieval-augmented generation: A survey. arXiv preprint arXiv:2408.08921 (2024)."},{"key":"e_1_3_3_31_2","doi-asserted-by":"crossref","unstructured":"Ciyuan Peng Feng Xia Mehdi Naseriparsa and Francesco Osborne. 2023. Knowledge graphs: Opportunities and challenges. Artificial Intelligence Review 56 11 (2023) 13071\u201313102.","DOI":"10.1007\/s10462-023-10465-9"},{"key":"e_1_3_3_32_2","doi-asserted-by":"crossref","unstructured":"Wenjun Peng Guiyang Li Yue Jiang Zilong Wang Dan Ou Xiaoyi Zeng Derong Xu Tong Xu and Enhong Chen. 2024. Large language model based long-tail query rewriting in taobao search. In Companion Proceedings of the ACM Web Conference 2024. 20\u201328.","DOI":"10.1145\/3589335.3648298"},{"key":"e_1_3_3_33_2","unstructured":"Libo Qin Qiguang Chen Xiachong Feng Yang Wu Yongheng Zhang Yinghui Li Min Li Wanxiang Che and Philip S. Yu. 2024. Large language models meet nlp: A survey. arXiv preprint arXiv:2405.12819 (2024)."},{"key":"e_1_3_3_34_2","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2024.3365742"},{"key":"e_1_3_3_35_2","doi-asserted-by":"crossref","unstructured":"Matthew Renze and Erhan Guven. 2024. Self-reflection in LLM agents: Effects on problem-solving performance. arXiv preprint arXiv:2405.06682 (2024).","DOI":"10.1109\/FLLM63129.2024.10852426"},{"key":"e_1_3_3_36_2","doi-asserted-by":"crossref","unstructured":"Bhaskarjit Sarmah Dhagash Mehta Benika Hall Rohan Rao Sunil Patel and Stefano Pasquali. 2024. Hybridrag: Integrating knowledge graphs and vector retrieval augmented generation for efficient information extraction. In Proceedings of the 5th ACM International Conference on AI in Finance. 608\u2013616.","DOI":"10.1145\/3677052.3698671"},{"key":"e_1_3_3_37_2","unstructured":"Milena Trajanoska Riste Stojanov and Dimitar Trajanov. 2023. Enhancing knowledge graph construction using large language models. arXiv preprint arXiv:2305.04676 (2023)."},{"key":"e_1_3_3_38_2","doi-asserted-by":"crossref","first-page":"355","DOI":"10.18653\/v1\/W19-8643","volume-title":"Proceedings of the 12th International Conference on Natural Language Generation","author":"Lee Chris Van Der","year":"2019","unstructured":"Chris Van Der Lee, Albert Gatt, Emiel Van Miltenburg, Sander Wubben, and Emiel Krahmer. 2019. Best practices for the human evaluation of automatically generated text. In Proceedings of the 12th International Conference on Natural Language Generation. 355\u2013368."},{"key":"e_1_3_3_39_2","doi-asserted-by":"publisher","DOI":"10.1145\/3583758"},{"key":"e_1_3_3_40_2","doi-asserted-by":"crossref","unstructured":"Yu Wang Nedim Lipka Ryan A. Rossi Alexa Siu Ruiyi Zhang and Tyler Derr. 2024. Knowledge graph prompting for multi-document question answering. In Proceedings of the AAAI Conference on Artificial Intelligence 38 (2024) 19206\u201319214.","DOI":"10.1609\/aaai.v38i17.29889"},{"key":"e_1_3_3_41_2","unstructured":"Shangyu Wu Ying Xiong Yufei Cui Haolun Wu Can Chen Ye Yuan Lianming Huang Xue Liu Tei-Wei Kuo Nan Guan et\u00a0al. 2024. Retrieval-augmented generation for natural language processing: A survey. arXiv preprint arXiv:2407.13193 (2024)."},{"key":"e_1_3_3_42_2","doi-asserted-by":"crossref","unstructured":"Zhilin Yang Peng Qi Saizheng Zhang Yoshua Bengio William W. Cohen Ruslan Salakhutdinov and Christopher D. Manning. 2018. HotpotQA: A dataset for diverse explainable multi-hop question answering. In Conference on Empirical Methods in Natural Language Processing (EMNLP).","DOI":"10.18653\/v1\/D18-1259"},{"key":"e_1_3_3_43_2","doi-asserted-by":"crossref","unstructured":"Liang Yao Jiazhen Peng Chengsheng Mao and Yuan Luo. 2025. Exploring large language models for knowledge graph completion. In ICASSP 2025-2025 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP). IEEE 1\u20135.","DOI":"10.1109\/ICASSP49660.2025.10889242"},{"key":"e_1_3_3_44_2","doi-asserted-by":"crossref","unstructured":"Hao Yu Aoran Gan Kai Zhang Shiwei Tong Qi Liu and Zhaofeng Liu. 2024. Evaluation of retrieval-augmented generation: A survey. In CCF Conference on Big Data. Springer 102\u2013120.","DOI":"10.1007\/978-981-96-1024-2_8"},{"key":"e_1_3_3_45_2","unstructured":"Penghao Zhao Hailin Zhang Qinhan Yu Zhengren Wang Yunteng Geng Fangcheng Fu Ling Yang Wentao Zhang Jie Jiang and Bin Cui. 2024. Retrieval-augmented generation for ai-generated content: A survey. arXiv preprint arXiv:2402.19473 (2024)."},{"key":"e_1_3_3_46_2","unstructured":"Wayne Xin Zhao Kun Zhou Junyi Li Tianyi Tang Xiaolei Wang Yupeng Hou Yingqian Min Beichen Zhang Junjie Zhang Zican Dong Yifan Du Chen Yang Yushuo Chen Zhipeng Chen Jinhao Jiang Ruiyang Ren Yifan Li Xinyu Tang Zikang Liu Peiyu Liu Jian-Yun Nie and Ji-Rong Wen. 2024. A Survey of Large Language Models. arxiv:2303.18223. Retrieved from https:\/\/arxiv.org\/abs\/2303.18223"},{"key":"e_1_3_3_47_2","first-page":"46595","article-title":"Judging llm-as-a-judge with mt-bench and chatbot arena","volume":"36","author":"Zheng Lianmin","year":"2023","unstructured":"Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric Xing, et\u00a0al. 2023. Judging llm-as-a-judge with mt-bench and chatbot arena. Advances in Neural Information Processing Systems 36 (2023), 46595\u201346623.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_48_2","doi-asserted-by":"crossref","unstructured":"Ming Zhong Yang Liu Da Yin Yuning Mao Yizhu Jiao Pengfei Liu Chenguang Zhu Heng Ji and Jiawei Han. 2022. Towards a unified multi-dimensional evaluator for text generation. arXiv:2210.07197. Retrieved from https:\/\/arxiv.org\/abs\/2210.07197","DOI":"10.18653\/v1\/2022.emnlp-main.131"},{"key":"e_1_3_3_49_2","doi-asserted-by":"publisher","DOI":"10.1088\/1742-6596\/1487\/1\/012016"}],"container-title":["ACM Transactions on Design Automation of Electronic Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3731680","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,22]],"date-time":"2025-10-22T12:22:15Z","timestamp":1761135735000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3731680"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,21]]},"references-count":48,"journal-issue":{"issue":"6","published-print":{"date-parts":[[2025,11,30]]}},"alternative-id":["10.1145\/3731680"],"URL":"https:\/\/doi.org\/10.1145\/3731680","relation":{},"ISSN":["1084-4309","1557-7309"],"issn-type":[{"value":"1084-4309","type":"print"},{"value":"1557-7309","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,21]]},"assertion":[{"value":"2024-09-30","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-04-16","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-10-21","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}