{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T03:37:01Z","timestamp":1773805021038,"version":"3.50.1"},"reference-count":0,"publisher":"Association for the Advancement of Artificial Intelligence (AAAI)","issue":"38","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AAAI"],"abstract":"<jats:p>Large language models (LLMs) concentrate substantial knowledge in specialized domains due to extensive pretraining and instruction tuning, and they are now central to commercial and scientific practice. Yet access is usually limited to costly, rate-limited interfaces, which motivates methods that can extract targeted domain knowledge with minimal querying effort. A further challenge is that the target domain may be unknown in advance, so naive or generic prompts waste queries and fail to expose the underlying concepts and relations that structure the domain.\nIn this work, we introduce a query-efficient approach for domain-specific knowledge stealing from black-box language models. Rather than issuing random questions or generic templates, our framework performs self-directed exploration that lets the model find the direction and mine domain knowledge by itself. Starting from a small and diverse seed, it discovers salient domain entities and induces their relations through structured question families that elicit definitional, functional, and compositional information. A feedback-driven controller analyzes the errors and uncertainty of the extracted surrogate model and uses this signal to refine subsequent queries, all without relying on prior domain knowledge or external resources.\nWe evaluate the method in two expert-centric settings, medicine and finance, and observe consistently better performance while requiring significantly fewer queries.<\/jats:p>","DOI":"10.1609\/aaai.v40i38.40456","type":"journal-article","created":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T02:52:30Z","timestamp":1773802350000},"page":"31870-31878","source":"Crossref","is-referenced-by-count":0,"title":["Query-Efficient Domain Knowledge Stealing Against Large Language Models"],"prefix":"10.1609","volume":"40","author":[{"given":"Zhengao","family":"Li","sequence":"first","affiliation":[]},{"given":"Xiaopeng","family":"Yuan","sequence":"additional","affiliation":[]},{"given":"Bolin","family":"Shen","sequence":"additional","affiliation":[]},{"given":"Kien","family":"Le","sequence":"additional","affiliation":[]},{"given":"Haohan","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Xugui","family":"Zhou","sequence":"additional","affiliation":[]},{"given":"Shangqian","family":"Gao","sequence":"additional","affiliation":[]},{"given":"Yushun","family":"Dong","sequence":"additional","affiliation":[]}],"member":"9382","published-online":{"date-parts":[[2026,3,14]]},"container-title":["Proceedings of the AAAI Conference on Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/40456\/44417","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/40456\/44417","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T02:52:30Z","timestamp":1773802350000},"score":1,"resource":{"primary":{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/view\/40456"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,14]]},"references-count":0,"journal-issue":{"issue":"38","published-online":{"date-parts":[[2026,3,17]]}},"URL":"https:\/\/doi.org\/10.1609\/aaai.v40i38.40456","relation":{},"ISSN":["2374-3468","2159-5399"],"issn-type":[{"value":"2374-3468","type":"electronic"},{"value":"2159-5399","type":"print"}],"subject":[],"published":{"date-parts":[[2026,3,14]]}}}