{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T03:29:13Z","timestamp":1773804553090,"version":"3.50.1"},"reference-count":0,"publisher":"Association for the Advancement of Artificial Intelligence (AAAI)","issue":"34","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AAAI"],"abstract":"<jats:p>Text-attributed heterogeneous graphs (TAHGs), characterized by nodes interconnected through diverse relationships and enriched with textual descriptions, are prevalent in numerous real-world applications. Recent advancements in integrating pre-trained language models (PLMs) and large language models (LLMs) with heterogeneous graph neural networks (HGNNs) have enhanced learning on TAHGs. However, the absence of standardized benchmark datasets tailored to TAHGs has impeded further progress. To bridge this gap, we propose the Text-attributed Heterogeneous Graphs Benchmark (THGB), a comprehensive collection of heterogeneous graphs from diverse domains, with each node enriched by relevant text attributes. Alongside dataset construction, we conduct extensive benchmark experiments using various graph learning methods, including GNN, PLM-GNN, and LLM-GNN approaches, for node classification and link prediction tasks. We evaluated model performance across supervised, few-shot, and zero-shot learning scenarios to assess their ability to leverage limited and unseen data. Our experiments highlight THGB's potential to improve the integration of heterogeneous structural and textual information. By providing curated datasets, robust evaluation protocols, and baseline implementations, THGB introduces a standardized benchmark and solid groundwork for TAHGs research.<\/jats:p>","DOI":"10.1609\/aaai.v40i34.40133","type":"journal-article","created":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T02:28:22Z","timestamp":1773800902000},"page":"28973-28981","source":"Crossref","is-referenced-by-count":0,"title":["THGB: A Comprehensive Benchmark for Text-attributed Heterogeneous Graphs"],"prefix":"10.1609","volume":"40","author":[{"given":"Lixin","family":"Zhou","sequence":"first","affiliation":[]},{"given":"Zemin","family":"Liu","sequence":"additional","affiliation":[]},{"given":"Yuan","family":"Fang","sequence":"additional","affiliation":[]},{"given":"Dan","family":"Niu","sequence":"additional","affiliation":[]},{"given":"Jing","family":"Ying","sequence":"additional","affiliation":[]}],"member":"9382","published-online":{"date-parts":[[2026,3,14]]},"container-title":["Proceedings of the AAAI Conference on Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/40133\/44094","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/40133\/44094","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T02:28:22Z","timestamp":1773800902000},"score":1,"resource":{"primary":{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/view\/40133"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,14]]},"references-count":0,"journal-issue":{"issue":"34","published-online":{"date-parts":[[2026,3,17]]}},"URL":"https:\/\/doi.org\/10.1609\/aaai.v40i34.40133","relation":{},"ISSN":["2374-3468","2159-5399"],"issn-type":[{"value":"2374-3468","type":"electronic"},{"value":"2159-5399","type":"print"}],"subject":[],"published":{"date-parts":[[2026,3,14]]}}}