{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,24]],"date-time":"2025-09-24T00:14:50Z","timestamp":1758672890844,"version":"3.44.0"},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,9]]},"abstract":"<jats:p>The remarkable success of Large Language Models (LLMs) across diverse tasks has driven the research community to extend their capabilities to molecular applications. However, most molecular LLMs employ adapter-based architectures that fail to equally integrate molecule and text modalities and lack explicit supervision signals for the molecular modality. To address these issues, we introduce UniMoT, a Unified Molecule-Text LLM adopting a tokenizer-based architecture that expands the vocabulary of LLMs with molecule tokens. Specifically, we introduce a Vector Quantization-driven tokenizer that incorporates a Q-Former to bridge the modality gap between molecule and text. This tokenizer transforms molecular structures into sequences of  tokens exhibiting causal dependency, thereby encapsulating both  high-level molecular features and textual information. Equipped with this tokenizer, UniMoT  unifies molecule and text modalities under a shared token representation and an autoregressive training paradigm. This enables the model to process molecular structures as a distinct linguistic system and generate them in textual form. Through a four-stage training scheme, UniMoT functions as a multi-modal generalist capable of performing both molecule-to-text and text-to-molecule tasks. Extensive experiments demonstrate that UniMoT achieves state-of-the-art performance across a wide range of molecule comprehension and generation tasks.<\/jats:p>","DOI":"10.24963\/ijcai.2025\/1023","type":"proceedings-article","created":{"date-parts":[[2025,9,19]],"date-time":"2025-09-19T08:10:40Z","timestamp":1758269440000},"page":"9205-9213","source":"Crossref","is-referenced-by-count":0,"title":["Unified Molecule-Text Language Model with Discrete Token Representation"],"prefix":"10.24963","author":[{"given":"Shuhan","family":"Guo","sequence":"first","affiliation":[{"name":"Tsinghua University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yatao","family":"Bian","sequence":"additional","affiliation":[{"name":"Tencent AI Lab"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ruibing","family":"Wang","sequence":"additional","affiliation":[{"name":"Northwestern Polytechnical University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nan","family":"Yin","sequence":"additional","affiliation":[{"name":"Hong Kong University of Science and Technology"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhen","family":"Wang","sequence":"additional","affiliation":[{"name":"Northwestern Polytechnical University"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Quanming","family":"Yao","sequence":"additional","affiliation":[{"name":"Tsinghua University"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"10584","event":{"number":"34","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)"],"acronym":"IJCAI-2025","name":"Thirty-Fourth International Joint Conference on Artificial Intelligence {IJCAI-25}","start":{"date-parts":[[2025,8,16]]},"theme":"Artificial Intelligence","location":"Montreal, Canada","end":{"date-parts":[[2025,8,22]]}},"container-title":["Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2025,9,23]],"date-time":"2025-09-23T11:35:53Z","timestamp":1758627353000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2025\/1023"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2025,9]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2025\/1023","relation":{},"subject":[],"published":{"date-parts":[[2025,9]]}}}