{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,18]],"date-time":"2026-03-18T02:37:54Z","timestamp":1773801474955,"version":"3.50.1"},"reference-count":0,"publisher":"Association for the Advancement of Artificial Intelligence (AAAI)","issue":"8","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AAAI"],"abstract":"<jats:p>Recent advances in vision-language-action (VLA) models\nhave demonstrated impressive generalization for robotic manipulation.\nHowever, these models often operate by directly mapping visual and linguistic inputs to subsequent actions, lacking intermediate task planning, along with failure detection and recovery ability.\nThese limitations prevent them from effectively decomposing complex tasks, recognizing problems, and correcting erroneous actions, ultimately resulting in complete task failure. This significantly hinders their ability to perform long-horizon tasks and generalization ability.\nTo this end, we introduce TCoT: Trajectory Chain-of-Thought, a unified VLA framework that enhances this direct mapping with trajectory planning as well as failure detection and recovery.\nTCoT leverages hierarchy trajectories as a precise and compact representation of CoT reasoning for manipulation: global planning provides a high-level, goal-oriented trajectory to guide the robot toward its task objective, while local planning focuses on real-time adjustments to address dynamic changes.\nMoreover, we designed the Global-Local Switching Recovery algorithm that detects and effectively recovers from failures. \nExperimental results reveal that TCoT surpasses the state-of-the-art methods across both real and simulated scenarios and exhibits superior generalization capabilities.<\/jats:p>","DOI":"10.1609\/aaai.v40i8.37577","type":"journal-article","created":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T23:28:16Z","timestamp":1773790096000},"page":"6486-6494","source":"Crossref","is-referenced-by-count":0,"title":["TCoT: Trajectory Chain-of-Thoughts for Robotic Manipulation with Failure Recovery in Vision-Language-Action Model"],"prefix":"10.1609","volume":"40","author":[{"given":"Xiang","family":"Li","sequence":"first","affiliation":[]},{"given":"Ya-Li","family":"Li","sequence":"additional","affiliation":[]},{"given":"Yuan","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Huaqiang","family":"Wang","sequence":"additional","affiliation":[]},{"given":"Shengjin","family":"Wang","sequence":"additional","affiliation":[]}],"member":"9382","published-online":{"date-parts":[[2026,3,14]]},"container-title":["Proceedings of the AAAI Conference on Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/37577\/41539","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/download\/37577\/41539","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,3,17]],"date-time":"2026-03-17T23:28:16Z","timestamp":1773790096000},"score":1,"resource":{"primary":{"URL":"https:\/\/ojs.aaai.org\/index.php\/AAAI\/article\/view\/37577"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,14]]},"references-count":0,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2026,3,17]]}},"URL":"https:\/\/doi.org\/10.1609\/aaai.v40i8.37577","relation":{},"ISSN":["2374-3468","2159-5399"],"issn-type":[{"value":"2374-3468","type":"electronic"},{"value":"2159-5399","type":"print"}],"subject":[],"published":{"date-parts":[[2026,3,14]]}}}