{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,5]],"date-time":"2026-07-05T21:54:36Z","timestamp":1783288476966,"version":"3.54.6"},"reference-count":99,"publisher":"Association for Computing Machinery (ACM)","issue":"FSE","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. ACM Softw. Eng."],"published-print":{"date-parts":[[2025,6,19]]},"abstract":"<jats:p>The rapid advancement of generative AI and multi-modal foundation models has shown significant potential in advancing robotic manipulation. Vision-language-action (VLA) models, in particular, have emerged as a promising approach for visuomotor control by leveraging large-scale vision-language data and robot demonstrations. However, current VLA models are typically evaluated using a limited set of hand-crafted scenes, leaving their general performance and robustness in diverse scenarios largely unexplored. To address this gap, we present VLATest, a fuzzing framework designed to generate robotic manipulation scenes for testing VLA models. Based on VLATest, we conducted an empirical study to assess the performance of seven representative VLA models. Our study results revealed that current VLA models lack the robustness necessary for practical deployment. Additionally, we investigated the impact of various factors, including the number of confounding objects, lighting conditions, camera poses, unseen objects, and task instruction mutations, on the VLA model's performance. Our findings highlight the limitations of existing VLA models, emphasizing the need for further research to develop reliable and trustworthy VLA applications.<\/jats:p>","DOI":"10.1145\/3729343","type":"journal-article","created":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T15:15:34Z","timestamp":1750346134000},"page":"1615-1638","source":"Crossref","is-referenced-by-count":9,"title":["VLATest: Testing and Evaluating Vision-Language-Action Models for Robotic Manipulation"],"prefix":"10.1145","volume":"2","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4559-5426","authenticated-orcid":false,"given":"Zhijie","family":"Wang","sequence":"first","affiliation":[{"name":"University of Alberta, Edmonton, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-9542-4858","authenticated-orcid":false,"given":"Zhehua","family":"Zhou","sequence":"additional","affiliation":[{"name":"University of Alberta, Edmonton, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-7093-9781","authenticated-orcid":false,"given":"Jiayang","family":"Song","sequence":"additional","affiliation":[{"name":"University of Alberta, Edmonton, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3666-4020","authenticated-orcid":false,"given":"Yuheng","family":"Huang","sequence":"additional","affiliation":[{"name":"The University of Tokyo, Tokyo, Japan"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5933-254X","authenticated-orcid":false,"given":"Zhan","family":"Shu","sequence":"additional","affiliation":[{"name":"University of Alberta, Edmonton, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8621-2420","authenticated-orcid":false,"given":"Lei","family":"Ma","sequence":"additional","affiliation":[{"name":"The University of Tokyo, Tokyo, Japan"},{"name":"University of Alberta, Edmonton, Canada"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"320","published-online":{"date-parts":[[2025,6,19]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3644388","article-title":"DeepGD: A Multi-Objective Black-Box Test Selection Approach for Deep Neural Networks","volume":"33","author":"Aghababaeyan Zohreh","year":"2024","unstructured":"Zohreh Aghababaeyan, Manel Abdellatif, Mahboubeh Dadkhah, and Lionel Briand. 2024. DeepGD: A Multi-Objective Black-Box Test Selection Approach for Deep Neural Networks. ACM Transactions on Software Engineering and Methodology, 33, 6 (2024), 1\u201329.","journal-title":"ACM Transactions on Software Engineering and Methodology"},{"key":"e_1_2_1_2_1","doi-asserted-by":"crossref","first-page":"46","DOI":"10.3390\/jimaging9020046","article-title":"Data Augmentation in Classification and Segmentation: A Survey and New Strategies","volume":"9","author":"Alomar Khaled","year":"2023","unstructured":"Khaled Alomar, Halil Ibrahim Aysel, and Xiaohao Cai. 2023. Data Augmentation in Classification and Segmentation: A Survey and New Strategies. Journal of Imaging, 9, 2 (2023), 46.","journal-title":"Journal of Imaging"},{"key":"e_1_2_1_3_1","volume-title":"Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1264\u20131274","author":"Ayerdi Jon","year":"2021","unstructured":"Jon Ayerdi and Valerio Terragni. 2021. Generating metamorphic relations for cyber-physical systems with genetic programming: an industrial case study. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 1264\u20131274."},{"key":"e_1_2_1_4_1","unstructured":"Yuntao Bai Andy Jones and Kamal Ndousse. 2022. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback. arXiv preprint arXiv:2204.05862."},{"key":"e_1_2_1_5_1","volume-title":"2024 IEEE International Conference on Robotics and Automation (ICRA). 4788\u20134795","author":"Bharadhwaj Homanga","year":"2024","unstructured":"Homanga Bharadhwaj, Jay Vakil, Mohit Sharma, Abhinav Gupta, Shubham Tulsiani, and Vikash Kumar. 2024. RoboAgent: Generalization and Efficiency in Robot Manipulation via Semantic Augmentations and Action Chunking. In 2024 IEEE International Conference on Robotics and Automation (ICRA). 4788\u20134795."},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3631970","article-title":"Testing of Deep Reinforcement Learning Agents with Surrogate Models","volume":"33","author":"Biagiola Matteo","year":"2024","unstructured":"Matteo Biagiola and Paolo Tonella. 2024. Testing of Deep Reinforcement Learning Agents with Surrogate Models. ACM Transactions on Software Engineering and Methodology, 33, 3 (2024), 1\u201333.","journal-title":"ACM Transactions on Software Engineering and Methodology"},{"key":"e_1_2_1_7_1","unstructured":"Anthony Brohan Noah Brown and Justice Carbajal. 2022. RT-1: Robotics Transformer for Real-World Control at Scale. arXiv preprint arXiv:2212.06817."},{"key":"e_1_2_1_8_1","unstructured":"Anthony Brohan Noah Brown and Justice Carbajal. 2023. RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control. arXiv preprint arXiv:2307.15818."},{"key":"e_1_2_1_9_1","volume-title":"2015 international conference on advanced robotics (ICAR). 510\u2013517","author":"Calli Berk","year":"2015","unstructured":"Berk Calli, Arjun Singh, Aaron Walsman, Siddhartha Srinivasa, Pieter Abbeel, and Aaron M Dollar. 2015. The YCB object and Model set: Towards common benchmarks for manipulation research. In 2015 international conference on advanced robotics (ICAR). 510\u2013517."},{"key":"e_1_2_1_10_1","doi-asserted-by":"crossref","first-page":"3675","DOI":"10.1109\/TSE.2023.3267446","article-title":"MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation","volume":"49","author":"Cassano Federico","year":"2023","unstructured":"Federico Cassano, John Gouwar, and Daniel Nguyen. 2023. MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation. IEEE Transactions on Software Engineering, 49, 7 (2023), 3675\u20133691.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_2_1_11_1","volume-title":"The Twelfth International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=KuPixIqPiq","author":"Chen Xinyun","year":"2024","unstructured":"Xinyun Chen, Maxwell Lin, Nathanael Sch\u00e4rli, and Denny Zhou. 2024. Teaching Large Language Models to Self-Debug. In The Twelfth International Conference on Learning Representations. https:\/\/openreview.net\/forum?id=KuPixIqPiq"},{"key":"e_1_2_1_12_1","volume-title":"Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 14\u201326","author":"Chen Yuqi","year":"2020","unstructured":"Yuqi Chen, Bohan Xuan, Christopher M Poskitt, Jun Sun, and Fan Zhang. 2020. Active Fuzzing for Testing and Securing Cyber-Physical Systems. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 14\u201326."},{"key":"e_1_2_1_13_1","volume-title":"Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. 488\u2013500","author":"Cheng Mingfei","year":"2023","unstructured":"Mingfei Cheng, Yuan Zhou, and Xiaofei Xie. 2023. BehAVExplor: Behavior Diversity Guided Testing for Autonomous Driving Systems. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. 488\u2013500."},{"key":"e_1_2_1_14_1","article-title":"Test Input Prioritization for Machine Learning Classifiers","author":"Dang Xueqi","year":"2024","unstructured":"Xueqi Dang, Yinghua Li, Mike Papadakis, Jacques Klein, Tegawend\u00e9 F Bissyand\u00e9, and Yves Le Traon. 2024. Test Input Prioritization for Machine Learning Classifiers. IEEE Transactions on Software Engineering.","journal-title":"IEEE Transactions on Software Engineering."},{"key":"e_1_2_1_15_1","volume-title":"Transforming Management Using Artificial Intelligence Techniques","author":"Dhaliwal Amandeep","unstructured":"Amandeep Dhaliwal. 2020. The Rise of Automation and Robotics in Warehouse Management. In Transforming Management Using Artificial Intelligence Techniques. CRC Press, 63\u201372."},{"key":"e_1_2_1_16_1","volume-title":"Task and Motion Planning with Large Language Models for Object Rearrangement. In 2023 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS). 2086\u20132092","author":"Ding Yan","year":"2023","unstructured":"Yan Ding, Xiaohan Zhang, Chris Paxton, and Shiqi Zhang. 2023. Task and Motion Planning with Large Language Models for Object Rearrangement. In 2023 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS). 2086\u20132092."},{"key":"e_1_2_1_17_1","volume-title":"International Conference on Learning Representations.","author":"Dosovitskiy Alexey","year":"2020","unstructured":"Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, and Sylvain Gelly. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations."},{"key":"e_1_2_1_18_1","volume-title":"PaLM-E: An Embodied Multimodal Language Model. In Proceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research","volume":"8488","author":"Driess Danny","unstructured":"Danny Driess, Fei Xia, and Mehdi S. M. Sajjadi. 2023. PaLM-E: An Embodied Multimodal Language Model. In Proceedings of the 40th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 202). PMLR, 8469\u20138488."},{"key":"e_1_2_1_19_1","volume-title":"Proceedings of The 2nd Conference on Lifelong Learning Agents (Proceedings of Machine Learning Research","volume":"136","author":"Du Yuqing","year":"2023","unstructured":"Yuqing Du, Ksenia Konyushkova, Misha Denil, Akhil Raju, Jessica Landon, Felix Hill, Nando de Freitas, and Serkan Cabi. 2023. Vision-Language Models as Success Detectors. In Proceedings of The 2nd Conference on Lifelong Learning Agents (Proceedings of Machine Learning Research, Vol. 232). PMLR, 120\u2013136."},{"key":"e_1_2_1_20_1","doi-asserted-by":"crossref","first-page":"135","DOI":"10.3390\/app12010135","article-title":"Advanced applications of industrial robotics: New trends and possibilities","volume":"12","author":"Dzedzickis Andrius","year":"2021","unstructured":"Andrius Dzedzickis, Jurga Suba\u010di\u016bt\u0117-\u017demaitien\u0117, Ernestas \u0160utinys, Urt\u0117 Samukait\u0117-Bubnien\u0117, and Vytautas Bu\u010dinskas. 2021. Advanced applications of industrial robotics: New trends and possibilities. Applied Sciences, 12, 1 (2021), 135.","journal-title":"Applied Sciences"},{"key":"e_1_2_1_21_1","doi-asserted-by":"crossref","first-page":"1473","DOI":"10.1162\/tacl_a_00529","article-title":"FaithDial: A Faithful Benchmark for Information-Seeking Dialogue","volume":"10","author":"Dziri Nouha","year":"2022","unstructured":"Nouha Dziri, Ehsan Kamalloo, Sivan Milton, Osmar Zaiane, Mo Yu, Edoardo M Ponti, and Siva Reddy. 2022. FaithDial: A Faithful Benchmark for Information-Seeking Dialogue. Transactions of the Association for Computational Linguistics, 10 (2022), 1473\u20131490.","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"e_1_2_1_22_1","volume-title":"Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 177\u2013188","author":"Feng Yang","year":"2020","unstructured":"Yang Feng, Qingkai Shi, Xinyu Gao, Jun Wan, Chunrong Fang, and Zhenyu Chen. 2020. DeepGini: prioritizing massive tests to enhance the robustness of deep neural networks. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 177\u2013188."},{"key":"e_1_2_1_23_1","volume-title":"Proceedings of the 44th International Conference on Software Engineering. 73\u201385","author":"Gao Xinyu","year":"2022","unstructured":"Xinyu Gao, Yang Feng, Yining Yin, Zixi Liu, Zhenyu Chen, and Baowen Xu. 2022. Adaptive test selection for deep neural networks. In Proceedings of the 44th International Conference on Software Engineering. 73\u201385."},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering. 1\u201313","author":"Gao Xinyu","year":"2024","unstructured":"Xinyu Gao, Zhijie Wang, Yang Feng, Lei Ma, Zhenyu Chen, and Baowen Xu. 2024. MultiTest: Physical-Aware Object Insertion for Testing Multi-sensor Fusion Perception Systems. In Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering. 1\u201313."},{"key":"e_1_2_1_25_1","doi-asserted-by":"crossref","unstructured":"Ruchi Goel and Pooja Gupta. 2020. Robotics and Industry 4.0. Springer International Publishing Cham. 157\u2013169.","DOI":"10.1007\/978-3-030-14544-6_9"},{"key":"e_1_2_1_26_1","unstructured":"Jiayuan Gu Fanbo Xiang and Xuanlin Li. 2023. ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills. arXiv preprint arXiv:2302.04659."},{"key":"e_1_2_1_27_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3583136","article-title":"Recent Trends in Task and Motion Planning for Robotics: A Survey","volume":"55","author":"Guo Huihui","year":"2023","unstructured":"Huihui Guo, Fan Wu, Yunchuan Qin, Ruihui Li, Keqin Li, and Kenli Li. 2023. Recent Trends in Task and Motion Planning for Robotics: A Survey. Comput. Surveys, 55, 13s (2023), 1\u201336.","journal-title":"Comput. Surveys"},{"key":"e_1_2_1_28_1","volume-title":"CPS-Based Self-Adaptive Collaborative Control for Smart Production-Logistics Systems","author":"Guo Zhengang","year":"2020","unstructured":"Zhengang Guo, Yingfeng Zhang, Xibin Zhao, and Xiaoyu Song. 2020. CPS-Based Self-Adaptive Collaborative Control for Smart Production-Logistics Systems. IEEE transactions on cybernetics, 51, 1 (2020), 188\u2013198."},{"key":"e_1_2_1_29_1","volume-title":"Tim Finin, Primal Pappachan, and Roberto Yus.","author":"Hamid Aamir","year":"2023","unstructured":"Aamir Hamid, Hemanth Reddy Samidi, Tim Finin, Primal Pappachan, and Roberto Yus. 2023. GenAIPABench: A Benchmark for Generative AI-based Privacy Assistants. arXiv preprint arXiv:2309.05138."},{"key":"e_1_2_1_30_1","doi-asserted-by":"crossref","first-page":"3762","DOI":"10.3390\/s23073762","article-title":"A Survey on Deep Reinforcement Learning Algorithms for Robotic Manipulation","volume":"23","author":"Han Dong","year":"2023","unstructured":"Dong Han, Beni Mulyana, Vladimir Stankovic, and Samuel Cheng. 2023. A Survey on Deep Reinforcement Learning Algorithms for Robotic Manipulation. Sensors, 23, 7 (2023), 3762.","journal-title":"Sensors"},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering. 961\u2013973","author":"He Pinjia","year":"2020","unstructured":"Pinjia He, Clara Meister, and Zhendong Su. 2020. Structure-Invariant Testing for Machine Translation. In Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering. 961\u2013973."},{"key":"e_1_2_1_32_1","doi-asserted-by":"crossref","first-page":"398","DOI":"10.1109\/TASE.2021.3064065","article-title":"Sim2Real in Robotics and Automation: Applications and Challenges","volume":"18","author":"H\u00f6fer Sebastian","year":"2021","unstructured":"Sebastian H\u00f6fer, Kostas Bekris, and Ankur Handa. 2021. Sim2Real in Robotics and Automation: Applications and Challenges. IEEE Transactions on Automation Science and Engineering, 18, 2 (2021), 398\u2013400.","journal-title":"IEEE Transactions on Automation Science and Engineering"},{"key":"e_1_2_1_33_1","doi-asserted-by":"crossref","first-page":"47","DOI":"10.3390\/robotics10010047","article-title":"Service Robots in the Healthcare Sector","volume":"10","author":"Holland Jane","year":"2021","unstructured":"Jane Holland, Liz Kingston, Conor McCarthy, Eddie Armstrong, Peter O\u2019Dwyer, Fionn Merz, and Mark McConnell. 2021. Service Robots in the Healthcare Sector. Robotics, 10, 1 (2021), 47.","journal-title":"Robotics"},{"key":"e_1_2_1_34_1","volume-title":"2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE). 1776\u20131787","author":"Hu Qiang","year":"2023","unstructured":"Qiang Hu, Yuejun Guo, Xiaofei Xie, Maxime Cordy, Mike Papadakis, Lei Ma, and Yves Le Traon. 2023. Aries: Efficient Testing of Deep Neural Networks via Labeling-Free Accuracy Estimation. In 2023 IEEE\/ACM 45th International Conference on Software Engineering (ICSE). 1776\u20131787."},{"key":"e_1_2_1_35_1","unstructured":"Zhisheng Hu Shengjian Guo Zhenyu Zhong and Kang Li. 2021. Coverage-based Scene Fuzzing for Virtual Autonomous Driving Testing. arXiv preprint arXiv:2106.00873."},{"key":"e_1_2_1_36_1","unstructured":"Siyuan Huang Zhengkai Jiang Hao Dong Yu Qiao Peng Gao and Hongsheng Li. 2023. Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model. arXiv preprint arXiv:2305.11176."},{"key":"e_1_2_1_37_1","unstructured":"Yuheng Huang Jiayang Song Qiang Hu Felix Juefei-Xu and Lei Ma. 2024. Active Testing of Large Language Model via Multi-Stage Sampling. arXiv preprint arXiv:2408.03573."},{"key":"e_1_2_1_38_1","doi-asserted-by":"crossref","first-page":"413","DOI":"10.1109\/TSE.2024.3519464","article-title":"Look Before You Leap: An Exploratory Study of Uncertainty Analysis for Large Language Models","volume":"51","author":"Huang Yuheng","year":"2025","unstructured":"Yuheng Huang, Jiayang Song, and Zhijie Wang. 2025. Look Before You Leap: An Exploratory Study of Uncertainty Analysis for Large Language Models. IEEE Transactions on Software Engineering, 51, 2 (2025), 413\u2013429.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_2_1_39_1","volume-title":"The Twelfth International Conference on Learning Representations.","author":"Jimenez Carlos E","year":"2024","unstructured":"Carlos E Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik Narasimhan. 2024. SWE-bench: Can Language Models Resolve Real-World GitHub Issues? In The Twelfth International Conference on Learning Representations."},{"key":"e_1_2_1_40_1","unstructured":"Moo Jin Kim Karl Pertsch and Siddharth Karamcheti. 2024. OpenVLA: An Open-Source Vision-Language-Action Model. arXiv preprint arXiv:2406.09246."},{"key":"e_1_2_1_41_1","doi-asserted-by":"crossref","first-page":"8","DOI":"10.3390\/technologies9010008","article-title":"A Survey of Robots in Healthcare","volume":"9","author":"Kyrarini Maria","year":"2021","unstructured":"Maria Kyrarini, Fotios Lygerakis, Akilesh Rajavenkatanarayanan, Christos Sevastopoulos, Harish Ram Nambiappan, Kodur Krishna Chaitanya, Ashwin Ramesh Babu, Joanne Mathew, and Fillia Makedon. 2021. A Survey of Robots in Healthcare. Technologies, 9, 1 (2021), 8.","journal-title":"Technologies"},{"key":"e_1_2_1_42_1","volume-title":"Cong Liu, Wei Yang, and Shiyi Wei.","author":"Lee Jaeseong","year":"2024","unstructured":"Jaeseong Lee, Simin Chen, Austin Mordahl, Cong Liu, Wei Yang, and Shiyi Wei. 2024. Automated Testing Linguistic Capabilities of NLP Models. ACM Transactions on Software Engineering and Methodology."},{"key":"e_1_2_1_43_1","volume-title":"Fuzzing for CPS Mutation Testing. In 2023 38th IEEE\/ACM International Conference on Automated Software Engineering (ASE). 1377\u20131389","author":"Lee Jaekwon","year":"2023","unstructured":"Jaekwon Lee, Enrico Vigan\u00f2, Oscar Cornejo, Fabrizio Pastore, and Lionel Briand. 2023. Fuzzing for CPS Mutation Testing. In 2023 38th IEEE\/ACM International Conference on Automated Software Engineering (ASE). 1377\u20131389."},{"key":"e_1_2_1_44_1","volume-title":"Evaluating Real-World Robot Manipulation Policies in Simulation. In 8th Annual Conference on Robot Learning.","author":"Li Xuanlin","year":"2024","unstructured":"Xuanlin Li, Kyle Hsu, and Jiayuan Gu. 2024. Evaluating Real-World Robot Manipulation Policies in Simulation. In 8th Annual Conference on Robot Learning."},{"key":"e_1_2_1_45_1","volume-title":"The Twelfth International Conference on Learning Representations.","author":"Li Xinghang","year":"2024","unstructured":"Xinghang Li, Minghuan Liu, and Hanbo Zhang. 2024. Vision-Language Foundation Models as Effective Robot Imitators. In The Twelfth International Conference on Learning Representations."},{"key":"e_1_2_1_46_1","volume-title":"LLaRA: Supercharging Robot Learning Data for Vision-Language Policy. In The Thirteenth International Conference on Learning Representations.","author":"Li Xiang","year":"2025","unstructured":"Xiang Li, Cristina Mata, and Jongwoo Park. 2025. LLaRA: Supercharging Robot Learning Data for Vision-Language Policy. In The Thirteenth International Conference on Learning Representations."},{"key":"e_1_2_1_47_1","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1007\/s10664-024-10520-1","article-title":"Prioritizing test cases for deep learning-based video classifiers","volume":"29","author":"Li Yinghua","year":"2024","unstructured":"Yinghua Li, Xueqi Dang, Lei Ma, Jacques Klein, and Tegawend\u00e9 F Bissyand\u00e9. 2024. Prioritizing test cases for deep learning-based video classifiers. Empirical Software Engineering, 29, 5 (2024), 111.","journal-title":"Empirical Software Engineering"},{"key":"e_1_2_1_48_1","volume-title":"2023 IEEE International Conference on Robotics and Automation (ICRA). 9493\u20139500","author":"Liang Jacky","year":"2023","unstructured":"Jacky Liang and Wenlong Huang. 2023. Code as Policies: Language Model Programs for Embodied Control. In 2023 IEEE International Conference on Robotics and Automation (ICRA). 9493\u20139500."},{"key":"e_1_2_1_49_1","article-title":"Holistic Evaluation of Language Models","author":"Liang Percy","year":"2023","unstructured":"Percy Liang, Rishi Bommasani, and Tony Lee. 2023. Holistic Evaluation of Language Models. Transactions on Machine Learning Research, issn:2835-8856","journal-title":"Transactions on Machine Learning Research, issn:2835-8856"},{"key":"e_1_2_1_50_1","volume-title":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 3214\u20133252","author":"Lin Stephanie","year":"2022","unstructured":"Stephanie Lin, Jacob Hilton, and Owain Evans. 2022. TruthfulQA: Measuring How Models Mimic Human Falsehoods. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 3214\u20133252."},{"key":"e_1_2_1_51_1","volume-title":"Visual Instruction Tuning. Advances in neural information processing systems, 36","author":"Liu Haotian","year":"2024","unstructured":"Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. 2024. Visual Instruction Tuning. Advances in neural information processing systems, 36 (2024)."},{"key":"e_1_2_1_52_1","volume-title":"Yuyao Wang, and Lingming Zhang.","author":"Liu Jiawei","year":"2024","unstructured":"Jiawei Liu, Chunqiu Steven Xia, Yuyao Wang, and Lingming Zhang. 2024. Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation. Advances in Neural Information Processing Systems, 36 (2024)."},{"key":"e_1_2_1_53_1","volume-title":"Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering. 372\u2013384","author":"Menghi Claudio","year":"2020","unstructured":"Claudio Menghi, Shiva Nejati, Lionel Briand, and Yago Isasi Parache. 2020. Approximation-Refinement Testing of Compute-Intensive Cyber-Physical Models: An Approach Based on System Identification. In Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering. 372\u2013384."},{"key":"e_1_2_1_54_1","first-page":"21199","article-title":"Uncertainty-aware Self-training for Few-shot Text Classification","author":"Mukherjee Subhabrata","year":"2020","unstructured":"Subhabrata Mukherjee and Ahmed Awadallah. 2020. Uncertainty-aware Self-training for Few-shot Text Classification. In Advances in Neural Information Processing Systems. 33, Curran Associates, Inc., 21199\u201321212.","journal-title":"Advances in Neural Information Processing Systems. 33, Curran Associates, Inc."},{"key":"e_1_2_1_55_1","volume-title":"A Mathematical Introduction to Robotic Manipulation","author":"Murray Richard M","unstructured":"Richard M Murray, Zexiang Li, and S Shankar Sastry. 2017. A Mathematical Introduction to Robotic Manipulation. CRC press."},{"key":"e_1_2_1_56_1","volume-title":"Proceedings of The 6th Conference on Robot Learning (Proceedings of Machine Learning Research","volume":"909","author":"Nair Suraj","year":"2023","unstructured":"Suraj Nair, Aravind Rajeswaran, Vikash Kumar, Chelsea Finn, and Abhinav Gupta. 2023. R3M: A Universal Visual Representation for Robot Manipulation. In Proceedings of The 6th Conference on Robot Learning (Proceedings of Machine Learning Research, Vol. 205). PMLR, 892\u2013909."},{"key":"e_1_2_1_57_1","article-title":"DINOv2: Learning Robust Visual Features without Supervision","author":"Oquab Maxime","year":"2023","unstructured":"Maxime Oquab, Timoth\u00e9e Darcet, and Th\u00e9o Moutakanni. 2023. DINOv2: Learning Robust Visual Features without Supervision. Transactions on Machine Learning Research.","journal-title":"Transactions on Machine Learning Research."},{"key":"e_1_2_1_58_1","doi-asserted-by":"crossref","first-page":"3625","DOI":"10.3390\/s23073625","article-title":"Multi-Agent Deep Reinforcement Learning for Multi-Robot Applications: A Survey","volume":"23","author":"Orr James","year":"2023","unstructured":"James Orr and Ayan Dutta. 2023. Multi-Agent Deep Reinforcement Learning for Multi-Robot Applications: A Survey. Sensors, 23, 7 (2023), 3625.","journal-title":"Sensors"},{"key":"e_1_2_1_59_1","unstructured":"Abhishek Padalkar Acorn Pooley and Ajinkya Jain. 2023. Open X-Embodiment: Robotic Learning Datasets and RT-X Models. arXiv preprint arXiv:2310.08864."},{"key":"e_1_2_1_60_1","volume-title":"Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering.","author":"Pan Rangeet","year":"2024","unstructured":"Rangeet Pan and Ali Reza Ibrahimzada. 2024. Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code. In Proceedings of the IEEE\/ACM 46th International Conference on Software Engineering."},{"key":"e_1_2_1_61_1","volume-title":"Proceedings of the AAAI conference on artificial intelligence. 32","author":"Perez Ethan","year":"2018","unstructured":"Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. 2018. FiLM: Visual Reasoning with a General Conditioning Layer. In Proceedings of the AAAI conference on artificial intelligence. 32."},{"key":"e_1_2_1_62_1","unstructured":"Abu Rayhan. 2023. Artificial intelligence in robotics: From automation to autonomous systems."},{"key":"e_1_2_1_63_1","volume-title":"International journal of computer vision, 115","author":"Russakovsky Olga","year":"2015","unstructured":"Olga Russakovsky, Jia Deng, and Hao Su. 2015. ImageNet Large Scale Visual Recognition Challenge. International journal of computer vision, 115 (2015), 211\u2013252."},{"key":"e_1_2_1_64_1","doi-asserted-by":"crossref","first-page":"577","DOI":"10.5267\/j.uscm.2021.11.006","article-title":"A Conceptual Model for the Adoption of Autonomous Robots in the Supply Chain and Logistics Industry","volume":"10","author":"Shamout Mohamed","year":"2022","unstructured":"Mohamed Shamout and Rabeb Ben-Abdallah. 2022. A Conceptual Model for the Adoption of Autonomous Robots in the Supply Chain and Logistics Industry. Uncertain Supply Chain Management, 10, 2 (2022), 577\u2013592.","journal-title":"Uncertain Supply Chain Management"},{"key":"e_1_2_1_65_1","volume-title":"Springer Handbook of Robotics","author":"Siciliano B","unstructured":"B Siciliano. 2008. Springer Handbook of Robotics. Springer."},{"key":"e_1_2_1_66_1","volume-title":"2023 IEEE International Conference on Robotics and Automation (ICRA). 11523\u201311530","author":"Singh Ishika","year":"2023","unstructured":"Ishika Singh, Valts Blukis, and Arsalan Mousavian. 2023. ProgPrompt: Generating Situated Robot Task Plans using Large Language Models. In 2023 IEEE International Conference on Robotics and Automation (ICRA). 11523\u201311530."},{"key":"e_1_2_1_67_1","doi-asserted-by":"crossref","first-page":"4058","DOI":"10.1109\/TSE.2023.3282981","article-title":"SIEGE: A Semantics-Guided Safety Enhancement Framework for AI-Enabled Cyber-Physical Systems","volume":"49","author":"Song Jiayang","year":"2023","unstructured":"Jiayang Song, Xuan Xie, and Lei Ma. 2023. SIEGE: A Semantics-Guided Safety Enhancement Framework for AI-Enabled Cyber-Physical Systems. IEEE Transactions on Software Engineering, 49, 8 (2023), 4058\u20134080.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_2_1_68_1","unstructured":"Jiayang Song Zhehua Zhou Jiawei Liu Chunrong Fang Zhan Shu and Lei Ma. 2023. Self-Refined Large Language Model as Automated Reward Function Designer for Deep Reinforcement Learning in Robotics. arXiv preprint arXiv:2309.06687."},{"key":"e_1_2_1_69_1","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1016\/j.cogr.2023.04.001","article-title":"Artificial intelligence, machine learning and deep learning in advanced robotics, a review","volume":"3","author":"Soori Mohsen","year":"2023","unstructured":"Mohsen Soori, Behrooz Arezoo, and Roza Dastres. 2023. Artificial intelligence, machine learning and deep learning in advanced robotics, a review. Cognitive Robotics, 3 (2023), 54\u201370.","journal-title":"Cognitive Robotics"},{"key":"e_1_2_1_70_1","first-page":"e2386","article-title":"Confidence-driven weighted retraining for predicting safety-critical failures in autonomous driving systems","volume":"34","author":"Stocco Andrea","year":"2022","unstructured":"Andrea Stocco and Paolo Tonella. 2022. Confidence-driven weighted retraining for predicting safety-critical failures in autonomous driving systems. Journal of Software: Evolution and Process, 34, 10 (2022), e2386.","journal-title":"Journal of Software: Evolution and Process"},{"key":"e_1_2_1_71_1","volume-title":"Proceedings of The 7th Conference on Robot Learning (Proceedings of Machine Learning Research","volume":"3417","author":"Stone Austin","year":"2023","unstructured":"Austin Stone. 2023. Open-World Object Manipulation using Pre-trained Vision-Language Models. In Proceedings of The 7th Conference on Robot Learning (Proceedings of Machine Learning Research, Vol. 229). PMLR, 3397\u20133417."},{"key":"e_1_2_1_72_1","volume-title":"Proceedings of the 41st International Conference on Machine Learning (Proceedings of Machine Learning Research","volume":"20270","author":"Sun Lichao","year":"2024","unstructured":"Lichao Sun and Yue Huang. 2024. TrustLLM: Trustworthiness in Large Language Models. In Proceedings of the 41st International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 235). PMLR, 20166\u201320270."},{"key":"e_1_2_1_73_1","volume-title":"Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering. 974\u2013985","author":"Sun Zeyu","year":"2020","unstructured":"Zeyu Sun, Jie M Zhang, Mark Harman, Mike Papadakis, and Lu Zhang. 2020. Automatic Testing and Improvement of Machine Translation. In Proceedings of the ACM\/IEEE 42nd International Conference on Software Engineering. 974\u2013985."},{"key":"e_1_2_1_74_1","volume-title":"Proceedings of the 36th International Conference on Machine Learning, Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.) (Proceedings of Machine Learning Research","volume":"6114","author":"Tan Mingxing","year":"2019","unstructured":"Mingxing Tan and Quoc Le. 2019. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of the 36th International Conference on Machine Learning, Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.) (Proceedings of Machine Learning Research, Vol. 97). PMLR, 6105\u20136114."},{"key":"e_1_2_1_75_1","volume-title":"Octo: An Open-Source Generalist Robot Policy. arXiv preprint arXiv:2405.12213.","author":"Team Octo Model","year":"2024","unstructured":"Octo Model Team, Dibya Ghosh, and Homer Walke. 2024. Octo: An Open-Source Generalist Robot Policy. arXiv preprint arXiv:2405.12213."},{"key":"e_1_2_1_76_1","unstructured":"Hugo Touvron Louis Martin and Kevin Stone. 2023. Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv preprint arXiv:2307.09288."},{"key":"e_1_2_1_77_1","doi-asserted-by":"crossref","first-page":"100057","DOI":"10.1016\/j.ailsci.2023.100057","article-title":"Application of AI techniques and robotics in agriculture: A review","volume":"3","author":"Wakchaure Manas","year":"2023","unstructured":"Manas Wakchaure, BK Patle, and AK Mahindrakar. 2023. Application of AI techniques and robotics in agriculture: A review. Artificial Intelligence in the Life Sciences, 3 (2023), 100057.","journal-title":"Artificial Intelligence in the Life Sciences"},{"key":"e_1_2_1_78_1","volume-title":"Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 515\u2013527","author":"Wan Yuxuan","year":"2023","unstructured":"Yuxuan Wan, Wenxuan Wang, Pinjia He, Jiazhen Gu, Haonan Bai, and Michael R Lyu. 2023. BiasAsker: Measuring the Bias in Conversational AI System. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 515\u2013527."},{"key":"e_1_2_1_79_1","volume-title":"DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models. In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track.","author":"Wang Boxin","year":"2023","unstructured":"Boxin Wang, Weixin Chen, and Hengzhi Pei. 2023. DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models. In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track."},{"key":"e_1_2_1_80_1","volume-title":"MORTAR: A Model-based Runtime Action Repair Framework for AI-enabled Cyber-Physical Systems. arXiv preprint arXiv:2408.03892.","author":"Wang Renzhi","year":"2024","unstructured":"Renzhi Wang, Zhehua Zhou, Jiayang Song, Xuan Xie, Xiaofei Xie, and Lei Ma. 2024. MORTAR: A Model-based Runtime Action Repair Framework for AI-enabled Cyber-Physical Systems. arXiv preprint arXiv:2408.03892."},{"key":"e_1_2_1_81_1","volume-title":"Proceedings of the 35th IEEE\/ACM International Conference on Automated Software Engineering. 1053\u20131065","author":"Wang Shuai","year":"2020","unstructured":"Shuai Wang and Zhendong Su. 2020. Metamorphic Object Insertion for Testing Object Detection Systems. In Proceedings of the 35th IEEE\/ACM International Conference on Automated Software Engineering. 1053\u20131065."},{"key":"e_1_2_1_82_1","volume-title":"Proceedings of the IEEE\/ACM 47th International Conference on software Engineering (ICSE \u201925)","author":"Wang Zhijie","year":"2025","unstructured":"Zhijie Wang, Zijie Zhou, Da Song, Yuheng Huang, Shengmai Chen, Lei Ma, and Tianyi Zhang. 2025. Towards Understanding the Characteristics of Code Generation Errors Made by Large Language Models. In Proceedings of the IEEE\/ACM 47th International Conference on software Engineering (ICSE \u201925)."},{"key":"e_1_2_1_83_1","first-page":"24824","article-title":"Chain-of-Thought Prompting Elicits Reasoning in Large Language Models","author":"Wei Jason","year":"2022","unstructured":"Jason Wei, Xuezhi Wang, and Dale Schuurmans. 2022. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Advances in Neural Information Processing Systems. 35, Curran Associates, Inc., 24824\u201324837.","journal-title":"Advances in Neural Information Processing Systems. 35, Curran Associates, Inc."},{"key":"e_1_2_1_84_1","volume-title":"Agentless: Demystifying LLM-based Software Engineering Agents. arXiv preprint arXiv:2407.01489.","author":"Xia Chunqiu Steven","year":"2024","unstructured":"Chunqiu Steven Xia, Yinlin Deng, Soren Dunn, and Lingming Zhang. 2024. Agentless: Demystifying LLM-based Software Engineering Agents. arXiv preprint arXiv:2407.01489."},{"key":"e_1_2_1_85_1","volume-title":"Proceedings of the 38th IEEE\/ACM International Conference on Automated Software Engineering. 1136\u20131148","author":"Xiao Mingxuan","year":"2023","unstructured":"Mingxuan Xiao and Yan Xiao. 2023. LEAP: Efficient and Automated Test Method for NLP Software. In Proceedings of the 38th IEEE\/ACM International Conference on Automated Software Engineering. 1136\u20131148."},{"key":"e_1_2_1_86_1","unstructured":"Xuan Xie Jiayang Song Zhehua Zhou Yuheng Huang Da Song and Lei Ma. 2024. Online Safety Analysis for LLMs: a Benchmark an Assessment and a Path Forward. arXiv preprint arXiv:2404.08517."},{"key":"e_1_2_1_87_1","first-page":"50528","article-title":"SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering","author":"Yang John","year":"2024","unstructured":"John Yang and Carlos E. Jimenez. 2024. SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering. In Advances in Neural Information Processing Systems. 37, Curran Associates, Inc., 50528\u201350652.","journal-title":"Advances in Neural Information Processing Systems. 37, Curran Associates, Inc."},{"key":"e_1_2_1_88_1","first-page":"11809","article-title":"Tree of Thoughts: Deliberate Problem Solving with Large Language Models","author":"Yao Shunyu","year":"2023","unstructured":"Shunyu Yao, Dian Yu, and Jeffrey Zhao. 2023. Tree of Thoughts: Deliberate Problem Solving with Large Language Models. In Advances in Neural Information Processing Systems. 36, Curran Associates, Inc., 11809\u201311822.","journal-title":"Advances in Neural Information Processing Systems. 36, Curran Associates, Inc."},{"key":"e_1_2_1_89_1","doi-asserted-by":"crossref","first-page":"151019","DOI":"10.1109\/ACCESS.2020.3016826","article-title":"Cyber-Physical Power System (CPPS): A Review on Modeling, Simulation, and Analysis With Cyber Security Applications","volume":"8","author":"Yohanandhan Rajaa Vikhram","year":"2020","unstructured":"Rajaa Vikhram Yohanandhan and Rajvikram Madurai Elavarasan. 2020. Cyber-Physical Power System (CPPS): A Review on Modeling, Simulation, and Analysis With Cyber Security Applications. IEEE Access, 8 (2020), 151019\u2013151064.","journal-title":"IEEE Access"},{"key":"e_1_2_1_90_1","volume-title":"Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis. 467\u2013479","author":"Yu Boxi","year":"2022","unstructured":"Boxi Yu, Zhiqing Zhong, Xinran Qin, Jiayi Yao, Yuancheng Wang, and Pinjia He. 2022. Automated testing of image captioning systems. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis. 467\u2013479."},{"key":"e_1_2_1_91_1","volume-title":"Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering. 1\u201312","author":"Yu Hao","year":"2024","unstructured":"Hao Yu, Bo Shen, Dezhi Ran, Jiaxin Zhang, Qi Zhang, Yuchi Ma, Guangtai Liang, Ying Li, Qianxiang Wang, and Tao Xie. 2024. CoderEval: A Benchmark of Pragmatic Code Generation with Generative Pre-trained Models. In Proceedings of the 46th IEEE\/ACM International Conference on Software Engineering. 1\u201312."},{"key":"e_1_2_1_92_1","volume-title":"Proceedings of the IEEE\/CVF International Conference on Computer Vision. 11975\u201311986","author":"Zhai Xiaohua","year":"2023","unstructured":"Xiaohua Zhai, Basil Mustafa, Alexander Kolesnikov, and Lucas Beyer. 2023. Sigmoid Loss for Language Image Pre-Training. In Proceedings of the IEEE\/CVF International Conference on Computer Vision. 11975\u201311986."},{"key":"e_1_2_1_93_1","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers. Association for Computational Linguistics, 769\u2013787","author":"Zhang Kechi","year":"2023","unstructured":"Kechi Zhang, Zhuo Li, Jia Li, Ge Li, and Zhi Jin. 2023. Self-Edit: Fault-Aware Code Editor for Code Generation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers. Association for Computational Linguistics, 769\u2013787."},{"key":"e_1_2_1_94_1","volume-title":"Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 1592\u20131604","author":"Zhang Yuntong","year":"2024","unstructured":"Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, and Abhik Roychoudhury. 2024. AutoCodeRover: Autonomous Program Improvement. In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. 1592\u20131604."},{"key":"e_1_2_1_95_1","doi-asserted-by":"crossref","first-page":"1842","DOI":"10.1109\/TSE.2022.3194640","article-title":"FalsifAI: Falsification of AI-Enabled Hybrid Control Systems Guided by Time-Aware Coverage Criteria","volume":"49","author":"Zhang Zhenya","year":"2022","unstructured":"Zhenya Zhang, Deyun Lyu, Paolo Arcaini, Lei Ma, Ichiro Hasuo, and Jianjun Zhao. 2022. FalsifAI: Falsification of AI-Enabled Hybrid Control Systems Guided by Time-Aware Coverage Criteria. IEEE Transactions on Software Engineering, 49, 4 (2022), 1842\u20131859.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_2_1_96_1","volume-title":"2020 IEEE Symposium Series on Computational Intelligence (SSCI). 737\u2013744","author":"Zhao Wenshuai","year":"2020","unstructured":"Wenshuai Zhao, Jorge Pe\u00f1a Queralta, and Tomi Westerlund. 2020. Sim-to-Real Transfer in Deep Reinforcement Learning for Robotics: a Survey. In 2020 IEEE Symposium Series on Computational Intelligence (SSCI). 737\u2013744."},{"key":"e_1_2_1_97_1","first-page":"3391","article-title":"Specification-Based Autonomous Driving System Testing","volume":"49","author":"Zhou Yuan","year":"2023","unstructured":"Yuan Zhou, Yang Sun, and Yun Tang. 2023. Specification-Based Autonomous Driving System Testing. IEEE Transactions on Software Engineering, 49, 6 (2023), 3391\u20133410.","journal-title":"IEEE Transactions on Software Engineering"},{"key":"e_1_2_1_98_1","volume-title":"ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon Sequential Task Planning. In 2024 IEEE International Conference on Robotics and Automation (ICRA). 2081\u20132088","author":"Zhou Zhehua","year":"2024","unstructured":"Zhehua Zhou, Jiayang Song, Kunpeng Yao, Zhan Shu, and Lei Ma. 2024. ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon Sequential Task Planning. In 2024 IEEE International Conference on Robotics and Automation (ICRA). 2081\u20132088."},{"key":"e_1_2_1_99_1","volume-title":"Proceedings of the 1st ACM Workshop on Large AI Systems and Models with Privacy and Safety Analysis (LAMPS \u201924)","author":"Zhu Kaijie","year":"2024","unstructured":"Kaijie Zhu, Jindong Wang, and Jiaheng Zhou. 2024. PromptRobust: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts. In Proceedings of the 1st ACM Workshop on Large AI Systems and Models with Privacy and Safety Analysis (LAMPS \u201924). 57\u201368."}],"container-title":["Proceedings of the ACM on Software Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3729343","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T15:35:02Z","timestamp":1750347302000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3729343"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,6,19]]},"references-count":99,"journal-issue":{"issue":"FSE","published-print":{"date-parts":[[2025,6,19]]}},"alternative-id":["10.1145\/3729343"],"URL":"https:\/\/doi.org\/10.1145\/3729343","relation":{},"ISSN":["2994-970X"],"issn-type":[{"value":"2994-970X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,6,19]]}}}