{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T16:20:49Z","timestamp":1777738849967,"version":"3.51.4"},"reference-count":70,"publisher":"Springer Science and Business Media LLC","issue":"4","license":[{"start":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T00:00:00Z","timestamp":1742342400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,3,19]],"date-time":"2025-03-19T00:00:00Z","timestamp":1742342400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000266","name":"RCUK | Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/L016834\/1"],"award-info":[{"award-number":["EP\/L016834\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000266","name":"RCUK | Engineering and Physical Sciences Research Council","doi-asserted-by":"publisher","award":["EP\/S023208\/1"],"award-info":[{"award-number":["EP\/S023208\/1"]}],"id":[{"id":"10.13039\/501100000266","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Nat Mach Intell"],"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Completing complex tasks in unpredictable settings challenges robotic systems, requiring a step change in machine intelligence. Sensorimotor abilities are considered integral to human intelligence. Thus, biologically inspired machine intelligence might usefully combine artificial intelligence with robotic sensorimotor capabilities. Here we report an embodied large-language-model-enabled robot (ELLMER) framework, utilizing GPT-4 and a retrieval-augmented generation infrastructure, to enable robots to complete long-horizon tasks in unpredictable settings. The method extracts contextually relevant examples from a knowledge base, producing action plans that incorporate force and visual feedback and enabling adaptation to changing conditions. We tested ELLMER on a robot tasked with coffee making and plate decoration; these tasks consist of a sequence of sub-tasks from drawer opening to pouring, each benefiting from distinct feedback types and methods. We show that the ELLMER framework allows the robot to complete the tasks. This demonstration marks progress towards scalable, efficient and \u2018intelligent robots\u2019 able to complete complex tasks in uncertain environments.<\/jats:p>","DOI":"10.1038\/s42256-025-01005-x","type":"journal-article","created":{"date-parts":[[2025,3,18]],"date-time":"2025-03-18T20:02:18Z","timestamp":1742328138000},"page":"592-601","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":52,"title":["Embodied large language models enable robots to complete complex tasks in unpredictable environments"],"prefix":"10.1038","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0009-0001-3583-8247","authenticated-orcid":false,"given":"Ruaridh","family":"Mon-Williams","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6636-1106","authenticated-orcid":false,"given":"Gen","family":"Li","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2094-0652","authenticated-orcid":false,"given":"Ran","family":"Long","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3352-0809","authenticated-orcid":false,"given":"Wenqian","family":"Du","sequence":"additional","affiliation":[]},{"given":"Christopher G.","family":"Lucas","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,3,19]]},"reference":[{"key":"1005_CR1","doi-asserted-by":"crossref","unstructured":"Intelligence research should not be held back by its past. Nature 545, 385\u2013386 (2017).","DOI":"10.1038\/nature.2017.22021"},{"key":"1005_CR2","doi-asserted-by":"publisher","first-page":"497","DOI":"10.1007\/s10339-012-0519-z","volume":"13","author":"K Friston","year":"2012","unstructured":"Friston, K. Embodied inference and spatial cognition. Cogn. Process. 13, 497\u2013514 (2012).","journal-title":"Cogn. Process."},{"key":"1005_CR3","doi-asserted-by":"publisher","first-page":"625","DOI":"10.3758\/BF03196322","volume":"9","author":"M Wilson","year":"2002","unstructured":"Wilson, M. Six views of embodied cognition. Psychon. Bull. Rev. 9, 625\u2013636 (2002).","journal-title":"Psychon. Bull. Rev."},{"key":"1005_CR4","doi-asserted-by":"publisher","first-page":"345","DOI":"10.1016\/S1364-6613(99)01361-3","volume":"3","author":"A Clark","year":"1999","unstructured":"Clark, A. An embodied cognitive science. Trends Cogn. Sci. 3, 345\u2013351 (1999).","journal-title":"Trends Cogn. Sci."},{"key":"1005_CR5","doi-asserted-by":"publisher","first-page":"561","DOI":"10.1038\/s42256-023-00669-7","volume":"5","author":"F Stella","year":"2023","unstructured":"Stella, F., Della Santina, C. & Hughes, J. How can LLMs transform the robotic design process? Nat. Mach. Intell. 5, 561\u2013564 (2023).","journal-title":"Nat. Mach. Intell."},{"key":"1005_CR6","doi-asserted-by":"crossref","unstructured":"Miriyev, A. & Kovac, M. Skills for physical artificial intelligence. Nat. Mach. Intell. 2, 658\u2013660 (2020).","DOI":"10.1038\/s42256-020-00258-y"},{"key":"1005_CR7","doi-asserted-by":"crossref","unstructured":"Cui, J. & Trinkle, J. Toward next-generation learned robot manipulation. Sci. Robot. 6, eabd9461 (2021).","DOI":"10.1126\/scirobotics.abd9461"},{"key":"1005_CR8","doi-asserted-by":"crossref","unstructured":"Arents, J. & Greitans, M. Smart industrial robot control trends, challenges and opportunities within manufacturing. Appl. Sci. 12, 937 (2022).","DOI":"10.3390\/app12020937"},{"key":"1005_CR9","doi-asserted-by":"crossref","unstructured":"Billard, A. & Kragic, D. Trends and challenges in robot manipulation. Science 364, eaat8414 (2019).","DOI":"10.1126\/science.aat8414"},{"key":"1005_CR10","doi-asserted-by":"crossref","unstructured":"Yang, G.-Z. et al. The grand challenges of Science Robotics. Sci. Robot. 3, eaar7650 (2018).","DOI":"10.1126\/scirobotics.aar7650"},{"key":"1005_CR11","doi-asserted-by":"crossref","unstructured":"Buchanan, R., Rofer, A., Moura, J., Valada, A. & Vijayakumar, S. Online estimation of articulated objects with factor graphs using vision and proprioceptive sensing. In 2024 IEEE International Conference on Robotics and Automation (ICRA) 16111\u201316117 (IEEE, 2024).","DOI":"10.1109\/ICRA57147.2024.10610590"},{"key":"1005_CR12","doi-asserted-by":"crossref","unstructured":"Nikolaidis, S., Ramakrishnan, R., Gu, K. & Shah, J. Efficient model learning from joint-action demonstrations for human-robot collaborative tasks. In 2015 10th ACM\/IEEE International Conference on Human-Robot Interaction (HRI) 189\u2013196 (IEEE, 2015).","DOI":"10.1145\/2696454.2696455"},{"key":"1005_CR13","doi-asserted-by":"crossref","unstructured":"Saveriano, M., Abu-Dakka, F. J., Kramberger, A. & Peternel, L. Dynamic movement primitives in robotics: a tutorial survey. Int. J. Robot. Res. 42, 1133\u20131184 (2023).","DOI":"10.1177\/02783649231201196"},{"key":"1005_CR14","doi-asserted-by":"crossref","unstructured":"Kober, J. et al. Movement templates for learning of hitting and batting. In 2010 IEEE International Conference on Robotics and Automation 853\u2013858 (IEEE, 2010).","DOI":"10.1109\/ROBOT.2010.5509672"},{"key":"1005_CR15","unstructured":"Huang, W. et al. VoxPoser: composable 3D value maps for robotic manipulation with language models. In Proc. 7th Conference on Robot Learning 540\u2013562 (PMLR, 2023)."},{"key":"1005_CR16","doi-asserted-by":"crossref","unstructured":"Zhang, D. et al. Explainable hierarchical imitation learning for robotic drink pouring. In IEEE Transactions on Automation Science and Engineering 3871\u20133887 (2022).","DOI":"10.1109\/TASE.2021.3138280"},{"key":"1005_CR17","first-page":"21:1\u201321:35","volume":"50","author":"A Hussein","year":"2017","unstructured":"Hussein, A., Gaber, M. M., Elyan, E. & Jayne, C. Imitation learning: a survey of learning methods. ACM Comput. Surv. 50, 21:1\u201321:35 (2017).","journal-title":"ACM Comput. Surv."},{"key":"1005_CR18","doi-asserted-by":"crossref","unstructured":"Di Palo, N. & Johns, E. DINOBot: robot manipulation via retrieval and alignment with vision foundation models. In International Conference on Robotics and Automation (ICRA) 2798\u2013805 (IEEE, 2024).","DOI":"10.1109\/ICRA57147.2024.10610923"},{"key":"1005_CR19","unstructured":"Shridhar, M., Manuelli, L. & Fox, D. CLIPort: what and where pathways for robotic manipulation. In Proc. 5th Conference on Robot Learning 894\u2013906 (PMLR, 2022)."},{"key":"1005_CR20","unstructured":"Shridhar, M., Manuelli, L. & Fox, D. Perceiver-Actor: a multi-task transformer for robotic manipulation. In Proc. 6th Conference on Robot Learning 785\u2013799 (PMLR, 2023)."},{"key":"1005_CR21","doi-asserted-by":"crossref","unstructured":"Mees, O., Hermann, L. & Burgard, W. What matters in language conditioned robotic imitation learning over unstructured data. IEEE Robot. Autom. Lett. 7, 11205\u201311212 (2022).","DOI":"10.1109\/LRA.2022.3196123"},{"key":"1005_CR22","doi-asserted-by":"crossref","unstructured":"Mees, O., Borja-Diaz, J. & Burgard, W. Grounding language with visual affordances over unstructured data. In 2023 IEEE International Conference on Robotics and Automation (ICRA) 11576\u201311582 (IEEE, 2023).","DOI":"10.1109\/ICRA48891.2023.10160396"},{"key":"1005_CR23","doi-asserted-by":"crossref","unstructured":"Shao, L., Migimatsu, T., Zhang, Q., Yang, K. & Bohg, J. Concept2Robot: learning manipulation concepts from instructions and human demonstrations. Int. J. Robot. Res. 40, 1419\u20131434 (2021).","DOI":"10.1177\/02783649211046285"},{"key":"1005_CR24","unstructured":"Ichter, B. et al. Do as I can, not as I say: grounding language in robotic affordances. In Proc. 6th Conference on Robot Learning 287\u2013318 (PMLR, 2023)."},{"key":"1005_CR25","unstructured":"Driess, D. et al. PaLM-E: an embodied multimodal language model. In Proc. 40th International Conference on Machine Learning 8469\u20138488 (PMLR, 2023)."},{"key":"1005_CR26","doi-asserted-by":"crossref","unstructured":"Peng, A. et al. Preference-conditioned language-guided abstraction. In Proc. 2024 ACM\/IEEE International Conference on Human-Robot Interaction, HRI \u201924 572\u2013581 (Association for Computing Machinery, 2024).","DOI":"10.1145\/3610977.3634930"},{"key":"1005_CR27","unstructured":"Huang, W., Abbeel, P., Pathak, D. & Mordatch, I. Language models as zero-shot planners: extracting actionable knowledge for embodied agents. In Proc. 39th International Conference on Machine Learning 9118\u20139147 (PMLR, 2022)."},{"key":"1005_CR28","doi-asserted-by":"crossref","unstructured":"Huang, J. & Chang, K. C.-C. Towards reasoning in large language models: a survey. In Findings of the Association for Computational Linguistics: ACL 2023 1049\u20131065 (Association for Computational Linguistics, 2023).","DOI":"10.18653\/v1\/2023.findings-acl.67"},{"key":"1005_CR29","unstructured":"Zitkovich, B. et al. RT-2: vision-language-action models transfer web knowledge to robotic control. In Proc. 7th Conference on Robot Learning 2165\u20132183 (PMLR, 2023)."},{"key":"1005_CR30","doi-asserted-by":"crossref","unstructured":"Ma, X., Patidar, S., Haughton, I. & James, S. Hierarchical diffusion policy for kinematics-aware multi-task robotic manipulation. In Proc. IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 18081\u201318090 (IEEE, 2024).","DOI":"10.1109\/CVPR52733.2024.01712"},{"key":"1005_CR31","doi-asserted-by":"crossref","unstructured":"Zhang, C., Chen, J., Li, J., Peng, Y. & Mao, Z. Large language models for human-robot interaction: a review. Biomimetic Intell. Robot. 3, 100131 (2023).","DOI":"10.1016\/j.birob.2023.100131"},{"key":"1005_CR32","unstructured":"Lewis, P. et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Advances in Neural Information Processing Systems 9459\u20139474 (Curran Associates, 2020)."},{"key":"1005_CR33","doi-asserted-by":"crossref","unstructured":"Raiaan, M. et al. A review on large language models: architectures, applications, taxonomies, open issues and challenges. IEEE Access 12, 26839\u201326874 (2024).","DOI":"10.1109\/ACCESS.2024.3365742"},{"key":"1005_CR34","doi-asserted-by":"crossref","unstructured":"Rozo, L., Jimenez, P. & Torras, C. Force-based robot learning of pouring skills using parametric hidden Markov models. In 9th International Workshop on Robot Motion and Control 227\u2013232 (IEEE, 2013).","DOI":"10.1109\/RoMoCo.2013.6614613"},{"key":"1005_CR35","doi-asserted-by":"publisher","first-page":"103692","DOI":"10.1016\/j.robot.2020.103692","volume":"136","author":"Y Huang","year":"2021","unstructured":"Huang, Y., Wilches, J. & Sun, Y. Robot gaining accurate pouring skills through self-supervised learning and generalization. Robot. Auton. Syst. 136, 103692 (2021).","journal-title":"Robot. Auton. Syst."},{"key":"1005_CR36","doi-asserted-by":"crossref","unstructured":"Mon-Williams, R., Stouraitis, T. & Vijayakumar, S. A behavioural transformer for effective collaboration between a robot and a non-stationary human. In 2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) 1150\u20131157 (IEEE, 2023).","DOI":"10.1109\/RO-MAN57019.2023.10309643"},{"key":"1005_CR37","unstructured":"Belkhale, S., Cui, Y. & Sadigh, D. Data quality in imitation learning. In Advances in Neural Information Processing Systems (NeurIPS) 80375\u201380395 (Curran Associates, 2024)."},{"key":"1005_CR38","unstructured":"Khazatsky, A. et al. DROID: a large-scale in-the-wild robot manipulation dataset. Robotics: Science and Systems; https:\/\/www.roboticsproceedings.org\/rss20\/p120.pdf (2024)."},{"key":"1005_CR39","doi-asserted-by":"crossref","unstructured":"Acosta, B., Yang, W. & Posa, M. Validating robotics simulators on real-world impacts. IEEE Robot. Autom. Lett. 7, 6471\u20136478 (2022).","DOI":"10.1109\/LRA.2022.3174367"},{"key":"1005_CR40","unstructured":"Alomar, A. et al. CausalSim: a causal framework for unbiased trace-driven simulation. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23) 1115\u20131147 (USENIX Association, 2023)."},{"key":"1005_CR41","doi-asserted-by":"crossref","unstructured":"Choi, H. et al. On the use of simulation in robotics: opportunities, challenges, and suggestions for moving forward. Proc. Natl Acad. Sci. USA 118, e190785611 (2021).","DOI":"10.1073\/pnas.1907856118"},{"key":"1005_CR42","doi-asserted-by":"crossref","unstructured":"Del Aguila Ferrandis, J., Moura, J. & Vijayakumar, S. Nonprehensile planar manipulation through reinforcement learning with multimodal categorical exploration. In 2023 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS) 5606\u20135613 (IEEE, 2023).","DOI":"10.1109\/IROS55552.2023.10341629"},{"key":"1005_CR43","doi-asserted-by":"publisher","first-page":"201","DOI":"10.1613\/jair.1.14174","volume":"76","author":"R Kirk","year":"2023","unstructured":"Kirk, R., Zhang, A., Grefenstette, E. & Rockt\u00e4schel, T. A survey of zero-shot generalisation in deep reinforcement learning. J. Artific. Intell. Res. 76, 201\u2013264 (2023).","journal-title":"J. Artific. Intell. Res."},{"key":"1005_CR44","doi-asserted-by":"publisher","first-page":"143","DOI":"10.1016\/j.neucom.2022.04.005","volume":"493","author":"T Dai","year":"2022","unstructured":"Dai, T. et al. Analysing deep reinforcement learning agents trained with domain randomisation. Neurocomputing 493, 143\u2013165 (2022).","journal-title":"Neurocomputing"},{"key":"1005_CR45","unstructured":"Chang, J., Uehara, M., Sreenivas, D., Kidambi, R. & Sun, W. Mitigating covariate shift in imitation learning via offline data with partial coverage. In Advances in Neural Information Processing Systems 965\u2013979 (Curran Associates, 2021)."},{"key":"1005_CR46","unstructured":"Huang, W. et al. Inner monologue: embodied reasoning through planning with language models. In Proc. 6th Conference on Robot Learning 1769\u20131782 (PMLR, 2023)."},{"key":"1005_CR47","unstructured":"Nair, S., Rajeswaran, A., Kumar, V., Finn, C. & Gupta, A. R3M: a universal visual representation for robot manipulation. In Proc. 6th Conference on Robot Learning Vol. 205, 892\u2013909 (PMLR, 2022)."},{"key":"1005_CR48","doi-asserted-by":"crossref","unstructured":"Singh, I. et al. ProgPrompt: generating situated robot task plans using large language models. In Proc. IEEE\/CVF International Conference on Robotics and Automation (ICRA) 11523\u201311530 (IEEE, 2023).","DOI":"10.1109\/ICRA48891.2023.10161317"},{"key":"1005_CR49","doi-asserted-by":"crossref","unstructured":"Song, C. H. et al. LLM-Planner: few-shot grounded planning for embodied agents with large language models. In Proc. IEEE\/CVF International Conference on Computer Vision (ICCV) 2998\u20133009 (IEEE\/CVF, 2023).","DOI":"10.1109\/ICCV51070.2023.00280"},{"key":"1005_CR50","doi-asserted-by":"publisher","first-page":"55682","DOI":"10.1109\/ACCESS.2024.3387941","volume":"12","author":"SH Vemprala","year":"2024","unstructured":"Vemprala, S. H., Bonatti, R., Bucker, A. & Kapoor, A. ChatGPT for robotics: design principles and model abilities. IEEE Access 12, 55682\u201355696 (2024).","journal-title":"IEEE Access"},{"key":"1005_CR51","doi-asserted-by":"crossref","unstructured":"Ding, Y., Zhang, X., Paxton, C. & Zhang, S. Task and motion planning with large language models for object rearrangement. In 2023 IEEE\/RSJ International Conference on Intelligent Robots and Systems (IROS) 2086\u20132092 (IEEE, 2023).","DOI":"10.1109\/IROS55552.2023.10342169"},{"key":"1005_CR52","doi-asserted-by":"crossref","unstructured":"Kwon, M. et al. Toward grounded commonsense reasoning. In Proc. International Conference on Robotics and Automation (ICRA) 5463\u20135470 (IEEE, 2024).","DOI":"10.1109\/ICRA57147.2024.10611218"},{"key":"1005_CR53","unstructured":"Hong, J., Levine, S. & Dragan, A. Learning to influence human behavior with offline reinforcement learning. In Advances in Neural Information Processing Systems (NeurIPS) 36094\u201336105 (Curran Associates, 2024)."},{"key":"1005_CR54","unstructured":"OpenAI. GPT-4 technical report. Preprint at http:\/\/arxiv.org\/abs\/2303.08774 (2024)."},{"key":"1005_CR55","unstructured":"OpenAI. Custom models program: fine-tuning GPT-4 for specific domains (2023); https:\/\/platform.openai.com\/docs\/guides\/fine-tuning\/"},{"key":"1005_CR56","unstructured":"Pietsch, M. et al. Haystack: the end-to-end nlp framework for pragmatic builders. GitHub https:\/\/github.com\/deepset-ai\/haystack (2019)."},{"key":"1005_CR57","unstructured":"Weaviate. Verba: the golden RAGtriever. GitHub https:\/\/github.com\/weaviate\/Verba (2023)."},{"key":"1005_CR58","doi-asserted-by":"crossref","unstructured":"Kirillov, A. et al. Segment anything. In Proc. IEEE\/CVF International Conference on Computer Vision (ICCV) 4015\u20134026 (IEEE, 2023).","DOI":"10.1109\/ICCV51070.2023.00371"},{"key":"1005_CR59","unstructured":"Ramesh, A. et al. Zero-shot text-to-image generation. In Proc. 38th International Conference on Machine Learning 8821\u20138831 (PMLR, 2021)."},{"key":"1005_CR60","unstructured":"Zeng, A. et al. Socratic models: composing zero-shot multimodal reasoning with language. In Proc. International Conference on Learning Representations (ICLR, 2023)."},{"key":"1005_CR61","doi-asserted-by":"crossref","unstructured":"Cui, Y. et al. No, to the right: online language corrections for robotic manipulation via shared autonomy. In Proc. 2023 ACM\/IEEE International Conference on Human-Robot Interaction, HRI \u201923 93\u2013101 (Association for Computing Machinery, 2023).","DOI":"10.1145\/3568162.3578623"},{"key":"1005_CR62","doi-asserted-by":"crossref","unstructured":"Bengio, Y. et al. Managing extreme AI risks amid rapid progress. Science 384, 842\u2013845 (2024).","DOI":"10.1126\/science.adn0117"},{"key":"1005_CR63","doi-asserted-by":"crossref","unstructured":"Li, G., Jampani, V., Sun, D. & Sevilla-Lara, L. Locate: localize and transfer object parts for weakly supervised affordance grounding. In Proc. IEEE\/CVF Conference on Computer Vision and Pattern Recognition 10922\u201310931 (IEEE, 2023).","DOI":"10.1109\/CVPR52729.2023.01051"},{"key":"1005_CR64","doi-asserted-by":"crossref","unstructured":"Li, G., Sun, D., Sevilla-Lara, L. & Jampani, V. One-shot open affordance learning with foundation models. In Proc. IEEE\/CVF Conference on Computer Vision and Pattern Recognition 3086\u20133096 (IEEE, 2024).","DOI":"10.1109\/CVPR52733.2024.00298"},{"key":"1005_CR65","doi-asserted-by":"crossref","unstructured":"Liang, J. et al. Code as policies: language model programs for embodied control. In 2023 IEEE International Conference on Robotics and Automation (ICRA) 9493\u20139500 (IEEE, 2023).","DOI":"10.1109\/ICRA48891.2023.10160591"},{"key":"1005_CR66","doi-asserted-by":"crossref","unstructured":"Hong, S. & Kim, H. An integrated GPU power and performance model. In Proc. 37th Annual International Symposium on Computer Architecture 280\u2013289 (Association for Computing Machinery, 2010).","DOI":"10.1145\/1815961.1815998"},{"key":"1005_CR67","unstructured":"Kinova Robotics. Kinova Gen3 Ultra-Lightweight Robotic Arm User Guide (2023); https:\/\/assets.iqr-robot.com\/wp-content\/uploads\/2023\/08\/20230814163651088831.pdf"},{"key":"1005_CR68","unstructured":"US Environmental Protection Agency. GHG emission factors hub (2024); https:\/\/www.epa.gov\/climateleadership\/ghg-emission-factors-hub"},{"key":"1005_CR69","doi-asserted-by":"crossref","unstructured":"Liu, S. et al. Grounding DINO: marrying DINO with grounded pre-training for open-set object detection. In 2024 European Conference on Computer Vision (eds Leonardis, A. et al.) Vol. 15105 (Springer, 2023).","DOI":"10.1007\/978-3-031-72970-6_3"},{"key":"1005_CR70","doi-asserted-by":"publisher","unstructured":"ruaridhmon. ruaridhmon\/ELLMER: v1.0.0: Initial Release. Zenodo https:\/\/doi.org\/10.5281\/zenodo.14483539 (2024).","DOI":"10.5281\/zenodo.14483539"}],"container-title":["Nature Machine Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.nature.com\/articles\/s42256-025-01005-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s42256-025-01005-x","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.nature.com\/articles\/s42256-025-01005-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,4,22]],"date-time":"2025-04-22T18:03:19Z","timestamp":1745344999000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.nature.com\/articles\/s42256-025-01005-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,3,19]]},"references-count":70,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2025,4]]}},"alternative-id":["1005"],"URL":"https:\/\/doi.org\/10.1038\/s42256-025-01005-x","relation":{"has-review":[{"id-type":"doi","id":"10.14293\/S2199-1006.1.SOR-UNCAT.A12088599.v1.RHQNFS","asserted-by":"object"}],"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-4622857\/v1","asserted-by":"object"}]},"ISSN":["2522-5839"],"issn-type":[{"value":"2522-5839","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,3,19]]},"assertion":[{"value":"22 June 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"31 January 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 March 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The authors declare no competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}]}}