{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,22]],"date-time":"2026-01-22T08:35:39Z","timestamp":1769070939419,"version":"3.49.0"},"reference-count":45,"publisher":"Wiley","issue":"1","license":[{"start":{"date-parts":[[2025,7,20]],"date-time":"2025-07-20T00:00:00Z","timestamp":1752969600000},"content-version":"vor","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2025,7,20]],"date-time":"2025-07-20T00:00:00Z","timestamp":1752969600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/doi.wiley.com\/10.1002\/tdm_license_1.1"}],"content-domain":{"domain":["advanced.onlinelibrary.wiley.com"],"crossmark-restriction":true},"short-container-title":["Advanced Intelligent Systems"],"published-print":{"date-parts":[[2026,1]]},"abstract":"<jats:p>Scrub nurses have crucial responsibilities, particularly in handling instrument\u2010related tasks. However, significant mental burdens and unfamiliarity with instruments can lead to various human errors. Consequently, the research community has explored robotic prototypes. Unfortunately, these prototypes often focus on specific instrument\u2010handling tasks or offer non\u2010intuitive interaction methods, hindering social acceptance. This article proposes a surgeon\u2010friendly robotic scrub nurse platform that addresses multiple instrument\u2010related tasks, including grasping and transferring, automatic sorting, and counting. To the best of the authors\u2019 knowledge, this is the first prototype to incorporate audiovisual input modalities and a large language model (LLM) for smooth and intuitive interaction between the surgeon and the robot. Specifically, vision artificial intelligence (AI) provides accurate instrument detection results using oriented bounding boxes with an average precision of 97.6%, guiding robot motion planning. The speech AI recognizes the surgeon's voice commands. The LLM further interprets multimodal information to trigger different robot actions via the \u201ctool use\u201d capability, achieving a standalone success rate of 94% with an average action latency of less than 1\u2009s on a real robotic scrub nurse hardware platform. Physical validation demonstrated that the proposed prototype successfully completed all assigned tasks, proving its feasibility and effectiveness.<\/jats:p>","DOI":"10.1002\/aisy.202500483","type":"journal-article","created":{"date-parts":[[2025,7,21]],"date-time":"2025-07-21T04:25:21Z","timestamp":1753071921000},"update-policy":"https:\/\/doi.org\/10.1002\/crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Large Language Model\u2010Embedded Intelligent Robotic Scrub Nurse with Multimodal Input for Enhancing Surgeon\u2013Robot Interaction"],"prefix":"10.1002","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4913-8292","authenticated-orcid":false,"given":"Wing Yin","family":"Ng","sequence":"first","affiliation":[{"name":"Department of Surgery The Chinese University of Hong Kong  Shatin N.T.,Hong Kong SAR China"}]},{"given":"Wanyu","family":"Ma","sequence":"additional","affiliation":[{"name":"Department of Surgery The Chinese University of Hong Kong  Shatin N.T.,Hong Kong SAR China"}]},{"given":"Pheng Ann","family":"Heng","sequence":"additional","affiliation":[{"name":"Institute of Medical Intelligence and XR The Chinese University of Hong Kong  Shatin N.T.,Hong Kong SAR China"},{"name":"Department of Computer Science and Engineering The Chinese University of Hong Kong  Shatin N.T.,Hong Kong SAR China"}]},{"given":"Philip Wai Yan","family":"Chiu","sequence":"additional","affiliation":[{"name":"Department of Surgery The Chinese University of Hong Kong  Shatin N.T.,Hong Kong SAR China"},{"name":"Chow Yuk Ho Technology Center for Innovative Medicine The Chinese University of Hong Kong  Shatin N.T.,Hong Kong SAR China"},{"name":"Multi\u2010scale Medical Robotics Center Limited  Shatin N.T.,Hong Kong SAR China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-4455-0808","authenticated-orcid":false,"given":"Zheng","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Surgery The Chinese University of Hong Kong  Shatin N.T.,Hong Kong SAR China"},{"name":"Chow Yuk Ho Technology Center for Innovative Medicine The Chinese University of Hong Kong  Shatin N.T.,Hong Kong SAR China"},{"name":"Multi\u2010scale Medical Robotics Center Limited  Shatin N.T.,Hong Kong SAR China"},{"name":"Li Ka Shing Institute of Health Science The Chinese University of Hong Kong  Shatin N.T.,Hong Kong SAR China"}]}],"member":"311","published-online":{"date-parts":[[2025,7,20]]},"reference":[{"key":"e_1_2_9_2_1","doi-asserted-by":"publisher","DOI":"10.1186\/s13037-023-00388-3"},{"key":"e_1_2_9_3_1","first-page":"16","volume":"27","author":"Kang E.","year":"2014","journal-title":"ACORN"},{"key":"e_1_2_9_4_1","doi-asserted-by":"publisher","DOI":"10.1038\/s43856-024-00581-0"},{"key":"e_1_2_9_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/RFID.2008.4519358"},{"key":"e_1_2_9_6_1","doi-asserted-by":"publisher","DOI":"10.4293\/JSLS.2020.00076"},{"key":"e_1_2_9_7_1","doi-asserted-by":"publisher","DOI":"10.1002\/aorn.12805"},{"key":"e_1_2_9_8_1","volume-title":"SPIE","author":"Wachs J. P.","year":"2012"},{"key":"e_1_2_9_9_1","doi-asserted-by":"publisher","DOI":"10.1515\/cdbme-2021-1035"},{"key":"e_1_2_9_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSMC.2011.6083972"},{"key":"e_1_2_9_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2447976.2447993"},{"key":"e_1_2_9_12_1","doi-asserted-by":"publisher","DOI":"10.1515\/cdbme-2023-1044"},{"key":"e_1_2_9_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-32254-0_19"},{"key":"e_1_2_9_14_1","author":"Ezzat A.","year":"2021","journal-title":"Surg. Endosc."},{"key":"e_1_2_9_15_1","doi-asserted-by":"publisher","DOI":"10.1108\/01439910510629136"},{"key":"e_1_2_9_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIE.2005.855692"},{"key":"e_1_2_9_17_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00464-005-0511-0"},{"key":"e_1_2_9_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/IROS.2007.4399359"},{"key":"e_1_2_9_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/BIOROB.2010.5626941"},{"key":"e_1_2_9_20_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.robot.2012.01.005"},{"key":"e_1_2_9_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/375360.375365"},{"key":"e_1_2_9_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2014.6907366"},{"key":"e_1_2_9_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICInfA.2017.8078917"},{"key":"e_1_2_9_24_1","volume-title":"Sorting Surgical Tools from a Cluttered Tray \u2010 Object Detection and Occlusion Reasoning","author":"Lavado D. M.","year":"2018"},{"key":"e_1_2_9_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICEAST52143.2021.9426258"},{"key":"e_1_2_9_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/AIM52237.2022.9863381"},{"key":"e_1_2_9_27_1","doi-asserted-by":"crossref","unstructured":"A.Wilcox J.Kerr B.Thananjeyan J.Ichnowski M.Hwang S.Paradis D.Fer K.Goldberg Learning to localize grasp and hand over unmodified surgical needles2021 https:\/\/arxiv.org\/abs\/2112.04071.","DOI":"10.1109\/ICRA46639.2022.9812393"},{"key":"e_1_2_9_28_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11012-022-01594-6"},{"key":"e_1_2_9_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11548-024-03245-5"},{"key":"e_1_2_9_30_1","first-page":"1","volume-title":"2024 13th International Conference on Control, Automation and Information Sciences (ICCAIS)","author":"Hwang B.","year":"2024"},{"key":"e_1_2_9_31_1","unstructured":"cgvict rolabelimg2018 https:\/\/github.com\/cgvict\/roLabelImg."},{"key":"e_1_2_9_32_1","unstructured":"Github \u2010 hukaixuan19970627\/yolov5_obb: yolov5 + csl_label https:\/\/github.com\/hukaixuan19970627\/yolov5_obb [Accessed 09\u201004\u20102025]."},{"key":"e_1_2_9_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/ROBIO64047.2024.10907620"},{"key":"e_1_2_9_34_1","unstructured":"GitHub \u2010 ultralytics\/ultralytics: Ultralytics YOLO11\u2014github.com https:\/\/github.com\/ultralytics\/ultralytics [Accessed 12\u201001\u20102025]."},{"key":"e_1_2_9_35_1","doi-asserted-by":"crossref","unstructured":"C.\u2010Y.Wang I.\u2010H.Yeh H.\u2010Y. M.Liao Yolov9: Learning what you want to learn using programmable gradient information2024 https:\/\/arxiv.org\/abs\/2402.13616.","DOI":"10.1007\/978-3-031-72751-1_1"},{"key":"e_1_2_9_36_1","doi-asserted-by":"crossref","unstructured":"A.Wang H.Chen L.Liu K.Chen Z.Lin J.Han G.Ding Yolov10: Real\u2010time end\u2010to\u2010end object detection2024 https:\/\/arxiv.org\/abs\/2405.14458.","DOI":"10.52202\/079017-3429"},{"key":"e_1_2_9_37_1","unstructured":"Y.Tian Q.Ye D.Doermann Yolov12: Attention\u2010centric real\u2010time object detectors2025 https:\/\/arxiv.org\/abs\/2502.12524."},{"key":"e_1_2_9_38_1","unstructured":"R.Microphones Wireless go ii | dual wireless mic system | rode\u2014rode.com https:\/\/rode.com\/cn\/microphones\/wireless\/ [Accessed 26\u201001\u20102025]."},{"key":"e_1_2_9_39_1","unstructured":"GitHub \u2010 openai\/whisper: Robust Speech Recognition via Large\u2010Scale Weak Supervision\u2014github.com https:\/\/github.com\/openai\/whisper [Accessed 03\u201007\u20102024]."},{"key":"e_1_2_9_40_1","unstructured":"Gladia I Audio Transcription API\u2014gladia.io https:\/\/www.gladia.io\/ [Accessed 26\u201001\u20102025]."},{"key":"e_1_2_9_41_1","unstructured":"OpenAI Gpt\u20104 technical report 2023 https:\/\/arxiv.org\/abs\/2303.08774."},{"key":"e_1_2_9_42_1","unstructured":"DeepSeek\u2010AI Deepseek\u2010v3 technical report 2024 https:\/\/arxiv.org\/abs\/2412.19437"},{"key":"e_1_2_9_43_1","unstructured":"C.Lugaresi J.Tang H.Nash C.McClanahan E.Uboweja M.Hays F.Zhang C.\u2010L.Chang M. G.Yong J.Lee W.\u2010T.Chang W.Hua M.Georg M.Grundmann Mediapipe: A framework for building perception pipelines2019 https:\/\/arxiv.org\/abs\/1906.08172."},{"key":"e_1_2_9_44_1","unstructured":"Gladia \u2010 Word Error Rate Speed & Price Analysis | Artificial Analysis\u2014 artificialanalysis.ai https:\/\/artificialanalysis.ai\/speech\u2010to\u2010text\/models\/gladia [Accessed 02\u201006\u20102025]."},{"key":"e_1_2_9_45_1","first-page":"189","author":"Brooke J.","year":"1995","journal-title":"Usabil. Eval. Ind."},{"key":"e_1_2_9_46_1","doi-asserted-by":"publisher","DOI":"10.21037\/acs.2016.03.05"}],"container-title":["Advanced Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/advanced.onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/aisy.202500483","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/advanced.onlinelibrary.wiley.com\/doi\/full-xml\/10.1002\/aisy.202500483","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/advanced.onlinelibrary.wiley.com\/doi\/pdf\/10.1002\/aisy.202500483","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,21]],"date-time":"2026-01-21T16:50:22Z","timestamp":1769014222000},"score":1,"resource":{"primary":{"URL":"https:\/\/advanced.onlinelibrary.wiley.com\/doi\/10.1002\/aisy.202500483"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,20]]},"references-count":45,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2026,1]]}},"alternative-id":["10.1002\/aisy.202500483"],"URL":"https:\/\/doi.org\/10.1002\/aisy.202500483","archive":["Portico"],"relation":{},"ISSN":["2640-4567","2640-4567"],"issn-type":[{"value":"2640-4567","type":"print"},{"value":"2640-4567","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,7,20]]},"assertion":[{"value":"2025-04-29","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-07-20","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}],"article-number":"2500483"}}