{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,15]],"date-time":"2026-06-15T14:12:22Z","timestamp":1781532742869,"version":"3.54.5"},"reference-count":25,"publisher":"Optica Publishing Group","issue":"9","license":[{"start":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T00:00:00Z","timestamp":1774569600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/doi.org\/10.1364\/OA_License_v2#VOR"},{"start":{"date-parts":[[2026,3,27]],"date-time":"2026-03-27T00:00:00Z","timestamp":1774569600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/opg.optica.org\/policies\/opg-tdm-policy.json"}],"funder":[{"DOI":"10.13039\/501100004921","name":"Shanghai Jiao Tong University","doi-asserted-by":"crossref","award":["21TQ1400213"],"award-info":[{"award-number":["21TQ1400213"]}],"id":[{"id":"10.13039\/501100004921","id-type":"DOI","asserted-by":"crossref"},{"id":"https:\/\/ror.org\/0220qvk04","id-type":"ROR","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["62175145"],"award-info":[{"award-number":["62175145"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"},{"id":"https:\/\/ror.org\/01h0zpd94","id-type":"ROR","asserted-by":"publisher"}]}],"content-domain":{"domain":["opg.optica.org"],"crossmark-restriction":false},"short-container-title":["J. Opt. Commun. Netw."],"published-print":{"date-parts":[[2026,9,1]]},"abstract":"<jats:p>The integration of large language models (LLMs) into autonomous optical networks (AONs) promises to revolutionize network management. However, the advancement of this field is currently hindered by the lack of a standardized evaluation framework. To bridge this gap, we introduce AutoONBench v1.0, the inaugural version of a comprehensive benchmark designed to assess LLM-based agentic systems within the optical network domain. AutoONBench constructs an evaluation environment incorporating a field-trial dataset, a neural-network-based digital twin (DT), domain-specific operational tools, and documents. It encompasses five task categories covering the complete network lifecycle: service management, network maintenance, failure handling, physical-layer modeling, and network optimization. We further propose a hybrid evaluation methodology that combines quantitative metrics with an LLM-as-a-judge mechanism to provide a multidimensional performance assessment. Extensive evaluations of modern commercial LLMs reveal several problems. While current agents demonstrate proficiency in following standard operating procedures for routine tasks, they exhibit significant limitations in context-aware execution, complex context retrieval, and numerical analysis. AutoONBench establishes a baseline for future research and facilitates the industrial deployment of agentic AON systems.<\/jats:p>","DOI":"10.1364\/jocn.589201","type":"journal-article","created":{"date-parts":[[2026,2,20]],"date-time":"2026-02-20T14:00:10Z","timestamp":1771596010000},"page":"D1","update-policy":"https:\/\/doi.org\/10.1364\/crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["AutoONBench: a benchmark for large language model agents in autonomous optical networks"],"prefix":"10.1364","volume":"18","author":[{"ORCID":"https:\/\/orcid.org\/0009-0003-2161-9945","authenticated-orcid":true,"given":"Yihao","family":"Zhang","sequence":"first","affiliation":[{"id":[{"id":"https:\/\/ror.org\/0220qvk04","id-type":"ROR","asserted-by":"publisher"}]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Qizhi","family":"Qiu","sequence":"additional","affiliation":[{"id":[{"id":"https:\/\/ror.org\/0220qvk04","id-type":"ROR","asserted-by":"publisher"}]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jiaping","family":"Wu","sequence":"additional","affiliation":[{"id":[{"id":"https:\/\/ror.org\/0220qvk04","id-type":"ROR","asserted-by":"publisher"}]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Xiaomin","family":"Liu","sequence":"additional","affiliation":[{"id":[{"id":"https:\/\/ror.org\/0220qvk04","id-type":"ROR","asserted-by":"publisher"}]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6168-2688","authenticated-orcid":true,"given":"Weisheng","family":"Hu","sequence":"additional","affiliation":[{"id":[{"id":"https:\/\/ror.org\/0220qvk04","id-type":"ROR","asserted-by":"publisher"}]}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2437-0220","authenticated-orcid":true,"given":"Qunbi","family":"Zhuge","sequence":"additional","affiliation":[{"id":[{"id":"https:\/\/ror.org\/0220qvk04","id-type":"ROR","asserted-by":"publisher"}]}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"285","published-online":{"date-parts":[[2026,3,27]]},"reference":[{"key":"jocn-18-9-D1-R1","first-page":"929","article-title":"Check-N-Run: a checkpointing system for training deep learning recommendation models","volume-title":"19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22)","author":"Eisenman","year":"2022"},{"key":"jocn-18-9-D1-R2","doi-asserted-by":"publisher","first-page":"C10","DOI":"10.1364\/JOCN.11.000C10","type":"journal-article","volume":"11","author":"Christodoulopoulos","year":"2019","journal-title":"J. Opt. Commun. Netw."},{"key":"jocn-18-9-D1-R3","doi-asserted-by":"publisher","first-page":"A159","DOI":"10.1364\/JOCN.576017","type":"journal-article","volume":"18","author":"Zhang","year":"2026","journal-title":"J. Opt. Commun. Netw."},{"key":"jocn-18-9-D1-R4","volume-title":"Artificial Intelligence: A Modern Approach","author":"Russell","year":"2020"},{"key":"jocn-18-9-D1-R5","first-page":"1595","article-title":"Large language model-driven AI agent in SDN controller towards intent-based management of optical networks","volume-title":"European Conference on Optical Communication (ECOC)","author":"Zhou","year":"2024"},{"key":"jocn-18-9-D1-R6","first-page":"1591","article-title":"Open implementation of a large language model pipeline for automated configuration of software-defined optical networks","volume-title":"European Conference on Optical Communication (ECOC)","author":"Cicco","year":"2024"},{"key":"jocn-18-9-D1-R7","doi-asserted-by":"publisher","first-page":"C82","DOI":"10.1364\/JOCN.550286","type":"journal-article","volume":"17","author":"Sun","year":"2025","journal-title":"J. Opt. Commun. Netw."},{"key":"jocn-18-9-D1-R8","doi-asserted-by":"publisher","first-page":"1116","DOI":"10.1364\/JOCN.527874","type":"journal-article","volume":"16","author":"Pang","year":"2024","journal-title":"J. Opt. Commun. Netw."},{"key":"jocn-18-9-D1-R9","doi-asserted-by":"publisher","first-page":"681","DOI":"10.1364\/JOCN.521913","type":"journal-article","volume":"16","author":"Wang","year":"2024","journal-title":"J. Opt. Commun. Netw."},{"key":"jocn-18-9-D1-R10","first-page":"Th1A.2","article-title":"First field trial of LLM-powered AI agent for lifecycle management of autonomous driving optical networks","volume-title":"Optical Fiber Communication Conference (OFC)","author":"Liu","year":"2025"},{"key":"jocn-18-9-D1-R11","doi-asserted-by":"publisher","first-page":"90","DOI":"10.1109\/MCOM.001.2400342","type":"journal-article","volume":"63","author":"Song","year":"2025","journal-title":"IEEE Commun. Mag."},{"key":"jocn-18-9-D1-R12","first-page":"W.02.01.177","article-title":"First field-trial demonstration of L4 autonomous optical network for distributed AI training communication: an LLM-powered multi-AI-agent solution","volume-title":"European Conference on Optical Communication (ECOC)","author":"Zhang","year":"2025"},{"key":"jocn-18-9-D1-R13","doi-asserted-by":"publisher","author":"Brockman","year":"2016","DOI":"10.48550\/arXiv.1606.01540","type":"preprint"},{"key":"jocn-18-9-D1-R14","article-title":"GLUE: a multi-task benchmark and analysis platform for natural language understanding","volume-title":"International Conference on Learning Representations (ICLR)","author":"Wang","year":"2019"},{"key":"jocn-18-9-D1-R15","article-title":"AgentBench: evaluating LLMs as agents","volume-title":"International Conference on Learning Representations (ICLR)","author":"Liu","year":"2024"},{"key":"jocn-18-9-D1-R16","unstructured":"Zhang Y. , \u201c AutoONBench ,\u201d GitHub ( 2025 ), https:\/\/github.com\/Project-Loong\/AutoONBench ."},{"key":"jocn-18-9-D1-R17","article-title":"ReAct: synergizing reasoning and acting in language models","volume-title":"International Conference on Learning Representations (ICLR)","author":"Yao","year":"2023"},{"key":"jocn-18-9-D1-R18","first-page":"9459","article-title":"Retrieval-augmented generation for knowledge-intensive NLP tasks","volume-title":"Advances in Neural Information Processing Systems (NeurIPS)","volume":"33","author":"Lewis","year":"2020"},{"key":"jocn-18-9-D1-R19","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1016\/S0951-8320(03)00058-9","type":"journal-article","volume":"81","author":"Helton","year":"2003","journal-title":"Reliab. Eng. Syst. Saf."},{"key":"jocn-18-9-D1-R20","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1186\/1471-2288-14-135","type":"journal-article","volume":"14","author":"Wan","year":"2014","journal-title":"BMC Med. Res. Methodol."},{"key":"jocn-18-9-D1-R21","doi-asserted-by":"publisher","first-page":"A44","DOI":"10.1364\/JOCN.572249","type":"journal-article","volume":"18","author":"Qiu","year":"2026","journal-title":"J. Opt. Commun. Netw."},{"key":"jocn-18-9-D1-R22","doi-asserted-by":"crossref","first-page":"2757","DOI":"10.18653\/v1\/2025.emnlp-main.138","article-title":"From generation to judgment: opportunities and challenges of LLM-as-a-judge","volume-title":"Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (Association for Computational Linguistics)","author":"Li","year":"2025"},{"key":"jocn-18-9-D1-R23","first-page":"46595","article-title":"Judging LLM-as-a-judge with MT-bench and Chatbot Arena","volume-title":"Advances in Neural Information Processing Systems (NeurIPS)","volume":"36","author":"Zheng","year":"2023"},{"key":"jocn-18-9-D1-R25","doi-asserted-by":"publisher","first-page":"84","DOI":"10.1038\/s41524-025-01564-y","type":"journal-article","volume":"11","author":"Lu","year":"2025","journal-title":"Npj Comput. Mater."},{"key":"jocn-18-9-D1-R26","doi-asserted-by":"publisher","author":"Gao","year":"2025","DOI":"10.48550\/arXiv.2504.15909","type":"preprint"}],"container-title":["Journal of Optical Communications and Networking"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/opg.optica.org\/viewmedia.cfm?URI=jocn-18-9-D1&seq=0","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,5,13]],"date-time":"2026-05-13T13:59:43Z","timestamp":1778680783000},"score":1,"resource":{"primary":{"URL":"https:\/\/opg.optica.org\/abstract.cfm?URI=jocn-18-9-D1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3,27]]},"references-count":25,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2026]]},"published-print":{"date-parts":[[2026]]}},"URL":"https:\/\/doi.org\/10.1364\/jocn.589201","relation":{},"ISSN":["1943-0620","1943-0639"],"issn-type":[{"value":"1943-0620","type":"print"},{"value":"1943-0639","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3,27]]},"assertion":[{"value":"Optica Publishing Group","name":"publisher","label":"This article is maintained by"},{"value":"https:\/\/doi.org\/10.1364\/JOCN.589201","name":"articlelink","label":"Crossref DOI link to publisher maintained version"},{"value":"research-article","name":"content_type","label":"Article type"},{"value":"Screened by Similarity Check","name":"cross_check","label":"Similarity check"},{"value":"Yes","order":0,"name":"peer_reviewed","label":"Peer reviewed","group":{"name":"peer_review","label":"Peer review"}},{"value":"Single blind","order":1,"name":"review_process","label":"Review process","group":{"name":"peer_review","label":"Peer review"}},{"value":"5 January 2026","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"19 February 2026","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"27 March 2026","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}},{"value":"\u00a9 2026 Optica Publishing Group. All rights, including for text and data mining (TDM), Artificial Intelligence (AI) training, and similar technologies, are reserved.","name":"copyright","label":"Copyright"}]}}