{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T17:11:07Z","timestamp":1770052267676,"version":"3.49.0"},"reference-count":41,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T00:00:00Z","timestamp":1769990400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Artif. Intell."],"abstract":"<jats:sec>\n                    <jats:title>Introduction<\/jats:title>\n                    <jats:p>Multi-agent\/ensemble approaches can improve discrete-choice reasoning with large language models, but common orchestration methods are often non-deterministic, expensive, and difficult to reproduce. We propose ORCH, a deterministic multi-agent orchestrator that targets higher accuracy and better cost\u2013performance via stable routing.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Methods<\/jats:title>\n                    <jats:p>ORCH uses a pool of heterogeneous LLM agents and a deterministic routing mechanism based on exponential moving average (EMA) performance tracking. For each question, ORCH selects a small subset of agents, obtains candidate answers, and merges them through a controlled aggregation procedure. We evaluate ORCH on multiple discrete-choice benchmarks and compare against single-model baselines and non-routed ensemble strategies under consistent prompting and scoring.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>ORCH delivers consistent accuracy improvements over the best low-cost single model and provides additional gains over high-cost single-model baselines on several tasks, while reducing reliance on always-invoking expensive models. The deterministic routing and merge pipeline improves stability across runs.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Discussion<\/jats:title>\n                    <jats:p>ORCH demonstrates that deterministic EMA-guided routing can offer a practical and reproducible orchestration strategy for discrete-choice reasoning. This framework can be extended to additional tasks, agent pools, and preference-aware routing policies in future work.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.3389\/frai.2026.1748735","type":"journal-article","created":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T06:31:17Z","timestamp":1770013877000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["ORCH: many analyses, one merge\u2014a deterministic multi-agent orchestrator for discrete-choice reasoning with EMA-guided routing"],"prefix":"10.3389","volume":"9","author":[{"given":"Hanlin","family":"Zhou","sequence":"first","affiliation":[{"name":"School of Computer Sciences, Universiti Sains Malaysia","place":["Gelugor, Malaysia"]},{"name":"Xiamen Institute of Software Technology","place":["Xiamen, China"]}]},{"given":"Huah Yong","family":"Chan","sequence":"additional","affiliation":[{"name":"School of Computer Sciences, Universiti Sains Malaysia","place":["Gelugor, Malaysia"]}]}],"member":"1965","published-online":{"date-parts":[[2026,2,2]]},"reference":[{"key":"ref1","author":"Amirhossein","year":"2023"},{"key":"ref2","article-title":"PBFT-backed semantic voting for multi-agent memory pruning","author":"Bach","year":"2025"},{"key":"ref3","doi-asserted-by":"crossref","DOI":"10.1109\/NOMS47738.2020.9110428","article-title":"Adaptive scaling of Kubernetes pods","author":"Balla","year":"2020"},{"key":"ref4","first-page":"11","article-title":"Emergent language-based coordination in deep multi-agent systems","author":"Baroni","year":"2022"},{"key":"ref5","article-title":"Think you have solved direct-answer question answering? Try arc-da, the direct-answer AI2 reasoning challenge","author":"Bhakthavatsalam","year":"2021"},{"key":"ref6","doi-asserted-by":"publisher","first-page":"4948","DOI":"10.3390\/app11114948","article-title":"Multi-agent reinforcement learning: a review of challenges and applications","volume":"11","author":"Canese","year":"2021","journal-title":"Appl. Sci."},{"key":"ref7","article-title":"Universal self-consistency for large language model generation","author":"Chen","year":"2023"},{"key":"ref8","article-title":"Agentverse: facilitating multi-agent collaboration and exploring emergent behaviors. The twelfth international conference on learning representations","author":"Chen","year":"2023"},{"key":"ref9","doi-asserted-by":"crossref","DOI":"10.4018\/979-8-3693-1552-1.ch001","article-title":"Machine learning and deep learning algorithms for green computing","volume-title":"Advances in computational intelligence and robotics book series","author":"Choudhury","year":"2024"},{"key":"ref10","doi-asserted-by":"publisher","first-page":"e74","DOI":"10.1017\/s0266462322000551","article-title":"Model for assessing the value of artificial intelligence in medical imaging (MAS-AI)","volume":"38","author":"Fasterholdt","year":"2022","journal-title":"Int. J. Technol. Assess. Health Care"},{"key":"ref11","doi-asserted-by":"publisher","first-page":"681","DOI":"10.1007\/s11023-020-09548-1","article-title":"GPT-3: its nature, scope, limits, and consequences","volume":"30","author":"Floridi","year":"2020","journal-title":"Minds Mach."},{"key":"ref12","article-title":"Agentscope: a flexible yet robust multi-agent platform","author":"Gao","year":"2024"},{"key":"ref13","article-title":"Metagpt: meta programming for a multi-agent collaborative framework","author":"Hong","year":"2023"},{"key":"ref14","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3708806","article-title":"LLM-based multi-agent systems for software engineering: literature review, vision and the road ahead","volume":"34","author":"Junda","year":"2025","journal-title":"ACM Trans. Softw. Eng. Methodol."},{"key":"ref15","doi-asserted-by":"publisher","first-page":"858","DOI":"10.1093\/postmj\/qgae065","article-title":"ChatGPT prompts for generating multiple-choice questions in medical education and evidence on their validity: a literature review","volume":"100","author":"K\u0131yak","year":"2024","journal-title":"Postgrad. Med. J."},{"key":"ref16","first-page":"346","article-title":"XAI: esclarecendo o problema da caixa preta com transpar\u00eancia e interpretabilidade","volume":"40","author":"Levy","year":"2024","journal-title":"Rev. Terra Cult.: Cad. Ensino Pesqui."},{"key":"ref17","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2025.findings-acl.472","article-title":"HellaSwag-Pro: a large-scale bilingual benchmark for evaluating the robustness of LLMs in commonsense reasoning","author":"Li","year":"2025"},{"key":"ref18","author":"Liu","year":"2025"},{"key":"ref19","first-page":"13390","article-title":"Advances in neural information processing systems","volume-title":"Adv Neural Inf Process Syst.","author":"Lyu","year":"2019"},{"key":"ref20","article-title":"Self-adaptive large language model (LLM)-based multiagent systems","author":"Nathalia","year":"2023"},{"key":"ref21","article-title":"ROUTELLM: learning to route LLMs with preference data","author":"Ong","year":"2024"},{"key":"ref22","first-page":"27730","article-title":"Training language models to follow instructions with human feedback","volume-title":"Adv Neural Inf Process Syst.","author":"Ouyang","year":"2022"},{"key":"ref23","article-title":"Reasoning capacity in multi-agent systems: limitations, challenges and human-centered solutions","author":"Pezeshkpour","year":"2024"},{"key":"ref24","first-page":"15174","article-title":"Chatdev: communicative agents for software development","author":"Qian","year":"2024"},{"key":"ref25","doi-asserted-by":"publisher","first-page":"328","DOI":"10.1007\/978-3-031-21689-3_24","article-title":"Detecting malicious http requests without log parser using RequestBERT-BiLSTM","volume":"13654","author":"Ramos","year":"2022","journal-title":"Intell. Syst."},{"key":"ref26","doi-asserted-by":"publisher","first-page":"103858","DOI":"10.1016\/j.jnca.2024.103858","article-title":"AI-enhanced blockchain technology: a review of advancements and opportunities","volume":"225","author":"Ressi","year":"2024","journal-title":"J. Netw. Comput. Appl."},{"key":"ref27","doi-asserted-by":"crossref","DOI":"10.1145\/3603166.3632165","article-title":"Machine learning for predictive resource scaling of microservices on Kubernetes platforms","author":"Rubak","year":"2023"},{"key":"ref28","article-title":"Multi-agent collaboration: harnessing the power of intelligent LLM agents","author":"Talebirad","year":"2023"},{"key":"ref29","article-title":"LLM merging: building LLMs efficiently through merging","author":"Tam","year":"2024"},{"key":"ref30","first-page":"413","article-title":"Scientific machine learning benchmarks","author":"Thiyagalingam","year":"2022"},{"key":"ref31","first-page":"95266","article-title":"MMLU-Pro: a more robust and challenging multi-task language understanding benchmark","volume-title":"Adv Neural Inf Process Syst.","author":"Wang","year":"2024"},{"key":"ref32","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2402.11795","article-title":"Megaagent: a practical framework for autonomous cooperation in large-scale LLM agent systems","author":"Wang","year":"2024","journal-title":"arXiv"},{"key":"ref33","article-title":"Mas2: self-generative, self-configuring, self-rectifying multi-agent systems","author":"Wang","year":"2025"},{"key":"ref34","article-title":"Large language models are biased reinforcement learners","author":"William","year":"2024"},{"key":"ref35","doi-asserted-by":"crossref","DOI":"10.1109\/ETFA61755.2024.10710900","article-title":"LLM experiments with simulation: large language model multi-agent system for process simulation parametrization in digital twins","author":"Xia","year":"2024"},{"key":"ref36","doi-asserted-by":"crossref","DOI":"10.1109\/ETFA61755.2024.10710900","article-title":"LLM experiments with simulation: large language model multi-agent system for simulation model parametrization in digital twins","author":"Xia","year":"2024"},{"key":"ref37","first-page":"114872","article-title":"Calibrating reasoning in language models with internal consistency","author":"Xie","year":"2024"},{"key":"ref38","doi-asserted-by":"crossref","DOI":"10.18653\/v1\/2025.emnlp-main.79","article-title":"MMLU-Prox: a multilingual benchmark for advanced large language model evaluation","author":"Xuan","year":"2025"},{"key":"ref39","article-title":"Auto-GPT for online decision making: benchmarks and additional opinions","author":"Yang","year":"2023"},{"key":"ref40","first-page":"13371","article-title":"MMLU-CF: a contamination-free multi-task language understanding benchmark","author":"Zhao","year":"2025"},{"key":"ref41","first-page":"1","article-title":"Achieving >97% on GSM8K: deeply understanding the problems makes LLMs better solvers for math word problems.","volume-title":"Front. Comput. Sci","author":"Zhong","year":"2026"}],"container-title":["Frontiers in Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2026.1748735\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,2,2]],"date-time":"2026-02-02T06:31:20Z","timestamp":1770013880000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2026.1748735\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,2,2]]},"references-count":41,"alternative-id":["10.3389\/frai.2026.1748735"],"URL":"https:\/\/doi.org\/10.3389\/frai.2026.1748735","relation":{},"ISSN":["2624-8212"],"issn-type":[{"value":"2624-8212","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,2,2]]},"article-number":"1748735"}}