{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T19:14:23Z","timestamp":1773342863750,"version":"3.50.1"},"reference-count":42,"publisher":"Verein zur Forderung des Open Access Publizierens in den Quantenwissenschaften","license":[{"start":{"date-parts":[[2025,5,28]],"date-time":"2025-05-28T00:00:00Z","timestamp":1748390400000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100011033\/feder,","name":"Ministry of Science, Innovation & Universities \/ Spanish Research Agency \/ European Regional Development Fund of the EU","doi-asserted-by":"publisher","award":["CPP2023-010459"],"award-info":[{"award-number":["CPP2023-010459"]}],"id":[{"id":"10.13039\/501100011033\/feder,","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003030","name":"Ag\u00e8ncia de Gesti\u00f3 d\u2019Ajuts Universitaris i de Recerca","doi-asserted-by":"crossref","award":["020-DI00063"],"award-info":[{"award-number":["020-DI00063"]}],"id":[{"id":"10.13039\/501100003030","id-type":"DOI","asserted-by":"crossref"}]},{"name":"European Union\u2019s Horizon 2020 research and innovation programme","award":["951911 (AI4Media)"],"award-info":[{"award-number":["951911 (AI4Media)"]}]}],"content-domain":{"domain":["quantum-journal.org"],"crossmark-restriction":false},"short-container-title":["Quantum"],"abstract":"<jats:p>We propose a novel Reinforcement Learning (RL) method for optimizing quantum circuits using graph-theoretic simplification rules of ZX-diagrams. The agent, trained using the Proximal Policy Optimization (PPO) algorithm, employs Graph Neural Networks to approximate the policy and value functions. We demonstrate the capacity of our approach by comparing it against the best performing ZX-Calculus-based algorithm for the problem in hand. After training on small Clifford+T circuits of 5-qubits and few tenths of gates, the agent consistently improves the state-of-the-art for this type of circuits, for at least up to 80-qubit and 2100 gates, whilst remaining competitive in terms of computational performance. Additionally, we illustrate the versatility of the agent by incorporating additional optimization routines on the workflow during training, improving the two-qubit gate count state-of-the-art on multiple structured quantum circuits for relevant applications of much larger dimension and different gate distributions than the circuits the agent trains on. This conveys the potential of tailoring the reward function to the specific characteristics of each application and hardware backend. Our approach is a valuable tool for the implementation of quantum algorithms in the near-term intermediate-scale range (NISQ).<\/jats:p>","DOI":"10.22331\/q-2025-05-28-1758","type":"journal-article","created":{"date-parts":[[2025,5,28]],"date-time":"2025-05-28T16:44:26Z","timestamp":1748450666000},"page":"1758","update-policy":"https:\/\/doi.org\/10.22331\/q-crossmark-policy-page","source":"Crossref","is-referenced-by-count":9,"title":["Reinforcement Learning Based Quantum Circuit Optimization via ZX-Calculus"],"prefix":"10.22331","volume":"9","author":[{"given":"Jordi","family":"Riu","sequence":"first","affiliation":[{"name":"Qilimanjaro Quantum Tech, Carrer de Vene\u00e7uela, 74, Sant Mart\u00ed, 08019, Barcelona, Spain"},{"name":"Universitat Polit\u00e8cnica de Catalunya, Carrer de Jordi Girona, 3, 08034 Barcelona, Spain"}]},{"given":"Jan","family":"Nogu\u00e9","sequence":"additional","affiliation":[{"name":"Qilimanjaro Quantum Tech, Carrer de Vene\u00e7uela, 74, Sant Mart\u00ed, 08019, Barcelona, Spain"},{"name":"Universitat Polit\u00e8cnica de Catalunya, Carrer de Jordi Girona, 3, 08034 Barcelona, Spain"}]},{"given":"Gerard","family":"Vilaplana","sequence":"additional","affiliation":[{"name":"Qilimanjaro Quantum Tech, Carrer de Vene\u00e7uela, 74, Sant Mart\u00ed, 08019, Barcelona, Spain"}]},{"given":"Artur","family":"Garcia-Saez","sequence":"additional","affiliation":[{"name":"Qilimanjaro Quantum Tech, Carrer de Vene\u00e7uela, 74, Sant Mart\u00ed, 08019, Barcelona, Spain"},{"name":"Barcelona Supercomputing Center, Pla\u00e7a Eusebi G\u00fcell, 1-3, 08034 Barcelona, Spain"}]},{"given":"Marta P.","family":"Estarellas","sequence":"additional","affiliation":[{"name":"Qilimanjaro Quantum Tech, Carrer de Vene\u00e7uela, 74, Sant Mart\u00ed, 08019, Barcelona, Spain"}]}],"member":"9598","published-online":{"date-parts":[[2025,5,28]]},"reference":[{"key":"0","doi-asserted-by":"publisher","unstructured":"John Preskill. ``Quantum computing in the NISQ era and beyond&apos;&apos;. Quantum 2, 79 (2018).","DOI":"10.22331\/q-2018-08-06-79"},{"key":"1","doi-asserted-by":"publisher","unstructured":"Beatrice Nash, Vlad Gheorghiu, and Michele Mosca. ``Quantum circuit optimizations for nisq architectures&apos;&apos;. Quantum Science and Technology 5, 025010 (2020).","DOI":"10.1088\/2058-9565\/ab79b1"},{"key":"2","doi-asserted-by":"publisher","unstructured":"Scott Aaronson and Daniel Gottesman. ``Improved simulation of stabilizer circuits&apos;&apos;. Phys. Rev. A 70, 052328 (2004).","DOI":"10.1103\/PhysRevA.70.052328"},{"key":"3","doi-asserted-by":"publisher","unstructured":"Vadym Kliuchnikov and Dmitri Maslov. ``Optimization of clifford circuits&apos;&apos;. Phys. Rev. A 88, 052307 (2013).","DOI":"10.1103\/PhysRevA.88.052307"},{"key":"4","unstructured":"Richard S. Sutton and Andrew G. Barto. ``Reinforcement learning: An introduction&apos;&apos;. The MIT Press. (2018). Second edition. url: http:\/\/incompleteideas.net\/book\/the-book-2nd.html."},{"key":"5","unstructured":"Thomas F\u00f6sel, Murphy Yuezhen Niu, Florian Marquardt, and Li Li. ``Quantum circuit optimization with deep reinforcement learning&apos;&apos; (2021). arXiv:2103.07585."},{"key":"6","unstructured":"Zikun Li, Jinjun Peng, Yixuan Mei, Sina Lin, Yi Wu, Oded Padon, and Zhihao Jia. ``Quarl: A learning-based quantum circuit optimizer&apos;&apos; (2023). arXiv:2307.10120."},{"key":"7","doi-asserted-by":"publisher","unstructured":"Bob Coecke and Ross Duncan. ``Interacting quantum observables: categorical algebra and diagrammatics&apos;&apos;. New Journal of Physics 13, 043016 (2011).","DOI":"10.1088\/1367-2630\/13\/4\/043016"},{"key":"8","unstructured":"John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. ``Proximal policy optimization algorithms&apos;&apos; (2017). arXiv:1707.06347."},{"key":"9","unstructured":"Jan Nogu\u00e9. ``Reinforcement Learning based Circuit Compilation via ZX-calculus&apos;&apos;. Master&apos;s thesis. Universitat de Barcelona. (2023). url: https:\/\/diposit.ub.edu\/dspace\/bitstream\/2445\/202911\/1\/Memoria_TFM-JanNogue.pdf."},{"key":"10","doi-asserted-by":"publisher","unstructured":"Daniel Gottesman. ``Theory of fault-tolerant quantum computation&apos;&apos;. Phys. Rev. A 57, 127\u2013137 (1998).","DOI":"10.1103\/PhysRevA.57.127"},{"key":"11","unstructured":"Daniel Gottesman. ``The heisenberg representation of quantum computers&apos;&apos; (1998). arXiv:quant-ph\/9807006."},{"key":"12","doi-asserted-by":"publisher","unstructured":"Aleks Kissinger and John van de Wetering. ``Pyzx: Large scale automated diagrammatic reasoning&apos;&apos;. Electronic Proceedings in Theoretical Computer Science 318, 229\u2013241 (2020).","DOI":"10.4204\/eptcs.318.14"},{"key":"13","doi-asserted-by":"publisher","unstructured":"Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. ``A comprehensive survey on graph neural networks&apos;&apos;. IEEE Transactions on Neural Networks and Learning Systems 32, 4\u201324 (2021).","DOI":"10.1109\/tnnls.2020.2978386"},{"key":"14","doi-asserted-by":"publisher","unstructured":"Bob Coecke and Aleks Kissinger. ``Picturing quantum processes: A first course in quantum theory and diagrammatic reasoning&apos;&apos;. Cambridge University Press. (2017).","DOI":"10.1017\/9781316219317"},{"key":"15","unstructured":"John van de Wetering. ``Zx-calculus for the working quantum computer scientist&apos;&apos; (2020). arXiv:2012.13966."},{"key":"16","doi-asserted-by":"publisher","unstructured":"Ross Duncan, Aleks Kissinger, Simon Perdrix, and John van de Wetering. ``Graph-theoretic simplification of quantum circuits with the ZX-calculus&apos;&apos;. Quantum 4, 279 (2020).","DOI":"10.22331\/q-2020-06-04-279"},{"key":"17","doi-asserted-by":"publisher","unstructured":"Niel de Beaudrap, Aleks Kissinger, and John van de Wetering. ``Circuit extraction for zx-diagrams can be p-hard&apos;&apos;. In Schloss Dagstuhl \u2013 Leibniz-Zentrum f\u00fcr Informatik. (2022). arXiv:2202.09194.","DOI":"10.4230\/LIPIcs.ICALP.2022.119"},{"key":"18","unstructured":"Calum Holker. ``Causal flow preserving optimisation of quantum circuits in the zx-calculus&apos;&apos; (2024). arXiv:2312.02793."},{"key":"19","doi-asserted-by":"crossref","unstructured":"Korbinian Staudacher, Tobias Guggemos, Wolfgang Gehrke, and Sophia Grundner-Culemann. ``Reducing 2-qubit gate count for zx-calculus based quantum circuit optimization&apos;&apos;. In Quantum Processing and Languages (QPL22). Pages 1\u201317. (2022). url: https:\/\/elib.dlr.de\/188470\/.","DOI":"10.4204\/EPTCS.394.3"},{"key":"20","unstructured":"Anton Kotzig. ``Eulerian lines in finite 4-valent graphs and their transformations&apos;&apos;. In Colloqium on Graph Theory Tihany 1966Pages pages 219 \u2013 230 (Academic Press, 1968)."},{"key":"21","doi-asserted-by":"crossref","unstructured":"Andr\u00e9 Bouchet. ``Graphic presentations of isotropic systems&apos;&apos;. J. Comb. Theory Ser. A 45, 58\u201376 (1987).","DOI":"10.1016\/0095-8956(88)90055-X"},{"key":"22","doi-asserted-by":"publisher","unstructured":"Daniel E Browne, Elham Kashefi, Mehdi Mhalla, and Simon Perdrix. ``Generalized flow and determinism in measurement-based quantum computation&apos;&apos;. New Journal of Physics 9, 250 (2007).","DOI":"10.1088\/1367-2630\/9\/8\/250"},{"key":"23","doi-asserted-by":"publisher","unstructured":"Robert Raussendorf, Daniel Browne, and Hans Briegel. ``The one-way quantum computer\u2013a non-network model of quantum computation&apos;&apos;. Journal of Modern Optics 49, 1299\u20131306 (2002).","DOI":"10.1080\/09500340110107487"},{"key":"24","doi-asserted-by":"publisher","unstructured":"Miriam Backens, Hector Miller-Bakewell, Giovanni de Felice, Leo Lobski, and John van de Wetering. ``There and back again: A circuit extraction tale&apos;&apos;. Quantum 5, 421 (2021).","DOI":"10.22331\/q-2021-03-25-421"},{"key":"25","doi-asserted-by":"publisher","unstructured":"Aleks Kissinger and John van de Wetering. ``Reducing the number of non-clifford gates in quantum circuits&apos;&apos;. Phys. Rev. A 102, 022406 (2020).","DOI":"10.1103\/PhysRevA.102.022406"},{"key":"26","doi-asserted-by":"publisher","unstructured":"Vincent Danos and Elham Kashefi. ``Determinism in the one-way model&apos;&apos;. Phys. Rev. A 74, 052310 (2006).","DOI":"10.1103\/PhysRevA.74.052310"},{"key":"27","doi-asserted-by":"publisher","unstructured":"Mehdi Mhalla and Simon Perdrix. ``Finding optimal flows efficiently&apos;&apos;. Page 857\u2013868. Springer Berlin Heidelberg. (2008).","DOI":"10.1007\/978-3-540-70575-8_70"},{"key":"28","unstructured":"John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, and Pieter Abbeel. ``High-dimensional continuous control using generalized advantage estimation&apos;&apos; (2018). arXiv:1506.02438."},{"key":"29","unstructured":"Petar Veli\u010dkovi\u0107, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Li\u00f2, and Yoshua Bengio. ``Graph attention networks&apos;&apos; (2018). arXiv:1710.10903."},{"key":"30","unstructured":"Shaked Brody, Uri Alon, and Eran Yahav. ``How attentive are graph attention networks?&apos;&apos; (2022). arXiv:2105.14491."},{"key":"31","unstructured":"Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas K\u00f6pf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. ``Pytorch: An imperative style, high-performance deep learning library&apos;&apos; (2019). arXiv:1912.01703."},{"key":"32","unstructured":"Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. ``Openai gym&apos;&apos; (2016) arXiv:1606.01540."},{"key":"33","doi-asserted-by":"publisher","unstructured":"Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, and Pushmeet Kohli. ``Graph matching networks for learning the similarity of graph structured objects&apos;&apos;. In International conference on machine learning. Pages 3835\u20133845. PMLR (2019).","DOI":"10.48550\/arXiv.1904.12787"},{"key":"34","unstructured":"Neil J Ross. code: njross\/optimizer."},{"key":"35","doi-asserted-by":"publisher","unstructured":"Yunseong Nam, Neil J. Ross, Yuan Su, Andrew M. Childs, and Dmitri Maslov. ``Automated optimization of large quantum circuits with continuous parameters&apos;&apos;. npj Quantum Information 4 (2018).","DOI":"10.1038\/s41534-018-0072-4"},{"key":"36","doi-asserted-by":"publisher","unstructured":"Palak Chawla, Shweta, K. R. Swain, Tushti Patel, Renu Bala, Disha Shetty, Kenji Sugisaki, Sudhindu Bikash Mandal, Jordi Riu, Jan Nogu\u00e9, V. S. Prasannaa, and B. P. Das. ``Relativistic variational-quantum-eigensolver calculations of molecular electric dipole moments on quantum hardware&apos;&apos;. Phys. Rev. A 111, 022817 (2025).","DOI":"10.1103\/PhysRevA.111.022817"},{"key":"37","unstructured":"Palak Chawla, Disha Shetty, Peniel Bertrand Tsemo, Kenji Sugisaki, Jordi Riu, Jan Nogue, Debashis Mukherjee, and V. S. Prasannaa. ``Vqe calculations on a nisq era trapped ion quantum computer using a multireference unitary coupled cluster ansatz: application to the beh$_2$ insertion problem&apos;&apos; (2025). arXiv:2504.07037."},{"key":"38","unstructured":"Aashna Anil Zade, Kenji Sugisaki, Matthias Werner, Ana Palacios, Jordi Riu, Jan Nogue, Artur Garcia-Saez, Arnau Riera, and V. S. Prasannaa. ``Capturing strong correlation effects on a quantum annealer: calculation of avoided crossing in the h$_4$ molecule using the quantum annealer eigensolver&apos;&apos; (2025). arXiv:2412.20464."},{"key":"39","doi-asserted-by":"publisher","unstructured":"Maximilian N\u00e4gele and Florian Marquardt. ``Optimizing zx-diagrams with deep reinforcement learning&apos;&apos; (2023). arXiv:2311.18588.","DOI":"10.1088\/2632-2153\/ad76f7"},{"key":"40","unstructured":"Francois Charton, Alexandre Krajenbrink, Konstantinos Meichanetzidis, and Richie Yeung. ``Teaching small transformers to rewrite ZX diagrams&apos;&apos;. In The 3rd Workshop on Mathematical Reasoning and AI at NeurIPS&apos;23. (2023). url: https:\/\/openreview.net\/forum?id=btQ7Bt1NLF."},{"key":"41","unstructured":"Jordi Riu and Jan Nogu\u00e9. ``Code for quantum circuit optimization with rl via zx-calculus&apos;&apos;. Github Repository (2023). url: https:\/\/github.com\/qilimanjaro-tech\/Circopt-RL-ZXCalc."}],"container-title":["Quantum"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/quantum-journal.org\/papers\/q-2025-05-28-1758\/pdf\/","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"}],"deposited":{"date-parts":[[2025,5,28]],"date-time":"2025-05-28T16:44:30Z","timestamp":1748450670000},"score":1,"resource":{"primary":{"URL":"https:\/\/quantum-journal.org\/papers\/q-2025-05-28-1758\/"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,5,28]]},"references-count":42,"URL":"https:\/\/doi.org\/10.22331\/q-2025-05-28-1758","archive":["CLOCKSS"],"relation":{},"ISSN":["2521-327X"],"issn-type":[{"value":"2521-327X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,5,28]]},"article-number":"1758"}}