{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,30]],"date-time":"2026-04-30T14:39:48Z","timestamp":1777559988910,"version":"3.51.4"},"reference-count":34,"publisher":"SAGE Publications","issue":"1","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AIC"],"published-print":{"date-parts":[[2021,2,15]]},"abstract":"<jats:p>Reinforcement learning is an efficient, widely used machine learning technique that performs well when the state and action spaces have a reasonable size. This is rarely the case regarding control-related problems, as for instance controlling traffic signals. Here, the state space can be very large. In order to deal with the curse of dimensionality, a rough discretization of such space can be employed. However, this is effective just up to a certain point. A way to mitigate this is to use techniques that generalize the state space such as function approximation. In this paper, a linear function approximation is used. Specifically, SARSA ( \u03bb ) with Fourier basis features is implemented to control traffic signals in the agent-based transport simulation MATSim. The results are compared not only to trivial controllers such as fixed-time, but also to state-of-the-art rule-based adaptive methods. It is concluded that SARSA ( \u03bb ) with Fourier basis features is able to outperform such methods, especially in scenarios with varying traffic demands or unexpected events.<\/jats:p>","DOI":"10.3233\/aic-201580","type":"journal-article","created":{"date-parts":[[2021,1,5]],"date-time":"2021-01-05T12:19:02Z","timestamp":1609849142000},"page":"89-103","source":"Crossref","is-referenced-by-count":6,"title":["Reinforcement learning vs.\u00a0rule-based adaptive traffic signal control: A Fourier basis linear function approximation for traffic signal control"],"prefix":"10.1177","volume":"34","author":[{"given":"Theresa","family":"Ziemke","sequence":"first","affiliation":[{"name":"Transport Systems Planning and Transport Telematics, Technische Universit\u00e4t Berlin, Germany. E-mail:\u00a0tziemke@vsp.tu-berlin.de"}]},{"given":"Lucas N.","family":"Alegre","sequence":"additional","affiliation":[{"name":"Instituto da Inform\u00e1tica, Universidade Federal do Rio Grande do Sul (UFRGS), Brazil. E-mails:\u00a0lnalegre@inf.ufrgs.br,\u00a0bazzan@inf.ufrgs.br"}]},{"given":"Ana L.C.","family":"Bazzan","sequence":"additional","affiliation":[{"name":"Instituto da Inform\u00e1tica, Universidade Federal do Rio Grande do Sul (UFRGS), Brazil. E-mails:\u00a0lnalegre@inf.ufrgs.br,\u00a0bazzan@inf.ufrgs.br"}]}],"member":"179","reference":[{"issue":"2","key":"10.3233\/AIC-201580_ref1","doi-asserted-by":"publisher","first-page":"201","DOI":"10.1007\/s10489-013-0455-3","article-title":"Hierarchical control of traffic signals using Q-learning with tile coding","volume":"40","author":"Abdoos","year":"2014","journal-title":"Appl. Intell."},{"key":"10.3233\/AIC-201580_ref3","doi-asserted-by":"publisher","DOI":"10.1016\/B978-1-55860-377-6.50013-X"},{"issue":"3","key":"10.3233\/AIC-201580_ref4","doi-asserted-by":"publisher","first-page":"342","DOI":"10.1007\/s10458-008-9062-9","article-title":"Opportunities for multiagent systems and multiagent reinforcement learning in traffic control","volume":"18","author":"Bazzan","year":"2009","journal-title":"Autonomous Agents and Multiagent Systems"},{"issue":"1","key":"10.3233\/AIC-201580_ref5","doi-asserted-by":"publisher","first-page":"120","DOI":"10.1016\/j.trc.2009.04.022","article-title":"Multi-agent model predictive control of signaling split in urban traffic networks","volume":"18","author":"de\u00a0Oliveira","year":"2010","journal-title":"Transportation Research Part C: Emerging Technologies"},{"key":"10.3233\/AIC-201580_ref6","doi-asserted-by":"crossref","unstructured":"M.\u00a0Di Taranto, UTOPIA, in: Proc. of the IFAC-IFIP-IFORS Conference on Control, Computers, Communication in Transportation, International Federation of Automatic Control, Paris, 1989, pp.\u00a0245\u2013252.","DOI":"10.1016\/B978-0-08-037025-5.50042-6"},{"issue":"2","key":"10.3233\/AIC-201580_ref7","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1016\/S0967-0661(01)00121-6","article-title":"A multivariable regulator approach to traffic-responsive network-wide signal control","volume":"10","author":"Diakaki","year":"2002","journal-title":"Control Engineering Practice"},{"key":"10.3233\/AIC-201580_ref8","unstructured":"B.\u00a0Friedrich, Adaptive signal control\u00a0\u2013 an overview, in: Proc. of the 9th Meeting of the Euro Working Group Transportation, Bari, Italy, 2002."},{"key":"10.3233\/AIC-201580_ref9","first-page":"75","article-title":"OPAC\u00a0\u2013 a demand-responsive strategy for traffic signal control","volume":"906","author":"Gartner","year":"1983","journal-title":"Transportation Research Record"},{"key":"10.3233\/AIC-201580_ref10","doi-asserted-by":"publisher","first-page":"26","DOI":"10.1016\/j.procs.2018.04.008","article-title":"Evaluating reinforcement learning state representations for adaptive traffic signal control","volume":"130","author":"Genders","year":"2018","journal-title":"Procedia Computer Science"},{"key":"10.3233\/AIC-201580_ref11","unstructured":"D.\u00a0Grether, J.\u00a0Bischoff and K.\u00a0Nagel, Traffic-actuated signal control: Simulation of the user benefits in a big event real-world scenario, in: 2nd International Conference on Models and Technologies for ITS, Leuven, Belgium, 2011."},{"key":"10.3233\/AIC-201580_ref12","doi-asserted-by":"publisher","DOI":"10.5334\/baw"},{"key":"10.3233\/AIC-201580_ref14","unstructured":"R.\u00a0Grunitzki, B.C.\u00a0da Silva and A.L.C.\u00a0Bazzan, A flexible approach for designing optimal reward functions, in: Proceedings of the 16th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2017), S.\u00a0Das, E.\u00a0Durfee, K.\u00a0Larson and M.\u00a0Winikoff, eds, IFAAMAS, S\u00e3o Paulo, 2017, pp.\u00a01559\u20131560, http:\/\/ifaamas.org\/Proceedings\/aamas2017\/pdfs\/p1559.pdf."},{"key":"10.3233\/AIC-201580_ref15","doi-asserted-by":"crossref","unstructured":"R.\u00a0Grunitzki, B.C.\u00a0da Silva and A.L.C.\u00a0Bazzan, Towards designing optimal reward functions in multi-agent reinforcement learning problems, in: Proc. of the 2018 International Joint Conference on Neural Networks (IJCNN 2018), Rio de Janeiro, 2018.","DOI":"10.1109\/IJCNN.2018.8489029"},{"key":"10.3233\/AIC-201580_ref16","doi-asserted-by":"crossref","unstructured":"J.\u00a0Henry, J.L.\u00a0Farges and J.\u00a0Tuffal, The PRODYN real time traffic algorithm, in: Proceedings of the Int. Fed. of Aut. Control, I.F.A.C.\u00a0Conf and R.\u00a0Isermann, eds, IFAC, Baden-Baden, 1983, pp.\u00a0307\u2013312.","DOI":"10.1016\/S1474-6670(17)62577-1"},{"key":"10.3233\/AIC-201580_ref17","doi-asserted-by":"publisher","DOI":"10.5334\/baw"},{"key":"10.3233\/AIC-201580_ref18","unstructured":"P.B.\u00a0Hunt, D.I.\u00a0Robertson, R.D.\u00a0Bretherton and R.I.\u00a0Winton, SCOOT\u00a0\u2013 a traffic responsive method of coordinating signals, in: TRRL Laboratory Report, 1014, TRRL, Crowthorne, Berkshire, UK, 1981."},{"key":"10.3233\/AIC-201580_ref19","doi-asserted-by":"crossref","unstructured":"G.\u00a0Konidaris, S.\u00a0Osentoski and P.\u00a0Thomas, Value function approximation in reinforcement learning using the Fourier basis, in: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, AAAI\u201911, AAAI Press, 2011, pp.\u00a0380\u2013385.","DOI":"10.1609\/aaai.v25i1.7903"},{"key":"10.3233\/AIC-201580_ref20","doi-asserted-by":"publisher","first-page":"894","DOI":"10.1016\/j.procs.2018.04.086","article-title":"Implementing an adaptive traffic signal control algorithm in an agent-based transport simulation","volume":"130","author":"K\u00fchnel","year":"2018","journal-title":"Procedia Computer Science"},{"key":"10.3233\/AIC-201580_ref21","first-page":"143","article-title":"Die Selbst-Steuerung im Praxistest","volume":"3","author":"L\u00e4mmer","year":"2016","journal-title":"Stra\u00dfenverkehrstechnik"},{"key":"10.3233\/AIC-201580_ref22","doi-asserted-by":"publisher","DOI":"10.1088\/1742-5468\/2008\/04\/P04019"},{"key":"10.3233\/AIC-201580_ref23","unstructured":"P.\u00a0Lowrie, The Sydney coordinate adaptive traffic system\u00a0\u2013 principles, methodology, algorithms, in: Proceedings of the International Conference on Road Traffic Signalling, Sydney, Australia, 1982."},{"key":"10.3233\/AIC-201580_ref24","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-25808-9_4"},{"key":"10.3233\/AIC-201580_ref25","unstructured":"V.\u00a0Mnih, K.\u00a0Kavukcuoglu, D.\u00a0Silver, A.\u00a0Graves, I.\u00a0Antonoglou, D.\u00a0Wierstra and M.\u00a0Riedmiller, Playing atari with deep reinforcement learning, in: NIPS Deep Learning Workshop, 2013."},{"key":"10.3233\/AIC-201580_ref26","doi-asserted-by":"crossref","unstructured":"K.J.\u00a0Prabuchandran, A.N.H.\u00a0Kumar and S.\u00a0Bhatnagar, Decentralized learning for traffic signal control, in: Proceedings of the 7th International Conference on Communication Systems and Networks (COMSNETS), 2015, pp. 1\u20136. ISBN 9781479984398.","DOI":"10.1109\/COMSNETS.2015.7098712"},{"key":"10.3233\/AIC-201580_ref27","doi-asserted-by":"publisher","DOI":"10.1109\/ITSC.2011.6082823"},{"key":"10.3233\/AIC-201580_ref28","doi-asserted-by":"crossref","unstructured":"R.S.\u00a0Sutton and A.G.\u00a0Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998.","DOI":"10.1109\/TNN.1998.712192"},{"key":"10.3233\/AIC-201580_ref29","unstructured":"R.S.\u00a0Sutton and A.G.\u00a0Barto, Reinforcement Learning: An Introduction, 2nd edn, The MIT Press, 2018."},{"key":"10.3233\/AIC-201580_ref30","doi-asserted-by":"publisher","first-page":"481","DOI":"10.1016\/j.trpro.2018.12.215","article-title":"Adaptive traffic signal control for real-world scenarios in agent-based transport simulations","volume":"37","author":"Thunig","year":"2019","journal-title":"Transportation Research Procedia"},{"key":"10.3233\/AIC-201580_ref31","doi-asserted-by":"publisher","DOI":"10.1109\/MTITS.2019.8883373"},{"key":"10.3233\/AIC-201580_ref33","first-page":"1","article-title":"True online temporal-difference learning","volume":"17","author":"van Seijen","year":"2016","journal-title":"Journal of Machine Learning Research"},{"key":"10.3233\/AIC-201580_ref34","unstructured":"H.\u00a0Van Seijen and R.S.\u00a0Sutton, True online TD(\u03bb), in: Proceedings of the 31st International Conference on International Conference on Machine Learning, ICML\u201914, Vol.\u00a032, JMLR.org, 2014, pp.\u00a0I-692\u2013I-700."},{"issue":"3","key":"10.3233\/AIC-201580_ref35","first-page":"279","volume":"8","author":"Watkins","year":"1992","journal-title":"Q-learning, Machine Learning"},{"key":"10.3233\/AIC-201580_ref38","doi-asserted-by":"publisher","DOI":"10.1145\/3068287"},{"key":"10.3233\/AIC-201580_ref40","doi-asserted-by":"publisher","first-page":"870","DOI":"10.1016\/j.procs.2019.04.120","article-title":"The MATSim open Berlin scenario: A multimodal agent-based transport simulation scenario based on synthetic demand modeling and open data","volume":"151","author":"Ziemke","year":"2019","journal-title":"Procedia Computer Science"}],"container-title":["AI Communications"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/AIC-201580","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,28]],"date-time":"2026-04-28T18:27:58Z","timestamp":1777400878000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.medra.org\/servlet\/aliasResolver?alias=iospress&doi=10.3233\/AIC-201580"}},"subtitle":[],"editor":[{"given":"Marin","family":"Lujak","sequence":"additional","affiliation":[]},{"given":"Ivana","family":"Dusparic","sequence":"additional","affiliation":[]},{"given":"Franziska","family":"Kl\u00fcgl","sequence":"additional","affiliation":[]},{"given":"Giuseppe","family":"Vizzari","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2021,2,15]]},"references-count":34,"journal-issue":{"issue":"1"},"URL":"https:\/\/doi.org\/10.3233\/aic-201580","relation":{},"ISSN":["1875-8452","0921-7126"],"issn-type":[{"value":"1875-8452","type":"electronic"},{"value":"0921-7126","type":"print"}],"subject":[],"published":{"date-parts":[[2021,2,15]]}}}