{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,11]],"date-time":"2025-03-11T04:15:31Z","timestamp":1741666531376,"version":"3.38.0"},"reference-count":29,"publisher":"SAGE Publications","issue":"2","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["AIC"],"published-print":{"date-parts":[[2024,4,17]]},"abstract":"<jats:p>The increasing demand for mobility in our society poses various challenges to traffic engineering, computer science in general, and artificial intelligence in particular. Increasing the capacity of road networks is not always possible, thus a more efficient use of the available transportation infrastructure is required. Another issue is that many problems in traffic management and control are inherently decentralized and\/or require adaptation to the traffic situation. Hence, there is a close relationship to multiagent reinforcement learning. However, using reinforcement learning poses the challenge that the state space is normally large and continuous, thus it is necessary to find appropriate schemes to deal with discretization of the state space. To address these issues, a multiagent system with agents learning independently via a learning algorithm was proposed, which is based on estimating Q-values from k-nearest neighbors. In the present paper, we extend this approach and include transfer of experiences among the agents, especially when an agent does not have a good set of k experiences. We deal with traffic signal control, running experiments on a traffic network in which we vary the traffic situation along time, and compare our approach to two baselines (one involving reinforcement learning and one based on fixed times). Our results show that the extended method pays off when an agent returns to an already experienced traffic situation.<\/jats:p>","DOI":"10.3233\/aic-220305","type":"journal-article","created":{"date-parts":[[2023,9,29]],"date-time":"2023-09-29T15:00:00Z","timestamp":1695999600000},"page":"247-259","source":"Crossref","is-referenced-by-count":0,"title":["Transferring experiences in k-nearest neighbors based multiagent reinforcement learning: an application to traffic signal control"],"prefix":"10.1177","volume":"37","author":[{"given":"Ana Lucia C.","family":"Bazzan","sequence":"first","affiliation":[{"name":"Instituto da Inform\u00e1tica, Universidade Federal do Rio Grande do Sul (UFRGS), Brazil"}]},{"given":"Vicente N.","family":"de\u00a0Almeida","sequence":"additional","affiliation":[{"name":"Instituto da Inform\u00e1tica, Universidade Federal do Rio Grande do Sul (UFRGS), Brazil"}]},{"given":"Monireh","family":"Abdoos","sequence":"additional","affiliation":[{"name":"Faculty of Computer Science and Engineering, Shahid Beheshti University, Iran"}]}],"member":"179","reference":[{"doi-asserted-by":"publisher","key":"10.3233\/AIC-220305_ref1","DOI":"10.1016\/j.eswa.2021.114580"},{"issue":"2","key":"10.3233\/AIC-220305_ref2","doi-asserted-by":"publisher","first-page":"201","DOI":"10.1007\/s10489-013-0455-3","article-title":"Hierarchical control of traffic signals using Q-learning with tile coding","volume":"40","author":"Abdoos","year":"2014","journal-title":"Appl. Intell."},{"doi-asserted-by":"publisher","key":"10.3233\/AIC-220305_ref4","DOI":"10.7717\/peerj-cs.575"},{"doi-asserted-by":"publisher","key":"10.3233\/AIC-220305_ref5","DOI":"10.1109\/TITS.2021.3091014"},{"key":"10.3233\/AIC-220305_ref6","doi-asserted-by":"publisher","first-page":"732","DOI":"10.1016\/j.trc.2017.09.020","article-title":"Adaptive traffic signal control with actor-critic methods in a real-world traffic network with different traffic disruption events","volume":"85","author":"Aslani","year":"2017","journal-title":"Transportation Research Part C: Emerging Technologies"},{"doi-asserted-by":"crossref","unstructured":"L.\u00a0Baird, Residual algorithms: Reinforcement learning with function approximation, in: Proceedings of the Twelfth International Conference on Machine Learning, Morgan Kaufmann, 1995, pp.\u00a030\u201337.","key":"10.3233\/AIC-220305_ref7","DOI":"10.1016\/B978-1-55860-377-6.50013-X"},{"issue":"3","key":"10.3233\/AIC-220305_ref8","doi-asserted-by":"publisher","first-page":"177","DOI":"10.1049\/iet-its.2009.0096","article-title":"Urban traffic signal control using reinforcement learning agents","volume":"4","author":"Balaji","year":"2010","journal-title":"IET Intelligent Transportation Systems"},{"issue":"3","key":"10.3233\/AIC-220305_ref9","doi-asserted-by":"publisher","first-page":"342","DOI":"10.1007\/s10458-008-9062-9","article-title":"Opportunities for multiagent systems and multiagent reinforcement learning in traffic control","volume":"18","author":"Bazzan","year":"2009","journal-title":"Autonomous Agents and Multiagent Systems"},{"unstructured":"V.N.\u00a0de\u00a0Almeida, A.L.C.\u00a0Bazzan and M.\u00a0Abdoos, Multiagent reinforcement learning for traffic signal control: A k-nearest neighbors based approach, in: Twelfth International Workshop on Agents in Traffic and Transportation, A.L.C.\u00a0Bazzan, I.\u00a0Dusparic, M.\u00a0Lujak and G.\u00a0Vizzari, eds, CEUR Workshop Proceedings, Vol.\u00a03173, CEUR-WS.org, 2022, pp.\u00a032\u201346, http:\/\/ceur-ws.org\/Vol-3173\/3.pdf.","key":"10.3233\/AIC-220305_ref11"},{"issue":"3","key":"10.3233\/AIC-220305_ref12","doi-asserted-by":"publisher","first-page":"1140","DOI":"10.1109\/TITS.2013.2255286","article-title":"Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (MARLIN-ATSC): Methodology and large-scale application on downtown Toronto","volume":"14","author":"El-Tantawy","year":"2013","journal-title":"Intelligent Transportation Systems, IEEE Transactions on"},{"doi-asserted-by":"publisher","key":"10.3233\/AIC-220305_ref13","DOI":"10.1109\/WCICA.2012.6359080"},{"doi-asserted-by":"publisher","key":"10.3233\/AIC-220305_ref14","DOI":"10.1109\/ICTC49638.2020.9123312"},{"doi-asserted-by":"crossref","unstructured":"P.A.\u00a0Lopez, M.\u00a0Behrisch, L.\u00a0Bieker-Walz, J.\u00a0Erdmann, Y.-P.\u00a0Fl\u00f6tter\u00f6d, R.\u00a0Hilbrich, L.\u00a0L\u00fccken, J.\u00a0Rummel, P.\u00a0Wagner and E.\u00a0Wie\u00dfner, Microscopic traffic simulation using SUMO, in: The 21st IEEE International Conference on Intelligent Transportation Systems, 2018.","key":"10.3233\/AIC-220305_ref15","DOI":"10.1109\/ITSC.2018.8569938"},{"doi-asserted-by":"publisher","key":"10.3233\/AIC-220305_ref16","DOI":"10.1007\/978-3-319-25808-9_4"},{"doi-asserted-by":"publisher","key":"10.3233\/AIC-220305_ref17","DOI":"10.1007\/978-3-540-75867-9_18"},{"doi-asserted-by":"publisher","key":"10.3233\/AIC-220305_ref18","DOI":"10.1007\/978-3-642-02264-7_32"},{"doi-asserted-by":"publisher","key":"10.3233\/AIC-220305_ref19","DOI":"10.31224\/osf.io\/ewxrj"},{"doi-asserted-by":"publisher","key":"10.3233\/AIC-220305_ref20","DOI":"10.1007\/978-3-319-63342-8_4"},{"issue":"12","key":"10.3233\/AIC-220305_ref21","doi-asserted-by":"crossref","first-page":"2043","DOI":"10.1109\/JPROC.2003.819610","article-title":"Review of road traffic control strategies","volume":"91","author":"Papageorgiou","year":"2003","journal-title":"Proceedings of the IEEE"},{"key":"10.3233\/AIC-220305_ref22","first-page":"816","volume-title":"Traffic Engineering","author":"Roess","year":"2004"},{"unstructured":"F.L.D.\u00a0Silva, Integrating agent advice and previous task solutions in multiagent reinforcement learning, in: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019, pp.\u00a02447\u20132448.","key":"10.3233\/AIC-220305_ref23"},{"key":"10.3233\/AIC-220305_ref24","doi-asserted-by":"publisher","first-page":"645","DOI":"10.1613\/jair.1.11396","article-title":"A survey on transfer learning for multiagent reinforcement learning systems","volume":"64","author":"Silva","year":"2019","journal-title":"Journal of Artificial Intelligence Research"},{"doi-asserted-by":"crossref","unstructured":"F.L.D.\u00a0Silva, M.E.\u00a0Taylor and A.H.R.\u00a0Costa, Autonomously reusing knowledge in multiagent reinforcement learning, in: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18), IJCAI, 2018, pp.\u00a05487\u20135493.","key":"10.3233\/AIC-220305_ref25","DOI":"10.24963\/ijcai.2018\/774"},{"unstructured":"R.S.\u00a0Sutton and A.G.\u00a0Barto, Reinforcement Learning: An Introduction, 2nd edn, The MIT Press, 2018.","key":"10.3233\/AIC-220305_ref26"},{"issue":"56","key":"10.3233\/AIC-220305_ref27","first-page":"1633","article-title":"Transfer learning for reinforcement learning domains: A survey","volume":"10","author":"Taylor","year":"2009","journal-title":"Journal of Machine Learning Research"},{"issue":"2","key":"10.3233\/AIC-220305_ref31","doi-asserted-by":"publisher","first-page":"12","DOI":"10.1145\/3447556.3447565","article-title":"Recent advances in reinforcement learning for traffic signal control: A survey of models and evaluation","volume":"22","author":"Wei","year":"2021","journal-title":"SIGKDD Explor. Newsl."},{"doi-asserted-by":"publisher","key":"10.3233\/AIC-220305_ref33","DOI":"10.1145\/3219819.3220096"},{"doi-asserted-by":"publisher","key":"10.3233\/AIC-220305_ref34","DOI":"10.1145\/3068287"},{"issue":"6","key":"10.3233\/AIC-220305_ref35","doi-asserted-by":"publisher","first-page":"5508","DOI":"10.1109\/TCYB.2020.3034424","article-title":"Differential advising in multiagent reinforcement learning","volume":"52","author":"Ye","year":"2020","journal-title":"IEEE Transactions on Cybernetics"}],"container-title":["AI Communications"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/AIC-220305","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,10]],"date-time":"2025-03-10T13:38:41Z","timestamp":1741613921000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/AIC-220305"}},"subtitle":[],"editor":[{"given":"Ana L.C.","family":"Bazzan","sequence":"additional","affiliation":[]},{"given":"Ivana","family":"Dusparic","sequence":"additional","affiliation":[]},{"given":"Marin","family":"Lujak","sequence":"additional","affiliation":[]},{"given":"Giuseppe","family":"Vizzari","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2024,4,17]]},"references-count":29,"journal-issue":{"issue":"2"},"URL":"https:\/\/doi.org\/10.3233\/aic-220305","relation":{},"ISSN":["1875-8452","0921-7126"],"issn-type":[{"type":"electronic","value":"1875-8452"},{"type":"print","value":"0921-7126"}],"subject":[],"published":{"date-parts":[[2024,4,17]]}}}