{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,19]],"date-time":"2025-10-19T06:06:51Z","timestamp":1760854011039,"version":"3.41.0"},"reference-count":37,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2018,1,3]],"date-time":"2018-01-03T00:00:00Z","timestamp":1514937600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Cyber-Phys. Syst."],"published-print":{"date-parts":[[2018,1,31]]},"abstract":"<jats:p>The routing game models congestion in transportation networks, communication networks, and other cyber-physical systems in which agents compete for shared resources. We consider an online learning model of player dynamics: at each iteration, every player chooses a route (or a probability distribution over routes, which corresponds to a flow allocation over the physical network), then the joint decision of all players determines the costs of each path, which are then revealed to the players.<\/jats:p>\n          <jats:p>We pose the following estimation problem: given a sequence of player decisions and the corresponding costs, we would like to estimate the parameters of the learning model. We consider, in particular, entropic mirror descent dynamics and reduce the problem to estimating the learning rates of each player.<\/jats:p>\n          <jats:p>In order to demonstrate our methods, we developed a web application that allows players to participate in a distributed, online routing game, and we deployed the application on Amazon Mechanical Turk. When players log in, they are assigned an origin and destination on a shared network. They can choose, at each iteration, a distribution over their available routes, and each player seeks to minimize her own cost. We collect a dataset using this platform, then apply the proposed method to estimate the learning rates of each player. We observe, in particular, that after an exploration phase, the joint decision of the players remains within a small distance of the set of equilibria. We also use the estimated model parameters to predict the flow distribution over routes, and compare our predictions to the actual distributions, showing that the online learning model can be used as a predictive model over short horizons. Finally, we discuss some of the qualitative insights from the experiments, and give directions for future research.<\/jats:p>","DOI":"10.1145\/3078620","type":"journal-article","created":{"date-parts":[[2018,1,4]],"date-time":"2018-01-04T16:27:31Z","timestamp":1515083251000},"page":"1-23","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":3,"title":["On Learning How Players Learn"],"prefix":"10.1145","volume":"2","author":[{"given":"Walid","family":"Krichene","sequence":"first","affiliation":[{"name":"University of California, Berkeley"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Mohamed Chedhli","family":"Bourguiba","sequence":"additional","affiliation":[{"name":"Ecole Polytechnique, Palaiseau, France"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Kiet","family":"Tlam","sequence":"additional","affiliation":[{"name":"University of California, Berkeley"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Alexandre","family":"Bayen","sequence":"additional","affiliation":[{"name":"University of California, Berkeley"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2018,1,3]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"43rd IEEE Conference on Decision and Control","volume":"2","author":"Arslan G.","unstructured":"G. Arslan and J. S. Shamma . 2004. Distributed convergence to Nash equilibria with local utility measurements . In 43rd IEEE Conference on Decision and Control , Vol. 2 . 1538--1543. G. Arslan and J. S. Shamma. 2004. Distributed convergence to Nash equilibria with local utility measurements. In 43rd IEEE Conference on Decision and Control, Vol. 2. 1538--1543."},{"key":"e_1_2_1_2_1","volume-title":"Clustering with Bregman divergences. J. Mach. Learn. Res. 6 (Dec","author":"Banerjee Arindam","year":"2005","unstructured":"Arindam Banerjee , Srujana Merugu , Inderjit S. Dhillon , and Joydeep Ghosh . 2005. Clustering with Bregman divergences. J. Mach. Learn. Res. 6 (Dec . 2005 ), 1705--1749. Arindam Banerjee, Srujana Merugu, Inderjit S. Dhillon, and Joydeep Ghosh. 2005. Clustering with Bregman divergences. J. Mach. Learn. Res. 6 (Dec. 2005), 1705--1749."},{"key":"e_1_2_1_3_1","volume-title":"Division of Research and Innovation.","author":"Bayen A. M.","year":"2011","unstructured":"A. M. Bayen , J. Butler , A. D. Patire , CCIT , UC Berkeley ITS, and California Dpartment of Transportation , Division of Research and Innovation. 2011 . Mobile Millennium Final Report . A. M. Bayen, J. Butler, A. D. Patire, CCIT, UC Berkeley ITS, and California Dpartment of Transportation, Division of Research and Innovation. 2011. Mobile Millennium Final Report."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-6377(02)00231-6"},{"key":"e_1_2_1_5_1","volume-title":"Winsten","author":"Beckmann Martin J.","year":"1955","unstructured":"Martin J. Beckmann , Charles B. McGuire , and Christopher B . Winsten . 1955 . Studies in the economics of transportation. Cowles Commission for Research in Economics at Yale University . Martin J. Beckmann, Charles B. McGuire, and Christopher B. Winsten. 1955. Studies in the economics of transportation. Cowles Commission for Research in Economics at Yale University."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1137\/S1052623499354564"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.2140\/pjm.1956.6.1"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1145\/1146381.1146392"},{"volume-title":"Convex Optimization","author":"Boyd Stephen","key":"e_1_2_1_9_1","unstructured":"Stephen Boyd and Lieven Vandenberghe . 2010. Convex Optimization . Vol. 25 . Cambridge University Press . Stephen Boyd and Lieven Vandenberghe. 2010. Convex Optimization. Vol. 25. Cambridge University Press."},{"key":"e_1_2_1_10_1","first-page":"3","article-title":"Grenoble traffic lab: An experimental platform for advanced traffic monitoring and forecasting","volume":"35","author":"De Wit Carlos Canudas","year":"2015","unstructured":"Carlos Canudas De Wit , Fabio Morbidi , Luis Leon Ojeda , Alain Y. Kibangou , Iker Bellicot , and Pascal Bellemain . 2015 . Grenoble traffic lab: An experimental platform for advanced traffic monitoring and forecasting . IEEE Control Syst. 35 , 3 (June 2015), 23--39. Carlos Canudas De Wit, Fabio Morbidi, Luis Leon Ojeda, Alain Y. Kibangou, Iker Bellicot, and Pascal Bellemain. 2015. Grenoble traffic lab: An experimental platform for advanced traffic monitoring and forecasting. IEEE Control Syst. 35, 3 (June 2015), 23--39.","journal-title":"IEEE Control Syst."},{"key":"e_1_2_1_11_1","volume-title":"Parallel Optimization: Theory, Algorithms and Applications","author":"Censor Yair","year":"1997","unstructured":"Yair Censor and Stavros Zenios . 1997 . Parallel Optimization: Theory, Algorithms and Applications . Oxford University Press . Yair Censor and Stavros Zenios. 1997. Parallel Optimization: Theory, Algorithms and Applications. Oxford University Press."},{"volume-title":"Prediction, Learning, and Games","author":"Cesa-Bianchi Nicol\u00f2","key":"e_1_2_1_12_1","unstructured":"Nicol\u00f2 Cesa-Bianchi and G\u00e1bor Lugosi . 2006. Prediction, Learning, and Games . Cambridge University Press . Nicol\u00f2 Cesa-Bianchi and G\u00e1bor Lugosi. 2006. Prediction, Learning, and Games. Cambridge University Press."},{"key":"e_1_2_1_13_1","volume-title":"Algorithms--ESA","author":"Fischer Simon","year":"2004","unstructured":"Simon Fischer and Berthold V\u00f6cking . 2004. On the evolution of selfish routing . In Algorithms--ESA 2004 . Springer , 323--334. Simon Fischer and Berthold V\u00f6cking. 2004. On the evolution of selfish routing. In Algorithms--ESA 2004. Springer, 323--334."},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.3390\/g4040561"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1006\/game.1999.0738"},{"key":"e_1_2_1_16_1","volume-title":"Levine","author":"Fudenberg Drew","year":"1998","unstructured":"Drew Fudenberg and David K . Levine . 1998 . The Theory of Learning in Games. Vol. 2 . MIT Press . Drew Fudenberg and David K. Levine. 1998. The Theory of Learning in Games. Vol. 2. MIT Press."},{"key":"e_1_2_1_17_1","volume-title":"Approximation to Bayes risk in repeated plays. Contributions to the Theory of Games 3","author":"Hannan James","year":"1957","unstructured":"James Hannan . 1957. Approximation to Bayes risk in repeated plays. Contributions to the Theory of Games 3 ( 1957 ), 97--139. James Hannan. 1957. Approximation to Bayes risk in repeated plays. Contributions to the Theory of Games 3 (1957), 97--139."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1468-0262.2005.00625.x"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1006\/jeth.2000.2746"},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the 19th International Conference on Artificial Intelligence and Statistics. JMLR, 102--110","author":"Herman Michael","year":"2016","unstructured":"Michael Herman , Tobias Gindele , J\u00f6rg Wagner , Felix Schmitt , and Wolfram Burgard . 2016 . Inverse reinforcement learning with simultaneous estimation of rewards and dynamics . In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics. JMLR, 102--110 . Michael Herman, Tobias Gindele, J\u00f6rg Wagner, Felix Schmitt, and Wolfram Burgard. 2016. Inverse reinforcement learning with simultaneous estimation of rewards and dynamics. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics. JMLR, 102--110."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1006\/inco.1996.2612"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/1536414.1536487"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.5555\/1764891.1764944"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1137\/140980685"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/ECC.2015.7330604"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1109\/CDC.2015.7402714"},{"key":"e_1_2_1_27_1","volume-title":"Handbook of Game Theory","volume":"4","author":"Marden J. R.","unstructured":"J. R. Marden and J. S. Shamma . 2013. Game theory and distributed control . In Handbook of Game Theory , Vol. 4 , H.P. Young and S. Zamir (Eds.). Elsevier Science. J. R. Marden and J. S. Shamma. 2013. Game theory and distributed control. In Handbook of Game Theory, Vol. 4, H.P. Young and S. Zamir (Eds.). Elsevier Science."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCST.2013.2257780"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.3758\/s13428-011-0124-6"},{"key":"e_1_2_1_30_1","unstructured":"A. S. Nemirovsky and D. B. Yudin. 1983. Problem Complexity and Method Efficiency in Optimization. Wiley.  A. S. Nemirovsky and D. B. Yudin. 1983. Problem Complexity and Method Efficiency in Optimization. Wiley."},{"volume-title":"Proceedings of the 17th International Conference on Machine Learning (ICML\u201900)","author":"Andrew","key":"e_1_2_1_31_1","unstructured":"Andrew Y. Ng and Stuart J. Russell. 2000. Algorithms for inverse reinforcement learning . In Proceedings of the 17th International Conference on Machine Learning (ICML\u201900) . Morgan Kaufmann Publishers Inc., San Francisco, CA, 663--670. Andrew Y. Ng and Stuart J. Russell. 2000. Algorithms for inverse reinforcement learning. In Proceedings of the 17th International Conference on Machine Learning (ICML\u201900). Morgan Kaufmann Publishers Inc., San Francisco, CA, 663--670."},{"key":"e_1_2_1_32_1","volume-title":"Incentives and pricing in communication networks. Algorithmic Game Theory","author":"Ozdaglar Asuman","year":"2007","unstructured":"Asuman Ozdaglar and R Srikant . 2007. Incentives and pricing in communication networks. Algorithmic Game Theory ( 2007 ), 571--591. Asuman Ozdaglar and R Srikant. 2007. Incentives and pricing in communication networks. Algorithmic Game Theory (2007), 571--591."},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF01737559"},{"volume-title":"Algorithmic Game Theory","author":"Roughgarden Tim","key":"e_1_2_1_34_1","unstructured":"Tim Roughgarden . 2007. Routing games . In Algorithmic Game Theory . Cambridge University Press , Chap . 18, 461--486. Tim Roughgarden. 2007. Routing games. In Algorithmic Game Theory. Cambridge University Press, Chap. 18, 461--486."},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1006\/jeth.2000.2696"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.2307\/1884852"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1680\/ipeds.1952.11259"}],"container-title":["ACM Transactions on Cyber-Physical Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3078620","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3078620","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:30:31Z","timestamp":1750217431000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3078620"}},"subtitle":["Estimation of Learning Dynamics in the Routing Game"],"short-title":[],"issued":{"date-parts":[[2018,1,3]]},"references-count":37,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2018,1,31]]}},"alternative-id":["10.1145\/3078620"],"URL":"https:\/\/doi.org\/10.1145\/3078620","relation":{},"ISSN":["2378-962X","2378-9638"],"issn-type":[{"type":"print","value":"2378-962X"},{"type":"electronic","value":"2378-9638"}],"subject":[],"published":{"date-parts":[[2018,1,3]]},"assertion":[{"value":"2016-07-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-04-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-01-03","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}