{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,20]],"date-time":"2026-01-20T09:21:33Z","timestamp":1768900893255,"version":"3.49.0"},"reference-count":34,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,2,22]],"date-time":"2022-02-22T00:00:00Z","timestamp":1645488000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,2,22]],"date-time":"2022-02-22T00:00:00Z","timestamp":1645488000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Ann Oper Res"],"published-print":{"date-parts":[[2025,7]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>This paper presents a multi-agent reinforcement learning algorithm to represent strategic bidding behavior by carriers and shippers in freight transport markets. We investigate whether feasible market equilibriums arise without central control or communication between agents. Observed behavior in such environments serves as a stepping stone towards self-organizing logistics systems like the Physical Internet, while also offering valuable insights for the design of contemporary transport brokerage platforms. We model an agent-based environment in which shipper and carrier actively learn bidding strategies using policy gradient methods, posing bid- and ask prices at the individual container level. Both agents aim to learn the best response given the expected behavior of the opposing agent. Inspired by financial markets, a neutral broker allocates jobs based on bid-ask spreads. Our game-theoretical analysis and numerical experiments focus on behavioral insights. To evaluate system performance, we measure adherence to Nash equilibria, fairness of reward division and utilization of transport capacity. We observe good performance both in predictable, deterministic settings (<jats:inline-formula>\n              <jats:alternatives>\n                <jats:tex-math>$$\\sim $$<\/jats:tex-math>\n                <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mo>\u223c<\/mml:mo>\n                <\/mml:math>\n              <\/jats:alternatives>\n            <\/jats:inline-formula>\u00a095% adherence to Nash equilibria) and highly stochastic environments (<jats:inline-formula>\n              <jats:alternatives>\n                <jats:tex-math>$$\\sim $$<\/jats:tex-math>\n                <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:mo>\u223c<\/mml:mo>\n                <\/mml:math>\n              <\/jats:alternatives>\n            <\/jats:inline-formula>\u00a085% adherence). Risk-seeking behavior may increase an agent\u2019s reward share, yet overly aggressive strategies destabilize the system. The results suggest a potential for full automation and decentralization of freight transport markets. These insights ease the design of real-world market platforms, suggesting an innate tendency of markets to reach equilibria without behavioral models, information sharing or explicit incentives.<\/jats:p>","DOI":"10.1007\/s10479-022-04572-z","type":"journal-article","created":{"date-parts":[[2022,2,21]],"date-time":"2022-02-21T23:35:50Z","timestamp":1645486550000},"page":"131-168","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Strategic bidding in freight transport using deep reinforcement learning"],"prefix":"10.1007","volume":"350","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5413-9660","authenticated-orcid":false,"given":"W. J. A.","family":"van Heeswijk","sequence":"first","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,2,22]]},"reference":[{"issue":"6","key":"4572_CR1","doi-asserted-by":"publisher","first-page":"1606","DOI":"10.1080\/00207543.2018.1494392","volume":"57","author":"T Ambra","year":"2019","unstructured":"Ambra, T., Caris, A., & Macharis, C. (2019). Towards freight transport system unification: Reviewing and combining the advancements in the physical internet and synchromodal transport research. International Journal of Production Research, 57(6), 1606\u20131623.","journal-title":"International Journal of Production Research"},{"issue":"10","key":"4572_CR2","doi-asserted-by":"publisher","first-page":"670","DOI":"10.1177\/0361198120935116","volume":"2674","author":"B Atasoy","year":"2020","unstructured":"Atasoy, B., Schulte, F., & Steenkamp, A. (2020). Platform-based collaborative routing using dynamic prices as incentives. Transportation Research Record, 2674(10), 670\u2013679.","journal-title":"Transportation Research Record"},{"key":"4572_CR3","doi-asserted-by":"crossref","unstructured":"Aumann, R. J., & Shapley, L. S. (1994). Long-term competition-a game-theoretic analysis. In N. Megiddo (Ed.), Essays in game theory (pp. 1\u201315). Springer.","DOI":"10.1007\/978-1-4612-2648-2_1"},{"key":"4572_CR4","doi-asserted-by":"crossref","unstructured":"Cruijssen, F. (2020). Cross-chain collaboration in logistics. In International series in operations research and management science (ISOR) vol. 297.","DOI":"10.1007\/978-3-030-57093-4"},{"issue":"1","key":"4572_CR5","doi-asserted-by":"publisher","first-page":"1","DOI":"10.2307\/2296617","volume":"38","author":"JW Friedman","year":"1971","unstructured":"Friedman, J. W. (1971). A non-cooperative equilibrium for supergames. The Review of Economic Studies, 38(1), 1\u201312.","journal-title":"The Review of Economic Studies"},{"issue":"6","key":"4572_CR6","doi-asserted-by":"publisher","first-page":"1291","DOI":"10.1109\/TSMCC.2012.2218595","volume":"42","author":"I Grondman","year":"2012","unstructured":"Grondman, I., Busoniu, L., Lopes, G. A., & Babuska, R. (2012). A survey of actor-critic reinforcement learning: Standard and natural policy gradients. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(6), 1291\u20131307.","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)"},{"key":"4572_CR7","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision (pp. 1026\u20131034).","DOI":"10.1109\/ICCV.2015.123"},{"issue":"7","key":"4572_CR8","doi-asserted-by":"publisher","first-page":"1623","DOI":"10.2307\/1913954","volume":"45","author":"E Kalai","year":"1977","unstructured":"Kalai, E. (1977). Proportional solutions to bargaining situations: Interpersonal utility comparisons. Econometrica: Journal of the Econometric Society, 45(7), 1623\u20131630.","journal-title":"Econometrica: Journal of the Econometric Society"},{"key":"4572_CR9","doi-asserted-by":"crossref","unstructured":"Kalai, E., & Smorodinsky, M. (1975). Other solutions to Nash\u2019s bargaining problem. Econometrica: Journal of the Econometric Society, 43(3), 513\u2013518.","DOI":"10.2307\/1914280"},{"issue":"1","key":"4572_CR10","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12544-021-00512-3","volume":"13","author":"A Karam","year":"2021","unstructured":"Karam, A., Reinau, K. H., & \u00d8stergaard, C. R. (2021). Horizontal collaboration in the freight transport sector: Barrier and decision-making frameworks. European Transport Research Review, 13(1), 1\u201322.","journal-title":"European Transport Research Review"},{"key":"4572_CR11","doi-asserted-by":"crossref","unstructured":"Kellerer, H., Pferschy, U., & Pisinger, D. (Eds.). (2004). Multidimensional knapsack problems. In Knapsack problems (pp. 235\u2013283). Springer.","DOI":"10.1007\/978-3-540-24777-7_9"},{"issue":"3","key":"4572_CR12","first-page":"61","volume":"32","author":"M Kenney","year":"2016","unstructured":"Kenney, M., & Zysman, J. (2016). The rise of the platform economy. Issues in Science and Technology, 32(3), 61.","journal-title":"Issues in Science and Technology"},{"issue":"2","key":"4572_CR13","doi-asserted-by":"publisher","first-page":"402","DOI":"10.1287\/trsc.2016.0682","volume":"52","author":"MA Klapp","year":"2018","unstructured":"Klapp, M. A., Erera, A. L., & Toriello, A. (2018). The one-dimensional dynamic dispatch waves problem. Transportation Science, 52(2), 402\u2013415.","journal-title":"Transportation Science"},{"key":"4572_CR14","doi-asserted-by":"publisher","first-page":"193","DOI":"10.1016\/j.trc.2019.05.026","volume":"113","author":"J Miller","year":"2020","unstructured":"Miller, J., & Nie, Y. M. (2020). Dynamic trucking equilibrium through a freight exchange. Transportation Research Part C: Emerging Technologies, 113, 193\u2013212.","journal-title":"Transportation Research Part C: Emerging Technologies"},{"issue":"1","key":"4572_CR15","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1287\/opre.41.1.77","volume":"41","author":"AS Minkoff","year":"1993","unstructured":"Minkoff, A. S. (1993). A Markov decision model and decomposition heuristic for dynamic vehicle dispatching. Operations Research, 41(1), 77\u201390.","journal-title":"Operations Research"},{"issue":"2\u20133","key":"4572_CR16","doi-asserted-by":"publisher","first-page":"71","DOI":"10.1007\/s12159-011-0045-x","volume":"3","author":"B Montreuil","year":"2011","unstructured":"Montreuil, B. (2011). Toward a Physical Internet: Meeting the global logistics sustainability grand challenge. Logistics Research, 3(2\u20133), 71\u201387.","journal-title":"Logistics Research"},{"key":"4572_CR17","first-page":"151","volume-title":"Physical internet foundations","author":"B Montreuil","year":"2013","unstructured":"Montreuil, B., Meller, R. D., & Ballot, E. (2013). Physical internet foundations (pp. 151\u2013166). Berlin Heidelberg.: Springer."},{"key":"4572_CR18","doi-asserted-by":"publisher","first-page":"155","DOI":"10.2307\/1907266","volume":"8","author":"J Nash","year":"1950","unstructured":"Nash, J. (1950). The bargaining problem. Econometrica, 8, 155\u2013162.","journal-title":"Econometrica"},{"key":"4572_CR19","doi-asserted-by":"crossref","unstructured":"Nash, J. (1953). Two-person cooperative games. Econometrica: Journal of the Econometric Society, 21(1), 128\u2013140.","DOI":"10.2307\/1906951"},{"issue":"7","key":"4572_CR20","doi-asserted-by":"publisher","first-page":"2631","DOI":"10.1007\/s10845-016-1289-8","volume":"30","author":"B Qiao","year":"2019","unstructured":"Qiao, B., Pan, S., & Ballot, E. (2019). Dynamic pricing model for less-than-truckload carriers in the Physical Internet. Journal of Intelligent Manufacturing, 30(7), 2631\u20132643.","journal-title":"Journal of Intelligent Manufacturing"},{"key":"4572_CR21","unstructured":"Rolnick, D., & Tegmark, M. (2017). The power of deeper networks for expressing natural functions. arXiv preprint arXiv:1705.05502."},{"issue":"1","key":"4572_CR22","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/BF01784792","volume":"9","author":"A Rubinstein","year":"1980","unstructured":"Rubinstein, A. (1980). Strong perfect equilibrium in supergames. International Journal of Game Theory, 9(1), 1\u201312.","journal-title":"International Journal of Game Theory"},{"key":"4572_CR23","doi-asserted-by":"crossref","unstructured":"Rubinstein, A. (1994). Equilibrium in supergames. In N. Megiddo (Ed.), Essays in game theory (pp. 17\u201327). Springer.","DOI":"10.1007\/978-1-4612-2648-2_2"},{"key":"4572_CR24","doi-asserted-by":"publisher","first-page":"96","DOI":"10.1016\/j.compind.2015.12.006","volume":"81","author":"Y Sallez","year":"2016","unstructured":"Sallez, Y., Pan, S., Montreuil, B., Berger, T., & Ballot, E. (2016). On the activeness of intelligent physical internet containers. Computers in Industry, 81, 96\u2013104.","journal-title":"Computers in Industry"},{"key":"4572_CR25","volume-title":"Reinforcement learning: An introduction","author":"RS Sutton","year":"2018","unstructured":"Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). Cambridge, Massachusetts: The MIT Press.","edition":"2"},{"key":"4572_CR26","first-page":"469","volume-title":"Lecture notes in computer science","author":"WJA Van Heeswijk","year":"2019","unstructured":"Van Heeswijk, W. J. A. (2019). Smart containers with bidding capacity: A policy gradient algorithm for semi-cooperative learning. In E. Lalla-Ruiz, M. R. K. Mes, & S. Vo\u00df (Eds.), Lecture notes in computer science (Vol. 12433, pp. 469\u2013491). Cham: Springer."},{"key":"4572_CR27","doi-asserted-by":"crossref","unstructured":"Van Heeswijk, W. J. A., & La Poutr\u00e9, H. (2018). Scalability and performance of decentralized planning in flexible transport networks. In 2018 IEEE International Conference on Systems Man, and Cybernetics (pp. 292\u2013297). Piscataway, New Jersey: Institute of Electrical and Electronics Engineers Inc.","DOI":"10.1109\/SMC.2018.00060"},{"key":"4572_CR28","doi-asserted-by":"crossref","unstructured":"Van Heeswijk, W. J. A., Mes, M. R. K., & Schutten, J. M. J. (2019a). The delivery dispatching problem with time windows for urban consolidation centers. Transportation Science, 53(1), 203\u2013221.","DOI":"10.1287\/trsc.2017.0773"},{"key":"4572_CR29","doi-asserted-by":"crossref","unstructured":"Van Heeswijk, W. J. A., Mes, M. R. K., & Schutten, J. M. J. (2019). Transportation management. In H. Zijm, M. Klumpp, A. Regattieri, & S. Heragu (Eds.), Operations, Logistics and supply chain management (pp. 469\u2013491). Cham: Springer.","DOI":"10.1007\/978-3-319-92447-2_21"},{"issue":"1","key":"4572_CR30","doi-asserted-by":"publisher","first-page":"167","DOI":"10.1287\/trsc.2016.0732","volume":"53","author":"SA Voccia","year":"2019","unstructured":"Voccia, S. A., Campbell, A. M., & Thomas, B. W. (2019). The same-day delivery problem for online purchases. Transportation Science, 53(1), 167\u2013184.","journal-title":"Transportation Science"},{"key":"4572_CR31","unstructured":"Wang, Y., Nascimento, J. M. D., & Powell, W. B. (2018). Reinforcement learning for dynamic bidding in truckload markets: an application to large-scale fleet management with advance commitments. arXiv preprint arXiv:1802.08976."},{"issue":"3\u20134","key":"4572_CR32","doi-asserted-by":"publisher","first-page":"229","DOI":"10.1023\/A:1022672621406","volume":"8","author":"RJ Williams","year":"1992","unstructured":"Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3\u20134), 229\u2013256.","journal-title":"Machine Learning"},{"key":"4572_CR33","first-page":"1","volume":"2018","author":"F Yan","year":"2018","unstructured":"Yan, F., Ma, Y., Xu, M., & Ge, X. (2018). Transportation service procurement bid construction problem from less than truckload perspective. Mathematical Problems in Engineering, 2018, 1\u201317.","journal-title":"Mathematical Problems in Engineering"},{"key":"4572_CR34","doi-asserted-by":"publisher","first-page":"2","DOI":"10.1016\/j.trpro.2017.05.002","volume":"23","author":"L Zha","year":"2017","unstructured":"Zha, L., Yin, Y., & Du, Y. (2017). Surge pricing and labor supply in the ride-sourcing market. Transportation Research Procedia, 23, 2\u201321.","journal-title":"Transportation Research Procedia"}],"container-title":["Annals of Operations Research"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10479-022-04572-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10479-022-04572-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10479-022-04572-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,5]],"date-time":"2025-09-05T18:03:07Z","timestamp":1757095387000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10479-022-04572-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,22]]},"references-count":34,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2025,7]]}},"alternative-id":["4572"],"URL":"https:\/\/doi.org\/10.1007\/s10479-022-04572-z","relation":{},"ISSN":["0254-5330","1572-9338"],"issn-type":[{"value":"0254-5330","type":"print"},{"value":"1572-9338","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,2,22]]},"assertion":[{"value":"21 January 2022","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 February 2022","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}