{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,18]],"date-time":"2026-04-18T03:43:40Z","timestamp":1776483820485,"version":"3.51.2"},"reference-count":41,"publisher":"Springer Science and Business Media LLC","issue":"9","license":[{"start":{"date-parts":[[2021,8,8]],"date-time":"2021-08-08T00:00:00Z","timestamp":1628380800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,8,8]],"date-time":"2021-08-08T00:00:00Z","timestamp":1628380800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Knowl Inf Syst"],"published-print":{"date-parts":[[2021,9]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>With the availability of significant amount of data, data-driven decision making becomes an alternative way for solving complex multiagent decision problems. Instead of using domain knowledge to explicitly build decision models, the data-driven approach learns decisions\u00a0(probably optimal ones) from available data. This removes the knowledge bottleneck in the traditional knowledge-driven decision making, which requires a strong support from domain experts. In this paper, we study data-driven decision making in the context of interactive dynamic influence diagrams\u00a0(I-DIDs)\u2014a general framework for multiagent sequential decision making under uncertainty. We propose a data-driven framework to solve the I-DIDs model and focus on learning the behavior of other agents in problem domains. The challenge is on learning a complete policy tree that will be embedded in the I-DIDs models due to limited data. We propose two new methods to develop complete policy trees for the other agents in the I-DIDs. The first method uses a simple clustering process, while the second one employs sophisticated statistical checks. We analyze the proposed algorithms in a theoretical way and experiment them over two problem domains.<\/jats:p>","DOI":"10.1007\/s10115-021-01600-5","type":"journal-article","created":{"date-parts":[[2021,8,8]],"date-time":"2021-08-08T18:02:26Z","timestamp":1628445746000},"page":"2431-2453","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Toward data-driven solutions to interactive dynamic influence diagrams"],"prefix":"10.1007","volume":"63","author":[{"given":"Yinghui","family":"Pan","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jing","family":"Tang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Biyang","family":"Ma","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yifeng","family":"Zeng","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhong","family":"Ming","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,8,8]]},"reference":[{"key":"1600_CR1","doi-asserted-by":"publisher","first-page":"66","DOI":"10.1016\/j.artint.2018.01.002","volume":"258","author":"SV Albrecht","year":"2018","unstructured":"Albrecht SV, Stone P (2018) Autonomous agents modelling other agents: A comprehensive survey and open problems. Artif Intell 258:66\u201395","journal-title":"Artif Intell"},{"key":"1600_CR2","doi-asserted-by":"publisher","first-page":"817","DOI":"10.1613\/jair.1.11418","volume":"64","author":"C Amato","year":"2019","unstructured":"Amato C, Konidaris G, Kaelbling LP, How JP (2019) Modeling and planning with macro-actions in decentralized pomdps. J Artif Intell Res (JAIR) 64:817\u2013859","journal-title":"J Artif Intell Res (JAIR)"},{"key":"1600_CR3","doi-asserted-by":"crossref","unstructured":"Barrett S, Stone P (2015) Cooperating with unknown teammates in complex domains: A robot soccer case study of ad hoc teamwork. In: Proceedings of the 29th international conference on association for the advancement of artificial intelligence (AAAI), pp 2010\u20132016","DOI":"10.1609\/aaai.v29i1.9428"},{"key":"1600_CR4","unstructured":"Carmel D, Markovitch S (1996) Learning models of intelligent agents. In: Proceedings of the 13th international conference on association for the advancement of artificial intelligence (AAAI), vol\u00a01, pp 62\u201367 (1996)"},{"key":"1600_CR5","unstructured":"Chandrasekaran M, Doshi P, Zeng Y, Chen Y (2014) Team behavior in interactive dynamic influence diagrams with applications to ad hoc teams. In: Proceedings of the 13th international conference on autonomous agents and multiagent systems (AAMAS), pp 1559\u20131560"},{"key":"1600_CR6","unstructured":"Chandrasekaran M, Zhang J, Doshi P, Zeng Y (2017) Robust model equivalence using stochastic bisimulation for n-agent interactive DIDs. In: Proceedings of the thirty-third conference on uncertainty in artificial intelligence, UAI 2017, Sydney, Australia, August 11\u201315, 2017. AUAI Press"},{"key":"1600_CR7","unstructured":"Chen Y, Doshi P, Zeng Y (2015) Iterative online planning in multiagent settings with limited model spaces and PAC guarantees. In: Proceedings of the 14th international conference on autonomous agents and multiagent systems (AAMAS), pp 1161\u20131169"},{"key":"1600_CR8","unstructured":"Conroy R, Zeng Y, Cavazza M, Tang J, Pan Y (2016) A value equivalence approach for solving interactive dynamic influence diagrams. In: Proceedings of the 15th international conference on autonomous agents & multiagent systems (AAMAS), Singapore, May 9\u201313, 2016, pp 1162\u20131170"},{"key":"1600_CR9","unstructured":"Delle\u00a0Fave FM, Brown M, Zhang C, Shieh E, Jiang AX, Rosoff H, Tambe M, Sullivan J (2014)Security games in the field: an initial study on a transit system. In: Proceedings of the 13th international conference on autonomous agents and multi-agent systems (AAMAS), pp 1363\u20131364"},{"issue":"3","key":"1600_CR10","doi-asserted-by":"publisher","first-page":"376","DOI":"10.1007\/s10458-008-9064-7","volume":"18","author":"P Doshi","year":"2009","unstructured":"Doshi P, Zeng Y, Chen Q (2009) Graphical models for interactive pomdps: representations and solutions. J Auton Agents Multi-Agent Syst (JAAMAS) 18(3):376\u2013416","journal-title":"J Auton Agents Multi-Agent Syst (JAAMAS)"},{"key":"1600_CR11","unstructured":"Ford B, Kar D, Delle\u00a0Fave FM, Yang R, Tambe M (2014) Paws: Adaptive game-theoretic patrolling for wildlife protection (demonstration). In: Proceedings of the 13th international conference on autonomous agents and multi-agent systems (AAMAS), pp 1641\u20131642"},{"key":"1600_CR12","doi-asserted-by":"crossref","unstructured":"Gal Y, Pfeffer A (2003) A language for modeling agents\u2019 decision making processes in games. In: Proceedings of the 2nd international joint conference on autonomous agents and multiagent systems (AAMAS), pp 265\u2013272","DOI":"10.1145\/860575.860618"},{"key":"1600_CR13","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1613\/jair.1579","volume":"24","author":"PJ Gmytrasiewicz","year":"2005","unstructured":"Gmytrasiewicz PJ, Doshi P (2005) A framework for sequential planning in multiagent settings. J Artif Intell Res (JAIR) 24:49\u201379","journal-title":"J Artif Intell Res (JAIR)"},{"issue":"3","key":"1600_CR14","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1287\/mnsc.14.3.159","volume":"14","author":"JC Harsanyi","year":"1967","unstructured":"Harsanyi JC (1967) Games with incomplete information played by bayesian players. Manage Sci 14(3):159\u2013182","journal-title":"Manage Sci"},{"key":"1600_CR15","volume-title":"Grammatical inference: learning automata and grammar","author":"Cdl Higuera","year":"2003","unstructured":"Higuera Cdl (2003) Grammatical inference: learning automata and grammar. Cambridge University Press, Cambridge"},{"key":"1600_CR16","doi-asserted-by":"publisher","first-page":"13","DOI":"10.1080\/01621459.1963.10500830","volume":"58","author":"W Hoeffding","year":"1963","unstructured":"Hoeffding W (1963) Probability inequalities for sums of bounded random variables. J Am Stat Assoc (JASA) 58:13\u201330","journal-title":"J Am Stat Assoc (JASA)"},{"issue":"3","key":"1600_CR17","doi-asserted-by":"publisher","first-page":"127","DOI":"10.1287\/deca.1050.0020","volume":"2","author":"RA Howard","year":"2005","unstructured":"Howard RA, Matheson JE (2005) Influence diagrams. Decis Anal 2(3):127\u2013143","journal-title":"Decis Anal"},{"key":"1600_CR18","unstructured":"Katt S, Oliehoek FA, Amato C (2017) Learning in pomdps with monte Carlo tree search. In: Proceedings of the 34th international conference on machine learning (ICML), pp 1819\u20131827"},{"key":"1600_CR19","unstructured":"Katt S, Oliehoek FA, Amato C (2019) Bayesian reinforcement learning in factored pomdps. In: Proceedings of the 18th international conference on autonomous agents and multiagent systems (AAMAS), pp 7\u201315"},{"key":"1600_CR20","unstructured":"Khandelwal P, Stone PH (2014) Multi-robot human guidance using topological graphs. In: Proceedings of the 28th international conference on association for the advancement of artificial intelligence (AAAI), pp 65\u201372"},{"issue":"1","key":"1600_CR21","doi-asserted-by":"publisher","first-page":"181","DOI":"10.1016\/S0899-8256(02)00544-4","volume":"45","author":"D Koller","year":"2003","unstructured":"Koller D, Milch B (2003) Multi-agent influence diagrams for representing and solving games. Games Econom Behav 45(1):181\u2013221","journal-title":"Games Econom Behav"},{"key":"1600_CR22","doi-asserted-by":"crossref","unstructured":"Lewis M, Sycara K(2011) Network-centric control for multirobot teams in urban search and rescue. In: The 44th 2011 Hawaii international conference on systems sciences (HICSS). IEEE, pp 1\u201310","DOI":"10.1109\/HICSS.2011.315"},{"key":"1600_CR23","doi-asserted-by":"crossref","unstructured":"Loftin RT, MacGlashan J, Peng B, Taylor ME, Littman ML, Huang J, Roberts DL (2014) A strategy-aware technique for learning behaviors from discrete human feedback. In: Proceedings of the 28th international conference on association for the advancement of artificial intelligence (AAAI), pp 937\u2013943","DOI":"10.1609\/aaai.v28i1.8839"},{"key":"1600_CR24","unstructured":"Marecki J, Gupta T, Varakantham P, Tambe M, Yokoo M (2008) Not all agents are equal: Scaling up distributed pomdps for agent networks. In: Proceedings of the 7th international conference on autonomous agents and multi-agent systems (AAMAS), pp 485\u2013492"},{"key":"1600_CR25","unstructured":"Panella A, Gmytrasiewicz P (2015) Nonparametric bayesian learning of other agents\u2019 policies in multiagent pomdps. In: Proceedings of the 29th international conference on association for the advancement of artificial intelligence(AAAI), pp 1875\u20131876"},{"key":"1600_CR26","doi-asserted-by":"crossref","unstructured":"Robu V, Vinyals M, Rogers A, Jennings NR (2014) Efficient buyer groups for prediction-of-use electricity tariffs. In: Proceedings of the 28th international conference on association for the advancement of artificial intelligence (AAAI), pp 451\u2013457","DOI":"10.1609\/aaai.v28i1.8764"},{"key":"1600_CR27","doi-asserted-by":"crossref","unstructured":"Salah AA, Hung H, Aran O, Gunes H (2013) Creative applications of human behavior understanding. In: International workshop on human behavior understanding (HBU). Springer, pp 1\u201314","DOI":"10.1007\/978-3-319-02714-2_1"},{"key":"1600_CR28","unstructured":"Schlenker A, Thakoor O, Xu H, Fang F, Tambe M, Tran-Thanh L, Vayanos P, Vorobeychik Y(2018) Deceiving cyber adversaries: A game theoretic approach. In: Proceedings of the 17th international conference on autonomous agents and multiagent systems (AAMAS), vol\u00a02, pp 892\u2013900"},{"issue":"2","key":"1600_CR29","doi-asserted-by":"publisher","first-page":"190","DOI":"10.1007\/s10458-007-9026-5","volume":"17","author":"S Seuken","year":"2008","unstructured":"Seuken S, Zilberstein S (2008) Formal models and algorithms for decentralized decision making under uncertainty. J Auton Agents Multi-Agent Syst 17(2):190\u2013250","journal-title":"J Auton Agents Multi-Agent Syst"},{"key":"1600_CR30","doi-asserted-by":"crossref","unstructured":"Simao TD, Spaan MTJ (2019)Structure learning for safe policy improvement. In: Proceedings of the 28th international joint conference on artificial intelligence (IJCAI), pp 3453\u20133459","DOI":"10.24963\/ijcai.2019\/479"},{"issue":"5","key":"1600_CR31","doi-asserted-by":"publisher","first-page":"1071","DOI":"10.1287\/opre.21.5.1071","volume":"21","author":"RD Smallwood","year":"1973","unstructured":"Smallwood RD, Sondik EJ (1973) The optimal control of partially observable Markov processes over a finite horizon. Oper Res (OR) 21(5):1071\u20131088","journal-title":"Oper Res (OR)"},{"key":"1600_CR32","doi-asserted-by":"crossref","unstructured":"Stone P, Kaminka GA, Kraus S, Rosenschein JS (2010) Ad hoc autonomous agent teams: Collaboration without pre-coordination. In: Proceedings of the 24th international conference on association for the advancement of artificial intelligence (AAAI), pp 1504\u20131509","DOI":"10.1609\/aaai.v24i1.7529"},{"key":"1600_CR33","doi-asserted-by":"crossref","unstructured":"Suryadi D, Gmytrasiewicz PJ (1999) Learning models of other agents using influence diagrams. In: International conference on user modeling. Springer, pp 223\u2013232","DOI":"10.1007\/978-3-7091-2490-1_22"},{"key":"1600_CR34","unstructured":"Velagapudi P, Varakantham P, Sycara K, Scerri P (2011) Distributed model shaping for scaling to decentralized pomdps with hundreds of agents. In: Proceedings of the 10th international conference on autonomous agents and multi-agent systems (AAMAS), pp 955\u2013962"},{"key":"1600_CR35","unstructured":"Wu F, Zilberstein S, Jennings NR (2013) Monte-carlo expectation maximization for decentralized pomdps. In: Proceedings of the 23rd international joint conference on artificial intelligence (IJCAI), pp 397\u2013403"},{"key":"1600_CR36","unstructured":"Zeng Y, Doshi P(2009) Speeding up exact solutions of interactive influence diagrams using action equivalence. In: Proceedings of the 21st international joint conference on artificial intelligence (IJCAI), pp 1996\u20132001"},{"key":"1600_CR37","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1613\/jair.3461","volume":"43","author":"Y Zeng","year":"2012","unstructured":"Zeng Y, Doshi P (2012) Exploiting model equivalences for solving interactive dynamic influence diagrams. J Artif Intell Res (JAIR) 43:211\u2013255","journal-title":"J Artif Intell Res (JAIR)"},{"issue":"2","key":"1600_CR38","doi-asserted-by":"publisher","first-page":"511","DOI":"10.1007\/s10115-015-0912-x","volume":"49","author":"Y Zeng","year":"2016","unstructured":"Zeng Y, Doshi P, Chen Y, Pan Y, Mao H, Chandrasekaran M (2016) Approximating behavioral equivalence for scaling solutions of i-dids. Knowl Inf Syst 49(2):511\u2013552","journal-title":"Knowl Inf Syst"},{"key":"1600_CR39","unstructured":"Zeng Y, Mao H, Pan Y, Luo J(2012) Improved use of partial policies for identifying behavioral equivalences. In: Proceedings of the 11th international conference on autonomous agents and multiagent systems (AAMAS), pp 1015\u20131022"},{"key":"1600_CR40","doi-asserted-by":"publisher","first-page":"80","DOI":"10.1016\/j.artint.2014.03.004","volume":"212","author":"HH Zhuo","year":"2014","unstructured":"Zhuo HH, Yang Q (2014) Action-model acquisition for planning via transfer learning. Artif Intell 212:80\u2013103","journal-title":"Artif Intell"},{"key":"1600_CR41","doi-asserted-by":"crossref","unstructured":"Zilberstein S (2015) Building strong semi-autonomous systems. In: Proceedings of the 29th international conference on association for the advancement of artificial intelligence (AAAI), pp 4088\u20134092","DOI":"10.1609\/aaai.v29i1.9773"}],"container-title":["Knowledge and Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10115-021-01600-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10115-021-01600-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10115-021-01600-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,6]],"date-time":"2023-01-06T22:37:02Z","timestamp":1673044622000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10115-021-01600-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,8,8]]},"references-count":41,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2021,9]]}},"alternative-id":["1600"],"URL":"https:\/\/doi.org\/10.1007\/s10115-021-01600-5","relation":{},"ISSN":["0219-1377","0219-3116"],"issn-type":[{"value":"0219-1377","type":"print"},{"value":"0219-3116","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,8,8]]},"assertion":[{"value":"5 March 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 July 2021","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 July 2021","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 August 2021","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}