{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T07:43:39Z","timestamp":1740123819731,"version":"3.37.3"},"reference-count":42,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,4,20]],"date-time":"2022-04-20T00:00:00Z","timestamp":1650412800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,4,20]],"date-time":"2022-04-20T00:00:00Z","timestamp":1650412800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100000923","name":"Australian Research Council","doi-asserted-by":"crossref","award":["DP200103718"],"award-info":[{"award-number":["DP200103718"]}],"id":[{"id":"10.13039\/501100000923","id-type":"DOI","asserted-by":"crossref"}]},{"DOI":"10.13039\/501100001774","name":"University of Sydney","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100001774","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["World Wide Web"],"published-print":{"date-parts":[[2023,1]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>We present a novel online decision-making solution, where the optimal path of a given decision tree is dynamically found based on the contextual bandits analysis. At each round, the learner finds a path in the decision tree by making a sequence of decisions following the tree structure and receives an <jats:italic>outcome<\/jats:italic> when a terminal node is reached. At each decision node, the environment information is observed to hint on which child node to visit, resulting in a better outcome. The objective is to learn the context-specific optimal decision for each decision node to maximize the accumulated outcome. In this paper, we propose <jats:italic>Dynamic Path Identifier (DPI)<\/jats:italic>, a learning algorithm where the <jats:italic>contextual bandit<\/jats:italic> is applied to every decision node, and the observed outcome is used as the <jats:italic>reward<\/jats:italic> of the previous decisions of the same round. The technical difficulty of DPI is the <jats:italic>high exploration challenge<\/jats:italic> caused by the width (i.e., the number of paths) of the tree as well as the large context space. We mathematically prove that DPI\u2019s regret per round approached zero as the number of the rounds approaches infinity. We also prove that the regret is not a function of the number of paths in the tree. Numerical evaluations are provided to complement the theoretical analysis.<\/jats:p>","DOI":"10.1007\/s11280-022-01043-0","type":"journal-article","created":{"date-parts":[[2022,4,20]],"date-time":"2022-04-20T11:06:24Z","timestamp":1650452784000},"page":"271-296","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Dynamic path learning in decision trees using contextual bandits"],"prefix":"10.1007","volume":"26","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-8411-2402","authenticated-orcid":false,"given":"Weiyu","family":"Ju","sequence":"first","affiliation":[]},{"given":"Dong","family":"Yuan","sequence":"additional","affiliation":[]},{"given":"Wei","family":"Bao","sequence":"additional","affiliation":[]},{"given":"Liming","family":"Ge","sequence":"additional","affiliation":[]},{"given":"Bing Bing","family":"Zhou","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,4,20]]},"reference":[{"key":"1043_CR1","volume-title":"Decision trees for decision making","author":"JF Magee","year":"1964","unstructured":"Magee, J.F.: Decision trees for decision making. Harvard Business Review, Boston (1964)"},{"key":"1043_CR2","volume-title":"Classification and regression trees","author":"L Breiman","year":"1984","unstructured":"Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and regression trees. CRC Press, United States (1984)"},{"issue":"3","key":"1043_CR3","doi-asserted-by":"publisher","first-page":"660","DOI":"10.1109\/21.97458","volume":"21","author":"SR Safavian","year":"1991","unstructured":"Safavian, S.R., Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics 21(3), 660\u2013674 (1991)","journal-title":"IEEE Transactions on Systems, Man, and Cybernetics"},{"issue":"6","key":"1043_CR4","doi-asserted-by":"publisher","first-page":"1787","DOI":"10.1007\/s11280-018-0619-5","volume":"21","author":"S Zhang","year":"2018","unstructured":"Zhang, S.: Multiple-scale cost sensitive decision tree learning. World Wide Web 21(6), 1787\u20131800 (2018)","journal-title":"World Wide Web"},{"key":"1043_CR5","doi-asserted-by":"publisher","unstructured":"Huntley, N., Troffaes, M.: Normal form backward induction for decision trees with coherent lower previsions. Annals of Operations Research, 195 (2011). https:\/\/doi.org\/10.1007\/s10479-011-0968-2","DOI":"10.1007\/s10479-011-0968-2"},{"key":"1043_CR6","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1007\/s10100-017-0479-6","volume":"26","author":"B Kaminski","year":"2018","unstructured":"Kaminski, B., Jakubczyk, M., Szufel, P.: A framework for sensitivity analysis of decision trees. Central European Journal of Operations Research 26, 135\u2013159 (2018)","journal-title":"Central European Journal of Operations Research"},{"key":"1043_CR7","unstructured":"Coquelin, P., Munos, R.: Bandit algorithms for tree search. (2007) arXiv:cs\/0703062"},{"issue":"1","key":"1043_CR8","doi-asserted-by":"publisher","first-page":"48","DOI":"10.1137\/S0097539701398375","volume":"32","author":"P Auer","year":"2003","unstructured":"Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: The nonstochastic multiarmed bandit problem. SIAM J. Comput. 32(1), 48\u201377 (2003)","journal-title":"SIAM J. Comput."},{"issue":"2","key":"1043_CR9","doi-asserted-by":"publisher","first-page":"913","DOI":"10.1016\/j.amc.2007.07.043","volume":"196","author":"DE Koulouriotis","year":"2008","unstructured":"Koulouriotis, D.E., Xanthopoulos, A.: Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems. Applied Mathematics and Computation 196(2), 913\u2013922 (2008)","journal-title":"Applied Mathematics and Computation"},{"key":"1043_CR10","doi-asserted-by":"crossref","unstructured":"Bouneffouf, D., Rish, I.: A survey on practical applications of multi-armed and contextual bandits. (2019) arxiv:1904.10040","DOI":"10.1109\/CEC48606.2020.9185782"},{"key":"1043_CR11","doi-asserted-by":"publisher","unstructured":"Wang, S., Liu, Q., Ge, T., Lian, D., Zhang, Z.: A hybrid bandit model with visual priors for creative ranking in display advertising. In: Proceedings of the Web Conference 2021. WWW \u201921, pp. 2324\u20132334. Association for Computing Machinery, New York, NY, USA (2021). https:\/\/doi.org\/10.1145\/3442381.3449910","DOI":"10.1145\/3442381.3449910"},{"key":"1043_CR12","doi-asserted-by":"publisher","unstructured":"Li, L., Chu, W., Langford, J., Schapire, R.E.: A contextual-bandit approach to personalized news article recommendation. In: Proceedings of the 19th International Conference on World Wide Web. WWW \u201910, pp. 661\u2013670. Association for Computing Machinery, New York, NY, USA (2010). https:\/\/doi.org\/10.1145\/1772690.1772758","DOI":"10.1145\/1772690.1772758"},{"key":"1043_CR13","doi-asserted-by":"publisher","unstructured":"Chan, H.P., Zhao, T., King, I.: Trust-aware peer assessment using multiarmed bandit algorithms. In: Proceedings of the 25th International Conference Companion on World Wide Web. WWW \u201916 Companion, pp. 899\u2013903. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE (2016). https:\/\/doi.org\/10.1145\/2872518.2891080","DOI":"10.1145\/2872518.2891080"},{"key":"1043_CR14","doi-asserted-by":"crossref","unstructured":"Oswal, U., Bhargava, A., Nowak, R.: Linear bandits with feature feedback. In: AAAI, pp. 5331\u20135338 (2020)","DOI":"10.1609\/aaai.v34i04.5980"},{"key":"1043_CR15","unstructured":"Russac, Y., Vernade, C., Capp\u00e9, O.: Weighted linear bandits for nonstationary environments. In: Wallach, H., Larochelle, H., Beygelzimer, A., d\u2019 Alch\u00e9-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32, pp. 12040\u201312049. Curran Associates, Inc., Vancouver, BC, Canada (2019)"},{"key":"1043_CR16","unstructured":"Ito, S., Hirahara, S., Soma, T., Yoshida, Y.: Tight first- and second-order regret bounds for adversarial linear bandits. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 2028\u20132038. Curran Associates, Inc., virtual (2020)"},{"key":"1043_CR17","unstructured":"Levine, N., Crammer, K., Mannor, S.: Rotting bandits. In: Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R. (eds.) Advances in Neural Information Processing Systems 30, pp. 3074\u20133083. Curran Associates Inc, California (2017)"},{"key":"1043_CR18","unstructured":"Seznec, J., Locatelli, A., Carpentier, A., Lazaric, A., Valko, M.: Rotting bandits are not harder than stochastic ones (2020)"},{"key":"1043_CR19","doi-asserted-by":"crossref","unstructured":"Li, F., Liu, J., Ji, B.: Combinatorial sleeping bandits with fairness constraints. IEEE Transactions on Network Science and Engineering, 1\u20131 (2019)","DOI":"10.1109\/INFOCOM.2019.8737461"},{"key":"1043_CR20","unstructured":"Saha, A., Gaillard, P., Valko, M.: Improved sleeping bandits with stochastic actions sets and adversarial rewards. (2020) arXiv:2004.06248"},{"issue":"1","key":"1043_CR21","first-page":"1","volume":"4","author":"R Combes","year":"2020","unstructured":"Combes, R., Prouti\u00e8re, A., Fauquette, A.: Unimodal bandits with continuous arms: Order-optimal regret without smoothness. POMACS 4(1), 1\u201328 (2020)","journal-title":"POMACS"},{"key":"1043_CR22","doi-asserted-by":"publisher","first-page":"6070","DOI":"10.1609\/aaai.v33i01.33016070","volume":"33","author":"Z Hu","year":"2019","unstructured":"Hu, Z., Zhang, J., Li, Z.: General robustness evaluation of incentive mechanism against bounded rationality using continuum-armed bandits. Proceedings of the AAAI Conference on Artificial Intelligence 33, 6070\u20136078 (2019)","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"1043_CR23","unstructured":"Wang, Y., Audibert, J.-Y., Munos, R.: Algorithms for infinitely many-armed bandits, 1729\u20131736 (2008)"},{"key":"1043_CR24","unstructured":"Kveton, B., Szepesvari, C., Wen, Z., Ashkan, A.: Cascading bandits: Learning to rank in the cascade model. In: ICML, Lille, France, pp. 767\u2013776 (2015)"},{"key":"1043_CR25","doi-asserted-by":"crossref","unstructured":"Li, S., Zhang, S.: Online clustering of contextual cascading bandits. In: Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA (2018)","DOI":"10.1609\/aaai.v32i1.11763"},{"key":"1043_CR26","unstructured":"Br\u00e9g\u00e8re, M., Gaillard, P., Goude, Y., Stoltz, G.: Target tracking for contextual bandits: Application to demand side management. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th ICML. Proceedings of Machine Learning Research, vol. 97, pp. 754\u2013763. PMLR, Long Beach, California, USA (2019)"},{"key":"1043_CR27","doi-asserted-by":"crossref","unstructured":"Dimakopoulou, M., Zhou, Z., Athey, S., Imbens, G.: Balanced linear contextual bandits. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33. Honolulu, HI, USA, pp. 3445\u20133453 (2019)","DOI":"10.1609\/aaai.v33i01.33013445"},{"key":"1043_CR28","unstructured":"Zhou, Z., Xu, R., Blanchet, J.: Learning in generalized linear contextual bandits with stochastic delays. In: Wallach, H., Larochelle, H., Beygelzimer, A., d\u2019 Alch\u00e8-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32, pp. 5197\u20135208. Curran Associates, Inc., Red Hook, NY, USA, Vancouver, BC (2019)"},{"key":"1043_CR29","unstructured":"Zhang, C., Agarwal, A., Iii, H.D., Langford, J., Negahban, S.: Warmstarting contextual bandits: Robustly combining supervised and bandit feedback. In: Chaudhuri, K., Salakhutdinov, R. (eds.) Proceedings of the 36th ICML. Proceedings of Machine Learning Research, vol. 97, pp. 7335\u20137344. PMLR, Long Beach, California, USA (2019)"},{"key":"1043_CR30","unstructured":"Tirinzoni, A., Pirotta, M., Restelli, M., Lazaric, A.: An asymptotically optimal primal-dual incremental algorithm for contextual linear bandits. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 1417\u20131427. Curran Associates, Inc., virtual (2020)"},{"key":"1043_CR31","unstructured":"Agarwal, A., Hsu, D., Kale, S., Langford, J., Li, L., Schapire, R.: Taming the monster: A fast and simple algorithm for contextual bandits. In: ICML, Beijing, China, pp. 1638\u20131646 (2014)"},{"key":"1043_CR32","unstructured":"Foster, D.J., Krishnamurthy, A., Luo, H.: Model selection for contextual bandits. In: Wallach, H., Larochelle, H., Beygelzimer, A., d\u2019 Alch\u00e9-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32, pp. 14741\u201314752. Curran Associates, Inc., Red Hook, NY, USA, Vancouver, BC (2019)"},{"key":"1043_CR33","unstructured":"Dudik, M., Hsu, D., Kale, S., Karampatziakis, N., Langford, J., Reyzin, L., Zhang, T.: Efficient optimal learning for contextual bandits. In: 27th UAI. UAI\u201911, pp. 169\u2013178. AUAI Press, Arlington, Virginia, USA (2011)"},{"key":"1043_CR34","doi-asserted-by":"crossref","unstructured":"Hazan, E., Megiddo, N.: Online learning with prior knowledge. In: International Conference on Computational Learning Theory, Berlin, Heidelberg, pp. 499\u2013513. Springer (2007)","DOI":"10.1007\/978-3-540-72927-3_36"},{"key":"1043_CR35","unstructured":"Slivkins, A.: Contextual bandits with similarity information. In: Proceedings of the 24th Annual Conference On Learning Theory, Budapest, Hungary, pp. 679\u2013702 (2011)"},{"key":"1043_CR36","unstructured":"Wanigasekara, N., Yu, C.: Nonparametric contextual bandits in metric spaces with unknown metric. In: Wallach, H., Larochelle, H., Beygelzimer, A., d\u2019 Alch\u00e9-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32, pp. 14684\u201314694. Curran Associates, Inc., Red Hook, NY, USA, Vancouver, BC (2019)"},{"key":"1043_CR37","doi-asserted-by":"crossref","unstructured":"Chen, J., Zhong, M., Li, J., Wang, D., Qian, T., Tu, H.: Effective deep attributed network representation learning with topology adapted smoothing. IEEE Transactions on Cybernetics (2021)","DOI":"10.1109\/TCYB.2021.3064092"},{"key":"1043_CR38","doi-asserted-by":"publisher","first-page":"106618","DOI":"10.1016\/j.knosys.2020.106618","volume":"212","author":"Z Li","year":"2021","unstructured":"Li, Z., Wang, X., Li, J., Zhang, Q.: Deep attributed network representation learning of complex coupling and interaction. Knowledge-Based Systems 212, 106618 (2021)","journal-title":"Knowledge-Based Systems"},{"key":"1043_CR39","doi-asserted-by":"crossref","unstructured":"Yang, Y., Guan, Z., Li, J., Zhao, W., Cui, J., Wang, Q.: Interpretable and efficient heterogeneous graph convolutional network. IEEE Transactions on Knowledge and Data Engineering (2021)","DOI":"10.1109\/TKDE.2021.3101356"},{"key":"1043_CR40","doi-asserted-by":"crossref","unstructured":"Cai, T., Li, J., Mian, A.S., Sellis, T., Yu, J.X., et al.: Target-aware holistic influence maximization in spatial social networks. IEEE Transactions on Knowledge and Data Engineering (2020)","DOI":"10.1109\/TKDE.2020.3003047"},{"key":"1043_CR41","unstructured":"Auer, P., Cesa-Bianchi, N., Freund, Y., Schapire, R.E.: Gambling in arigged casino: The adversarial multi-armed bandit problem. In: Proceedings of IEEE 36th Annual Foundations of Computer Science, pp. 322\u2013331 (1995)"},{"key":"1043_CR42","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511813658","volume-title":"Probability with Martingales","author":"D Williams","year":"1991","unstructured":"Williams, D.: Probability with Martingales. Cambridge University Press, Cambridge, United Kingdom (1991)"}],"container-title":["World Wide Web"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11280-022-01043-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11280-022-01043-0\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11280-022-01043-0.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,20]],"date-time":"2023-01-20T22:08:50Z","timestamp":1674252530000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11280-022-01043-0"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,4,20]]},"references-count":42,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2023,1]]}},"alternative-id":["1043"],"URL":"https:\/\/doi.org\/10.1007\/s11280-022-01043-0","relation":{},"ISSN":["1386-145X","1573-1413"],"issn-type":[{"type":"print","value":"1386-145X"},{"type":"electronic","value":"1573-1413"}],"subject":[],"published":{"date-parts":[[2022,4,20]]},"assertion":[{"value":"26 October 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 March 2022","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"8 March 2022","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 April 2022","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}