{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T05:30:57Z","timestamp":1774330257773,"version":"3.50.1"},"reference-count":67,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2021,9,3]],"date-time":"2021-09-03T00:00:00Z","timestamp":1630627200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Shenzhen Basic Research Fund","award":["JCYJ20200813091134001"],"award-info":[{"award-number":["JCYJ20200813091134001"]}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61972261"],"award-info":[{"award-number":["61972261"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Knowl. Discov. Data"],"published-print":{"date-parts":[[2022,4,30]]},"abstract":"<jats:p>This article proposes a deep learning solution to the online portfolio selection problem based on learning a latent structure directly from a price time series. It introduces a novel wealth flow matrix for representing a latent structure that has special regular conditions to encode the knowledge about the relative strengths of assets in portfolios. Therefore, a wealth flow model (WFM) is proposed to learn wealth flow matrices and maximize portfolio wealth simultaneously. Compared with existing approaches, our work has several distinctive benefits: (1) the learning of wealth flow matrices makes our model more generalizable than models that only predict wealth proportion vectors, and (2) the exploitation of wealth flow matrices and the exploration of wealth growth are integrated into our deep reinforcement algorithm for the WFM. These benefits, in combination, lead to a highly-effective approach for generating reasonable investment behavior, including short-term trend following, the following of a few losers, no self-investment, and sparse portfolios. Extensive experiments on five benchmark datasets from real-world stock markets confirm the theoretical advantage of the WFM, which achieves the Pareto improvements in terms of multiple performance indicators and the steady growth of wealth over the state-of-the-art algorithms.<\/jats:p>","DOI":"10.1145\/3464308","type":"journal-article","created":{"date-parts":[[2021,9,4]],"date-time":"2021-09-04T04:05:52Z","timestamp":1630728352000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Wealth Flow Model: Online Portfolio Selection Based on Learning Wealth Flow Matrices"],"prefix":"10.1145","volume":"16","author":[{"given":"Jianfei","family":"Yin","sequence":"first","affiliation":[{"name":"College of Computer Science and Software Engineering and National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ruili","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Natural and Computational Sciences, Massey University, Auckland, New Zealand"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yeqing","family":"Guo","sequence":"additional","affiliation":[{"name":"Tisson Regaltc Communication Technology, Guangzhou, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yizhe","family":"Bai","sequence":"additional","affiliation":[{"name":"Shenzhen University, Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shunda","family":"Ju","sequence":"additional","affiliation":[{"name":"Shenzhen University, Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weili","family":"Liu","sequence":"additional","affiliation":[{"name":"Shenzhen University, Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Joshua Zhexue","family":"Huang","sequence":"additional","affiliation":[{"name":"Shenzhen University, Shenzhen, China"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,9,3]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation. 265\u2013283","author":"Abadi M.","unstructured":"M. Abadi , P. Barham , J. Chen , Z. Chen , A. Davis , J. Dean , M. Devin , S. Ghemawat , G. Irving , M. Isard , M. Kudlur , J. Levenberg , R. Monga , S. Moore , D. G. Murray , B. Steiner , P. Tucker , V. Vasudevan , P. Warden , M. Wicke , Y. Yu , and X. Zheng . 2016. Tensorflow: A system for large-scale machine learning . In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation. 265\u2013283 . M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng. 2016. Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation. 265\u2013283."},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of the 23rd International Conference on Machine Learning. 148","author":"Agarwal A.","unstructured":"A. Agarwal , E. Hazan , S. Kale , and R. E. Schapire . 2006. Algorithms for portfolio management based on the newton method . In Proceedings of the 23rd International Conference on Machine Learning. 148 , 9\u201316. A. Agarwal, E. Hazan, S. Kale, and R. E. Schapire. 2006. Algorithms for portfolio management based on the newton method. In Proceedings of the 23rd International Conference on Machine Learning. 148, 9\u201316."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.3905\/jfds.2019.1.045"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.4086\/toc.2012.v008a006"},{"key":"e_1_2_1_5_1","volume-title":"Proceedings of the International Conference on Knowledge Science, Engineering and Management. Springer, 472\u2013480","author":"Bai Y.","unstructured":"Y. Bai , J. Yin , S. Ju , Z. Chen , and J. Z. Huang . 2020. Long and short term risk control for online portfolio selection . In Proceedings of the International Conference on Knowledge Science, Engineering and Management. Springer, 472\u2013480 . Y. Bai, J. Yin, S. Ju, Z. Chen, and J. Z. Huang. 2020. Long and short term risk control for online portfolio selection. In Proceedings of the International Conference on Knowledge Science, Engineering and Management. Springer, 472\u2013480."},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1080\/01621459.2017.1285773"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1007530728748"},{"key":"e_1_2_1_8_1","volume-title":"Matrix Calculus","author":"Bodewig E.","unstructured":"E. Bodewig . 2014. Matrix Calculus . Elsevier . E. Bodewig. 2014. Matrix Calculus. Elsevier."},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.5555\/1622467.1622484"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSP.2019.2941067"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/MIS.2016.31"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00500-018-3281-z"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.physa.2019.04.185"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCNS.2019.2906916"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-9965.1991.tb00002.x"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/18.485708"},{"key":"e_1_2_1_17_1","volume-title":"Proceedings of the 28th AAAI Conference on Artificial Intelligence. 1185\u20131191","author":"Das P.","unstructured":"P. Das , N. Johnson , and A. Banerjee . 2014. Online portfolio selection with group sparsity . In Proceedings of the 28th AAAI Conference on Artificial Intelligence. 1185\u20131191 . P. Das, N. Johnson, and A. Banerjee. 2014. Online portfolio selection with group sparsity. In Proceedings of the 28th AAAI Conference on Artificial Intelligence. 1185\u20131191."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.conb.2010.02.008"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.2012.2205131"},{"key":"e_1_2_1_20_1","volume-title":"Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2094\u20132100","author":"Guez Van Hasselt H., A.","unstructured":"Van Hasselt H., A. Guez , and D. Silver . 2016. Deep reinforcement learning with double q-learning . In Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2094\u20132100 . Van Hasselt H., A. Guez, and D. Silver. 2016. Deep reinforcement learning with double q-learning. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2094\u20132100."},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-007-5016-8"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1111\/1467-9965.00058"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_2_1_24_1","volume-title":"Proceedings of the 2014 IEEE International Conference on Robotics and Automation. IEEE, 1048\u20131053","author":"Hu N.","unstructured":"N. Hu , G. Englebienne , Z. Lou , and B. Krse . 2014. Learning latent structure for activity recognition . In Proceedings of the 2014 IEEE International Conference on Robotics and Automation. IEEE, 1048\u20131053 . N. Hu, G. Englebienne, Z. Lou, and B. Krse. 2014. Learning latent structure for activity recognition. In Proceedings of the 2014 IEEE International Conference on Robotics and Automation. IEEE, 1048\u20131053."},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the 24th International Joint Conference on Artificial Intelligence.","author":"Huang D.","unstructured":"D. Huang , Y. Zhu , B. Li , S. Zhou , and S. C. Hoi . 2015. Semi-universal portfolios with transaction costs . In Proceedings of the 24th International Joint Conference on Artificial Intelligence. D. Huang, Y. Zhu, B. Li, S. Zhou, and S. C. Hoi. 2015. Semi-universal portfolios with transaction costs. In Proceedings of the 24th International Joint Conference on Artificial Intelligence."},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.3905\/jpm.2017.44.1.015"},{"key":"e_1_2_1_27_1","first-page":"1563","article-title":"Near-optimal regret bounds for reinforcement learning","author":"Jaksch T.","year":"2010","unstructured":"T. Jaksch , R. Ortner , and P. Auer . 2010 . Near-optimal regret bounds for reinforcement learning . Journal of Machine Learning Research 11 , Apr (2010), 1563 \u2013 1600 . T. Jaksch, R. Ortner, and P. Auer. 2010. Near-optimal regret bounds for reinforcement learning. Journal of Machine Learning Research 11, Apr (2010), 1563\u20131600.","journal-title":"Journal of Machine Learning Research 11"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2018.09.036"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.frl.2017.12.009"},{"key":"e_1_2_1_30_1","volume-title":"Proceedings of the 2017 Intelligent Systems Conference. IEEE, 905\u2013913","author":"Jiang Z.","unstructured":"Z. Jiang and J. Liang . 2017. Cryptocurrency portfolio management with deep reinforcement learning . In Proceedings of the 2017 Intelligent Systems Conference. IEEE, 905\u2013913 . Z. Jiang and J. Liang. 2017. Cryptocurrency portfolio management with deep reinforcement learning. In Proceedings of the 2017 Intelligent Systems Conference. IEEE, 905\u2013913."},{"key":"e_1_2_1_31_1","volume-title":"Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM","author":"Kaplan H.","unstructured":"H. Kaplan , D. Naori , and D. Raz . 2020. Competitive analysis with a sample and the secretary problem . In Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM , 2082\u20132095. H. Kaplan, D. Naori, and D. Raz. 2020. Competitive analysis with a sample and the secretary problem. In Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 2082\u20132095."},{"key":"e_1_2_1_32_1","first-page":"2469","article-title":"Deep reinforcement learning for sequence-to-sequence models","volume":"31","author":"Keneshloo Y.","year":"2019","unstructured":"Y. Keneshloo , T. Shi , N. Ramakrishnan , and C. K. Reddy . 2019 . Deep reinforcement learning for sequence-to-sequence models . IEEE Transactions on Neural Networks and Learning Systems 31 , 7 (2019), 2469 \u2013 2489 . Y. Keneshloo, T. Shi, N. Ramakrishnan, and C. K. Reddy. 2019. Deep reinforcement learning for sequence-to-sequence models. IEEE Transactions on Neural Networks and Learning Systems 31, 7 (2019), 2469\u20132489.","journal-title":"IEEE Transactions on Neural Networks and Learning Systems"},{"key":"e_1_2_1_33_1","doi-asserted-by":"crossref","first-page":"293","DOI":"10.1111\/j.1467-9965.2010.00430.x","article-title":"Universal semiconstant rebalanced portfolios","volume":"21","author":"Kozat S. S.","year":"2011","unstructured":"S. S. Kozat and A. C. Singer . 2011 . Universal semiconstant rebalanced portfolios . Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics 21 , 2 (2011), 293 \u2013 311 . S. S. Kozat and A. C. Singer. 2011. Universal semiconstant rebalanced portfolios. Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics 21, 2 (2011), 293\u2013311.","journal-title":"Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.5555\/3291125.3309625"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1093\/rfs\/12.5.1113"},{"key":"e_1_2_1_36_1","volume-title":"Proceedings of the 29th International Conference on Machine Learning.","author":"Li B.","unstructured":"B. Li and S. C. Hoi . 2012. On-line portfolio selection with moving average reversion . In Proceedings of the 29th International Conference on Machine Learning. B. Li and S. C. Hoi. 2012. On-line portfolio selection with moving average reversion. In Proceedings of the 29th International Conference on Machine Learning."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/2541315"},{"key":"e_1_2_1_38_1","doi-asserted-by":"crossref","unstructured":"B. Li S. C. Hoi P. Zhao and V. Gopalkrishnan. 2013. Confidence weighted mean reversion strategy for online portfolio selection. ACM Transactions on Knowledge Discovery from Data 7 1 (2013) 1\u201338. B. Li S. C. Hoi P. Zhao and V. Gopalkrishnan. 2013. Confidence weighted mean reversion strategy for online portfolio selection. ACM Transactions on Knowledge Discovery from Data 7 1 (2013) 1\u201338.","DOI":"10.1145\/2435209.2435213"},{"key":"e_1_2_1_39_1","first-page":"1","article-title":"OLPS: A toolbox for on-line portfolio selection","volume":"17","author":"Li B.","year":"2016","unstructured":"B. Li , D. Sahoo , and S. C. Hoi . 2016 . OLPS: A toolbox for on-line portfolio selection . Journal of Machine Learning Research 17 , 35 (2016), 1 \u2013 5 . B. Li, D. Sahoo, and S. C. Hoi. 2016. OLPS: A toolbox for on-line portfolio selection. Journal of Machine Learning Research 17, 35 (2016), 1\u20135.","journal-title":"Journal of Machine Learning Research"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1080\/14697688.2017.1357831"},{"key":"e_1_2_1_41_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-012-5281-z"},{"key":"e_1_2_1_42_1","volume-title":"Proceedings of the 2018 11th International Symposium on Computational Intelligence and Design","volume":"02","author":"Li J.","unstructured":"J. Li , R. Rao , and J. Shi . 2018. Learning to trade with deep actor critic methods . In Proceedings of the 2018 11th International Symposium on Computational Intelligence and Design , Vol. 02 . 66\u201371. J. Li, R. Rao, and J. Shi. 2018. Learning to trade with deep actor critic methods. In Proceedings of the 2018 11th International Symposium on Computational Intelligence and Design, Vol. 02. 66\u201371."},{"key":"e_1_2_1_43_1","volume-title":"Proceedings of the AAAI Conference on Artificial Intelligence","volume":"33","author":"Li S.","unstructured":"S. Li , Y. Wu , X. Cui , and H. Dong . 2019. Robust multi-agent reinforcement learning via minimax deep deterministic policy gradient . In Proceedings of the AAAI Conference on Artificial Intelligence , Vol. 33 . 4213\u20134220. S. Li, Y. Wu, X. Cui, and H. Dong. 2019. Robust multi-agent reinforcement learning via minimax deep deterministic policy gradient. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 4213\u20134220."},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ejor.2010.07.004"},{"key":"e_1_2_1_45_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2018.10.034"},{"key":"e_1_2_1_46_1","volume-title":"Proceedings of the 32nd International Conference on Neural Information Processing Systems. 8235\u20138245","author":"Luo H.","unstructured":"H. Luo , C. Y. Wei , and K. Zheng . 2018. Efficient online portfolio with logarithmic regret . In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 8235\u20138245 . H. Luo, C. Y. Wei, and K. Zheng. 2018. Efficient online portfolio with logarithmic regret. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 8235\u20138245."},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1007\/s12559-018-9609-2"},{"key":"e_1_2_1_48_1","doi-asserted-by":"crossref","unstructured":"V. Mnih K. Kavukcuoglu D. Silver and A. A. Rusu. 2015. Human-level control through deep reinforcement learning. Nature 518 7540 (2015) 529\u2013533. V. Mnih K. Kavukcuoglu D. Silver and A. A. Rusu. 2015. Human-level control through deep reinforcement learning. Nature 518 7540 (2015) 529\u2013533.","DOI":"10.1038\/nature14236"},{"key":"e_1_2_1_49_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10479-016-2176-6"},{"key":"e_1_2_1_50_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patcog.2017.05.025"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-014-5474-8"},{"key":"e_1_2_1_52_1","volume-title":"Proceedings of the 35th Uncertainty in Artificial Intelligence. PMLR, 81\u201390","author":"Ortner R.","unstructured":"R. Ortner , P. Gajane , and P. Auer . 2020. Variational regret bounds for reinforcement learning . In Proceedings of the 35th Uncertainty in Artificial Intelligence. PMLR, 81\u201390 . R. Ortner, P. Gajane, and P. Auer. 2020. Variational regret bounds for reinforcement learning. In Proceedings of the 35th Uncertainty in Artificial Intelligence. PMLR, 81\u201390."},{"key":"e_1_2_1_53_1","doi-asserted-by":"publisher","DOI":"10.1137\/18M1174076"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISIT.2008.4595271"},{"key":"e_1_2_1_55_1","doi-asserted-by":"crossref","unstructured":"I. Rahwan M. Cebrian N. Obradovich and J. Bongard. 2019. Machine behaviour. Nature 568 7753 (2019) 477\u2013486. I. Rahwan M. Cebrian N. Obradovich and J. Bongard. 2019. Machine behaviour. Nature 568 7753 (2019) 477\u2013486.","DOI":"10.1038\/s41586-019-1138-y"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.chaos.2007.09.085"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-019-05815-0"},{"key":"e_1_2_1_58_1","doi-asserted-by":"crossref","unstructured":"D. Silver J. Schrittwieser K. Simonyan L. Antonoglou A. Huang A. Guez T. Hubert L. Baker M. Lai A. Bolton Y. Chen T. Lillicrap F. Hui L. Sifre G. Driessche T. Graepel and D. Hassabis. 2017. Mastering the game of Go without human knowledge. Nature 550 7676 (2017) 354\u2013359. D. Silver J. Schrittwieser K. Simonyan L. Antonoglou A. Huang A. Guez T. Hubert L. Baker M. Lai A. Bolton Y. Chen T. Lillicrap F. Hui L. Sifre G. Driessche T. Graepel and D. Hassabis. 2017. Mastering the game of Go without human knowledge. Nature 550 7676 (2017) 354\u2013359.","DOI":"10.1038\/nature24270"},{"key":"e_1_2_1_59_1","volume-title":"Proceedings of the Advances in Neural Information Processing Systems. 5998\u20136008","author":"Vaswani A.","unstructured":"A. Vaswani , N. Shazeer , N. Parmar , J. Uszkoreit , L. Jones , A. N. Gomez , L. Kaiser , and I. Polosukhin . 2017. Attention is all you need . In Proceedings of the Advances in Neural Information Processing Systems. 5998\u20136008 . A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. 2017. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems. 5998\u20136008."},{"key":"e_1_2_1_60_1","volume-title":"Proceedings of the 28th International Conference on Computational Linguistics. 978\u2013987","author":"Xing F.","unstructured":"F. Xing , L. Malandri , Y. Zhang , and E. Cambria . 2020. Financial sentiment analysis: An investigation into common mistakes and silver bullets . In Proceedings of the 28th International Conference on Computational Linguistics. 978\u2013987 . F. Xing, L. Malandri, Y. Zhang, and E. Cambria. 2020. Financial sentiment analysis: An investigation into common mistakes and silver bullets. In Proceedings of the 28th International Conference on Computational Linguistics. 978\u2013987."},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCI.2018.2866727"},{"key":"e_1_2_1_62_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2019.03.029"},{"key":"e_1_2_1_63_1","volume-title":"Proceedings of the 37th International Conference on Machine Learning. PMLR, 10746\u201310756","author":"Yang L.","unstructured":"L. Yang and M. Wang . 2020. Reinforcement learning in feature space: Matrix bandit, kernels, and regret bound . In Proceedings of the 37th International Conference on Machine Learning. PMLR, 10746\u201310756 . L. Yang and M. Wang. 2020. Reinforcement learning in feature space: Matrix bandit, kernels, and regret bound. In Proceedings of the 37th International Conference on Machine Learning. PMLR, 10746\u201310756."},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.imavis.2019.103867"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00500-019-04039-7"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/MIS.2020.2977590"},{"key":"e_1_2_1_67_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.automatica.2019.108651"}],"container-title":["ACM Transactions on Knowledge Discovery from Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3464308","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3464308","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T20:12:16Z","timestamp":1750191136000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3464308"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,9,3]]},"references-count":67,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2022,4,30]]}},"alternative-id":["10.1145\/3464308"],"URL":"https:\/\/doi.org\/10.1145\/3464308","relation":{},"ISSN":["1556-4681","1556-472X"],"issn-type":[{"value":"1556-4681","type":"print"},{"value":"1556-472X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,9,3]]},"assertion":[{"value":"2019-06-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-04-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-09-03","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}