{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,15]],"date-time":"2026-03-15T02:59:09Z","timestamp":1773543549164,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":42,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,2,18]],"date-time":"2022-02-18T00:00:00Z","timestamp":1645142400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,2,18]]},"DOI":"10.1145\/3529836.3529857","type":"proceedings-article","created":{"date-parts":[[2022,6,21]],"date-time":"2022-06-21T20:27:55Z","timestamp":1655843275000},"page":"34-43","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Single stock trading with deep reinforcement learning: A comparative study"],"prefix":"10.1145","author":[{"given":"Jun","family":"GE","sequence":"first","affiliation":[{"name":"Research Center for Intelligent Social Governance, Zhejiang lab, China"}]},{"given":"Yuanqi","family":"QIN","sequence":"additional","affiliation":[{"name":"Research Center for Intelligent Social Governance, Zhejiang lab, China"}]},{"given":"Yaling","family":"Li","sequence":"additional","affiliation":[{"name":"Research Center for Intelligent Social Governance, Zhejiang lab, China"}]},{"given":"yanjia","family":"Huang","sequence":"additional","affiliation":[{"name":"School of Mathematics, Xi'an Jiaotong-Liverpool University, China"}]},{"given":"Hao","family":"Hu","sequence":"additional","affiliation":[{"name":"School of Engineering, Westlake University, China"}]}],"member":"320","published-online":{"date-parts":[[2022,6,21]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2016.2547339"},{"key":"e_1_3_2_1_2_1","first-page":"18","volume":"2018","author":"Huang C Y","unstructured":"Huang C Y . Financial Trading as a Game: a Deep Reinforcement Learning Approach[J]. Quantitative Finance , 2018 ,06(01): 18 - 20 . Huang C Y. Financial Trading as a Game: a Deep Reinforcement Learning Approach[J]. Quantitative Finance,2018,06(01):18-20.","journal-title":"Quantitative Finance"},{"issue":"31","key":"e_1_3_2_1_3_1","first-page":"041","volume":"03","author":"Jo U","year":"2019","unstructured":"Jo U , Jo T , Kim W , Cooperative Multi -agent Reinforcement Learning Framework for Scalping Trading[J]. Computer Science , 2019 , 03 ( 31 ): 041 - 045 . Jo U,Jo T,Kim W,et al. Cooperative Multi-agent Reinforcement Learning Framework for Scalping Trading[J].Computer Science, 2019,03(31):041-045.","journal-title":"Computer Science"},{"key":"e_1_3_2_1_4_1","first-page":"59","volume-title":"Quantitative Finance","author":"Jiang Z","year":"2017","unstructured":"Jiang Z , Xu D , Liang J . A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem[J] . Quantitative Finance , 2017 .06(30): 59 - 63 . Jiang Z,Xu D,Liang J . A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem[J]. Quantitative Finance, 2017.06(30):59-63."},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSMCA.2007.904825"},{"key":"e_1_3_2_1_6_1","unstructured":"Lee J W. [IEEE ISIE 2001.2001 IEEE International Symposium on Industrial Electronics Proceedings - Pusan South Korea (12-16 June 2001)] ISIE 2001.2001 IEEE International Symposium on Industrial Electronics Proceedings (Cat. No. 01TH8570) - Stock price prediction using reinforcement learning [C]\/\/ IEEE International Symposium on Industrial Electronics. IEEE 2001:690-695.  Lee J W. [IEEE ISIE 2001.2001 IEEE International Symposium on Industrial Electronics Proceedings - Pusan South Korea (12-16 June 2001)] ISIE 2001.2001 IEEE International Symposium on Industrial Electronics Proceedings (Cat. No. 01TH8570) - Stock price prediction using reinforcement learning [C]\/\/ IEEE International Symposium on Industrial Electronics. IEEE 2001:690-695."},{"key":"e_1_3_2_1_7_1","first-page":"296","volume":"86","author":"Lee J W","year":"2003","unstructured":"Lee J W , Kim S D , Lee J , An Intelligent Stock Trading System Based on Reinforcement Learning[J]. ICE Transactions on Information and Systems , 2003 , E86 -D(2):p. 296 - 305 . Lee J W,Kim S D, Lee J, An Intelligent Stock Trading System Based on Reinforcement Learning[J]. ICE Transactions on Information and Systems, 2003, E86-D(2):p.296-305.","journal-title":"ICE Transactions on Information and Systems"},{"issue":"5","key":"e_1_3_2_1_8_1","first-page":"917","volume":"17","author":"Moody J","year":"1998","unstructured":"Moody J , Saffell M. Reinforcement learning for trading [J]. Advances in Neural Information Processing Systems , 1998 , 17 ( 5-6 ): 917 - 923 . Moody J, Saffell M. Reinforcement learning for trading [J]. Advances in Neural Information Processing Systems, 1998, 17(5-6):917-923.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_2_1_9_1","unstructured":"Ryan Lowe. Multi-agent Actor-Critic for Mixed Cooperative-Competitive Environments[J]. Computer Science 2017.06(07):00275-00276  Ryan Lowe. Multi-agent Actor-Critic for Mixed Cooperative-Competitive Environments[J]. Computer Science 2017.06(07):00275-00276"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.2006.18.7.1527"},{"key":"e_1_3_2_1_11_1","first-page":"252","volume":"2018","author":"Patel Y.","unstructured":"Patel Y. Optimizing Market Making using Multi-agent Reinforcement Learning [J]. Quantitative Finance , 2018 ,12(26): 252 - 259 . Patel Y. Optimizing Market Making using Multi-agent Reinforcement Learning [J]. Quantitative Finance,2018,12(26):252-259.","journal-title":"Quantitative Finance"},{"key":"e_1_3_2_1_12_1","first-page":"497","volume":"2019","author":"Sato Y.","unstructured":"Sato Y. Model-Free Reinforcement Learning for Financial Portfolios: a Brief Survey[J]. Quantitative Finance , 2019 ,05(03): 497 - 499 Sato Y. Model-Free Reinforcement Learning for Financial Portfolios: a Brief Survey[J]. Quantitative Finance,2019,05(03):497-499","journal-title":"Quantitative Finance"},{"key":"e_1_3_2_1_13_1","first-page":"1108","volume":"2019","author":"Mao H","unstructured":"Mao H , Zhang Z , Xiao Z , Modelling the Dynamic Joint Policy of Teammates with Attention Multi-agent DDPG[J]. AAMAS , 2019 ,05: 1108 - 1116 . Mao H,Zhang Z,Xiao Z,et al. Modelling the Dynamic Joint Policy of Teammates with Attention Multi-agent DDPG[J].AAMAS,2019,05:1108-1116.","journal-title":"AAMAS"},{"key":"e_1_3_2_1_14_1","first-page":"2267","volume":"333","author":"Lai S","year":"2015","unstructured":"Lai S , Xu L , Liu K , Recurrent Convolutional Neural Networks for Text Classification[C]\/\/ AAAI . 2015 , 333 : 2267 - 2273 . Lai S, Xu L, Liu K, Recurrent Convolutional Neural Networks for Text Classification[C]\/\/AAAI. 2015, 333: 2267-2273.","journal-title":"AAAI"},{"key":"e_1_3_2_1_15_1","volume-title":"Semantic parsing via staged query graph generation: question answering with knowledge base [J]","author":"Yih S W","year":"2015","unstructured":"Yih S W , Chang M W , He X , Semantic parsing via staged query graph generation: question answering with knowledge base [J] . 2015 . Yih S W, Chang M W, He X, Semantic parsing via staged query graph generation: question answering with knowledge base [J]. 2015."},{"key":"e_1_3_2_1_16_1","first-page":"123","volume":"2015","author":"Bogdanova D","unstructured":"Bogdanova D , dos Santos C , Barbosa L , Detecting semantically equivalent questions in online user forums[C]\/\/ Proceedings of the Nineteenth Conference on Computational Natural Language Learning. 2015 : 123 - 131 . Bogdanova D, dos Santos C, Barbosa L, Detecting semantically equivalent questions in online user forums[C]\/\/Proceedings of the Nineteenth Conference on Computational Natural Language Learning. 2015: 123-131.","journal-title":"Proceedings of the Nineteenth Conference on Computational Natural Language Learning."},{"key":"e_1_3_2_1_17_1","volume-title":"Paraphrase detection using recursive autoencoder [J]. Source: [http:\/\/nlp. stanford. edu\/courses\/cs224n\/2011\/reports\/ehhuang. pdf]","author":"Huang E.","year":"2011","unstructured":"Huang E. Paraphrase detection using recursive autoencoder [J]. Source: [http:\/\/nlp. stanford. edu\/courses\/cs224n\/2011\/reports\/ehhuang. pdf] , 2011 . Huang E. Paraphrase detection using recursive autoencoder [J]. Source: [http:\/\/nlp. stanford. edu\/courses\/cs224n\/2011\/reports\/ehhuang. pdf], 2011."},{"key":"e_1_3_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2015.08.456"},{"key":"e_1_3_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASLP.2014.2339736"},{"key":"e_1_3_2_1_20_1","volume-title":"A LSTM-based method for stock returns prediction: a case study of China stock market[C]\/\/2015 IEEE International Conference on Big Data (Big Data)","author":"Chen K","year":"2015","unstructured":"Chen K , Zhou Y , Dai F. A LSTM-based method for stock returns prediction: a case study of China stock market[C]\/\/2015 IEEE International Conference on Big Data (Big Data) . IEEE , 2015 : 2823-2824. Chen K, Zhou Y, Dai F. A LSTM-based method for stock returns prediction: a case study of China stock market[C]\/\/2015 IEEE International Conference on Big Data (Big Data). IEEE, 2015: 2823-2824."},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2016.2547339"},{"issue":"2","key":"e_1_3_2_1_22_1","first-page":"1","volume":"5","author":"Dunlavy T. G.","year":"2011","unstructured":"D. M. Dunlavy , T. G. Kolda , E. Acar . Temporal Link Prediction Using Matrix and Tensor Factorizations[J]. ACM Transactions on Knowledge Discovery from Data , 2011 , 5 ( 2 ): 1 - 27 . D. M. Dunlavy, T. G. Kolda, E. Acar. Temporal Link Prediction Using Matrix and Tensor Factorizations[J]. ACM Transactions on Knowledge Discovery from Data, 2011, 5(2): 1-27.","journal-title":"Data"},{"key":"e_1_3_2_1_23_1","volume-title":"Mastering the game of Go with deep neural networks and tree search [J]. nature","author":"Silver D","year":"2016","unstructured":"Silver D , Huang A , Maddison C J , Mastering the game of Go with deep neural networks and tree search [J]. nature , 2016 , 529(7587): 484. Silver D, Huang A, Maddison C J, Mastering the game of Go with deep neural networks and tree search [J]. nature, 2016, 529(7587): 484."},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1038\/nature14236"},{"key":"e_1_3_2_1_25_1","first-page":"431","volume-title":"2017 10th International Symposium on. IEEE","author":"Si W","year":"2017","unstructured":"Si W , Li J , Ding P, A Multi -objective Deep Reinforcement Learning Approach for Stock Index Future's Intraday Trading [C]\/\/ Computational Intelligence and Design (ISCID) , 2017 10th International Symposium on. IEEE , 2017 , 2: 431 - 436 . Si W, Li J, Ding P, A Multi-objective Deep Reinforcement Learning Approach for Stock Index Future's Intraday Trading [C]\/\/ Computational Intelligence and Design (ISCID), 2017 10th International Symposium on. IEEE, 2017, 2: 431-436."},{"key":"e_1_3_2_1_26_1","volume-title":"Cryptocurrency portfolio management with deep reinforcement learning [C]\/\/Intelligent Systems Conference (IntelliSys)","author":"Jiang Z","year":"2017","unstructured":"Jiang Z , Liang J. Cryptocurrency portfolio management with deep reinforcement learning [C]\/\/Intelligent Systems Conference (IntelliSys) , 2017 . IEEE , 2017: 905-913. Jiang Z, Liang J. Cryptocurrency portfolio management with deep reinforcement learning [C]\/\/Intelligent Systems Conference (IntelliSys), 2017 . IEEE, 2017: 905-913."},{"key":"e_1_3_2_1_27_1","unstructured":"Hwang T Norris S Su H Deep Reinforcement Learning for Pairs Trading [J].  Hwang T Norris S Su H Deep Reinforcement Learning for Pairs Trading [J]."},{"key":"e_1_3_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/72.935097"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSMCA.2007.904825"},{"key":"e_1_3_2_1_30_1","volume-title":"2003 IEEE International Conference on Computational Intelligence for Financial Engineering, 2003. Proceedings. IEEE","author":"Gold C.","unstructured":"Gold C. FX trading via recurrent reinforcement learning [C] , 2003 IEEE International Conference on Computational Intelligence for Financial Engineering, 2003. Proceedings. IEEE , 2003: 363-370. Gold C. FX trading via recurrent reinforcement learning [C], 2003 IEEE International Conference on Computational Intelligence for Financial Engineering, 2003. Proceedings. IEEE, 2003: 363-370."},{"key":"e_1_3_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2005.10.012"},{"key":"e_1_3_2_1_32_1","volume-title":"An investigation into the use of reinforcement learning techniques within the algorithmic trading domain [D]. Master's thesis","author":"Cumming J","year":"2015","unstructured":"Cumming J , Alrajeh D D , Dickens L. An investigation into the use of reinforcement learning techniques within the algorithmic trading domain [D]. Master's thesis , Imperial College London , United Kingdoms , 2015 . Cumming J, Alrajeh D D, Dickens L. An investigation into the use of reinforcement learning techniques within the algorithmic trading domain [D]. Master's thesis, Imperial College London, United Kingdoms, 2015."},{"key":"e_1_3_2_1_33_1","volume-title":"Algorithm trading using q-learning and recurrent reinforcement learning [J]. positions","author":"Du X","year":"2016","unstructured":"Du X , Zhai J , Lv K. Algorithm trading using q-learning and recurrent reinforcement learning [J]. positions , 2016 , 1: 1. Du X, Zhai J, Lv K. Algorithm trading using q-learning and recurrent reinforcement learning [J]. positions, 2016, 1: 1."},{"key":"e_1_3_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2016.2547339"},{"key":"e_1_3_2_1_35_1","doi-asserted-by":"crossref","unstructured":"Ardi T Tambet M Dorian K etal Multiagent cooperation and competition with deep reinforcement learning [J]. PLOS ONE 2017 12(4):  e0172395.  Ardi T Tambet M Dorian K et al. Multiagent cooperation and competition with deep reinforcement learning [J]. PLOS ONE 2017 12(4): e0172395.","DOI":"10.1371\/journal.pone.0172395"},{"key":"e_1_3_2_1_36_1","first-page":"18","volume":"2018","author":"Huang C Y","unstructured":"Huang C Y . Financial Trading as a Game: A Deep Reinforcement Learning Approach [J]. Quantitative Finance , 2018 ,06(01): 18 - 20 . Huang C Y . Financial Trading as a Game: A Deep Reinforcement Learning Approach[J]. Quantitative Finance,2018,06(01):18-20.","journal-title":"Quantitative Finance"},{"issue":"31","key":"e_1_3_2_1_37_1","first-page":"041","volume":"03","author":"Jo U","year":"2019","unstructured":"Jo U , Jo T , Kim W , Cooperative Multi -agent Reinforcement Learning Framework for Scalping Trading[J]. Computer Science , 2019 , 03 ( 31 ): 041 - 045 . Jo U,Jo T,Kim W,et al. Cooperative Multi-agent Reinforcement Learning Framework for Scalping Trading[J].Computer Science, 2019,03(31):041- 045.","journal-title":"Computer Science"},{"key":"e_1_3_2_1_38_1","volume-title":"International conference on machine learning (pp. 1928-1937)","author":"Mnih V.","year":"2016","unstructured":"Mnih , V. , Badia , A. P. , Mirza , M. , Graves , A. , Lillicrap , T. , Harley , T. , ... & Kavukcuoglu , K. ( 2016 , June). Asynchronous methods for deep reinforcement learning . In International conference on machine learning (pp. 1928-1937) . PMLR. Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., ... & Kavukcuoglu, K. (2016, June). Asynchronous methods for deep reinforcement learning. In International conference on machine learning (pp. 1928-1937). PMLR."},{"key":"e_1_3_2_1_39_1","unstructured":"Schulman John Filip Wolski Prafulla Dhariwal Alec Radford and Oleg Klimov. \"Proximal policy optimization algorithms.\" arXiv preprint arXiv:1707.06347 (2017).  Schulman John Filip Wolski Prafulla Dhariwal Alec Radford and Oleg Klimov. \"Proximal policy optimization algorithms.\" arXiv preprint arXiv:1707.06347 (2017)."},{"key":"e_1_3_2_1_40_1","volume-title":"Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971","author":"Lillicrap T. P.","year":"2015","unstructured":"Lillicrap , T. P. , Hunt , J. J. , Pritzel , A. , Heess , N. , Erez , T. , Tassa , Y. , ... & Wierstra , D. ( 2015 ). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 . Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., ... & Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971."},{"key":"e_1_3_2_1_41_1","volume-title":"International Conference on Machine Learning (pp. 1587-1596)","author":"Fujimoto S.","year":"2018","unstructured":"Fujimoto , S. , Hoof , H. , & Meger , D. ( 2018 , July). Addressing function approximation error in actor-critic methods . In International Conference on Machine Learning (pp. 1587-1596) . PMLR. Fujimoto, S., Hoof, H., & Meger, D. (2018, July). Addressing function approximation error in actor-critic methods. In International Conference on Machine Learning (pp. 1587-1596). PMLR."},{"key":"e_1_3_2_1_42_1","volume-title":"International conference on machine learning (pp. 1861-1870)","author":"Haarnoja T.","year":"2018","unstructured":"Haarnoja , T. , Zhou , A. , Abbeel , P. , & Levine , S. ( 2018 , July). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor . In International conference on machine learning (pp. 1861-1870) . PMLR. Haarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2018, July). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning (pp. 1861-1870). PMLR."}],"event":{"name":"ICMLC 2022: 2022 14th International Conference on Machine Learning and Computing","location":"Guangzhou China","acronym":"ICMLC 2022"},"container-title":["2022 14th International Conference on Machine Learning and Computing (ICMLC)"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3529836.3529857","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3529836.3529857","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:31:25Z","timestamp":1750188685000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3529836.3529857"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,18]]},"references-count":42,"alternative-id":["10.1145\/3529836.3529857","10.1145\/3529836"],"URL":"https:\/\/doi.org\/10.1145\/3529836.3529857","relation":{},"subject":[],"published":{"date-parts":[[2022,2,18]]},"assertion":[{"value":"2022-06-21","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}