{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,2]],"date-time":"2026-04-02T21:30:59Z","timestamp":1775165459364,"version":"3.50.1"},"reference-count":22,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2021,3,31]],"date-time":"2021-03-31T00:00:00Z","timestamp":1617148800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,3,31]],"date-time":"2021-03-31T00:00:00Z","timestamp":1617148800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Complex Intell. Syst."],"published-print":{"date-parts":[[2022,6]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The digital curling game is a two-player zero-sum extensive game in a continuous action space. There are some challenging problems that are still not solved well, such as the uncertainty of strategy, the large game tree searching, and the use of large amounts of supervised data, etc. In this work, we combine NFSP and KR-UCT for digital curling games, where NFSP uses two adversary learning networks and can automatically produce supervised data, and KR-UCT can be used for large game tree searching in continuous action space. We propose two reward mechanisms to make reinforcement learning converge quickly. Experimental results validate the proposed method, and show the strategy model can reach the Nash equilibrium.<\/jats:p>","DOI":"10.1007\/s40747-021-00345-6","type":"journal-article","created":{"date-parts":[[2021,3,31]],"date-time":"2021-03-31T13:03:48Z","timestamp":1617195828000},"page":"1857-1863","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":13,"title":["A game strategy model in the digital curling system based on NFSP"],"prefix":"10.1007","volume":"8","author":[{"given":"Yuntao","family":"Han","sequence":"first","affiliation":[]},{"given":"Qibin","family":"Zhou","sequence":"additional","affiliation":[]},{"given":"Fuqing","family":"Duan","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,3,31]]},"reference":[{"issue":"7","key":"345_CR1","doi-asserted-by":"publisher","first-page":"78","DOI":"10.1145\/1785414.1785439","volume":"53","author":"T Roughgarden","year":"2010","unstructured":"Roughgarden T (2010) Algorithmic game theory. Commun ACM 53(7):78\u201386","journal-title":"Commun ACM"},{"issue":"2","key":"345_CR2","doi-asserted-by":"publisher","first-page":"113","DOI":"10.1109\/JAS.2016.7471613","volume":"3","author":"FY Wang","year":"2016","unstructured":"Wang FY, Zhang JJ, Zheng X et al (2016) Where does AlphaGo go: from church-turing thesis to AlphaGo thesis and beyond. IEEE CAA J Autom Sin 3(2):113\u2013120","journal-title":"IEEE CAA J Autom Sin"},{"key":"345_CR3","first-page":"1729","volume":"20","author":"M Zinkevich","year":"2007","unstructured":"Zinkevich M, Johanson M, Bowling M et al (2007) Regret minimization in games with incomplete information. Adv Neural Inf Process Syst 20:1729\u20131736","journal-title":"Adv Neural Inf Process Syst"},{"key":"345_CR4","unstructured":"Heinrich J, Silver D (2016) Deep reinforcement learning from self-play in imperfect-information games. arXiv preprint arXiv:1603.01121"},{"key":"345_CR5","unstructured":"Heinrich J, Lanctot M, Silver D (2015) Fictitious self-play in extensive-form games. In: International conference on machine learning. pp 805\u2013813"},{"issue":"1","key":"345_CR6","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1111\/1467-937X.00076","volume":"66","author":"E Maskin","year":"1999","unstructured":"Maskin E (1999) Nash equilibrium and welfare optimality. Rev Econ Stud 66(1):23\u201338","journal-title":"Rev Econ Stud"},{"key":"345_CR7","first-page":"282","volume-title":"Bandit based monte-carlo planning European conference on machine learning","author":"L Kocsis","year":"2006","unstructured":"Kocsis L, Szepesvri C (2006) Bandit based monte-carlo planning European conference on machine learning. Springer, Berlin, pp 282\u2013293"},{"key":"345_CR8","unstructured":"Dulac-Arnold G, Evans R, van Hasselt H et al (2015) Deep reinforcement learning in large discrete action spaces. arXiv preprint arXiv:1512.07679"},{"key":"345_CR9","unstructured":"Yee T, Lisy V, Bowling MH, Kambhampati S (2016) Monte Carlo tree search in continuous action spaces with execution uncertainty. In: IJCAI. pp 690\u2013697"},{"issue":"1","key":"345_CR10","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1007\/s12283-013-0129-8","volume":"17","author":"N Maeno","year":"2014","unstructured":"Maeno N (2014) Dynamics and curl ratio of a curling stone. Sports Eng 17(1):33\u201341","journal-title":"Sports Eng"},{"key":"345_CR11","doi-asserted-by":"publisher","first-page":"596","DOI":"10.1016\/j.proeng.2016.06.246","volume":"147","author":"E Lozowski","year":"2016","unstructured":"Lozowski E et al (2016) Comparison of IMU measurements of curling stone dynamics with a numerical model. Procedia Eng 147:596\u2013601","journal-title":"Procedia Eng"},{"key":"345_CR12","doi-asserted-by":"crossref","unstructured":"Yamamoto M, Kato S, Iizuka H (2015) Digital curling strategy based on game tree search. In: 2015 IEEE conference on computational intelligence and games (CIG). IEEE, pp 474\u2013480","DOI":"10.1109\/CIG.2015.7317931"},{"key":"345_CR13","unstructured":"Ito T, Kitasei Y (2015) Proposal and implementation of digital curling. In: Proceedings of the IEEE conference on computational intelligence and games, CIG, 469C473"},{"key":"345_CR14","unstructured":"Yamamoto M, Kato S, Iizuka H (2018) Learning of expected scores distribution for positions of digital curling. In: Proceedings of workshop on curling informatics (WCI2018). pp 8\u20139"},{"key":"345_CR15","unstructured":"Myungpyo H, Yoon KK, Sanghoon S (2018) Camera pose estimation based on concentric circles and parallel lines of a curling sheet, WCI2018, 12\u201315"},{"key":"345_CR16","doi-asserted-by":"crossref","unstructured":"Won D-O et al. (2018) Curly: an AI-based curling robot successfully competing in the olympic discipline of curling. In: IJCAI","DOI":"10.24963\/ijcai.2018\/870"},{"key":"345_CR17","first-page":"151","volume-title":"A curling agent based on the Monte-Carlo tree search considering the similarity of the best action among similar states advances in computer games","author":"K Ohto","year":"2017","unstructured":"Ohto K, Tanaka T (2017) A curling agent based on the Monte-Carlo tree search considering the similarity of the best action among similar states advances in computer games. Springer, Cham, pp 151\u2013164"},{"key":"345_CR18","unstructured":"Ahmad ZF, Holte RC, Bowling M (2016) Action selection for hammer shots in curling. In: IJCAI. pp 561\u2013567"},{"key":"345_CR19","unstructured":"Ahmad ZF (2017) Action selection for hammer shots in curling: optimization of non-convex continuous actions with stochastic action outcomes"},{"key":"345_CR20","unstructured":"Lee K, Kim SA, Choi J et al (2018) Deep reinforcement learning in continuous action spaces: a case study in the game of simulated curling. In: International conference on machine learning, pp 2937\u20132946"},{"key":"345_CR21","unstructured":"Mnih V, Kavukcuoglu K, Silver D et al. (2013) Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602"},{"key":"345_CR22","unstructured":"Schulman J, Wolski F, Dhariwal P et al. (2017) Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347"}],"container-title":["Complex &amp; Intelligent Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-021-00345-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s40747-021-00345-6\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s40747-021-00345-6.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,5,30]],"date-time":"2022-05-30T01:06:25Z","timestamp":1653872785000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s40747-021-00345-6"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,3,31]]},"references-count":22,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2022,6]]}},"alternative-id":["345"],"URL":"https:\/\/doi.org\/10.1007\/s40747-021-00345-6","relation":{},"ISSN":["2199-4536","2198-6053"],"issn-type":[{"value":"2199-4536","type":"print"},{"value":"2198-6053","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,3,31]]},"assertion":[{"value":"29 November 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"17 March 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"31 March 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"No conflict.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"Code will be available after paper publication.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Code availability"}}]}}