{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,21]],"date-time":"2025-06-21T04:05:22Z","timestamp":1750478722055,"version":"3.41.0"},"reference-count":25,"publisher":"Cambridge University Press (CUP)","issue":"3","license":[{"start":{"date-parts":[[2009,7,1]],"date-time":"2009-07-01T00:00:00Z","timestamp":1246406400000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/www.cambridge.org\/core\/terms"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2009,7]]},"abstract":"<jats:title>Abstract<\/jats:title>\n\t  <jats:p>We describe a system that automatically learns effective and engaging dialogue strategies, generated from a library of dialogue content, using reinforcement learning from user feedback. Besides the more usual clarification and verification components of dialogue, this library contains various social elements like greetings, apologies, small talk, relational questions and jokes. We tested the method through an experimental dialogue system that encourages take-up of exercise and shows that the learned dialogue policy performs as well as one built by human experts for this system.<\/jats:p>","DOI":"10.1017\/s1351324908004956","type":"journal-article","created":{"date-parts":[[2008,12,18]],"date-time":"2008-12-18T10:08:52Z","timestamp":1229594932000},"page":"355-378","source":"Crossref","is-referenced-by-count":3,"title":["Learning effective and engaging strategies for advice-giving human-machine dialogue"],"prefix":"10.1017","volume":"15","author":[{"given":"MARTIJN","family":"SPITTERS","sequence":"first","affiliation":[]},{"given":"MARCO","family":"DE BONI","sequence":"additional","affiliation":[]},{"given":"JAKUB","family":"ZAVREL","sequence":"additional","affiliation":[]},{"given":"REMKO","family":"BONNEMA","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2009,7,1]]},"reference":[{"volume-title":"Reinforcement Learning","year":"1998","author":"Sutton","key":"S1351324908004956_manual_ref-20"},{"key":"S1351324908004956_manual_ref-25","first-page":"33","volume-title":"Text Mining, Theoretical Aspects and Applications","author":"Zavrel","year":"2003"},{"key":"S1351324908004956_manual_ref-17","doi-asserted-by":"crossref","unstructured":"Rudary, M. , Singh, S. , and Pollack, M. E. 2004. Adaptive cognitive orthotics: Combining reinforcement learning and constraint-based temporal reasoning. In Proceedings of the 21st International Conference on Machine Learning, Banff, Alberta, Canada, ACM, New York, USA.","DOI":"10.1145\/1015330.1015411"},{"key":"S1351324908004956_manual_ref-19","doi-asserted-by":"crossref","first-page":"105","DOI":"10.1613\/jair.859","article-title":"Optimizing dialogue management with reinforcement learning: experiments with the NJFun system","volume":"16","author":"Singh","year":"2002","journal-title":"Journal of Artificial Intelligence Research (JAIR)"},{"key":"S1351324908004956_manual_ref-4","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511486579","volume-title":"Memory-Based Language Processing","author":"Daelemans","year":"2005"},{"key":"S1351324908004956_manual_ref-6","doi-asserted-by":"crossref","unstructured":"English, M. S. and Heeman, P. A. 2005. Learning mixed initiative dialog strategies by using reinforcement learning on both conversants. In Proceedings of HLT\/NAACL, Vancouver, British Columbia, Canada, ACL, Morristown, NJ, USA.","DOI":"10.3115\/1220575.1220702"},{"key":"S1351324908004956_manual_ref-7","doi-asserted-by":"crossref","unstructured":"Frampton, M. and Lemon, O. 2006. Learning more effective dialogue strategies using limited dialogue move features. In Proceedings of the Annual Meeting of the ACL, Sydney, Australia, ACL, Morristown, NJ, USA.","DOI":"10.3115\/1220175.1220199"},{"key":"S1351324908004956_manual_ref-5","unstructured":"Daelemans, W. , Buchholz, S. , and Veenstra, J. 1999. Memory-Based Shallow Parsing. In Proceedings of CoNLL-99, Bergen, Norway."},{"key":"S1351324908004956_manual_ref-2","doi-asserted-by":"publisher","DOI":"10.1023\/A:1024026532471"},{"key":"S1351324908004956_manual_ref-12","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1023\/A:1015036910358","article-title":"Designing and evaluating an adaptive spoken dialogue system","volume":"12","author":"Litman","year":"2002","journal-title":"User Modeling and User-Adapted Interaction"},{"key":"S1351324908004956_manual_ref-13","unstructured":"Liu, K. K. and Picard, R. W. 2005. Embedded empathy in continuous, interactive health assessment. CHI Workshop on HCI Challenges in Health Assessment, Portland, Oregon."},{"key":"S1351324908004956_manual_ref-14","doi-asserted-by":"crossref","unstructured":"Maloor, P. and Chai, J. 2000. Dynamic user level and utility measurement for adaptive dialog in a help-desk system. In Proceedings of the 1st Sigdial Workshop, Hong Kong, China.","DOI":"10.3115\/1117736.1117747"},{"key":"S1351324908004956_manual_ref-8","unstructured":"Georgila, K. and Lemon, O. 2004. Adaptive multimodal dialogue management based on the information state update approach. W3C Workshop on Multimodal Interaction, Sophia-Antipolis, France."},{"key":"S1351324908004956_manual_ref-23","doi-asserted-by":"publisher","DOI":"10.1613\/jair.713"},{"key":"S1351324908004956_manual_ref-11","first-page":"55","article-title":"A technique for the measurement of attitudes","volume":"140","author":"Likert","year":"1932","journal-title":"Archives of Psychology"},{"key":"S1351324908004956_manual_ref-21","unstructured":"Stock, O. 1996. Password Swordfish: Verbal humour in the interface. In Proceedings of the International Workshop on Computational Humour, TWLT-12, Enschede."},{"key":"S1351324908004956_manual_ref-22","doi-asserted-by":"publisher","DOI":"10.3115\/1220835.1220870"},{"key":"S1351324908004956_manual_ref-10","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1109\/89.817450","article-title":"A stochastic model of human-machine interaction for learning dialog strategies","volume":"8","author":"Levin","year":"2000","journal-title":"IEEE Trans. on Speech and Audio Processing"},{"key":"S1351324908004956_manual_ref-16","doi-asserted-by":"publisher","DOI":"10.3115\/1273073.1273158"},{"key":"S1351324908004956_manual_ref-24","doi-asserted-by":"crossref","unstructured":"Williams, J. D. , Poupart, P. , and Young, S. 2005. Partially observable markov decision processes with continuous observations for dialogue management. In Proceedings of the 6th SigDial Workshop, September 2005, Lisbon.","DOI":"10.18653\/v1\/2005.sigdial-1.4"},{"key":"S1351324908004956_manual_ref-3","doi-asserted-by":"crossref","unstructured":"Cuay\u00e1huitl, H. , Renals, S. , Lemon, O. and Shimodaira, H. 2006. Learning multi-goal dialogue strategies using reinforcement learning with reduced state-action spaces. In Proceedings of Interspeech-ICSLP, Pittsburgh, Pennsylvania, USA.","DOI":"10.21437\/Interspeech.2006-149"},{"key":"S1351324908004956_manual_ref-1","unstructured":"Bickmore, T. W. 2003. Relational Agents: Effecting Change through Human-Computer Relationships. Ph.D. Thesis, MIT, Cambridge, MA."},{"key":"S1351324908004956_manual_ref-9","unstructured":"Henderson, J. , Lemon, O. , and Georgila, K. 2005. Hybrid reinforcement\/supervised learning for dialogue policies from COMMUNICATOR data. IJCAI workshop on Knowledge and Reasoning in Practical Dialogue Systems, Edinburgh, UK."},{"key":"S1351324908004956_manual_ref-15","doi-asserted-by":"crossref","unstructured":"Paek, T. and Chickering, D. M. 2005. The markov assumption in spoken dialogue management. Proceedings of SIGDIAL 2005, Lisbon, Portugal.","DOI":"10.18653\/v1\/2005.sigdial-1.5"},{"key":"S1351324908004956_manual_ref-18","doi-asserted-by":"crossref","unstructured":"Scheffler, K. and Young, S. 2002. Automatic learning of dialogue strategy using dialogue simulation and reinforcement learning. In Proceedings of HLT-2002, San Diego, California, Morgan Kaufmann.","DOI":"10.3115\/1289189.1289246"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324908004956","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,20]],"date-time":"2025-06-20T17:39:59Z","timestamp":1750441199000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324908004956\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2009,7]]},"references-count":25,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2009,10]]}},"alternative-id":["S1351324908004956"],"URL":"https:\/\/doi.org\/10.1017\/s1351324908004956","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"type":"print","value":"1351-3249"},{"type":"electronic","value":"1469-8110"}],"subject":[],"published":{"date-parts":[[2009,7]]}}}