{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,4]],"date-time":"2026-04-04T06:15:53Z","timestamp":1775283353907,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":39,"publisher":"ACM","license":[{"start":{"date-parts":[[2022,8,23]],"date-time":"2022-08-23T00:00:00Z","timestamp":1661212800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"EPSRC","award":["EP\/P024289\/1"],"award-info":[{"award-number":["EP\/P024289\/1"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2022,8,23]]},"DOI":"10.1145\/3539813.3545126","type":"proceedings-article","created":{"date-parts":[[2022,8,25]],"date-time":"2022-08-25T22:18:32Z","timestamp":1661465912000},"page":"275-280","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Evaluating the Cranfield Paradigm for Conversational Search Systems"],"prefix":"10.1145","author":[{"given":"Xiao","family":"Fu","sequence":"first","affiliation":[{"name":"University College London, London, United Kingdom"}]},{"given":"Emine","family":"Yilmaz","sequence":"additional","affiliation":[{"name":"University College London &amp; Amazon, London, United Kingdom"}]},{"given":"Aldo","family":"Lipani","sequence":"additional","affiliation":[{"name":"University College London, London, United Kingdom"}]}],"member":"320","published-online":{"date-parts":[[2022,8,25]]},"reference":[{"key":"e_1_3_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/2009916.2009965"},{"key":"e_1_3_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/1277741.1277902"},{"key":"e_1_3_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3290605.3300233"},{"key":"e_1_3_2_1_4_1","volume-title":"Conversational Search (Dagstuhl Seminar 19461). Dagstuhl Reports 9, 11","author":"Anand Avishek","year":"2020","unstructured":"Avishek Anand , Lawrence Cavedon , Hideo Joho , Mark Sanderson , and Benno Stein . 2020. Conversational Search (Dagstuhl Seminar 19461). Dagstuhl Reports 9, 11 ( 2020 ). https:\/\/doi.org\/10.4230\/DagRep.9.11.34 Avishek Anand, Lawrence Cavedon, Hideo Joho, Mark Sanderson, and Benno Stein. 2020. Conversational Search (Dagstuhl Seminar 19461). Dagstuhl Reports 9, 11 (2020). https:\/\/doi.org\/10.4230\/DagRep.9.11.34"},{"key":"e_1_3_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/2009916.2010037"},{"key":"e_1_3_2_1_6_1","volume-title":"Embodied Conversational Agents: Representation and Intelligence in User Interfaces. AI Magazine 22, 4","author":"Cassell Justine","year":"2001","unstructured":"Justine Cassell . 2001. Embodied Conversational Agents: Representation and Intelligence in User Interfaces. AI Magazine 22, 4 ( 2001 ). https:\/\/doi.org\/10.1609\/aimag.v22i4.1593 Justine Cassell. 2001. Embodied Conversational Agents: Representation and Intelligence in User Interfaces. AI Magazine 22, 4 (2001). https:\/\/doi.org\/10.1609\/aimag.v22i4.1593"},{"key":"e_1_3_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3358047"},{"key":"e_1_3_2_1_8_1","first-page":"1","volume":"20","author":"Nominal Scales Jacob","year":"1960","unstructured":"Jacob Cohen;. 1960. A Coefficient of Agreement for Nominal Scales . Educational and Psychological Measurement 20 , 1 ( 1960 ). https:\/\/doi.org\/10.1177\/001316446002000104 Jacob Cohen;. 1960. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement 20, 1 (1960). https:\/\/doi.org\/10.1177\/001316446002000104","journal-title":"Psychological Measurement"},{"key":"e_1_3_2_1_9_1","doi-asserted-by":"crossref","unstructured":"Jeffrey Dalton Chenyan Xiong and Jamie Callan. 2020. TREC CAsT 2019: The Conversational Assistance Track Overview. Jeffrey Dalton Chenyan Xiong and Jamie Callan. 2020. TREC CAsT 2019: The Conversational Assistance Track Overview.","DOI":"10.6028\/NIST.SP.1266.cast-overview"},{"key":"e_1_3_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.10"},{"key":"e_1_3_2_1_11_1","volume-title":"Evaluating Implicit Measures to Improve Web Search. ACM Trans. Inf. Syst. 23, 2","author":"Fox Steve","year":"2005","unstructured":"Steve Fox , Kuldeep Karnawat , Mark Mydland , Susan Dumais , and Thomas White . 2005. Evaluating Implicit Measures to Improve Web Search. ACM Trans. Inf. Syst. 23, 2 ( 2005 ). https:\/\/doi.org\/10.1145\/1059981.1059982 Steve Fox, Kuldeep Karnawat, Mark Mydland, Susan Dumais, and Thomas White. 2005. Evaluating Implicit Measures to Improve Web Search. ACM Trans. Inf. Syst. 23, 2 (2005). https:\/\/doi.org\/10.1145\/1059981.1059982"},{"key":"e_1_3_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/2348283.2348323"},{"key":"e_1_3_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/1718487.1718515"},{"key":"e_1_3_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/2063576.2063599"},{"key":"e_1_3_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/N15-1086"},{"key":"e_1_3_2_1_16_1","volume-title":"Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval","author":"Scott","unstructured":"Scott B. Huffman and Michael Hochster. 2007. How Well Does Result Relevance Predict Session Satisfaction? . In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval ( Amsterdam, The Netherlands) (SIGIR '07). Association for Computing Machinery, New York, NY, USA. https:\/\/doi.org\/10.1145\/1277741.1277839 Scott B. Huffman and Michael Hochster. 2007. How Well Does Result Relevance Predict Session Satisfaction?. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Amsterdam, The Netherlands) (SIGIR '07). Association for Computing Machinery, New York, NY, USA. https:\/\/doi.org\/10.1145\/1277741.1277839"},{"key":"e_1_3_2_1_17_1","volume-title":"Cumulated Gain-Based Evaluation of IR Techniques. ACM Trans. Inf. Syst. 20, 4","author":"J\u00e4rvelin Kalervo","year":"2002","unstructured":"Kalervo J\u00e4rvelin and Jaana Kek\u00e4l\u00e4inen . 2002. Cumulated Gain-Based Evaluation of IR Techniques. ACM Trans. Inf. Syst. 20, 4 ( 2002 ). https:\/\/doi.org\/10.1145\/582415.582418 Kalervo J\u00e4rvelin and Jaana Kek\u00e4l\u00e4inen. 2002. Cumulated Gain-Based Evaluation of IR Techniques. ACM Trans. Inf. Syst. 20, 4 (2002). https:\/\/doi.org\/10.1145\/582415.582418"},{"key":"e_1_3_2_1_18_1","volume-title":"Advances in Information Retrieval, Craig Macdonald, Iadh Ounis, Vassilis Plachouras, Ian Ruthven, and Ryen W","author":"J\u00e4rvelin Kalervo","unstructured":"Kalervo J\u00e4rvelin , Susan L. Price , Lois M. L. Delcambre , and Marianne Lykke Nielsen . 2008. Discounted Cumulated Gain Based Evaluation of Multiple-Query IR Sessions . In Advances in Information Retrieval, Craig Macdonald, Iadh Ounis, Vassilis Plachouras, Ian Ruthven, and Ryen W . White (Eds.). Springer Berlin Heidelberg, Berlin , Heidelberg . Kalervo J\u00e4rvelin, Susan L. Price, Lois M. L. Delcambre, and Marianne Lykke Nielsen. 2008. Discounted Cumulated Gain Based Evaluation of Multiple-Query IR Sessions. In Advances in Information Retrieval, Craig Macdonald, Iadh Ounis, Vassilis Plachouras, Ian Ruthven, and Ryen W. White (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg."},{"key":"e_1_3_2_1_19_1","volume-title":"Rosie Jones, Umut Ozertem, Imed Zitouni, Ranjitha Gurunath Kulkarni, and Omar Zia Khan.","author":"Jiang Jiepu","year":"2015","unstructured":"Jiepu Jiang , Ahmed Hassan Awadallah , Rosie Jones, Umut Ozertem, Imed Zitouni, Ranjitha Gurunath Kulkarni, and Omar Zia Khan. 2015 . Automatic Online Evaluation of Intelligent Assistants. In Proceedings of the 24th International Conference on World Wide Web (Florence, Italy) (WWW '15). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE. https:\/\/doi.org\/10.1145\/2736277.2741669 Jiepu Jiang, Ahmed Hassan Awadallah, Rosie Jones, Umut Ozertem, Imed Zitouni, Ranjitha Gurunath Kulkarni, and Omar Zia Khan. 2015. Automatic Online Evaluation of Intelligent Assistants. In Proceedings of the 24th International Conference on World Wide Web (Florence, Italy) (WWW '15). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE. https:\/\/doi.org\/10.1145\/2736277.2741669"},{"key":"e_1_3_2_1_20_1","volume-title":"Methods for Evaluating Interactive Information Retrieval Systems with Users. Foundations and Trends\u00ae in Information Retrieval 3, 1--2","author":"Kelly Diane","year":"2009","unstructured":"Diane Kelly . 2009. Methods for Evaluating Interactive Information Retrieval Systems with Users. Foundations and Trends\u00ae in Information Retrieval 3, 1--2 ( 2009 ). https:\/\/doi.org\/10.1561\/1500000012 Diane Kelly. 2009. Methods for Evaluating Interactive Information Retrieval Systems with Users. Foundations and Trends\u00ae in Information Retrieval 3, 1--2 (2009). https:\/\/doi.org\/10.1561\/1500000012"},{"key":"e_1_3_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3531814"},{"key":"e_1_3_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661829.2661960"},{"key":"e_1_3_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/2806416.2806483"},{"key":"e_1_3_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/2911451.2911521"},{"key":"e_1_3_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/2854946.2854961"},{"key":"e_1_3_2_1_26_1","volume-title":"Commentary: A Dissenting View on So-Called Paradoxes of Reliability Coefficients. Annals of the International Communication Association 36, 1","author":"Krippendorff Klaus","year":"2013","unstructured":"Klaus Krippendorff . 2013 . Commentary: A Dissenting View on So-Called Paradoxes of Reliability Coefficients. Annals of the International Communication Association 36, 1 (2013). https:\/\/doi.org\/10.1080\/23808985.2013.11679143 Klaus Krippendorff. 2013. Commentary: A Dissenting View on So-Called Paradoxes of Reliability Coefficients. Annals of the International Communication Association 36, 1 (2013). https:\/\/doi.org\/10.1080\/23808985.2013.11679143"},{"key":"e_1_3_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3341981.3344216"},{"key":"e_1_3_2_1_28_1","volume-title":"Article 51","author":"Lipani Aldo","year":"2021","unstructured":"Aldo Lipani , Ben Carterette , and Emine Yilmaz . 2021. How Am I Doing?: Evaluating Conversational Search Systems Offline. ACM Trans. Inf. Syst. 39, 4 , Article 51 ( 2021 ). https:\/\/doi.org\/10.1145\/3451160 Aldo Lipani, Ben Carterette, and Emine Yilmaz. 2021. How Am I Doing?: Evaluating Conversational Search Systems Offline. ACM Trans. Inf. Syst. 39, 4, Article 51 (2021). https:\/\/doi.org\/10.1145\/3451160"},{"key":"e_1_3_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1230"},{"key":"e_1_3_2_1_30_1","doi-asserted-by":"crossref","unstructured":"Alistair Moffat Falk Scholer and Paul Thomas. 2012. Models and Metrics: IR Evaluation as a User Process. Alistair Moffat Falk Scholer and Paul Thomas. 2012. Models and Metrics: IR Evaluation as a User Process.","DOI":"10.1145\/2407085.2407092"},{"key":"e_1_3_2_1_31_1","volume-title":"Article 2","author":"Moffat Alistair","year":"2008","unstructured":"Alistair Moffat and Justin Zobel . 2008. Rank-Biased Precision for Measurement of Retrieval Effectiveness. ACM Trans. Inf. Syst. 27, 1 , Article 2 ( 2008 ). https:\/\/doi.org\/10.1145\/1416950.1416952 Alistair Moffat and Justin Zobel. 2008. Rank-Biased Precision for Measurement of Retrieval Effectiveness. ACM Trans. Inf. Syst. 27, 1, Article 2 (2008). https:\/\/doi.org\/10.1145\/1416950.1416952"},{"key":"e_1_3_2_1_32_1","volume-title":"Proceedings of the 40th Annual Meeting on Association for Computational Linguistics","author":"Papineni Kishore","year":"2002","unstructured":"Kishore Papineni , Salim Roukos , Todd Ward , and Wei-Jing Zhu . 2002 . BLEU: A Method for Automatic Evaluation of Machine Translation . In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics ( Philadelphia, Pennsylvania) (ACL '02). Association for Computational Linguistics, USA. https:\/\/doi.org\/10.3115\/1073083.1073135 Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (Philadelphia, Pennsylvania) (ACL '02). Association for Computational Linguistics, USA. https:\/\/doi.org\/10.3115\/1073083.1073135"},{"key":"e_1_3_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1264"},{"key":"e_1_3_2_1_34_1","unstructured":"Justus J Randolph. 200"},{"key":"e_1_3_2_1_35_1","volume-title":"A Structured Review of the Validity of BLEU. Computational Linguistics 44, 3","author":"Reiter Ehud","year":"2018","unstructured":"Ehud Reiter . 2018. A Structured Review of the Validity of BLEU. Computational Linguistics 44, 3 ( 2018 ). https:\/\/doi.org\/10.1162\/coli_a_00322 Ehud Reiter. 2018. A Structured Review of the Validity of BLEU. Computational Linguistics 44, 3 (2018). https:\/\/doi.org\/10.1162\/coli_a_00322"},{"key":"e_1_3_2_1_36_1","unstructured":"Zhengxiang Shi Yue Feng and Aldo Lipani. 2022. Learning to Execute Actions or Ask Clarification Questions. In Findings of NAACL (2022-01-01) (NAACL '22). Zhengxiang Shi Yue Feng and Aldo Lipani. 2022. Learning to Execute Actions or Ask Clarification Questions. In Findings of NAACL (2022-01-01) (NAACL '22)."},{"key":"e_1_3_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v36i10.21383"},{"key":"e_1_3_2_1_38_1","unstructured":"Matthijs J Warrens. 201"},{"key":"e_1_3_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2661829.2661953"}],"event":{"name":"ICTIR '22: The 2022 ACM SIGIR International Conference on the Theory of Information Retrieval","location":"Madrid Spain","acronym":"ICTIR '22","sponsor":["SIGIR ACM Special Interest Group on Information Retrieval"]},"container-title":["Proceedings of the 2022 ACM SIGIR International Conference on Theory of Information Retrieval"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3539813.3545126","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3539813.3545126","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T18:10:03Z","timestamp":1750183803000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3539813.3545126"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,23]]},"references-count":39,"alternative-id":["10.1145\/3539813.3545126","10.1145\/3539813"],"URL":"https:\/\/doi.org\/10.1145\/3539813.3545126","relation":{},"subject":[],"published":{"date-parts":[[2022,8,23]]},"assertion":[{"value":"2022-08-25","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}