{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,15]],"date-time":"2026-04-15T20:49:37Z","timestamp":1776286177365,"version":"3.50.1"},"publisher-location":"Cham","reference-count":20,"publisher":"Springer Nature Switzerland","isbn-type":[{"value":"9783031485497","type":"print"},{"value":"9783031485503","type":"electronic"}],"license":[{"start":{"date-parts":[[2023,12,28]],"date-time":"2023-12-28T00:00:00Z","timestamp":1703721600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,12,28]],"date-time":"2023-12-28T00:00:00Z","timestamp":1703721600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2024]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In Agile software development, user stories play a vital role in capturing and conveying end-user needs, prioritizing features, and facilitating communication and collaboration within development teams. However, automated methods for evaluating user stories require training in NLP tools and can be time-consuming to develop and integrate. This study explores using ChatGPT for user story quality evaluation and compares its performance with an existing benchmark. Our study shows that ChatGPT\u2019s evaluation aligns well with human evaluation, and we propose a \u201cbest of three\u201d strategy to improve its output stability. We also discuss the concept of trustworthiness in AI and its implications for non-experts using ChatGPT\u2019s unprocessed outputs. Our research contributes to understanding the reliability and applicability of Generative AI in user story evaluation and offers recommendations for future research.<\/jats:p>","DOI":"10.1007\/978-3-031-48550-3_17","type":"book-chapter","created":{"date-parts":[[2023,12,27]],"date-time":"2023-12-27T11:02:10Z","timestamp":1703674930000},"page":"173-181","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":35,"title":["ChatGPT as\u00a0a\u00a0Tool for\u00a0User Story Quality Evaluation: Trustworthy Out of\u00a0the\u00a0Box?"],"prefix":"10.1007","author":[{"given":"Krishna","family":"Ronanki","sequence":"first","affiliation":[]},{"given":"Beatriz","family":"Cabrero-Daniel","sequence":"additional","affiliation":[]},{"given":"Christian","family":"Berger","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,12,28]]},"reference":[{"key":"17_CR1","doi-asserted-by":"crossref","unstructured":"Lucassen, G., Dalpiaz, F., Van Der Werf, J.M.E., Brinkkemper, S.: Forging high-quality user stories: towards a discipline for agile requirements. In: IEEE International Requirements Engineering Conference (RE), pp. 126\u2013135. IEEE (2015)","DOI":"10.1109\/RE.2015.7320415"},{"key":"17_CR2","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"205","DOI":"10.1007\/978-3-319-30282-9_14","volume-title":"Requirements Engineering: Foundation for Software Quality","author":"G Lucassen","year":"2016","unstructured":"Lucassen, G., Dalpiaz, F., Werf, J.M.E.M., Brinkkemper, S.: The use and effectiveness of user stories in practice. In: Daneva, M., Pastor, O. (eds.) REFSQ 2016. LNCS, vol. 9619, pp. 205\u2013222. Springer, Cham (2016). https:\/\/doi.org\/10.1007\/978-3-319-30282-9_14"},{"key":"17_CR3","volume-title":"User Stories Applied: For Agile Software Development","author":"M Cohn","year":"2004","unstructured":"Cohn, M.: User Stories Applied: For Agile Software Development. Addison-Wesley Professional, Boston (2004)"},{"key":"17_CR4","doi-asserted-by":"publisher","first-page":"51723","DOI":"10.1109\/ACCESS.2022.3173745","volume":"10","author":"AR Amna","year":"2022","unstructured":"Amna, A.R., Poels, G.: Systematic literature mapping of user story research. IEEE Access 10, 51723\u201351746 (2022)","journal-title":"IEEE Access"},{"key":"17_CR5","doi-asserted-by":"crossref","unstructured":"Mustaffa, S.N.F.N.B., Sallim, J.B., Mohamed, R.B.: Enhancing high-quality user stories with AQUSA: an overview study of data cleaning process. In: 2021 International Conference on Software Engineering & Computer Systems and 4th International Conference on Computational Science and Information Management (ICSECS-ICOCSIM), pp. 295\u2013300 (2021)","DOI":"10.1109\/ICSECS52883.2021.00060"},{"key":"17_CR6","series-title":"Lecture Notes in Information Systems and Organisation","doi-asserted-by":"publisher","first-page":"183","DOI":"10.1007\/978-3-319-07040-7_18","volume-title":"Smart Organizations and Smart Artifacts","author":"SR Humayoun","year":"2014","unstructured":"Humayoun, S.R., Dubinsky, Y., Catarci, T.: User evaluation support through development environment for agile software teams. In: Caporarello, L., Di Martino, B., Martinez, M. (eds.) Smart Organizations and Smart Artifacts. LNISO, vol. 7, pp. 183\u2013191. Springer, Cham (2014). https:\/\/doi.org\/10.1007\/978-3-319-07040-7_18"},{"key":"17_CR7","unstructured":"Jurisch, M., Lusky, M., Igler, B., B\u00f6hm, S.: Evaluating a recommendation system for user stories in mobile enterprise application development. Int. J. Adv. Intell. Syst. (2017)"},{"key":"17_CR8","doi-asserted-by":"crossref","unstructured":"Pe\u00f1a, F.J., Rold\u00e1n, L., Vegetti, M.: User stories identification in software\u2019s issues records using natural language processing. In: 2020 IEEE Congreso Bienal de Argentina (ARGENCON), pp. 1\u20137 (2020)","DOI":"10.1109\/ARGENCON49523.2020.9505355"},{"key":"17_CR9","unstructured":"Sharir, O., Peleg, B., Shoham, Y.: The cost of training NLP models: a concise overview. arXiv:2004.08900 (2020)"},{"key":"17_CR10","unstructured":"Zhang, B., Ding, D., Jing, L.: How would stance detection techniques evolve after the launch of ChatGPT? arXiv preprint: arXiv:2212.14548 (2022)"},{"key":"17_CR11","doi-asserted-by":"crossref","unstructured":"Shen, Y., et al.: ChatGPT and other large language models are double-edged swords (2023)","DOI":"10.1148\/radiol.230163"},{"key":"17_CR12","doi-asserted-by":"crossref","unstructured":"Choi, J.H., Hickman, K.E., Monahan, A., Schwarcz, D.: ChatGPT goes to law school. Available at SSRN (2023)","DOI":"10.2139\/ssrn.4335905"},{"issue":"8","key":"17_CR13","first-page":"9","volume":"1","author":"A Radford","year":"2019","unstructured":"Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)","journal-title":"OpenAI Blog"},{"key":"17_CR14","unstructured":"Perez, E., Kiela, D., Cho, K.: True few-shot learning with language models. In: Advances in Neural Information Processing Systems, vol. 34, pp. 11054\u201311070 (2021)"},{"issue":"9","key":"17_CR15","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3560815","volume":"55","author":"P Liu","year":"2023","unstructured":"Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55(9), 1\u201335 (2023)","journal-title":"ACM Comput. Surv."},{"key":"17_CR16","doi-asserted-by":"publisher","first-page":"383","DOI":"10.1007\/s00766-016-0250-x","volume":"21","author":"G Lucassen","year":"2015","unstructured":"Lucassen, G., Dalpiaz, F., van der Werf, J.M.E., Brinkkemper, S.: improving agile requirements: the quality user story framework and tool. Requirements Eng. 21, 383\u2013403 (2015)","journal-title":"Requirements Eng."},{"key":"17_CR17","unstructured":"T\u00f5emets, T.: Analysing the quality of user stories in open source projects. PhD thesis, University of Tartu (2020)"},{"key":"17_CR18","doi-asserted-by":"crossref","unstructured":"Borji, A.: A categorical archive of ChatGPT failures (2023)","DOI":"10.21203\/rs.3.rs-2895792\/v1"},{"key":"17_CR19","doi-asserted-by":"publisher","unstructured":"Koubaa, A.: GPT-4 vs. GPT-3.5: a concise showdown. Available in https:\/\/doi.org\/10.36227\/techrxiv.22312330 (2023)","DOI":"10.36227\/techrxiv.22312330"},{"key":"17_CR20","doi-asserted-by":"crossref","unstructured":"Bang, Y., et al.: A multitask, multilingual, multimodal evaluation of ChatGPT on reasoning, hallucination, and interactivity. arXiv preprint: arXiv:2302.04023 (2023)","DOI":"10.18653\/v1\/2023.ijcnlp-main.45"}],"container-title":["Lecture Notes in Business Information Processing","Agile Processes in Software Engineering and Extreme Programming \u2013 Workshops"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-48550-3_17","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,6]],"date-time":"2024-11-06T22:27:48Z","timestamp":1730932068000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-48550-3_17"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,28]]},"ISBN":["9783031485497","9783031485503"],"references-count":20,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-48550-3_17","relation":{},"ISSN":["1865-1348","1865-1356"],"issn-type":[{"value":"1865-1348","type":"print"},{"value":"1865-1356","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,28]]},"assertion":[{"value":"28 December 2023","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"XP","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Conference on Agile Software Development","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Amsterdam","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"The Netherlands","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2023","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"13 June 2023","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"16 June 2023","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"24","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"xpu2023","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"https:\/\/www.agilealliance.org\/xp2023\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Single-blind","order":1,"name":"type","label":"Type","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"EasyChair","order":2,"name":"conference_management_system","label":"Conference Management System","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"40","order":3,"name":"number_of_submissions_sent_for_review","label":"Number of Submissions Sent for Review","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"11","order":4,"name":"number_of_full_papers_accepted","label":"Number of Full Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"1","order":5,"name":"number_of_short_papers_accepted","label":"Number of Short Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"28% - The value is computed by the equation \"Number of Full Papers Accepted \/ Number of Submissions Sent for Review * 100\" and then rounded to a whole number.","order":6,"name":"acceptance_rate_of_full_papers","label":"Acceptance Rate of Full Papers","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"2.6","order":7,"name":"average_number_of_reviews_per_paper","label":"Average Number of Reviews per Paper","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"2.1","order":8,"name":"average_number_of_papers_per_reviewer","label":"Average Number of Papers per Reviewer","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"Yes","order":9,"name":"external_reviewers_involved","label":"External Reviewers Involved","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"For the workshops 15 papers have been accepted from 38 submissions.","order":10,"name":"additional_info_on_review_process","label":"Additional Info on Review Process","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}}]}}