{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,22]],"date-time":"2026-04-22T10:05:56Z","timestamp":1776852356099,"version":"3.51.2"},"reference-count":109,"publisher":"Association for Computing Machinery (ACM)","issue":"5","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Comput.-Hum. Interact."],"published-print":{"date-parts":[[2025,10,31]]},"abstract":"<jats:p>\n            Data visualization creators often lack formal training, resulting in a knowledge gap in design practice. Large-language models such as\n            <jats:sc>ChatGPT<\/jats:sc>\n            , with their vast internet-scale training data, offer transformative potential to address this gap. In this study, we used both qualitative and quantitative methods to investigate how well\n            <jats:sc>ChatGPT<\/jats:sc>\n            can address visualization design questions. First, we quantitatively compared the\n            <jats:sc>ChatGPT<\/jats:sc>\n            -generated responses with anonymous online\n            <jats:sc>Human<\/jats:sc>\n            replies to data visualization questions on the VisGuides user forum. Next, we conducted a qualitative user study examining the reactions and attitudes of practitioners toward\n            <jats:sc>ChatGPT<\/jats:sc>\n            as a visualization design assistant. Participants were asked to bring their visualizations and design questions and received feedback from both\n            <jats:sc>Human<\/jats:sc>\n            experts and\n            <jats:sc>ChatGPT<\/jats:sc>\n            in randomized order. Our findings from both studies underscore\n            <jats:sc>ChatGPT<\/jats:sc>\n            \u2019s strengths\u2014particularly its ability to rapidly generate diverse design options\u2014while also highlighting areas for improvement, such as nuanced contextual understanding and fluid interaction dynamics beyond the chat interface. Drawing on these insights, we discuss design considerations for future LLM-based design feedback systems.\n          <\/jats:p>","DOI":"10.1145\/3745768","type":"journal-article","created":{"date-parts":[[2025,6,25]],"date-time":"2025-06-25T11:59:42Z","timestamp":1750852782000},"page":"1-33","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":12,"title":["How Good Is\n            <scp>ChatGPT<\/scp>\n            in Giving Advice on Your Visualization Design?"],"prefix":"10.1145","volume":"32","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4899-6671","authenticated-orcid":false,"given":"Nam Wook","family":"Kim","sequence":"first","affiliation":[{"name":"Computer Science, Boston College, Chestnut Hill, Massachusetts, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5797-5445","authenticated-orcid":false,"given":"Yongsu","family":"Ahn","sequence":"additional","affiliation":[{"name":"Computer Science, Boston College, Chestnut Hill, Massachusetts, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-6862-5521","authenticated-orcid":false,"given":"Grace","family":"Myers","sequence":"additional","affiliation":[{"name":"Information Science, Cornell University Cornell Tech, New York, New York, USA and Computer Science, Boston College, Chestnut Hill, Massachusetts, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9201-7744","authenticated-orcid":false,"given":"Benjamin","family":"Bach","sequence":"additional","affiliation":[{"name":"Bivwac, Inria Bordeaux, Talence, France"}]}],"member":"320","published-online":{"date-parts":[[2025,10,14]]},"reference":[{"key":"e_1_3_3_2_2","unstructured":"Data Visualization Society. 2023. Data Visualization Society. Retrieved March 28 2023 from https:\/\/www.datavisualizationsociety.org\/"},{"key":"e_1_3_3_3_2","unstructured":"Data Visualization Society. 2023. Data Visualization Society Surveys. Retrieved September 9 2023 from https:\/\/www.datavisualizationsociety.org\/survey-history"},{"key":"e_1_3_3_4_2","unstructured":"Science Technology and Public Policy Program. 2025. What\u2019s in the Chatterbox? Large Language Models Why They Matter and What We Should Do About Them\u2014stpp.fordschool.umich.edu. Retrieved May 1 2025 from https:\/\/stpp.fordschool.umich.edu\/research\/research-report\/whats-in-the-chatterbox"},{"key":"e_1_3_3_5_2","doi-asserted-by":"publisher","DOI":"10.1002\/eng2.12890"},{"key":"e_1_3_3_6_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.tjem.2018.08.001"},{"key":"e_1_3_3_7_2","doi-asserted-by":"crossref","first-page":"106","DOI":"10.1109\/VIS55277.2024.00029","volume-title":"Proceedings of the 2024 IEEE Visualization and Visual Analytics (VIS)","author":"Alexander Jason","year":"2024","unstructured":"Jason Alexander, Priyal Nanda, Kai-Cheng Yang, and Ali Sarvghad. 2024. Can GPT-4 models detect misleading visualizations? In Proceedings of the 2024 IEEE Visualization and Visual Analytics (VIS), 106\u2013110. DOI: 10.1109\/VIS55277.2024.00029"},{"key":"e_1_3_3_8_2","doi-asserted-by":"publisher","unstructured":"Raghav Awasthi Shreya Mishra Dwarikanath Mahapatra Ashish Khanna Kamal Maheshwari Jacek Cywinski Frank Papay and Piyush Mathur. 2023. Humanely: Human evaluation of LLM yield using a novel web based evaluation TOOL. medRxiv. DOI: 10.1101\/2023.12.22.23300458","DOI":"10.1101\/2023.12.22.23300458"},{"key":"e_1_3_3_9_2","doi-asserted-by":"crossref","unstructured":"Benjamin Bach Mandy Keck Fateme Rajabiyazdi Tatiana Losev Isabel Meirelles Jason Dykes Robert S. Laramee Mashael AlKadi Christina Stoiber Samuel Huron et al. 2023. Challenges and opportunities in data visualization education: A call to action. arXiv:2308.07703. Retrieved from https:\/\/arxiv.org\/abs\/2308.07703","DOI":"10.1109\/TVCG.2023.3327378"},{"key":"e_1_3_3_10_2","doi-asserted-by":"crossref","unstructured":"David Baidoo-Anu and Leticia Owusu Ansah. 2023. Education in the era of generative artificial intelligence (AI): Understanding the potential benefits of ChatGPT in promoting teaching and learning. Journal of AI 7 1 (2023) 52\u201362.","DOI":"10.61969\/jai.1337500"},{"key":"e_1_3_3_11_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2022.3209490"},{"key":"e_1_3_3_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2024.3456155"},{"key":"e_1_3_3_13_2","doi-asserted-by":"crossref","first-page":"610","DOI":"10.1145\/3442188.3445922","volume-title":"Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Virtual Event) (FAccT \u201921)","author":"Bender Emily M.","year":"2021","unstructured":"Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Virtual Event) (FAccT \u201921). ACM, New York, NY, 610\u2013623. DOI: 10.1145\/3442188.3445922"},{"key":"e_1_3_3_14_2","unstructured":"Rishi Bommasani Drew A. Hudson Ehsan Adeli Russ Altman Simran Arora Sydney von Arx Michael S. Bernstein Jeannette Bohg Antoine Bosselut Emma Brunskill et al. 2021. On the opportunities and risks of foundation models. arXiv:2108.07258. Retrieved from https:\/\/crfm.stanford.edu\/assets\/report.pdf"},{"key":"e_1_3_3_15_2","doi-asserted-by":"publisher","unstructured":"Ali Borji and Mehrdad Mohammadian. 2023. Battle of the Wordsmiths: Comparing ChatGPT GPT-4 Claude and Bard. SSRN. Retrieved from https:\/\/ssrn.com\/abstract=4476855 or 10.2139\/ssrn.4476855","DOI":"10.2139\/ssrn.4476855"},{"key":"e_1_3_3_16_2","first-page":"1877","volume-title":"Advances in Neural Information Processing Systems","author":"Brown Tom","year":"2020","unstructured":"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D. Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. In Advances in Neural Information Processing Systems. H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33, Curran Associates, Inc., 1877\u20131901. Retrieved from https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2020\/file\/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf"},{"key":"e_1_3_3_17_2","doi-asserted-by":"publisher","DOI":"10.1177\/1745691610393980"},{"key":"e_1_3_3_18_2","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1424"},{"key":"e_1_3_3_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/3641289"},{"key":"e_1_3_3_20_2","first-page":"713","volume-title":"Computer Graphics Forum","author":"Chatzimparmpas Angelos","year":"2020","unstructured":"Angelos Chatzimparmpas, Rafael Messias Martins, Ilir Jusufi, Kostiantyn Kucher, Fabrice Rossi, and Andreas Kerren. 2020. The state of the art in enhancing trust in machine learning models with the use of visualizations. In Computer Graphics Forum, Vol. 39, Wiley Online Library, 713\u2013756."},{"key":"e_1_3_3_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2024.3456320"},{"key":"e_1_3_3_22_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2024.3413195"},{"key":"e_1_3_3_23_2","volume-title":"Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI \u201923)","author":"Choi Jinhan","year":"2023","unstructured":"Jinhan Choi, Changhoon Oh, Yea-Seul Kim, and Nam Wook Kim. 2023. VisLab: Enabling visualization designers to gather empirically informed design feedback. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI \u201923). ACM, New York, NY, Article 813, 18 pages. DOI: 10.1145\/3544548.3581132"},{"key":"e_1_3_3_24_2","volume-title":"Proceedings of the Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems (CHI EA \u201921)","author":"Choi Jinhan","year":"2021","unstructured":"Jinhan Choi, Changhoon Oh, Bongwon Suh, and Nam Wook Kim. 2021. Toward a unified framework for visualization design guidelines. In Proceedings of the Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems (CHI EA \u201921). ACM, New York, NY, Article 240, 7 pages. DOI: 10.1145\/3411763.3451702"},{"key":"e_1_3_3_25_2","doi-asserted-by":"publisher","unstructured":"Jonathan H. Choi Kristin E. Hickman Amy Monahan and Daniel Schwarcz. 2023. ChatGPT goes to law school.\u00a071 Journal of Legal Education 387 (2022). Retrieved from https:\/\/ssrn.com\/abstract=4335905 or 10.2139\/ssrn.4335905","DOI":"10.2139\/ssrn.4335905"},{"key":"e_1_3_3_26_2","volume-title":"Statistics Without Maths for Psychology: Using SPSS for Windows","author":"Dancey Christine","year":"2008","unstructured":"Christine Dancey and John Reidy. 2008. Statistics Without Maths for Psychology: Using SPSS for Windows (4th ed.). Prentice Hall International (UK) Ltd.","edition":"4"},{"key":"e_1_3_3_27_2","unstructured":"Jacob Devlin Ming-Wei Chang Kenton Lee and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805. Retrieved from https:\/\/arxiv.org\/abs\/1810.04805"},{"key":"e_1_3_3_28_2","doi-asserted-by":"crossref","first-page":"113","DOI":"10.18653\/v1\/2023.acl-demo.11","volume-title":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)","author":"Dibia Victor","year":"2023","unstructured":"Victor Dibia. 2023. LIDA: A tool for automatic generation of grammar-agnostic visualizations and infographics using large language models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations). Association for Computational Linguistics, 113\u2013126. DOI: 10.18653\/v1\/2023.acl-demo.11"},{"key":"e_1_3_3_29_2","doi-asserted-by":"publisher","DOI":"10.2312\/eurovisshort.20181079"},{"key":"e_1_3_3_30_2","doi-asserted-by":"publisher","DOI":"10.2312\/eged.20211003"},{"key":"e_1_3_3_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3377325.3377501"},{"key":"e_1_3_3_32_2","volume-title":"Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST \u201924)","author":"Duan Peitong","year":"2024","unstructured":"Peitong Duan, Chin-Yi Cheng, Gang Li, Bjoern Hartmann, and Yang Li. 2024. UICrit: Enhancing automated design evaluation with a UI critique dataset. In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST \u201924). ACM, New York, NY, Article 46, 17 pages. DOI: 10.1145\/3654777.3676381"},{"key":"e_1_3_3_33_2","volume-title":"Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI \u201924)","author":"Duan Peitong","year":"2024","unstructured":"Peitong Duan, Jeremy Warner, Yang Li, and Bjoern Hartmann. 2024. Generating automatic feedback on UI mockups with large language models. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI \u201924). ACM, New York, NY, Article 6, 20 pages. DOI: 10.1145\/3613904.3642782"},{"key":"e_1_3_3_34_2","doi-asserted-by":"publisher","DOI":"10.1075\/idj.22004.est"},{"key":"e_1_3_3_35_2","unstructured":"Zhiwei Fei Xiaoyu Shen Dawei Zhu Fengzhe Zhou Zhuo Han Songyang Zhang Kai Chen Zongwen Shen and Jidong Ge. 2023. LawBench: Benchmarking legal knowledge of large language models. arXiv:2309.16289. Retrieved from https:\/\/arxiv.org\/abs\/2309.16289"},{"key":"e_1_3_3_36_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10439-023-03248-4"},{"key":"e_1_3_3_37_2","unstructured":"Jinlan Fu See-Kiong Ng Zhengbao Jiang and Pengfei Liu. 2023. GPTScore: Evaluate as you desire. 2023. arXiv:2302.04166. Retrieved from https:\/\/arxiv.org\/abs\/2302.04166"},{"key":"e_1_3_3_38_2","doi-asserted-by":"crossref","first-page":"3356","DOI":"10.18653\/v1\/2020.findings-emnlp.301","volume-title":"Findings of the Association for Computational Linguistics: EMNLP 2020","author":"Gehman Samuel","year":"2020","unstructured":"Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, and Noah A. Smith. 2020. RealToxicityPrompts: Evaluating neural toxic degeneration in language models. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, 3356\u20133369. DOI: 10.18653\/v1\/2020.findings-emnlp.301"},{"key":"e_1_3_3_39_2","doi-asserted-by":"publisher","DOI":"10.2196\/51580"},{"key":"e_1_3_3_40_2","doi-asserted-by":"publisher","DOI":"10.1101\/2022.12.23.22283901"},{"key":"e_1_3_3_41_2","volume-title":"Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI \u201924)","author":"Gu Ken","year":"2024","unstructured":"Ken Gu, Madeleine Grunde-McLaughlin, Andrew McNutt, Jeffrey Heer, and Tim Althoff. 2024. How do data analysts respond to AI assistance? A Wizard-of-Oz study. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI \u201924). ACM, New York, NY, Article 1015, 22 pages. DOI: 10.1145\/3613904.3641891"},{"key":"e_1_3_3_42_2","first-page":"223","volume-title":"Handbook of Inter-Rater Reliability","author":"Kilem Gwet","year":"2001","unstructured":"Kilem Gwet. 2001. Handbook of Inter-Rater Reliability. STATAXIS Publishing Company, Gaithersburg, MD, 223\u2013246."},{"key":"e_1_3_3_43_2","volume-title":"Proceedings of the International Conference on Intelligence Science (ICIS \u201921)","author":"Haug Saskia","year":"2021","unstructured":"Saskia Haug and Alexander M\u00e4dche. 2021. Crowd-feedback in information systems development: A state-of-the-art review. In Proceedings of the International Conference on Intelligence Science (ICIS \u201921). Association for Information Systems (AIS). DOI: 10.5445\/IR\/1000139669"},{"key":"e_1_3_3_44_2","unstructured":"Xinmeng Hou. 2024. Mitigating biases to embrace diversity: A comprehensive annotation benchmark for toxic language. arXiv:2410.13313. Retrieved from https:\/\/arxiv.org\/abs\/2410.13313"},{"key":"e_1_3_3_45_2","unstructured":"Aris Ihwan. 2023. Role Prompting. Retrieved September 8 2023 from https:\/\/www.linkedin.com\/pulse\/role-prompting-aris-ihwan\/"},{"key":"e_1_3_3_46_2","unstructured":"Jeevana Priya Inala Chenglong Wang Steven Drucker Gonzalo Ramos Victor Dibia Nathalie Riche Dave Brown Dan Marshall and Jianfeng Gao. 2024. Data analysis in the era of generative AI. arXiv:2409.18475. Retrieved from https:\/\/arxiv.org\/abs\/2409.18475"},{"key":"e_1_3_3_47_2","unstructured":"Che Jiang Biqing Qi Xiangyu Hong Dayuan Fu Yang Cheng Fandong Meng Mo Yu Bowen Zhou and Jie Zhou. 2024. On large language models\u2019 hallucination with regard to known facts. arXiv:2403.20009. Retrieved from https:\/\/arxiv.org\/abs\/2403.20009"},{"key":"e_1_3_3_48_2","first-page":"1","volume-title":"Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI \u201918)","author":"Hyeonsu","year":"2018","unstructured":"Hyeonsu, B. Kang, Gabriel Amoako, Neil Sengupta, and Steven P. Dow. 2018. Paragon: An online gallery for enhancing design feedback with visual examples. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI \u201918). ACM, New York, NY, 1\u201313. DOI: 10.1145\/3173574.3174180"},{"key":"e_1_3_3_49_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.lindif.2023.102274"},{"key":"e_1_3_3_50_2","doi-asserted-by":"publisher","DOI":"10.1109\/VL\/HCC60511.2024.00022"},{"key":"e_1_3_3_51_2","unstructured":"Seungone Kim Jamin Shin Yejin Cho Joel Jang Shayne Longpre Hwaran Lee Sangdoo Yun Seongjin Shin Sungdong Kim James Thorne and Minjoon Seo. 2024. Prometheus: Inducing fine-grained evaluation capability in language models. arXiv:2310.08491. Retrieved from https:\/\/arxiv.org\/abs\/2310.08491"},{"key":"e_1_3_3_52_2","first-page":"4627","volume-title":"Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI \u201917)","author":"Krause Markus","year":"2017","unstructured":"Markus Krause, Tom Garncarz, JiaoJiao Song, Elizabeth M. Gerber, Brian P. Bailey, and Steven P. Dow. 2017. Critique style guide: Improving crowdsourced design feedback with a natural language model. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI \u201917). ACM, New York, NY, 4627\u20134639. DOI: 10.1145\/3025453.3025883"},{"key":"e_1_3_3_53_2","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-023-31412-2"},{"key":"e_1_3_3_54_2","doi-asserted-by":"publisher","DOI":"10.1056\/NEJMsr2214184"},{"key":"e_1_3_3_55_2","unstructured":"Patrick Lewis Ethan Perez Aleksandra Piktus Fabio Petroni Vladimir Karpukhin Naman Goyal Heinrich K\u00fcttler Mike Lewis Wen-Tau Yih Tim Rockt\u00e4schel et al. 2020. Retrieval-augmented generation for knowledge-intensive NLP tasks Vol. 33 9459\u20139474. Retrieved from https:\/\/proceedings.neurips.cc\/paper_files\/paper\/2020\/file\/6b493230205f780e1bc26945df7481e5-Paper.pdf"},{"key":"e_1_3_3_56_2","unstructured":"Dawei Li Bohan Jiang Liangjie Huang Alimohammad Beigi Chengshuai Zhao Zhen Tan Amrita Bhattacharjee Yuxuan Jiang Canyu Chen Tianhao Wu et al. 2025. From generation to judgment: Opportunities and challenges of LLM-as-a-judge. 2025. arXiv:2411.16594. Retrieved from https:\/\/arxiv.org\/abs\/2411.16594"},{"key":"e_1_3_3_57_2","unstructured":"Percy Liang Rishi Bommasani Tony Lee Dimitris Tsipras Dilara Soylu Michihiro Yasunaga Yian Zhang Deepak Narayanan Yuhuai Wu Ananya Kumar et al. 2022. Holistic evaluation of language models. arXiv:2211.09110. Retrieved from https:\/\/arxiv.org\/abs\/2211.09110"},{"key":"e_1_3_3_58_2","doi-asserted-by":"crossref","first-page":"104770","DOI":"10.1016\/j.ebiom.2023.104770","article-title":"Benchmarking large language models\u2019 performances for myopia care: A comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard","volume":"95","author":"Lim Zhi Wei","year":"2023","unstructured":"Zhi Wei Lim, Krithi Pushpanathan, Samantha Min Er Yew, Yien Lai, Chen-Hsin Sun, Janice Sing Harn Lam, David Ziyou Chen, Jocelyn Hui Lin Goh, Marcus Chun Jin Tan, Bin Sheng, et al. 2023. Benchmarking large language models\u2019 performances for myopia care: A comparative analysis of ChatGPT-3.5, ChatGPT-4.0, and Google Bard. eBioMedicine 95 (2023), 104770.","journal-title":"eBioMedicine"},{"issue":"2","key":"e_1_3_3_59_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3436755","article-title":"When machine learning meets privacy: A survey and outlook","volume":"54","author":"Liu Bo","year":"2021","unstructured":"Bo Liu, Ming Ding, Sina Shaham, Wenny Rahayu, Farhad Farokhi, and Zihuai Lin. 2021. When machine learning meets privacy: A survey and outlook. ACM Computing Surveys 54, 2 (2021), 1\u201336.","journal-title":"ACM Computing Surveys"},{"key":"e_1_3_3_60_2","unstructured":"Pengfei Liu Weizhe Yuan Jinlan Fu Zhengbao Jiang Hiroaki Hayashi and Graham Neubig. 2021. Pre-train prompt and predict: A systematic survey of prompting methods in natural language processing. arXiv:2107.13586. Retrieved from https:\/\/arxiv.org\/abs\/2107.13586"},{"key":"e_1_3_3_61_2","doi-asserted-by":"crossref","unstructured":"Yang Liu Dan Iter Yichong Xu Shuohang Wang Ruochen Xu and Chenguang Zhu. 2023. G-Eval: NLG evaluation using GPT-4 with better human alignment. arXiv:2303.16634. Retrieved from https:\/\/arxiv.org\/abs\/2303.16634","DOI":"10.18653\/v1\/2023.emnlp-main.153"},{"key":"e_1_3_3_62_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2024.3456333"},{"key":"e_1_3_3_63_2","first-page":"21","volume-title":"Proceedings of the Companion Publication of the 17th ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW Companion \u201914)","author":"Luther Kurt","year":"2014","unstructured":"Kurt Luther, Amy Pavel, Wei Wu, Jari-Lee Tolentino, Maneesh Agrawala, Bj\u00f6rn Hartmann, and Steven P. Dow. 2014. CrowdCrit: Crowdsourcing and aggregating visual design critique. In Proceedings of the Companion Publication of the 17th ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW Companion \u201914). ACM, New York, NY, 21\u201324. DOI: 10.1145\/2556420.2556788"},{"key":"e_1_3_3_64_2","first-page":"473","volume-title":"Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW \u201915)","author":"Luther Kurt","year":"2015","unstructured":"Kurt Luther, Jari-Lee Tolentino, Wei Wu, Amy Pavel, Brian P. Bailey, Maneesh Agrawala, Bj\u00f6rn Hartmann, and Steven P. Dow. 2015. Structuring, aggregating, and evaluating crowdsourced design critique. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW \u201915). ACM, New York, NY, 473\u2013485. DOI: 10.1145\/2675133.2675283"},{"key":"e_1_3_3_65_2","unstructured":"Steve Mollman. 2023. ChatGPT Passed a Wharton MBA Exam and It\u2019s Still in Its Infancy. One Professor Is Sounding the Alarm. Retrieved September 7 2023 from https:\/\/fortune.com\/2023\/01\/21\/chatgpt-passed-wharton-mba-exam-one-professor-is-sounding-alarm-artificial-intelligence\/"},{"key":"e_1_3_3_66_2","first-page":"5356","volume-title":"Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)","author":"Nadeem Moin","year":"2021","unstructured":"Moin Nadeem, Anna Bethke, and Siva Reddy. 2021. StereoSet: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 5356\u20135371. DOI: 10.18653\/v1\/2021.acl-long.416"},{"key":"e_1_3_3_67_2","unstructured":"OpenAI. 2023. ChatGPT. Retrieved September 7 2023 from https:\/\/openai.com\/chatgpt"},{"key":"e_1_3_3_68_2","unstructured":"OpenAI Josh Achiam Steven Adler Sandhini Agarwal Lama Ahmad Ilge Akkaya Florencia Leoni Aleman Diogo Almeida Janko Altenschmidt Sam Altman et al. 2024. GPT-4 technical report. arXiv:2303.08774. Retrieved from https:\/\/arxiv.org\/abs\/2303.08774"},{"key":"e_1_3_3_69_2","first-page":"27730","article-title":"Training language models to follow instructions with human feedback","volume":"35","author":"Ouyang Long","year":"2022","unstructured":"Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems, Vol. 35, 27730\u201327744.","journal-title":"Advances in Neural Information Processing Systems"},{"key":"e_1_3_3_70_2","doi-asserted-by":"publisher","DOI":"10.1038\/s43588-023-00399-1"},{"key":"e_1_3_3_71_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2021.3114959"},{"key":"e_1_3_3_72_2","doi-asserted-by":"crossref","first-page":"176","DOI":"10.1109\/VIS47514.2020.00042","volume-title":"Proceedings of the 2020 IEEE Visualization Conference (VIS)","author":"Parsons Paul","year":"2020","unstructured":"Paul Parsons, Colin M. Gray, Ali Baigelenov, and Ian Carr. 2020. Design judgment in data visualization practice. In Proceedings of the 2020 IEEE Visualization Conference (VIS), 176\u2013180. DOI: 10.1109\/VIS47514.2020.00042"},{"key":"e_1_3_3_73_2","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1109\/VIS47514.2020.00049","volume-title":"Proceedings of the 2020 IEEE Visualization Conference (VIS)","author":"Parsons Paul","year":"2020","unstructured":"Paul Parsons and Prakash Shukla. 2020. Data visualization practitioners\u2019 perspectives on chartjunk. In Proceedings of the 2020 IEEE Visualization Conference (VIS), 211\u2013215. DOI: 10.1109\/VIS47514.2020.00049"},{"key":"e_1_3_3_74_2","unstructured":"Chengwei Qin Aston Zhang Zhuosheng Zhang Jiaao Chen Michihiro Yasunaga and Diyi Yang. 2023. Is ChatGPT a general-purpose natural language processing task solver? arXiv:2302.06476. Retrieved from https:\/\/arxiv.org\/abs\/2302.06476"},{"key":"e_1_3_3_75_2","unstructured":"Alec Radford Wu Jeffrey Child Rewon Luan David Amodei Dario and Sutskever Ilya. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1 8 (2019) 9."},{"issue":"8","key":"e_1_3_3_76_2","first-page":"9","article-title":"Language models are unsupervised multitask learners","volume":"1","author":"Radford Alec","year":"2019","unstructured":"Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.","journal-title":"OpenAI Blog"},{"key":"e_1_3_3_77_2","doi-asserted-by":"publisher","DOI":"10.1162\/coli_a_00322"},{"key":"e_1_3_3_78_2","doi-asserted-by":"publisher","unstructured":"V. Schetinger S. Di Bartolomeo M. El-Assady A. McNutt M. Miller J. P. A. Passos and J. L. Adams. 2023. Doom or deliciousness: Challenges and opportunities for visualization in the age of generative models. Computer Graphics Forum 42 3 (2023) 423\u2013435. DOI: 10.1111\/cgf.14841","DOI":"10.1111\/cgf.14841"},{"key":"e_1_3_3_79_2","doi-asserted-by":"publisher","unstructured":"Donia Scott and Johanna Moore. 2006. An NLG evaluation competition? Eight reasons to be cautious. DOI: 10.21954\/ou.ro.00016050","DOI":"10.21954\/ou.ro.00016050"},{"key":"e_1_3_3_80_2","unstructured":"Shuyu Shen Sirong Lu Leixian Shen Zhonghua Sheng Nan Tang and Yuyu Luo. 2024. Ask humans or AI? Exploring their roles in visualization troubleshooting. arXiv:2412.07673. Retrieved from https:\/\/arxiv.org\/abs\/2412.07673"},{"key":"e_1_3_3_81_2","volume-title":"Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI \u201923)","volume":"811","author":"Shin Sungbok","year":"2023","unstructured":"Sungbok Shin, Sanghyun Hong, and Niklas Elmqvist. 2023. Perceptual pat: A virtual human visual system for iterative visualization design. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI \u201923). ACM, New York, NY, Article 811, 17 pages. DOI: 10.1145\/3544548.3580974"},{"key":"e_1_3_3_82_2","doi-asserted-by":"crossref","unstructured":"Makesh Narsimhan Sreedhar Traian Rebedea Shaona Ghosh Jiaqi Zeng and Christopher Parisien. 2024. CantTalkAboutThis: Aligning language models to stay on topic in dialogues. arXiv:2404.03820. Retrieved from https:\/\/arxiv.org\/abs\/2404.03820","DOI":"10.18653\/v1\/2024.findings-emnlp.713"},{"key":"e_1_3_3_83_2","doi-asserted-by":"publisher","unstructured":"Sangho Suh Meng Chen Bryan Min Toby Jia-Jun Li and Haijun Xia. 2024. Luminate: Structured generation and exploration of design space with large language models for human-AI co-creation. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems 1\u201326. DOI: 10.1145\/3613904.3642400","DOI":"10.1145\/3613904.3642400"},{"key":"e_1_3_3_84_2","volume-title":"Assessing Social and Intersectional Biases in Contextualized Word Representations","author":"Chern Tan Yi","year":"2019","unstructured":"Yi Chern Tan and L. Elisa Celis. 2019. Assessing Social and Intersectional Biases in Contextualized Word Representations. Curran Associates Inc., Red Hook, NY."},{"issue":"3","key":"e_1_3_3_85_2","doi-asserted-by":"crossref","first-page":"1731","DOI":"10.1109\/TVCG.2024.3368621","article-title":"ChartGPT: Leveraging LLMs to generate charts from abstract natural language","volume":"31","author":"Tian Yuan","year":"2025","unstructured":"Yuan Tian, Weiwei Cui, Dazhen Deng, Xinjing Yi, Yurun Yang, Haidong Zhang, and Yingcai Wu. 2025. ChartGPT: Leveraging LLMs to generate charts from abstract natural language. IEEE Transactions on Visualization and Computer Graphics 31, 3 (2025), 1731\u20131745.","journal-title":"IEEE Transactions on Visualization and Computer Graphics"},{"key":"e_1_3_3_86_2","doi-asserted-by":"publisher","DOI":"10.1186\/s40561-023-00237-x"},{"key":"e_1_3_3_87_2","doi-asserted-by":"crossref","first-page":"272","DOI":"10.1145\/3351095.3372834","volume-title":"Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT* \u201920)","author":"Toreini Ehsan","year":"2020","unstructured":"Ehsan Toreini, Mhairi Aitken, Kovila Coopamootoo, Karen Elliott, Carlos Gonzalez Zelaya, and Aad van Moorsel. 2020. The relationship between trust in AI and trustworthy machine learning technologies. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT* \u201920). ACM, New York, NY, 272\u2013283. DOI: 10.1145\/3351095.3372834"},{"key":"e_1_3_3_88_2","unstructured":"Hugo Touvron Thibaut Lavril Gautier Izacard Xavier Martinet Marie-Anne Lachaux Timoth\u00e9e Lacroix Baptiste Rozi\u00e8re Naman Goyal Eric Hambro Faisal Azhar et al. 2023. LLaMA: Open and efficient foundation language models. arXiv:2302.139710. Retrieved from https:\/\/arxiv.org\/abs\/2302.13971"},{"key":"e_1_3_3_89_2","volume-title":"Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI \u201924)","author":"Vaithilingam Priyan","year":"2024","unstructured":"Priyan Vaithilingam, Elena L. Glassman, Jeevana Priya Inala, and Chenglong Wang. 2024. DynaVis: Dynamically synthesized UI widgets for visualization editing. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI \u201924). ACM, New York, NY, Article 985, 17 pages. DOI: 10.1145\/3613904.3642639"},{"key":"e_1_3_3_90_2","volume-title":"Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (CHI EA \u201922)","volume":"332","author":"Vaithilingam Priyan","year":"2022","unstructured":"Priyan Vaithilingam, Tianyi Zhang, and Elena L. Glassman. 2022. Expectation vs. experience: Evaluating the usability of code generation tools powered by large language models. In Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (CHI EA \u201922). ACM, New York, NY, Article 332, 7 pages. DOI: 10.1145\/3491101.3519665"},{"key":"e_1_3_3_91_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2020.101151"},{"key":"e_1_3_3_92_2","first-page":"1730","volume-title":"Proceedings of the 27th International Conference on Computational Linguistics","author":"van Miltenburg Emiel","year":"2018","unstructured":"Emiel van Miltenburg, Desmond Elliott, and Piek Vossen. 2018. Measuring the diversity of automatic image descriptions. In Proceedings of the 27th International Conference on Computational Linguistics. Emily M. Bender, Leon Derczynski, and Pierre Isabelle (Eds.), Association for Computational Linguistics, 1730\u20131741. Retrieved from https:\/\/aclanthology.org\/C18-1147\/"},{"key":"e_1_3_3_93_2","unstructured":"Chenglong Wang Bongshin Lee Steven Drucker Dan Marshall and Jianfeng Gao. 2025. Data Formulator 2: Iterative creation of data visualizations with AI transforming data along the way. arXiv:2408.16119. Retrieved from https:\/\/arxiv.org\/abs\/2408.16119"},{"issue":"1","key":"e_1_3_3_94_2","first-page":"1128","article-title":"Data Formulator: AI-powered concept-driven visualization authoring","volume":"30","author":"Wang Chenglong","year":"2023","unstructured":"Chenglong Wang, John Thompson, and Bongshin Lee. 2023. Data Formulator: AI-powered concept-driven visualization authoring. IEEE Transactions on Visualization and Computer Graphics 30, 1 (2023), 1128\u20131138.","journal-title":"IEEE Transactions on Visualization and Computer Graphics"},{"key":"e_1_3_3_95_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2024.3456350"},{"key":"e_1_3_3_96_2","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2024.3456378"},{"key":"e_1_3_3_97_2","doi-asserted-by":"crossref","unstructured":"Jiaan Wang Yunlong Liang Fandong Meng Zengkui Sun Haoxiang Shi Zhixu Li Jinan Xu Jianfeng Qu and Jie Zhou. 2023. Is ChatGPT a good NLG evaluator? A preliminary study. arXiv:2303.04048. Retrieved from https:\/\/arxiv.org\/abs\/2303.04048","DOI":"10.18653\/v1\/2023.newsum-1.1"},{"key":"e_1_3_3_98_2","unstructured":"Ke Wang Houxing Ren Aojun Zhou Zimu Lu Sichun Luo Weikang Shi Renrui Zhang Linqi Song Mingjie Zhan and Hongsheng Li. 2023. MathCoder: Seamless code integration in LLMs for enhanced mathematical reasoning. arXiv:2310.03731. Retrieved from https:\/\/arxiv.org\/abs\/2310.03731"},{"key":"e_1_3_3_99_2","unstructured":"Xuezhi Wang Jason Wei Dale Schuurmans Quoc Le Ed Chi Sharan Narang Aakanksha Chowdhery and Denny Zhou. 2022. Self-consistency improves chain of thought reasoning in language models. arXiv:2203.11171. Retrieved from https:\/\/arxiv.org\/abs\/2203.11171"},{"key":"e_1_3_3_100_2","volume-title":"Transactions on Machine Learning Research","author":"Wei Jason","year":"2022","unstructured":"Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, et al. 2022. Emergent abilities of large language models. Transactions on Machine Learning Research (2022)."},{"key":"e_1_3_3_101_2","unstructured":"Jason Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Fei Xia Ed. H. Chi Quoc V. Le and Denny Zhou 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35 (2022) 24824\u201324837."},{"key":"e_1_3_3_102_2","doi-asserted-by":"publisher","DOI":"10.1109\/JAS.2023.123618"},{"key":"e_1_3_3_103_2","first-page":"1433","volume-title":"Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW \u201914)","author":"Xu Anbang","year":"2014","unstructured":"Anbang Xu, Shih-Wen Huang, and Brian Bailey. 2014. Voyant: Generating structured feedback on visual designs using a crowd of non-experts. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW \u201914). ACM, New York, NY, 1433\u20131444. DOI: 10.1145\/2531602.2531604"},{"key":"e_1_3_3_104_2","first-page":"189","volume-title":"Proceedings of the 25th International Conference on Intelligent User Interfaces (IUI \u201920)","author":"Yang Fumeng","year":"2020","unstructured":"Fumeng Yang, Zhuanyi Huang, Jean Scholtz, and Dustin L. Arendt. 2020. How do visual explanations foster end users\u2019 appropriate trust in machine learning? In Proceedings of the 25th International Conference on Intelligent User Interfaces (IUI \u201920). ACM, New York, NY, 189\u2013201. DOI: 10.1145\/3377325.3377480"},{"key":"e_1_3_3_105_2","doi-asserted-by":"crossref","first-page":"773","DOI":"10.1145\/2901790.2901820","volume-title":"Proceedings of the 2016 ACM Conference on Designing Interactive Systems","author":"Yen Yu-Chun","year":"2016","unstructured":"Yu-Chun Yen, Steven, P. Dow, Elizabeth Gerber, and Brian P. Bailey. 2016. Social network, web forum, or task market? Comparing different crowd genres for design feedback exchange. In Proceedings of the 2016 ACM Conference on Designing Interactive Systems, 773\u2013784."},{"key":"e_1_3_3_106_2","first-page":"1","volume-title":"Proceedings of the 2019 Chi Conference on Human Factors in Computing Systems","author":"Yin Ming","year":"2019","unstructured":"Ming Yin, Jennifer Wortman Vaughan, and Hanna Wallach. 2019. Understanding the effect of accuracy on trust in machine learning models. In Proceedings of the 2019 Chi Conference on Human Factors in Computing Systems, 1\u201312."},{"key":"e_1_3_3_107_2","first-page":"1005","volume-title":"Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW \u201916)","author":"Yuan Alvin","year":"2016","unstructured":"Alvin Yuan, Kurt Luther, Markus Krause, Sophie Isabel Vennix, Steven, P. Dow, and Bjorn Hartmann. 2016. Almost an expert: The effects of rubrics and expertise on perceived value of crowdsourced design critiques. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW \u201916). ACM, New York, NY, 1005\u20131017. DOI: 10.1145\/2818048.2819953"},{"key":"e_1_3_3_108_2","unstructured":"Zhuosheng Zhang Aston Zhang Mu Li Hai Zhao George Karypis and Alex Smola. 2023. Multimodal chain-of-thought reasoning in language models. arXiv:2302.00923. Retrieved from https:\/\/arxiv.org\/abs\/2302.00923"},{"key":"e_1_3_3_109_2","unstructured":"Wayne Xin Zhao Kun Zhou Junyi Li Tianyi Tang Xiaolei Wang Yupeng Hou Yingqian Min Beichen Zhang Junjie Zhang Zican Dong et al. 2023. A survey of large language models. arXiv:2303.18223. Retrieved from https:\/\/arxiv.org\/abs\/2303.18223"},{"key":"e_1_3_3_110_2","unstructured":"Lianmin Zheng Wei-Lin Chiang Ying Sheng Siyuan Zhuang Zhanghao Wu Yonghao Zhuang Zi Lin Zhuohan Li Dacheng Li Eric Xing et al. 2023. Judging LLM-as-a-judge with MT-bench and chatbot Arena. arXiv:2306.05685. Retrieved from https:\/\/arxiv.org\/abs\/2306.05685"}],"container-title":["ACM Transactions on Computer-Human Interaction"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3745768","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,15]],"date-time":"2025-10-15T06:54:13Z","timestamp":1760511253000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3745768"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,10,14]]},"references-count":109,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2025,10,31]]}},"alternative-id":["10.1145\/3745768"],"URL":"https:\/\/doi.org\/10.1145\/3745768","relation":{},"ISSN":["1073-0516","1557-7325"],"issn-type":[{"value":"1073-0516","type":"print"},{"value":"1557-7325","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,10,14]]},"assertion":[{"value":"2024-03-09","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-05-02","order":2,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2025-10-14","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}