{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T19:59:30Z","timestamp":1776110370096,"version":"3.50.1"},"publisher-location":"New York, NY, USA","reference-count":85,"publisher":"ACM","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,7,5]]},"DOI":"10.1145\/3715336.3735754","type":"proceedings-article","created":{"date-parts":[[2025,7,4]],"date-time":"2025-07-04T10:12:39Z","timestamp":1751623959000},"page":"999-1019","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["IKIWISI: An Interactive Visual Pattern Generator for Evaluating the Reliability of Vision-Language Models Without Ground Truth"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6075-2832","authenticated-orcid":false,"given":"Md Touhidul","family":"Islam","sequence":"first","affiliation":[{"name":"College of Information Sciences and Technology, Pennsylvania State University, University Park, Pennsylvania, USA"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-8145-9328","authenticated-orcid":false,"given":"Imran","family":"Kabir","sequence":"additional","affiliation":[{"name":"College of Information Sciences and Technology, Pennsylvania State University, State College, Pennsylvania, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7692-817X","authenticated-orcid":false,"given":"Md Alimoor","family":"Reza","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, Drake University, Des Moines, Iowa, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5063-3808","authenticated-orcid":false,"given":"Syed Masum","family":"Billah","sequence":"additional","affiliation":[{"name":"College of Information Sciences and Technology, Pennsylvania State University, University Park, Pennsylvania, USA"}]}],"member":"320","published-online":{"date-parts":[[2025,7,4]]},"reference":[{"key":"e_1_3_3_2_2_2","unstructured":"[n.d.]. Aira. https:\/\/aira.io\/."},{"key":"e_1_3_3_2_3_2","doi-asserted-by":"crossref","unstructured":"Bilal Alsallakh Allan Hanbury Helwig Hauser Silvia Miksch and Andreas Rauber. 2014. Visual methods for analyzing probabilistic classification data. IEEE transactions on visualization and computer graphics 20 12 (2014) 1703\u20131712.","DOI":"10.1109\/TVCG.2014.2346660"},{"key":"e_1_3_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2015.279"},{"key":"e_1_3_3_2_5_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV.2019.00939"},{"key":"e_1_3_3_2_6_2","unstructured":"BeMyEyes. 2021. Be My Eyes. https:\/\/www.bemyeyes.com\/."},{"key":"e_1_3_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/3132525.3132531"},{"key":"e_1_3_3_2_8_2","first-page":"27","volume-title":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"1","author":"Billah Syed\u00a0Masum","year":"2015","unstructured":"Syed\u00a0Masum Billah and Susan Gauch. 2015. Social network analysis for predicting emerging researchers. In 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K), Vol.\u00a01. IEEE, 27\u201335."},{"key":"e_1_3_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/2901318.2901335"},{"key":"e_1_3_3_2_10_2","doi-asserted-by":"publisher","DOI":"10.1109\/WACVW60836.2024.00020"},{"key":"e_1_3_3_2_11_2","volume-title":"Machines like us: toward AI with common sense","author":"Brachman Ronald\u00a0J","year":"2023","unstructured":"Ronald\u00a0J Brachman and Hector\u00a0J Levesque. 2023. Machines like us: toward AI with common sense. MIT Press."},{"key":"e_1_3_3_2_12_2","unstructured":"S\u00e9bastien Bubeck Varun Chandrasekaran Ronen Eldan Johannes Gehrke Eric Horvitz Ece Kamar Peter Lee Yin\u00a0Tat Lee Yuanzhi Li Scott Lundberg et\u00a0al. 2023. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2303.12712 (2023)."},{"key":"e_1_3_3_2_13_2","doi-asserted-by":"crossref","unstructured":"John\u00a0M Carroll and Judith\u00a0Reitman Olson. 1988. Mental models in human-computer interaction. Handbook of human-computer interaction (1988) 45\u201365.","DOI":"10.1016\/B978-0-444-70536-5.50007-5"},{"key":"e_1_3_3_2_14_2","doi-asserted-by":"crossref","unstructured":"Sonia Castelo Joao Rulff Erin McGowan Bea Steers Guande Wu Shaoyu Chen Iran Roman Roque Lopez Ethan Brewer Chen Zhao et\u00a0al. 2023. Argus: Visualization of ai-assisted task guidance in ar. IEEE Transactions on Visualization and Computer Graphics (2023).","DOI":"10.1109\/TVCG.2023.3327396"},{"key":"e_1_3_3_2_15_2","doi-asserted-by":"crossref","unstructured":"David Chan Suzanne Petryk Joseph\u00a0E Gonzalez Trevor Darrell and John Canny. 2023. Clair: Evaluating image captions with large language models. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2310.12971 (2023).","DOI":"10.18653\/v1\/2023.emnlp-main.841"},{"key":"e_1_3_3_2_16_2","doi-asserted-by":"crossref","unstructured":"Long Chen Oleg Sinavski Jan H\u00fcnermann Alice Karnsund Andrew\u00a0James Willmott Danny Birch Daniel Maund and Jamie Shotton. 2023. Driving with llms: Fusing object-level vector modality for explainable autonomous driving. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2310.01957 (2023).","DOI":"10.1109\/ICRA57147.2024.10611018"},{"key":"e_1_3_3_2_17_2","unstructured":"Teresa Datta and John\u00a0P Dickerson. 2023. Who\u2019s Thinking? A Push for Human-Centered Evaluation of LLMs using the XAI Playbook. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2303.06223 (2023)."},{"key":"e_1_3_3_2_18_2","doi-asserted-by":"crossref","unstructured":"Di Feng Christian Haase-Sch\u00fctz Lars Rosenbaum Heinz Hertlein Claudius Glaeser Fabian Timm Werner Wiesbeck and Klaus Dietmayer. 2020. Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets methods and challenges. IEEE Transactions on Intelligent Transportation Systems 22 3 (2020) 1341\u20131360.","DOI":"10.1109\/TITS.2020.2972974"},{"key":"e_1_3_3_2_19_2","doi-asserted-by":"crossref","unstructured":"Baltasar Fernandez-Manjon and Alfredo Fernandez-Valmayor. 1998. Building educational tools based on formal concept analysis. Education and Information Technologies 3 3 (1998) 187\u2013201.","DOI":"10.1023\/A:1009641330050"},{"key":"e_1_3_3_2_20_2","unstructured":"Maxwell Forbes Ari Holtzman and Yejin Choi. 2019. Do neural language representations learn physical commonsense? arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/1908.02899 (2019)."},{"key":"e_1_3_3_2_21_2","first-page":"2672","volume-title":"Advances in Neural Information Processing Systems","author":"Goodfellow Ian\u00a0J.","year":"2014","unstructured":"Ian\u00a0J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Vol.\u00a027. Curran Associates, Inc., 2672\u20132680."},{"key":"e_1_3_3_2_22_2","doi-asserted-by":"publisher","DOI":"10.1177\/1541931215591373"},{"key":"e_1_3_3_2_23_2","unstructured":"Tanmay Gupta A. Kamath Aniruddha Kembhavi and Derek Hoiem. 2021. Towards General Purpose Vision Systems. ArXiv abs\/2104.00743 (2021)."},{"key":"e_1_3_3_2_24_2","unstructured":"Tanmay Gupta A. Kamath Aniruddha Kembhavi and Derek Hoiem. 2022. Towards General Purpose Vision Systems. Conference of Computer Vision and Pattern Recognition (CVPR) (2022)."},{"key":"e_1_3_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.01591"},{"key":"e_1_3_3_2_26_2","doi-asserted-by":"publisher","DOI":"10.1145\/3613904.3642817"},{"key":"e_1_3_3_2_27_2","doi-asserted-by":"crossref","unstructured":"David Hand and Peter Christen. 2018. A note on using the F-measure for evaluating record linkage algorithms. Statistics and Computing 28 (2018) 539\u2013547.","DOI":"10.1007\/s11222-017-9746-6"},{"key":"e_1_3_3_2_28_2","doi-asserted-by":"crossref","unstructured":"Andreas Hinterreiter Peter Ruch Holger Stitz Martin Ennemoser J\u00fcrgen Bernard Hendrik Strobelt and Marc Streit. 2020. ConfusionFlow: A model-agnostic visualization for temporal analysis of classifier confusion. IEEE Transactions on Visualization and Computer Graphics 28 2 (2020) 1222\u20131236.","DOI":"10.1109\/TVCG.2020.3012063"},{"key":"e_1_3_3_2_29_2","doi-asserted-by":"crossref","unstructured":"Fred Hohman Minsuk Kahng Robert Pienta and Duen\u00a0Horng Chau. 2018. Visual analytics in deep learning: An interrogative survey for the next frontiers. IEEE transactions on visualization and computer graphics 25 8 (2018) 2674\u20132693.","DOI":"10.1109\/TVCG.2018.2843369"},{"key":"e_1_3_3_2_30_2","doi-asserted-by":"publisher","unstructured":"Md\u00a0Naimul Hoque Nazmus Saquib Syed\u00a0Masum Billah and Klaus Mueller. 2020. Toward Interactively Balancing the Screen Time of Actors Based on Observable Phenotypic Traits in Live Telecast. 4 CSCW2 Article 154 (oct 2020) 18\u00a0pages. 10.1145\/3415225","DOI":"10.1145\/3415225"},{"key":"e_1_3_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/1458082.1458345"},{"key":"e_1_3_3_2_32_2","doi-asserted-by":"crossref","unstructured":"Md\u00a0Touhidul Islam and Syed\u00a0Masum Billah. 2023. SpaceX Mag: An Automatic Scalable and Rapid Space Compactor for Optimizing Smartphone App Interfaces for Low-Vision Users. Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies 7 2 (2023) 1\u201336.","DOI":"10.1145\/3596253"},{"key":"e_1_3_3_2_33_2","unstructured":"Md\u00a0Touhidul Islam Imran Kabir Elena\u00a0Ariel Pearce Md\u00a0Alimoor Reza and Syed\u00a0Masum Billah. 2024. A Dataset for Crucial Object Recognition in Blind and Low-Vision Individuals\u2019 Navigation. arxiv:https:\/\/arXiv.org\/abs\/2407.16777\u00a0[cs.CV] https:\/\/arxiv.org\/abs\/2407.16777"},{"key":"e_1_3_3_2_34_2","doi-asserted-by":"publisher","DOI":"10.1145\/3663548.3688538"},{"key":"e_1_3_3_2_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/3672539.3686749"},{"key":"e_1_3_3_2_36_2","doi-asserted-by":"publisher","DOI":"10.1145\/3654777.3676396"},{"key":"e_1_3_3_2_37_2","unstructured":"Zhaoyin Jia Andy Gallagher Ashutosh Saxena and Tsuhan Chen. 2014. 3D Reasoning from Blocks to Stability. IEEE Trans PAMI (2014)."},{"key":"e_1_3_3_2_38_2","doi-asserted-by":"publisher","DOI":"10.1145\/1008992.1009124"},{"key":"e_1_3_3_2_39_2","unstructured":"Roger\u00a0T Johnson and David\u00a0W Johnson. 1986. Cooperative learning in the science classroom. Science and children 24 2 (1986) 31\u201332."},{"key":"e_1_3_3_2_40_2","volume-title":"2025 IEEE International Conference on Robotics and Automation (ICRA)","author":"Kabir Imran","year":"2025","unstructured":"Imran Kabir, Md\u00a0Alimoor Reza, and Syed Billah. 2025. Logic-RAG: Augmenting Large Multimodal Models with Visual-Spatial Knowledge for Road Scene Understanding. In 2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE."},{"key":"e_1_3_3_2_41_2","doi-asserted-by":"crossref","unstructured":"Minsuk Kahng Pierre\u00a0Y Andrews Aditya Kalro and Duen\u00a0Horng Chau. 2017. A cti v is: Visual exploration of industry-scale deep neural network models. IEEE transactions on visualization and computer graphics 24 1 (2017) 88\u201397.","DOI":"10.1109\/TVCG.2017.2744718"},{"key":"e_1_3_3_2_42_2","doi-asserted-by":"crossref","unstructured":"Amita Kamath Jack Hessel and Kai-Wei Chang. 2023. What\u2019s\" up\" with vision-language models? Investigating their struggle with spatial reasoning. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2310.19785 (2023).","DOI":"10.18653\/v1\/2023.emnlp-main.568"},{"key":"e_1_3_3_2_43_2","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2019.8793751"},{"key":"e_1_3_3_2_44_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2019.00111"},{"key":"e_1_3_3_2_45_2","unstructured":"Dongxu Li Junnan Li Hung Le Guangsen Wang Silvio Savarese and Steven\u00a0CH Hoi. 2022. Lavis: A library for language-vision intelligence. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2209.09019 (2022)."},{"key":"e_1_3_3_2_46_2","first-page":"12888","volume-title":"International Conference on Machine Learning","author":"Li Junnan","year":"2022","unstructured":"Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi. 2022. Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. In International Conference on Machine Learning. PMLR, 12888\u201312900."},{"key":"e_1_3_3_2_47_2","doi-asserted-by":"crossref","unstructured":"Fangyu Liu Guy Emerson and Nigel Collier. 2023. Visual spatial reasoning. Transactions of the Association for Computational Linguistics 11 (2023) 635\u2013651.","DOI":"10.1162\/tacl_a_00566"},{"key":"e_1_3_3_2_48_2","unstructured":"Haotian Liu Chunyuan Li Qingyang Wu and Yong\u00a0Jae Lee. 2023. Visual instruction tuning. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2304.08485 (2023)."},{"key":"e_1_3_3_2_49_2","unstructured":"Haotian Liu Chunyuan Li Qingyang Wu and Yong\u00a0Jae Lee. 2024. Visual instruction tuning. Advances in neural information processing systems 36 (2024)."},{"key":"e_1_3_3_2_50_2","doi-asserted-by":"crossref","unstructured":"Mengchen Liu Jiaxin Shi Kelei Cao Jun Zhu and Shixia Liu. 2017. Analyzing the training processes of deep generative models. IEEE transactions on visualization and computer graphics 24 1 (2017) 77\u201387.","DOI":"10.1109\/TVCG.2017.2744938"},{"key":"e_1_3_3_2_51_2","doi-asserted-by":"crossref","unstructured":"Shixia Liu Xiting Wang Mengchen Liu and Jun Zhu. 2017. Towards better analysis of machine learning models: A visual analytics perspective. Visual Informatics 1 1 (2017) 48\u201356.","DOI":"10.1016\/j.visinf.2017.01.006"},{"key":"e_1_3_3_2_52_2","unstructured":"Muhammad Maaz Hanoona Rasheed Salman Khan and Fahad\u00a0Shahbaz Khan. 2023. Video-chatgpt: Towards detailed video understanding via large vision and language models. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2306.05424 (2023)."},{"key":"e_1_3_3_2_53_2","unstructured":"Microsoft. 2017. Surface Dial. https:\/\/www.microsoft.com\/en-us\/surface\/accessories\/surface-dial"},{"key":"e_1_3_3_2_54_2","doi-asserted-by":"crossref","unstructured":"Mark\u00a0EJ Newman. 2004. Coauthorship networks and patterns of scientific collaboration. Proceedings of the national academy of sciences 101 suppl_1 (2004) 5200\u20135205.","DOI":"10.1073\/pnas.0307545100"},{"key":"e_1_3_3_2_55_2","volume-title":"Japanese candlestick charting techniques: a contemporary guide to the ancient investment techniques of the Far East","author":"Nison Steve","year":"2001","unstructured":"Steve Nison. 2001. Japanese candlestick charting techniques: a contemporary guide to the ancient investment techniques of the Far East. Penguin."},{"key":"e_1_3_3_2_56_2","doi-asserted-by":"publisher","DOI":"10.1201\/b15703"},{"key":"e_1_3_3_2_57_2","unstructured":"OpenAI. [n.d.]. GPTV Sysmtem Card. https:\/\/cdn.openai.com\/papers\/GPTV_System_Card.pdf"},{"key":"e_1_3_3_2_58_2","volume-title":"GPT-4 Technical Report","year":"2023","unstructured":"OpenAI. 2023. GPT-4 Technical Report. arxiv:https:\/\/arXiv.org\/abs\/2303.08774v2https:\/\/arxiv.org\/abs\/2303.08774v2"},{"key":"e_1_3_3_2_59_2","volume-title":"GPT-4V(ision) System Card","year":"2023","unstructured":"OpenAI. 2023. GPT-4V(ision) System Card. https:\/\/cdn.openai.com\/papers\/GPTV_System_Card.pdf"},{"key":"e_1_3_3_2_60_2","volume-title":"GPT-4V(ision) technical work and authors","year":"2023","unstructured":"OpenAI. 2023. GPT-4V(ision) technical work and authors. https:\/\/openai.com\/contributions\/gpt-4v"},{"key":"e_1_3_3_2_61_2","volume-title":"The PageRank citation ranking: Bringing order to the web.","author":"Page Lawrence","year":"1999","unstructured":"Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: Bringing order to the web. Technical Report. Stanford infolab."},{"key":"e_1_3_3_2_62_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-71051-4_10"},{"key":"e_1_3_3_2_63_2","unstructured":"David\u00a0MW Powers. 2020. Evaluation: from precision recall and F-measure to ROC informedness markedness and correlation. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2010.16061 (2020)."},{"key":"e_1_3_3_2_64_2","doi-asserted-by":"crossref","unstructured":"Junaid Qadir Mohammad\u00a0Qamar Islam and Ala Al-Fuqaha. 2022. Toward accountable human-centered AI: rationale and promising directions. Journal of Information Communication and Ethics in Society 20 2 (2022) 329\u2013342.","DOI":"10.1108\/JICES-06-2021-0059"},{"key":"e_1_3_3_2_65_2","unstructured":"Chenyang Qi Xiaodong Cun Yong Zhang Chenyang Lei Xintao Wang Ying Shan and Qifeng Chen. 2023. Fatezero: Fusing attentions for zero-shot text-based video editing. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2303.09535 (2023)."},{"key":"e_1_3_3_2_66_2","doi-asserted-by":"crossref","unstructured":"Donghao Ren Saleema Amershi Bongshin Lee Jina Suh and Jason\u00a0D Williams. 2016. Squares: Supporting interactive performance analysis for multiclass classifiers. IEEE transactions on visualization and computer graphics 23 1 (2016) 61\u201370.","DOI":"10.1109\/TVCG.2016.2598828"},{"key":"e_1_3_3_2_67_2","doi-asserted-by":"crossref","unstructured":"Mark\u00a0O Riedl. 2019. Human-centered artificial intelligence and machine learning. Human behavior and emerging technologies 1 1 (2019) 33\u201336.","DOI":"10.1002\/hbe2.117"},{"key":"e_1_3_3_2_68_2","doi-asserted-by":"crossref","unstructured":"Christopher\u00a0A Sanchez and Jennifer Wiley. 2009. To scroll or not to scroll: Scrolling working memory capacity and comprehending complex texts. Human Factors 51 5 (2009) 730\u2013738.","DOI":"10.1177\/0018720809352788"},{"key":"e_1_3_3_2_69_2","doi-asserted-by":"crossref","unstructured":"Daniel\u00a0J Simons and Daniel\u00a0T Levin. 1997. Change blindness. Trends in cognitive sciences 1 7 (1997) 261\u2013267.","DOI":"10.1016\/S1364-6613(97)01080-2"},{"key":"e_1_3_3_2_70_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52688.2022.00517"},{"key":"e_1_3_3_2_71_2","doi-asserted-by":"crossref","unstructured":"Dejan Todorovic. 2008. Gestalt principles. Scholarpedia 3 12 (2008) 5345.","DOI":"10.4249\/scholarpedia.5345"},{"key":"e_1_3_3_2_72_2","doi-asserted-by":"crossref","unstructured":"Anne Treisman. 1985. Preattentive processing in vision. Computer vision graphics and image processing 31 2 (1985) 156\u2013177.","DOI":"10.1016\/S0734-189X(85)80004-9"},{"key":"e_1_3_3_2_73_2","doi-asserted-by":"crossref","unstructured":"Anne\u00a0M Treisman and Garry Gelade. 1980. A feature-integration theory of attention. Cognitive psychology 12 1 (1980) 97\u2013136.","DOI":"10.1016\/0010-0285(80)90005-5"},{"key":"e_1_3_3_2_74_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.00721"},{"key":"e_1_3_3_2_75_2","doi-asserted-by":"crossref","unstructured":"Liuping Wang Zhan Zhang Dakuo Wang Weidan Cao Xiaomu Zhou Ping Zhang Jianxing Liu Xiangmin Fan and Feng Tian. 2023. Human-centered design and evaluation of AI-empowered clinical decision support systems: a systematic review. Frontiers in Computer Science 5 (2023) 1187299.","DOI":"10.3389\/fcomp.2023.1187299"},{"key":"e_1_3_3_2_76_2","volume-title":"Interactive Data Disualization: Foundations, Techniques, and Applications","author":"Ward Matthew\u00a0O","year":"2010","unstructured":"Matthew\u00a0O Ward, Georges Grinstein, and Daniel Keim. 2010. Interactive Data Disualization: Foundations, Techniques, and Applications. CRC Press."},{"key":"e_1_3_3_2_77_2","volume-title":"The Twelfth International Conference on Learning Representations","author":"West Peter","year":"2023","unstructured":"Peter West, Ximing Lu, Nouha Dziri, Faeze Brahman, Linjie Li, Jena\u00a0D Hwang, Liwei Jiang, Jillian Fisher, Abhilasha Ravichander, Khyathi Chandu, et\u00a0al. 2023. THE GENERATIVE AI PARADOX:\u201cWhat It Can Create, It May Not Understand\u201d. In The Twelfth International Conference on Learning Representations."},{"key":"e_1_3_3_2_78_2","doi-asserted-by":"publisher","DOI":"10.1145\/3706598.3714210"},{"key":"e_1_3_3_2_79_2","unstructured":"Jingyi Xie Rui Yu He Zhang Sooyeon Lee Syed\u00a0Masum Billah and John\u00a0M Carroll. 2024. Emerging practices for large multimodal model (lmm) assistance for people with visual impairments: Implications for design. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2407.08882 (2024)."},{"key":"e_1_3_3_2_80_2","unstructured":"Zhenjie Yang Xiaosong Jia Hongyang Li and Junchi Yan. 2023. A survey of large language models for autonomous driving. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2311.01043 (2023)."},{"key":"e_1_3_3_2_81_2","unstructured":"Mert Yuksekgonul Federico Bianchi Pratyusha Kalluri Dan Jurafsky and James Zou. 2022. When and why vision-language models behave like bag-of-words models and what to do about it. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2210.01936 5 (2022)."},{"key":"e_1_3_3_2_82_2","doi-asserted-by":"publisher","DOI":"10.1145\/3688828.3699636"},{"key":"e_1_3_3_2_83_2","doi-asserted-by":"crossref","unstructured":"Jiawei Zhang Yang Wang Piero Molino Lezhi Li and David\u00a0S Ebert. 2018. Manifold: A model-agnostic framework for interpretation and diagnosis of machine learning models. IEEE transactions on visualization and computer graphics 25 1 (2018) 364\u2013373.","DOI":"10.1109\/TVCG.2018.2864499"},{"key":"e_1_3_3_2_84_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2018.00068"},{"key":"e_1_3_3_2_85_2","unstructured":"Xingcheng Zhou Mingyu Liu Bare\u00a0Luka Zagar Ekim Yurtsever and Alois\u00a0C Knoll. 2023. Vision language models in autonomous driving and intelligent transportation systems. arXiv preprint arXiv:https:\/\/arXiv.org\/abs\/2310.14414 (2023)."},{"key":"e_1_3_3_2_86_2","doi-asserted-by":"publisher","DOI":"10.1145\/1240624.1240704"}],"event":{"name":"DIS '25: Designing Interactive Systems Conference","location":"Madeira Portugal","acronym":"DIS '25","sponsor":["SIGCHI ACM Special Interest Group on Computer-Human Interaction"]},"container-title":["Proceedings of the 2025 ACM Designing Interactive Systems Conference"],"original-title":[],"deposited":{"date-parts":[[2025,7,4]],"date-time":"2025-07-04T11:26:39Z","timestamp":1751628399000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3715336.3735754"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,4]]},"references-count":85,"alternative-id":["10.1145\/3715336.3735754","10.1145\/3715336"],"URL":"https:\/\/doi.org\/10.1145\/3715336.3735754","relation":{},"subject":[],"published":{"date-parts":[[2025,7,4]]},"assertion":[{"value":"2025-07-04","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}