{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T15:39:53Z","timestamp":1775230793338,"version":"3.50.1"},"reference-count":81,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2024,11,21]],"date-time":"2024-11-21T00:00:00Z","timestamp":1732147200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100006374","name":"NSF","doi-asserted-by":"publisher","award":["SCH-2406099"],"award-info":[{"award-number":["SCH-2406099"]}],"id":[{"id":"10.13039\/501100006374","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. ACM Interact. Mob. Wearable Ubiquitous Technol."],"published-print":{"date-parts":[[2024,11,21]]},"abstract":"<jats:p>Voice assistants capable of answering user queries during various physical tasks have shown promise in guiding users through complex procedures. However, users often find it challenging to articulate their queries precisely, especially when unfamiliar with the specific terminologies required for machine-oriented tasks. We introduce PrISM-Q&amp;A, a novel question-answering (Q&amp;A) interaction termed step-aware Q&amp;A, which enhances the functionality of voice assistants on smartwatches by incorporating Human Activity Recognition (HAR) and providing the system with user context. It continuously monitors user behavior during procedural tasks via audio and motion sensors on the watch and estimates which step the user is performing. When a question is posed, this contextual information is supplied to Large Language Models (LLMs) as part of the context used to generate a response, even in the case of inherently vague questions like \"What should I do next with this?\" Our studies confirmed that users preferred the convenience of our approach compared to existing voice assistants. Our real-time assistant represents the first Q&amp;A system that provides contextually situated support during tasks without camera use, paving the way for the ubiquitous, intelligent assistant.<\/jats:p>","DOI":"10.1145\/3699759","type":"journal-article","created":{"date-parts":[[2024,11,21]],"date-time":"2024-11-21T12:23:32Z","timestamp":1732191812000},"page":"1-26","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":15,"title":["PrISM-Q&amp;A: Step-Aware Voice Assistant on a Smartwatch Enabled by Multimodal Procedure Tracking and Large Language Models"],"prefix":"10.1145","volume":"8","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-7868-4754","authenticated-orcid":false,"given":"Riku","family":"Arakawa","sequence":"first","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7906-4899","authenticated-orcid":false,"given":"Jill Fain","family":"Lehman","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh, United States"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1237-7545","authenticated-orcid":false,"given":"Mayank","family":"Goel","sequence":"additional","affiliation":[{"name":"Carnegie Mellon University, Pittsburgh, United States"}]}],"member":"320","published-online":{"date-parts":[[2024,11,21]]},"reference":[{"key":"e_1_2_2_1_1","doi-asserted-by":"publisher","DOI":"10.1145\/3290605.3300233"},{"key":"e_1_2_2_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/3654777.3676350"},{"key":"e_1_2_2_3_1","doi-asserted-by":"publisher","DOI":"10.1145\/3569504"},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/2806416.2806491"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1145\/3341163.3347735"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.18653\/V1\/2023.FINDINGS-EMNLP.1036"},{"key":"e_1_2_2_7_1","doi-asserted-by":"publisher","DOI":"10.1145\/3264901"},{"key":"e_1_2_2_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/TMM.2017.2726187"},{"key":"e_1_2_2_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3534582"},{"key":"e_1_2_2_10_1","volume-title":"SUS: A 'Quick and Dirty' Usability Scale. In Usability Evaluation In Industry, Patrick W","author":"Brooke John","year":"1996","unstructured":"John Brooke. 1996. SUS: A 'Quick and Dirty' Usability Scale. In Usability Evaluation In Industry, Patrick W. Jordan, B. Thomas, Ian Lyall McClelland, and Bernard Weerdmeester (Eds.). CRC Press, London, UK, 207--212."},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-39229-0_51"},{"key":"e_1_2_2_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/169891.169968"},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1007\/S40593-018-0166-3"},{"key":"e_1_2_2_14_1","first-page":"9","article-title":"On the Design of and Interaction with Conversational Agents: An Organizing and Assessing Review of Human-Computer Interaction Research","volume":"23","author":"Diederich Stephan","year":"2022","unstructured":"Stephan Diederich, Alfred Benedikt Brendel, Stefan Morana, and Lutz M. Kolbe. 2022. On the Design of and Interaction with Conversational Agents: An Organizing and Assessing Review of Human-Computer Interaction Research. J. Assoc. Inf. Syst. 23, 1 (2022), 9. https:\/\/aisel.aisnet.org\/jais\/vol23\/iss1\/9","journal-title":"J. Assoc. Inf. Syst."},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1038\/s41431-023-01396-8"},{"key":"e_1_2_2_16_1","volume-title":"Ragas: Automated evaluation of retrieval augmented generation. arXiv preprint arXiv:2309.15217","author":"Es Shahul","year":"2023","unstructured":"Shahul Es, Jithin James, Luis Espinosa-Anke, and Steven Schockaert. 2023. Ragas: Automated evaluation of retrieval augmented generation. arXiv preprint arXiv:2309.15217 (2023)."},{"key":"e_1_2_2_17_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2312.11805"},{"key":"e_1_2_2_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3649500"},{"key":"e_1_2_2_19_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2312.10997"},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1145\/3090076"},{"key":"e_1_2_2_21_1","doi-asserted-by":"publisher","DOI":"10.3115\/981175.981181"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.18653\/V1\/2022.SIGDIAL-1.61"},{"key":"e_1_2_2_23_1","doi-asserted-by":"publisher","DOI":"10.1145\/1101149.1101228"},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRA.2018.8460699"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1145\/302979.303030"},{"key":"e_1_2_2_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3411764.3445283"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1145\/3563657.3596059"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1587\/TRANSINF.2016SLP0004"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.18653\/V1\/W19-8663"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3613904.3642183"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3287048"},{"key":"e_1_2_2_32_1","unstructured":"Jane Huang. 2024. Evaluating Large Language Model (LLM) systems: Metrics challenges and best practices. https:\/\/medium.com\/data-science-at-microsoft\/evaluating-llm-systems-metrics-challenges-and-best-practices-664ac25be7e5"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3571730"},{"key":"e_1_2_2_34_1","volume-title":"International Conference on Machine Learning, ICML 2023","volume":"15707","author":"Kandpal Nikhil","year":"2023","unstructured":"Nikhil Kandpal, Haikang Deng, Adam Roberts, Eric Wallace, and Colin Raffel. 2023. Large Language Models Struggle to Learn Long-Tail Knowledge. In International Conference on Machine Learning, ICML 2023, 23--29 July 2023, Honolulu, Hawaii, USA (Proceedings of Machine Learning Research, Vol. 202). PMLR, 15696--15707."},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.CHB.2021.106914"},{"key":"e_1_2_2_36_1","doi-asserted-by":"publisher","DOI":"10.1145\/3326458.3326932"},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3242587.3242609"},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2984511.2984582"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3613904.3642910"},{"key":"e_1_2_2_40_1","volume-title":"Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems. ACM","author":"Lee Jaewook","unstructured":"Jaewook Lee, Jun Wang, Elizabeth Brown, Liam Chu, Sebastian S. Rodriguez, and Jon E. Froehlich. 2024. GazePointAR: A Context-Aware Multimodal Voice Assistant for Pronoun Disambiguation in Wearable Augmented Reality. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems. ACM, New York, NY."},{"key":"e_1_2_2_41_1","volume-title":"Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020","author":"Lewis Patrick S. H.","year":"2020","unstructured":"Patrick S. H. Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K\u00fcttler, Mike Lewis, Wen-tau Yih, Tim Rockt\u00e4schel, Sebastian Riedel, and Douwe Kiela. 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual."},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1145\/3544548.3581006"},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/2858036.2858288"},{"key":"e_1_2_2_44_1","doi-asserted-by":"publisher","DOI":"10.18653\/V1\/2023.ACL-LONG.546"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.18653\/V1\/2021.ACL-LONG.316"},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1145\/3313831.3376147"},{"key":"e_1_2_2_47_1","doi-asserted-by":"publisher","DOI":"10.1145\/3313831.3376479"},{"key":"e_1_2_2_48_1","unstructured":"Meta Platforms. 2023. Smart glasses for living all in. https:\/\/www.meta.com\/smart-glasses\/"},{"key":"e_1_2_2_49_1","unstructured":"Microsoft. 2024. Tiny but mighty: The Phi-3 small language models with big potential. https:\/\/news.microsoft.com\/source\/features\/ai\/the-phi-3-small-language-models-with-big-potential\/"},{"key":"e_1_2_2_50_1","doi-asserted-by":"publisher","DOI":"10.1145\/3550284"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.3390\/S16010072"},{"key":"e_1_2_2_52_1","doi-asserted-by":"publisher","DOI":"10.1207\/s15327566ijce0403_2"},{"key":"e_1_2_2_53_1","unstructured":"OpenAI. 2022. Introducing Whisper. https:\/\/openai.com\/research\/whisper"},{"key":"e_1_2_2_54_1","unstructured":"OpenAI. 2023. GPT-4 Technical Report. CoRR abs\/2303.08774 (2023). https:\/\/doi.org\/10.48550\/ARXIV.2303.08774 arXiv:2303.08774"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2306.08302"},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2406.10750"},{"key":"e_1_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1145\/3173574.3174214"},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/3359316"},{"key":"e_1_2_2_59_1","doi-asserted-by":"publisher","DOI":"10.1101\/2023.11.20.23298784"},{"key":"e_1_2_2_60_1","doi-asserted-by":"publisher","DOI":"10.1080\/00207548008919653"},{"key":"e_1_2_2_61_1","doi-asserted-by":"publisher","unstructured":"Jorge Rodrguez Teresa Gutirrez Emilio J. Sara Casado and Iker Aguinag. 2012. Training of Procedural Tasks Through the Use of Virtual Reality and Direct Aids. InTech. https:\/\/doi.org\/10.5772\/36650","DOI":"10.5772\/36650"},{"key":"e_1_2_2_62_1","volume-title":"Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023","author":"Schick Timo","year":"2023","unstructured":"Timo Schick, Jane Dwivedi-Yu, Roberto Dess\u00ec, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. 2023. Toolformer: Language Models Can Teach Themselves to Use Tools. In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, New Orleans, LA, USA, December 10-16, 2023."},{"key":"e_1_2_2_63_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPRW.2009.5204354"},{"key":"e_1_2_2_64_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00403-024-03025-w"},{"key":"e_1_2_2_65_1","doi-asserted-by":"publisher","DOI":"10.1145\/3411764.3445536"},{"key":"e_1_2_2_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/3173574.3173782"},{"key":"e_1_2_2_67_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.01854"},{"key":"e_1_2_2_68_1","doi-asserted-by":"publisher","DOI":"10.2139\/ssrn.3718008"},{"key":"e_1_2_2_69_1","doi-asserted-by":"publisher","DOI":"10.1145\/3313831.3376875"},{"key":"e_1_2_2_70_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2402.02008"},{"key":"e_1_2_2_71_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP49357.2023.10095969"},{"key":"e_1_2_2_72_1","doi-asserted-by":"publisher","DOI":"10.1145\/3328931"},{"key":"e_1_2_2_73_1","doi-asserted-by":"publisher","DOI":"10.1145\/3638550.3641130"},{"key":"e_1_2_2_74_1","doi-asserted-by":"publisher","DOI":"10.1609\/AAAI.V38I17.29906"},{"key":"e_1_2_2_75_1","volume-title":"ReAct: Synergizing Reasoning and Acting in Language Models. In The Eleventh International Conference on Learning Representations, ICLR 2023","author":"Yao Shunyu","year":"2023","unstructured":"Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik R. Narasimhan, and Yuan Cao. 2023. ReAct: Synergizing Reasoning and Acting in Language Models. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net."},{"key":"e_1_2_2_76_1","doi-asserted-by":"publisher","DOI":"10.1016\/J.IJCCI.2019.04.005"},{"key":"e_1_2_2_77_1","doi-asserted-by":"publisher","DOI":"10.48550\/ARXIV.2303.18223"},{"key":"e_1_2_2_78_1","volume-title":"Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023","author":"Zheng Lianmin","year":"2023","unstructured":"Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, and Ion Stoica. 2023. Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. In Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 -16, 2023."},{"key":"e_1_2_2_79_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR52729.2023.01033"},{"key":"e_1_2_2_80_1","doi-asserted-by":"publisher","DOI":"10.18653\/V1\/2022.ACL-LONG.214"},{"key":"e_1_2_2_81_1","volume-title":"Retrieving and Reading: A Comprehensive Survey on Open-domain Question Answering. CoRR abs\/2101.00774","author":"Zhu Fengbin","year":"2021","unstructured":"Fengbin Zhu, Wenqiang Lei, Chao Wang, Jianming Zheng, Soujanya Poria, and Tat-Seng Chua. 2021. Retrieving and Reading: A Comprehensive Survey on Open-domain Question Answering. CoRR abs\/2101.00774 (2021)."}],"container-title":["Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3699759","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3699759","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,25]],"date-time":"2025-09-25T16:30:31Z","timestamp":1758817831000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3699759"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,21]]},"references-count":81,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2024,11,21]]}},"alternative-id":["10.1145\/3699759"],"URL":"https:\/\/doi.org\/10.1145\/3699759","relation":{},"ISSN":["2474-9567"],"issn-type":[{"value":"2474-9567","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,11,21]]},"assertion":[{"value":"2024-11-21","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}