{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,31]],"date-time":"2026-01-31T00:13:24Z","timestamp":1769818404426,"version":"3.49.0"},"reference-count":48,"publisher":"Cambridge University Press (CUP)","issue":"3","license":[{"start":{"date-parts":[[2021,10,29]],"date-time":"2021-10-29T00:00:00Z","timestamp":1635465600000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["cambridge.org"],"crossmark-restriction":true},"short-container-title":["Nat. Lang. Eng."],"published-print":{"date-parts":[[2023,5]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>We propose a new setting for question answering (QA) in which users can query the system using both natural language and direct interactions within a graphical user interface that displays multiple time series associated with an entity of interest. The user interacts with the interface in order to understand the entity\u2019s state and behavior, entailing sequences of actions and questions whose answers may depend on previous factual or navigational interactions. We describe a pipeline implementation where spoken questions are first transcribed into text which is then semantically parsed into logical forms that can be used to automatically extract the answer from the underlying database. The speech recognition module is implemented by adapting a pre-trained long short-term memory (LSTM)-based architecture to the user\u2019s speech, whereas for the semantic parsing component we introduce an LSTM-based encoder\u2013decoder architecture that models context dependency through copying mechanisms and multiple levels of attention over inputs and previous outputs. When evaluated separately, with and without data augmentation, both models are shown to substantially outperform several strong baselines. Furthermore, the full pipeline evaluation shows only a small degradation in semantic parsing accuracy, demonstrating that the semantic parser is robust to mistakes in the speech recognition output. The new QA paradigm proposed in this paper has the potential to improve the presentation and navigation of the large amounts of sensor data and life events that are generated in many areas of medicine.<\/jats:p>","DOI":"10.1017\/s1351324921000292","type":"journal-article","created":{"date-parts":[[2021,10,29]],"date-time":"2021-10-29T07:42:45Z","timestamp":1635493365000},"page":"769-793","update-policy":"https:\/\/doi.org\/10.1017\/policypage","source":"Crossref","is-referenced-by-count":3,"title":["A semantic parsing pipeline for context-dependent question answering over temporally structured data"],"prefix":"10.1017","volume":"29","author":[{"given":"Charles","family":"Chen","sequence":"first","affiliation":[]},{"given":"Razvan","family":"Bunescu","sequence":"additional","affiliation":[]},{"given":"Cindy","family":"Marling","sequence":"additional","affiliation":[]}],"member":"56","published-online":{"date-parts":[[2021,10,29]]},"reference":[{"key":"S1351324921000292_ref43","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D16-1137"},{"key":"S1351324921000292_ref48","unstructured":"Zhong, V. , Xiong, C. and Socher, R. (2017). Seq2SQL: generating structured queries from natural language using reinforcement learning. CoRR, abs\/1709.00103."},{"key":"S1351324921000292_ref13","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1004"},{"key":"S1351324921000292_ref17","doi-asserted-by":"publisher","DOI":"10.3115\/1219840.1219866"},{"key":"S1351324921000292_ref36","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2017.131"},{"key":"S1351324921000292_ref23","doi-asserted-by":"publisher","DOI":"10.1145\/1273221.1273231"},{"key":"S1351324921000292_ref20","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"S1351324921000292_ref11","doi-asserted-by":"publisher","DOI":"10.1109\/JSTSP.2017.2752691"},{"key":"S1351324921000292_ref18","doi-asserted-by":"publisher","DOI":"10.1109\/MSP.2012.2205597"},{"key":"S1351324921000292_ref22","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1002"},{"key":"S1351324921000292_ref33","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2019-2702"},{"key":"S1351324921000292_ref8","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/D14-1179"},{"key":"S1351324921000292_ref7","doi-asserted-by":"publisher","DOI":"10.1109\/ICOSC.2019.8665509"},{"key":"S1351324921000292_ref45","unstructured":"Zeng, W. , Luo, W. , Fidler, S. and Urtasun, R. (2016). Efficient summarization with read-again and copy mechanism. CoRR, abs\/1611.03382."},{"key":"S1351324921000292_ref29","first-page":"71","article-title":"Speech recognition systems: a comparative review","volume":"19","author":"Matarneh","year":"2017","journal-title":"IOSR Journal of Computer Engineering"},{"key":"S1351324921000292_ref6","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2016.7472621"},{"key":"S1351324921000292_ref37","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2017-1705"},{"key":"S1351324921000292_ref2","unstructured":"Bahdanau, D. , Cho, K. and Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In Bengio, Y. and LeCun, Y. , (eds), 3rd International Conference on Learning Representations, ICLR, San Dieco, CA."},{"key":"S1351324921000292_ref28","first-page":"1","article-title":"A unified approach to interpreting model predictions","volume":"30","author":"Lundberg","year":"2017","journal-title":"In Advances in Neural Information Processing Systems"},{"key":"S1351324921000292_ref39","unstructured":"Sigurdsson, S. , Petersen, K.B. and Lehn-Schi\u00f8ler, T. (2006). Mel frequency cepstral coefficients: an evaluation of robustness of MP3 encoded music. In International Conference on Music Information Retrieval (ISMIR), Victoria, Canada, pp. 286\u2013289."},{"key":"S1351324921000292_ref35","unstructured":"Ranzato, M. , Chopra, S. , Auli, M. and Zaremba, W. (2016). Sequence level training with recurrent neural networks. In The 4th International Conference on Learning Representations (ICLR), Conference Track Proceedings, San Juan, Puerto Rico."},{"key":"S1351324921000292_ref46","doi-asserted-by":"publisher","DOI":"10.3115\/1690219.1690283"},{"key":"S1351324921000292_ref44","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-1447"},{"key":"S1351324921000292_ref19","doi-asserted-by":"publisher","DOI":"10.1126\/science.1127647"},{"key":"S1351324921000292_ref10","unstructured":"Devlin, J. , Chang, M.-W. , Lee, K. and Toutanova, K. (2019). BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota. Association for Computational Linguistics, pp. 4171\u20134186."},{"key":"S1351324921000292_ref1","unstructured":"Artzi, Y. and Zettlemoyer, L. (2011). Bootstrapping semantic parsers from conversations. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, UK, pp. 421\u2013432."},{"key":"S1351324921000292_ref14","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1068"},{"key":"S1351324921000292_ref24","unstructured":"Kingma, D.P. and Ba, J. (2015). Adam: a method for stochastic optimization. In ICLR 2015, the 3rd International Conference on Learning Representations, Conference Track Proceedings, San Diego, CA."},{"key":"S1351324921000292_ref26","doi-asserted-by":"publisher","DOI":"10.1145\/2866568"},{"key":"S1351324921000292_ref9","unstructured":"Corona, R. , Thomason, J. and Mooney, R. (2017). Improving black-box speech recognition using semantic parsing. In Proceedings of the 8th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Taipei, Taiwan, pp. 122\u2013127."},{"key":"S1351324921000292_ref25","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2019-1819"},{"key":"S1351324921000292_ref21","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P17-1167"},{"key":"S1351324921000292_ref5","doi-asserted-by":"publisher","DOI":"10.1561\/2200000006"},{"key":"S1351324921000292_ref4","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2019-3167"},{"key":"S1351324921000292_ref27","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1138"},{"key":"S1351324921000292_ref30","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2019-2661"},{"key":"S1351324921000292_ref31","unstructured":"Norouzi, M. , Bengio, S. , Jaitly, N. , Schuster, M. , Wu, Y. , Schuurmans, D. et al. (2016). Reward augmented maximum likelihood for neural structured prediction. In Advances in Neural Information Processing Systems 29, Barcelona, Spain, pp. 1723\u20131731."},{"key":"S1351324921000292_ref32","unstructured":"Paulus, R. , Xiong, C. and Socher, R. (2018). A deep reinforced model for abstractive summarization. In The 6th International Conference on Learning Representations (ICLR), Conference Track Proceedings, Vancouver, Canada."},{"key":"S1351324921000292_ref34","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2019-3060"},{"key":"S1351324921000292_ref38","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1162"},{"key":"S1351324921000292_ref3","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2019-1554"},{"key":"S1351324921000292_ref41","unstructured":"Weston, J. , Bordes, A. , Chopra, S. and Mikolov, T. (2016). Towards AI-complete question answering: a set of prerequisite toy tasks. In The 4th International Conference on Learning Representations (ICLR), Conference Track Proceedings, San Juan, Puerto Rico."},{"key":"S1351324921000292_ref47","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2018-1616"},{"key":"S1351324921000292_ref12","doi-asserted-by":"publisher","DOI":"10.1109\/ICFHR.2016.0074"},{"key":"S1351324921000292_ref15","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1154"},{"key":"S1351324921000292_ref40","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2017-1118"},{"key":"S1351324921000292_ref16","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P16-1014"},{"key":"S1351324921000292_ref42","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1989.1.2.270"}],"container-title":["Natural Language Engineering"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.cambridge.org\/core\/services\/aop-cambridge-core\/content\/view\/S1351324921000292","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,5,19]],"date-time":"2023-05-19T07:31:29Z","timestamp":1684481489000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.cambridge.org\/core\/product\/identifier\/S1351324921000292\/type\/journal_article"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,29]]},"references-count":48,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,5]]}},"alternative-id":["S1351324921000292"],"URL":"https:\/\/doi.org\/10.1017\/s1351324921000292","relation":{},"ISSN":["1351-3249","1469-8110"],"issn-type":[{"value":"1351-3249","type":"print"},{"value":"1469-8110","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,10,29]]},"assertion":[{"value":"\u00a9 The Author(s), 2021. Published by Cambridge University Press","name":"copyright","label":"Copyright","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}},{"value":"This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https:\/\/creativecommons.org\/licenses\/by\/4.0\/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.","name":"license","label":"License","group":{"name":"copyright_and_licensing","label":"Copyright and Licensing"}},{"value":"This content has been made available to all.","name":"free","label":"Free to read"}]}}