{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,10]],"date-time":"2026-04-10T00:18:33Z","timestamp":1775780313404,"version":"3.50.1"},"reference-count":31,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2021,7]]},"abstract":"<jats:p>Many navigation applications take natural language speech as input, which avoids users typing in words and thus improves traffic safety. However, navigation applications often fail to understand a user's free-form description of a route. In addition, they only support input of a specific source or destination, which does not enable users to specify additional route requirements. We propose a SpeakNav framework that enables users to describe intended routes via speech and then recommends appropriate routes. Specifically, we propose a novel Route Template based Bidirectional Encoder Representation from Transformers (RT-BERT) model that supports the understanding of natural language route descriptions. The model enables extraction of information of intended POI keywords and related distances. Then we formalize a template-driven path query that uses the extracted information. To enable efficient query processing, we develop a hybrid label index for computing network distances between POIs, and we propose a branch-and-bound algorithm along with a pivot reverse B-tree (PB-tree) index. Experiments with real and synthetic data indicate that RT-BERT offers high accuracy and that the proposed algorithm is capable of outperforming baseline algorithms.<\/jats:p>","DOI":"10.14778\/3476311.3476383","type":"journal-article","created":{"date-parts":[[2021,10,28]],"date-time":"2021-10-28T22:48:43Z","timestamp":1635461323000},"page":"3056-3068","source":"Crossref","is-referenced-by-count":9,"title":["SpeakNav"],"prefix":"10.14778","volume":"14","author":[{"given":"Bolong","family":"Zheng","sequence":"first","affiliation":[{"name":"Huazhong University of Science and Technology"}]},{"given":"Lei","family":"Bi","sequence":"additional","affiliation":[{"name":"Huazhong University of Science and Technology"}]},{"given":"Juan","family":"Cao","sequence":"additional","affiliation":[{"name":"Huazhong University of Science and Technology"}]},{"given":"Hua","family":"Chai","sequence":"additional","affiliation":[{"name":"Didi Chuxing"}]},{"given":"Jun","family":"Fang","sequence":"additional","affiliation":[{"name":"Didi Chuxing"}]},{"given":"Lu","family":"Chen","sequence":"additional","affiliation":[{"name":"Zhejiang University"}]},{"given":"Yunjun","family":"Gao","sequence":"additional","affiliation":[{"name":"Zhejiang University"}]},{"given":"Xiaofang","family":"Zhou","sequence":"additional","affiliation":[{"name":"Hong Kong University of Science and Technology"}]},{"given":"Christian S.","family":"Jensen","sequence":"additional","affiliation":[{"name":"Aalborg University"}]}],"member":"320","published-online":{"date-parts":[[2021,10,28]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.5555\/2008623.2008645"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463676.2465315"},{"key":"e_1_2_1_3_1","volume-title":"Christian S. Jensen, and Bolong Zheng.","author":"Bi Lei","year":"2021","unstructured":"Lei Bi , Juan Cao , GuoHui Li , Nguyen Quoc Viet Hung , Christian S. Jensen, and Bolong Zheng. 2021 . SpeakNav: A Voice-based Navigation System via Route Description Language Understanding. In ICDE. IEEE , 2669--2672. Lei Bi, Juan Cao, GuoHui Li, Nguyen Quoc Viet Hung, Christian S. Jensen, and Bolong Zheng. 2021. SpeakNav: A Voice-based Navigation System via Route Description Language Understanding. In ICDE. IEEE, 2669--2672."},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.14778\/2350229.2350234"},{"key":"e_1_2_1_5_1","volume-title":"BERT for Joint Intent Classification and Slot Filling. CoRR abs\/1902.10909","author":"Chen Qian","year":"2019","unstructured":"Qian Chen , Zhu Zhuo , and Wen Wang . 2019. BERT for Joint Intent Classification and Slot Filling. CoRR abs\/1902.10909 ( 2019 ). Qian Chen, Zhu Zhuo, and Wen Wang. 2019. BERT for Joint Intent Classification and Slot Filling. CoRR abs\/1902.10909 (2019)."},{"key":"e_1_2_1_6_1","volume-title":"Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces. CoRR abs\/1805.10190","author":"Coucke Alice","year":"2018","unstructured":"Alice Coucke , Alaa Saade , Adrien Ball , Th\u00e9odore Bluche , Alexandre Caulier , David Leroy , Cl\u00e9ment Doumouro , Thibault Gisselbrecht , Francesco Caltagirone , Thibaut Lavril , Ma\u00ebl Primet , and Joseph Dureau . 2018. Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces. CoRR abs\/1805.10190 ( 2018 ). Alice Coucke, Alaa Saade, Adrien Ball, Th\u00e9odore Bluche, Alexandre Caulier, David Leroy, Cl\u00e9ment Doumouro, Thibault Gisselbrecht, Francesco Caltagirone, Thibaut Lavril, Ma\u00ebl Primet, and Joseph Dureau. 2018. Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces. CoRR abs\/1805.10190 (2018)."},{"key":"e_1_2_1_7_1","volume-title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. 4171--4186.","author":"Devlin Jacob","year":"2019","unstructured":"Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. 4171--4186. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. 4171--4186."},{"key":"e_1_2_1_8_1","volume-title":"ACL (1)","author":"Haihong","unstructured":"Haihong E, Peiqing Niu , Zhongfu Chen , and Meina Song . 2019. A Novel Bidirectional Interrelated Model for Joint Intent Detection and Slot Filling . In ACL (1) . Association for Computational Linguistics , 5467--5471. Haihong E, Peiqing Niu, Zhongfu Chen, and Meina Song. 2019. A Novel Bidirectional Interrelated Model for Joint Intent Detection and Slot Filling. In ACL (1). Association for Computational Linguistics, 5467--5471."},{"key":"e_1_2_1_9_1","volume-title":"Attention Branch Network: Learning of Attention Mechanism for Visual Explanation","author":"Fukui Hiroshi","unstructured":"Hiroshi Fukui , Tsubasa Hirakawa , Takayoshi Yamashita , and Hironobu Fujiyoshi . 2019. Attention Branch Network: Learning of Attention Mechanism for Visual Explanation . In CVPR. Computer Vision Foundation \/ IEEE , 10705--10714. Hiroshi Fukui, Tsubasa Hirakawa, Takayoshi Yamashita, and Hironobu Fujiyoshi. 2019. Attention Branch Network: Learning of Attention Mechanism for Visual Explanation. In CVPR. Computer Vision Foundation \/ IEEE, 10705--10714."},{"key":"e_1_2_1_10_1","volume-title":"NAACL-HLT (2)","author":"Goo Chih-Wen","unstructured":"Chih-Wen Goo , Guang Gao , Yun-Kai Hsu , Chih-Li Huo , Tsung-Chieh Chen , Keng-Wei Hsu , and Yun-Nung Chen . 2018. Slot-Gated Modeling for Joint Slot Filling and Intent Prediction . In NAACL-HLT (2) . Association for Computational Linguistics , 753--757. Chih-Wen Goo, Guang Gao, Yun-Kai Hsu, Chih-Li Huo, Tsung-Chieh Chen, Keng-Wei Hsu, and Yun-Nung Chen. 2018. Slot-Gated Modeling for Joint Slot Filling and Intent Prediction. In NAACL-HLT (2). Association for Computational Linguistics, 753--757."},{"key":"e_1_2_1_11_1","volume-title":"Joint semantic utterance classification and slot filling with recursive neural networks","author":"Guo Daniel","unstructured":"Daniel Guo , G\u00f6khan T\u00fcr , Wen-tau Yih, and Geoffrey Zweig . 2014. Joint semantic utterance classification and slot filling with recursive neural networks . In SLT. IEEE , 554--559. Daniel Guo, G\u00f6khan T\u00fcr, Wen-tau Yih, and Geoffrey Zweig. 2014. Joint semantic utterance classification and slot filling with recursive neural networks. In SLT. IEEE, 554--559."},{"key":"e_1_2_1_12_1","volume-title":"Wright","author":"Haffner Patrick","year":"2003","unstructured":"Patrick Haffner , G\u00f6khan T\u00fcr , and Jerry H . Wright . 2003 . Optimizing SVMs for complex call classification. In ICASSP (1). IEEE , 632--635. Patrick Haffner, G\u00f6khan T\u00fcr, and Jerry H. Wright. 2003. Optimizing SVMs for complex call classification. In ICASSP (1). IEEE, 632--635."},{"key":"e_1_2_1_13_1","doi-asserted-by":"crossref","unstructured":"Dilek Hakkani-T\u00fcr G\u00f6khan T\u00fcr Asli \u00c7elikyilmaz Yun-Nung Chen Jianfeng Gao Li Deng and Ye-Yi Wang. 2016. Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM. In INTERSPEECH. ISCA 715--719.  Dilek Hakkani-T\u00fcr G\u00f6khan T\u00fcr Asli \u00c7elikyilmaz Yun-Nung Chen Jianfeng Gao Li Deng and Ye-Yi Wang. 2016. Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM. In INTERSPEECH . ISCA 715--719.","DOI":"10.21437\/Interspeech.2016-402"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.3115\/116580.116613"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_2_1_16_1","doi-asserted-by":"crossref","unstructured":"Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In EMNLP. ACL 1746--1751.  Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In EMNLP . ACL 1746--1751.","DOI":"10.3115\/v1\/D14-1181"},{"key":"e_1_2_1_17_1","volume-title":"Kingma and Jimmy Ba","author":"Diederik","year":"2015","unstructured":"Diederik P. Kingma and Jimmy Ba . 2015 . Adam : A Method for Stochastic Optimization. In ICLR (Poster) . Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR (Poster)."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1145\/3186728.3164141"},{"key":"e_1_2_1_19_1","doi-asserted-by":"crossref","unstructured":"Bing Liu and Ian Lane. 2016. Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling. In INTERSPEECH. ISCA 685--689.  Bing Liu and Ian Lane. 2016. Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling. In INTERSPEECH . ISCA 685--689.","DOI":"10.21437\/Interspeech.2016-1352"},{"key":"e_1_2_1_20_1","volume-title":"Language models are unsupervised multitask learners. OpenAI blog 1, 8","author":"Radford Alec","year":"2019","unstructured":"Alec Radford , Jeffrey Wu , Rewon Child , David Luan , Dario Amodei , and Ilya Sutskever . 2019. Language models are unsupervised multitask learners. OpenAI blog 1, 8 ( 2019 ), 9. Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9."},{"key":"e_1_2_1_21_1","volume-title":"Ravuri and Andreas Stolcke","author":"Suman","year":"2015","unstructured":"Suman V. Ravuri and Andreas Stolcke . 2015 . Recurrent neural network and LSTM models for lexical utterance classification. In INTERSPEECH. ISCA , 135--139. Suman V. Ravuri and Andreas Stolcke. 2015. Recurrent neural network and LSTM models for lexical utterance classification. In INTERSPEECH. ISCA, 135--139."},{"key":"e_1_2_1_22_1","doi-asserted-by":"crossref","unstructured":"Christian Raymond and Giuseppe Riccardi. 2007. Generative and discriminative algorithms for spoken language understanding. In INTERSPEECH. ISCA 1605--1608.  Christian Raymond and Giuseppe Riccardi. 2007. Generative and discriminative algorithms for spoken language understanding. In INTERSPEECH . ISCA 1605--1608.","DOI":"10.21437\/Interspeech.2007-448"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/78.650093"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/3295222.3295349"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1967.1054010"},{"key":"e_1_2_1_26_1","volume-title":"Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation","author":"Wang Yude","unstructured":"Yude Wang , Jie Zhang , Meina Kan , Shiguang Shan , and Xilin Chen . 2020. Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation . In CVPR. IEEE , 12272--12281. Yude Wang, Jie Zhang, Meina Kan, Shiguang Shan, and Xilin Chen. 2020. Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation. In CVPR. IEEE, 12272--12281."},{"key":"e_1_2_1_27_1","volume-title":"Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. CoRR abs\/1609.08144","author":"Wu Yonghui","year":"2016","unstructured":"Yonghui Wu , Mike Schuster , Zhifeng Chen , Quoc V. Le , Mohammad Norouzi , Wolfgang Macherey , Maxim Krikun , Yuan Cao , Qin Gao , Klaus Macherey , Jeff Klingner , Apurva Shah , Melvin Johnson , Xiaobing Liu , Lukasz Kaiser , Stephan Gouws , Yoshikiyo Kato , Taku Kudo , Hideto Kazawa , Keith Stevens , George Kurian , Nishant Patil , Wei Wang , Cliff Young , Jason Smith , Jason Riesa , Alex Rudnick , Oriol Vinyals , Greg Corrado , Macduff Hughes , and Jeffrey Dean . 2016. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. CoRR abs\/1609.08144 ( 2016 ). Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Lukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2016. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. CoRR abs\/1609.08144 (2016)."},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1145\/2093973.2094001"},{"key":"e_1_2_1_29_1","unstructured":"Kaisheng Yao Geoffrey Zweig Mei-Yuh Hwang Yangyang Shi and Dong Yu. 2013. Recurrent neural networks for language understanding. In INTERSPEECH. ISCA 2524--2528.  Kaisheng Yao Geoffrey Zweig Mei-Yuh Hwang Yangyang Shi and Dong Yu. 2013. Recurrent neural networks for language understanding. In INTERSPEECH . ISCA 2524--2528."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2018.03.058"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2017.2703848"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3476311.3476383","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:38:01Z","timestamp":1672227481000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3476311.3476383"}},"subtitle":["voice-based route description language understanding for template-driven path search"],"short-title":[],"issued":{"date-parts":[[2021,7]]},"references-count":31,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2021,7]]}},"alternative-id":["10.14778\/3476311.3476383"],"URL":"https:\/\/doi.org\/10.14778\/3476311.3476383","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2021,7]]}}}