{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,5]],"date-time":"2025-11-05T21:17:48Z","timestamp":1762377468064,"version":"3.41.0"},"reference-count":48,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2023,12,29]],"date-time":"2023-12-29T00:00:00Z","timestamp":1703808000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100001809","name":"Natural Science Foundation of China","doi-asserted-by":"crossref","award":["61902209, U2001212"],"award-info":[{"award-number":["61902209, U2001212"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Beijing Outstanding Young Scientist Program","award":["BJJWZYJH012019100020098"],"award-info":[{"award-number":["BJJWZYJH012019100020098"]}]},{"name":"Intelligent Social Governance Platform, Major Innovation Planning Interdisciplinary Platform for the \u201cDouble-First Class\u201d Initiative, Renmin University of China"},{"name":"Fundamental Research Funds for the Central Universities, and the Research Funds of Renmin University of China"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Inf. Syst."],"published-print":{"date-parts":[[2024,5,31]]},"abstract":"<jats:p>Point-of-interest (POI) search is important for location-based services, such as navigation and online ride-hailing service. The goal of POI search is to find the most relevant destinations from a large-scale POI database given a text query. To improve the effectiveness and efficiency of POI search, most existing approaches are based on a multi-stage pipeline that consists of an efficiency-oriented retrieval stage and one or more effectiveness-oriented re-rank stages. In this article, we focus on the first efficiency-oriented retrieval stage of the POI search. We first identify the limitations of existing first-stage POI retrieval models in capturing the semantic-geography relationship and modeling the fine-grained geographical context information. Then, we propose a Geo-Enhanced Dense Retrieval framework for POI search to alleviate the above problems. Specifically, the proposed framework leverages the capacity of pre-trained language models (e.g., BERT) and designs a pre-training approach to better model the semantic match between the query prefix and POIs. With the POI collection, we first perform a token-level pre-training task based on a geographical-sensitive masked language prediction and design two retrieval-oriented pre-training tasks that link the address of each POI to its name and geo-location. With the user behavior logs collected from an online POI search system, we design two additional pre-training tasks based on users\u2019 query reformulation behavior and the transitions between POIs. We also utilize a late-interaction network structure to model the fine-grained interactions between the text and geographical context information within an acceptable query latency. Extensive experiments on the real-world datasets collected from the Didichuxing application demonstrate that the proposed framework can achieve superior retrieval performance over existing first-stage POI retrieval methods.<\/jats:p>","DOI":"10.1145\/3631937","type":"journal-article","created":{"date-parts":[[2023,11,7]],"date-time":"2023-11-07T12:19:44Z","timestamp":1699359584000},"page":"1-27","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":4,"title":["Improving First-stage Retrieval of Point-of-interest Search by Pre-training Models"],"prefix":"10.1145","volume":"42","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7960-3036","authenticated-orcid":false,"given":"Lang","family":"Mei","sequence":"first","affiliation":[{"name":"Beijing Key Laboratory of Big Data Management and Analysis Methods, Gaoling School of Artificial Intelligence, Renmin University of China, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9257-5498","authenticated-orcid":false,"given":"Jiaxin","family":"Mao","sequence":"additional","affiliation":[{"name":"Didi Chuxing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0007-2886-3966","authenticated-orcid":false,"given":"Juan","family":"Hu","sequence":"additional","affiliation":[{"name":"Didi Chuxing, China"}]},{"ORCID":"https:\/\/orcid.org\/0009-0008-4687-5212","authenticated-orcid":false,"given":"Naiqiang","family":"Tan","sequence":"additional","affiliation":[{"name":"Didi Chuxing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8351-1935","authenticated-orcid":false,"given":"Hua","family":"Chai","sequence":"additional","affiliation":[{"name":"Didi Chuxing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9777-9676","authenticated-orcid":false,"given":"Ji-Rong","family":"Wen","sequence":"additional","affiliation":[{"name":"Beijing Key Laboratory of Big Data Management and Analysis Methods, Gaoling School of Artificial Intelligence, Renmin University of China, China"}]}],"member":"320","published-online":{"date-parts":[[2023,12,29]]},"reference":[{"key":"e_1_3_2_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3531943"},{"key":"e_1_3_2_3_2","volume-title":"Convolutional Neural Network for Sentence Classification","author":"Chen Yahui","year":"2015","unstructured":"Yahui Chen. 2015. Convolutional Neural Network for Sentence Classification. Master\u2019s thesis. University of Waterloo."},{"key":"e_1_3_2_4_2","doi-asserted-by":"publisher","DOI":"10.1145\/3331184.3331303"},{"key":"e_1_3_2_5_2","article-title":"Improving local identifiability in probabilistic box embeddings","author":"Dasgupta Shib Sankar","year":"2020","unstructured":"Shib Sankar Dasgupta, Michael Boratko, Dongxu Zhang, Luke Vilnis, Xiang Lorraine Li, and Andrew McCallum. 2020. Improving local identifiability in probabilistic box embeddings. arXiv preprint arXiv:2010.04831 (2020).","journal-title":"arXiv preprint arXiv:2010.04831"},{"key":"e_1_3_2_6_2","article-title":"BERT: Pre-training of deep bidirectional transformers for language understanding","author":"Devlin Jacob","year":"2018","unstructured":"Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).","journal-title":"arXiv preprint arXiv:1810.04805"},{"key":"e_1_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/3447548.3467058"},{"key":"e_1_3_2_8_2","article-title":"Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity","author":"Fedus William","year":"2021","unstructured":"William Fedus, Barret Zoph, and Noam Shazeer. 2021. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity. arXiv preprint arXiv:2101.03961 (2021).","journal-title":"arXiv preprint arXiv:2101.03961"},{"key":"e_1_3_2_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/2766462.2767780"},{"key":"e_1_3_2_10_2","article-title":"Unsupervised corpus aware language model pre-training for dense passage retrieval","author":"Gao Luyu","year":"2021","unstructured":"Luyu Gao and Jamie Callan. 2021. Unsupervised corpus aware language model pre-training for dense passage retrieval. arXiv preprint arXiv:2108.05540 (2021).","journal-title":"arXiv preprint arXiv:2108.05540"},{"key":"e_1_3_2_11_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-72113-8_10"},{"key":"e_1_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2013.379"},{"key":"e_1_3_2_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3532086"},{"key":"e_1_3_2_14_2","article-title":"Universal language model fine-tuning for text classification","author":"Howard Jeremy","year":"2018","unstructured":"Jeremy Howard and Sebastian Ruder. 2018. Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146 (2018).","journal-title":"arXiv preprint arXiv:1801.06146"},{"key":"e_1_3_2_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/3366423.3380027"},{"key":"e_1_3_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403318"},{"key":"e_1_3_2_17_2","doi-asserted-by":"publisher","DOI":"10.1145\/3534678.3539021"},{"key":"e_1_3_2_18_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403305"},{"key":"e_1_3_2_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/2505515.2505665"},{"key":"e_1_3_2_20_2","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2010.57"},{"key":"e_1_3_2_21_2","doi-asserted-by":"publisher","DOI":"10.1109\/TBDATA.2019.2921572"},{"key":"e_1_3_2_22_2","article-title":"Dense passage retrieval for open-domain question answering","author":"Karpukhin Vladimir","year":"2020","unstructured":"Vladimir Karpukhin, Barlas O\u011fuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906 (2020).","journal-title":"arXiv preprint arXiv:2004.04906"},{"key":"e_1_3_2_23_2","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401075"},{"key":"e_1_3_2_24_2","volume-title":"International Conference on Learning Representations","author":"Li Xiang","year":"2018","unstructured":"Xiang Li, Luke Vilnis, Dongxu Zhang, Michael Boratko, and Andrew McCallum. 2018. Smoothing the geometry of probabilistic box embeddings. In International Conference on Learning Representations."},{"key":"e_1_3_2_25_2","doi-asserted-by":"publisher","DOI":"10.1145\/3394486.3403149"},{"key":"e_1_3_2_26_2","first-page":"1","volume-title":"ACM SIGIR Forum","author":"Lin Jimmy","year":"2022","unstructured":"Jimmy Lin. 2022. A proposed conceptual framework for a representational approach to information retrieval. In ACM SIGIR Forum, Vol. 55. ACM New York, NY, 1\u201329."},{"key":"e_1_3_2_27_2","first-page":"2209","volume-title":"Findings of the Association for Computational Linguistics: EMNLP\u201921","author":"Liu Xiao","year":"2021","unstructured":"Xiao Liu, Juan Hu, Qi Shen, and Huan Chen. 2021. Geo-BERT Pre-training model for query rewriting in POI search. In Findings of the Association for Computational Linguistics: EMNLP\u201921. 2209\u20132214."},{"key":"e_1_3_2_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/3477495.3531772"},{"key":"e_1_3_2_29_2","doi-asserted-by":"publisher","DOI":"10.1145\/3437963.3441777"},{"key":"e_1_3_2_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3462869"},{"key":"e_1_3_2_31_2","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401093"},{"key":"e_1_3_2_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/3485447.3512073"},{"key":"e_1_3_2_33_2","unstructured":"Guy M. Morton. 1966. A computer oriented geodetic data base and a new technique in file sequencing. https:\/\/dominoweb.draco.res.ibm.com\/0dabf9473b9c86d48525779800566a39.html"},{"key":"e_1_3_2_34_2","article-title":"Passage re-ranking with BERT","author":"Nogueira Rodrigo","year":"2019","unstructured":"Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage re-ranking with BERT. arXiv preprint arXiv:1901.04085 (2019).","journal-title":"arXiv preprint arXiv:1901.04085"},{"key":"e_1_3_2_35_2","article-title":"Semantic modelling with long-short-term memory for information retrieval","author":"Palangi Hamid","year":"2014","unstructured":"Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, and R. Ward. 2014. Semantic modelling with long-short-term memory for information retrieval. arXiv preprint arXiv:1412.6629 (2014).","journal-title":"arXiv preprint arXiv:1412.6629"},{"key":"e_1_3_2_36_2","doi-asserted-by":"crossref","unstructured":"M. E. Peters M. Neumann M. Iyyer M. Gardner C. Clark K. Lee and L. Zettlemoyer. 2018. Deep contextualized word representations. arXiv preprint arXiv: 180205365 . (2018).","DOI":"10.18653\/v1\/N18-1202"},{"key":"e_1_3_2_37_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11431-020-1647-3"},{"key":"e_1_3_2_38_2","article-title":"RocketQA: An optimized training approach to dense passage retrieval for open-domain question answering","author":"Qu Yingqi","year":"2020","unstructured":"Yingqi Qu, Yuchen Ding, Jing Liu, Kai Liu, Ruiyang Ren, Wayne Xin Zhao, Daxiang Dong, Hua Wu, and Haifeng Wang. 2020. RocketQA: An optimized training approach to dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2010.08191 (2020).","journal-title":"arXiv preprint arXiv:2010.08191"},{"key":"e_1_3_2_39_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4471-2099-5_24"},{"key":"e_1_3_2_40_2","article-title":"ColBERTv2: Effective and efficient retrieval via lightweight late interaction","author":"Santhanam Keshav","year":"2021","unstructured":"Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, Christopher Potts, and Matei Zaharia. 2021. ColBERTv2: Effective and efficient retrieval via lightweight late interaction. arXiv preprint arXiv:2112.01488 (2021).","journal-title":"arXiv preprint arXiv:2112.01488"},{"key":"e_1_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1145\/2567948.2577348"},{"key":"e_1_3_2_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3459637.3481924"},{"issue":"11","key":"e_1_3_2_43_2","article-title":"Visualizing data using t-SNE.","volume":"9","author":"Maaten Laurens Van der","year":"2008","unstructured":"Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 11 (2008).","journal-title":"J. Mach. Learn. Res."},{"key":"e_1_3_2_44_2","article-title":"Attention is all you need","volume":"30","author":"Vaswani Ashish","year":"2017","unstructured":"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, \u0141ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017).","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"e_1_3_2_45_2","article-title":"Graph attention networks","author":"Veli\u010dkovi\u0107 Petar","year":"2017","unstructured":"Petar Veli\u010dkovi\u0107, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).","journal-title":"arXiv preprint arXiv:1710.10903"},{"key":"e_1_3_2_46_2","article-title":"Approximate nearest neighbor negative contrastive learning for dense text retrieval","author":"Xiong Lee","year":"2020","unstructured":"Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul Bennett, Junaid Ahmed, and Arnold Overwijk. 2020. Approximate nearest neighbor negative contrastive learning for dense text retrieval. arXiv preprint arXiv:2007.00808 (2020).","journal-title":"arXiv preprint arXiv:2007.00808"},{"key":"e_1_3_2_47_2","doi-asserted-by":"publisher","DOI":"10.1145\/3397271.3401159"},{"key":"e_1_3_2_48_2","doi-asserted-by":"publisher","DOI":"10.1145\/3404835.3462880"},{"key":"e_1_3_2_49_2","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v33i01.33011270"}],"container-title":["ACM Transactions on Information Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3631937","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3631937","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T22:51:02Z","timestamp":1750287062000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3631937"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,29]]},"references-count":48,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2024,5,31]]}},"alternative-id":["10.1145\/3631937"],"URL":"https:\/\/doi.org\/10.1145\/3631937","relation":{},"ISSN":["1046-8188","1558-2868"],"issn-type":[{"type":"print","value":"1046-8188"},{"type":"electronic","value":"1558-2868"}],"subject":[],"published":{"date-parts":[[2023,12,29]]},"assertion":[{"value":"2023-02-27","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-10-29","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-12-29","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}