{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,17]],"date-time":"2025-11-17T21:40:22Z","timestamp":1763415622428,"version":"3.41.0"},"reference-count":55,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2023,4,12]],"date-time":"2023-04-12T00:00:00Z","timestamp":1681257600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"Hong Kong Polytechnic University and Didi Chuxing"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Spatial Algorithms Syst."],"published-print":{"date-parts":[[2023,6,30]]},"abstract":"<jats:p>Cities are very complex systems. Representing urban regions are essential for exploring, understanding, and predicting properties and features of cities. The enrichment of multi-modal urban big data has provided opportunities for researchers to enhance urban region embedding. However, existing works failed to develop an integrated pipeline that fully utilizes effective and informative data sources within geographic units. In this article, we regard a geo-tile as a geographic unit and propose a multi-modal and multi-stage representation learning framework, namely Geo-Tile2Vec, for urban analytics, especially for urban region properties identification. Specifically, in the early stage, geo-tile embeddings are firstly inferred through dynamic mobility events which are combinations of point-of-interest (POI) data and trajectory data by a Word2Vec-like model and metric learning. Then, in the latter stage, we use static street-level imagery to further enrich the embedding information by metric learning. Lastly, the framework learns distributed geo-tile embeddings for the given multi-modal data. We conduct experiments on real-world urban datasets. Four downstream tasks, i.e., main POI category classification task, main land use category classification task, restaurant average price regression task, and firm number regression task, are adopted for validating the effectiveness of the proposed framework in representing geo-tiles. Our proposed framework can significantly improve the performances of all downstream tasks. In addition, we also demonstrate that geo-tiles with similar urban region properties are geometrically closer in the vector space.<\/jats:p>","DOI":"10.1145\/3571741","type":"journal-article","created":{"date-parts":[[2022,11,18]],"date-time":"2022-11-18T12:04:42Z","timestamp":1668773082000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":6,"title":["Geo-Tile2Vec: A Multi-Modal and Multi-Stage Embedding Framework for Urban Analytics"],"prefix":"10.1145","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9533-6070","authenticated-orcid":false,"given":"Yan","family":"Luo","sequence":"first","affiliation":[{"name":"Department of Computing, the Hong Kong Polytechnic University, Hung Hom, KLN, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6124-1890","authenticated-orcid":false,"given":"Chak-Tou","family":"Leong","sequence":"additional","affiliation":[{"name":"Department of Computing, the Hong Kong Polytechnic University, Hung Hom, KLN, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8584-0276","authenticated-orcid":false,"given":"Shuhai","family":"Jiao","sequence":"additional","affiliation":[{"name":"Didi Chuxing, Beijing, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5294-8168","authenticated-orcid":false,"given":"Fu-Lai","family":"Chung","sequence":"additional","affiliation":[{"name":"Department of Computing, the Hong Kong Polytechnic University, Hung Hom, KLN, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7360-8864","authenticated-orcid":false,"given":"Wenjie","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Computing, the Hong Kong Polytechnic University, Hung Hom, KLN, Hong Kong"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3734-1346","authenticated-orcid":false,"given":"Guoping","family":"Liu","sequence":"additional","affiliation":[{"name":"Didi Chuxing, Beijing, China"}]}],"member":"320","published-online":{"date-parts":[[2023,4,12]]},"reference":[{"key":"e_1_3_3_2_2","doi-asserted-by":"publisher","DOI":"10.1145\/3183344"},{"key":"e_1_3_3_3_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.pmcj.2017.09.006"},{"key":"e_1_3_3_4_2","doi-asserted-by":"publisher","DOI":"10.5555\/3304222.3304226"},{"key":"e_1_3_3_5_2","first-page":"785","volume-title":"Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining","author":"Chen Tianqi","year":"2016","unstructured":"Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A scalable tree boosting system. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, New York, NY, 785\u2013794."},{"key":"e_1_3_3_6_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2020.106205"},{"issue":"1","key":"e_1_3_3_7_2","doi-asserted-by":"crossref","first-page":"19","DOI":"10.1007\/s43762-022-00049-8","article-title":"Investigating functional consistency of mobility-related urban zones via motion-driven embedding vectors and local POI-type distributions","volume":"2","author":"Crivellari Alessandro","year":"2022","unstructured":"Alessandro Crivellari and Bernd Resch. 2022. Investigating functional consistency of mobility-related urban zones via motion-driven embedding vectors and local POI-type distributions. Computational Urban Science 2, 1 (2022), 19.","journal-title":"Computational Urban Science"},{"key":"e_1_3_3_8_2","doi-asserted-by":"publisher","DOI":"10.1073\/pnas.1903064116"},{"key":"e_1_3_3_9_2","article-title":"A gridded establishment dataset as a proxy for economic activity in China","volume":"8","author":"Dong Lei","year":"2021","unstructured":"Lei Dong, Xiao-Hui Yuan, Meng Li, Carlo Ratti, and Yu Liu. 2021. A gridded establishment dataset as a proxy for economic activity in China. Scientific Data 8, 1 (2021), 1\u20139.","journal-title":"Scientific Data"},{"key":"e_1_3_3_10_2","doi-asserted-by":"publisher","DOI":"10.1145\/3380970"},{"issue":"01","key":"e_1_3_3_11_2","doi-asserted-by":"crossref","first-page":"906","DOI":"10.1609\/aaai.v33i01.3301906","article-title":"Efficient region embedding with multi-view spatial networks: A perspective of locality-constrained spatial autocorrelations","volume":"33","author":"Fu Yanjie","year":"2019","unstructured":"Yanjie Fu, Pengyang Wang, Jiadi Du, Le Wu, and Xiaolin Li. 2019. Efficient region embedding with multi-view spatial networks: A perspective of locality-constrained spatial autocorrelations. Proceedings of the AAAI Conference on Artificial Intelligence 33, 01 (July2019), 906\u2013913.","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"e_1_3_3_12_2","first-page":"201700035","article-title":"Using deep learning and Google street view to estimate the demographic makeup of the US","volume":"114","author":"Gebru Timnit","year":"2017","unstructured":"Timnit Gebru, Jonathan Krause, Yilun Wang, Duyun Chen, Jia Deng, Erez Lieberman Aiden, and Li Fei-Fei. 2017. Using deep learning and Google street view to estimate the demographic makeup of the US. Proceedings of the National Academy of Sciences 114 (112017), 201700035.","journal-title":"Proceedings of the National Academy of Sciences"},{"key":"e_1_3_3_13_2","doi-asserted-by":"publisher","DOI":"10.1145\/3477314.3506992"},{"key":"e_1_3_3_14_2","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"issue":"1","key":"e_1_3_3_15_2","first-page":"1","article-title":"Accelerating land cover change in West Africa over four decades as population pressure increased","volume":"1","author":"Herrmann Stefanie M.","year":"2020","unstructured":"Stefanie M. Herrmann, Martin Brandt, Kjeld Rasmussen, and Rasmus Fensholt. 2020. Accelerating land cover change in West Africa over four decades as population pressure increased. Communications Earth & Environment 1, 1 (2020), 1\u201310.","journal-title":"Communications Earth & Environment"},{"key":"e_1_3_3_16_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compenvurbsys.2021.101619"},{"key":"e_1_3_3_17_2","volume-title":"Proceedings of the DeepSpatial 2021: 2nd ACM KDD Workshop on Deep Learning for Spatio-Temporal Data, Applications and Systems","author":"Huang Tianyuan","year":"2021","unstructured":"Tianyuan Huang, Zhecheng Wang, Hao Sheng, Andrew Y. Ng, and Ram Rajagopal. 2021. M3G: Learning urban neighborhood representation from multi-modal multi-graph. In Proceedings of the DeepSpatial 2021: 2nd ACM KDD Workshop on Deep Learning for Spatio-Temporal Data, Applications and Systems."},{"issue":"0","key":"e_1_3_3_18_2","first-page":"1","article-title":"Estimating urban functional distributions with semantics preserved POI embedding","volume":"0","author":"Huang Weiming","year":"2022","unstructured":"Weiming Huang, Lizhen Cui, Meng Chen, Daokun Zhang, and Yao Yao. 2022. Estimating urban functional distributions with semantics preserved POI embedding. International Journal of Geographical Information Science 0, 0 (2022), 1\u201326.","journal-title":"International Journal of Geographical Information Science"},{"issue":"01","key":"e_1_3_3_19_2","doi-asserted-by":"crossref","first-page":"3967","DOI":"10.1609\/aaai.v33i01.33013967","article-title":"Tile2Vec: Unsupervised representation learning for spatially distributed data","volume":"33","author":"Jean Neal","year":"2019","unstructured":"Neal Jean, Sherrie Wang, Anshul Samar, George Azzari, David Lobell, and Stefano Ermon. 2019. Tile2Vec: Unsupervised representation learning for spatially distributed data. Proceedings of the AAAI Conference on Artificial Intelligence 33, 01 (July2019), 3967\u20133974.","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"e_1_3_3_20_2","doi-asserted-by":"publisher","DOI":"10.1145\/3357384.3358001"},{"issue":"1","key":"e_1_3_3_21_2","first-page":"4:1\u20134:26","article-title":"Transfer urban human mobility via POI embedding over multiple cities","volume":"2","author":"Jiang Renhe","year":"2021","unstructured":"Renhe Jiang, Xuan Song, Zipei Fan, Tianqi Xia, Zhaonan Wang, Quanjun Chen, Zekun Cai, and Ryosuke Shibasaki. 2021. Transfer urban human mobility via POI embedding over multiple cities. ACM\/IMS Transactions on Data Science 2, 1 (2021), 4:1\u20134:26.","journal-title":"ACM\/IMS Transactions on Data Science"},{"key":"e_1_3_3_22_2","doi-asserted-by":"publisher","DOI":"10.1145\/3356471.3365238"},{"key":"e_1_3_3_23_2","article-title":"Estimation of regional economic development indicator from transportation network analytics","volume":"10","author":"Li Bin","year":"2020","unstructured":"Bin Li, Song Gao, Yunlei Liang, Yuhao Kang, Timothy Prestby, Yuqi Gao, and Run mou Xiao. 2020. Estimation of regional economic development indicator from transportation network analytics. Scientific Reports 10, 1 (2020), 1\u201315.","journal-title":"Scientific Reports"},{"key":"e_1_3_3_24_2","first-page":"138","volume-title":"Proceedings of the Annual Conference of the Association for Computational Linguistics","author":"Li Shen","year":"2018","unstructured":"Shen Li, Zhe Zhao, Renfen Hu, Wensi Li, Tao Liu, and Xiaoyong Du. 2018. Analogical reasoning on chinese morphological and semantic relations. In Proceedings of the Annual Conference of the Association for Computational Linguistics. Association for Computational Linguistics, 138\u2013143."},{"issue":"5","key":"e_1_3_3_25_2","doi-asserted-by":"crossref","first-page":"4241","DOI":"10.1609\/aaai.v35i5.16548","article-title":"Pre-training context and time aware location embeddings from spatial-temporal trajectories for user next location prediction","volume":"35","author":"Lin Yan","year":"2021","unstructured":"Yan Lin, Huaiyu Wan, Shengnan Guo, and Youfang Lin. 2021. Pre-training context and time aware location embeddings from spatial-temporal trajectories for user next location prediction. Proceedings of the AAAI Conference on Artificial Intelligence 35, 5 (May2021), 4241\u20134248.","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"e_1_3_3_26_2","doi-asserted-by":"publisher","DOI":"10.1080\/13658816.2015.1086923"},{"key":"e_1_3_3_27_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.landurbplan.2012.02.012"},{"key":"e_1_3_3_28_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.rse.2006.02.010"},{"key":"e_1_3_3_29_2","unstructured":"Tomas Mikolov Kai Chen Gregory S. Corrado and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. International Conference on Learning Representations 1\u201312."},{"key":"e_1_3_3_30_2","doi-asserted-by":"publisher","DOI":"10.5555\/2999792.2999959"},{"key":"e_1_3_3_31_2","doi-asserted-by":"publisher","DOI":"10.1109\/TITS.2012.2209201"},{"key":"e_1_3_3_32_2","doi-asserted-by":"publisher","DOI":"10.1080\/14786440109462720"},{"key":"e_1_3_3_33_2","doi-asserted-by":"publisher","DOI":"10.1080\/13658816.2014.913794"},{"key":"e_1_3_3_34_2","doi-asserted-by":"crossref","DOI":"10.4324\/9781315618159","volume-title":"The Geography of Transport Systems","author":"Rodrigue Jean-Paul","year":"2016","unstructured":"Jean-Paul Rodrigue, Claude Comtois, and Brian Slack. 2016. The Geography of Transport Systems."},{"key":"e_1_3_3_35_2","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(88)90021-0"},{"key":"e_1_3_3_36_2","doi-asserted-by":"crossref","first-page":"815","DOI":"10.1109\/CVPR.2015.7298682","volume-title":"Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","author":"Schroff Florian","year":"2015","unstructured":"Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 815\u2013823."},{"key":"e_1_3_3_37_2","doi-asserted-by":"publisher","DOI":"10.1145\/3397536.3422229"},{"key":"e_1_3_3_38_2","doi-asserted-by":"publisher","DOI":"10.3390\/ijgi10050339"},{"key":"e_1_3_3_39_2","doi-asserted-by":"publisher","DOI":"10.2307\/143141"},{"issue":"4","key":"e_1_3_3_40_2","first-page":"Article 22","article-title":"SeqST-GAN: Seq2Seq generative adversarial nets for multi-step urban crowd flow prediction","volume":"6","author":"Wang Senzhang","year":"2020","unstructured":"Senzhang Wang, Jiannong Cao, Hao Chen, Hao Peng, and Zhiqiu Huang. 2020. SeqST-GAN: Seq2Seq generative adversarial nets for multi-step urban crowd flow prediction. ACM Transactions on Spatial Algorithms and Systems 6, 4 (2020), Article 22.","journal-title":"ACM Transactions on Spatial Algorithms and Systems"},{"issue":"01","key":"e_1_3_3_41_2","doi-asserted-by":"crossref","first-page":"1013","DOI":"10.1609\/aaai.v34i01.5450","article-title":"Urban2Vec: Incorporating street view imagery and POIs for multi-modal urban neighborhood embedding","volume":"34","author":"Wang Zhecheng","year":"2020","unstructured":"Zhecheng Wang, Haoyuan Li, and Ram Rajagopal. 2020. Urban2Vec: Incorporating street view imagery and POIs for multi-modal urban neighborhood embedding. Proceedings of the AAAI Conference on Artificial Intelligence 34, 01 (April2020), 1013\u20131020.","journal-title":"Proceedings of the AAAI Conference on Artificial Intelligence"},{"key":"e_1_3_3_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/3486183.3491001"},{"key":"e_1_3_3_43_2","doi-asserted-by":"crossref","first-page":"412","DOI":"10.1145\/3347146.3359063","volume-title":"Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems","author":"Yabe Takahiro","year":"2019","unstructured":"Takahiro Yabe, Kota Tsubouchi, Toru Shimizu, Yoshihide Sekimoto, and Satish V. Ukkusuri. 2019. City2City: Translating place representations across cities. In Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (2019-11-05). 412\u2013415."},{"key":"e_1_3_3_44_2","doi-asserted-by":"publisher","DOI":"10.1145\/3139958.3140054"},{"key":"e_1_3_3_45_2","doi-asserted-by":"publisher","DOI":"10.5555\/3304222.3304314"},{"key":"e_1_3_3_46_2","article-title":"Urban function recognition by integrating social media and street-level imagery","author":"Ye Chao","year":"2020","unstructured":"Chao Ye, Fan Zhang, Lan Mu, Yong Gao, and Yu Liu. 2020. Urban function recognition by integrating social media and street-level imagery. Environment and Planning B: Urban Analytics and City Science 48, 6 (2020), 1430\u20131444.","journal-title":"Environment and Planning B: Urban Analytics and City Science"},{"key":"e_1_3_3_47_2","doi-asserted-by":"publisher","DOI":"10.1145\/2339530.2339561"},{"key":"e_1_3_3_48_2","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2014.2345405"},{"key":"e_1_3_3_49_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.compenvurbsys.2018.11.008"},{"key":"e_1_3_3_50_2","doi-asserted-by":"publisher","DOI":"10.3390\/ijgi10060372"},{"key":"e_1_3_3_51_2","doi-asserted-by":"crossref","first-page":"361","DOI":"10.1145\/3038912.3052601","volume-title":"Proceedings of the 26th International Conference on World Wide Web","author":"Zhang Chao","year":"2017","unstructured":"Chao Zhang, Keyang Zhang, Quan Yuan, Haoruo Peng, Yu Zheng, Tim Hanratty, Shaowen Wang, and Jiawei Han. 2017. Regions, periods, activities: Uncovering urban dynamics via cross-modal representation learning. In Proceedings of the 26th International Conference on World Wide Web (2017-04-03). 361\u2013370."},{"key":"e_1_3_3_52_2","first-page":"4431","volume-title":"Proceedings of the 29th International Joint Conference on Artificial Intelligence","volume":"5","author":"Zhang Mingyang","year":"2020","unstructured":"Mingyang Zhang, Tong Li, Yong Li, and Pan Hui. 2020. Multi-view joint graph representation learning for urban region embedding. In Proceedings of the 29th International Joint Conference on Artificial Intelligence, Vol. 5. 4431\u20134437."},{"key":"e_1_3_3_53_2","doi-asserted-by":"publisher","DOI":"10.1145\/3292500.3330972"},{"key":"e_1_3_3_54_2","first-page":"487","volume-title":"Proceedings of the Conference on Neural Information Processing Systems","author":"Zhou Bolei","year":"2014","unstructured":"Bolei Zhou, Agata Lapedriza, Jianxiong Xiao, Antonio Torralba, and Aude Oliva. 2014. Learning deep features for scene recognition using places database. In Proceedings of the Conference on Neural Information Processing Systems. MIT Press, Cambridge, MA, 487\u2013495."},{"key":"e_1_3_3_55_2","article-title":"Places: A 10 million image database for scene recognition","author":"Zhou Bolei","year":"2017","unstructured":"Bolei Zhou, \u00c0gata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2017. Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 6 (2017), 1452\u20131464.","journal-title":"IEEE Transactions on Pattern Analysis and Machine Intelligence"},{"key":"e_1_3_3_56_2","doi-asserted-by":"crossref","first-page":"2403","DOI":"10.1109\/BigData.2018.8622444","volume-title":"Proceedings of the 2018 IEEE International Conference on Big Data (Big Data)","author":"Zhou Yang","year":"2018","unstructured":"Yang Zhou and Yan Huang. 2018. DeepMove: Learning place representations through large scale movement data. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data). 2403\u20132412."}],"container-title":["ACM Transactions on Spatial Algorithms and Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3571741","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3571741","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:49:33Z","timestamp":1750182573000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3571741"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,4,12]]},"references-count":55,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2023,6,30]]}},"alternative-id":["10.1145\/3571741"],"URL":"https:\/\/doi.org\/10.1145\/3571741","relation":{},"ISSN":["2374-0353","2374-0361"],"issn-type":[{"type":"print","value":"2374-0353"},{"type":"electronic","value":"2374-0361"}],"subject":[],"published":{"date-parts":[[2023,4,12]]},"assertion":[{"value":"2021-08-22","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-11-07","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-04-12","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}