{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,12]],"date-time":"2025-12-12T13:48:54Z","timestamp":1765547334510,"version":"3.32.0"},"reference-count":73,"publisher":"Association for Computing Machinery (ACM)","issue":"12","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2024,8]]},"abstract":"<jats:p>We released open-source software Hadoop-GIS in 2011, and presented and published the work in VLDB 2013. This work initiated the development of a new spatial data analytical ecosystem characterized by its large-scale capacity in both computing and data storage, high scalability, compatibility with low-cost commodity processors in clusters and open-source software. After more than a decade of research and development, this ecosystem has matured and is now serving many applications across various fields. In this paper, we provide the background on why we started this project and give an overview of the original Hadoop-GIS software architecture, along with its unique technical contributions and legacy. We present the evolution of the ecosystem and its current state-of-the-art, which has been influenced by the Hadoop-GIS project. We also describe the ongoing efforts to further enhance this ecosystem with hardware accelerations to meet the increasing demands for low latency and high throughput in various spatial data analysis tasks. Finally, we will summarize the insights gained and lessons learned over more than a decade in pursuing high-performance spatial data analytics.<\/jats:p>","DOI":"10.14778\/3685800.3685912","type":"journal-article","created":{"date-parts":[[2024,11,8]],"date-time":"2024-11-08T17:25:21Z","timestamp":1731086721000},"page":"4507-4520","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["High-Performance Spatial Data Analytics: Systematic R&amp;D for Scale-Out and Scale-Up Solutions from the Past to Now"],"prefix":"10.14778","volume":"17","author":[{"given":"Fusheng","family":"Wang","sequence":"first","affiliation":[{"name":"Stony Brook University"}]},{"given":"Rubao","family":"Lee","sequence":"additional","affiliation":[{"name":"Freelance"}]},{"given":"Dejun","family":"Teng","sequence":"additional","affiliation":[{"name":"Shandong University"}]},{"given":"Xiaodong","family":"Zhang","sequence":"additional","affiliation":[{"name":"The Ohio State University"}]},{"given":"Joel","family":"Saltz","sequence":"additional","affiliation":[{"name":"Stony Brook University"}]}],"member":"320","published-online":{"date-parts":[[2024,11,8]]},"reference":[{"unstructured":"2011. Hadoop-GIS. https:\/\/github.com\/StonyBrookDB\/hadoopgis","key":"e_1_2_1_1_1"},{"unstructured":"2024. https:\/\/hive.apache.org\/","key":"e_1_2_1_2_1"},{"unstructured":"2024. https:\/\/www.precisely.com\/product\/data-integrity\/precisely-data-integrity-suite\/spatial-analytics","key":"e_1_2_1_3_1"},{"unstructured":"2024. https:\/\/www.precisely.com\/product\/precisely-spectrum-spatial\/spectrum-spatial","key":"e_1_2_1_4_1"},{"unstructured":"2024. https:\/\/www.precisely.com\/data-guide\/products\/dynamic-weather-data","key":"e_1_2_1_5_1"},{"unstructured":"2024. https:\/\/www.statista.com\/statistics\/276306\/global-apple-iphone-sales-since-fiscal-year-2007\/","key":"e_1_2_1_6_1"},{"unstructured":"2024. https:\/\/backlinko.com\/uber-users","key":"e_1_2_1_7_1"},{"unstructured":"2024. https:\/\/www.techpowerup.com\/gpu-specs\/","key":"e_1_2_1_8_1"},{"unstructured":"2024. https:\/\/en.wikipedia.org\/wiki\/Transistor_count","key":"e_1_2_1_9_1"},{"unstructured":"2024. https:\/\/www.starsolutions.com\/","key":"e_1_2_1_10_1"},{"unstructured":"2024. https:\/\/developer.nvidia.com\/geometric-performance-primitives-gpp\/","key":"e_1_2_1_11_1"},{"unstructured":"2024. https:\/\/developer.nvidia.com\/","key":"e_1_2_1_12_1"},{"unstructured":"2024. https:\/\/en.wikipedia.org\/wiki\/Worse_is_better","key":"e_1_2_1_13_1"},{"unstructured":"2024. https:\/\/www.dreamsongs.com\/WIB.html","key":"e_1_2_1_14_1"},{"unstructured":"2024. Apache Hadoop. https:\/\/hadoop.apache.org\/.","key":"e_1_2_1_15_1"},{"unstructured":"2024. Apache Sedona. https:\/\/sedona.apache.org\/1.6.0\/","key":"e_1_2_1_16_1"},{"unstructured":"2024. Magellan: Geospatial Analytics Using Spark. https:\/\/github.com\/harsha2010\/magellan","key":"e_1_2_1_17_1"},{"unstructured":"2024. VLDB Test of Time Award. https:\/\/www.vldb.org\/awards_10year.html","key":"e_1_2_1_18_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_19_1","DOI":"10.14778\/2536222.2536227"},{"doi-asserted-by":"publisher","key":"e_1_2_1_20_1","DOI":"10.1145\/116873.116880"},{"key":"e_1_2_1_21_1","volume-title":"Accelerating spatial cross-matching on cpu-gpu hybrid platform with cuda and openacc. Frontiers in big Data 3","author":"Baig Furqan","year":"2020","unstructured":"Furqan Baig, Chao Gao, Dejun Teng, Jun Kong, and Fusheng Wang. 2020. Accelerating spatial cross-matching on cpu-gpu hybrid platform with cuda and openacc. Frontiers in big Data 3 (2020), 14."},{"doi-asserted-by":"publisher","key":"e_1_2_1_22_1","DOI":"10.1145\/3139958.3140019"},{"doi-asserted-by":"publisher","key":"e_1_2_1_23_1","DOI":"10.1145\/93597.98741"},{"unstructured":"Lauren Bennett and Trisalyn Nelson. 2022. Five Reasons Every Data Science Team Needs a Geographer. https:\/\/www.esri.com\/about\/newsroom\/arcuser\/five-reasons-every-data-science-team-needs-a-geographer\/","key":"e_1_2_1_24_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_25_1","DOI":"10.1016\/S0167-8191(02)00097-2"},{"doi-asserted-by":"publisher","key":"e_1_2_1_26_1","DOI":"10.1016\/S0167-8191(01)00099-0"},{"unstructured":"CGAL. 2024. The Computational Geometry Algorithms Library. http:\/\/www.sfcgal.org\/","key":"e_1_2_1_27_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_28_1","DOI":"10.1145\/273244.273264"},{"doi-asserted-by":"publisher","key":"e_1_2_1_29_1","DOI":"10.1145\/3557917.3567618"},{"doi-asserted-by":"publisher","key":"e_1_2_1_30_1","DOI":"10.1109\/JPROC.2011.2182074"},{"key":"e_1_2_1_31_1","volume-title":"MapReduce: Simplified Data Processing on Large Clusters. In 6th Symposium on Operating System Design and Implementation (OSDI 2004","author":"Dean Jeffrey","year":"2004","unstructured":"Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified Data Processing on Large Clusters. In 6th Symposium on Operating System Design and Implementation (OSDI 2004), San Francisco, California, USA, December 6--8, 2004, Eric A. Brewer and Peter Chen (Eds.). USENIX Association, 137--150. http:\/\/www.usenix.org\/events\/osdi04\/tech\/dean.html"},{"doi-asserted-by":"publisher","key":"e_1_2_1_32_1","DOI":"10.1093\/bioinformatics\/btab418"},{"doi-asserted-by":"publisher","key":"e_1_2_1_33_1","DOI":"10.1109\/ICDE.2015.7113382"},{"doi-asserted-by":"publisher","key":"e_1_2_1_34_1","DOI":"10.1109\/ICDE51399.2021.00092"},{"unstructured":"OSM Foundation. 2024. OpenStreetMap. http:\/\/www.openstreetmap.org.","key":"e_1_2_1_35_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_36_1","DOI":"10.1145\/3650200.3656610"},{"volume-title":"Using MPI: portable parallel programming with the message-passing interface","author":"Gropp William","unstructured":"William Gropp, Ewing Lusk, and Anthony Skjellum. 1999. Using MPI: portable parallel programming with the message-passing interface. Vol. 1. MIT press.","key":"e_1_2_1_37_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_38_1","DOI":"10.1109\/ICDE.2011.5767933"},{"doi-asserted-by":"publisher","key":"e_1_2_1_39_1","DOI":"10.1145\/2588555.2595630"},{"doi-asserted-by":"publisher","key":"e_1_2_1_40_1","DOI":"10.1145\/2038916.2038920"},{"doi-asserted-by":"publisher","key":"e_1_2_1_41_1","DOI":"10.14778\/2556549.2556559"},{"key":"e_1_2_1_42_1","volume-title":"VLDB","volume":"94","author":"Kamel Ibrahim","year":"1994","unstructured":"Ibrahim Kamel and Christos Faloutsos. 1994. Hilbert r-tree: An improved rtree using fractals. In VLDB, Vol. 94. Citeseer, 500--509."},{"doi-asserted-by":"publisher","key":"e_1_2_1_43_1","DOI":"10.1109\/38.933521"},{"doi-asserted-by":"publisher","key":"e_1_2_1_44_1","DOI":"10.1145\/331532.331544"},{"doi-asserted-by":"publisher","key":"e_1_2_1_45_1","DOI":"10.1109\/ICDCS.2011.26"},{"doi-asserted-by":"publisher","key":"e_1_2_1_46_1","DOI":"10.14778\/3476311.3476378"},{"key":"e_1_2_1_47_1","volume-title":"Proceedings of the 40th IEEE International Conference on Data Engineering (ICDE","author":"Dongxiao Yu Rubao Mengbai Xiao","year":"2024","unstructured":"Mengbai Xiao Dongxiao Yu Rubao Lee Li, Xin and Xiaodong Zhang. 2024. Ultra-Precise: A GPU-based Framework for Arbitrary-Precision Arithmetic in Database Systems. In Proceedings of the 40th IEEE International Conference on Data Engineering (ICDE 2024)."},{"doi-asserted-by":"publisher","key":"e_1_2_1_48_1","DOI":"10.1145\/3139958.3139961"},{"unstructured":"Nvidia. 2018. Nvidia Turing GPU Architecture. (2018).","key":"e_1_2_1_49_1"},{"unstructured":"Nvidia. 2021. Nvidia Ampere GA102 GPU Architecture. (2021).","key":"e_1_2_1_50_1"},{"unstructured":"Nvidia. 2022. Nvidia Ada GPU Architecture. (2022).","key":"e_1_2_1_51_1"},{"key":"e_1_2_1_52_1","first-page":"53","article-title":"A conversation with Jim Gray","volume":"1","author":"Patterson Dave","year":"2003","unstructured":"Dave Patterson. 2003. A conversation with Jim Gray. ACM Queue 1, 4 (2003), 53--56.","journal-title":"ACM Queue"},{"doi-asserted-by":"publisher","key":"e_1_2_1_53_1","DOI":"10.1145\/1559845.1559865"},{"doi-asserted-by":"publisher","key":"e_1_2_1_54_1","DOI":"10.1038\/s41374-020-0463-y"},{"doi-asserted-by":"publisher","key":"e_1_2_1_55_1","DOI":"10.1145\/1629175.1629197"},{"key":"e_1_2_1_56_1","volume-title":"Efficient spatial queries over complex polygons with hybrid representations. GeoInformatica","author":"Teng Dejun","year":"2023","unstructured":"Dejun Teng, Furqan Baig, Zhaohui Peng, Jun Kong, and Fusheng Wang. 2023. Efficient spatial queries over complex polygons with hybrid representations. GeoInformatica (2023), 1--39."},{"doi-asserted-by":"publisher","key":"e_1_2_1_57_1","DOI":"10.1109\/MDM52706.2021.00024"},{"key":"e_1_2_1_58_1","volume-title":"Advances in Database Technology: proceedings. International Conference on Extending Database Technology","volume":"25","author":"Teng D","year":"2022","unstructured":"D Teng, Y Liang, F Baig, J Kong, V Hoang, and F Wang. 2022. 3DPro: Querying Complex Three-Dimensional Data with Progressive Compression and Refinement.. In Advances in Database Technology: proceedings. International Conference on Extending Database Technology, Vol. 25. 104--117."},{"doi-asserted-by":"publisher","key":"e_1_2_1_59_1","DOI":"10.1145\/3502221"},{"doi-asserted-by":"publisher","key":"e_1_2_1_60_1","DOI":"10.1145\/2666310.2666365"},{"doi-asserted-by":"publisher","key":"e_1_2_1_61_1","DOI":"10.1145\/2666310.2666365"},{"doi-asserted-by":"publisher","key":"e_1_2_1_62_1","DOI":"10.14778\/3229863.3236264"},{"doi-asserted-by":"publisher","key":"e_1_2_1_63_1","DOI":"10.4103\/2153-3539.83192"},{"doi-asserted-by":"publisher","key":"e_1_2_1_64_1","DOI":"10.1145\/2544105"},{"doi-asserted-by":"publisher","key":"e_1_2_1_65_1","DOI":"10.14778\/2350229.2350268"},{"doi-asserted-by":"publisher","key":"e_1_2_1_66_1","DOI":"10.14778\/2732967.2732976"},{"doi-asserted-by":"publisher","key":"e_1_2_1_67_1","DOI":"10.1145\/3325135"},{"doi-asserted-by":"publisher","key":"e_1_2_1_68_1","DOI":"10.1109\/ICDCS.2019.00025"},{"doi-asserted-by":"publisher","key":"e_1_2_1_69_1","DOI":"10.1145\/3503513"},{"doi-asserted-by":"publisher","key":"e_1_2_1_70_1","DOI":"10.1109\/ICDEW.2015.7129541"},{"doi-asserted-by":"publisher","key":"e_1_2_1_71_1","DOI":"10.1145\/2820783.2820860"},{"doi-asserted-by":"publisher","key":"e_1_2_1_72_1","DOI":"10.14778\/2536206.2536210"},{"doi-asserted-by":"publisher","key":"e_1_2_1_73_1","DOI":"10.14778\/2809974.2809984"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3685800.3685912","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,12,31]],"date-time":"2024-12-31T05:26:13Z","timestamp":1735622773000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3685800.3685912"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8]]},"references-count":73,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2024,8]]}},"alternative-id":["10.14778\/3685800.3685912"],"URL":"https:\/\/doi.org\/10.14778\/3685800.3685912","relation":{},"ISSN":["2150-8097"],"issn-type":[{"type":"print","value":"2150-8097"}],"subject":[],"published":{"date-parts":[[2024,8]]},"assertion":[{"value":"2024-11-08","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}