{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T18:07:52Z","timestamp":1757614072206,"version":"3.44.0"},"reference-count":50,"publisher":"Association for Computing Machinery (ACM)","issue":"11","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2025,7]]},"abstract":"<jats:p>Though polar scientists entertain having huge amounts of publicly available datasets, they face the challenge that working with such data is a cumbersome process that requires downloading tons of unnecessary data and writing various scripts on top of it. This hinders their ability to perform any kind of interactive analysis. This paper presents Polaris; a novel open-source system infrastructure for Polar science that is highly Interactive and Scalable. Polaris is designed based on three observations that distinguish the query workload of polar scientists, namely, all queries are spatio-temporal, not all data are equal, and the large majority of queries are aggregates. Polaris is equipped with a hierarchical spatio-temporal index structure that stores precomputed aggregates for data of interest. Experimental results with a real Polaris prototype and real scientific data show that it achieves highly interactive and scalable data access, enabling interactive analysis of polar science data.<\/jats:p>","DOI":"10.14778\/3749646.3749719","type":"journal-article","created":{"date-parts":[[2025,9,4]],"date-time":"2025-09-04T17:55:06Z","timestamp":1757008506000},"page":"4644-4652","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["Polaris: An Interactive and Scalable Data Infrastructure for Polar Science"],"prefix":"10.14778","volume":"18","author":[{"given":"Yuchuan","family":"Huang","sequence":"first","affiliation":[{"name":"University of Minnesota, USA"}]},{"given":"Ana Elena","family":"Uribe","sequence":"additional","affiliation":[{"name":"University of Minnesota, USA"}]},{"given":"Youssef","family":"Hussein","sequence":"additional","affiliation":[{"name":"University of Minnesota, USA"}]},{"given":"Kareem","family":"Eldahshoury","sequence":"additional","affiliation":[{"name":"University of Minnesota, USA"}]},{"given":"Grant","family":"Ogren","sequence":"additional","affiliation":[{"name":"University of Minnesota, USA"}]},{"given":"Mohamed F.","family":"Mokbel","sequence":"additional","affiliation":[{"name":"University of Minnesota, USA"}]}],"member":"320","published-online":{"date-parts":[[2025,9,4]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCSE.2021.3059437"},{"key":"e_1_2_1_2_1","first-page":"1961","article-title":"A Demonstration of ST-Hadoop: A MapReduce Framework for Big Spatio-temporal Data","volume":"10","author":"Alarabi Louai","year":"2017","unstructured":"Louai Alarabi and Mohamed F. Mokbel. 2017. A Demonstration of ST-Hadoop: A MapReduce Framework for Big Spatio-temporal Data. PVLDB 10, 12 (2017), 1961\u20131964.","journal-title":"PVLDB"},{"key":"e_1_2_1_3_1","volume-title":"Aref and Hanan Samet","author":"Walid","year":"1990","unstructured":"Walid G. Aref and Hanan Samet. 1990. Efficient Processing of Window Queries in The Pyramid Data Structure. In PODS (Nashville, TN, USA)."},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","unstructured":"Peter Baumann Andreas Dehmel Paula Furtado Roland Ritsch and Norbert Widmann. 1998. The Multidimensional Database System RasDaMan. In SIGMOD.","DOI":"10.1145\/276304.276386"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1186\/s40537-020-00399-2"},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","unstructured":"Paul G. Brown. 2010. Overview of sciDB: large scale array storage processing and analysis. In SIGMOD (Indianapolis Indiana USA).","DOI":"10.1145\/1807167.1807271"},{"key":"e_1_2_1_7_1","unstructured":"CARRA [n.d.]. Arctic regional reanalysis on single levels from 1991 to present. https:\/\/cds.climate.copernicus.eu\/datasets\/reanalysis-carra-single-levels."},{"key":"e_1_2_1_8_1","unstructured":"Robert W. Carver and Alex Merose. [n.d.]. ARCO-ERA5: An Analysis-Ready Cloud-Optimized Reanalysis Dataset. https:\/\/github.com\/google-research\/arco-era5."},{"key":"e_1_2_1_9_1","unstructured":"CESM [n.d.]. NCAR Community Earth System Model. https:\/\/www.cesm.ucar.edu\/."},{"volume-title":"SIGMOD (New York","author":"Chen Lisi","key":"e_1_2_1_10_1","unstructured":"Lisi Chen, Gao Cong, and Xin Cao. 2013. An efficient query indexing mechanism for filtering geo-textual data. In SIGMOD (New York, NY, USA)."},{"key":"e_1_2_1_11_1","volume-title":"Kian-Lee Tan, and Mario A. Nascimento.","author":"Chen Su","year":"2008","unstructured":"Su Chen, Beng Chin Ooi, Kian-Lee Tan, and Mario A. Nascimento. 2008. ST2Btree: a self-tunable spatio-temporal b+-tree index for moving objects. In SIGMOD (Vancouver, BC, Canada)."},{"key":"e_1_2_1_12_1","unstructured":"CryoCloud [n.d.]. CryoCloud - Accelerating discovery and enhancing collaboration for NASA Cryosphere communities. https:\/\/cryointhecloud.com\/."},{"key":"e_1_2_1_13_1","unstructured":"Dask [n.d.]. Dask. https:\/\/www.dask.org\/."},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","unstructured":"Harish Doraiswamy Huy T. Vo Cl\u00e1udio T. Silva and Juliana Freire. 2016. A GPU-based index to support interactive spatio-temporal queries over historical data. In ICDE (Helsinki Finland).","DOI":"10.1109\/ICDE.2016.7498315"},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00382-019-05044-0"},{"key":"e_1_2_1_16_1","volume-title":"Mokbel","author":"Eldawy Ahmed","year":"2015","unstructured":"Ahmed Eldawy and Mohamed F. Mokbel. 2015. SpatialHadoop: A MapReduce framework for spatial data. In ICDE (Seoul, South Korea)."},{"key":"e_1_2_1_17_1","volume-title":"SHAHED: A MapReduce-based system for querying and visualizing spatio-temporal satellite data. In ICDE (Seoul, South Korea).","author":"Eldawy Ahmed","year":"2015","unstructured":"Ahmed Eldawy, Mohamed F. Mokbel, Saif Al-Harthi, Abdulhadi Alzaidy, Kareem Tarek, and Sohaib Ghani. 2015. SHAHED: A MapReduce-based system for querying and visualizing spatio-temporal satellite data. In ICDE (Seoul, South Korea)."},{"key":"e_1_2_1_18_1","unstructured":"ERA5 [n.d.]. ERA5 hourly data on single levels from 1940 to present. https:\/\/cds.climate.copernicus.eu\/datasets\/reanalysis-era5-single-levels."},{"key":"e_1_2_1_19_1","unstructured":"GeoTrellis [n.d.]. GeoTrellis. A geographic data processing engine for high performance applications. https:\/\/geotrellis.io\/."},{"key":"e_1_2_1_20_1","doi-asserted-by":"crossref","unstructured":"Yang Guo Zhiqi Wang Jin Xue and Zili Shao. 2024. A Spatio-Temporal Series Data Model with Efficient Indexing and Layout for Cloud-Based Trajectory Data Management. In ICDE (Utrecht The Netherlands).","DOI":"10.1109\/ICDE60146.2024.00313"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-004-0151-3"},{"key":"e_1_2_1_22_1","unstructured":"Hadoop [n.d.]. Apache Hadoop. https:\/\/hadoop.apache.org\/."},{"key":"e_1_2_1_23_1","unstructured":"Hans Hersbach Bill Bell Paul Berrisford Shoji Hirahara Andr\u00e1s Hor\u00e1nyi Joaqu\u00edn Mu\u00f1oz-Sabater Julien Nicolas Carole Peubey Raluca Radu Dinand Schepers et al. 2020. The ERA5 global reanalysis. Quarterly journal of the royal meteorological society 146 730 (2020) 1999\u20132049."},{"key":"e_1_2_1_24_1","unstructured":"ICESat-2 [n.d.]. Ice Cloud and land Elevation Satellite-2 (ICESat-2). https:\/\/icesat-2.gsfc.nasa.gov\/."},{"key":"e_1_2_1_25_1","unstructured":"iharp [n.d.]. iHARP: NSF HDR Institute for Harnessing Data and Model Revolution in the Polar Regions. https:\/\/iharp.umbc.edu\/."},{"key":"e_1_2_1_26_1","doi-asserted-by":"crossref","unstructured":"Lubos Krc\u00e1l and Shen-Shyang Ho. 2015. A SciDB-based Framework for Efficient Satellite Data Storage and Query based on Dynamic Atmospheric Event Trajectory. In SIGSPATIAL (Bellevue WA USA).","DOI":"10.1145\/2835185.2835190"},{"key":"e_1_2_1_27_1","volume-title":"JUST: JD Urban Spatio-Temporal Data Engine. In ICDE (Dallas, TX, USA).","author":"Li Ruiyuan","year":"2020","unstructured":"Ruiyuan Li, Huajun He, Rubin Wang, Yuchuan Huang, Junwen Liu, Sijie Ruan, Tianfu He, Jie Bao, and Yu Zheng. 2020. JUST: JD Urban Spatio-Temporal Data Engine. In ICDE (Dallas, TX, USA)."},{"key":"e_1_2_1_28_1","volume-title":"Mokbel","author":"Magdy Amr","year":"2017","unstructured":"Amr Magdy and Mohamed F. Mokbel. 2017. Demonstration of Kite: A Scalable System for Microblogs Data Management. In ICDE (San Diego, CA, USA)."},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10707-018-0329-2"},{"key":"e_1_2_1_30_1","first-page":"1098","article-title":"An ERA5-Based Hourly Global Pressure and Temperature (HGPT) Model. Remote","volume":"12","author":"Mateus Pedro","year":"2020","unstructured":"Pedro Mateus, Jo\u00e3o Catal\u00e3o Fernandes, Virg\u00edlio B. Mendes, and Giovanni Nico. 2020. An ERA5-Based Hourly Global Pressure and Temperature (HGPT) Model. Remote. Sens. 12, 7 (2020), 1098.","journal-title":"Sens."},{"key":"e_1_2_1_31_1","unstructured":"MERRA-2 [n.d.]. Modern-Era Retrospective analysis for Research and Applications Version 2. https:\/\/gmao.gsfc.nasa.gov\/reanalysis\/MERRA-2\/."},{"key":"e_1_2_1_32_1","first-page":"40","article-title":"Spatio-Temporal Access Methods","volume":"26","author":"Mokbel Mohamed F.","year":"2003","unstructured":"Mohamed F. Mokbel, Thanaa M. Ghanem, and Walid G. Aref. 2003. Spatio-Temporal Access Methods. IEEE Data Engineering Bulletin 26, 2 (2003), 40\u201349.","journal-title":"IEEE Data Engineering Bulletin"},{"key":"e_1_2_1_33_1","first-page":"46","article-title":"Spatio-Temporal Access Methods: Part 2 (2003 \u2013 2010)","volume":"33","author":"Nguyen-Dinh Long-Van","year":"2010","unstructured":"Long-Van Nguyen-Dinh, Walid G. Aref, and Mohamed F. Mokbel. 2010. Spatio-Temporal Access Methods: Part 2 (2003 \u2013 2010). IEEE Data Engineering Bulletin 33, 2 (2010), 46\u201355.","journal-title":"IEEE Data Engineering Bulletin"},{"key":"e_1_2_1_34_1","unstructured":"OpenDataCube [n.d.]. OpenDataCube - Open Source Earth Observation at Scale. https:\/\/www.opendatacube.org\/."},{"key":"e_1_2_1_35_1","unstructured":"Pangeo [n.d.]. Pangeo: A community for open reproducible scalable geoscience. https:\/\/www.pangeo.io\/."},{"key":"e_1_2_1_36_1","unstructured":"Dimitris Papadias Yufei Tao Panos Kalnis and Jun Zhang. 2002. Indexing Spatio-Temporal Data Warehouses. In ICDE (San Jose CA USA)."},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.14778\/3025111.3025117"},{"key":"e_1_2_1_38_1","doi-asserted-by":"crossref","unstructured":"Gary Planthaber Michael Stonebraker and James Frew. 2012. EarthDB: scalable analysis of MODIS data using SciDB. In SIGSPATIAL (Redondo Beach CA USA).","DOI":"10.1145\/2447481.2447483"},{"key":"e_1_2_1_39_1","unstructured":"POLARIS [n.d.]. POLARIS first System release. https:\/\/iharpv.cs.umn.edu\/."},{"key":"e_1_2_1_40_1","doi-asserted-by":"crossref","unstructured":"Keven Richly Rainer Schlosser and Martin Boissier. 2021. Joint Index Sorting and Compression Optimization for Memory-Efficient Spatio-Temporal Data Management. In ICDE (Chania Greece).","DOI":"10.1109\/ICDE51399.2021.00174"},{"key":"e_1_2_1_41_1","unstructured":"Sedona [n.d.]. Apache Sedone. https:\/\/sedona.apache.org\/."},{"key":"e_1_2_1_42_1","unstructured":"Spark [n.d.]. Apache Spark - Unified Engine for large-scale data analytics. https:\/\/spark.apache.org\/."},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.3389\/fclim.2021.782909"},{"key":"e_1_2_1_44_1","unstructured":"TileDB [n.d.]. TileDB - The Universal Storage Engine. https:\/\/docs.tiledb.com\/main."},{"key":"e_1_2_1_45_1","volume-title":"Christopher Barnard, Ruth Coughlan, Jesus San-Miguel-Ayanz, Giorgio Libert\u00e1, and Blazej Krzeminski.","author":"Vitolo Claudia","year":"2020","unstructured":"Claudia Vitolo, Francesca Di Giuseppe, Christopher Barnard, Ruth Coughlan, Jesus San-Miguel-Ayanz, Giorgio Libert\u00e1, and Blazej Krzeminski. 2020. ERA5-based global meteorological wildfire danger maps. Scientific data 7, 1 (2020), 216."},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.rse.2022.113181"},{"key":"e_1_2_1_47_1","unstructured":"Xarray [n.d.]. Xarray. https:\/\/xarray.dev\/."},{"key":"e_1_2_1_48_1","doi-asserted-by":"crossref","unstructured":"Jia Yu Jinxuan Wu and Mohamed Sarwat. 2016. A demonstration of GeoSpark: A cluster computing framework for processing big spatial data. In ICDE (Helsinki Finland).","DOI":"10.1109\/ICDE.2016.7498357"},{"key":"e_1_2_1_49_1","first-page":"1247","article-title":"ChronosDB: Distributed, File Based","volume":"11","author":"Rodriges Zalipynis Ramon Antonio","year":"2018","unstructured":"Ramon Antonio Rodriges Zalipynis. 2018. ChronosDB: Distributed, File Based, Geospatial Array DBMS. PVLDB 11, 10 (2018), 1247\u20131261.","journal-title":"Geospatial Array DBMS. PVLDB"},{"key":"e_1_2_1_50_1","first-page":"3186","article-title":"Array DBMS: Past, Present, and (Near) Future","volume":"14","author":"Rodriges Zalipynis Ramon Antonio","year":"2021","unstructured":"Ramon Antonio Rodriges Zalipynis. 2021. Array DBMS: Past, Present, and (Near) Future. PVLDB 14, 12 (2021), 3186\u20133189.","journal-title":"PVLDB"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/3749646.3749719","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,5]],"date-time":"2025-09-05T03:31:17Z","timestamp":1757043077000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/3749646.3749719"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7]]},"references-count":50,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2025,7]]}},"alternative-id":["10.14778\/3749646.3749719"],"URL":"https:\/\/doi.org\/10.14778\/3749646.3749719","relation":{},"ISSN":["2150-8097"],"issn-type":[{"type":"print","value":"2150-8097"}],"subject":[],"published":{"date-parts":[[2025,7]]},"assertion":[{"value":"2025-09-04","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}