{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T15:34:30Z","timestamp":1760369670410,"version":"build-2065373602"},"reference-count":36,"publisher":"MDPI AG","issue":"11","license":[{"start":{"date-parts":[[2020,11,19]],"date-time":"2020-11-19T00:00:00Z","timestamp":1605744000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100000002","name":"National Institutes of Health","doi-asserted-by":"publisher","award":["T32CA163184"],"award-info":[{"award-number":["T32CA163184"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IJGI"],"abstract":"<jats:p>Technologies around the world produce and interact with geospatial data instantaneously, from mobile web applications to satellite imagery that is collected and processed across the globe daily. Big raster data allow researchers to integrate and uncover new knowledge about geospatial patterns and processes. However, we are at a critical moment, as we have an ever-growing number of big data platforms that are being co-opted to support spatial analysis. A gap in the literature is the lack of a robust assessment comparing the efficiency of raster data analysis on big data platforms. This research begins to address this issue by establishing a raster data benchmark that employs freely accessible datasets to provide a comprehensive performance evaluation and comparison of raster operations on big data platforms. The benchmark is critical for evaluating the performance of spatial operations on big data platforms. The benchmarking datasets and operations are applied to three big data platforms. We report computing times and performance bottlenecks so that GIScientists can make informed choices regarding the performance of each platform. Each platform is evaluated for five raster operations: pixel count, reclassification, raster add, focal averaging, and zonal statistics using three raster different datasets.<\/jats:p>","DOI":"10.3390\/ijgi9110690","type":"journal-article","created":{"date-parts":[[2020,11,19]],"date-time":"2020-11-19T06:23:52Z","timestamp":1605767032000},"page":"690","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Developing the Raster Big Data Benchmark: A Comparison of Raster Analysis on Big Data Platforms"],"prefix":"10.3390","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-6858-428X","authenticated-orcid":false,"given":"David","family":"Haynes","sequence":"first","affiliation":[{"name":"Institute for Health Informatics, University of Minnesota, Minneapolis, MN 55455, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0146-2624","authenticated-orcid":false,"given":"Philip","family":"Mitchell","sequence":"additional","affiliation":[{"name":"Ali I. Al-Naimi Petroleum Engineering Research Center, King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia"}]},{"given":"Eric","family":"Shook","sequence":"additional","affiliation":[{"name":"Geography Environment and Society, University of Minnesota, Minneapolis, MN 55455, USA"}]}],"member":"1968","published-online":{"date-parts":[[2020,11,19]]},"reference":[{"key":"ref_1","unstructured":"Boshuizen, C., Mason, J., Klupar, P., and Spanhake, S. (2014). Results from the planet labs flock constellation."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1080\/17538947.2016.1239771","article-title":"Big Data and cloud computing: Innovation opportunities and challenges","volume":"10","author":"Yang","year":"2017","journal-title":"Int. J. Digit. Earth"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Haynes, D. (2020, November 19). Array Databases. Geographic Information Science Technologies Body of Knowledge. Available online: https:\/\/gistbok.ucgis.org\/bok-topics\/array-databases.","DOI":"10.22224\/gistbok\/2019.3.2"},{"key":"ref_4","unstructured":"Ding, M., Yang, M., and Chen, S. (2019). Storing and Querying Large-Scale Spatio-Temporal Graphs with High-Throughput Edge Insertions. arXiv."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Arnold, J., Glavic, B., and Raicu, I. (2019, January 20\u201324). A High-Performance Distributed Relational Database System for Scalable OLAP Processing. Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), Rio de Janeiro, Brazil.","DOI":"10.1109\/IPDPS.2019.00083"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Palamuttam, R., Mogrovejo, R.M., Mattmann, C., Wilson, B., Whitehall, K., Verma, R., McGibbney, L., and Ramirez, P. (November, January 29). SciSpark: Applying in-memory distributed computing to weather event detection and tracking. Proceedings of the 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, USA.","DOI":"10.1109\/BigData.2015.7363983"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Wang, W., Liu, T., Tang, D., Liu, H., Li, W., and Lee, R. (2016, January 8\u201310). SparkArray: An Array-Based Scientific Data Management System Built on Apache Spark. Proceedings of the 2016 IEEE International Conference on Networking, Architecture and Storage (NAS), Long Beach, CA, USA.","DOI":"10.1109\/NAS.2016.7549422"},{"key":"ref_8","unstructured":"Wang, G., Zomaya, A., Martinez, G., and Li, K. (2015, January 18\u201320). FASTDB: An Array Database System for Efficient Storing and Analyzing Massive Scientific Data. Proceedings of the International Conference on Algorithms and Architectures for Parallel Processing, Zhangjiajie, China."},{"key":"ref_9","unstructured":"Appel, M., Lahn, F., Pebesma, E., Buytaert, W., and Moulds, S. (2016, January 17\u201322). Scalable earth-observation analytics for geoscientists: Spacetime extensions to the array database SciDB. Proceedings of the EGU General Assembly Conference Abstracts, Vienna, Austria."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Jiang, L., Kawashima, H., and Tatebe, O. (2016, January 23\u201327). Fast window aggregate on array database by recursive incremental computation. Proceedings of the 2016 IEEE 12th International Conference on e-Science (e-Science), Baltimore, MD, USA.","DOI":"10.1109\/eScience.2016.7870890"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Lu, M., Appel, M., and Pebesma, E.J. (2018). Multidimensional Arrays for Analysing Geoscientific Data. ISPRS Int. J. Geo-Information, 7.","DOI":"10.3390\/ijgi7080313"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Planthaber, G., Stonebraker, M., and Frew, J. (2012, January 6\u20139). EarthDB: Scalable analysis of MODIS data using SciDB. Proceedings of the 1st ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, Redondo Beach, CA, USA.","DOI":"10.1145\/2447481.2447483"},{"key":"ref_13","unstructured":"Karmas, A., Karantzalos, K., and Athanasiou, S. (2014, January 15). Online analysis of remote sensing data for agricultural applications. Proceedings of the OSGeo\u2019s European Conference on Free and Open Source Software for Geospatial, Bremen, Germany."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"328","DOI":"10.1016\/j.isprsjprs.2018.08.007","article-title":"Big earth observation time series analysis for monitoring Brazilian agriculture","volume":"145","author":"Picoli","year":"2018","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"486","DOI":"10.1080\/22797254.2018.1451782","article-title":"Using Google Earth Engine to detect land cover change: Singapore as a use case","volume":"51","author":"Sidhu","year":"2018","journal-title":"Eur. J. Remote. Sens."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Eldawy, A., and Mokbel, M.F. (2015, January 15\u201318). The era of big spatial data. Proceedings of the 2015 31st IEEE International Conference on Data Engineering Workshops, Pittsburgh, PA, USA.","DOI":"10.1109\/ICDEW.2015.7129542"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Doan, K., Oloso, A.O., Kuo, K.-S., Clune, T.L., Yu, H., Nelson, B., and Zhang, J. (2016, January 5\u20138). Evaluating the impact of data placement to spark and SciDB with an Earth Science use case. Proceedings of the 2016 IEEE International Conference on Big Data (Big Data), Washington, DC, USA.","DOI":"10.1109\/BigData.2016.7840621"},{"key":"ref_18","first-page":"111","article-title":"A New Initiative for Tiling, Stitching and Processing Geospatial Big Dat in Distributed Computing Environments. ISPRS Ann. Photogramm","volume":"3","author":"Olasz","year":"2016","journal-title":"Remote Sens. Spat. Inf. Sci."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"546","DOI":"10.1111\/tgis.12286","article-title":"Terra Populus\u2019 Architecture for Integrated Big Geospatial Services","volume":"21","author":"Haynes","year":"2017","journal-title":"Trans. GIS"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Wiener, P., Simko, V., and Nimis, J. (2017, January 27\u201328). Taming the Evolution of Big Data and its Technologies in BigGIS A Conceptual Architectural Framework for Spatio-Temporal Analytics at Scale. Proceedings of the 3rd International Conference on Geographical Information Systems Theory, Applications and Management, Porto, Portugal.","DOI":"10.5220\/0006334200900101"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Ray, S., Simion, B., and Brown, A.D. (2011, January 11\u201316). Jackpine: A benchmark to evaluate spatial database performance. Proceedings of the 2011 IEEE 27th International Conference on Data Engineering, Hannover, Germany.","DOI":"10.1109\/ICDE.2011.5767929"},{"key":"ref_22","unstructured":"Baru, C., Bhandarkar, M., Nambiar, R., Poess, M., and Rabl, T. (2015, January 14\u201315). Big data benchmarking. Proceedings of the 6th International Workshop, WBDB 2015, Toronto, ON, Canada, 16\u201317 June 2015 and 7th International Workshop, WBDB 2015, New Delhi, India. Revised Selected Papers."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"2629","DOI":"10.3390\/rs2112629","article-title":"DEM Development from Ground-Based LiDAR Data: A Method to Remove Non-Surface Objects","volume":"2","author":"Sharma","year":"2010","journal-title":"Remote. Sens."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"669","DOI":"10.1080\/02693799608902104","article-title":"Spatial strategies for parallel spatial modelling","volume":"10","author":"Ding","year":"1996","journal-title":"Int. J. Geogr. Inf. Syst."},{"key":"ref_25","unstructured":"Stonebraker, M., Brown, P., Poliakov, A., and Raman, S. (2018, January 25\u201329). The Architecture of SciDB. Proceedings of the Public-Key Cryptography PKC 2018, Janeiro, Brazil."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Camara, G., Assis, L.F., Ribeiro, G., Ferreira, K.R., Llapa, E., and Vinhas, L. (2016, January 31). Big earth observation data analytics: Matching requirements to system architectures. Proceedings of the 5th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data\u2014BigSpatial\u201916, San Francisco, CA, USA.","DOI":"10.1145\/3006386.3006393"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"227","DOI":"10.1016\/j.isprsjprs.2016.03.007","article-title":"Spatio-temporal change detection from multidimensional arrays: Detecting deforestation from MODIS time series","volume":"117","author":"Lu","year":"2016","journal-title":"ISPRS J. Photogramm. Remote Sens."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1080\/17538947.2014.1003106","article-title":"Big Data Analytics for Earth Sciences: The EarthServer approach","volume":"9","author":"Baumann","year":"2015","journal-title":"Int. J. Digit. Earth"},{"key":"ref_29","unstructured":"National Institute of Space Research (2019, June 01). E-Sensing: Bg Earth Observation Data Analytics for LUCC. Available online: http:\/\/esensing.org\/."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Gu, L., and Li, H. (2013, January 13\u201315). Memory or Time: Performance Evaluation for Iterative Operation on Hadoop and Spark. Proceedings of the 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), Zhangjiajie, China.","DOI":"10.1109\/HPCC.and.EUC.2013.106"},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"62","DOI":"10.1109\/MCSE.2014.80","article-title":"XSEDE: Accelerating Scientific Discovery","volume":"16","author":"Towns","year":"2014","journal-title":"Comput. Sci. Eng."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"31","DOI":"10.1016\/j.cageo.2013.05.005","article-title":"Parallel scanline algorithm for rapid rasterization of vector geographic data","volume":"59","author":"Wang","year":"2013","journal-title":"Comput. Geosci."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Eldawy, A., Niu, L., Haynes, D., and Su, Z. (2017, January 7\u201310). Large Scale Analytics of Vector+Raster Big Spatial Data. Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems\u2014SIGSPATIAL\u201917, Redondo Beach, CA, USA.","DOI":"10.1145\/3139958.3140042"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Yang, H.-C., Dasdan, A., Hsiao, R.-L., and Parker, D.S. (2007, January 12\u201314). Map-reduce-merge. Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data\u2014SIGMOD\u201907, Beijing, China.","DOI":"10.1145\/1247480.1247602"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Afrati, F.N., and Ullman, J.D. (2010, January 22\u201326). Optimizing joins in a map-reduce environment. Proceedings of the 13th International Conference on Extending Database Technology\u2014EDBT\u201910, Lausanne, Switzerland.","DOI":"10.1145\/1739041.1739056"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1007\/s10707-018-0330-9","article-title":"Spatial data management in apache spark: The GeoSpark perspective and beyond","volume":"23","author":"Yu","year":"2019","journal-title":"GeoInformatica"}],"container-title":["ISPRS International Journal of Geo-Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2220-9964\/9\/11\/690\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T10:34:17Z","timestamp":1760178857000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2220-9964\/9\/11\/690"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020,11,19]]},"references-count":36,"journal-issue":{"issue":"11","published-online":{"date-parts":[[2020,11]]}},"alternative-id":["ijgi9110690"],"URL":"https:\/\/doi.org\/10.3390\/ijgi9110690","relation":{},"ISSN":["2220-9964"],"issn-type":[{"type":"electronic","value":"2220-9964"}],"subject":[],"published":{"date-parts":[[2020,11,19]]}}}