{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,15]],"date-time":"2026-01-15T00:24:31Z","timestamp":1768436671371,"version":"3.49.0"},"reference-count":62,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2022,1,22]],"date-time":"2022-01-22T00:00:00Z","timestamp":1642809600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["42130113"],"award-info":[{"award-number":["42130113"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Strategic Priority Research Program of the Chinese Academy of Sciences","award":["XDA19040504"],"award-info":[{"award-number":["XDA19040504"]}]},{"name":"Basic Research Innovative Groups of Gansu Province, China","award":["21JR7RA068"],"award-info":[{"award-number":["21JR7RA068"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>As a result of Earth observation (EO) entering the era of big data, a significant challenge relating to by the storage, analysis, and visualization of a massive amount of remote sensing (RS) data must be addressed. In this paper, we proposed a novel scalable computing resources system to achieve high-speed processing of RS big data in a parallel distributed architecture. To reduce data movement among computing nodes, the Hadoop Distributed File System (HDFS) is established on nodes of K8s, which are also used for computing. In the process of RS data analysis, we innovatively use the tile-oriented programming model instead of the traditional strip-oriented or pixel-oriented approach to better implement parallel computing in a Spark on Kubernetes (K8s) cluster. A large RS raster layer can be abstracted as a user-defined tile format of any size, so that a whole computing task can be divided into multiple distributed parallel tasks. The computing resources applied by users would be immediately assigned in the Spark on K8s cluster by simply configuring and initializing SparkContext through a web-based Jupyter notebook console. Users can easily query, write, or visualize data in any box size from the catalog module in GeoPySpark. In summary, the system proposed in this study can provide a distributed scalable resources system for assembling big data storage, parallel computing, and real-time visualization.<\/jats:p>","DOI":"10.3390\/rs14030521","type":"journal-article","created":{"date-parts":[[2022,1,23]],"date-time":"2022-01-23T20:34:40Z","timestamp":1642970080000},"page":"521","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":18,"title":["A Scalable Computing Resources System for Remote Sensing Big Data Processing Using GeoPySpark Based on Spark on K8s"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2329-5668","authenticated-orcid":false,"given":"Jifu","family":"Guo","sequence":"first","affiliation":[{"name":"Key Laboratory of Remote Sensing of Gansu Province, Heihe Remote Sensing Experimental Research Station, Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences, Lanzhou 730000, China"},{"name":"University of Chinese Academy of Sciences, Beijing 100049, China"},{"name":"College of Information Science and Technology, Gansu Agricultural University, Lanzhou 730070, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1366-5170","authenticated-orcid":false,"given":"Chunlin","family":"Huang","sequence":"additional","affiliation":[{"name":"Key Laboratory of Remote Sensing of Gansu Province, Heihe Remote Sensing Experimental Research Station, Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences, Lanzhou 730000, China"}]},{"given":"Jinliang","family":"Hou","sequence":"additional","affiliation":[{"name":"Key Laboratory of Remote Sensing of Gansu Province, Heihe Remote Sensing Experimental Research Station, Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences, Lanzhou 730000, China"}]}],"member":"1968","published-online":{"date-parts":[[2022,1,22]]},"reference":[{"key":"ref_1","first-page":"1211","article-title":"Automatic analysis and mining of remote sensing big data","volume":"43","author":"Deren","year":"2014","journal-title":"Acta Geod. Cartogr. Sin."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"47","DOI":"10.1016\/j.future.2014.10.029","article-title":"Remote sensing big data computing: Challenges and opportunities","volume":"51","author":"Ma","year":"2015","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_3","unstructured":"Skytland, N. (2012). Big data: What is nasa doing with big data today. Open. Gov. Open Access Artic., Available online: https:\/\/www.opennasa.org\/what-is-nasa-doing-with-big-data-today.html."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1109\/JSTARS.2011.2106332","article-title":"Foreword to the special issue on \u201chuman settlements: A global remote sensing challenge\u201d","volume":"4","author":"Gamba","year":"2011","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Stromann, O., Nascetti, A., Yousif, O., and Ban, Y. (2020). Dimensionality Reduction and Feature Selection for Object-Based Land Cover Classification based on Sentinel-1 and Sentinel-2 Time Series Using Google Earth Engine. Remote Sens., 12.","DOI":"10.3390\/rs12010076"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"101","DOI":"10.1111\/j.1467-9671.2010.01205.x","article-title":"Moving code in spatial data infrastructures\u2013web service based deployment of geoprocessing algorithms","volume":"14","author":"Bernard","year":"2010","journal-title":"Trans. GIS"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Camara, G., Assis, L.F., Ribeiro, G., Ferreira, K.R., Llapa, E., and Vinhas, L. (2016, January 31). Big earth observation data analytics: Matching requirements to system architectures. Proceedings of the 5th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, Burlingame, CA, USA.","DOI":"10.1145\/3006386.3006393"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Gomes, V.C.F., Queiroz, G.R., and Ferreira, K.R. (2020). An Overview of Platforms for Big Earth Observation Data Management and Analysis. Remote Sens., 12.","DOI":"10.3390\/rs12081253"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Mell, P., and Grance, T. (2011). The NIST Definition of Cloud Computing.","DOI":"10.6028\/NIST.SP.800-145"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Mutanga, O., and Kumar, L. (2019). Google Earth Engine Applications. Remote Sens., 11.","DOI":"10.3390\/rs11050591"},{"key":"ref_11","unstructured":"White, T. (2012). Hadoop: The Definitive Guide, O\u2019Reilly Media, Inc."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Jo, J., and Lee, K.-W. (2018). High-performance geospatial big data processing system based on MapReduce. ISPRS Int. J. Geo-Inf., 7.","DOI":"10.3390\/ijgi7100399"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Cary, A., Sun, Z., Hristidis, V., and Rishe, N. (2009, January 2\u20134). Experiences on processing spatial data with mapreduce. Proceedings of the International Conference on Scientific and Statistical Database Management, New Orleans, LA, USA.","DOI":"10.1007\/978-3-642-02279-1_24"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1230","DOI":"10.14778\/2536274.2536283","article-title":"A demonstration of spatialhadoop: An efficient mapreduce framework for spatial data","volume":"6","author":"Eldawy","year":"2013","journal-title":"Proc. VLDB Endow."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1016\/j.cag.2015.03.003","article-title":"A framework for processing large scale geospatial and remote sensing data in MapReduce environment","volume":"49","author":"Giachetta","year":"2015","journal-title":"Comput. Graph."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Aji, A., Wang, F., Vo, H., Lee, R., Liu, Q., Zhang, X., and Saltz, J. (2013, January 26\u201330). Hadoop-GIS: A high performance spatial data warehousing system over MapReduce. Proceedings of the VLDB Endowment International Conference on Very Large Data Bases, Trento, Italy.","DOI":"10.14778\/2536222.2536227"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"409","DOI":"10.1109\/JSTARS.2016.2603120","article-title":"A new cloud computing architecture for the classification of remote sensing data","volume":"10","author":"Quirita","year":"2016","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1109\/JSTARS.2016.2547020","article-title":"In-memory parallel processing of massive remotely sensed data using an apache spark on hadoop yarn model","volume":"10","author":"Huang","year":"2016","journal-title":"IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1016\/j.future.2016.06.009","article-title":"pipsCloud: High performance cloud computing for remote sensing big data management and processing","volume":"78","author":"Wang","year":"2018","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Warmerdam, F. (2008). The geospatial data abstraction library. Open Source Approaches in Spatial Data Handling, Springer.","DOI":"10.1007\/978-3-540-74831-1_5"},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"2075057","DOI":"10.1155\/2018\/2075057","article-title":"Spark Sensing: A Cloud Computing Framework to Unfold Processing Efficiencies for Large and Multiscale Remotely Sensed Data, with Examples on Landsat 8 and MODIS Data","volume":"2018","author":"Lan","year":"2018","journal-title":"J. Sens."},{"key":"ref_22","first-page":"93","article-title":"A review study of apache spark in big data processing","volume":"4","author":"Jonnalagadda","year":"2016","journal-title":"Int. J. Comput. Sci. Trends Technol. IJCST"},{"key":"ref_23","first-page":"301","article-title":"Apache spark and big data analytics for solving real world problems","volume":"4","author":"Ghatge","year":"2016","journal-title":"Int. J. Comput. Sci. Trends Technol."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"630","DOI":"10.1007\/s10766-017-0513-2","article-title":"Real-time big data stream processing using GPU with spark over hadoop ecosystem","volume":"46","author":"Rathore","year":"2018","journal-title":"Int. J. Parallel Program."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Tian, F., Wu, B., Zeng, H., Zhang, X., and Xu, J. (2019). Efficient identification of corn cultivation area with multitemporal synthetic aperture radar and optical images in the google earth engine cloud platform. Remote Sens., 11.","DOI":"10.3390\/rs11060629"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Sun, Z., Chen, F., Chi, M., and Zhu, Y. (2015, January 8\u20139). A spark-based big data platform for massive remote sensing data processing. Proceedings of the International Conference on Data Science, Sydney, Australia.","DOI":"10.1007\/978-3-319-24474-7_17"},{"key":"ref_27","unstructured":"Docker (2021, November 19). Docker Overview. Available online: https:\/\/docs.docker.com\/get-started\/overview."},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Bhimani, J., Yang, Z., Leeser, M., and Mi, N. (2017, January 12\u201314). Accelerating big data applications using lightweight virtualization framework on enterprise cloud. Proceedings of the 2017 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA.","DOI":"10.1109\/HPEC.2017.8091086"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"3566","DOI":"10.1109\/TII.2020.3022843","article-title":"Evaluating docker for lightweight virtualization of distributed and time-sensitive applications in industrial automation","volume":"17","author":"Sollfrank","year":"2020","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Zhang, Q., Liu, L., Pu, C., Dou, Q., Wu, L., and Zhou, W. (2018, January 2\u20137). A comparative study of containers and virtual machines in big data environment. Proceedings of the 2018 IEEE 11th International Conference on Cloud Computing (CLOUD), San Francisco, CA, USA.","DOI":"10.1109\/CLOUD.2018.00030"},{"key":"ref_31","unstructured":"Cloud Native Computing Foundation (2021, November 19). Overview. Available online: https:\/\/kubernetes.io."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Thurgood, B., and Lennon, R.G. (2019, January 1\u20132). Cloud computing with Kubernetes cluster elastic scaling. Proceedings of the 3rd International Conference on Future Networks and Distributed Systems, Paris, France.","DOI":"10.1145\/3341325.3341995"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Vithlani, H.N., Dogotari, M., Lam, O.H.Y., Pr\u00fcm, M., Melville, B., Zimmer, F., and Becker, R. (2020, January 7\u20139). Scale Drone Mapping on K8S: Auto-scale Drone Imagery Processing on Kubernetes-orchestrated On-premise Cloud-computing Platform. Proceedings of the GISTAM, Prague, Czech Republic.","DOI":"10.5220\/0009816003180325"},{"key":"ref_34","unstructured":"Jacob, A., Vicente-Guijalba, F., Kristen, H., Costa, A., Ventura, B., Monsorno, R., and Notarnicola, C. (2017, January 28\u201330). Organizing Access to Complex Multi-Dimensional Data: An Example From The Esa Seom Sincohmap Project. Proceedings of the 2017 Conference on Big Data from Space, Toulouse, France."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Huang, W., Zhou, J., and Zhang, D. (2021). On-the-Fly Fusion of Remotely-Sensed Big Data Using an Elastic Computing Paradigm with a Containerized Spark Engine on Kubernetes. Sensors, 21.","DOI":"10.3390\/s21092971"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Guo, Z., Fox, G., and Zhou, M. (2012, January 13\u201316). Investigation of data locality in mapreduce. Proceedings of the 2012 12th IEEE\/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), Washington, DC, USA.","DOI":"10.1109\/CCGrid.2012.42"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"10","DOI":"10.5334\/jors.148","article-title":"xarray: ND labeled arrays and datasets in Python","volume":"5","author":"Hoyer","year":"2017","journal-title":"J. Open Res. Softw."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1016\/j.future.2017.11.007","article-title":"A versatile data-intensive computing platform for information retrieval from big geospatial data","volume":"81","author":"Soille","year":"2018","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_39","unstructured":"Open Data Cube (2022, January 02). Available online: https:\/\/www.sentinel-hub.com\/."},{"key":"ref_40","doi-asserted-by":"crossref","unstructured":"Eldawy, A. (2014, January 22\u201327). SpatialHadoop: Towards flexible and scalable spatial processing using mapreduce. Proceedings of the 2014 SIGMOD PhD Symposium, Snowbird, UT, USA.","DOI":"10.1145\/2602622.2602625"},{"key":"ref_41","unstructured":"AS Foundation (2020, September 10). Running Spark on Kubernetes. Available online: http:\/\/spark.apache.org\/docs\/latest\/running-on-kubernetes.html."},{"key":"ref_42","unstructured":"Bouffard, J., and McClean, J. (2021, November 19). What Is GeoPySpark?. Available online: https:\/\/geopyspark.readthedocs.io\/en\/latest\/."},{"key":"ref_43","unstructured":"Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauly, M., Franklin, M.J., Shenker, S., and Stoica, I. (2012, January 25\u201327). Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. Proceedings of the 9th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 12), San Jose, CA, USA."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"100","DOI":"10.5623\/cig2017-203","article-title":"Web Mercator and raster tile maps: Two cornerstones of online map service providers","volume":"71","author":"Stefanakis","year":"2017","journal-title":"Geomatica"},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Dungan, W., Stenger, A., and Sutty, G. (1978, January 23\u201325). Texture tile considerations for raster graphics. Proceedings of the 5th Annual Conference on Computer Graphics and Interactive Techniques, New York, NY, USA.","DOI":"10.1145\/800248.807383"},{"key":"ref_46","unstructured":"C Foundation (2021, November 29). Intro to Ceph. Available online: https:\/\/docs.ceph.com\/en\/latest\/cephfs\/index.html."},{"key":"ref_47","unstructured":"TL Foundation (2021, November 29). Storage Classes. Available online: https:\/\/kubernetes.io\/docs\/concepts\/storage\/storage-classes\/."},{"key":"ref_48","unstructured":"TL Foundation (2021, November 29). Persistent Volumes. Available online: https:\/\/kubernetes.io\/docs\/concepts\/storage\/persistent-volumes\/."},{"key":"ref_49","unstructured":"AS Foundation (2021, September 16). HDFS Architecture Guide. Available online: https:\/\/hadoop.apache.org\/docs\/r1.2.1\/-hdfs_design.pdf."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"314","DOI":"10.1016\/j.ins.2014.01.015","article-title":"Data-intensive applications, challenges, techniques and technologies: A survey on Big Data","volume":"275","author":"Chen","year":"2014","journal-title":"Inf. Sci."},{"key":"ref_51","unstructured":"Azavea Inc. (2019, December 20). What Is GeoTrellis?. Available online: https:\/\/geotrellis.io\/documentation."},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1109\/MCSE.2007.55","article-title":"Matplotlib: A 2D graphics environment","volume":"9","author":"Hunter","year":"2007","journal-title":"Comput. Sci. Eng."},{"key":"ref_53","unstructured":"TL Foundation (2021, November 29). What Is Helm?. Available online: https:\/\/helm.sh\/docs."},{"key":"ref_54","unstructured":"Pete, L. (2021, November 28). Haproxy Ingress. Available online: https:\/\/haproxy-ingress.github.io\/."},{"key":"ref_55","doi-asserted-by":"crossref","first-page":"2374","DOI":"10.1080\/01431161.2019.1688419","article-title":"Non-stationary and unequally spaced NDVI time series analyses by the LSWAVE software","volume":"41","author":"Ghaderpour","year":"2020","journal-title":"Int. J. Remote Sens."},{"key":"ref_56","unstructured":"Zhao, Y. (2003). Principles and Methods of Remote Sensing Application Analysis, Science Press."},{"key":"ref_57","unstructured":"Vermote, E.F., Roger, J.C., and Ray, J.P. (2021, November 29). MODIS Surface Reflectance User\u2019s Guide, Available online: https:\/\/lpdaac.usgs.gov\/documents\/306\/MOD09_User_Guide_V6.pdf."},{"key":"ref_58","unstructured":"Ackerman, S., and Frey, R. (2015). MODIS atmosphere L2 cloud mask product, NASA MODIS Adaptive Processing System."},{"key":"ref_59","first-page":"309","article-title":"Monitoring vegetation systems in the Great Plains with ERTS","volume":"351","author":"Rouse","year":"1974","journal-title":"NASA Spec. Publ."},{"key":"ref_60","first-page":"357","article-title":"The conceptual model of the hybrid geographic information system based on kubernetes containers and cloud computing","volume":"20","author":"Gazul","year":"2020","journal-title":"Int. Multidiscip. Sci. GeoConference SGEM"},{"key":"ref_61","unstructured":"Aliyun (2021, November 28). Container repository service. Available online: https:\/\/cn.aliyun.com."},{"key":"ref_62","unstructured":"Foundation, A.S. (2021, September 15). Tuning Spark. Available online: http:\/\/spark.apache.org\/docs\/latest\/tuning.html#tuning-spark."}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/3\/521\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T22:05:50Z","timestamp":1760133950000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/14\/3\/521"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,22]]},"references-count":62,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2022,2]]}},"alternative-id":["rs14030521"],"URL":"https:\/\/doi.org\/10.3390\/rs14030521","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,1,22]]}}}