{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:01:29Z","timestamp":1760241689293,"version":"build-2065373602"},"reference-count":27,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2018,7,12]],"date-time":"2018-07-12T00:00:00Z","timestamp":1531353600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>Efficient utilization of resources plays an important role in the performance of large scale task processing. In cases where heterogeneous types of resources are used within the same application, it is hard to achieve good utilization of all of the different types of resources. By taking advantage of recent developments in cloud infrastructure that enable the use of dynamic clusters of resources, and by dynamically altering the size of the available resources for all the different resource types, the overall utilization of resources, however, can be improved. Starting from this premise, this paper discusses a solution that aims to provide a generic algorithm to estimate the desired ratios of instance processing tasks as well as ratios of the resources that are used by these instances, without the necessity for trial runs or a priori knowledge of the execution steps. These ratios are then used as part of an adaptive system that is able to reconfigure itself to maximize utilization. To verify the solution, a reference framework which adaptively manages clusters of functionally different VMs to host a calculation scenario is implemented. Experiments are conducted based on a compute-heavy use case in which the probability of underground pipeline failures is determined based on the settlement of soils. These experiments show that the solution is capable of eliminating large amounts of under-utilization, resulting in increased throughput and lower lead times.<\/jats:p>","DOI":"10.3390\/bdcc2030015","type":"journal-article","created":{"date-parts":[[2018,7,12]],"date-time":"2018-07-12T06:39:59Z","timestamp":1531377599000},"page":"15","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Adaptive Provisioning of Heterogeneous Cloud Resources for Big Data Processing"],"prefix":"10.3390","volume":"2","author":[{"given":"Maarten","family":"Kollenstart","sequence":"first","affiliation":[{"name":"Monitoring and Control Systems, TNO Groningen, Eemsgolaan 3, 9727 DW Groningen, The Netherlands"}]},{"given":"Edwin","family":"Harmsma","sequence":"additional","affiliation":[{"name":"Monitoring and Control Systems, TNO Groningen, Eemsgolaan 3, 9727 DW Groningen, The Netherlands"}]},{"given":"Erik","family":"Langius","sequence":"additional","affiliation":[{"name":"Monitoring and Control Systems, TNO Groningen, Eemsgolaan 3, 9727 DW Groningen, The Netherlands"}]},{"given":"Vasilios","family":"Andrikopoulos","sequence":"additional","affiliation":[{"name":"Faculty of Science and Engineering, University of Groningen, Nijenborgh 9, 9747 AG Groningen, The Netherlands"}]},{"given":"Alexander","family":"Lazovik","sequence":"additional","affiliation":[{"name":"Faculty of Science and Engineering, University of Groningen, Nijenborgh 9, 9747 AG Groningen, The Netherlands"}]}],"member":"1968","published-online":{"date-parts":[[2018,7,12]]},"reference":[{"key":"ref_1","unstructured":"Carolina Donnelly (2018, May 31). Public Cloud Competition Prompts 66 Research Reveals. Available online: https:\/\/www.computerweekly.com\/news\/4500270463\/Public-cloud-competition-results-in-66-drop-in-prices-since-2013-research-reveals."},{"key":"ref_2","unstructured":"Toyota (2018, May 31). Toyota Production System. Available online: https:\/\/www.toyota-europe.com\/world-of-toyota\/this-is-toyota\/toyota-production-system."},{"key":"ref_3","unstructured":"Microsoft Azure (2018, May 31). Overview of Autoscale with Azure Virtual Machine Scale Sets. Available online: https:\/\/docs.microsoft.com\/en-us\/azure\/virtual-machine-scale-sets\/virtual-machine-scale-sets-autoscale-overview."},{"key":"ref_4","unstructured":"Amazon Web Services (2018, May 31). Overview of Autoscale with Azure Virtual Machine Scale Sets. Available online: https:\/\/docs.aws.amazon.com\/autoscaling\/ec2\/userguide\/what-is-amazon-ec2-auto-scaling.html."},{"key":"ref_5","first-page":"10","article-title":"MapReduce: Simplified Data Processing on Large Clusters","volume":"Volume 6","author":"Dean","year":"2004","journal-title":"Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation (OSDI\u201904)"},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Ahmad, F., Chakradhar, S.T., Raghunathan, A., and Vijaykumar, T.N. (2012, January 3\u20137). Tarazu. Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS \u201912), London, UK.","DOI":"10.1145\/2150976.2150984"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1341","DOI":"10.1109\/TC.2017.2669964","article-title":"Cross-Platform Resource Scheduling for Spark and MapReduce on YARN","volume":"66","author":"Cheng","year":"2017","journal-title":"IEEE Trans. Comput."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Burton, F.W., and Sleep, M.R. (1981, January 18\u201322). Executing functional programs on a virtual tree of processors. Proceedings of the 1981 Conference on Functional Programming Languages and Computer Architecture (FPCA \u201981), Portsmouth, NH, USA.","DOI":"10.1145\/800223.806778"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Acar, U.A., Chargueraud, A., and Rainey, M. (2013, January 23\u201327). Scheduling parallel programs by work stealing with private deques. Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP \u201913), Shenzhen, China.","DOI":"10.1145\/2442516.2442538"},{"key":"ref_10","unstructured":"Zaharia, M., Konwinski, A., Joseph, A.D., Katz, R., and Stoica, I. (2008, January 8\u201310). Improving MapReduce Performance in Heterogeneous Environments. Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI\u201908), San Diego, CA, USA."},{"key":"ref_11","unstructured":"Xing, Y., Zdonik, S., and Hwang, J.H. (2005, January 5\u20138). Dynamic load distribution in the Borealis stream processor. Proceedings of the 21st International Conference on Data Engineering (ICDE\u201905), Tokoyo, Japan."},{"key":"ref_12","unstructured":"Shah, M.A., Hellerstein, J.M., Chandrasekaran, S., and Franklin, M.J. (2003, January 5\u20138). Flux: An adaptive partitioning operator for continuous query systems. Proceedings of the 19th International Conference on Data Engineering (Cat. No.03CH37405), Bangalore, India."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Collins, R.L., and Carloni, L.P. (2009, January 11\u201316). Flexible filters. Proceedings of the Seventh ACM International Conference on Embedded Software (EMSOFT \u201909), Grenoble, France.","DOI":"10.1145\/1629335.1629363"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Cheng, D., Chen, Y., Zhou, X., Gmach, D., and Milojicic, D. (2017, January 1\u20134). Adaptive scheduling of parallel jobs in spark streaming. Proceedings of the Conference on Computer Communications (INFOCOM 2017), Atlanta, GA, USA.","DOI":"10.1109\/INFOCOM.2017.8057206"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Pace, F., Venzano, D., Carra, D., and Michiardi, P. (2017, January 14\u201317). Flexible Scheduling of Distributed Analytic Applications. Proceedings of the 2017 17th IEEE\/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), Madrid, Spain.","DOI":"10.1109\/CCGRID.2017.52"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Ostermann, S., Prodan, R., and Fahringer, T. (2010, January 25\u201328). Dynamic Cloud provisioning for scientific Grid workflows. Proceedings of the 2010 11th IEEE\/ACM International Conference on Grid Computing, Brussels, Belgium.","DOI":"10.1109\/GRID.2010.5697953"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Buyya, R., and Barreto, D. (2015, January 16\u201319). Multi-cloud resource provisioning with Aneka: A unified and integrated utilisation of microsoft azure and amazon EC2 instances. Proceedings of the 2015 International Conference on Computing and Network Communications (CoCoNet), Kerala, India.","DOI":"10.1109\/CoCoNet.2015.7411190"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Zhang, Q., Cherkasova, L., and Smirni, E. (2007, January 11\u201315). A Regression-Based Analytic Model for Dynamic Resource Provisioning of Multi-Tier Applications. Proceedings of the Fourth International Conference on Autonomic Computing (ICAC\u201907), Jacksonville, FL, USA.","DOI":"10.1109\/ICAC.2007.1"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"14","DOI":"10.1109\/TCC.2014.2306427","article-title":"Dynamic Heterogeneity-Aware Resource Provisioning in the Cloud","volume":"2","author":"Zhang","year":"2014","journal-title":"IEEE Trans. Cloud Comput."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"682","DOI":"10.1109\/TC.2013.2295797","article-title":"Efficient Server Provisioning and Offloading Policies for Internet Data Centers with Dynamic Load-Demand","volume":"64","author":"Xu","year":"2015","journal-title":"IEEE Trans. Comput."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Maroulis, S., Zacheilas, N., and Kalogeraki, V. (2017, January 5\u20138). A Framework for Efficient Energy Scheduling of Spark Workloads. Proceedings of the 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), Atlanta, GA, USA.","DOI":"10.1109\/ICDCS.2017.179"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"265","DOI":"10.1007\/s10723-016-9366-y","article-title":"Docker Cluster Management for the Cloud\u2014Survey Results and Own Solution","volume":"14","author":"Peinl","year":"2016","journal-title":"J. Grid Comput."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Verma, A., Pedrosa, L., Korupolu, M., Oppenheimer, D., Tune, E., and Wilkes, J. (2015, January 21\u201324). Large-scale cluster management at Google with Borg. Proceedings of the Tenth European Conference on Computer Systems (EuroSys \u201915), Bordeaux, France.","DOI":"10.1145\/2741948.2741964"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Zheng, C., and Thain, D. (2015, January 15\u201316). Integrating Containers into Workflows. Proceedings of the 8th International Workshop on Virtualization Technologies in Distributed Computing (VTDC \u201915), Portland, OR, USA.","DOI":"10.1145\/2755979.2755984"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Liu, K., Aida, K., Yokoyama, S., and Masatani, Y. (2016, January 4\u20135). Flexible Container-Based Computing Platform on Cloud for Scientific Workflows. Proceedings of the 2016 International Conference on Cloud Computing Research and Innovations (ICCCRI), Singapore.","DOI":"10.1109\/ICCCRI.2016.17"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.future.2014.10.023","article-title":"Enabling scalable scientific workflow management in the Cloud","volume":"46","author":"Zhao","year":"2015","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_27","unstructured":"TNO (2018, May 31). Innovative Techniques for Monitoring Infrastructures. Available online: https:\/\/www.tno.nl\/en\/focus-areas\/information-communication-technology\/roadmaps\/information-creation-from-data-to-information\/innovative-techniques-for-monitoring-infrastructures\/."}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/2\/3\/15\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:11:48Z","timestamp":1760195508000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/2\/3\/15"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,7,12]]},"references-count":27,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2018,9]]}},"alternative-id":["bdcc2030015"],"URL":"https:\/\/doi.org\/10.3390\/bdcc2030015","relation":{},"ISSN":["2504-2289"],"issn-type":[{"type":"electronic","value":"2504-2289"}],"subject":[],"published":{"date-parts":[[2018,7,12]]}}}