{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T17:41:56Z","timestamp":1776447716542,"version":"3.51.2"},"reference-count":151,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2019,9,13]],"date-time":"2019-09-13T00:00:00Z","timestamp":1568332800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/501100000270","name":"Natural Environment Research Council","doi-asserted-by":"publisher","award":["LANDSLIP:NE\/P000681\/1,FloodPrep:NE\/P017134\/1"],"award-info":[{"award-number":["LANDSLIP:NE\/P000681\/1,FloodPrep:NE\/P017134\/1"]}],"id":[{"id":"10.13039\/501100000270","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Comput. Surv."],"published-print":{"date-parts":[[2020,9,30]]},"abstract":"<jats:p>Interest in processing big data has increased rapidly to gain insights that can transform businesses, government policies, and research outcomes. This has led to advancement in communication, programming, and processing technologies, including cloud computing services and technologies such as Hadoop, Spark, and Storm. This trend also affects the needs of analytical applications, which are no longer monolithic but composed of several individual analytical steps running in the form of a workflow. These big data workflows are vastly different in nature from traditional workflows. Researchers are currently facing the challenge of how to orchestrate and manage the execution of such workflows. In this article, we discuss in detail orchestration requirements of these workflows as well as the challenges in achieving these requirements. We also survey current trends and research that supports orchestration of big data workflows and identify open research challenges to guide future developments in this area.<\/jats:p>","DOI":"10.1145\/3332301","type":"journal-article","created":{"date-parts":[[2019,9,13]],"date-time":"2019-09-13T12:28:56Z","timestamp":1568377736000},"page":"1-41","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":53,"title":["Orchestrating Big Data Analysis Workflows in the Cloud"],"prefix":"10.1145","volume":"52","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-9146-2459","authenticated-orcid":false,"given":"Mutaz","family":"Barika","sequence":"first","affiliation":[{"name":"University of Tasmania, Tasmania, Australia"}]},{"given":"Saurabh","family":"Garg","sequence":"additional","affiliation":[{"name":"University of Tasmania, Tasmania, Australia"}]},{"given":"Albert Y.","family":"Zomaya","sequence":"additional","affiliation":[{"name":"University of Sydney, New South Wales, Australia"}]},{"given":"Lizhe","family":"Wang","sequence":"additional","affiliation":[{"name":"China University of Geoscience (Wuhan), Wuhan, P. R China"}]},{"given":"Aad Van","family":"Moorsel","sequence":"additional","affiliation":[{"name":"Newcastle University, United Kingdom"}]},{"given":"Rajiv","family":"Ranjan","sequence":"additional","affiliation":[{"name":"China University of Geoscience (Wuhan) and Newcastle University, United Kingdom"}]}],"member":"320","published-online":{"date-parts":[[2019,9,13]]},"reference":[{"key":"e_1_2_2_1_1","unstructured":"{n.d.}. Chapter 15 - A taxonomy and survey of fault-tolerant workflow manag. sys. in cloud and dist. computing env. In Software Architecture for Big Data and the Cloud Ivan Mistrik Rami Bahsoon Nour Ali Maritta Heisel and Bruce Maxim (Eds.). Morgan Kaufmann.  {n.d.}. Chapter 15 - A taxonomy and survey of fault-tolerant workflow manag. sys. in cloud and dist. computing env. In Software Architecture for Big Data and the Cloud Ivan Mistrik Rami Bahsoon Nour Ali Maritta Heisel and Bruce Maxim (Eds.). Morgan Kaufmann."},{"key":"e_1_2_2_2_1","unstructured":"2015. Anomaly Detection over Sensor Data Streams. Retrieved from http:\/\/wiki.clommunity-project.eu\/pilots:and.  2015. Anomaly Detection over Sensor Data Streams. Retrieved from http:\/\/wiki.clommunity-project.eu\/pilots:and."},{"key":"e_1_2_2_3_1","unstructured":"Adamu et al. 2016. A Survey on Big Data Indexing Strategies. Technical Report. SLAC National Accelerator Lab. Menlo Park CA.  Adamu et al. 2016. A Survey on Big Data Indexing Strategies. Technical Report. SLAC National Accelerator Lab. Menlo Park CA."},{"key":"e_1_2_2_4_1","doi-asserted-by":"publisher","DOI":"10.1109\/BDCloud.2014.63"},{"key":"e_1_2_2_5_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-017-1991-0"},{"key":"e_1_2_2_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/2443416.2443417"},{"key":"e_1_2_2_7_1","doi-asserted-by":"crossref","unstructured":"Alrokayan et al. 2014. Sla-aware provisioning and scheduling of cloud resources for big data analytics. In CCEM. IEEE 1--8.  Alrokayan et al. 2014. Sla-aware provisioning and scheduling of cloud resources for big data analytics. In CCEM. IEEE 1--8.","DOI":"10.1109\/CCEM.2014.7015497"},{"key":"e_1_2_2_8_1","unstructured":"Amazon. 2017. AWS Lambda. Retrieved from https:\/\/aws.amazon.com\/lambda\/details\/.  Amazon. 2017. AWS Lambda. Retrieved from https:\/\/aws.amazon.com\/lambda\/details\/."},{"key":"e_1_2_2_9_1","unstructured":"Amstutz et al. 2016. Common workflow language draft 3.  Amstutz et al. 2016. Common workflow language draft 3."},{"key":"e_1_2_2_10_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2011.04.017"},{"key":"e_1_2_2_11_1","doi-asserted-by":"publisher","DOI":"10.1145\/2535929"},{"key":"e_1_2_2_12_1","volume-title":"USENIX Annual Technical Conference.","author":"Bessani","year":"2014","unstructured":"Bessani et al. 2014 . SCFS: A shared cloud-backed file system . In USENIX Annual Technical Conference. Bessani et al. 2014. SCFS: A shared cloud-backed file system. In USENIX Annual Technical Conference."},{"key":"e_1_2_2_13_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.csbj.2014.11.001"},{"key":"e_1_2_2_14_1","doi-asserted-by":"publisher","DOI":"10.1109\/IPDPS.2013.81"},{"key":"e_1_2_2_15_1","doi-asserted-by":"publisher","DOI":"10.1109\/TDSC.2013.6"},{"key":"e_1_2_2_16_1","volume-title":"Parallelization in scientific workflow management systems. arXiv preprint arXiv:1303.7195","author":"Bux Marc","year":"2013","unstructured":"Marc Bux and Ulf Leser . 2013. Parallelization in scientific workflow management systems. arXiv preprint arXiv:1303.7195 ( 2013 ). Marc Bux and Ulf Leser. 2013. Parallelization in scientific workflow management systems. arXiv preprint arXiv:1303.7195 (2013)."},{"key":"e_1_2_2_17_1","volume-title":"Grids, Clouds and Virtualization","author":"Cafaro Massimo","unstructured":"Massimo Cafaro and Giovanni Aloisio . 2011. Grids , clouds, and virtualization . In Grids, Clouds and Virtualization . Springer , 1--21. Massimo Cafaro and Giovanni Aloisio. 2011. Grids, clouds, and virtualization. In Grids, Clouds and Virtualization. Springer, 1--21."},{"key":"e_1_2_2_18_1","first-page":"75","article-title":"IoT-based big data storage systems in cloud comp.: Perspectives and challenges","volume":"4","author":"Cai","year":"2017","unstructured":"Cai et al. 2017 . IoT-based big data storage systems in cloud comp.: Perspectives and challenges . IEEE IoT J. 4 , 1 (2017), 75 -- 87 . Cai et al. 2017. IoT-based big data storage systems in cloud comp.: Perspectives and challenges. IEEE IoT J. 4, 1 (2017), 75--87.","journal-title":"IEEE IoT J."},{"key":"e_1_2_2_19_1","unstructured":"Cao et al. 2016. A resource provisioning strategy for elastic analytical workflows in the cloud. In Proceedings of the 18th International Conference on High-Performance Computing and Communications 14th International Conference on Smart City and 2nd International Conference on Data Science and Systems (HPCC\/SmartCity\/DSS). IEEE 538--545.  Cao et al. 2016. A resource provisioning strategy for elastic analytical workflows in the cloud. In Proceedings of the 18th International Conference on High-Performance Computing and Communications 14th International Conference on Smart City and 2nd International Conference on Data Science and Systems (HPCC\/SmartCity\/DSS). IEEE 538--545."},{"key":"e_1_2_2_20_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11704-013-3903-7"},{"key":"e_1_2_2_21_1","doi-asserted-by":"crossref","unstructured":"Chen et al. 2018. Scheduling jobs across geo-distributed datacenters with max-min fairness. IEEE Trans. Network Sci.Eng. (2018). PrePrints.  Chen et al. 2018. Scheduling jobs across geo-distributed datacenters with max-min fairness. IEEE Trans. Network Sci.Eng. (2018). PrePrints.","DOI":"10.1109\/INFOCOM.2017.8056949"},{"key":"e_1_2_2_22_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2014.01.015"},{"key":"e_1_2_2_23_1","unstructured":"Peng Chen. 2016. Big data analytics in static and streaming provenance.  Peng Chen. 2016. Big data analytics in static and streaming provenance."},{"key":"e_1_2_2_24_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-31500-8_2"},{"key":"e_1_2_2_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/CCGrid.2012.57"},{"key":"e_1_2_2_26_1","first-page":"20","article-title":"MapReduce online","volume":"10","author":"Condie","year":"2010","unstructured":"Condie et al. 2010 . MapReduce online . In NSDI , Vol. 10. 20 . Condie et al. 2010. MapReduce online. In NSDI, Vol. 10. 20.","journal-title":"NSDI"},{"key":"e_1_2_2_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00607-017-0564-7"},{"key":"e_1_2_2_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/CloudCom.2011.15"},{"key":"e_1_2_2_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/eScience.2014.59"},{"key":"e_1_2_2_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2663715.2669614"},{"key":"e_1_2_2_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/1327452.1327492"},{"key":"e_1_2_2_32_1","doi-asserted-by":"publisher","DOI":"10.1109\/CCGRID.2017.144"},{"key":"e_1_2_2_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2523616.2523630"},{"key":"e_1_2_2_34_1","doi-asserted-by":"publisher","DOI":"10.1145\/3133956.3134032"},{"key":"e_1_2_2_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2015.7363795"},{"key":"e_1_2_2_36_1","volume-title":"Proceedings of the IEEE 31st International Conference on Data Engineering (ICDE\u201915)","author":"Eldawy Ahmed","unstructured":"Ahmed Eldawy and Mohamed F. Mokbel . 2015. Spatialhadoop: A mapreduce framework for spatial data . In Proceedings of the IEEE 31st International Conference on Data Engineering (ICDE\u201915) . IEEE, 1352--1363. Ahmed Eldawy and Mohamed F. Mokbel. 2015. Spatialhadoop: A mapreduce framework for spatial data. In Proceedings of the IEEE 31st International Conference on Data Engineering (ICDE\u201915). IEEE, 1352--1363."},{"key":"e_1_2_2_37_1","doi-asserted-by":"publisher","DOI":"10.1109\/COMPSAC.2018.00115"},{"key":"e_1_2_2_38_1","doi-asserted-by":"publisher","DOI":"10.5555\/3018100.3018101"},{"key":"e_1_2_2_39_1","doi-asserted-by":"publisher","DOI":"10.1109\/eScience.2015.40"},{"key":"e_1_2_2_40_1","doi-asserted-by":"publisher","DOI":"10.1177\/1094342016649766"},{"key":"e_1_2_2_41_1","doi-asserted-by":"publisher","DOI":"10.1109\/MIS.2010.131"},{"key":"e_1_2_2_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/TFUZZ.2010.2041008"},{"key":"e_1_2_2_43_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-015-0830-y"},{"key":"e_1_2_2_44_1","doi-asserted-by":"crossref","unstructured":"Garg et al. 2018. Orchestration Tools for Big Data. Springer International Publishing 1--9.  Garg et al. 2018. Orchestration Tools for Big Data. Springer International Publishing 1--9.","DOI":"10.1007\/978-3-319-63962-8_43-1"},{"key":"e_1_2_2_45_1","doi-asserted-by":"publisher","DOI":"10.1109\/DataCloud.2014.6"},{"key":"e_1_2_2_46_1","doi-asserted-by":"publisher","DOI":"10.1023\/A:1011139719773"},{"key":"e_1_2_2_47_1","volume-title":"BTW Workshops","volume":"11","author":"Glavic","year":"2011","unstructured":"Glavic et al. 2011 . The case for fine-grained stream provenance . In BTW Workshops , Vol. 11 . Glavic et al. 2011. The case for fine-grained stream provenance. In BTW Workshops, Vol. 11."},{"key":"e_1_2_2_48_1","doi-asserted-by":"publisher","DOI":"10.1145\/2633689"},{"key":"e_1_2_2_49_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-53974-9_7"},{"key":"e_1_2_2_50_1","doi-asserted-by":"crossref","unstructured":"Gomes et al. 2018. Enabling rootless Linux containers in multi-user envin.: The udocker tool. Computer Physics Communications (2018).  Gomes et al. 2018. Enabling rootless Linux containers in multi-user envin.: The udocker tool. Computer Physics Communications (2018).","DOI":"10.1016\/j.cpc.2018.05.021"},{"key":"e_1_2_2_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/2490257.2490290"},{"key":"e_1_2_2_52_1","volume-title":"Future: Architectures, Technologies, and Implementations","author":"Hassan","year":"2017","unstructured":"Hassan et al. 2017 . Networks of the Future: Architectures, Technologies, and Implementations . Chapman and Hall\/CRC. Hassan et al. 2017. Networks of the Future: Architectures, Technologies, and Implementations. Chapman and Hall\/CRC."},{"key":"e_1_2_2_53_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIFS.2016.2573746"},{"key":"e_1_2_2_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/TDSC.2016.2596286"},{"key":"e_1_2_2_55_1","doi-asserted-by":"publisher","DOI":"10.1147\/JRD.2013.2243535"},{"key":"e_1_2_2_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2014.2332453"},{"key":"e_1_2_2_57_1","doi-asserted-by":"publisher","DOI":"10.1109\/INFOCOM.2016.7524469"},{"key":"e_1_2_2_58_1","doi-asserted-by":"publisher","DOI":"10.1145\/2806777.2806780"},{"key":"e_1_2_2_59_1","doi-asserted-by":"publisher","DOI":"10.5555\/2033546.2033560"},{"key":"e_1_2_2_60_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-017-0474-5"},{"key":"e_1_2_2_61_1","unstructured":"Matteo Interlandi and Tyson Condie. 2018. Supporting data provenance in data-intensive scalable comp. sys. Data Eng. (2018) 63.  Matteo Interlandi and Tyson Condie. 2018. Supporting data provenance in data-intensive scalable comp. sys. Data Eng. (2018) 63."},{"key":"e_1_2_2_62_1","volume-title":"Falkirk wheel: Rollback recovery for dataflow systems. arXiv preprint arXiv:1503.08877","author":"Isard Michael","year":"2015","unstructured":"Michael Isard and Mart\u00edn Abadi . 2015. Falkirk wheel: Rollback recovery for dataflow systems. arXiv preprint arXiv:1503.08877 ( 2015 ). Michael Isard and Mart\u00edn Abadi. 2015. Falkirk wheel: Rollback recovery for dataflow systems. arXiv preprint arXiv:1503.08877 (2015)."},{"key":"e_1_2_2_63_1","doi-asserted-by":"crossref","unstructured":"Jin et al. 2016. Workload-aware scheduling across geo-distributed data centers. In Trustcom\/BigDataSE\/ISPA. IEEE 1455--1462.  Jin et al. 2016. Workload-aware scheduling across geo-distributed data centers. In Trustcom\/BigDataSE\/ISPA. IEEE 1455--1462.","DOI":"10.1109\/TrustCom.2016.0228"},{"key":"e_1_2_2_64_1","first-page":"684","article-title":"Data analytics computing resource provisioning based on computed cost and time parameters for proposed computing resource configurations","volume":"9","author":"Todd Jr.","year":"2017","unstructured":"Todd Jr. et al. 2017 . Data analytics computing resource provisioning based on computed cost and time parameters for proposed computing resource configurations . US Patent 9 , 684 ,866. Todd Jr. et al. 2017. Data analytics computing resource provisioning based on computed cost and time parameters for proposed computing resource configurations. US Patent 9,684,866.","journal-title":"US Patent"},{"key":"e_1_2_2_65_1","volume-title":"CLOSER 2012","author":"Jrad","year":"2012","unstructured":"Jrad et al. 2012 . SLA based service brokering in intercloud environments . CLOSER 2012 (2012), 76--81. Jrad et al. 2012. SLA based service brokering in intercloud environments. CLOSER 2012 (2012), 76--81."},{"key":"e_1_2_2_66_1","doi-asserted-by":"publisher","DOI":"10.1145\/2462326.2462339"},{"key":"e_1_2_2_67_1","doi-asserted-by":"publisher","DOI":"10.1109\/SCC.2014.16"},{"key":"e_1_2_2_68_1","doi-asserted-by":"publisher","DOI":"10.1109\/MWC.2017.1600427"},{"key":"e_1_2_2_69_1","volume-title":"Streaming Data: Big Data at High Velocity.","author":"Keenan Tyler","year":"2016","unstructured":"Tyler Keenan . 2016 . Streaming Data: Big Data at High Velocity. Retrieved from https:\/\/www.upwork.com\/hiring\/data\/streaming-data-high-velocity\/. Tyler Keenan. 2016. Streaming Data: Big Data at High Velocity. Retrieved from https:\/\/www.upwork.com\/hiring\/data\/streaming-data-high-velocity\/."},{"key":"e_1_2_2_70_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2015.7364082"},{"key":"e_1_2_2_71_1","doi-asserted-by":"publisher","DOI":"10.1080\/03081079.2012.710437"},{"key":"e_1_2_2_72_1","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pone.0177459"},{"key":"e_1_2_2_73_1","doi-asserted-by":"publisher","DOI":"10.1145\/2371536.2371547"},{"key":"e_1_2_2_74_1","doi-asserted-by":"publisher","DOI":"10.25103\/jestr.105.05"},{"key":"e_1_2_2_75_1","unstructured":"Lin et al. 2016. StreamScope: Continuous reliable distributed processing of big data streams. In NSDI. 439--453.   Lin et al. 2016. StreamScope: Continuous reliable distributed processing of big data streams. In NSDI. 439--453."},{"key":"e_1_2_2_76_1","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-14325-5_10"},{"key":"e_1_2_2_77_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10723-015-9329-8"},{"key":"e_1_2_2_78_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.isprsjprs.2015.11.006"},{"key":"e_1_2_2_79_1","unstructured":"Liu et al. 2018. A survey of scheduling frameworks in big data systems. Int. J. Cloud Comput. (2018) 1--27.  Liu et al. 2018. A survey of scheduling frameworks in big data systems. Int. J. Cloud Comput. (2018) 1--27."},{"key":"e_1_2_2_80_1","volume-title":"A replication-based mechanism for fault tolerance in mapreduce framework. Math. Prob. Eng. 2015","author":"Liu Yang","year":"2015","unstructured":"Yang Liu and Wei Wei . 2015. A replication-based mechanism for fault tolerance in mapreduce framework. Math. Prob. Eng. 2015 ( 2015 ). Yang Liu and Wei Wei. 2015. A replication-based mechanism for fault tolerance in mapreduce framework. Math. Prob. Eng. 2015 (2015)."},{"key":"e_1_2_2_81_1","unstructured":"Rache lKempf. 2017. Open Source Data Pipeline\u2014Luigi vs Azkaban vs Oozie vs Airflow. Retrieved from https:\/\/www.bizety.com\/2017\/06\/05\/open-source-data-pipeline-luigi-vs-azkaban-vs-oozie-vs-airflow\/.  Rache lKempf. 2017. Open Source Data Pipeline\u2014Luigi vs Azkaban vs Oozie vs Airflow. Retrieved from https:\/\/www.bizety.com\/2017\/06\/05\/open-source-data-pipeline-luigi-vs-azkaban-vs-oozie-vs-airflow\/."},{"key":"e_1_2_2_82_1","doi-asserted-by":"publisher","DOI":"10.1109\/GLOCOM.2016.7841533"},{"key":"e_1_2_2_83_1","unstructured":"Dan Lynn. 2016. Apache Spark Cluster Managers: YARN Mesos or Standalone? Retrieved from http:\/\/www.agildata.com\/apache-spark-cluster-managers-yarn-mesos-or-standalone\/.  Dan Lynn. 2016. Apache Spark Cluster Managers: YARN Mesos or Standalone? Retrieved from http:\/\/www.agildata.com\/apache-spark-cluster-managers-yarn-mesos-or-standalone\/."},{"key":"e_1_2_2_84_1","doi-asserted-by":"publisher","DOI":"10.1145\/2396761.2398587"},{"key":"e_1_2_2_85_1","doi-asserted-by":"publisher","DOI":"10.1109\/DSNW.2011.5958795"},{"key":"e_1_2_2_86_1","doi-asserted-by":"publisher","DOI":"10.1109\/eScience.2010.51"},{"key":"e_1_2_2_87_1","doi-asserted-by":"publisher","DOI":"10.1145\/3136623"},{"key":"e_1_2_2_88_1","doi-asserted-by":"crossref","unstructured":"Di Martino et al. 2015. Cross-platform cloud APIs. In Cloud Portability and Interoperability. Springer 45--57.  Di Martino et al. 2015. Cross-platform cloud APIs. In Cloud Portability and Interoperability. Springer 45--57.","DOI":"10.1007\/978-3-319-13701-8_3"},{"key":"e_1_2_2_89_1","unstructured":"Ulf Mattsson. 2016. Data centric security key to cloud and digital business. Retrieved from https:\/\/www.helpnetsecurity.com\/2016\/03\/22\/data-centric-security\/.  Ulf Mattsson. 2016. Data centric security key to cloud and digital business. Retrieved from https:\/\/www.helpnetsecurity.com\/2016\/03\/22\/data-centric-security\/."},{"key":"e_1_2_2_90_1","doi-asserted-by":"publisher","DOI":"10.1109\/Grid.2011.31"},{"key":"e_1_2_2_91_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigDataCongress.2016.15"},{"key":"e_1_2_2_92_1","doi-asserted-by":"publisher","DOI":"10.5555\/3019078.3019082"},{"key":"e_1_2_2_93_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jnca.2017.08.011"},{"key":"e_1_2_2_94_1","unstructured":"Matri et al. 2016. T\u1ef3r: Efficient Transactional Storage for Data-Intensive Applications. Ph.D. Dissertation. Inria Rennes Bretagne Atlantique; Universidad Polit\u00e9cnica de Madrid.  Matri et al. 2016. T\u1ef3r: Efficient Transactional Storage for Data-Intensive Applications. Ph.D. Dissertation. Inria Rennes Bretagne Atlantique; Universidad Polit\u00e9cnica de Madrid."},{"key":"e_1_2_2_95_1","doi-asserted-by":"crossref","unstructured":"Suraj Pandey and Rajkumar Buyya. 2012. A survey of scheduling and management techniques for data-intensive application workflows. In Data Intensive Distributed Computing: Challenges and Solutions for Large-scale Information Management. IGI Global 156--176.  Suraj Pandey and Rajkumar Buyya. 2012. A survey of scheduling and management techniques for data-intensive application workflows. In Data Intensive Distributed Computing: Challenges and Solutions for Large-scale Information Management. IGI Global 156--176.","DOI":"10.4018\/978-1-61520-971-2.ch007"},{"key":"e_1_2_2_96_1","doi-asserted-by":"publisher","DOI":"10.14778\/3402755.3402768"},{"key":"e_1_2_2_97_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLOUD.2012.24"},{"key":"e_1_2_2_98_1","volume-title":"Proceedings of the Science and Information Conference (SAI). IEEE.","author":"Peoples","year":"2013","unstructured":"Peoples et al. 2013 . The standardisation of cloud computing: Trends in the state-of-the-art and management issues for the next generation of cloud . In Proceedings of the Science and Information Conference (SAI). IEEE. Peoples et al. 2013. The standardisation of cloud computing: Trends in the state-of-the-art and management issues for the next generation of cloud. In Proceedings of the Science and Information Conference (SAI). IEEE."},{"key":"e_1_2_2_99_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.procs.2014.05.047"},{"key":"e_1_2_2_100_1","doi-asserted-by":"publisher","DOI":"10.1145\/2815624"},{"key":"e_1_2_2_101_1","doi-asserted-by":"publisher","DOI":"10.1109\/CloudCom.2016.0052"},{"key":"e_1_2_2_102_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.1734"},{"key":"e_1_2_2_103_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCC.2015.64"},{"key":"e_1_2_2_104_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCC.2017.55"},{"key":"e_1_2_2_105_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10115-018-1248-0"},{"key":"e_1_2_2_106_1","first-page":"64","article-title":"Dppacs: A novel data partitioning and placement aware computation scheduling scheme for data-intensive cloud applications","volume":"59","author":"Reddy K. H. K.","year":"2015","unstructured":"K. H. K. Reddy and D. S. Roy . 2015 . Dppacs: A novel data partitioning and placement aware computation scheduling scheme for data-intensive cloud applications . Comput. J. 59 , 1 (2015), 64 -- 82 . K. H. K. Reddy and D. S. Roy. 2015. Dppacs: A novel data partitioning and placement aware computation scheduling scheme for data-intensive cloud applications. Comput. J. 59, 1 (2015), 64--82.","journal-title":"Comput. J."},{"key":"e_1_2_2_107_1","volume-title":"A taxonomy and survey on scheduling algorithms for scientific workflows in IaaS cloud computing environments. Concurrency Comput. Pract. Experience 29, 8","author":"Rodriguez Maria Alejandra","year":"2017","unstructured":"Maria Alejandra Rodriguez and Rajkumar Buyya . 2017. A taxonomy and survey on scheduling algorithms for scientific workflows in IaaS cloud computing environments. Concurrency Comput. Pract. Experience 29, 8 ( 2017 ). Maria Alejandra Rodriguez and Rajkumar Buyya. 2017. A taxonomy and survey on scheduling algorithms for scientific workflows in IaaS cloud computing environments. Concurrency Comput. Pract. Experience 29, 8 (2017)."},{"key":"e_1_2_2_108_1","doi-asserted-by":"publisher","DOI":"10.5555\/2748143.2748384"},{"key":"e_1_2_2_109_1","doi-asserted-by":"publisher","DOI":"10.1109\/SURV.2011.032211.00087"},{"key":"e_1_2_2_110_1","doi-asserted-by":"publisher","DOI":"10.1145\/2522968.2522979"},{"key":"e_1_2_2_111_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICIS.2013.6607885"},{"key":"e_1_2_2_112_1","doi-asserted-by":"publisher","DOI":"10.1007\/s10270-016-0551-z"},{"key":"e_1_2_2_113_1","doi-asserted-by":"crossref","unstructured":"Shishido et al. 2018. (WIP) tasks selection policies for securing sensitive data on workflow scheduling in clouds. In IEEE SCC.  Shishido et al. 2018. (WIP) tasks selection policies for securing sensitive data on workflow scheduling in clouds. In IEEE SCC.","DOI":"10.1109\/SCC.2018.00037"},{"key":"e_1_2_2_114_1","doi-asserted-by":"publisher","DOI":"10.14778\/3229863.3236265"},{"key":"e_1_2_2_115_1","volume-title":"Proceedings of the International Conference on Cloud Engineering (IC2E).","author":"Souza","year":"2018","unstructured":"Souza et al. 2018 . Hybrid adaptive checkpointing for VM fault tolerance . In Proceedings of the International Conference on Cloud Engineering (IC2E). Souza et al. 2018. Hybrid adaptive checkpointing for VM fault tolerance. In Proceedings of the International Conference on Cloud Engineering (IC2E)."},{"key":"e_1_2_2_116_1","unstructured":"Mesos Sphere. 2017. Apache Mesos. Retrieved from https:\/\/mesosphere.com\/why-mesos\/?utm_source&equals;adwords8utm_medium&equals;g8utm_campaign&equals;438435124318utm_term&equals;mesos8utm_content&equals;1908059572258gclid&equals;CLqw8o6J6dMCFdkGKgodYlsD_A.  Mesos Sphere. 2017. Apache Mesos. Retrieved from https:\/\/mesosphere.com\/why-mesos\/?utm_source&equals;adwords8utm_medium&equals;g8utm_campaign&equals;438435124318utm_term&equals;mesos8utm_content&equals;1908059572258gclid&equals;CLqw8o6J6dMCFdkGKgodYlsD_A."},{"key":"e_1_2_2_117_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2016.10.010"},{"key":"e_1_2_2_118_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-017-2151-2"},{"key":"e_1_2_2_119_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2016.2634557"},{"key":"e_1_2_2_120_1","doi-asserted-by":"publisher","DOI":"10.1111\/j.1475-3995.2011.00808.x"},{"key":"e_1_2_2_121_1","unstructured":"Tan et al. 2014. Diff-Index: Differentiated index in distributed log-structured data stores. In EDBT. 700--711.  Tan et al. 2014. Diff-Index: Differentiated index in distributed log-structured data stores. In EDBT. 700--711."},{"key":"e_1_2_2_122_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2017.05.042"},{"key":"e_1_2_2_123_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCC.2015.2440254"},{"key":"e_1_2_2_124_1","doi-asserted-by":"publisher","DOI":"10.1145\/3217880.3217888"},{"key":"e_1_2_2_125_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.is.2004.02.002"},{"key":"e_1_2_2_126_1","doi-asserted-by":"publisher","DOI":"10.1145\/2523616.2523633"},{"key":"e_1_2_2_127_1","doi-asserted-by":"publisher","DOI":"10.1145\/3132747.3132750"},{"key":"e_1_2_2_128_1","volume-title":"Proceedings of the EDA-PS Workshop.","author":"Vijayakumar Nithya","year":"2007","unstructured":"Nithya Vijayakumar and Beth Plale . 2007 . Tracking stream provenance in complex event processing systems for workflow-driven computing . In Proceedings of the EDA-PS Workshop. Nithya Vijayakumar and Beth Plale. 2007. Tracking stream provenance in complex event processing systems for workflow-driven computing. In Proceedings of the EDA-PS Workshop."},{"key":"e_1_2_2_129_1","doi-asserted-by":"publisher","DOI":"10.1109\/CSNT.2014.103"},{"key":"e_1_2_2_130_1","doi-asserted-by":"crossref","unstructured":"von Leon et al. 2019. A lightweight container middleware for edge cloud architectures. Fog and Edge Computing: Principles and Paradigms (2019) 145--170.  von Leon et al. 2019. A lightweight container middleware for edge cloud architectures. Fog and Edge Computing: Principles and Paradigms (2019) 145--170.","DOI":"10.1002\/9781119525080.ch7"},{"key":"e_1_2_2_131_1","volume-title":"Proceedings of the 10th USENIX Conference on File and Storage Technologies.","author":"Vrable","year":"2012","unstructured":"Vrable et al. 2012 . BlueSky: A cloud-backed file system for the enterprise . In Proceedings of the 10th USENIX Conference on File and Storage Technologies. Vrable et al. 2012. BlueSky: A cloud-backed file system for the enterprise. In Proceedings of the 10th USENIX Conference on File and Storage Technologies."},{"key":"e_1_2_2_132_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigData.2014.7004220"},{"key":"e_1_2_2_133_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2014.2375195"},{"key":"e_1_2_2_134_1","doi-asserted-by":"publisher","DOI":"10.1002\/cpe.3617"},{"key":"e_1_2_2_135_1","first-page":"929","article-title":"Cost effective, reliable and secure workflow deployment over federated clouds","volume":"10","author":"Wen","year":"2017","unstructured":"Wen et al. 2017 . Cost effective, reliable and secure workflow deployment over federated clouds . IEEE TSC. 10 , 6 (2017), 929 -- 941 . Wen et al. 2017. Cost effective, reliable and secure workflow deployment over federated clouds. IEEE TSC. 10, 6 (2017), 929--941.","journal-title":"IEEE TSC."},{"key":"e_1_2_2_136_1","doi-asserted-by":"publisher","DOI":"10.1145\/1670243.1670245"},{"key":"e_1_2_2_137_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11227-015-1438-4"},{"key":"e_1_2_2_138_1","first-page":"1709","article-title":"On fault tolerance for distributed iterative dataflow processing","volume":"29","author":"Xu","year":"2017","unstructured":"Xu et al. 2017 . On fault tolerance for distributed iterative dataflow processing . IEEE Trans. KDE 29 , 8 (2017), 1709 -- 1722 . Xu et al. 2017. On fault tolerance for distributed iterative dataflow processing. IEEE Trans. KDE 29, 8 (2017), 1709--1722.","journal-title":"IEEE Trans. KDE"},{"key":"e_1_2_2_139_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00778-011-0256-4"},{"key":"e_1_2_2_140_1","doi-asserted-by":"publisher","DOI":"10.1109\/CloudCom.2014.88"},{"key":"e_1_2_2_141_1","doi-asserted-by":"publisher","DOI":"10.1145\/1084805.1084814"},{"key":"e_1_2_2_142_1","doi-asserted-by":"publisher","DOI":"10.1145\/2479942.2479945"},{"key":"e_1_2_2_143_1","doi-asserted-by":"publisher","DOI":"10.1109\/SERVICES.2014.75"},{"key":"e_1_2_2_144_1","doi-asserted-by":"publisher","DOI":"10.1109\/CCGrid.2015.72"},{"key":"e_1_2_2_145_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2014.10.023"},{"key":"e_1_2_2_146_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICPP.2015.60"},{"key":"e_1_2_2_147_1","doi-asserted-by":"crossref","unstructured":"Zhao et al. 2016. Heuristic data placement for data-intensive applications in heterogeneous cloud. JECE (2016).  Zhao et al. 2016. Heuristic data placement for data-intensive applications in heterogeneous cloud. JECE (2016).","DOI":"10.1155\/2016\/3516358"},{"key":"e_1_2_2_148_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jnca.2015.05.001"},{"key":"e_1_2_2_149_1","doi-asserted-by":"publisher","DOI":"10.1145\/2755979.2755984"},{"key":"e_1_2_2_150_1","doi-asserted-by":"publisher","DOI":"10.1109\/BigDataCongress.2015.39"},{"key":"e_1_2_2_151_1","doi-asserted-by":"publisher","DOI":"10.1109\/TPDS.2016.2543731"}],"container-title":["ACM Computing Surveys"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3332301","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3332301","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T23:54:37Z","timestamp":1750204477000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3332301"}},"subtitle":["Research Challenges, Survey, and Future Directions"],"short-title":[],"issued":{"date-parts":[[2019,9,13]]},"references-count":151,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2020,9,30]]}},"alternative-id":["10.1145\/3332301"],"URL":"https:\/\/doi.org\/10.1145\/3332301","relation":{},"ISSN":["0360-0300","1557-7341"],"issn-type":[{"value":"0360-0300","type":"print"},{"value":"1557-7341","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,9,13]]},"assertion":[{"value":"2018-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2019-09-13","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}