{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:08:05Z","timestamp":1750306085724,"version":"3.41.0"},"reference-count":42,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2018,1,20]],"date-time":"2018-01-20T00:00:00Z","timestamp":1516406400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"name":"University of Leeds and CSC joint scholarship program"},{"name":"China National Key Research and Development Program","award":["2016YFB1000101 and 2016YFB1000103"],"award-info":[{"award-number":["2016YFB1000101 and 2016YFB1000103"]}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Internet Technol."],"published-print":{"date-parts":[[2018,5,31]]},"abstract":"<jats:p>Modern Cloud computing systems are massive in scale, featuring environments that can execute highly dynamic Internetware applications with huge numbers of interacting tasks. This has led to a substantial challenge\u2014the straggler problem, whereby a small subset of slow tasks significantly impede parallel job completion. This problem results in longer service responses, degraded system performance, and late timing failures that can easily threaten Quality of Service (QoS) compliance. Speculative execution (or speculation) is the prominent method deployed in Clouds to tolerate stragglers by creating task replicas at runtime. The method detects stragglers by specifying a predefined threshold to calculate the difference between individual tasks and the average task progression within a job. However, such a static threshold debilitates speculation effectiveness as it fails to capture the intrinsic diversity of timing constraints in Internetware applications, as well as dynamic environmental factors, such as resource utilization. By considering such characteristics, different levels of strictness for replica creation can be imposed to adaptively achieve specified levels of QoS for different applications. In this article, we present an algorithm to improve the execution efficiency of Internetware applications by dynamically calculating the straggler threshold, considering key parameters including job QoS timing constraints, task execution progress, and optimal system resource utilization. We implement this dynamic straggler threshold into the YARN architecture to evaluate it\u2019s effectiveness against existing state-of-the-art solutions. Results demonstrate that the proposed approach is capable of reducing parallel job response time by up to 20% compared to the static threshold, as well as a higher speculation success rate, achieving up to 66.67% against 16.67% in comparison to the static method.<\/jats:p>","DOI":"10.1145\/3093896","type":"journal-article","created":{"date-parts":[[2018,1,22]],"date-time":"2018-01-22T13:22:59Z","timestamp":1516627379000},"page":"1-22","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":10,"title":["Adaptive Speculation for Efficient Internetware Application Execution in Clouds"],"prefix":"10.1145","volume":"18","author":[{"given":"Xue","family":"Ouyang","sequence":"first","affiliation":[{"name":"University of Leeds, and National University of Defense Technology, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Peter","family":"Garraghan","sequence":"additional","affiliation":[{"name":"Lancaster University, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bernhard","family":"Primas","sequence":"additional","affiliation":[{"name":"University of Leeds, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"David","family":"Mckee","sequence":"additional","affiliation":[{"name":"University of Leeds, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Paul","family":"Townend","sequence":"additional","affiliation":[{"name":"University of Leeds, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jie","family":"Xu","sequence":"additional","affiliation":[{"name":"University of Leeds, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2018,1,20]]},"reference":[{"key":"e_1_2_1_1_1","volume-title":"Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation. 185--198","author":"Ananthanarayanan Ganesh","year":"2013","unstructured":"Ganesh Ananthanarayanan , Ali Ghodsi , Scott Shenker , and Ion Stoica . 2013 . Effective straggler mitigation: Attack of the clones . In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation. 185--198 . Ganesh Ananthanarayanan, Ali Ghodsi, Scott Shenker, and Ion Stoica. 2013. Effective straggler mitigation: Attack of the clones. In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation. 185--198."},{"key":"e_1_2_1_2_1","volume-title":"Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201910)","volume":"10","author":"Ananthanarayanan Ganesh","year":"2010","unstructured":"Ganesh Ananthanarayanan , Srikanth Kandula , Albert G. Greenberg , Ion Stoica , Yi Lu , Bikas Saha , and Edward Harris . 2010 . Reining in the outliers in map-reduce clusters using mantri . In Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201910) , Vol. 10 . 24--37. Ganesh Ananthanarayanan, Srikanth Kandula, Albert G. Greenberg, Ion Stoica, Yi Lu, Bikas Saha, and Edward Harris. 2010. Reining in the outliers in map-reduce clusters using mantri. In Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201910), Vol. 10. 24--37."},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/TDSC.2004.2"},{"key":"e_1_2_1_4_1","unstructured":"G. E. Blelloch L. Dagum S. J. Smith K. Thearling and M. Zagha. 1993. An evaluation of sorting as a supercomputer benchmark. Int. J. High Speed Comput. (1993).  G. E. Blelloch L. Dagum S. J. Smith K. Thearling and M. Zagha. 1993. An evaluation of sorting as a supercomputer benchmark. Int. J. High Speed Comput. (1993)."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.future.2008.12.001"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/2856127"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2013.15"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/CIT.2010.458"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/2408776.2408794"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/1327452.1327492"},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.sysarc.2014.07.004"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1109\/TSC.2015.2491287"},{"key":"e_1_2_1_13_1","volume-title":"Straggler root-cause and impact analysis for massive-scale virtualized cloud datacenters","author":"Garraghan Peter","year":"2016","unstructured":"Peter Garraghan , Xue Ouyang , Renyu Yang , David McKee , and Jie Xu. 2016b. Straggler root-cause and impact analysis for massive-scale virtualized cloud datacenters . IEEE Trans. Services Comput . ( 2016 ). Peter Garraghan, Xue Ouyang, Renyu Yang, David McKee, and Jie Xu. 2016b. Straggler root-cause and impact analysis for massive-scale virtualized cloud datacenters. IEEE Trans. Services Comput. (2016)."},{"key":"e_1_2_1_14_1","unstructured":"Hadoop. 2016. {Online}. Available: http:\/\/hadoop.apache.org\/.  Hadoop. 2016. {Online}. Available: http:\/\/hadoop.apache.org\/."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.14257\/ijgdc.2014.7.4.13"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2213836.2213840"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2670979.2670988"},{"volume-title":"Theories of Programming and Formal Methods","author":"L\u00fc Jian","key":"e_1_2_1_18_1","unstructured":"Jian L\u00fc , Yu Huang , Chang Xu , and Xiaoxing Ma. 2013. Managing environment and adaptation risks for the internetware paradigm . In Theories of Programming and Formal Methods . Springer , 271--284. Jian L\u00fc, Yu Huang, Chang Xu, and Xiaoxing Ma. 2013. Managing environment and adaptation risks for the internetware paradigm. In Theories of Programming and Formal Methods. Springer, 271--284."},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/COMPSAC.2010.90"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1109\/MC.2012.189"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11390-011-1159-y"},{"key":"e_1_2_1_22_1","unstructured":"OpenCloud. 2016. OpenCloud hadoop cluster trace. {Online}. Available: http:\/\/ftp.pdl.cmu.edu\/pub\/datasets\/hla\/dataset.html.  OpenCloud. 2016. OpenCloud hadoop cluster trace. {Online}. Available: http:\/\/ftp.pdl.cmu.edu\/pub\/datasets\/hla\/dataset.html."},{"key":"e_1_2_1_23_1","unstructured":"OpenNebula. 2016. Flexible enterprise cloud made simple. {Online}. Available: https:\/\/opennebula.org\/.  OpenNebula. 2016. Flexible enterprise cloud made simple. {Online}. Available: https:\/\/opennebula.org\/."},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/AINA.2016.84"},{"key":"e_1_2_1_25_1","volume-title":"Proceedings of the 46th Annual IEEE\/IFIP International Conference on Dependable Systems and Networks (DSN\u201916)","author":"Ouyang Xue","year":"2016","unstructured":"Xue Ouyang , Peter Garraghan , Renyu Yang , Paul Townend , and Jie Xu . 2016 b. Reducing late-timing failure at scale: Straggler root-cause analysis in cloud datacenters . In Proceedings of the 46th Annual IEEE\/IFIP International Conference on Dependable Systems and Networks (DSN\u201916) . Xue Ouyang, Peter Garraghan, Renyu Yang, Paul Townend, and Jie Xu. 2016b. Reducing late-timing failure at scale: Straggler root-cause analysis in cloud datacenters. In Proceedings of the 46th Annual IEEE\/IFIP International Conference on Dependable Systems and Networks (DSN\u201916)."},{"key":"e_1_2_1_26_1","volume-title":"Sheth","author":"Patel Pankesh","year":"2009","unstructured":"Pankesh Patel , Ajith H. Ranabahu , and Amit P . Sheth . 2009 . Service level agreement in cloud computing. {Online}. Available: http:\/\/corescholar.libraries.wright.edu\/knoesis\/78. Pankesh Patel, Ajith H. Ranabahu, and Amit P. Sheth. 2009. Service level agreement in cloud computing. {Online}. Available: http:\/\/corescholar.libraries.wright.edu\/knoesis\/78."},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.5555\/876891.880582"},{"key":"e_1_2_1_28_1","volume-title":"Google cluster-usage traces: Format+ schema","author":"Reiss Charles","year":"2011","unstructured":"Charles Reiss and John Wilkes . 2011. Google cluster-usage traces: Format+ schema . Google Inc., White Paper ( 2011 ), 1--14. Charles Reiss and John Wilkes. 2011. Google cluster-usage traces: Format+ schema. Google Inc., White Paper (2011), 1--14."},{"volume-title":"Fine-grained micro-tasks for mapreduce skew-handling. White Paper","author":"Rosen Josh","key":"e_1_2_1_29_1","unstructured":"Josh Rosen . 2012. Fine-grained micro-tasks for mapreduce skew-handling. White Paper , University of Berkeley . Josh Rosen. 2012. Fine-grained micro-tasks for mapreduce skew-handling. White Paper, University of Berkeley."},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1007\/s11390-012-1221-4"},{"key":"e_1_2_1_31_1","unstructured":"Google Cluster Data V2. 2016. {Online}. Available: https:\/\/github.com\/google\/cluster-data.  Google Cluster Data V2. 2016. {Online}. Available: https:\/\/github.com\/google\/cluster-data."},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/2523616.2523633"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/2377978.2377982"},{"key":"e_1_2_1_34_1","first-page":"1","article-title":"Towards context consistency by concurrent checking for internetware applications. Sci","volume":"56","author":"Xu Chang","year":"2013","unstructured":"Chang Xu , YePang Liu , Shing Chi Cheung , Chun Cao , and Jian Lv . 2013 a. Towards context consistency by concurrent checking for internetware applications. Sci . China Info. Sci. 56 , 8 (2013), 1 -- 20 . Chang Xu, YePang Liu, Shing Chi Cheung, Chun Cao, and Jian Lv. 2013a. Towards context consistency by concurrent checking for internetware applications. Sci. China Info. Sci. 56, 8 (2013), 1--20.","journal-title":"China Info. Sci."},{"key":"e_1_2_1_35_1","volume-title":"Proceedings of the 21st IEEE International Conference on Network Protocols (ICNP\u201913)","author":"Xu Huanle","year":"2013","unstructured":"Huanle Xu and Wing Cheong Lau . 2013 . Resource optimization for speculative execution in a mapreduce cluster . In Proceedings of the 21st IEEE International Conference on Network Protocols (ICNP\u201913) . IEEE, 1--3. Huanle Xu and Wing Cheong Lau. 2013. Resource optimization for speculative execution in a mapreduce cluster. In Proceedings of the 21st IEEE International Conference on Network Protocols (ICNP\u201913). IEEE, 1--3."},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/TR.2015.2464075"},{"key":"e_1_2_1_37_1","volume-title":"Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation. 329--341","author":"Xu Yunjing","year":"2013","unstructured":"Yunjing Xu , Zachary Musgrave , Brian Noble , and Michael Bailey . 2013 b. Bobtail: Avoiding long tails in the cloud . In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation. 329--341 . Yunjing Xu, Zachary Musgrave, Brian Noble, and Michael Bailey. 2013b. Bobtail: Avoiding long tails in the cloud. In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation. 329--341."},{"volume-title":"Proactive straggler avoidance using machine learning. White Paper","author":"Wontae Yadwadkar","key":"e_1_2_1_38_1","unstructured":"Yadwadkar and Wontae . 2012. Proactive straggler avoidance using machine learning. White Paper , University of Berkeley . Yadwadkar and Wontae. 2012. Proactive straggler avoidance using machine learning. White Paper, University of Berkeley."},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/2020723.2020727"},{"key":"e_1_2_1_40_1","volume-title":"Spark: Cluster computing with working sets. In HotCloud\u201910. 10--16.","author":"Zaharia Matei","year":"2010","unstructured":"Matei Zaharia , Mosharaf Chowdhury , Michael J. Franklin , Scott Shenker , and Ion Stoica . 2010 . Spark: Cluster computing with working sets. In HotCloud\u201910. 10--16. Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In HotCloud\u201910. 10--16."},{"key":"e_1_2_1_41_1","volume-title":"Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201908)","volume":"8","author":"Zaharia Matei","year":"2008","unstructured":"Matei Zaharia , Andy Konwinski , Anthony D. Joseph , Randy H. Katz , and Ion Stoica . 2008 . Improving mapreduce performance in heterogeneous environments . In Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201908) , Vol. 8 . 7--20. Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy H. Katz, and Ion Stoica. 2008. Improving mapreduce performance in heterogeneous environments. In Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI\u201908), Vol. 8. 7--20."},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.14778\/2733004.2733012"}],"container-title":["ACM Transactions on Internet Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3093896","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3093896","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T03:30:16Z","timestamp":1750217416000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3093896"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,1,20]]},"references-count":42,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2018,5,31]]}},"alternative-id":["10.1145\/3093896"],"URL":"https:\/\/doi.org\/10.1145\/3093896","relation":{},"ISSN":["1533-5399","1557-6051"],"issn-type":[{"type":"print","value":"1533-5399"},{"type":"electronic","value":"1557-6051"}],"subject":[],"published":{"date-parts":[[2018,1,20]]},"assertion":[{"value":"2016-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2017-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2018-01-20","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}