{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:20:44Z","timestamp":1760242844205,"version":"build-2065373602"},"reference-count":40,"publisher":"MDPI AG","issue":"9","license":[{"start":{"date-parts":[[2016,8,30]],"date-time":"2016-08-30T00:00:00Z","timestamp":1472515200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"NSFC","doi-asserted-by":"publisher","award":["61300238, 61300237, 61232016, 1405254, 61373133"],"award-info":[{"award-number":["61300238, 61300237, 61232016, 1405254, 61373133"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Marie Curie Fellowship","award":["701697-CAR-MSCA-IFEF-ST"],"award-info":[{"award-number":["701697-CAR-MSCA-IFEF-ST"]}]},{"name":"the 2014 Project of six personnel in Jiangsu Province","award":["2014-WLW-013"],"award-info":[{"award-number":["2014-WLW-013"]}]},{"name":"the 2015 Project of six personnel in Jiangsu Province","award":["R2015L06"],"award-info":[{"award-number":["R2015L06"]}]},{"name":"Basic Research Programs (Natural Science Foundation) of Jiangsu Province","award":["BK20131004"],"award-info":[{"award-number":["BK20131004"]}]},{"name":"the PAPD fund"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Sensors"],"abstract":"<jats:p>Distributed Computing has achieved tremendous development since cloud computing was proposed in 2006, and played a vital role promoting rapid growth of data collecting and analysis models, e.g., Internet of things, Cyber-Physical Systems, Big Data Analytics, etc. Hadoop has become a data convergence platform for sensor networks. As one of the core components, MapReduce facilitates allocating, processing and mining of collected large-scale data, where speculative execution strategies help solve straggler problems. However, there is still no efficient solution for accurate estimation on execution time of run-time tasks, which can affect task allocation and distribution in MapReduce. In this paper, task execution data have been collected and employed for the estimation. A two-phase regression (TPR) method is proposed to predict the finishing time of each task accurately. Detailed data of each task have drawn interests with detailed analysis report being made. According to the results, the prediction accuracy of concurrent tasks\u2019 execution time can be improved, in particular for some regular jobs.<\/jats:p>","DOI":"10.3390\/s16091386","type":"journal-article","created":{"date-parts":[[2016,8,30]],"date-time":"2016-08-30T09:56:03Z","timestamp":1472550963000},"page":"1386","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":15,"title":["Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment"],"prefix":"10.3390","volume":"16","author":[{"given":"Qi","family":"Liu","sequence":"first","affiliation":[{"name":"Jiangsu Collaborative Innovation Centre of Atmospheric Environment and Equipment Technology (CICAEET), Nanjing University of Information Science & Technology, Nanjing 210044, China"},{"name":"School of Computer & Software, Nanjing University of Information Science & Technology, Nanjing 210044, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Weidong","family":"Cai","sequence":"additional","affiliation":[{"name":"School of Computer & Software, Nanjing University of Information Science & Technology, Nanjing 210044, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Dandan","family":"Jin","sequence":"additional","affiliation":[{"name":"School of Computer & Software, Nanjing University of Information Science & Technology, Nanjing 210044, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Jian","family":"Shen","sequence":"additional","affiliation":[{"name":"Jiangsu Engineering Centre of Network Monitoring, Nanjing University of Information Science and Technology, Nanjing 210044, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Zhangjie","family":"Fu","sequence":"additional","affiliation":[{"name":"Jiangsu Engineering Centre of Network Monitoring, Nanjing University of Information Science and Technology, Nanjing 210044, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Xiaodong","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Computing, Edinburgh Napier University, 10 Colinton Road, Edinburgh EH10 5DT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nigel","family":"Linge","sequence":"additional","affiliation":[{"name":"Computer Networking and Telecommunications Research Centre, University of Salford, Salford, Greater Manchester M5 4WT, UK"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2016,8,30]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1145\/1327452.1327492","article-title":"MapReduce: Simplified data processing on large clusters","volume":"51","author":"Dean","year":"2008","journal-title":"Commun. ACM"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"190","DOI":"10.1587\/transcom.E98.B.190","article-title":"Achieving efficient cloud search services: Multi-keyword ranked search over encrypted cloud data supporting parallel computing","volume":"E98B","author":"Fu","year":"2015","journal-title":"IEICE Trans. Commun."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Toshniwal, A., Taneja, S., Shukla, A., Ramasamy, K., Patel, J.M., Kulkarni, S., and Bhagat, N. (2014, January 22\u201327). Storm@ twitter. Proceedings of the 2014 ACM International Conference on Management of Data(SIGMOD), Snowbird, UT, USA.","DOI":"10.1145\/2588555.2595641"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Stoica, I. (2014, January 16\u201320). Conquering big data with spark and BDAS. Proceedings of the 2014 ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), New York, NY, USA.","DOI":"10.1145\/2591971.2611389"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Rodr\u00edguez-Mazahua, L., S\u00e1nchez-Cervantes, J.L., Cervantes, J., Garc\u00eda-Alcaraz, J.L., and Alor-Hern\u00e1ndez, G. (2015). A general perspective of big data: Applications, tools, challenges and trends. J. Supercomput.","DOI":"10.1007\/s11227-015-1501-1"},{"key":"ref_6","first-page":"48","article-title":"On big data stream processing","volume":"3","author":"Namiot","year":"2015","journal-title":"Int. J. Open Inf. Technol."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Zaharia, M., Das, T., Li, H., Hunter, T., Shenker, S., and Stoica, I. (2013, January 3\u20136). Discretized streams: Fault-tolerant streaming computation at scale. Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, Farmington, PA, USA.","DOI":"10.1145\/2517349.2522737"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Armbrust, M., Xin, R.S., Lian, C., Huai, Y., Liu, D., Bradley, J.K., Meng, X., Kaftan, T., Franklin, M., and Zaharia, M. (June, January 31). Spark SQL: Relational data processing in Spark. Proceedings of the 2015 ACM International Conference on Management of Data (SIGMOD), Melbourne, Australia.","DOI":"10.1145\/2723372.2742797"},{"key":"ref_9","first-page":"1","article-title":"MLlib: Machine learning in apache spark","volume":"17","author":"Meng","year":"2016","journal-title":"J. Mach. Learn. Res."},{"key":"ref_10","unstructured":"Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., and Stoica, I. (2014, January 6\u20138). Graphx: Graph processing in a distributed dataflow framework. Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), Broomfield, Denver, CO, USA."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Gu, L., and Li, H. (2013, January 13\u201315). Memory or time: Performance evaluation for iterative operation on Hadoop and Spark. Proceedings of the 2013 IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC and EUC 2013), Zhangjiajie, China.","DOI":"10.1109\/HPCC.and.EUC.2013.106"},{"key":"ref_12","first-page":"637","article-title":"Cloud Hadoop Map Reduce for Remote Sensing Image Analysis","volume":"3","author":"Almeer","year":"2012","journal-title":"J. Emerg. Trends Comput. Inf. Sci."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"151564","DOI":"10.1155\/2014\/151564","article-title":"Design and experiment analysis of a Hadoop-based video transcoding system for next-generation wireless sensor networks","volume":"2014","author":"Xu","year":"2014","journal-title":"Int. J. Distrib. Sensor Netw."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"601868","DOI":"10.1155\/2014\/601868","article-title":"Hadoop-based distributed sensor node management system","volume":"2014","author":"Jung","year":"2014","journal-title":"Int. J. Distrib. Sensor Netw."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"22001","DOI":"10.3390\/s141122001","article-title":"Behavior life style analysis for mobile sensory data in cloud computing through MapReduce","volume":"14","author":"Hussain","year":"2014","journal-title":"Sensors"},{"key":"ref_16","first-page":"22","article-title":"Anomaly detection using Hadoop and MapReduce technique in cloud with sensor data","volume":"125","author":"Alghussein","year":"2015","journal-title":"Int. J. Comput. Appl."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Ibrahim, S., Jin, H., Lu, L., He, B., Antoniu, G., and Wu, S. (2012, January 13\u201316). Maestro: Replica-aware map scheduling for MapReduce. Proceedings of the 12th IEEE\/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Ottawa, ON, Canada.","DOI":"10.1109\/CCGrid.2012.122"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"314","DOI":"10.1109\/TDSC.2013.14","article-title":"Orchestrating an ensemble of MapReduce jobs for minimizing their makespan","volume":"10","author":"Verma","year":"2013","journal-title":"IEEE Trans. Dependable Secure Comput."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"333","DOI":"10.1109\/TCC.2014.2329299","article-title":"Dynamic MR: A dynamic slot allocation optimization framework for MapReduce clusters","volume":"2","author":"Tang","year":"2014","journal-title":"IEEE Trans. Cloud Comput."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1109\/TCC.2014.2338291","article-title":"LsPS: A job size-based scheduler for efficient task assignments in Hadoop","volume":"3","author":"Yao","year":"2015","journal-title":"IEEE Trans. Cloud Comput."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"2300","DOI":"10.1109\/TPDS.2014.2345068","article-title":"Mammoth: Gearing Hadoop towards memory-intensive MapReduce applications","volume":"26","author":"Shi","year":"2015","journal-title":"IEEE Trans. Parallel Distrib. Syst."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1109\/TCC.2014.2379096","article-title":"PRISM: Fine-grained resource-aware scheduling for MapReduce","volume":"3","author":"Zhang","year":"2015","journal-title":"IEEE Trans. Cloud Comput."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"2720","DOI":"10.1109\/TPDS.2014.2358556","article-title":"Energy-aware scheduling of MapReduce jobs for big data applications","volume":"26","author":"Mashayekhy","year":"2016","journal-title":"IEEE Trans. Parallel Distrib. Syst."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1109\/TSC.2015.2426186","article-title":"Dynamic job ordering and slot configurations for MapReduce workloads","volume":"9","author":"Tang","year":"2016","journal-title":"IEEE Trans. Serv. Comput."},{"key":"ref_25","unstructured":"Zaharia, M., Konwinski, A., Joseph, A.D., Katz, R.H., and Stoica, I. (2008, January 8\u201310). Improving MapReduce performance in heterogeneous environments. Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI), San Diego, CA, USA."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"954","DOI":"10.1109\/TC.2013.15","article-title":"Improving MapReduce performance using smart speculative execution strategy","volume":"63","author":"Chen","year":"2014","journal-title":"IEEE Trans. Comput."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Wu, H., Li, K., Tang, Z., and Zhang, L. (2014, January 13\u201315). A heuristic speculative execution strategy in heterogeneous distributed environments. Proceedings of the 2014 Sixth International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), Beijing, China.","DOI":"10.1109\/PAAP.2014.29"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"166","DOI":"10.1016\/j.compeleceng.2015.06.013","article-title":"Novel heuristic speculative execution strategies in heterogeneous distributed environments","volume":"50","author":"Huang","year":"2016","journal-title":"Comput. Electrical Eng."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"203","DOI":"10.14257\/ijgdc.2016.9.2.18","article-title":"A smart strategy for speculative execution based on hardware Resource in a heterogeneous distributed environment","volume":"9","author":"Liu","year":"2016","journal-title":"Int. J. Grid Distrib. Comput."},{"key":"ref_30","unstructured":"Liu, Q., Cai, W., Shen, J., Fu, Z., and Linge, N. (February, January 31). A smart speculative execution strategy based on node classification for heterogeneous hadoop systems. Proceedings of the 18th International Conference on Advanced Communication Technology (ICACT), PyeongChang, Korea."},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Raju, R., Amudhavel, J., Pavithra, M., and Anuja, S. (2014, January 6\u20138). A heuristic fault tolerant MapReduce framework for minimizing makespan in hybrid cloud environment. Proceedings of the International Conference on Green Computing Communication and Electrical Engineering, Coimbatore, India.","DOI":"10.1109\/ICGCCEE.2014.6922462"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Li, Y., Yang, Q., Lai, S., and Li, B. (2015, January 10\u201312). A new speculative execution algorithm based on C4.5 decision tree for Hadoop. Proceedings of the International Conference of Young Computer Scientists, Engineers and Educators (ICYCSEE 2015), Harbin, China.","DOI":"10.1007\/978-3-662-46248-5_35"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"587","DOI":"10.1007\/s10723-015-9350-y","article-title":"Improving MapReduce performance with partial speculative execution","volume":"11","author":"Wang","year":"2015","journal-title":"J. Grid Comput."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1016\/j.jnca.2015.07.012","article-title":"Design adaptive task allocation scheduler to improve MapReduce performance in heterogeneous Clouds","volume":"57","author":"Yang","year":"2013","journal-title":"J. Netw. Comput. Appl."},{"key":"ref_35","unstructured":"Ananthanarayanan, G., Kandula, S., Greenberg, A., Stoica, I., Lu, Y., Saha, B., and Harris, E. (2010, January 4\u20136). Reining in the outliers in Map-Reduce clusters using Mantri. Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI), Vancouver, BC, Canada."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Xu, H., and Lau, W.C. (May, January 26). Optimization for Speculative Execution in a MapReduce-Like Cluster. Proceedings of the 2015 IEEE Conference on Computer Communications (INFOCOM), Hong Kong, China.","DOI":"10.1109\/INFOCOM.2015.7218480"},{"key":"ref_37","unstructured":"Xu, H., and Lau, W.C. (July, January 29). Task-cloning algorithms in a MapReduce cluster with competitive performance bounds. Proceedings of the IEEE 35th International Conference on Distributed Computing Systems (ICDCS), Columbus, OH, USA."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"431","DOI":"10.1137\/0111030","article-title":"An algorithm for least-squares estimation of nonlinear parameters","volume":"11","author":"Marquardt","year":"2016","journal-title":"J. Soc. Ind. Appl. Math."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1145\/2189750.2150984","article-title":"Tarazu: Optimizing MapReduce on heterogeneous clusters","volume":"40","author":"Ahmad","year":"2012","journal-title":"ACM SIGARCH Comput. Archit. News"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1109\/CC.2014.6911091","article-title":"Improving MapReduce performance by balancing skewed loads","volume":"11","author":"Fan","year":"2014","journal-title":"China Commun."}],"container-title":["Sensors"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1424-8220\/16\/9\/1386\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T19:29:35Z","timestamp":1760210975000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1424-8220\/16\/9\/1386"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,8,30]]},"references-count":40,"journal-issue":{"issue":"9","published-online":{"date-parts":[[2016,9]]}},"alternative-id":["s16091386"],"URL":"https:\/\/doi.org\/10.3390\/s16091386","relation":{},"ISSN":["1424-8220"],"issn-type":[{"type":"electronic","value":"1424-8220"}],"subject":[],"published":{"date-parts":[[2016,8,30]]}}}