{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T16:11:45Z","timestamp":1760199105305,"version":"build-2065373602"},"reference-count":22,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2018,12,24]],"date-time":"2018-12-24T00:00:00Z","timestamp":1545609600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Algorithms"],"abstract":"<jats:p>MapReduce is a parallel computing model in which a large dataset is split into smaller parts and executed on multiple machines. Due to its simplicity, MapReduce has been widely used in various applications domains. MapReduce can significantly reduce the processing time of a large amount of data by dividing the dataset into smaller parts and processing them in parallel in multiple machines. However, when data are not uniformly distributed, we have the so called partitioning skew, where the allocation of tasks to machines becomes unbalanced, either by the distribution function splitting the dataset unevenly or because a part of the data is more complex and requires greater computational effort. To solve this problem, we propose an approach based on metaheuristics. For evaluating purposes, three metaheuristics were implemented: Simulated Annealing, Local Beam Search and Stochastic Beam Search. Our experimental evaluation, using a MapReduce implementation of the Bron-Kerbosch Clique Algorithm, shows that the proposed method can find good partitionings while better balancing data among machines.<\/jats:p>","DOI":"10.3390\/a12010005","type":"journal-article","created":{"date-parts":[[2018,12,24]],"date-time":"2018-12-24T10:37:49Z","timestamp":1545647869000},"page":"5","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["MAPSkew: Metaheuristic Approaches for Partitioning Skew in MapReduce"],"prefix":"10.3390","volume":"12","author":[{"given":"Matheus H. M.","family":"Pericini","sequence":"first","affiliation":[{"name":"Departamento de P\u00f3s Gradua\u00e7\u00e3o em Ci\u00eancias da Computa\u00e7\u00e3o (MDCC), Universidade Federal do Cear\u00e1, Fortaleza, CE 60020-181, Brazil"}]},{"given":"Lucas G. M.","family":"Leite","sequence":"additional","affiliation":[{"name":"Departamento de P\u00f3s Gradua\u00e7\u00e3o em Ci\u00eancias da Computa\u00e7\u00e3o (MDCC), Universidade Federal do Cear\u00e1, Fortaleza, CE 60020-181, Brazil"}]},{"given":"Francisco H.","family":"De Carvalho-Junior","sequence":"additional","affiliation":[{"name":"Departamento de P\u00f3s Gradua\u00e7\u00e3o em Ci\u00eancias da Computa\u00e7\u00e3o (MDCC), Universidade Federal do Cear\u00e1, Fortaleza, CE 60020-181, Brazil"}]},{"given":"Javam C.","family":"Machado","sequence":"additional","affiliation":[{"name":"Departamento de P\u00f3s Gradua\u00e7\u00e3o em Ci\u00eancias da Computa\u00e7\u00e3o (MDCC), Universidade Federal do Cear\u00e1, Fortaleza, CE 60020-181, Brazil"}]},{"given":"Cenez A.","family":"Rezende","sequence":"additional","affiliation":[{"name":"Centro de Ci\u00eancias e Tecnologia (CCT), Universidade de Fortaleza (UNIFOR), Fortaleza, CE 60020-181, Brazil"}]}],"member":"1968","published-online":{"date-parts":[[2018,12,24]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1145\/1327452.1327492","article-title":"MapReduce: Simplified data processing on large clusters","volume":"51","author":"Dean","year":"2008","journal-title":"Commun. ACM"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"859","DOI":"10.1007\/s10922-015-9362-8","article-title":"OPTIMA: On-Line Partitioning Skew Mitigation for MapReduce with Resource Adjustment","volume":"24","author":"Liu","year":"2016","journal-title":"J. Netw. Syst. Manag."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"575","DOI":"10.1145\/362342.362367","article-title":"Algorithm 457: Finding all cliques of an undirected graph","volume":"16","author":"Bron","year":"1973","journal-title":"Commun. ACM"},{"key":"ref_4","unstructured":"Gufler, B., Augsten, N., Reiser, A., and Kemper, A. (2011, January 7\u20139). Handing data skew in mapreduce. Proceedings of the 1st International Conference on Cloud Computing and Services Science, Noordwijkerhout, The Netherlands."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"409","DOI":"10.1007\/s12083-013-0213-7","article-title":"Handling partitioning skew in mapreduce using leen","volume":"6","author":"Ibrahim","year":"2013","journal-title":"Peer-to-Peer Netw. Appl."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Ramakrishnan, S.R., Swart, G., and Urmanov, A. (2012, January 14\u201317). Balancing reducer skew in MapReduce workloads using progressive sampling. Proceedings of the Third ACM Symposium on Cloud Computing, San Jose, CA, USA.","DOI":"10.1145\/2391229.2391245"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Yan, W., Xue, Y., and Malin, B. (2013, January 6\u20139). Scalable and robust key group size estimation for reducer load balancing in MapReduce. Proceedings of the 2013 IEEE International Conference on Big Data, Silicon Valley, CA, USA.","DOI":"10.1109\/BigData.2013.6691568"},{"key":"ref_8","unstructured":"Zacheilas, N., and Kalogeraki, V. (2014, January 18\u201320). Real-Time Scheduling of Skewed MapReduce Jobs in Heterogeneous Environments. Proceedings of the 11th International Conference on Autonomic Computing (ICAC), Philadelphia, PA, USA."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Kwon, Y., Balazinska, M., Howe, B., and Rolia, J. (2012, January 20\u201324). Skewtune: Mitigating skew in mapreduce applications. Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, Scottsdale, AZ, USA.","DOI":"10.1145\/2213836.2213840"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Kumaresan, V., Baskaran, R., and Dhavachelvan, P. (2017). AEGEUS++: An energy-aware online partition skew mitigation algorithm for mapreduce in cloud. Cluster Computing, Springer.","DOI":"10.1145\/2980258.2980461"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"224","DOI":"10.1002\/nem.1928","article-title":"ROUTE: Run-time robust reducer workload estimation for MapReduce","volume":"26","author":"Liu","year":"2016","journal-title":"Int. J. Netw. Manag."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Liu, Z., Zhang, Q., Zhani, M.F., Boutaba, R., Liu, Y., and Gong, Z. (2015, January 11\u201315). Dreams: Dynamic resource allocation for mapreduce with data skew. Proceedings of the 2015 IFIP\/IEEE International Symposium on Integrated Network Management (IM), Ottawa, ON, Canada.","DOI":"10.1109\/INM.2015.7140272"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"59","DOI":"10.1145\/1272998.1273005","article-title":"Dryad: Distributed data-parallel programs from sequential building blocks","volume":"Volume 41","author":"Isard","year":"2007","journal-title":"Proceedings of the ACM SIGOPS Operating Systems Review\u2014EuroSys\u201907 Conference Proceedings"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"He, B., Fang, W., Luo, Q., Govindaraju, N.K., and Wang, T. (2008, January 25\u201329). Mars: A MapReduce framework on graphics processors. Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, Toronto, ON, Canada.","DOI":"10.1145\/1454115.1454152"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., and Kozyrakis, C. (2007, January 10\u201314). Evaluating mapreduce for multi-core and multiprocessor systems. Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture, Phoenix, AZ, USA.","DOI":"10.1109\/HPCA.2007.346181"},{"key":"ref_16","unstructured":"Kwon, Y., Balazinska, M., Howe, B., and Rolia, J. (2011, January 12\u201313). A study of skew in mapreduce applications. Proceedings of the Open Cirrus Summit, Atlanta, GA, USA."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Okcan, A., and Riedewald, M. (2011, January 12\u201316). Processing theta-joins using MapReduce. Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, Athens, Greece.","DOI":"10.1145\/1989323.1989423"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Ibrahim, S., Jin, H., Lu, L., Wu, S., He, B., and Qi, L. (December, January 30). Leen: Locality\/fairness-aware key partitioning for mapreduce in the cloud. Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom), Indianapolis, IN, USA.","DOI":"10.1109\/CloudCom.2010.25"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Atta, F., Viglas, S.D., and Niazi, S. (2011, January 22\u201324). SAND Join\u2014A skew handling join algorithm for Google\u2019s MapReduce framework. Proceedings of the 2011 IEEE 14th International Multitopic Conference (INMIC), Karachi, Pakistan.","DOI":"10.1109\/INMIC.2011.6151466"},{"key":"ref_20","unstructured":"Russell, S.J., Norvig, P., Canny, J.F., Malik, J.M., and Edwards, D.D. (2003). Artificial Intelligence: A Modern Approach, Prentice Hall."},{"key":"ref_21","unstructured":"(2018, December 11). Dropboxlink. Available online: https:\/\/www.dropbox.com\/sh\/bfq9g83pmbt68su\/AABtXoEm2bXGD9BgX98WGyR_a?dl=0."},{"key":"ref_22","unstructured":"(2018, December 11). Facebook Egonet. Available online: https:\/\/snap.stanford.edu\/data\/egonets-Facebook.html."}],"container-title":["Algorithms"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-4893\/12\/1\/5\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:35:52Z","timestamp":1760196952000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-4893\/12\/1\/5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,12,24]]},"references-count":22,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2019,1]]}},"alternative-id":["a12010005"],"URL":"https:\/\/doi.org\/10.3390\/a12010005","relation":{},"ISSN":["1999-4893"],"issn-type":[{"type":"electronic","value":"1999-4893"}],"subject":[],"published":{"date-parts":[[2018,12,24]]}}}