{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,14]],"date-time":"2025-05-14T02:38:32Z","timestamp":1747190312695,"version":"3.40.5"},"reference-count":61,"publisher":"Wiley","license":[{"start":{"date-parts":[[2020,12,4]],"date-time":"2020-12-04T00:00:00Z","timestamp":1607040000000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Yulin University-Industry Collaboration Project","award":["2019-75-3","2019-ZJ-7078"],"award-info":[{"award-number":["2019-75-3","2019-ZJ-7078"]}]},{"name":"Applied Basic Research Program Funded by Qinghai Province","award":["2019-75-3","2019-ZJ-7078"],"award-info":[{"award-number":["2019-75-3","2019-ZJ-7078"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Complexity"],"published-print":{"date-parts":[[2020,12,4]]},"abstract":"<jats:p>Multiway join queries incur high-cost I\/Os operations over large-scale data. Exploiting sharing join opportunities among multiple multiway joins could be beneficial to reduce query execution time and shuffled intermediate data. Although multiway join optimization has been carried out in MapReduce, different design principles (i.e., in-memory Big Data platforms, Flink) are not considered. To bridge the gap of not considering the optimization of Big Data platforms, an end-to-end multiway join over Flink, which is called Join-MOTH system (J-MOTH), is proposed to exploit sharing data granularity, sharing join granularity, and sharing implicit sorts within multiple join queries. For sharing data, our previous work, Multiquery Optimization using Tuple Size and Histogram (MOTH) system, has been introduced to consider the granularity of sharing data opportunities among multiple queries. For sharing sort, our previous work, Sort-Based Optimizer for Big Data Multiquery (SOOM), has been introduced to consider the implicit sorts among join queries. For sharing join, additional modules have been tailored to the J-MOTH optimizer to optimize sharing work by exploiting shared pipelined multiway join among multiple multiway join queries. The experimental evaluation has demonstrated that the J-MOTH system outperforms the naive and the state-of-the-art techniques by 44% for query execution time using TPC-H queries. Also, the proposed J-MOTH system introduces maximal intermediate data size reduction by 30% in average over Hadoop-like infrastructures.<\/jats:p>","DOI":"10.1155\/2020\/6617149","type":"journal-article","created":{"date-parts":[[2020,12,8]],"date-time":"2020-12-08T19:14:01Z","timestamp":1607454841000},"page":"1-25","source":"Crossref","is-referenced-by-count":3,"title":["Exploiting Sharing Join Opportunities in Big Data Multiquery Optimization with Flink"],"prefix":"10.1155","volume":"2020","author":[{"given":"Xiao-Yan","family":"Gao","sequence":"first","affiliation":[{"name":"School of Mathematics and Statistics, Yulin University, Yulin 719000, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8019-9069","authenticated-orcid":true,"given":"Radhya","family":"Sahal","sequence":"additional","affiliation":[{"name":"Faculty of Computers and Information, Cairo University, Cairo, Egypt"},{"name":"Faculty of Computer Science and Engineering, Hodeidah University, Hodeidah, Yemen"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6590-791X","authenticated-orcid":true,"given":"Gui-Xiu","family":"Chen","sequence":"additional","affiliation":[{"name":"School of Mathematics and Statistics, Qinghai Normal University, 810008 Xining, China"}]},{"given":"Mohammed H.","family":"Khafagy","sequence":"additional","affiliation":[{"name":"Faculty of Computers and Information, Fayoum University, Fayoum, Egypt"}]},{"given":"Fatma A.","family":"Omara","sequence":"additional","affiliation":[{"name":"Faculty of Computers and Information, Cairo University, Cairo, Egypt"}]}],"member":"311","reference":[{"first-page":"62","article-title":"Smart governance through bigdata: digital transformation of public agencies","author":"M. N. I. Sarker","key":"1"},{"first-page":"169","article-title":"Productivity improvement in agriculture sector using big data tools","author":"C. C. Sekhar","key":"2"},{"key":"3","doi-asserted-by":"publisher","DOI":"10.1109\/access.2017.2776400"},{"key":"4","article-title":"Intelligent equipment design assisted by Cognitive Internet of Things and industrial big data","volume":"32","author":"J. Wan","year":"2018","journal-title":"Neural Computing and Applications"},{"first-page":"196","article-title":"Big data collection and analysis framework research for public digital culture sharing service","author":"G. Zhang","key":"5"},{"key":"6","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1109\/JIOT.2018.2855937","article-title":"Towards improving robotic-assisted gait training: can big data analysis help us?","volume":"6","author":"L. Carnevale","year":"2019","journal-title":"IEEE Internet of Things Journal"},{"key":"7","doi-asserted-by":"publisher","DOI":"10.1109\/jbhi.2017.2681126"},{"key":"8","doi-asserted-by":"crossref","DOI":"10.1201\/b16014","volume-title":"Big Data Computing","author":"R. Akerkar","year":"2013"},{"volume-title":"Large-Scale Data Analytics","year":"2016","author":"A. Gkoulalas-Divanis","key":"9"},{"key":"10","doi-asserted-by":"publisher","DOI":"10.1145\/1327452.1327492"},{"key":"11","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1561\/1900000036","article-title":"Massively parallel databases and MapReduce systems","volume":"5","author":"S. Babu","year":"2013","journal-title":"Foundations and Trends in Databases"},{"first-page":"28","article-title":"Evaluating new approaches of big data analytics frameworks","author":"N. Spangenberg","key":"12"},{"key":"13","doi-asserted-by":"publisher","DOI":"10.1016\/j.jocs.2016.07.002"},{"key":"14","doi-asserted-by":"publisher","DOI":"10.1145\/42201.42203"},{"first-page":"311","article-title":"Using common subexpressions to optimize multiple queries","author":"J. Park","key":"15"},{"key":"16","doi-asserted-by":"publisher","DOI":"10.1080\/00207727408920138"},{"key":"17","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2018.09.050"},{"key":"18","doi-asserted-by":"publisher","DOI":"10.14778\/1920841.1920906"},{"key":"19","doi-asserted-by":"publisher","DOI":"10.14778\/2168651.2168659"},{"key":"20","doi-asserted-by":"crossref","first-page":"416","DOI":"10.1109\/69.506709","article-title":"Optimization of parallel execution for multi-join queries","volume":"8","author":"C. Ming-Syan","year":"1996","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"key":"21","doi-asserted-by":"publisher","DOI":"10.1016\/j.jocs.2017.05.023"},{"key":"22","doi-asserted-by":"publisher","DOI":"10.1089\/big.2019.0023"},{"key":"23","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2015.10.041"},{"key":"24","first-page":"111","article-title":"JOUM: an indexing methodology for improving join in hive star schema","volume":"6","author":"H. S. A. Azez","year":"2015","journal-title":"International Journal of Scientific & Engineering Research"},{"key":"25","doi-asserted-by":"publisher","DOI":"10.14778\/2350229.2350238"},{"key":"26","doi-asserted-by":"publisher","DOI":"10.1007\/s10723-018-9431-9"},{"key":"27","doi-asserted-by":"publisher","DOI":"10.14778\/2536360.2536364"},{"key":"28","doi-asserted-by":"publisher","DOI":"10.1016\/j.jmsy.2019.11.004"},{"article-title":"On evaluating the impact of changes in IoT data streams rate over query window configurations","author":"R. Sahal","key":"29","doi-asserted-by":"crossref","DOI":"10.1145\/3328905.3332509"},{"key":"30","doi-asserted-by":"publisher","DOI":"10.1504\/ijwet.2018.092401"},{"key":"31","doi-asserted-by":"publisher","DOI":"10.14778\/2732232.2732234"},{"key":"32","doi-asserted-by":"publisher","DOI":"10.14257\/ijgdc.2016.9.5.20"},{"key":"33","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2010.04.006"},{"first-page":"851","article-title":"Opportunistic physical design for big data analytics","author":"J. LeFevre","key":"34"},{"first-page":"1591","article-title":"MISO: souping up big data query processing with a multistore system","author":"J. LeFevre","key":"35"},{"key":"36","doi-asserted-by":"publisher","DOI":"10.14778\/3192965.3192971"},{"first-page":"164","article-title":"MapReduce join strategies for key-value storage","author":"D. Van Hieu","key":"37"},{"key":"38","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2014.11.012"},{"first-page":"12","article-title":"Query optimization for massively parallel data processing","author":"S. Wu","key":"39"},{"key":"40","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1007\/978-3-319-32055-7_27","article-title":"Join query processing in data quality management","volume-title":"Database Systems for Advanced Applications","author":"M. Yue","year":"2016"},{"key":"41","unstructured":"Abdel AzezH. S.KhafagyM. H.OmaraF. A.Optimizing join in HIVE star schema using key\/facts indexing2017112IETE Technical Report"},{"key":"42","doi-asserted-by":"publisher","DOI":"10.14778\/3151106.3151110"},{"first-page":"80","article-title":"JOMR: multi-join optimizer technique to enhance map-reduce job","author":"S. S. Mina Shanoda","key":"43"},{"first-page":"355","article-title":"Hash semi cascade join for joining multi-way map reduce","author":"M. H. Mohamed","key":"44"},{"first-page":"383","article-title":"QPipe: a simultaneously pipelined relational query engine","author":"S. Harizopoulos","key":"45"},{"key":"46","doi-asserted-by":"publisher","DOI":"10.1016\/s0022-0000(03)00031-x"},{"key":"47","doi-asserted-by":"publisher","DOI":"10.1145\/2560796"},{"volume-title":"In-Memory Caching For Multi-Query Optimization Of Data-Intensive Scalable Computing Workloads","year":"2019","author":"P. Michiardi","key":"48"},{"key":"49","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2018.12.031"},{"key":"50","doi-asserted-by":"publisher","DOI":"10.1109\/access.2019.2891285"},{"first-page":"80","article-title":"HOME: HiveQL optimization in multi-session environment","author":"M. N. Lu","key":"51"},{"key":"52","doi-asserted-by":"publisher","DOI":"10.1186\/s13677-014-0012-6"},{"first-page":"2215","article-title":"Reuse-based optimization for Pig Latin","author":"J. Camacho-Rodr\u00edguez","key":"53"},{"key":"54","doi-asserted-by":"publisher","DOI":"10.14778\/2994509.2994519"},{"first-page":"299","article-title":"Wide table layout optimization based on column ordering and duplication","author":"H. Bian","key":"55"},{"key":"56","doi-asserted-by":"publisher","DOI":"10.14778\/3236187.3236215"},{"first-page":"63","article-title":"From theory to practice: efficient join query evaluation in a parallel database system","author":"S. Chu","key":"57"},{"key":"58"},{"key":"59","first-page":"20","volume-title":"MapReduce Online","author":"T. Condie","year":"2010"},{"first-page":"681","article-title":"Estimating the progress of MapReduce pipelines","author":"K. Morton","key":"60"},{"key":"61"}],"container-title":["Complexity"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2020\/6617149.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2020\/6617149.xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/downloads.hindawi.com\/journals\/complexity\/2020\/6617149.pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2020,12,8]],"date-time":"2020-12-08T19:14:21Z","timestamp":1607454861000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.hindawi.com\/journals\/complexity\/2020\/6617149\/"}},"subtitle":[],"editor":[{"given":"Ahmed Mostafa","family":"Khalil","sequence":"additional","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2020,12,4]]},"references-count":61,"alternative-id":["6617149","6617149"],"URL":"https:\/\/doi.org\/10.1155\/2020\/6617149","relation":{},"ISSN":["1099-0526","1076-2787"],"issn-type":[{"type":"electronic","value":"1099-0526"},{"type":"print","value":"1076-2787"}],"subject":[],"published":{"date-parts":[[2020,12,4]]}}}