{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,17]],"date-time":"2025-05-17T05:14:15Z","timestamp":1747458855330,"version":"3.38.0"},"reference-count":41,"publisher":"SAGE Publications","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IDA"],"published-print":{"date-parts":[[2023,10,6]]},"abstract":"<jats:p>When the concentration focuses on data mining, frequent itemset mining (FIM) and high-utility itemset mining (HUIM) are commonly addressed and researched. Many related algorithms are proposed to reveal the general relationship between utility, frequency, and items in transaction databases. Although these algorithms can mine FIMs or HUIMs quickly, these algorithms merely take into account frequency or utility as a unilateral criterion for itemsets but the other factors (e.g., distance, price) could be also valuable for decision-making. A new skyline framework has been presented to mine frequent high utility patterns (SFUPs) to better support user decision-making. Several new algorithms have been proposed one after another. However, the Internet of Things (IoT), mobile Internet, and traditional Internet are generating massive amounts of data every day, and these cutting-edge standalone algorithms can not satisfy the new challenge of finding interesting patterns from this data. Big Data uses a distributed architecture in the form of cloud computing to filter and process this data to extract useful information. This paper proposes a novel parallel algorithm on Hadoop as a three-stage iterative algorithm based on MapReduce. MapReduce is used to divide the mining tasks of the whole large data set into multiple independent sub-tasks to find frequent and high utility patterns in parallel. Numerous experiments were done in this paper, and from the results, the algorithm can handle large datasets and show good performance on Hadoop clusters.<\/jats:p>","DOI":"10.3233\/ida-220756","type":"journal-article","created":{"date-parts":[[2023,8,13]],"date-time":"2023-08-13T19:06:44Z","timestamp":1691953604000},"page":"1359-1377","source":"Crossref","is-referenced-by-count":2,"title":["Mining skyline frequent-utility patterns from big data environment based on MapReduce framework"],"prefix":"10.1177","volume":"27","author":[{"given":"Jimmy Ming-Tai","family":"Wu","sequence":"first","affiliation":[{"name":"Department of Information Management, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan"}]},{"given":"Ranran","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Information Management, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan"}]},{"given":"Mu-En","family":"Wu","sequence":"additional","affiliation":[{"name":"Department of Information and Finance Management, National Taipei University of Technology, Taipei, Taiwan"}]},{"given":"Jerry Chun-Wei","family":"Lin","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Western Norway University of Applied Sciences, Bergen, Norway"}]}],"member":"179","reference":[{"key":"10.3233\/IDA-220756_ref1","doi-asserted-by":"crossref","first-page":"98","DOI":"10.1016\/j.is.2014.07.006","article-title":"The rise of \u201cbig data\u201d on cloud computing: Review and open research issues","author":"Hashem","year":"2015","journal-title":"Information Systems"},{"issue":"1","key":"10.3233\/IDA-220756_ref2","doi-asserted-by":"crossref","first-page":"13","DOI":"10.1080\/17538947.2016.1239771","article-title":"Big Data and cloud computing: innovation opportunities and challenges","author":"Yang","year":"2017","journal-title":"International Journal of Digital Earth"},{"key":"10.3233\/IDA-220756_ref3","first-page":"1","article-title":"Big data using cloud computing","author":"Purcell","year":"2014","journal-title":"Journal of Technology Research"},{"issue":"6","key":"10.3233\/IDA-220756_ref4","doi-asserted-by":"crossref","first-page":"914","DOI":"10.1109\/69.250074","article-title":"Database mining: A performance perspective","author":"Agrawal","year":"1993","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"key":"10.3233\/IDA-220756_ref5","doi-asserted-by":"crossref","unstructured":"R. Agrawal, T. Imieli\u0144ski and A. Swami, Mining association rules between sets of items in large databases, in: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, 1993, pp.\u00a0207\u2013216.","DOI":"10.1145\/170035.170072"},{"issue":"2","key":"10.3233\/IDA-220756_ref6","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/335191.335372","article-title":"Mining frequent patterns without candidate generation","author":"Han","year":"2000","journal-title":"ACM Sigmod Record"},{"issue":"2","key":"10.3233\/IDA-220756_ref7","doi-asserted-by":"crossref","first-page":"175","DOI":"10.1145\/568271.223813","article-title":"An effective hash-based algorithm for mining association rules","author":"Park","year":"1995","journal-title":"Acm Sigmod Record"},{"issue":"4","key":"10.3233\/IDA-220756_ref8","doi-asserted-by":"crossref","first-page":"343","DOI":"10.1023\/A:1009773317876","article-title":"Parallel algorithms for discovery of association rules","author":"Zaki","year":"1997","journal-title":"Data Mining and Knowledge Discovery"},{"key":"10.3233\/IDA-220756_ref9","unstructured":"R. Agrawal, R. Srikant et al., Fast algorithms for mining association rules, in: Proc. 20th Int. Conf. Very Large Data Bases, VLDB, Vol. 1215, Citeseer, 1994, pp.\u00a0487\u2013499."},{"key":"10.3233\/IDA-220756_ref10","unstructured":"Z.P. Ogihara, M. Zaki, S. Parthasarathy, M. Ogihara and W. Li, New algorithms for fast discovery of association rules, in: In 3rd Intl. Conf. on Knowledge Discovery and Data Mining, Citeseer, 1997."},{"key":"10.3233\/IDA-220756_ref11","unstructured":"R. Chan, Q. Yang and Y.-D. Shen, Mining high utility itemsets, in: Third IEEE International Conference on Data Mining, IEEE Computer Society, 2003, pp.\u00a019\u201319."},{"key":"10.3233\/IDA-220756_ref12","doi-asserted-by":"crossref","unstructured":"Y. Liu, W.-k. Liao and A. Choudhary, A two-phase algorithm for fast discovery of high utility itemsets, in: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, 2005, pp.\u00a0689\u2013695.","DOI":"10.1007\/11430919_79"},{"issue":"3","key":"10.3233\/IDA-220756_ref13","doi-asserted-by":"crossref","first-page":"603","DOI":"10.1016\/j.datak.2005.10.004","article-title":"Mining itemset utilities from transaction databases","author":"Yao","year":"2006","journal-title":"Data & Knowledge Engineering"},{"issue":"12","key":"10.3233\/IDA-220756_ref14","doi-asserted-by":"crossref","first-page":"1708","DOI":"10.1109\/TKDE.2009.46","article-title":"Efficient tree structures for high utility pattern mining in incremental databases","author":"Ahmed","year":"2009","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"issue":"6","key":"10.3233\/IDA-220756_ref15","doi-asserted-by":"crossref","first-page":"7419","DOI":"10.1016\/j.eswa.2010.12.082","article-title":"An effective tree structure for mining high utility itemsets","author":"Lin","year":"2011","journal-title":"Expert Systems with Applications"},{"key":"10.3233\/IDA-220756_ref16","doi-asserted-by":"crossref","unstructured":"M. Liu and J. Qu, Mining high utility itemsets without candidate generation, in: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012, pp.\u00a055\u201364.","DOI":"10.1145\/2396761.2396773"},{"key":"10.3233\/IDA-220756_ref17","doi-asserted-by":"crossref","unstructured":"V.S. Tseng, C.-W. Wu, B.-E. Shie and P.S. Yu, UP-Growth: an efficient algorithm for high utility itemset mining, in: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2010, pp.\u00a0253\u2013262.","DOI":"10.1145\/1835804.1835839"},{"key":"10.3233\/IDA-220756_ref18","doi-asserted-by":"crossref","unstructured":"P. Fournier-Viger, C.-W. Wu, S. Zida and V.S. Tseng, FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning, in: International Symposium on Methodologies for Intelligent Systems, Springer, 2014, pp.\u00a083\u201392.","DOI":"10.1007\/978-3-319-08326-1_9"},{"key":"10.3233\/IDA-220756_ref19","doi-asserted-by":"crossref","unstructured":"S. Zida, P. Fournier-Viger, J.C.-W. Lin, C.-W. Wu and V.S. Tseng, EFIM: a highly efficient algorithm for high-utility itemset mining, in: Mexican International Conference on Artificial Intelligence, Springer, 2015, pp.\u00a0530\u2013546.","DOI":"10.1007\/978-3-319-27060-9_44"},{"key":"10.3233\/IDA-220756_ref20","doi-asserted-by":"crossref","unstructured":"P. Fournier-Viger, C.-W. Wu and V.S. Tseng, Mining top-k association rules, in: Canadian Conference on Artificial Intelligence, Springer, 2012, pp.\u00a061\u201373.","DOI":"10.1007\/978-3-642-30353-1_6"},{"issue":"1","key":"10.3233\/IDA-220756_ref21","first-page":"54","article-title":"Efficient algorithms for mining top-k high utility itemsets","author":"Tseng","year":"2015","journal-title":"IEEE Transactions on Knowledge and Data Exngineering"},{"key":"10.3233\/IDA-220756_ref22","doi-asserted-by":"crossref","unstructured":"K. Wang, J.M.-T. Wu, B. Cui and J.C.-W. Lin, Revealing Top-k Dominant Individuals in Incomplete Data Based on Spark Environment, in: International Conference on Genetic and Evolutionary Computing, Springer, 2021, pp.\u00a0471\u2013480.","DOI":"10.1007\/978-981-16-8430-2_43"},{"key":"10.3233\/IDA-220756_ref23","doi-asserted-by":"crossref","unstructured":"V. Goyal, A. Sureka and D. Patel, Efficient skyline itemsets mining, in: Proceedings of the Eighth International C* Conference on Computer Science & Software Engineering, 2015, pp.\u00a0119\u2013124.","DOI":"10.1145\/2790798.2790816"},{"key":"10.3233\/IDA-220756_ref24","doi-asserted-by":"crossref","unstructured":"J.C.-W. Lin, L. Yang, P. Fournier-Viger, S. Dawar, V. Goyal, A. Sureka and B. Vo, A more efficient algorithm to mine skyline frequent-utility patterns, in: International Conference on Genetic and Evolutionary Computing, Springer, 2016, pp.\u00a0127\u2013135.","DOI":"10.1007\/978-3-319-48490-7_16"},{"key":"10.3233\/IDA-220756_ref25","doi-asserted-by":"crossref","first-page":"229","DOI":"10.1016\/j.engappai.2018.10.010","article-title":"Mining of skyline patterns by considering both frequent and utility constraints","author":"Lin","year":"2019","journal-title":"Engineering Applications of Artificial Intelligence"},{"issue":"1","key":"10.3233\/IDA-220756_ref26","doi-asserted-by":"crossref","first-page":"107","DOI":"10.1145\/1327452.1327492","article-title":"MapReduce: simplified data processing on large clusters","author":"Dean","year":"2008","journal-title":"Communications of the ACM"},{"key":"10.3233\/IDA-220756_ref27","doi-asserted-by":"crossref","unstructured":"J. Liu, K. Wang and B.C. Fung, Direct discovery of high utility itemsets without candidate generation, in: 2012 IEEE 12th International Conference on Data Mining, IEEE, 2012, pp.\u00a0984\u2013989.","DOI":"10.1109\/ICDM.2012.20"},{"issue":"5","key":"10.3233\/IDA-220756_ref28","doi-asserted-by":"crossref","first-page":"1245","DOI":"10.1109\/TKDE.2015.2510012","article-title":"Mining high utility patterns in one phase without generating candidates","author":"Liu","year":"2015","journal-title":"IEEE Transactions on Knowledge and Data Engineering"},{"issue":"6","key":"10.3233\/IDA-220756_ref29","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3363571","article-title":"High-utility itemset mining with effective pruning strategies","author":"Wu","year":"2019","journal-title":"ACM Transactions on Knowledge Discovery from Data (TKDD)"},{"key":"10.3233\/IDA-220756_ref30","doi-asserted-by":"crossref","first-page":"66788","DOI":"10.1109\/ACCESS.2020.2982415","article-title":"Incrementally updating the discovered high average-utility patterns with the pre-large concept","author":"Wu","year":"2020","journal-title":"IEEE Access"},{"key":"10.3233\/IDA-220756_ref31","doi-asserted-by":"crossref","unstructured":"H. Yao, H.J. Hamilton and C.J. Butz, A foundational approach to mining itemset utilities from databases, in: Proceedings of the 2004 SIAM International Conference on Data Mining, SIAM, 2004, pp.\u00a0482\u2013486.","DOI":"10.1137\/1.9781611972740.51"},{"issue":"2","key":"10.3233\/IDA-220756_ref32","doi-asserted-by":"crossref","first-page":"595","DOI":"10.1007\/s10115-016-0986-0","article-title":"EFIM: A fast and memory efficient algorithm for high-utility itemset mining","author":"Zida","year":"2017","journal-title":"Knowledge and Information Systems"},{"key":"10.3233\/IDA-220756_ref33","doi-asserted-by":"crossref","unstructured":"C.-W. Lin, T.-P. Hong and W.-H. Lu, Efficiently mining high average utility itemsets with a tree structure, in: Asian Conference on Intelligent Information and Database Systems, Springer, 2010, pp.\u00a0131\u2013139.","DOI":"10.1007\/978-3-642-12145-6_14"},{"issue":"5","key":"10.3233\/IDA-220756_ref34","doi-asserted-by":"crossref","first-page":"2371","DOI":"10.1016\/j.eswa.2014.11.001","article-title":"Pruning strategies for mining high utility itemsets","author":"Krishnamoorthy","year":"2015","journal-title":"Expert Systems with Applications"},{"issue":"2","key":"10.3233\/IDA-220756_ref35","doi-asserted-by":"crossref","first-page":"357","DOI":"10.1007\/s11704-016-6245-4","article-title":"CLS-Miner: Efficient and effective closed high-utility itemset mining","author":"Dam","year":"2019","journal-title":"Frontiers of Computer Science"},{"key":"10.3233\/IDA-220756_ref37","doi-asserted-by":"crossref","unstructured":"Y.C. Lin, C.-W. Wu and V.S. Tseng, Mining high utility itemsets in big data, in: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, 2015, pp.\u00a0649\u2013661.","DOI":"10.1007\/978-3-319-18032-8_51"},{"issue":"1","key":"10.3233\/IDA-220756_ref38","doi-asserted-by":"crossref","first-page":"216","DOI":"10.1007\/s11036-020-01701-5","article-title":"Mining of High-Utility Patterns in Big IoT-based Databases","author":"Wu","year":"2021","journal-title":"Mobile Networks and Applications"},{"key":"10.3233\/IDA-220756_ref39","doi-asserted-by":"crossref","unstructured":"S.-J. Yen and Y.-S. Lee, Mining high utility quantitative association rules, in: International Conference on Data Warehousing and Knowledge Discovery, Springer, 2007, pp.\u00a0283\u2013292.","DOI":"10.1007\/978-3-540-74553-2_26"},{"issue":"4","key":"10.3233\/IDA-220756_ref41","first-page":"1","article-title":"The efficient mining of skyline patterns from a volunteer computing network","author":"Wu","year":"2021","journal-title":"ACM Transactions on Internet Technology (TOIT)"},{"key":"10.3233\/IDA-220756_ref42","doi-asserted-by":"crossref","unstructured":"W. Song, C. Zheng and P. Fournier-Viger, Mining Skyline Frequent-Utility Itemsets with Utility Filtering, in: Pacific Rim International Conference on Artificial Intelligence, Springer, 2021, pp.\u00a0411\u2013424.","DOI":"10.1007\/978-3-030-89188-6_31"},{"key":"10.3233\/IDA-220756_ref43","doi-asserted-by":"crossref","unstructured":"P. Fournier-Viger, J.C.-W. Lin, A. Gomariz, T. Gueniche, A. Soltani, Z. Deng and H.T. Lam, The SPMF open-source data mining library version 2, in: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, 2016, pp.\u00a036\u201340.","DOI":"10.1007\/978-3-319-46131-1_8"}],"container-title":["Intelligent Data Analysis"],"original-title":[],"link":[{"URL":"https:\/\/content.iospress.com\/download?id=10.3233\/IDA-220756","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,11]],"date-time":"2025-03-11T07:54:26Z","timestamp":1741679666000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/full\/10.3233\/IDA-220756"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,6]]},"references-count":41,"journal-issue":{"issue":"5"},"URL":"https:\/\/doi.org\/10.3233\/ida-220756","relation":{},"ISSN":["1088-467X","1571-4128"],"issn-type":[{"type":"print","value":"1088-467X"},{"type":"electronic","value":"1571-4128"}],"subject":[],"published":{"date-parts":[[2023,10,6]]}}}