{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T04:02:05Z","timestamp":1760241725990,"version":"build-2065373602"},"reference-count":30,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2018,9,3]],"date-time":"2018-09-03T00:00:00Z","timestamp":1535932800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Data"],"abstract":"<jats:p>In large organizations, it is often required to collect data from the different geographic branches spread over different locations. Extensive amounts of data may be gathered at the centralized location in order to generate interesting patterns via mono-mining the amassed database. However, it is feasible to mine the useful patterns at the data source itself and forward only these patterns to the centralized company, rather than the entire original database. These patterns also exist in huge numbers, and different sources calculate different utility values for each pattern. This paper proposes a weighted model for aggregating the high-utility patterns from different data sources. The procedure of pattern selection was also proposed to efficiently extract high-utility patterns in our weighted model by discarding low-utility patterns. Meanwhile, the synthesizing model yielded high-utility patterns, unlike association rule mining, in which frequent itemsets are generated by considering each item with equal utility, which is not true in real life applications such as sales transactions. Extensive experiments performed on the datasets with varied characteristics show that the proposed algorithm will be effective for mining very sparse and sparse databases with a huge number of transactions. Our proposed model also outperforms various state-of-the-art distributed models of mining in terms of running time.<\/jats:p>","DOI":"10.3390\/data3030032","type":"journal-article","created":{"date-parts":[[2018,9,3]],"date-time":"2018-09-03T10:50:51Z","timestamp":1535971851000},"page":"32","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Synthesizing High-Utility Patterns from Different Data Sources"],"prefix":"10.3390","volume":"3","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7965-4001","authenticated-orcid":false,"given":"Abhinav","family":"Muley","sequence":"first","affiliation":[{"name":"Department of Computer Engineering, St. Vincent Pallotti College of Engineering &amp; Technology, Nagpur 441108, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Manish","family":"Gudadhe","sequence":"additional","affiliation":[{"name":"Department of Computer Engineering, St. Vincent Pallotti College of Engineering &amp; Technology, Nagpur 441108, India"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2018,9,3]]},"reference":[{"key":"ref_1","unstructured":"Pujari, A. (2013). Data Mining Techniques, Universities Press (India) Private Limited."},{"key":"ref_2","unstructured":"Marr, B. (2018, August 21). Really Big Data at Walmart Real Time Insights from Their 40-Petabyte Data Cloud. Available online: www.forbes.com\/sites\/bernardmarr\/2017\/01\/23\/really-big-data-at-walmart-real-time-insights-from-their-40-petabyte-data-cloud\/."},{"key":"ref_3","unstructured":"(2018, August 23). Taste of Efficiency: How Swiggy Is Disrupting Food Delivery with ML, AI. Available online: https:\/\/www.financialexpress.com\/industry\/taste-of-efficiency-how-swiggy-is-disrupting-food-delivery-with-ml-ai\/1288840\/."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"962","DOI":"10.1109\/69.553164","article-title":"Parallel mining of association rules","volume":"8","author":"Agrawal","year":"1996","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_5","unstructured":"Chattratichat, J., Darlington, J., Ghanem, M., Guo, Y., H\u00fcning, H.F., K\u00f6hler, M., and Yang, D. (1997, January 14\u201317). Large Scale Data Mining: Challenges and Responses. Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (KDD-97), Newport Beach, CA, USA."},{"key":"ref_6","unstructured":"Cheung, D.W., Han, J., Ng, V.T., and Wong, C.Y. (March, January 26). Maintenance of discovered association rules in large databases: An incremental updating technique. Proceedings of the Twelfth International Conference on Data Engineering, New Orleans, LA, USA."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/PL00011656","article-title":"Parallel data mining for association rules on shared-memory systems","volume":"3","author":"Parthasarathy","year":"2001","journal-title":"Knowl. Inf. Syst."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Shintani, T., and Kitsuregawa, M. (1998, January 1\u20134). Parallel mining algorithms for generalized association rules with classification hierarchy. Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Seattle, WA, USA.","DOI":"10.1145\/276304.276308"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1109\/TSMC.2015.2437327","article-title":"Fidoop: Parallel mining of frequent itemsets using mapreduce","volume":"46","author":"Xun","year":"2016","journal-title":"IEEE Trans. Syst. Man Cybern. Syst."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"1493","DOI":"10.1007\/s10586-015-0477-1","article-title":"A distributed frequent itemset mining algorithm using Spark for Big Data analytics","volume":"18","author":"Zhang","year":"2015","journal-title":"Clust. Comput."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"5247","DOI":"10.1109\/ACCESS.2017.2689040","article-title":"Big IoT data analytics: architecture, opportunities, and open research challenges","volume":"5","author":"Marjani","year":"2017","journal-title":"IEEE Access"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1016\/j.patrec.2018.01.013","article-title":"Review on mining data from multiple data sources","volume":"109","author":"Wang","year":"2018","journal-title":"Pattern Recognit. Lett."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Adhikari, A., and Adhikari, J. (2014). Mining Patterns of Select Items in Different Data Sources, Springer.","DOI":"10.1007\/978-3-319-13212-9_12"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s13042-018-0791-z","article-title":"Synthesizing decision rules from multiple information sources: A neighborhood granulation viewpoint","volume":"9","author":"Lin","year":"2018","journal-title":"Int. J. Mach. Learn. Cybern."},{"key":"ref_15","first-page":"5","article-title":"Multi-database mining","volume":"2","author":"Zhang","year":"2003","journal-title":"IEEE Comput. Intell. Bull."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"410","DOI":"10.1016\/j.ins.2016.04.009","article-title":"A novel approach to information fusion in multi-source datasets: A granular computing viewpoint","volume":"378","author":"Xu","year":"2017","journal-title":"Inf. Sci."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"353","DOI":"10.1109\/TKDE.2003.1185839","article-title":"Synthesizing high-frequency rules from different data sources","volume":"15","author":"Wu","year":"2003","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"313","DOI":"10.1007\/s10115-008-0126-6","article-title":"Modified algorithms for synthesizing high-frequency rules from different data sources","volume":"17","author":"Ramkumar","year":"2008","journal-title":"Knowl. Inf. Syst."},{"key":"ref_19","unstructured":"Savasere, A., Omiecinski, E.R., and Navathe, S.B. (1995, January 11\u201315). An efficient algorithm for mining association rules in large databases. Proceedings of the 21th International Conference on Very Large Data Bases, San Francisco, CA, USA."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"595","DOI":"10.1007\/s10115-016-0986-0","article-title":"EFIM: a fast and memory efficient algorithm for high-utility itemset mining","volume":"51","author":"Zida","year":"2017","journal-title":"Knowl. Inf. Syst."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Liu, Y., Liao, W.K., and Choudhary, A. (2005, January 21). A fast high-utility itemsets mining algorithm. Proceedings of the 1st International Workshop on Utility-Based Data Mining, Chicago, IL, USA.","DOI":"10.1145\/1089827.1089839"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Agrawal, R., Imieli\u0144ski, T., and Swami, A. (1993, January 25\u201328). Mining association rules between sets of items in large databases. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA.","DOI":"10.1145\/170035.170072"},{"key":"ref_23","unstructured":"Yao, H., Hamilton, H.J., and Geng, L. (2006, January 20\u201323). A unified framework for utility-based measures for mining itemsets. Proceedings of the ACM SIGKDD 2nd Workshop on Utility-Based Data Mining, Philadelphia, PA, USA."},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"1708","DOI":"10.1109\/TKDE.2009.46","article-title":"Efficient tree structures for high-utility pattern mining in incremental databases","volume":"21","author":"Ahmed","year":"2009","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"30","DOI":"10.1007\/978-3-319-14717-8_3","article-title":"Novel Concise Representations of High-Utility Itemsets Using Generator Patterns","volume":"Volume 8933","author":"Luo","year":"2014","journal-title":"Advanced Data Mining and Applications"},{"key":"ref_26","unstructured":"Good, I.J. (2007). Probability and the Weighting of Evidence, Charles Griffin."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Fournier-Viger, P., Lin, C.W., Gomariz, A., Gueniche, T., Soltani, A., Deng, Z., and Lam, H.T. (2016, January 19\u201323). The SPMF Open-Source Data Mining Library Version 2. Proceedings of the 19th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD 2016) Part III, Riva del Garda, Italy.","DOI":"10.1007\/978-3-319-46131-1_8"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"1245","DOI":"10.1109\/TKDE.2015.2510012","article-title":"Mining high-utility patterns in one phase without generating candidates","volume":"28","author":"Liu","year":"2016","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1016\/j.bdr.2016.07.001","article-title":"Approximate parallel high-utility itemset mining","volume":"6","author":"Chen","year":"2016","journal-title":"Big Data Res."},{"key":"ref_30","first-page":"251","article-title":"Parallel Method for Mining High-Utility Itemsets from Vertically Partitioned Distributed Databases","volume":"Volume 5711","author":"Howlett","year":"2009","journal-title":"Knowledge-Based and Intelligent Information and Engineering Systems. KES 2009. Lecture Notes in Computer Science"}],"container-title":["Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2306-5729\/3\/3\/32\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T15:18:33Z","timestamp":1760195913000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2306-5729\/3\/3\/32"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,9,3]]},"references-count":30,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2018,9]]}},"alternative-id":["data3030032"],"URL":"https:\/\/doi.org\/10.3390\/data3030032","relation":{},"ISSN":["2306-5729"],"issn-type":[{"type":"electronic","value":"2306-5729"}],"subject":[],"published":{"date-parts":[[2018,9,3]]}}}