{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,4]],"date-time":"2026-06-04T15:46:56Z","timestamp":1780588016502,"version":"3.54.1"},"reference-count":34,"publisher":"MDPI AG","issue":"21","license":[{"start":{"date-parts":[[2022,10,28]],"date-time":"2022-10-28T00:00:00Z","timestamp":1666915200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100003141","name":"National Council of Science and Technology (CONACYT)","doi-asserted-by":"publisher","award":["A1-S-51808"],"award-info":[{"award-number":["A1-S-51808"]}],"id":[{"id":"10.13039\/501100003141","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Applied Sciences"],"abstract":"<jats:p>Data warehousing gives frameworks and means for enterprise administrators to methodically prepare, comprehend, and utilize the data to improve strategic decision-making skills. One of the principal challenges to data warehouse designers is fragmentation. Currently, several fragmentation approaches for data warehouses have been developed since this technique can decrease the OLAP (online analytical processing) query response time and it provides considerable benefits in table loading and maintenance tasks. In this paper, a horizontal fragmentation method, called FTree, that uses decision trees to fragment data warehouses is presented to take advantage of the effectiveness that this technique provides in classification. FTree determines the OLAP queries with major relevance, evaluates the predicates found in the workload, and according to this, builds the decision tree to select the horizontal fragmentation scheme. To verify that the design is correct, the SSB (star schema benchmark) was used in the first instance; later, a tourist data warehouse was built, and the fragmentation method was tested on it. The results of the experiments proved the efficacy of the method.<\/jats:p>","DOI":"10.3390\/app122110942","type":"journal-article","created":{"date-parts":[[2022,10,29]],"date-time":"2022-10-29T23:45:00Z","timestamp":1667087100000},"page":"10942","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Decision-Tree-Based Horizontal Fragmentation Method for Data Warehouses"],"prefix":"10.3390","volume":"12","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8227-9476","authenticated-orcid":false,"given":"Nidia","family":"Rodr\u00edguez-Mazahua","sequence":"first","affiliation":[{"name":"Tecnol\u00f3gico Nacional de M\u00e9xico\/I. T., Av. Oriente 9 852, Col. Emiliano Zapata, Orizaba C.P. 94320, Veracruz, Mexico"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9861-3993","authenticated-orcid":false,"given":"Lisbeth","family":"Rodr\u00edguez-Mazahua","sequence":"additional","affiliation":[{"name":"Tecnol\u00f3gico Nacional de M\u00e9xico\/I. T., Av. Oriente 9 852, Col. Emiliano Zapata, Orizaba C.P. 94320, Veracruz, Mexico"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5254-0939","authenticated-orcid":false,"given":"Asdr\u00fabal","family":"L\u00f3pez-Chau","sequence":"additional","affiliation":[{"name":"Universidad Aut\u00f3noma del Estado de M\u00e9xico, Centro Universitario UAEM Zumpango, Camino Viejo a Jilotzingo Continuaci\u00f3n Calle Ray\u00f3n, Valle Hermoso, Zumpango C.P. 55600, Estado de M\u00e9xico, Mexico"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3296-0981","authenticated-orcid":false,"given":"Giner","family":"Alor-Hern\u00e1ndez","sequence":"additional","affiliation":[{"name":"Tecnol\u00f3gico Nacional de M\u00e9xico\/I. T., Av. Oriente 9 852, Col. Emiliano Zapata, Orizaba C.P. 94320, Veracruz, Mexico"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-3822-4478","authenticated-orcid":false,"given":"Isaac","family":"Machorro-Cano","sequence":"additional","affiliation":[{"name":"Universidad del Papaloapan, Calle Circuito Central #200, Col. Parque Industrial, Tuxtepec C.P. 68301, Oaxaca, Mexico"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2022,10,28]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Ozsu, M.T., and Valduriez, P. (2020). Principles of Distributed Database Systems, 4th ed, Springer Nature Switzerland AG.","DOI":"10.1007\/978-3-030-26253-2"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Daniel, C., Salamanca, E., and Nordlinger, B. (2020). Hospital Databases: AP-HP Clinical Data Warehouse. Healthcare and Artificial Intelligence, Springer.","DOI":"10.1007\/978-3-030-32161-1_8"},{"key":"ref_3","unstructured":"Melton, J.E., Go, S., Zilliac, G.G., and Zhang, B.Z. (2022). Greenhouse Gas Emission Estimations for 2016\u20132020 using the Sherlock Air Traffic Data Warehouse, Report NASA\/TM-202220007609."},{"key":"ref_4","unstructured":"Janzen, T.J., and Ristino, L. (2018). USDA and Agriculture Data: Improving Productivity while Protecting Privacy, SSRN."},{"key":"ref_5","unstructured":"Han, J., Kamber, M., and Pei, J. (2012). Data Mining Concepts and Techniques, 3rd ed, Morgan Kaufmann Publishers."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Furtado, P. (2004, January 12\u201313). Experimental Evidence on Partitioning in Parallel Data Warehouses. Proceedings of the 7th ACM International Workshop on Data Warehousing and OLAP, Washington, DC, USA.","DOI":"10.1145\/1031763.1031769"},{"key":"ref_7","unstructured":"Kimball, R., Ross, M., Thornthwaite, W., Mundy, J., and Becker, B. (2008). The Data Warehouse Lifecycle Toolkit, Wiley Publishing, Inc.. [2nd ed.]."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Noaman, A.Y., and Barker, K. (1999, January 2\u20136). A Horizontal Fragmentation Algorithm for the Fact Relation in a Distributed Data Warehouse. Proceedings of the Eighth International Conference on Information and Knowledge Management, CIKM \u201999, Kansas City, MI, USA.","DOI":"10.1145\/319950.319972"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Ordonez, C., Song, I.Y., Anderst-Kotsis, G., Tjoa, A.M., and Khalil, I. (2019). SDWP: A New Data Placement Strategy for Distributed Big Data Warehouses in Hadoop. Big Data Analytics and Knowledge Discovery, Springer International Publishing.","DOI":"10.1007\/978-3-030-27520-4"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"48","DOI":"10.14778\/1920841.1920853","article-title":"Schism: A workload-driven approach to database replication and partitioning","volume":"3","author":"Curino","year":"2010","journal-title":"Proc. VLDB Endow."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Mahboubi, H., and Darmont, J. (2008, January 30). Data mining-based fragmentation of XML data warehouses. Proceedings of the ACM 11th international workshop on Data warehousing and OLAP-DOLAP \u201908, Napa Valley, CA, USA. Available online: http:\/\/portal.acm.org\/citation.cfm?doid=1458432.1458435.","DOI":"10.1145\/1458432.1458435"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"907","DOI":"10.1080\/08839514.2018.1519096","article-title":"Bi-Objective Optimization Method for Horizontal Fragmentation Problem in Relational Data Warehouses as a Linear Programming Problem","volume":"32","author":"Barr","year":"2018","journal-title":"Appl. Artif. Intell."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"3320","DOI":"10.4028\/www.scientific.net\/AMM.284-287.3320","article-title":"An Efficient Partitioning for Object-Relational Data Warehouses","volume":"284\u2013287","author":"Liu","year":"2013","journal-title":"Appl. Mech. Mater."},{"key":"ref_14","first-page":"506","article-title":"Performance optimisation of the decision-support queries by the horizontal fragmentation of the data warehouse","volume":"26","author":"Kechar","year":"2017","journal-title":"Int. J. Bus. Inf. Syst."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Kechar, M., and Nait-Bahloul, S. (2019, January 23\u201324). Bringing Together Physical Design and Fast Querying of Large Data Warehouses: A New Data Partitioning Strategy. Proceedings of the 4th International Conference on Big Data and Internet of Things, Rabat Morocco.","DOI":"10.1145\/3372938.3372947"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Ramdane, Y., Boussaid, O., Kabachi, N., and Bentayeb, F. (2018, January 11\u201313). Partitioning and Bucketing Techniques to Speed up Query Processing in Spark-SQL. Proceedings of the 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS), Singapore. Available online: https:\/\/ieeexplore.ieee.org\/document\/8644891\/.","DOI":"10.1109\/PADSW.2018.8644891"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"2411","DOI":"10.14778\/3407790.3407834","article-title":"Fast and effective distribution-key recommendation for amazon redshift","volume":"13","author":"Parchas","year":"2020","journal-title":"Proc. VLDB Endow."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s40537-018-0144-5","article-title":"Chabok: A Map-Reduce based method to solve data warehouse problems","volume":"5","author":"Barkhordari","year":"2018","journal-title":"J. Big. Data."},{"key":"ref_19","unstructured":"Song, I.Y., Eder, J., and Nguyen, T.M. (2008). Data Partitioning in Data Warehouses: Hardness Study, Heuristics and ORACLE Validation. Data Warehousing and Knowledge Discovery, Springer. Available online: http:\/\/link.springer.com\/10.1007\/978-3-540-85836-2_9."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Barr, M., and Bellatreche, L. (2010, January 3\u20135). A New Approach Based on Ants for Solving the Problem of Horizontal Fragmentation in Relational Data Warehouses. Proceedings of the 2010 International Conference on Machine and Web Intelligence, Algiers, Algeria. Available online: http:\/\/ieeexplore.ieee.org\/document\/5648104\/.","DOI":"10.1109\/ICMWI.2010.5648104"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Laender, A.H.F., Pernici, B., Lim, E.P., and de Oliveira, J.P.M. (2019). SkipSJoin: A New Physical Design for Distributed Big Data Warehouses in Hadoop. Conceptual Modeling, Springer International Publishing.","DOI":"10.1007\/978-3-030-33223-5"},{"key":"ref_22","first-page":"1","article-title":"Web Service for Incremental and Automatic Data Warehouses Fragmentation","volume":"8","author":"Ettaoufik","year":"2017","journal-title":"Int. J. Adv. Comput. Sci. Appl."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"012037","DOI":"10.1088\/1742-6596\/1743\/1\/012037","article-title":"Big-Parallel-ETL: New ETL for Multidimensional NoSQL Graph Oriented Data","volume":"1743","author":"Soussi","year":"2021","journal-title":"J. Phys. Conf. Ser."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Munerman, V., Munerman, D., and Samoilova, T. (2021, January 26\u201329). The Heuristic Algorithm for Symmetric Horizontal Data Distribution. Proceedings of the 2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus). St. Petersburg, Moscow, Russia.","DOI":"10.1109\/ElConRus51938.2021.9396510"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Jaziri, R., Martin, A., Rousset, M.C., Boudjeloud-Assala, L., and Guillet, F. (2022). A Data Mining Approach to Guide the Physical Design of Distributed Big Data Warehouses. Advances in Knowledge Discovery and Management: Volume 9, Springer International Publishing.","DOI":"10.1007\/978-3-030-90287-2"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"102918","DOI":"10.1016\/j.parco.2022.102918","article-title":"Building a novel physical design of a distributed big data warehouse over a Hadoop cluster to enhance OLAP cube query performance","volume":"111","author":"Ramdane","year":"2022","journal-title":"Parallel. Comput."},{"key":"ref_27","unstructured":"O\u2019neil, P., O\u2019neil, B., and Chen, X. (2009). The Star Schema Benchmark (SSB), UMass."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"337","DOI":"10.1007\/978-3-030-71115-3_15","article-title":"Comparative Analysis of Decision Tree Algorithms for Data Warehouse Fragmentation","volume":"Volume 966","year":"2021","journal-title":"New Perspectives on Enterprise Decision-Making Applying Artificial Intelligence Techniques"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"551","DOI":"10.1016\/j.jss.2003.04.002","article-title":"An adaptable vertical partitioning method in distributed systems","volume":"73","author":"Son","year":"2004","journal-title":"J. Syst. Softw."},{"key":"ref_30","unstructured":"Rodr\u00edguez, L., Alor-Hern\u00e1ndez, G., Abud-Figueroa, M.A., and Pel\u00e1ez-Camarena, S.G. (2014, January 16\u201322). Horizontal Partitioning of Multimedia Databases Using Hierarchical Agglomerative Clustering. Proceedings of the Mexican International Conference on Artificial Intelligence, MICAI 2014: Nature-Inspired Computation and Machine Learning, Tuxtla, Mexico."},{"key":"ref_31","unstructured":"Satapathy, S.C. (2022). Classification of VASA Dataset Using J48, Random Forest, and Naive Bayes. Ntelligent Data Engineering and Analytics Smart Innovation, Systems, and Technologies, Springer."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Razdan, S., Gupta, H., and Seth, A. (2021, January 2\u20134). Performance Analysis of Network Intrusion Systems using J48 and Naive Bayes Algorithm. Proceedings of the 6th International Conference for Convergence in Technology (I2CT), Maharashtra, India.","DOI":"10.1109\/I2CT51068.2021.9417971"},{"key":"ref_33","unstructured":"Tan, P.N., Steinbach, M., Karpatne, A., and Kumar, V. (2019). Introduction to Data Mining, Pearson. [2nd ed.]."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Kimball, R., and Ross, M. (2016). The Kimball Group Reader: Relentlessly Practical Tools for Data Warehousing and Business Intelligence, John Wiley & Sons, Inc.. [2nd ed.].","DOI":"10.1002\/9781119228912"}],"container-title":["Applied Sciences"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2076-3417\/12\/21\/10942\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:05:09Z","timestamp":1760144709000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2076-3417\/12\/21\/10942"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,10,28]]},"references-count":34,"journal-issue":{"issue":"21","published-online":{"date-parts":[[2022,11]]}},"alternative-id":["app122110942"],"URL":"https:\/\/doi.org\/10.3390\/app122110942","relation":{},"ISSN":["2076-3417"],"issn-type":[{"value":"2076-3417","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,10,28]]}}}