{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T02:05:20Z","timestamp":1760148320734,"version":"build-2065373602"},"reference-count":28,"publisher":"MDPI AG","issue":"2","license":[{"start":{"date-parts":[[2023,4,18]],"date-time":"2023-04-18T00:00:00Z","timestamp":1681776000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>The term \u201cbig data\u201d refers to the vast amount of structured and unstructured data generated by businesses, organizations, and individuals on a daily basis. The rapid growth of big data has led to the development of new technologies and techniques for storing, processing, and analyzing these data in order to extract valuable information. This study examines some of these technologies, compares their pros and cons, and provides solutions for handling specific types of reporting using big data tools. In addition, this paper discusses some of the challenges associated with big data and suggests approaches that could be used to manage and analyze these data. The findings demonstrate the benefits of efficiently managing the datasets and choosing the appropriate tools, as well as the efficiency of the proposed solution with hands-on examples.<\/jats:p>","DOI":"10.3390\/bdcc7020078","type":"journal-article","created":{"date-parts":[[2023,4,19]],"date-time":"2023-04-19T01:39:05Z","timestamp":1681868345000},"page":"78","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["Managing and Optimizing Big Data Workloads for On-Demand User Centric Reports"],"prefix":"10.3390","volume":"7","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1264-3404","authenticated-orcid":false,"given":"Alexandra","family":"B\u0103icoianu","sequence":"first","affiliation":[{"name":"Department of Mathematics and Informatics, Faculty of Mathematics and Informatics, Transilvania University of Bra\u015fov, Iuliu Maniu 50, 500090 Bra\u015fov, Romania"}]},{"given":"Ion Valentin","family":"Scheianu","sequence":"additional","affiliation":[{"name":"Faculty of Mathematics and Informatics, Transilvania University of Bra\u015fov, Iuliu Maniu 50, 500090 Bra\u015fov, Romania"}]}],"member":"1968","published-online":{"date-parts":[[2023,4,18]]},"reference":[{"key":"ref_1","unstructured":"Sumits, A. (2022, December 17). The History and Future of Internet Traffic. Available online: https:\/\/blogs.cisco.com\/sp\/the-history-and-future-of-internet-traffic."},{"key":"ref_2","unstructured":"O\u2019Dea, S. (2022, December 17). Monthly Internet Traffic in the U.S. 2018\u20132023. Available online: https:\/\/www.statista.com\/statistics\/216335\/data-usage-per-month-in-the-us-by-age\/."},{"key":"ref_3","unstructured":"Heffernan, V. (2022, December 27). Is Moore\u2019s Law Really Dead?. Available online: https:\/\/www.wired.com\/story\/moores-law-really-dead\/."},{"key":"ref_4","unstructured":"Agrawal, D., Bernstein, P., Bertino, E., Davidson, S., Dayal, U., Franklin, M., Gehrke, J., Haas, L., Halevy, A., and Han, J. (2011). Challenges and Opportunities with Big Data 2011-1, Purdue University Libraries."},{"key":"ref_5","unstructured":"Takahashi, D. (2022, December 27). Intel: Moore\u2019s Law Isn\u2019t Slowing Down. Available online: https:\/\/venturebeat.com\/business\/intel-moores-law-isnt-slowing-down\/."},{"key":"ref_6","first-page":"4","article-title":"Is Moore\u2019s Law Slowing Down? What\u2019s Next?","volume":"37","author":"Eeckhout","year":"2017","journal-title":"IEEE Micro"},{"key":"ref_7","unstructured":"Collet, Y., and Kucherawy, M. (2023, March 05). Zstandard Compression and the Application\/zstd Media Type. Available online: https:\/\/www.rfc-editor.org\/rfc\/rfc8478."},{"key":"ref_8","first-page":"1","article-title":"3D data management: Controlling data volume, velocity and variety","volume":"6","author":"Laney","year":"2001","journal-title":"META Group Res. Note"},{"key":"ref_9","first-page":"31","article-title":"Big data challenges","volume":"4","author":"Tole","year":"2013","journal-title":"Database Syst. J."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"319","DOI":"10.1016\/j.procs.2015.04.188","article-title":"A Brief Introduction on Big Data 5Vs Characteristics and Hadoop Technology","volume":"48","author":"Ishwarappa","year":"2015","journal-title":"Procedia Comput. Sci."},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Awan, M.J., Bilal, M.H., Yasin, A., Nobanee, H., Khan, N.S., and Zain, A.M. (2021). Detection of COVID-19 in Chest X-ray Images: A Big Data Enabled Deep Learning Approach. Int. J. Environ. Res. Public Health, 18.","DOI":"10.3390\/ijerph181910147"},{"key":"ref_12","first-page":"9307","article-title":"Big data challenges","volume":"4","author":"Nasser","year":"2015","journal-title":"J. Comput. Eng. Inf. Technol."},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Haafza, L.A., Awan, M.J., Abid, A., Yasin, A., Nobanee, H., and Farooq, M.S. (2021). Big data COVID-19 systematic literature review: Pandemic crisis. Electronics, 10.","DOI":"10.3390\/electronics10243125"},{"key":"ref_14","unstructured":"Self-Service BI (2023, April 01). Available online: https:\/\/learn.microsoft.com\/."},{"key":"ref_15","unstructured":"(2023, April 02). Druid Use-Cases. Available online: https:\/\/druid.apache.org\/use-cases."},{"key":"ref_16","unstructured":"(2022, December 29). AWS Druid Costs. Available online: https:\/\/aws.amazon.com\/marketplace\/pp\/prodview-4n6wdupx4okgw."},{"key":"ref_17","unstructured":"Roginski, M. (2022, December 29). When Should I Use Apache Druid? Try This Checklist. Available online: https:\/\/www.rilldata.com\/blog\/when-should-i-use-apache-druid."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.jnca.2014.07.022","article-title":"A comprehensive view of Hadoop research\u2014A systematic literature review","volume":"46","author":"Polato","year":"2014","journal-title":"J. Netw. Comput. Appl."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Wang, K., and Khan, M.M.H. (2015, January 24\u201326). Performance prediction for apache spark platform. Proceedings of the 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems, New York, NY, USA.","DOI":"10.1109\/HPCC-CSS-ICESS.2015.246"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Thusoo, A., Sarma, J.S., Jain, N., Shao, Z., Chakka, P., Zhang, N., Antony, S., Liu, H., and Murthy, R. (2010, January 1\u20136). Hive-a petabyte scale data warehouse using hadoop. Proceedings of the 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010), Long Beach, CA, USA.","DOI":"10.1109\/ICDE.2010.5447738"},{"key":"ref_21","unstructured":"Foundation, T.A.S. (2023, January 20). Apache Oozie Workflow Scheduler for Hadoop. Available online: https:\/\/oozie.apache.org\/."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"849","DOI":"10.1142\/S0219622020500182","article-title":"A lightweight approach to extract interschema properties from structured, semi-structured and unstructured sources in a big data scenario","volume":"19","author":"Cauteruccio","year":"2020","journal-title":"Int. J. Inf. Technol. Decis. Mak."},{"key":"ref_23","unstructured":"Arnholt, A.T. (2023, January 02). Passion Driven Statistics. Available online: https:\/\/alanarnholt.github.io\/PDS-Bookdown2\/skewed-right-distributions.html."},{"key":"ref_24","unstructured":"Statz, D. (2023, January 02). Handling Data Skew in Apache Spark. Available online: https:\/\/itnext.io\/handling-data-skew-in-apache-spark-9f56343e58e8."},{"key":"ref_25","unstructured":"Reursora, K. (2023, January 02). Generating Random Numbers with Uniform Distribution in Python. Available online: https:\/\/linuxhint.com\/generating-random-numbers-with-uniform-distribution-in-python\/."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"104820","DOI":"10.1016\/j.knosys.2019.06.028","article-title":"Generalizing identity-based string comparison metrics: Framework and techniques","volume":"187","author":"Cauteruccio","year":"2020","journal-title":"Knowl.-Based Syst."},{"key":"ref_27","unstructured":"(2023, February 16). Open Air Quality. Available online: https:\/\/openaq.org\/."},{"key":"ref_28","unstructured":"(2023, February 16). OpenAQ Amazon S3 Bucket. Available online: https:\/\/registry.opendata.aws\/openaq\/."}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/7\/2\/78\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:18:16Z","timestamp":1760123896000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/7\/2\/78"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,4,18]]},"references-count":28,"journal-issue":{"issue":"2","published-online":{"date-parts":[[2023,6]]}},"alternative-id":["bdcc7020078"],"URL":"https:\/\/doi.org\/10.3390\/bdcc7020078","relation":{},"ISSN":["2504-2289"],"issn-type":[{"type":"electronic","value":"2504-2289"}],"subject":[],"published":{"date-parts":[[2023,4,18]]}}}