{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,18]],"date-time":"2025-10-18T15:15:56Z","timestamp":1760800556226,"version":"build-2065373602"},"reference-count":45,"publisher":"MDPI AG","issue":"3","license":[{"start":{"date-parts":[[2022,6,27]],"date-time":"2022-06-27T00:00:00Z","timestamp":1656288000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>Currently, the continuous massive growth in the size, variety, and velocity of data is defined as big data. Relational databases have a limited ability to work with big data. Consequently, not only structured query language (NoSQL) databases were utilized to handle big data because NoSQL represents data in diverse models and uses a variety of query languages, unlike traditional relational databases. Therefore, using NoSQL has become essential, and many studies have attempted to propose different layers to convert relational databases to NoSQL; however, most of them targeted only one or two models of NoSQL, and evaluated their layers on a single node, not in a distributed environment. This study proposes a Spark-based layer for mapping relational databases to NoSQL models, focusing on the document, column, and key\u2013value databases of NoSQL models. The proposed Spark-based layer comprises of two parts. The first part is concerned with converting relational databases to document, column, and key\u2013value databases, and encompasses two phases: a metadata analyzer of relational databases and Spark-based transformation and migration. The second part focuses on executing a structured query language (SQL) on the NoSQL. The suggested layer was applied and compared with Unity, as it has similar components and features and supports sub-queries and join operations in a single-node environment. The experimental results show that the proposed layer outperformed Unity in terms of the query execution time by a factor of three. In addition, the proposed layer was applied to multi-node clusters using different scenarios, and the results show that the integration between the Spark cluster and NoSQL databases on multi-node clusters provided better performance in reading and writing while increasing the dataset size than using a single node.<\/jats:p>","DOI":"10.3390\/bdcc6030071","type":"journal-article","created":{"date-parts":[[2022,6,27]],"date-time":"2022-06-27T22:31:14Z","timestamp":1656369074000},"page":"71","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["A Comprehensive Spark-Based Layer for Converting Relational Databases to NoSQL"],"prefix":"10.3390","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2888-0367","authenticated-orcid":false,"given":"Manal A.","family":"Abdel-Fattah","sequence":"first","affiliation":[{"name":"Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, Cairo 11795, Egypt"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2142-6444","authenticated-orcid":false,"given":"Wael","family":"Mohamed","sequence":"additional","affiliation":[{"name":"Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, Cairo 11795, Egypt"}]},{"given":"Sayed","family":"Abdelgaber","sequence":"additional","affiliation":[{"name":"Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, Cairo 11795, Egypt"}]}],"member":"1968","published-online":{"date-parts":[[2022,6,27]]},"reference":[{"key":"ref_1","first-page":"1","article-title":"Data Modeling and NoSQL Databases\u2014A Systematic Mapping Review","volume":"54","author":"Guo","year":"2021","journal-title":"ACM Comput. Surv."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"128889","DOI":"10.1109\/ACCESS.2021.3112880","article-title":"A Systematic Review of Data Models for the Big Data Problem","volume":"9","author":"Mostajabi","year":"2021","journal-title":"IEEE Access"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"103149","DOI":"10.1016\/j.csi.2016.10.003","article-title":"Data Modeling in the NoSQL World","volume":"67","author":"Atzeni","year":"2020","journal-title":"Comput. Stand. Interfaces"},{"key":"ref_4","unstructured":"Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., and Stoica, I. (2010, January 1). Spark: Cluster Computing with Working Sets. Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 10), Boston, MA, USA."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Armbrust, M., Xin, R.S., Lian, C., Huai, Y., Liu, D., Bradley, J.K., Meng, X., Kaftan, T., Franklin, M.J., and Ghodsi, A. (June, January 31). Spark SQL: Relational Data Processing in Spark. Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD \u201915), Melbourne, Australia.","DOI":"10.1145\/2723372.2742797"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"111","DOI":"10.1016\/j.future.2016.02.002","article-title":"Data Adapter for Querying and Transformation between SQL and NoSQL Database","volume":"65","author":"Liao","year":"2016","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"221","DOI":"10.1007\/s00607-019-00736-1","article-title":"Bringing SQL Databases to Key-Based NoSQL Databases: A Canonical Approach","volume":"102","author":"Schreiner","year":"2020","journal-title":"Computing"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"69042","DOI":"10.1109\/ACCESS.2019.2916912","article-title":"Intelligent Data Engineering for Migration to NoSQL Based Secure Environments","volume":"7","author":"Ramzan","year":"2019","journal-title":"IEEE Access"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Kuszera, E.M., Peres, L.M., Didonet, M., and Fabro, D. (2019, January 8\u201312). Toward RDB to NoSQL: Transforming Data with Metamorfose Framework. Proceedings of the 34th ACM\/SIGAPP Symposium on Applied Computing, New York, NY, USA.","DOI":"10.1145\/3297280.3299734"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Schreiner, G.A., Duarte, D., and Dos Santos Mello, R. (2015, January 11\u201313). SQLtoKeyNoSQL: A Layer for Relational to Key-Based NoSQL Database Mapping. Proceedings of the 17th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2015\u2014Proceedings, Brussels, Belgium.","DOI":"10.1145\/2837185.2837224"},{"key":"ref_11","unstructured":"(2018, January 22). MongoDB. Available online: https:\/\/www.mongodb.com\/."},{"key":"ref_12","unstructured":"(2018, January 22). Apache Cassandra. Available online: http:\/\/cassandra.apache.org\/."},{"key":"ref_13","unstructured":"(2022, January 15). Redis. Available online: https:\/\/redis.io\/."},{"key":"ref_14","unstructured":"(2021, February 01). neo4j. Available online: https:\/\/neo4j.com\/."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"489","DOI":"10.1007\/s10515-013-0135-x","article-title":"JackHare: A Framework for SQL to NoSQL Translation Using MapReduce","volume":"21","author":"Chung","year":"2014","journal-title":"Autom. Softw. Eng."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1002\/spe.2666","article-title":"An Integration Approach of Hybrid Databases Based on SQL in Cloud Computing Environment","volume":"49","author":"Li","year":"2019","journal-title":"Softw. Pract. Exp."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Lawrence, R. (2014, January 10\u201313). Integration and Virtualization of Relational SQL and NoSQL Systems Including MySQL and MongoDB. Proceedings of the 2014 International Conference on Computational Science and Computational Intelligence, Las Vegas, NV, USA.","DOI":"10.1109\/CSCI.2014.56"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Schreiner, G.A., Duarte, D., and dos Santos Mello, R. (2019). When Relational-Based Applications Go to NoSQL Databases: A Survey. Information, 10.","DOI":"10.3390\/info10070241"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Morzy, T., H\u00e4rder, T., and Wrembel, R. (2012). SimpleSQL: A Relational Layer for SimpleDB. Advances in Databases and Information Systems, Proceedings of the 16th East European Conference, ADBIS, Poznan, Poland, 18\u201321 October 2012, Springer.","DOI":"10.1007\/978-3-642-33074-2"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Jia, T., Zhao, X., Wang, Z., Gong, D., and Ding, G. (2016, January 5\u20138). Model Transformation and Data Migration from Relational Database to MongoDB. Proceedings of the 2016 IEEE International Congress on Big Data (BigData Congress), Washington, DC, USA.","DOI":"10.1109\/BigDataCongress.2016.16"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Rith, J., Lehmayr, P.S., and Meyer-Wegener, K. (2014, January 24\u201328). Speaking in Tongues: SQL Access to NoSQL Systems. Proceedings of the ACM Symposium on Applied Computing, Salamanca, Spain.","DOI":"10.1145\/2554850.2555099"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"798","DOI":"10.1016\/j.procs.2018.08.014","article-title":"BigDimETL with NoSQL Database","volume":"126","author":"Mallek","year":"2018","journal-title":"Procedia Comput. Sci."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zhao, G., Lin, Q., Li, L., and Li, Z. (2014, January 8\u201310). Schema Conversion Model of SQL Database to NoSQL. Proceedings of the 2014 9th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 3PGCIC 2014, Guangdong, China.","DOI":"10.1109\/3PGCIC.2014.137"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Ramzan, S., Bajwa, I.S., and Kazmi, R. (2018). An Intelligent Approach for Handling Complexity by Migrating from Conventional Databases to Big Data. Symmetry, 10.","DOI":"10.3390\/sym10120698"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Adriana, J., and Holanda, M. (2018, January 27\u201329). NoSQL2: SQL to NoSQL Databases. Proceedings of the World Conference on Information Systems and Technologies, Naples, Italy.","DOI":"10.1007\/978-3-319-77712-2_89"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"455","DOI":"10.1016\/j.future.2019.11.032","article-title":"Mortadelo: Automatic Generation of NoSQL Stores from Platform-Independent Data Models","volume":"105","author":"Blanco","year":"2020","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_27","first-page":"2972","article-title":"A Comprehensive Approach for Converting Relational to Graph Database Using Spark","volume":"99","author":"Mohamed","year":"2021","journal-title":"J. Theor. Appl. Inf. Technol."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"101941","DOI":"10.1016\/j.is.2021.101941","article-title":"Exploring Data Structure Alternatives in the RDB to NoSQL Document Store Conversion Process","volume":"105","author":"Kuszera","year":"2022","journal-title":"Inf. Syst."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"7936","DOI":"10.1007\/s11227-018-2361-2","article-title":"Techniques and Guidelines for Effective Migration from RDBMS to NoSQL","volume":"76","author":"Kim","year":"2020","journal-title":"J. Supercomput."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"93","DOI":"10.1016\/j.is.2017.04.002","article-title":"Uniform Data Access Platform for SQL and NoSQL Database Systems","volume":"69","year":"2017","journal-title":"Inf. Syst."},{"key":"ref_31","unstructured":"(2021, November 24). Schemacrawler. Available online: https:\/\/www.schemacrawler.com\/."},{"key":"ref_32","first-page":"544","article-title":"Data Modeling Guidelines for NoSQL Document-Store Databases","volume":"9","author":"Imam","year":"2018","journal-title":"Int. J. Adv. Comput. Sci. Appl."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Liyanaarachchi, G., Kasun, L., Nimesha, M., Lahiru, K., and Karunasena, A. (2016, January 16\u201319). MigDB\u2014Relational to NoSQL Mapper. Proceedings of the 2016 IEEE International Conference on Information and Automation for Sustainability: Interoperable Sustainable Smart Systems for Next Generation, ICIAfS 2016, Galle, Sri Lanka.","DOI":"10.1109\/ICIAFS.2016.7946576"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Imam, A.A., Basri, S., Ahmad, R., Aziz, N., and Gonz\u00e5lez-Aparicio, M.T. (2017, January 5\u20138). New Cardinality Notations and Styles for Modeling NoSQL Document-Store Databases. Proceedings of the TENCON 2017\u20132017 IEEE Region 10 Conference, Penang, Malaysia.","DOI":"10.1109\/TENCON.2017.8228332"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"46","DOI":"10.1186\/s40537-018-0156-1","article-title":"Automatic Schema Suggestion Model for NoSQL Document-Stores Databases","volume":"5","author":"Imam","year":"2018","journal-title":"J. Big Data"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"2275","DOI":"10.1109\/TKDE.2017.2722412","article-title":"NoSE: Schema Design for NoSQL Applications","volume":"29","author":"Mior","year":"2017","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"10","DOI":"10.1186\/s40537-017-0071-x","article-title":"Toward Building RDB to HBase Conversion Rules","volume":"4","author":"Ouanouki","year":"2017","journal-title":"J. Big Data"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Alotaibi, O., and Pardede, E. (2019). Transformation of Schema from Relational Database (RDB) to NoSQL Databases. Data, 4.","DOI":"10.3390\/data4040148"},{"key":"ref_39","unstructured":"(2022, January 07). Apache Phoenix. Available online: https:\/\/phoenix.apache.org\/."},{"key":"ref_40","unstructured":"(2020, December 08). JSqlParser. Available online: https:\/\/github.com\/JSQLParser\/JSqlParser."},{"key":"ref_41","unstructured":"(2021, October 29). MongoDB Connector for Spark. Available online: https:\/\/docs.mongodb.com\/spark-connector\/current\/."},{"key":"ref_42","unstructured":"(2021, January 01). Spark Redis Connector. Available online: https:\/\/github.com\/RedisLabs\/spark-redis."},{"key":"ref_43","unstructured":"(2022, January 13). Spark Cassandra Connector. Available online: https:\/\/github.com\/datastax\/spark-cassandra-connector."},{"key":"ref_44","unstructured":"IMDb (2020, December 20). Internet Movie Database. Available online: www.imdb.com\/interfaces."},{"key":"ref_45","unstructured":"(2022, January 04). Stack Exchange Data. Available online: https:\/\/archive.org\/details\/stackexchange."}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/6\/3\/71\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T23:39:17Z","timestamp":1760139557000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/6\/3\/71"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,6,27]]},"references-count":45,"journal-issue":{"issue":"3","published-online":{"date-parts":[[2022,9]]}},"alternative-id":["bdcc6030071"],"URL":"https:\/\/doi.org\/10.3390\/bdcc6030071","relation":{},"ISSN":["2504-2289"],"issn-type":[{"type":"electronic","value":"2504-2289"}],"subject":[],"published":{"date-parts":[[2022,6,27]]}}}