{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,11]],"date-time":"2026-01-11T16:13:25Z","timestamp":1768148005748,"version":"3.49.0"},"reference-count":42,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2022,12,9]],"date-time":"2022-12-09T00:00:00Z","timestamp":1670544000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["BDCC"],"abstract":"<jats:p>While big data benefits are numerous, the use of big data requires, however, addressing new challenges related to data processing, data security, and especially degradation of data quality. Despite the increased importance of data quality for big data, data quality measurement is actually limited to few metrics. Indeed, while more than 50 data quality dimensions have been defined in the literature, the number of measured dimensions is limited to 11 dimensions. Therefore, this paper aims to extend the measured dimensions by defining four new data quality metrics: Integrity, Accessibility, Ease of manipulation, and Security. Thus, we propose a comprehensive Big Data Quality Assessment Framework based on 12 metrics: Completeness, Timeliness, Volatility, Uniqueness, Conformity, Consistency, Ease of manipulation, Relevancy, Readability, Security, Accessibility, and Integrity. In addition, to ensure accurate data quality assessment, we apply data weights at three data unit levels: data fields, quality metrics, and quality aspects. Furthermore, we define and measure five quality aspects to provide a macro-view of data quality. Finally, an experiment is performed to implement the defined measures. The results show that the suggested methodology allows a more exhaustive and accurate big data quality assessment, with a more extensive methodology defining a weighted quality score based on 12 metrics and achieving a best quality model score of 9\/10.<\/jats:p>","DOI":"10.3390\/bdcc6040153","type":"journal-article","created":{"date-parts":[[2022,12,9]],"date-time":"2022-12-09T06:14:00Z","timestamp":1670566440000},"page":"153","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":26,"title":["An Advanced Big Data Quality Framework Based on Weighted Metrics"],"prefix":"10.3390","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2968-2389","authenticated-orcid":false,"given":"Widad","family":"Elouataoui","sequence":"first","affiliation":[{"name":"Laboratory of Engineering Sciences, National School of Applied Sciences, Ibn Tofail University, Kenitra 14000, Morocco"}]},{"given":"Imane","family":"El Alaoui","sequence":"additional","affiliation":[{"name":"Telecommunications Systems and Decision Engineering Laboratory, Ibn Tofail University, Kenitra 14000, Morocco"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1938-621X","authenticated-orcid":false,"given":"Saida","family":"El Mendili","sequence":"additional","affiliation":[{"name":"Laboratory of Engineering Sciences, National School of Applied Sciences, Ibn Tofail University, Kenitra 14000, Morocco"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8010-9206","authenticated-orcid":false,"given":"Youssef","family":"Gahi","sequence":"additional","affiliation":[{"name":"Laboratory of Engineering Sciences, National School of Applied Sciences, Ibn Tofail University, Kenitra 14000, Morocco"}]}],"member":"1968","published-online":{"date-parts":[[2022,12,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Baddi, Y., Gahi, Y., Maleh, Y., Alazab, M., and Tawalbeh, L. (2022). Data Quality in the Era of Big Data: A Global Review. Big Data Intelligence for Smart Applications, Springer International Publishing.","DOI":"10.1007\/978-3-030-87954-9"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"841","DOI":"10.1109\/TII.2022.3190405","article-title":"Healthcare Data Quality Assessment for Cybersecurity Intelligence","volume":"19","author":"Li","year":"2022","journal-title":"IEEE Trans. Ind. Inform."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Elouataoui, W., El Alaoui, I., and Gahi, Y. (2022, January 6). Metadata Quality Dimensions for Big Data Use Cases. Proceedings of the International Conference on Big Data, Modelling and Machine Learning (BML), Kenitra, Morocco.","DOI":"10.5220\/0010737400003101"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Kapil, G., Agrawal, A., and Khan, R.A. (2016, January 21\u201322). A study of big data characteristics. Proceedings of the 2016 International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India.","DOI":"10.1109\/CESYS.2016.7889917"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Faroukhi, A.Z., El Alaoui, I., Gahi, Y., and Amine, A. (2020). An Adaptable Big Data Value Chain Framework for End-to-End Big Data Monetization. Big Data Cogn. Comput., 4.","DOI":"10.3390\/bdcc4040034"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1186\/s40537-019-0281-5","article-title":"Big data monetization throughout Big Data Value Chain: A comprehensive review","volume":"7","author":"Faroukhi","year":"2020","journal-title":"J. Big Data"},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Juddoo, S. (2015, January 4\u20135). Overview of data quality challenges in the context of Big Data. Proceedings of the 2015 International Conference on Computing, Communication and Security (ICCCS), Pointe aux Piments, Mauritius.","DOI":"10.1109\/CCCS.2015.7374131"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Maleh, Y., Alazab, M., Gherabi, N., Tawalbeh, L., and Abd El-Latif, A.A. (2021). Metadata Quality in the Era of Big Data and Unstructured Content. Advances in Information, Communication and Cybersecurity, Springer. Advances in Information, Communication and Cybersecurity. Lecture Notes in Networks and Systems.","DOI":"10.1007\/978-3-030-91738-8"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Ben Ahmed, M., and Boudhir, A. (2018). Big Data Analytics: A Comparison of Tools and Applications. Innovations in Smart Cities and Applications, Springer. Lecture Notes in Networks and Systems.","DOI":"10.1007\/978-3-319-74500-8"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Alaoui, I.E., Gahi, Y., and Messoussi, R. (2019, January 12\u201315). Full Consideration of Big Data Characteristics in Sentiment Analysis Context. Proceedings of the 4th International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), Chengdu, China.","DOI":"10.1109\/ICCCBDA.2019.8725728"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Sidi, F., Shariat Panahy, P.H., Affendey, L.S., Jabar, M.A., Ibrahim, H., and Mustapha, A. (2012, January 13\u201315). Data quality: A survey of data quality dimensions. Proceedings of the 2012 International Conference on Information Retrieval Knowledge Management, Kuala Lumpur, Malaysia.","DOI":"10.1109\/InfRKM.2012.6204995"},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"El Alaoui, I., Gahi, Y., and Messoussi, R. (2019, January 11). Big Data Quality Metrics for Sentiment Analysis Approaches. Proceedings of the 2019 International Conference on Big Data Engineering, New York, NY, USA.","DOI":"10.1145\/3341620.3341629"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1080\/07421222.1996.11518099","article-title":"Beyond Accuracy: What Data Quality Means to Data Consumers","volume":"12","author":"Wang","year":"1996","journal-title":"J. Manag. Inf. Syst."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"803","DOI":"10.1016\/j.procs.2019.11.007","article-title":"The Impact of Big Data Quality on Sentiment Analysis Approaches","volume":"160","author":"Alaoui","year":"2019","journal-title":"Procedia Comput. Sci."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"207","DOI":"10.1111\/1467-8551.00375","article-title":"Towards a Methodology for Developing Evidence-Informed Management Knowledge by Means of Systematic Review","volume":"14","author":"Tranfield","year":"2003","journal-title":"Br. J. Manag."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"58","DOI":"10.1145\/269012.269022","article-title":"A product perspective on total data quality management","volume":"41","author":"Wang","year":"1998","journal-title":"Commun. ACM"},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1016\/S0378-7206(02)00043-5","article-title":"AIMQ: A methodology for information quality assessment","volume":"40","author":"Lee","year":"2002","journal-title":"Inf. Manag."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3190578","article-title":"Visual Interactive Creation, Customization, and Analysis of Data Quality Metrics","volume":"10","author":"Bors","year":"2018","journal-title":"J. Data Inf. Qual."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"708","DOI":"10.1080\/14783363.2017.1332954","article-title":"Measuring data quality with weighted metrics","volume":"30","author":"Vaziri","year":"2019","journal-title":"Total Qual. Manag. Bus. Excell."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"60","DOI":"10.5121\/ijdms.2011.3105","article-title":"A Data Quality Methodology for Heterogeneous Data","volume":"3","author":"Batini","year":"2011","journal-title":"Int. J. Database Manag. Syst."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"3","DOI":"10.1016\/j.icte.2022.01.006","article-title":"Disturbed-entropy: A simple data quality assessment approach","volume":"8","author":"Li","year":"2022","journal-title":"ICT Express"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"76","DOI":"10.1186\/s40537-021-00468-0","article-title":"Big data quality framework: A holistic approach to continuous quality management","volume":"8","author":"Taleb","year":"2021","journal-title":"J. Big Data"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"147","DOI":"10.1007\/s41060-021-00257-1","article-title":"Big data quality prediction informed by banking regulation","volume":"12","author":"Wong","year":"2021","journal-title":"Int. J. Data Sci. Anal."},{"key":"ref_24","unstructured":"Azeroual, O., Saake, G., and Abuosba, M. (2019). Data Quality Measures and Data Cleansing for Research Information Systems. arXiv, Available online: http:\/\/arxiv.org\/abs\/1901.06208."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"113138","DOI":"10.1016\/j.dss.2019.113138","article-title":"Measuring data quality in information systems research","volume":"126","author":"Timmerman","year":"2019","journal-title":"Decis. Support Syst."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Mylavarapu, G., Thomas, J.P., and Viswanathan, K.A. (2019, January 15\u201318). An Automated Big Data Accuracy Assessment Tool. Proceedings of the 2019 IEEE 4th International Conference on Big Data Analytics (ICBDA), Suzhou, China.","DOI":"10.1109\/ICBDA.2019.8713218"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Taleb, I., Serhani, M.A., and Dssouli, R. (2019). Big Data Quality: A Data Quality Profiling Model. Services\u2014SERVICES 2019, Springer.","DOI":"10.1007\/978-3-030-23381-5_5"},{"key":"ref_28","first-page":"1","article-title":"Requirements for Data Quality Metrics","volume":"9","author":"Heinrich","year":"2018","journal-title":"J. Data Inf. Qual."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Bencz\u00far, A., Thalheim, B., and Horv\u00e1th, T. (2018). Data Quality in a Big Data Context. Advances in Databases and Information Systems, Springer. Lecture Notes in Computer Science.","DOI":"10.1007\/978-3-319-98398-1"},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Micic, N., Neagu, D., Campean, F., and Zadeh, E.H. (2017, January 21\u201323). Towards a Data Quality Framework for Heterogeneous Data. Proceedings of the 2017 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), Exeter, UK.","DOI":"10.1109\/iThings-GreenCom-CPSCom-SmartData.2017.28"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Taleb, I., Kassabi, H.T.E., Serhani, M.A., Dssouli, R., and Bouhaddioui, C. (2016, January 18\u201321). Big Data Quality: A Quality Dimensions Evaluation. Proceedings of the 2016 Intelligence IEEE Conferences on Ubiquitous Intelligence Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC\/ATC\/ScalCom\/CBDCom\/IoP\/SmartWorld), Toulouse, France.","DOI":"10.1109\/UIC-ATC-ScalCom-CBDCom-IoP-SmartWorld.2016.0122"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Serhani, M.A., El Kassabi, H.T., Taleb, I., and Nujum, A. (2016, January 5\u20138). An Hybrid Approach to Quality Evaluation across Big Data Value Chain. IEEE. Proceedings of the 2016 IEEE International Congress on Big Data (BigData Congress), Washington, DC, USA.","DOI":"10.1109\/BigDataCongress.2016.65"},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1007\/s41019-015-0004-7","article-title":"On the Meaningfulness of \u201cBig Data Quality\u201d (Invited Paper)","volume":"1","author":"Firmani","year":"2016","journal-title":"Data Sci. Eng."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"2","DOI":"10.5334\/dsj-2015-002","article-title":"The Challenges of Data Quality and Data Quality Assessment in the Big Data Era","volume":"14","author":"Cai","year":"2015","journal-title":"Data Sci. J."},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Zhang, P., Xiong, F., Gao, J., and Wang, J. (2017, January 4\u20138). Data quality in big data processing: Issues, solutions and open problems. Proceedings of the 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld\/SCALCOM\/UIC\/ATC\/CBDCom\/IOP\/SCI), San Francisco, CA, USA.","DOI":"10.1109\/UIC-ATC.2017.8397554"},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"86","DOI":"10.1145\/240455.240479","article-title":"Anchoring data quality dimensions in ontological foundations","volume":"39","author":"Wand","year":"1996","journal-title":"Commun. ACM"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Maleh, Y., Shojafar, M., Alazab, M., and Baddi, Y. (2021). Machine Learning and Deep Learning Models for Big Data Issues. Machine Intelligence and Big Data Analytics for Cybersecurity Applications, Springer. Studies in Computational Intelligence.","DOI":"10.1007\/978-3-030-57024-8"},{"key":"ref_38","first-page":"33","article-title":"An End-to-End Big Data Deduplication Framework based on Online Continuous Learning","volume":"13","author":"Elouataoui","year":"2022","journal-title":"Int. J. Adv. Comput. Sci. Appl."},{"key":"ref_39","unstructured":"(2021, October 07). COVID-19: Twitter Dataset Of 100+ Million Tweets. Available online: https:\/\/kaggle.com\/adarshsng\/covid19-twitter-dataset-of-100-million-tweets."},{"key":"ref_40","unstructured":"(2022, August 24). Great Expectations Home Page. Available online: https:\/\/www.greatexpectations.io\/."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Reda, O., Sassi, I., Zellou, A., and Anter, S. (2020, January 23\u201324). Towards a Data Quality Assessment in Big Data. Proceedings of the 13th International Conference on Intelligent Systems: Theories and Applications, New York, NY, USA.","DOI":"10.1145\/3419604.3419803"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"730","DOI":"10.1016\/j.procs.2020.07.108","article-title":"Network Security Strategies in Big Data Context","volume":"175","author":"Alaoui","year":"2020","journal-title":"Procedia Comput. Sci."}],"container-title":["Big Data and Cognitive Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2504-2289\/6\/4\/153\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:37:12Z","timestamp":1760146632000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2504-2289\/6\/4\/153"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,12,9]]},"references-count":42,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2022,12]]}},"alternative-id":["bdcc6040153"],"URL":"https:\/\/doi.org\/10.3390\/bdcc6040153","relation":{},"ISSN":["2504-2289"],"issn-type":[{"value":"2504-2289","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,12,9]]}}}