{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T01:11:55Z","timestamp":1760058715490,"version":"build-2065373602"},"reference-count":57,"publisher":"MDPI AG","issue":"4","license":[{"start":{"date-parts":[[2025,4,19]],"date-time":"2025-04-19T00:00:00Z","timestamp":1745020800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100004955","name":"Austrian Research Promotion Agency (FFG)","doi-asserted-by":"publisher","award":["FO999899544"],"award-info":[{"award-number":["FO999899544"]}],"id":[{"id":"10.13039\/501100004955","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Data"],"abstract":"<jats:p>Data analysis is widely used in research and industry where there is a need to extract information from data. A significant amount of time within a data analysis project is required to prepare the data for subsequent analysis. This paper presents a comprehensive weighted maturity model to estimate the readiness of data for subsequent data analysis, with the goal of avoiding delays due to data quality problems. The maturity model uses a questionnaire with nine criteria to determine the maturity level of data preparation. The maturity model is integrated into a web application that provides an automated evaluation of maturity and a novel visualization approach that combines a modified spider chart and augmented chord diagrams. The comprehensive weighted maturity model is a ready-to-use application that provides prospective users with an easy and quick way to check their data for maturity for subsequent data analysis, with the goal of improving the data preparation process. The weighted maturity model is applicable to all types of data analysis, regardless of the domain of the data.<\/jats:p>","DOI":"10.3390\/data10040055","type":"journal-article","created":{"date-parts":[[2025,4,20]],"date-time":"2025-04-20T20:31:36Z","timestamp":1745181096000},"page":"55","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["A Comprehensive Data Maturity Model for Data Pre-Analysis"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-7164-2114","authenticated-orcid":false,"given":"Lukas","family":"Knoflach","sequence":"first","affiliation":[{"name":"Institute of Visual Computing, Graz University of Technology, 8010 Graz, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5118-9515","authenticated-orcid":false,"given":"Lin","family":"Shao","sequence":"additional","affiliation":[{"name":"Fraunhofer Austria Research GmbH, 8010 Graz, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7866-9762","authenticated-orcid":false,"given":"Torsten","family":"Ullrich","sequence":"additional","affiliation":[{"name":"Institute of Visual Computing, Graz University of Technology, 8010 Graz, Austria"},{"name":"Fraunhofer Austria Research GmbH, 8010 Graz, Austria"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,4,19]]},"reference":[{"key":"ref_1","unstructured":"CrowdFlower (2017). 2017 Data Scientist Report, CrowdFlower."},{"key":"ref_2","unstructured":"Anaconda (2025, February 09). 2022 State of Data Science. Available online: https:\/\/www.anaconda.com\/resources\/whitepapers\/state-of-data-science-report-2022."},{"key":"ref_3","unstructured":"Woodie, A. (2025, February 09). Data Prep Still Dominates Data Scientists\u2019 Time, Survey Finds. Available online: https:\/\/www.bigdatawire.com\/2020\/07\/06\/data-prep-still-dominates-data-scientists-time-survey-finds\/."},{"key":"ref_4","unstructured":"Anwar, M. (2025, February 09). Was ist Datenbereinigung?. Ein vollst\u00e4ndiger Leitfaden., Available online: https:\/\/www.astera.com\/de\/type\/blog\/data-cleansing\/."},{"key":"ref_5","unstructured":"Documentation, I. (2025, February 09). SPSS Modeler. Available online: https:\/\/www.ibm.com\/docs\/de\/spss-modeler\/18.4.0?topic=preparation-data-overview."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Dasu, T., and Johnson, T. (2003). Exploratory Data Mining and Data Cleaning, John Wiley & Sons.","DOI":"10.1002\/0471448354"},{"key":"ref_7","unstructured":"Matzer, M. (2025, February 09). Datenaufbereitung ist ein Untersch\u00e4tzter Prozess. Available online: https:\/\/www.bigdata-insider.de\/datenaufbereitung-ist-ein-unterschaetzter-prozess-a-803469\/."},{"key":"ref_8","unstructured":"Institute, P. (2025, February 09). Overcoming the 80\/20 Rule in Data Science. Available online: https:\/\/www.pragmaticinstitute.com\/resources\/articles\/data\/overcoming-the-80-20-rule-in-data-science\/."},{"key":"ref_9","unstructured":"Lohr, S. (2025, February 09). For Big-Data Scientists, \u2018Janitor Work\u2019 Is Key Hurdle to Insights. The New York Times, Available online: https:\/\/www.nytimes.com\/2014\/08\/18\/technology\/for-big-data-scientists-hurdle-to-insights-is-janitor-work.html."},{"key":"ref_10","unstructured":"Wirth, R., and Hipp, J. (2000, January 11\u201313). CRISP-DM: Towards a Standard Process Model for Data Mining. Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining, Manchester, UK."},{"key":"ref_11","first-page":"13","article-title":"The CRISP-DM Model: The New Blueprint for Data Mining","volume":"5","author":"Shearer","year":"2000","journal-title":"J. Data Warehous."},{"key":"ref_12","first-page":"37","article-title":"From Data Mining to Knowledge Discovery in Databases","volume":"17","author":"Fayyad","year":"1996","journal-title":"AI Mag."},{"key":"ref_13","unstructured":"Azevedo, A., and Santos, M.F. (2008, January 24\u201326). KDD, SEMMA and CRISP-DM: A Parallel Overview. Proceedings of the IADIS European Conference Data Mining, Amsterdam, The Netherlands."},{"key":"ref_14","unstructured":"D\u00e5derman, A., and Rosander, S. (2018). Evaluating Frameworks for Implementing Machine Learning in Signal Processing: A Comparative Study of CRISP-DM, SEMMA and KDD. [Master\u2019s Thesis, KTH Royal Institute of Technology]."},{"key":"ref_15","unstructured":"Center, S.H. (2025, February 09). Introduction to SEMMA. Available online: https:\/\/documentation.sas.com\/doc\/en\/emref\/14.3\/n061bzurmej4j3n1jnj8bbjjm1a2.htm."},{"key":"ref_16","first-page":"217","article-title":"A Comparative Study of Data Mining Process Models (KDD, CRISP-DM and SEMMA)","volume":"12","author":"Shafique","year":"2014","journal-title":"Int. J. Innov. Sci. Res."},{"key":"ref_17","first-page":"19","article-title":"Data Mining; A Conceptual Overview","volume":"8","author":"Jackson","year":"2002","journal-title":"Commun. Assoc. Inf. Syst."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"1317","DOI":"10.1016\/j.infsof.2012.07.007","article-title":"The Maturity of Maturity Model Research: A Systematic Mapping Study","volume":"54","author":"Wendler","year":"2012","journal-title":"Inf. Softw. Technol."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1109\/52.219617","article-title":"Capability Maturity Model, Version 1.1","volume":"10","author":"Paulk","year":"1993","journal-title":"IEEE Softw."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Al-Sai, Z.A., Abdullah, R., and Husin, M.H. (2019, January 9\u201311). A Review on Big Data Maturity Models. Proceedings of the 2019 IEEE Jordan International Joint Conference on Electrical Engineering and Information Technology (JEEIT), Amman, Jordan.","DOI":"10.1109\/JEEIT.2019.8717398"},{"key":"ref_21","unstructured":"Moore, D.T. (2014, January 6\u20139). Roadmaps and Maturity Models: Pathways toward Adopting Big Data. Proceedings of the Conference for Information Systems Applied Research, Baltimore, MD, USA."},{"key":"ref_22","first-page":"547","article-title":"The Development of the Data Science Capability Maturity Model: A Survey-Based Research","volume":"46","author":"Kayabay","year":"2021","journal-title":"Online Inf. Rev."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Al-Sai, Z.A., Husin, M.H., Syed-Mohamad, S.M., Abdullah, R., Zitar, R.A., Abualigah, L., and Gandomi, A.H. (2023). Big Data Maturity Assessment Models: A Systematic Literature Review. Big Data Cogn. Comput., 7.","DOI":"10.3390\/bdcc7010002"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"2151","DOI":"10.1002\/qre.2008","article-title":"How Can SMEs Benefit from Big Data? Challenges and a Path Forward","volume":"32","author":"Coleman","year":"2016","journal-title":"Qual. Reliab. Eng. Int."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"1468","DOI":"10.1108\/IMDS-12-2015-0495","article-title":"How Organisations Leverage Big Data: A Maturity Model","volume":"116","author":"Comuzzi","year":"2016","journal-title":"Ind. Manag. Data Syst."},{"key":"ref_26","unstructured":"Davenport, T., and Harris, J. (2017). Competing on Analytics: Updated, with a New Introduction: The New Science of Winning, Harvard Business Press."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Kr\u00f3l, K., and Zdonek, D. (2020). Analytics Maturity Models: An Overview. Information, 11.","DOI":"10.3390\/info11030142"},{"key":"ref_28","first-page":"22","article-title":"TDWI Analytics Maturity Model","volume":"22","author":"Halper","year":"2020","journal-title":"TDWI Res."},{"key":"ref_29","first-page":"1","article-title":"A Capability Maturity Model for Developing and Improving Advanced Data Analytics Capabilities","volume":"16","author":"Korsten","year":"2024","journal-title":"Pac. Asia J. Assoc. Inf. Syst."},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1068","DOI":"10.1016\/j.chb.2014.09.030","article-title":"MD3M: The Master Data Management Maturity Model","volume":"51","author":"Spruit","year":"2015","journal-title":"Comput. Hum. Behav."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1002\/meet.2011.14504801036","article-title":"A Capability Maturity Model for Scientific Data Management: Evidence from the Literature","volume":"48","author":"Crowston","year":"2011","journal-title":"Proc. Am. Soc. Inf. Sci. Technol."},{"key":"ref_32","unstructured":"CMMI Institute (2025, April 14). Data Management Maturity (DMM) Model. Available online: https:\/\/stage.cmmiinstitute.com\/getattachment\/cb35800b-720f-4afe-93bf-86ccefb1fb17\/attachment.aspx."},{"key":"ref_33","unstructured":"Yang, B., Wu, H., and Zhang, H. (2018, January 26\u201328). Research and Application of Data Management Based on Data Management Maturity Model (DMM). Proceedings of the ICMLC 2018: 2018 10th International Conference on Machine Learning and Computing, Macau, China."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Belghith, O., Skhiri, S., Zitoun, S., and Ferjaoui, S. (2021, January 13\u201315). A Survey of Maturity Models in Data Management. Proceedings of the 2021 IEEE 12th International Conference on Mechanical and Intelligent Manufacturing Technologies (ICMIMT), Cape Town, South Africa.","DOI":"10.1109\/ICMIMT52186.2021.9476197"},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"191","DOI":"10.4218\/etrij.06.0105.0026","article-title":"A Data Quality Management Maturity Model","volume":"28","author":"Ryu","year":"2006","journal-title":"ETRI J."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"H\u00fcner, K.M., Ofner, M., and Otto, B. (2009, January 8\u201312). Towards a Maturity Model for Corporate Data Quality Management. Proceedings of the 2009 ACM Symposium on Applied Computing, Honolulu, HI, USA. SAC \u201909.","DOI":"10.1145\/1529282.1529334"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1007\/s40786-013-0002-z","article-title":"A Maturity Model for Enterprise Data Quality Management","volume":"8","author":"Ofner","year":"2013","journal-title":"Enterp. Model. Inf. Syst. Archit."},{"key":"ref_38","first-page":"100256","article-title":"Organizational Process Maturity Model for IoT Data Quality Management","volume":"26","author":"Kim","year":"2022","journal-title":"J. Ind. Inf. Integr."},{"key":"ref_39","unstructured":"Kirikoglu, O. (2017). A Maturity Model for Improving Data Quality Management. [Master\u2019s Thesis, University of Twente]."},{"key":"ref_40","unstructured":"Twilt, S. (2023). A Data Analytics Maturity Assessment Model for Data-Intensive Organizations. [Master\u2019s Thesis, Utrecht University]."},{"key":"ref_41","unstructured":"Giovannini, E., and Ward, D. (2004, January 27\u201328). Quality Framework for OECD Statistics. Proceedings of the Conference on Data Quality for International Organizations, Wiesbaden, Germany."},{"key":"ref_42","unstructured":"Durand, M. (2012). Quality Framework and Guidelines for OECD Statistical Activities, Version 2011\/1, OECD."},{"key":"ref_43","unstructured":"Askham, N., Cook, D., Doyle, M., Fereday, H., Gibson, M., Landbeck, U., Lee, R., Maynard, C., Palmer, G., and Schwarzenbach, J. (2013). The Six Primary Dimensions for Data Quality Assessment, Data Management Association."},{"key":"ref_44","unstructured":"RDA FAIR Data Maturity Model Working Group (2025, April 14). FAIR Data Maturity Model: Specification and Guidelines. Research Data Alliance. Available online: https:\/\/zenodo.org\/records\/3909563#.YGRNnq8za70."},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"18","DOI":"10.1145\/3444831.3444835","article-title":"Data Preparation: A Survey of Commercial Tools","volume":"49","author":"Hameed","year":"2020","journal-title":"ACM SIGMOD Rec."},{"key":"ref_46","unstructured":"Kuckartz, U., and R\u00e4diker, S. (2024). Qualitative Inhaltsanalyse. Methoden, Praxis, Umsetzung Mit Software Und K\u00fcnstlicher Intelligenz, Beltz Juventa. [6th ed.]."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Schultz, D., and Cook, C. (2007). Client-Side Scripting Basics. Beginning HTML with CSS and XHTML: Modern Guide and Reference, Apress.","DOI":"10.1007\/978-1-4302-0350-6"},{"key":"ref_48","unstructured":"Marcotte, E. (2025, February 09). Responsive Web Design. Available online: https:\/\/alistapart.com\/article\/responsive-web-design\/."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"37","DOI":"10.1515\/kbo-2017-0153","article-title":"Responsive Web Design Techniques","volume":"Volume 23","author":"Giurgiu","year":"2017","journal-title":"International Conference Knowledge-Based Organization"},{"key":"ref_50","unstructured":"Spurlock, J. (2013). Bootstrap: Responsive Web Development, O\u2019Reilly Media, Inc."},{"key":"ref_51","first-page":"349","article-title":"A Review Paper On Bootstrap Framework","volume":"2","author":"Gaikwad","year":"2019","journal-title":"IRE J."},{"key":"ref_52","unstructured":"Liu, W.Y., Wang, B.W., Yu, J.X., Li, F., Wang, S.X., and Hong, W.X. (, January 12\u201315). Visualization Classification Method of Multi-Dimensional Data Based on Radar Chart Mapping. Proceedings of the 2008 International Conference on Machine Learning and Cybernetics, Kunming, China."},{"key":"ref_53","doi-asserted-by":"crossref","unstructured":"Porter, M.M., and Niksiar, P. (2018). Multidimensional Mechanics: Performance Mapping of Natural Biological Systems Using Permutated Radar Charts. PLoS ONE, 13.","DOI":"10.1371\/journal.pone.0204309"},{"key":"ref_54","first-page":"1","article-title":"A Colour Alphabet and the Limits of Colour Coding","volume":"5","year":"2010","journal-title":"JAIC-J. Int. Colour Assoc."},{"key":"ref_55","doi-asserted-by":"crossref","unstructured":"Mazel, J., Fontugne, R., and Fukuda, K. (2014, January 24\u201328). Visual Comparison of Network Anomaly Detectors with Chord Diagrams. Proceedings of the SAC 2014: Symposium on Applied Computing, Gyeongju, Republic of Korea.","DOI":"10.1145\/2554850.2554886"},{"key":"ref_56","unstructured":"Keahey, T.A. (2013). Using Visualization to Understand Big Data. IBM Business Analytics Advanced Visualisation, IBM Corporation."},{"key":"ref_57","unstructured":"Teller, S. (2013). Data Visualization with D3.Js, Packt Publishing."}],"container-title":["Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2306-5729\/10\/4\/55\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T17:17:52Z","timestamp":1760030272000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2306-5729\/10\/4\/55"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,19]]},"references-count":57,"journal-issue":{"issue":"4","published-online":{"date-parts":[[2025,4]]}},"alternative-id":["data10040055"],"URL":"https:\/\/doi.org\/10.3390\/data10040055","relation":{},"ISSN":["2306-5729"],"issn-type":[{"type":"electronic","value":"2306-5729"}],"subject":[],"published":{"date-parts":[[2025,4,19]]}}}