{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,13]],"date-time":"2026-04-13T11:59:02Z","timestamp":1776081542393,"version":"3.50.1"},"reference-count":25,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2019,3,5]],"date-time":"2019-03-05T00:00:00Z","timestamp":1551744000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Informatics"],"abstract":"<jats:p>The topic of data integration from external data sources or independent IT-systems has received increasing attention recently in IT departments as well as at management level, in particular concerning data integration in federated database systems. An example of the latter are commercial research information systems (RIS), which regularly import, cleanse, transform and prepare the analysis research information of the institutions of a variety of databases. In addition, all these so-called steps must be provided in a secured quality. As several internal and external data sources are loaded for integration into the RIS, ensuring information quality is becoming increasingly challenging for the research institutions. Before the research information is transferred to a RIS, it must be checked and cleaned up. An important factor for successful or competent data integration is therefore always the data quality. The removal of data errors (such as duplicates and harmonization of the data structure, inconsistent data and outdated data, etc.) are essential tasks of data integration using extract, transform, and load (ETL) processes. Data is extracted from the source systems, transformed and loaded into the RIS. At this point conflicts between different data sources are controlled and solved, as well as data quality issues during data integration are eliminated. Against this background, our paper presents the process of data transformation in the context of RIS which gains an overview of the quality of research information in an institution\u2019s internal and external data sources during its integration into RIS. In addition, the question of how to control and improve the quality issues during the integration process in RIS will be addressed.<\/jats:p>","DOI":"10.3390\/informatics6010010","type":"journal-article","created":{"date-parts":[[2019,3,5]],"date-time":"2019-03-05T11:19:50Z","timestamp":1551784790000},"page":"10","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":21,"title":["ETL Best Practices for Data Quality Checks in RIS Databases"],"prefix":"10.3390","volume":"6","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-5225-389X","authenticated-orcid":false,"given":"Otmane","family":"Azeroual","sequence":"first","affiliation":[{"name":"German Center for Higher Education Research and Science Studies (DZHW), Sch\u00fctzenstra\u00dfe 6a, 10117 Berlin, Germany"},{"name":"Institute for Technical and Business Information Systems\u2014Database Research Group, Otto-von-Guericke-University Magdeburg, Universit\u00e4tsplatz 2, 39106 Magdeburg, Germany"},{"name":"Department of Computer Science and Engineering, University of Applied Sciences\u2014HTW Berlin, Wilhelminenhofstra\u00dfe 75 A, 12459 Berlin, Germany"}]},{"given":"Gunter","family":"Saake","sequence":"additional","affiliation":[{"name":"Institute for Technical and Business Information Systems\u2014Database Research Group, Otto-von-Guericke-University Magdeburg, Universit\u00e4tsplatz 2, 39106 Magdeburg, Germany"}]},{"given":"Mohammad","family":"Abuosba","sequence":"additional","affiliation":[{"name":"Department of Computer Science and Engineering, University of Applied Sciences\u2014HTW Berlin, Wilhelminenhofstra\u00dfe 75 A, 12459 Berlin, Germany"}]}],"member":"1968","published-online":{"date-parts":[[2019,3,5]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1271","DOI":"10.1007\/s11192-018-2735-5","article-title":"Data measurement in research information systems: Metrics for the evaluation of data quality","volume":"115","author":"Azeroual","year":"2018","journal-title":"Scientometrics"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Azeroual, O., and Sch\u00f6pfel, J. (2019). Quality issues of CRIS data: An exploratory investigation with universities from twelve countries. Publications, 7.","DOI":"10.3390\/publications7010014"},{"key":"ref_3","first-page":"12","article-title":"Data quality measures and data cleansing for research information systems","volume":"16","author":"Azeroual","year":"2018","journal-title":"J. Digit. Inf. Manag."},{"key":"ref_4","unstructured":"Jeffery, K.G. (2004, January 13\u201315). The new technologies: Can CRISs benefit?. Proceedings of the CRIS2004: 7th International Conference on Current Research Information Systems, Antwerp, Belgium."},{"key":"ref_5","unstructured":"J\u00f6rg, B., Cutting-Decelle, A.F., Houssos, N., Sicilia, M.A., and Jeffery, K.G. (2012, January 28\u201331). CERIF-CRIS, a research information model for decision support: Use and trends for the future. Proceedings of the 23rd International CODATA Conference, Taipei, Taiwan."},{"key":"ref_6","unstructured":"Sch\u00f6pfel, J., Prost, H., and Rebouillat, V. (2016, January 9\u201311). Research data in current research information systems. Proceedings of the CRIS2016: 13th International Conference on Current Research Information Systems, St Andrews, UK."},{"key":"ref_7","first-page":"1","article-title":"Overview and framework for data and information quality research","volume":"1","author":"Madnick","year":"2009","journal-title":"J. Data Inf. Qual."},{"key":"ref_8","unstructured":"Naumann, F. (2018, December 25). Informationsintegration: Schema Mapping. Available online: https:\/\/www.informatik.hu-berlin.de\/de\/forschung\/gebiete\/wbi\/ii\/folien\/InfoInt_15_SchemaMapping.ppt\/at_download\/file."},{"key":"ref_9","unstructured":"Van den Berghe, S., and Van Gaeveren, K. (2016, January 9\u201311). Data quality assessment and improvement: A Vrije Universiteit Brussel case study. Proceedings of the CRIS2016: 13th International Conference on Current Research Information Systems, St Andrews, UK."},{"key":"ref_10","first-page":"30","article-title":"The effects of using business intelligence systems on an excellence management and decision-making process by start-up companies: A case study","volume":"4","author":"Azeroual","year":"2018","journal-title":"Int. J. Manag. Sci. and Bus. Adm."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"79","DOI":"10.1145\/269012.269025","article-title":"The impact of poor data quality on typical enterprise","volume":"41","author":"Redman","year":"1998","journal-title":"Commun. ACM"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"54","DOI":"10.1145\/269012.269021","article-title":"Examining data quality","volume":"41","author":"Ballou","year":"1998","journal-title":"Commun. ACM"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1080\/07421222.1996.11518099","article-title":"Beyond accuracy: What data quality means to data consumers?","volume":"12","author":"Wang","year":"1996","journal-title":"J. Manag. Inf. Syst."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1145\/505248.506010","article-title":"Data quality assessment","volume":"45","author":"Pipino","year":"2002","journal-title":"Commun. ACM"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/1805286.1805291","article-title":"A survey on uncertainty management in data integration","volume":"2","author":"Magnani","year":"2010","journal-title":"J. Data Inf. Qual."},{"key":"ref_16","unstructured":"Berkhoff, K., Ebeling, B., and L\u00fcbbe, S. (2012, January 6\u20139). Integrating research information into a software for higher education administration\u2014Benefits for data quality and accessibility. Proceedings of the CRIS2012: 11th International Conference on Current Research Information Systems, Prague, Czech Republic."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"109","DOI":"10.7494\/csci.2014.15.2.109","article-title":"Integration of data from heterogeneous sources using ETL technology","volume":"15","author":"Macura","year":"2014","journal-title":"Comput. Sci."},{"key":"ref_18","unstructured":"Quix, C., and Jarke, M. (2016, January 13\u201315). Information integration in research information systems. Proceedings of the CRIS2014: 12th International Conference on Current Research Information Systems, Rome, Italy."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Azeroual, O., Saake, G., Abuosba, M., and Sch\u00f6pfel, J. (2019). Integrating quality of research information into research information management systems\u2014Using the European CERIF and German RCD standards as examples. Inf. Ser. Use, Forthcoming.","DOI":"10.3233\/ISU-180030"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1007\/s13740-012-0006-9","article-title":"Metrics for the prediction of evolution impact in ETL ecosystems: A case study","volume":"1","author":"Papastefanatos","year":"2012","journal-title":"J. Data Semantics"},{"key":"ref_21","unstructured":"Helmis, S., and Hollmann, R. (2009). Webbasierte Datenintegration\u2014Ans\u00e4tze zur Messung und Sicherung der Informationsqualit\u00e4t in heterogenen Datenbest\u00e4nden unter Verwendung eines vollst\u00e4ndig webbasierten Werkzeuges, Vieweg+Teubner\/GWV Fachverlage GmbH."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1","DOI":"10.4018\/jdwm.2009070101","article-title":"A survey of extract transform load technology","volume":"5","author":"Vassiliadis","year":"2009","journal-title":"Int. J. Data Warehous. Min."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Vassiliadis, P., and Simitsis, A. (2009). Extraction, transformation, and loading. Encyclopedia of Database Systems, Springer.","DOI":"10.1007\/978-0-387-39940-9_158"},{"key":"ref_24","unstructured":"Kimball, R., and Caserta, J. (2004). The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data, Wiley Publishing, Inc."},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1016\/j.ijinfomgt.2018.02.007","article-title":"Analyzing data quality issues in research information systems via data profiling","volume":"41","author":"Azeroual","year":"2018","journal-title":"Int. J. Inf. Manag."}],"container-title":["Informatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2227-9709\/6\/1\/10\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:36:25Z","timestamp":1760186185000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2227-9709\/6\/1\/10"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,3,5]]},"references-count":25,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2019,3]]}},"alternative-id":["informatics6010010"],"URL":"https:\/\/doi.org\/10.3390\/informatics6010010","relation":{},"ISSN":["2227-9709"],"issn-type":[{"value":"2227-9709","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,3,5]]}}}