Abstract:
Making informed and accurate decisions based on data is a crucial challenge faced by organizations across all industries. With the rise in data collection and storage tools, organizations have access to vast amounts of data. However, the challenge remains on how to effectively utilize this data for decision-making. Data quality issues continue to persist during the ETL process, which is responsible for shaping and moving data for decision-making purposes. The main goal of this master thesis is to address the question of how to preserve and enhance data quality during the ETL process. The study will begin by providing an overview of data warehouse concepts, including the data lifecycle, followed by a literature review to identify the best practices in data quality management. Next, the study will assess the difficulties and challenges related to data quality in a specific company. The outcome will be the creation of a custom framework for ensuring data quality during the transformation phase of ETL, which is observable, monitorable, and maintainable. This framework will provide organizations with a structured approach to data quality management, allowing them to identify and correct data quality issues, leading to better decision-making.