Successful data quality and master data management initiatives require a holistic approach.
Organizations need to address persons, processes and technology to implement the business demands on data quality:
The organization should include clear responsibilities for data domains (e.g., customer, product, financial figures), as well as roles (data owner, operational data quality assurance / data stewards).
Processes for data quality assurance can be defined by adopting best practices like the data quality cycle.
Technology supports people in their processes via software features and the requisite IT architecture.
It is important to consider business obligations first and keep in mind that the organization and its processes are always more important than the technology since they are defined according to company strategy. So technology is simply there to support them.
Organization of Data Quality and Master Data Management
When it comes to improving data quality, a company culture that recognizes data as a key production factor for generating insights is essential.
In the context of data quality and master data management, the responsibility for data plays a crucial role.
Role concepts help with the definition and assignment of tasks and competencies to certain employees. By assigning certain roles, the company can ensure that responsibility for accurate data and its care is clear and enduring.
Typical roles for ensuring data quality and master data management are:
The data owner is the central contact person for certain data domains. He defines requirements, ensures data quality and accessibility, assigns access rights and authorizes data stewards to manage data.
The data steward defines rules, plans requirements and coordinates data delivery. He is also responsible for operational data quality, for example checking for duplicate values.
The data manager is usually a member of the IT department. He implements the requirements of the data owner, manages the technological infrastructure and secures access protection.
All being well, data users from business departments or IT have access to reliable and understandable data.
Each role involves clear tasks that are geared towards company-specific goals.
Processes for Data Quality and Master Data Management
The best practice process for improving and ensuring high data quality follows the so-called data quality cycle.
The cycle is made up of an iterative process of analyzing, cleansing and monitoring data quality. The concept of a cycle emphasizes that data quality is not a one-time project but an ongoing undertaking.
The data quality cycle is made up of the following phases:
At first data quality goals or metrics needs to be defined according to business needs. These goals should form part of the overall data quality strategy as well. There should be a clear understanding of what data should be analyzed. Does a lack of completeness for some data really matter? What attributes are required for data to be complete? How can a data domain (e.g., a customer) be defined?
Then the data is analyzed: questions like “which values can the data have?” or “is the data valid and accurate?” need to be addressed.
The cleansing of the (master) data is normally done according to engineered individual business rules.
Enrichment of data (with, for example, geo-data or socio-demographic information) can help with systems and business processes.
To ensure (master) data quality is reached, continuous monitoring and checking of the data is very important. This can be done automatically via software by applying defined business rules. So at the end of the cycle, there is a fluent transition of the original data quality initiative to the second phase: the ongoing protection of data quality.
The different phases are typically assigned to the aforementioned roles.
Data Quality and Master Data Management Software
Most of the technologies on the market today are aligned with the data quality cycle and provide in-depth functionality to assist various user roles in their processes.
The best way of achieving high data quality with technology is to integrate the different phases of the data quality cycle into operational processes and match them with individual roles. Software tools assist in different ways by providing:
Data profiling functions
Data quality functions like cleansing, standardization, parsing, de-duplication, matching, hierarchy management, identity resolution
User-specific interfaces/workflow support
Integration and synchronization with application models
Data cleansing, enrichment and removal
Data distribution and synchronization with data stores
Definition of metrics, monitoring components
Data Lifecycle Management
Reporting components, dashboarding
Versioning functionality for datasets, issue tracking, collaboration
This list of functions is intended to give an overview of the functional range offered by current data quality and master data management tools.
Based on your company’s individual requirements, every organization should define and prioritize which specific functions are relevant to them and which will have a significant impact on the business.
The market for data quality and master data management tools is comparatively heterogeneous. Providers can be classified according to their focus or their history in the following groups:
Business intelligence and data management generalists have a broad software portfolio which also can be used for data quality and master data management tasks.
Specialists of service-oriented infrastructures also offer software platforms, which can be used to manage master data.
Data quality specialists are mainly focused on how data quality can be ensured and provide tools that can be integrated into existing infrastructures and systems.
Data integration specialists offer tools that are especially useful for matching and integrating data from different systems.
For master data management specialists, master data management is a strategic topic. These providers offer explicit solutions and services for the management of master data.
(Data Pre-Processing for Data Discovery
) is a relatively new trend in the business intelligence and data management market. In the context of data quality, data discovery software can be a flexible (but not really durable) tool for business users to address data quality problems.