18 Oct '21

Why Data Quality Matters to Every Organisation 

You know that you can only build a strong house on strong foundations, and so it is that in business, you can only build strong decisions on strong data.   Unless you trust your data, how can you ever be confident of your strategic decisions?    

At DTSQUARED, we appreciate and understand that data quality will always be of primary importance to your business.  Think about it; you may deal with data from multiple sources, data created on different systems or data imported by multiple users. Each of these factors will ring alarm bells when you come to take decisions based on data that could potentially be riddled with errors. 

Sir Tim Berners Lee highlighted more than a decade ago that data is precious and will last longer than the systems themselves – another reason why you should strive to achieve the best possible data quality.

To demonstrate the importance of data quality for every organisation, let’s start by considering each of the main dimensions that define data quality. 

Six Dimensions of Data Quality

  1. Data Completeness is about the availability of your data. For example, if you are looking at customer data entries you would expect to see a customer’s name included for each entry, and this will form part of any completeness check. 
  1. Data Conformity is about ensuring that your data conforms to a standard format.  For example, if looking at data concerning an employee’s salary, you would expect to see a numerical not alphabetical, entry.  
  1. Data Consistency is about making sure that data values do not have conflicting information.  Each user should see a consistent view of the data, including any visible changes made by the user’s own transactions or the transactions of other users.
  1. Data Accuracy is generally achieved by comparing data against an approved source. For example, you might check an address and postcode against a Royal Mail address database.
  1. Data Duplicates can occur when you have repetitive data in your data set.  Using a uniqueness or duplicate check can make sure that that is avoided.
  1. Data Timeliness is probably the most difficult to check and is generally linked to data functionality.  For example, if you are to generate a report on a specific data set you expect your data to be available early that day to allow you to create the report.  If it is not there, that is a timeliness issue.

Apart from these ‘Big Six’ there are other dimensions that you may need to consider depending on how your data is stored.  One of them is an integrity check, which is the comparison of data between multiple tables to ensure they should be there and demonstrates that the primary default relationships within a data set exists.  Also, traceability, which helps you to check if your data is traceable all the way back to source.

Why do you need high quality data?

The DQ dimensions listed above show clearly how integral the quality of your data is to your company’s success.  Everyone within your organisation needs access to quality data to make informed decisions to do their job efficiently and benefit the business more widely.  Forecasting accurately a return on investment can only be achieved by basing it on a data set that is of good quality. 

What are the problems and risks associated with poor quality data?

The problems of relying on poor quality data are numerous.  There may be regulatory and compliance problems.  There is the danger of failing to deliver on SLA (service level agreements) when using poor quality data and having to rerun a report either because of missing or incorrect data values.  

Poor quality data, including data that is inaccurate, incomplete, or out of date, is data that is not fit for purpose and can ultimately cost you time and money.

This is an avoidable risk that you cannot afford to take.  

How do you manage data quality?

Managing data quality is achieved through a cycle of four steps – discover, define, standardise and monitor.  You run through each step of the cycle, and then repeat it.   

  1. Discover and analyse.  This is the first step as you load and profile your source data to assess its quality.  Profiling your data is getting to know the system level details and characteristics.  You must do this first to uncover any inconsistencies or strange patterns.
  1. Define.  Accepting that there is no such thing as 100% perfection, your next step will be to to define your data quality target and threshold levels.  A data quality Target is the quality level you aspire to achieve – if you want your data quality to be 95% that is your target.  The threshold is the minimum data quality level you will accept. Anything between threshold and target is acceptable.  Now you define the rules.  
  1. Apply. At this stage of the process you apply the rules to standardise and cleanse your data. How complex this stage is will depend on how good your data is and just how many changes need to be made to enrich your data. As you complete this phase you may decide to overwrite the data on the source system, or to store it elsewhere for comparison.
  1. Monitor. Hopefully, the steps you have already taken in this cycle will be reflected in an upward data performance trend.  In this phase you will monitor the impact of these changes and be ready to target and repair any data set where the trend is going down, not up. Measuring and monitoring is about understanding your data trends and keeping them in check to ensure that they conform to your data quality rules.  

Data quality is of huge importance to your business because it underpins good decisions, helps you to make sure that you are meeting your regulatory and compliance obligations and have confidence that you can use your data to achieve the best possible return on investment.  

Implementing the DQ dimensions required to achieve data quality and continuing to monitor the quality of your data by following the cycle of discovery, definition, standardisation, and monitoring will help to ensure that you keep your business in the best possible shape. 

Think of it as a regular health check for your data.  

Want to find out how your data stands up? Get in touch with us today to discuss your data challenges and the solutions we can implement to make the most out of your data. We would be delighted to work with you on an initial Data Maturity Assessment too.

Click Here to get in touch.

https://www.bcs.org/articles-opinion-and-research/isnt-it-semantic/

Get in touch with our data experts

Get in touch for a free session with our data experts