We need to start with a few definitions. The EDM Council is doing a great job in clarifying and socialising some key terms that are often interchanged. We’re seeing increased adoption of these definitions, however, until these classifications are used industry wide then a cautious approach is required.
The EDM Council defines Data Management as ‘the development, execution and supervision of plans, policies, programs and practices which deliver control and protection, and enhance the value of data and information assets throughout their lifecycles.
This definition implies that Data Management incorporates a number of sub-disciplines. Data Governance is one of the key execution components, however, to successfully deliver a Data Management programme, Data Quality and Data Architecture must also be considered.
For the sake of brevity, I’m focusing solely on Data Governance in this blog. In a future blog we’ll come back to the broader Data Management topic and discuss the ways in which Snowflake can facilitate the establishment of the Data Quality and Data Architecture disciplines too.
The Data Governance discipline is still climbing its own maturity curve and the processes and technology support available to Data Governance professionals are evolving at speed. At its core, Data Governance sets the standards, defines the rules and establishes the policies that help an organisation create value and insight from its data and comply with regulatory requirements. One of DTSQUARED’s key partners, Collibra, developer of an industry leading Data Governance platform, provides an article on Data Governance that expands considerably on this definition.
When starting from a clean slate, establishing Data Governance in an organisation requires data owners to be identified and for those owners to work together to catalogue and manage the organisations critical data. What is critical data? Data that is critical to a business process or outcome; for example, regulatory reporting requirements will define a subset of critical data elements (CDEs) in an organisation.
Cloud data warehouses have seen a significant rise in popularity recently and the Snowflake Data Cloud is leading this charge. Designed as cloud native from the onset it enables fast delivery and is cheaper to operate and easier to use than legacy on-premises data warehouses. Indeed, this cost reduction (i.e. pay-per-use) makes the collation of all operational data into a single platform available to organisations that previously didn’t have the resource required to take operational advantage of a data management platform.
Snowflake’s fundamental design philosophy, the separation of storage and compute, allows compute to be dedicated to individuals or teams. This means that if one team is maxing out their allocated compute, other teams will see a consistent and unaffected service. This same architectural philosophy also underpins many other capabilities such as; ‘Zero Copy Clone’, the ability to copy an entire database in seconds and without increasing storage costs and DevOps or DataOps practices, which are now viable on large monolithic data stores for the first time.
Snowflake not only provides capabilities that equal or exceed those offered by their competitors, they include capabilities that their competitors, especially those hosted on-premise, will likely never be able to add to their offerings. These include ‘Time Travel’, ‘Secure Data Sharing’ or the ‘Data Marketplace’ and the list goes on. This great article provides further details of their offering and one of our own blogs provides a view on the Snowflake product pipeline as they further widen the gap with their competitors.
Chicken or Egg?
Pulling both these strands together, we’re often asked one fundamental question; what should you do if you are considering a Snowflake implementation but don’t currently have Data Governance in place? Is Data Governance a pre-requisite to Snowflake adoption or do we need data to be migrated to Snowflake before it can be subject to Data Governance? Every business and every Snowflake adoption are different but as a general rule, here’s what we believe you should do.
I’m reminded at this point of the Chinese proverb, ‘the best time to plant a tree was 20 years ago, the second-best time in now’. Establishing Data Governance takes time and it can and should be started straight away.
Data Governance isn’t strictly a pre-requisite to a Snowflake implementation. It would delay the realisation of the benefits of Snowflake if we made it so, however, we would not recommend any significant re-architecture or implementation without also introducing Data Governance at the same time. A lot of information, valuable in both the Snowflake implementation and the Data Governance introduction, could be uncovered as a single operation that would be of value to the organisation on both fronts.
The ideal approach for an organisation without either Snowflake or Data Governance is for them to be introduced in parallel, with data being governed as part of the process of migrating it to Snowflake. If the governance work ran short of migrated data they can always work ahead of the Snowflake migration, equally if the governance work falls behind the Snowflake adoption then resourcing and/or prioritization should be reviewed to allow governance work minimise the size of this gap. Starting and finishing both disciplines at around the same time is a highly optimal way for an organisation to deliver and leverage the benefits of a modern cloud data platform containing a governed data set.
For the sake of balance this same timing conundrum is equally applicable to any combination of Data Governance and a (cloud) data management platform. Given the current capabilities and cost of the Snowflake Data Cloud and their product road-map, why would you even consider any other platform to store your organisations critical data?
For your Data Governance needs or your Snowflake needs or, dare we say it, a little of both together please speak to our team of experts at DTSQUARED. We would be happy to offer a complementary tailored introduction on how your organisation can better leverage Data Governance and the Snowflake Data Cloud.