collibra data citizens data ethics and data managment with collibra partners in london
17 Sep '21

Looking back on Collibra’s Data Citizens Conference 2021

In June this year Collibra hosted their annual Data Citizens conference with more sessions than ever covering a host of different topics. In case you didn’t manage to catch them all, here we have selected three stand-out sessions that cover the product vision and strategy at Collibra, how data has the power to unite us, as well as insights into the future of data management.

1) The product vision and strategy at Collibra with Jim Cushman

In the first part of this session Jim explained the core product vision for a typical Collibra user, which is to easily gain access to and get valuable insights from their data. Jim demonstrated that by simply searching for a term, the platform displays related data sets, quality scores, certifications and diagrams for additional context. Where the quality score is low, users can raise suggestions to correct data. What’s more, Jim also showed how it is possible to shop for checkout data, meaning that access to data is granted using workflows with links to BI reports provided.

Jim went on to explain Collibra’s strategy for managing ever-growing volumes of metadata, showcasing an example of how their new data quality tool can be used to locate high quality sources of data. Collibra have set themselves high targets for performance and scaling, aiming to ingest at a speed of 50k records per hour and double this performance every 6 months. This would mean managing 1 billion technical assets by the end of 2022, demonstrating that Data Catalogue performance is a top priority for the company.

Jim then delved deeper into Collibra’s integration with their new data quality tool. He demonstrated the impressive progress that has already been made in only a couple of months, including being able to set up rules with one click and see immediate results. Perhaps even more impressive was the fact that the results were reflected immediately in Collibra, showing that the integration between the tools is already seamless.

With Collibra now available on the marketplace of all three main cloud partners, including Azure, Jim shared his goals for the coming year, including bringing data quality controls into the Collibra platform and introducing what he calls ‘continuous data profiling’.

2) How data has the power to unite us 

This section was broken down into three smaller talks hosted by a range of experts on managed data, trusted data and accessible data.

Managed Data – Felix Van de Maele (Founder & CEO, Collibra) and Zhamak Dehghani (Director of Emerging Technologies, ThoughtWorks)

In this session Collibra’s founder and CEO spoke to Zhamak Dehghani about her work on Data Mesh – a new approach to analytical data management – with a forward-looking view on what a data landscape might look like with the massive expansion and use of data. Together they discussed the four main principals of Data Mesh: 

  • Domain-orientated decentralised data ownership and architecture – breaking down data by domain and giving ownership of each domain to a group so they become experts of this data.
  • Data as a product – treating data as a product, much like a product is available to purchase by a consumer.
  • Self-serve data infrastructure as a platform – making data available on a self-serve platform, rather like infrastructure is available as a service on cloud platforms.
  • Federated computational governance – focusing on computational governance around data rather than data stewards manually approving and discussing data issues.

Trusted Data – Kirk Haslbeck (VP, Engineering, Collibra and Founder, OwlDQ) and Viktor Mayer-Schönberg (Author of DELETE and BIG DATA)

The focus of this talk was data quality and how it can enable continuous delivery of trusted data. The key to getting your data model right is not about the model itself, but rather the quality of the data that you put into the model. 

The speakers made an interesting point about self-driving cars improving only after being fed predictive data that had been generated, rather than the massive amounts of real data already gathered from sensors, alluding to OwlDQ’s predictive data capabilities. 

Finally, the conversation moved onto the data revolution, drawing comparisons to the industrial revolution. Kirk and Viktor highlighted that there wasn’t a single invention that made the industrial revolution, but in fact thousands, and the same could be said about data. The incremental improvements are what to look out for; there is no ‘silver bullet’. 

With OwlDQ now acquired by Collibra, we should expect a lot more data quality initiatives to be presented in the future.

Accessible Data – Paul Zikopoulos (VP of Big Data Cognitive Systems, IBM) and Stijn ‘Stan’ Christiaens (Founder & Chief Data Citizen, Collibra)

This session was an honest discussion between two data experts: Stijn, a founder and Chief Data Citizen at Collibra, and Paul, a VP at IBM (also an author of Hadoop for Dummies). They spoke about the three ‘layers of mud’ around data:

1) The sheer volume of data and getting a handle on it

2) Data solutions are not a magic wand; expertise and skill are needed

3) Governance is essential and rather than doing the ‘least amount to comply’, organisations should do everything they can to optimise their use of data. 

In order to drive this process of optimisation, teams must be curious and use data exhaust to their advantage. Top 3 tips were:

1) Keep learning 

2) Outperform with data by investing in it 

3) Use governance to explain your data with confidence. 

When it comes down to making data accessible to all individuals in an organisation, it is essential to make it understandable and have clear rules on what you can and cannot do with it. 

3) A conversation with Neil deGrasse Tyson

A fascinating and entertaining talk from the astrophysicist that put many of the data challenges our clients face today into context when considering the history of collecting data related to the universe. For example, there has been an enormous amount of data collected already, but unless it is both precise and accurate (not one or the other), you cannot draw meaningful conclusions. This has implications for Artificial Intelligence, Machine Learning and the vast training data sets. 

Neil shared that his biggest concern going forward is the growth of data surpassing people’s ability to interpret it fast enough, alluding to the use of machines in breaking down and presenting data in the most efficient and effective way. This could be another hint that OwlDQ will become more than just a data quality tool, also taking over some of the manual work currently done by people.

Overall, it was an incredibly informative and interesting event and as a Collibra partner, it was exciting to see what the future holds as the Collibra capabilities develop. The main takeaway for us was that as the data landscape continues to evolve at an exponential rate, the market importance of data will only continue to increase as organisations rely more heavily on and extract more value from their data assets. From our years of experience helping clients with every kind of data strategy, we know all too well how important it is to have the right technologies in place and so it was encouraging to see Collibra facilitate this debate with industry leaders as well as showcase their new data quality tool.

At DTSQUARED, we are committed to helping our clients fully harness the power of data, by implementing and utilising tools like Collibra to unlock the value of their data and turn it into a strategic, competitive asset. If you would like to know more about any of the topics covered, or to find out how we can help your organisation, then please get in touch with our team who will be happy to discuss.

Get in touch with our data experts

Get in touch for a free session with our data experts