Enterprise applications often generate volumes of data that proliferate throughout an organisation. Because these applications have a responsibility to produce quality data as well as the associated metadata, the software should do its best to ensure this data is fit for ongoing use and integrates well within the data landscape. Legacy applications are often blamed for flouting these basic principles, but do modern applications always fare much better? Here are 5 things for an application to integrate well within a data-centric enterprise:
1. Enforce tight rules on data capture. Avoid capturing general data when bespoke data types can be configured.
Data quality starts at source. Do not allow opportunist users the ability to proliferate junk data that is persisted and later moved downstream, side-lined from dashboards, and missed from the eyes of the decision makers. Take, for example, the comment field: sometimes abused but normally misused.
Do not allow this field to hold business critical information. Identify the real need for generic fields and allow users to configure their own customised data entry points. If it is a type of data that can be defined more precisely, make it a type that a customer can configure. The comment field may not be obsolete, but it should not exist only because the other choices were too difficult to imagine.
2. Expose a stable schema from which data intensive consumers can integrate from. Isolate and protect consumers from changes to the core schema while simplifying upgrades.
Application upgrades need to be delivered with minimal interruption and manual intervention. Whether it is integrating with other cloud services or with other components on-premises, if a core schema is accessible, it will sooner or later somehow become an integral part of the enterprise’s data empire.
Core schemas evolve and change which means no-one should be relying on a model they think they understand from a core schema. A better way is to expose a stable integration endpoint, be that via web services or directly from a database.
Managing changes to an application’s core schema is a problem for the software vendor to solve, not the data consumer. Not only does this give the vendor more freedom to evolve their core schema, you can trust the stability of the platform during upgrades.
3. Use common business rules to ensure all data consumed is of high quality. Reverse engineer data quality rules and apply validation on data from any channel.
Give the end business users confidence their rules are enforced no matter where the data comes from. Applications typically ingest data from different channels and generate data internally based on that ingested data.
Whether data is ingested via custom channels, the user interface or it is generated based on a combination of other data, the same rules should apply. It is not only the functional definition of the rules that should be identical, the implementations should also be too.
What rules should be enforced, for example, across multiple tables? An ideal starting reference point is the set of rules in the data quality tool currently running over the data the application generates. Reverse engineer the data quality checks so that the application is the real guardian of all data entering its realm.
4. Provide transparent data lineage. Give consumers the opportunity to track the origin of data and what changes might affect it.
Data lineage should not be a privilege for those technical few with knowledge and access, it is a right for all business users to know data provenance to ensure trustworthiness. The traditional black box conceptual model of an application need not apply to data lineage, so expose transformation rules and passages of data.
Give fields universal identifiers so that their use can be monitored and the complex relationships between fields can be discovered. If the machine can read it, the machine will do the work of joining the pieces together, learning patterns and inferring dependencies. A data governance tool can then visualise the data lineage and provide business users with evidence to help them source data they trust.
5. Expose a lightweight API for metadata. Value metadata integration as important as data when designing APIs.
The data about your data is just as important as the data itself, so do not make it any harder to obtain metadata than it can be. This is not just about the already machine-readable metadata on your API or schema, it is also about the business metadata.
Applications are used in different ways, and with each set up and configuration, the metadata should be captured by users who configure the application. Take, for example, a customer who has configured their application to service a particular department because their owner needs the data each morning.
A business user would like to refer to their metadata management tool to look up that customer, that department, that owner and that time. Expose custom metadata through APIs and make it possible for customers to define it based on application usage and configuration, not only out-of-the-box defaults.
Each of these factors are equally important in ensuring that the required data is fit for purpose and seamlessly integrates within the customer’s data landscape. By providing all five of these elements fluidly, applications will stand the best chance of thriving within a data-centric enterprise. For more information on how you can leverage these goals for your specific landscape, please get in touch and we would be happy to help with a complementary tailored introduction.