The Top Five Things to Consider When Selecting Your Data Storage Solution
Do you need a new (or better?) data storage solution? Here are the top five things to consider when selecting your data storage solution
Data is redefining the landscape of every sector, providing challenges but also most importantly, big opportunities if used in the right way. This has made it arguably the most powerful tool for companies and more widely, industries, to leverage moving forwards. A key factor to consider when embarking on a data strategy is data storage, so in this blog we have broken down the essential considerations to keep in mind when selecting the best data storage solution for you.
One of the considerations when selecting a repository solution is its projected cost effectiveness. Generally speaking, on-premise storage solutions require substantial upfront investments and involve ongoing maintenance costs that may render such options uneconomic. It is therefore not surprising to see many smaller scale organisations select off-premise cloud solutions where the vendor provides a Data Warehouse as a Service (DWaaS) solution.
One of the differentiators between various DWaaS providers is the cost structure. These are primarily divided into two categories: package subscription and pay-per-use costing. Subscription-based service require the customer to specify the amount of storage and compute power that will be provided during a defined time period. A pay-per-use pricing structure is far more efficient, allowing the user to pay only for utilised resources.
Scalability refers to a system’s ability to operate at greater capacity and accommodate a larger workload when necessary. For example, in one of our recent Social Housing blogs we discussed the growing importance and potential of ‘smart’ appliances to monitor clients’ wellbeing. The rate of data collection from these may vary throughout the year and the allocated amount of storage capacity and the compute power required to interpret this data may not be constant. In this instance, using a DWaaS provider capable of allocating additional storage/compute resources on an automated basis reduces cost otherwise spent on reserve capacity.
Ease of Upload
In terms of ease of upload, special consideration should be paid to two capabilities: compatibility with ETL tools and support for the organisation’s bespoke data structuring needs (i.e. the ability to store structured and/or semi-structured data). While there may be an immediate need for storing only structured data today, organisations also need to consider future-proofing and long-term ambitions.
When considering ETL support, organisations need to consider both their current and desired future state compatibility requirements. While still important, this is becoming less of an issue since all major DWaaS solutions integrate with most popular ETL tools (e.g. Fivetran, DataStage).
Competing DWaaS providers offer varying levels of support for semi-structured/unstructured data upload and manipulation. Referring to the same example of social housing, we highlighted the importance of allowing clients to proactively submit feedback to the Housing Association through a self-service portal. The majority of submitted data would most likely be unstructured/semi-structured, such as photos of appliances in need of maintenance or emails describing issues / arising needs. Housing Associations may therefore need a DWaaS solution which offers best of breed native support for semi-structured data.
Ease of Access
Considerations around access should focus on the platform’s ability to provide authenticated applications with uninterrupted access to stored data, whilst also overcoming the challenge of concurrency.
Concurrency refers to the platform’s ability to run multiple disparate processes on the same data. Traditional warehouses had processes carefully scheduled as to not interfere with each other. Finding windows of “free periods” becomes increasingly difficult with growing levels of data utilisation. Any delayed or ad-hoc query may also cause a considerable knock on effect for subsequent processes.
To provide an example, banks operating from within eurozone states are required to submit the AnaCredit report to the ECB containing information regarding individual loans by the end of every month. Those same datasets used to generate these reports are likely to be used to derive incurred interest, exposure to risk and other figures necessary for banks’ operations.
Some modern DWaaS providers are built on shared data architecture which allows them to run multiple processes whenever required, while not restricting others from accessing the processed data.
Data Sharing Capabilities
Sharing data with external partners is often accompanied by two challenges: how do I ensure that data is shared securely, and how can I protect sensitive data from being accidentally revealed?
Data in transit is data at risk and while there are methods of mitigating risk, namely through encryption, sharing data may result in its retention beyond the intended period.
Particularly within the consulting industry, client organisations tend to expose vast amounts of proprietary information to contracted consultancies in order to allow for the development of the most effective solutions. Often times, it is only the integrity of the consultancy that guarantees the appropriate retention of shared data.
Modern DWaaS solutions tackle this challenge at its foundation by providing the recipient with governed access as opposed to sending an entire data set. This access is only valid for as long as the sharing party allows.
The other challenge revolves around unauthorised users accessing sensitive or PII data such as an email address. Securing sensitive data can be accomplished through “masking”. This is a process where data is being obfuscated by replacing it with modified content.
Modern DWaaS solutions allow administrators to configure dynamic data masking that hides sensitive data from users depending on the authorisation granted by their roles.
During the last few years, we have witnessed the rising popularity of DWaaS providers. One of the most popular choices is Snowflake – an industry-leading solution provider that recognises DTSQUARED as a Select Service Partner. Snowflake excels in the capabilities highlighted above and offers a wide range of sophisticated functionality such as instant dynamic upscaling, native support for unstructured and semi-structured data, support for concurrency and others which are relevant to Housing Associations. DTSQUARED has successfully helped many of our clients to implement and enhance their data warehousing solutions, whether that involved Snowflake or other leading DWaaS technologies. If you’d like to talk to us about how we can help you to prepare your data and infrastructure in order to deliver its full potential, please speak to one of our experts.