Logo

The Data Daily

What is a cloud data warehouse and why is it important?

What is a cloud data warehouse and why is it important?

The benefits of cloud technology are indisputable and we are rapidly witnessing the advantages it can offer to businesses. Rob Mellor, VP and GM EMEA, WhereScape, discusses five positives to utilising a cloud data warehouse in today’s digital landscape.

We are seeing business expectations for on-demand data explode, with many data warehousing teams beginning to transition their data warehousing efforts to the cloud. The need to efficiently pull together data from a wide range of ever-evolving data sources and present it in a consumable way to a broadening audience of decision-makers means cloud data warehousing is proving invaluable.

Here, we cover the basics and explore cloud data warehousing; how the cloud data warehouse compares to the traditional data warehouse, and the benefits of a cross-cloud solution. 

A cloud data warehouse is a database service hosted online by a public cloud company. It has the functionality of an on-premises database but is managed by a third party, can be accessed remotely and its memory and compute power can be shrunk or grown instantly.

A traditional data warehouse is an architecture for organising, storing and accessing ordered data, hosted in a data centre on-premises owned by the organisation whose data is stored within it. It is of a finite size and power and is owned by that organisation.

A cloud data warehouse is a flexible volume of storage and compute power, which is part of a much bigger public cloud data centre and is accessed and managed online. Storage and compute power is merely rented. Its physical location is largely irrelevant apart from for countries and/or industries whose regulations dictate their data must be stored in the same country.

The benefits of a cloud data warehouse can be summarised by five key points:

Rather than having only physical access to databases in data centres, cloud data warehouses can be accessed remotely from anywhere as well as being convenient for staff that live near the data centre, who can now troubleshoot from home or anywhere out of hours if needed. This access means companies can hire staff based anywhere, which opens up talent pools that were previously unavailable. Cloud data warehousing is self-service and so its provision does not depend on the availability of specialist staff.

Data centres are expensive to buy and maintain. Property to store them in needs to be properly cooled, insured and expertly staffed, and the databases themselves come at a huge cost. Cloud data warehousing allows the same service to be enjoyed, but you only pay for the computing and storage power you need, when you need it. Now, with elastic cloud services such as Snowflake, compute and storage can be bought separately, in different amounts. You really only have to pay for what you’re using and you can instantly close or downsize capabilities you no longer need.

Cloud service providers compete to offer use of the most performant hardware for a fraction of the cost that would be incurred to reproduce such power on-premises.

Upgrades are performed automatically, so you always have the latest capabilities and do not experience downtime in upgrading to the latest ‘version’. Some on-premises databases offer faster performance, but not at the cost and availability of the Infrastructure-as-a-Service (IaaS) that cloud providers offer.

Opening a cloud data warehouse is as simple as opening an account with a provider such as Microsoft Azure, AWS Redshift, Google BigQuery and Snowflake. The account can be grown and shrunk, or even closed instantly. Users are aware of the costs involved before they change the amount of computer storage they rent. This scalability has led to the coining of the phrase ‘Elastic Cloud’.

Hosting data in a cloud data warehouse means you can switch providers if and when it suits changes in business strategy. Staying database-agnostic means you have the agility to upsize, downsize or switch completely. Metadata-driven automation software allows you to lift and shift entire data infrastructures on and off the cloud data warehouse if desired and allows different teams within the same company to work with the database and hybrid cloud structure that best suits their needs.

A cost analysis is vital in estimating how much money a cloud data warehouse would save the business. Different cloud providers have different pricing structures that need bearing in mind. More established providers such as Amazon and Microsoft rent nodes and clusters, so your company uses a defined section of the server. This makes pricing predictable and constant, but sometimes maintenance of your particular node is needed.

Snowflake and Google offer a serverless system, which means the cluster locations and numbers are not defined and so are irrelevant. Instead, the customer is charged for exactly the amount of compute or processing power it consumes. However, in bigger companies it is often difficult to predict the amount of users and size of a process before it occurs. It is possible for queries to be much bigger and cost much more than what was expected.

Each cloud provider has its own suite of supporting tools for functions such as data management, visualisation and predictive analytics, so these particular needs should be factored in when deciding on a provider.

Using cloud-based data warehouse platforms mean you can gather even more data from a multitude of data sources and instantly and elastically scale to support virtually unlimited users and workloads. With the ability to manage the influx of Big Data, using automation to aid in providing return on investment, businesses will be able to manage the influx of Big Data, automate manual processes and maximise the return on cloud.

Click below to share this article

Images Powered by Shutterstock