Amazon Redshift Pricing
Amazon Web Services (AWS) is the first public cloud or cloud computing provider to offer a cloud-native, petabyte-scale service dedicated to data warehousing. The service is called Redshift, and it is the most popular cloud data warehouse. It boasts thousands of companies small and large as its customers, although competition in this space is heating up now with Google BigQuery, Snowflake, Oracle Autonomous Data Warehouse now vying for a pie in the growing cloud data warehouse space.
Redshift has been around since early 2013 and has undergone many improvements over the last six years. Built on Postgres, the software provides a query engine that is familiar to most users allowing SQL based querying. Although AWS fully supports the infrastructure that underpins Redshift, there are still some functions that would have to be managed by the administrators of Redshift. However, the tools that are provided by Redshift make this management a breeze. Redshift is built on a scalable infrastructure and supports big data and massive workloads that span many nodes and multiple petabytes of data. Redshift also provides a robust management console to load data, allows for connections from any sql client, and supports a host of business intelligence tools to connect with the service. The Redshift service also supports REST APIs allowing developers to manage the instance in real-time with simple API calls.
Redshift Spectrum, AWS Athena, and the omnipresent data storage solution Amazon S3, compliment Redshift and together offer all the necessary technologies to build a data warehouse or an enterprise-scale data lake.
Pricing with any cloud service requires a deep level of understanding of the software and architecture know-how, and Redshift is no different. Some times, people can be misled with the ease of use of these services but may get surprised at the end of the month when the invoices show up, and the amount due is more than their original accounted for the service. Billing surprises are not unique to AWS, and this is just how the cloud works; just because spinning up services is simple in the cloud doesn’t mean one doesn’t require to focus on the specifics – wouldn’t it be nice if these services optimized themselves for performance and price? – Well, that is a topic for another day. The good news is that it can be sized to fit both your technical requirements as well as your budget.
Let’s look at the pricing components for AWS Redshift.
Some basic parameters guide the pricing of most cloud services, and that includes Redshift.
- Contract type
Consider the following parameters while turning the knob on a Redshift cluster.
- Contract type
- On-Demand Pricing: Suitable for services that don’t have a predictable workload or don’t have to be up at all times. Pay an hourly rate for the duration of usage of the service. In this model, you trade flexibility for price.
- Reserved Instance Pricing: This pricing model is well suited for predictable, consistent workloads. By reversing instance or in other words, committing usage of resources to AWS makes their operations and revenue more predictable, and that automatically means lower prices for you.
- Redshift Spectrum: The new service allows Redshift users to run queries directly on data in S3 buckets. In a traditional Redshift setup, the data gets loaded into the drives attached to the cluster. This new service opens up new opportunities for users of Redshift to run massive data stored in a data lake without having to increase cluster size to account for large storage volumes.
- Compute shape or node type.
- It is essential to note that compute cost in AWS doesn’t scale linearly. i.e., the cost of a 2vCPU 14Gb shape may not always be twice that of a 1vCPU 7Gb shape. Therefore it is crucial to model the pricing, account for potential growth, and then budget for it to avoid surprises.
- Pricing varies based on whether the data warehouse running at all times or go for an approach that allows for a stop and restarts.
- You shall be charged for some resources, even if you stop the Redshift cluster. This charge occurs because AWS cannot release these resources to other customers.
- Reserved instances offer the best possible pricing for the Redshift service
- Depending on the performance requirements, you may be able to choose different types of nodes.
- DS stands for Dense Storage.
- These shapes give you more storage capacity per node, but the store is on HDD. Review Redshift documentation on the IOPS you can expect for each of these shapes.
- DC stands for Dense Compute shapes
- These shapes come with attached SSD stores and offer higher IOPS as a result when compared to DS shapes.
- At the time of writing this article, AWS is not offering memory-optimized shapes for Redshift
- This link provides a detailed overview of the underlying hardware for these nodes.
- The amount of storage used by the Redshift cluster
- Don’t forget to account for the storage required to host the backups of your database
- If you have a multi-region service, then you may notice network charges. Unless you are moving large volumes of data between regions or back to your data center, this cost is usually negligible.
- Prices for cloud resources vary by region. Residency requirements, latency requirements, and availability of features or services you require are some factors that may influence your choice of region. Selecting the data center or region closest to your data sources offers the best possible performance.
Pricing for cloud services changes often, and therefore, it is prudent to use the pricing calculator tool provided by AWS to assess the costs of your Redshift cluster. Make sure that you consider the factors mentioned above while to size out your cluster.
Having supported many customers with their data replication to Redshift, our team is adept at helping you size your Redshift environment. Feel free to send out a note to us if you have any questions about pricing or architecting your data warehouse.
Because public clouds often refresh their hardware, there may be new node types created regularly. It is also a common practice to reduce the pricing of older hardware, although AWS does not guarantee a price reduction by contract.
Scaling of AWS Nodes:
With every new release of Redshift, efforts are being made to reduce the downtime associated with resizing the cluster size. Regardless of how you resize the cluster, add more modes, or change existing nodes to a different node type, it is reasonable to expect some amount of disruption to the existing users or the users issuing new connection requests to Redshift. The necessity for downtime may change in the future, but this is a factor to consider while sizing out your Redshift environment today.
Now that you are ready to spin up your Redshift cluster, your next task would be to figure out how to move data to Redshift. You can leverage the native tools provided by AWS to fulfill some of your data replication requirements.
Our cloud-based data pipeline, Daton, provides a simple yet cost-effective way to replicate your data to Redshift. Daton has a variety of pre-built adapters for databases, SaaS applications, files, webhooks, marketing applications, and more. Replicate your data from any source to Redshift in three simple steps without having to write any code in a matter of minutes.
Sign up for a free trial today to kick-start your data warehousing initiative.
Not ready yet? – Talk to our data architects who are happy to answer your questions. Send us a note – we love to hear from you!