BigQuery vs Snowflake – 3 Important Influences on Price

Posted By: administrator
Posted On: 07 Jul 2020

Before you embark on deciding whether Snowflake or BigQuery is the data warehouse best suited for your workloads, it is critical to understand the nuances of BigQuery vs Snowflake pricing. In this article, we are only comparing the on-demand pricing differences between the two data warehouses. We are going to take simple examples to explain the key aspects of pricing for these two popular cloud data warehouses.  

For any data warehouse, two basic elements primarily contribute to price: Storage and Compute.  

Any cloud data warehouse service vendor like Google with their BigQuery data warehouse or Snowflake data warehouse, use these two metrics as the basis for their pricing. Any other driver of cost beyond the above two variables/metrics are often related to the internals of the database or making certain features available in certain plans. A thorough understanding of how storage and compute are priced helps analysts control the costs of their data warehouse and avoid unpleasant surprises caused by overages and a call from their CTO or CFO.  

Let’s look at how these two variables, Storage and Compute, drive the price of Snowflake vs BigQuery. We will start with computing. 

BigQuery vs Snowflake – How they handle compute pricing? 

Compute is the CPU, memory and associated hardware resources that any database uses to process your data. For any data warehouse, there are two kinds of data processing that occurs: Writes and Reads.  

For an OLTP system, workloads are mostly write-heavy. i.e. database writes happen more often than reads as transactions get recorded in the database.   

For data warehouses, which are OLAP systems, the workload is usually a write once and read multiple times type of workloads. It makes sense if you think about it – we replicate data from a source system to a data warehouse, perform data modelling on it. Still, many users may be running queries or reports on that data over a long period.  

When it comes to pricing compute, BigQuery and Snowflake take different approaches to handle pricing for the writing activity in the data warehouse, and that means interesting pricing complications for the users.

Feature BigQuery Snowflake 
Batch Data Ingest No Charge Charged 
Incremental Data Ingest No Charge Charged 
Streaming Data Ingest Charged Charged 
Table Updates No Charge Charged 

In layman terms, Bigquery does not charge customers for bringing data into the data warehouse if you are using batch data ingest. Most of the businesses that are attempting to automate business reporting can get away without the need to perform streaming data ingest. Whereas, Snowflake charges users for the compute usage for the duration of the ingest. Let’s look at a few common scenarios to understand this better: 

Example 1: Batch load into staging area running every 1 hour for 5 mins 

Scenario: You have a data replication job set up to replicate data from your MySQL data source to your data warehouse.

BigQuery vs SnowflakeWhat is the cost incurred for each replication job?

  • BigQuery doesn’t charge you anything 
  • Snowflake will charge you for the compute used during the 5 mins that it took to process the data. 

BigQuery vs SnowflakeWhat is the cost incurred by the end of the day?

  • BigQuery compute costs for writing data is Zero.  
  • Snowflake would’ve charged you for 24 hours * 5 mins  = 120 mins of computing usage. 
  • Depending on the shape of your Snowflake instance, you can take the unit price/sec and multiply it with 120 mins * 60 Secs to get your write cost for the day. 

Example 2: Data loaded into aggregated tables based on queries on staging table data  

Scenario: You are inserting data into an aggregated table based on data in a staging table every 1 hour. The insertion of data into aggregated table takes 5 mins. 

A sample script could be something like: 

Insert into employee_salary_agg  

Select sum (salary), employee_name 

  From employee_staging 

Group by employee_name; 

Whenever the above statement gets executed, there are two distinct operations that are being performed: 

  • Reading of data in the employee_staging table 
  • Writing of data into employee_salary_agg table. 

We will get to the pricing of the reading part of the operation in the next section. But as far as the writing goes, 

  • BigQuery doesn’t cost you anything 
  • Snowflake will charge you for the 5 minutes that it took to insert data into the employee_salary_agg table.  

At the end of the day, if this operation gets performed once every hour 

  • BigQuery compute costs for writing data is Zero.  
  • Snowflake would’ve charged you for 24 hours * 5 minutes = 120 mins of computing usage. 
  • Depending on the shape of your Snowflake instance, you can take the unit price/sec and multiply it with 120 mins * 60 Secs to get your write cost for the day. 

Verdict:  BigQuery 1     Snowflake 0

The other reason for computing usage in a database is to handle read workloads. The Unit of measure for BigQuery and Snowflake differ when it comes to pricing the read workloads in the data warehouse. The table below highlights how BigQuery and Snowflake handle the read workloads.

Read BigQuery Snowflake 
Query Execution Size of the data processed to fulfil the query The amount of time taken by the database to process the query and return results 

Let’s a few examples and look at how pricing is calculated. 

Example 1: Query runs for 5 minutes and processes 1Gb of data before results are returned to the user. 

How does BigQuery charge for query execution?

BigQuery will charge you for the 1GB of data processed according to their pricing sheet that you can find here. As you can see from the picture below, Google BigQuery offers a 1Tb of data querying for free per month.  

Assuming you fall in that bucket, then you just got yourself a fully managed cloud data warehouse for pennies (remember, you still need to pay a whopping $.02 per Gb!!) 

If your 1GB of query data processed is past the Google BigQuery free tier for the month, you will be charged 1gb * $.005 to execute your request.

How does Snowflake charge for query execution?

In snowflake, compute usage costs for read and write workloads is the same. In the example mentioned above, since the query ran for 5 minutes, you will be charged for the compute usage for 5 minutes. 

BigQuery vs Snowflake – How they handle Storage Pricing?

Storage costs have plummeted over the years. When it comes to storage used for data warehouses, how the data is stored and organized can have a massive impact on your monthly bills. But from a pure storage required to merely store your data, the cost differences are generally not significant. Not at least in data warehouses that are a few terabytes – which is most of the data warehouses. 

Fortunately, or unfortunately, Snowflake doesn’t give you any levers when it comes to organizing how your data resides on the disk. You only get to choose which region/zone those storage disks reside. Snowflake handles the management of storage and optimization. This approach has obvious benefits- Customers don’t have to worry about managing storage and optimizing it for query performance, Snowflake does that job for you. You can focus on building your models and doing the analysis. How well Snowflake manages the storage will have an impact on your price, but since we can’t control it, let’s assume it does a good job for the time being. How Snowflake storage management compares to other databases like Oracle, is a matter of perception and the type of workload. But that is beyond the scope of this article and perhaps a topic for another day. 

BigQuery on the other hand, gives you some controls in terms of partitioning and clustering the data in the data warehouse. Let me just say this, if you understand your data and know how the query patterns on your data, with little effort you can get a high functioning, fully-managed, and really affordable data warehouse. 

As far as the cost of Snowflake and BigQuery storage is concerned, 

Snowflake charges a monthly fee for the data stored in Snowflake. The average amount of storage used per month, after compression, will be used to determine the storage cost. Snowflake charges $23/Tb/Mo.  

BigQuery has two storage tiers. Active storage is defined as any data that has been actively queries in the preceding 90 days. Any data that falls out of the 90-day range is considered long-term storage. 

Assuming you have a 1Tb database, and all that data is active, you are looking to pay about $20/Mo for storage.

Active storage $0.020 per GB The first 10 GB is free each month. See Storage pricing for details. 
Long-term storage $0.010 per GB The first 10 GB is free each month. See Storage pricing for details. 

Verdict: BigQuery 1 Snowflake 1

BigQuery vs SnowflakeWhat is best for my workload?

Scenario 1: 

  • Are you a customer with a small userbase asking for a consolidated data store, who may not use it frequently, but when they do, they want to see most recent data? 
  • Your total data warehouse size is less than a few terabytes? 

Examples: Small eCommerce customers 

Translates to: 

  • Have to keep the data warehouse up to date. (write frequently) 
  • Relatively low read volume 

Recommendation: BigQuery ought to be given a serious thought. 

Scenario 2: 

  • Are you a customer with a small userbase asking for a consolidated data store, who use it frequently and they want to see most recent data? 
  • Your total data warehouse size is less than a few terabytes? 

Examples: Medium-sized eCommerce customers 

Translates to: 

  • Have to keep the data warehouse up to date. (write frequently) 
  • Relatively high read volume 
  • Low storage requirement 

Recommendation: BigQuery ought to be given a serious thought.

However, you must have someone get the partitioning and clustering done right in the data so that there are no surprising spikes in costs as the size of the data warehouse grows. Since you have control on data storage, getting an expert to review the storage allocation and usage can help contain the costs. It is not hard, but its typically something an analyst may not always know how to do. 

Scenario 3: 

  • Are you a customer with a large userbase asking for a consolidated data store, who will use the warehouse frequently, and when they do, they want to see most recent data? 
  • Your total data warehouse size is more than a few terabytes 
  • Your database cannot be easily partitioned, or you don’t have those people available to help 

Examples: Large eCommerce customers

Recommendation: Snowflake may be a good option for you

The money that you save in ingest costs in BigQuery might get consumed as you pay more for reading data from the underlying tables. However, a thorough comparison of both is always the best option.

Round-Up: BigQuery vs Snowflake Pricing

This article by no means covers the entire gamut of pricing variables for these two data warehouses. We are just trying to cover the basics because we have seen many businesses struggle with a sudden increase in costs and most cases, understanding the contents of this article and making the recommended changes to compute. Storage utilization is all that is needed to bring the cost down significantly.  

If you see an unexpected increase in costs of your data warehousing environment, you are not alone. Because data warehousing has become easy, companies have started to adopt the technology at a rapid pace. However, the skills needed to optimize a data warehouse are still required, although for a shorter duration to ensure your warehouse and the data models are optimally designed, and costs are under control. Get an expert to review your data warehouse set up so that you don’t regret the decision to operate a data warehouse. Reach out to us if you have this issue, and we will be glad to help. 

Leave a comment

Your email address will not be published. Required fields are marked *

Sign up for a free trial of Daton today.

Take your analytics game to the next level

×
-