BigQuery has become popular with marketers thanks to its ability to store and analyze data from sources like Google Analytics or Facebook Ads. The aim of this article is to provide a comprehensive BigQuery tutorial. Here you’ll learn what Google BigQuery is, how to set it up, how to use it, and most importantly, why you should use BigQuery in the first place.
Google BigQuery is a fully-managed data warehouse that runs on the Google Cloud and allows users to store and analyze huge datasets.
Let’s now expand this answer with more accurate definitions.
BigQuery is a database in the broadest sense. Databases are collections of related data, and BigQuery allows you to store petabytes of data.
The official title of BigQuery is a serverless data warehouse. Data warehouses are types of data storage where you can not only store structured data from multiple sources but also analyze it. In other words, BigQuery is an analytics database you can use to run queries and get insights from your business data using built-in features like machine learning, geospatial analysis, and business intelligence.
BigQuery features a columnar storage system that supports nested and repeated columns as semi-structured data. This is mostly a technical definition that will appeal to data-savvy professionals.
This is the most basic definition of BigQuery. BigQuery combines the features of both spreadsheet software like Google Sheets, and a database management system like MySQL.
BigQuery allows you to collaborate with partners, analysts, or other stakeholders in a familiar spreadsheet interface but also to retrieve information from structured datasets.
A big selling point of BigQuery is that it is a cloud service. You don’t have to install any software and worry about compatibility or updates. Google handles the infrastructure, and you only need to set up BigQuery.
If this is a first-time visit, you need to select your country and agree to the Terms of Service.
You can find it either through the search bar or by selecting it manually in the left menu.
This is how BigQuery looks when you open it for the first time.
Click the Create Project button to start. Name your project, choose an organization, if needed, and click Create.
You’ll get an official welcome window to BigQuery.
BigQuery Sandbox is a BigQuery initiative that allows you to explore the capabilities of this data warehouse at no cost, to determine whether BigQuery fits your needs.
Once you start a BigQuery project, you’ll see two messages at the top of the BigQuery console:
1. SANDBOX: Set up billing to upgrade to the full BigQuery experience.
2. Your free trial is waiting: activate now to get a $300 credit to explore Google Cloud products.
SANDBOX is a free account option that doesn’t require entering payment information. The sandbox gives you 10 GB of active storage and 1 TB of processed query data per month.
With your sandbox account, your tables expire in 60 days.
The second banner offers you to activate a free trial. The difference from the sandbox is that once you activate the trial, you’ll need to enter your payment details. Do it, and you get $300 of cloud credits for free.
For now, you can dismiss both options, because, in this tutorial, we’ll only be using the sandbox option.
First, we’re going to add some data into BigQuery to see how it works.
1. Click the project you want and then Create Dataset.
2. Assign a Dataset ID. You can enter letters and numbers. If needed, you can select the Data location as well as a table expiration (no longer than 60 days in the sandbox) and encryption. You’ve created a new dataset. Find it by clicking the Expand node button next to your project name.
3. Create a table. Click the Create Table button, and you have 5 options:
You can load your table data into BigQuery in these formats:
Click the Create table butting and follow these steps:
1. Choose source — Upload
2. Select file — Click Browse and choose the CSV file from your device
3. File format — Choose CSV, although the system will probably detect the file format
4. Table name — Name your table
5. Tick the Auto detect checkbox
6. Click Create table
This is what the basic flow looks like if you're uploading a CSV file from your device.
This is what your BigQuery table looks like:
BigQuery also allows you to upload Google Sheets tables manually. To do this, click the Create table button and:
1. Choose source — Drive
2. Select Drive URL — Insert the URL of your spreadsheet
3. File format — Choose Google Sheets
4. Sheet range — Specify the sheet and data range for import
5. Table name — Enter the table name
6. Check the Auto detect checkbox
7. Click Create table
Advanced options let you skip rows with column values that don’t match the schema, skip a specific number of rows from the top, and select an encryption key.
BigQuery will import the specified sheet from your spreadsheet.
You may have a dataset in Google Sheets you need to load data to BigQuery every day. You can do this manually, or automate the import using Whatagraph.
If you need to store data from multiple marketing tools in BigQuery, it makes much more sense to use Whatagraph.
Let’s say you want to transfer your Google Ads data to BigQuery for analysis:
1. Open Whatagraph and click the Data transfer tab. Click the Create new transfer button.
2. Enter the Project ID and click Test connection. If everything is OK, click Save and continue.
3. Select the integration from the dropdown. Click Google Ads.
4. Select an account and click Save and continue.
5. Name your transfer, select frequency, and click Create transfer.
All done! Your data from Google Ads is on its way to BigQuery.
When you click on new transfer details, you get a quick overview of the transfer you’ve just created.
BigQuery allows you to query tables in your database using the standard SQL dialect. You can also use non-standard or legacy SQL, but, BigQuery recommends standard SQL.
If you understand Google Sheets QUERY functions, you should understand how queries work:
=query(Deals!A:EU,"select E, N, T order by T Desc")
"select E, N, T order by T Desc" — this query retrieves three columns of the entire data set and orders the results in descending order.
Click Save => Save Query to save your queries for later use.
In the Save query window, name your query and choose its visibility:
Click on Save.
Right next to the Save button, you’ll find a Schedule button you can use to schedule queries.
So why would you run SQL queries on schedule in the first place?
There are at least two reasons:
When you click the Schedule button, you’ll get a message that you need to enable the BigQuery Data Transfer API first.
Click Enable API, and you’ll be able to use the Schedule button to create scheduled queries.
To proceed, click Create new scheduled query and define parameters like
You can also set up advanced options and notifications. When you finish setting up, click Schedule.
Now select your Google account to continue to BigQuery Data Transfer Service.
Users often need to export the results of their queries outside BigQuery. The common destinations are spreadsheet apps like Google Sheets and Excel, visualization and dashboarding apps like Google Data Studio and Tableau, and other business intelligence tools.
To export query results, click the Save Results button and select one of the options:
In this case, we’ll choose the BigQuery table option.
Name your table and choose the project and BigQuery dataset. Click Save and you’re all done.
Traditional relational databases store analytics data in rows. BigQuery, on the other hand, stores big data column by column. This columnar system gives BigQuery very fast output, which is essential for online data analysis.
BigQuery is able to analyze large volumes of data with SQL, which is a great format for analysis but is still a bit raw. To visualize your data, you can use Whatagraph.
You can use Whatagraph to visualize any data that is available in your BigQuery account, This not only includes data created by the Whatagraph transfer function but any data source you may have.
Just select your BigQuery account, connect your sources, and create report.
Now you can drag-and-drop dimensions or metrics from your dataset to organize your data display.
You can use the filter function at the top of your dashboard to filter your dataset to show only a specific area or subset. Drag fields into the filter as you’d drag them into your metrics or dimensions.
Whatagraph allows you to create calculated fields, plant data together with other data, or use custom functions with custom metrics that you create.
BigQuery has a serverless architecture that is fully managed by Google Cloud which means you don’t have to invest in additional storage and computing power.
With BigQuery, you don’t have to download your data for analysis. The computer engine and data source are separated so that you can analyze your data from any of your storage choices. You can store and analyze data within BigQuery or use BigQuery to analyze your data where it lives.
Explore large, real-world data from the Public Dataset Program. Access over 40 public datasets and integrate them into your applications as needed. Google pays for the storage of these data sets, and you, as the end user, only pay for the queries you perform.
BigQuery offers both on-demand and flat-rate subscription models. Loading, copying, and exporting data are free, and you only pay for data storage and querying, where it charges the quantity of processed data during each query.
By now, you should have the answers every beginner has when coming face to face with a new tool. If you have more questions, stay tuned. In the following articles, we’ll cover other BigQuery-related topics, such as BigQuery SQL syntax and using client libraries to start the BigQuery API in your preferred programming language.
In the meantime, book a demo call with our product manager and learn how Whatagraph can save you time using BigQuery.
Published on Jan 17, 2023
WRITTEN BYNikola Gemes
Enter your email and get curated content straight to your inbox!
Only the best content & no spam.