Understanding BigQuery data types is essential if you want to maximize the on-demand offerings and capabilities of this data warehouse. Many of our customers would like to know more about the different data types provided by BigQuery and how to map the data from their source to BigQuery data types. Let’s dive in and simplify BigQuery data types.
Table of Contents
What is BigQuery?
BigQuery is a highly-scalable data warehouse managed by the Google Cloud Platform, similar to Redshift and Azure. Businesses use the BigQuery BI functionality to store, query, ingest and retrieve insights from their schema datasets.
Unlike SQL servers, BigQuery operates as a serverless REST API SaaS solution, so your data teams can analyze data using standard SQL.
That was one of the reasons why we choose BigQuery as a dedicated data warehouse for Whatagraph data transfers.
We wanted to make the data migration process so simple that small non-tech teams can perform it.
Over time, we noticed that our customers are mainly marketing agencies that handle hundreds of clients, each with their own data stacks.
In both cases, Whatagraph saves those people's time needed to load data from multiple sources to Google BigQuery.
What is more, we simplified the data transfer process to the point-and-click level so anyone can do it.
No code knowledge or data engineering is required.
What are the different BigQuery data types?
BigQuery supports different data types, each with several functions and operations, as well as restrictions:
5. TIME AND DATE
7. GEOGRAPHY (GIS)
Let’s now explain each of these data types.
1. BigQuery NUMERIC data type
Numeric data types supported by BigQuery are:
- Integer (INT64): This represents numbers within the range of 9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 and numbers that don’t have fractional components.
- Float (Float64): Numbers with approximate numeric values and fractions.
- Numeric: A data type called NUMERIC is similar to DECIMAL which can store values with 38 decimal digits of precision and nine decimal digits of scale. Used for exact calculations.
Numeric BigQuery data types support the following functions:
Numeric types support basic arithmetic operations like addition, subtraction, multiplication, etc. You can always use mathematical functions on columns and rows that contain numeric data types.
Standard compliant floating point division
If the denominator is zero, applying the division operator will return a divide-by-zero error.
You should use specific functions to execute the operation to avoid these errors from crashing your SQL. For example, the function IEEE_divide returns NaN (not-a-number) when a division by zero is attempted.
This allows you to run queries without returning any errors.
With Safe functions, you can return NULL instead of an error for undefined mathematical operations. You can apply the Safe function by prefixing your operation with the SAFE command.
Precise decimal calculations with numeric
In financing and account applications, you often need exact values for calculations. The numeric data type is ideal for this use. Numeric data types in BigQuery use 16 bytes for storage and can accurately represent decimal figures, which makes them perfect for financial calculations.
2. BigQuery BOOLEAN data type
Boolean data types are variables defined by the keyword TRUE or FALSE (both case insensitive).
Bool data types enable the following operations.
Logical operators are commands that help filter results based on a particular set of conditions. Filtering is achieved using a WHERE clause, while the logical operators set the conditions. Logical operators on BigQuery include AND, OR, and NOT. The result of using a WHERE clause and a logical operator is based on whether the WHERE condition is TRUE or FALSE.
Boolean data types are also helpful in conditional clauses. These clauses simplify queries by giving an option and alternative if the clause is not satisfied. IF clauses work in a way that if the first parameter in the condition is TRUE, it returns the second parameter, and if FALSE, it returns the third parameter.
3. BigQuery STRINGS data type
Strings are Unicode characters that have variable lengths. They must have UTF-8 encodings. Mastering string manipulation is an essential condition for working with data.
The good news is that BigQuery provides an array of built-in functions that help improve your string manipulation task.
You must quote strings with single, double, or triple quotation marks. The string functions that BigQuery supports are:
Casting and coercion
You can CAST a string as an INT64 or FLOAT64 to parse it and cast an integer or float back a string. You can convert from one data type to another using the CAST command.
Even if the CAST operation fails and the data conversion is successful, SAFE_CAST will return NULL, and the query will run successfully.
This type of conversion using SAFE_CAST is called coercion.
You can use the CONCAT command to manipulate strings, for example, to concatenate (link together in a chain or series) separate parts of different strings into one.
Regular expressions allow you to find specific characters and patterns and perform powerful searches on your string data type. Using this BigQuery function, you can find a regex pattern character, remove the regex pattern, and replace the regex pattern.
If you have little experience manipulating strings, you can use Whatagraph.
Whatagraph allows customers to move data from the most popular marketing sources to Google BigQuery without coding.
Whatagraph can fully automate and schedule your transfers. Connect your destination source, and your data starts loading in a few clicks.
4. BigQuery BYTES data type
Although most functions that apply to strings also apply to bytes, bytes are different from strings and should not be used interchangeably.
Bytes are also represented by single, double, or triple quotation marks, but with bytes, you need to make sure it begins with the prefix letter B(b).
5. BigQuery TIME AND DATE data type
BigQuery also supports different types of time and date functionality, including the following.
- DATE: A logical calendar date independent of a time zone. A certain DATE represents a different period when recorded in different time zones. It has the format of YYYY : [ M ] M : [ D ] D.
- TIME: Time independent of a specific date.
- DATETIME: A point in time, like the one you see when you look at your watch. When an event occurs, any other precision is irrelevant.
- TIMESTAMP: An exact point in time with a microsecond precision independent of location. Datetime is part of a timestamp with a lower degree of precision.
It’s important to notice that these data types are interchangeable but with different levels of precision. For example, you can extract datetime from a timestamp and cast a timestamp to a date time.
Also, keep in mind that While DATETIME is supported on BigQuery Connector, TIMESTAMP, which contains the timezone, is not supported by the connector.
6. BigQuery TIMEZONE data type
BigQuery’s timezone format is used when parsing or formatting timestamps for display. Timezones are represented by these formats:
- Offset from Coordinated Universal time (UTC): In this format, no spaces are allowed between the timezone and the rest of the timestamp.
Time zone name from TZ database: When using this format, you need space between the timezone name and the rest of the timestamp.
7. BigQuery GEOGRAPHY (GIS) data type
The geography data type represents points, lines, and polygons on the surface of the earth. In BigQuery, all geographical positions are represented in the WGS84 reference ellipsoid, which is the same ellipsoid used in the GPS.
This way, the longitude and latitude can be used directly in BigQuery to find the location of a place. Geography functions in BigQuery are prefixed with ST_. signature and are classified based on their use cases:
Constructors: Create new geography values from coordinates or existing geography data.
Parsers: Build geography data types from external formats like WKT and Geo JSON.
Formatters: The opposite of parsers, they extract geography data types to an external format such as WKT and Geo JSON.
Transformations: Generate new geography data types from other geography data types.
Predicates: Filter clauses in geography data types. These return TRUE or FALSE for relationships between geographies or geography data types properties.
Accessors: Provide unlimited access to the property of geography.
Measures: These functions compute the measurements of one or more geography data types.
Aggregate functions: Execute geography-specific aggregate functions.
8. BigQuery ARRAY data type
ARRAY data types are ordered lists of zero or more elements of any non-array type. Arrays are represented by the angle brackets (</>). The following table shows declarations that arrays on BigQuery support or don’t support.
|ARRAY<Float 64>, ARRAY<Int 64>
|ARRAYs support mostly all data types.
|BigQuery encounters an error if a query is supposed to return an ARRAY that contains NULL, but you’re free to use such ARRAYs inside your query.
|ARRAYs of ARRAYs are not supported on BigQuery.
|You can declare an ARRAY of an ARRAY by juxtaposing a Struct in between the ARRAYs.
9. BigQuery STRUCT data type
STRUCT types are containers of ordered fields with a required data type and an optional field name. You can also declare structs using the (</>) brackets. The following table below illustrates declarations that are possible with structs.
|STRUCT supports mostly all data types.
|STRUCT <x STRUCT <’y’ String, z INT 64>>
|A STRUCT embedded inside another STRUCT named x. The STRUCT x has two fields: y, which is a string, and z, which is an integer.
|STRUCT <ARRAY<INT 64>>
|A STRUCT contains an ARRAY, and the ARRAY holds an integer.
|STRUCT supports NULL values of fields that have NULL values.
This concludes an overview of Google BigQuery data types, their variants, and usage. Whatagraph offers a much faster way to move data from your marketing analytics tools to BigQuery, which you can visualize using the Whatagraph integration.
Book a demo call with our product experts to learn more about how Whatagraph can automate and speed up your reporting.
Published on Jan 17 2023
WRITTEN BYNikola Gemes
Nikola is a content marketer at Whatagraph with extensive writing experience in SaaS and tech niches. With a background in content management apps and composable architectures, it's his job to educate readers about the latest developments in the world of marketing data, data warehousing, headless architectures, and federated content platforms.
Get marketing insights direct to your inbox