Understanding BigQuery data types is essential if you want to maximize the on-demand offerings and capabilities of this data warehouse. Many of our customers would like to know more about the different data types provided by BigQuery and how to map the data from their source to BigQuery data types. Let’s dive in and simplify BigQuery data types.
BigQuery is a highly-scalable data warehouse managed by the Google Cloud Platform, similar to Redshift and Azure. Businesses use the BigQuery BI functionality to store, query, ingest and retrieve insights from their schema datasets.
Unlike SQL servers, BigQuery operates as a serverless REST API SaaS solution, so your data teams can analyze data using standard SQL.
BigQuery supports different data types, each with several functions and operations, as well as restrictions:
5. TIME AND DATE
7. GEOGRAPHY (GIS)
Let’s now explain each of these data types.
Numeric data types supported by BigQuery are:
Numeric BigQuery data types support the following functions:
Numeric types support basic arithmetic operations like addition, subtraction, multiplication, etc. You can always use mathematical functions on columns and rows that contain numeric data types.
If the denominator is zero, applying the division operator will return a divide-by-zero error.
You should use specific functions to execute the operation to avoid these errors from crashing your SQL. For example, the function IEEE_divide returns NaN (not-a-number) when a division by zero is attempted.
This allows you to run queries without returning any errors.
With Safe functions, you can return NULL instead of an error for undefined mathematical operations. You can apply the Safe function by prefixing your operation with the SAFE command.
In financing and account applications, you often need exact values for calculations. The numeric data type is ideal for this use. Numeric data types in BigQuery use 16 bytes for storage and can accurately represent decimal figures, which makes them perfect for financial calculations.
Boolean data types are variables defined by the keyword TRUE or FALSE (both case insensitive).
Bool data types enable the following operations.
Logical operators are commands that help filter results based on a particular set of conditions. Filtering is achieved using a WHERE clause, while the logical operators set the conditions. Logical operators on BigQuery include AND, OR, and NOT. The result you get when using a WHERE clause and a logical operator is based on if the WHERE condition is TRUE or FALSE.
Boolean data types are also helpful in conditional clauses. These clauses simplify queries by giving an option and alternative if the clause is not satisfied. IF clauses work in a way that if the first parameter in the condition is TRUE, it returns the second parameter, and if FALSE, it returns the third parameter.
Strings are Unicode characters that have variable lengths. They must have UTF-8 encodings. Mastering string manipulation is an essential condition for working with data.
The good news is that BigQuery provides an array of built-in functions that help improve your string manipulation task.
You must quote strings with single, double, or triple quotation marks. The string functions that BigQuery supports are:
You can CAST a string as an INT64 or FLOAT64 to parse it and cast an integer or float back a string. You can convert from one data type to another using the CAST command.
Even if the CAST operation fails and the data conversion is successful, SAFE_CAST will return NULL, and the query will run successfully.
This type of conversion using SAFE_CAST is called coercion.
You can use the CONCAT command to manipulate strings, for example, to concatenate (link together in a chain or series) separate parts of different strings into one.
Regular expressions allow you to find specific characters and patterns and perform powerful searches on your string data type. Using this BigQuery function, you can find a regex pattern character, remove the regex pattern, and replace the regex pattern.
If you have little experience with the manipulation of strings, you can use Whatagraph.
Whatagraph allows customers to move data from the most popular marketing sources to Google BigQuery without coding.
Whatagraph can fully automate and schedule your transfers. Connect your destination source, and your data starts loading in a few clicks.
Although most functions that apply to strings also apply to bytes, bytes are different from strings and should not be used interchangeably.
Bytes are also represented by single, double, or triple quotation marks, but with bytes, you need to make sure it begins with the prefix letter B(b).
BigQuery also supports different types of time and date functionality, including the following.
It’s important to notice that these data types are interchangeable but with different levels of precision. For example, you can extract datetime from a timestamp and cast a timestamp to a date time.
Also, keep in mind that While DATETIME is supported on BigQuery Connector, TIMESTAMP, which contains the timezone, is not supported by the connector.
BigQuery’s timezone format is used when parsing or formatting timestamps for display. Timezones are represented by these formats:
Time zone name from TZ database: When using this format, you need space between the timezone name and the rest of the timestamp.
The geography data type represents points, lines, and polygons on the surface of the earth. In BigQuery, all geographical positions are represented in the WGS84 reference ellipsoid, which is the same ellipsoid used in the GPS.
This way, the longitude and latitude can be used directly in BigQuery to find the location of a place. Geography functions in BigQuery are prefixed with ST_. signature and are classified based on their use cases:
Constructors: Create new geography values from coordinates or existing geography data.
Parsers: Build geography data types from external formats like WKT and Geo JSON.
Formatters: The opposite of parsers, they extract geography data types to an external format such as WKT and Geo JSON.
Transformations: Generate new geography data types from other geography data types.
Predicates: Filter clauses in geography data types. These return TRUE or FALSE for relationships between geographies or geography data types properties.
Accessors: Provide unlimited access to the property of geography.
Measures: These functions compute the measurements of one or more geography data types.
Aggregate functions: Execute geography-specific aggregate functions.
ARRAY data types are ordered lists of zero or more elements of any non-array type. Arrays are represented by the angle brackets (</>). The following table shows declarations that arrays on BigQuery support or don’t support.
|ARRAY<Float 64>, ARRAY<Int 64>||ARRAYs support mostly all data types.|
|ARRAY<NULL>||BigQuery encounters an error if a query is supposed to return an ARRAY that contains NULL, but you’re free to use such ARRAYs inside your query.|
|ARRAY<ARRAY<INT 64>>||ARRAYs of ARRAYs are not supported on BigQuery.|
|ARRAY<Struct<ARRAY<INT 64>>>||You can declare an ARRAY of an ARRAY by juxtaposing a Struct in between the ARRAYs.|
STRUCT types are containers of ordered fields with a required data type and an optional field name. You can also declare structs using the (</>) brackets. The following table below illustrates declarations that are possible with structs.
|STRUCT<INT 64>||STRUCT supports mostly all data types.|
|STRUCT <x STRUCT <’y’ String, z INT 64>>||A STRUCT embedded inside another STRUCT named x. The STRUCT x has two fields: y, which is a string, and z, which is an integer.|
|STRUCT <ARRAY<INT 64>>||A STRUCT contains an ARRAY, and the ARRAY holds an integer.|
|STRUCT<NULL>||STRUCT supports NULL values of fields that have NULL values.|
This concludes an overview of Google BigQuery data types, their variants, and usage. Whatagraph offers a much faster way to move data from your marketing analytics tools to BigQuery, which you can visualize using the Whatagraph integration.
Book a demo call with our product experts to learn more about how Whatagraph can automate and speed up your reporting.
Published on Jan 17, 2023
WRITTEN BYNikola Gemes
Enter your email and get curated content straight to your inbox!
Only the best content & no spam.