Data analytics

9 Most Important BigQuery Data Types

Understanding BigQuery data types is essential if you want to maximize the on-demand offerings and capabilities of this data warehouse. Many of our customers would like to know more about the different data types provided by BigQuery and how to map the data from their source to BigQuery data types. Let’s dive in and simplify BigQuery data types.

Whatagraph marketing reporting tool
Nikola Gemes

Jan 17 2023 5 min read

Whatagraph marketing reporting tool

Table of Contents

    What is BigQuery?

    BigQuery is a highly-scalable data warehouse managed by the Google Cloud Platform, similar to Redshift and Azure. Businesses use the BigQuery BI functionality to store, query, ingest and retrieve insights from their schema datasets.  

    Unlike SQL servers, BigQuery operates as a serverless REST API SaaS solution, so your data teams can analyze data using standard SQL.  

    That was one of the reasons why we choose BigQuery as a dedicated data warehouse for Whatagraph data transfers.

    We wanted to make the data migration process so simple that small non-tech teams can perform it.

    Over time, we noticed that our customers are mainly marketing agencies that handle hundreds of clients, each with their own data stacks.  

    In both cases, Whatagraph saves those people's time needed to load data from multiple sources to Google BigQuery.

    What is more, we simplified the data transfer process to the point-and-click level so anyone can do it. 

    No code knowledge or data engineering is required.   

    What are the different BigQuery data types?

    BigQuery supports different data types, each with several functions and operations, as well as restrictions:

    1. NUMERIC

    2. BOOLEAN

    3. STRINGS

    4. BYTES

    5. TIME AND DATE

    6. TIMEZONE

    7. GEOGRAPHY (GIS)

    8. ARRAY

    9. STRUCT

    Let’s now explain each of these data types.

    1. BigQuery NUMERIC data type

    Numeric data types supported by BigQuery are:

    • Integer (INT64): This represents numbers within the range of 9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 and numbers that don’t have fractional components.
    • Float (Float64): Numbers with approximate numeric values and fractions.
    • Numeric: A data type called NUMERIC is similar to DECIMAL which can store values with 38 decimal digits of precision and nine decimal digits of scale. Used for exact calculations.

    Numeric BigQuery data types support the following functions:

    Arithmetical/mathematical

    Numeric types support basic arithmetic operations like addition, subtraction, multiplication, etc. You can always use mathematical functions on columns and rows that contain numeric data types.

    Standard compliant floating point division

    If the denominator is zero, applying the division operator will return a divide-by-zero error.

    You should use specific functions to execute the operation to avoid these errors from crashing your SQL. For example, the function IEEE_divide returns NaN (not-a-number) when a division by zero is attempted.

    This allows you to run queries without returning any errors.

    Easily build a marketing data warehouse with Whatagraph

    Safe functions

    With Safe functions, you can return NULL instead of an error for undefined mathematical operations. You can apply the Safe function by prefixing your operation with the SAFE command.

    Precise decimal calculations with numeric

    In financing and account applications, you often need exact values for calculations. The numeric data type is ideal for this use. Numeric data types in BigQuery use 16 bytes for storage and can accurately represent decimal figures, which makes them perfect for financial calculations.

    2. BigQuery BOOLEAN data type

    Boolean data types are variables defined by the keyword TRUE or FALSE (both case insensitive).

    boolean-data

    Source

    Bool data types enable the following operations.

    Logical operations

    Logical operators are commands that help filter results based on a particular set of conditions. Filtering is achieved using a WHERE clause, while the logical operators set the conditions. Logical operators on BigQuery include AND, OR, and NOT. The result of using a WHERE clause and a logical operator is based on whether the WHERE condition is TRUE or FALSE.

    Conditional operations

    Boolean data types are also helpful in conditional clauses. These clauses simplify queries by giving an option and alternative if the clause is not satisfied. IF clauses work in a way that if the first parameter in the condition is TRUE, it returns the second parameter, and if FALSE, it returns the third parameter.

    3. BigQuery STRINGS data type

    Strings are Unicode characters that have variable lengths. They must have UTF-8 encodings. Mastering string manipulation is an essential condition for working with data. 

    The good news is that BigQuery provides an array of built-in functions that help improve your string manipulation task. 

    You must quote strings with single, double, or triple quotation marks. The string functions that BigQuery supports are:

    Casting and coercion

    You can CAST a string as an INT64 or FLOAT64 to parse it and cast an integer or float back a string. You can convert from one data type to another using the CAST command.

    Even if the CAST operation fails and the data conversion is successful, SAFE_CAST will return NULL, and the query will run successfully.

    This type of conversion using SAFE_CAST is called coercion.

    Concatenation

    You can use the CONCAT command to manipulate strings, for example, to concatenate (link together in a chain or series) separate parts of different strings into one.

    Regular expressions

    Regular expressions allow you to find specific characters and patterns and perform powerful searches on your string data type. Using this BigQuery function, you can find a regex pattern character, remove the regex pattern, and replace the regex pattern.

    If you have little experience manipulating strings, you can use Whatagraph. 

    Whatagraph allows customers to move data from the most popular marketing sources to Google BigQuery without coding. 

    Whatagraph can fully automate and schedule your transfers. Connect your destination source, and your data starts loading in a few clicks. 

    4. BigQuery BYTES data type

    Although most functions that apply to strings also apply to bytes, bytes are different from strings and should not be used interchangeably.

    Bytes are also represented by single, double, or triple quotation marks, but with bytes, you need to make sure it begins with the prefix letter B(b).

    Store your data sets securely in a BigQuery warehouse with Whatagraph

    5. BigQuery TIME AND DATE data type

    BigQuery also supports different types of time and date functionality, including the following.

    • DATE: A logical calendar date independent of a time zone. A certain DATE represents a different period when recorded in different time zones. It has the format of YYYY : [ M ] M : [ D ] D.  
    • TIME: Time independent of a specific date.
    • DATETIME: A point in time, like the one you see when you look at your watch. When an event occurs, any other precision is irrelevant.
    • TIMESTAMP: An exact point in time with a microsecond precision independent of location. Datetime is part of a timestamp with a lower degree of precision.

    It’s important to notice that these data types are interchangeable but with different levels of precision. For example, you can extract datetime from a timestamp and cast a timestamp to a date time.

    Also, keep in mind that While DATETIME is supported on BigQuery Connector, TIMESTAMP, which contains the timezone, is not supported by the connector.

    6. BigQuery TIMEZONE data type

    BigQuery’s timezone format is used when parsing or formatting timestamps for display. Timezones are represented by these formats:

    • Offset from Coordinated Universal time (UTC): In this format, no spaces are allowed between the timezone and the rest of the timestamp.

    Time zone name from TZ database: When using this format, you need space between the timezone name and the rest of the timestamp.

    timezone-data

    Source

    7. BigQuery GEOGRAPHY (GIS) data type

    The geography data type represents points, lines, and polygons on the surface of the earth. In BigQuery, all geographical positions are represented in the WGS84 reference ellipsoid, which is the same ellipsoid used in the GPS. 

    This way, the longitude and latitude can be used directly in BigQuery to find the location of a place. Geography functions in BigQuery are prefixed with ST_. signature and are classified based on their use cases:

    Constructors: Create new geography values from coordinates or existing geography data.

    Parsers: Build geography data types from external formats like WKT and Geo JSON.

    Formatters: The opposite of parsers, they extract geography data types to an external format such as WKT and Geo JSON.

    Transformations: Generate new geography data types from other geography data types.

    Predicates: Filter clauses in geography data types. These return TRUE or FALSE for relationships between geographies or geography data types properties.

    Accessors: Provide unlimited access to the property of geography.

    Measures: These functions compute the measurements of one or more geography data types.

    Aggregate functions: Execute geography-specific aggregate functions.

    8. BigQuery ARRAY data type

    ARRAY data types are ordered lists of zero or more elements of any non-array type. Arrays are represented by the angle brackets (</>). The following table shows declarations that arrays on BigQuery support or don’t support.

    Array DeclarationMeaning
    ARRAY<Float 64>, ARRAY<Int 64>ARRAYs support mostly all data types.
    ARRAY<NULL>BigQuery encounters an error if a query is supposed to return an ARRAY that contains NULL, but you’re free to use such ARRAYs inside your query.
    ARRAY<ARRAY<INT 64>>ARRAYs of ARRAYs are not supported on BigQuery.
    ARRAY<Struct<ARRAY<INT 64>>>You can declare an ARRAY of an ARRAY by juxtaposing a Struct in between the ARRAYs.

    9. BigQuery STRUCT data type

    STRUCT types are containers of ordered fields with a required data type and an optional field name.  You can also declare structs using the (</>) brackets. The following table below illustrates declarations that are possible with structs.

    Struct DeclarationMeaning
    STRUCT<INT 64>STRUCT supports mostly all data types.
    STRUCT <x STRUCT <’y’ String, z INT 64>>A STRUCT embedded inside another STRUCT named x. The STRUCT x has two fields: y, which is a string, and z, which is an integer. 
    STRUCT <ARRAY<INT 64>>A STRUCT contains an ARRAY, and the ARRAY holds an integer.
    STRUCT<NULL>STRUCT supports NULL values of fields that have NULL values.

     

    Wrapping up

    This concludes an overview of Google BigQuery data types, their variants, and usage. Whatagraph offers a much faster way to move data from your marketing analytics tools to BigQuery, which you can visualize using the Whatagraph integration.  

    Book a demo call with our product experts to learn more about how Whatagraph can automate and speed up your reporting. 

    Try Whatagraph and transfer data to a warehouse in 4 simple steps
     

    Published on Jan 17 2023

    Whatagraph marketing reporting tool

    WRITTEN BY

    Nikola Gemes

    Nikola is a content marketer at Whatagraph with extensive writing experience in SaaS and tech niches. With a background in content management apps and composable architectures, it's his job to educate readers about the latest developments in the world of marketing data, data warehousing, headless architectures, and federated content platforms.