A Comprehensive Guide to Connecting GA4 with BigQuery

This is the fourth entry in our Comprehensive Guide to GA4 series. If you’d like to learn more about GA4, you can find our other articles here: Hookflash Analytics - Dedicated Digital Analytics Experts

If you’ve spent any time in the analytics world over the last few years, you’ve probably heard someone gush about BigQuery. It’s a tool on Google’s Cloud platform that allows you to build databases, query them using SQL, and export data to other tools (such as Looker Studio). In the right hands, it can be incredibly powerful, allowing you access to raw unsampled data and enabling you to join different datasets. It’s particularly primed to cooperate with GA4, as the two can be linked directly, opening up new and exciting opportunities for your data analysis.

Key benefits of integrating BigQuery with GA4

BigQuery can provide some benefits over relying on GA4 for your reports. They include:


  • No Cardinality Issue. GA4’s reports have a limit of 500 entries per custom dimension.  If you have a dimension with high cardinality (such as user_id), it won’t display beyond 500 rows. BigQuery has no limit.
  • Merging Datasets. BigQuery allows you to merge your GA4 data with other datasets. For example, you can merge it with your own first party dataset, or one of the many publicly available ones. Want to see how the weather affects your sales? Now you can!
  • Unsampled Data. GA4 samples it’s funnel reports and its Explorations if you surpass the quota limit. However, when you export your GA4 data to BigQuery it is unsampled, meaning you get every event that’s occurred on site after cookie consent.
  • Data Manipulation. BigQuery utilises SQL, allowing for advanced data manipulations. For example, with commands like the CASE statement, you can create new columns based on conditions in existing ones.
  • Fresher Reporting. Events usually take about a day to appear in GA4. If you enable export streaming in BigQuery, you can get events as and when they happen.

But beware - some things to know about BigQuery

However, there are some drawbacks.

  • SQL Knowledge Required. To effectively use BigQuery, you’ll need to know some SQL. SQL is a very intuitive and easy to understand language, but it is certainly more tricky than simply clicking on the “Reports” tab
  • No Google Signals. Signals is not available in BigQuery, meaning users are not deduplicated by device. You also do not have access to Signals Demographic data. You can learn more about what Google Signals provides here: [GA4] Activate Google signals for Google Analytics 4 properties - Analytics Help
  • No Modelling. BigQuery uses a database of raw data, and as a result does no modelling. This means, for example, that you can’t use data-driven attribution.
  • Query Costs. BigQuery has a generous 1Tb limit before queries begin to cost you – but if you’ll be storing large amounts of data, and making many queries, you’ll likely surpass that. It is, however, reasonably priced.

Given the different capacities that BigQuery has when compared to GA4, there are many use cases and projects it’s appropriate for. This article will explain how to set up your BigQuery project, link your GA4 data to it, and get querying.


Step 1 - Using Google's Cloud Platform

To use BigQuery, we will be using Google’s Cloud platform. You will need to sign into the same google account you are using for your GA4 property. Once signed in, you’ll need to create a new project. You can do this by clicking on the small box at the top of the page, next to the Google Cloud logo. Select New Project, and give your new project a name, then click create.



This new project will give you an SQL workspace that you’ll be able to make your queries in. However, before we can do that, we’ll need to link your GA4 account to the project.

Step 2 - Linking GA4 with BigQuery

Screenshot of a BigQuery project set-up screen

Navigate to your GA4 property, and click Admin. If you scroll down, under the Property section, you will see an option named BigQuery Links. Click on that option, and then select Link. This will bring up the submenu.


Select the Big Query project you made earlier (it will be identified as it’s Project ID instead of the Project Name). Once selected, choose an appropriate Data Location.


Once you’re in the Configure Settings section, you’ll need to select which GA4 data streams you would like to link to your Big Query project. You can select more than one if you wish but be aware that there is a limit of 1 million events a day that can be exported.


You will also need to select which type of export you’d like, Daily or Streaming. Daily does what you imagine it would – export the events from the data stream once a day. Streaming is more rigorous and will export every event as and when it happens, however you will need billing enabled on your BigQuery project, as there is a cost associated with it. Once you’re happy with your configurations, you can click Submit at the bottom of the sub menu.

Step 3 - Querying your data in the SQL Workspace

Now that your data streams are linked to your BigQuery project, you can start querying! Or at least, you can start querying roughly 24 hours after the link has been made. Google doesn’t export the data immediately, so you’ll need to find something to do for a day (maybe learn how to crochet?). After the export has been made, go onto the cloud platform and click on the hamburger menu in the top left. Under BigQuery, select SQL Workspace. 


In the Explorer, you should see your project ID. Click on the dropdown, and you will see your GA4 dataset (it will be named “analytics_YOUR_PROPERTY_ID”). Under this dataset, there will be a series of tables called events. Each one will have a US formatted date at the end of its name, indicating which day of GA4 events the table contains. The tables will start from the date the link was made. Unfortunately, the link does not export your historical data by default, and you will need to import that data into BigQuery yourself.



Click on your events tables, and the GA4 schema will appear. This schema will be the same for all GA4 properties, so it’s useful learning how to query it. Here’s how it should look:

Screenshot of a BigQuery schema

Each field in the schema represents a column in the table. You’ll note that some fields have a small dropdown arrow next to them. This shows this is a nested field and has multiple values within it. For example, event parameters have a key, and a value associated with them, so those two are nested within the event_params field. This isn’t a full guide to the GA4 Schema, but here are some useful tips:


GA4 BigQuery Schema tips


  • The user_pseudo_id is the GA cookie ID and is set to the browser. This is a different value to the user_id which would likely be set by your site or developers using different conditions (for example, user accounts).
  • There are no channel groupings in the schema. This is because it’s raw event data, and channel grouping is something GA4 does after the data arrives to your property. If you’d like to see channel groupings, you’ll need to build the logic in an SQL Query
  • You can find more information on the schema here: [GA4] BigQuery Export schema - Analytics Help (google.com)
  • You can use the Preview tab to see what your raw table looks like.
  • It’s cheaper to unnest user properties than it is to unnest event parameters
  • You can find more information on the schema here: [GA4] BigQuery Export schema


Now we can start querying! Click on the Query option (I usually use the split tab option, as it allows me to look at the schema as I work). For now, let’s use a query that shows off one of BigQuery’s unique use cases. For example, in BigQuery you can access an event’s timestamp. This is particularly useful for debugging, or for observing trends over smaller time periods.

Here’s a query you can use to test. It counts the number of sessions for each quarter of the day, and groups them, giving the groups more human readable names (Morning, Night, etc)

An example query for GA4 data in BigQuery

WITH TimeData AS (

 -- Convert event_timestamp from microseconds to a timestamp format

 -- and extract the hour

 SELECT

   EXTRACT(HOUR FROM TIMESTAMP_SECONDS(CAST(event_timestamp/203 831 6690 AS INT64))) AS hour_of_day,

  event_bundle_sequence_id

 FROM

  `Insert Table ID Here`

 WHERE

  event_name = 'session_start' -- Assuming sessions are represented by 'session_start'

)

 


-- Group by quarter of the day and count sessions

SELECT

 CASE

   WHEN hour_of_day BETWEEN 0 AND 5 THEN 'Night'

   WHEN hour_of_day BETWEEN 6 AND 11 THEN 'Morning'

   WHEN hour_of_day BETWEEN 12 AND 17 THEN 'Afternoon'

   ELSE 'Evening'

 END AS time_of_day,

  COUNT(DISTINCT event_bundle_sequence_id) AS sessions

FROM

  TimeData

GROUP BY

  time_of_day

ORDER BY

  CASE time_of_day

  WHEN 'Night' THEN 1

  WHEN 'Morning' THEN 2

  WHEN 'Afternoon' THEN 3

  ELSE 4

 END;

Now click Run and you should have completed your very first query! Congratulations, I hope you’ve prepared a speech for the occasion. Under the option Save Results, you’ll be given many options for export. If you wish, you can also use it to create another table in your BigQuery dataset. 

Bonus Step – Exporting to Looker Studio

Linking accounts and making queries is all well and good, but most people would rather not have to makes SQL queries every time they want to see their GA4 data. There are plenty of options to take from here, (that’s what makes BQ so exciting) but an appealing one is exporting into a reporting suite like Looker Studio.



First, load up Looker studio, and create a blank dashboard. Select on the Add Data option, and a sub menu will appear with many Google Connector options. Select BigQuery, and a list of all your BigQuery projects will appear. Select your dataset, and then the events table we selected earlier.

Screenshot of a Looker Studio connector screen

Select Add, and the data will now be imported into your report. Now when you create a chart, you can select the BigQuery table as a data source and visualize your data to your heart's content. Not too shabby, eh?

Just the beginning

In wrapping up this chapter of our GA4 guide, it's evident that the melding of BigQuery with GA4 is a game-changer. The union enhances your GA4 data with the power and versatility of SQL, and offers a cardinality not seen in the out-of-the-box reports. While there are undoubtedly challenges and learning curves to maneuver, the possibilities of diving deeper into your data and drawing more nuanced insights are genuinely unparalleled.


If you’d like any help with BigQuery or GA4, get in touch! We’ll help you take your BigQuery game to the next level.

Want to have a chat? 

Chat through our services with our team today and find out how we can help.

Contact us
Share by: