GA4 Traffic Allocation and Conversion Attribution (Part II: GA4 BigQuery)

In a previous blog post, we discussed how GA4 UI attributes conversions and user/session acquisition in GA4 UI. In this article, I want to show you how these traffic source fields are recorded in BigQuery GA4 export. In BigQuery, there are 4 sets of traffic source fields that are collected from web directly. If you want to understand how they differ and how you can use these to create your own channel data modelling in BigQuery, read on.

(If you are new to BigQuery GA4 export, you may want to start with Connor’s BigQuery intro blog)

GA4 BigQuery Different Types of Traffic Source Record.

At the time of writing, the below four groups of records log traffic source related information in BigQuery’s GA4 event table:

  • traffic_source
  • collected _traffic_source
  • session_traffic_source_last_click.manual_campaign
  • session_traffic_source_last_click.cross_channel_campaign

Each of these records contains a list of traffic source values GA4 collected from the URL or from platform integrations. 

Here is a summary of these four records and their value persistency:

In the next three sections, we are going to go through these fields in further detail. 

User level traffic source

The three fields under ‘traffic_source’ store user level traffic source data: 

A common mistake a lot of GA4 BigQuery users make in the beginning is that they naturally use traffic_source fields to query source, medium and campaign values. It is important to highlight that these are user level traffic source. As mentioned in my previous article, user level campaign name, source and medium are collected from the url when a new user first landed on the site. These values are assigned to first_visit event, and will persist throughout a user’s lifetime on your website, as long as the user_pseudo_id retains.

You will only use these fields in your query if you are analysing user acquisition, NOT when you analyse session or conversion channel attribution. 

Hit level traffic source

The fields under ‘collected_traffic_source’ contain a lot more traffic source information, these are collected at every single hit base on the URL each event is tied to.

As you can see, on top of the traditional utm fields, there are other query strings used by ads platforms such as ‘gclid’.

The values of these fields are assigned from the URL if the corresponding query parametres are available with that event hit, which means two things:

1) most of the events do not get these values.

In below example, not all events in each session contain the campaign name, source and medium values because these query parametres don’t exist in the URLs when, for example, a scroll event happens:

2) if utm values change in the middle of a site visit, that page view event will record the new values in these fields.

Unlike Universal Analytics, GA4 UI does not restart a session if the utm values change in the middle of a session. Therefore, if an online user lands on the site via an organic search, a few minutes later, they click onto an Instagram link to your site, GA4 UI would count only 1 session in this scenario, the session is credited to organic search only. However, if you want to count two sessions under the above scenario, you can use these hit level fields in BigQuery to create session view differently from the UI.  

Session level traffic source

The sub-records under ‘session_traffic_source_last_click’ started being added to GA4’s event tables from July 2024. The values under these records are updated for all hits within the same session, some of these records are gathered via connecting with third party platforms such as SA360. These additions make campaign analysis much easier and more flexible for paid search ads.

The difference between these two sets is that the manual_campaign updates the values only based on the current session, but cross_channel_campaign updates the values based on last non-direct model (last non-direct model is explained in this article).

For most sessions, these two sets of fields have the same value, however, when a session has no source, medium and campaign values, the cross_channel_campaign fields look back to the previous session, and pass on the last UTM values into these fields. Here we found an example when this happened: 

An important correction by cross channel fields

BigQuery traffic fields have an known error: the source/medium values are incorrectly assigned as ‘google/organic’ for paid search traffic when auto tagging is switched on in Google Ads. This is because auto tagged links have no utm query parameters, instead, there is only a ‘gclid’ parameter in the url, therefore BigQuery could not get correct values from the URL.

However, the latest last click session_traffic_source_last_click.cross_channel_campaign fields have fixed this issue, it is now passing on the correct values: 

Try it yourself

The best way of understanding how these fields work is to try it yourself. Here is a query for you to kick off your own exploration. You can use this to compare these outputs for your consented events grouped by each session. If you have any questions, contact us!

 SELECT 
-- Session ID and Event Name
CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id')) AS session_id,
event_name, 
-- User level traffic source
traffic_source.name AS user_campaign_name, 
traffic_source.medium AS user_medium, 
traffic_source.source AS user_source,
-- Hit level traffic source
collected_traffic_source.manual_campaign_name AS collect_manual_campaign_name,
collected_traffic_source.manual_source AS collect_manual_source, 
collected_traffic_source.manual_medium AS collect_manual_medium,
-- Session level traffic source (single session)
session_traffic_source_last_click.manual_campaign.`source` AS session_manual_source, 
session_traffic_source_last_click.manual_campaign.medium AS session_manual_medium, 
-- Session level traffic source (cross session last click)
session_traffic_source_last_click.manual_campaign.campaign_name AS session_cross_campaign,
session_traffic_source_last_click.cross_channel_campaign.`source` AS session_cross_source,
session_traffic_source_last_click.cross_channel_campaign.medium AS session_cross_medium,
FROM `ga_dataset_1234567.events_20250224`
WHERE user_pseudo_id IS NOT NULL
GROUP BY ALL 



GA4’s BigQuery export offers powerful flexibility for traffic source analysis beyond the GA4 UI. By understanding how different fields capture user, session, and hit-level data, you can create more tailored attribution models. Whether you need to refine session definitions, improve campaign analysis, or leveraging raw data to build advanced machine learning models, BigQuery provides the foundation to do so. If you need expert guidance in using BigQuery for GA4 channel data modelling, reach out to Hookflash—we’re here to help you get the most from your raw data.

Want to have a chat? 

Chat through our services with our team today and find out how we can help.

Contact us
Share by: