Pack's A/B testing calculates a "Chance to Win" metric for each variant — the Bayesian probability that a variant outperforms the control. For revenue-based goals, this tells you the likelihood that a variant generates more revenue per user than the control.
The calculation happens in two stages:
- BigQuery aggregates raw event data (total revenue, visitor counts, etc.) per variant
- Bayesian Engine computes the probability that each variant beats the control
The query joins two event streams from GA4: experiment views (which variant a user saw) and goal events (e.g., purchases). It then aggregates the results per variant.
-- CTE 1: Get all users who were assigned to a variant
WITH experiment_viewed AS (
SELECT
user_pseudo_id AS user,
(SELECT value.string_value FROM UNNEST(event_params)
WHERE key = "experiment_id") AS testId,
(SELECT value.string_value FROM UNNEST(event_params)
WHERE key = "experiment_name") AS testHandle,
COALESCE(
(SELECT value.string_value FROM UNNEST(event_params)
WHERE key = "experiment_variant_id"),
(SELECT value.string_value FROM UNNEST(event_params)
WHERE key = "variation_id")
) AS testVariantId,
COALESCE(
(SELECT value.string_value FROM UNNEST(event_params)
WHERE key = "experiment_variation"),
(SELECT value.string_value FROM UNNEST(event_params)
WHERE key = "variation")
) AS testVariantHandle,
CAST(CAST(TIMESTAMP_MICROS(event_timestamp) AS DATE) AS STRING) AS eventDate
FROM `<dataset>.events_*`
WHERE _TABLE_SUFFIX BETWEEN @partitionStart AND @partitionEnd
AND event_name = "view_experiment"
AND (SELECT value.string_value FROM UNNEST(event_params)
WHERE key = "experiment_id") = @testId
AND (TIMESTAMP_MICROS(event_timestamp)
BETWEEN TIMESTAMP(@startDate) AND TIMESTAMP(@endDate))
GROUP BY user, testId, testHandle, testVariantId, testVariantHandle, eventDate
),
-- CTE 2: Get all goal events (e.g., purchases) with their revenue value
event AS (
SELECT
user_pseudo_id AS user,
(SELECT
CASE
WHEN value.string_value IS NOT NULL THEN value.string_value
WHEN value.int_value IS NOT NULL THEN CAST(value.int_value AS STRING)
WHEN value.float_value IS NOT NULL THEN CAST(value.float_value AS STRING)
WHEN value.double_value IS NOT NULL THEN CAST(value.double_value AS STRING)
END
FROM UNNEST(event_params) WHERE key = @aggregationColumnName
) AS value,
CAST(CAST(TIMESTAMP_MICROS(event_timestamp) AS DATE) AS STRING) AS eventDate
FROM `<dataset>.events_*`
WHERE _TABLE_SUFFIX BETWEEN @partitionStart AND @partitionEnd
AND event_name = @eventName
AND (TIMESTAMP_MICROS(event_timestamp)
BETWEEN TIMESTAMP(@startDate) AND TIMESTAMP(@endDate))
),
-- CTE 3: Join goal events to experiment assignments
event_experiment AS (
SELECT
event.user,
event.value,
CASE WHEN event.eventDate IS NOT NULL
THEN event.eventDate ELSE ev.eventDate END AS eventDate,
ev.user AS visitor,
testId, testHandle, testVariantId, testVariantHandle
FROM event
FULL JOIN experiment_viewed ev ON ev.user = event.user
WHERE testId IS NOT NULL
),
-- CTE 4: Aggregate per variant
aggregation AS (
SELECT
testId, testHandle, testVariantId, testVariantHandle,
COUNT(DISTINCT visitor) AS visitors,
COUNT(DISTINCT user) AS users,
COUNT(user) AS count,
SUM(CAST(value AS FLOAT64)) AS value,
SUM(POWER(COALESCE(CAST(value AS FLOAT64), 0), 2)) AS valueSquare
FROM event_experiment
GROUP BY testId, testHandle, testVariantId, testVariantHandle
)
SELECT testVariantHandle, visitors, users, count, value, valueSquare
FROM aggregation
ORDER BY testVariantHandle| Column | Description |
|---|---|
visitors |
Unique users who saw this variant |
users |
Unique users who triggered the goal event |
count |
Total goal event occurrences |
value |
Total revenue (sum of the event parameter, e.g., purchase value) |
valueSquare |
Sum of squared values (needed for variance calculation) |
Using the BigQuery output, the Bayesian engine calculates the probability that each variant beats the control.
For revenue (a continuous "mean" metric):
mean = value / visitors (revenue per visitor)
variance = (valueSquare - value² / visitors) / (visitors - 1)
The relative lift measures how much the variant differs from the control:
lift = (meanVariant - meanControl) / meanControl
This measures uncertainty in the lift estimate:
dataVariance = varianceVariant / (meanControl² × visitorsVariant)
+ (varianceControl × meanVariant²) / (meanControl⁴ × visitorsControl)
The Chance to Win is the probability that the true lift is greater than zero — i.e., the variant actually beats the control:
chanceToWin = 1 - NormalCDF(0, meanDiff, stdDiff)
Where:
meanDiff= the estimated lift (from Step 2)stdDiff= √(1 / posteriorPrecision), the uncertainty in that estimateNormalCDF= cumulative distribution function of the normal distribution
Interpretation:
- > 50% — Variant is likely beating the control (displayed in green)
- ≤ 50% — Variant is likely losing to the control (displayed in red)
- > 95% — Strong confidence the variant wins
A range we're 95% confident the true lift falls within:
lower = NormalInverse(0.025, meanDiff, stdDiff)
upper = NormalInverse(0.975, meanDiff, stdDiff)
Quantifies the potential downside of choosing each option:
- Risk to Control — expected loss if you pick the variant but control was actually better
- Risk to Variant — expected loss if you pick control but the variant was actually better
| Variant | Visitors | Revenue | Rev/User | Chance to Win | Lift | 95% CI |
|---|---|---|---|---|---|---|
| Control | 10,000 | $50,000 | $5.00 | - | - | - |
| Variant A | 10,000 | $55,000 | $5.50 | 87.3% | +10.0% | [+2.1%, +17.9%] |
In this example, there's an 87.3% probability that Variant A generates more revenue per user than the Control.