Metric types at a glance
| Type | Answers | Formula | Typical use |
|---|---|---|---|
| Conversion | Did the user trigger the event at least once? | triggered ÷ exposed | Activation, signup, first action |
| Events per user | How many events per user? | Σ events ÷ exposed | Volume: clicks, views, purchases |
| Events per user per active day | How intensely when engaged? | Σ(events ÷ active days) ÷ exposed | Engagement depth, session quality |
| Retention | Did the user clear a threshold within a window? | threshold met ÷ exposed | Habit formation, stickiness |
Conversion
Measures the fraction of exposed users who fired the event at least once.Formula
Example
A variant exposes 1,200 users. 340 of them firesignup_completed at least once.
When to use
Activation funnels, first-time actions, any binary “did it happen” outcome. Don’t use it when volume matters. A user firing the event ten times contributes the same as a user firing it once. For volume, use Events per user.Events per user
Measures the average number of events each exposed user fired.Formula
Example
A variant exposes 1,200 users. 200 fire 600 events total; the remaining 1,000 fire none.When to use
Volume metrics: clicks, page views, messages sent, purchases. A user firing the event ten times contributes ten times as much as a user firing it once. Don’t use it when engagement intensity matters more than cumulative volume. A user firing ten events on one day and a user firing one event on each of ten days contribute the same here. For intensity, use Events per user per active day.Events per user per active day
Measures the average per-active-day event rate, computed per user, then averaged across the exposed population.Formula
Example
Two users, same event, same experiment window:| User | Events | Active days | Per-user rate |
|---|---|---|---|
| Alice | 10 | 1 | 10 |
| Bob | 10 | 10 | 1 |
When to use
Engagement intensity, session quality, and any question where “when users come, they come hard” matters more than “how many total interactions.” Don’t use it when you care about total volume. Use Events per user.Retention
Measures the fraction of exposed users who met a frequency threshold within a specified post-exposure window.Formula
Example
Threshold: at least 2 events in days 0–7. Out of 1,200 exposed users, 180 meet it.When to use
Habit formation, stickiness, and any question about whether users keep engaging within a specific time window. Don’t use it for one-time actions (Conversion is simpler) or when you need the number of interactions (Events per user).How results are computed
The sections below document the statistical machinery. Read them to interpret edge cases; skip them to just use the numbers.Zero-filling
All four metric types divide by the total exposed user count, not just participants. Exposed users who never trigger the event contribute 0 to the numerator. This is intent-to-treat (ITT) analysis: the experiment measures the effect of assigning users to a variant, not the effect on users who engaged. Restricting to engaged users selects on outcome and biases the result. Practical consequence: with low participation (say 3%), the median pins to 0 because more than half of the variant contributes 0. The Participating column shows how many users contributed non-zero values.Winsorization
Events per user and Events per user per active day cap each user’s per-user value at the variant’s 99.9th percentile before summing. This bounds the influence of extreme outliers without removing them entirely.
Conversion and Retention are booleans per user; there’s no value to cap.
Significance testing
| Metric type | Test |
|---|---|
Conversion, Retention | Two-proportion Z-test (two-sided) |
Events per user, Events per user per active day | Welch’s t-test (two-sided) |
q < 0.05 as statistically significant.
CUPED variance reduction
CUPED (Controlled-experiment Using Pre-Experiment data) uses each user’s pre-exposure behavior as a covariate to shrink variance. Typical reduction is 10–40%, which means experiments reach significance with fewer samples. Enable CUPED per metric by settingcuped_pre_exposure_days to the number of pre-exposure days to use (e.g. 7 or 14). Only applies to Events per user and Events per user per active day.
Troubleshooting
My mean and median look suspiciously small
My mean and median look suspiciously small
Low participation. If 3% of exposed users engaged, the remaining 97% contribute 0. The mean is diluted and the median pins to 0.Check the Participating column. When it’s a small fraction of Users, the metric is dominated by zero-filled non-participants. Options:
- Run the experiment longer to accumulate more participants.
- Switch to Retention if the question is “how many users crossed a threshold.”
- Switch to Conversion if a binary “did they engage” is enough.
Events per user and Events per user per active day show nearly identical numbers
Events per user and Events per user per active day show nearly identical numbers
The two metrics differ only when users vary in how many days they were active. If every engaged user fires the event on exactly one day, the two reduce to the same quantity (events ÷ 1 = events).To see a meaningful difference, pick an event that can repeat across days — session-level activity, daily check-ins, repeated clicks.
The q-value never drops below 0.05
The q-value never drops below 0.05
The experiment is underpowered: either the sample is too small or the effect size is too small relative to variance. Options:
- Increase traffic allocation or extend the run to accumulate more exposures.
- Enable CUPED on mean metrics to reduce variance.
- Pick a metric with less variance — Conversion is typically less noisy than Events per user.
- Verify the metric is one the treatment is expected to move.