Retention Queries¶
Build typed retention analysis against Mixpanel's Insights engine — define born/return event pairs, retention periods, custom buckets, and segmentation inline without creating saved reports first.
New in v0.4
Workspace.query_retention() is the typed way to run retention analysis programmatically. It supports capabilities not available through the legacy retention() method, including per-event filters, custom retention buckets, alignment modes, display modes, and typed breakdowns.
When to Use query_retention()¶
query_retention() builds retention bookmark params and posts them to the Insights engine. The legacy retention() method queries the older Retention API endpoint. Use query_retention() when you need any of the capabilities in the right column:
| Capability | Legacy retention() |
query_retention() |
|---|---|---|
| Basic cohort retention | retention(born_event=..., return_event=...) |
query_retention("Signup", "Login") |
| Per-event filters | Expression strings only | RetentionEvent("Signup", filters=[...]) |
| Custom retention buckets | Not available | bucket_sizes=[1, 3, 7, 14, 30] |
| Alignment modes | Not available | alignment="birth" or "interval_start" |
| Display modes | Not available | mode="curve", "trends", or "table" |
| Typed filters | Expression strings | where=Filter.equals("country", "US") |
| Property breakdowns | on="country" |
group_by="country" |
| Math types | Not available | math="retention_rate" or "unique" |
| Save query as a report | N/A | result.params → create_bookmark() |
Use the legacy retention() when:
- You need the older Query API response format →
retention(born_event=..., return_event=...) - You need
born_where/return_whereexpression-string filters →retention(born_where='...')
Getting Started¶
The simplest possible retention query — weekly retention over the last 30 days:
import mixpanel_data as mp
ws = mp.Workspace()
result = ws.query_retention("Signup", "Login")
print(result.average) # synthetic average across all cohorts
print(result.df.head())
# cohort_date bucket count rate
# 0 2025-01-01 0 1000 1.000000
# 1 2025-01-01 1 800 0.800000
# 2 2025-01-01 2 650 0.650000
Add a time range and retention unit:
# Daily retention over the last 14 days
result = ws.query_retention("Signup", "Login", retention_unit="day", last=14)
# Monthly retention over the last 180 days
result = ws.query_retention("Signup", "Login", retention_unit="month", last=180)
# Specific date range
result = ws.query_retention(
"Signup",
"Login",
from_date="2025-01-01",
to_date="2025-03-31",
retention_unit="week",
)
Events¶
Plain Strings¶
The simplest way to define born and return events — pass event names as strings:
The first argument is the born event (defines cohort membership) and the second is the return event (defines what counts as returning).
The RetentionEvent Class¶
For per-event configuration with filters, use RetentionEvent objects:
from mixpanel_data import RetentionEvent, Filter
result = ws.query_retention(
RetentionEvent("Signup", filters=[Filter.equals("source", "organic")]),
RetentionEvent("Login"),
)
RetentionEvent fields:
| Field | Type | Default | Description |
|---|---|---|---|
event |
str |
(required) | Mixpanel event name |
filters |
list[Filter] \| None |
None |
Per-event filter conditions |
filters_combinator |
"all" \| "any" |
"all" |
How per-event filters combine (AND/OR) |
Plain strings and RetentionEvent objects can be mixed freely:
result = ws.query_retention(
"Signup", # plain string — no filters needed
RetentionEvent("Purchase", filters=[Filter.greater_than("amount", 0)]),
)
Per-Event Filters¶
Apply filters to individual events using RetentionEvent.filters. These restrict which events count for that specific role (born or return):
from mixpanel_data import RetentionEvent, Filter
result = ws.query_retention(
RetentionEvent("Signup", filters=[Filter.equals("source", "organic")]),
RetentionEvent("Purchase", filters=[
Filter.equals("country", "US"),
Filter.greater_than("amount", 25),
]),
)
By default, multiple per-event filters combine with AND logic. Use filters_combinator="any" for OR logic:
result = ws.query_retention(
"Signup",
RetentionEvent(
"Purchase",
filters=[
Filter.equals("country", "US"),
Filter.equals("country", "CA"),
],
filters_combinator="any", # match US OR CA
),
)
See Insights Queries — Filters for the full list of Filter factory methods.
Retention Unit¶
Control the retention period granularity:
| Unit | Description |
|---|---|
"day" |
Daily retention buckets |
"week" (default) |
Weekly retention buckets |
"month" |
Monthly retention buckets |
# Daily retention
result = ws.query_retention("Signup", "Login", retention_unit="day", last=14)
# Weekly retention (default)
result = ws.query_retention("Signup", "Login", retention_unit="week", last=90)
# Monthly retention
result = ws.query_retention("Signup", "Login", retention_unit="month", last=180)
Alignment¶
The alignment parameter controls how retention periods are anchored:
| Alignment | Behavior |
|---|---|
"birth" (default) |
Each user's retention clock starts from their born event |
"interval_start" |
Retention periods align to calendar boundaries (start of day/week/month) |
# Birth-aligned (default) — each user's clock starts individually
result = ws.query_retention("Signup", "Login", alignment="birth")
# Interval-aligned — retention periods snap to calendar boundaries
result = ws.query_retention("Signup", "Login", alignment="interval_start")
Custom Buckets¶
By default, retention uses uniform bucket sizes (bucket 0, 1, 2, ...). Use bucket_sizes for non-uniform retention periods:
# Custom day-based buckets: day 1, 3, 7, 14, 30
result = ws.query_retention(
"Signup",
"Login",
retention_unit="day",
bucket_sizes=[1, 3, 7, 14, 30],
)
Bucket sizes must be:
- Positive integers
- In strictly ascending order
- Maximum 730 values
Aggregation¶
The math parameter controls what metric is computed:
| Math type | What it measures |
|---|---|
"retention_rate" (default) |
Percentage of cohort retained per bucket (0.0–1.0) |
"unique" |
Raw unique user count per bucket |
# Retention rate (default)
result = ws.query_retention("Signup", "Login", math="retention_rate")
# Raw unique user counts
result = ws.query_retention("Signup", "Login", math="unique")
Filters¶
Global Filters¶
Apply filters across the entire query with where=:
from mixpanel_data import Filter
# Single filter
result = ws.query_retention(
"Signup",
"Login",
where=Filter.equals("country", "US"),
)
# Multiple filters (AND logic)
result = ws.query_retention(
"Signup",
"Login",
where=[
Filter.equals("platform", "web"),
Filter.is_true("is_premium"),
],
)
Global filters apply to the overall query. For event-specific filtering, use RetentionEvent.filters (see Events — Per-Event Filters).
See Insights Queries — Available Filter Methods for the complete filter reference.
Cohort Filters¶
Restrict retention analysis to users in a cohort:
from mixpanel_data import Filter, CohortCriteria, CohortDefinition
# Do power users retain better?
result = ws.query_retention(
"Signup",
"Login",
where=Filter.in_cohort(123, "Power Users"),
retention_unit="week",
last=90,
)
# Inline cohort — define the segment on the fly
organic = CohortDefinition(
CohortCriteria.did_event("Signup", at_least=1, within_days=90,
where=Filter.equals("source", "organic"))
)
result = ws.query_retention(
"Signup", "Purchase",
where=Filter.in_cohort(organic, name="Organic Signups"),
)
See Insights Queries — Cohort Filters for the full cohort filter reference.
Custom Property Filters¶
Use saved or inline custom properties in retention filters:
from mixpanel_data import Filter, CustomPropertyRef
result = ws.query_retention(
"Signup",
"Login",
where=Filter.greater_than(property=CustomPropertyRef(42), value=100),
retention_unit="week",
last=90,
)
Warning
Custom property filters in retention where= may cause server errors due to a known Mixpanel API bug. Custom property breakdowns work reliably.
See Insights Queries — Custom Properties in Queries for InlineCustomProperty, validation rules, and full options.
Breakdowns¶
Break down retention results by property values with group_by:
from mixpanel_data import GroupBy
# Simple string breakdown
result = ws.query_retention("Signup", "Login", group_by="platform")
# Multiple breakdowns
result = ws.query_retention("Signup", "Login", group_by=["country", "platform"])
# Numeric bucketing
result = ws.query_retention(
"Signup",
"Purchase",
group_by=GroupBy("amount", property_type="number", bucket_size=50),
)
Cohort Breakdowns¶
Segment retention by cohort membership — compare how a cohort retains vs. everyone else:
from mixpanel_data import CohortBreakdown
result = ws.query_retention(
"Signup",
"Login",
group_by=CohortBreakdown(123, "Power Users"),
retention_unit="week",
last=90,
)
Note
query_retention() does not support mixing CohortBreakdown with property GroupBy in the same group_by list. Use one or the other.
See Insights Queries — Cohort Breakdowns for inline definitions and options.
Custom Property Breakdowns¶
Break down retention results by a saved or inline custom property:
from mixpanel_data import GroupBy, CustomPropertyRef
result = ws.query_retention(
"Signup",
"Login",
group_by=GroupBy(property=CustomPropertyRef(42), property_type="number"),
retention_unit="week",
last=90,
)
See Insights Queries — Custom Properties in Queries for InlineCustomProperty and full options.
See Insights Queries — Breakdowns for the full GroupBy reference.
Time Ranges¶
Relative (Default)¶
By default, query_retention() returns the last 30 days. Customize with last (always in days) and unit (aggregation granularity):
# Last 7 days
result = ws.query_retention("Signup", "Login", last=7)
# Last 90 days, weekly granularity
result = ws.query_retention("Signup", "Login", last=90, unit="week")
# Last 180 days, monthly granularity
result = ws.query_retention("Signup", "Login", last=180, unit="month")
Absolute¶
Specify explicit start and end dates:
# Q1 2025
result = ws.query_retention(
"Signup",
"Login",
from_date="2025-01-01",
to_date="2025-03-31",
)
Dates must be in YYYY-MM-DD format.
Display Modes¶
The mode parameter controls result presentation:
| Mode | Chart type | Use case |
|---|---|---|
"curve" (default) |
Retention curve | Standard retention analysis |
"trends" |
Line chart | Track retention performance over time |
"table" |
Table | Detailed cohort-level comparison |
# Retention curve (default)
result = ws.query_retention("Signup", "Login", mode="curve")
# Trends over time
result = ws.query_retention(
"Signup",
"Login",
mode="trends",
last=90,
unit="week",
)
# Tabular format
result = ws.query_retention("Signup", "Login", mode="table")
Working with Results¶
RetentionQueryResult¶
query_retention() returns a RetentionQueryResult with:
result = ws.query_retention("Signup", "Login", retention_unit="week", last=90)
# Cohort data — keyed by cohort date
for date, data in result.cohorts.items():
print(f"{date}: {data['first']} users born")
print(f" Retention: {data['rates']}") # [1.0, 0.8, 0.65, ...]
# Synthetic average across all cohorts
result.average # {"first": 500, "counts": [...], "rates": [...]}
# DataFrame (lazy, cached)
result.df
# cohort_date bucket count rate
# 0 2025-01-01 0 1000 1.000000
# 1 2025-01-01 1 800 0.800000
# 2 2025-01-01 2 650 0.650000
# 3 2025-01-08 0 950 1.000000
# 4 2025-01-08 1 760 0.800000
# Time range
result.from_date # "2025-01-01"
result.to_date # "2025-03-31"
# Metadata
result.computed_at # "2025-03-31T12:00:00.000000+00:00"
result.meta # {"sampling_factor": 1.0, ...}
# Generated bookmark params (for debugging or persistence)
result.params # dict — the full bookmark JSON sent to API
DataFrame Structure¶
The DataFrame has one row per (cohort_date, bucket) pair:
| Column | Description |
|---|---|
cohort_date |
Date string identifying the cohort (users born on this date) |
bucket |
Retention bucket index (0 = born period, 1 = first return period, ...) |
count |
Number of users retained in this bucket |
rate |
Retention rate for this bucket (count / cohort size, 0.0–1.0) |
Cohort Data Structure¶
Each entry in result.cohorts is a dict with:
| Key | Type | Description |
|---|---|---|
first |
int |
Cohort size — users who did the born event on this date |
counts |
list[int] |
User counts retained per bucket |
rates |
list[float] |
Retention rates per bucket (0.0–1.0) |
Persisting as a Saved Report¶
The generated bookmark params can be saved as a Mixpanel report:
from mixpanel_data import CreateBookmarkParams
result = ws.query_retention("Signup", "Login", retention_unit="week", last=90)
ws.create_bookmark(CreateBookmarkParams(
name="Signup → Login Retention (Weekly)",
bookmark_type="retention",
params=result.params,
))
Debugging¶
Inspect result.params to see the exact bookmark JSON sent to the API:
import json
result = ws.query_retention("Signup", "Login")
print(json.dumps(result.params, indent=2))
Validation¶
query_retention() validates all parameter combinations before making an API call and raises BookmarkValidationError with descriptive messages:
| Rule | Error code | Error message |
|---|---|---|
| Empty born event name | R1_EMPTY_BORN_EVENT |
Born event name must be a non-empty string |
| Control chars in born event | R1_CONTROL_CHAR_BORN_EVENT |
Born event name contains control characters |
| Empty return event name | R2_EMPTY_RETURN_EVENT |
Return event name must be a non-empty string |
| Control chars in return event | R2_CONTROL_CHAR_RETURN_EVENT |
Return event name contains control characters |
| Non-positive bucket sizes | R5_BUCKET_SIZES_POSITIVE |
Each bucket size must be a positive integer |
| Float bucket sizes | R5_BUCKET_SIZES_INTEGER |
Bucket sizes must be integers, not floats |
| Buckets not ascending | R6_BUCKET_SIZES_ASCENDING |
Bucket sizes must be in strictly ascending order |
| Invalid retention unit | R7_INVALID_RETENTION_UNIT |
Must be one of: day, week, month |
| Invalid alignment | R8_INVALID_ALIGNMENT |
Must be one of: birth, interval_start |
| Invalid math | R9_INVALID_MATH |
Must be one of: retention_rate, unique |
| Invalid mode | R10_INVALID_MODE |
Must be one of: curve, trends, table |
| Invalid unit | R11_INVALID_UNIT |
Must be one of: day, week, month |
Errors are collected — all validation issues are reported at once, not just the first:
from mixpanel_data import BookmarkValidationError
try:
ws.query_retention("", "Login", bucket_sizes=[5, 3, 1])
except BookmarkValidationError as e:
for error in e.errors:
print(f"[{error.code}] {error.path}: {error.message}")
# [R1_EMPTY_BORN_EVENT] born_event: Born event name must be a non-empty string
# [R6_BUCKET_SIZES_ASCENDING] bucket_sizes: Bucket sizes must be in strictly ascending order
Complete Examples¶
User Onboarding Retention¶
import mixpanel_data as mp
from mixpanel_data import RetentionEvent, Filter
ws = mp.Workspace()
# Weekly retention: do new signups come back?
result = ws.query_retention(
RetentionEvent("Signup", filters=[Filter.equals("source", "organic")]),
"Login",
retention_unit="week",
last=90,
group_by="platform",
)
# Inspect average retention curve
avg = result.average
print(f"Cohort size: {avg['first']}")
for i, rate in enumerate(avg['rates']):
print(f" Week {i}: {rate:.1%}")
# Export to DataFrame for further analysis
print(result.df)
Product Engagement¶
# Do users who complete onboarding keep making purchases?
result = ws.query_retention(
"Complete Onboarding",
"Purchase",
retention_unit="month",
last=180,
where=Filter.is_true("is_premium"),
)
# Custom bucket sizes for key retention milestones
milestone_retention = ws.query_retention(
"Signup",
"Login",
retention_unit="day",
bucket_sizes=[1, 3, 7, 14, 30, 60, 90],
last=90,
)
Retention Trends¶
# Track how retention is changing over time
result = ws.query_retention(
"Signup",
"Login",
mode="trends",
unit="week",
retention_unit="week",
last=180,
)
print(result.df)
Generating Params Without Querying¶
Use build_retention_params() to generate bookmark params without making an API call — useful for debugging, inspecting the generated JSON, or saving queries as reports:
# Same arguments as query_retention(), returns dict instead of RetentionQueryResult
params = ws.build_retention_params(
"Signup",
"Login",
retention_unit="week",
bucket_sizes=[1, 3, 7, 14, 30],
last=90,
)
import json
print(json.dumps(params, indent=2)) # inspect the generated bookmark JSON
# Save as a report directly from params
from mixpanel_data import CreateBookmarkParams
ws.create_bookmark(CreateBookmarkParams(
name="Signup → Login Retention (Custom Buckets)",
bookmark_type="retention",
params=params,
))
Next Steps¶
- Insights Queries — Typed analytics with DAU, formulas, filters, and breakdowns
- Funnel Queries — Typed funnel conversion analysis with steps, exclusions, and conversion windows
- Flow Queries — Typed flow path analysis with steps, directions, and graph output
- Live Analytics — Retention — Legacy retention method
- API Reference — Workspace — Full method signatures
- API Reference — Types — RetentionEvent, RetentionQueryResult, RetentionAlignment, RetentionMode, RetentionMathType details