Skip to content

Result Types

Explore on DeepWiki

🤖 Result Types Reference →

Ask questions about result structures, DataFrame conversion, or type usage patterns.

All result types are immutable frozen dataclasses with:

  • Lazy DataFrame conversion via the .df property
  • JSON serialization via the .to_dict() method
  • Full type hints for IDE/mypy support

Fetch Results

mixpanel_data.FetchResult dataclass

FetchResult(
    table: str,
    rows: int,
    type: Literal["events", "profiles"],
    duration_seconds: float,
    date_range: tuple[str, str] | None,
    fetched_at: datetime,
    _data: list[dict[str, Any]] = list(),
    _df_cache: DataFrame | None = None,
)

Result of a data fetch operation.

Represents the outcome of fetching events or profiles from Mixpanel and storing them in the local database.

table instance-attribute

table: str

Name of the created table.

rows instance-attribute

rows: int

Number of rows fetched.

type instance-attribute

type: Literal['events', 'profiles']

Type of data fetched.

duration_seconds instance-attribute

duration_seconds: float

Time taken to complete the fetch.

date_range instance-attribute

date_range: tuple[str, str] | None

Date range for events (None for profiles).

fetched_at instance-attribute

fetched_at: datetime

Timestamp when fetch completed.

df property

df: DataFrame

Convert result data to pandas DataFrame.

Conversion is lazy - computed on first access and cached.

RETURNS DESCRIPTION
DataFrame

DataFrame with fetched data.

to_dict

to_dict() -> dict[str, Any]

Serialize result for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary representation (excludes raw data).

dict[str, Any]

datetime values are converted to ISO format strings.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize result for JSON output.

    Returns:
        Dictionary representation (excludes raw data).
        datetime values are converted to ISO format strings.
    """
    return {
        "table": self.table,
        "rows": self.rows,
        "type": self.type,
        "duration_seconds": self.duration_seconds,
        "date_range": self.date_range,
        "fetched_at": self.fetched_at.isoformat(),
    }

Parallel Fetch Types

Types for parallel event fetching with progress tracking and failure handling.

mixpanel_data.ParallelFetchResult dataclass

ParallelFetchResult(
    table: str,
    total_rows: int,
    successful_batches: int,
    failed_batches: int,
    failed_date_ranges: tuple[tuple[str, str], ...],
    duration_seconds: float,
    fetched_at: datetime,
)

Result of a parallel fetch operation.

Aggregates results from all batches, providing summary statistics and information about any failures for retry.

ATTRIBUTE DESCRIPTION
table

Name of the created/appended table.

TYPE: str

total_rows

Total number of rows fetched across all batches.

TYPE: int

successful_batches

Number of batches that completed successfully.

TYPE: int

failed_batches

Number of batches that failed.

TYPE: int

failed_date_ranges

Date ranges (from_date, to_date) of failed batches.

TYPE: tuple[tuple[str, str], ...]

duration_seconds

Total time taken for the parallel fetch.

TYPE: float

fetched_at

Timestamp when fetch completed.

TYPE: datetime

Example
result = ws.fetch_events(
    name="events",
    from_date="2024-01-01",
    to_date="2024-03-31",
    parallel=True,
)

if result.has_failures:
    print(f"Warning: {result.failed_batches} batches failed")
    for from_date, to_date in result.failed_date_ranges:
        print(f"  {from_date} to {to_date}")

table instance-attribute

table: str

Name of the created/appended table.

total_rows instance-attribute

total_rows: int

Total number of rows fetched across all batches.

successful_batches instance-attribute

successful_batches: int

Number of batches that completed successfully.

failed_batches instance-attribute

failed_batches: int

Number of batches that failed.

failed_date_ranges instance-attribute

failed_date_ranges: tuple[tuple[str, str], ...]

Date ranges (from_date, to_date) of failed batches for retry.

duration_seconds instance-attribute

duration_seconds: float

Total time taken for the parallel fetch.

fetched_at instance-attribute

fetched_at: datetime

Timestamp when fetch completed.

has_failures property

has_failures: bool

Check if any batches failed.

RETURNS DESCRIPTION
bool

True if at least one batch failed, False otherwise.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all result fields including has_failures.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all result fields including has_failures.
    """
    return {
        "table": self.table,
        "total_rows": self.total_rows,
        "successful_batches": self.successful_batches,
        "failed_batches": self.failed_batches,
        "failed_date_ranges": [list(dr) for dr in self.failed_date_ranges],
        "duration_seconds": self.duration_seconds,
        "fetched_at": self.fetched_at.isoformat(),
        "has_failures": self.has_failures,
    }

mixpanel_data.BatchProgress dataclass

BatchProgress(
    from_date: str,
    to_date: str,
    batch_index: int,
    total_batches: int,
    rows: int,
    success: bool,
    error: str | None = None,
)

Progress update for a parallel fetch batch.

Sent to the on_batch_complete callback when a batch finishes (successfully or with error).

ATTRIBUTE DESCRIPTION
from_date

Start date of this batch (YYYY-MM-DD).

TYPE: str

to_date

End date of this batch (YYYY-MM-DD).

TYPE: str

batch_index

Zero-based index of this batch.

TYPE: int

total_batches

Total number of batches in the parallel fetch.

TYPE: int

rows

Number of rows fetched in this batch (0 if failed).

TYPE: int

success

Whether this batch completed successfully.

TYPE: bool

error

Error message if failed, None if successful.

TYPE: str | None

Example
def on_batch(progress: BatchProgress) -> None:
    status = "✓" if progress.success else "✗"
    print(f"[{status}] Batch {progress.batch_index + 1}/{progress.total_batches}")

result = ws.fetch_events(
    name="events",
    from_date="2024-01-01",
    to_date="2024-03-31",
    parallel=True,
    on_batch_complete=on_batch,
)

from_date instance-attribute

from_date: str

Start date of this batch (YYYY-MM-DD).

to_date instance-attribute

to_date: str

End date of this batch (YYYY-MM-DD).

batch_index instance-attribute

batch_index: int

Zero-based index of this batch.

total_batches instance-attribute

total_batches: int

Total number of batches in the parallel fetch.

rows instance-attribute

rows: int

Number of rows fetched in this batch (0 if failed).

success instance-attribute

success: bool

Whether this batch completed successfully.

error class-attribute instance-attribute

error: str | None = None

Error message if failed, None if successful.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all batch progress fields.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all batch progress fields.
    """
    return {
        "from_date": self.from_date,
        "to_date": self.to_date,
        "batch_index": self.batch_index,
        "total_batches": self.total_batches,
        "rows": self.rows,
        "success": self.success,
        "error": self.error,
    }

mixpanel_data.BatchResult dataclass

BatchResult(
    from_date: str,
    to_date: str,
    rows: int,
    success: bool,
    error: str | None = None,
)

Result of fetching a single date range chunk.

Internal type used by ParallelFetcherService to track batch outcomes. Contains either the fetched data (on success) or error info (on failure).

ATTRIBUTE DESCRIPTION
from_date

Start date of this batch (YYYY-MM-DD).

TYPE: str

to_date

End date of this batch (YYYY-MM-DD).

TYPE: str

rows

Number of rows fetched (0 if failed).

TYPE: int

success

Whether the batch completed successfully.

TYPE: bool

error

Exception message if failed, None if successful.

TYPE: str | None

Note

Data is not included in to_dict() as it's consumed by the writer thread and is not JSON-serializable (iterator of dicts).

from_date instance-attribute

from_date: str

Start date of this batch (YYYY-MM-DD).

to_date instance-attribute

to_date: str

End date of this batch (YYYY-MM-DD).

rows instance-attribute

rows: int

Number of rows fetched (0 if failed).

success instance-attribute

success: bool

Whether the batch completed successfully.

error class-attribute instance-attribute

error: str | None = None

Exception message if failed, None if successful.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output (excludes data).

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with batch result fields (excluding data).

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output (excludes data).

    Returns:
        Dictionary with batch result fields (excluding data).
    """
    return {
        "from_date": self.from_date,
        "to_date": self.to_date,
        "rows": self.rows,
        "success": self.success,
        "error": self.error,
    }

Parallel Profile Fetch Types

Types for parallel profile fetching with page-based progress tracking.

mixpanel_data.ParallelProfileResult dataclass

ParallelProfileResult(
    table: str,
    total_rows: int,
    successful_pages: int,
    failed_pages: int,
    failed_page_indices: tuple[int, ...],
    duration_seconds: float,
    fetched_at: datetime,
)

Result of a parallel profile fetch operation.

Aggregates results from all pages, providing summary statistics and information about any failures for retry.

ATTRIBUTE DESCRIPTION
table

Name of the created/appended table.

TYPE: str

total_rows

Total number of rows fetched across all pages.

TYPE: int

successful_pages

Number of pages that completed successfully.

TYPE: int

failed_pages

Number of pages that failed.

TYPE: int

failed_page_indices

Page indices of failed pages for retry.

TYPE: tuple[int, ...]

duration_seconds

Total time taken for the parallel fetch.

TYPE: float

fetched_at

Timestamp when fetch completed.

TYPE: datetime

Example
result = ws.fetch_profiles(
    name="users",
    parallel=True,
)

if result.has_failures:
    print(f"Warning: {result.failed_pages} pages failed")
    for idx in result.failed_page_indices:
        print(f"  Page {idx}")

table instance-attribute

table: str

Name of the created/appended table.

total_rows instance-attribute

total_rows: int

Total number of rows fetched across all pages.

successful_pages instance-attribute

successful_pages: int

Number of pages that completed successfully.

failed_pages instance-attribute

failed_pages: int

Number of pages that failed.

failed_page_indices instance-attribute

failed_page_indices: tuple[int, ...]

Page indices of failed pages for retry.

duration_seconds instance-attribute

duration_seconds: float

Total time taken for the parallel fetch.

fetched_at instance-attribute

fetched_at: datetime

Timestamp when fetch completed.

has_failures property

has_failures: bool

Check if any pages failed.

RETURNS DESCRIPTION
bool

True if at least one page failed, False otherwise.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all result fields including has_failures.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all result fields including has_failures.
    """
    return {
        "table": self.table,
        "total_rows": self.total_rows,
        "successful_pages": self.successful_pages,
        "failed_pages": self.failed_pages,
        "failed_page_indices": list(self.failed_page_indices),
        "duration_seconds": self.duration_seconds,
        "fetched_at": self.fetched_at.isoformat(),
        "has_failures": self.has_failures,
    }

mixpanel_data.ProfileProgress dataclass

ProfileProgress(
    page_index: int,
    total_pages: int | None,
    rows: int,
    success: bool,
    error: str | None,
    cumulative_rows: int,
)

Progress update for a parallel profile fetch page.

Sent to the on_page_complete callback when a page finishes (successfully or with error). Used for progress visibility during parallel profile fetching operations.

ATTRIBUTE DESCRIPTION
page_index

Zero-based index of this page.

TYPE: int

total_pages

Total pages if known, None if not yet determined.

TYPE: int | None

rows

Number of rows fetched in this page (0 if failed).

TYPE: int

success

Whether this page completed successfully.

TYPE: bool

error

Error message if failed, None if successful.

TYPE: str | None

cumulative_rows

Total rows fetched so far across all pages.

TYPE: int

Example
def on_page(progress: ProfileProgress) -> None:
    status = "✓" if progress.success else "✗"
    pct = f"{progress.page_index + 1}/{progress.total_pages}" if progress.total_pages else f"{progress.page_index + 1}/?"
    print(f"[{status}] Page {pct}: {progress.cumulative_rows} total rows")

result = ws.fetch_profiles(
    name="users",
    parallel=True,
    on_page_complete=on_page,
)

page_index instance-attribute

page_index: int

Zero-based index of this page.

total_pages instance-attribute

total_pages: int | None

Total pages if known, None if not yet determined.

rows instance-attribute

rows: int

Number of rows fetched in this page (0 if failed).

success instance-attribute

success: bool

Whether this page completed successfully.

error instance-attribute

error: str | None

Error message if failed, None if successful.

cumulative_rows instance-attribute

cumulative_rows: int

Total rows fetched so far across all pages.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all profile progress fields.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all profile progress fields.
    """
    return {
        "page_index": self.page_index,
        "total_pages": self.total_pages,
        "rows": self.rows,
        "success": self.success,
        "error": self.error,
        "cumulative_rows": self.cumulative_rows,
    }

mixpanel_data.ProfilePageResult dataclass

ProfilePageResult(
    profiles: list[dict[str, Any]],
    session_id: str | None,
    page: int,
    has_more: bool,
    total: int,
    page_size: int,
)

Result from fetching a single page of profiles.

Contains the profiles from one page of the Engage API along with pagination metadata for fetching subsequent pages.

ATTRIBUTE DESCRIPTION
profiles

List of profile dictionaries from this page.

TYPE: list[dict[str, Any]]

session_id

Session ID for fetching next page, None if no more pages.

TYPE: str | None

page

Zero-based page index that was fetched.

TYPE: int

has_more

True if there are more pages to fetch.

TYPE: bool

total

Total number of profiles matching the query across all pages.

TYPE: int

page_size

Number of profiles per page (typically 1000).

TYPE: int

Example
# Fetch first page to get pagination metadata
result = api_client.export_profiles_page(page=0)
all_profiles = list(result.profiles)

# Pre-compute total pages for parallel fetching
total_pages = result.num_pages
print(f"Fetching {total_pages} pages ({result.total} profiles)")

# Continue fetching if more pages
while result.has_more:
    result = api_client.export_profiles_page(
        page=result.page + 1,
        session_id=result.session_id,
    )
    all_profiles.extend(result.profiles)

profiles instance-attribute

profiles: list[dict[str, Any]]

List of profile dictionaries from this page.

session_id instance-attribute

session_id: str | None

Session ID for fetching next page, None if no more pages.

page instance-attribute

page: int

Zero-based page index that was fetched.

has_more instance-attribute

has_more: bool

True if there are more pages to fetch.

total instance-attribute

total: int

Total number of profiles matching the query across all pages.

page_size instance-attribute

page_size: int

Number of profiles per page (typically 1000).

num_pages property

num_pages: int

Calculate total number of pages needed.

Uses ceiling division to ensure partial pages are counted.

RETURNS DESCRIPTION
int

Total pages needed to fetch all profiles.

int

Returns 0 if total is 0 (empty result set).

Example
result = api_client.export_profiles_page(page=0)
# If total=5432 and page_size=1000, num_pages=6
for page_idx in range(1, result.num_pages):
    # Fetch remaining pages...

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all page result fields including pagination metadata.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all page result fields including pagination metadata.
    """
    return {
        "profiles": self.profiles,
        "session_id": self.session_id,
        "page": self.page,
        "has_more": self.has_more,
        "profile_count": len(self.profiles),
        "total": self.total,
        "page_size": self.page_size,
        "num_pages": self.num_pages,
    }

Query Results

mixpanel_data.SegmentationResult dataclass

SegmentationResult(
    event: str,
    from_date: str,
    to_date: str,
    unit: Literal["day", "week", "month"],
    segment_property: str | None,
    total: int,
    series: dict[str, dict[str, int]] = dict(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Result of a segmentation query.

Contains time-series data for an event, optionally segmented by a property.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

event instance-attribute

event: str

Queried event name.

from_date instance-attribute

from_date: str

Query start date (YYYY-MM-DD).

to_date instance-attribute

to_date: str

Query end date (YYYY-MM-DD).

unit instance-attribute

unit: Literal['day', 'week', 'month']

Time unit for aggregation.

segment_property instance-attribute

segment_property: str | None

Property used for segmentation (None if total only).

total instance-attribute

total: int

Total count across all segments and time periods.

series class-attribute instance-attribute

series: dict[str, dict[str, int]] = field(default_factory=dict)

Time series data by segment.

Structure: {segment_name: {date_string: count}} Example: {"US": {"2024-01-01": 150, "2024-01-02": 200}, "EU": {...}} For unsegmented queries, segment_name is "total".

df property

df: DataFrame

Convert to DataFrame with columns: date, segment, count.

For unsegmented queries, segment column is 'total'.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "unit": self.unit,
        "segment_property": self.segment_property,
        "total": self.total,
        "series": self.series,
    }

mixpanel_data.FunnelResult dataclass

FunnelResult(
    funnel_id: int,
    funnel_name: str,
    from_date: str,
    to_date: str,
    conversion_rate: float,
    steps: list[FunnelStep] = list(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Result of a funnel query.

Contains step-by-step conversion data for a funnel.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

funnel_id instance-attribute

funnel_id: int

Funnel identifier.

funnel_name instance-attribute

funnel_name: str

Funnel display name.

from_date instance-attribute

from_date: str

Query start date.

to_date instance-attribute

to_date: str

Query end date.

conversion_rate instance-attribute

conversion_rate: float

Overall conversion rate (0.0 to 1.0).

steps class-attribute instance-attribute

steps: list[FunnelStep] = field(default_factory=list)

Step-by-step breakdown.

df property

df: DataFrame

Convert to DataFrame with columns: step, event, count, conversion_rate.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "funnel_id": self.funnel_id,
        "funnel_name": self.funnel_name,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "conversion_rate": self.conversion_rate,
        "steps": [step.to_dict() for step in self.steps],
    }

mixpanel_data.FunnelStep dataclass

FunnelStep(event: str, count: int, conversion_rate: float)

Single step in a funnel.

event instance-attribute

event: str

Event name for this step.

count instance-attribute

count: int

Number of users at this step.

conversion_rate instance-attribute

conversion_rate: float

Conversion rate from previous step (0.0 to 1.0).

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "count": self.count,
        "conversion_rate": self.conversion_rate,
    }

mixpanel_data.RetentionResult dataclass

RetentionResult(
    born_event: str,
    return_event: str,
    from_date: str,
    to_date: str,
    unit: Literal["day", "week", "month"],
    cohorts: list[CohortInfo] = list(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Result of a retention query.

Contains cohort-based retention data.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

born_event instance-attribute

born_event: str

Event that defines cohort membership.

return_event instance-attribute

return_event: str

Event that defines return.

from_date instance-attribute

from_date: str

Query start date.

to_date instance-attribute

to_date: str

Query end date.

unit instance-attribute

unit: Literal['day', 'week', 'month']

Time unit for retention periods.

cohorts class-attribute instance-attribute

cohorts: list[CohortInfo] = field(default_factory=list)

Cohort retention data.

df property

df: DataFrame

Convert to DataFrame with columns: cohort_date, cohort_size, period_N.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "born_event": self.born_event,
        "return_event": self.return_event,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "unit": self.unit,
        "cohorts": [cohort.to_dict() for cohort in self.cohorts],
    }

mixpanel_data.CohortInfo dataclass

CohortInfo(date: str, size: int, retention: list[float] = list())

Retention data for a single cohort.

date instance-attribute

date: str

Cohort date (when users were 'born').

size instance-attribute

size: int

Number of users in cohort.

retention class-attribute instance-attribute

retention: list[float] = field(default_factory=list)

Retention percentages by period (0.0 to 1.0).

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "date": self.date,
        "size": self.size,
        "retention": self.retention,
    }

mixpanel_data.JQLResult dataclass

JQLResult(_raw: list[Any] = list(), *, _df_cache: DataFrame | None = None)

Bases: ResultWithDataFrame

Result of a JQL query.

JQL (JavaScript Query Language) allows custom queries against Mixpanel data.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

The df property intelligently detects JQL result patterns (groupBy, percentiles, simple dicts) and converts them to clean tabular format.

raw property

raw: list[Any]

Raw result data from JQL execution.

df property

df: DataFrame

Convert result to DataFrame with intelligent structure detection.

The conversion strategy depends on the detected JQL result pattern:

groupBy results (detected by {key: [...], value: X} structure): - Keys expanded to columns: key_0, key_1, key_2, ... - Single value: "value" column - Multiple reducers (value array): value_0, value_1, value_2, ... - Additional fields (from .map()): preserved as-is - Example: {"key": ["US"], "value": 100, "name": "USA"} -> columns: key_0, value, name

Nested percentile results ([[{percentile: X, value: Y}, ...]]): - Outer list unwrapped, inner dicts converted directly

Simple list of dicts (already well-structured): - Converted directly to DataFrame preserving all fields

Fallback for other structures (scalars, mixed types, incompatible dicts): - Safely wrapped in single "value" column to prevent data loss - Used when structure doesn't match known patterns

RAISES DESCRIPTION
ValueError

If groupBy structure has inconsistent value types across rows (some scalar, some array) which indicates malformed query results.

RETURNS DESCRIPTION
DataFrame

DataFrame representation, cached after first access.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "raw": self._raw,
        "row_count": len(self._raw),
    }

Discovery Types

mixpanel_data.FunnelInfo dataclass

FunnelInfo(funnel_id: int, name: str)

A saved funnel definition.

Represents a funnel saved in Mixpanel that can be queried using the funnel() method.

funnel_id instance-attribute

funnel_id: int

Unique identifier for funnel queries.

name instance-attribute

name: str

Human-readable funnel name.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "funnel_id": self.funnel_id,
        "name": self.name,
    }

mixpanel_data.SavedCohort dataclass

SavedCohort(
    id: int,
    name: str,
    count: int,
    description: str,
    created: str,
    is_visible: bool,
)

A saved cohort definition.

Represents a user cohort saved in Mixpanel for profile filtering.

id instance-attribute

id: int

Unique identifier for profile filtering.

name instance-attribute

name: str

Human-readable cohort name.

count instance-attribute

count: int

Current number of users in cohort.

description instance-attribute

description: str

Optional description (may be empty string).

created instance-attribute

created: str

Creation timestamp (YYYY-MM-DD HH:mm:ss).

is_visible instance-attribute

is_visible: bool

Whether cohort is visible in Mixpanel UI.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "id": self.id,
        "name": self.name,
        "count": self.count,
        "description": self.description,
        "created": self.created,
        "is_visible": self.is_visible,
    }

mixpanel_data.TopEvent dataclass

TopEvent(event: str, count: int, percent_change: float)

Today's event activity data.

Represents an event's current activity including count and trend.

event instance-attribute

event: str

Event name.

count instance-attribute

count: int

Today's event count.

percent_change instance-attribute

percent_change: float

Change vs yesterday (-1.0 to +infinity).

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "count": self.count,
        "percent_change": self.percent_change,
    }

Lexicon Types

mixpanel_data.LexiconSchema dataclass

LexiconSchema(entity_type: str, name: str, schema_json: LexiconDefinition)

Complete schema definition from Mixpanel Lexicon.

Represents a documented event or profile property definition from the Mixpanel data dictionary.

entity_type instance-attribute

entity_type: str

Type of entity (e.g., 'event', 'profile', 'custom_event', 'group', etc.).

name instance-attribute

name: str

Name of the event or profile property.

schema_json instance-attribute

schema_json: LexiconDefinition

Full schema definition.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with entity_type, name, and schema_json.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with entity_type, name, and schema_json.
    """
    return {
        "entity_type": self.entity_type,
        "name": self.name,
        "schema_json": self.schema_json.to_dict(),
    }

mixpanel_data.LexiconDefinition dataclass

LexiconDefinition(
    description: str | None,
    properties: dict[str, LexiconProperty],
    metadata: LexiconMetadata | None,
)

Full schema definition for an event or profile property in Lexicon.

Contains the structural definition including description, properties, and platform-specific metadata.

description instance-attribute

description: str | None

Human-readable description of the entity.

properties instance-attribute

properties: dict[str, LexiconProperty]

Property definitions keyed by property name.

metadata instance-attribute

metadata: LexiconMetadata | None

Optional Mixpanel-specific metadata for the entity.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with properties, and optionally description and metadata.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with properties, and optionally description and metadata.
    """
    result: dict[str, Any] = {
        "properties": {k: v.to_dict() for k, v in self.properties.items()},
    }
    if self.description is not None:
        result["description"] = self.description
    if self.metadata is not None:
        result["metadata"] = self.metadata.to_dict()
    return result

mixpanel_data.LexiconProperty dataclass

LexiconProperty(
    type: str, description: str | None, metadata: LexiconMetadata | None
)

Schema definition for a single property in a Lexicon schema.

Describes the type and metadata for an event or profile property.

type instance-attribute

type: str

JSON Schema type (string, number, boolean, array, object, integer, null).

description instance-attribute

description: str | None

Human-readable description of the property.

metadata instance-attribute

metadata: LexiconMetadata | None

Optional Mixpanel-specific metadata.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with type, and optionally description and metadata.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with type, and optionally description and metadata.
    """
    result: dict[str, Any] = {"type": self.type}
    if self.description is not None:
        result["description"] = self.description
    if self.metadata is not None:
        result["metadata"] = self.metadata.to_dict()
    return result

mixpanel_data.LexiconMetadata dataclass

LexiconMetadata(
    source: str | None,
    display_name: str | None,
    tags: list[str],
    hidden: bool,
    dropped: bool,
    contacts: list[str],
    team_contacts: list[str],
)

Mixpanel-specific metadata for Lexicon schemas and properties.

Contains platform-specific information about how schemas and properties are displayed and organized in the Mixpanel UI.

source instance-attribute

source: str | None

Origin of the schema definition (e.g., 'api', 'csv', 'ui').

display_name instance-attribute

display_name: str | None

Human-readable display name in Mixpanel UI.

tags instance-attribute

tags: list[str]

Categorization tags for organization.

hidden instance-attribute

hidden: bool

Whether hidden from Mixpanel UI.

dropped instance-attribute

dropped: bool

Whether data is dropped/ignored.

contacts instance-attribute

contacts: list[str]

Owner email addresses.

team_contacts instance-attribute

team_contacts: list[str]

Team ownership labels.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all metadata fields.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all metadata fields.
    """
    return {
        "source": self.source,
        "display_name": self.display_name,
        "tags": self.tags,
        "hidden": self.hidden,
        "dropped": self.dropped,
        "contacts": self.contacts,
        "team_contacts": self.team_contacts,
    }

Event Analytics Results

mixpanel_data.EventCountsResult dataclass

EventCountsResult(
    events: list[str],
    from_date: str,
    to_date: str,
    unit: Literal["day", "week", "month"],
    type: Literal["general", "unique", "average"],
    series: dict[str, dict[str, int]],
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Time-series event count data.

Contains aggregate counts for multiple events over time with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

events instance-attribute

events: list[str]

Queried event names.

from_date instance-attribute

from_date: str

Query start date (YYYY-MM-DD).

to_date instance-attribute

to_date: str

Query end date (YYYY-MM-DD).

unit instance-attribute

unit: Literal['day', 'week', 'month']

Time unit for aggregation.

type instance-attribute

type: Literal['general', 'unique', 'average']

Counting method used.

series instance-attribute

series: dict[str, dict[str, int]]

Time series data: {event_name: {date: count}}.

df property

df: DataFrame

Convert to DataFrame with columns: date, event, count.

Conversion is lazy - computed on first access and cached.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "events": self.events,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "unit": self.unit,
        "type": self.type,
        "series": self.series,
    }

mixpanel_data.PropertyCountsResult dataclass

PropertyCountsResult(
    event: str,
    property_name: str,
    from_date: str,
    to_date: str,
    unit: Literal["day", "week", "month"],
    type: Literal["general", "unique", "average"],
    series: dict[str, dict[str, int]],
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Time-series property value distribution data.

Contains aggregate counts by property values over time with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

event instance-attribute

event: str

Queried event name.

property_name instance-attribute

property_name: str

Property used for segmentation.

from_date instance-attribute

from_date: str

Query start date (YYYY-MM-DD).

to_date instance-attribute

to_date: str

Query end date (YYYY-MM-DD).

unit instance-attribute

unit: Literal['day', 'week', 'month']

Time unit for aggregation.

type instance-attribute

type: Literal['general', 'unique', 'average']

Counting method used.

series instance-attribute

series: dict[str, dict[str, int]]

Time series data by property value.

Structure: {property_value: {date: count}} Example: {"US": {"2024-01-01": 150, "2024-01-02": 200}, "EU": {...}}

df property

df: DataFrame

Convert to DataFrame with columns: date, value, count.

Conversion is lazy - computed on first access and cached.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "property_name": self.property_name,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "unit": self.unit,
        "type": self.type,
        "series": self.series,
    }

Advanced Query Results

mixpanel_data.UserEvent dataclass

UserEvent(event: str, time: datetime, properties: dict[str, Any] = dict())

Single event in a user's activity feed.

Represents one event from a user's event history with timestamp and all associated properties.

event instance-attribute

event: str

Event name.

time instance-attribute

time: datetime

Event timestamp (UTC).

properties class-attribute instance-attribute

properties: dict[str, Any] = field(default_factory=dict)

All event properties including system properties.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "time": self.time.isoformat(),
        "properties": self.properties,
    }

mixpanel_data.ActivityFeedResult dataclass

ActivityFeedResult(
    distinct_ids: list[str],
    from_date: str | None,
    to_date: str | None,
    events: list[UserEvent] = list(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Collection of user events from activity feed query.

Contains chronological event history for one or more users with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

distinct_ids instance-attribute

distinct_ids: list[str]

Queried user identifiers.

from_date instance-attribute

from_date: str | None

Start date filter (YYYY-MM-DD), None if not specified.

to_date instance-attribute

to_date: str | None

End date filter (YYYY-MM-DD), None if not specified.

events class-attribute instance-attribute

events: list[UserEvent] = field(default_factory=list)

Event history (chronological order).

df property

df: DataFrame

Convert to DataFrame with columns: event, time, distinct_id, + properties.

Flattens event properties into individual columns. Conversion is lazy - computed on first access and cached.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "distinct_ids": self.distinct_ids,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "event_count": len(self.events),
        "events": [e.to_dict() for e in self.events],
    }

mixpanel_data.FrequencyResult dataclass

FrequencyResult(
    event: str | None,
    from_date: str,
    to_date: str,
    unit: Literal["day", "week", "month"],
    addiction_unit: Literal["hour", "day"],
    data: dict[str, list[int]] = dict(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Event frequency distribution (addiction analysis).

Contains frequency arrays showing how many users performed events in N time periods, with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

event instance-attribute

event: str | None

Filtered event name (None = all events).

from_date instance-attribute

from_date: str

Query start date (YYYY-MM-DD).

to_date instance-attribute

to_date: str

Query end date (YYYY-MM-DD).

unit instance-attribute

unit: Literal['day', 'week', 'month']

Overall time period.

addiction_unit instance-attribute

addiction_unit: Literal['hour', 'day']

Measurement granularity.

data class-attribute instance-attribute

data: dict[str, list[int]] = field(default_factory=dict)

Frequency arrays by date.

Structure: {date: [count_1, count_2, ...]} Example: {"2024-01-01": [100, 50, 25, 10]}

Each array shows user counts by frequency: - Index 0: users active exactly 1 time - Index 1: users active exactly 2 times - Index N: users active exactly N+1 times

df property

df: DataFrame

Convert to DataFrame with columns: date, period_1, period_2, ...

Each period_N column shows users active in at least N time periods. Conversion is lazy - computed on first access and cached.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "unit": self.unit,
        "addiction_unit": self.addiction_unit,
        "data": self.data,
    }

mixpanel_data.NumericBucketResult dataclass

NumericBucketResult(
    event: str,
    from_date: str,
    to_date: str,
    property_expr: str,
    unit: Literal["hour", "day"],
    series: dict[str, dict[str, int]] = dict(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Events segmented into numeric property ranges.

Contains time-series data bucketed by automatically determined numeric ranges, with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

event instance-attribute

event: str

Queried event name.

from_date instance-attribute

from_date: str

Query start date (YYYY-MM-DD).

to_date instance-attribute

to_date: str

Query end date (YYYY-MM-DD).

property_expr instance-attribute

property_expr: str

The 'on' expression used for bucketing.

unit instance-attribute

unit: Literal['hour', 'day']

Time aggregation unit.

series class-attribute instance-attribute

series: dict[str, dict[str, int]] = field(default_factory=dict)

Bucket data: {range_string: {date: count}}.

df property

df: DataFrame

Convert to DataFrame with columns: date, bucket, count.

Conversion is lazy - computed on first access and cached.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "property_expr": self.property_expr,
        "unit": self.unit,
        "series": self.series,
    }

mixpanel_data.NumericSumResult dataclass

NumericSumResult(
    event: str,
    from_date: str,
    to_date: str,
    property_expr: str,
    unit: Literal["hour", "day"],
    results: dict[str, float] = dict(),
    computed_at: str | None = None,
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Sum of numeric property values per time unit.

Contains daily or hourly sum totals for a numeric property with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

event instance-attribute

event: str

Queried event name.

from_date instance-attribute

from_date: str

Query start date (YYYY-MM-DD).

to_date instance-attribute

to_date: str

Query end date (YYYY-MM-DD).

property_expr instance-attribute

property_expr: str

The 'on' expression summed.

unit instance-attribute

unit: Literal['hour', 'day']

Time aggregation unit.

results class-attribute instance-attribute

results: dict[str, float] = field(default_factory=dict)

Sum values: {date: sum}.

computed_at class-attribute instance-attribute

computed_at: str | None = None

Computation timestamp (if provided by API).

df property

df: DataFrame

Convert to DataFrame with columns: date, sum.

Conversion is lazy - computed on first access and cached.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    result: dict[str, Any] = {
        "event": self.event,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "property_expr": self.property_expr,
        "unit": self.unit,
        "results": self.results,
    }
    if self.computed_at is not None:
        result["computed_at"] = self.computed_at
    return result

mixpanel_data.NumericAverageResult dataclass

NumericAverageResult(
    event: str,
    from_date: str,
    to_date: str,
    property_expr: str,
    unit: Literal["hour", "day"],
    results: dict[str, float] = dict(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Average of numeric property values per time unit.

Contains daily or hourly average values for a numeric property with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

event instance-attribute

event: str

Queried event name.

from_date instance-attribute

from_date: str

Query start date (YYYY-MM-DD).

to_date instance-attribute

to_date: str

Query end date (YYYY-MM-DD).

property_expr instance-attribute

property_expr: str

The 'on' expression averaged.

unit instance-attribute

unit: Literal['hour', 'day']

Time aggregation unit.

results class-attribute instance-attribute

results: dict[str, float] = field(default_factory=dict)

Average values: {date: average}.

df property

df: DataFrame

Convert to DataFrame with columns: date, average.

Conversion is lazy - computed on first access and cached.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "property_expr": self.property_expr,
        "unit": self.unit,
        "results": self.results,
    }

Bookmark Types

mixpanel_data.BookmarkInfo dataclass

BookmarkInfo(
    id: int,
    name: str,
    type: BookmarkType,
    project_id: int,
    created: str,
    modified: str,
    workspace_id: int | None = None,
    dashboard_id: int | None = None,
    description: str | None = None,
    creator_id: int | None = None,
    creator_name: str | None = None,
)

Metadata for a saved report (bookmark) from the Mixpanel Bookmarks API.

Represents a saved Insights, Funnel, Retention, or Flows report that can be queried using query_saved_report() or query_flows().

ATTRIBUTE DESCRIPTION
id

Unique bookmark identifier.

TYPE: int

name

User-defined report name.

TYPE: str

type

Report type (insights, funnels, retention, flows, launch-analysis).

TYPE: BookmarkType

project_id

Parent Mixpanel project ID.

TYPE: int

created

Creation timestamp (ISO format).

TYPE: str

modified

Last modification timestamp (ISO format).

TYPE: str

workspace_id

Optional workspace ID if scoped to a workspace.

TYPE: int | None

dashboard_id

Optional parent dashboard ID if linked to a dashboard.

TYPE: int | None

description

Optional user-provided description.

TYPE: str | None

creator_id

Optional creator's user ID.

TYPE: int | None

creator_name

Optional creator's display name.

TYPE: str | None

id instance-attribute

id: int

Unique bookmark identifier.

name instance-attribute

name: str

User-defined report name.

type instance-attribute

type: BookmarkType

Report type.

project_id instance-attribute

project_id: int

Parent Mixpanel project ID.

created instance-attribute

created: str

Creation timestamp (ISO format).

modified instance-attribute

modified: str

Last modification timestamp (ISO format).

workspace_id class-attribute instance-attribute

workspace_id: int | None = None

Workspace ID if scoped to a workspace.

dashboard_id class-attribute instance-attribute

dashboard_id: int | None = None

Parent dashboard ID if linked to a dashboard.

description class-attribute instance-attribute

description: str | None = None

User-provided description.

creator_id class-attribute instance-attribute

creator_id: int | None = None

Creator's user ID.

creator_name class-attribute instance-attribute

creator_name: str | None = None

Creator's display name.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all bookmark metadata fields.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all bookmark metadata fields.
    """
    result: dict[str, Any] = {
        "id": self.id,
        "name": self.name,
        "type": self.type,
        "project_id": self.project_id,
        "created": self.created,
        "modified": self.modified,
    }
    if self.workspace_id is not None:
        result["workspace_id"] = self.workspace_id
    if self.dashboard_id is not None:
        result["dashboard_id"] = self.dashboard_id
    if self.description is not None:
        result["description"] = self.description
    if self.creator_id is not None:
        result["creator_id"] = self.creator_id
    if self.creator_name is not None:
        result["creator_name"] = self.creator_name
    return result

mixpanel_data.SavedReportResult dataclass

SavedReportResult(
    bookmark_id: int,
    computed_at: str,
    from_date: str,
    to_date: str,
    headers: list[str] = list(),
    series: dict[str, Any] = dict(),
    _df_cache: DataFrame | None = None,
)

Data from a saved report (Insights, Retention, or Funnel).

Contains data from a pre-configured saved report with automatic report type detection and lazy DataFrame conversion support.

The report_type property automatically detects the report type based on headers: "$retention" indicates retention, "$funnel" indicates funnel, otherwise it's an insights report.

ATTRIBUTE DESCRIPTION
bookmark_id

Saved report identifier.

TYPE: int

computed_at

When report was computed (ISO format).

TYPE: str

from_date

Report start date.

TYPE: str

to_date

Report end date.

TYPE: str

headers

Report column headers (used for type detection).

TYPE: list[str]

series

Report data (structure varies by report type).

TYPE: dict[str, Any]

bookmark_id instance-attribute

bookmark_id: int

Saved report identifier.

computed_at instance-attribute

computed_at: str

When report was computed (ISO format).

from_date instance-attribute

from_date: str

Report start date.

to_date instance-attribute

to_date: str

Report end date.

headers class-attribute instance-attribute

headers: list[str] = field(default_factory=list)

Report column headers (used for type detection).

series class-attribute instance-attribute

series: dict[str, Any] = field(default_factory=dict)

Report data (structure varies by report type).

For Insights reports: {event_name: {date: count}} For Retention reports: {series_name: {date: {segment: {first, counts, rates}}}} For Funnel reports: {count: {...}, overall_conv_ratio: {...}, ...}

report_type property

report_type: SavedReportType

Detect the report type from headers.

RETURNS DESCRIPTION
SavedReportType

'retention' if headers contain '$retention',

SavedReportType

'funnel' if headers contain '$funnel',

SavedReportType

'insights' otherwise.

df property

df: DataFrame

Convert to DataFrame.

For Insights reports: columns are date, event, count. For Retention/Funnel reports: flattens the nested structure.

Conversion is lazy - computed on first access and cached.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all report fields including detected report_type.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all report fields including detected report_type.
    """
    return {
        "bookmark_id": self.bookmark_id,
        "computed_at": self.computed_at,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "headers": self.headers,
        "series": self.series,
        "report_type": self.report_type,
    }

mixpanel_data.FlowsResult dataclass

FlowsResult(
    bookmark_id: int,
    computed_at: str,
    steps: list[dict[str, Any]] = list(),
    breakdowns: list[dict[str, Any]] = list(),
    overall_conversion_rate: float = 0.0,
    metadata: dict[str, Any] = dict(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Data from a saved Flows report.

Contains user path/navigation data from a pre-configured Flows report with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

ATTRIBUTE DESCRIPTION
bookmark_id

Saved report identifier.

TYPE: int

computed_at

When report was computed (ISO format).

TYPE: str

steps

Flow step data with event sequences and counts.

TYPE: list[dict[str, Any]]

breakdowns

Path breakdown data showing user flow distribution.

TYPE: list[dict[str, Any]]

overall_conversion_rate

End-to-end conversion rate (0.0 to 1.0).

TYPE: float

metadata

Additional API metadata from the response.

TYPE: dict[str, Any]

bookmark_id instance-attribute

bookmark_id: int

Saved report identifier.

computed_at instance-attribute

computed_at: str

When report was computed (ISO format).

steps class-attribute instance-attribute

steps: list[dict[str, Any]] = field(default_factory=list)

Flow step data with event sequences and counts.

breakdowns class-attribute instance-attribute

breakdowns: list[dict[str, Any]] = field(default_factory=list)

Path breakdown data showing user flow distribution.

overall_conversion_rate class-attribute instance-attribute

overall_conversion_rate: float = 0.0

End-to-end conversion rate (0.0 to 1.0).

metadata class-attribute instance-attribute

metadata: dict[str, Any] = field(default_factory=dict)

Additional API metadata from the response.

df property

df: DataFrame

Convert steps to DataFrame.

Returns DataFrame with columns derived from step data structure. Conversion is lazy - computed on first access and cached.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all flows report fields.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all flows report fields.
    """
    return {
        "bookmark_id": self.bookmark_id,
        "computed_at": self.computed_at,
        "steps": self.steps,
        "breakdowns": self.breakdowns,
        "overall_conversion_rate": self.overall_conversion_rate,
        "metadata": self.metadata,
    }

JQL Discovery Types

mixpanel_data.PropertyDistributionResult dataclass

PropertyDistributionResult(
    event: str,
    property_name: str,
    from_date: str,
    to_date: str,
    total_count: int,
    values: tuple[PropertyValueCount, ...],
    _df_cache: DataFrame | None = None,
)

Distribution of values for a property from JQL analysis.

Contains the top N values for a property with their counts and percentages, enabling quick understanding of property value distribution without fetching all data locally.

ATTRIBUTE DESCRIPTION
event

The event type analyzed.

TYPE: str

property_name

The property name analyzed.

TYPE: str

from_date

Query start date (YYYY-MM-DD).

TYPE: str

to_date

Query end date (YYYY-MM-DD).

TYPE: str

total_count

Total number of events with this property defined.

TYPE: int

values

Top values with counts and percentages.

TYPE: tuple[PropertyValueCount, ...]

event instance-attribute

event: str

Event type analyzed.

property_name instance-attribute

property_name: str

Property name analyzed.

from_date instance-attribute

from_date: str

Query start date (YYYY-MM-DD).

to_date instance-attribute

to_date: str

Query end date (YYYY-MM-DD).

total_count instance-attribute

total_count: int

Total events with this property defined.

values instance-attribute

values: tuple[PropertyValueCount, ...]

Top values with counts and percentages.

df property

df: DataFrame

Convert to DataFrame with columns: value, count, percentage.

Conversion is lazy - computed on first access and cached.

RETURNS DESCRIPTION
DataFrame

DataFrame with value distribution data.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all distribution data.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all distribution data.
    """
    return {
        "event": self.event,
        "property_name": self.property_name,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "total_count": self.total_count,
        "values": [v.to_dict() for v in self.values],
    }

mixpanel_data.PropertyValueCount dataclass

PropertyValueCount(
    value: str | int | float | bool | None, count: int, percentage: float
)

A single value and its count from property distribution analysis.

Represents one row in a property value distribution, showing the value, its occurrence count, and percentage of total.

ATTRIBUTE DESCRIPTION
value

The property value (can be string, number, bool, or None).

TYPE: str | int | float | bool | None

count

Number of occurrences of this value.

TYPE: int

percentage

Percentage of total events (0.0 to 100.0).

TYPE: float

value instance-attribute

value: str | int | float | bool | None

The property value.

count instance-attribute

count: int

Number of occurrences.

percentage instance-attribute

percentage: float

Percentage of total (0.0 to 100.0).

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with value, count, and percentage.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with value, count, and percentage.
    """
    return {
        "value": self.value,
        "count": self.count,
        "percentage": self.percentage,
    }

mixpanel_data.NumericPropertySummaryResult dataclass

NumericPropertySummaryResult(
    event: str,
    property_name: str,
    from_date: str,
    to_date: str,
    count: int,
    min: float,
    max: float,
    sum: float,
    avg: float,
    stddev: float,
    percentiles: dict[int, float],
)

Statistical summary of a numeric property from JQL analysis.

Contains min, max, sum, average, standard deviation, and percentiles for a numeric property, enabling understanding of value distributions without fetching all data locally.

ATTRIBUTE DESCRIPTION
event

The event type analyzed.

TYPE: str

property_name

The property name analyzed.

TYPE: str

from_date

Query start date (YYYY-MM-DD).

TYPE: str

to_date

Query end date (YYYY-MM-DD).

TYPE: str

count

Number of events with this property defined.

TYPE: int

min

Minimum value.

TYPE: float

max

Maximum value.

TYPE: float

sum

Sum of all values.

TYPE: float

avg

Average value.

TYPE: float

stddev

Standard deviation.

TYPE: float

percentiles

Percentile values keyed by percentile number.

TYPE: dict[int, float]

event instance-attribute

event: str

Event type analyzed.

property_name instance-attribute

property_name: str

Property name analyzed.

from_date instance-attribute

from_date: str

Query start date (YYYY-MM-DD).

to_date instance-attribute

to_date: str

Query end date (YYYY-MM-DD).

count instance-attribute

count: int

Number of events with this property defined.

min instance-attribute

min: float

Minimum value.

max instance-attribute

max: float

Maximum value.

sum instance-attribute

sum: float

Sum of all values.

avg instance-attribute

avg: float

Average value.

stddev instance-attribute

stddev: float

Standard deviation.

percentiles instance-attribute

percentiles: dict[int, float]

Percentile values keyed by percentile number (e.g., {50: 98.0}).

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all numeric summary data.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all numeric summary data.
    """
    return {
        "event": self.event,
        "property_name": self.property_name,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "count": self.count,
        "min": self.min,
        "max": self.max,
        "sum": self.sum,
        "avg": self.avg,
        "stddev": self.stddev,
        "percentiles": {str(k): v for k, v in self.percentiles.items()},
    }

mixpanel_data.DailyCountsResult dataclass

DailyCountsResult(
    from_date: str,
    to_date: str,
    events: tuple[str, ...] | None,
    counts: tuple[DailyCount, ...],
    _df_cache: DataFrame | None = None,
)

Time-series event counts by day from JQL analysis.

Contains daily event counts for quick activity trend analysis without complex segmentation setup.

ATTRIBUTE DESCRIPTION
from_date

Query start date (YYYY-MM-DD).

TYPE: str

to_date

Query end date (YYYY-MM-DD).

TYPE: str

events

Event types included (None for all events).

TYPE: tuple[str, ...] | None

counts

Daily counts for each event.

TYPE: tuple[DailyCount, ...]

from_date instance-attribute

from_date: str

Query start date (YYYY-MM-DD).

to_date instance-attribute

to_date: str

Query end date (YYYY-MM-DD).

events instance-attribute

events: tuple[str, ...] | None

Event types included (None for all events).

counts instance-attribute

counts: tuple[DailyCount, ...]

Daily counts for each event.

df property

df: DataFrame

Convert to DataFrame with columns: date, event, count.

Conversion is lazy - computed on first access and cached.

RETURNS DESCRIPTION
DataFrame

DataFrame with daily counts data.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all daily counts data.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all daily counts data.
    """
    return {
        "from_date": self.from_date,
        "to_date": self.to_date,
        "events": list(self.events) if self.events else None,
        "counts": [c.to_dict() for c in self.counts],
    }

mixpanel_data.DailyCount dataclass

DailyCount(date: str, event: str, count: int)

Event count for a single date from daily counts analysis.

Represents one row in a daily counts result, showing date, event, and count.

ATTRIBUTE DESCRIPTION
date

Date string (YYYY-MM-DD).

TYPE: str

event

Event name.

TYPE: str

count

Number of occurrences on this date.

TYPE: int

date instance-attribute

date: str

Date string (YYYY-MM-DD).

event instance-attribute

event: str

Event name.

count instance-attribute

count: int

Number of occurrences.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with date, event, and count.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with date, event, and count.
    """
    return {
        "date": self.date,
        "event": self.event,
        "count": self.count,
    }

mixpanel_data.EngagementDistributionResult dataclass

EngagementDistributionResult(
    from_date: str,
    to_date: str,
    events: tuple[str, ...] | None,
    total_users: int,
    buckets: tuple[EngagementBucket, ...],
    _df_cache: DataFrame | None = None,
)

User engagement distribution from JQL analysis.

Shows how many users performed N events, helping understand user engagement patterns without fetching all data locally.

ATTRIBUTE DESCRIPTION
from_date

Query start date (YYYY-MM-DD).

TYPE: str

to_date

Query end date (YYYY-MM-DD).

TYPE: str

events

Event types included (None for all events).

TYPE: tuple[str, ...] | None

total_users

Total number of distinct users.

TYPE: int

buckets

Engagement buckets with user counts.

TYPE: tuple[EngagementBucket, ...]

from_date instance-attribute

from_date: str

Query start date (YYYY-MM-DD).

to_date instance-attribute

to_date: str

Query end date (YYYY-MM-DD).

events instance-attribute

events: tuple[str, ...] | None

Event types included (None for all events).

total_users instance-attribute

total_users: int

Total number of distinct users.

buckets instance-attribute

buckets: tuple[EngagementBucket, ...]

Engagement buckets with user counts.

df property

df: DataFrame

Convert to DataFrame with engagement bucket columns.

Conversion is lazy - computed on first access and cached.

RETURNS DESCRIPTION
DataFrame

DataFrame with engagement distribution data.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all engagement distribution data.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all engagement distribution data.
    """
    return {
        "from_date": self.from_date,
        "to_date": self.to_date,
        "events": list(self.events) if self.events else None,
        "total_users": self.total_users,
        "buckets": [b.to_dict() for b in self.buckets],
    }

mixpanel_data.EngagementBucket dataclass

EngagementBucket(
    bucket_min: int, bucket_label: str, user_count: int, percentage: float
)

User count in an engagement bucket from engagement analysis.

Represents one bucket in a user engagement distribution, showing how many users performed events in a certain frequency range.

ATTRIBUTE DESCRIPTION
bucket_min

Minimum events in this bucket.

TYPE: int

bucket_label

Human-readable label (e.g., "1", "2-5", "100+").

TYPE: str

user_count

Number of users in this bucket.

TYPE: int

percentage

Percentage of total users (0.0 to 100.0).

TYPE: float

bucket_min instance-attribute

bucket_min: int

Minimum events in this bucket.

bucket_label instance-attribute

bucket_label: str

Human-readable label (e.g., '1', '2-5', '100+').

user_count instance-attribute

user_count: int

Number of users in this bucket.

percentage instance-attribute

percentage: float

Percentage of total users (0.0 to 100.0).

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with bucket data.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with bucket data.
    """
    return {
        "bucket_min": self.bucket_min,
        "bucket_label": self.bucket_label,
        "user_count": self.user_count,
        "percentage": self.percentage,
    }

mixpanel_data.PropertyCoverageResult dataclass

PropertyCoverageResult(
    event: str,
    from_date: str,
    to_date: str,
    total_events: int,
    coverage: tuple[PropertyCoverage, ...],
    _df_cache: DataFrame | None = None,
)

Property coverage analysis result from JQL.

Shows which properties are consistently populated vs sparse, helping understand data quality before writing queries.

ATTRIBUTE DESCRIPTION
event

The event type analyzed.

TYPE: str

from_date

Query start date (YYYY-MM-DD).

TYPE: str

to_date

Query end date (YYYY-MM-DD).

TYPE: str

total_events

Total number of events analyzed.

TYPE: int

coverage

Coverage statistics for each property.

TYPE: tuple[PropertyCoverage, ...]

event instance-attribute

event: str

Event type analyzed.

from_date instance-attribute

from_date: str

Query start date (YYYY-MM-DD).

to_date instance-attribute

to_date: str

Query end date (YYYY-MM-DD).

total_events instance-attribute

total_events: int

Total number of events analyzed.

coverage instance-attribute

coverage: tuple[PropertyCoverage, ...]

Coverage statistics for each property.

df property

df: DataFrame

Convert to DataFrame with property coverage columns.

Conversion is lazy - computed on first access and cached.

RETURNS DESCRIPTION
DataFrame

DataFrame with property coverage data.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all coverage data.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all coverage data.
    """
    return {
        "event": self.event,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "total_events": self.total_events,
        "coverage": [c.to_dict() for c in self.coverage],
    }

mixpanel_data.PropertyCoverage dataclass

PropertyCoverage(
    property: str,
    defined_count: int,
    null_count: int,
    coverage_percentage: float,
)

Coverage statistics for a single property from coverage analysis.

Shows how often a property is defined vs null for a given event type.

ATTRIBUTE DESCRIPTION
property

Property name.

TYPE: str

defined_count

Number of events with this property defined.

TYPE: int

null_count

Number of events with this property null/undefined.

TYPE: int

coverage_percentage

Percentage of events with property defined (0.0-100.0).

TYPE: float

property instance-attribute

property: str

Property name.

defined_count instance-attribute

defined_count: int

Number of events with property defined.

null_count instance-attribute

null_count: int

Number of events with property null/undefined.

coverage_percentage instance-attribute

coverage_percentage: float

Percentage with property defined (0.0 to 100.0).

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with coverage data.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with coverage data.
    """
    return {
        "property": self.property,
        "defined_count": self.defined_count,
        "null_count": self.null_count,
        "coverage_percentage": self.coverage_percentage,
    }

Introspection Types

mixpanel_data.ColumnSummary dataclass

ColumnSummary(
    column_name: str,
    column_type: str,
    min: Any,
    max: Any,
    approx_unique: int,
    avg: float | None,
    std: float | None,
    q25: Any,
    q50: Any,
    q75: Any,
    count: int,
    null_percentage: float,
)

Statistical summary of a single column from DuckDB's SUMMARIZE command.

Contains per-column statistics including min/max, quartiles, null percentage, and approximate distinct counts. Numeric columns include additional stats like average and standard deviation.

column_name instance-attribute

column_name: str

Name of the column.

column_type instance-attribute

column_type: str

DuckDB data type (VARCHAR, TIMESTAMP, INTEGER, JSON, etc.).

min instance-attribute

min: Any

Minimum value (type varies by column type).

max instance-attribute

max: Any

Maximum value (type varies by column type).

approx_unique instance-attribute

approx_unique: int

Approximate count of distinct values (HyperLogLog).

avg instance-attribute

avg: float | None

Mean value (None for non-numeric columns).

std instance-attribute

std: float | None

Standard deviation (None for non-numeric columns).

q25 instance-attribute

q25: Any

25th percentile value (None for non-numeric).

q50 instance-attribute

q50: Any

Median / 50th percentile (None for non-numeric).

q75 instance-attribute

q75: Any

75th percentile value (None for non-numeric).

count instance-attribute

count: int

Number of non-null values.

null_percentage instance-attribute

null_percentage: float

Percentage of null values (0.0 to 100.0).

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all column statistics.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all column statistics.
    """
    return {
        "column_name": self.column_name,
        "column_type": self.column_type,
        "min": self.min,
        "max": self.max,
        "approx_unique": self.approx_unique,
        "avg": self.avg,
        "std": self.std,
        "q25": self.q25,
        "q50": self.q50,
        "q75": self.q75,
        "count": self.count,
        "null_percentage": self.null_percentage,
    }

mixpanel_data.SummaryResult dataclass

SummaryResult(
    table: str,
    row_count: int,
    columns: list[ColumnSummary] = list(),
    _df_cache: DataFrame | None = None,
)

Statistical summary of all columns in a table.

Contains row count and per-column statistics from DuckDB's SUMMARIZE command. Provides both structured access via the columns list and DataFrame conversion via the df property.

table instance-attribute

table: str

Name of the summarized table.

row_count instance-attribute

row_count: int

Total number of rows in the table.

columns class-attribute instance-attribute

columns: list[ColumnSummary] = field(default_factory=list)

Per-column statistics.

df property

df: DataFrame

Convert to DataFrame with one row per column.

Conversion is lazy - computed on first access and cached.

RETURNS DESCRIPTION
DataFrame

DataFrame with column statistics.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with table name, row count, and column statistics.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with table name, row count, and column statistics.
    """
    return {
        "table": self.table,
        "row_count": self.row_count,
        "columns": [col.to_dict() for col in self.columns],
    }

mixpanel_data.EventStats dataclass

EventStats(
    event_name: str,
    count: int,
    unique_users: int,
    first_seen: datetime,
    last_seen: datetime,
    pct_of_total: float,
)

Statistics for a single event type.

Contains count, unique users, date range, and percentage of total for a specific event in an events table.

event_name instance-attribute

event_name: str

Name of the event.

count instance-attribute

count: int

Total occurrences of this event.

unique_users instance-attribute

unique_users: int

Count of distinct users who triggered this event.

first_seen instance-attribute

first_seen: datetime

Earliest occurrence timestamp.

last_seen instance-attribute

last_seen: datetime

Latest occurrence timestamp.

pct_of_total instance-attribute

pct_of_total: float

Percentage of all events (0.0 to 100.0).

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with event statistics (datetimes as ISO strings).

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with event statistics (datetimes as ISO strings).
    """
    return {
        "event_name": self.event_name,
        "count": self.count,
        "unique_users": self.unique_users,
        "first_seen": self.first_seen.isoformat(),
        "last_seen": self.last_seen.isoformat(),
        "pct_of_total": self.pct_of_total,
    }

mixpanel_data.EventBreakdownResult dataclass

EventBreakdownResult(
    table: str,
    total_events: int,
    total_users: int,
    date_range: tuple[datetime, datetime],
    events: list[EventStats] = list(),
    _df_cache: DataFrame | None = None,
)

Distribution of events in a table.

Contains aggregate statistics and per-event breakdown with counts, unique users, date ranges, and percentages.

table instance-attribute

table: str

Name of the analyzed table.

total_events instance-attribute

total_events: int

Total number of events in the table.

total_users instance-attribute

total_users: int

Total distinct users across all events.

date_range instance-attribute

date_range: tuple[datetime, datetime]

(earliest, latest) event timestamps.

events class-attribute instance-attribute

events: list[EventStats] = field(default_factory=list)

Per-event statistics, ordered by count descending.

df property

df: DataFrame

Convert to DataFrame with one row per event type.

Conversion is lazy - computed on first access and cached.

RETURNS DESCRIPTION
DataFrame

DataFrame with event statistics.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with table info and event statistics.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with table info and event statistics.
    """
    return {
        "table": self.table,
        "total_events": self.total_events,
        "total_users": self.total_users,
        "date_range": [
            self.date_range[0].isoformat(),
            self.date_range[1].isoformat(),
        ],
        "events": [event.to_dict() for event in self.events],
    }

mixpanel_data.ColumnStatsResult dataclass

ColumnStatsResult(
    table: str,
    column: str,
    dtype: str,
    count: int,
    null_count: int,
    null_pct: float,
    unique_count: int,
    unique_pct: float,
    top_values: list[tuple[Any, int]] = list(),
    min: float | None = None,
    max: float | None = None,
    mean: float | None = None,
    std: float | None = None,
    _df_cache: DataFrame | None = None,
)

Deep statistical analysis of a single column.

Provides detailed statistics including null rates, cardinality, top values, and numeric statistics (for numeric columns). Supports JSON path expressions for analyzing properties.

table instance-attribute

table: str

Name of the source table.

column instance-attribute

column: str

Column expression analyzed (may include JSON path).

dtype instance-attribute

dtype: str

DuckDB data type of the column.

count instance-attribute

count: int

Number of non-null values.

null_count instance-attribute

null_count: int

Number of null values.

null_pct instance-attribute

null_pct: float

Percentage of null values (0.0 to 100.0).

unique_count instance-attribute

unique_count: int

Approximate count of distinct values.

unique_pct instance-attribute

unique_pct: float

Percentage of values that are unique (0.0 to 100.0).

top_values class-attribute instance-attribute

top_values: list[tuple[Any, int]] = field(default_factory=list)

Most frequent (value, count) pairs.

min class-attribute instance-attribute

min: float | None = None

Minimum value (None for non-numeric).

max class-attribute instance-attribute

max: float | None = None

Maximum value (None for non-numeric).

mean class-attribute instance-attribute

mean: float | None = None

Mean value (None for non-numeric).

std class-attribute instance-attribute

std: float | None = None

Standard deviation (None for non-numeric).

df property

df: DataFrame

Convert top values to DataFrame with columns: value, count.

Conversion is lazy - computed on first access and cached.

RETURNS DESCRIPTION
DataFrame

DataFrame with top values and their counts.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS DESCRIPTION
dict[str, Any]

Dictionary with all column statistics.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all column statistics.
    """
    return {
        "table": self.table,
        "column": self.column,
        "dtype": self.dtype,
        "count": self.count,
        "null_count": self.null_count,
        "null_pct": self.null_pct,
        "unique_count": self.unique_count,
        "unique_pct": self.unique_pct,
        "top_values": [[value, count] for value, count in self.top_values],
        "min": self.min,
        "max": self.max,
        "mean": self.mean,
        "std": self.std,
    }

Storage Types

mixpanel_data.TableMetadata dataclass

TableMetadata(
    type: Literal["events", "profiles"],
    fetched_at: datetime,
    from_date: str | None = None,
    to_date: str | None = None,
    filter_events: list[str] | None = None,
    filter_where: str | None = None,
    filter_cohort_id: str | None = None,
    filter_output_properties: list[str] | None = None,
    filter_group_id: str | None = None,
    filter_behaviors: str | None = None,
)

Metadata for a data fetch operation.

This metadata is passed to table creation methods and stored in the database's internal _metadata table for tracking fetch operations.

type instance-attribute

type: Literal['events', 'profiles']

Type of data fetched.

fetched_at instance-attribute

fetched_at: datetime

When the fetch completed (UTC).

from_date class-attribute instance-attribute

from_date: str | None = None

Start date for events (YYYY-MM-DD), None for profiles.

to_date class-attribute instance-attribute

to_date: str | None = None

End date for events (YYYY-MM-DD), None for profiles.

filter_events class-attribute instance-attribute

filter_events: list[str] | None = None

Event names filtered (if applicable).

filter_where class-attribute instance-attribute

filter_where: str | None = None

WHERE clause filter (if applicable).

filter_cohort_id class-attribute instance-attribute

filter_cohort_id: str | None = None

Cohort ID filter for profiles (if applicable).

filter_output_properties class-attribute instance-attribute

filter_output_properties: list[str] | None = None

Property names to include in output (if applicable).

filter_group_id class-attribute instance-attribute

filter_group_id: str | None = None

Group ID for group profile queries (if applicable).

filter_behaviors class-attribute instance-attribute

filter_behaviors: str | None = None

Serialized behaviors filter for behavioral profile queries (if applicable).

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "type": self.type,
        "fetched_at": self.fetched_at.isoformat(),
        "from_date": self.from_date,
        "to_date": self.to_date,
        "filter_events": self.filter_events,
        "filter_where": self.filter_where,
        "filter_cohort_id": self.filter_cohort_id,
        "filter_output_properties": self.filter_output_properties,
        "filter_group_id": self.filter_group_id,
        "filter_behaviors": self.filter_behaviors,
    }

mixpanel_data.TableInfo dataclass

TableInfo(
    name: str,
    type: Literal["events", "profiles"],
    row_count: int,
    fetched_at: datetime,
)

Information about a table in the database.

Returned by list_tables() to provide summary information about available tables without retrieving full schemas.

name instance-attribute

name: str

Table name.

type instance-attribute

type: Literal['events', 'profiles']

Table type.

row_count instance-attribute

row_count: int

Number of rows.

fetched_at instance-attribute

fetched_at: datetime

When data was fetched (UTC).

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "name": self.name,
        "type": self.type,
        "row_count": self.row_count,
        "fetched_at": self.fetched_at.isoformat(),
    }

mixpanel_data.ColumnInfo dataclass

ColumnInfo(name: str, type: str, nullable: bool, primary_key: bool = False)

Information about a table column.

Describes a single column's schema, including name, type, nullability constraints, and primary key status.

name instance-attribute

name: str

Column name.

type instance-attribute

type: str

DuckDB type (VARCHAR, TIMESTAMP, JSON, INTEGER, etc.).

nullable instance-attribute

nullable: bool

Whether column allows NULL values.

primary_key class-attribute instance-attribute

primary_key: bool = False

Whether column is a primary key.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "name": self.name,
        "type": self.type,
        "nullable": self.nullable,
        "primary_key": self.primary_key,
    }

mixpanel_data.TableSchema dataclass

TableSchema(table_name: str, columns: list[ColumnInfo])

Schema information for a table.

Returned by get_schema() to describe the structure of a table, including all column definitions.

table_name instance-attribute

table_name: str

Table name.

columns instance-attribute

columns: list[ColumnInfo]

Column definitions.

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "table_name": self.table_name,
        "columns": [col.to_dict() for col in self.columns],
    }

mixpanel_data.WorkspaceInfo dataclass

WorkspaceInfo(
    path: Path | None,
    project_id: str,
    region: str,
    account: str | None,
    tables: list[str],
    size_mb: float,
    created_at: datetime | None,
)

Information about a Workspace instance.

Returned by Workspace.info() to provide metadata about the workspace including database location, connection details, and table summary.

path instance-attribute

path: Path | None

Database file path (None for ephemeral or in-memory workspaces).

project_id instance-attribute

project_id: str

Mixpanel project ID.

region instance-attribute

region: str

Data residency region (us, eu, in).

account instance-attribute

account: str | None

Named account used (None if credentials from environment).

tables instance-attribute

tables: list[str]

Names of tables in the database.

size_mb instance-attribute

size_mb: float

Database file size in megabytes (0.0 for in-memory workspaces).

created_at instance-attribute

created_at: datetime | None

When database was created (None if unknown).

to_dict

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py
def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "path": str(self.path) if self.path else None,
        "project_id": self.project_id,
        "region": self.region,
        "account": self.account,
        "tables": self.tables,
        "size_mb": self.size_mb,
        "created_at": self.created_at.isoformat() if self.created_at else None,
    }