Result Types¶

Explore on DeepWiki

Ask questions about result structures, DataFrame conversion, or type usage patterns.

All result types are immutable frozen dataclasses with:

Lazy DataFrame conversion via the .df property
JSON serialization via the .to_dict() method
Full type hints for IDE/mypy support

Fetch Results¶

mixpanel_data.FetchResult `dataclass` ¶

FetchResult(
    table: str,
    rows: int,
    type: Literal["events", "profiles"],
    duration_seconds: float,
    date_range: tuple[str, str] | None,
    fetched_at: datetime,
    _data: list[dict[str, Any]] = list(),
    _df_cache: DataFrame | None = None,
)

Result of a data fetch operation.

Represents the outcome of fetching events or profiles from Mixpanel and storing them in the local database.

table `instance-attribute` ¶

table: str

Name of the created table.

rows `instance-attribute` ¶

rows: int

Number of rows fetched.

type `instance-attribute` ¶

type: Literal['events', 'profiles']

Type of data fetched.

duration_seconds `instance-attribute` ¶

duration_seconds: float

Time taken to complete the fetch.

date_range `instance-attribute` ¶

date_range: tuple[str, str] | None

Date range for events (None for profiles).

fetched_at `instance-attribute` ¶

fetched_at: datetime

Timestamp when fetch completed.

df `property` ¶

df: DataFrame

Convert result data to pandas DataFrame.

Conversion is lazy - computed on first access and cached.

RETURNS	DESCRIPTION
`DataFrame`	DataFrame with fetched data.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize result for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary representation (excludes raw data).
`dict[str, Any]`	datetime values are converted to ISO format strings.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize result for JSON output.

    Returns:
        Dictionary representation (excludes raw data).
        datetime values are converted to ISO format strings.
    """
    return {
        "table": self.table,
        "rows": self.rows,
        "type": self.type,
        "duration_seconds": self.duration_seconds,
        "date_range": self.date_range,
        "fetched_at": self.fetched_at.isoformat(),
    }

Parallel Fetch Types¶

Types for parallel event fetching with progress tracking and failure handling.

mixpanel_data.ParallelFetchResult `dataclass` ¶

ParallelFetchResult(
    table: str,
    total_rows: int,
    successful_batches: int,
    failed_batches: int,
    failed_date_ranges: tuple[tuple[str, str], ...],
    duration_seconds: float,
    fetched_at: datetime,
)

Result of a parallel fetch operation.

Aggregates results from all batches, providing summary statistics and information about any failures for retry.

ATTRIBUTE	DESCRIPTION
`table`	Name of the created/appended table. TYPE: `str`
`total_rows`	Total number of rows fetched across all batches. TYPE: `int`
`successful_batches`	Number of batches that completed successfully. TYPE: `int`
`failed_batches`	Number of batches that failed. TYPE: `int`
`failed_date_ranges`	Date ranges (from_date, to_date) of failed batches. TYPE: `tuple[tuple[str, str], ...]`
`duration_seconds`	Total time taken for the parallel fetch. TYPE: `float`
`fetched_at`	Timestamp when fetch completed. TYPE: `datetime`

Example

result = ws.fetch_events(
    name="events",
    from_date="2024-01-01",
    to_date="2024-03-31",
    parallel=True,
)

if result.has_failures:
    print(f"Warning: {result.failed_batches} batches failed")
    for from_date, to_date in result.failed_date_ranges:
        print(f"  {from_date} to {to_date}")

table `instance-attribute` ¶

table: str

Name of the created/appended table.

total_rows `instance-attribute` ¶

total_rows: int

Total number of rows fetched across all batches.

successful_batches `instance-attribute` ¶

successful_batches: int

Number of batches that completed successfully.

failed_batches `instance-attribute` ¶

failed_batches: int

Number of batches that failed.

failed_date_ranges `instance-attribute` ¶

failed_date_ranges: tuple[tuple[str, str], ...]

Date ranges (from_date, to_date) of failed batches for retry.

duration_seconds `instance-attribute` ¶

duration_seconds: float

Total time taken for the parallel fetch.

fetched_at `instance-attribute` ¶

fetched_at: datetime

Timestamp when fetch completed.

has_failures `property` ¶

has_failures: bool

Check if any batches failed.

RETURNS	DESCRIPTION
`bool`	True if at least one batch failed, False otherwise.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all result fields including has_failures.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all result fields including has_failures.
    """
    return {
        "table": self.table,
        "total_rows": self.total_rows,
        "successful_batches": self.successful_batches,
        "failed_batches": self.failed_batches,
        "failed_date_ranges": [list(dr) for dr in self.failed_date_ranges],
        "duration_seconds": self.duration_seconds,
        "fetched_at": self.fetched_at.isoformat(),
        "has_failures": self.has_failures,
    }

mixpanel_data.BatchProgress `dataclass` ¶

BatchProgress(
    from_date: str,
    to_date: str,
    batch_index: int,
    total_batches: int,
    rows: int,
    success: bool,
    error: str | None = None,
)

Progress update for a parallel fetch batch.

Sent to the on_batch_complete callback when a batch finishes (successfully or with error).

ATTRIBUTE	DESCRIPTION
`from_date`	Start date of this batch (YYYY-MM-DD). TYPE: `str`
`to_date`	End date of this batch (YYYY-MM-DD). TYPE: `str`
`batch_index`	Zero-based index of this batch. TYPE: `int`
`total_batches`	Total number of batches in the parallel fetch. TYPE: `int`
`rows`	Number of rows fetched in this batch (0 if failed). TYPE: `int`
`success`	Whether this batch completed successfully. TYPE: `bool`
`error`	Error message if failed, None if successful. TYPE: `str \| None`

Example

def on_batch(progress: BatchProgress) -> None:
    status = "✓" if progress.success else "✗"
    print(f"[{status}] Batch {progress.batch_index + 1}/{progress.total_batches}")

result = ws.fetch_events(
    name="events",
    from_date="2024-01-01",
    to_date="2024-03-31",
    parallel=True,
    on_batch_complete=on_batch,
)

from_date `instance-attribute` ¶

from_date: str

Start date of this batch (YYYY-MM-DD).

to_date `instance-attribute` ¶

to_date: str

End date of this batch (YYYY-MM-DD).

batch_index `instance-attribute` ¶

batch_index: int

Zero-based index of this batch.

total_batches `instance-attribute` ¶

total_batches: int

Total number of batches in the parallel fetch.

rows `instance-attribute` ¶

rows: int

Number of rows fetched in this batch (0 if failed).

success `instance-attribute` ¶

success: bool

Whether this batch completed successfully.

error `class-attribute` `instance-attribute` ¶

error: str | None = None

Error message if failed, None if successful.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all batch progress fields.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all batch progress fields.
    """
    return {
        "from_date": self.from_date,
        "to_date": self.to_date,
        "batch_index": self.batch_index,
        "total_batches": self.total_batches,
        "rows": self.rows,
        "success": self.success,
        "error": self.error,
    }

mixpanel_data.BatchResult `dataclass` ¶

BatchResult(
    from_date: str,
    to_date: str,
    rows: int,
    success: bool,
    error: str | None = None,
)

Result of fetching a single date range chunk.

Internal type used by ParallelFetcherService to track batch outcomes. Contains either the fetched data (on success) or error info (on failure).

ATTRIBUTE	DESCRIPTION
`from_date`	Start date of this batch (YYYY-MM-DD). TYPE: `str`
`to_date`	End date of this batch (YYYY-MM-DD). TYPE: `str`
`rows`	Number of rows fetched (0 if failed). TYPE: `int`
`success`	Whether the batch completed successfully. TYPE: `bool`
`error`	Exception message if failed, None if successful. TYPE: `str \| None`

Note

Data is not included in to_dict() as it's consumed by the writer thread and is not JSON-serializable (iterator of dicts).

from_date `instance-attribute` ¶

from_date: str

Start date of this batch (YYYY-MM-DD).

to_date `instance-attribute` ¶

to_date: str

End date of this batch (YYYY-MM-DD).

rows `instance-attribute` ¶

rows: int

Number of rows fetched (0 if failed).

success `instance-attribute` ¶

success: bool

Whether the batch completed successfully.

error `class-attribute` `instance-attribute` ¶

error: str | None = None

Exception message if failed, None if successful.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output (excludes data).

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with batch result fields (excluding data).

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output (excludes data).

    Returns:
        Dictionary with batch result fields (excluding data).
    """
    return {
        "from_date": self.from_date,
        "to_date": self.to_date,
        "rows": self.rows,
        "success": self.success,
        "error": self.error,
    }

Parallel Profile Fetch Types¶

Types for parallel profile fetching with page-based progress tracking.

mixpanel_data.ParallelProfileResult `dataclass` ¶

ParallelProfileResult(
    table: str,
    total_rows: int,
    successful_pages: int,
    failed_pages: int,
    failed_page_indices: tuple[int, ...],
    duration_seconds: float,
    fetched_at: datetime,
)

Result of a parallel profile fetch operation.

Aggregates results from all pages, providing summary statistics and information about any failures for retry.

ATTRIBUTE	DESCRIPTION
`table`	Name of the created/appended table. TYPE: `str`
`total_rows`	Total number of rows fetched across all pages. TYPE: `int`
`successful_pages`	Number of pages that completed successfully. TYPE: `int`
`failed_pages`	Number of pages that failed. TYPE: `int`
`failed_page_indices`	Page indices of failed pages for retry. TYPE: `tuple[int, ...]`
`duration_seconds`	Total time taken for the parallel fetch. TYPE: `float`
`fetched_at`	Timestamp when fetch completed. TYPE: `datetime`

Example

result = ws.fetch_profiles(
    name="users",
    parallel=True,
)

if result.has_failures:
    print(f"Warning: {result.failed_pages} pages failed")
    for idx in result.failed_page_indices:
        print(f"  Page {idx}")

table `instance-attribute` ¶

table: str

Name of the created/appended table.

total_rows `instance-attribute` ¶

total_rows: int

Total number of rows fetched across all pages.

successful_pages `instance-attribute` ¶

successful_pages: int

Number of pages that completed successfully.

failed_pages `instance-attribute` ¶

failed_pages: int

Number of pages that failed.

failed_page_indices `instance-attribute` ¶

failed_page_indices: tuple[int, ...]

Page indices of failed pages for retry.

duration_seconds `instance-attribute` ¶

duration_seconds: float

Total time taken for the parallel fetch.

fetched_at `instance-attribute` ¶

fetched_at: datetime

Timestamp when fetch completed.

has_failures `property` ¶

has_failures: bool

Check if any pages failed.

RETURNS	DESCRIPTION
`bool`	True if at least one page failed, False otherwise.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all result fields including has_failures.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all result fields including has_failures.
    """
    return {
        "table": self.table,
        "total_rows": self.total_rows,
        "successful_pages": self.successful_pages,
        "failed_pages": self.failed_pages,
        "failed_page_indices": list(self.failed_page_indices),
        "duration_seconds": self.duration_seconds,
        "fetched_at": self.fetched_at.isoformat(),
        "has_failures": self.has_failures,
    }

mixpanel_data.ProfileProgress `dataclass` ¶

ProfileProgress(
    page_index: int,
    total_pages: int | None,
    rows: int,
    success: bool,
    error: str | None,
    cumulative_rows: int,
)

Progress update for a parallel profile fetch page.

Sent to the on_page_complete callback when a page finishes (successfully or with error). Used for progress visibility during parallel profile fetching operations.

ATTRIBUTE	DESCRIPTION
`page_index`	Zero-based index of this page. TYPE: `int`
`total_pages`	Total pages if known, None if not yet determined. TYPE: `int \| None`
`rows`	Number of rows fetched in this page (0 if failed). TYPE: `int`
`success`	Whether this page completed successfully. TYPE: `bool`
`error`	Error message if failed, None if successful. TYPE: `str \| None`
`cumulative_rows`	Total rows fetched so far across all pages. TYPE: `int`

Example

def on_page(progress: ProfileProgress) -> None:
    status = "✓" if progress.success else "✗"
    pct = f"{progress.page_index + 1}/{progress.total_pages}" if progress.total_pages else f"{progress.page_index + 1}/?"
    print(f"[{status}] Page {pct}: {progress.cumulative_rows} total rows")

result = ws.fetch_profiles(
    name="users",
    parallel=True,
    on_page_complete=on_page,
)

page_index `instance-attribute` ¶

page_index: int

Zero-based index of this page.

total_pages `instance-attribute` ¶

total_pages: int | None

Total pages if known, None if not yet determined.

rows `instance-attribute` ¶

rows: int

Number of rows fetched in this page (0 if failed).

success `instance-attribute` ¶

success: bool

Whether this page completed successfully.

error `instance-attribute` ¶

error: str | None

Error message if failed, None if successful.

cumulative_rows `instance-attribute` ¶

cumulative_rows: int

Total rows fetched so far across all pages.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all profile progress fields.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all profile progress fields.
    """
    return {
        "page_index": self.page_index,
        "total_pages": self.total_pages,
        "rows": self.rows,
        "success": self.success,
        "error": self.error,
        "cumulative_rows": self.cumulative_rows,
    }

mixpanel_data.ProfilePageResult `dataclass` ¶

ProfilePageResult(
    profiles: list[dict[str, Any]],
    session_id: str | None,
    page: int,
    has_more: bool,
    total: int,
    page_size: int,
)

Result from fetching a single page of profiles.

Contains the profiles from one page of the Engage API along with pagination metadata for fetching subsequent pages.

ATTRIBUTE	DESCRIPTION
`profiles`	List of profile dictionaries from this page. TYPE: `list[dict[str, Any]]`
`session_id`	Session ID for fetching next page, None if no more pages. TYPE: `str \| None`
`page`	Zero-based page index that was fetched. TYPE: `int`
`has_more`	True if there are more pages to fetch. TYPE: `bool`
`total`	Total number of profiles matching the query across all pages. TYPE: `int`
`page_size`	Number of profiles per page (typically 1000). TYPE: `int`

Example

# Fetch first page to get pagination metadata
result = api_client.export_profiles_page(page=0)
all_profiles = list(result.profiles)

# Pre-compute total pages for parallel fetching
total_pages = result.num_pages
print(f"Fetching {total_pages} pages ({result.total} profiles)")

# Continue fetching if more pages
while result.has_more:
    result = api_client.export_profiles_page(
        page=result.page + 1,
        session_id=result.session_id,
    )
    all_profiles.extend(result.profiles)

profiles `instance-attribute` ¶

profiles: list[dict[str, Any]]

List of profile dictionaries from this page.

session_id `instance-attribute` ¶

session_id: str | None

Session ID for fetching next page, None if no more pages.

page `instance-attribute` ¶

page: int

Zero-based page index that was fetched.

has_more `instance-attribute` ¶

has_more: bool

True if there are more pages to fetch.

total `instance-attribute` ¶

total: int

Total number of profiles matching the query across all pages.

page_size `instance-attribute` ¶

page_size: int

Number of profiles per page (typically 1000).

num_pages `property` ¶

num_pages: int

Calculate total number of pages needed.

Uses ceiling division to ensure partial pages are counted.

RETURNS	DESCRIPTION
`int`	Total pages needed to fetch all profiles.
`int`	Returns 0 if total is 0 (empty result set).

Example

result = api_client.export_profiles_page(page=0)
# If total=5432 and page_size=1000, num_pages=6
for page_idx in range(1, result.num_pages):
    # Fetch remaining pages...

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all page result fields including pagination metadata.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all page result fields including pagination metadata.
    """
    return {
        "profiles": self.profiles,
        "session_id": self.session_id,
        "page": self.page,
        "has_more": self.has_more,
        "profile_count": len(self.profiles),
        "total": self.total,
        "page_size": self.page_size,
        "num_pages": self.num_pages,
    }

Query Results¶

mixpanel_data.SegmentationResult `dataclass` ¶

SegmentationResult(
    event: str,
    from_date: str,
    to_date: str,
    unit: Literal["day", "week", "month"],
    segment_property: str | None,
    total: int,
    series: dict[str, dict[str, int]] = dict(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Result of a segmentation query.

Contains time-series data for an event, optionally segmented by a property.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

event `instance-attribute` ¶

event: str

Queried event name.

from_date `instance-attribute` ¶

from_date: str

Query start date (YYYY-MM-DD).

to_date `instance-attribute` ¶

to_date: str

Query end date (YYYY-MM-DD).

unit `instance-attribute` ¶

unit: Literal['day', 'week', 'month']

Time unit for aggregation.

segment_property `instance-attribute` ¶

segment_property: str | None

Property used for segmentation (None if total only).

total `instance-attribute` ¶

total: int

Total count across all segments and time periods.

series `class-attribute` `instance-attribute` ¶

series: dict[str, dict[str, int]] = field(default_factory=dict)

Time series data by segment.

Structure: {segment_name: {date_string: count}} Example: {"US": {"2024-01-01": 150, "2024-01-02": 200}, "EU": {...}} For unsegmented queries, segment_name is "total".

df `property` ¶

df: DataFrame

Convert to DataFrame with columns: date, segment, count.

For unsegmented queries, segment column is 'total'.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "unit": self.unit,
        "segment_property": self.segment_property,
        "total": self.total,
        "series": self.series,
    }

mixpanel_data.FunnelResult `dataclass` ¶

FunnelResult(
    funnel_id: int,
    funnel_name: str,
    from_date: str,
    to_date: str,
    conversion_rate: float,
    steps: list[FunnelStep] = list(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Result of a funnel query.

Contains step-by-step conversion data for a funnel.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

funnel_id `instance-attribute` ¶

funnel_id: int

Funnel identifier.

funnel_name `instance-attribute` ¶

funnel_name: str

Funnel display name.

from_date `instance-attribute` ¶

from_date: str

Query start date.

to_date `instance-attribute` ¶

to_date: str

Query end date.

conversion_rate `instance-attribute` ¶

conversion_rate: float

Overall conversion rate (0.0 to 1.0).

steps `class-attribute` `instance-attribute` ¶

steps: list[FunnelStep] = field(default_factory=list)

Step-by-step breakdown.

df `property` ¶

df: DataFrame

Convert to DataFrame with columns: step, event, count, conversion_rate.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "funnel_id": self.funnel_id,
        "funnel_name": self.funnel_name,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "conversion_rate": self.conversion_rate,
        "steps": [step.to_dict() for step in self.steps],
    }

mixpanel_data.FunnelStep `dataclass` ¶

FunnelStep(event: str, count: int, conversion_rate: float)

Single step in a funnel.

event `instance-attribute` ¶

event: str

Event name for this step.

count `instance-attribute` ¶

count: int

Number of users at this step.

conversion_rate `instance-attribute` ¶

conversion_rate: float

Conversion rate from previous step (0.0 to 1.0).

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "count": self.count,
        "conversion_rate": self.conversion_rate,
    }

mixpanel_data.RetentionResult `dataclass` ¶

RetentionResult(
    born_event: str,
    return_event: str,
    from_date: str,
    to_date: str,
    unit: Literal["day", "week", "month"],
    cohorts: list[CohortInfo] = list(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Result of a retention query.

Contains cohort-based retention data.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

born_event `instance-attribute` ¶

born_event: str

Event that defines cohort membership.

return_event `instance-attribute` ¶

return_event: str

Event that defines return.

from_date `instance-attribute` ¶

from_date: str

Query start date.

to_date `instance-attribute` ¶

to_date: str

Query end date.

unit `instance-attribute` ¶

unit: Literal['day', 'week', 'month']

Time unit for retention periods.

cohorts `class-attribute` `instance-attribute` ¶

cohorts: list[CohortInfo] = field(default_factory=list)

Cohort retention data.

df `property` ¶

df: DataFrame

Convert to DataFrame with columns: cohort_date, cohort_size, period_N.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "born_event": self.born_event,
        "return_event": self.return_event,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "unit": self.unit,
        "cohorts": [cohort.to_dict() for cohort in self.cohorts],
    }

mixpanel_data.CohortInfo `dataclass` ¶

CohortInfo(date: str, size: int, retention: list[float] = list())

Retention data for a single cohort.

date `instance-attribute` ¶

date: str

Cohort date (when users were 'born').

size `instance-attribute` ¶

size: int

Number of users in cohort.

retention `class-attribute` `instance-attribute` ¶

retention: list[float] = field(default_factory=list)

Retention percentages by period (0.0 to 1.0).

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "date": self.date,
        "size": self.size,
        "retention": self.retention,
    }

mixpanel_data.JQLResult `dataclass` ¶

JQLResult(_raw: list[Any] = list(), *, _df_cache: DataFrame | None = None)

Bases: ResultWithDataFrame

Result of a JQL query.

JQL (JavaScript Query Language) allows custom queries against Mixpanel data.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

The df property intelligently detects JQL result patterns (groupBy, percentiles, simple dicts) and converts them to clean tabular format.

raw `property` ¶

raw: list[Any]

Raw result data from JQL execution.

df `property` ¶

df: DataFrame

Convert result to DataFrame with intelligent structure detection.

The conversion strategy depends on the detected JQL result pattern:

groupBy results (detected by {key: [...], value: X} structure): - Keys expanded to columns: key_0, key_1, key_2, ... - Single value: "value" column - Multiple reducers (value array): value_0, value_1, value_2, ... - Additional fields (from .map()): preserved as-is - Example: {"key": ["US"], "value": 100, "name": "USA"} -> columns: key_0, value, name

Nested percentile results ([[{percentile: X, value: Y}, ...]]): - Outer list unwrapped, inner dicts converted directly

Simple list of dicts (already well-structured): - Converted directly to DataFrame preserving all fields

Fallback for other structures (scalars, mixed types, incompatible dicts): - Safely wrapped in single "value" column to prevent data loss - Used when structure doesn't match known patterns

RAISES	DESCRIPTION
`ValueError`	If groupBy structure has inconsistent value types across rows (some scalar, some array) which indicates malformed query results.

RETURNS	DESCRIPTION
`DataFrame`	DataFrame representation, cached after first access.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "raw": self._raw,
        "row_count": len(self._raw),
    }

Discovery Types¶

mixpanel_data.FunnelInfo `dataclass` ¶

FunnelInfo(funnel_id: int, name: str)

A saved funnel definition.

Represents a funnel saved in Mixpanel that can be queried using the funnel() method.

funnel_id `instance-attribute` ¶

funnel_id: int

Unique identifier for funnel queries.

name `instance-attribute` ¶

name: str

Human-readable funnel name.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "funnel_id": self.funnel_id,
        "name": self.name,
    }

mixpanel_data.SavedCohort `dataclass` ¶

SavedCohort(
    id: int,
    name: str,
    count: int,
    description: str,
    created: str,
    is_visible: bool,
)

A saved cohort definition.

Represents a user cohort saved in Mixpanel for profile filtering.

id `instance-attribute` ¶

id: int

Unique identifier for profile filtering.

name `instance-attribute` ¶

name: str

Human-readable cohort name.

count `instance-attribute` ¶

count: int

Current number of users in cohort.

description `instance-attribute` ¶

description: str

Optional description (may be empty string).

created `instance-attribute` ¶

created: str

Creation timestamp (YYYY-MM-DD HH:mm:ss).

is_visible `instance-attribute` ¶

is_visible: bool

Whether cohort is visible in Mixpanel UI.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "id": self.id,
        "name": self.name,
        "count": self.count,
        "description": self.description,
        "created": self.created,
        "is_visible": self.is_visible,
    }

mixpanel_data.TopEvent `dataclass` ¶

TopEvent(event: str, count: int, percent_change: float)

Today's event activity data.

Represents an event's current activity including count and trend.

event `instance-attribute` ¶

event: str

Event name.

count `instance-attribute` ¶

count: int

Today's event count.

percent_change `instance-attribute` ¶

percent_change: float

Change vs yesterday (-1.0 to +infinity).

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "count": self.count,
        "percent_change": self.percent_change,
    }

Lexicon Types¶

mixpanel_data.LexiconSchema `dataclass` ¶

LexiconSchema(entity_type: str, name: str, schema_json: LexiconDefinition)

Complete schema definition from Mixpanel Lexicon.

Represents a documented event or profile property definition from the Mixpanel data dictionary.

entity_type `instance-attribute` ¶

entity_type: str

Type of entity (e.g., 'event', 'profile', 'custom_event', 'group', etc.).

name `instance-attribute` ¶

name: str

Name of the event or profile property.

schema_json `instance-attribute` ¶

schema_json: LexiconDefinition

Full schema definition.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with entity_type, name, and schema_json.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with entity_type, name, and schema_json.
    """
    return {
        "entity_type": self.entity_type,
        "name": self.name,
        "schema_json": self.schema_json.to_dict(),
    }

mixpanel_data.LexiconDefinition `dataclass` ¶

LexiconDefinition(
    description: str | None,
    properties: dict[str, LexiconProperty],
    metadata: LexiconMetadata | None,
)

Full schema definition for an event or profile property in Lexicon.

Contains the structural definition including description, properties, and platform-specific metadata.

description `instance-attribute` ¶

description: str | None

Human-readable description of the entity.

properties `instance-attribute` ¶

properties: dict[str, LexiconProperty]

Property definitions keyed by property name.

metadata `instance-attribute` ¶

metadata: LexiconMetadata | None

Optional Mixpanel-specific metadata for the entity.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with properties, and optionally description and metadata.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with properties, and optionally description and metadata.
    """
    result: dict[str, Any] = {
        "properties": {k: v.to_dict() for k, v in self.properties.items()},
    }
    if self.description is not None:
        result["description"] = self.description
    if self.metadata is not None:
        result["metadata"] = self.metadata.to_dict()
    return result

mixpanel_data.LexiconProperty `dataclass` ¶

LexiconProperty(
    type: str, description: str | None, metadata: LexiconMetadata | None
)

Schema definition for a single property in a Lexicon schema.

Describes the type and metadata for an event or profile property.

type `instance-attribute` ¶

type: str

JSON Schema type (string, number, boolean, array, object, integer, null).

description `instance-attribute` ¶

description: str | None

Human-readable description of the property.

metadata `instance-attribute` ¶

metadata: LexiconMetadata | None

Optional Mixpanel-specific metadata.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with type, and optionally description and metadata.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with type, and optionally description and metadata.
    """
    result: dict[str, Any] = {"type": self.type}
    if self.description is not None:
        result["description"] = self.description
    if self.metadata is not None:
        result["metadata"] = self.metadata.to_dict()
    return result

mixpanel_data.LexiconMetadata `dataclass` ¶

LexiconMetadata(
    source: str | None,
    display_name: str | None,
    tags: list[str],
    hidden: bool,
    dropped: bool,
    contacts: list[str],
    team_contacts: list[str],
)

Mixpanel-specific metadata for Lexicon schemas and properties.

Contains platform-specific information about how schemas and properties are displayed and organized in the Mixpanel UI.

source `instance-attribute` ¶

source: str | None

Origin of the schema definition (e.g., 'api', 'csv', 'ui').

display_name `instance-attribute` ¶

display_name: str | None

Human-readable display name in Mixpanel UI.

tags `instance-attribute` ¶

tags: list[str]

Categorization tags for organization.

hidden `instance-attribute` ¶

hidden: bool

Whether hidden from Mixpanel UI.

dropped `instance-attribute` ¶

dropped: bool

Whether data is dropped/ignored.

contacts `instance-attribute` ¶

contacts: list[str]

Owner email addresses.

team_contacts `instance-attribute` ¶

team_contacts: list[str]

Team ownership labels.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all metadata fields.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all metadata fields.
    """
    return {
        "source": self.source,
        "display_name": self.display_name,
        "tags": self.tags,
        "hidden": self.hidden,
        "dropped": self.dropped,
        "contacts": self.contacts,
        "team_contacts": self.team_contacts,
    }

Event Analytics Results¶

mixpanel_data.EventCountsResult `dataclass` ¶

EventCountsResult(
    events: list[str],
    from_date: str,
    to_date: str,
    unit: Literal["day", "week", "month"],
    type: Literal["general", "unique", "average"],
    series: dict[str, dict[str, int]],
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Time-series event count data.

Contains aggregate counts for multiple events over time with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

events `instance-attribute` ¶

events: list[str]

Queried event names.

from_date `instance-attribute` ¶

from_date: str

Query start date (YYYY-MM-DD).

to_date `instance-attribute` ¶

to_date: str

Query end date (YYYY-MM-DD).

unit `instance-attribute` ¶

unit: Literal['day', 'week', 'month']

Time unit for aggregation.

type `instance-attribute` ¶

type: Literal['general', 'unique', 'average']

Counting method used.

series `instance-attribute` ¶

series: dict[str, dict[str, int]]

Time series data: {event_name: {date: count}}.

df `property` ¶

df: DataFrame

Convert to DataFrame with columns: date, event, count.

Conversion is lazy - computed on first access and cached.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "events": self.events,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "unit": self.unit,
        "type": self.type,
        "series": self.series,
    }

mixpanel_data.PropertyCountsResult `dataclass` ¶

PropertyCountsResult(
    event: str,
    property_name: str,
    from_date: str,
    to_date: str,
    unit: Literal["day", "week", "month"],
    type: Literal["general", "unique", "average"],
    series: dict[str, dict[str, int]],
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Time-series property value distribution data.

Contains aggregate counts by property values over time with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

event `instance-attribute` ¶

event: str

Queried event name.

property_name `instance-attribute` ¶

property_name: str

Property used for segmentation.

from_date `instance-attribute` ¶

from_date: str

Query start date (YYYY-MM-DD).

to_date `instance-attribute` ¶

to_date: str

Query end date (YYYY-MM-DD).

unit `instance-attribute` ¶

unit: Literal['day', 'week', 'month']

Time unit for aggregation.

type `instance-attribute` ¶

type: Literal['general', 'unique', 'average']

Counting method used.

series `instance-attribute` ¶

series: dict[str, dict[str, int]]

Time series data by property value.

Structure: {property_value: {date: count}} Example: {"US": {"2024-01-01": 150, "2024-01-02": 200}, "EU": {...}}

df `property` ¶

df: DataFrame

Convert to DataFrame with columns: date, value, count.

Conversion is lazy - computed on first access and cached.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "property_name": self.property_name,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "unit": self.unit,
        "type": self.type,
        "series": self.series,
    }

Advanced Query Results¶

mixpanel_data.UserEvent `dataclass` ¶

UserEvent(event: str, time: datetime, properties: dict[str, Any] = dict())

Single event in a user's activity feed.

Represents one event from a user's event history with timestamp and all associated properties.

event `instance-attribute` ¶

event: str

Event name.

time `instance-attribute` ¶

time: datetime

Event timestamp (UTC).

properties `class-attribute` `instance-attribute` ¶

properties: dict[str, Any] = field(default_factory=dict)

All event properties including system properties.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "time": self.time.isoformat(),
        "properties": self.properties,
    }

mixpanel_data.ActivityFeedResult `dataclass` ¶

ActivityFeedResult(
    distinct_ids: list[str],
    from_date: str | None,
    to_date: str | None,
    events: list[UserEvent] = list(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Collection of user events from activity feed query.

Contains chronological event history for one or more users with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

distinct_ids `instance-attribute` ¶

distinct_ids: list[str]

Queried user identifiers.

from_date `instance-attribute` ¶

from_date: str | None

Start date filter (YYYY-MM-DD), None if not specified.

to_date `instance-attribute` ¶

to_date: str | None

End date filter (YYYY-MM-DD), None if not specified.

events `class-attribute` `instance-attribute` ¶

events: list[UserEvent] = field(default_factory=list)

Event history (chronological order).

df `property` ¶

df: DataFrame

Convert to DataFrame with columns: event, time, distinct_id, + properties.

Flattens event properties into individual columns. Conversion is lazy - computed on first access and cached.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "distinct_ids": self.distinct_ids,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "event_count": len(self.events),
        "events": [e.to_dict() for e in self.events],
    }

mixpanel_data.FrequencyResult `dataclass` ¶

FrequencyResult(
    event: str | None,
    from_date: str,
    to_date: str,
    unit: Literal["day", "week", "month"],
    addiction_unit: Literal["hour", "day"],
    data: dict[str, list[int]] = dict(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Event frequency distribution (addiction analysis).

Contains frequency arrays showing how many users performed events in N time periods, with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

event `instance-attribute` ¶

event: str | None

Filtered event name (None = all events).

from_date `instance-attribute` ¶

from_date: str

Query start date (YYYY-MM-DD).

to_date `instance-attribute` ¶

to_date: str

Query end date (YYYY-MM-DD).

unit `instance-attribute` ¶

unit: Literal['day', 'week', 'month']

Overall time period.

addiction_unit `instance-attribute` ¶

addiction_unit: Literal['hour', 'day']

Measurement granularity.

data `class-attribute` `instance-attribute` ¶

data: dict[str, list[int]] = field(default_factory=dict)

Frequency arrays by date.

Structure: {date: [count_1, count_2, ...]} Example: {"2024-01-01": [100, 50, 25, 10]}

Each array shows user counts by frequency: - Index 0: users active exactly 1 time - Index 1: users active exactly 2 times - Index N: users active exactly N+1 times

df `property` ¶

df: DataFrame

Convert to DataFrame with columns: date, period_1, period_2, ...

Each period_N column shows users active in at least N time periods. Conversion is lazy - computed on first access and cached.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "unit": self.unit,
        "addiction_unit": self.addiction_unit,
        "data": self.data,
    }

mixpanel_data.NumericBucketResult `dataclass` ¶

NumericBucketResult(
    event: str,
    from_date: str,
    to_date: str,
    property_expr: str,
    unit: Literal["hour", "day"],
    series: dict[str, dict[str, int]] = dict(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Events segmented into numeric property ranges.

Contains time-series data bucketed by automatically determined numeric ranges, with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

event `instance-attribute` ¶

event: str

Queried event name.

from_date `instance-attribute` ¶

from_date: str

Query start date (YYYY-MM-DD).

to_date `instance-attribute` ¶

to_date: str

Query end date (YYYY-MM-DD).

property_expr `instance-attribute` ¶

property_expr: str

The 'on' expression used for bucketing.

unit `instance-attribute` ¶

unit: Literal['hour', 'day']

Time aggregation unit.

series `class-attribute` `instance-attribute` ¶

series: dict[str, dict[str, int]] = field(default_factory=dict)

Bucket data: {range_string: {date: count}}.

df `property` ¶

df: DataFrame

Convert to DataFrame with columns: date, bucket, count.

Conversion is lazy - computed on first access and cached.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "property_expr": self.property_expr,
        "unit": self.unit,
        "series": self.series,
    }

mixpanel_data.NumericSumResult `dataclass` ¶

NumericSumResult(
    event: str,
    from_date: str,
    to_date: str,
    property_expr: str,
    unit: Literal["hour", "day"],
    results: dict[str, float] = dict(),
    computed_at: str | None = None,
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Sum of numeric property values per time unit.

Contains daily or hourly sum totals for a numeric property with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

event `instance-attribute` ¶

event: str

Queried event name.

from_date `instance-attribute` ¶

from_date: str

Query start date (YYYY-MM-DD).

to_date `instance-attribute` ¶

to_date: str

Query end date (YYYY-MM-DD).

property_expr `instance-attribute` ¶

property_expr: str

The 'on' expression summed.

unit `instance-attribute` ¶

unit: Literal['hour', 'day']

Time aggregation unit.

results `class-attribute` `instance-attribute` ¶

results: dict[str, float] = field(default_factory=dict)

Sum values: {date: sum}.

computed_at `class-attribute` `instance-attribute` ¶

computed_at: str | None = None

Computation timestamp (if provided by API).

df `property` ¶

df: DataFrame

Convert to DataFrame with columns: date, sum.

Conversion is lazy - computed on first access and cached.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    result: dict[str, Any] = {
        "event": self.event,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "property_expr": self.property_expr,
        "unit": self.unit,
        "results": self.results,
    }
    if self.computed_at is not None:
        result["computed_at"] = self.computed_at
    return result

mixpanel_data.NumericAverageResult `dataclass` ¶

NumericAverageResult(
    event: str,
    from_date: str,
    to_date: str,
    property_expr: str,
    unit: Literal["hour", "day"],
    results: dict[str, float] = dict(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Average of numeric property values per time unit.

Contains daily or hourly average values for a numeric property with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

event `instance-attribute` ¶

event: str

Queried event name.

from_date `instance-attribute` ¶

from_date: str

Query start date (YYYY-MM-DD).

to_date `instance-attribute` ¶

to_date: str

Query end date (YYYY-MM-DD).

property_expr `instance-attribute` ¶

property_expr: str

The 'on' expression averaged.

unit `instance-attribute` ¶

unit: Literal['hour', 'day']

Time aggregation unit.

results `class-attribute` `instance-attribute` ¶

results: dict[str, float] = field(default_factory=dict)

Average values: {date: average}.

df `property` ¶

df: DataFrame

Convert to DataFrame with columns: date, average.

Conversion is lazy - computed on first access and cached.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "event": self.event,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "property_expr": self.property_expr,
        "unit": self.unit,
        "results": self.results,
    }

Bookmark Types¶

mixpanel_data.BookmarkInfo `dataclass` ¶

BookmarkInfo(
    id: int,
    name: str,
    type: BookmarkType,
    project_id: int,
    created: str,
    modified: str,
    workspace_id: int | None = None,
    dashboard_id: int | None = None,
    description: str | None = None,
    creator_id: int | None = None,
    creator_name: str | None = None,
)

Metadata for a saved report (bookmark) from the Mixpanel Bookmarks API.

Represents a saved Insights, Funnel, Retention, or Flows report that can be queried using query_saved_report() or query_flows().

ATTRIBUTE	DESCRIPTION
`id`	Unique bookmark identifier. TYPE: `int`
`name`	User-defined report name. TYPE: `str`
`type`	Report type (insights, funnels, retention, flows, launch-analysis). TYPE: `BookmarkType`
`project_id`	Parent Mixpanel project ID. TYPE: `int`
`created`	Creation timestamp (ISO format). TYPE: `str`
`modified`	Last modification timestamp (ISO format). TYPE: `str`
`workspace_id`	Optional workspace ID if scoped to a workspace. TYPE: `int \| None`
`dashboard_id`	Optional parent dashboard ID if linked to a dashboard. TYPE: `int \| None`
`description`	Optional user-provided description. TYPE: `str \| None`
`creator_id`	Optional creator's user ID. TYPE: `int \| None`
`creator_name`	Optional creator's display name. TYPE: `str \| None`

id `instance-attribute` ¶

id: int

Unique bookmark identifier.

name `instance-attribute` ¶

name: str

User-defined report name.

type `instance-attribute` ¶

type: BookmarkType

Report type.

project_id `instance-attribute` ¶

project_id: int

Parent Mixpanel project ID.

created `instance-attribute` ¶

created: str

Creation timestamp (ISO format).

modified `instance-attribute` ¶

modified: str

Last modification timestamp (ISO format).

workspace_id `class-attribute` `instance-attribute` ¶

workspace_id: int | None = None

Workspace ID if scoped to a workspace.

dashboard_id `class-attribute` `instance-attribute` ¶

dashboard_id: int | None = None

Parent dashboard ID if linked to a dashboard.

description `class-attribute` `instance-attribute` ¶

description: str | None = None

User-provided description.

creator_id `class-attribute` `instance-attribute` ¶

creator_id: int | None = None

Creator's user ID.

creator_name `class-attribute` `instance-attribute` ¶

creator_name: str | None = None

Creator's display name.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all bookmark metadata fields.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all bookmark metadata fields.
    """
    result: dict[str, Any] = {
        "id": self.id,
        "name": self.name,
        "type": self.type,
        "project_id": self.project_id,
        "created": self.created,
        "modified": self.modified,
    }
    if self.workspace_id is not None:
        result["workspace_id"] = self.workspace_id
    if self.dashboard_id is not None:
        result["dashboard_id"] = self.dashboard_id
    if self.description is not None:
        result["description"] = self.description
    if self.creator_id is not None:
        result["creator_id"] = self.creator_id
    if self.creator_name is not None:
        result["creator_name"] = self.creator_name
    return result

mixpanel_data.SavedReportResult `dataclass` ¶

SavedReportResult(
    bookmark_id: int,
    computed_at: str,
    from_date: str,
    to_date: str,
    headers: list[str] = list(),
    series: dict[str, Any] = dict(),
    _df_cache: DataFrame | None = None,
)

Data from a saved report (Insights, Retention, or Funnel).

Contains data from a pre-configured saved report with automatic report type detection and lazy DataFrame conversion support.

The report_type property automatically detects the report type based on headers: "$retention" indicates retention, "$funnel" indicates funnel, otherwise it's an insights report.

ATTRIBUTE	DESCRIPTION
`bookmark_id`	Saved report identifier. TYPE: `int`
`computed_at`	When report was computed (ISO format). TYPE: `str`
`from_date`	Report start date. TYPE: `str`
`to_date`	Report end date. TYPE: `str`
`headers`	Report column headers (used for type detection). TYPE: `list[str]`
`series`	Report data (structure varies by report type). TYPE: `dict[str, Any]`

bookmark_id `instance-attribute` ¶

bookmark_id: int

Saved report identifier.

computed_at `instance-attribute` ¶

computed_at: str

When report was computed (ISO format).

from_date `instance-attribute` ¶

from_date: str

Report start date.

to_date `instance-attribute` ¶

to_date: str

Report end date.

headers `class-attribute` `instance-attribute` ¶

headers: list[str] = field(default_factory=list)

Report column headers (used for type detection).

series `class-attribute` `instance-attribute` ¶

series: dict[str, Any] = field(default_factory=dict)

Report data (structure varies by report type).

For Insights reports: {event_name: {date: count}} For Retention reports: {series_name: {date: {segment: {first, counts, rates}}}} For Funnel reports: {count: {...}, overall_conv_ratio: {...}, ...}

report_type `property` ¶

report_type: SavedReportType

Detect the report type from headers.

RETURNS	DESCRIPTION
`SavedReportType`	'retention' if headers contain '$retention',
`SavedReportType`	'funnel' if headers contain '$funnel',
`SavedReportType`	'flows' if headers contain '$flows',
`SavedReportType`	'insights' otherwise.

df `property` ¶

df: DataFrame

Convert to DataFrame.

For Insights reports: columns are date, event, count. For Retention/Funnel reports: flattens the nested structure.

Conversion is lazy - computed on first access and cached.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all report fields including detected report_type.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all report fields including detected report_type.
    """
    return {
        "bookmark_id": self.bookmark_id,
        "computed_at": self.computed_at,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "headers": self.headers,
        "series": self.series,
        "report_type": self.report_type,
    }

mixpanel_data.FlowsResult `dataclass` ¶

FlowsResult(
    bookmark_id: int,
    computed_at: str,
    steps: list[dict[str, Any]] = list(),
    breakdowns: list[dict[str, Any]] = list(),
    overall_conversion_rate: float = 0.0,
    metadata: dict[str, Any] = dict(),
    *,
    _df_cache: DataFrame | None = None,
)

Bases: ResultWithDataFrame

Data from a saved Flows report.

Contains user path/navigation data from a pre-configured Flows report with lazy DataFrame conversion support.

Inherits from ResultWithDataFrame to provide: - Lazy DataFrame caching via _df_cache field - Normalized table output via to_table_dict() method

ATTRIBUTE	DESCRIPTION
`bookmark_id`	Saved report identifier. TYPE: `int`
`computed_at`	When report was computed (ISO format). TYPE: `str`
`steps`	Flow step data with event sequences and counts. TYPE: `list[dict[str, Any]]`
`breakdowns`	Path breakdown data showing user flow distribution. TYPE: `list[dict[str, Any]]`
`overall_conversion_rate`	End-to-end conversion rate (0.0 to 1.0). TYPE: `float`
`metadata`	Additional API metadata from the response. TYPE: `dict[str, Any]`

bookmark_id `instance-attribute` ¶

bookmark_id: int

Saved report identifier.

computed_at `instance-attribute` ¶

computed_at: str

When report was computed (ISO format).

steps `class-attribute` `instance-attribute` ¶

steps: list[dict[str, Any]] = field(default_factory=list)

Flow step data with event sequences and counts.

breakdowns `class-attribute` `instance-attribute` ¶

breakdowns: list[dict[str, Any]] = field(default_factory=list)

Path breakdown data showing user flow distribution.

overall_conversion_rate `class-attribute` `instance-attribute` ¶

overall_conversion_rate: float = 0.0

End-to-end conversion rate (0.0 to 1.0).

metadata `class-attribute` `instance-attribute` ¶

metadata: dict[str, Any] = field(default_factory=dict)

Additional API metadata from the response.

df `property` ¶

df: DataFrame

Convert steps to DataFrame.

Returns DataFrame with columns derived from step data structure. Conversion is lazy - computed on first access and cached.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all flows report fields.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all flows report fields.
    """
    return {
        "bookmark_id": self.bookmark_id,
        "computed_at": self.computed_at,
        "steps": self.steps,
        "breakdowns": self.breakdowns,
        "overall_conversion_rate": self.overall_conversion_rate,
        "metadata": self.metadata,
    }

JQL Discovery Types¶

mixpanel_data.PropertyDistributionResult `dataclass` ¶

PropertyDistributionResult(
    event: str,
    property_name: str,
    from_date: str,
    to_date: str,
    total_count: int,
    values: tuple[PropertyValueCount, ...],
    _df_cache: DataFrame | None = None,
)

Distribution of values for a property from JQL analysis.

Contains the top N values for a property with their counts and percentages, enabling quick understanding of property value distribution without fetching all data locally.

ATTRIBUTE	DESCRIPTION
`event`	The event type analyzed. TYPE: `str`
`property_name`	The property name analyzed. TYPE: `str`
`from_date`	Query start date (YYYY-MM-DD). TYPE: `str`
`to_date`	Query end date (YYYY-MM-DD). TYPE: `str`
`total_count`	Total number of events with this property defined. TYPE: `int`
`values`	Top values with counts and percentages. TYPE: `tuple[PropertyValueCount, ...]`

event `instance-attribute` ¶

event: str

Event type analyzed.

property_name `instance-attribute` ¶

property_name: str

Property name analyzed.

from_date `instance-attribute` ¶

from_date: str

Query start date (YYYY-MM-DD).

to_date `instance-attribute` ¶

to_date: str

Query end date (YYYY-MM-DD).

total_count `instance-attribute` ¶

total_count: int

Total events with this property defined.

values `instance-attribute` ¶

values: tuple[PropertyValueCount, ...]

Top values with counts and percentages.

df `property` ¶

df: DataFrame

Convert to DataFrame with columns: value, count, percentage.

Conversion is lazy - computed on first access and cached.

RETURNS	DESCRIPTION
`DataFrame`	DataFrame with value distribution data.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all distribution data.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all distribution data.
    """
    return {
        "event": self.event,
        "property_name": self.property_name,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "total_count": self.total_count,
        "values": [v.to_dict() for v in self.values],
    }

mixpanel_data.PropertyValueCount `dataclass` ¶

PropertyValueCount(
    value: str | int | float | bool | None, count: int, percentage: float
)

A single value and its count from property distribution analysis.

Represents one row in a property value distribution, showing the value, its occurrence count, and percentage of total.

ATTRIBUTE	DESCRIPTION
`value`	The property value (can be string, number, bool, or None). TYPE: `str \| int \| float \| bool \| None`
`count`	Number of occurrences of this value. TYPE: `int`
`percentage`	Percentage of total events (0.0 to 100.0). TYPE: `float`

value `instance-attribute` ¶

value: str | int | float | bool | None

The property value.

count `instance-attribute` ¶

count: int

Number of occurrences.

percentage `instance-attribute` ¶

percentage: float

Percentage of total (0.0 to 100.0).

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with value, count, and percentage.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with value, count, and percentage.
    """
    return {
        "value": self.value,
        "count": self.count,
        "percentage": self.percentage,
    }

mixpanel_data.NumericPropertySummaryResult `dataclass` ¶

NumericPropertySummaryResult(
    event: str,
    property_name: str,
    from_date: str,
    to_date: str,
    count: int,
    min: float,
    max: float,
    sum: float,
    avg: float,
    stddev: float,
    percentiles: dict[int, float],
)

Statistical summary of a numeric property from JQL analysis.

Contains min, max, sum, average, standard deviation, and percentiles for a numeric property, enabling understanding of value distributions without fetching all data locally.

ATTRIBUTE	DESCRIPTION
`event`	The event type analyzed. TYPE: `str`
`property_name`	The property name analyzed. TYPE: `str`
`from_date`	Query start date (YYYY-MM-DD). TYPE: `str`
`to_date`	Query end date (YYYY-MM-DD). TYPE: `str`
`count`	Number of events with this property defined. TYPE: `int`
`min`	Minimum value. TYPE: `float`
`max`	Maximum value. TYPE: `float`
`sum`	Sum of all values. TYPE: `float`
`avg`	Average value. TYPE: `float`
`stddev`	Standard deviation. TYPE: `float`
`percentiles`	Percentile values keyed by percentile number. TYPE: `dict[int, float]`

event `instance-attribute` ¶

event: str

Event type analyzed.

property_name `instance-attribute` ¶

property_name: str

Property name analyzed.

from_date `instance-attribute` ¶

from_date: str

Query start date (YYYY-MM-DD).

to_date `instance-attribute` ¶

to_date: str

Query end date (YYYY-MM-DD).

count `instance-attribute` ¶

count: int

Number of events with this property defined.

min `instance-attribute` ¶

min: float

Minimum value.

max `instance-attribute` ¶

max: float

Maximum value.

sum `instance-attribute` ¶

sum: float

Sum of all values.

avg `instance-attribute` ¶

avg: float

Average value.

stddev `instance-attribute` ¶

stddev: float

Standard deviation.

percentiles `instance-attribute` ¶

percentiles: dict[int, float]

Percentile values keyed by percentile number (e.g., {50: 98.0}).

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all numeric summary data.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all numeric summary data.
    """
    return {
        "event": self.event,
        "property_name": self.property_name,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "count": self.count,
        "min": self.min,
        "max": self.max,
        "sum": self.sum,
        "avg": self.avg,
        "stddev": self.stddev,
        "percentiles": {str(k): v for k, v in self.percentiles.items()},
    }

mixpanel_data.DailyCountsResult `dataclass` ¶

DailyCountsResult(
    from_date: str,
    to_date: str,
    events: tuple[str, ...] | None,
    counts: tuple[DailyCount, ...],
    _df_cache: DataFrame | None = None,
)

Time-series event counts by day from JQL analysis.

Contains daily event counts for quick activity trend analysis without complex segmentation setup.

ATTRIBUTE	DESCRIPTION
`from_date`	Query start date (YYYY-MM-DD). TYPE: `str`
`to_date`	Query end date (YYYY-MM-DD). TYPE: `str`
`events`	Event types included (None for all events). TYPE: `tuple[str, ...] \| None`
`counts`	Daily counts for each event. TYPE: `tuple[DailyCount, ...]`

from_date `instance-attribute` ¶

from_date: str

Query start date (YYYY-MM-DD).

to_date `instance-attribute` ¶

to_date: str

Query end date (YYYY-MM-DD).

events `instance-attribute` ¶

events: tuple[str, ...] | None

Event types included (None for all events).

counts `instance-attribute` ¶

counts: tuple[DailyCount, ...]

Daily counts for each event.

df `property` ¶

df: DataFrame

Convert to DataFrame with columns: date, event, count.

Conversion is lazy - computed on first access and cached.

RETURNS	DESCRIPTION
`DataFrame`	DataFrame with daily counts data.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all daily counts data.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all daily counts data.
    """
    return {
        "from_date": self.from_date,
        "to_date": self.to_date,
        "events": list(self.events) if self.events else None,
        "counts": [c.to_dict() for c in self.counts],
    }

mixpanel_data.DailyCount `dataclass` ¶

DailyCount(date: str, event: str, count: int)

Event count for a single date from daily counts analysis.

Represents one row in a daily counts result, showing date, event, and count.

ATTRIBUTE	DESCRIPTION
`date`	Date string (YYYY-MM-DD). TYPE: `str`
`event`	Event name. TYPE: `str`
`count`	Number of occurrences on this date. TYPE: `int`

date `instance-attribute` ¶

date: str

Date string (YYYY-MM-DD).

event `instance-attribute` ¶

event: str

Event name.

count `instance-attribute` ¶

count: int

Number of occurrences.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with date, event, and count.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with date, event, and count.
    """
    return {
        "date": self.date,
        "event": self.event,
        "count": self.count,
    }

mixpanel_data.EngagementDistributionResult `dataclass` ¶

EngagementDistributionResult(
    from_date: str,
    to_date: str,
    events: tuple[str, ...] | None,
    total_users: int,
    buckets: tuple[EngagementBucket, ...],
    _df_cache: DataFrame | None = None,
)

User engagement distribution from JQL analysis.

Shows how many users performed N events, helping understand user engagement patterns without fetching all data locally.

ATTRIBUTE	DESCRIPTION
`from_date`	Query start date (YYYY-MM-DD). TYPE: `str`
`to_date`	Query end date (YYYY-MM-DD). TYPE: `str`
`events`	Event types included (None for all events). TYPE: `tuple[str, ...] \| None`
`total_users`	Total number of distinct users. TYPE: `int`
`buckets`	Engagement buckets with user counts. TYPE: `tuple[EngagementBucket, ...]`

from_date `instance-attribute` ¶

from_date: str

Query start date (YYYY-MM-DD).

to_date `instance-attribute` ¶

to_date: str

Query end date (YYYY-MM-DD).

events `instance-attribute` ¶

events: tuple[str, ...] | None

Event types included (None for all events).

total_users `instance-attribute` ¶

total_users: int

Total number of distinct users.

buckets `instance-attribute` ¶

buckets: tuple[EngagementBucket, ...]

Engagement buckets with user counts.

df `property` ¶

df: DataFrame

Convert to DataFrame with engagement bucket columns.

Conversion is lazy - computed on first access and cached.

RETURNS	DESCRIPTION
`DataFrame`	DataFrame with engagement distribution data.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all engagement distribution data.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all engagement distribution data.
    """
    return {
        "from_date": self.from_date,
        "to_date": self.to_date,
        "events": list(self.events) if self.events else None,
        "total_users": self.total_users,
        "buckets": [b.to_dict() for b in self.buckets],
    }

mixpanel_data.EngagementBucket `dataclass` ¶

EngagementBucket(
    bucket_min: int, bucket_label: str, user_count: int, percentage: float
)

User count in an engagement bucket from engagement analysis.

Represents one bucket in a user engagement distribution, showing how many users performed events in a certain frequency range.

ATTRIBUTE	DESCRIPTION
`bucket_min`	Minimum events in this bucket. TYPE: `int`
`bucket_label`	Human-readable label (e.g., "1", "2-5", "100+"). TYPE: `str`
`user_count`	Number of users in this bucket. TYPE: `int`
`percentage`	Percentage of total users (0.0 to 100.0). TYPE: `float`

bucket_min `instance-attribute` ¶

bucket_min: int

Minimum events in this bucket.

bucket_label `instance-attribute` ¶

bucket_label: str

Human-readable label (e.g., '1', '2-5', '100+').

user_count `instance-attribute` ¶

user_count: int

Number of users in this bucket.

percentage `instance-attribute` ¶

percentage: float

Percentage of total users (0.0 to 100.0).

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with bucket data.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with bucket data.
    """
    return {
        "bucket_min": self.bucket_min,
        "bucket_label": self.bucket_label,
        "user_count": self.user_count,
        "percentage": self.percentage,
    }

mixpanel_data.PropertyCoverageResult `dataclass` ¶

PropertyCoverageResult(
    event: str,
    from_date: str,
    to_date: str,
    total_events: int,
    coverage: tuple[PropertyCoverage, ...],
    _df_cache: DataFrame | None = None,
)

Property coverage analysis result from JQL.

Shows which properties are consistently populated vs sparse, helping understand data quality before writing queries.

ATTRIBUTE	DESCRIPTION
`event`	The event type analyzed. TYPE: `str`
`from_date`	Query start date (YYYY-MM-DD). TYPE: `str`
`to_date`	Query end date (YYYY-MM-DD). TYPE: `str`
`total_events`	Total number of events analyzed. TYPE: `int`
`coverage`	Coverage statistics for each property. TYPE: `tuple[PropertyCoverage, ...]`

event `instance-attribute` ¶

event: str

Event type analyzed.

from_date `instance-attribute` ¶

from_date: str

Query start date (YYYY-MM-DD).

to_date `instance-attribute` ¶

to_date: str

Query end date (YYYY-MM-DD).

total_events `instance-attribute` ¶

total_events: int

Total number of events analyzed.

coverage `instance-attribute` ¶

coverage: tuple[PropertyCoverage, ...]

Coverage statistics for each property.

df `property` ¶

df: DataFrame

Convert to DataFrame with property coverage columns.

Conversion is lazy - computed on first access and cached.

RETURNS	DESCRIPTION
`DataFrame`	DataFrame with property coverage data.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all coverage data.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all coverage data.
    """
    return {
        "event": self.event,
        "from_date": self.from_date,
        "to_date": self.to_date,
        "total_events": self.total_events,
        "coverage": [c.to_dict() for c in self.coverage],
    }

mixpanel_data.PropertyCoverage `dataclass` ¶

PropertyCoverage(
    property: str,
    defined_count: int,
    null_count: int,
    coverage_percentage: float,
)

Coverage statistics for a single property from coverage analysis.

Shows how often a property is defined vs null for a given event type.

ATTRIBUTE	DESCRIPTION
`property`	Property name. TYPE: `str`
`defined_count`	Number of events with this property defined. TYPE: `int`
`null_count`	Number of events with this property null/undefined. TYPE: `int`
`coverage_percentage`	Percentage of events with property defined (0.0-100.0). TYPE: `float`

property `instance-attribute` ¶

property: str

Property name.

defined_count `instance-attribute` ¶

defined_count: int

Number of events with property defined.

null_count `instance-attribute` ¶

null_count: int

Number of events with property null/undefined.

coverage_percentage `instance-attribute` ¶

coverage_percentage: float

Percentage with property defined (0.0 to 100.0).

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with coverage data.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with coverage data.
    """
    return {
        "property": self.property,
        "defined_count": self.defined_count,
        "null_count": self.null_count,
        "coverage_percentage": self.coverage_percentage,
    }

Introspection Types¶

mixpanel_data.ColumnSummary `dataclass` ¶

ColumnSummary(
    column_name: str,
    column_type: str,
    min: Any,
    max: Any,
    approx_unique: int,
    avg: float | None,
    std: float | None,
    q25: Any,
    q50: Any,
    q75: Any,
    count: int,
    null_percentage: float,
)

Statistical summary of a single column from DuckDB's SUMMARIZE command.

Contains per-column statistics including min/max, quartiles, null percentage, and approximate distinct counts. Numeric columns include additional stats like average and standard deviation.

column_name `instance-attribute` ¶

column_name: str

Name of the column.

column_type `instance-attribute` ¶

column_type: str

DuckDB data type (VARCHAR, TIMESTAMP, INTEGER, JSON, etc.).

min `instance-attribute` ¶

min: Any

Minimum value (type varies by column type).

max `instance-attribute` ¶

max: Any

Maximum value (type varies by column type).

approx_unique `instance-attribute` ¶

approx_unique: int

Approximate count of distinct values (HyperLogLog).

avg `instance-attribute` ¶

avg: float | None

Mean value (None for non-numeric columns).

std `instance-attribute` ¶

std: float | None

Standard deviation (None for non-numeric columns).

q25 `instance-attribute` ¶

q25: Any

25th percentile value (None for non-numeric).

q50 `instance-attribute` ¶

q50: Any

Median / 50th percentile (None for non-numeric).

q75 `instance-attribute` ¶

q75: Any

75th percentile value (None for non-numeric).

count `instance-attribute` ¶

count: int

Number of non-null values.

null_percentage `instance-attribute` ¶

null_percentage: float

Percentage of null values (0.0 to 100.0).

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all column statistics.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all column statistics.
    """
    return {
        "column_name": self.column_name,
        "column_type": self.column_type,
        "min": self.min,
        "max": self.max,
        "approx_unique": self.approx_unique,
        "avg": self.avg,
        "std": self.std,
        "q25": self.q25,
        "q50": self.q50,
        "q75": self.q75,
        "count": self.count,
        "null_percentage": self.null_percentage,
    }

mixpanel_data.SummaryResult `dataclass` ¶

SummaryResult(
    table: str,
    row_count: int,
    columns: list[ColumnSummary] = list(),
    _df_cache: DataFrame | None = None,
)

Statistical summary of all columns in a table.

Contains row count and per-column statistics from DuckDB's SUMMARIZE command. Provides both structured access via the columns list and DataFrame conversion via the df property.

table `instance-attribute` ¶

table: str

Name of the summarized table.

row_count `instance-attribute` ¶

row_count: int

Total number of rows in the table.

columns `class-attribute` `instance-attribute` ¶

columns: list[ColumnSummary] = field(default_factory=list)

Per-column statistics.

df `property` ¶

df: DataFrame

Convert to DataFrame with one row per column.

Conversion is lazy - computed on first access and cached.

RETURNS	DESCRIPTION
`DataFrame`	DataFrame with column statistics.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with table name, row count, and column statistics.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with table name, row count, and column statistics.
    """
    return {
        "table": self.table,
        "row_count": self.row_count,
        "columns": [col.to_dict() for col in self.columns],
    }

mixpanel_data.EventStats `dataclass` ¶

EventStats(
    event_name: str,
    count: int,
    unique_users: int,
    first_seen: datetime,
    last_seen: datetime,
    pct_of_total: float,
)

Statistics for a single event type.

Contains count, unique users, date range, and percentage of total for a specific event in an events table.

event_name `instance-attribute` ¶

event_name: str

Name of the event.

count `instance-attribute` ¶

count: int

Total occurrences of this event.

unique_users `instance-attribute` ¶

unique_users: int

Count of distinct users who triggered this event.

first_seen `instance-attribute` ¶

first_seen: datetime

Earliest occurrence timestamp.

last_seen `instance-attribute` ¶

last_seen: datetime

Latest occurrence timestamp.

pct_of_total `instance-attribute` ¶

pct_of_total: float

Percentage of all events (0.0 to 100.0).

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with event statistics (datetimes as ISO strings).

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with event statistics (datetimes as ISO strings).
    """
    return {
        "event_name": self.event_name,
        "count": self.count,
        "unique_users": self.unique_users,
        "first_seen": self.first_seen.isoformat(),
        "last_seen": self.last_seen.isoformat(),
        "pct_of_total": self.pct_of_total,
    }

mixpanel_data.EventBreakdownResult `dataclass` ¶

EventBreakdownResult(
    table: str,
    total_events: int,
    total_users: int,
    date_range: tuple[datetime, datetime],
    events: list[EventStats] = list(),
    _df_cache: DataFrame | None = None,
)

Distribution of events in a table.

Contains aggregate statistics and per-event breakdown with counts, unique users, date ranges, and percentages.

table `instance-attribute` ¶

table: str

Name of the analyzed table.

total_events `instance-attribute` ¶

total_events: int

Total number of events in the table.

total_users `instance-attribute` ¶

total_users: int

Total distinct users across all events.

date_range `instance-attribute` ¶

date_range: tuple[datetime, datetime]

(earliest, latest) event timestamps.

events `class-attribute` `instance-attribute` ¶

events: list[EventStats] = field(default_factory=list)

Per-event statistics, ordered by count descending.

df `property` ¶

df: DataFrame

Convert to DataFrame with one row per event type.

Conversion is lazy - computed on first access and cached.

RETURNS	DESCRIPTION
`DataFrame`	DataFrame with event statistics.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with table info and event statistics.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with table info and event statistics.
    """
    return {
        "table": self.table,
        "total_events": self.total_events,
        "total_users": self.total_users,
        "date_range": [
            self.date_range[0].isoformat(),
            self.date_range[1].isoformat(),
        ],
        "events": [event.to_dict() for event in self.events],
    }

mixpanel_data.ColumnStatsResult `dataclass` ¶

ColumnStatsResult(
    table: str,
    column: str,
    dtype: str,
    count: int,
    null_count: int,
    null_pct: float,
    unique_count: int,
    unique_pct: float,
    top_values: list[tuple[Any, int]] = list(),
    min: float | None = None,
    max: float | None = None,
    mean: float | None = None,
    std: float | None = None,
    _df_cache: DataFrame | None = None,
)

Deep statistical analysis of a single column.

Provides detailed statistics including null rates, cardinality, top values, and numeric statistics (for numeric columns). Supports JSON path expressions for analyzing properties.

table `instance-attribute` ¶

table: str

Name of the source table.

column `instance-attribute` ¶

column: str

Column expression analyzed (may include JSON path).

dtype `instance-attribute` ¶

dtype: str

DuckDB data type of the column.

count `instance-attribute` ¶

count: int

Number of non-null values.

null_count `instance-attribute` ¶

null_count: int

Number of null values.

null_pct `instance-attribute` ¶

null_pct: float

Percentage of null values (0.0 to 100.0).

unique_count `instance-attribute` ¶

unique_count: int

Approximate count of distinct values.

unique_pct `instance-attribute` ¶

unique_pct: float

Percentage of values that are unique (0.0 to 100.0).

top_values `class-attribute` `instance-attribute` ¶

top_values: list[tuple[Any, int]] = field(default_factory=list)

Most frequent (value, count) pairs.

min `class-attribute` `instance-attribute` ¶

min: float | None = None

Minimum value (None for non-numeric).

max `class-attribute` `instance-attribute` ¶

max: float | None = None

Maximum value (None for non-numeric).

mean `class-attribute` `instance-attribute` ¶

mean: float | None = None

Mean value (None for non-numeric).

std `class-attribute` `instance-attribute` ¶

std: float | None = None

Standard deviation (None for non-numeric).

df `property` ¶

df: DataFrame

Convert top values to DataFrame with columns: value, count.

Conversion is lazy - computed on first access and cached.

RETURNS	DESCRIPTION
`DataFrame`	DataFrame with top values and their counts.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

RETURNS	DESCRIPTION
`dict[str, Any]`	Dictionary with all column statistics.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output.

    Returns:
        Dictionary with all column statistics.
    """
    return {
        "table": self.table,
        "column": self.column,
        "dtype": self.dtype,
        "count": self.count,
        "null_count": self.null_count,
        "null_pct": self.null_pct,
        "unique_count": self.unique_count,
        "unique_pct": self.unique_pct,
        "top_values": [[value, count] for value, count in self.top_values],
        "min": self.min,
        "max": self.max,
        "mean": self.mean,
        "std": self.std,
    }

Storage Types¶

mixpanel_data.TableMetadata `dataclass` ¶

TableMetadata(
    type: Literal["events", "profiles"],
    fetched_at: datetime,
    from_date: str | None = None,
    to_date: str | None = None,
    filter_events: list[str] | None = None,
    filter_where: str | None = None,
    filter_cohort_id: str | None = None,
    filter_output_properties: list[str] | None = None,
    filter_group_id: str | None = None,
    filter_behaviors: str | None = None,
)

Metadata for a data fetch operation.

This metadata is passed to table creation methods and stored in the database's internal _metadata table for tracking fetch operations.

type `instance-attribute` ¶

type: Literal['events', 'profiles']

Type of data fetched.

fetched_at `instance-attribute` ¶

fetched_at: datetime

When the fetch completed (UTC).

from_date `class-attribute` `instance-attribute` ¶

from_date: str | None = None

Start date for events (YYYY-MM-DD), None for profiles.

to_date `class-attribute` `instance-attribute` ¶

to_date: str | None = None

End date for events (YYYY-MM-DD), None for profiles.

filter_events `class-attribute` `instance-attribute` ¶

filter_events: list[str] | None = None

Event names filtered (if applicable).

filter_where `class-attribute` `instance-attribute` ¶

filter_where: str | None = None

WHERE clause filter (if applicable).

filter_cohort_id `class-attribute` `instance-attribute` ¶

filter_cohort_id: str | None = None

Cohort ID filter for profiles (if applicable).

filter_output_properties `class-attribute` `instance-attribute` ¶

filter_output_properties: list[str] | None = None

Property names to include in output (if applicable).

filter_group_id `class-attribute` `instance-attribute` ¶

filter_group_id: str | None = None

Group ID for group profile queries (if applicable).

filter_behaviors `class-attribute` `instance-attribute` ¶

filter_behaviors: str | None = None

Serialized behaviors filter for behavioral profile queries (if applicable).

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "type": self.type,
        "fetched_at": self.fetched_at.isoformat(),
        "from_date": self.from_date,
        "to_date": self.to_date,
        "filter_events": self.filter_events,
        "filter_where": self.filter_where,
        "filter_cohort_id": self.filter_cohort_id,
        "filter_output_properties": self.filter_output_properties,
        "filter_group_id": self.filter_group_id,
        "filter_behaviors": self.filter_behaviors,
    }

mixpanel_data.TableInfo `dataclass` ¶

TableInfo(
    name: str,
    type: Literal["events", "profiles"],
    row_count: int,
    fetched_at: datetime,
)

Information about a table in the database.

Returned by list_tables() to provide summary information about available tables without retrieving full schemas.

name `instance-attribute` ¶

name: str

Table name.

type `instance-attribute` ¶

type: Literal['events', 'profiles']

Table type.

row_count `instance-attribute` ¶

row_count: int

Number of rows.

fetched_at `instance-attribute` ¶

fetched_at: datetime

When data was fetched (UTC).

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "name": self.name,
        "type": self.type,
        "row_count": self.row_count,
        "fetched_at": self.fetched_at.isoformat(),
    }

mixpanel_data.ColumnInfo `dataclass` ¶

ColumnInfo(name: str, type: str, nullable: bool, primary_key: bool = False)

Information about a table column.

Describes a single column's schema, including name, type, nullability constraints, and primary key status.

name `instance-attribute` ¶

name: str

Column name.

type `instance-attribute` ¶

type: str

DuckDB type (VARCHAR, TIMESTAMP, JSON, INTEGER, etc.).

nullable `instance-attribute` ¶

nullable: bool

Whether column allows NULL values.

primary_key `class-attribute` `instance-attribute` ¶

primary_key: bool = False

Whether column is a primary key.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "name": self.name,
        "type": self.type,
        "nullable": self.nullable,
        "primary_key": self.primary_key,
    }

mixpanel_data.TableSchema `dataclass` ¶

TableSchema(table_name: str, columns: list[ColumnInfo])

Schema information for a table.

Returned by get_schema() to describe the structure of a table, including all column definitions.

table_name `instance-attribute` ¶

table_name: str

Table name.

columns `instance-attribute` ¶

columns: list[ColumnInfo]

Column definitions.

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "table_name": self.table_name,
        "columns": [col.to_dict() for col in self.columns],
    }

mixpanel_data.WorkspaceInfo `dataclass` ¶

WorkspaceInfo(
    path: Path | None,
    project_id: str,
    region: str,
    account: str | None,
    tables: list[str],
    size_mb: float,
    created_at: datetime | None,
)

Information about a Workspace instance.

Returned by Workspace.info() to provide metadata about the workspace including database location, connection details, and table summary.

path `instance-attribute` ¶

path: Path | None

Database file path (None for ephemeral or in-memory workspaces).

project_id `instance-attribute` ¶

project_id: str

Mixpanel project ID.

region `instance-attribute` ¶

region: str

Data residency region (us, eu, in).

account `instance-attribute` ¶

account: str | None

Named account used (None if credentials from environment).

tables `instance-attribute` ¶

tables: list[str]

Names of tables in the database.

size_mb `instance-attribute` ¶

size_mb: float

Database file size in megabytes (0.0 for in-memory workspaces).

created_at `instance-attribute` ¶

created_at: datetime | None

When database was created (None if unknown).

to_dict ¶

to_dict() -> dict[str, Any]

Serialize for JSON output.

Source code in src/mixpanel_data/types.py

def to_dict(self) -> dict[str, Any]:
    """Serialize for JSON output."""
    return {
        "path": str(self.path) if self.path else None,
        "project_id": self.project_id,
        "region": self.region,
        "account": self.account,
        "tables": self.tables,
        "size_mb": self.size_mb,
        "created_at": self.created_at.isoformat() if self.created_at else None,
    }

Result Types¶

Fetch Results¶

mixpanel_data.FetchResult dataclass ¶

table instance-attribute ¶

rows instance-attribute ¶

type instance-attribute ¶

duration_seconds instance-attribute ¶

date_range instance-attribute ¶

fetched_at instance-attribute ¶

df property ¶

to_dict ¶

Parallel Fetch Types¶

mixpanel_data.ParallelFetchResult dataclass ¶

table instance-attribute ¶

total_rows instance-attribute ¶

successful_batches instance-attribute ¶

failed_batches instance-attribute ¶

failed_date_ranges instance-attribute ¶

duration_seconds instance-attribute ¶

fetched_at instance-attribute ¶

has_failures property ¶

to_dict ¶

mixpanel_data.BatchProgress dataclass ¶

from_date instance-attribute ¶

to_date instance-attribute ¶

batch_index instance-attribute ¶

total_batches instance-attribute ¶

rows instance-attribute ¶

success instance-attribute ¶

error class-attribute instance-attribute ¶

to_dict ¶

mixpanel_data.BatchResult dataclass ¶

from_date instance-attribute ¶

to_date instance-attribute ¶

rows instance-attribute ¶

success instance-attribute ¶

error class-attribute instance-attribute ¶

to_dict ¶

Parallel Profile Fetch Types¶

mixpanel_data.ParallelProfileResult dataclass ¶

table instance-attribute ¶

total_rows instance-attribute ¶

successful_pages instance-attribute ¶

failed_pages instance-attribute ¶

failed_page_indices instance-attribute ¶

duration_seconds instance-attribute ¶

fetched_at instance-attribute ¶

has_failures property ¶

to_dict ¶

mixpanel_data.ProfileProgress dataclass ¶

page_index instance-attribute ¶

total_pages instance-attribute ¶

rows instance-attribute ¶

success instance-attribute ¶

error instance-attribute ¶

cumulative_rows instance-attribute ¶

to_dict ¶

mixpanel_data.ProfilePageResult dataclass ¶

profiles instance-attribute ¶

session_id instance-attribute ¶

page instance-attribute ¶

has_more instance-attribute ¶

total instance-attribute ¶

page_size instance-attribute ¶

num_pages property ¶

to_dict ¶

Query Results¶

mixpanel_data.SegmentationResult dataclass ¶

event instance-attribute ¶

from_date instance-attribute ¶

to_date instance-attribute ¶

unit instance-attribute ¶

segment_property instance-attribute ¶

total instance-attribute ¶

series class-attribute instance-attribute ¶

df property ¶

to_dict ¶

mixpanel_data.FunnelResult dataclass ¶

funnel_id instance-attribute ¶

funnel_name instance-attribute ¶

mixpanel_data.FetchResult `dataclass` ¶

table `instance-attribute` ¶

rows `instance-attribute` ¶

type `instance-attribute` ¶

duration_seconds `instance-attribute` ¶

date_range `instance-attribute` ¶

fetched_at `instance-attribute` ¶

df `property` ¶

mixpanel_data.ParallelFetchResult `dataclass` ¶

table `instance-attribute` ¶

total_rows `instance-attribute` ¶

successful_batches `instance-attribute` ¶

failed_batches `instance-attribute` ¶

failed_date_ranges `instance-attribute` ¶

duration_seconds `instance-attribute` ¶

fetched_at `instance-attribute` ¶

has_failures `property` ¶

mixpanel_data.BatchProgress `dataclass` ¶

from_date `instance-attribute` ¶

to_date `instance-attribute` ¶

batch_index `instance-attribute` ¶

total_batches `instance-attribute` ¶

rows `instance-attribute` ¶

success `instance-attribute` ¶

error `class-attribute` `instance-attribute` ¶

mixpanel_data.BatchResult `dataclass` ¶

from_date `instance-attribute` ¶

to_date `instance-attribute` ¶

rows `instance-attribute` ¶

success `instance-attribute` ¶

error `class-attribute` `instance-attribute` ¶

mixpanel_data.ParallelProfileResult `dataclass` ¶

table `instance-attribute` ¶

total_rows `instance-attribute` ¶

successful_pages `instance-attribute` ¶

failed_pages `instance-attribute` ¶

failed_page_indices `instance-attribute` ¶

duration_seconds `instance-attribute` ¶

fetched_at `instance-attribute` ¶

has_failures `property` ¶

mixpanel_data.ProfileProgress `dataclass` ¶

page_index `instance-attribute` ¶

total_pages `instance-attribute` ¶

rows `instance-attribute` ¶

success `instance-attribute` ¶

error `instance-attribute` ¶

cumulative_rows `instance-attribute` ¶

mixpanel_data.ProfilePageResult `dataclass` ¶

profiles `instance-attribute` ¶

session_id `instance-attribute` ¶

page `instance-attribute` ¶

has_more `instance-attribute` ¶

total `instance-attribute` ¶

page_size `instance-attribute` ¶

num_pages `property` ¶

mixpanel_data.SegmentationResult `dataclass` ¶

event `instance-attribute` ¶

from_date `instance-attribute` ¶

to_date `instance-attribute` ¶

unit `instance-attribute` ¶

segment_property `instance-attribute` ¶

total `instance-attribute` ¶

series `class-attribute` `instance-attribute` ¶

df `property` ¶

mixpanel_data.FunnelResult `dataclass` ¶

funnel_id `instance-attribute` ¶

funnel_name `instance-attribute` ¶

from_date `instance-attribute` ¶

to_date `instance-attribute` ¶

conversion_rate `instance-attribute` ¶

steps `class-attribute` `instance-attribute` ¶

df `property` ¶

mixpanel_data.FunnelStep `dataclass` ¶

event `instance-attribute` ¶

count `instance-attribute` ¶

conversion_rate `instance-attribute` ¶

mixpanel_data.RetentionResult `dataclass` ¶

born_event `instance-attribute` ¶

return_event `instance-attribute` ¶

from_date `instance-attribute` ¶