Skip to content

Architecture

mixpanel_data follows a layered architecture with clear separation of concerns.

Explore on DeepWiki

๐Ÿค– Architecture Deep Dive โ†’

Ask questions about the architecture, trace data flows, or explore component relationships interactively.

Layer Diagram

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                      CLI Layer (Typer)                      โ”‚
โ”‚         Argument parsing, output formatting, progress       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
                              โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                   Public API Layer                          โ”‚
โ”‚   Workspace ยท Account/Session ยท accounts/session/targets    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
                              โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     Service Layer                           โ”‚
โ”‚            DiscoveryService, LiveQueryService               โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ”‚
                              โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                  Infrastructure Layer                       โ”‚
โ”‚            ConfigManager, MixpanelAPIClient                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Components

Workspace (Facade)

The Workspace class is the unified entry point that coordinates all services:

  • Session resolution โ€” Three independent axes resolved via env > param > target > bridge > [active] > default_project. Single resolver in _internal/auth/resolver.py; no silent cross-axis fallback.
  • In-session switching โ€” Workspace.use(account=, project=, workspace=, target=) returns self for chaining and preserves the underlying httpx.Client and per-account /me cache (O(1) per swap).
  • Service Orchestration โ€” Creates and manages service instances
  • Entity CRUD โ€” Direct App API access for dashboards, reports, cohorts (workspace-scoped) and feature flags, experiments (project-scoped)
  • Data Governance โ€” Schema registry, enforcement, auditing, volume anomalies, event deletion requests, Lexicon definitions, drop filters, custom properties, custom events, and lookup tables
  • Resource Management โ€” Context manager support for cleanup

Services

DiscoveryService

Schema introspection with session-scoped caching:

  • list_events() โ€” All event names (cached)
  • list_properties(event) โ€” Properties for an event (cached per event)
  • list_property_values(property, event) โ€” Sample values (cached)
  • list_funnels() โ€” Saved funnels (cached)
  • list_cohorts() โ€” Saved cohorts (cached)
  • list_top_events() โ€” Today's top events (NOT cached, real-time)

LiveQueryService

Executes live analytics queries against Mixpanel Query API:

  • Segmentation, funnels, retention, JQL
  • Event counts, property counts
  • Activity feed, saved reports, flows, frequency
  • Numeric aggregations (bucket, sum, average)

Infrastructure

ConfigManager

TOML-based account management at ~/.mp/config.toml (single schema; legacy v1/v2 do not load):

  • Account CRUD over [accounts.NAME] blocks
  • Target CRUD over [targets.NAME] blocks
  • Active-session read/write over the [active] block (account + optional workspace)
  • Atomic writes via temp-file + rename

MixpanelAPIClient

HTTP client with Mixpanel-specific features:

  • Service account authentication
  • Regional endpoint routing (US, EU, India)
  • Automatic rate limit handling with exponential backoff
  • Streaming JSONL parsing for large exports

Three-Axis Hierarchy

The 0.4.0 redesign organizes auth around three independent axes:

  • Account โ€” who is authenticating. Three first-class types managed through one surface: service_account (Basic Auth), oauth_browser (PKCE flow, tokens auto-refreshed), oauth_token (static bearer for CI/agents).
  • Project โ€” which Mixpanel project the calls run against. Lives on the active account as default_project; can be overridden per call.
  • Workspace โ€” which workspace inside the project. Optional; lazy-resolves to the project's default workspace on first workspace-scoped call.

Persisted (account, project, optional workspace) bundles are called targets and act as named cursor positions: mp target add ecom --account team --project 3018488 then mp target use ecom.

Data Paths

Live Query Path

User Request โ†’ Workspace โ†’ LiveQueryService โ†’ MixpanelAPIClient โ†’ Mixpanel API
                                                      โ†“
                                              Typed Result (e.g., SegmentationResult)

Best for:

  • Real-time data needs
  • One-off analysis
  • Pre-computed Mixpanel reports

Streaming Path

User Request โ†’ Workspace โ†’ MixpanelAPIClient โ†’ Mixpanel Export API
                                    โ†“
                          Iterator[dict] (no storage)
                                    โ†“
                          Process each record inline

Best for:

  • ETL pipelines to external systems
  • One-time processing without storage
  • Memory-constrained environments
  • Unix pipeline integration (CLI --stdout)

Key Design Decisions

Streaming Data Access

The API client returns iterators for memory-efficient processing of large datasets without loading everything into memory.

Immutable Session

A Session (account + project + optional workspace) is resolved once at Workspace construction; Workspace.use() swaps in a new Session atomically. The httpx.Client and per-account /me cache are preserved across swaps, so cross-project iteration is O(1) per turn.

Dependency Injection

All services accept their dependencies as constructor arguments. This enables:

  • Easy testing with mocks
  • Flexible composition
  • Clear dependency relationships

Technology Stack

Component Technology Purpose
Language Python 3.10+ Type hints, modern syntax
CLI Framework Typer Declarative CLI building
Output Formatting Rich Tables, progress bars, colors
Validation Pydantic Data validation, settings
HTTP Client httpx Async-capable HTTP

Package Structure

src/mixpanel_data/
โ”œโ”€โ”€ __init__.py              # Public exports (Workspace, Account, Session, namespaces, exceptions, types)
โ”œโ”€โ”€ workspace.py             # Workspace facade with Workspace.use()
โ”œโ”€โ”€ auth_types.py            # Public auth surface (Account union, Session, Region, OAuthTokens, BridgeFile, ...)
โ”œโ”€โ”€ accounts.py              # mp.accounts namespace (add/list/use/login/test/...)
โ”œโ”€โ”€ session.py               # mp.session namespace (show/use)
โ”œโ”€โ”€ targets.py               # mp.targets namespace (saved cursors)
โ”œโ”€โ”€ exceptions.py            # Exception hierarchy (incl. AccountInUseError, WorkspaceScopeError)
โ”œโ”€โ”€ types.py                 # Result dataclasses (SegmentationResult, AccountSummary, Target, ...)
โ”œโ”€โ”€ py.typed                 # PEP 561 marker
โ”œโ”€โ”€ _internal/
โ”‚   โ”œโ”€โ”€ config.py            # ConfigManager (single TOML schema)
โ”‚   โ”œโ”€โ”€ api_client.py        # MixpanelAPIClient
โ”‚   โ”œโ”€โ”€ me.py                # MeService + per-account MeCache
โ”‚   โ”œโ”€โ”€ pagination.py        # Cursor-based App API pagination
โ”‚   โ”œโ”€โ”€ auth/
โ”‚   โ”‚   โ”œโ”€โ”€ account.py       # Account variants (ServiceAccount/OAuthBrowserAccount/OAuthTokenAccount)
โ”‚   โ”‚   โ”œโ”€โ”€ session.py       # Session, Project, WorkspaceRef, ActiveSession
โ”‚   โ”‚   โ”œโ”€โ”€ resolver.py      # env > param > target > bridge > [active] resolver
โ”‚   โ”‚   โ”œโ”€โ”€ token_resolver.py# OnDiskTokenResolver
โ”‚   โ”‚   โ”œโ”€โ”€ token.py         # OAuthTokens, OAuthClientInfo
โ”‚   โ”‚   โ”œโ”€โ”€ flow.py          # OAuth PKCE browser flow
โ”‚   โ”‚   โ”œโ”€โ”€ bridge.py        # Cowork bridge file v2
โ”‚   โ”‚   โ”œโ”€โ”€ storage.py       # account_dir, ensure_account_dir (atomic 0o600 writes)
โ”‚   โ”‚   โ”œโ”€โ”€ pkce.py          # PKCE challenge generation (RFC 7636)
โ”‚   โ”‚   โ”œโ”€โ”€ callback_server.py # Local HTTP callback server
โ”‚   โ”‚   โ””โ”€โ”€ client_registration.py # Dynamic Client Registration (RFC 7591)
โ”‚   โ””โ”€โ”€ services/
โ”‚       โ”œโ”€โ”€ discovery.py     # DiscoveryService
โ”‚       โ””โ”€โ”€ live_query.py    # LiveQueryService
โ””โ”€โ”€ cli/
    โ”œโ”€โ”€ main.py              # Typer app + global flags (-a / -p / -w / -t)
    โ”œโ”€โ”€ commands/            # account / project / workspace / target / session + query / inspect / ...
    โ”œโ”€โ”€ formatters.py        # Output formatters
    โ””โ”€โ”€ utils.py             # CLI utilities