diff --git a/Change-V04.0.0%3A-Proposal.md b/Change-V04.0.0%3A-Proposal.md deleted file mode 100644 index 58157cf..0000000 --- a/Change-V04.0.0%3A-Proposal.md +++ /dev/null @@ -1,638 +0,0 @@ -# MCP Data Platform — Architecture Reference - -*Plugin taxonomy, server responsibilities, and interaction patterns for Leo's data marketplace* - ---- - -## Overview - -Two plugins serving distinct domains, designed for independent or combined use. - -| Plugin | Servers | Domain | -|--------|---------|--------| -| **data-platform** | pandas-mcp, postgres-mcp, dbt-mcp | Ingestion, storage, transformation | -| **viz-platform** | dmc-mcp, dash-mcp | Component validation, dashboards, theming | - -**Key principles:** -- MCP servers are independent processes—they don't import each other -- Claude orchestrates cross-server data flow at runtime -- Plugins ship multiple servers; projects load only what they need -- Claude.md defines project-specific workflows spanning plugins - ---- - -## Component Definitions - -| Component Type | Definition | Runtime Context | -|----------------|------------|-----------------| -| **MCP Server** | Standalone service exposing tools via Model Context Protocol. One server = one domain responsibility. | Long-running process, spawned by Claude Desktop/Code | -| **Tool** | Single callable function within an MCP server. Atomic operation with defined input schema and output. | Invoked per-request by LLM | -| **Resource** | Read-only data exposed by MCP server (files, schemas, configs). Discoverable but not executable. | Static or cached | -| **Agent** | Orchestration layer that chains multiple tool calls across servers. Lives in Claude's reasoning, not in MCP servers. | LLM-driven, multi-step | -| **Command** | User-facing shortcut (e.g., `/ingest`) that triggers predefined tool sequences. | Chat interface trigger | - ---- - -## Plugin: data-platform - -### Server Loading - -Single plugin ships all three servers. Which servers load is determined by project config—not environment variables. - -| Server | Default | Optional | -|--------|---------|----------| -| pandas-mcp | ✓ | — | -| postgres-mcp | ✓ | — | -| dbt-mcp | — | ✓ | - -**Example project configs:** - -```yaml -# Web app project (no dbt) -mcp_servers: - - pandas-mcp - - postgres-mcp -``` - -```yaml -# Data engineering project (full stack) -mcp_servers: - - pandas-mcp - - postgres-mcp - - dbt-mcp -``` - -Agents check server availability at runtime. If dbt-mcp isn't loaded, dbt-related steps are skipped or surface "not available for this project." - ---- - -### Server: pandas-mcp (Data Shaping Layer) - -**Responsibility:** File ingestion, data profiling, schema inference, and utility shaping operations. - -**Philosophy:** SQL-first for persistent transforms (use dbt). Pandas for: -- Pre-database ingestion (profiling, validation, schema inference) -- Visualization prep (reshaping query results for chart formats) -- Ad-hoc operations (prototyping, merging with local files) - -#### Tool Categories - -| Category | Tools | Description | -|----------|-------|-------------| -| Ingestion | `read_file`, `write_file`, `detect_encoding` | File I/O with format auto-detection | -| Profiling | `profile`, `validate`, `sample` | Data quality assessment | -| Schema | `infer_schema` | Generate DDL from data structure | -| Shaping | `reshape`, `pivot`, `melt`, `merge`, `add_columns`, `filter_rows` | Transform any data reference | - -#### Data Reference Sources - -pandas-mcp accepts `data_ref` from multiple origins: - -| Source | How It Arrives | -|--------|----------------| -| Local file | `read_file` tool | -| Query result | Passed from postgres-mcp | -| dbt model output | Passed from dbt-mcp | -| Previous transform | Chained from shaping tool | - -#### When to Use Shaping Tools - -| Scenario | Use pandas-mcp | Use SQL/dbt | -|----------|----------------|-------------| -| Pivot for heatmap chart | ✓ | — | -| Join query result with local CSV | ✓ | — | -| Prototype transform before formalizing | ✓ | — | -| Persistent aggregation in pipeline | — | ✓ | -| Reusable business logic | — | ✓ | -| Needs version control + testing | — | ✓ | - ---- - -### Server: postgres-mcp (Database Layer) - -**Responsibility:** Data loading, querying, schema management, performance analysis, and geospatial operations. - -#### Tool Categories - -| Category | Tools | Description | -|----------|-------|-------------| -| Query | `list_schemas`, `list_tables`, `get_table_schema`, `execute_query`, `query_geometry` | Read operations | -| Analysis | `explain_query`, `recommend_indexes`, `health_check` | Performance insights | -| Write | `execute_write`, `load_dataframe` | Data modification | -| DDL | `execute_ddl`, `get_schema_snapshot` | Schema management with change tracking | - -#### DDL Change Tracking - -`execute_ddl` returns structured output for downstream automation: - -```json -{ - "success": true, - "operation": "CREATE TABLE", - "affected_objects": [ - { - "type": "table", - "schema": "public", - "name": "customer_orders", - "change": "created" - } - ], - "timestamp": "2025-01-22T14:30:00Z" -} -``` - -This enables documentation updates, ERD regeneration (via Mermaid Chart MCP), or other automated responses. - ---- - -### Server: dbt-mcp (Transform Layer) - -**Responsibility:** Model execution, lineage, documentation, and YAML generation for local dbt-core projects. - -**Note:** Official dbt-mcp is Cloud-only. This server wraps local dbt-core CLI. - -#### Tool Categories - -| Category | Tools | Description | -|----------|-------|-------------| -| Discovery | `parse_manifest`, `list_models`, `list_sources` | Project exploration | -| Model | `get_model`, `get_lineage`, `compile_sql` | Model inspection | -| Execution | `run_model`, `test_model`, `get_run_results` | dbt CLI wrapper | -| Documentation | `generate_yaml` | Auto-generate schema.yml | - -#### Lineage Output - -`get_lineage` outputs Mermaid-formatted DAG, compatible with existing Mermaid Chart MCP for rendering. - ---- - -### Internal Dependency Flow (data-platform) - -``` -files → pandas-mcp → postgres-mcp ↔ dbt-mcp - ↑______________| - (query results for reshaping) -``` - -| Flow | Description | -|------|-------------| -| files → pandas | Entry point for raw data | -| pandas → postgres | Schema inference, bulk loading | -| postgres ↔ dbt | dbt queries marts, postgres executes | -| postgres → pandas | Query results for reshaping | -| dbt → pandas | Model outputs for visualization prep | - ---- - -### Agents (data-platform) - -| Agent | Trigger | Sequence | -|-------|---------|----------| -| `data_ingestion` | User provides file | read_file → profile → infer_schema → execute_ddl → load_dataframe → validate | -| `model_analysis` | User asks about dbt model | get_model → get_lineage → explain_query → test_model → synthesize | -| `full_pipeline` | File to materialized model | data_ingestion → create dbt model → run_model | - -**Behavior when dbt-mcp absent:** - -| Agent | Behavior | -|-------|----------| -| `data_ingestion` | Runs fully (no dbt steps) | -| `model_analysis` | Skipped—surfaces "dbt not configured" | -| `full_pipeline` | Stops after load, prompts user | - ---- - -### Commands (data-platform) - -| Command | Maps To | -|---------|---------| -| `/ingest {file}` | `data_ingestion` agent | -| `/profile {file}` | `pandas-mcp.profile` | -| `/pivot {data} by {cols}` | `pandas-mcp.pivot` | -| `/merge {left} {right} on {key}` | `pandas-mcp.merge` | -| `/explain {query}` | `postgres-mcp.explain_query` | -| `/schema {table}` | `postgres-mcp.get_table_schema` | -| `/lineage {model}` | `dbt-mcp.get_lineage` | -| `/run {model}` | `dbt-mcp.run_model` | -| `/test {model}` | `dbt-mcp.test_model` | - -dbt commands return graceful "dbt-mcp not loaded" when unavailable. - ---- - -## Plugin: viz-platform - -### Servers - -| Server | Responsibility | -|--------|----------------| -| dmc-mcp | Version-locked component registry, prop validation | -| dash-mcp | Charts, layouts, pages, theming—validates against dmc-mcp | - ---- - -### Server: dmc-mcp (Component Constraint Layer) - -**Responsibility:** Single source of truth for Dash Mantine Components API. Prevents Claude from hallucinating deprecated props or non-existent components. - -**Problem solved:** DMC versions introduce breaking changes. Claude's training data mixes versions. Runtime errors from invalid props waste cycles. - -#### Tool Categories - -| Category | Tools | Description | -|----------|-------|-------------| -| Discovery | `list_components` | What exists in installed version | -| Introspection | `get_component_props` | Valid props, types, defaults | -| Validation | `validate_component` | Check component definition before use | - -#### Usage Pattern - -Claude queries dmc-mcp first: -1. "What props does `dmc.Select` accept?" → `get_component_props` -2. Build component with valid props -3. Pass to dash-mcp for rendering - -dash-mcp validates against dmc-mcp before rendering. Invalid components fail fast with actionable errors. - ---- - -### Server: dash-mcp (Visualization Layer) - -**Responsibility:** Chart generation, dashboard layouts, page structure, theming system, and export. - -**Philosophy:** Single server, multiple concerns. Tools are namespaced but share context (theme tokens flow to charts automatically). - -#### Tool Categories - -| Category | Tools | Description | -|----------|-------|-------------| -| `chart_*` | `chart_create`, `chart_configure_interaction` | Data visualization (Plotly) | -| `layout_*` | `layout_create`, `layout_add_filter`, `layout_set_grid` | Dashboard composition | -| `page_*` | `page_create`, `page_add_navbar`, `page_set_auth` | App-level structure | -| `theme_*` | `theme_create`, `theme_extend`, `theme_validate`, `theme_export_css` | Design tokens, component styles | - -#### Design Token Structure - -Themes are built from design tokens—single source of truth for visual consistency: - -```yaml -tokens: - colors: - primary: "#228be6" - secondary: "#868e96" - background: - base: "#ffffff" - subtle: "#f8f9fa" - text: - primary: "#212529" - muted: "#868e96" - - spacing: - xs: "4px" - sm: "8px" - md: "16px" - lg: "24px" - - typography: - fontFamily: "Inter, sans-serif" - fontSize: - sm: "14px" - md: "16px" - - radii: - sm: "4px" - md: "8px" -``` - -#### Component Style Registry - -Per-component overrides ensuring consistency: - -| Component | Registered Style | Purpose | -|-----------|------------------|---------| -| `kpi_card` | Shadow, padding, border-radius | All KPIs look identical | -| `data_table` | Header bg, row hover, border | Tables share appearance | -| `filter_panel` | Background, spacing, alignment | Filters positioned consistently | -| `chart_card` | Title typography, padding | Chart containers unified | - ---- - -### Internal Dependency Flow (viz-platform) - -``` -dmc-mcp ← dash-mcp - ↑ | - └──────────┘ - (validation before render) -``` - -dash-mcp always validates component definitions against dmc-mcp. No direct data dependency—data comes from external sources. - ---- - -### Agents (viz-platform) - -| Agent | Trigger | Sequence | -|-------|---------|----------| -| `theme_setup` | New project or brand consistency | list_themes → create_theme → register_component_style → validate_theme | -| `layout_builder` | User wants dashboard structure | create_layout → add_filter → apply_theme → preview | -| `component_check` | Before rendering any DMC component | get_component_props → validate_component → proceed or error | - ---- - -### Commands (viz-platform) - -| Command | Maps To | -|---------|---------| -| `/chart {type}` | `dash-mcp.chart_create` (expects data input) | -| `/dashboard {template}` | `layout_builder` agent | -| `/theme {name}` | `dash-mcp.theme_apply` | -| `/theme new {name}` | `dash-mcp.theme_create` | -| `/theme css {name}` | `dash-mcp.theme_export_css` | -| `/component {name}` | `dmc-mcp.get_component_props` | - ---- - -## Cross-Plugin Interactions - -### How It Works - -MCP servers don't call each other. Claude orchestrates: - -1. Server A returns output to Claude -2. Claude interprets and determines next step -3. Claude passes relevant data to Server B - -### Documentation Layers - -| Layer | Location | Purpose | -|-------|----------|---------| -| Plugin docs | Each plugin's README.md | Declares inputs/outputs | -| Claude.md | Project root | Cross-plugin agents for this project | -| contract-validator | Separate plugin | Validates compatibility | -| doc-guardian | Separate plugin | Catches drift within each project | - -### Interface Contracts - -Each plugin declares what it produces and accepts: - -**data-platform outputs:** -- `data_ref`: In-memory DataFrame reference -- `query_result`: Row set from postgres-mcp -- `model_output`: Materialized table reference from dbt-mcp -- `schema_snapshot`: Full schema state for documentation - -**viz-platform inputs:** -- Accepts `data_ref`, `query_result`, or `model_output` as data source -- Validates all DMC components against dmc-mcp before rendering - -### Cross-Plugin Agents (defined in Claude.md) - -| Agent | Trigger | Sequence | -|-------|---------|----------| -| `dashboard_builder` | User requests visualization of database content | postgres-mcp.execute_query → pandas-mcp.pivot (if needed) → dmc-mcp.validate → dash-mcp.chart_create → dash-mcp.layout_create | -| `visualization_prep` | Query result needs reshaping | postgres-mcp.execute_query → pandas-mcp.reshape → dash-mcp.chart_create | - -### Validation: contract-validator - -Separate plugin for cross-plugin validation. See **Plugin: contract-validator** section for full specification. - -**Key distinction from doc-guardian:** -- doc-guardian: "did code change break docs?" (within a project) -- contract-validator: "do plugins work together?" (across plugins) - ---- - -## Plugin: contract-validator - -### Purpose - -Validates cross-plugin compatibility and Claude.md agent definitions. Ensures plugins can actually work together before runtime failures occur. - -**Problem solved:** Plugins declare interfaces in README. Claude.md references tools across plugins. Without validation: -- Agents reference tools that don't exist -- viz-platform expects input format data-platform doesn't produce -- Plugin updates break workflows silently - ---- - -### What It Reads - -| Source | Purpose | -|--------|---------| -| Plugin README.md | Extract declared inputs/outputs | -| Claude.md | Extract agent definitions and tool references | -| MCP server schemas | Verify tools actually exist with expected signatures | - ---- - -### Tool Categories - -| Category | Tools | Description | -|----------|-------|-------------| -| Parse | `parse_plugin_interface`, `parse_claude_md_agents` | Extract structured data from docs | -| Validate | `validate_compatibility`, `validate_agent_refs`, `validate_data_flow` | Check contracts match | -| Report | `generate_compatibility_report`, `list_issues` | Output findings | - -#### Tool Details - -**`parse_plugin_interface`** -- Input: Plugin path or README content -- Output: Structured interface (inputs accepted, outputs produced, tool names) - -**`parse_claude_md_agents`** -- Input: Claude.md path or content -- Output: List of agents with their tool sequences - -**`validate_compatibility`** -- Input: Two plugin interfaces -- Output: Compatibility report (what A produces that B accepts, gaps) - -**`validate_agent_refs`** -- Input: Agent definition, list of available plugins -- Output: Missing tools, invalid sequences - -**`validate_data_flow`** -- Input: Agent sequence -- Output: Verification that each step's output matches next step's expected input - ---- - -### Agents (contract-validator) - -| Agent | Trigger | Sequence | -|-------|---------|----------| -| `full_validation` | User runs `/validate-contracts` | parse all plugin interfaces → parse Claude.md → validate_compatibility for each pair → validate_agent_refs for each agent → generate_compatibility_report | -| `agent_check` | User runs `/check-agent {name}` | parse_claude_md_agents → find agent → validate_agent_refs → validate_data_flow → report issues | - ---- - -### Commands - -| Command | Maps To | Description | -|---------|---------|-------------| -| `/validate-contracts` | `full_validation` agent | Full project validation | -| `/check-agent {name}` | `agent_check` agent | Validate single agent definition | -| `/list-interfaces` | `parse_plugin_interface` for all plugins | Show what each plugin produces/accepts | - ---- - -### Output Format - -**Compatibility Report:** - -``` -## Contract Validation Report - -### Plugin Interfaces -- data-platform: produces [data_ref, query_result, model_output, schema_snapshot] -- viz-platform: accepts [data_ref, query_result, model_output] - -### Compatibility Matrix -| Producer | Consumer | Status | -|----------|----------|--------| -| data-platform → viz-platform | ✓ Compatible | All outputs accepted | - -### Agent Validation -| Agent | Status | Issues | -|-------|--------|--------| -| dashboard_builder | ✓ Valid | — | -| model_analysis | ⚠ Warning | dbt-mcp optional; agent fails if not loaded | - -### Issues Found -- None - -### Warnings -- Agent `model_analysis` depends on optional server `dbt-mcp` -``` - -**Issue Types:** - -| Type | Severity | Example | -|------|----------|---------| -| Missing tool | Error | Agent references `pandas-mcp.transform` but tool is `pandas-mcp.reshape` | -| Interface mismatch | Error | viz-platform expects `chart_data` but data-platform produces `data_ref` | -| Optional dependency | Warning | Agent uses dbt-mcp which may not be loaded | -| Undeclared output | Warning | Plugin produces output not listed in README | - ---- - -### Integration with doc-guardian - -**Separation of concerns:** - -| Plugin | Scope | Trigger | -|--------|-------|---------| -| doc-guardian | Code ↔ docs drift within a project | PostToolUse (Write/Edit) | -| contract-validator | Plugin ↔ plugin compatibility | On-demand or CI hook | - -contract-validator does NOT watch for file changes. It runs on-demand or as CI step. - -**Potential future integration:** doc-guardian could trigger contract-validator when Claude.md or plugin README changes. Not required for v1. - ---- - -## Diagramming Approach - -No diagram-mcp server. Use existing Mermaid Chart MCP. - -**For ERDs:** -- postgres-mcp exposes schema metadata via `get_schema_snapshot` -- Claude generates Mermaid syntax -- Mermaid Chart MCP renders - -**For dbt lineage:** -- dbt-mcp.get_lineage outputs Mermaid-formatted DAG -- Mermaid Chart MCP renders - -This avoids the complexity of draw.io XML generation while maintaining documentation capability. - ---- - -## Implementation Order - -| Phase | Plugin | Server | Rationale | -|-------|--------|--------|-----------| -| 1 | data-platform | pandas-mcp | Entry point, no dependencies | -| 2 | data-platform | postgres-mcp | Load from Phase 1, query capabilities | -| 3 | data-platform | dbt-mcp | Transform layer, requires postgres-mcp | -| 4 | viz-platform | dmc-mcp | Constraint layer, no dependencies | -| 5 | viz-platform | dash-mcp | Visualization, validates against dmc-mcp | -| 6 | contract-validator | — | Validates all above, requires stable interfaces | - -**Notes:** -- Phases 1-3 (data-platform) and 4-5 (viz-platform) can proceed in parallel -- contract-validator (Phase 6) should wait until plugin interfaces stabilize -- doc-guardian already exists; update scope documentation only - ---- - -## Open Questions - -### Data Reference Passing - -How do servers share `data_ref` objects? Options: -- **Temporary files with URIs**: Portable but I/O overhead -- **Arrow IPC**: Efficient but requires both servers to support -- **Recommendation**: Arrow IPC for efficiency, file fallback for compatibility - -### Authentication - -Should postgres-mcp handle connection strings directly, or use a secrets manager pattern? - -### Theme Storage - -Where do custom themes persist? -- Local config file (`~/.dash-mcp/themes/`) -- Project-level (alongside dbt_project.yml) -- Database table (for shared team themes) - -### dbt Project Discovery - -Auto-detect `dbt_project.yml` in common locations, or require explicit path? - ---- - -## Technology Stack - -| Layer | Technology | Notes | -|-------|------------|-------| -| MCP Framework | FastMCP | Or manual MCP SDK | -| Python | 3.11+ | Type hints, async support | -| Data Processing | pandas | Core DataFrame ops | -| Arrow | pyarrow | Parquet, efficient memory | -| Database | psycopg | Async-ready Postgres driver | -| Geospatial | geoalchemy2 | PostGIS integration | -| dbt | dbt-core | CLI wrapper | -| Visualization | plotly | Figure generation | -| UI Components | dash-mantine-components | Version-locked via dmc-mcp | - ---- - -## Summary - -### Core Plugins - -| Plugin | Servers/Scope | Key Characteristic | -|--------|---------------|-------------------| -| data-platform | pandas-mcp, postgres-mcp, dbt-mcp | Optional server loading per project | -| viz-platform | dmc-mcp, dash-mcp | dmc-mcp validates before dash-mcp renders | -| contract-validator | Interface parsing, compatibility checks | Validates cross-plugin contracts and agent definitions | - -### Supporting Plugins (Existing) - -| Plugin | Purpose | -|--------|---------| -| doc-guardian | Code-to-docs drift (unchanged scope) | -| Mermaid Chart MCP | Diagram rendering | - -### Interaction Model - -``` -Plugin READMEs → declare inputs/outputs -Claude.md → define cross-plugin agents -contract-validator → validate compatibility -doc-guardian → catch drift within projects -``` - -**Flow:** Plugins declare interfaces. Claude.md defines workflows. contract-validator enforces compatibility. doc-guardian handles internal drift.